# THE GAUSS MARKOV THEOREM: A PEDAGOGICAL NOTE.

Ralph C. Allen [*]

Jack H. Stone [**]

Abstract

When stating the Gauss-Markov theorem, undergraduate econometric textbooks generally imply that the ordinary least squares (OLS) estimator has minimum variance. However, the proof of the Gauss Markov theorem indicates that the weights produced by the OLS estimator--not the formula per se--produce the unique minimum-variance estimator. An example demonstrates that other linear unbiased estimators can yield the same variance as the OLS estimator; however, the weights from such formulas are identical to the OLS weights. To avoid this pedagogical confusion, the statement of the theorem should emphasize the weights rather than the estimator itself.

Undergraduate econometric textbooks generally state the Gauss-Markov theorem as follows: Among all linear unbiased estimators, the ordinary least squares (OLS) estimator is best in the sense that it has minimum variance [Gujarati 1995; Pindyck and Rubinfeld 1997; Studenmund 1997]. These textbook assertions that the OLS estimator is best implies that the process (formula) that OLS uses to generate the numerical estimate uniquely minimizes the variance. However, the crucial feature of the proof of the Gauss-Markov theorem is that, of all possible sets of weights formed from the observations on the independent variables and applied to the observations on the dependent variables to form a linear unbiased estimator, the weights produced by the OLS estimator will produce the minimum variance estimator. The proof implies that the weights are unique but not the OLS estimator formula per se. [1]

Unfortunately, when authors state the theorem, they often ignore the proof that stresses the weights and at best just state the theorem in a form that stresses the OLS formula. For example, Gujarati (1995), Pindyck and Rubinfeld (1997), and Studenmund (1997) state the theorem and then refer the reader to an appendix for a proof. Brown (1991) refers to the Gauss-Markov theorem but does not state it or provide a proof even in an appendix. Maddala (1988) only refers to the theorem (in contrast to stating it) and presents a proof in an appendix. Pindyck and Rubinfeld (1999), Gujarati (1999), and Mirer (1995) indicate that the OLS formula can be written as a linear combination of the observations of the dependent variable but then state the theorem in the form that indicates that the OLS formula produces a unique, minimum variance estimator.

The distinction between the formula and the weights may sound like a distinction without a difference. However, given that undergraduate econometrics courses rarely formally prove the Gauss-Markov theorem, embarrassing pedagogical errors can result as is suggested by the following incident based on a classroom experience by one of the authors. [2]

An instructor wishes to illustrate the Gauss-Markov theorem by selecting an alternative, linear unbiased estimator to estimate the slope coefficient. The intent is to show that this alternative estimator has a larger variance than the OLS estimator. The instructor employs the following conventional, simple regression model that satisfies all, of the standard OLS assumptions: Yi = 60 + .10xi + Ui where Ui has the following probability distribution for any

P(Ui = - 1) = 1/2

P(Ui = + 1) = 1/2.

The independent variable, Xi, is expressed as a deviation from its own mean, xi. The expected value of Ui equals 0, and the variance of Ui equals 1.

In each possible sample consisting or three observations drawn from the process described by the linear model above, let Xi = 100 for i = 1 (that is, Xi = 100 for the first observation), Xi = 200 for i = 2, and Xi = 300 for i = 3. Therefore, xi = -100, 0, 100 for i = 1, 2, 3, respectively.

Consider two estimators, OLS and an alternative called J, that could be used to estimate the slope parameter [beta]1 (that is, .10). The OLS estimator for [beta]1 is [sigma]xiYi/[sigma][[xi].sup.2] = [(xl/[sigma][[xi].sup.2])]Y1 + [(x2/[sigma][[xi].sup.2])]Y2 + [(x3/[sigma][[xi].sup.2])]Y3, where the weights are the expressions in the square brackets, [], and Yi = Y1, Y2, and Y3 are the observations on the dependent variable. The alternative J estimator is (Y3 - Y1)/(x3 - x1) = [-1/(x3 - x1)]Y1 + [0] Y2 + [1/(x3 - x1)]Y3, where again the weights are the expressions in the square brackets. Note that the formula that generates the weights for the J estimator is significantly different from the formula of the OLS estimator.

Both OLS and J are linear and unbiased, and they both have expected values equal to .10. The variance of the OLS estimator is equal to 1/[sigma][[xi].sup.2], which, in the sample, is equal to 1/20000. The variance of J is equal to VAR{[1/(x3 - x1)](Y3 - Y1)} = (1/[(x3 - x1).sup.2*] VAR(Y3 - Y1) = [(1/[(x3 - x1).sup.2]] *[VAR(Y3) + VAR(Y1)] = (1 + 1)/[200.sup.2] = 2/40000 = 1/20000. Therefore, the variances of the OLS and J estimators are the same for this sample.

The Gauss-Markov theorem implies that OLS must be unique; however, J is an alternative, linear unbiased estimator that yields the same variance as OLS. In addition, for any sample consisting of three observations such that the Xi's in the new sample are some additive displacement of the Xi's used in the example, the variances are the same. Therefore, an infinite number of samples of size three exist for which the variances of the OLS and J estimators are the same.

The instructor (and the students) undertaking such a classroom exercise may be genuinely confused. Initially assuming that the OLS estimator was uniquely best, the instructor now believes a case exists that violates this uniqueness property. However, upon further investigation the instructor finds the weights in the linear estimator J are identical to the OLS weights, i.e., [-1/200]Y1 + [0]Y2 + [1/200]Y3, even though the estimators themselves are different.

After reviewing a standard proof of the Gauss-Markov theorem, the instructor concludes that the critical factor to demonstrate to the students is that the weights formed from the independent variables are uniquely best and not the estimator formula. To implement this approach, the instructor now believes that the formula for the OLS estimator in undergraduate econometrics texts should be expressed with the weights generated by this formula explicitly displayed. In addition, the Gauss-Markov theorem should be restated as follows: For any set of sample data and for the class of linear unbiased estimators, the OLS estimator generates a set of weights formed from the observations on the independent variables that, when applied to the observations on the dependent variables, uniquely yield the smallest variance compared to any other set of weights. Given that the formal proof is omitted or relegated to an appendix (that is usually not read by the student), stating the Gauss-Markov theorem in this manner has two a dvantages.

First, this approach emphasizes that the weights (and not the estimator) are unique. Second, it emphasizes that the OLS estimator produces the ideal weights for any sample of data (in contrast to estimators such as J which produce minimum variance weights for more limited set of samples

(*.) Professor of Economics, Valdosta State University. 912-245-2243 (Voice) 912-245-2248 (Fax) rcallen@valdosta.edu

(**.) Associate Professor of Economics, Spelman College. 404-223-7579 (Voice) jstone@spelman.edu

Notes

(1.) At a Teaching Conference on Quantitative Methods, Econometrics, and Mathematical Analysis held at Miami University on October 20-22, 1996, a summary of undergraduate texts used by participants was compiled. With one exception, all current or previous editions of the sample of texts identified in this paragraph were included in the list of fourteen identified in the summary. The remaining text, Mirer (1995), was used by one of the authors in his econometrics course.

(2.) Erekson (1996) who argued that the typical student might not be prepared for formal statistical theory underscores the widespread omission of formal proofs in undergraduate economics courses. Erekson further argued that undergraduate econometrics should limit exposure to formal statistical theory and use intuitive techniques such as Monte Carlo exercises.

References

Brown, William S. (1991). Introducing Econometrics. West Publishing Company.

Erekson, Homer (1996). "Designing an Undergraduate Econometrics Course." presented at Miami University Teaching Conference: Quantitative Methods, Econometrics, and Mathematical Analysis in Teaching Undergraduate Economics, October 20-22.

Gujarati, Domodar N. (1999). Essentials of Econometrics. McGrawHill, Second Edition.

Gujarati, Domodar N. (1995). Basic Econometrics. McGraw-Hill, Third Edition.

Maddala, G. S. (1988). Introduction to Econometrics. Macmillan Publishing Company.

Mirer, Thad (1995). Economic Statistics and Econometrics. Prentice Hall, Third Edition.

Pindyck, Robert S. and Rubinfeld, Daniel L. (1997). Econometric Models & Economics Forecasts. McGraw-Hill Book Company, Fourth Edition.

Studenmund, A. H. (1997). Using Econometrics. Addison Wesley, Third Edition

Jack H. Stone [**]

Abstract

When stating the Gauss-Markov theorem, undergraduate econometric textbooks generally imply that the ordinary least squares (OLS) estimator has minimum variance. However, the proof of the Gauss Markov theorem indicates that the weights produced by the OLS estimator--not the formula per se--produce the unique minimum-variance estimator. An example demonstrates that other linear unbiased estimators can yield the same variance as the OLS estimator; however, the weights from such formulas are identical to the OLS weights. To avoid this pedagogical confusion, the statement of the theorem should emphasize the weights rather than the estimator itself.

Undergraduate econometric textbooks generally state the Gauss-Markov theorem as follows: Among all linear unbiased estimators, the ordinary least squares (OLS) estimator is best in the sense that it has minimum variance [Gujarati 1995; Pindyck and Rubinfeld 1997; Studenmund 1997]. These textbook assertions that the OLS estimator is best implies that the process (formula) that OLS uses to generate the numerical estimate uniquely minimizes the variance. However, the crucial feature of the proof of the Gauss-Markov theorem is that, of all possible sets of weights formed from the observations on the independent variables and applied to the observations on the dependent variables to form a linear unbiased estimator, the weights produced by the OLS estimator will produce the minimum variance estimator. The proof implies that the weights are unique but not the OLS estimator formula per se. [1]

Unfortunately, when authors state the theorem, they often ignore the proof that stresses the weights and at best just state the theorem in a form that stresses the OLS formula. For example, Gujarati (1995), Pindyck and Rubinfeld (1997), and Studenmund (1997) state the theorem and then refer the reader to an appendix for a proof. Brown (1991) refers to the Gauss-Markov theorem but does not state it or provide a proof even in an appendix. Maddala (1988) only refers to the theorem (in contrast to stating it) and presents a proof in an appendix. Pindyck and Rubinfeld (1999), Gujarati (1999), and Mirer (1995) indicate that the OLS formula can be written as a linear combination of the observations of the dependent variable but then state the theorem in the form that indicates that the OLS formula produces a unique, minimum variance estimator.

The distinction between the formula and the weights may sound like a distinction without a difference. However, given that undergraduate econometrics courses rarely formally prove the Gauss-Markov theorem, embarrassing pedagogical errors can result as is suggested by the following incident based on a classroom experience by one of the authors. [2]

An instructor wishes to illustrate the Gauss-Markov theorem by selecting an alternative, linear unbiased estimator to estimate the slope coefficient. The intent is to show that this alternative estimator has a larger variance than the OLS estimator. The instructor employs the following conventional, simple regression model that satisfies all, of the standard OLS assumptions: Yi = 60 + .10xi + Ui where Ui has the following probability distribution for any

P(Ui = - 1) = 1/2

P(Ui = + 1) = 1/2.

The independent variable, Xi, is expressed as a deviation from its own mean, xi. The expected value of Ui equals 0, and the variance of Ui equals 1.

In each possible sample consisting or three observations drawn from the process described by the linear model above, let Xi = 100 for i = 1 (that is, Xi = 100 for the first observation), Xi = 200 for i = 2, and Xi = 300 for i = 3. Therefore, xi = -100, 0, 100 for i = 1, 2, 3, respectively.

Consider two estimators, OLS and an alternative called J, that could be used to estimate the slope parameter [beta]1 (that is, .10). The OLS estimator for [beta]1 is [sigma]xiYi/[sigma][[xi].sup.2] = [(xl/[sigma][[xi].sup.2])]Y1 + [(x2/[sigma][[xi].sup.2])]Y2 + [(x3/[sigma][[xi].sup.2])]Y3, where the weights are the expressions in the square brackets, [], and Yi = Y1, Y2, and Y3 are the observations on the dependent variable. The alternative J estimator is (Y3 - Y1)/(x3 - x1) = [-1/(x3 - x1)]Y1 + [0] Y2 + [1/(x3 - x1)]Y3, where again the weights are the expressions in the square brackets. Note that the formula that generates the weights for the J estimator is significantly different from the formula of the OLS estimator.

Both OLS and J are linear and unbiased, and they both have expected values equal to .10. The variance of the OLS estimator is equal to 1/[sigma][[xi].sup.2], which, in the sample, is equal to 1/20000. The variance of J is equal to VAR{[1/(x3 - x1)](Y3 - Y1)} = (1/[(x3 - x1).sup.2*] VAR(Y3 - Y1) = [(1/[(x3 - x1).sup.2]] *[VAR(Y3) + VAR(Y1)] = (1 + 1)/[200.sup.2] = 2/40000 = 1/20000. Therefore, the variances of the OLS and J estimators are the same for this sample.

The Gauss-Markov theorem implies that OLS must be unique; however, J is an alternative, linear unbiased estimator that yields the same variance as OLS. In addition, for any sample consisting of three observations such that the Xi's in the new sample are some additive displacement of the Xi's used in the example, the variances are the same. Therefore, an infinite number of samples of size three exist for which the variances of the OLS and J estimators are the same.

The instructor (and the students) undertaking such a classroom exercise may be genuinely confused. Initially assuming that the OLS estimator was uniquely best, the instructor now believes a case exists that violates this uniqueness property. However, upon further investigation the instructor finds the weights in the linear estimator J are identical to the OLS weights, i.e., [-1/200]Y1 + [0]Y2 + [1/200]Y3, even though the estimators themselves are different.

After reviewing a standard proof of the Gauss-Markov theorem, the instructor concludes that the critical factor to demonstrate to the students is that the weights formed from the independent variables are uniquely best and not the estimator formula. To implement this approach, the instructor now believes that the formula for the OLS estimator in undergraduate econometrics texts should be expressed with the weights generated by this formula explicitly displayed. In addition, the Gauss-Markov theorem should be restated as follows: For any set of sample data and for the class of linear unbiased estimators, the OLS estimator generates a set of weights formed from the observations on the independent variables that, when applied to the observations on the dependent variables, uniquely yield the smallest variance compared to any other set of weights. Given that the formal proof is omitted or relegated to an appendix (that is usually not read by the student), stating the Gauss-Markov theorem in this manner has two a dvantages.

First, this approach emphasizes that the weights (and not the estimator) are unique. Second, it emphasizes that the OLS estimator produces the ideal weights for any sample of data (in contrast to estimators such as J which produce minimum variance weights for more limited set of samples

(*.) Professor of Economics, Valdosta State University. 912-245-2243 (Voice) 912-245-2248 (Fax) rcallen@valdosta.edu

(**.) Associate Professor of Economics, Spelman College. 404-223-7579 (Voice) jstone@spelman.edu

Notes

(1.) At a Teaching Conference on Quantitative Methods, Econometrics, and Mathematical Analysis held at Miami University on October 20-22, 1996, a summary of undergraduate texts used by participants was compiled. With one exception, all current or previous editions of the sample of texts identified in this paragraph were included in the list of fourteen identified in the summary. The remaining text, Mirer (1995), was used by one of the authors in his econometrics course.

(2.) Erekson (1996) who argued that the typical student might not be prepared for formal statistical theory underscores the widespread omission of formal proofs in undergraduate economics courses. Erekson further argued that undergraduate econometrics should limit exposure to formal statistical theory and use intuitive techniques such as Monte Carlo exercises.

References

Brown, William S. (1991). Introducing Econometrics. West Publishing Company.

Erekson, Homer (1996). "Designing an Undergraduate Econometrics Course." presented at Miami University Teaching Conference: Quantitative Methods, Econometrics, and Mathematical Analysis in Teaching Undergraduate Economics, October 20-22.

Gujarati, Domodar N. (1999). Essentials of Econometrics. McGrawHill, Second Edition.

Gujarati, Domodar N. (1995). Basic Econometrics. McGraw-Hill, Third Edition.

Maddala, G. S. (1988). Introduction to Econometrics. Macmillan Publishing Company.

Mirer, Thad (1995). Economic Statistics and Econometrics. Prentice Hall, Third Edition.

Pindyck, Robert S. and Rubinfeld, Daniel L. (1997). Econometric Models & Economics Forecasts. McGraw-Hill Book Company, Fourth Edition.

Studenmund, A. H. (1997). Using Econometrics. Addison Wesley, Third Edition

Printer friendly Cite/link Email Feedback | |

Author: | Allen, Ralph C.; Stone, Jack H. |
---|---|

Publication: | American Economist |

Geographic Code: | 1USA |

Date: | Mar 22, 2001 |

Words: | 1425 |

Previous Article: | DETERMINANTS OF VARIATIONS IN JOURNAL PUBLICATION RATES OF ECONOMISTS. |

Next Article: | The dollar and U.S. inflation: Some evidence from a VECM process (+). |

Topics: |