Chapter 4: Properties of OLS Estimators
Assumptions of the Simple Regression Model (in terms of the error term):
1. y = b1 + b2 x + e
2. E(e) = 0
3. Var(e) = s2.
4. Cov (ei , ej ) = 0.
5. x-values
non-random and at least 2 different values
6. (Opt.) e ~ N(0, s2)
Notes: a) Population paramaters b1 and b2
are unknown population constants
b) The formulas
that produce the estimates b1 and b2 are called the Estimators of b1 and b2
c) Estimators
(formulas) are random variables because they give different results with different
samples.
d) Since
estimators are random variables, their mean, variance and probability distribution can be
calculated.
e) The properties
of these OLS estimators can be compared with these of alternative estimators.
Properties of OLS Estimators:
Can prove that the means of the
estimators b1 and b2
are equal to the true population parameters b1 and b2
.
Variances and covariance:
Var(b2 ) = s2
[ 1/S( xt- E(x))2
]
Var(b1 ) = s2
[ Sxt2/TS( xt- E(x))2 ]
Cov(b1, b2
) = s2 [-E(x)/S( xt- E(x))2 ]
Factors that determine the size of the variances/covariances:
1) The uncertainty about y, i.e. s2
.
2)The more spread out the x-values, the larger the
denominator, the more accurate the estimate.
3) The larger the sample size T, the smaller the
variance/covariance.
4) The variance of b1
is larger if the sum of squares of x's is larger.
5) The Cov(b1, b2 ) has the opposite sign than E(x). If E(x) = 0
then changes in b1 do not affect
b2 and vice versa.
The Gauss-Markov Theorem
Under the first 5 assumptions of the
simple linear regression model, the Ordinary Least Squares estimators b1
and b2 have the smallest
variance of all linear and unbiased estimators of b1 and b2
. In other words, they are the Best Linear Unbiased Estimators (BLUE) of b1 and b2 .
Note: 1) The Gauss-Markov theorem
does not require normality!
2) The Gauss-Markov is not based upon the Least Squares Principles, but only on the
formulas for b1 and b2
, however derived!
3) It is possible that non-linear and/or biased estimators can be found with smaller
variance.
The Probability Distribution of the OLS Estimators
It can be shown that both estimators are normally
distributed, i.e.:
b1 ~ N(b1 , s2 [ Sxt2/TS( xt- E(x))2 ]),
b2 ~ N(b2 , s2 [ 1/S( xt- E(x))2 ] ).
Justification:
1. If assumption 6 holds: Since b1
and b2 are linear combinations of the
y-values, which are normally distributed, it follows that b1
and b2 are themselves normally distributed
[Rule: Linear combinations of normally distributed variables are themselves normally
distributed].
2. If assumption 6 does not hold: Apply the Central Limit
Theorem.
Central Limit Theorem: If the first 5
Gauss-markov assumptions hold, and the sample size T is sufficiently large (> 50 obs),
then the OLS estimators have a distribution that approximates the normal (with greater
accuracy the larger the value of the sample size T).
Consistency:
Another desired property of estimators is Consistency. It is
used when it cannot be proven that an estimator is BLUE. Consistency requires that
estimators b1 and b2
collapse to their true values b1
and b2 as the
sample size goes to infinity. In other words, consistency requires that the variances go
to zero as T goes to infinity.
For the OLS estimators: It can be shown that as T goes to infinity, S( xt- E(x))2 also goes to infinity and both variances go to zero, i.e. the estimators are consistent.
Estimating the variance of the Error Term
To calculate the variance of the estimators, we need the
variance of the error term ( s2 ). But this variance
is unknown since the errors are unknown. The best we can do is to estimate this
variance using the residuals (et). Thus the estimated variance is:
s2 = (Set2/T-2)
It can be shown that s2 is an unbiased
estimator of s2 . (Note that T-2 refers to degrees of freedom).
Using s2 we can substitute in the variance
formulas to get the variance estimators.
The Least Squares Predictor:
Given the value of x one period after the end of the sample, i.e. xT+1 we can predict the corresponding value of y, i.e. yT+1 .
yhatT+1 = b1 + b2 xT+1
It is interesting to evaluate the forecast error that we will be making:
forecast error (f) = yhatT+1 - yT+1 = (b1- b1
) +( b2 - b2 ) xT+1 - eT+1
It easily follows (from
unbiasedness and assumption 2) that
E(f) = 0
and can be shown that the variance of the
forecast error is:
Var(f) = s2 [1 + (1/T) + (xT+1-E(x))2/S( xt-
E(x))2 ]