Chapter 4: Properties of OLS Estimators

Assumptions of the Simple Regression Model (in terms of the error term):
  
         1.  y = b1 + b2 x + e  
              2.  E(e) = 0
            3.  Var(e) = s2.
            4. Cov (ei , ej ) = 0.
            5.  x-values non-random and at least 2 different values
            6. (Opt.)  e ~ N(0, s2)

Notes:  a) Population paramaters b1 and b2 are unknown population constants
            b) The formulas that produce the estimates b1 and b2 are called the Estimators of b1 and b2
       
c) Estimators (formulas) are random variables because they give different results with different samples.
            d) Since estimators are random variables, their mean, variance and probability distribution can be calculated.
            e) The properties of these OLS estimators can be compared with these of alternative estimators.

   Properties of OLS Estimators:
        Can prove that the means of the estimators b1 and b2 are equal to the true population parameters b1 and b2 .

          Variances and covariance:
                        Var(b2 ) = s2 [ 1/S( xt- E(x))2
                        Var(b1 ) = s2 [ Sxt2/TS( xt- E(x))2 ]
                        Cov(b1, b2 ) = s2 [-E(x)/S( xt- E(x))2 ]

    Factors that determine the size of the variances/covariances:
     1) The uncertainty about y, i.e. s2 .
     2)The more spread out the x-values, the larger the denominator, the more accurate the estimate.
     3) The larger the sample size T, the smaller the variance/covariance.
     4) The variance of  b1 is larger if the sum of squares of x's is larger.
     5) The Cov(b1, b2 ) has the opposite sign than E(x). If E(x) = 0 then changes in  b1 do not affect   b2 and vice versa.
 

   The Gauss-Markov Theorem
        Under the first 5 assumptions of the simple linear regression model, the Ordinary Least Squares estimators b1 and b2 have the smallest variance of all linear and unbiased estimators of  b1 and b2 . In other words, they are the Best Linear Unbiased Estimators (BLUE) of b1 and b2 .

        Note:  1) The Gauss-Markov theorem does not require normality!
                  2) The Gauss-Markov is not based upon the Least Squares Principles, but only on the formulas for b1 and b2 , however derived!
                  3) It is possible that non-linear and/or biased estimators can be found with smaller variance.

    The Probability Distribution of the OLS Estimators
      It can be shown that both estimators are normally distributed, i.e.:

            b1 ~ N(b1 , s2 [ Sxt2/TS( xt- E(x))2 ]),

            b2 ~  N(b2 s2 [ 1/S( xt- E(x))2 ] ).

        Justification:
    1. If assumption 6 holds:  Since b1 and b2  are linear combinations of the y-values, which are normally distributed, it follows that b1 and b2 are themselves normally distributed [Rule: Linear combinations of normally distributed variables are themselves normally distributed].

    2. If assumption 6 does not hold: Apply the Central Limit Theorem.
        Central Limit Theorem: If the first 5 Gauss-markov assumptions hold, and the sample size T is sufficiently large (> 50 obs), then the OLS estimators have a distribution that approximates the normal (with greater accuracy the larger the value of the sample size T).

    Consistency:
   Another desired property of estimators is Consistency. It is used when it cannot be proven that an estimator is BLUE. Consistency requires that estimators  b1 and b2   collapse to their true values b1 and b2 as the sample size goes to infinity. In other words, consistency requires that the variances go to zero as T goes to infinity.

    For the OLS estimators: It can be shown that as T goes to infinity, S( xt- E(x))2  also goes to infinity and both variances go to zero, i.e. the estimators are consistent.

     Estimating the variance of the Error Term
   To calculate the variance of the estimators, we need the variance of the error term ( s2 ). But this variance is unknown since the errors are unknown.  The best we can do is to estimate this variance using the residuals (et).  Thus the estimated variance is:

             s2 = (Set2/T-2)

It can be shown that s2 is an unbiased estimator of s2 . (Note that T-2 refers to degrees of freedom).
Using s2 we can substitute in the variance formulas to get the variance estimators.

The Least Squares Predictor:

Given the value of x one period after the end of the sample, i.e. xT+1 we can predict the corresponding value of y, i.e. yT+1 .

yhatT+1 = b1 + b2 xT+1

It is interesting to evaluate the forecast error that we will be making:
forecast error (f) = yhatT+1 - yT+1 = (b1- b1 ) +( b2 - b2 ) xT+1 - eT+1
       It easily follows (from unbiasedness and assumption 2) that
            E(f) = 0
        and can be shown that the variance of the forecast error is:
            Var(f) = s2 [1 + (1/T) + (xT+1-E(x))2/S( xt- E(x))2 ]