Abstract

In regression theory, it is stated that the disturbance term follows the normal distribution when the sample size is large. In Professor J.Johnston's words: "In view of the many factors involved, an appeal to the Central Limit Theorem would further suggest a normal distribution for u." This paper includes an elementary proof that the disturbance term follows the normal distribution when n is large.

Full PDF

aa r X i v : . [ m a t h . P R ] S e p A normal distribution for the disturbance term in regressiontheory

Lambros IossifAthens, GreeceSeptember 2007

Abstract

In regression theory, it is stated that the disturbance term follows the normal distributionwhen the sample size is large.In Professor J. Johnston’s words: “In view of the many factors involved, an appeal tothe Central Limit Theorem would further suggest a normal distribution for u .” [ ? ]This paper includes an elementary proof that the disturbance term follows the normaldistribution when n is large. Consider the regression equation Y i = α + βX i + u i . The assumptions about the disturbance term u i are summarized as follows: E ( u i ) = 0 E ( u i u j ) = ( σ when i = j, i = 1 , , , . . . , n i = j, j = 1 , , , . . . , n No assumption is made regarding the distribution of the disturbance term u i . We wish toshow that for large n the term u i follows the normal distribution N ( O, σ ) with mean 0 andvariant σ .We proceed as follows: let u i = n X k =1 u k − m X k =1 k = i u k = nσ √ n  n X k =1 u k n − Oσ √ n  + σ ( n − √ n −  n X k =1 k =1 − u k n − − Oσ/ √ n −  . Let x = n X k =1 u k n − Oσ/ √ n and y = n X k =1 k =1 − u k n − − Oσ/ √ n − X and Y approach the N (0 ,

1) distribution for large n . Bythe Central Limit Theorem: f X ( x ) = 1 √ π exp − x ! and f Y ( y ) = 1 √ π exp − y ! for large n . The joint density f XY ( x, y ) is completely determined if we can calculate the corre-lation ρ XY between X and Y . Thus we have: since σ X = σ Y = 1 ρ XY = cov( X, Y ) σ X σ Y = E  n n X k =1 u k σ/ √ n − E  n X k =1 u k nσ/ √ n   n X k =1 k = i − u k n − σ/ √ n − − E  n X k =1 k = i − u k n − σ/ √ n −  = E  n X k =1 u k nσ/n   n X k =1 k = i − u k n − σ/ √ n −  = E  n X k =1 u k nσ/ √ n   n X k =1 (cid:18) − u k n − u k n − (cid:19) σ/ √ n −  = 1 σ / p n ( n − E  − n X k =1 u k ! n ( n −

1) + u i n X k =1 u k n ( n −  = p n ( n − σ E  − n X i =1 n X j =1 u i u j n ( n −

1) + u i n X k =1 u k n ( n −  = p n ( n − σ E  − n X i =1 n X j =1 j = i E ( u i u j ) n ( n − − n X i =1 E ( u i ) n ( n −

1) + n X k =1 k = i E ( u i u k ) n ( n −

1) + E ( u i ) n ( n −  = p n ( n − σ " − − nσ + 0 − σ n ( n − = − ( n − p n ( n − n ( n − − p n ( n − n ρ XY . Thus the joint density f XY of X and Y is the bivariate normal density function: f XY ( x, y ) = 12 π p − ρ exp (cid:26) − − ρ ) h x − ρxy + y i(cid:27) with σ x = 1, σ y = 1, µ x = 0, µ y = 0, and ρ = − √ n ( n − n is f XY ( x, y ) = √ n π exp (cid:26) − (cid:18) nx + 2 q n ( n − xy + ny (cid:19)(cid:27) for large n .In order to ﬁnd the density of u i for large n we consider the transformation (see [ ? , p. 204]): u i = nσ √ n x + σ ( n − √ n − y = √ nσx + σy √ n − v = y Let the above transformation be a one-to-one transformation of the xy plane onto the uv planewith inverse transformation given by: x = u i σ √ n − √ n − √ n vy = v. The Jacobian of the above transformation is: J = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂x∂u i ∂x∂v∂y∂u i ∂y∂v (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) σ √ n − √ n − √ n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 1 σ √ n . We note in passing that the partial derivatives ∂x/∂u i , ∂x/∂u , ∂y/∂u i , and ∂y/∂u are allcontinuous functions of u i and u as they are constant.We are interested in the absolute value of the Jacobian above: | J | = 1 σ √ n . Then the joint density of u i and v is f ( u i , v ) = √ n πσ √ n exp  −  n u i σ √ n − √ n − √ n v ! + 2 q n ( n − u i σ √ n − √ n − √ n v ! v + nv  = 12 πσ exp " − ( u i σ √ n + ( n − v + 2 u i v √ n − σ − u i v √ n − σ − n − v + nv ) = 1 √ πσ √ π exp ( − " u i σ √ n + nv + v − nv + nv = 1 √ πσ √ π exp ( − u i σ ) exp ( − v ) . Thus f ( u i ) = 1 √ πσ exp ( − u i σ ) Z + ∞−∞ √ π exp ( − v ) dv, v is large.From our study of the bivariate normal random variable u with density f ( v ) = 1 √ π exp ( − u ) the integral above is unity i.e.: Z + ∞−∞ √ π exp ( − v ) dv = 1 . So that the density of u i alone is given by f ( u i ) = 1 √ πσ exp ( − u i σ ) = N (0 , σ )with u a normal distribution mean 0 variance σ .This is what we set out to prove when n is large.Q.E.D. Note

We observe that u i and v are independent as their joint density factors out as a productof u i and v alone. This is expected since u i and v = y = n X k =1 k = i −−

We observe that u i and v are independent as their joint density factors out as a productof u i and v alone. This is expected since u i and v = y = n X k =1 k = i −− u k n −−

We observe that u i and v are independent as their joint density factors out as a productof u i and v alone. This is expected since u i and v = y = n X k =1 k = i −− u k n −− . σ √ n −−