ECOM30001/ECOM90001 Basic Econometrics Semester 1, 2021 Week 1, Lecture 2 Basic Linear Model: Assumptions Reading: Hill et. al Chapter 5 1/19 1 The Linear Regression Model Introduction The Error Term Assumptions about the Error Term 2 The Least Squares Principle 3 Properties of the OLS Residuals 2/19 Lecture Objectives The Basic Linear Model 1 What is the interpretation of the error term? 2 Why is the assumption E [ε|X] so important? 3 What are the additional assumptions? 4 What is the Least Squares (OLS) Principle? 5 What are the Least Squares (OLS) Residuals? What are the properties of the OLS residuals? 6 What is the difference between an estimator and an estimate? 3/19 Introduction (economic) theory describes the average behaviour of many individuals: identifies relationships between economic variables makes predictions about the direction of outcomes when a variable is altered we are ultimately interested in whether one variable (x) ‘causes’ another variable (y), all else equal problem: How can we uncover the magnitude of these ‘causal’ relationships without experimental data? technique use the regression model with appropriate assumptions 4/19 Introduction The regression model relates the dependent variable (y) to a number of explanatory variables (X = X1,X2, . . .XK ) through the linear (in parameters) equation: yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi where β0, β1, . . . βK are unknown parameters econometrics recognizes that actual behaviour depends upon both average behaviour and a random component econometrics uses a sample of data on (y ,X) to learn about the magnitude of the unknown parameters β0, β1, . . . βK the random error ε can never be known because the population parameters β0, β1, . . . βK governing the relationship between y and X are unknown 5/19 The Error Term ε The error term ε represents: any other factors other than X that affect y and are not included in the model. approximation error arising from the assumed linear functional form. unpredictable random behaviour—knowledge of all variables that affect y might not be sufficient to perfectly predict y . 6/19 The Economic Model yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi 1 the intercept β0 represents the average value of y when all the X ’s are zero 2 the slope parameter βj represents the expected change in y associated with a unit change in Xj , all else constant βj = ∆E [y |X] ∆Xj ∣∣∣∣ all other X ’s constant for j = 1, 2, . . .K 7/19 Assumptions about the Error Term The econometric model: yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi ultimately we want to answer questions such as: What is the effect of X2 on y , holding all other factors fixed (X1,X3, . . .XK , ε)? but we have grouped all these other factors into the error term ε. to answer these questions we need to make assumptions about the error term and the relationship between the error term ε and the X ’s. 8/19 Assumption 1 E [ε|x ] = 0 the critical assumption is the error term has an expected value of 0, given any value of the X ’s This means that knowledge of the X ’s does not change the expected value of the random error ε. the average of the omitted ‘other factors’ does not depend upon the X ’s this does not mean that, in any sample of data, the errors ‘average out’ the (conditional) expectation is taken over all possible realizations of the error 9/19 Assumption 1 E [ε|X] = 0 Given: yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi the assumption E [εi |Xi ] = 0 implies E [yi |Xi ] = β0 + β1 X1i + β2 X2i + . . . βK XKi the ‘other factors’ do not systematically affect ‘average’ behaviour yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi = E [yi |Xi ] + εi The econometric model decomposes y into 1 a systematic component of y ‘explained’ by X 2 a random component of y not explained by X 10/19 Why is E [ε|X] = 0 Important? The econometric model: yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi with: E [yi |Xi ] = β0 + β1 X1i + β2 X2i + . . . βK XKi 1 regardless of whether E [ε|X] = 0 or not, we can always calculate ‘estimates’ of the (statistical) conditional mean ̂E [yi |Xi ] = βˆ0 + βˆ1 X1i + βˆ2 X2i + . . . βˆK XKi This can be used to predict or forecast our outcome variable 2 Ultimately we want to answer questions such as: What is the causal effect of X2 on y , holding all other factors fixed (X1,X3, . . .XK , ε)? Only when E [ε|X] = 0 can we interpret the estimated parameters as causal or behavioural parameters 11/19 Assumptions MR1: The correct model is: yi = β0 + β1 X1i + β2 X2i + . . . βK XKi + εi MR2: The conditional expected value of the random errors is zero: E [εi |Xi ] = 0 Will relax this assumption later in the subject MR3 Homoskedasticity : The variance of the random errors is constant and independent of the X ’s: VAR[εi |Xi ] = σ2 Will relax this assumption later in the subject MR4: any pair of random errors are uncorrelated: COV(εi , εj |Xi ,Xj) = 0 for all i , j = 1, 2, . . .N, i 6= j Will relax this assumption later in the subject 12/19 Assumptions about the Explanatory Variables 1 MR5a: the explanatory variables are not-random—the values of all of the X ’s are known prior to observing the (realized) values of the dependent variable. - as if X’s are ‘pre-determined’ - ‘fixed in repeated sampling’ assumption. - we will relax this assumption later in the subject. 2 MR5b: any one of the X ’s is not an exact linear function of any of the other X ’s. This is equivalent to assuming that none of the X ’s are redundant if this second assumption is violated, we have a condition called exact collinearity and the least squares procedure will fail—more on this later 13/19 The Least Squares Principle the Least Squares Principle estimates (β0, β1, . . . βK ) such that the squared difference between the fitted line and the observed value of y is minimized. let (b0, b1, . . . bK ) be estimates of the unknown parameters (β0, β1, . . . βK ) define the fitted line: ŷi = b0 + b1 X1i + b2 X2i + . . .+ bK XKi and define the least squares residuals êi = (yi − ŷi ) = yi − (b0 + b1 X1i + b2 X2i + . . .+ bK XKi ) 14/19 The Least Squares Principle Why might we want to minimize the squared difference between the fitted line and the observed value of y? Why not minimize ∑ êi—the minimum value would be at −∞ so the fitted line could be set arbitrarily “high” to minimize ∑ êi Why not set fitted line so ∑ êi = 0— a large positive value of êi > 0 would cancel out a large negative value êi < 0. In addition any fitted line passing through the (sample) means of y and x would satisfy this criteria (infinite number of potential fitted lines) 15/19 The Least Squares Principle choose (β0, β1, . . .βK ) to minimize the sum of squared errors: min {β0,β1...,βK} S = N∑ i=1 [yi − β0 − β1 X1i . . .− βK XKi ]2 first order conditions: ∂S ∂β0 = −2 N∑ i=1 [yi − β0 − β1 X1i . . .− βK XKi ] = 0 for j = 1, 2, . . .K ∂S ∂βj = −2 N∑ i=1 [yi − β0 − β1 X1i . . .− βK xKi ] Xji = 0 16/19 The Least Squares Principle let (b0, b1, . . . bK ) be the values of (β0, β1, . . . βK ) that minimize the sum of squares function S : ∂S ∂β0 ∣∣∣∣ b1,b2,...bK = −2 N∑ i=1 [yi − b0 − b1 X1i . . .− bK XKi ] = 0 for j = 1, 2, . . .K ∂S ∂βj ∣∣∣∣ b0,b1,...bK = −2 T∑ i=1 [yi − b0 − b1 X1i . . .− bK XKi ] Xji = 0 we have (K + 1) linear equations in (K + 1) unknowns 17/19 Objective Our objective is to use a random sample of data on y and X and the econometric model to learn about the values of the population parameters (β0, β1, . . . βK ) we use the Ordinary Least Squares estimators for the population parameters. the expressions for (b0, b1, . . . bK ) are estimators they are formulas that allow us to find the values of (b0, b1, . . . bK ) no matter what the values of y and the X ’s turn out to be—the formulas will be the same for different samples of data. (b0, b1, . . . bK ) are random variables—their values depend upon the sample data y and X. when the sample data are substituted into the formulas we obtain numbers that are the observed values of random variables. We call these least squares estimates 18/19 Properties of the OLS Residuals 1 when there is an intercept term (β0): N∑ i=1 êi = 0 2 for each j = 1, 2, . . .K : N∑ i=1 êi X1i = 0 and N∑ i=1 êi X2i = 0 and . . . N∑ i=1 êi XKi = 0 3 these two properties imply:∑ êi ŷi= ∑ êi [b0 + b1 X1i + . . . bK XKi ] = b0 ∑ êi + b1 ∑ êi X1i + . . . bK ∑ êi XKi = 0 19/19
欢迎咨询51作业君