Summer 2021 Exam

EC220

Introduction to Econometrics

Suitable for all candidates

Instructions to candidates

This paper contains FOUR questions, divided into two sections. Section A contains ONE question

related to Michaelmas Term and Section B contains THREE questions related to Lent Term. You

should answer ALL questions from Section A and ALL questions from Section B.

If at any point in this exam you feel that anything is unclear, please make additional assumptions that

you feel are necessary and state them clearly.

For Section A: Please type your answer in a Word-processing software on a computer (e.g. Word).

You could combine the typed document with scanned or photographed hand-drawn diagrams and

computations. The maximumword count is 1500 words, beyond which nothing will be marked. There

is no minimum word count and concise answers will be rewarded.

For Section B: Please use pen and paper and scan (or photograph) your answers. You could also use

an iPad or a tablet. There is no maximum word count for Section B. Please annotate your answers

clearly.

The answers must then be converted to pdf and uploaded to Moodle as ONE individual file together

with the Coversheet. Please make sure every single scanned page is legible and properly ordered.

The file will be run through Turnitin to ensure academic integrity.

Time Allowed Submit PDF with answers within 24 hours after official start of the exam

Expected effort Reading Time: 15 minutes

Answering Time: 3 hours

You are supplied with: Lindley & Scott Cambridge Statistical Tables

Table A5 Durbin-Watson d-statistic

You may also use: Open book examination

Calculators: Calculators are allowed in this exam

© LSE ST 2021/EC220 Page 1 of 7

Section A

(Answer all questions.)

Question 1

[33.34 marks]

A blowout of the BPDeepwater Horizon oil-well in April 2010 led to the largestmarine oil spill in history,

lasting until July of that year. Researchers would like to analyse whether consumers reacted to the

disaster by reducing their consumption of BP branded petrol during the oil spill. They collected data

on the prices and quantities sold at BP-branded and non-BP petrol stations across zip codes (postal

codes; small local areas) in the US. Either a zip code contains BP stations, in which case the average

price and average number of gallons sold for each of these BP stations is recorded (and the indicator

variableBP = 1), or a zip code contains no BP station, in which case the price and quantity at these

non-BP stations is recorded (and BP = 0). An observation is a particular petrol station. Non-BP

stations in BP zip codes are not used in the sample.

Prices and quantities are the averages either for the period January 2009 to March 2010 (before the

oil spill) in columns (1) and (2) or for April 2010 to July 2010 (during the oil spill) in columns (3) to (6)

of the table below. Prices are in US Dollars per gallon and coded as Price. Quantities are in logarithm

and coded as ln(sales). The researchers also constructed a variable called Green Index, which is

supposed to measure the environmental orientation of consumers in the zip code. TheGreen Index

is constructed by combining the share of hybrid vehicle registrations, per capita membership in the

Sierra Club, an environmental organisation, and per capita contributions to Green Party election funds

in the zip code, all measured prior to 2010. The Green index is then standardised to have mean 0

and standard deviation 1. Using either Price or ln(sales) as the dependent variable, the researchers

obtain the following results.

Jan 2009 - Mar 2010 Apr 2010 - Jul 2010

Price

(1)

ln(sales)

(2)

Price

(3)

ln(sales)

(4)

Price

(5)

ln(sales)

(6)

BP 0.003(0.002)

-0.004

(0.007)

-0.043

(0.002)

-0.036

(0.009)

-0.041

(0.002)

-0.033

(0.009)

Green Index 0.006(0.001)

-0.002

(0.002)

BP× Green Index -0.006(0.002)

0.013

(0.008)

Notes: All regressions also contain a constant. Robust standard errors are

in parentheses. The regressions have 5,686 observations

(a) Define the treatment, the outcome, and the counterfactuals implicit in the regression in column

(3)? What do the researchers use as the control group in this regression?

[3.34 marks]

(b) Why do the researchers run the regressions in columns (1) and (2) for the period before the oil

spill? What do you conclude from this exercise?

[6 marks]

(c) What is the average effect of the oil spill on BP prices in column (3)? Discuss whether this is

likely a causal effect.

[6 marks]

© LSE ST 2021/EC220 Page 2 of 7

(d) The researchers also have a variable available which measures the advertising expenditures by

BP in a particular zip code fromApril to July 2010. Would this variable be useful for the analysis?

Explain either how it could be used productively or why you think it is not useful.

[4 marks]

(e) What is the interpretation of the coefficient on the interaction term BP x Green Index in Column

(5)? How much lower is the BP price at a station in a zip code which is 2 standard deviations

above the mean in the Green Index compared to a similarly green control counties? Carefully

explain your answer.

[6 marks]

(f) The average BP price during the sample period (2009-2010) is US$2.70. Use the estimates in

columns (3) and (4) to construct a price elasticity for BP-branded petrol. Is this a demand or

supply elasticity or neither? Carefully explain your answer.

[8 marks]

© LSE ST 2021/EC220 Page 3 of 7

Section B

(Answer all questions.)

Question 2

[22.33 marks]

Consider the bivariate regression model without intercept

yi = βxi + ui,

for i = 1, . . . , n. We impose the following assumptions.

SLR.1 The population model is y = βx+ u.

SLR.2We have a random sample of size n, {(yi, xi) : i = 1, . . . , n}, following the population model

in SLR.1.

SLR.3 The sample outcomes on {xi : i = 1, . . . , n} are not all the same value.

SLR.4 The error term u satisfies E(u|x) = 0 for any value of x.

(a) Let βˆ be the OLS estimator for the regression from y on x (without intercept). Show that βˆ is a

consistent estimator for β under SLR.1-4.

[3 marks]

(b) In addition to SLR.1-4, suppose we know that

V ar(u|x) = σ2x2.

Derive the (conditional) variance V ar(βˆ|X), whereX = (x1, . . . , xn).

[4 marks]

(c) Find an estimator for V ar(βˆ|X) derived in (b), and explain how to compute the 95% confidence

interval for β.

[3.33 marks]

(d) Now consider the weighted OLS estimator β˜ defined as

β˜ = arg min

b

n∑

i=1

w(xi) · (yi − bxi)2,

where w(x) > 0 is a positive weight function of x. Derive the expression for β˜.

[4 marks]

(e) Show that β˜ is an unbiased estimator for β under SLR.1-4.

[4 marks]

(f) Additionally, suppose:

SLR.5 The error term u satisfies V ar(u|x) = σ2 for any value of x (homoskedasticity).

Under SLR.1-5, derive the (conditional) variance V ar(β˜|X).

[4 marks]

© LSE ST 2021/EC220 Page 4 of 7

Question 3

[22.33 marks]

(a) Are the following statements true or false? Explain your answers.

(i) Consider the bivariate regression model yi = β0 + β1xi + ui for i = 1, . . . , n. If the t

statistic to test the null hypothesisH0 : β1 = 0 is zero, then R2 is also zero.

[4 marks]

(ii) Consider the bivariate regression model yi = β0 +β1xi+ui for i = 1, . . . , n. If one rejects

the null hypothesis

H0 : β1 = 10 in favor ofH1 : β1 > 10

at significance level α, then he/she would certainly reject the null hypothesis

H0 : β1 = 10 in favor ofH1 : β1 6= 10

at the same significance level.

[4 marks]

(iii) Consider a multiple regression model y = β0 +β1x1 +β2x2 +u, where u is independent of

(x1, x2) and u ∼ Normal(0, σ2) (i.e., Assumptions MLR.1-6 are satisfied). If the sample

correlation between x1 and x2 is extremely high (say, 0.99), then the t statistic for testing

the hypothesisH0 : β2 = 0 does not follow the t distribution underH0.

[3 marks]

© LSE ST 2021/EC220 Page 5 of 7

(b) We are interested in evaluating whether the decision by loan officials to deny a mortgage may

be racially motivated. Let the binary variable deny equal 1 if the application for a mortgage was

denied, and deny = 0 if an application for a mortgage was successful. minority is a dummy vari-

able which indicates if the applicant belongs to an ethnic minority group (1 = yes) or not (0 = no).

The data set contains information on a wide range of variables which a loan officermight legally

consider when deciding on amortgage application.Wewill restrict our attention to the variables:

pirat (ratio of total monthly debt payment to total monthly income), lvrat_med and lvrat_high

(dummy variables indicating whether the loan-to-value ratio is intermediate or high, with the ex-

cluded dummy being low), and a consumer credit score, chist (which ranges from 1 to 5, where 5

is the worst rating). We have a random sample of 2,380 observations. There are 285 denied ap-

plications, the average of pirat equals 0.33, and the average of chist equals 2.12. The following

table provides various regression results based on this data thatmay shed light on this question.

LPM logit (1) Logit (2) Logit (3)

(minority)

Logit (4)

(non-minority)

minority .113

(.024)

.777

(.160)

- - -

pirat .483

(.072)

5.039

(.752)

5.283

(.759)

5.165

(1.579)

5.013

(.860)

lvrat_med .053

(.014)

.643

(.145)

.748

(.143)

.190

(.285)

.788

(.168)

lvrat_high .251

(.051)

1.759

(.281)

1.904

(.277)

1.182

(.497)

1.973

(.336)

chist .041

(.005)

.332

(.035)

.366

(.034)

.318

(.065)

.340

(.041)

intercept − .171

(.024)

−5.144

(.305)

−5.209

(.307)

−4.039

(.643)

−5.238

(.352)

n 2, 380 2, 380 2, 380 339 2, 041

R2 .1427

logL −726.469 −737.605 −177.477 −547.051

The usual standard errors for the logit model and the robust standard errors for the LPM are

reported in parentheses.

(i) Let us consider the results that are based on the linear probability model (LPM). Provide

the interpretation of parameter estimate on minority, let us denote it by βˆminority, and test

whether the effect is statistically significant. Clearly indicate the assumptions you make

use of for your interpretation and for the test you conduct.

[4 marks]

(ii) Obtain the partial effect of being a minority on the mortgage denial rate using the logit

model, when evaluated at the mean values of the explanatory variables and given an inter-

mediate loan-to-value ratio. Briefly indicate whether you expect this effect to be significant

(no formal test expected), and should we expect this effect to be the same given a high

loan-to-value ratio?

[4.33 marks]

(iii) Let us define Pr (deny = 1|x) = Λ(α0 + α1pirat + α2lvrat_med + α3lvrat_high + α4chist),

where Λ(z) = exp(z)/(1 + exp(z)). You are interested in deciding whether there is evi-

dence that the decision rule taken by the loan officer is the same for minorities as it is for

non-minorities. Using the results provided, conduct this test. Clearly specify the null and

the alternative hypothesis, the test statistic and its distribution under the null, the rejection

rule, and interpret your findings.

[3 marks]

© LSE ST 2021/EC220 Page 6 of 7

Question 4

[22 marks]

Let us consider the following stationary, weakly dependent, time series model

yt = α + ρ1yt−1 + ρ2yt−2 + βxt + εt, t = 1, ..., T.

The error process hasmean zero and exhibits autocorrelation. Youmay assume that εt is independent

of xs for all s ≤ t, and y0 and y−1 are available and equal 0.

(a) Discuss how you would test the null hypothesis that the error process does not exhibit autocor-

relation against the alternative that the error can be represented by a stationary AR(2) process.

In your answer you should clearly indicate what an AR(2) process is.

[5 marks]

(b) Briefly discuss the properties of the OLS estimator when applied to the above model (unbiased-

ness, consistency) recognizing the presence of autocorrelation. Support your answers with

suitable arguments.

[4 marks]

(c) What is the purpose of heteroskedasticity and autocorrelation robust (HAC) standard errors?

Discuss whether HAC standard errors can resolve themain problem associated with estimating

the model using OLS.

[3 marks]

(d) Describe in detail amethod to resolve themain problemwith usingOLS to estimate (α, ρ1, ρ2, β).

Clearly indicate the assumptions underlying this method. If you think there is no method that

can resolve the main problem, explain why.

[5 marks]

(e) What is the Long Run Propensity (long run effect that a permanent change in xt has on yt) in

this model and what is the Impact Propensity? Describe how you would conduct, using a single

linear hypothesis, a test for the hypothesis that the LRP is the same as the Impact Propensity.

[5 marks]

END OF PAPER

© LSE ST 2021/EC220 Page 7 of 7

欢迎咨询51作业君