辅导案例-ECO-7000A

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
SCHOOL OF ECONOMICS
ECO-7000A: Econometric Methods
ECO-7009A: Finanacial Econometrics
ECO-7014A: Banking Econometrics
Take-home Assignment - Autumn Semester 2020

This piece of work should be submitted on Blackboard no later than 3pm on Monday 9 November 2020
(week 7). It accounts for 40% of the overall mark for the module. The exercise is divided into many
parts. Parts marked with an asterisk (*) are the most important.

The file HOUSE_2020 contains data on 4,150 residential properties traded in Norwich between January 2014
and October 2020. The variables are:
house: Property number
price: Sale price in thousands of pounds
beds: Number of bedrooms
baths: Number of bathrooms
recs: Number of recreation rooms
garages: Number of garages
type: 1 if empty plot of land
2 if flat
3 if bungalow
4 if chalet
5 if terraced house
6 if end-terraced house
7 if semi-detached house
8 if detached house
pcode: 1 if post code is NR1 (South and East Central Norwich)
2 if NR2 (West Central Norwich)
3 if NR3 (North Central Norwich)
4 if NR4 (South-West Norwich)
5 if NR5 (West Norwich)
6 if NR6 (North Norwich)
7 if NR7 (East Norwich)
8 if NR8 (North-West Norwich)
sqm: Internal area in square metres.
dg: One if property has double glazing; zero otherwise
ch: One if property has central heating; zero otherwise
gsize: Size of Garden in square metres
poll: Air pollution at property (measured in millionths of a gram of particulate matter
per cubic metre of air)
noise: Level of traffic noise at property (measured in Decibels, DB).
age: Age of property in years
month: Month of transaction: 1 if Jan 2014;
2 if Feb 2014;
:
82 if Oct 2020.

(a) Compare mean and median price (sum price, detail), and obtain a histogram of price (hist
price). Describe the distribution of property prices in Norwich. What does the comparison of mean
and median tell us about the nature of the distribution?
(b) Find the mean price for each of the eight postcodes (tab pcode, sum(price)). Rank the eight
postcodes by mean property price.
(c) Estimate a regression model using the OLS estimator, with price as the dependent variable, with sqm,
beds, baths, recs and garages as quantitative explanatory variables, and with a set of dummy
variables for type, using “empty plot” as the base case. (To obtain the type dummies, you just need to
include i.type in the regress command).1 Present the results in a table.
[continued over

1 Dummy variables are explanatory variables that take on only two values, 0 and 1. We will talk about dummy variables in
the lectures soon.
(d) Explain why one of the eight type dummies must be omitted in order for estimation to be possible.
(e) Evaluate how well the model in part (c) fits the data. That is, quote and interpret R2; quote the F-
statistic for overall significance, conduct an F-test for overall significance, and interpret the result.
(f) Does the intercept parameter have a meaningful interpretation in the model of (c)?
(g) Number of bedrooms and number of recreation rooms both appear to have a negative effect on price.
Does this have a logical explanation?
(h) Using economic concepts where appropriate, interpret each of the other coefficients in the model of (c).
Briefly indicate which are significantly different from zero.
(i) Still using the results from the model of (c), report a 95% confidence interval for the slope parameter
associated with “sqm”. Interpret this interval estimate.
(j) Extend the model of part (c) to include a set of postcode dummies, with NR1 as the “base case” (just
add i.pcode to the regress command). Report the regression results.
(k) Using the formula for the F-test given in the lecture notes, conduct an F-test for the significance of
the postcode dummies, in order to assess the importance of location in price determination. (This is a
test of the model of (c) as a restricted version of the model of (j)).
(l)* Draw up a ranking of the eight postcodes, based on ceteris paribus price comparisons (i.e. based on
your regression results). Does your answer contradict your answer to (b)? If so, why?
(m) Using the model with postcode dummies (part j), predict the price of a terraced house in East Norwich,
with 2 bedrooms, 1 bathroom, 1 recreation room, no garage and an internal area of 60 square metres.
(Give the answer in pounds.) Predict the price of a detached house in North-West Norwich, with 5
bedrooms, 2 bathrooms, 3 recreation rooms, two garages and an internal area of 200 square metres.
(n) Which of the 4,150 properties appears to have been the best in terms of “value for money”, and which
worst? (Hint: Look at the residuals).

FOR PARTS (o)-(r), DO NOT PROVIDE TABLES OF RESULTS; JUST FOCUS ON THE PARTS OF THE
RESULTS THAT ARE RELEVANT IN ANSWERING THE QUESTIONS.

(o)* Add the variables age and age-squared to the model of (j). (To do this, you need to add
c.age##c.age to the regress command). Test the individual and joint significance of age and age-
squared. Plot predicted price against age, with other variables set to means (to do this, you need the
two commands: margins, at(age=(0(20)200) atmeans, followed by marginsplot.) There
are two economic arguments for why age affects property price: the depreciation effect (the value of a
property declines as it gets older); and the vintage effect (older properties attract a premium). Are you
finding evidence of the depreciation effect, or the vintage effect, or both
(p) Add the variable month to the the model of (o), and interpret the coefficient of month. Why is this
information useful to a homeowner?
(q)* Add the variable month-squared as well as month. (To do this, add c.month##c.month to the
regress command). Similarly to part (o), obtain a plot of predicted price against month. In which month
(if any) did prices reach a maximum? If not, predict when a maximum price will be reached.
(r) Test for heteroskedasticity in the model of (q) using the command hettest sqm age, fstat.
Suggest reasons why the variables sqm and age are likely to cause heteroskedasticity in the present
situation.
(s)* Other variables are available in the data set. Experiment with these by adding them in different
combinations (with squared terms and interaction terms where appropriate) to the model (you should
continue to include the variables you have previously used). You might also try making the necessary
correction for heteroskedasticity (if you found evidence of it in (r)). REPORT ONLY ONE COMPLETE
SET OF RESULTS; THIS SET OF RESULTS SHOULD BE FROM YOUR MOST PREFERRED
SPECIFICATION. Make sure you explain why it is your most preferred specification. Interpret the
coefficients of the variables you have added (there is no need to interpret coefficients of variables
previously used).

Hints for part (s):

 The relationship between price and garden size is almost certainly non-linear, with the marginal value
of an additional square metre falling as garden size rises. For this reason it is inappropriate to use
gsize itself as an explanatory variable. Try log(gsize). Take care in interpreting the coefficient.

 Something called “Covid19” happened from March 2020 onwards. Did it have a significant effect on
property prices in Norwich? Try including a “Covid dummy” in your model.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468