1 ECOM30002/90002 Econometrics 2, Semester 2, 2020 ASSIGNMENT 3 Instructions and information • Submit online through LMS no later than 8am Monday, 28 September; assignments submitted late (for whatever reason) incur a five-mark per (full) hour late penalty. • Assignments can be completed individually (on your own) or by a group (of up to four students). Students in a group do not have to be from the same tutorial. Equal marks are awarded to each group member. • Assignment groups must be formed through LMS (People section) before they are ‘locked’ at 10am, Thursday, 24 September; after that, they cannot be changed. Please do not submit your assignment until after group formation has closed. • Please include on the front page the names and ID numbers of all group members. • Assignments should be submitted as a fully typed document (pdf or Word). Question numbers should be clearly indicated. • For questions requiring interpretation, explanation and/or discussion, concise correct answers will be valued over lengthier unclear off-topic ones. • If performing calculator-based calculations, to maximize accuracy, please do not round intermediate results, and report final answers to four decimal places. • Each question starts on a new page. Notes • This assignment has two questions comprising 10 parts and will be marked out of 50 (all question parts—(a), (b), (c) etc.—are worth five marks each). • Up to five marks in total can be deducted for not doing the following: providing your R code. (Code can be in an appendix.) 2 QUESTION 1 Suppose that the reduced form equations for the simultaneously determined variables and are = (1 + ) + (1 + ) + 1, (1.1) = (1 + ) + + 2, (1.2) where �1,�,� = �2,�,� = 0, 1, and 2, are uncorrelated and is a positive parameter ( > 0). The structural equations corresponding to (1.1) and (1.2) have the form = 1 + 1 + 1, (1.3) = 2 + 2 + 2, (1.4) (a) Use the reduced form equations, (1.1) and (1.2), to derive the structural form equations, (1.3) and (1.4), and obtain an expression for 1, and expressions for 1, 2 and 2 in terms of . (b) Report in a table (as formatted below) the results of a simulation based on 1,000 replications under the following assumptions and settings (please set seed of 12): • The true data generating process (DGP) is given by (1.1) and (1.2) • The parameter of interest is 1 • (1.3) is estimated separately by OLS and 2SLS using and as instruments • ~iid (0, 1) • ~iid (0, 1) • 1,~iid (0, 1) • 2,~iid (0, 1) • the sample size () is 250 • Note: to regress Y on X without an intercept, use the R code Y~0+X Table 1(b). Means of � and �, with varying , based on 1,000 replications 2SLS OLS 1 4 10 100 Notes: (1) The IV and LS superscripts correspond to 2SLS and OLS, respectively. (2) Each cell entry is the mean of 1,000 estimates of 1 from equation (1.3). 3 (c) Explain thoroughly how and why the OLS estimates differ from the 2SLS estimates. (d) Compute by simulation the rejection frequencies for 0: 1 = 1 vs. 1: 1 ≠ 1 under 2SLS and OLS, assuming a 5% level of significance, report these rejection frequencies in a table as formatted below (please set seed of 12) and explain why the results in the 2SLS and OLS columns differ. Table 1(d). Rejection frequencies for : = , with varying (1,000 replications) 2SLS OLS 1 4 10 100 Note: Each cell entry is the proportion of times out of 1,000 that 1 = 1 is rejected at the 5% level. 4 QUESTION 2 This question uses CP.csv, which is an abridged version of the dataset used by Baltagi (2006) to analyse determinants of the crime rate,1 comprising data on 90 counties in North Carolina from 1980 to 1987 (total sample size 630). Except for the regional dummies (, and ), the variables from the dataset that we use (listed below) are all in natural logs. In the datafile, the cross-section identifier is “county” and the time-series identifier is “year”. The and subscripts denote county and year, respectively.2 , crimes committed per person; , police per capita; , a deterrence measure based on Baltagi’s ‘probabilities’ of arrest, conviction and imprisonment;3 , percentage in county that are young males; = 1 if county in western North Carolina, 0 otherwise; = 1 if in central North Carolina, 0 otherwise; = 1 if county in metropolitan (urban) area, 0 otherwise; , tax revenue per capita. Based on the variables listed above, consider the following simultaneous equations model of the crime rate (,) and police numbers (,): , = 0 + 1, + 2, + 3, + 4 + 5 + 6 + 1,, (2.1) , = 0 + 1, + 2, + 3 + 4 + 5 + 2,, (2.2) where 1,, and 2,, are i.i.d. error terms. (a) Note: In this part, we ignore the obvious endogeneity of , in (2.1) that arises because of simultaneity so that we can look at the effects of dealing with its endogeneity later. So, treating all right-side variables (regressors) as exogenous (for now), estimate equation (2.1) using the within estimator with cross-section (individual/entity) fixed 1 Badi H. Baltagi, “Estimating an Economic Model of Crime using Panel Data from North Carolina”, Journal of Applied Econometrics, Vol. 21, No. 4, 2006, pp. 543–547 (available in LMS > Assignment 3 and worth a read). 2 For details of the data and variable definitions, see Baltagi. 3 To be specific, this variable is the sum of Baltagi’s ‘probabilities’ of arrest, conviction and prison sentence (see his first paragraph of section 1). One might think of the deterrence measure as reflecting the (logged) probability of the following sequence: perpetrator arrested, perpetrator convicted, perpetrator imprisoned. 5 effects only and report your coefficient estimates (“Est. coef.”) and heteroskedasticity- consistent standard errors (“HCSE”s) in column (a) of a table equivalent to Table 2 on the final page. (Just write “See column (a) of Table 2 for results.”) (b) Is the sign of your estimate of 1 from (a) as expected? If so, thoroughly explain why. If not, thoroughly explain the difference between the expected and estimated signs. (c) Estimate the dummy variables version of the fixed effects model described in (a) above and report the results in column (c) of a table equivalent to Table 2 on the final page. (Note that where comparison is possible, the estimated coefficients and standard errors in columns (a) and (c) should be the same.) (d) Note: From now on, as is appropriate, we treat the simultaneously determined , and , as endogenous, but treat all other variables in (2.1) and (2.2) as exogenous. Estimate the dummy variables version of the reduced form equation for police numbers (,) and report the results in column (d) of a table equivalent to Table 2 on the final page.4 (e) Using the fitted values from the regression in (d) above, re-estimate the dummy variables version of equation (2.1) with �, replacing ,, and report the results in column (e) of a table equivalent to Table 2 on the final page. (f) Has the procedure applied in (e) above dealt with the unexpected sign problem referred to in (b) above? Suggest how the model might be developed further to address this issue. 4 Please ignore the message about three coefficients being dropped because of singularities. This is presumably due to keeping the regional dummies in the model even though these must be based on the county locations. 6 Table 2. Panel data estimates of crime and police equations (please fill in!) Regressor (a) dv = , (c) dv = , (d) dv = , (e) dv = , , Est. coef. – – HCSE – – �, Est. coef. – – – HCSE – – – , Est. coef. HCSE , Est. coef. HCSE Est. coef. – HCSE – Est. coef. – HCSE – Est. coef. – HCSE – , Est. coef. – – – HCSE – – – Notes 1. dv = dependent variable 2. The cell entry – (dash) indicates that either the regressor is included in the model but its estimated coefficient cannot be reported or the regressor is excluded from the model (you have to figure out which!) 3. The regressions in columns (c), (d) and (e) include county dummy variables whose estimated coefficients are not reported.
欢迎咨询51作业君