1 Nencka Econometrics Eco311 SP21 Midterm 1 Name Instructions: 1. You have from 8:35 to 9:35 to take the test. Please upload your answers typed, as a separate Word Doc or PDF. 2. Answer all the questions to your best ability---I will give partial credit. There are 100 total points. Ask me questions if anything is unclear. 3. If a question asks you to justify or explain your result, do so. Otherwise you will not get full credit. 4. Be “precise and concise” in your explanations. Do not write a lot just to write a lot. 5. Have fun and good luck! 6. Do not forgot to write your name at the top of your page. 7. This exam is open note and open internet. But collaboration with fellow students is prohibited. Do not share this exam with any other student or post it online. 2 1. (25 points) A recent paper in the Quarterly Journal of Economics begins with the following sentences: Sustained growth in medical spending has prompted policy makers, insurers, and employers to search for ways to reduce health care costs. One widely touted solution is to increase the use of [workplace] “wellness programs,” interventions designed to encourage preventive care and discourage unhealthy behaviors, such as inactivity or smoking. (Jones et. al, 2019) 1a) (10 points) Why is a simple comparison of firms that do and do not choose to implement workplace wellness programs unlikely to identify the causal effect of these policies on health care costs? Identify two possible forms of firm/employer selection that might affect this simple comparison. For each example, which direction would the bias go? 1b) (7.5 points) Jones et. al (2019) describe a randomized control trial within a single workplace to test the effectiveness of these programs. Alternatively, we could imagine a randomized trial across workplaces. How would you set up this program? How would you know if the randomization worked? 1c) (7.5 points) Suppose that you successfully randomize workplace wellness programs across firms. You have individual data on each firm’s workers and whether each worker smokes, the main outcome that you care about. Write down the equation that you would use to estimate the causal effect of workplace wellness programs on smoking. Define each variable (in words) and explain the interpretation of the treatment effect coefficient in words. If relevant, be sure to explain the percentage interpretation of any coefficient. 3 2. (25 points) A question in U.S economic history is how much the development of the railroad affected economic growth. There was massive growth in the US railroad system in the 19th century – for example, compare the US railroad system in 1870 (left) vs 1900 (right) 2a) (10 points) Suppose that I calculated the miles of railroads in each state in 1870 and 1900. Using this data, I then calculated the change in miles over those years for each state s: GainedMiles_s = Miles_1900s - Miles_1870s. For example, perhaps Ohio gained an additional 500 miles while California gained 4000 miles. Interpret the 1 coefficient of this regression of state railroad growth on 1900 state log GDP in words. 1900 = 0 + 1 + 2b) (5 points) is likely right skewed – some eastern states gained little, while western states dramatically expanded the railroad. What transformation would be appropriate to account for this? Write the new regression and carefully interpret the meaning of the new 1 parameter in words. 2c) (10 points) The estimated 1 from your model in part 2b (and also part 2a) is likely to be biased. Why? What are two possible omitted variables that could bias the relationship between changes in railroad length and state GDP? For each omitted variable, carefully justify your answer and explain how controlling for that variable would affect the estimated 1. 4 3) 10 Points each – True or False. For each question, briefly explain your answer with an example. A true statement is always true, a false statement needs just one alternative example to be false. 3a) True or False: If an omitted variable A is correlated with an outcome variable but is not correlated with the treatment variable, it will cause estimated treatment effects to be biased. 3b) True or False: In a randomized control trial, we expect all measured and unmeasured characteristics of the treatment and control groups to be similar if we have a large enough sample. 3c) True or False: If a regression model suffers from omitted variable bias, it should not be used for prediction. 4) (20 points) This question uses a dataset of diamond characteristics and their prices, in dollars. Diamonds vary in their clarity, cut quality, carats (“size”), color, and many other dimensions. Here is a look at the data. Cut, Color, and Clarity are categories – carat is a continuous measure of size. 4a (10 points) Briefly describe what each line of the below R code does. Intuitively, what information would the first row of the returned tibble “diamonds_summary” tell you about the data if you ran this code? 5 4b (5 points) Briefly describe each part of this ggplot code 4c (5 points, harder) What is the equation of the fitted regression line in this graph? (Hint: you can and should estimate!) EC: If two products are substitutes, the cross-price elasticity of demand estimated by a log-log regression model will be ____________________ THANKS!
欢迎咨询51作业君