PPHA 34600: Program Evaluation Spring 2021 Problem Set 1 Due: Thursday, April 15, at 9PM Chicago time to Gradescope Instructions: This problem set consists of two files: (1) this document with instructions and questions; and (2) a dataset which you will use to answer the questions below. You can work in groups of up to three. Please identify your group members. Groups can share code, but each group member must turn in their own problem set, and must have separate written answers to the questions. You may not share any written work (including drafts) with other members of your group. You should submit written answers—which should be parsimonious—along with your code and results for the data analysis. You must use R. If you know how to use them, I recommend that you use RMarkdown or knitr, which will allow you to intersperse your code and written answers (but this is not required). If you do not use RMarkdown you must still include a print out of your code in the document. Note that you are primarily being graded on your written answers. Problem sets must be submitted in PDF format. Problem sets must be turned in via Gradescope; no late submissions will be considered. Questions: You have been asked by a well-meaning NGO, Monsoon Agricultural Preparatory Learning and Extension (MAPLE) to help them learn about the impacts of their monsoon forecast product, the Local End-user Agronomic Forecast Service, (MAPLE LEAFS). MAPLE LEAFS is a pilot monsoon forecast that can tell farmers up to 2 months in advance when the seasonal monsoon will arrive in India (and everyone knows it’s just good branding to share your name with the future Stanley Cup winner). MAPLE hypothesizes that these forecasts lead farmers to improve their rice yields, by tailoring their agricultural practices to the year’s expected rainfall. 1. MAPLE would like to know about the yield impacts of LEAFS. They say they’re interested in measuring the impact of their forecasts, but don’t exactly know what that means. Use the potential outcomes framework to describe the impact of treatment (defined as “seeing the monsoon forecast”) for farm i on rice yield (measured in tonnes per hectare) formally (in math) and in words. 2. MAPLE are extremely impressed. They want to know how they can go about measuring _i. Let them down gently, but explain to them why estimating _i is impossible. 3. MAPLE are on board with the idea that they can’t estimate individual-specific treatment effects. They suggest estimating the average treatment effect instead. They are willing to give you some of their early data on yields. They have data on farmers who did and didn’t have access to LEAFS, and want you to compare the average yield across the two sets of farms. Describe what this is actually measuring, and provide an example of why this may differ from the average treatment effect. 4. MAPLE have realized the error of their ways. Their CEO tells you, “Okay, we understand that our data won’t let us estimate the average treatment effect. But can’t we estimate the average treatment effect on the treated?” First formally (in math) define the ATT in this context, and then explain whether or not the MAPLE LEAFS data will allow you to estimate it. If so, describe how what you see in the data corresponds to the necessary components of the ATT. If not, explain why not, and describe what you can’t see in the data that you’d need to observe. 5. MAPLE forgot to tell you that they ran a pilot randomized study to estimate the effects of LEAFS on yields. They’re happy to share those data with you: find it in ps1_data.csv. This experience has made you a little bit skeptical of MAPLE’s skills, so start by checking (with a proper statistical test) that the treatment group and control group are balanced in pre-treatment yields, profits, number of workers, number of plots, and owner age. Use leafs_trt as your treatment variable. Report your results. What do you find? 6. Plot a histogram of yields for treated farms and control farms. What do you see? Re-do your balance table to reflect any necessary adjustments. What does this table tell you about whether or not MAPLE’s randomization worked? What assumption do we need to make on unobserved characteristics in order to be able to estimate the causal effect of leafs_trt? 7. Assuming that leafs_trt is indeed randomly assigned, describe how to use it to estimate the average treatment effect, and then do so. Please describe your estimate: what is the interpretation of your coefficient (be clear about your units)? Is your result statistically significant? Is the effect you find large or small, relative to the mean in the control group? 8. MAPLE is convinced that the reason their forecasts are effective is because they are getting farmers to acquire more plots of land. They want you to estimate the effects of LEAFS, but controlling for endline number of plots. Is this a good idea? Why or why not? Run this regression and describe your estimates. How do they differ from your results in (7)? What about controlling for baseline number of plots? Run this regression and describe your estimates. How do they differ from your results in (7)? How do the two estimates differ? What is driving any differences between them? 9. One of the MAPLE RAs (the real workforce!) informs you that not everybody who was assigned to treatment -- or was offered a forecast -- (leafs_trt = 1) actually took up MAPLE’s offer and saw LEAFS. She tells you that the actual treatment indicator is leafs_trt_yes. (Since LEAFS is new, we know for a fact that nobody in the control group got the information). In light of this new information, what did you actually estimate in question (7)? How does this differ from what you thought you were estimating? 10. MAPLE aren’t actually interested in the effect of assignment to treatment - they want to know about the actual effects of their forecasts. Describe (in math, and then in words) what you can estimate using the two treatment variables we observe, leafs_trt and leafs_trt_yes. Estimate this object (you can ignore standard errors just for this once). Interpret your findings. How does this compare to what you estimated in (7)?
欢迎咨询51作业君