1 Assignment #6 – ECON 323 – Fall 2020 Professor Mikal Skuterud Date assigned: October 27, 2020 at 9:00 Date due: November 2, 2020 at 23:30 1. In this assignment we are going to be analyzing the Wooldridge dataset CPS78_85. This dataset is an example of pooled cross-sections. The data come from the Current Population Survey, which is the U.S. equivalent of the Canadian Labour Force Survey. This particular file contains a random sample of Americans from 1978 and 1985. 2. Open the dataset in R. To make the coding a little simpler in this assignment, I would recommend attaching the data frame to the R search path. This can be done using the attach() function. The dataset contains a variable called lwage, which is the log hourly wage of an individual. Create a new variable equal to the wage level (as opposed to log wage). Call this variable wage. 3. The dataset contains a dummy variable called female which is equal to 1 for women and 0 for men. It also contains a dummy variable called y85, which is equal to 1 for observations that were sampled in 1985 and equal to 0 for observations sampled in 1978. Use the command mean(wage[female==1&y85==0])to estimate the average hourly wage of women in 1978. Confirm that it was $4.79 per hour. 4. Estimate the average hourly wage of men in 1978. How does it compare to the average wage of women in 1978? What is the percentage difference in the average wage of women and men? 5. What is the difference (in the level, not the percentage) in the mean log wage of women and men? How does it compare to your answer in question 4? Is this a coincidence that the values are reasonably close? [Hint: log(x) – log(y) ≈ (x – y)/y)] 6. Estimate the simple linear regression: 0 1log( ) female uwage β β+ += using only the sample of observations from 1978. A simple way to restrict the sample is to use the subset() function when specifying the data in the lm() function. How does your estimate of β1 compare to your answer in question 5? Is this a coincidence? Can you reject the null hypothesis that the true difference in the average log hourly wage of men and women is zero at the 5% significance level? 7. What is the OLS estimate of β0 in the regression you estimated in question 6? How does it compare to your estimate of the mean log hourly wage of men in 1978 in question 4? Is this a coincidence? Explain. 2 8. Using the same simple linear regression used in question 6, estimate the difference in the mean log hourly wage of women and men in 1985. Is it bigger or smaller than it was 1978? How much bigger or smaller exactly? Are you surprised or is this what you would have expected? Why or why not? Explain. Can you reject the null hypothesis that the true difference in the average log hourly wage of men and women in 1985 was zero at the 5% significance level? 9. Estimate the multiple linear regression model using the OLS estimator: 0 1 2 3log( ) 8 55 8 )(femawage y femalele y uβ β β β ⋅++ + += . Interpret the estimates of β0, β1, β2, and β3. [Hint: you have already estimated all these values in the previous questions in this assignment!] How much did the male-female log wage difference change between 1978 and 1985? Can you reject the null hypothesis that the difference between the 1978 difference and the 1985 difference (the difference-in-differences value) is zero at the 5% significance level? 10. A reason why the relative wages of women may have changed between 1978 and 1985 is that women’s education levels increased more than men’s over this period of time. Examine to what extent this is true by estimating the regression: 0 1 2 3 4log( ) 85 ( 85)wage y femalefemale y educ uβ β β β β= ++ + ⋅ + + where educ is equal to an individual’s years of schooling. Interpret the estimate of β4. Be specific. What happens to your estimate of β3 when you condition on an individual’s education? Interpret this result.
欢迎咨询51作业君