辅导案例-STAT 425

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Homework 2
STAT 425 - Yu
Due: Sept 26, 2019 11:59:00pm
Question 1 – Teen gambling (11 points)
The dataset teengamb (from the faraway library) concerns a study of teenage gambling in Britain. Fit
a regression model with the expenditure on gambling as the response and sex, status, income
and verbal scores as predictors.
a) What percentage of variation in the response is explained by these predictors? (1 pt)
b) Give the case (observation) number that corresponds to the highest positive residual, and the
one corresponds to the lowest negative residual. What are the mean and median of the
residuals? (1 pt)
c) When all other predictors are held constant, what would be the difference in the predicted
expenditure on gambling for a male compared to a female? (1 pt)
d) Predict the amount that a male with average status, income and verbal score would gamble
along with a 95% prediction interval. (1 pt)
e) Generate 95% prediction bands. (1 pt)
f) Fit a model with just the variables that are significant at the 0.05 significance level. What
percentage of variation in the response is explained by this new model? Use an F-test to
formally compare it to the full model. (2 points)
g) Fit a simple linear regression model with the expenditure on gambling as the response and one
of sex, status, income and verbal scores as predictors. Which predictor gives you the highest R2?
Compare the selected model with the full model (i.e., the model with all four predictors) via an
F-test. What is your (statistical) conclusion? (2 points)
h) From the model you chose in part (g), record R2. Then fit 3 more models by adding predictors
back in – 1 at a time. (i.e. you should have a model with 2 predictors and 3 predictors, and the
full model). Record R2 for all of these models. Make a graph of R2 vs the number of (non-
intercept) predictors in the model. Comment on the trend in this plot. (2 points)



Question 2 – A fun test question (9 points)
The following are outputs from R and some have been removed on purpose. Answer the following
questions based on the provided information:


a) What is R2 for myfit? Show all your work/steps for full credit. (4 points)
b) What’s the value of the test statistic for the following command? What’s its distribution under
H0?
> anova(newfit1, myfit) (3 points)
c) What is the value and null distribution for the F statistic under the following test?:
 = Full model vs  = Y ~ X1 + X2


(in other words, what is the value and null distribution of F if I ran the following command:)

anova(myfit, newfit2) (2 points)

51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468