程序辅导案例 > Program >

程序代写案例-2I

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Recap
Assumptions of Linear Regression
y = X + ✏, ✏ ⇠ MVN(0,2I )
Assumptions:
1. Linearity
2. Independence
3. Normality
4. Equal variance (homoskedasticity)
Estimate via OLS:
min
X
i
(yi xTi )2
yields ˆ = (XTX )1XTY , and we have shown
ˆ ⇠ N(,2(XTX )1)
STAT 331: Applied Linear Models 1
Diagnosing Problems
• Linearity
• Partial regression plots
• Independence
• Subject matter knowledge/design
• Normality
• Histogram of residuals, QQ plots
• Equal variance (homoskedasticity)
• Residuals vs fitted values
STAT 331: Applied Linear Models 2
Violating Assumptions
We have seen how violating assumptions can make our results invalid
We have seen how to assess whether our assumptions are broken
How do we fix our models?
STAT 331: Applied Linear Models 3
Linearity
Suppose our assessment suggests that linearity isn’t met
Might consider transforming xj
• Instead include log(xj)
• Instead use a quadratic model (xj and x2j )
• Etc.
Caveat: this changes the interpretation!
STAT 331: Applied Linear Models 4
Transforming a Covariate
STAT 331: Applied Linear Models 5
Independence
Violations of independence require more advanced regression methods
Options:
• Estimates are still unbiased but standard errors are broken
• =) replace SEs with robust alternatives (sandwich form, GEE)
• Explicitly model the dependence structure
• Mixed e↵ects models
STAT 331: Applied Linear Models 6
Normality
Violations of normality might not be a big deal
• E.g. if we have a large sample size
However, Normality is required for valid prediction intervals
• Could consider transforming Y
• E.g. model log(Y )
• This again changes interpretations!
• Might not be a problem if we’re interested in prediction
• Could consider other regression approaches: GLMs, etc. (not
covered in this course)
STAT 331: Applied Linear Models 7
Homoskedasticity
If our errors are heteroskedastic, we have a few options:
• Transform outcome (see above)
• Variance stabilizing transform
• Weighted Least Squares
• Bootstrap (time permitting...)
STAT 331: Applied Linear Models 8
Weighted Least Squares
Suppose we have heteroskedasticity:
y = X + ✏, s.t. ✏ ⇠ N(0,⌃) where ⌃ =
0BBBB@
21 0 . . . 0
0 22 . . . 0
...
0 0 . . . 2n
1CCCCA
Likelihood: L(,⌃) =Qi 1p2⇡2i exp
h
1
22i
(yi xTi )2
i
Maximizing the likelihood is equivalent to:
min wi (yi xTi )2, where wi =
1
2i
This is weighted least squares, as opposed to ordinary least squares.
STAT 331: Applied Linear Models 9
Weighted Least Squares
In matrix notation:
min (y X)TW(y X), where W = diag(w1, . . . ,wn)
Taking W as fixed for the moment:
@L
@
=
@
@
⇥
(y X)TW(y X)⇤
=
@
@
h
yTWy yTWX TXTWy+ TXTWX
i
0 =
h
2XTWy+ 2(XTWX)
i
XTWy = (XTWX)
(XTWX)1XTWy = ˆW
ˆW is our WLS estimator
STAT 331: Applied Linear Models 10
WLS Estimator
E [ˆW ] = E [(X
TWX)1XTWy]
= (XTWX)1XTWE [y]
= (XTWX)1XTWX =
Var [ˆW ] = Var [(X
TWX)1XTWy]
= (XTWX)1XTWVar [y]WTX (XTWX)1
= (XTWX)1XTW⌃WTX(XTWX)1
= (XTWX)1XTWX(XTWX)1
= (XTWX)1
STAT 331: Applied Linear Models 11
An Alternative View of WLS
y = X + ✏, s.t. ✏ ⇠ N(0,⌃) where ⌃ = diag(21 , ...,2n)
Let W1/2 = diag(w1/21 , ...,w
1/2
n ).
We could pre-multiple our model by W1/2:
W1/2y =W1/2X +W1/2✏
yw = Xw + ✏w
E [✏w ] = 0
Var [✏w ] = Var(W
1/2✏)
=W1/2Var(✏)W1/2
=W1/2⌃W1/2 = I
So we could achieve ˆW just by OLS of yw on Xw STAT 331: Applied Linear Models 12
Fitting WLS
Structure of W may be known by design in some special cases (see
Practice)
In practice we dont know W
• We need to plug in values for wi = 1/2i
We could estimate 2i via e
2
i .
Do this in a few ways:
• Directly: set 2i = e2i . (This is pretty unstable)
• Binning: estimate a single 2i for a group of observations
• Model 2i
• E.g. |ei | = ↵0 + ↵1yˆi + ✏0 and then ˆ2i = ˆ|ei |
2
• E.g. e2i = ↵0 + ↵1yˆi + ✏0 and then ˆ2i = be2i
• Could also regress against covariates instead of fitted values
But how do we get ei = yi yˆi without first estimating yˆi?
• A bit circular...
• =) Iteratively reweighted least squares algorithm!
STAT 331: Applied Linear Models 13
Iteratively Reweighted Least Squares
1. Fit OLS: ˆ = (XTX)1XTy
• Get fitted values yˆi and residuals ei
2. Using fitted values and residuals, estimate 2i as described above
and set wi = 1/2i
• E.g. regress |ei | = ↵0 + ↵1yˆi + ✏0
3. Fit WLS: ˆW = (X
TWX)1XTWy
• Get updated fitted values and residuals
4. Repeat 2—3 until ˆW converges (stops changing)
STAT 331: Applied Linear Models 14
Practice Q1
The hospitals.csv dataset consists of measurements of post-surgery
complications in 31 hospitals in New York state. The variables in the
dataset are:
• SurgQual: A quality index for post-surgery complications averaged
over all patients in the hospital (higher means less complications).
• Diculty: A measure of the average diculty of surgeries for that
hospital (higher means more dicult cases).
• N: The number of patients used to calculate these indices.
Consider a simple linear regression model, regressing SurgQual on
Difficulty.
(a) Why does the structure/design of the data suggest homoskedasticity
may not hold?
(b) Compute WLS estimates of regression coecients, accounting for the
fact that outcomes are averaged over N patients, and compare with OLS
estimates.
STAT 331: Applied Linear Models 15
Sol1
Instead of underlying measurements zi ’s, each of our observations
corresponds to a group average yi =
1
Ni
PNi
j zj where the size of each
group, Ni , varies.
In this case, if the underlying z ’s are iid, Var(✏i ) =
1
Ni
2, indicating
heteroskedasticity.
Hence we could consider wi / Ni
• we could pre-multiply by pNi
This is a special case when the structure of W is known.
STAT 331: Applied Linear Models 16
Practice Q2
You are considering two transformations of the outcome in a linear
model, g1(y) or g2(y), to address some violation of our usual
assumptions. To compare the two di↵erent choices, suppose you regress
both transformed outcomes on your covariates and examine model fit by
computing ˆ in each analysis.
Why might this not be a useful comparison?
STAT 331: Applied Linear Models 17
Soln Q2
The scale of g1(y) and g2(y) di↵er, and as a result they no longer have
the same SST, for example. Their respective SSRes values will also be on
di↵erent scales and are not directly comparable.
STAT 331: Applied Linear Models 18

欢迎咨询51作业君