程序辅导案例 > Program >

程序代写案例-I 1 /

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Generalised Linear Models (GLMs) I
Generalised Linear Models (GLMs) I 1 / 17
Learning goals
Be able to define Generalised Linear Model (GLM).
Be able to obtain the cannonical link function for GLM.
(Challenging) Be able to explain ideas for the iterated weighted least
squares (IWLS) algorithm.
Be able to compute the variance of parameter estimates in GLM using
IWLS algorithm.
Understand (scaled) deviance
Be able to define (scaled) deviance.
Be able to compute (scaled) deviance.
Be able to use it to test model adequacy, perform model selection for
nested models and non-nested models.
Be able to perform diagnostics for GLM using R.
Understand what different ‘diagnostics measurements’ aim to measure.
Be able to obtain those diagnostic measurements from the glm output.
Be able to perform diagnostics for GLM using R.
Generalised Linear Models (GLMs) I 2 / 17
Generalised Linear Model
Definition: Y is a GLM if it is from an exponential family, and
µ := EY = g−1(xTβ)
where
g is a monotonic differentiable function called the link function.
x is a vector of independent (predictor) variables, and
β is a vector of parameters
Remark: We model location using η = xTβ, and let the scale sort itself
out. That is, we do not model the scale explicitly.
Generalised Linear Models (GLMs) I 3 / 17
Canonical link
Recall Y is from an exponential family if
f (y ; θ, φ) = exp
[
yθ − b(θ)
a(φ)
+ c(y , φ)
]
.
If g(µ) = g(EY ) = xTβ = θ then g is called the cannonical link. Since
µ = b′(θ), it follows that the canonical link must be (b′)−1.
Generalised Linear Models (GLMs) I 4 / 17
Examples: canonical links
normal θ = µ, g(µ) = µ
Poisson θ = log λ = logµ, g(µ) = logµ
binomial θ = log p1−p = log
µ
m−µ , g(µ) = log
µ
m−µ
Generalised Linear Models (GLMs) I 5 / 17
Estimation of parameters in GLM
Iterated Weighted Least Squares (IWLS) algorithm
Implemented in the glm function in R
Generalised Linear Models (GLMs) I 6 / 17
Estimation of parameters in GLM using maximum
likelihood methods
Suppose we have independent observations yi from an exponential family,
with canonical parameter θi and dispersion parameter φ, for i = 1, . . . , n.
Furthermore suppose that yi has mean
µi = b
′(θi ) = g−1(xTi β)
If g is the canonical link then θi = x
T
i β.
The log-likelihood is then
l(β, φ; y) =
n∑
i=1
(
yiθi − b(θi )
a(φ)
+ c(yi , φ)
)
.
Maximum likelihood methods aim to find the values of parameters that
maximise their log likelihood function.
Generalised Linear Models (GLMs) I 7 / 17
Aside: Newton-Raphson method
If f (x) has a continuous derivative f ′(x), then the problem of finding the
value that maximise f (x) is equivalent to 1) finding x∗1 , . . . , x
∗
k that are the
roots of f ′(x), and then 2) among them, finding the one that maximise
f (x).
We apply the Newton-Raphson method for root-finding to f ′(x):
xn+1 = xn − f
′(xn)
f ′′(xn)
.
Generalised Linear Models (GLMs) I 8 / 17
Newton-Raphson method to find MLE
Suppose we wish to find the value of θ that maximise a log likelihood l(θ)
using Newton-Raphson method.
Our update step is
θn+1 = θn − H(θn)−1U(θn)
where
U(θ) =
∂l(θ)
∂θ
and H(θ) =
∂2l(θ)
∂θ∂θT
= −J (θ).
If we replace J by I, the Fisher information, then the algorithm is called
Fisher scoring.
The Fisher information is guaranteed to be positive definite (unlike the
observed information).
Generalised Linear Models (GLMs) I 9 / 17
Connection with Iterated Weighted Least Squares (IWLS)
algorithm?
It turns out that the Fisher scoring applied to GLM can be interpreted as a
least squares problem (if you want to know details, read “McCullagh &
Nelder on IWLS” posted on LMS).
The glm function in R computes MLEs of parameters in GLM using
iterated weighted “Least Squares” algorithm.
Generalised Linear Models (GLMs) I 10 / 17
Aside: Weighted Least Squares method
Suppose Y = Xβ + ε where ε ∼ N(0,Σ), and X and Σ are full rank.
Multiplying by Σ−1/2 we get Σ−1/2Y = Σ−1/2Xβ + ε′ where ε′ ∼ N(0, I ).
What is the least squares estimator of β?
The estimator of β that minimises the sum of squares
(Σ−1/2y − Σ−1/2Xβ)T (Σ−1/2y − Σ−1/2Xβ) = (y − Xβ)TΣ−1(y − Xβ)
is
βˆ = (XTΣ−1X )−1XTΣ−1y
= (XTWX )−1XTW y,
where the weight W = Σ−1.
Generalised Linear Models (GLMs) I 11 / 17
Weighted Least Squares for GLM
g(Yi ) ≈ g(µi ) + (Yi − µi )g ′(µi )
Let Zi = g(µi ) + (Yi − µi )g ′(µi ) and i = (Yi − µi )g ′(µi ). Then,
Zi = x
T
i β + i
where
Var i = (g
′(µi ))2VarYi .
If we knew Var i then the estimator of β would be the solution to the
weighted least squares problem:
min
β
(z− Xβ)TΣ−1(z− Xβ)
where Σ is diagonal with Σii = Var i .
Generalised Linear Models (GLMs) I 12 / 17
Weighted Least Squares for GLM
We have g(Y) ≈ Z = Xβ+ ε where i = (Yi − µi )g ′(µi ) and Σ = Var ε is
diagonal with entries
Σii = (g
′(µi ))2VarYi = (g ′(µi ))2v(µi )a(φ).
This suggests
βˆ = (XTΣ−1X )−1XTΣ−1z.
Problem: zi and Σii depend on β because µi = g
−1(xTi β).
Solution: iterate!
Note that the a(φ) factor in the expression for βˆ cancels out.
Generalised Linear Models (GLMs) I 13 / 17
Iterated Weighted Least Squares (IWLS) algorithm
IWLS algorithm (for finding MLE of β)
1 Start with µˆ0 = y.
2 Given µˆn calculate, for each i ,
zni = g(µˆ
n
i ) + (yi − µˆni )g ′(µˆni ) and
W nii =
1
Σii
= 1
g ′(µˆni )2v(µˆ
n
i )
.
3 Put βˆ
n+1
= (XTW nX )−1XTW nzn and
for each i , µˆn+1i = g
−1(xTi βˆ
n+1
).
4 If βˆ
n+1
is sufficiently close to βˆ
n
then stop, otherwise
return to (2).
For GLM, IWLS algorithm is equivalent to Fisher scoring.
Example: see the section “IWLS” in Bliss.pdf.
Generalised Linear Models (GLMs) I 14 / 17
Variance of βˆ from IWLS algorithm
Suppose that the IWLS algorithm converges to the estimate βˆ, then
βˆ = (XT Σˆ−1X )−1XT Σˆ−1z
where, elementwise, we have
µˆi = g
−1(xTi βˆ)
zi = g(µˆi ) + (yi − µˆi )g ′(µˆi )
Σˆii = (g
′(µˆi ))2v(µˆi ).
Since Var z = Σˆ we have
Var βˆ = [(XT Σˆ−1X )−1XT Σˆ−1]Σˆ[(XT Σˆ−1X )−1XT Σˆ−1]T
= (XT Σˆ−1X )−1
Note that the a(φ) term in Σˆ does not cancel here, as it did in the IWLS
algorithm, so we need to estimate it.
Generalised Linear Models (GLMs) I 15 / 17
Estimator for a(φ)
Yi has mean µi and variance v(µi )a(φ). So∑
i
(Yi − µˆi )2
v(µˆi )a(φ)
≈ χ2n−p
where p is the number of parameters used to estimate µ.
Let
X 2 :=
∑
i
(Yi − µˆi )2
v(µˆi )
.
Then, X 2/(n − p) will be an estimator for a(φ).
X 2 is called Pearson’s χ2 statistic, and it can be shown that X 2/(n− p) is
a consistent estimator for a(φ).
Generalised Linear Models (GLMs) I 16 / 17
Variance of βˆ from IWLS algorithm
Var βˆ = (XT Σˆ−1X )−1,
where
Σˆii = (g
′(µˆi ))2v(µˆi ) ˆa(φ)
= (g ′(µˆi ))2v(µˆi )X 2/(n − p).
Generalised Linear Models (GLMs) I 17 / 17

欢迎咨询51作业君