Generalised Linear Models (GLMs) I

Generalised Linear Models (GLMs) I 1 / 17

Learning goals

Be able to define Generalised Linear Model (GLM).

Be able to obtain the cannonical link function for GLM.

(Challenging) Be able to explain ideas for the iterated weighted least

squares (IWLS) algorithm.

Be able to compute the variance of parameter estimates in GLM using

IWLS algorithm.

Understand (scaled) deviance

Be able to define (scaled) deviance.

Be able to compute (scaled) deviance.

Be able to use it to test model adequacy, perform model selection for

nested models and non-nested models.

Be able to perform diagnostics for GLM using R.

Understand what different ‘diagnostics measurements’ aim to measure.

Be able to obtain those diagnostic measurements from the glm output.

Be able to perform diagnostics for GLM using R.

Generalised Linear Models (GLMs) I 2 / 17

Generalised Linear Model

Definition: Y is a GLM if it is from an exponential family, and

µ := EY = g−1(xTβ)

where

g is a monotonic differentiable function called the link function.

x is a vector of independent (predictor) variables, and

β is a vector of parameters

Remark: We model location using η = xTβ, and let the scale sort itself

out. That is, we do not model the scale explicitly.

Generalised Linear Models (GLMs) I 3 / 17

Canonical link

Recall Y is from an exponential family if

f (y ; θ, φ) = exp

[

yθ − b(θ)

a(φ)

+ c(y , φ)

]

.

If g(µ) = g(EY ) = xTβ = θ then g is called the cannonical link. Since

µ = b′(θ), it follows that the canonical link must be (b′)−1.

Generalised Linear Models (GLMs) I 4 / 17

Examples: canonical links

normal θ = µ, g(µ) = µ

Poisson θ = log λ = logµ, g(µ) = logµ

binomial θ = log p1−p = log

µ

m−µ , g(µ) = log

µ

m−µ

Generalised Linear Models (GLMs) I 5 / 17

Estimation of parameters in GLM

Iterated Weighted Least Squares (IWLS) algorithm

Implemented in the glm function in R

Generalised Linear Models (GLMs) I 6 / 17

Estimation of parameters in GLM using maximum

likelihood methods

Suppose we have independent observations yi from an exponential family,

with canonical parameter θi and dispersion parameter φ, for i = 1, . . . , n.

Furthermore suppose that yi has mean

µi = b

′(θi ) = g−1(xTi β)

If g is the canonical link then θi = x

T

i β.

The log-likelihood is then

l(β, φ; y) =

n∑

i=1

(

yiθi − b(θi )

a(φ)

+ c(yi , φ)

)

.

Maximum likelihood methods aim to find the values of parameters that

maximise their log likelihood function.

Generalised Linear Models (GLMs) I 7 / 17

Aside: Newton-Raphson method

If f (x) has a continuous derivative f ′(x), then the problem of finding the

value that maximise f (x) is equivalent to 1) finding x∗1 , . . . , x

∗

k that are the

roots of f ′(x), and then 2) among them, finding the one that maximise

f (x).

We apply the Newton-Raphson method for root-finding to f ′(x):

xn+1 = xn − f

′(xn)

f ′′(xn)

.

Generalised Linear Models (GLMs) I 8 / 17

Newton-Raphson method to find MLE

Suppose we wish to find the value of θ that maximise a log likelihood l(θ)

using Newton-Raphson method.

Our update step is

θn+1 = θn − H(θn)−1U(θn)

where

U(θ) =

∂l(θ)

∂θ

and H(θ) =

∂2l(θ)

∂θ∂θT

= −J (θ).

If we replace J by I, the Fisher information, then the algorithm is called

Fisher scoring.

The Fisher information is guaranteed to be positive definite (unlike the

observed information).

Generalised Linear Models (GLMs) I 9 / 17

Connection with Iterated Weighted Least Squares (IWLS)

algorithm?

It turns out that the Fisher scoring applied to GLM can be interpreted as a

least squares problem (if you want to know details, read “McCullagh &

Nelder on IWLS” posted on LMS).

The glm function in R computes MLEs of parameters in GLM using

iterated weighted “Least Squares” algorithm.

Generalised Linear Models (GLMs) I 10 / 17

Aside: Weighted Least Squares method

Suppose Y = Xβ + ε where ε ∼ N(0,Σ), and X and Σ are full rank.

Multiplying by Σ−1/2 we get Σ−1/2Y = Σ−1/2Xβ + ε′ where ε′ ∼ N(0, I ).

What is the least squares estimator of β?

The estimator of β that minimises the sum of squares

(Σ−1/2y − Σ−1/2Xβ)T (Σ−1/2y − Σ−1/2Xβ) = (y − Xβ)TΣ−1(y − Xβ)

is

βˆ = (XTΣ−1X )−1XTΣ−1y

= (XTWX )−1XTW y,

where the weight W = Σ−1.

Generalised Linear Models (GLMs) I 11 / 17

Weighted Least Squares for GLM

g(Yi ) ≈ g(µi ) + (Yi − µi )g ′(µi )

Let Zi = g(µi ) + (Yi − µi )g ′(µi ) and i = (Yi − µi )g ′(µi ). Then,

Zi = x

T

i β + i

where

Var i = (g

′(µi ))2VarYi .

If we knew Var i then the estimator of β would be the solution to the

weighted least squares problem:

min

β

(z− Xβ)TΣ−1(z− Xβ)

where Σ is diagonal with Σii = Var i .

Generalised Linear Models (GLMs) I 12 / 17

Weighted Least Squares for GLM

We have g(Y) ≈ Z = Xβ+ ε where i = (Yi − µi )g ′(µi ) and Σ = Var ε is

diagonal with entries

Σii = (g

′(µi ))2VarYi = (g ′(µi ))2v(µi )a(φ).

This suggests

βˆ = (XTΣ−1X )−1XTΣ−1z.

Problem: zi and Σii depend on β because µi = g

−1(xTi β).

Solution: iterate!

Note that the a(φ) factor in the expression for βˆ cancels out.

Generalised Linear Models (GLMs) I 13 / 17

Iterated Weighted Least Squares (IWLS) algorithm

IWLS algorithm (for finding MLE of β)

1 Start with µˆ0 = y.

2 Given µˆn calculate, for each i ,

zni = g(µˆ

n

i ) + (yi − µˆni )g ′(µˆni ) and

W nii =

1

Σii

= 1

g ′(µˆni )2v(µˆ

n

i )

.

3 Put βˆ

n+1

= (XTW nX )−1XTW nzn and

for each i , µˆn+1i = g

−1(xTi βˆ

n+1

).

4 If βˆ

n+1

is sufficiently close to βˆ

n

then stop, otherwise

return to (2).

For GLM, IWLS algorithm is equivalent to Fisher scoring.

Example: see the section “IWLS” in Bliss.pdf.

Generalised Linear Models (GLMs) I 14 / 17

Variance of βˆ from IWLS algorithm

Suppose that the IWLS algorithm converges to the estimate βˆ, then

βˆ = (XT Σˆ−1X )−1XT Σˆ−1z

where, elementwise, we have

µˆi = g

−1(xTi βˆ)

zi = g(µˆi ) + (yi − µˆi )g ′(µˆi )

Σˆii = (g

′(µˆi ))2v(µˆi ).

Since Var z = Σˆ we have

Var βˆ = [(XT Σˆ−1X )−1XT Σˆ−1]Σˆ[(XT Σˆ−1X )−1XT Σˆ−1]T

= (XT Σˆ−1X )−1

Note that the a(φ) term in Σˆ does not cancel here, as it did in the IWLS

algorithm, so we need to estimate it.

Generalised Linear Models (GLMs) I 15 / 17

Estimator for a(φ)

Yi has mean µi and variance v(µi )a(φ). So∑

i

(Yi − µˆi )2

v(µˆi )a(φ)

≈ χ2n−p

where p is the number of parameters used to estimate µ.

Let

X 2 :=

∑

i

(Yi − µˆi )2

v(µˆi )

.

Then, X 2/(n − p) will be an estimator for a(φ).

X 2 is called Pearson’s χ2 statistic, and it can be shown that X 2/(n− p) is

a consistent estimator for a(φ).

Generalised Linear Models (GLMs) I 16 / 17

Variance of βˆ from IWLS algorithm

Var βˆ = (XT Σˆ−1X )−1,

where

Σˆii = (g

′(µˆi ))2v(µˆi ) ˆa(φ)

= (g ′(µˆi ))2v(µˆi )X 2/(n − p).

Generalised Linear Models (GLMs) I 17 / 17

欢迎咨询51作业君