程序代写案例-MATH97075 MATH97183

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
© 2020 Imperial College London Page 1
MATH97075 MATH97183
BSc, MSci and MSc EXAMINATIONS (MATHEMATICS)
May-June 2020
This paper is also taken for the relevant examination for the
Associateship of the Royal College of Science
Survival Models and Actuarial Applications
SUBMIT YOUR ANSWERS AS ONE PDF TO THE RELEVANT DROPBOX ON BLACKBOARD
INCLUDING A COMPLETED COVERSHEET WITH YOUR CID NUMBER, QUESTION
NUMBERS ANSWERED AND PAGE NUMBERS PER QUESTION.
.
Date: 11th May 2020
Time: 09.00am - 11.30am (BST)
Time Allowed: 2 Hours 30 Minutes
Upload Time Allowed: 30 Minutes
This paper has 5 Questions.
Candidates should start their solutions to each question on a new sheet of paper.
Each sheet of paper should have your CID, Question Number and Page Number on the
top.
Only use 1 side of the paper.
Allow margins for marking.
Any required additional material(s) will be provided.
Credit will be given for all questions attempted.
Each question carries equal weight.
1. (a) State the Kaplan-Meier Estimator, Sˆ(t), of the survivor function S(t) and the Nelson-Aalen
Estimator, Mˆ(t), of the cumulative hazard function M(t). Clearly define all quantities
appearing in the estimators. (5 marks)
(b) Give 3 equivalent expressions that can be useful in different contexts for the hazard rate µ(t)
of a continuous random variable T .
(3 marks)
(c) Derive Greenwood’s estimator of the standard error of Sˆ(t), s.e.{Sˆ(t)}. (5 marks)
(d) Pointwise (1−α)-confidence intervals for S(t) can be constructed based on the approximate
distribution
Sˆ(t) ∼ N
(
S(t), s.e.{Sˆ(t)}2
)
.
However, the resulting intervals may include negative values for S(t). Derive an estimator
of the standard error of log Sˆ(t). Using a normal approximation, explain how to construct a
non-negative, asymptotically valid (1−α)-confidence interval for S(t) based on your estimate
of s.e.[log Sˆ(t)]. (3 marks)
(e) Construct an alternative non-parametric estimator of the survivor function, S˜(t), based on
∗ the Nelson-Aalen estimator Mˆ(t); and
∗ the relationship between the survivor function and cumulative hazard rate.
Show that Sˆ(t) ≤ S˜(t) for all t ≥ 0. (4 marks)
(Total: 20 marks)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 2
2. Consider the proportional hazards model µ(t; z) = µ0(t) exp(zβ) for z, β ∈ R.
(a) State the partial likelihood L(β) for this model based on a random sample where some
observations are subject to right-censoring in two cases: (i) all death times are distinct, (ii)
accounting for ties using the Breslow approximation. (5 marks)
(b) Define the likelihood ratio test statistic Λ of the hypothesis H0 : β = 0 against H1 : β 6= 0.
What is the asymptotic distribution of the statistic? (3 marks)
(c) Consider the following dataset arising from the proportional hazards model:
i tobs,i zi
1 1.1 1
2 1.3+ 0
3 1.4 1
4 1.4 0
5 1.6+ 0
6 2.3 1
Write down and simplify the partial likelihood for these data. (5 marks)
(d) It is known that U(β) := ∂
∂β
logL(β) has an approximate N(0, I(β)) distribution. Show that
when z ∈ {0, 1}
U(0) =
k∑
j=1
(
d1j − dj n1j
nj
)
and I(0) =
k∑
j=1
n1jn0jdj
n2j
where t1 < . . . < tk denote the distinct death times, dij denotes the number of deaths with
z = i at time tj, nij denotes the number of individuals at risk with z = i just before tj, and
nj = n1j + n0j, dj = d1j + d0j. (4 marks)
(e) Define an approximate level α test of the hypothesis H0 : β = 0 against H1 : β 6= 0 based on
the statistic U(0) from part (d). Briefly explain which single quantity from ∑kj=1 nj, ∑kj=1 dj
and ∑kj=1(nj + dj) has most effect on the appropriateness of the approximation.
(3 marks)
(Total: 20 marks)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 3
3. The Binomial model of mortality is used to model the number of deaths before age x + 1 in a
sample of n individuals still alive at the exact integer age x, with independent probability of death
qx ≡ 1qx for each individual. Individual i is assumed to be available for observation only within a
sub-interval [x + ai, x + bi), for 0 ≤ ai < bi ≤ 1, with corresponding death probability bi−aiqx+ai
during this observation period. Let di = 0, 1 be equal to 1 if individual i within the age group
[x, x+ 1) died, and let d = ∑ni=1 di be the total number of deaths.
(a) The Initial Exposed to Risk is Ex :=
∑n
i=1(1− ai)−
∑n
i=1(1− di)(1− bi).
(i) Compare the contribution to Ex of individual i that is censored or uncensored.
(2 marks)
(ii) Express the Central Exposed to Risk, Ecx, in terms of the ai, bi and di. (2 marks)
(iii) Derive the formula Ex ≈ Ecx + 12d. Clearly state all required assumptions. (4 marks)
(b) An entomologist is studying the lifespan of a particular species of dragonfly that has a
maximum lifespan of ω ≈ 4 months. They have obtained the Central Exposed to Risk
at age x and number of deaths in the interval [x, x + 1) from a sample of dragonflies for
x = 0, 1, 2, 3, where x is the age of a dragonfly in months.
Estimate qx for x = 0, 1, 2, 3 from this data:
x Ecx d
0 35 10
1 45 6
2 75 30
3 60 40
(5 marks)
(c) From the data in part (b), estimate the curtate expectation of life for a dragonfly currently
aged x = 1 months. That is, estimate e1. (3 marks)
(d) In part (b), estimates of qx are obtained separately for each interval [x, x + 1). In reality,
it is believed that qx is a smooth function of x. The process of smoothing crude actuarial
estimates is called graduation, and results in graduated estimates.
By smoothing the estimates of qx from part (b), the entomologist obtains graduated estimates
q˚0 = 0.2, q˚1 = 0.2, q˚2 = 0.3, and q˚3 = 0.5
Using the data in part (b), conduct a cumulative deviations test at the α = 0.05 level, using
the graduated values to define the null hypothesis, and simplifying the test statistic as much
as possible. Note that P (|X| > 1.96) ≈ 0.05 for X ∼ N(0, 1). (4 marks)
(Total: 20 marks)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 4
4. The following diagram illustrates the resulting 3-state model for a homogeneous Markov jump
process X(t) denoting the status of a given patient receiving a bone marrow transplant:
Transplant0 Platelet recovery2
Relapse or death1
-
µ02
@
@
@
@
@
@
@
@
@
@R
µ01 µ21










Here, {µij} denote the transition intensities for X(t).
(a) (i) Write down the generator matrix G for this process. (4 marks)
(ii) State the distribution of the holding time in state i for i = 0, 2. (3 marks)
(iii) What is the probability that an individual that has just had a bone marrow transplant
relapses or dies without their platelets recovering? (2 marks)
(iv) Find the expected time from transplant to either relapse or death. (4 marks)
(b) Let Yi(j, t) = 1 indicate that patient i is in state j at time t, and Yi(j, t) = 0 otherwise. Let
Ni(t) denote the number of transitions made by the ith individual between times 0 and t.
(i) Suppose that we observe n individuals. Find the intensity λ(t) for the counting process
N(t) := ∑ni=1Ni(t). (4 marks)
(ii) Derive expressions for the compensator Λ(t) and counting process martingale D(t) for
the process N(t) guaranteed by the Doob-Meyer Decomposition Theorem. Write your
expressions in terms of the total time V j(t) spent by the n individuals in state j up to
time t for j = 0, 1, 2. (3 marks)
(Total: 20 marks)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 5
5. This question is based on the following paper:
Mackenzie, Todd (2012) “Survival Curve Estimation with Dependent Left Truncated
Data Using Cox’s Model,” The International Journal of Biostatistics: Vol. 8: Iss. 1,
Article 29. DOI: 10.1515/1557-4679.1312
The notation used in this question follows the conventions used in that paper.
(a) (i) Write (T,∆, V ) in terms of the unobserved (Y, V,E) and explain what is meant by
dependent left truncation. (3 marks)
(ii) What untestable assumption is being made in the paper? How do the parameters β and
Λ relate to this assumption? (3 marks)
(b) Verify that Q := Pr[Y ≥ V ] = 1/E{ 1Pr[Y≥V |V=v] | Y ≥ V }, hence justifying the inverse
probability weighted approach to estimation of Q by Qˆ. To avoid measurability complications,
consider discrete distributions for the time-to-event Y and truncating variable V . (Hint: start
by considering the probability Pr[Y ≥ V | V = v]).
(4 marks)
(c) Consider the estimators
FˆV (v) :=

n

vi≤v
exp[g(vi; βˆ)Λˆ(vi)]
and
SˆY (y) :=

n
n∑
i=1
exp[−g(vi; βˆ){Λˆ(y)− Λˆ(vi)}]
(i) For any measurable function f , show that
E[f(V )] = E
{
Q
f(V )
Pr(Y ≥ V | V = v)
∣∣∣∣∣ Y ≥ V
}
.
To avoid measurability complications, consider discrete distributions for the time-to-event
Y and truncating variable V . (4 marks)
(ii) Explain how the result in part (c)(i) can be used to justify the estimators FˆV (v) and
SˆY (y). Your answer should specify the function f(·) for each estimator.
(2 marks)
(d) The author estimates the overall survival curve for users of the VA health system using both
their method and the Kaplan-Meier estimate. Comment on the appropriateness of the Kaplan-
Meier estimate in this data analysis.
(2 marks)
(e) Based on the simulation studies in this paper, how would you expect the bias of the survival
curve estimator SˆY (y) to behave in a setting with a negative correlation between Y and V
with Q = 0.1? How would your answer change for values of Q > 0.5? (2 marks)
(Total: 20 marks)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 6
Course: MATH96048/MATH97075/MATH97183
Setter: Whitney
Checker: Heard
Editor: Hallsworth
External: Jennison
Date: April 16, 2020
MSc EXAMINATIONS (MATHEMATICS)
May 2020
MATH96048/MATH97075/MATH97183
Survival Models and Actuarial Applications
Setter’s signature Checker’s signature Editor’s signature
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 1 of 9
MSc EXAMINATIONS (MATHEMATICS)
May 2020
This paper is also taken for the relevant examination for the Associateship.
MATH96048/MATH97075/MATH97183
Survival Models and Actuarial Applications
Date: ?? Time: ??
Credit will be given for all questions attempted.
Calculators may not be used.
c© 2020 Imperial College London MATH96048/MATH97075/MATH97183 Page 1 of 9
1. (a) [Seen] Let t1 < · · · < tk denoted the unique, ordered death times. For i = 1, 2, . . . , k, let di denote
the number of individuals that died at time ti and let ni be the number of individuals that are at
risk at time ti.
Sˆ(t) =

i:ti≤t
(
1− di
ni
)
Mˆ(t) =

i:ti≤t
di
ni
Answers receive 1 mark for providing all 3 definitions (death times, number of deaths, and number
at risk), 2 marks for correctly stating Sˆ(t) and 2 marks for correctly stating Mˆ(t).
(b) [Seen] Answers receive 1 mark for each (correct) expression. Various possibilities include:
µ(t) = lim
h↓0
Pr(T ≤ t+ h | T > t)
h
= ft(0) =
d
dt
M(t) = − d
dt
logS(t) =
f(t)
S(t)
where ft(·) is the density of Tt (the future lifetime for an individual aged t), f0(·) = f(·) is the
density of T ≡ T0, M(t) is the cumulative hazard rate, and S(t) is the survivor function.
(c) [Seen] There are roughly 5 steps to the derivation. Partial credit is based on the number of steps
completed:
1. Consider log Sˆ(t) =

i:ti≤t log
(
1− dini
)
. Assuming approximate independence,
var{log Sˆ(t)} =

i:ti≤t
var
{
log
(
1− di
ni
)}
2. The di are Bernoulli random variables with estimated probability of death equal to 1−di/ni, so
var(di/ni) ≈ di(ni − di)
n3i
3. By the delta method,
var
{
log
(
1− di
ni
)}
≈ (ni − di)
2
d2i
var(di/ni) ≈ di
ni(ni − di)
So far, we have derived the approximation
var{log Sˆ(t)} ≈

i:ti≤t
di
ni(ni − di)
4. By the delta method, var{log Sˆ(t)} ≈ 1
Sˆ(t)2
var{Sˆ(t)}. Hence
var{Sˆ(t)} ≈ Sˆ(t)2

i:ti≤t
di
ni(ni − di) .
5. Taking the square root of this expression yields Greenwood’s formula.
(d) [Unseen] From the derivation of Greenwood’s formula, it is known that
var{log Sˆ(t)} ≈

i:ti≤t
di
ni(ni − di) .
Hence the standard error estimate is (1 mark)
s.e.{log Sˆ(t)} ≈
√∑
i:ti≤t
di
ni(ni − di) .
Let c be the value such that P (X > c) = α/2 for X ∼ N(0, 1). Then
log Sˆ(t)− c× s.e.{log Sˆ(t)} to log Sˆ(t)− c× s.e.{log Sˆ(t)}
is an asymptotically valid (1− α)-confidence interval for logS(t). Hence (1 mark)
Sˆ(t)e−c×s.e.{log Sˆ(t)} to Sˆ(t)ec×s.e.{log Sˆ(t)}
is an asymptotically valid (1 − α)-confidence interval for S(t). Since each term in the products
involved is non-negative, the confidence interval is also non-negative. (1 mark)
(e) [Unseen] We have that S(t) = exp{−M(t)}, so the estimator of interest is
S˜(t) := exp{−Mˆ(t)} = exp
−∑
i:ti≤t
di
ni
 = ∏
i:ti≤t
exp
{
−di
ni
}


i:ti≤t
(
1− di
ni
)
= Sˆ(t)
Where the inequality follows since exp(x) ≥ 1+x for all x (by convexity of exp(x)), and in particular
for each −di/ni.
Solutions receive 2 marks for any correct definition of S˜(t), and the remaining 2 marks for a correct
derivation of the inequality.
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 3 of 9
2. (a) [Seen] Let Rt denote the indices of individuals still at risk at time t and define si :=

j:tj=ti
zi. Let
t1 < · · · < tk be the death times.
(i) When all death times are distinct
L(β) =
k∏
i=1
eziβ∑
j∈Rti e
zjβ
(ii) When there are possible ties
L(β) =
k∏
i=1
esiβ
[

j∈Rti e
zjβ]di
Assign 1 mark for defining the notations and 2 marks for each of the two cases.
(b) [Seen] The likelihood ratio test statistic is
Λ = 2{logL(βˆ)− logL(0)}
with an asymptotic χ21 distribution.
(c) [Seen Method] Note that there are ties in the death times. Letting ψ = eβ, the Breslow partial
likelihood is:
L(β) =
ψ
3 + 3ψ
× ψ
(2 + 2ψ)2
× ψ
ψ
=
1
12
× ψ
2
(1 + ψ)3
.
To maximise L(β), we take the derivative of the log-likelihood
`(β) = logL(β) = − log 12 + 2 logψ − 3 log(1 + ψ)
d`

=
2
ψ


− 3
1 + ψ


= 2− 3 ψ
1 + ψ
.
It follows that
2− 3 ψ
1 + ψ
= 0 ⇔ 2
3
=
ψ
1 + ψ
⇔ β = log
(
2/3
1/3
)
= log 2.
Hence, βˆ = log 2 ≈ 0.7 is the partial maximum likelihood estimate. Indeed, this maximises L(β), as
d2`
dβ2
= −3 ψ
2
(1 + ψ)2
= −3 exp(β)
2
(1 + exp(β))2
< 0.
Answers receive 2 marks for the correct likelihood, 2 marks for the partial MLE, and 1 mark for
verifying it is a maximum.
(d) [Unseen] We have (2 marks)
d logL(β)

=

i∈U
[
zi −

j∈Rti zj exp(βzj)∑
j∈Rti exp(βzj)
]
(β≡0)
=

i∈U
[
zi −

j∈Rti zj
|Rti |
]
=
k∑
j=1
(
d1j − dj n1j
nj
)
= U(0)
and (2 marks)
−d
2 logL(β)
dβ2
=

i∈U
d

[∑
j∈Rti zj exp(βzj)∑
j∈Rti exp(βzj)
]
=

i∈U
[

j∈Rti exp(βzj)][

j∈Rti z
2
j exp(βzj)]− [

j∈Rti zj exp(βzj)]
2
[

j∈Rti exp(βzj)]
2
(β≡0)
=

i∈U
|Rti |[

j∈Rti zj ]− [

j∈Rti zj ]
2
|Rti |2
=
k∑
j=1
dj
njn1 − n21
n2j
=
k∑
j=1
n1n0dj
n2j
= I(0)
(e) [Unseen] Let X = U(0)2/I(0). Then X has an asymptotically χ21 distribution. To obtain a level α
test, we reject for X > c where c is a value such that P (X > c) = α. (2 marks)
The last mark is awarded for correctly identifying
∑k
j=1 dj as the most relevant quantity. It is the
number of observed failures, rather than the number of observations, that really matters. (1 mark)
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 5 of 9
3. (a) [Seen]
(i) We have
Ex =
n∑
i=1
(1− ai)−
n∑
i=1
(1− di)(1− bi) =

i:di=1
(1− ai) +

i:di=0
(bi − ai)
so if individual i dies, they contribute (1− ai). If they are right-censored, then they contribute
(bi − ai). (2 marks)
(ii) Ecx =

i=1(bi − ai) is the total time observed (2 marks)
(iii) First, we have the equality (2 marks)
Ex =

i:di=1
(1− bi + bi − ai) +

i:di=0
(bi − ai)
=

i:di=1
(1− bi) + Ecx.
Now, assuming that the deaths happen, on average, at time x+ 1/2, we find∑
i:di=1
(1− bi) ≈

i:di=1
(1− 1
2
) =
1
2
d
and (2 marks)
Ex ≈ Ecx +
1
2
d.
(b) [Seen Method] Using the approximation from 3(a)(iii), we have qˆx = d/(E
c
x+d/2) for x = 0, 1, 2, 3.
In particular
qˆ0 =
10
35 + 5
=
1
4
qˆ1 =
6
45 + 3
=
1
8
qˆ2 =
30
75 + 15
=
1
3
qˆ3 =
40
60 + 20
=
1
2
Note: 1 mark received for each correct value. 1 mark for using the approximation.
(c) [Seen Similar] From a result in the notes
e1 =
3∑
k=1
kp1
= p1 + 2p1 + 3p1
= (1− q1) + (1− q1)(1− q2) + (1− q1)(1− q2)(1− q3)
= (1− q1){1 + (1− q2)[1 + (1− q3)]}.
Substituting our estimates
eˆ1 = (1− qˆ1){1 + (1− qˆ2)[1 + (1− qˆ3)]} = 7
8
{1 + 2
3
[1 +
1
2
]} = 7
4
= 1.75
(d) [Unseen] For the cumulative deviations test, we use the statistic
z =

x(dx − Exq˚x)√∑
xExq˚x(1− q˚x)
.
The relevant quantities are calculated in the following table:
x dx E
c
x Ex Exq˚x Exq˚x(1− q˚x)
0 10 35 35 + 5 = 40 40(0.2) = 8 8(0.8) = 6.40
1 6 45 45 + 3 = 48 48(0.2) = 9.6 9.6(0.8) = 7.68
2 30 75 75 + 15 = 90 90(0.3) = 27 27(0.7) = 18.9
3 40 60 60 + 20 = 80 80(0.5) = 40 40(0.5) = 20
Totals 86 - - 84.6 52.98
Plugging these in,
z =
86− 84.6√
52.98
=
1.4√
52.98
we see that
−1.96 < 0.175 = 1.4
8
< z <
1.4
7
= 0.2 < 1.96
so we do not reject at the 0.05 level.
Answers receive 3 marks for the correct z value and 1 mark for the correct decision.
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 7 of 9
4. (a) (i) [Seen Similar] The generator matrix G for this process is
G =
µ00 µ01 µ02µ10 µ11 µ12
µ20 µ21 µ22
 =
−(µ01 + µ02) µ01 µ020 0 0
0 µ21 −µ21

One mark is awarded if at least the first matrix is written. For each correct row in the matrix,
award a mark (e.g. 3 marks if the second row is wrong, but the others are correct).
(ii) [Seen] The occupancy probability is 1 − e−(−µiit) for any i. Hence the distribution of holding
time in state 0 is Exponential(µ01 + µ02) and the distribution of holding time in state 2 is
Exponential(µ21).
Partial marks: 1 mark each, and 1 mark for at least recognising the exponential distribution.
(iii) [Seen Similar] (2 marks) This is the probability of a transition 0 7→ 1 for a patient currently in
state 0. Hence, the jump probability is
Pr(0 7→ 1) = µ
01
µ01 + µ02
.
(iv) [Unseen] We can write the expectation
E(time from transplant to either relapse or death)
as the following sum over times in each state (2 marks)
E(holding time in state 0) + Pr(0 7→ 1)0 + Pr(0 7→ 2)E(holding time in state 2).
From part (a)(ii), we can evaluate the expected holding times
E(holding time in state 0) =
1
µ01 + µ02
and
E(holding time in state 2) =
1
µ21
so that (1 mark)
E(time from transplant to either relapse or death) =
1
µ01 + µ02
+ Pr(0 7→ 2) 1
µ21
.
From part (a)(iii), we can evaluate the jump probability (1 mark)
Pr(0 7→ 2) = µ
02
µ01 + µ02
so that E(time from transplant to either relapse or death) equals
1
µ01 + µ02
+
µ02
µ01 + µ02
1
µ21
=
1
µ01 + µ02
(
1 +
µ02
µ21
)
=
µ02
µ01 + µ02
(
1
µ02
+
1
µ21
)
.
Note: any of the above expressions are acceptable.
(b) (i) [Seen Similar] Suppose that we observe n individuals. Find the intensity λ(t) for the counting
process N(t) :=
∑n
i=1Ni(t).
For individual i, the intensity of Ni(t) is
λi(t) = −
2∑
j=0
Yi(j, t)µ
jj = Yi(0, t)(µ
01+µ02)+Yi(1, t)·0+Yi(2, t)µ21 = Yi(0, t)(µ01+µ02)+Yi(2, t)µ21.
For N(t), we have
λ(t) =
n∑
i=1
λi(t) = (µ
01 + µ02)
n∑
i=1
Yi(0, t) + µ
21
n∑
i=1
Yi(2, t).
Partial marks: 2 of 4 marks if λi(t) is correctly stated, but λ(t) is wrong.
(ii) The compensator Λ(t) =
∫ t
0 λ(s)ds. For the ith individual, this is
Λi(t) =
∫ t
0
λi(s)ds = (µ
01+µ02)
∫ t
0
Yi(0, s)ds+µ
21
∫ t
0
Yi(2, s)ds = (µ
01+µ02)V 0i (t)+µ
21V 2i (t)
where V ji (t) is the time spent by individual i in state j up to time t. Hence,
Λ(t) =
n∑
i=1
Λi(t) = (µ
01 + µ02)
n∑
i=1
V 0i (t) + µ
21
n∑
i=1
V 2i (t) = (µ
01 + µ02)V 0(t) + µ21V 2(t)
where V j(t) is the total time spent by the n individuals in state j up to time t. (2 marks)
The counting process martingale is D(t) = N(t)− Λ(t), so (1 mark)
D(t) =
n∑
i=1
Ni(t)− (µ01 + µ02)V 0(t)− µ21V 2(t).
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 9 of 9
5. (a) (i) (2 marks) Observations are
(T,∆, V ) =
(
min(Y, V + E), I{Y ≤ V + E}, V ).
(1 mark) Dependent left truncation means that we only observe the distribution conditional on
the event {Y ≥ V }, and we do not assume that Y and V are independent.
(ii) The untestable assumption is that Y depends on V in a Cox proportional hazards model (1
mark), even when Y < V (1 mark). Parameter β is the log-hazard ratio and Λ is the cumulative
baseline hazard function (1 mark).
(b) For each v, we have
Pr(Y ≥ V | V = v) = Pr(Y ≥ V, V = v)
Pr(V = v)
=
Pr(V = v | Y ≥ V )Pr(Y ≥ V )
Pr(V = v)
=
Pr(V = v | Y ≥ V )Q
Pr(V = v)
Hence,
1
Q
Pr(V = v) =
Pr(V = v | Y ≥ V )
Pr(Y ≥ V | V = v) .
Summing over all v, we obtain
1
Q
=
1
Q

v
Pr(V = v) =

v
Pr(V = v | Y ≥ V )
Pr(Y ≥ V | V = v) = E
{
1
Pr[Y ≥ V | V = v] | Y ≥ V
}
.
Taking reciprocals completes the proof. If the student gets to the point just before summing over
all v, award 2 out of 4 marks.
(c) (i) From the calculations in (b), we have that
Pr(V = v) =
Q
Pr(Y ≥ V | V = v)Pr(V = v | Y ≥ V )
and hence we have that
E[f(V )] =

v
f(v)Pr(V = v)
=

v
f(v)
Q
Pr(Y ≥ V | V = v)Pr(V = v | Y ≥ V )
= E
{
Q
f(V )
Pr(Y ≥ V | V = v)
∣∣∣∣ Y ≥ V} .
(ii) FˆV (v) is obtained by setting
f(·) := I(· ≤ v)
and SˆY (y) is obtained by setting
f(·) := exp[−g(·, β)Λ(y)].
(1 mark each)
(d) The Kaplan-Meier estimate seems inappropriate as it assumes independent truncation. The test
of independence over the observed region was significant, which provides evidence of dependent
truncation. (2 marks)
(e) In a setting with a negative correlation between Y and V with Q = 0.1, we expect a negative bias
in the survival curve estimator (1 mark). For values of Q > 0.5, this bias would tend to zero (or at
least decrease in magnitude) (1 mark).
MATH96048/MATH97075/MATH97183 Survival Models and Actuarial Applications (2020) Page 11 of 9
ExamModuleCode QuestionNComments for Students
MATH97075 MATH9718 1
PARTS A,B,C were generally well answered. PART D had many answers that did not clearly 
demonstrate a non‐negative confidence interval, though in many cases the standard error 
estimator was obtained. PART E, many students were unable to justify the inequality. 
MATH97075 MATH9718 2
PART A was generally well answered. PART B had many responses that did not specify the 
degrees of freedom (1, here). PART C several students had difficulty writing the likelihood 
accounting for the tied event times. PART D had several issues simplifying the derivatives. PART E 
was typically well answered.
MATH97075 MATH9718 3
In PART A a surprising number of students had difficulty connecting the observation intervals to 
the central exposed to risk. PART B was answered well for the most part. Several students had 
difficulty evaluating the expectation for PART C. In PART D many students had difficulty 
calculating the cumulative deviations test statistic for the binomial model.
MATH97075 MATH9718 4
In PART A, many students had difficulty recognising that they should calculate the jump 
probability in part III and most students struggled to find the expecation in part IV. In PART B, 
many students did not demonstration understanding of the processes beyond the basic 
definitions.
MATH97075 MATH9718 5
PART A was fairly well answered, but students often did not address all parts of the question. 
PARTS B and C was in all cases not attempted or did not display the desired level of 
understanding of the estimator in the article. PARTS D and E were generally well answered.
If your module is taught across multiple year levels, you might have received this form for each level of the module. You are only required to fill this out once, for each question.
Please record below, some brief but non‐trivial comments for students about how well (or otherwise) the questions were answered. For example, you may wish to comment on common errors and 
misconceptions, or areas where students have done well. These comments should note any errors in and corrections to the paper. These comments will be made available to students via the MathsCentral 
Blackboard site and should not contain any information which identifies individual candidates. Any comments which should be kept confidential should be included as confidential comments for the  Exam 
Board and Externals. If you would like to add formulas, please include a sperate pdf file with your email. 

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468