MATH3871/MATH5960

Assignment 1

Assignment 1

This assignment covers material in Lectures 1–3. Assignment is worth 15% of final course

grade. 5% of grade will be allocated for neat and concise presentation. Please refer to the

following instructions:

• Assignment to be submitted via Moodle by 7 October 11:55PM AEDT

• Include in your assignment, any relevant R code, R output, and mathematical derivations.

Embed the code and plots into your assignment (please don’t attach R markdown or other

R script files)

• The total number of submitted pages should not exceed 6 A4 pages. Any pages submitted

in excess of 6 pages will not be graded.

• Print, sign and attach this cover sheet with your assignment (not included in page count).

• Refer to course handout for grading of late submissions

Plagiarism Statement

I declare that this assessment item is my own work, except where acknowledged, and has not

been submitted for academic credit elsewhere. I acknowledge that the assessor of this item may,

for the purpose of assessing this item reproduce this assessment item and provide a copy to

another member of UNSW; and/or communicate a copy of this assessment item to a plagiarism

checking service (which may then retain a copy of the assessment item on its database for the

purpose of future plagiarism checking).

I certify that I have read and understood UNSW Rules in respect of Student Academic Mis-

conduct.

Name (print clearly):

Student Number:

Signature:

Date:

1

1. Inference: Let θ be the true proportion of people over the age of 40 in your community

with hypertension. Consider the following thought experiment:

(a) Though you may have little or no expertise in this area, give an initial point estimate

of θ.

(b) Now suppose a survey to estimate θ is established in your community, and of the first

5 randomly selected people, 4 are hypertensive. How does this information affect

your initial estimate of θ?

(c) Finally, suppose that at the survey’s completion, 400 of 1000 people have emerged

as hypertensive. Now what is your estimate of θ?

2. Multivariate Priors: Let x1, . . . , xn ∈ Rd be n iid d-dimensional vectors. Suppose that

we wish to model xi ∼ Nd(µ,Σ) for i = 1, . . . , n where µ ∈ R is an unknown mean vector,

and Σ is a known positive semi-definite covariance matrix.

(a) Adopting the conjugate prior µ ∼ Nd(µ0,Σ0) show that the resulting posterior dis-

tribution for µ|x1, . . . , xn is Nd(µˆ, Σˆ) where

µˆ = (Σ−10 + nΣ

−1)−1(Σ−10 µ0 + nΣ

−1x¯)

and

Σˆ = (Σ−10 + nΣ

−1)−1.

(b) Derive Jeffreys’ prior piJ(µ) for µ.

Hint: If you need help with vector differentiation, you can find out about this on various

places on the internet. One such place is https://en.wikipedia.org/wiki/Matrix calculus.

3. Importance Sampling: There are many ways to compute or estimate pi. A very sim-

ple estimation procedure is via importance sampling. Suppose that samples x1, . . . , xn

were obtained uniformly inside a square with side length 2r (see diagram), where each

xi = (x

(1)

i , x

(2)

i ) for i = 1, . . . , n.

r

2

Now define bi = 1 if xi is also inside the circle of radius r, and bi = 0 otherwise. Then

pˆ = 1

n

∑n

i=1 bi is an estimate of the ratio of the area of the circle to the area of the square.

Given that we know the true value of p for this setting, we can then obtain an estimate

of pi.

(a) Show that the estimate of pi is given by 4pˆ.

(b) Estimate pi using n = 1, 000 samples.

(c) Using the central limit theorem, determine the Monte Carlo sampling variability of

pˆi (i.e. derive the asymptotic distribution of pˆi as n gets large).

(d) Construct a histogram of 1, 000 estimates of pˆi, each based on n = 1, 000 samples.

Superimpose the Monte Carlo sampling variability distribution from part (c) under

the assumption that the true value for p=0.7854, and verify that it matches the

experimental result.

(e) Without using the true value of p, based on the Monte Carlo sampling variability,

determine what sample size, n, is needed if we require to estimate pi to within 0.01

with at least 95% probability.

(Hint: You will need to use a value for p in order to obtain this value. Choose the

value of p that gives the most conservative value of n, so that you can be sure that

you have estimated pi to the desired accuracy.)

3

欢迎咨询51作业君