MATH3871/MATH5960

Assignment 2

Assignment 2

This assignment covers material in Lectures from weeks 6-8. The Assignment is worth 15%

of the final course grade. 5% of the grade will be allocated for neat and concise presentation.

Please refer to the following instructions:

• Assignment to be submitted via Moodle by 14 November 11:55PM AEDT

• You may work individually OR in groups of max. 3 people. Each group should submit

only one submission (any member can upload the submission to moodle). All members

must sign the plagiarism statement and attach to the submission and clearly indicate who

is part of the group.

• Include in your assignment, any relevant R code and output and mathematical derivations.

Embed the code and plots into your assignment (please don’t attach R markdown or other

R script files)

• The total number of submitted pages should not exceed 7 A4 pages. Any pages submitted

in excess of 7 pages will not be graded.

• Print, sign and attach this cover sheet with your assignment (not included in page count).

• Refer to course handout for grading of late submissions

1

Plagiarism Statement

I declare that this assessment item is my own work, except where acknowledged, and has not

been submitted for academic credit elsewhere. I acknowledge that the assessor of this item may,

for the purpose of assessing this item reproduce this assessment item and provide a copy to

another member of UNSW; and/or communicate a copy of this assessment item to a plagiarism

checking service (which may then retain a copy of the assessment item on its database for the

purpose of future plagiarism checking).

I certify that I have read and understood UNSW Rules in respect of Student Academic Mis-

conduct.

Name (print clearly):

Student Number:

Signature:

Date:

2

Questions

1. Archaeology provides a rich source of complex, non-standard problems, where if the prior

is available, it needs careful elicitation. Here we look at a data study of the technique of

corbelling, a method of roofing spaces with blocks of stone, widely used in prehistory.

For many decades, archaeologists and historians have been fascinated by the ability of

prehistoric communities to develop sufficient skills to allow them to construct these domes,

some of which have survived for over 4,000 years. These speculations have led to applied

mathematical models being developed as an aid to understanding why the domes stand

up and how they were constructed.

Consider the simplest of these models

yi = αx

β

i

where y denotes the radius of the dome at which measurements were taken, and x is the

depth from the apex of the dome to the point at which measurements were taken. It is

easier to work with the log linear model

ln yi = lnα + β lnxi + i

where i ∼ N(0, σ2) are iid error terms.

Below are 24 measurements from the late Minoan tholos dome at Stylos, of Crete in

Greece.

x (depth) 0.04 0.24 0.44 0.64 0.84 1.04 1.24 1.44 1.64 1.84 2.04 2.24

y (radius) 0.40 0.53 0.70 0.90 1.06 1.16 1.26 1.36 1.47 1.62 1.67 1.68

x (depth) 2.44 2.64 2.84 3.04 3.24 3.44 3.64 3.84 4.04 4.24 4.44 4.64

y (radius) 1.77 1.82 1.89 1.96 2.00 2.05 2.10 2.10 2.14 2.13 2.15 2.14

Table 1: Measurements for the late Minoan tholos at Stylos.

Use the Metropolis-Hastings (M-H) algorithm to compute posterior estimates for the

parameters ln(α), β and σ2. Code your M-H sampler from scratch in R rather than using

in-built packages. In your answer, provide:

(a) specific settings of your M-H algorithm (e.g. chosen proposal q)

(b) The priors chosen, and their justification. Carry out a prior sensitivity analysis (i.e.

assess the sensitivity of your posterior distribution to moderate changes in your prior

specification).

(c) The likelihood function.

(d) Results of convergence assessments

(e) Traceplots, density plots of the posterior samples, and a suitable point estimate for

each posterior parameter.

(f) The R code developed (not included in page count)

3

2. Benford’s law, also called the first-digit law, states that in lists of numbers from many

(but not all) real-life sources of data, the leading digit 1 occurs much more often than

the others (about 30% of the time). Furthermore, the larger the digit, the less likely it is

to occur as the leading digit of a number. The approximate distribution of leading digits

(which exclude 0 by definition) is given by:

Probability

1 2 3 4 5 6 7 8 9

Leading Digit 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

See http://en.wikipedia.org/wiki/Benford%27s law for more information on this

phenomenon.

Benford’s law can be used to expose cheating in tax and elections (among other things).

Below are counts of the number of polling stations (out of 192) that recorded each leading

digit for the number of pro-Hugo Chavez votes cast in a past Venezuelan election.

Number of Voting Stations

1 2 3 4 5 6 7 8 9

Leading Digit 31 32 29 20 18 18 21 13 10

(a) Assuming a multinomial likelihood and dirichlet prior with parameter vector a =

(a1, . . . , a9)

>, show that the Bayes factor for testing the conjecture that observed

counts n = (n1, . . . , n9)

> are consistent with Benford’s law is given by

B01 =

B(a)

B(a + n)

9∏

j=1

p

nj

0j

where p0 = (p01, . . . , p09)

> are Benford’s hypothesised proportions and

B(a) = B(a1, . . . , a9) =

∏9

j=1 Γ(aj)

Γ(

∑9

j=1 aj)

.

(b) Derive a similar expression for the fractional Bayes factor with training fraction b,

and produce a plot of 0 ≤ b ≤ 1 versus Bayes factor.

(c) Assess the adherance of the Venezuelation election counts to Benford’s law by eval-

uating both standard and fractional Bayes factors, and drawing appropriate conclu-

sions.

3. Let x1, . . . , xn be independent binary observations. Under model 1, Pr(xi = 1) = θ with

improper prior distribution pi(θ) = c1θ

−1(1 − θ)−1 for 0 < θ < 1, while under model 2,

Pr(xi = 1) = θ0, a fixed value. Let r =

∑n

i=1 xi be the number of ones in the data, and

suppose that 0 < r < n.

4

(a) Show that all minimal training samples produce the same partial Bayes factor, and

hence that both arithmetic (BA12) and geometric (B

G

12) intrinsic Bayes factors take

the value

BA12 = B

G

12 = B(r, n− r)θ1−r0 (1− θ0)1−n+r,

where B(p, q) = Γ(p)Γ(q)/Γ(p+ q) is the beta function.

(b) Derive the fractional Bayes factor BF12.

(c) Contrast BG12 and B

F

12 in the case θ0 = 0 and r = 1 by evaluating which models each

Bayes factor supports. Are these consistent with each other? Which is correct?

5

欢迎咨询51作业君