Australian National University RSFAS, College of Business and Economics INTRODUCTION TO BAYESIAN DATA ANALYSIS (STAT3016/4116/7016) SEMESTER 2 2021 ASSIGNMENT 1 DUE DATE: Friday 3 September 2021, by 11:59pm (15% of total course grade) (Total Marks: 55 (STAT3016); 70 (STAT4116/7016)) INSTRUCTIONS: 1. All students must hand in an assignment of their own writing. 2. The assignment should be submitted using the online submission facility Turnitin on the course Wattle site under ‘ASSIGNMENTS/Assignment 1’ 3. Begin each question on a new page. The questions are not equally weighted. 4. Typed solutions are preferred. However, you may scan and submit hand written solutions to problems which require mathematical derivations. Be sure your handwritten work is legible on the computer once scanned. 5. Where required, provide sufficient computer output to support your answers. Provide enough intermediate numerical calculations to justify working for your final answer. 6. Computer output must be interpreted in written format. A solution solely highlighting the computer output is not acceptable. 7. No late assignments will be accepted without prior permission before the due date and time from the course convenor COLLABORATION POLICY University policies on plagiarism will be strictly enforced. You are encouraged to (orally) discuss your assignments with your classmates, but each student must write up solutions separately. Be sure that you have worked through each problem yourself and that all answers you submit are the results of your own efforts. This includes all computer code and output. The submission facility Turnitin will provide a similarity score after matching your submission against other student submissions and external sources. Australian National University RSFAS, College of Business and Economics Problem 1 [10 marks] How many ambulance vehicles are there in Canberra? Suppose I know there cannot be more than 60 ambulance vehicles in Canberra. Whilst driving around last week I observed six ambulances numbered 13, 14, 21, 26, 37 and 38. So the sample size is n = 6. I assume that ambulances in Canberra are numbered from 1 to N, and that I am equally likely to observe any numbered ambulance at any time. I also assume observations are independent. To solve this problem, suppose one takes independent observations y1, ..., yn from a discrete uniform distribution on the set {1, 2, ..., N}, where the upper bound N is unknown. Suppose one places a uniform discrete prior for N on the values 1, ..., B, where B is known. (a) [3 marks] Derive the posterior distribution of N up to a proportionality constant. Be sure to specify the bounds on the parameter space of N in your posterior distribution. (b) [3 marks] Compute posterior probabilities of N over a grid of values. (c) [2 marks] Compute the posterior mean and posterior standard deviation of N. (d) [2 marks] Find the posterior probability that there are more than 50 ambulance vehicles in Canberra. Problem 2 [5 marks] Suppose for a binary sampling problem we plan on using a uniform, or Beta(1,1) prior for the population proportion θ. Perhaps our reasoning is that this represents “no prior information about θ”. However, some people like to look at proportions on the log-odds scale, that is, they are interested in γ = log ( θ 1−θ ) . Via Monte-Carlo sampling or otherwise find the prior distribution for γ that is induced by the uniform prior for θ. Is this prior informative about γ? Australian National University RSFAS, College of Business and Economics Problem 3 [15 marks] The speed limit on Gunghalin Drive in Canberra between Barton Highway and the Glenloch interchange is 90km/h. Ben frequently drives on Gunghalin Drive and typically drives at a constant speed of 90km/h (that is, at the speed limit) on this section of road. One day, he passes 3 cars and gets passed by 17 cars on this section of road. Suppose that car speeds on this section of road are normally distributed with unknown mean µ and known standard deviation σ = 4.5. Let s = 3 denote the number of cars that Ben overtakes and let their unobserved car speeds be y1, y2, y3. Let t = 17 denote the number of cars that overtake Ben and let their unobserved car speeds be y4, y5, .....y20. (a) [3 marks] Assign the unknown mean µ a flat prior density. Write down the mathematical expression for the posterior density of µ (that is, p(µ|σ, s, t, y)) up to a proportionality constant. Hint: the actual car speeds yi (i = 1, ..., 20) are not observed, but if Ben passes say Car A, then we know that the speed of Car A must be less than 90km/hr. Similarly if Car B passes Ben, then we know that the speed of Car B must be greater than 90km/hr. (b) [2 marks] Plot the posterior density of µ. (c) [1 mark] Using the density found in part (b), provide a 95% interval estimate for the average speed at which cars travel along this section of Gunghalin Drive between Barton Highway and the Glenloch interchange. (d) [1 mark] Estimate the probability that the average speed of the cars on this section of Gunghalin Drive exceeds the 90km/h speed limit. (e) [2 marks] Now let’s assume σ is unknown. Assume the non-informative joint prior distribution p(µ, σ2) ∝ (σ2)−1. Derive the joint posterior distribution of (µ, σ2) up to a proportionality constant. (f) [3 marks] Create a contour plot of the joint posterior density of µ and σ2. (g) [3 marks] Using the joint density found in part (f), provide a 95% interval estimate for the average speed at which cars travel along this section of Gunghalin Drive between Barton Highway and the Glenloch interchange. How does your answer compare to your previous answer in part (c)? Australian National University RSFAS, College of Business and Economics Problem 4 [8 marks] Due to COVID restrictions, final exams are currently administered online. For exams that do not use invigilation software there has been an increase in reports of potential collusion between students when sitting the online exam. Identical answers or matching submission times for each question raise red flags for a potential breach of academic integrity. Consider an online exam with 20 questions and answers are directly entered into input boxes on Wattle (that is, no file uploads are permitted). Two students, say Student A and Student B provided an answer to all 20 questions. Suppose for 15 of the 20 questions, the answers from Student A are identical to the answers from Student B, and for the remaining 5 questions, the answers are very similar. As further evidence of collusion, we check the time stamps of when each of the 20 answers were submitted. Let yi = 1 if there is a match between the timestamp of Question i for Student A and the timestamp of Question i for Student B, (i = 1, ..., 20). Let θ be the common probability of a match in timestamps for any question. The assumed likelihood function is yi|θ iid∼ Bern(θ). Therefore, if n = 20 is fixed, the joint likelihood is p(y1, ..., y20|θ) ∝ θ ∑n i=1 yi(1−θ)n− ∑n i=1 yi Suppose the observed data are, in order, 1,1,1,0,1,1,1,1,1,1,1,1,1,0,1,1,0,1,1,1. What criteria should we use to establish a case for collusion between Student A and Student B based on the observed timestamps? Suppose the protocol for measurement is to stop once 15 ones have appeared. (a) [2 marks] Assume a uniform prior on θ. What is the posterior distribution p(θ|y) under the new protocol where n is not fixed? (b) [6 marks] Let’s run some posterior predictive checks. Define the test quantity T= number of switches between 0 and 1 in the sequence. Simulate the replications yrep under the new measurement protocol to stop once 15 ones have appeared. Display the predictive simulations T (yrep) in a histogram. Compare to the distribution of T (yrep) when n = 20 is fixed and explain any differences. Australian National University RSFAS, College of Business and Economics Problem 5 [17 marks] We often use the t-distribution to model continuous random variables where the distribution is approximately symmetric but with a higher possibility of outliers compared to a normal distribution. Consider the following course grade data for a class of size n=15 students, 78, 50, 72, 72, 75, 72, 68, 94, 66, 92, 66, 90, 64, 71, 45. With lower and upper outliers in the observed data, a t-distribution may be more appropriate to model the distribution of course grades. Suppose y1, ..., yn are a sample from a t distribution with location parameter µ, scale parameter σ, and known degrees of freedom ν. Assuming conditional independence given the parameters, the likelihood function is given by p(y1, ..., yn|µ, σ, ν) = ∏n i=1 1 σ ( 1 + (yi−µ) 2 σ2 )−(ν+1)/2 . Assuming a non-informative prior g(µ, σ) ∝ 1σ the posterior density is given by p(µ, σ|y1, ..., yn, ν) ∝ 1 σ n∏ i=1 1 σ ( 1 + (yi − µ)2 σ2 )−(ν+1)/2 To obtain posterior draws of µ and σ, we can actually implement a Gibbs sampler. First, we need to introduce a new scale parameter λ and each observation yi is represented as a scale mixture of normals. The new representation of the model is yi|µ, σ, λi ∼ Normal(µ, σ/ √ λi) λi|ν ∼ Gamma(ν/2, ν/2) p(µ, σ) ∝ 1/σ. where ν is assumed fixed and known. (a) [3 marks] Reparametise the model in terms of σ2 instead of σ, and write out the joint posterior density of all parameters (µ, σ2, λi) (i = 1, ..., n) (b) [2 marks] Derive the full conditional distribution of λi (i = 1, ..., n) given µ and σ 2 and ν. (c) [2 marks] Derive the full conditional distribution of µ given λi (i = 1, ..., n) and σ 2 and ν. (d) [2 marks] Derive the full conditional distribution of σ2 given µ and λi (i = 1, ..., n) and ν. (e) [3 marks] Assume ν=4. Write some code in R (or other computing package) to implement your Gibbs sampling algorithm Australian National University RSFAS, College of Business and Economics (f) [2 marks] Obtain 95% posterior interval estimates for µ, σ2. (g) [3 marks] Show some convergence diagnostics for your Gibbs sampling algorithm. Problem 6 [STAT4116/STAT7016 ONLY [15 marks] Consider the problem of comparing proportions from two binomial distributions, θ1 and θ2. We observe y1 distributed as Binomial(n1, θ1) and y2 distributed as Binomial(n2, θ2). We want to derive the posterior distributions of θ1 and θ2. Let’s consider the case of dependent priors for θ1 and θ2. That is, knowledge of the value of θ1 may influence the prior belief about the location of the second proportion θ2. For example, the Australian Technical Advisory Group on Immunisation (ATAGI) has recommended that Pfizer is the preferred vaccine for people aged 60 and under. So what proportion of people under age 60 are willing jump the Pfizer queue and get the alternative AstraZeneca vaccine which is more readily available? Let θ1 denote the proportion of people aged 30-39 who are willing to get the AstraZeneca vaccine. Let θ2 denote the proportion of people aged 40-49 who are willing to get the AstraZeneca vaccine. Because we are considering adjacent age groups, the vaccine preferences of people in the first age group may affect the vaccine preferences of people in the second age group and vice versa. That is, the belief that θ1 is close to say 7% might lead us to believe that the value of θ2 is also close to 7%. This belief implies the use of dependent priors for θ1 and θ2. What are the options for a dependent prior? Howard (1998) proposed a special form of dependent prior between θ1 and θ2 expressed as follows. First, consider a logit transformation of the parameters θ1 and θ2. That is, define γ1 = log θ1 1− θ1 and γ2 = log θ2 1− θ2 . To model the dependency, let γ2|γ1 ∼ Normal(mean = γ1, stdev = σ). Howard (1998) proposed the following general form of the dependent prior p(θ1, θ2) ∝ e−1/2u2θα−11 (1− θ1)β−1θκ−12 (1− θ2)δ−1 where u = 1σ (γ1 − γ2). (a) [3 marks] Explain the role of each of the hyperparameters (α, β, κ, δ, σ) (b) [2 marks] Is the joint prior on p(θ1, θ2) defined above a conjugate prior? Explain why or why not?? Australian National University RSFAS, College of Business and Economics (c) [6 marks] Suppose the following data are observed from a sample of 30-39 year old people and 40-49 year old people in Canberra on their willingness to receive an AstraZeneca vaccine. Age group Yes No Total 30-39 3 15 18 40-49 1 11 12 Write an R function to compute the value of the posterior density p(θ1, θ2|y1, y2). Draw contour plots of the posterior distribution for values of the parameter σ = 2, 1, 0.5 and 0.25 respectively. (d) [4 marks] For each of the four assumed values of σ in part (c), compute the posterior probability that θ1 > θ2 (that is, Pr(θ1 > θ2|y1, y2)) by simulating samples from the posterior distribution p(θ1, θ2|y1, y2).
欢迎咨询51作业君