Simulation, Semester II 2021-2022 ST3247: Assignment 21 • Due date: April 10th (Sunday of Week 12) • There are THREE questions in this question sheet. • Your answer sheet should contain answers to ALL questions. Do not leave answers in your code file. Marking mainly depends on your answer sheet. • Please submit an answer sheet with an R file in the submission folder separately, and name them with the same name. Please do NOT use compressed files. • The R file should contain your coding work as a supplementary file. Please make it ready for reproductions. • For all the questions, you can only use standard uniform random variables unless there are particular statements in the problem. 1. Suppose X1, X2, · · · , Xn ∼ N(0, θ) where θ ∼ Gamma(3, 0.5), that p(θ) = θ2e−θ/2 23Γ(3) , θ > 0. Note. For this question, you can use any random variable generation function in R, such as rnorm(), rgamma() and so on. You can also use gamma() to calculate Γ(x) for any x. (a) Find the posterior distribution that p(θ|x1, x2, · · · , xn), subject to a constant cp. (b) We are interested in the mean of the posterior distribution, which is E = ∫ ∞ 0 θ ∗ p(θ|x1, x2, · · · , xn)dθ. The posterior distribution is hard to sample even with the rejection algorithm. So we want to use the importance sampling algorithm. Suppose we have the random variables Y1, Y2, · · · , Ym ∼ q, where q is a distribution. Please write out the estimation in terms of m, p(θ) = p(θ|x1, x2, · · · , xn), q, and Yi’s. 1All rights reserved by NUS. Reproduction or distribution of lecture notes/tutorials/quiz/exam without the written permission of the sponsor is prohibited. 1 (c) To get rid of the constants, let p = cpp0 and q = cqq0, where cp and cq are constants and p0, q0 are kernel functions. Prove that∫ f(θ)p(θ) q(θ) q(θ)dθ = ∫ f(θ)p(θ) q(θ) q(θ)dθ∫ p(θ) q(θ)q(θ)dθ = ∫ f(θ)p0(θ) q0(θ) q(θ)dθ∫ p0(θ) q0(θ) q(θ)dθ . (d) According to part (b) and (c), find the estimation in terms of m, p0, q0, and Yi’s. Further, express them in terms of log p0, log q0 and Yi’s. (e) We set the function q to be Gamma(cs2, c), i.e. g(θ) = ccs 2 θcs 2−1e−cθ Γ(cs2) , θ > 0. Here, s2 is the sample variance and c > 0 is a parameter to decide. We use c to make the function flat. With the density functions of p and q, write R functions to calculate log p0 and log q0. (f) With all the previous definitions of p, q and the discussion, write a pseudo-code to estimate E. (g) Generate 100 data points with distribution N(0, 5), denoted by x1, x2, · · · , x100. It means the ground truth is θ = 5. With your generated data, run your code to estimate E with c ranges from 0.04 to 20 with step size 0.04, m = 1000. Draw one plot of Eˆ versus c and comment. 2 2. Messages arrive at a communications facility in accordance with a Poisson process having a rate of λ = 2 per hour. The facility consists of three channels, and an arriving message will either go to a free channel if any of them are free or else will be lost if all channels are busy. The amount of time that a message ties up a channel is a random variable that depends on the weather condition at the time the message arrives. Specifically, if the message arrives when the weather is “good,” then its processing time is a random variable having distribution function F (x) = x, 0 < x < 1 whereas if the weather is “bad” when a message arrives, then its processing time has distribution function F (x) = x3, 0 < x < 1. Initially, the weather is good, and it alternates between good and bad periods— with the good periods having fixed lengths of 2 hours and the bad periods having fixed lengths of 1 hour. (Thus, for example, at time 5 the weather changes from good to bad.) Suppose we are interested in the distribution of the number of lost messages by time T = 100. (a) Set up a vector with length T , that decide the weather at time point 1 ≤ t ≤ T . (b) Define the events and variables that enable us to use the DES approach. (c) Write the pseudo code of how to update the variables for each iteration (d) Implement the R code. Run your program to give an interval estimate with 95% confidence of the expected number of lost messages in the first 100 hours of operation. (No need to include code here.) (e) Now I want to check the code under a simplified case. For the following, assume that we are always in GOOD weather and there is only ONE channel. i. What is the inter-arrival time between two messages? ii. The status of this channel will be Busy, Idle,Busy, Idle, · · · . Busy means it is processing a message and Idle means there is no message under processing now. Let tB denote the length of one busy period. What is the distribution of tB? What is E[tB]? iii. Let tI be the length of one idle period. What is the distribution of tI? Hint. tI is the time until next arriving message. iv. The messages arrived during the busy period are lost. Let N denote the number of lost messages. Let TB be the total busy time of the channel until T = 100. 3 Prove that E[N ] = λE[TB]. v. According to the property of stochastic processes, in the long run, the proportion of all busy periods is approximately E[tB]/(E[tB] + E[tI ]). Let TB be the total time that the channel is busy until T = 100. What is E[TB]? vi. Find E[N ] according to parts (iv) and (v). vii. Adjust your code for this case. Run 1000 simulations for T = 100. Compare your results with the results in part (vi). 4 3. Let Q be a symmetric matrix, that is, qij = qji for all i, j. It satisfies the requirements for a transition probability matrix, that every entry is non-negative and the row sum is 1. Consider a Markov chain based onQ, where the transitions due to the following procedure, Step 1. Generate Y ∼ qi, where i is the current state. qi is the distribution defined by i-th row of Q. Step 2. Generate U ∼ Unif(0, 1). Step 3. If U ≤ bYbi+bY , set next step to be Y Step 4. Otherwise, set next step to be the current state i. Here, bj ’s are specified positive numbers. (a) Say X0 = i, what is the distribution of X1? (b) Find the transition probability matrix of the current Markov Chain. (c) Actually this MC is time reversible. Please set up the local balanced equations and find the corresponding limiting distribution pi. (d) Consider the distribution on the permutations of (1, 2, 3, · · · , 10). The probability of a permutation (x1, x2, · · · , x10) is that P ((x1, x2, · · · , x10)) = C ∗ 10∑ i=1 xi/i, where C is the normalising constant. Please identify a procedure according to the results of this problem. Identify Q and bi’s, and then write down the pseudo code. (e) With the pseudo code in part (d), please run simulations to find the expectation of∑10 i=1 i ∗ xi. Write down the steps you get your samples (take that in part (d) as a single function) and the result. 5
欢迎咨询51作业君