程序代写案例-R2
1. Consider a region in R2 defined by: Ch = { (u, v) ∈ R2 : 0 ≤ u ≤ √ h ( v u )} where h(x) ≥ 0 ∀x ∈ R and ∫ R h(x)dx <∞. (a) (i) Show that Ch has finite area. (ii) Show that, if the bivariate vector (U, V ) is distributed uniformly on Ch, then the random variable X = V/U has probability density function fX(x) = h(x)∫ R h(y)dy . Now define h(x) = exp {−λx2} x ≥ 0 0 otherwise. (b) Calculate values for a, b and c, such that Rh = { (u, v) ∈ R2 : 0 ≤ u ≤ a, b ≤ v ≤ c } is the smallest rectangle such that Ch ⊆ Rh. The following pseudo-code outlines a Ratio-of-Uniforms procedure for sampling from fX(x), which uses squeezing in Steps 3 and 4: 1. Choose values for the tuning parameters α and β. 2. Generate U1, U2 ∼ U(0, 1). 3. Set U = aU1, V = b+ (c− b)U2, where a, b and c are given in part (b). 4. If V ≤ T1(U ;α), set X = V/U . 5. If V ≥ T2(U ; β), return to Step 2. 6. If V ≤ T3(U), set X = V/U , otherwise return to Step 2. (c) (i) Derive the expression T3(U), which appears in Step 6. (ii) By using a Taylor series expansion for exp(y), derive expressions for the squeezing functions T1(U ;α) and T2(U ; β). (d) Explain how the above Ratio-of-Uniforms procedure could be used to generate a homogeneous Poisson process of rate λ. M3S9/M4S9 Stochastic Simulation (2017) Page 2 2. Consider the problem of estimating the quantity η = EfX [X] , where X is a random variable with some known probability density function fX(·). Suppose that ηˆ1 and ηˆ2 are two distinct, unbiased and negatively correlated estimators for η, both of which have variance σ2. (a) Show that the arithmetic mean of ηˆ1 and ηˆ2 is an unbiased estimator with variance less than σ2/2. (b) Prove that if U ∼ U(0, 1), and φ : [0, 1]→ R is a monotonic increasing function, then φ(U) and φ(1− U) will be negatively correlated. Hint: you may wish to consider defining a point t ∈ [0, 1] such that 1− t = inf { u : φ(u) > E[φ(U)] } . Suppose now that X has a truncated Cauchy distribution, i.e. fX(x) = k pi(1 + x2) , 0 ≤ x ≤ 1 0 otherwise. (c) Calculate the constant of proportionality, k. (d) (i) State an expression for an unbiased estimator for η, in terms of an i.i.d. sample X1, . . . , Xn ∼ fX ; denote this estimator ηˆ3. (ii) Now express your estimator ηˆ3 in terms of an i.i.d. sample U1, . . . , Un ∼ U(0, 1). (iii) Derive expressions for two further, different unbiased estimators for η. These estimators should be written as a function of an i.i.d. sample U1, . . . , Un ∼ U(0, 1). (iv) State how the variance of each of your estimators in (d)(iii) compares with the variance of ηˆ3. Suppose we use Importance Sampling to estimate η, choosing an auxiliary density g(x). (e) Derive the Importance Sampling estimator ηˆIS when the auxiliary density corresponds to a U(0, 1) distribution. M3S9/M4S9 Stochastic Simulation (2017) Page 3 3. The central Laplace distribution has probability density function fX(x) = 1 2σ exp { −|x| σ } , x ∈ R, and it is known that this is also the distribution of the random variable X = −σ log(1− 2|W |) W ≥ 0 σ log(1− 2|W |) W < 0 where W ∼ U ( −12 , 12 ) . (a) Suppose we have access to a stream of n+1 i.i.d. Exponentially-distributed random variables Y1, . . . , Yn+1 ∼ Exp(1). (i) By defining Sk = k∑ i=1 Yk, k = 1, . . . , n+ 1, show that the joint density of S = (S1, . . . , Sn+1) can be written fS(s1, . . . , sn+1) = exp {−sn+1} . (ii) Using part (i), and defining Vk = Sk/Sn+1, k = 1, . . . , n, and Vn+1 = Sn+1, show that the joint density of V = (V1, . . . , Vn+1) can be written fV (v1, . . . , vn+1) = exp {−vn+1} vnn+1. (iii) Using part (ii), show that V1, . . . , Vn can be treated as a set of n ordered U(0, 1) random variates. (iv) Hence, briefly explain how we can use Y1, . . . , Yn+1 to obtain an ordered sample X(1), . . . , X(n) ∼ fX . [Continued on next page.] M3S9/M4S9 Stochastic Simulation (2017) Page 4 The President of the United States of America tweets the following: “I’ve just observed an ordered data set x(1), . . . , x(n) from the central Laplace distribution. The ratio x(n)/x(1) is -0.5. So central! Amazing!” In order to fact-check this tweet, you decide to carry out a two-tailed Monte Carlo test with Type I error α, using the observed test statistic t = x(n)/x(1). Your null hypothesis, H0, is that the observed data are distributed according to fX(x). The distribution of T = X(n)/X(1) under H0 is unknown but continuous. You generate m distinct ordered datasets from fX , and use these to obtain simulated values t∗1, . . . , t ∗ m of the test statistic. These simulated values are then sorted, to provide an ordered set of simulated test statistics, which we denote t∗(1), . . . , t∗(m). (b) Give expressions for k1 and k2 such that we would reject H0 if either t < t∗(k1) or t > t ∗ (k2). (c) Show that the Monte Carlo test rejects H0 with probability p = k1−1∑ r=0 ( m r ){ ur(1− u)m−r + um−r(1− u)r } , where u = P(T ≤ t). Having implemented the MC test, suppose we do not reject H0. Define Ti = t∗i , i = 1, . . . ,m, and Tm+1 = t, such that T = (T1, . . . , Tm+1) is a vector of m+ 1 i.i.d., unordered replicates of the test statistic T and denote the corresponding sample mean by T¯ . (d) Explain how Jackknife resampling can be used to estimate the variance of T¯ . Provide an expression for the Jackknife estimator for the variance of T¯ . M3S9/M4S9 Stochastic Simulation (2017) Page 5 4. Suppose we wish to sample from a distribution with pdf pi(x), defined on a continuous univariate state space X ⊆ R. (a) We can sample from pi(x) using the following static (i.e. non-adaptive) Metropolis-Hastings procedure to construct a Markov chain {Xn}n≥0 with pi(x) as its stationary distribution: Step 1: Initialise the chain by sampling from some initial distribution: X0 ∼ pi(0). Set n = 1. Step 2: Given Xn−1 = x, generate a candidate value Y = y from the proposal density qh(y|x) = 1 h k ( y − x h ) , x, y ∈ X , h > 0, Step 3: Set Xn = y with probability α(x, y), otherwise set Xn = x. Step 4: Replace n by n+ 1 and return to Step 2. (i) Give an expression for α(x, y) and state the property of the resulting Markov chain that guarantees the existence of a stationary distribution. (ii) What condition must k(·) satisfy in order for the above Metropolis-Hastings procedure to reduce to a Metropolis procedure. (iii) The given proposal density, qh, allows the user to control the performance of the Metropolis procedure through the fixed parameter h. Assuming the condition in part (ii) holds, describe and explain the effect on the acceptance probability α(x, y) that comes from decreasing h. Suppose we are now interested in using adaptive MCMC methods to sample from the multivariate distribution with pdf pid(x), which is defined on the continuous state space X d ⊆ Rd, d > 1. (b) The Adaptive Metropolis (AM) algorithm is a version of the Metropolis procedure that uses the following proposal density at iteration n: qn(y|x) = fN (x, (0.1)2Id/d) n ≤ 2d (1− β)fN (y|x, 2.382Sn/d) + βfN (x, 0.12Id/d) n > 2d, where fN (·|µ,Σ) is a multivariate Gaussian density with mean µ and covariance matrix Σ, and where Id is the d-dimensional identity matrix and β ∈ (0, 1) is a user-specified constant. (i) Describe the quantity denoted by Sn, and briefly explain why the value of 2.382/d is chosen to scale Sn in the above density. (ii) Briefly explain why the non-adaptive density fN (x, 0.12Id/d) is included in the proposal mechanism. [Continued on next page.] M3S9/M4S9 Stochastic Simulation (2017) Page 6 (c) As an alternative to the AM procedure in part (b), one can employ component-wise adaptive scaling methods, such as the Single Component Adaptive Metropolis (SCAM) algorithm and the Adaptive Metropolis-within-Gibbs (AMWG) algorithm. Briefly explain the primary difference between the SCAM and AMWG procedures. (d) Suppose that, at the start of iteration n, the existing output of an adaptive Metropolis procedure is denoted xn = (x′0, . . . , x′n−1) ∈ X d×n, where xj = (x1j , . . . , xdj ) ∈ X d for j = 0, . . . , n−1, and where x′j denotes the transpose of xj. Denote the row vector containing only component i of this output by xin = (xi0, . . . , xin−1) ∈ X n, and define the sample mean x¯in−1 and sample variance gin for this univariate component of the output in the usual way: x¯in−1 = 1 n n−1∑ j=0 xij, g i n = 1 n− 1 n−1∑ j=0 ( xij − x¯in )2 . Show that gin satisfies the following relationship: gin = n− 2 n− 1g i n−1 + ( x¯in−1 )2 + 1 n− 1 ( xin−1 )2 − n− 1 n ( x¯in−1 + xin−1 n− 1 )2 . (e) Roberts & Rosenthal (“Examples of Adaptive MCMC”, 2009) use integrated autocorrelation time and average square jump distance as indicators of performance for adaptive MCMC procedures. (i) Which aspect of performance is being measured by these quantities? (ii) For each of these indicators, state whether good performance of the procedure is indicated by high or low values? M3S9/M4S9 Stochastic Simulation (2017) Page 7 EXAMINATION SOLUTIONS 2016-17 Course M3/4S9 Question 1 Marks & seen/unseen Part (a) (i) seen Area(Ch) = Z Z Ch dudv Use a change of variables: (U, V )! U,X = VU 1 V = XU =) dv dx = u , dv = u dx, So, Area(Ch) = Z R "Z ph(x) 0 udu # dx = 1 2 Z R h(x)dx <1 1 Part (a) (ii) seen The Jacobian for the above change of variables is @(u, x)@(u, v)