GU4206-GR5206 Final Exam Practice Problems Student (UNI) Part 1: Warm Up Recall the strikes dataset from Week 8 lecture notes. This data set, compiled by Bruce Western has information on 18 countries over 35 years. strikes <- read.csv("strikes.csv", as.is = TRUE) dim(strikes) ## [1] 625 8 head(strikes, 3) ## country year strike.volume unemployment inflation left.parliament ## 1 Australia 1951 296 1.3 19.8 43 ## 2 Australia 1952 397 2.2 17.2 43 ## 3 Australia 1953 360 2.5 4.3 43 ## centralization density ## 1 0.3748588 NA ## 2 0.3751829 NA ## 3 0.3745076 NA In this problem, you will use the variable unemployment. The goal of this exercise is to take the following inelegant nested-loop and condense it to as few of lines of code as possible. For full credit, perform the same task in no more that two lines of code. countries <- unique(strikes$country) unemployment_statistic <- rep(NA,length(countries)) counter <- 1 for (c in countries) { one_country <- strikes[strikes$country==c,] unemployment_temp <- one_country[,c("unemployment")] stat <- 0 for (i in 1:length(unemployment_temp)){ stat <- stat + unemployment_temp[i]/length(unemployment_temp) } unemployment_statistic[counter] <- stat counter <- counter+1 } data.frame(country=countries,unemployment=unemployment_statistic) ## country unemployment ## 1 Australia 3.5057143 ## 2 Austria 2.5400000 ## 3 Belgium 3.6466667 1 ## 4 Canada 6.0428571 ## 5 Denmark 5.7114286 ## 6 Finland 2.5714286 ## 7 France 3.1828571 ## 8 Germany 3.1171429 ## 9 Ireland 7.7714286 ## 10 Italy 6.7257143 ## 11 Japan 1.6028571 ## 12 Netherlands 3.6914286 ## 13 New.Zealand 1.0028571 ## 14 Norway 1.4285714 ## 15 Sweden 2.1371429 ## 16 Switzerland 0.3285714 ## 17 UK 3.4514286 ## 18 USA 5.5428571 Write your solution below. ## Solution goes here ------- Part 2: Simulation The goal of this section is to study the random variable X = U1 + U2, where U1 and U2 are independent uniform random variables, each over the unit interval. This section of the final exam has five major components; simulate X = U1 + U2 directly, plot the pdf of X = U1 + U2, use the inverse-transform method to simulate X, use the accept-reject method to simulate X, and use the simulated random variable X to compute a Monte-Carlo integral of the function b(x) = sin(sin(x)), over [0, 2]. Perform the following tasks: Part 2.i) Simulate 10,000 cases of the random variable X = U1 + U2 directly by using runif() for U1 and U2 and then adding the resulting vectors. Display the first 10 simulated values of X and use ggplot to construct a histogram of the distribution of X. Use a Base R plot for partial credit. ## Solution goes here ------- Part 2.ii) Using ggplot, plot the pdf of X over the range [−1, 3]. The pdf is given by the following piecewise function: f(x) = x 0 ≤ x < 1 2− x 1 ≤ x < 2 0 otherwise For guidance on defining and plotting a piecewise function, please see Part 2.v. 2 ## Solution goes here ------- Part 2.iii) Use the inverse-transform method to simulate 10,000 draws of X. For full credit, perform the following tasks: a) Define a R function called F.cdf.inv which is the inverse of the cdf F (x). Test your function using the points .1,.45,.7. More specifically, run the code F.cdf.inv(F.cdf(c(.1,.45,.7))), where F.cdf() is the cdf. b) Use the inverse-transform method to simulate 10,000 draws of X and display the first 10 simulated draws. c) Plot the 10,000 simulated cases using ggplot, i.e., plot a histogram. Also overlay the pdf f(x) on the the histogram. For partial credit, use Base R graphics. Note that the cumulative distribution function (cdf) of f(x) is F (x) = 0 x < 0 1 2x 2 0 ≤ x < 1 − 12x2 + 2x− 1 1 ≤ x < 2 1 x ≥ 2 Note that the function F (x) appears to be non-invertible because of the quadratic terms. However, the function is invertible over the domain of each piecewise interval. Hint: y = ax2 + bx+ c −→ x = − b2a ± √ 1 a (y − c) + ( b 2a )2 Part 2.iii.a) ## Solution goes here ------- Part 2.iii.b) ## Solution goes here ------- Part 2.iii.c) ## Solution goes here ------- Part 2.iv) Use the accept-reject method to simulate 10,000 draws of X. For full credit, perform the following tasks: a) Clearly identify an easy to simulate distribution g(x). b) Identify a suitable envelope function e(x) that satisfies f(x) ≤ e(x) = g(x)/α, where 0 < α < 1. c) Simulate 10,000 draws from the target distribution f(x) using the accept-reject algorithm. Display the first 10 simulated values. 3 d) Using ggplot or Base R, construct a histogram of the simulated distribution with the pdf f(x) overlayed on the plot. Part 2.vi.a) Part 2.vi.b) ## Solution goes here ------- Part 2.iv.c) ## Solution goes here ------- Part 2.iv.d) ## Solution goes here ------- Part 2.v) The goal of this section is to use the simulated cases coming from target distribution f(x) to numerically integrate b(x) via Monte-Carlo. You must draw directly from f(x) by using any of the previous three methods. The integral of interest is:∫ 2 0 b(x)dx, where b(x) = { sin(sin(x)) 0 < x < 2 0 otherwise To visualize b(x), see the code below. x <- seq(-1,3,by=.01) n.x <- length(x) b <- function(x) { out <- ifelse((x<=0)|(x>2),0,sin(sin(x))) return(out) } plot_data <- data.frame(x=x,b=b(x)) library(ggplot2) ggplot(data = plot_data) + geom_abline(slope=0,intercept=0,linetype = "dashed")+ geom_line(mapping = aes(x = x, y = b),color="red")+ labs(title = expression(sin(sin(x))),y="b(x)") 4 0.0 0.2 0.4 0.6 0.8 −1 0 1 2 3 x b(x ) sin(sin(x)) ## Solution goes here ----- Part 3: MLE and Bayes’ Posterior MCMC Data Generating Process and Dataset: Consider the simple linear regression model: Yi = β0 + β1xi + i, where i = 1, 2, . . . , 100 and i iid∼ N(0, σ2 = 4). Note that we are interested in estimating β0, β1 and we are assuming known variance σ2 = 4. The dataset of interest is MLE_MCMC_Problem.csv. The scatter plot follows: MCMC.data <- read.csv("MLE_MCMC_Problem.csv") plot(MCMC.data$x,MCMC.data$y,xlab="x",ylab="y") 5 3 4 5 6 7 4 6 8 10 12 14 16 x y Estimation Procedure 1: Maximum Likelihood The goal of this section is to estimate parameters β0 and β1 using maximum likelihood. The likelihood function is L(β|y1, . . . , yn) = L(β0, β1|y1, . . . , yn) = 100∏ i=1 f(yi|β0, β1). The function f(yi|β0, β1) comes directly from our simple linear regression model. Perform the following tasks: Part 3.i) Define the regular likelihood function L(β0, β1|y1, . . . , y100) in R. This should be a function of the vector( β0 β1 ) . Name the likelihood function my.likeliood and evaluate your likelihood at the point c(0,0). ## Solution goes here # my.likeliood(beta=c(0,0)) Part 3.ii) Define the negative log-likelihood function −`(β0, β1|y1, . . . , y100) in R. This should be a function of the vector ( β0 β1 ) . Name the likelihood function my.NL.likeliood. Evaluate your negative log-likelihood at the point c(0,0). ## Solution goes here # my.NL.likelihood(beta=c(0,0)) 6 Part 3.iii) Check that your answer from problem (2) matches problem (1) by taking evaluating: #exp(-1*my.NL.likelihood(beta=c(0,0))) # Compare #my.likelihood(beta=c(0,0)) Part 3.iv) Use the nlm() function to compute the maximum likelihood estimates of β0, β1. Use the starting value p=c(0,0). Compare your answer to the estimated coefficients produced from the linear model function lm(). Note: the least squares and maximum likelihood estimates are analytically the same in this scenario. ## Solution goes here Estimation Procedure 2: Bayes’ MCMC The goal of this section is to estimate parameters β0 and β1 using Bayesian techniques. Students are required to preform the famous MCMC approach to simulate from target posterior distribution pi(β0, β1|y1, . . . , y100). The empirical Bayes’ estimators βˆ0,B and βˆ1,B can then be computed from the simulated posterior. Note that both posteriors are simulated simultaneously. To solve this problem, we fist choose a prior distribution pi(β0, β1) and we must also set up the likelihood function L(θ|y1, . . . , yn) = L(β0, β1|y1, . . . , yn) = 100∏ i=1 f(yi|β0, β1). The likelihood was solved in problem (1). Choice of Prior: In ordinary linear regression, the objective function for least squares or maximum likelihood has no constraints placed on our parameters. Thus the unknown parameters β0, β1 can theoretically take on any real numbers. Consequently, a first natural choice of our prior should be a joint distribution that allows for real-valued coefficients. In our setting, choose the prior distribution pi(β0, β1) to be bivariate normal with pdf defined by: pi(β) = pi(β0, β1) = (2pi)−1 det(Σ)−1/2 exp ( − 12(β − µ) TΣ−1(β − µ) ) , where β ∈ R2, and β = ( β0 β1 )T , µ = ( µ0 µ1 )T , Σ = ( σ20 ρσ0σ1 ρσ0σ1 σ 2 1 ) . This prior has parameters µ0, µ1, σ0, σ1, ρ. The following code, which is hidden in the knitted markdown pdf, is the joint pdf of the bivariate normal. The density is plotted for parameter values µ0 = 0, µ1 = 0, σ0 = 1, σ1 = 1, ρ = 0. Note: the code displayed (or hidden) below will not be used in the MCMC algorithm. Recall that the proposal density, in our case g(β0, β1) = pi(β0, β1), cancels out in the Metropolis-Hastings ratio. This code is included to give students a visual representation of our bivariate normal prior. 7 u1 −2 0 2 4u2 −2 0 2 4 f(u) 0.0 0.2 MCMC Setup Suppose that a pilot study provided some prior knowledge to our application. Namely, the true intercept should be near zero and the true slope is likely between 1 and 3. Choose the prior pi(β0, β1) as bivariate normal with parameters µ0 = 0, µ1 = 2, σ0 = .25, σ1 = .5, ρ = 0. To simulate from proposal pi(β0, β1), use the following function sim.bivariate.norm. library(MASS) sim.bivariate.norm <- function(n,mu0=0,mu1=0,sigma0=1,sigma1=1,rho=0) { mu <- c(mu0,mu1) Sigma <- matrix(c(sigma0^2,sigma0*sigma1*rho,sigma0*sigma1*rho,sigma1^2), nrow=2) return(mvrnorm(n = n, mu=mu, Sigma=Sigma)) } Perform the following tasks: Part 3.v) Simulate 1 case and 10 cases from our prior pi(β0, β1). Recall the choice of parameters: µ0 = 0, µ1 = 2, σ0 = .25, σ1 = .5, ρ = 0. ## Solution goes here Part 3.vi) Run the MCMC algorithm using the bivariate normal prior with µ0 = 0, µ1 = 2, σ0 = .25, σ1 = .5, ρ = 0. Run 10,000 MCMC iterations and display the first 20 simulated cases of your simulated chain β(t). ## Solution goes here 8 Part 3.vii) Construct traceplots (or lineplots) of the simulated chains as a function of their iterations. There should be two plots, one for β0 and one for β1. ## Solution goes here Part 3.viii) Remove the first 1000 simulated cases (10% burn-in) and display the simulated posterior pi(β0, β1|y1, . . . , y100). To accomplish this, simply plot two histograms, one for β0 and one for β1. Also calculate the posterior Bayes’ estimators for both β0 and β1. ## Solution goes here Part 3.ix) Based on our prior information (β0 ≈ 0 and 1 < β1 < 3), choose a prior distribution that exhibits poor mixing behavior. Note that you should still choose a bivariate normal for prior pi(β0, β1). Clearly specify your choice of the prior and run 10,000 iterations of the MCMC algorithm. Also provide traceplots showing the chain’s mixing behavior. Solution ## Solution goes here Part 3.x) Remove the first 1000 simulated cases (10% burn-in) and display the simulated posterior pi(β0, β1|y1, . . . , y100). To accomplish this, simply plot two histograms, one for β0 and one for β1. Also calculate the posterior Bayes’ estimators for both β0 and β1. ## Solution goes here Part 3.xii) Compare your Bayes estimators (for both priors) to the least squares estimator (or MLE). To assess the quality of your estimators, compute the Euclidean norm: ‖βˆ − β‖. Let the true coefficients be defined as β = ( 0 2 )T . ## Solution goes here 9
欢迎咨询51作业君