经典案例 > R代写 >

Statistics 471 – Homework 3 Part I

Due March 27, 2019

51作业君 | 最靠谱的留学生CS代写之一 | 自营菁英辅导团队 | 为您专业代写CS作业,我们高度保障客户隐私,提供Java/Python/C/C++代写,金融代写,quiz代写,assignment代写,project代写,代修网课,远程助考,CS课程辅导服务; 自营团队无中介环节,沟通交流零障碍,北美学霸, 非中介良心CS代写, 专注诚信代写Computer Science留学生作业代写; 专业CS程序/CS作业代写, C/C++/JAVA/python/assignment/算法/web/安卓/Operation System/AI/Machine Learning/R代写全覆盖, 业务遍及北美/澳洲/英国/新西兰/加拿大作业代写, PhD实力代写, 轻松A+  

Homework Assignment Policy and Guidelines

  1. Homework assignments should be well organized and reasonably neat. It is required that you show your work in order to receive credit.

  2. Unless otherwise stated in a problem, please use R Markdown to write homework answers.

  3. Homework assignments are due in class unless otherwise noted. Credit will not be given for homework turned in late.

  4. You may be asked to submit some homework problems online, in addition to a hard

    copy that you turn in in class. Such homework problems will be marked online submission . Your submission should be combined into one PDF or HTML document.

  5. Unless it is specifically stated otherwise, you may work on and submit your homework in groups of 1 or 2. If you choose to work as a group of 2, both of you should con- tribute significantly to the solution for every question and submit only one copy of the homework with both your names on it. Whether you submit on your own or with a partner, discussing homework with your fellow students is encouraged. However, after discussions, every group must ultimately produce their own homework to be graded. Verbatim copying of homework is absolutely forbidden.

  6. For simulation studies, please refrain from reporting intermediate results; report out- puts that are most relevant to the questions.

  1. online submission Write a program to implement the secant root-finding method:

    xn1 xn

    1. xn+1 = xn f (xn) f (x


      n1

      .

      ) f (xn)


      1. First test your program by finding the root of the function f (x) = cos(x) x.

      2. Next see how the secant method performs in finding the root of f (x) = log(x)exp(x) using x0 = 1 and x1 = 2. Compare its performance with that of the other two methods.

      3. Write a function secant show.r that plots the sequence of iterates produced by the secant algorithm.


  2. online submission In the textbook (Jones et al), the Newton Raphson code is given. I reproduce it here for convenience. I strongly suggest playing about with this code and looking at example inputs / outputs to see what this function does, before continuing on.

1. newtonraphson <- function(ftn, x0, tol = 1e-9, max.iter = 100) { 2.

  1. # Newton_Raphson algorithm for solving ftn(x)[1] == 0

  2. # we assume that ftn is a function of a single variable that returns

  3. # the function value and the first derivative as a vector of length 2 6. #

  1. # x0 is the initial guess at the root

  2. # the algorithm terminates when the function value is within distance

  3. # tol of 0, or the number of iterations exceeds max.iter 10.

11. # initialise

12. x <- x0

  1. fx <- ftn(x)

  2. iter <- 0

  3. # continue iterating until stopping conditions are met 16.

17. while ((abs(fx[1]) > tol) && (iter < max.iter)) { 18. x <- x - fx[1]/fx[2]

  1. fx <- ftn(x)

  2. iter <- iter + 1

  3. cat("At iteration", iter, "value of x is:", x, "\n")

22. }

23. # output depends on success of algorithm 24.

  1. if (abs(fx[1]) > tol) {

  2. cat("Algorithm failed to converge\n")

  3. return(NULL)

  4. } else {

  5. cat("Algorithm converged\n")

  6. return(x)

    31. }

    32. }

    We are going to use the Newton Raphson Algorithm as a baseline when looking at another root finding method: Halley’s method.

    1. Modify lines 29-30 of the function newtonraphson to give the output x together with the number of iterations needed for the algorithm to converge. You may add addi- tional lines of code. The output can be in any format you want. Call this function newtonraphson1.

    2. In addition to the first derivative, Halley’s method requires second derivative informa- tion as well. Thus, if xn is our current solution then:

      f (xn)

      n

      xn+1 = xn f t(x

      ) (f (xn

      )f tt(xn

      )/2f t(xn))

      Write a function halley1 that takes in four inputs:

      halley1<-function(ftn, x0, tol, max.iter){


      # ftn - a function of a single variable that returns

      # the function value, the first derivative, and the

      # second derivative as a vector of length 3.


      # x0 is the initial guess at the root


      # the algorithm terminates when the function value is within distance

      # tol of 0, or the number of iterations exceeds max.iter


      return(c(root_x, num_iter))

      }


      The output is a vector of length 2. If the algorithm does not converge, then root x should be NA, and num iter = -1. Else, root x should be the approximate root x returned by the algorithm, and num iter the number of iterations it takes for conver- gence.

    3. In both newtonraphson1 and halley1, we had an input ftn that was a function of a single variable which returns the function value, the first derivative, and the second derivative (for halley1). However, this means we would have to write a new function for every f (x) we have. For example, if we wanted to find the root of f (x) = x2 using halley1, we would have to write:

      ftnxsquare<-function(x){ fx = x^2

      dfx = 2*x ddfx = 2

      return(c(fx,dfx,ddfx))

      }


      and then run:

      results = halley1(ftnxsquare, 5, 1e-6, 100)


      Let’s see if we can simplify this process by resorting to calculus. Recall that we have the definition of:


      f t(x) = lim

      s0

      f (x + s) f (x) s

      Using this idea, write a function FindDeriv of the following form:

      FindDeriv<-function(ftn, x, eps = ...){


      # a ftn of a single variable f(x), which takes in x

      # and returns f(x), f’(x) and f’’(x)


      return(c(fx,dfx,ddfx))

      }

      where eps is an optional parameter. The function should return a vector of ftn’s output at x, the first derivative at x, and second derivative at x.

    4. Write newtonraphson2 and halley2 where the respective derivatives are analytically computed.

    5. We now compare how these four methods do in root finding.

      For each question, report how fast each method takes to converge (number of itera- tions), report the root it converges to, report the true root, for all of the given starting values of x0.

      You are encouraged to present the answers in a format which allows easy comparison.


      1. The square root of 26 with x0 = 1, 3, 6

        (ii) 2x 3x + 1 with x0 = 2, 0, 2

        (iii) x3 7x2 + 14x 8 with x0 = 1.1, 1.2, . . . , 1.9

    6. Based on the answers you get in Part (e), which method would you prefer to use in general cases of root finding? Are there any methods you would prefer to use in specialized cases? Explain your answers.


  1. This problem involves data from the Challenger space shuttle which exploded after takeoff on January 28, 1986. The data in “shuttle.csv” contains three variables: flight, ndo and temp. The variable flight is simply a code for the launch numbers (prior to Jan. 28, 1986) and can be ignored in what follows. The variable ndo is the number of damage o-rings, which was determined after the solid rocket boosters were retrieved from the Atlantic ocean following each launch. The variable temp is the temperature at the time of each launch. Let yi be an indicator of o-ring damage during the ith launch and let πi = P (yi = 1). Consider the following logistic regression model for the probability of o-ring damage:

    πi

    ln

    1 − πi

    = β0 + β1xi ,


    where xi is the temperature at the ith launch.


    1. Assuming all the launches are independent (which seems reasonable), derive the log- likelihood functions for the regression parameter vector β = (β0, β1).

    2. Derive formulas for the gradient vector and the Hessian matrix of the log-likelihood function.

    3. online submission Write a function oringNR to implement the Newton-Raphson algorithm to determine the maximum likelihood estimates of the regression coefficients

      β0 and β1. Use the starting values β0 = ln

      1

      and β1 = 0, and report the number of

      iterations required for the algorithm to converge.

      Your function should be of this form:

      oringNR = function(filename){ data = read.csv(filename)


      # Code here return(histOR)

      }


      You can assume that filename is a name of a CSV file. Function checking will comprise of using CSV files similar to the provided shuttle.csv file.

      histOR should be a matrix with 3 columns. The first column should be the iteration number, second column the values of β0 at each iteration, and the third column the values of β1 at each iteration.

    4. online submission In order to make inference about the parameters, we consider bootstrap. Use the function NewtonRaphson2 in Notes #11 to obtain the maximum likelihood estimates of β0 and β1 for each bootstrap sample. Visualize the empirical

      bootstrap distributions for

      βˆ0 and

      βˆ1 respectively. Comment on the shape of the

      distributions. Report the bootstrap estimates of the standard errors and the 95%

      bootstrap Confidence Intervals for both parameters. Compare your results with the outputs produced by the glm function. What do you conclude?

    5. online submission Construct a plot of the probability of o-ring damage as a function of temperature. What is the predicted probability of o-ring damage when temperature at launch is 31◦F?


  2. Plateau models are used in agriculture to estimate the optimal amount of nitrogen (N ) to apply to soil to increase crop yields (Y ). The linear plateau model is given by

    Y = β0 + β1 min(N, Nmax) + E

    where E denotes random error, and Nmax > 0 is the nitrogen content beyond which there is no improvement (increase) in yield.

    The following data was collected on nitrogen amount and yield of a particular crop.

    data.frame("nitrogen"= c(rep(0,4),rep(30,4),rep(60,4),rep(90,4),rep(120,4)),

    "yield"=c(1.41,1.75,2.02,2.13,1.93,2.24,2.29,2.35,2.12,2.38,

    2.49,2.57,2.16,2.20,2.28,2.49,2.34,2.45,2.59,2.62))

    The first four yield values correspond to zero nitrogen application, the second four correspond to nitrogen values of 30, etc..

    1. online submission Fit a simple linear regression model to the data. Display the fitted model on a scatterplot of the data.

    2. Determine the form of the gradient vector of the model function

      f (x, θ) = β0 + β1 min(x, Nmax)

      with respect to the parameter θ = (β0, β1, Nmax). (Hint: For Nmax consider the two cases x < Nmax and x > Nmax.)

    3. online submission Write a function LPmodel to fit the linear plateau model using the Gauss-Newton method discussed in Notes #11. Use the estimated coefficients from your simple linear regression fit as starting values for β0 and β1, and max(N ) − 5 as the starting value for Nmax.

      LPmodel = function(Y,N,tol=1e-5,maxiter=100){

      ## compute starting values inside the function

      ## del=change in sum of squares between iterations return(list(b0,b1,Nmax,iter,del))

      }

      (Note: you can check your answer using the nls function.)

    4. online submission Display the fitted model on the scatterplot. Use different colors for the simple linear regression and linear plateau model fits and indicate which is which using a legend.


  3. Let X1, X2, . . . , Xn be a random sample from a Poisson distribution with mean µ.

    1. Write down the likelihood function for µ. Show that the sample mean, maximum likelihood estimator for µ.

      X¯n, is the


    2. The score statistic for testing H0 : µ = µ0 against a two-sided alternative is

      . .

      .2

      2

      1. S =

      i Xi − nµ0

      √nµ0

      =

      .

      . X¯n µ0 .

      ,µ0/n

      Explain why this statistic has a chi-squared distribution (approximately) if H0 is true. Determine its degrees of freedom.

    3. What is the corresponding likelihood ratio statistic?

    4. Show that the endpoints of a confidence interval for µ obtained by inverting the score test are the solutions of a quadratic equation. Show that the endpoints are always positive unless the all the data values are zero.


  4. See hw3-part2.pdf.


51作业君 51作业君

扫码添加客服微信

点击跳转淘宝店