Assignment #3 STA457H1F/2202H1F Due Friday, November 15, 2024 Instructions: Solutions to problems 1 and 2 are to be submitted on Quercus (PDF files only). You are strongly encouraged to do problems 3 and 4 but these are not to be submitted for grading. 1. Data on the number of monthly traffic fatalities in Ontario from 1960 to 1974 are contained in a file fatalities.txt on Quercus. You may want to analyze the logs of the data. (a) Use the function stl to seasonally adjust the data. (Some details on stl are available using the help facility in R – help(stl) – and more in the paper by Cleveland et al. on Quercus. The two key tuning parameters in stl are s.window and t.window, which control the num- ber of observations used by loess in the estimation of the seasonal and trend components, respectively; these parameters must be odd numbers. For example, > fatal <- scan("fatalities.txt") > fatal <- ts(fatal,start=c(1960,1),end=c(1974,12),freq=12) > r <- stl(fatal,s.window = 3, t.window = 51, robust=T) > plot(r) > r <- stl(fatal,s.window = 5, t.window = 61, robust = T) > plot(r) > r <- stl(fatal,s.window = "periodic", t.window = 41, robust = T) # seasonal periodic > plot(r) The option robust = T allows one to better see anomalous observations or outliers in the irregular component. (b) For one of set of parameter values used in part (a), look at the estimated irregular component. Does it look like white noise? Would you expect it to look like white noise? (c) The function stl estimates trend, seasonal, and irregular components. Other seasonal adjustment procedures can also estimate a calendar component in order to reflect variation due to the number of weekend days, holidays etc. For these data, do you think a calendar component would be useful? 2. The file speech.txt contains a “speech record” of a person saying the syllable ahh; this was sampled at 10000 points per second. (These data represent a subset of a larger data set.) The data can be read into R as follows: > speech <- ts(scan("speech.txt"),frequency=10000) The argument frequency=10000 reflects that fact that we have 10000 measurements per second and we are using seconds as the unit of measurement and Hertz (cycles per second) as the unit of frequency. (a) Autoregressive spectral density estimates can be computed using spec.ar. For example, > r <- spec.ar(speech,order=10,method="burg") will give an estimate obtained by fitting an AR(10) model to the time series (using Burg’s estimates) while > r <- spec.ar(speech,method="burg") will use AIC to choose the AR order (using Yule-Walker estimates). Again play around with different AR orders and compare them to the estimates in parts (b) and (c) below. (b) Estimate the spectral density function using the multitaper method; the function spec.mtm is available in the package multitaper. The library can be loaded as follows: > library(multitaper) The two key parameters in spec.mtm are nw, the time-bandwidth parameter, and k, the number of tapers used to construct the estimate; k should be less than 2× nw. Try different values of nw (and corresponding k). (c) An R function spec.parzen is available for doing spectral density estimation using Parzen’s lag window. For example, > speech <- ts(scan("speech.txt"),frequency=10000) > r <- spec.parzen(speech,lag.max=60,plot=T) will compute the estimate using M = 60 (lag.max=60); the plot will give approximate pointwise 95% confidence intervals for the spectral density function. Play around with different values of M to see how the estimates change with M . (d) Which frequencies (measured in Hertz or cycles per second) seem to be most dominant? 3. Suppose that {Xt} is an ARIMA(p, d, q) process and define X̂n+s to be the best linear predictor of Xn+s given Xn = xn, Xn−1 = xn−1, · · · Then σ2(s) = σ2 s−1∑ u=0 ψ2u where σ2 is the white noise variance and {ψu} are defined by the identity ∞∑ u=0 ψuz u = 1 + ∑q u=1 βuz u (1− z)d(1−∑pu=1 ϕuzu (a) Suppose that {Xt} is an ARIMA(1,1,1) process with σ2 = 1, ϕ1 = 0.95 and β1 = 0.2. Evaluate σ2(s) for s = 1, · · · , 10. (You may want to write a simple program in R to do the computations.) (b) Suppose that {Xt} is an ARMA(p,q) process whose parameters satisfy the usual condi- tions. Show that σ2(s)→ Var(Xt) as s→∞. (Hint: Note that Xt can be written as Xt = µ+ ∞∑ u=0 ψuεt−u where {εt} is white noise.) 4. Let {Xt} be an MA(1) process Xt = εt + βεt−1 where {εt} is a zero mean white noise process, and |β| < 1. Let X̂t be the best linear predictor of Xt based on X1, · · · , Xt−1. (a) Show that X̂2 = β 1 + β2 X1. (b) Show that X̂n+1 = β θn (Xn − X̂n) where θ1 = 1+β 2 and θn = 1+β 2−β2/θn−1. (Note: This is quite difficult; you may be able to use the Levinson algorithm although it may be easier to work directly from the definition of the best linear predictor using the fact that ρ(s) = 0 for s ≥ 2.) (c) Show that limn→∞ θn = 1. (Hint: You can use the following “fixed point theorem”: Let f be a continuous function on some interval interval [a, b] with a ≤ f(x) ≤ b for a ≤ x ≤ b. Define a sequence {θn} such that θn = f(θn−1). If |f ′(x)| ≤ k < 1 for a < x < b and a ≤ θi ≤ b for some i then limn→∞ θn = θ0 where f(θ0) = θ0.) 51作业君版权所有