Assignment #1 STA457H1F/2202H1F due Friday September 27, 2024 (11:59pm) Instructions: Solutions to problems 1–3 are to be submitted on Quercus (PDF files only). You are strongly encouraged to do problems 4 through 6 but these are not to be submitted for grading. 1. Daily Canadian/U.S. dollar exchange rates ($US/$CAN) from Jan. 2, 1997 to Dec. 29, 2000 are given in the file dollar.txt on Quercus. Analyze the data on the log-scale: > dollar <- scan("dollar.txt") > dollar <- ts(log(dollar)) Define the first differences of the time series as follows: > returns <- diff(dollar) (a) Plot the correlogram and periodogram of the orginal data (i.e. dollar). (b) Plot the correlogram and periodogram of the first differences. (c) Comment on the results obtained in parts (a) and (b). In particular, how are the correlograms and periodograms different? A simple model for the logarithm of exchange rates is a random walk – are the correlograms and periodograms in (a) and (b) consistent with this model? (d) Now look at the correlogram and periodogram of the absolute values of the first differences (i.e. abs(returns)). Comment on the differences between the results for returns and abs(returns), in particular, with respect to the applicability of the random walk model. 2. Average monthly concentrations of carbon dioxide (CO2) from March 1958 to July 2024 at Mauna Loa volcano in Hawaii are given in the file CO2.txt on Quercus. Read the data into R and create a time series object as follows: > CO2 <- scan("CO2.txt") > CO2 <- ts(CO2,start=c(1958,3),end=c(2024,7),frequency=12) Thus CO2 has a start date of March 1958 and end date of July 2024 while the argument frequency=12 indicates that there are 12 observations per year. (a) Plot the periodogram of the time series. At what frequencies are there peaks? To which features of the time series do these peaks correspond? (Note that we have 12 observations per year so that ∆ = 1/12 and so the Nyquist frequency is 6 cycles per year.) (b) An estimate of the trend is given in the file CO2-trend.txt; you will need to construct a time series object for these data using the same approach for the data in CO2.txt. Sub- tract the trend from the original data and look at the periodogram of the detrended data. Comment on the differences between the periodograms in parts (a) and (b). (It is useful here to overlay the two periodograms on the same plot.) In particular, how effective is the detrending in emphasizing the seasonality in the data? 3. Suppose that S is a smoothing matrix and xT = (x1, · · · , xn) is a time series so that x̂1 = S x is an estimate of a trend. Twicing is procedure whereby S is applied to the residuals x− x̂1 and then added to x̂1 to produce a new trend estimate: x̂2 = x̂1 + S(x− x̂1) = {I − (I − S)2}x Twicing can be applied iteratively: x̂k = x̂k−1 + S(x− x̂k−1) = {I − (I − S)k}x Twicing is a special case of boosting, which is a method used in machine learning. Twicing can be applied to linear filtering. Suppose that {Xt} is a stationary stochastic process with spectral density function fX(ω) and define Yt = ∞∑ u=−∞ cuXt−u Zt = Yt + ∞∑ u=−∞ cu(Xt−u − Yt−u) (a) Show that the spectral density function of {Zt} is fZ(ω) = ∣∣∣1− (1−Ψ(ω))2∣∣∣2 fX(ω) where Ψ(ω) = ∞∑ u=−∞ cu exp(2piiωu). (Hint: The result of Problem 4 may be useful here.) (b) In lecture, we showed that for the Hodrick-Prescott filter Ψ(ω) = 1 1 + 16λ sin4(ωpi) where λ > 0 is a tuning parameter. Take λ = 100 and suppose that Zt = ∞∑ u=−∞ duXt−u Use the method shown in lecture to compute {du} for |u| ≤ 40. If Yt = ∑∞u=−∞ cuXt−u, how do {du} compare to {cu}? Note: Phillips & Shi (2019) introduced the boosted Hodrick-Prescott filter; a nice (but very technical) survey paper on this methdo by Mei, Phillips & Shi (2024) is on Quercus. 4. Let {xt} and {yt} be two infinite sequences of real-numbers and define X(ω) = ∞∑ t=−∞ xt exp(2piiωt) and Y (ω) = ∞∑ t=−∞ yt exp(2piiωt) to be their (discrete) Fourier transforms. Define {zt} to be the convolution of {xt} and {yt}; that is, zt = ∞∑ u=−∞ xuyt−u. (a) If Z(ω) is the Fourier transform of {zt}, show that Z(ω) = X(ω)Y (ω). (b) Suppose that {xt} is a time series and {yt} is defined by yt = p∑ u=0 cuxt−u Then from (a), it follows that the periodograms of {yt} and {xt} are related by Iy(ω) ≈ |Γ(ω)|2Ix(ω) where Γ(ω) = ∑p u=0 cu exp(2piiωu). How does this explain the behaviour of the periodograms in parts (a) and (b) of Question 1? (Hint: Take c0 = 1, c1 = −1.) (c) Suppose that cu > 0 for u = 0, · · · , p with c0 + · · · + cp = 1. How does the periodogram of {yt} (defined in (b)) compare to that of {xt}? 5. (a) Show that n−1∑ k=0 exp(ikθ) = n−1∑ k=0 [exp(iθ)]k = 1− exp(inθ) 1− exp(iθ) if θ is not a multiple of 2pi and use this result to give simple expressions for n−1∑ k=0 cos(kθ) and n−1∑ k=0 sin(kθ). (b) Let x1, · · · , xn be a sequence of numbers (for example, a time series) and define the discrete Fourier transform X(ω) = n∑ t=1 xt exp(2piiωt). Define ωk = k/n for k = 0, · · · , n− 1. Show that xs = 1 n n−1∑ k=0 X(ωk) exp(−2piiωks). 6. An n× n matrix A is said to be centrosymmetric if JA = AJ where J = 0 0 · · · 0 1 0 0 · · · 1 0 ... ... . . . ... ... 0 1 0 · · · 0 1 0 0 · · · 0 . An equivalent condition for A to be centrosymmetric is JAJ = A. (a) Suppose that A and B are n× n centrosymmetric matrices. Show that A + B and AB are both centrosymmetric. (b) Suppose that A is invertible and centrosymmetric. Show that A−1 is centrosymmetric. (c) Define A = X(XTX)−1XT to be the standard least squares projection matrix (also known as the “hat” matrix) where X is an n× p matrix. For simplicity, assume that X has orthonormal columns so that (XTX)−1 = I and so A = XXT . If x1, · · · ,xp are the columns of X, show that A is centrosymmetric if Jxk = ±xk for k = 1, · · · , p. 51作业君版权所有