辅导案例-STAT4102/8002

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 1 of 7
Student Number u____________________

Research School of Finance, Actuarial Studies and Statistics
FINAL EXAMINATION
Semester 1, 2020

STAT4102/8002 Applied Time Series Analysis

Release Time: 9:25am, 23 June 2020 (Canberra Time)
Start Time: 9:30am, 23 June 2020 (Canberra Time)
Duration: 240 minutes
End of submission time: 1:30pm, 23 June 2020 (Canberra Time)

Exam Conditions:
Centrally scheduled remote examination

Permitted Materials:
Any
Please prepare laptops, other electronic devices, scribble papers, etc by yourselves.
Please install all the R packages used in this course when this cover sheet is released.

Instructions to Students:
1. This exam paper comprises a total of 7 pages. Please ensure your paper has the correct number of pages.
2. The exam includes a total of 5 questions. The questions are of unequal value, with marks indicated for
each question.
3. STAT4102 students must attempt all questions except Question 5. Maximum points: 73.
4. STAT8002 students must answer all questions. Maximum points: 78.
5. While you may use course material, computer software, internet, or other resources, you must complete
the exam individually. Identical submissions even only for one question are treated as cheating (the
similarity will be tested by Turnitin), and all the students involved in those identical submissions will be
marked 0 for this final exam.
6. You can use any result, formula or statement from the course material without proof. In fact, doing this
will help your exam.
7. Please round all of your final numeric answers to four decimal places and please do not round in the
middle of the computation process.
8. Please include all working out, as marks will not be awarded for those answers which do not include.
9. Please type, paste, and combine all of your answers in the answer sheet file on Wattle, including signing a
declaration form in the cover page. Please be aware of the quality of the file when you are preparing the
submission, and in particular that the text on each scanned/photographed image is legible.
10. Primary mode of submission: Turnitin link (file size limit: 40MB, and Turnitin will only accept several but
not all pages being scanned). Back up mode of submission: See “Final Exam Instruction” on Wattle.
11. Late submission is not accepted. Please prepare to submit your answer sheet at least 30 minutes before
the end of submission time.
12. Any queries of this exam can be communicated via [email protected] during the exam period.
13. The significance level is set to be 0.05 throughout the exam.
Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 2 of 7
Question 1 (13 points)

Part 1. (3 points) Consider the time series model ! = 3!"# +! − 3.3!"# + 0.9!"$,
where {!} is white noise (WN) with variance %$ , denoted by !~WN(0, %$). Is this model
causal? Why or why not?

Part 2. (4 points) Consider the time series model ! = −!"# − &#' !"$ +! −!"# +!"$,
with !~WN(0, %$).

a) (2 points) Is this model invertible? Why or why not?

b) (2 points) Is ! stationary? Why or why not?

Part 3. (6 points) Consider the time series model (1 − 0.65()! = ! with !~WN(0, %$),
where is the backshift operator.

a) (2 points) Is this model causal? Why or why not?

b) (2 points) What is the value of the partial autocorrelation function (PACF) at lag 7, namely )(7) of !?

c) (2 points) What is the value of the partial autocorrelation function (PACF) at lag 5, namely )(5) of !?

Question 2 (20 points)

In this question, please input

> set.seed(uniID)

in the beginning of your R code, where uniID = your student number (without “u”). Then
please do not input “set.seed()” function again for the rest of the question. If you do not
follow this, you will be marked 0 for the whole question.

Part 1. (9 points)
a) (1 point) Using R, please simulate a time series {!: = 1, 2,⋯ , } from the model ! =#!"# + $!"$ +! with !~WN(0, %$), where = 100, # = −0.5, $ = −0.5 and %$ = 1.
Please paste the R code (not R output) in the answer sheet.

Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 3 of 7
b) (2 points) Please use R to verify that the ordinary least squares (OLS) estimates of #
and $ (denoted by =#, =$) are the same as the minimum CSS estimates for the simulated
time series {!: = 1, 2,⋯ , }. Please paste the corresponding R code and R output/figures,
together with your written analysis to support your answer in the answer sheet. If you only
provide R code/output without any analysis or description in the answer sheet, you will be
marked 0 for this question.

c) (2 points) Using =#, =$ defined in b), consider the time series model ! = =#!"# +=$!"$ +!. Is this model causal? Why or why not?

d) (2 points) Using the simulated time series {!: = 1, 2,⋯ , } in a), and the Yule-Walker
estimation results obtained by R, what is the one-step-ahead forecast of *+#? What is the
truncated two-step-ahead forecast of *+$? Please paste the corresponding R code and the
numeric answers in the answer sheet.

e) (2 points) What is the value of the sample autocorrelation function (SACF) at lag 2,
namely ?)(2), for this simulated time series {!: = 1, 2,⋯ , }? What is the value of the
sample partial autocorrelation function (SPACF) at lag 2, namely ?)(2), for this simulated
time series {!: = 1, 2,⋯ , }? Please paste the corresponding R code and the numeric
answers in the answer sheet.

Part 2. (11 points)
a) (1 point) Using R, please simulate a time series {!: = 1, 2,⋯ , } from a new model ! =#!"# + $!"$ +! + #!"# + $!"$ with !~WN(0, %$), where = 100, # = 0.6, $ =−0.09, # = 0.3, $ = −0.1, and %$ = 1. Please paste the R code (not R output) in the
answer sheet.

b) (2 points) What is the value of the autocorrelation function (ACF) at lag 3, namely )(3) of !? What is the value of the partial autocorrelation function (PACF) at lag 3, namely )(3) of !? Please paste the corresponding R code and the numeric answers in the answer sheet.

c) (2 points) Using the simulated time series {!: = 1, 2,⋯ , } in a), and the maximum
likelihood estimation (MLE) results obtained by R, what is the 95% confidence interval of #?
Please paste the corresponding R code and the numeric answers in the answer sheet.

d) (2 points) Suppose the MLE of #, $, #, $ and %$ are denoted by =#, =$, =#, =$ and ?%$ . The causal form of the model ! = =#!"# + =$!"$ +! + =#!"# + =$!"$ with !~WN(0, ?%$) is denoted by ! =A=,-,./ !", .
What are the values of =/, =#, =$ and =&? Please paste the corresponding R code and the
numeric answers in the answer sheet.

Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 4 of 7
e) (2 points) If we additionally assume ! are independent and normally distributed in the
model, using the simulated time series {!: = 1, 2,⋯ , } in a), and the MLE results obtained
by R, what is the 95% prediction interval of *+0? Please paste the corresponding R code
and the numeric answers in the answer sheet.

f) (2 points) Is the prediction interval that you obtain in e) accurate for the simulated time
series {!: = 1, 2,⋯ , } in a)? Why or why not? Please paste the corresponding R code and
R output/figures, together with your written analysis to support your answer in the answer
sheet. If you only provide R code/output without any analysis or description in the answer
sheet, you will be marked 0 for this question.

Question 3 (30 points, Data Competition)

Suppose now you are in a data competition organised by Kaggle
(https://www.kaggle.com/competitions). The dataset is the quarterly U.S. gross national
product (GNP) time series from 1947 (Quarter 1) to 2002 (Quarter 3), 223 observations. The
data are real U.S. GNP in billions of chained 1996 dollars. The data were obtained from the
Federal Reserve Bank of St. Louis (http://research. stlouisfed.org/). This dataset is stored in
the object “gnp” in the R library “astsa”, and is shown using the following R code:

> rm(list=ls())
> library(astsa)
> gnp
Qtr1 Qtr2 Qtr3 Qtr4
1947 1488.9 1496.9 1500.5 1524.3
1948 1546.6 1571.1 1577.6 1580.5
1949 1558.2 1553.6 1570.7 1553.9
1950 1618.4 1667.2 1733.1 1763.9
1951 1782.9 1814.9 1851.6 1855.8
1952 1876.7 1878.2 1889.9 1951.9
1953 1987.4 2004.3 1990.2 1958.6
1954 1949.7 1952.6 1973.7 2014.1
1955 2071.6 2104.3 2132.4 2143.9
1956 2136.4 2152.8 2150.8 2184.1
1957 2198.8 2195.0 2215.5 2189.2
1958 2131.0 2143.6 2190.9 2239.7
1959 2286.2 2345.5 2345.5 2354.1
1960 2405.4 2393.9 2398.9 2369.3
1961 2383.7 2427.1 2467.2 2517.5
1962 2561.0 2590.3 2615.7 2625.1
1963 2654.8 2688.2 2739.8 2760.3
1964 2823.2 2855.7 2894.7 2900.5
1965 2974.0 3014.6 3073.6 3144.5
1966 3222.6 3234.8 3254.7 3283.7
1967 3313.4 3310.7 3336.6 3360.8
1968 3429.2 3488.3 3513.4 3528.1
1969 3582.2 3590.6 3610.3 3593.3
1970 3589.1 3597.4 3628.3 3587.6
1971 3691.3 3712.8 3738.4 3749.2
1972 3823.4 3910.0 3950.7 4018.7
1973 4125.0 4168.3 4158.0 4192.5
1974 4168.1 4176.5 4126.5 4098.0
1975 4040.1 4075.6 4148.4 4206.7
Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 5 of 7
1976 4304.2 4341.2 4362.0 4398.4
1977 4457.6 4535.9 4616.4 4616.6
1978 4636.0 4804.8 4854.6 4925.8
1979 4939.6 4949.3 4995.6 5011.4
1980 5028.8 4922.5 4911.3 4986.3
1981 5086.4 5048.1 5110.5 5056.8
1982 4969.4 4996.9 4963.4 4964.8
1983 5021.5 5142.2 5233.9 5342.0
1984 5452.6 5544.3 5591.1 5627.1
1985 5664.3 5710.9 5788.6 5839.6
1986 5887.3 5901.9 5959.0 5981.7
1987 6027.6 6095.8 6145.8 6254.1
1988 6302.0 6372.8 6402.0 6487.4
1989 6565.6 6599.7 6633.4 6663.4
1990 6743.6 6760.8 6742.6 6713.3
1991 6667.4 6692.1 6704.7 6749.4
1992 6811.1 6873.8 6923.3 7015.1
1993 7020.9 7056.0 7092.4 7182.1
1994 7249.8 7346.3 7385.1 7476.0
1995 7510.2 7528.6 7572.3 7645.2
1996 7703.1 7820.4 7853.5 7947.9
1997 8025.1 8145.6 8225.1 8276.9
1998 8405.4 8448.7 8517.6 8662.0
1999 8755.5 8801.8 8906.4 9071.1
2000 9119.7 9233.0 9238.2 9274.0
2001 9241.7 9224.3 9199.8 9283.5
2002 9367.5 9376.7 9477.9

Now suppose Kaggle releases the first = 1, 2,⋯ , , = 210 observations to you, in order
to perform the time series analysis, and denote those observations by {!: = 1, 2,⋯ , }.
The remaining observations, denoted by *+#, *+$, ⋯, *+#&, are on hold by Kaggle, which
will be further used to assess the forecast performance of the models built by you.

Part 1. (4 points) Are the time series {!: = 1, 2,⋯ , } and {log(!) : = 1, 2,⋯ , }
stationary? Please state at least two reasons why each of {!: = 1, 2,⋯ , } and {log(!) : = 1, 2,⋯ , } is stationary or not stationary. Please paste the corresponding R
code and R output/figures, together with your written analysis to support your answer in the
answer sheet. If you only provide R code/output without any analysis or description in the
answer sheet, you will be marked 0 for this question.

Part 2. (10 points) Please build proper ARIMA models for the time series {!: = 1, 2,⋯ , }.
The resulting models should include

a) ARIMA(0, , ), ≠ 0;

b) ARIMA(, , 0), ≠ 0, ;

c) ARIMA(, , ), ≠ 0, ≠ 0; and

d) SARIMA(, , )(, , )1.

You can only use SACF, SPACF, SEACF or AIC to determine the orders , , , in the
above models. Please paste the corresponding R code and R output/figures, together with
your analysis to support how you build the above models in the answer sheet. Note that the
resulting four models should all pass the time series model diagnostics. In your answer,
please also use R code and R output to clearly indicate the orders and parameter estimates
for each of the four models, respectively. Model d) can also reduce to ARIMA(, , ) if
Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 6 of 7
proper reasons are provided. If you only provide R code/output without any analysis or
description in the answer sheet, you will be marked 0 for this question.

Part 3. (10 points) Please build proper ARIMA models for the time series {log(!): =1, 2,⋯ , }. The resulting models should include

a) ARIMA(0, , ), ≠ 0;

b) ARIMA(, , 0), ≠ 0, ;

c) ARIMA(, , ), ≠ 0, ≠ 0; and

d) SARIMA(, , )(, , )1.

You can only use SACF, SPACF, SEACF or AIC to determine the orders , , , in the
above models. Please paste the corresponding R code and R output/figures, together with
your analysis to support how you build the above models in the answer sheet. Note that the
resulting four models should all pass the time series model diagnostics. In your answer,
please also use R code and R output to clearly indicate the orders and parameter estimates
for each of the four models, respectively. Model d) can also reduce to ARIMA(, , ) if
proper reasons are provided. If you only provide R code/output without any analysis or
description in the answer sheet, you will be marked 0 for this question.

Part 4. (3 points) Please use “auto.arima()” function from R package “forecast” to
automatically obtain an ARIMA(, , ) or SARIMA(, , )(, , )1 model for the time series {!: = 1, 2,⋯ , }. Does this fitted model pass the time series model diagnostics? Why or
why not? Please paste the corresponding R code and R output/figures, together with your
written analysis to support your answer in the answer sheet. In your answer, please also use
R code and R output to clearly indicate the orders and parameter estimates of the fitted
model that you obtain from “auto.arima()”, respectively. If you only provide R code/output
without any analysis or description in the answer sheet, you will be marked 0 for this
question.

Part 5. (3 points) Now suppose you submit all of your nine models obtained from Part 2,
Part 3 and Part 4 to Kaggle, and Kaggle will check which model performs the best for
forecasting the on-hold data *+#, *+$, ⋯, *+#&. Based on each of the submitted model,
Kaggle can compute the truncated -step ahead forecast O*+2|* for = 1, 2,⋯ , 13, and
then compute the averaged square prediction error (ASPE) ASPE = 113 A(*+2 − O*+2|*)$#&2.# .
Based on this measure, which one of your nine models leads to the best forecast and wins
this data competition? Please paste the corresponding R code and R output/figures, together
with your analysis to support your answer in the answer sheet. If you only provide R
code/output without any analysis or description in the answer sheet, you will be marked 0 for
this question.

Semester 1 - Final, 2020 STAT4102/8002 Applied Time Series Analysis

Page 7 of 7
Question 4 (10 points)

Part 1. (2 points) Consider a general time series {!: = 1,⋯ , } and its single exponential
smoothing series ! = (1 − )! +!"#, = 1,⋯ , ,

where is the smoothing parameter, || < 1 and define / = #. Please use the iterating
backward method to give the expression of * by using and {!: = 1,⋯ , } in the answer
sheet. Please simplify this expression as much as you can.

Part 2. (8 points) Consider the time series model ! = !"# +! + !"#, with !~WN(0, %$)
and || < 1. Assume the parameters and %$ are known.

a) (4 points) Suppose that we observe data {!: = 1,⋯ , }, which follow this model. Please
give the expression of the truncated one-step-ahead forecast O*+#|* by using and {!: =1,⋯ , } in the answer sheet. Please simplify this expression as much as you can.

b) (2 points) Suppose we define a new forecast W*+# = * ,

where * is derived in Part 1, and this forecast is called single exponential smoothing
forecast. Compare the expressions of W*+# and the truncated one-step-ahead forecast O*+#|* in a), what relationship do you find between W*+# and O*+#|*?

c) (2 points) We have learned that the mean squared prediction error (MSPE) of the
truncated forecast can be approximated when is large. Please give the expression of the
MSPE approximation of the truncated one-step-ahead forecast O*+#|* in a) by using and %$ in the answer sheet. Please simplify the expressions as much as you can.

Question 5 (5 points, STAT8002 Only)

Students enrolled in STAT4102 should not answer this question (0 points for
STAT4102 students). This question is only compulsory for the students enrolled in
STAT8002.

Suppose we have data {!: = 1,⋯ , } which follow an invertible time series model ! =! + #!"# + $!"$ + &!"&, where !~WN(0, %$) with ! being independent and
normally distributed. If we want to obtain the conditional maximum likelihood estimation
(conditional MLE) based on data {!: = 1,⋯ , }, and conditional on / = "# = "$ = 0,
please give the expression of the conditional log-likelihood function of parameters #, $, &
and %$ in the answer sheet. Please simplify this expression as much as you can. Please
also explain how to use the concentrated likelihood approach to obtain the conditional MLE
of parameters #, $, & and %$ in the answer sheet.

END OF EXAMINATION