MATH6011 FORECASTING ASSIGNMENT 2020

Your coursework must be submitted electronically via Blackboard by 3pm on Friday

March 20th. Any work handed in after this time will be subject to the following

penalties: 10% of your marks lost per working day up to 5 working days. Do not write

your name anywhere on your work, as marking will be anonymous. Your student ID

should be included in the filenames but not your name; see further instructions on

file naming in Section 3 below. An extension, for bona fide reasons, may be allowed

by prior agreement, but only well before the deadline. Computer crashes or file losses

a day or two before the deadline will not be an acceptable reason. It is therefore

advisable to keep back-up copies of your work. Components of the project will receive

different weightings in producing your final mark: 50 marks for exponential smoothing,

20 for ARIMA, 20 for regression, and 10 for the description of the codes/files.

1. Background and analysis

In light of the current economic situation, you have been employed as a consultant to prepare

a report for an ad hoc Stock Exchange committee, Future Stocks. The report is to forecast

the behaviour of a number of key economic indicators until December 2020; the relevant data

sets are to be obtained from the UK Government Office for National Statistics website. A

recommendation is sought as to whether these economic indicators can be used to forecast

the FTSE 100 Financial Times Index itself.

1.1. How to get the data. From the four websites below, download the datasets by click-

ing on xlsx or xls format. An excel spreadsheet is then downloaded, which contains several

columns of data. Copy the data sets from the required columns as described below; i.e.,

K54D, EAFV, K226, and JQ2J, scrolling down to find the monthly statistics.

(a) Average weekly earnings data set (column CG, from row 550 down):

https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkingho

urs/datasets/averageweeklyearnings

K54D: Monthly average of private sector weekly pay.

(b) From retail sales time series data (column FP, from row 168 down):

https://www.ons.gov.uk/businessindustryandtrade/retailindustry/datasets/retailsales

EAFV: Retail sales index, household goods, all businesses.

(c) From the index of production data set (column BY, from row 692 down):

https://www.ons.gov.uk/economy/economicoutputandproductivity/output/datasets/indexof

production

K226: Extraction of crude petroleum and natural gas.

(d) From turnover and orders in the production and services industries–xls file under “Lat-

est version” (column B, from row 143 down):

https://www.ons.gov.uk/businessindustryandtrade/manufacturingandproductionindustry/tim

1

2eseries/jq2j/ios1/previous

JQ2J: The manufacturing and business sector of Great Britain, total turnover and orders.

1.2. Tasks. As so often happens in the real world, the data sets are of different lengths. You

will have to use your own judgment in inspecting and preparing the data before carrying out

any technical analysis. The analysis is in three parts:

(a) You are asked to take all four series separately and to forecast monthly behavior until

December 2020, using exponential smoothing-type forecasting methods.

(b) Future Stocks have been satisfied in the past with exponential smoothing–type forecasting

methods and are happy to see them used in the analysis. However, they are interested in

the possible use of ARIMA methodology to predict K54D. You are asked to fit the ARIMA

model to K54D, for analysis in which you compare the use of the ARIMA forecasting method

and an appropriate exponential smoothing technique. You should make a recommendation

as to future use of ARIMA on this time series.

(c) You are finally asked to use the above four time series as explanatory variables in a

multivariate regression model to forecast the UK Footsie 100 share index: FTSE. The data

can be found at https://finance.yahoo.com/ and is in column 1 of the table (after the date

column). It is recommended that you save the table into a text file, using MS Word or

Notepad for example, and then open the file as space delimited using MS Excel. A data sort

can be used to change the data to date order.

Develop a multiple regression model, use it for the prediction of FTSE until December 2020,

and report on whether you think the model is satisfactory or not.

2. What you must produce

You must produce a technical report describing all the analysis done to select the most

suitable forecasting method, as well as the results obtained. The report must be accompanied

by the codes used to perform the technical analysis, as well as the resulting graphs. More

details on each of the aspects of the work is given in the next subsections.

2.1. The technical report. The technical report must follow the structure described in

Subsection 2.4. It should address the three parts of the analysis: exponential smoothing,

ARIMA, and regression. For each part, give details of the preliminary analysis, data prepa-

ration, models chosen and analysis carried out. Also describe why each model was built and

explain the analysis carried out, including an evaluation of the effectiveness of the models.

2.2. Python codes. You must also prepare and submit python codes that you use to gen-

erate the results that will be included in your technical report. If any preliminary operations

on your data is needed before applying/developing a python code for your analysis, it is fine

to include this in the corresponding excel file containing your data sets. However, you must

complete all the main tasks of your analysis using python. You can use the codes from the

course, use different ones or develop your own. Marking on this aspect of your work will not

be based on how good you can program in python, but rather on the functionality of your

codes and their relevance in the corresponding analysis.

To help us easily know what you do in each code, you must produce a single page document,

as Appendix A to your technical report, to give a brief one or two sentences description

3of what it does. In case you do any preliminary operations on your data in the excel file

containing your data set, a line or two should also be included to describe this.

2.3. Analysis and forecast graphs. You are expected to produce graphs to illustrate your

analysis in the technical report. Do not include these graphs in the main part of the report

(Sections 1 - 3; see details in next subsection), but rather, put all of them in Appendix B. You

are allowed up to 10 pages for the graphs produced for your analysis. Organise the graphs

in 3 main parts, each corresponding to one of the main sections of the technical report. Also

number each of your graphs accordingly to be able to easily refer to them, as necessary, in

Sections 1, 2, and 3. You do not need to repeat graphs in Appendix B. For example, if you

want to refer to a graph under the ARIMA section, which was already done in the section

dedicated to exponential smoothing, you are encouraged to instead use the figure number of

that specific graph rather than repeating the graph again.

2.4. Organizing your technical report. The report must be organized as follows:

1. Exponential smoothing (maximal length: 2 pages; total marks: 50)

Marks to be attributed base on how well you articulate the following aspects:

• Describe data preparation (and its effects) prior to the implementation of

exponential smoothing methods.

• Describe preliminary analysis undertaken (and conclusions drawn) prior to

the implementation of exponential smoothing methods.

• Give details of how exponential smoothing models were selected for each of

the time series, and how effective these methods are at forecasting.

• Clarity and quality of presentation.

• Functionality of python codes.

• Quality and suitability of illustrative or forecast result graphs.

2. ARIMA forecasting (maximal length: 1 page; total marks: 20)

Marks to be attributed base on how well you articulate the following aspects:

• Describe any data preparation prior to ARIMA, and its effects.

• Describe preliminary analysis undertaken prior to ARIMA modelling, and

the conclusions drawn.

• Give details of how an ARIMA model was selected, tested, and its effective-

ness evaluated.

• Compare ARIMA and exponential smoothing forecasting, both in general

terms and in this particular instance.

• Clarity and quality of presentation.

• Functionality of python codes.

• Quality and suitability of illustrative or forecast result graphs.

3. Regression prediction (maximal length: 1 page; total marks: 20)

Marks to be attributed base on how well you articulate the following aspects:

• Describe any data preparation prior to regression.

• Describe any preliminary analysis undertaken prior to regression and the

conclusions drawn.

• Give details of how a regression model has been selected and comment on

its suitability for prediction.

4• Clarity and quality of presentation.

• Functionality of python codes.

• Quality and suitability of illustrative or forecast result graphs.

Appendix A: Description of your codes (maximal length: 1 page; full marks: 10)

Marks here will be attributed base on how clear, informative, and brief is your de-

scription of what each of your python codes (or excel file, in case any preliminary

operations is carried out there) does.

Appendix B: Analysis and forecast graphs (maximal length: 10 pages)

This appendix should be organised in 3 subsections; first, second, and third

section dedicated to graphs related to the exponential smoothing, ARIMA, and

regression methods, respectively. As you can see above, marks dedicated to this

appendix are attributed under the corresponding sections; i.e., Sections 1, 2, and

3, respectively.

In summary, the following guidelines must be followed while producing the technical report:

• The technical report must be organized as described above, with maximum 15 pages

in total: maximum 5 pages in total for Sections 1, 2, 3, and Appendix A; and

maximum 10 pages dedicated to the analysis and forecast graphs (Appendix B).

• No need to include graphs in Sections 1, 2, and 3. All graphs should be included

under Appendix B with appropriate numbering, in order to easily to refer to them

in your discussions under these sections.

• No theory of forecasting is required, or repeat of the material from lectures, unless

you have used models not included in notes.

• Formal English should be used, avoiding abbreviations (such as “doesn’t”), slang,

and casual vocabulary.

• In Sections 1, 2, and 3, references to codes developed/used for specific tasks can

be made by using the corresponding code’s name. But no other details of python

modelling are needed in those sections.

• At most 2 sentences are needed in Appendix A to explain what each python code (or

excel file, if necessary) does.

• Feel free to include subsections to Sections 1, 2, 3, and Appendices A and B, if they

seem necessary to help make some parts clearer.

• No introduction, table of contents or conclusions should be written for the report.

3. Submission

All submissions should be done under the corresponding assignment tab in Blackboard.

Submit one zipped folder (.zip), not an archived file (.rar), without internal folders, which

contains a pdf copy of the technical report, five spreadsheets with the data sets provided for

the analysis. You should also include an adequate number of files with your python codes.

Remember not to put your name anywhere on your work, as marking is anonymous. Include

your student ID in your technical report and use the following naming pattern for all the

files to be submitted via Blackboard:

• 1 pdf file with the technical report: TechnicalReport StudentID.pdf;

• 5 data files: K54Ddata StudentID.xls, EAFVdata StudentID.xls,

K226data StudentID.xls, JQ2Jdata StudentID.xls, and FTSEdata StudentID.xls.

5• Python codes: each file name should have 3 components, with first one related to the

corresponding methodology, second to the specific task, and 3rd being the student

ID. For example, if you produce/use a code to illustrate something related to the

exponential smoothing, ARIMA, and regression methods, you should respectively

apply the following naming pattern to your files:

– ExponentialSmoothing K54DTimePlot StudentID.py

– ARIMA ACFPlot StudentID.py

– Regression Correlation StudentID.py

The middle terms K54DTimePlot, ACFPlot, and Correlation are related to specific

tasks that could be carried out under the corresponding parts. This middle term

should not exceed 15 characters.