辅导案例-MLP 2020/21

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

MLP 2020/21: Coursework 1 Due: 30 October 2020
Machine Learning Practical 2020/21: Coursework 1
Released: Monday 19 October 2020
Submission due: 16:00 Friday 30 October 2020
1 Introduction
The aim of this coursework is to explore the classification of images of handwritten digits using neural networks.
The first part of this coursework will concern the identification and discussion of a fundamental problem in
machine learning, as shown in Figure 1. Following this preliminary discussion, you will further investigate this
problem in wider and deeper neural networks, study it in terms of network width and depth. The second part
involves implementing di↵erent methods to combat the problem identified in Task 1 and then comparing these
methods empirically and theoretically. In the final part, you will briefly discuss the main strengths and weakness
of any one related work to the methods examined in Task 2.
The coursework will use an extended version of the MNIST database, the EMNIST Balanced dataset,
described in Section 2. Section 3 describes the additional code provided for the coursework (in branch
mlp2020-21/coursework_1 of the MLP github), and Section 4 describes how the coursework is structured
into three tasks. The main deliverable of this coursework is a report, discussed in section 8, using a template that
is available on the github. Section 9 discusses the details of carrying out and submitting the coursework, and the
marking scheme is discussed in Section 10.
You will need to submit your completed report as a PDF file and your local version of the mlp code including
any changes you made to the provided (.py files). The detailed submission instructions are given in Section 9.2 –
please follow these instructions carefully.
2 EMNIST dataset
In this coursework we shall use the EMNIST (Extended MNIST) Balanced dataset [Cohen et al., 2017],
https://www.nist.gov/itl/iad/image-group/emnist-dataset. EMNIST extends MNIST by including images of
handwritten letters (upper and lower case) as well as handwritten digits. Both EMNIST and MNIST are extracted
from the same underlying dataset, referred to as NIST Special Database 19. Both use the same conversion
process resulting in centred images of dimension 28⇥28.
There are 62 potential classes for EMNIST (10 digits, 26 lower case letters, and 26 upper case letters). However,
we shall use a reduced label set of 47 di↵erent labels. This is because (following the data conversion process)
there are 15 letters for which it is confusing to discriminate between upper-case and lower-case versions. In the
47 label set, upper- and lower-case labels are merged for the following letters:
C, I, J, K, L, M, O, P, S, U, V, W, X, Y, Z.
The training set for Balanced EMNIST has about twice the number of examples as the MNIST training set, thus
you should expect the run-time of your experiments to be about twice as long. The expected accuracy rates
are lower for EMNIST than for MNIST (as EMNIST has more classes, and more confusable examples), and
di↵erences in accuracy between di↵erent systems should be larger. Cohen et al. [2017] present some baseline
results for EMNIST.
You do not need to directly download the EMNIST database from the nist.gov website, as it is part of the
coursework_1 branch in the mlpractical Github repository, discussed in Section 3 below.
1
MLP 2020/21: Coursework 1 Due: 30 October 2020
3 Github branch mlp2020-21/coursework_1
You should run all of the experiments for the coursework inside the Conda environment you set up
for the labs. The code for the coursework is available on the course Github repository on a branch
mlp2020-21/coursework_1. To create a local working copy of this branch in your local repository you need
to do the following.
1. Make sure all modified files on the branch you are currently have been committed (see notes/getting-
started-in-a-lab.md if you are unsure how to do this).
2. Fetch changes to the upstream origin repository by running
git fetch origin
3. Checkout a new local branch from the fetched branch using
git checkout -b coursework_1 origin/mlp2020-21/coursework_1
You will now have a new branch in your local repository with all the code necessary for the coursework in it.
This branch includes the following additions to your setup:
• A new EMNISTDataProvider class in the mlp.data_providers module. This class makes some
changes to the MNISTDataProvider class, linking to the EMNIST Balanced data, and setting the number
of classes to 47.
• Training, validation, and test sets for the EMNIST Balanced dataset that you will use in this coursework.
• In order to further improve performance and mitigate the problem identified in neural networks, you will
also need to implement a new class in the mlp.layers module:
DropoutLayer
and also two weight penalty tecniques in the mlp.penalties module:
L1Penalty and L2Penalty.
• DropoutandPenalty_tests.ipynb Jupyter notebook
to be used for testing the implementations of DropoutLayer, L1Penalty and L2Penalty classes.
The tests serve as a safeguard to prevent experimentation with faulty code which might lead to wrong
conclusions. Tests in general are a vital ingredient for good software development, and especially important
for building correct and ecient deep learning systems.
Please note that passing these preliminary tests does not necessarily mean your classes are absolutely
bug-free. If you get unexpected curves during model training, re-check your implementation of the classes.
• A directory called report which contains the LaTeX template and style files for your report. You should
copy all these files into the directory which will contain your report.
2
MLP 2020/21: Coursework 1 Due: 30 October 2020
(a) Error curve on the training and validation set of EMNIST dataset.
(b) Accuracy curve on the training and validation set of EMNIST dataset.
Figure 1: Error and Accuracy curves for a baseline model on EMNIST Dataset.
4 Tasks
The coursework is structured into 3 tasks, the first two are supported by experiments on EMNIST dataset.
1. Identification of a fundamental problem in machine learning as shown in Fig 1 and setting up a baseline
system on EMNIST by a valid hyper-parameter search.
2. A research investigation and analysis into whether using Dropout and/or Weight Penalty (L1Penalty and
L2Penalty) addresses the problem found in training machine learning models (Fig 1). How do these two
approaches improve/degrade the model’s performance?
3. A brief literature review of any one work as discussed in Section 7.
5 Task 1: Problem identification
Figure 1 shows the training and validation error curves in Figure 1a and also training and validation accuracies
in Figure 1b for a model with 1 hidden layer and a ReLU activation function trained on the EMNIST dataset by
using cross-entropy error function. This curve can be re-produced by running the model settings defined in the
Coursework1.ipynb notebook in the github repository. First identify and discuss the problem shown by the
curves in Figure 1 and briefly discuss potential solutions in this section for overcoming this problem.
Varying number of hidden units. Initially you will train various 1-hidden layer networks by using either
32, 64 and 128 ReLU hidden units per layer on EMNIST using stochastic gradient descent (SGD) without any
3
MLP 2020/21: Coursework 1 Due: 30 October 2020
regularization. Make sure you use an appropriate learning rate and train each network for 100 epochs. Visualise
and discuss how increasing number of hidden units a↵ects the validation performance and whether it worsens or
mitigates the problem.
(10 Marks)
Varying number of layers. Here you will train various neural networks by using either 1, 2, 3 hidden layers with
128 ReLU hidden units per layer on EMNIST using stochastic gradient descent (SGD) without any regularization.
Make sure you use an appropriate learning rate and train each network for 100 epochs. Visualise and discuss how
increasing number of layers a↵ects the validation performance and whether it worsens or mitigates the problem.
(10 Marks)
6 Task 2: Mitigating the problem with Dropout and Weight Penalty
Definition and Motivation. Here you will explain
• Dropout Layer, L1Penalty and L2Penalty including their formulations and implementation details
(do not copy/paste your code here),
• how/why/to what extent each one can alleviate the problem above,
• how they di↵er from each other in theory.
These explanations must be in your own words.
(10 Marks)
Implementing Dropout and Weight Penalty. Here you will implement DropoutLayer, L1Penalty and
L2Penalty and test their correctness. Here are the steps to follow:
1. Implement the Dropout class in the DropoutLayer of the mlp.layers module. You need to implement
fprop and bprop methods for this class.
2. Implement the L1Penalty and L2Penalty class in the L1Penalty and L2Penalty of the mlp.penalties
module. You need to implement __call__ and gradmethods for this class. After defining these functions,
they can be provided as a parameter, weights_penalty, biases_penalty in the AffineLayer class
while creating the multi-layer neural network.
3. Verify the correctness of your implementation using the supplied unit tests in
DropoutandPenalty_tests.ipynb
4. Automatically create test outputs xxxxxxx_regularization_test_pack.npy, by running the provided
program scripts/generate_regularization_layer_test_outputs.py which uses your code for
the previously mentioned layers to run your fprop, bprop, __call__ and grad methods where necessary
for each layer on a unique test vector generated using your student ID number.
To do this part simply go to the scripts folder scripts/ and then run
python generate_regularization_layer_test_outputs.py --student_id sxxxxxxx replac-
ing the student id with yours. A file called xxxxxxx_regularization_test_pack.npy will be gener-
ated under data which you need to submit with your report.
(20 Marks)
EMNIST Experiments. In this section you should modify your baseline network to one that uses one or a
combination of DropoutLayer with either L1Penalty or L2Penalty and train a model. For the experiments,
4
MLP 2020/21: Coursework 1 Due: 30 October 2020
your baseline network should contain 3 hidden layers and 128 hidden units with ReLU activation function. Your
main aim is to i) investigate whether/how each of these functions addresses the above mentioned problem, ii)
study the generalization performance of your network when used with one of these functions or a combination
of them, iii) discover the best possible network configuration, when the only available options to choose from
are Dropout and Weight Penalty functions and the hyper-parameters (learning rate, Dropout Probability and
penalty coefficient for the Weight Penalty functions).
The Dropout probability is a float value in the range (0,1), e.g. 0.5, chosen manually. Penalty coecient is also
a manually selected float value, e.g. 0.001, usually in the range of 0.1 0.00001 . For model selection, you
should use validation performance to pick the best model and finally report test performance of the best
model.
Ensure that you thoroughly describe how these functions a↵ect performance when used together and separately
with di↵erent hyperparameters in your report, ideally both at the theoretical and empirical level. Note that
the expected amount of work in this part is not a brute-force exploration of all possible variations of
network configurations and hyperparameters but a carefully designed set of experiments that provides
meaningful analysis and insights.
(40 Marks)
7 Task 3: Literature Review
In this section, you will explore one related work in the research area of regularization methods. You should
discuss the summary of the paper, strengths and limitations of the research work in less than 500 words. Note
that this review must be in your own words.
Below is a list of papers that you are recommended to consider for selecting one paper. If for any particular
reason, you do not wish to choose from the given list, you are free to discuss any other paper which relates this
coursework and is a published work in AI/ML/CV/NLP conferences such as ICML, NeurIPS, ICLR, AAAI,
IJCAI, CVPR, ECCV, ICCV, EMNLP or ACL. This is not an exhaustive list of conferences but largely covers
most of them.
• Dropout: a simple way to prevent neural networks from overfitting, Srivastava et al. [2014].
• Maxout Networks, Goodfellow et al. [2013].
• Understanding deep learning requires rethinking generalization, Zhang et al. [2016].
(10 Marks)
8 Report
Your coursework will be primarily assessed based on your submitted report.
The report template is divided into sections which corresponds to each task in the coursework specs, especially
from Section 2 to Section 5 of the report. The Abstract, Introduction and Conclusion sections will not be graded
for the final marks but we highly encourage you to attempt them in the report as we will provide feedback on
them. Please note that these sections will be graded in the later courseworks, hence it is beneficial to take the
feedback on coursework 1 into serious account.
The directory coursework_1/report contains a template for your report (mlp-cw1-template.tex); the
generated pdf file (mlp-cw1-template.pdf) is also provided, and you should read this file carefully as it
contains some useful information about the required structure and content. The template is written in LaTeX,
5
MLP 2020/21: Coursework 1 Due: 30 October 2020
and we strongly recommend that you write your own report using LaTeX, using the supplied document style
mlp2020 (as in the template).
You should copy the files in the report directory to the directory containing the LaTeX file of your report, as
pdflatex will need to access these files when building the pdf document from the LaTeX source file.
Your report should be in a 2-column format, based on the document format used for the ICML conference. The
report should be a maximum of 5 pages long, not including references. We will not read or assess any parts of
the report beyond this limit.
Ideally, all figures should be included in your report file as vector graphics files rather than raster files as this
will make sure all detail in the plot is visible. Matplotlib supports saving high quality figures in a wide range
of common image formats using the savefig function. You should use savefig rather than copying the
screen-resolution raster images outputted in the notebook. An example of using savefig to save a figure
as a PDF file (which can be included as graphics in LaTeX compiled with pdflatex is given below.
import matplotlib.pyplot as plt
import numpy as np
# Generate some example data to plot
x = np.linspace(0., 1., 100)
y1 = np.sin(2. * np.pi * x)
y2 = np.cos(2. * np.pi * x)
fig_size = (6, 3) # Set figure size in inches (width, height)
fig = plt.figure(figsize=fig_size) # Create a new figure object
ax = fig.add_subplot(1, 1, 1) # Add a single axes to the figure
# Plot lines giving each a label for the legend and setting line width to 2
ax.plot(x, y1, linewidth=2, label=’$y = \sin(2\pi x)$’)
ax.plot(x, y2, linewidth=2, label=’$y = \cos(2\pi x)$’)
# Set the axes labels. Can use LaTeX in labels within $...$ delimiters.
ax.set_xlabel(’$x$’, fontsize=12)
ax.set_ylabel(’$y$’, fontsize=12)
ax.grid(’on’) # Turn axes grid on
ax.legend(loc=’best’, fontsize=11) # Add a legend
fig.tight_layout() # This minimises whitespace around the axes.
fig.savefig(’file-name.pdf’) # Save figure to current directory in PDF format
If you make use of any any books, articles, web pages or other resources you should appropriately cite these in
your report. You do not need to cite material from the course lecture slides or lab notebooks.
To create a pdf file mlp-cw1-template.pdf from a LaTeX source file (mlp-cw1-template.tex), you can
run the following in a terminal:
pdflatex mlp-cw1-template
bibtex mlp-cw1-template
pdflatex mlp-cw1-template
pdflatex mlp-cw1-template
(Yes, you have to run pdflatex multiple times, in order for latex to construct the internal document references.)
An alternative, simpler approach uses the latexmk program:
latexmk -pdf mlp-cw1-template
6

欢迎咨询51作业君