辅导案例-CMSC 25025 /-Assignment 4

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

CMSC 25025 / STAT 37601
Machine Learning and Large Scale Data Analysis
Assignment 4 Due: Tuesday, May 19, 2020 at 2:00 pm.
Please hand in this Homework in 4 files:
1. A pdf of your jupyter notebook for problem 1. The derivation of the solution to 1(a)(ii) can be
written in the notebook markup.
2. The ipynb file for problem 1.
3. A pdf of your jupyter notebook for problem 2.
4. The ipynb file for problem 2.
1. Sparse coding of natural images and digits (40 points)
In this problem you will implement the sparse coding procedure as described in class on
image patches of size 12x12. This was proposed as a possible computational mechanism
underlying the evolution of neural representations in the visual cortex of mammals.1 You
will use the actual images used in this landmark paper.
To run the sparse coding algorithm over the images, we have provided a function that selects
random patches. This can be run on the Olshausen-Field images using the following code
import scipy.io
%matplotlib inline
import matplotlib.pyplot as plt
import random
import numpy
data =scipy.io.loadmat(’/project2/cmsc25025/sparsecoding/IMAGES_RAW.mat’)
images = data[’IMAGESr’]
#Show the first image.
plt.imshow(images[:,:,0], cmap=’gray’)
# Function to sample image patches from the large images.
def sample_random_square_patches(image, num, width):
patches =numpy.zeros([width,width,num]);
for k in range(num):
i, j = random.sample(range(image.shape[0]-width),2)
patches[:,:,k] = image[i:i+width,j:j+width]
return patches
We want to run the sparse coding scheme over the images.
(a) We will implement this with SGD alternating the following two steps.
i. With the current codebook find the coefficients α(i), i = 1, . . . , b for each exampleX(i)
in the batch as a Lasso problem.
ii. Fix the coefficients α(i) and compute the gradient of the loss with respect to each
vector of the codebook. Write the gradient of the codebook matrix V = [V (1), . . . , V (L)] ∈
Rd×L as one matrix computation using V and X ∈ Rd×b - the batch data, and
A = [α(1), . . . , α(b)] ∈ RL×b.
1B. Olshausen and D. Field, “Emergence of simple-cell receptive field properties by learning a sparse code for
natural images,” Nature 381, 607–609, 1996.
1
(b) Use the class sklearn.linear model.Lasso for the coefficient estimation step.
We want to fully exploit the vector computation properties of python. Make sure
there are no ‘loops’ inside the main loop that iterates over steps in the SGD. All
computations should be done with matrix operartions. For the Lasso step, you can
have the fit function of the Lasso class fit all training points in the batch at once.
(You may wan to compare the time it takes to do this to a loop calling Lasso on each
training point separately.)
(c) Monitor the convergence of the SGD algorithm by checking the change in the code-
book. How long does it take to converge? Experiment with a step size η for your
algorithm, constant or decreasing. Display the codebook after initialization, after con-
vergence, and at several intermediate stages. Comment on your results. Are they con-
sistent with the results presented in paper?
(d) Show reconstructions of patches images using the sparse representation.
2. Convolutional networks for MNIST (60 points)
The code in this notebook allows you to train a particular convolutional neural network
(which we call the original model) on MNIST data. It also saves the model in your directory
and has code to reload the model and continue training or to simply test the model on a new
data set. You have two options:
• You can download the notebook into your computer and remove the first cell, which is
specific to google drive.
• Or, you can save this notebook to your google drive by going to the file menu and
choosing save a copy in drive. This will be saved in your google colab folder
as explained in an earlier message. Upload the MNIST data to your google drive and
you’re ready to go. The advantage is that you can activate a GPU and have the algorithm
run very fast. Once you open you own colab notebook, use the runtimemenu, choose
the Change runtime type and pick gpu from the dropdown menu.
(a) Compute the total number of parameters in the original model. And run this model.
You shouldn’t run more than 20 epochs. (On the RCC with 8 cores it takes about 90
seconds per epoch with all the training data.) You can do this with only 10000 training
data to expedite the experiments. For each experiment plot the error rate on training
and validation as a function of the epoch number.
Show an image with the 32 5× 5 filters that are estimated in the first layer of the model
(b) Experiment with changing parameters of the network:
i. Keep the same number of layers and change layer parameters reducing number
of parameters by half in one experiment and doubling the number parameters in
another. Try a few different options. Report the results.
2
ii. Design a deeper network with more or less the same number of parameters as the
original network. Report the results.
iii. Once you pick the best configuration try it on the full training set and report the
result
(c) Handling variability. A transformed data set has been created at
/project2/cmsc25025/mnist/MNIST TRANSFORM.npy by taking each digit,
rotating it by a random angle between [-40,-20] or [20,40], applying a random shift of
+/− 3 pixels in each direction and applying a random scale between [.9, 1.1].
Display a few of these examples alongside the original digits.
Using the original architecture to test on this data set. The classification rate drops
dramatically.
Try to propose changes to the network architecture so that still training on the original
training set you would perform better on the transformed test set. Perform some exper-
iments using a transformed validation set and show the final results on the transformed
test set.
3