辅导案例-CSCI3151-Assignment 4

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Assignment 4 - Foundations of Machine Learning CSCI3151 - Dalhousie University

Q1 (30%)
Clustering
In this question we are going to explore two different clustering methods on the wine dataset
and evaluate it using two measures: one is an intrinsic measure (no labels), while the other one
makes use of the available labels.

a) Cluster the dataset using the k-Means clustering algorithm without using the class information
as part of the features. Experiment with different numbers of clusters ranging from 2 to 5. What
is the variability of the resulting clusters as a function of different initializations? Use the
Silhouette coefficient and Adjusted Rand Index as metrics for evaluation.
b) Cluster the dataset using DBSCAN without using the class information as part of the features.
Experiment with different values for the parameters Eps and minPts. It’s up to you how you deal
with any outliers, but make sure your criterion is properly justified. What is the variability of the
resulting clusters as a function of different initializations? Use the Silhouette coefficient and
Adjusted Rand Index as metrics for evaluation.
c) Summarize your findings and comparison after experimenting with these two clustering
methods.

Q2 (40%)
Convolutional neural networks
In this question you will construct a convolutional neural network to classify a large set of low
resolution images. Similarly to what you have done with A2, we would like you to describe the
behavior of the network as you modify certain parameters.

Use the CIFAR-100 dataset (available from Keras)

from keras.datasets import cifar100

(x_train_original, y_train_original), (x_test_original, y_test_original) =
cifar100.load_data(label_mode='fine')

a) Using two convolutional layers, explore the impact on different choices for the number of
nodes and filter sizes for the two layers. Summarize your observations.
b) For the hyperparameters for the layers that you determined in part (a), experiment with a
higher number of epochs. Summarize your observations.
c) Experiment with two more activation functions in the convolutional layers using the optimal
number of epochs from part (b). Are you able to improve the validation accuracy?
d) For the result in (c), calculate test accuracy for each class. Do you observe lower accuracy
for any specific class or classes? Do you observe higher accuracy for any specific class or
classes?

Q3 (30%)
Recurrent Neural Networks
In this question you will experiment with a simple recurrent neural network, where you will try to
model a sinusoidal function with noise, whose amplitude becomes larger and larger as the
independent variable t increases ( ). This function can be expressed in Python as 0 ≤ t ≤ N
x=(np.sin(0.02*t)+2*np.random.rand(N))*(t/N) . For N=5000 we have:

The idea is that you will train a recurrent neural network with points up to a certain value, Tp.
This is all the training points will be . The length of the sequence provided to the network pt ≤ T
is a parameter that you can tune.
You will use the notebook provided for this question named “A4Q3.ipynb”
a) Complete the notebook provided to fulfill this task. You need to add at least one
simpleRNN layer and a proper output layer.
b) Starting with length = 4, discuss how different choices of the length of the sequence fed
to the network can have an impact on performance.
c) Discuss how different network architectures can have an impact on performance.

Note 1: for simplicity, don’t change the values of the variables N and Tp.
Note 2: make sure you understand the code before addressing these questions.

Submitting the assignment
Note that you will have three separate Assignments 4 on Brightspace, i.e. one for each
question (A4-Q1, A4-Q2, and A4-Q3)
1. Your assignment as a single .ipynb file including your answers should be submitted for each
question before the deadline on Brightspace.
Use markdown syntax to format your answers.
2. Do not clear your notebook. Keep all the results in your notebook. (NEW)
3. You can submit multiple editions of your assignment. Only the last one will be marked. It is
recommended to upload a complete submission, even if you are still improving it, so that you
have something into the system if your computer fails for whatever reason.
4. IMPORTANT: PLEASE NAME YOUR PYTHON NOTEBOOK FILE AS:
--Assignment-N-Q.ipynb, for example
Soto-Axel-Assignment-2-1.ipynb (for the first question of the second assignment)
A penalty applies if the format is not correct.
5. The markers will enter your marks and their overall feedback on Brightspace. In case that
there is any important feedback, it will be given to you, but otherwise you would need to refer to
the model solutions.

Marking the assignment

Criteria and weights. Each criterion is marked by a letter grade. Overall mark is the
weighted average of the grade of each criterion.
For the experimental questions:
0.2 Clarity: All steps are clearly described. The origin of all code used is clearly. Markdown is
used effectively to format the answer to make it easier to read and grasp the main points. Links
have been added to all online resources used (markdown syntax is: [AnchorText](URL) ).
0.2 Justification: Parameter choices or processes are well justified.
0.2 Results: The results are complete. The results are presented in a manner that is easy to
understand. The answer is selective in the amount and diversity of the experimental results
presented.
Only key results that support the insights are presented. There is no need to present every
single experiment you carried out. Only the interesting results are presented, where the
behaviour of the ML model varies.
0.4 Insights: The insights obtained from the experimental results are clearly explained. The
insights are connected with the concepts discussed in the lectures.
The insights can also include statistical considerations (separate training-test data,
cross-validation, variance).Preliminary investigation of the statistical properties of the attributes
(e.g. histogram, mean, standard deviation) is included.