辅导案例-STAT 841 /-Assignment 2

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Statistical Learning-Classification
STAT 841 / 441, CM 763
Assignment 2
Department of Statistics and Actuarial Science
University of Waterloo
Policy on Lateness: No assignment are accepted after the due date.
1. Face detection:
Download the faces.mat from the course webpage.
The file faces.mat is composed of train faces, train nonfaces, test faces, and test nonfaces.
Make a training and a test set as follows:
training_data=[train_faces’ train_nonfaces’];
% (This will be a 361 by 4858 matrix.)
test_data=[test_faces’ test_nonfaces’];
% (This will be a 361 by 944 matrix.)
• Write a program to fit a logistic regression model to the training data. Report
the first 5 components of the optimum value of the logistic parameter , as well
as the training error and the test error.
Note : Attach your code to your assignment as an appendix, and submit the code to
the assignment drop box in Learn as well.
2. Download the Ionosphere dataset from the course webpage :
a) Write a program to fit a single hidden layer neural network via back-propagation
and weight decay. You cannot use builtin functions or Deep Network frameworks
(e.g. Keras, PyTorch, etc). You need to implement backpropagation yourself.
b) Apply your program in part a) to the data . Chose Ion.test as the test set, and
Ion.trin as the training set. Plot the training and test error curves as a function
of the number of epochs for four di↵erent values of the weight decay parameter.
Discuss the overfitting behavior in each case.
c) Set the value of weight decay equal to zero, then vary the number of hidden units
in the network (starting from 1 unit, and determine the minimum number needed
to perform well for this task. Plot the training and test error curves as a function
of the number of hidden units.
1

d) Select the best model (the optimum number of hidden nodes or the best value for
weight decay) and classify the test data using the network and report the observed
misclassification error rate. Construct a 2 by 2 table of the form
hˆ(x) = 0 hˆ(x) = 1
y = 0 ? ?
y = 1 ? ?
Note 1: Attach your code to your assignment as an appendix and submit the code to
the assignment drop box in D2L as well.
3. In a maximum likelihood problem, we can define an error function by taking the nega-
tive logarithm of the likelihood. Show that the error function for the logistic regression
model is a convex function of , and hence show that it has a unique minimum value.
Only for Grad Students
4. Consider a multiclass logistic regression model (multilogit model) applied to d-dimensional
data with K classes. Let be the (d + 1)(K 1)-vector consisting of all the coe-
cients. Define a suitably enlarged version of the input vector x to accommodate this
vectorized coecient vector. Derive the Newton-Raphson algorithm for maximizing
the log-likelihood, and describe how you would implement this algorithm.
5. Consider a classification model for two classes with prior class probabilities ⇡k, k = 1, 2.
Suppose that the class-conditional densities are given by Gaussian distributions with
a shared covariance matrix. Suppose we are given a training data set {(xi, yi)} where
i = 1 . . . n, and y 2 {0, 1} are class labels. Assume that the data points are drawn
independently from this model.
a) Compute the maximum-likelihood estimation for the prior probabilities.
b) Compute the maximum-likelihood estimation for the mean of the Gaussian dis-
tribution for each class.
c) Compute the maximum-likelihood estimation for the shared covariance matrix.
2

欢迎咨询51作业君