程序辅导案例 > Program >

程序代写案例-FIT5201-Assignment 1

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

1
Assignment 1
FIT5201: Machine Learning Due
date: 21:59:59 15, Sep, 2021
Please note that,
1. 1 sec delay will be penalized as 1-3 days delay. So please submit
your assignment in advance (considering the possible internet delay)
and do not wait until last minute.
2. We will not accept any resubmit version. So please double check
your assignment before the submission.
Objectives This assignment assesses your understanding of model complexity, model selection, uncertainty in prediction with bootstrapping, and probabilistic machine learning, and linear models for regression and classification, covered in Modules 1, 2, and 3. The total marks of this assignment is 150. This assignment constitutes 25% of your final mark for this unit.
Section A. Model Complexity and Model Selection In this section, you study the effect of model complexity on the training and testing error. You also demonstrate your programming skills by developing a regression algorithm and a cross-validation technique that will be used to select the models with the most effective complexity.
Background. A KNN regressor is similar to a KNN classifier (covered in Activity 1.1) in that it finds the K nearest neighbors and estimates the value of the given test point based on the values of its neighbours. The main difference between KNN regression and KNN classification is that KNN classifier returns the label that has the majority vote in the neighborhood, whilst KNN regressor returns the average of the neighbors’ values. In Activity 1 of Module 1, we use the number of mis-classifications as the measurement of training and testing errors in KNN classifier. For KNN regressor, you need to choose another error function (e.g., the sum of the squares of the errors) as the measurement of training errors and testing errors.
Question 1 [KNN Regressor, 20 Marks] I. [5marks] Implement the KNN regressor function:
knn(train.data, train.label, test.data, K=4)
2
which takes the training data and their labels (continuous values), the test data, and the size of the neighborhood (K). It should return the regressed values for the test data points. Note that, you need to use a distance function to choose the neighbors. The distance function used to measure the distance between a pair of data points is Euclidean distance function.
Hint: You are allowed to use KNN classifier code from Activity 1 of Module 1.
II. [5marks] Plot the training and the testing errors versus 1/K for K=1,.., 25in one plot, using the Task1A_train.csv and Task1A_test.csv datasetsprovided for this assignment. Save the plot in your Jupyter Notebook filefor Question 1. Report your chosen error function in your JupyterNotebook file.
III. [10marks] Report (in your Jupyter Notebook file) the optimum value for Kin terms of the testing error. Discuss the values of K and modelcomplexity corresponding to underfitting and overfitting based onyour plot in the previous part (Part II).
Question 2 [L-fold Cross Validation, 15 Marks]
I. [5marks] Implement a L-Fold Cross Validation (CV) function for your KNNregressor: cv(train.data, train.label, K, numFold)which takes the training data and their labels (continuous values), thenumber of folds, and returns errors for different folds of the training data.
II. [8marks] Using the training data in Question 1, run your L-Fold CVwhere the numFold is set to 10. Change the value of K=1,..,15 in your KNNregressor, and for each K compute the average 10 error numbers youhave got. Plot the average error numbers versus 1/K for K=1,..,15 in yourKNN regressor. Save the plot in your Jupyter Notebook file for Question 2.
III. [2marks] Report (in your Jupyter Notebook file) the optimum value for Kbased on your plot for this 10-fold cross validation in the previous part(Part II).
Section B. Prediction Uncertainty with Bootstrapping This section is the adaptation of Activity 1.2 from KNN classification to KNN regression. You use the bootstrapping technique to quantify the uncertainty of predictions for the KNN regressor that you implemented in Section A.
Background. Please refer to the background in Section A.
3
Question 3 [Bootstrapping, 25 Marks]
I. [5marks] Modify the code in Activity 1.2 to handle bootstrapping for KNNregression.
II. [5marks] Load Task1B_train.csv and Task1B_test.csv sets.Apply your bootstrapping for KNN regression with times = 50 (thenumber of subsets), size = 60 (the size of each subset), and change K=1,.., 15(the neighbourhood size). Now create a boxplot where the x-axis is K,and the y-axis is the average test error (and the uncertainty around it)corresponding to each K. Save the plot in your Jupyter Notebook file forQuestion 3.
Hint: You can refer to the boxplot in Activity 1.2 of Module 1. But the erroris measured in different ways compared with the KNN classifier.
III. [5marks] ]Based on the plot in the previous part (Part П), how does the testerror and its uncertainty behave as K increases? Explain in your JupyterNotebook file.
IV. [5marks] Load Task1B_train.csv and Task1B_test.csv sets.Apply your bootstrapping for KNN regression with K=10(theneighbourhood size), size = 40 (the size of each subset), and change times =10, 20, 30,.., 200 (the number of subsets). Now create a boxplot where thex-axis is ‘times’, and the y-axis is the average error (and theuncertainty around it) corresponding to each value of ‘times’. Savethe plot in your Jupyter Notebook file for Question 3.
V. [5marks] Based on the plot in the previous part (Part IV), how does the testerror and its uncertainty behave as the number of subsets in bootstrappingincreases? Explain in your Jupyter Notebook file.
Section C. Probabilistic Machine Learning
In this section, you show your knowledge about the foundation of the probabilistic machine learning (i.e. probabilistic inference and modeling) by solving a simple but basic statistical inference problem. Solve the following problem based on the probability concepts you have learned in Module 1 with the same math conventions.
4
Question 4 [Bayes Rule, 20 Marks] Recall the simple example from Appendix A of Module 1. Suppose we have one red, one blue, and one yellow box. In the red box we have 3 apples and 5 oranges, in the blue box we have 4 apples and 4 orange, and in the yellow box we have 1 apples and 1 orange. Now suppose we randomly selected one of the boxes and picked a fruit. If the picked fruit is an apple, what is the probability that it was picked from the yellow box? Note that the chances of picking the red, blue, and yellow boxes are 50%, 30%, and 20% respectively and the selection chance for any of the pieces from a box is equal for all the pieces in that box. Please show your work in your PDF report.
Hint: You can formulize this problem following the denotations in “Random Variable” paragraph in Appendix A of Module 1.
Section D. Ridge Regression In this section, you develop Ridge Regression by adding the L2 norm regularization to the linear regression (covered in Activity 2.1 of Module 2) and study the effect of the L2 norm regularization on the training and testing errors. This section assesses your mathematical skills (derivation), programming, and analytical skills.
Question 5 [Ridge Regression, 25 Marks]
I. [10marks] Given the gradient descent algorithms for linear regression (discussed in Chapter 2 of Module 2), derive weight update steps of stochastic gradient descent (SGD) for linear regression with L2 regularisation norm. Show your work with enough explanation in your PDF report; you should provide the steps of SGD.
Hint: Recall that for linear regression we defined the error function E. For this assignment, you only need to add an L2 regularization term to the error function (error term plus the regularization term). This question is similar to Activity 2.1 of Module 2.
II. [5marks] Using R (with no use of special libraries), implement an SGD algorithm that you derived in Step I. The implementation is straightforward as you are allowed to use the code examples provided.
III. Now let’s study the effect of the L2 norm regularization on the training and testing errors:
a. Load Task1C_train.csv and Task1C_test.csv sets.
5
b. [5marks] For each lambda in {0, 0.4, 0.8, …, 10}, build a regressionmodel and compute the training and testing errors, using theprovided data sets. While building each model, all parametersettings (initial values, learning rate, etc) are exactly the same,except a lambda value. Set the termination criterion as maximumof 20 x N weight updates (where N is the number of trainingdata). Create a plot of error rates (use different colors for thetraining and testing errors), where the x-axis is log lambda and y-axis is the error rate. Save your plot in your Jupyter Notebook filefor Question 5.
c. [5marks] Based on your plot in the previous part (Part b),what’s the best value for lambda? Discuss lambda, modelcomplexity, and error rates, corresponding to underfitting andoverfitting, by observing your plot. (Include all your answers inyour Jupyter Notebook file.)
Section E. Multiclass Perceptron
In this section, you are asked to demonstrate your understanding of linear models for classification. You expand the binary-class perceptron algorithm that is covered in Activity 3.1 of Module 3 into a multiclass classifier. Then, you study the effect of the learning rate on the error rate. This section assesses your programming, and analytical skills.
Background. Assume we have N training examples {(x1,t1),...,(xN,tN)} where tn can get K discrete values {C1, ..., CK}, i.e. a K-class classification problem. We use to represent the predicted label of
Model. To solve a K-class classification problem, we can learn K weight vectors wk, each of which corresponding to one of the classes.
Prediction. In the prediction time, a data point x will be classified as argmaxk wk . x
Training Algorithm. We train the multiclass perceptron based on the following algorithm:
• Initialise the weight vectors randomly w1,..,wK
• While not converged do:
o For n = 1 to N do:
 y = argmaxk wk . xn
 If yn != tn do
• : = − η
• : = + η
6
In what follows, we look into the convergence properties of the training algorithm for multiclass perceptron (similar to Activity 3.1 of Module 3).
Question 6 [Multiclass Perceptron, 20 Marks]
I. Load Task1D_train.csv and Task1D_test.csv sets.
II. [10marks] Implement the multiclass perceptron as explained above. Please provide enough comments for your code in your submission.
III. [10marks] Train two multiclass perceptron models on the provided training data by setting the learning rates η to .09 and .01 respectively. Note that all parameter settings stay the same, except the learning rate, when building each model.For each model, evaluate the error of the model on the test data, after processing every 5 training data points (also known as a mini-batch). Then, plot the testing errors of two models built based on the learning rates .09 and .01(with different colors) versus the number of mini-batches. Include it in your Jupyter Notebook file for Question 6.Now, explain how the testing errors of two models behave differently, as the training data increases, by observing your plot. (Include all your answers in your Jupyter Notebook file.)
Section F. Logistic Regression vs. Bayesian Classifier
This task assesses your analytical skills. You need to study the performance of two well-known generative and discriminative models, i.e. Bayesian classifier and logistic regression, as the size of the training set increases. Then, you show your understanding of the behavior of learning curves of typical generative and discriminative models.
Question 7 [Discriminative vs Generative Models, 25 Marks]
I. Load Task1E_train.csv and Task1E_test.csv as well as the Bayesian classifier (BC) and logistic regression (LR) codes from Activities 3.2 and 3.3 in Module 3.
II. [10marks] Using the first 5 data points from the training set, train a BC and a LR model, and compute their training and testing errors. In a “for loop”, increase the size of training set (5 data points at a time), retrain the models and
7
calculate their training and testing errors until all training data points are used. In one figure, plot the training errors of the BC and LR models (with different colors) versus the size of the training set and in the other figure, plot the testing errors of the BC and LR models(with different colors) versus the size of the training set; include two plots in your Jupyter Notebook file for Question 7.
III. Explain your observations in your Jupyter Notebook file.:
a. [5marks] What does happen for each classifier when the number oftraining data points is increased?
b. [5marks] Which classifier is best suited when the training set issmall, and which is best suited when the training set is big?
c. [5marks] Justify your observations in previous questions (III.a &III.b) by providing some speculations and possible reasons.
Hint: Think about model complexity and the fundamental conceptsof machine learning covered in Module 1.
Submission & Due Date:
The files that you need to submit are:
1. Jupyter Notebook files containing the code and your answers for questions {1,2,3,5,6,7} with the extension “.ipynb”. The file names should be in the following format STUDNETID_assessment_1_qX.ipynb where ‘X=1,2,3,5,6,7’ is the question number. For example, the Notebook for Question 2 should be named
STUDNETID_assessment_1_q2.ipynb
2. You must add enough comments to your code to make it readable and understandable by the tutor. Furthermore, you will be asked to meet (online) with your tutor when your assessment is marked to complete your interview.
3. A PDF file that contains your report, the file name should be in the following format STUDNETID_assessment_1_report.pdf You should replace STUDENTID with your own student ID. Only includes Q4 and Q5(I) in
your PDF report. All files must be submitted via Moodle before the due date and time.
4. Zip all of your files and submit it via Moodle. The name of your file must be in the following format:
STUDNETID_FirstName_LastName_assessment_1_report.zip
8
where in addition to your student ID, you need to use your first name and last name as well.
Assessment Criteria: The following outlines the criteria which you will be assessed against:
• Ability to understand the fundamentals of machine learning and linearmodels.
• Working code: The code executes without errors and produces correctresults.
• Quality of report: You report should show your understanding of thefundamentals of machine learning and linear models by answering thequestions in this assessment and attaching the required figures.
Penalties:
• Late submission (-10% of full marks per day. 0 marks awards after 6 days)
• Jupyter Notebook file is not properly named (-5%)
• The report PDF file is not properly named (-5%)

欢迎咨询51作业君