STAT 542: Homework 11 FALL 2020, by Ruoqing Zhu (rqzhu) Due: Monday, Nov 23, 11:59 PM CT Contents Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 About HW11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Question 1 [50 Points] Linear SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Question 2 [50 Points] Non-linear SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Instruction Students are encouraged to work together on homework. However, sharing, copying, or providing any part of a homework solution or code is an infraction of the University’s rules on Academic Integrity. Any violation will be punished as severely as possible. Final submissions must be uploaded to compass2g. No email or hardcopy will be accepted. For late submission policy and grading rubrics, please refer to the course website. • You are required to submit two files: – Your .rmd RMarkdown (or Python) file, which should be saved as HWx_yourNetID.Rmd. For example, HW1_rqzhu.Rmd. – The result of knitting your RMarkdown file as HW1_yourNetID.pdf. For example, HW1_rqzhu.pdf. Please note that this must be a .pdf file. .html format cannot be accepted. • Include your Name and NetID in your report. • If you use this file or the example homework .Rmd file as a template, be sure to remove this instruction section. • Your .Rmd file should be written such that, if it is placed in a folder with any data you utilize, it will knit properly without modification. • Make sure that you set seed properly so that the results can be replicated. • For some questions, there will be restrictions on what packages you can use. Please read the requirements carefully. About HW11 We continue the SVM example from HW10 to perform linear and nonlinear classification using the penalized loss framework. Both questions are based on the following logistic loss function: L(y, f(x)) = log(1 + e−yf(x)). The rest of the job is to solve this optimization problem if given the functional form of f(x). To do this, we will utilize a general-purpose optimization package/function. For example, in R, you can use the optim() function. Read the documentation of this function (or equivalent ones in Python) and set up the objective function properly to solve for the parameters. If you need an example of how to use the optim function, read the corresponding part in the example file provided on our course website here (Section 10). 1 Question 1 [50 Points] Linear SVM When f(x) is a linear function, SVM can be solved by optimizing the penalized loss: argmin β0,β n∑ i=1 L(yi, β0 + xTi β) + λ‖β‖2 You should generate the data using the code provided below code (or write similar code in Python), and answer these questions: • Write a function to define the penalized loss objective function. The R function optim() can run faster if you further define the gradient function. Hence, you should also define the gradient function properly and implement it in the optimization. • Choose a reasonable λ value so that your optimization can run properly. In addition, I recommend using the BFGS method in optimization. • After solving the optimization problem, plot all data and the decision line • If needed, modify your λ so that the model fits reasonably well and re-plot. You don’t have to tune λ as long as you obtain a reasonable decision line. • Report your training error. set.seed(20) n = 100 # number of data points for each class p = 2 # dimension # Generate the positive and negative examples xpos <- matrix(rnorm(n*p,mean=0,sd=1),n,p) xneg <- matrix(rnorm(n*p,mean=1.5,sd=1),n,p) x <- rbind(xpos,xneg) y <- c(rep(-1, n), rep(1, n)) plot(x,col=ifelse(y>0,"darkorange", "deepskyblue"), pch = 19, xlab = "x1", ylab = "x2") legend("topleft", c("Positive","Negative"), col=c("darkorange", "deepskyblue"), pch=c(19, 19), text.col=c("darkorange", "deepskyblue")) −2 0 2 4 − 2 − 1 0 1 2 3 x1 x2 Positive Negative 2 Question 2 [50 Points] Non-linear SVM You should generate the data using the code provided below code (or write similar code in Python), and answer these questions: + Use the kernel trick to solve for a nonlinear decision rule. Consider using the penalized optimization of the following form: n∑ i=1 L(yi,wTKi) + λwTKw where Ki is the ith column of the n×n kernel matrix K, and the (i, j)th entry of K is K(xi, xj), for any two sample points. For this problem, we consider K(·, ·) as the Gaussian kernel. For the bandwidth of the kernel function, you can borrow ideas from our previous homework. You do not have to optimize the bandwidth. • Pre-calculate the n× n kernel matrix K of the observed data. • Write a function to define the objective function. You should also define the gradient function properly and implement it in the optimization. • Choose a reasonable λ value so that your optimization can run properly. • It could be difficult to obtain the decision line itself. However, its relatively easy to obtain the fitted label. Hence, calculate the fitted label (in-sample prediction) of your model and report the classification error. • Plot the data using the fitted labels. This would allow you to visualize (approximately) the decision line. If needed, modify your λ so that the model fits reasonably well and re-plot. You don’t have to tune λ as long as your result is reasonable. You can judge if your model is reasonable based on the true model. set.seed(1) n = 400 p = 2 # dimension # Generate the positive and negative examples x <- matrix(runif(n*p), n, p) side <- (x[, 2] > 0.5 + 0.3*sin(3*pi*x[, 1])) y <- sample(c(1, -1), n, TRUE, c(0.9, 0.1))*(side == 1) + sample(c(1, -1), n, TRUE, c(0.1, 0.9))*(side == 0) plot(x,col=ifelse(y>0,"darkorange", "deepskyblue"), pch = 19, xlab = "x1", ylab = "x2") legend("topleft", c("Positive","Negative"), col=c("darkorange", "deepskyblue"), pch=c(19, 19), text.col=c("darkorange", "deepskyblue")) 3 0.0 0.2 0.4 0.6 0.8 1.0 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 x1 x2 Positive Negative 4
欢迎咨询51作业君