CS 4340/5340 Project 3 Points 100 (for 4340) or 200 (for 5340) Due: Oct 14, 11:59 pm Please do not use any package where logistic regression or (stochastic) gradient descent/ascent is already available; write C/C++/Java/Python code. Submission instructions remain the same. Please write legibly if you are submitting any hand-written (and scanned) stuff. Question A: Logistic regression. 100 points X Y 1 0 2 1 3 0 4 1 5 0 6 1 7 1 8 1 (a) Use the above data set (feel free to use +1/-1 or 1/0, whichever is convenient, for Y). Assume that X represents the time (in weeks) during which a student did not do any coursework for a certain course she was enrolled in, and let Y = 1 represent the fact that the student failed (with Y = 0 or -1 denoting passing). Implement logistic regression to obtain a prediction for the probability of failure, given the number of weeks of inaction (you will need to implement gradient ascent or gradient descent as part of this problem). In your report, clearly indicate (i) the initial choice of the weights, (ii) the learning rate constant (if you choose to use a variable learning rate, indicate how you reduce it with iterations), (iii) the total number of weight vector updates before your algorithm stops, (iv) your final solution, and (v) the quality of the final solution (briefly explain what you meant by the “quality” of a solution). (b) Show the results of use of different values of the learning rate (step size) constant, for the case when the weights are all initialized to zeros (that is, show outputs from different independent runs where the runs use different learning rates but the same initialization to zeros). (c) Using a fixed learning rate, show the effect of varying the initial weights (that is, show outputs from different independent runs where the runs use the same learning rate constant but different initializations). (d) Using your solution in part (a) above, predict the chances of a student passing a course if she does not study for (i) 3 weeks, (ii) 5 weeks. (e) Can logistic regression be used for classification? If not, explain why. If yes, show how your solution in part (a) above would classify the points X = 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5. State assumption(s), if any, that you make. (f) If there were two or more rows in the given data set with the same x–value but different y-values (e.g., (X=3, Y = 1) and (X=3, Y = 0 or -1)), would we still be able to obtain a valid logistic regression of Y on X, or would logistic regression make no sense in that case? Question B: Logistic regression by SGD [For the graduate section 5340 only]. 100 points Re-do Question A (parts a through d), replacing gradient ascent (descent) with stochastic gradient ascent (descent). Indicate the final solution obtained. Write a brief note on (i) the relative speed of computation (with respect to that in Question A), (ii) the relative quality of the final solution (w.r.t that in Question A), and (iii) the effect, if any, of varying the order in which you consider the given points.
欢迎咨询51作业君