CS 391: Midterm October 15, 2020 Due date: Oct 16, 8am Boston time Remember: For any question you answer “I do not know” you get 20% of the grade associated with this question. A totally wrong answer gets 0. Collaboration and Citing Policy: You must cite any sources that you use, including the lecture slides. Additionally, collaboration with other students is not allowed. You are encouraged to find the answers yourselves. If you use external sources please cite them. • All solutions should be typed in LaTeX. • Plots generated in python should be incorporated in the LaTeX file you submit. • Your python code should also be submitted. Exercise 1 (20 points) You and your friend decide to test whether a coin is a fair coin. You decide to flip it 2000 times and count the number of heads and tails. Your experiment results in 1200 heads and 800 tails. How confident can you be that the coin is not fair? In other words, if the coin were fair, how likely is the observed event? Give three upper bounds using the Markov, Chebyshev and Chernoff inequalities. Which one is the best? Exercise 2 (30 points) • (20 points) Assume that you are given a dataset and you are asked to produce a soft clustering using the EM algorithm. How are you going to decide the number of components (i.e., the value of k) by running the EM algorithm as many times as you want (and without visualizing the data). Describe your algorithm and explain why it is a reasonable algorithm. • (10 points) The problem of describing the data using a mixture of k Gaussians is solved by the EM algorithm, which finds a locally optimal solution (not necessarily the best). Show that when k = 1 this problem can be solved exactly. Describe your algorithm and prove its correctness. 1 Exercice 3 (30 points) Use the dataset “spatial data.txt” which contains as rows Facebook users and as columns 210 content categories on Facebook. Let’s denote this matrix as A. • (5 points )Compute the cosine similarity between users in A and store it in matrix C. • (10 points) Using code similar to the one we described in class, compute a low rank approximation of A, using its first 25 singular vectors. • (5 points )Compute the cosine similarity between users as described in this new 25-dimensional space. Let these similarities be stored in matrix C ′ • (10 points ) Plot a histogram of the entries in D = C − C ′. Exercise 4 (20 points) 1. (10 points) Show that A and AT have the same nonzero singular values. How are their SVD’s related? 2. (10 points) Show that if σ is a singular value of A then there exists a nonzero vector ~x such that: σ = ||A~x||2 ||~x||2 . (Hint: A vector x can be written as a linear combination of scaled unit vectors. For instance, let φ = {u1, u2, . . . , un} be a set of orthonormal vectors. We can then express x as x = ∑n i=1 aiui for some scalar values {a1, a2, . . . , an} along each unit vector. How can we express Ax using singular values?) 2
欢迎咨询51作业君