程序代写案例-CS 541

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

CS 541-B Artificial Intelligence: Final Exam
Instructor: Jie Shen
12/14/2020, 6:30 pm - 9:00 pm EST
Instructions:
• Open book exam, feel free to use any lecture notes;
• Discussion is not permitted;
• Always give your answer and explain it (guaranteed 5 point for nonempty answer);
• 20 points per problem, totally 110 points (20 * 5 + 10).
0. Write down your name. (10 pts)
1. Consider the following data points (represented by circles) in 2-dimensional space.
• Illustrate the group structure discovered by sparse subspace clustering in the left panel;
• Illustrate the clustering result of single iteration of k-means with k = 3 and initial centers
(−10, 0), (0,−10), (10, 10) in the right panel.
-10 -5 0 5 10
-10
-5
0
5
10
-10 -5 0 5 10
-10
-5
0
5
10
1
2. An outlier is a data point that behaves abnormally, that its magnitude is out of control. For
example, consider the age of a patient in his transcript: it is an outlier if it shows age = 2000, or if
age = 25.5. Even if we observe a “normal” value of 85 is may also be an outlier provided that his
actual age is 8.
Outlier may incur for a variety of reasons, e.g. human mistakes. In this case, we have to take such
dirty data into algorithmic design. Consider the simple regression model in high-dimensional regime:
yi = 〈ai,wtrue〉+ ei, ‖wtrue‖0 ≤ k, i = 1, . . . , n. (0.1)
In the above expression, ai ∈ Rd is the feature vector, wtrue is the groundtruth model we aim to
estimate, ei ∈ R is the possible outlier for the ith sample and yi ∈ R is the corrupted response.
• Given {ai, yi}ni=1 for sufficiently large n, say n → ∞, is it possible to learn the true model
without any prior knowledge/condition on {ei}ni=1?
• Now suppose that only n1 samples are corrupted by outliers where n1 n. In other words,
among {ei}ni=1, n− n1 of them are zeros (but we do not know which of them have zero values).
Give a proper formulation under which it is possible to simultaneously recover wtrue and detect
the outliers.
3. Suppose we have a set of data from n patients as in Table 1, where for each column and each
row there are some missing entries (represented by the symbol “?”). State when and how we can
estimate all the missing values.
Table 1: Patient data.
Age Weight Height Gender Blood Pressure · · · Sharp Pain
Patient 1 ? z12 z13 z14 ? · · · z1m
Patient 2 ? z22 ? z24 z25 · · · ?
Patient 3 z31 ? z33 ? ? · · · z3m
...
...
...
...
...
...
...
...
Patient n ? zn2 zn3 ? zn5 · · · ?
4. In real-world applications the positive and negative classes are often imbalanced. For example,
there are a massive amount of data from airplanes that operate normally, while the number of
aviation accidents is relatively few. Therefore, in order for a machine learning system to detect
abnormal status, the algorithm must appreciate those negative examples (i.e. the data from flight
recorder). Given a set of training samples {(xi, yi)}ni=1 ⊂ Rd × {+1,−1}, state how you will design
a learning algorithm for such abnormality detection system.
5. Consider two functions F1(w) and F2(w): both of them are strongly convex, but F1 is smooth
and F2 is non-smooth. Suppose we apply GD to optimize these two functions. The following figure
shows two convergence curves: a solid line and a dashed line. One is for F1(wt) and another for
F2(w
t).
• Explain which curve may correspond to F2;
2
• Plot another possible convergence curve of applying GD to optimize F2 with a different initial
iterate or learning rate.
5 10 15 20
t
0
0.2
0.4
0.6
0.8
1
F(
wt
)
3

欢迎咨询51作业君