辅导案例-SS9850B

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Instructor: L.-P. Chen
SS9850B Assignment #2
(Due April 3 2020)
Please carefully read the following instructions:
• There are 3 questions in this assignment. 4 marks for each subquestion.
• Show all your works in details and provide your code for computations. Also,
well summarize your numerical results in each question.
• This assignment is an INDIVIDUAL work. Plagiarism will earn ZERO mark.
• ONLY R functions/packages mentioned in this course are allowed. Using other
functions/packages that are not used in this course will lose marks.
• Submit your solutions to the Drop Box in the course site. Delayed submission
will lose marks.
1
1. (Wine Data Set) These data are the results of a chemical analysis of wines grown in
the same region in Italy but derived from three different cultivars. The analysis deter-
mined the quantities of 13 constituents (including Alcohol, Malic acid, Ash, Alcalinity of
ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins,
Color intensity, Hue,OD280/OD315 of diluted wines, and Proline) found in each of the
three types of wines. The sample size is 178. The dataset is available in the course site. The
main interest of this dataset is to study multiclassification of the three types of wines. Let ŷ
denote the predicted class of observations.
(a) Instead of estimating Σ (or Σi) by empirical estimates and then take an inverse, here we
consider to use glasso to estimate Θ , Σ−1 (or Θi , Σ−1i ) directly. After that, replacing
the estimator Θ̂ (or Θ̂i) to the linear discriminant function (or quadratic discriminant
function) gives graphical-based linear discriminant analysis (or quadratic discriminant
analysis). In this question, use the graphical-based linear discriminant analysis and
quadratic discriminant analysis to obtain ŷ. In addition, summarize the confusion table
for y and ŷ, use macro averaged metrics to evaluate recall, precision, F-measure, and
then conduct performance of classification.
(b) Use the support vector machine method to obtain ŷ. In addition, summarize the con-
fusion table for y and ŷ, use macro averaged metrics to evaluate recall, precision, F-
measure, and then conduct performance of classification.
(c) Suppose that we only consider the predictor Proline. Use the nonparametric density
estimation method in Section 4.1 to explore the multiclassification. For the choices
of kernel function, we examine the Gaussian kernel and the biweight kernel; for the
bandwidth selection, we use the cross-validation (CV) method.
After obtaining ŷ, summarize the confusion table for y and ŷ, use macro averaged metrics
to evaluate recall, precision, F-measure, and then conduct performance of classification.
## Hint : programming code o f CV func t i on :
J=function (h){
f ha t=Vecto r i z e ( function ( x ) density (X, from=x , to=x , n=1,bw=h)$y )
f h a t i=Vecto r i z e ( function ( i ) density (X[− i ] , from=X[ i ] , to=X[ i ] , n=1,bw=h)$y )
F=f h a t i ( 1 : length (X) )
return ( i n t e g r a t e ( function ( x ) fhat ( x)ˆ2 ,− In f , I n f )$value−2∗mean(F) )
}
(d) Summarize your findings in (a)-(c).
2
2. (Simulation studies) Consider the following linear model:
y = X1β1 +X2β2 +X3β3 +X5β5 − 5√ρX6β6 +X7β7 + , (1)
where X = (X1, · · · , Xp) is a p-dimensional vector of covariates and each Xk is generated
from N(0, 1). The correlations of all Xk except X6 are ρ, while X6 has the correlation
√
ρ
with all other p− 1 variables. Suppose that the sample size is n = 200.
(a) Show that X6 is marginally independent of y.
(b) Now, consider p = 1000 and generate the artificial data based on model (1) for 1000
repetitions. Specifically, let βi = 1 for every i = 1, · · · , 7 and set ρ = 0.8. After that, use
the SIS and iterated SIS methods to do variable selection and estimate the parameters
associated with selected covariates. Finally, summarize the estimator in the following
table:
Table 1: Simulation result for (b)
‖∆β‖1 ‖∆β‖2 #S #FN
SIS
Iterated SIS
(c) Here we consider the scenario that is different from (b). Let p = 50 and X ∼ N(0,ΣX)
with entry (j, k) in ΣX being 0.6
|j−k| for j, k = 1, · · · , p. We generate the artificial data
based on (1) for 1000 repetition with βi = 1 for every i = 1, · · · , 7. After that, use the
lasso, adaptive lasso, and Elastic net (set α = 0.5) methods to estimate the parameters.
Finally, summarize numerical results in the following table.
Table 2: Simulation result for (c)
‖∆β‖1 ‖∆β‖2 #S #FN
lasso
adaptive lasso
Elastic net (α = 0.5)
(d) Summarize your findings for parts (b) and (c), respectively.
Note: Let β̂ be the estimator, then ∆β is defined as ∆β = β̂ − β with the ith component
being β̂i − βi. Therefore, ‖∆β‖1 and ‖∆β‖2 are defined as
• ‖∆β‖1 =
p∑
i=1
∣∣∣β̂i − βi∣∣∣;
• ‖∆β‖2 =
√
p∑
i=1
(
β̂i − βi
)2
.
3
3. In the class, we have discussed that the estimator of Θ can be obtained by solving
Θ−1 − S − λΦ = 0,
where S and Ψ are defined in the note.
(a) Let W denote the working version of Θ−1 such that WΘ = I, where I is the identity
matrix. Show that
W11β − s12 − λψ12 = 0, (2)
where β = − θ12θ22 , and θ12, θ22, s12, and ψ12 are defined in the note.
(b) Suppose that β̂ is obtained by solving (2). Please show that θ̂12 = −β̂ · θ̂22 and θ̂22 =
(w22 − w>12W−111 w12)−1. Also, interpret the meaning of θ̂12.
4
Hint: Regarding simulation studies with 1000 repetitions.
In Question 2, you are asked to use simulation studies with 1000 repetitions to estimate the
parameters. Specifically, based on the kth artificial data that are independently generated, you are
able to obtain the estimator, denoted by β̂(k). As a result, with 1000 repetitions, the final estimator
is given by β̂ = 11000
1000∑
k=1
β̂(k).
5