辅导案例-STAT3017/STAT7017

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

STAT3017/STAT7017 Homework 1 Page 1 of 1
Homework 1
Due by 10 August 2020 23:59
Question 1 [2 Points]
Let X = (x1, . . . , xp)′ be a p-dimensional random vector with independent and identically
distributed entries xi such that E[xi ] = 0 and E[x2i ] = 1. Let M be a p × p matrix. Show
that E[X ′MX] = tr(M).
Question 2 [3 Points]
Consider the situation where we may want to explain each response variable Y ∈ R by a
p-dimensional variable X ∼ Unif([0, 1]p).
Suppose our data consists of n i.i.d. observations (Yi ,Xi){i=1,...,n} of the variables Y and
X. We could then model them with the classic regression equation
Yi = f (Xi) + i , i = 1, . . . , n
with f : [0, 1]p → R and 1, . . . , n are independent and centered random variables.
It is typical to assume that the function f is smooth and we can estimate f (x) by some
averaging of the Yi associated to the Xi in the vicinity of x . The simplest version of this
idea is the k-nearest neighbor estimator where f (x) is estimated by the mean of the Yi
associated with the k points Xi that are nearest to x . This works well in a low-dimensional
setting as it is easy to make sense of what “nearest points” means.
(a) Show that the notion of nearest points vanishes as the dimensionality p increases by
plotting the histogram of the distribution of pairwise-distances
{‖Xi − Xj‖ : 1 ≤ i < j ≤ n}
for n = 100 and dimensions p = 2, 10, 100 and 1000. [2 Points]
(b) What do you observe? [1 Point]
This homework is to be submitted through Wattle in digital form only as per ANU policy. If
you use any references (note: this will never count against you), please clearly indicate which
ones. This homework has 15% weight.
Dale Roberts - Australian National University
Last updated: July 28, 2020

欢迎咨询51作业君