STAT3017/STAT7017 Homework 1 Page 1 of 1 Homework 1 Due by 10 August 2020 23:59 Question 1 [2 Points] Let X = (x1, . . . , xp)′ be a p-dimensional random vector with independent and identically distributed entries xi such that E[xi ] = 0 and E[x2i ] = 1. Let M be a p × p matrix. Show that E[X ′MX] = tr(M). Question 2 [3 Points] Consider the situation where we may want to explain each response variable Y ∈ R by a p-dimensional variable X ∼ Unif([0, 1]p). Suppose our data consists of n i.i.d. observations (Yi ,Xi){i=1,...,n} of the variables Y and X. We could then model them with the classic regression equation Yi = f (Xi) + i , i = 1, . . . , n with f : [0, 1]p → R and 1, . . . , n are independent and centered random variables. It is typical to assume that the function f is smooth and we can estimate f (x) by some averaging of the Yi associated to the Xi in the vicinity of x . The simplest version of this idea is the k-nearest neighbor estimator where f (x) is estimated by the mean of the Yi associated with the k points Xi that are nearest to x . This works well in a low-dimensional setting as it is easy to make sense of what “nearest points” means. (a) Show that the notion of nearest points vanishes as the dimensionality p increases by plotting the histogram of the distribution of pairwise-distances {‖Xi − Xj‖ : 1 ≤ i < j ≤ n} for n = 100 and dimensions p = 2, 10, 100 and 1000. [2 Points] (b) What do you observe? [1 Point] This homework is to be submitted through Wattle in digital form only as per ANU policy. If you use any references (note: this will never count against you), please clearly indicate which ones. This homework has 15% weight. Dale Roberts - Australian National University Last updated: July 28, 2020
欢迎咨询51作业君