辅导案例-ST5222
ST5222: Advanced Topics in Applied Statistics Midterm 1 Dealine for submission midnight 9th of October, 2019. 1. (10 points) (a) Suppose (x1, x2, x3) follow a multivariate normal distribution with mean (µ1, µ2, µ3) and covariance matrix Σ = 1 ρ ρ2ρ 1 0 ρ2 0 1 show that the conditional distribution of (x1, x2) given x3 has mean vector [µ1 + ρ2(x3 − µ3), µ2]T and covariance matrix:[ 1− ρ4 ρ ρ 1 ] . (b) If x ∼ Np(µ,Σ) random variables and QΣQT (q × q) is non- singular, then, given Qx = q, show that the conditional distri- bution of X is normal with µ + ΣQT (QΣQT )−1(q − Qµ) and co- variance matrix Σ− ΣQT (QΣQT )−1QΣ. 2. (15 points) A naturalist for the Alaska Fish and Game Department studies grizzly bears with the goal of maintaining a healthy popula- tion. Measurements on n = 61 bears provided the following summary statistics: 1 Variable Weight Body Neck Girth Head Head (kg) length (cm) (cm) length width (cm) (cm) (cm) Sample mean x¯ 95.52 164.38 55.69 93.39 17.98 31.13 Covariance matrix: S = 3266.46 1343.97 731.54 1175.50 162.68 238.37 1343.97 721.91 324.25 537.35 80.17 117.73 731.54 324.25 179.28 281.17 39.15 56.80 1175.50 537.35 281.17 474.98 63.73 94.85 162.68 80.17 39.15 63.73 9.95 13.88 238.37 117.73 56.80 94.85 13.88 21.26 (a) Perform a principal component analysis using the covariance ma- trix. Can the data be effectively summarised in fewer than six dimensions? (b) Perform a principal component analysis using the correlation ma- trix. (c) Comment on the similarities and differences between the two anal- yses. 3. (15 points) Consider the data in file data3.txt. Cluster the data using K-means method. Comment on your findings. 2