DSC 212: Probability and Statistics for Data Science Due date: 5:00 pm PST, 13th March, 2023

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top

Assignment 2

DSC 212: Probability and Statistics for Data Science Due date: 5:00 pm PST, 13th March, 2023

1. Consider Bernoulli(p) observations 0 0 1 0 1 0 1 1. Plot the posterior distribution for p for the following prior distributions Beta(1,1), Beta(10,10), Beta(1, 10). Mark the inflection points in the plots.

Note: Hand-drawn plots are sufficient.

i.i.d.

2. Let X1, X2, . . . , Xn ∼ Uniform(0,θ). Calculate the posterior distribution corresponding to the prior

density f(θ) ∝ θ1.

3. In this problem, we derive a general approach—called iterative reweighed least-squares—for obtaining the empirical risk minimization solution with a smooth loss, and unconstrained linear function class. Consider minimizing a function L : Rd → R, and assume that L is continuously differentiable up to second order. The Newton iteration for obtaining the minimizer of L is

θt+1 = θt − [∇2L(θt)]−1∇L(θt) (1) where ∇2L and ∇L are the Hessian and gradient of L respectively.

(a) Show that iteration (1) is equivalent to minimizing the second-order Taylor expansion of L around θt.

(b) Let L(θ) = Pni=1 l(x⊤i θ, yi) where l : R × Y → R is smooth in its first argument. Argue that nn

∇L(θ) = X αi(θ)xi ∇2L(θ) = X wi(θ)xix⊤i (2) i=1 i=1

for weights αi(θ) and wi(θ) that you specify.

(c) Let X ∈ Rn×d be the matrix whose ith row is x⊤i . Let α(θ) = (α1(θ),...,αn(θ)) and Wθ =

 diag(w1(θ), . . . , wn(θ)). Show that iteration (1) is equivalent to solving 2

θt+1 = argmin W 1/2(θt)(Xθ − z(θt) θ∈Rd

(3)

for z(θt) ∈ Rn which you specify. Thus Newton iteration in this context is equivalent to solving a reweighed least-squares problem at each iteration.

Hint: ∇L(θ) = X⊤α(θ) and ∇2L(θ) = X⊤W(θ)X.

(d) Consider the case where l(t, y) = −yt + log(1 + et) is the logistic loss. Show that in this case, wi(θ) = σ(θ⊤xi)(1 − σ(θ⊤xi)) and find an expression for zi(θ).

4. (Bayesian linear model) For the fixed design linear model with X = [x1, x2, . . . , xn]⊤ with observations yi = x⊤i β + σεi having i.i.d. standard normal noise εi. Consider the likelihood and prior given below.

fY |β(y|w) ∝ exp(− 1 ∥y − Xw∥2) (4) 2σ2

fβ(w) ∝ exp(−∥w∥2Γ) (5)

Here Γ is a positive definite precision matrix and ∥w∥2Γ := w⊤Γw. Find the posterior distribution Pβ|Y .

5. If X ∼ N (μ, Σ) where μ ∈ Rd and Σd×d is a positive definite matrix. Find the distribution of the vector AX∈Rp forafixedmatrixA∈Rp×d.

1

 

 


51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468