MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS Revision of probability theory Recall the following concepts and definitions from probability theory. Let X be an arbitrary random variable. (1) FX(x) := P(X ≤ x) denotes the distribution function (or cdf) of X. (2) mX(u) := E(euX), where u ∈ R, denotes the moment generating function (mgf) of X. (3) Φ denotes the standard normal distribution function. Specifically, Φ(x) :=∫ x −∞ 1√ 2pi e−u2/2du. Question 1. Let Z ∼ N (0, 1). (a) Show that E(Z) = 0 and Var(Z) = 1. Solution. These expressions are calculated through direct computation. Notice that d dx ( −e−x2/2 ) = xe−x 2/2 and so we have an antiderivative for xe−x2/2. Hence E(Z) = ∫ ∞ −∞ x 1√ 2pi e−x 2/2dx = 1√ 2pi [ −e−x2/2 ]∞ −∞ = 0. For variance, Var(Z) = E(Z2)− E(Z)2︸ ︷︷ ︸ =0 = E(Z2) = ∫ ∞ −∞ x2 1√ 2pi e−x 2/2dx = ∫ ∞ −∞ 1√ 2pi x ( xe−x 2/2 ) dx. Integrating by parts∫ ∞ −∞ 1√ 2pi x ( xe−x 2/2 ) dx = − 1√ 2pi xe−x 2/2|∞−∞︸ ︷︷ ︸ =0 + ∫ ∞ −∞ 1√ 2pi e−x 2/2dx︸ ︷︷ ︸ =1 = 1. 1 MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 2 (b) Let Y = −Z. Find the distribution of Y . Discuss your findings. Solution. Y is a scaled normal random variable, and thus is again normal. Its parameters are E(Y ) = −E(Z) = 0 and Var(Y ) = (−1)2Var(Z) = 1. Thus Y ∼ N (0, 1). Alternatively, we can calculate the distribution function of Y . FY (x) = P(Y ≤ x) = P(−Z ≤ x) = P(Z ≥ −x) = ∫ ∞ −x 1√ 2pi e−u 2/2du. Now let m = −u. Then∫ ∞ −x 1√ 2pi e−u 2/2du = ∫ x −∞ 1√ 2pi e−m 2/2dm = Φ(x). Thus, FY coincides with the standard normal distribution function Φ. Our conclusion is that Z and Y are equal in distribution (often written Y d = Z) but clearly are not the same random variable, since they have the same magnitude but with opposite sign. Thus, they take a value around some x ∈ R with the same probability, but in the end their outcomes are always of opposite sign. (c) Find the moment generating function of Z. That is, calculate mZ(u) = E(euZ). Hint. Complete the square of the exponent of the integrand. Solution. E(euZ) = ∫ ∞ −∞ eux 1√ 2pi e−x 2/2dx = ∫ ∞ −∞ 1√ 2pi e− 1 2 (x2−2ux)dx = ∫ ∞ −∞ 1√ 2pi e− 1 2 ( (x−u)2−u2 ) dx = eu 2/2 ∫ ∞ −∞ 1√ 2pi e− 1 2 (x−u)2dx︸ ︷︷ ︸ =1 , where we have recognised the integral of a N (u, 1) density function over the whole real line. Thus mZ(u) = e u2/2. (d) Denote by Φ−1 the inverse standard normal distribution function. Let U ∼ U(0, 1). That is, U is a random variable uniformly distributed on the interval (0, 1).1 Show that Φ−1(U) d= Z. (1) Assuming that a computer can simulate an observation of a U(0, 1) random variable as well as calculate Φ−1, write an algorithm to simulate an observation of a N (0, 1) random variable. What is this procedure called? Hint. This assertion states that the distributions of the random variables Φ−1(U) 1Recall this means that its density is given by fU (x) = { 1, x ∈ (0, 1), 0, else. MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 3 and Z coincide. Distributions are classified by the distribution function of a random variable. Compute the distribution functions of both random variables and show they are the same. Solution. Our goal is to show that P(Φ−1(U) ≤ x) = Φ(x). Φ is a strictly increasing function, and hence its inverse enjoys the same property. Thus the event {Φ−1(U) ≤ x} = {U ≤ Φ(x)}. Hence P(Φ−1(U) ≤ x) = P(U ≤ Φ(x)) = FU (Φ(x)). The distribution function of U is FU (x) = 1, x > 1, x, 0 < x ≤ 1, 0, x ≤ 0. Now since Φ(x) ∈ [0, 1], we get that FU (Φ(x)) = Φ(x). Thus P(Φ−1(U) ≤ x) = Φ(x). An algorithm to simulate an observation of a N (0, 1) random variable is the following. • Generate an observation of a U(0, 1) random variable. Call the observation u. • Compute Φ−1(u). Call this number z. By eq. (1), z is equivalent to an observation of a N (0, 1) random variable. This method is called the inverse sampling method. Question 2. Let X ∼ N (µ, σ2). In the following, use the fact that X d= µ + σZ, where Z ∼ N (0, 1). (a) Find the moment generating function of X. That is, find mX(u) = E(euX). Hint. Use the result of Question 1 part (c) and exploit properties of expectation. Solution. Since X d = µ+ σZ, then E(euX) = E(eu(µ+σZ)) = euµE(euσZ). Recognise from Question 1 part (c) that E(euσZ) = mZ(uσ) = e 1 2 u2µ2 . Thus E(euX) = euµe 1 2 u2σ2 . Hence mX(u) = e uµ+ 1 2 u2σ2 . (b) Write an algorithm to simulate an observation of a N (µ, σ2) random variable. Hint. Notice you have already found a way to simulate an observation of a N (0, 1) random variable from Question 1 part (d). MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 4 Solution. An algorithm to simulate an observation of a N (µ, σ2) random variable is the following. • Simulate an observation of a N (0, 1) random variable through the method from Question 1 part (d). Call the observation z. • Compute µ+ σz. Call this number x. Then x is equivalent to an observation of a N (µ, σ2) random variable. Question 3. Let (Xi)i≥1 be an i.i.d. sequence of random variables, where each Xi has a N (µ, σ2) distribution. Define Sn := ∑n i=1Xi. (a) Find E(Sn) and Var(Sn). Solution. By linearity of expectation E(Sn) = E ( n∑ i=1 Xi ) = n∑ i=1 E(Xi) = nµ. In addition, since (Xi)i≥1 are independent, they are uncorrelated and thus Var(Sn) = Var ( n∑ i=1 Xi ) = n∑ i=1 Var(Xi) = nσ 2. (b) Find the distribution of Sn. Solution. As Sn is a sum of independent normal random variables, it is again nor- mal.2 Its mean and variance were calculated in part (a). Thus Sn ∼ N (nµ, nσ2). (c) Let Tn := Sn−nE(X1)√ nVar(X1) . Find the distribution of Tn. Solution. Recognise Tn is just the normalised version of Sn. Hence Tn ∼ N (0, 1). (d) Suppose now that (Xi)i≥1 is an i.i.d. sequence of random variables, where each Xi has finite variance, but their distribution is not necessarily normal. Tn no longer has a normal distribution. However in some sense, Tn ‘approaches’ a normal distribution. What is this result called? Solution. As n tends to infinity (that is, we sum up more and more random vari- ables), the distribution of Tn tends towards a standard normal distribution. Specifi- cally, lim n→∞P(Tn ≤ x) = Φ(x). 2To obtain this result rigorously, calculate the mgf of Sn and show it corresponds to a normal mgf. See Question 4 part (b) for more details. MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 5 3This result is called the central limit theorem, and is frequently used in statistics to approximate sums of random variables by a normal distribution. Question 4. Let X be an arbitrary random variable and consider its moment generating function mX(u) = E(euX). (a) Show how the moment generating function can be used to obtain the k-th moment, E(Xk). Hint. It may (or may not) be useful to write ex as a Taylor series around x = 0 to obtain an alternative expression for mX . Solution. Consider the following heuristics. By writing the exponential as a Taylor series we obtain mX(u) = E ( ∞∑ k=0 (uX)k k! ) = E ( 1 + uX + 1 2 u2X2 + 1 3! u3X3 + · · · ) . We can now ‘see’ all the moments of X within the expectation’s argument. Assume some regularity properties hold so that we can exchange expectation and infinite sum with differentiation at will. Suppose we take the third derivative of mX . This yields m′′′X(u) = E ( d3 du3 ( 1 + uX + 1 2 u2X2 + 1 3! u3X3 + · · · )) = E ( X3 + 4 · 3 · 2 4! uX4 + 5 · 4 · 3 5! u2X5 + · · · ) . Then by letting u = 0, we obtain the third moment. That is, m′′′X(0) = E(X3). Following this thought process, it is clear that m (k) X (0) = E(X k), for k ∈ N, where m (k) X denotes the k-th derivative of mX . A more elegant but perhaps less illuminating approach is the following. m (k) X (u) = E ( dk duk euX ) = E(XkeuX). Letting u = 0 gives the same result. (b) When it exists, the moment generating function fully characterises the distribution of a random variable. Elaborate on what this means mathematically. Solution. Suppose we have random variables X and Y with moment generating functions mX and mY respectively. Then the following result is true: X d = Y if and only if there exists some ε > 0 such that mX(u) = mY (u), ∀u ∈ (−ε, ε). That is, if the moment generating function of two random variables agree in some neighbourhood about u = 0, then those random variables have the same distribution. 3Such a convergence of Tn is called convergence in distribution. MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 6 Question 5. Bonus (non-examinable!!!). Define ϕX(u) := E(eiuX), where i := √−1. ϕX is called the characteristic function of X. Clearly the characteristic function is remi- niscent of the moment generating function. (a) Express the k-th moment of X in terms of its characteristic function ϕX . Solution. Ignoring the technicalities of complex analysis, it is clear that the char- acteristic function can generate moments like the moment generating function. A simple calculation leads to (−i)kϕ(k)X (0) = E(Xk). (b) Define the random variable Y = eZ , where Z ∼ N (0, 1). Calculate mY (u). Does there seem to be a problem? Does mY characterise the distribution of Y ? Solution. Let us attempt to calculate mY (u). mY (u) = E(euY ) = E(eue Z ) = ∫ ∞ −∞ 1√ 2pi eue x e−x 2/2dx. It is intuitive that this integral will blow up for u > 0, and so mY does not exist for u > 0. Thus, mY does not exist in any neighbourhood about u = 0, and thus does not characterise the distribution of Y . This is not a niche example, Y is called a log-normal random variable, which we will study in weeks to come, and is extremely important in financial modelling. (c) Suppose the random variable X has density fX . Obtain a bound on ϕX . Solution. Using the property from complex analysis, |eim| = 1 for any m ∈ R, we obtain |ϕX(u)| = ∣∣∣∣∫ ∞−∞ eiuxfX(x)dx ∣∣∣∣ ≤ ∫ ∞−∞ |eiux|fX(x)dx = ∫ ∞ −∞ fX(x)dx = 1. This means that the characteristic function always exists. (d) From your findings in the previous parts of this question, give your thoughts on when one should use the characteristic function over the moment generating function and vice-versa. Solution. One would use the characteristic function over the moment generating function when: • We are studying a random variable whose moment generating function does not exist. • When we are trying to prove a result for an arbitrary random variable. Since the random variable is arbitrary we cannot utilise its moment generating function as there is no guarantee it exists. For example, the proof of the central limit theorem MTH3251/ETC3510/ETC5351 APPLIED CLASS 1 SOLUTIONS 7 uses the characteristic function. Any proof using the moment generating function is wrong! One would use the moment generating function rather than the characteristic function when: • We are studying a random variable whose moment generating function exists. In- deed, if it exists, the moment generating function does everything the characteristic function does. However, it is simpler to use as it is real valued rather than complex valued.
欢迎咨询51作业君