# 辅导案例-ETC3400-Assignment 3

ETC3400-BEX3400-ETC5340: PRINCIPLES OF ECONOMETRICS
Solutions to Assignment 3, Semester 2, 2020
1. Suppose that the discrete random variable Y has a one parameter probability density
function which is given by
f (yj) = (1 )y ; y = 0; 1; 2; :::; 0 < < 1:
The mean and variance 2 of this distribution are equal to
1 and

(1)2 respectively.
Suppose that we have a sample of n i.i.d observations on this variable, fy1; y2; :::; yng.
(a) Show that the maximum likelihood estimator of is
^MLE =
y
y + 1
;
where y is the sample mean of fy1; y2; :::; yng :
L(; y1; y2; :::; yn) =
i=nY
i=1
(1 )yi = (1 )nyi ; =)
l(; y1; y2; :::; yn) = n log(1 ) +
i=nX
i=1
yi log ; )
@l
@
=
n
(1 ) +
1

i=nX
i=1
yi; so setting
@l
@
= 0 =)
n
(1 ) =
1

i=nX
i=1
yi; =) = (1 )y ) (1 + y) = y =)
^MLE =
y
y + 1
as required. To check that we have a maximum,
@2l
@2
=
n
(1 )2
1
2
i=nX
i=1
yi < 0, (since yi 0), so we have a maximum.
(b) Use the Weak Law of Large Numbers for sequences of i.i.d. random variables
together with Slutskys theorem to show that ^MLE is a consistent estimator of
0; carefully explaining each of the steps.
The WLLN tells us that plim(y) = E(Y ) = 0
10 :
1
Hence, since ^MLE =
y
y+1
; from Slutskys theorem (or the CMT) we obtain
p lim(^MLE) = p lim(
y
y+1
) = p lim(y)
p lim(y+1)
=
0
10
p lim(y)+1
=
0
10
0
10+1
=
0
10
1
10
= 0:
Therefore, since p lim(^MLE) = 0, we can conclude that ^MLE is a consistent
estimator of 0:
(c) Find the information matrix I () (a 1 1 matrix in this case), and hence write
down the limiting distribution for
p
n

^MLE 0

:
I () j=0 = E(H())j=0 = E( @
2l
@2
)j=0 = E( n(1)2 12
i=nP
i=1
yi)j=0
= n
(10)2 +
nE(Y )
20
= n
(10)2 +
n
20
0
10 =
n
(10)

1
(10) +
1
0

= n
(10)20 :
Thus (I () j=0)1 = (10)
20
n
=) (i () j=0)1 = (1 0)20; and
p
n

^MLE 0

d! N(0; (1 0)20):
(d) Dene b2N = (^MLE) = bMLE(1bMLE)2 ; and suppose that n = N , where N is a large
but nite number. Find the "asymptotic" distribution of b2N :
We apply the delta method here, and note that
@ ()
@

=0
=
@
@

(1 )2

=0
=

(1 )2 + 2(1 )
(1 )4

=0
=

(1 2)
(1 )4

=0
=

(1 + )
(1 )3

=0
Hence the limit distribution is
p
n(b2MLE 0(10)2 )) d! N

0;

(1+0)
(10)3
2
(i(0))
1

i.e.
p
n(b2MLE 0(10)2 )) d! N

0;

(1+0)
(10)3
2
(1 0)20

using the result from
(c), so
p
n(b2MLE 0(10)2 )) d! N

0; 0

(1+0)
(10)2
2
and the requested "asymp-
totic" distribution is b2N N 0(10)2 ; 0N (1+0)(10)22

:
2
(e) Consider maximum likelihood estimation of () =

2

, and provide the limit
distribution for
p
n
bMLEb2MLE

2

in terms of the observations and the
parameter :
We set ()| {z }
21
=

1()
2()

=
"

1

(1)2
#
, and given regularity, we have
p
n( (MLE)
(0))
d! N(0; ()| {z }
21
i(0)
1| {z }
11
(0)
0| {z })
12
; where (0) =
@
@0

=0
=
@ 1
@
@ 2
@

=0
:
Here, the details for (0) are (0) ="
@
@
(
1)
@
@
(
(1)2 )
#
=0
=
"
1+
(1)2
(1)2+2(1)
(1)4
#
=0
=
"
1
(1)2
12
(1)4
#
=0
=
"
1
(10)2
120
(10)4
#
;
so using results from (c) again, the variance of the limiting distribution becomes"
1
(10)2
120
(10)4
#
(1 0)20
h
1
(10)2
120
(10)4
i
=
"
1
(10)2
120
(10)4
#

h
0
(120)0
(10)2
i
=
"
0
(10)2
(1+0)0
(10)3
(1+0)0
(10)3
(1+0)20
(10)4
#
:
Hence
p
n
bMLEb2MLE

"
0
10
0
(10)2
#!
d! N

0
0

;
"
0
(10)2
(1+0)0
(10)3
(10)0
(10)3
(1+0)
20
(10)4
#!
:
2. Consider the CNLM,
y
(n1)
= X
(nk)

(k1)
+ u
(n1)
(1)
u
(n1)
N(0; 2In)
with all notation as dened in lectures, and X being treated as xed. Perform the
(a) Derive the MLE of = (0; )0 (noting that we are interested in estimating
rather than 2). You can assume that the second order conditions for a maximum
are satised.
3
The denition of u implies that ui
i:i:d: N(0; 2) for i = 1; 2; :::; n; and since we
are treating X as xed and yi = x
0
i+ui, we can transform the pdf of ui to a pdf
of yi via
f(yi) = f(ui)
@[email protected]
= f(ui = yi x0i) 1
where f(ui) = (22)1=2 expf 1
22
ui
2g:
so f (yij; ) =

22
1=2
exp

1
22

yi x0i
2
Hence, the log likelihood function can be expressed as
l = logL (jyi; y2; :::; yn)
= log
nQ
i=1
f (yij; )
= log
nQ
i=1

22
1=2
exp

1
22

yi x0i
2
= n
2
log (2) n
2
log

2
1
22
nP
i=1

yi x0i
2
= n
2
log (2) n
2
log

2
1
22
(y X)0 (y X) :
The MLE of = (0; )0, b = (b0; b)0; is obtained by solving the set of rst order
conditions @l
@
= 0 and @l
@
= 0; where
@l
@
= 1
22
@
@
(y X)0 (y X)
= 1
22
@
@
(y0 0X0) (y X)
= 1
22
@
@
(y0y y0X 0X0y + 0X0X)
= 1
22
@
@
(y0y 20X0y + 0X0X)
= 1
22
(2X0y + 2X0X)
=
1
2
(X0y X0X) (2)
and
@l
@
= n

+
1
3
(y X)0 (y X) (3)
4
We set (2) to be 0 and nd that
1b2

X0y X0Xb = 0
so b = (X0X)1X0y
(assuming that X has full rank so that (X0X)1 will exist).
Next, we set (3) to be 0
nb + 1b3

y Xb0 y Xb = 0
b =
vuuty0 b0X0y Xb
n
(b) Consider the following vector of parameter functions:
() =

1()
2()

=

=4

1

Specify all components of the limiting distribution of the MLE of () :
p
n( (b) (0)) d! N(0; (0)i(0)1 (0)0): (4)
with all notation as dened in lectures.
We use the delta method for vector MLE and need to nd
(0) =
@
@
=4 @
@
=4
@l
@
2 @
@
2

=0
=

Ik=
4 4=5
0 23

=0
=
"
Ik
04
40
50
0 230
#
The estimated and true values of the parameters of interest are
(bMLE) = " bMLEb4MLEb2MLE
#
and (0) =
"
0
04
20
#
The (per observation) information matrix is
i(0) =
1
n
(
E

@2l
@@0

=0
)
=
1
n
8><>:E
24 @[email protected]@0 @[email protected]@
@2l
@[email protected]
@2l
@()2
35
=0
9>=>;
5
where
@2l
@@0
=
@
@

@l
@
0
=
@
@

1
2
(X0y X0X)
0
= X
0X
2
@2l
@@
=
@2l
@@
=
@
@

@l
@

=
@
@

1
2
X0u

= 2
3
X0u
@2l
@ ()2
=
@
@

@l
@

=
@
@

n

+
1
3
(y X)0 (y X)

=
n
2
3
4
(y X)0 (y X)
=
n
2
3
4
u0u
so that i(0) is therefore
i(0) =
1
n
(
E
X0X
2
2
3
X0u
2
3
X0u n
2
3
4
u0u

=0
)
=
1
n
(
E

X0X
2
2
3
X0u
2
3
X0u n
2
+ 3
4
u0u

=0
)
=
1
n
(
X0X
2
0
0 n
2
+ 3
4
E [u0u]

=0
)
=
1
n
(
X0X
2
0
0 n
2
+ 3
4
n2

=0
)
=
1
n
(
X0X
2
0
0 n
2
+ 3n
2

=0
)
=
1
n
(
X0X
2
0
0 2n
2

=0
)
=
"
X0X
n20
0
0 2
20
#
The variance of the limit distribution is (0)i(0)1 (0)0
=
"
Ik
40
40
50
0 230
#

"
X0X
n20
0
0 2
20
#1

"
Ik
40
40
50
0 230
#0
6
="
Ik
40
40
50
0 230
#

"
n20(X
0X)1 0
0
20
2
#

" Ik
40
0
4
0
0
50
230
#
=
"
Ik
40
40
50
0 230
#

24 n(X0X)120 0
2
0
0
30
10
35 =
24 n(X0X)160 + 800080 4060
4
0
0
60
240
35
Hence
p
n
" bMLEb4MLEb2MLE
#

"
0
04
20
#!
d! N
[email protected] 0
0

;
24 n(X0X)160 + 800080 4060
4
0
0
60
240
35 1A :
3. Assume n i:i:d: draws from Y Exponential() with pdf:
f(yj) = 1

exp(1

y) for y > 0
E(Y j) = ; var(Y j) = 2
and consider the use of s2 =
nP
i=1

Yi Y
2
= (n 1) as an estimator of var(Y j) = 2:
(a) It is well known that the sample variance is an unbiased estimator for the pop-
ulation variance, and for the exponential distribution it is also the case that
var(s2) =
4
n
h
9 (n3)
(n1)
i
: Use these two pieces of information to explain why the
su¢ cient conditions for s2 to be a consistent estimator of 2 hold.
The su¢ cient conditions for s2 to be a consistent estimator for 2 are that
(i) s2 is an unbiased estimator for 2; (i.e. E(s2) = 2), and since our rst piece
of information says that s2 is an unbiased estimator for var(Y j); we know
that s2 is an unbiased estimator for 2 so that E(s2) = 2 holds true; and
(ii) var(s2)! 0 as n!1; which also holds since
lim
n!1
var(s2) = lim
n!1
1
n

94 4 (n 3)
(n 1)

(5)
= lim
n!1
1
n

94 4 (n 1 2)
(n 1)

(6)
= lim
n!1
4
n

9

1 2
(n 1)

(7)
= 0 (8)
Hence, s2 is a consistent estimator of var(Y ) = 2:
7
The above two conditions ensure that the distribution of s2 will become degenerate
as n!1, with all probability mass of s2 concentrated on 2:
b. UseM = 5000 independent samples of size n = 100 to estimate E(s2) and var(s2)
from an exponential distribution with = 5: (You will need to generate samples
of s2 to do this). Report your estimates, compare them with their theoretical
values, and also provide kernal estimates of the distribution of your simulated s2
values. Repeat this exercise with n = 1000, and then again with n = 10000:
Summary statistics from the three simulations are provided on the last page of
these answers, and the requested sample and theoretical results are tabulated
below:
n bs2 2 \V ar(s2) 4
n
h
9 (n3)
(n1)
i
100 24.8362 25.0 48.4487 50.1262
1000 24.9521 25.0 4.8407 5.0012
10000 24.9948 25.0 0.4866 0.5000
The kernal densities of simulated s2 values for the three sample sizes are:
8
(c) Outline how the results found in part (b) provide numerical demonstration of the
consistency of s2:
Comparison of the simulated s2 and 2 shows that there is very little apparent
bias, and further, as the sample size grows, this evidence of bias is disappearing.
This provides numerical evidence that E(s2) = 2.
Looking at the simulated \V ar(s2); we see that they are following their theoretical
values, and in particular, they are falling as the sample size grows. Thus it appears
that var(s2)! 0 as n!1:
Taking these two numerical observations together they provide evidence that the
two su¢ cent conditions for s2 to be consistent for 2 are satised.
9
Further evidence that s2 is a consistent estimator for 2 is that as the sample size
grows the kernal densities become more concentrated around the theoretical value
for 2, which is 25 in this case.
10  Email:51zuoyejun

@gmail.com