辅导案例-STA261

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

STA261 - Lecture 8
Interval Estimation: Part II
Rob Zimmerman
University of Toronto
July 29, 2020
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 1 / 15
Example 9.3.1: Optimizing the Length of a Confidence Interval
Let X1, X2, . . . , Xn iid∼ N
(
µ, σ2
)
, µ ∈ R, σ2 known. Among all the 1− α confidence
intervals for µ, can we find the one with the minimum length?
From last lecture, we know that to find a 1− α confidence interval, we can use the pivot
X−µ
σ/
√
n
∼ N (0, 1) and find constants a and b such that P (a ≤ Z ≤ b) = 1− α, which
would yield the 1− α confidence interval[
X − b σ√
n
, X − a σ√
n
]
.
Absorbing the constant σ√
n
into a and b, this is the same as finding a and b such that
1− α = P (a ≤ Z ≤ b) and (b− a) is minimized.
Previously, we chose a = −zα/2 and b = zα/2 without any particular justification,
although the symmetry of the Normal cdf lends some intuition towards this choice. It
turns out that splitting the probability α equally is optimal in this case, although that’s
not always the case.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 2 / 15
Theorem 9.3.2: Shortest Intervals for Unimodal Densities
Let f(x) be a unimodal pdf. Suppose the interval [a, b] satisfies the following three
conditions:
1
∫ b
a
f(x) dx = 1− α
2 f(a) = f(b) > 0
3 a ≤ x∗ ≤ b, where x∗ is a mode of f(x)
Then [a, b] is the shortest among all intervals that satisfy
∫ b
a
f(x) dx = 1− α.
Proof. Let [a′, b′] be any interval with shorter length (that is, b′ − a′ < b− a). Suppose
also that a′ ≤ a. Our goal is to show that ∫ b′
a′ f(x) dx < 1− α. We break the proof into
two cases: one for b′ ≤ a, and the other for b′ > a.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 3 / 15
Theorem 9.3.2: Shortest Intervals for Unimodal Densities (Continued)
For the first case, suppose that b′ ≤ a. Then a′ ≤ b′ ≤ a ≤ x∗, so that∫ b′
a′
f(x) dx ≤
∫ b′
a′
f(b′) dx since x ≤ b′ ≤ x∗ =⇒ f(x) ≤ f(b′)
= f(b′)(b′ − a′)
≤ f(a)(b′ − a′) since b′ ≤ a ≤ x∗ =⇒ f(b′) ≤ f(a)
< f(a)(b− a)
=
∫ b
a
f(a) dx
≤
∫ b
a
f(x) dx since f(a) ≤ f(x) for x ∈ [a, b] by Condition 3
= 1− α.
For the second case, now suppose that b′ > a. Then a′ ≤ a < b′ < b, so that∫ b′
a′
f(x) dx =
∫ b
a
f(x) dx︸︷︷︸
= 1−α
+
[∫ a
a′
f(x) dx−
∫ b
b′
f(x) dx
]
.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 4 / 15
Theorem 9.3.2: Shortest Intervals for Unimodal Densities (Continued)
If we can show that the expression in the square brackets above is negative, then we’ll be
done. To that end, the unimodality of f and a′ ≤ a < b′ < b gives us∫ a
a′
f(x) dx ≤
∫ a
a′
f(a) dx = f(a)(a− a′)
and ∫ b
b′
f(x) dx ≥
∫ b
b′
f(b) dx = f(b)(b− b′).
Therefore, the expression in the square brackets is∫ a
a′
f(x) dx−
∫ b
b′
f(x) dx ≤ f(a)(a− a′)− f(b)(b− b′).
Condition 2 forces the term on the right to be equal to
f(a)︸︷︷︸
> 0
[(b′ − a′)− (b− a)︸︷︷︸
< 0
] < 0.
All of that was for a′ ≤ a. If a′ < a, the proof proceeds along the same lines (with most
of the weak inequalities becoming strict).
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 5 / 15
Theorem: Shortest Intervals for Symmetric Unimodal Densities
Suppose that X ∼ f(x), where f(x) is a symmetric, unimodal, and continuous
pdf. Of all the intervals [a, b] which satisfy
∫ b
a
f(x) dx = 1 − α, the shortest is
obtained by choosing a and b such that P (X ≤ a) = P (X > b) = α2 .
Proof. Let x∗ be the mode of f(x), and suppose that
∫ a
−∞ f(x) dx =
α
2 for some a.
Since the integral is positive and f(x) is unimodal, we must have that f(a) > 0.
Next, we show that the only possible value of b that makes
∫ b
a
f(x) dx = 1− α is
b = 2x∗ − a. To that end, the substitution y = 2x∗ − x below gives us
α
2 =
∫ a
−∞
f(x) dx = −
∫ 2x∗−a
∞
f(2x∗ − y) dy = −
∫ b
∞
f(x∗ + (x∗ − y)) dy.
Now, since f(x) is clearly symmetric around its mode x∗, we must have that
f(x∗ + z) = f(x∗ − z) for any z ∈ R. In particular, taking z = x∗ − y gives us
f(x∗ + (x∗ − y)) = f(x∗ − (x∗ − y)) = f(y), so the integral on the right is equal to
−
∫ b
∞
f(y) dy =
∫ ∞
b
f(x) dx.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 6 / 15
Theorem: Shortest Intervals for Symmetric Unimodal Densities (Continued)
That is, α2 =
∫∞
b
f(x) dx, and b is the unique value for which this is true, since f(x) is
unimodal and
f(b) = f(2x∗ − a) = f(x∗ + (x∗ − a)) = f(a) > 0.
Finally, it is clear that a ≤ x∗, since we have P (X ≤ a) = α2 ≤ 12 = P (X ≤ x∗).
Similarly, we must also have x∗ ≤ b.
To summarize, choosing a and b such that P (X ≤ a) = P (X > b) = α2 implies the
following:∫ b
a
f(x) dx = 1− α
f(a) = f(b) > 0
a ≤ x∗ ≤ b, where x∗ is a mode of f(x)
Thus, the three conditions of Theorem 9.3.2 are satisfied, and it follows that [a, b] is the
shortest among all intervals which satisfy
∫ b
a
f(x) dx = 1− α.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 7 / 15
Example 9.3.3: Optimizing Expected Length
Let X1, X2, . . . , Xn iid∼ N
(
µ, σ2
)
, µ ∈ R, σ2 > 0. By Theorem 9.3.2, we now know that
our observed 1− α confidence interval for µ given by[
x− b s√
n
, x− a s√
n
]
has shortest length when a = −b = −tn−1,α/2. But this is after seeing the data; what
about minimizing the length of the actual 1− α confidence interval[
X − b S√
n
,X − a S√
n
]
?
The length of that is (b− a) S√
n
, which is random. Therefore we might consider trying to
minimize the expected length Eσ2
[
(b− a) S√
n
]
= b−a√
n
Eσ2 [S]. One can show that
Eσ2 [S] = σ
√
2
n− 1
Γ(n2 )
Γ(n−12 )
,
so the expected length is (b− a)c(n) σ√
n
for a function c(·) of n alone. Subject to the
1− α constraint, c(n) σ√
n
is a constant, so minimizing (b− a)c(n) σ√
n
is exactly the same
as minimizing b− a itself, and we still get a = −b = −tn−1,α/2 from Theorem 9.3.2.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 8 / 15
Example 9.3.4: Shortest Gamma Pivotal Interval
Let X ∼ Γ(k, 1
β
), so Y = X/β ∼ Γ(k, 1) is a pivot. Can we apply Theorem 9.3.2 to Y
to get the shortest 1− α confidence interval for β?
Not directly. The theorem would seem to imply that we want constants a and b to satisfy
P (a ≤ Y ≤ b) = 1− α and fY (a) = fY (b). However, the resulting confidence confidence
interval for β is of the form
[
X
b
, X
a
]
, and the length of that is proportional to
1
a
− 1
b
= (b−a)
ab
, which is definitely not the same as (b− a). So blindly applying Theorem
9.3.2 wouldn’t give us a minimum length confidence interval for β.
However, the general idea can be rescued. Observe that the only unknowns in Condition
1 of Theorem 9.3.2 are a and b, so we can formally consider b = b(a) as a smooth
function of a. Thus, the problem of finding the shortest length 1− α confidence interval
based on the pivot Y turns into the optimization problem
min
a
(
1
a
− 1
b(a)
)
subject to
∫ b(a)
a
fY (y) dy = 1− α.
A bit of Calculus shows that this implies fY (b) · b2 = fY (a) · a2. Thus, the problem
reduces to minimizing (b−a)
ab
subject to fY (b) · b2 = fY (a) · a2 and this can usually be
solved using numerical methods (if not analytically).
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 9 / 15
Definition: Probability of False Coverage
Let (X1, X2, . . . , Xn) ∼ f(x | θ), and suppose that C(X) is a 1− α confidence set for
the parameter θ. For a parameter θ′ 6= θ, the probability of false coverage for C(X) is
the function 
Pθ (θ′ ∈ C(X)) , when C(X) = [L(X), U(X)] and θ 6= θ′
Pθ (θ′ ∈ C(X)) , when C(X) = [L(X),∞) and θ′ < θ
Pθ (θ′ ∈ C(X)) , when C(X) = (−∞, U(X)] and θ′ > θ
.
That is, the probability of false coverage is the probability that C(X) covers another θ′
when the true parameter is θ.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 10 / 15
Definition: Uniformly Most Accurate Confidence Set
A 1− α confidence set C(X) is called a uniformly most accurate (UMA) confidence
set if it minimizes the probability of false coverage over all 1−α confidence sets. That is,
if Pθ (θ′ ∈ C(X)) is the probability of false coverage for C(X), then
Pθ
(
θ′ ∈ C(X)
)
≤ Pθ
(
θ′ ∈ C∗(X)
)
for all θ, θ′ ∈ Ω,
where C∗(X) is any other 1− α confidence set.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 11 / 15
Theorem 9.3.5: One-Sided UMP Tests Yield One-Sided UMA Bounds
Let (X1, X2, . . . , Xn) ∼ f(x | θ), where θ ∈ Ω ⊆ R. For each θ0 ∈ Ω, let
A∗(θ0) be the UMP level-α acceptance region of a test of H0 : θ = θ0 versus
HA : θ > θ0, and let C∗(X) be the 1− α confidence set formed by inverting the
UMP acceptance region. Then for any other 1−α confidence set C(X), we have
Pθ
(
θ′ ∈ C∗(X)
)
≤ Pθ
(
θ′ ∈ C(X)
)
for all θ′ < θ.
Proof. Fix θ′ < θ, and let A(θ′) be the acceptance region of the level-α test of
H0 : θ = θ′ formed by inverting C(X), which exists by Theorem 9.2.2. Since A∗(θ′) is
the UMP acceptance region for testing H0 : θ = θ′ versus HA : θ > θ′, we must have that
Pθ
(
θ′ ∈ C∗(X)
)
= Pθ
(
X ∈ A∗(θ′)
)
≤ Pθ
(
X ∈ A(θ′)
)
= Pθ
(
θ′ ∈ C(X)
)
,
where the inequality happens because A∗ corresponds to a UMP test, θ′ ∈ Ω0, and
Pθ
(
X ∈ A∗(θ′)
)
= 1− Pθ
(
X ∈ R∗(θ′)
)
= 1− β∗(θ′) ≤ 1− β(θ′).
Here β∗(·) and β(·) are the power functions corresponding to A∗ and A, respectively.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 12 / 15
Example 9.3.6: Normal UMA Confidence Bounds
Let X1, X2, . . . , Xn iid∼ N
(
µ, σ2
)
, µ ∈ R, σ2 known.
An analysis similar to that of Example 8.3.18 that shows that the UMP level-α test of
H0 : µ = µ0 versus HA : µ > µ0 has acceptance region A(µ) = {x : x ≤ µ+ σ√nzα},
and inverting this leads to the 1− α lower confidence bound
C(X) =
[
X − σ√
n
zα, ∞
)
.
By Theorem 9.3.5, this must be a 1− α UMA lower confidence bound for µ, since it was
obtained by inverting a UMP level-α acceptance region.
On the other hand, the classic two-sided interval that we’ve already seen many times,
C′(X) =
[
X − σ√
n
zα/2, X +
σ√
n
zα/2
]
,
is not a UMA confidence interval for µ, since it was obtained by inverting the two-sided
acceptance region of H0 : µ = µ0 versus HA : µ 6= µ0, and we showed in Example 8.3.19
that no UMP test exists for testing that set of hypotheses.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 13 / 15
Theorem 9.3.9: Pratt’s Theorem
Let T ∼ g(t | θ), θ ∈ Ω ⊆ R, where g(t | θ) is continuous. Let C(T ) =
[L(T ), U(T )] be an interval estimator for θ. If U(t) and L(t) are both increasing
in t, then for any θ∗ ∈ R,
Eθ∗ [Length(C(T ))] =
∫
θ 6=θ∗
Pθ∗ (θ ∈ C(T ))] dt.
That is, the expected length of an interval estimator is the integral of the proba-
bilities of false coverage, taken over all “false” values of the parameter.
Proof. Since L(·) and U(·) are both increasing, we can invert the parameter region and
sample region:
θ ∈ C(T ) ⇐⇒ θ ∈ [L(T ), U(T )] ⇐⇒ T ∈ [U−1(θ), L−1(θ)].
We have that
Eθ∗ [Length(C(T ))] =
∫
T
[U(t)− L(t)] g(t | θ∗) dt =
∫
T
[∫ U(t)
L(t)
dθ
]
g(t | θ∗) dt.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 14 / 15
Theorem 9.3.9: Pratt’s Theorem (Continued)
Now, since the integrands are finite, we can swap the order of integration by Fubini’s
theorem: ∫
T
[∫ U(t)
L(t)
dθ
]
g(t | θ∗) dt =
∫
Ω
[∫ L−1(θ)
U−1(θ)
g(t | θ∗) dt
]
dθ
=
∫
Ω
Pθ∗
(
T ∈ [U−1(θ), L−1(θ)]
)
dθ
=
∫
Ω
Pθ∗ (θ ∈ C(T )) dθ
=
∫
θ 6=θ∗
Pθ∗ (θ ∈ C(T )) dθ.
The last equality holds because leaving out one point of the region of integration does
not change the value of the integral.
Rob Zimmerman (University of Toronto) STA261 - Lecture 8 July 29, 2020 15 / 15