程序代写案例-STA305/1004

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
STA305/1004 - Week 4
(adapted from N. Taback)
Finding Power, Intro to Causal Inference
Week 4 Outline
I Finding Power
I Replication and Power: Case study on Power Poses
I Power and Sample size formulae: Two-sample proportions
I Power via simulation
I Introduction to causal inference:
I The fundamental problem
I The assignment mechanism: Weight gain study
Replication and Power: Case Study on Power poses:
(Carney et al. (2010) Power Posing: Brief Nonverbal Display Affect
Neuroendocrine Levels and Risk Tolerance, Psychological Science, 21(10),
1363-1368
Can power poses significantly change outcomes in your life?
Study methods (Carney et al. (2010)):
I Randomly assigned 42 participants to the high-power pose or the
low-power-pose condition.
I Participants believed that the study was about the science of physiological
recordings and was focused on how placement of electrocardiography
electrodes above and below the heart could influence data collection.
I Participants’ bodies were posed by an experimenter into high-power or
low-power poses. Each participant held two poses for 1 min each.
I Participants’ risk taking was measured with a gambling task; feelings of
power were measured with self-reports.
I Saliva samples, which were used to test cortisol and testosterone levels, were
taken before and approximately 17 min after the power-pose manipulation.
Can power poses significantly change outcomes in your life?
Study results (Carney et al. (2010)):
As hypothesized, high-power poses caused an increase in testosterone
compared with low-power poses, which caused a decrease in testosterone,
F(1, 39) = 4.29, p < .05; r = .34. Also as hypothesized, high-power
poses caused a decrease in cortisol compared with low-power poses,
which caused an increase in cortisol, F(1, 38) = 7.45, p < .02; r = .43
Can power poses significantly change outcomes in your life?
I The study was replicated by Ranehill et al. (2015)
I An initial power analysis based on the effect sizes in Carney et al. (power =
0.8, α = .05) indicated that a sample size of 100 participants would be
suitable.
library(pwr)
pwr.t.test(d=0.6,power = 0.8)
Two-sample t test power calculation
n = 44.58577
d = 0.6
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
Can power poses significantly change outcomes in your life?
I Ranehill et al. study used a sample of 200 participants to increase reliability.
I This study found none of the significant differences found in Carney et al.’s
study.
I The replication study obtained very precise estimates of the effects.
I What happened?
Can power poses significantly change outcomes in your life?
I Sampling theory predicts that the variation between samples is proportional
to 1√n .I In small samples we can expect variability.
I Many researchers often expect that these samples will be more similar than
sampling theory predicts.
Study replication
Suppose that you have run an experiment on 20 subjects, and have obtained a
significant result from a two-sided z-test (H0 : µ = 0 vs.H1 : µ 6= 0) which
confirms your theory (z = 2.23, p < 0.05, two-tailed). The researcher is
planning to run the same experiment on an additional 10 subjects. What is the
probability that the results will be significant at the 5% level by a one-tailed test
(H1 : µ > 0), separately for this group?
Week 4 Outline
I Power and Sample size formulae: Two-sample proportions
Comparing Proportions for Binary Outcomes
I In many clinical trials, the primary endpoint is dichotomous, for example,
whether a patient has responded to the treatment, or whether a patient has
experienced toxicity.
I Consider a two-arm randomized trial with binary outcomes. Let p1 denote
the response rate of the experimental drug, p2 as that of the standard drug,
and the difference is θ = p1 - p2.
Comparing Proportions for Binary Outcomes
Let Yik be the binary outcome for subject i in arm k; that is,
Yik =
{
1 with probability pk
0 with probability 1− pk ,
for i = 1, ..., nk and k = 1, 2. The sum of independent and identically distributed
Bernoulli random variables has a binomial distribution,
nk∑
i=1
Yik ∼ Bin(nk , pk), k = 1, 2.
(Yin, pg. 173-174)
Comparing Proportions for Binary Outcomes
The sample proportion for group k is
pˆk = Y¯k = (1/nk)
nk∑
i=1
Yik , k = 1, 2,
and E
(
Y¯k
)
= pk and Var
(
Y¯k
)
= pk (1−pk )nk .
The goal of the clinical trial is to determine if there is a difference between the
two groups using a binary endpoint. That is, we want to test H0 : θ = 0 versus
H0 : θ 6= 0.
The test statistic (assuming that H0 is true) is:
T = pˆ1 − pˆ2√
p1(1− p1)/n1 + p2(1− p2)/n2
∼ N(0, 1),
Comparing Proportions for Binary Outcomes
The test rejects at level α if and only if
|T | ≥ zα/2.
Using the same argument as the case with continuous endpoints and ignoring
terms smaller than α/2 we can solve for β
β ≈ Φ
(
zα/2 − |θ1|√
p1(1− p1)/n1 + p2(1− p2)/n2
)
.
Comparing Proportions for Binary Outcomes
Using this formula to solve for sample size. If n1 = r · n2 then
n2 =
(
zα/2 + zβ
)2
θ2
(p1(1− p1)/r + p2(1− p2)) .
Comparing Proportions for Binary Outcomes
I The built-in R function power.prop.test() can be used to calculate
sample size or power.
I For example suppose that the standard treatment for a disease has a
response rate of 20%, and an experimental treatment is anticipated to have
a response rate of 28%.
I The researchers want both arms to have an equal number of subjects. How
many patients should be enrolled if the study will conduct a two-sided test
at the 5% level with 80% power?
power.prop.test(p1 = 0.2,p2 = 0.25,power = 0.8)
Two-sample comparison of proportions power calculation
n = 1093.739
p1 = 0.2
p2 = 0.25
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
Week 4 Outline
I Power via simulation
Calculating Power by Simulation
I If the test statistic and distribution of the test statistic are known then the
power of the test can be calculated via simulation.
I Consider a two-sample t-test with 30 subjects per group and the standard
deviation of the clinical outcome is known to be 1.
I What is the power of the test H0 : µ1 − µ2 = 0 versus H0 : µ1 − µ2 = 0.5,
at the 5% significance level?
I The power is the proportion of times that the test correctly rejects the null
hypothesis in repeated sampling.
Calculating Power by Simulation
We can simulate a single study using the rnorm() command. Let’s assume that
n1 = n2 = 30, µ1 = 3.5, µ2 = 3, σ = 1, α = 0.05.
set.seed(2301)
t.test(rnorm(30,mean=3.5,sd=1),rnorm(30,mean=3,sd=1),var.equal = T)
Two Sample t-test
data: rnorm(30, mean = 3.5, sd = 1) and rnorm(30, mean = 3, sd = 1)
t = 2.1462, df = 58, p-value = 0.03605
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.03458122 0.99248595
sample estimates:
mean of x mean of y
3.339362 2.825828
Should you reject H0?
Calculating Power by Simulation
I Suppose that 10 studies are simulated.
I What proportion of these 10 studies will reject the null hypothesis at the
5% level?
I To investigate how many times the two-sample t-test will reject at the 5%
level the replicate() command will be used to generate 10 studies and
calculate the p-value in each study.
I It will still be assumed that
n1 = n2 = 30, µ1 = 3.5, µ2 = 3, σ = 1, α = 0.05.
set.seed(2301)
pvals <- replicate(10,t.test(rnorm(30,mean=3.5,sd=1),
rnorm(30,mean=3,sd=1),
var.equal = T)$p.value)
pvals # print out 10 p-values
[1] 0.03604893 0.15477655 0.01777959 0.40851999 0.34580930 0.11131007
[7] 0.14788381 0.00317709 0.09452230 0.39173723
#power is the number of times the test rejects at the 5% level
sum(pvals<=0.05)/10
[1] 0.3
Calculating Power by Simulation
But, since we only simulated 10 studies the estimate of power will have a large
standard error. So let’s try simulating 10,000 studies so that we can obtain a
more precise estimate of power.
set.seed(2301)
pvals <- replicate(10000,t.test(rnorm(30,mean=3.5,sd=1),
rnorm(30,mean=3,sd=1),
var.equal = T)$p.value)
sum(pvals<=0.05)/10000
[1] 0.4881
Calculating Power by Simulation
This is much closer to the theoretical power obtained from power.t.test().
power.t.test(n = 30,delta = 0.5,sd = 1,sig.level = 0.05)
Two-sample t test power calculation
n = 30
delta = 0.5
sd = 1
sig.level = 0.05
power = 0.477841
alternative = two.sided
NOTE: n is number in *each* group
Calculating Power by Simulation
I The built-in R functions power.t.test() and power.prop.test() don’t
have an option for calculating power where the there is unequal allocation
of subjects between groups.
I These built-in functions don’t have an option to investigate power if other
assumptions don’t hold (e.g., normality).
I One option is to simulate power for the scenarios that are of interest.
Another option is to write your own function using the formula derived
above.
Calculating Power by Simulation
I Suppose the standard treatment for a disease has a response rate of 20%,
and an experimental treatment is anticipated to have a response rate of
28%.
I The researchers want both arms to have an equal number of subjects.
I A power calculation above revealed that the study will require 1094 patients
for 80% power.
I What would happen to the power if the researchers put 1500 patients in the
experimental arm and 500 patients in the control arm?
Calculating Power by Simulation
I The number of subjects in the experimental arm that have a positive
response to treatment will be an observation from a Bin(1500, 0.28).
I The number of subjects that have a positive response to the standard
treatment will be an observation from a Bin(500, 0.2).
I We can obtain simulated responses from these distributions using the
rbinom() command in R.
set.seed(2301)
rbinom(1,1500,0.28)
[1] 403
rbinom(1,500,0.20)
[1] 89
Calculating Power by Simulation
I The p-value for this simulated study can be obtained using prop.test().
set.seed(2301)
prop.test(x=c(rbinom(1,1500,0.28),rbinom(1,500,0.20)),
n=c(1500,500),correct = F)
2-sample test for equality of proportions without continuity
correction
data: c(rbinom(1, 1500, 0.28), rbinom(1, 500, 0.2)) out of c(1500, 500)
X-squared = 16.62, df = 1, p-value = 4.568e-05
alternative hypothesis: two.sided
95 percent confidence interval:
0.05032654 0.13100680
sample estimates:
prop 1 prop 2
0.2686667 0.1780000
Calculating Power by Simulation
I A power simulation repeats this process a large number of times.
I In the example below we simulate 10,000 hypothetical studies to calculate
power.
set.seed(2301)
pvals <- replicate(10000,
prop.test(x=c(rbinom(n = 1,size = 1500,prob = 0.25),
rbinom(n=1,size=500,prob=0.20)),
n=c(1500,500),correct=F)$p.value)
sum(pvals<=0.05)/10000
[1] 0.6231
If the researchers decide to have a 3:1 allocation ratio of patients in the
treatment to control arm then the power will be _____?
Week 4 Outline
I Introduction to Causal Inference
Introduction to causal inference - Bob’s headache
I Suppose Bob, at a particular point in time, is contemplating whether or not
to take an aspirin for a headache.
I There are two treatment levels, taking an aspirin, and not taking an aspirin.
I If Bob takes the aspirin, his headache may be gone, or it may remain, say,
an hour later; we denote this outcome, which can be either “Headache” or
“No Headache,” by Y (Aspirin).
I Similarly, if Bob does not take the aspirin, his headache may remain an hour
later, or it may not; we denote this potential outcome by Y (No Aspirin),
which also can be either “Headache,” or “No Headache.”
I There are therefore two potential outcomes, Y (Aspirin) and Y (No Aspirin),
one for each level of the treatment. The causal effect of the treatment
involves the comparison of these two potential outcomes.
Introduction to causal inference - Bob’s headache
Because in this example each potential outcome can take on only two values, the
unit- level causal effect – the comparison of these two outcomes for the same
unit – involves one of four (two by two) possibilities:
1. Headache gone only with aspirin: Y(Aspirin) = No Headache, Y(No
Aspirin) = Headache
2. No effect of aspirin, with a headache in both cases: Y(Aspirin) = Headache,
Y(No Aspirin) = Headache
3. No effect of aspirin, with the headache gone in both cases: Y(Aspirin) =
No Headache, Y(No Aspirin) = No Headache
4. Headache gone only without aspirin: Y(Aspirin) = Headache, Y(No Aspirin)
= No Headache
Introduction to causal inference - Bob’s headache
There are two important aspects of this definition of a causal effect.
1. The definition of the causal effect depends on the potential outcomes, but
it does not depend on which outcome is actually observed.
2. The causal effect is the comparison of potential outcomes, for the same
unit, at the same moment in time post-treatment.
I The causal effect is not defined in terms of comparisons of outcomes at
different times, as in a before-and-after comparison of my headache before
and after deciding to take or not to take the aspirin.
The fundamental problem of causal inference
“The fundamental problem of causal inference” (Holland, 1986, p. 947) is the
problem that at most one of the potential outcomes can be realized and thus
observed.
I If the action you take is Aspirin, you observe Y (Aspirin) and will never
know the value of Y (No Aspirin) because you cannot go back in time.
I Similarly, if your action is No Aspirin, you observe Y (No Aspirin) but
cannot know the value of Y (Aspirin).
I In general, therefore, even though the unit-level causal effect (the
comparison of the two potential outcomes) may be well defined, by
definition we cannot learn its value from just the single realized potential
outcome.
The fundamental problem of causal inference
The outcomes that would be observed under control and treatment conditions
are often called counterfactuals or potential outcomes.
I If Bob took aspirin for his headache then he would be assigned to the
treatment condition so Ti = 1.
I Then Y (Aspirin) is observed and Y (No Aspirin) is the unobserved
counterfactual outcome—it represents what would have happened to Bob if
he had not taken aspirin.
I Conversely, if Bob had not taken aspirin then Y (No Aspirin) is observed
and Y (Aspirin) is counterfactual.
I In either case, a simple treatment effect for Bob can be defined as
treatment effect for Bob = Y (Aspirin)− Y (No Aspirin).
I The problem is that we can only observe one outcome.
The assignment mechanism
I Assignment mechanism: The process for deciding which units receive
treatment and which receive control.
I Ignorable Assignment Mechanism: The assignment of treatment or
control for all units is independent of the unobserved potential outcomes
(“nonignorable” means not ignorable)
I Unconfounded Assignment Mechanism: The assignment of treatment or
control for all units is independent of all potential outcomes, observed or
unobserved (“confounded” means not unconfounded)
The assignment mechanism
I Suppose that a doctor prescribes surgery (labeled 1) or drug (labeled 0) for
a certain condition.
I The doctor knows enough about the potential outcomes of the patients so
assigns each patient the treatment that is more beneficial to that patient.
unit Yi(0) Yi(1) Yi(1)− Yi(0)
patient #1 1 7 6
patient #2 6 5 -1
patient #3 1 5 4
patient #4 8 7 -1
Average 4 6 2
Y is years of post-treatment survival.
The assignment mechanism
I Patients 1 and 3 will receive surgery and patients 2 and 4 will receive drug
treatment.
I The observed treatments and outcomes are in this table.
unit Ti Y obsi Yi(1) Yi(0)
patient #1 1 7
patient #2 0 6
patient #3 1 5
patient #4 0 8
Average Drug 7
Average Surg 6
I This shows that we can reach invalid conclusions if we look at the observed
values of potential outcomes without considering how the treatments were
assigned.
I The assignment mechanism depended on the potential outcomes and was
therefore nonignorable (implying that it was confounded).
The assignment mechanism
The observed difference in means is entirely misleading in this situation. The
biggest problem when using the difference of sample means here is that we have
effectively pretended that we had an unconfounded treatment assignment when
in fact we did not. This example demonstrates the importance of finding a
statistic that is appropriate for the actual assignment mechanism.
The assignment mechanism
Is the treatment assignment ignorable?
I The doctor knows enough about the potential outcomes of the patients so
assigns each patient the treatment that is more beneficial to that patient.
I Suppose that a doctor prescribes surgery (labeled 1) or drug (labeled 0) for
a certain condition by tossing a biased coin that depends on Y (0) and
Y (1), where Y is years of post-treatment survival.
I If Y (1) ≥ Y (0) then P(Ti = 1|Yi(0),Yi(1)) = 0.8.
I If Y (1) < Y (0) then P(Ti = 1|Yi(0),Yi(1)) = 0.3.
unit Yi(0) Yi(1) p1 p0
patient #1 1 7 0.8 0.2
patient #2 6 5 0.3 0.7
patient #3 1 5 0.8 0.2
patient #4 8 7 0.3 0.7
where, p1 = P(Ti = 1|Yi(0),Yi(1)),and p0 = P(Ti = 0|Yi(0),Yi(1)).
Weight gain study
From Holland and Rubin (1983).
“A large university is interested in investigating the effects on the students of the
diet provided in the university dining halls and any sex differences in these effects.
Various types of data are gathered. In particular, the weight of each student at
the time of his [or her] arrival in September and his [or her] weight the following
June are recorded.”
I The average weight for Males was 180 in both September and June. Thus,
the average weight gain for Males was zero.
I The average weight for Females was 130 in both September and June.
Thus, the average weight gain for Females was zero.
I Question: What is the differential causal effect of the diet on male weights
and on female weights?
I Statistician 1: Look at gain scores: No effect of diet on weight for either
males or females, and no evidence of differential effect of the two sexes,
because no group shows any systematic change.
I Statistician 2: Compare June weight for males and females with the same
weight in September: On average, for a given September weight, men weigh
more in June than women. Thus, the new diet leads to more weight gain
for men.
I Is Statistician 1 correct? Statistician 2? Neither? Both?
Weight gain study
Questions:
1. What are the units?
2. What are the treatments?
3. What is the assignment mechanism?
4. Is the assignment mechanism useful for causal inference?
5. Would it have helped if all males received the dining hall diet and all
females received the control diet?
6. Is Statistician 1 or Statistician 2 correct?
Getting around the fundamental problem by using close substitutes
I Are there situations where you can measure both Y 0i and Y 1i on the same
unit?
I Drink tea one night and milk another night then measure the amount of
sleep. What has been assumed?
I Divide a piece of plastic into two parts then expose each piece to a
corrosive chemical. What has been assumed?
I Measure the effect of a new diet by comparing your weight before the diet
and your weight after. What has been assumed?
I There are strong assumptions implicit in these types of strategies.
Getting around the fundamental problem by using randomization and
experimentation
I The “statistical” idea of using the outcomes observed on a sample of units
to learn about the distribution of outcomes in the population.
I The basic idea is that since we cannot compare treatment and control
outcomes for the same units, we try to compare them on similar units.
I Similarity can be attained by using randomization to decide which units are
assigned to the treatment group and which units are assigned to the control
group.
Getting around the fundamental problem by using randomization and
experimentation
I It is not always possible to achieve close similarity between the treated and
control groups in a causal study.
I In observational studies, units often end up treated or not based on
characteristics that are predictive of the outcome of interest (for example,
men enter a job training program because they have low earnings and future
earnings is the outcome of interest).
I Randomized experiments can be impractical or unethical.
I When treatment and control groups are not similar, modeling or other
forms of statistical adjustment can be used to fill in the gap.
Fisherian Randomization Test
I The randomization test is related to a stochastic proof by contradiction
giving the plausibility of the null hypothesis of no treatment effect.
I The null hypothesis is Y 0i = Y 1i , for all units.
I Under the null hypothesis all potential outcomes are known from Y obs since
Y obs = Y 1 = Y 0.
I Under the null hypothesis, the observed value of any statistic, such as
y¯ 1 − y¯ 0 is known for all possible treatment assignments.
I The randomization distribution of y¯ 1 − y¯ 0 can then be obtained.
I Unless the data suggest that the null hypothesis of no treatment effect is
false then it’s difficult to claim evidence that the treatments are different.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468