Exam2
Design of Experiments: STAT 3515Q
Problem 1 
Five different washing solutions are being compared to study their effectiveness in retarding bacteria growth
in five-gallon milk containers. The analysis is done in a laboratory, and only five trials can be run on any day.
Because days could represent a potential source of variability, the experimenter decides to use a randomized
block design. Observations are taken for four days, and the data are shown here. Analyze the data from this
experiment and draw conclusions.
Day
Solution 1 2 3 4
1 13 22 18 39
2 16 24 17 44
3 5 4 1 22
4 8 6 5 28
5 20 29 24 50
a.  Write the model equation, assumptions and constraints.
yij = µ+ τi + βj + ij i = 1, 2, · · · , 5. j = 1, 2, · · · , 4.
τi: treatment effect and βj : block effect.
ij
iid∼ N(0, σ2), ∑5i=1 τi = 0 , ∑4j=1 βj = 0.
τi and βj are fixed effects.
b.  Estimate all parameters of the model in part (a).
µˆ = y¯.. = 19.75 σˆ2 = MSE = 7.908
τˆi = y¯i. − y¯..
τˆ1 = 3.25, τˆ2 = 5.5, τˆ3 = −11.75, τˆ4 = −8, τˆ5 = 11
βˆj = y¯.j − y¯..
βˆ1 = −7.35, βˆ2 = −2.75, βˆ3 = −6.75, βˆ4 = 16.85
c.  Write the hypotheses of the test of this design and draw conclusions for the test(s) with the two
approaches discussed in the class.
H0 : τ1 = · · · = τ5 = 0
H1 : τi 6= 0 for at least one i.
• Critical value approach:
1
For solution: F0 = 46.01, Critical value = F0.05,t−1,nT−t = F0.05,4,20 = 0.1724, F0 > F0.05,t−1,nT−t
For Day : F0 = 82.42, Critical value = F0.05,t−1,nT−t = F0.05,3,21 = 0.1156, F0 > F0.05,t−1,nT−t
So reject H0.
• p-value approach:
Both the p-values are less than 0.05.
So reject H0.
Thus the effect of the washing solutions on bacteria growth is significant.
d.  What does the model in part (a) assume about the interaction between day and solution?
The model assumes that there is no interaction between day and solution.
e.  Write the null hypothesis of the test for pairwise comparison. Compute the cutoff value for Tukey’s
comparison test and interpret the results of the Tukey’s test of all pairwise comparisons.
H0 : τi = τj for all 1 ≤ i 6= j ≤ 5
We reject H0 at 5% significance level if
|y¯i. − y¯j.| > q0.05(5, 12)

(MSE/b) = 4.507 ∗

(7.908/4) = 6.337
τˆ5 > τˆ2 > τˆ1 > τˆ4 > τˆ3
The Tukey’s test suggests that there is no significant difference between τ5 and τ2, τ2 and τ1, and τ4 and τ3.
τ5 is significantly different from τ1. The pair τ4 and τ3 is significantly different from all others.
f.  Suppose that this is a minimization problem. What washing solution(s) do you recommend?
Based on the results on Tukey’s test from part (e), either solution 3 or 4.
g.  Analyze the residuals from this experiment. Are the analysis of variance assumptions satisfied?
The Q-Q plot supports the normality assumption.
Residual vs. day plot suggests that the variance assumptions is not satisfied, it seems there exists curvature
in residual vs. predicted value plot.
Problem 2 
An engineer suspects that the surface finish of a metal part is influenced by the feed rate and the depth of
cut. He selects three feed rates and four depths of cut. He then conducts a factorial experiment and obtains
the following data:
2
Depth of Cut (in)
Feed Rate(in/min) 0.15 0.18 0.2 0.25
0.2 74 79 82 99
64 68 88 104
60 73 92 96
0.25 92 98 99 104
86 104 108 110
88 88 95 99
0.3 99 104 108 114
98 99 110 111
102 95 99 107
a.  Briefly describe how to conduct the randomization for this design.
There are 3 ∗ 4 = 12 combinations of treatments and 3 replicates for each treatment. 36 runs are required in
total. The order to conduct the 36 runs should be randomly decided by random draws of orders or some
other process.
b.  Specify the statistical model and the corresponding assumptions (including constraints). Then set up
the appropriate hypotheses. Use mathematical notation, and explain the symbols that you are using.
yijk = µ+ τi + βj + (τβ)ij + ijk
ijk
iid∼ N(0, σ2)
τ , β, (τβ) stand for effect of feed rate, depth of cut and the interaction respectively.
Constraints:

i τi = 0 ,

j βj = 0,

i(τβ)ij = 0,

j(τβ)ij = 0.
We test 3 hypotheses individually:
H01 : τ1 = τ2 = τ3 = 0
H02 : β1 = β2 = β3 = β4 = 0
H03 : (τβ)ij = 0
The alternative of all 3 hypotheses have same form: H1: at least one equality does not hold.
c.  Show the formula for the test statistics and compute their values.
Test1: F1 = MSfeed/MSE = 55.02
Test2: F2 = MScut/MSE = 24.66
Test3: F3 = MSinteraction/MSE = 3.23
d.  What are distributions of the test statistics under the null hypothesis?
F1 ∼ F (2, 24), F2 ∼ F (3, 24), F3 ∼ F (6, 24)
e.  Show the steps to compute the p-value. What is your conclusion based on p-value?
Reject H0 when F0 > Fα
3
P-value = P (F > F0) = 1− P (F < F0) = 1− CDF (F0) = 1− PROBF (F0, Df1, DF2)
Reject all 3 null hypotheses.
f.  Obtain parameter estimates for the fitted model.
µ = 94.33
τ1 = −12.75, τ2 = 3.25, τ3 = 9.5
β1 = −9.56, β2 = −4.56, β3 = 3.56, β4 = 10.56
(τβ)11 = −6.03, (τβ)12 = −3.7, (τβ)13 = 2.19, (τβ)14 = 7.52, (τβ)21 = 0.64, (τβ)22 = 3.64, (τβ)23 =
−0.47, (τβ)24 = −3.8, (τβ)31 = 5.387, (τβ)32 = 0.053, (τβ)33 = −1.72, (τβ)34 = −3.723
g.  Use the Tukey’s method to make comparisons among different feed rates and draw conclusions. What
are the conclusions for comparisons among different depths of cut?
For feed rate, all 3 levels are significant with each other in mean. Level of 0.2 has smallest mean while level
of 0.3 has largest mean.
For depth of cut, there is no significant difference detected between level of 0.15 and 0.18 in. For all other
pairs , they all have difference in mean.
h.  Conduct model adequacy checking.
For the model assumption checking, the residuals plot does not have obvious pattern. The points lie randomly
around y = 0, which indicates that the homoscedasticity assumption is valid. There is no obvious outliers
detected by jacknife residuals, leverage or cooks distance. The normal QQ plot is approximately a straight
line. And the Shapiro-Wilk test gets a p-value of 0.4397. The normal assumption is also valid.
i.  Set up reduced models with the given data and find the best fitted model. How do you interpret
PRESS statistics? Report R2pred of all chosen models.
j.  Suppose that this experiment had been conducted in three blocks, with each replicate a block. Assume
that the observations in the data table are given in order, that is, the first observation in each cell comes
from the first replicate, and so on (see the table below).
Depth of Cut (in)
Feed Rate(in/min) Block 0.15 0.18 0.2 0.25
0.2 1 74 79 82 99
2 64 68 88 104
3 60 73 92 96
0.25 1 92 98 99 104
2 86 104 108 110
3 88 88 95 99
0.3 1 99 104 108 114
2 98 99 110 111
3 102 95 99 107
Rework parts (b)-(e) by analyzing the data as a factorial experiment in blocks, assuming that the blocks are
fixed [you’re not required to show the steps to compute p value any more].
4
yijk = µ+ τi + βj + (τβ)ij + δk + ijk
ijk
iid∼ N(0, σ2)
τ , β, (τβ), δk stand for effect of feed rate, depth of cut, the interaction, and block respectively.
Constraints:

i τi = 0 ,

j βj = 0,

i(τβ)ij = 0,

j(τβ)ij = 0,

k δk = 0 .
We test 3 hypotheses individually:
H01 : τ1 = τ2 = τ3 = 0
H02 : β1 = β2 = β3 = β4 = 0
H03 : (τβ)ij = 0
The alternative of all 3 hypotheses have same form: H1: at least one equality does not hold.
Test1: F1 = MSfeed/MSE = 68.35
Test2: F2 = MScut/MSE = 30.64
Test3: F3 = MSinteraction/MSE = 4.02
F1 ∼ F (2, 22), F2 ∼ F (3, 22), F3 ∼ F (6, 22)
Based on the p-values, reject all 3 null hypotheses.
Problem 3 
A chemical product is produced in a pressure vessel. A factorial experiment is carried out in the pilot plant
to study the factors thought to influence the filtration rate of this product. The four factors are temperature
(A), pressure (B), concentration of formaldehyde (C), and stirring rate (D). Each factor has two levels, and a
single replicated design Factor is considered. The data are given as follows.
A B C D= -ABC Treatment combination Filtration rate
+ − − − a 18
− + − − b 13
− − + − c 17
+ + + − abc 15
− − − + d 10
+ + − + abd 24
+ − + + acd 21
− + + + bcd 17
a.  What is this experimental design, and what is its defining relation?
Unreplicated 24−1 fractional factorial design.
Defining relation : I = −ABCD
b.  What is the resolution of this design? Describe a property of designs with such resolution.
Resolution IV design. No main effect is aliased with other main effects or two-way intercations.
c.  Write out the alias structure for this design.
I = −ABCD
5
Alias structure:
A= -BCD [A] → A-BCD
B= -ACD [B] → B-ACD
C= -ABD [C] → C-ABD
D= -ABC [D] → D-ABC
AB= -CD [AB] → AB-CD
AC= -BD [AC] → AC-BD
AD= -BC [AD] → AD-BC
d.  Calculate the effect estimate of AB and MSAB .
contrastAB = −a− b+ c+ abc+ d+ abd− acd− bcd = −3
effectAB = contrastAB/(8/2) = −0.75
SSAB = contrast2AB/8 = 1.125
MSAB = SSAB/1 = 1.125
e.  Write out the decomposition of degrees of freedoms for this design.
Source of variation Df
A 1
B 1
C 1
D 1
AB 1
AC 1
AD 1
Error 0
Total 23 − 1 = 7
f.  Briefly describe a graphical or quantitative method of selecting important effects in the screening
experiment.
Normal probability plot for effect estimates, select effects that are far away from the straight line.
Problem 4 
The following table contains an incomplete output from the ANOVA model. Fill in the missing values.
Source Df Sum Sq Mean Sq F value
A 1 50 50 50
B 2 80 40 40
AB 2 30 15 15
Error 12 12 1
Total 17 172
6  Email:51zuoyejun

@gmail.com