Exam2 Design of Experiments: STAT 3515Q Problem 1 [25] Five different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five-gallon milk containers. The analysis is done in a laboratory, and only five trials can be run on any day. Because days could represent a potential source of variability, the experimenter decides to use a randomized block design. Observations are taken for four days, and the data are shown here. Analyze the data from this experiment and draw conclusions. Day Solution 1 2 3 4 1 13 22 18 39 2 16 24 17 44 3 5 4 1 22 4 8 6 5 28 5 20 29 24 50 a. [3] Write the model equation, assumptions and constraints. yij = µ+ τi + βj + ij i = 1, 2, · · · , 5. j = 1, 2, · · · , 4. τi: treatment effect and βj : block effect. ij iid∼ N(0, σ2), ∑5i=1 τi = 0 , ∑4j=1 βj = 0. τi and βj are fixed effects. b. [3] Estimate all parameters of the model in part (a). µˆ = y¯.. = 19.75 σˆ2 = MSE = 7.908 τˆi = y¯i. − y¯.. τˆ1 = 3.25, τˆ2 = 5.5, τˆ3 = −11.75, τˆ4 = −8, τˆ5 = 11 βˆj = y¯.j − y¯.. βˆ1 = −7.35, βˆ2 = −2.75, βˆ3 = −6.75, βˆ4 = 16.85 c. [6] Write the hypotheses of the test of this design and draw conclusions for the test(s) with the two approaches discussed in the class. H0 : τ1 = · · · = τ5 = 0 H1 : τi 6= 0 for at least one i. • Critical value approach: 1 For solution: F0 = 46.01, Critical value = F0.05,t−1,nT−t = F0.05,4,20 = 0.1724, F0 > F0.05,t−1,nT−t For Day : F0 = 82.42, Critical value = F0.05,t−1,nT−t = F0.05,3,21 = 0.1156, F0 > F0.05,t−1,nT−t So reject H0. • p-value approach: Both the p-values are less than 0.05. So reject H0. Thus the effect of the washing solutions on bacteria growth is significant. d. [2] What does the model in part (a) assume about the interaction between day and solution? The model assumes that there is no interaction between day and solution. e. [5] Write the null hypothesis of the test for pairwise comparison. Compute the cutoff value for Tukey’s comparison test and interpret the results of the Tukey’s test of all pairwise comparisons. H0 : τi = τj for all 1 ≤ i 6= j ≤ 5 We reject H0 at 5% significance level if |y¯i. − y¯j.| > q0.05(5, 12) √ (MSE/b) = 4.507 ∗ √ (7.908/4) = 6.337 τˆ5 > τˆ2 > τˆ1 > τˆ4 > τˆ3 The Tukey’s test suggests that there is no significant difference between τ5 and τ2, τ2 and τ1, and τ4 and τ3. τ5 is significantly different from τ1. The pair τ4 and τ3 is significantly different from all others. f. [3] Suppose that this is a minimization problem. What washing solution(s) do you recommend? Based on the results on Tukey’s test from part (e), either solution 3 or 4. g. [3] Analyze the residuals from this experiment. Are the analysis of variance assumptions satisfied? The Q-Q plot supports the normality assumption. Residual vs. day plot suggests that the variance assumptions is not satisfied, it seems there exists curvature in residual vs. predicted value plot. Problem 2 [40] An engineer suspects that the surface finish of a metal part is influenced by the feed rate and the depth of cut. He selects three feed rates and four depths of cut. He then conducts a factorial experiment and obtains the following data: 2 Depth of Cut (in) Feed Rate(in/min) 0.15 0.18 0.2 0.25 0.2 74 79 82 99 64 68 88 104 60 73 92 96 0.25 92 98 99 104 86 104 108 110 88 88 95 99 0.3 99 104 108 114 98 99 110 111 102 95 99 107 a. [2] Briefly describe how to conduct the randomization for this design. There are 3 ∗ 4 = 12 combinations of treatments and 3 replicates for each treatment. 36 runs are required in total. The order to conduct the 36 runs should be randomly decided by random draws of orders or some other process. b. [5] Specify the statistical model and the corresponding assumptions (including constraints). Then set up the appropriate hypotheses. Use mathematical notation, and explain the symbols that you are using. yijk = µ+ τi + βj + (τβ)ij + ijk ijk iid∼ N(0, σ2) τ , β, (τβ) stand for effect of feed rate, depth of cut and the interaction respectively. Constraints: ∑ i τi = 0 , ∑ j βj = 0, ∑ i(τβ)ij = 0, ∑ j(τβ)ij = 0. We test 3 hypotheses individually: H01 : τ1 = τ2 = τ3 = 0 H02 : β1 = β2 = β3 = β4 = 0 H03 : (τβ)ij = 0 The alternative of all 3 hypotheses have same form: H1: at least one equality does not hold. c. [3] Show the formula for the test statistics and compute their values. Test1: F1 = MSfeed/MSE = 55.02 Test2: F2 = MScut/MSE = 24.66 Test3: F3 = MSinteraction/MSE = 3.23 d. [2] What are distributions of the test statistics under the null hypothesis? F1 ∼ F (2, 24), F2 ∼ F (3, 24), F3 ∼ F (6, 24) e. [4] Show the steps to compute the p-value. What is your conclusion based on p-value? Reject H0 when F0 > Fα 3 P-value = P (F > F0) = 1− P (F < F0) = 1− CDF (F0) = 1− PROBF (F0, Df1, DF2) Reject all 3 null hypotheses. f. [3] Obtain parameter estimates for the fitted model. µ = 94.33 τ1 = −12.75, τ2 = 3.25, τ3 = 9.5 β1 = −9.56, β2 = −4.56, β3 = 3.56, β4 = 10.56 (τβ)11 = −6.03, (τβ)12 = −3.7, (τβ)13 = 2.19, (τβ)14 = 7.52, (τβ)21 = 0.64, (τβ)22 = 3.64, (τβ)23 = −0.47, (τβ)24 = −3.8, (τβ)31 = 5.387, (τβ)32 = 0.053, (τβ)33 = −1.72, (τβ)34 = −3.723 g. [4] Use the Tukey’s method to make comparisons among different feed rates and draw conclusions. What are the conclusions for comparisons among different depths of cut? For feed rate, all 3 levels are significant with each other in mean. Level of 0.2 has smallest mean while level of 0.3 has largest mean. For depth of cut, there is no significant difference detected between level of 0.15 and 0.18 in. For all other pairs , they all have difference in mean. h. [3] Conduct model adequacy checking. For the model assumption checking, the residuals plot does not have obvious pattern. The points lie randomly around y = 0, which indicates that the homoscedasticity assumption is valid. There is no obvious outliers detected by jacknife residuals, leverage or cooks distance. The normal QQ plot is approximately a straight line. And the Shapiro-Wilk test gets a p-value of 0.4397. The normal assumption is also valid. i. [6] Set up reduced models with the given data and find the best fitted model. How do you interpret PRESS statistics? Report R2pred of all chosen models. j. [8] Suppose that this experiment had been conducted in three blocks, with each replicate a block. Assume that the observations in the data table are given in order, that is, the first observation in each cell comes from the first replicate, and so on (see the table below). Depth of Cut (in) Feed Rate(in/min) Block 0.15 0.18 0.2 0.25 0.2 1 74 79 82 99 2 64 68 88 104 3 60 73 92 96 0.25 1 92 98 99 104 2 86 104 108 110 3 88 88 95 99 0.3 1 99 104 108 114 2 98 99 110 111 3 102 95 99 107 Rework parts (b)-(e) by analyzing the data as a factorial experiment in blocks, assuming that the blocks are fixed [you’re not required to show the steps to compute p value any more]. 4 yijk = µ+ τi + βj + (τβ)ij + δk + ijk ijk iid∼ N(0, σ2) τ , β, (τβ), δk stand for effect of feed rate, depth of cut, the interaction, and block respectively. Constraints: ∑ i τi = 0 , ∑ j βj = 0, ∑ i(τβ)ij = 0, ∑ j(τβ)ij = 0, ∑ k δk = 0 . We test 3 hypotheses individually: H01 : τ1 = τ2 = τ3 = 0 H02 : β1 = β2 = β3 = β4 = 0 H03 : (τβ)ij = 0 The alternative of all 3 hypotheses have same form: H1: at least one equality does not hold. Test1: F1 = MSfeed/MSE = 68.35 Test2: F2 = MScut/MSE = 30.64 Test3: F3 = MSinteraction/MSE = 4.02 F1 ∼ F (2, 22), F2 ∼ F (3, 22), F3 ∼ F (6, 22) Based on the p-values, reject all 3 null hypotheses. Problem 3 [25] A chemical product is produced in a pressure vessel. A factorial experiment is carried out in the pilot plant to study the factors thought to influence the filtration rate of this product. The four factors are temperature (A), pressure (B), concentration of formaldehyde (C), and stirring rate (D). Each factor has two levels, and a single replicated design Factor is considered. The data are given as follows. A B C D= -ABC Treatment combination Filtration rate + − − − a 18 − + − − b 13 − − + − c 17 + + + − abc 15 − − − + d 10 + + − + abd 24 + − + + acd 21 − + + + bcd 17 a. [4] What is this experimental design, and what is its defining relation? Unreplicated 24−1 fractional factorial design. Defining relation : I = −ABCD b. [4] What is the resolution of this design? Describe a property of designs with such resolution. Resolution IV design. No main effect is aliased with other main effects or two-way intercations. c. [4] Write out the alias structure for this design. I = −ABCD 5 Alias structure: A= -BCD [A] → A-BCD B= -ACD [B] → B-ACD C= -ABD [C] → C-ABD D= -ABC [D] → D-ABC AB= -CD [AB] → AB-CD AC= -BD [AC] → AC-BD AD= -BC [AD] → AD-BC d. [4] Calculate the effect estimate of AB and MSAB . contrastAB = −a− b+ c+ abc+ d+ abd− acd− bcd = −3 effectAB = contrastAB/(8/2) = −0.75 SSAB = contrast2AB/8 = 1.125 MSAB = SSAB/1 = 1.125 e. [4] Write out the decomposition of degrees of freedoms for this design. Source of variation Df A 1 B 1 C 1 D 1 AB 1 AC 1 AD 1 Error 0 Total 23 − 1 = 7 f. [5] Briefly describe a graphical or quantitative method of selecting important effects in the screening experiment. Normal probability plot for effect estimates, select effects that are far away from the straight line. Problem 4 [10] The following table contains an incomplete output from the ANOVA model. Fill in the missing values. Source Df Sum Sq Mean Sq F value A 1 50 50 50 B 2 80 40 40 AB 2 30 15 15 Error 12 12 1 Total 17 172 6
欢迎咨询51作业君