辅导案例-MTH 542

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
1

MTH 542 Chapter 8 – Transformations on y – The Box – Cox Method

Clathrate formation data (from Montgomery, Vining, Peck)

Data: y = Clathrate formation (mass%), x1 = amount of surfactant, x2 = Time (minutes)
clathrate compounds: A type of inclusion compound in which small molecules are trapped in the
cagelike lattice of macromolecules

surfactant: A surface-active agent, including substances commonly referred to as wetting agents,
surface tension depressants, detergents, dispersing agents, emulsifiers, and quaternary ammonium
antiseptics.

library(MPV)
attach(table.b8)
table.b8
x1 x2 y
1 0.00 10 7.5
2 0.00 50 15.0
3 0.00 85 22.0
-----
34 0.05 90 46.5
35 0.05 120 50.0
36 0.05 150 51.9

pairs(y~x1+x2,gap=0.4,cex.labels=1.5)

y
0.00 0.02 0.04
1
0
2
0
3
0
4
0
5
0
0
.0
0
0
.0
2
0
.0
4
x1
10 20 30 40 50 0 50 100 200 300
0
5
0
1
0
0
2
0
0
3
0
0
x2
2

> plot(resid(lm(y~x1+x2)),fitted(lm(y~x1+x2)))

> qqnorm(rstandard(lm(y~x1+x2)))




Next we look for the most appropriate transformation for the response y to correct non-normality.
Use the Box –Cox method.


Step I.
 For each value of λ obtain the transformed response  


 1)(
1)( 
 y
ygmy , if λ ≠ 0 and
yygmy ln)()(  if λ = 0.






3

Step II.
 For each λ calculate the residual sum of square RSS (λ) from fitting the model
exxy  22110
)( 
lambda RSS.lambda
1 -2.00 24794.9525
2 -1.75 15571.3368
3 -1.50 9931.1368
4 -1.25 6446.7541
5 -1.00 4271.8840
6 -0.75 2901.3226
7 -0.50 2031.3151
8 -0.25 1477.9404
9 0.00 1129.1560
10 0.25 916.4661
11 0.50 798.0994
12 0.75 748.9798
13 1.00 754.7438
14 1.25 808.1919
15 1.50 907.2317
16 1.75 1053.7580
17 2.00 1253.1491

See that the optimal value for λ is 0.75, because RSS is minimized at this value.
From theory we know that the value λ that minimizes RSS is the same with the value λ that maximizes
the function

   


n
i
iynRSS
n
L
1
log1/)(log
2
)( 

Using the boxcox function from the MASS package (to use it, we need to install it first) we obtain a
graph of the above log-likelihood function:

> library(MASS)
> boxcox(lm(y~x1+x2), plotit=T)

4

Indeed, the maximum is attained at some point around 0.75.

A Q-Q plot for the residuals corresponding to the model with the optimal transformation of the
response is:


Observe that the boxcox function in R also gives a 95% confidence interval for λ.
When a Confidence interval for λ is given we can select a convenient value for λ from this interval.
Actually if the interval contains the value 1, we may just take λ = 1, which corresponds to no
transformation.

Some comments on the Box-Cox method:
 It is influenced by outliers.
 If some yi are negative, we can add a constant to all the y’s to make them positive.
 There is a question on whether the estimation of the parameter λ should be counted as one less
df for the model. There is no clear answer for this.
 There are other ways to transform the response.




欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468