由變異數分析來評估化學污染所產生之交互作用

(1)

國立交通大學

統計學研究所

碩士論文

由變異數分析來評估化學汙染

所產生之交互作用

Assessment of Interactions in Chemical

Mixtures by ANOVA Method

研究生: 林書維

指導教授: 陳鄰安博士

中華民國九十九年六月

(2)

由變異數分析來評估化學污染所產生之交互作用

Assessment of Interactions in Chemical Mixtures by

ANOVA Method

Student: Shu-wei Lin Advisors: Dr. Chi-an Lin

國立交通大學統計學研究所

碩士論文

A Thesis

Submitted to Institute of Statistics College of Science National Chiao Tung University

In partial Fullfillment of the Requirement For the Degree of Master

In Statistics June 2010

Hsinchu, Taiwan, Republic of China

中華民國九十九年六月

(3)

i

由變異數分析來評估化學汙染所產生之交互作用

學生: 林書維指導教授: 陳鄰安博士

國立交通大學統計學研究所

摘要

將化學物品含量分割成區間水準是探討交互作用常用的做法，但不幸的是透過變異數分析方法並不能告訴我們交互作用在某個特定水準是正 (synergistic)還是負(antagonistic)。針對分割成區間水準的問題，我們提出一個分解方法去定義主效應與交互作用。我們導出這些效應的(主效應與交互作用)母體型式。至於交互作用的估計和假設檢定也在此篇有所討論。

(4)

ii

Assessment of Interactions in Chemical Mixtures

by ANOVA Method

Student: Shu-wei Lin

Advisors: Dr. Chi-an Lin

Institute of Statistics

National Chiao Tung University

SUMMARY

The investigation of interactions is popularly done by classifying the chemicals into interval levels and verifying it through the analysis of variance technique which unfortunately can not tell us if an interaction in a specific level is positive (synergistic) or negative (antagonistic). We propose a decomposition method to define main effects and interactions for these interval levels. Population type formulations of these effects are developed. Estimation and hypothesis testing are also discussed.

(5)

iii

誌謝

研究所的這兩年碩士生活，不但充實且愉悅，十八年的學生生活，即將在新竹這片大地劃上句點，開啟人生另一路程。在所上的兩年期間，真的非常感謝所上教授們的指導與教誨，由於所上老師們的教學認真，讓我在這兩年學了很多統計分析的技巧，也學習許多統計相關軟體，使得我對分析數據有更進一步的能力。由衷地感謝我的指導教授陳鄰安老師，老師從一開始就很有耐心的指導我，一步一步帶領著我們去完成這未知的論文，而且不管是小問題還是大麻煩，老師總是不厭其煩的為我解惑。不但如此，老師更是常常告誡我們一些人生道理，教導我們人生要有目標和態度，真的很榮幸自己是陳鄰安教授的學生，謝謝老師。在此特別感謝口試委員彭南夫老師、黃信誠老師及江永進老師對論文的指正與建議，使整體的論文更加充實。當然，這兩年也多虧了身旁的同學、朋友們，我們一起修課、一起討論課業、一起玩鬧、一起笑、一起分享自己的心情、一起度過種種的難關和一起成長。因為有你們，我的碩士生涯才如此多采多姿，謝謝你們。在這邊要特別提一個人，那就是我們所辦永遠最年輕的小姐郭碧芬，郭姐總是靜靜的坐在所辦，聽著我們的抱怨，當我們兩年的心靈導師，為我們處理大大小小的事情，甚至當我們學生和老師間的橋梁，真的是非常辛苦，感謝妳。最後，感謝我的女朋友-玄君，哥哥-龍和以及媽媽-綠雲，他們總是在我低落的時候，不斷地給我鼓勵，在我開心時，默默的分享我的喜悅，在我懶散時，嚴厲地督促著我。因為他們的愛才讓我有一直往前的動力，讓我能順利的完成研究所學業，謝謝你們，我最愛的家人。林書維謹誌于國立交通大學統計學研究所中華民國九十九年六月

(6)

Assessment of Interactions in Chemical Mixtures

by ANOVA Method

SUMMARY

The investigation of interactions is popularly done by classifying the chem-icals into interval levels and verifying it through the analysis of variance technique which unfortunately can not tell us if an interaction in a spe-cic level is positive (synergistic) or negative (antagonistic). We propose a decomposition method to dene main eects and interactions for these interval levels. Population type formulations of these eects are developed. Estimation and hypothesis testing are also discussed.

1. Introduction

The toxicological research has long been devoted to assess the risk with exposure to single chemicals in the environment. However, organisms are rarely environmentally exposed to single chemicals in isolation. More typ-ically, exposures occur to multiple chemicals simultaneously. It has long understood that the behavior of one chemical in the body is aected by other chemicals. Recently much of the literature has been investigated on the important area of toxicology of mixed chemicals. One very important study in chemical mixtures is the detection for existence of interactions and characterization of an interaction being synergistic or antagonistic eect. It is important for this study since one may overestimate the true risk asso-ciated with the mixtures of chemicals with assumption of additive eects when an antagonistic eects occur and one may underestimate the true risk with the same assumption when a synergistic eect occur.

There are several approaches for studying the chemical interactions. The most common technique in analysis of toxicologic interactions is by classify-ing the chemicals into interval levels and verifyclassify-ing it through the analysis of variance (ANOVA). This technique can detect the existence of interactions, however, there is no description of the interaction to be given. The isobolo-graphic method has a long history but is recently popular as an alternative method for the study of chemical interactions. Berenbaum (1981) dened

TypesetbyA M

S-T E

(8)

the interaction index through xed ratio ray designs to detect if the chemical mixture is additive, synergistic or antagonistic. However, this techniques of isobole require experimental iterations to obtain the doses of the studying chemicals that will cause the same magnitude of eect which is not only labor extensive and require a large number of animal experiments but is not applicable in real data analysis. For references of various interaction de-tecting techniques and discussions, see Rider and LeBlane (2005), Ei-Masri, Reardon and Yang (1997), Charles et. al. (2002) and Mumtaz et al. (1998). A systematic investigation of mixed chemicals in the environment or workplace is highly desired while the isobolographic method is not appli-cable for this practical investigation of interaction characterization. It is interesting to see if we can develop an ANOVA like model deserving the benet of providing valuable insights into the detection of interactions be-ing synergistic or antagonistic that is done by the isobolographic method.

In Section 2, we state the fundamental framework of a grouping ANOVA model for one response variable and several chemical variables and, in Sec-tion 3, we introduce the parameter type main eects and a theory for for-mulation of these eects. In Section 4, we introduce the main concept of interactions and their relationships and, in Section 5, we provide estimation and hypothesis testing for the unknown interactions.

2. Development of Grouping Two Way ANOVA Model

LetY be the response variable representing the combined eects and X1

and X2 be two variables representing, respectively, the exposures or doses

of two chemicals. Let A1 = ( ;1a 1]A2 = (a1a2]:::Am = (am;1 1) and B1 = ( ;1b 1]B2 = (b1b2]:::B` = (b`;1 1) be respectively, the

interval types partitions of the spaces of X1 and X2 where ai's and bj's

are two known increasing sequences. We assume that we have observations

0 @ y1 x11 x21 1 A::: 0 @ yn x1n x2n 1

A. We can distribute the observations y

1:::yn into

rec-tangle sets Aj Bg = f x1 x2 : x1 2 Ajx 2

(9)

distributed observations as follows: B1 B2 ::: B` A1 y11ii= 1:::n11 y12ii= 1:::n12 ::: y1ìi= 1:::n1` A2 y21ii= 1:::n21 y22ii= 1:::n2 ::: y2ìi= 1:::n2` ... ... ... ... ... Am ym1ii= 1:::nm1 ym2ii= 1:::nm2 ::: ymìi= 1:::nm`

What is appropriate denition of the conditional mean of y on rectangle level Aj Bg? We denote the joint probability density function (pdf) of

YX1 and X2 by fyx

1x2 and joint pdf of X

1 and X2 by fx

1x2. Further

letting fyjx

1x2(y) be the conditional pdf of y given X

1 = x1 and X2 =x2,

the conditional pdf of y given Aj Bg, denoting by fy

jAjBg(y), can be

dened as the average of fyjx

1x2(y) with respect to variable X

1 and X2 on Aj Bg, i.e., fyjAjBg(y) = Z AjBg fyjx 1x2(y) 1 P( X1 X2 2Aj Bg) fx1x2(x 1x2)dx1dx2 since 1 P( X 1 X2 ! 2AjBg) fx1x2(x

1x2) is the truncated pdf of X1 and X2 on

AjBg. However, it may be reformulated as

fyjAjBg(y) =EfyjX 1X2I((X 1X2) 2Aj Bg)] = 1 P( X1 X2 2AjBg) Z AjBg fyx1x2(yx 1x2)dx1dx2:

The group mean (conditional means) of the response variable y given an interval level Aj Bg is jg =

R 1 ;1yfy

jAjBg(y)dy for j = 1:::mg =

1:::`. Furthermore, by dening the error variables as jgi =yjgi;jg, we

may transform the bivariate sample into location models that we call it an interval grouping ANOVA model as

B1 B2 ::: B` A1 y11i =11+11i i= 1:::n11 y12i =12+12i i= 1:::n12 ::: y1ì =1`+1ì i= 1:::n1` A2 y21i =21+21i i= 1:::n21 y22i =22+22i i= 1:::n2 ::: y2ì =2`+2` i= 1:::n2` ... ... ... ... ... Am ym1i =m1 +m1i i= 1:::nm1 ym2i =m2+m2i i = 1:::nm2 ::: ymì_i_{= 1}=_:::nm`+_m`mì

(10)

where jg1:::jgnjg are iid random variables with zero means.

Let's dene grand mean = :: = 1

m`P` g=1 Pm j=1jg, group means j: = 1 `P` g=1jgj = 1:::m and :g = 1 mPm j=1jgg = 1:::`. The

parameters for classical ANOVA model are j =j:;j = 1:::m, g =

:g;g = 1:::`and jg =jg;(+j+g)j = 1:::mg= 1:::`while

we callj's the row eects, g's the column eects and jg's the interaction

eects. The two way classical ANOVA technique applying on this interval grouping problem is assuming the follwoing ANOVA model

yjgi =+j +g+ jg+jgii= 1:::njg (2.1) where Pm j=1j = P` g=1g = Pm j=1 jg = P` g=1 jg = 0.

There are several comments drawn from this ANOVA model for analyzing the health eects caused by chemical mixtures:

(a) Model (2.1) is classically analyzed through the assumption, for error vari-ables jgi's, of normality and constant variance. However, this assumption

has never been validated by theory. (b) The fact Pm

j=1 jg = P`

g=1 jg = 0 indicates that the term jg doesn't

characterize the interaction eects at level AjBg since its sign to be

pos-itive or negative is parametrized. With this, j andg do not, respectively,

represent the main eects for chemical variables X1 and X2.

(c) Once we have observation of the exposure or dose for the chemicals from the environment, we are expected to estimate or test hypothesis for the interacation to be synergistic or antagonistic at this level. However, this model do not allow us to achieve this aim.

3. Formulation of Grouping Individual Eects

Consider an experiment in an enviroment that there is only one chemical to aect the response variable and we may dene the main eect. Assume that we have response variable Y and chemical variable X1 with a joint

distribution as Y X1 N 2( y 1 2 y 1y y1 2 1 ):

(11)

The population mean of response variable Y on interval level Aj is aj =

EYjX 1

2 Aj]. We consider if there are constants y

1 and b such that

this population mean _aj can be decomposed uniformly in j as _aj = y1 +

bEX1I(Aj)]. We then call aj = bEX1I(Aj)] the main eect of chemical

X1 at interval levelAj and the population mean decomposition is

aj=y1+aj:

The response variable Y and chemical variable X2 have a joint distribution

as Y X2 N 2( y 2 2 y 2y y2 2 2 )

Similarly, if there is a decomposition on the interval level conditional mean

_bg =EYjX 2

2Bg] as

bg =y2+bg

with _bg = dEX2I(Bg)] for some constant b, we call bg the main eect of

chemical X2 at interval level Bg.

Following the results derived in Chan et al. (2008), we have the following theorem.

Theorem 3.1.

With normality assumption, we have the decomposition

_aj =y1+aj with y1 =y ;y 1 y 1

1 and main eects for chemical variable X1 as

1a=y 1 y 1 1 ; y1y p 2 (a1 ; 1 1 ) e; 1 2 (a 1 ; 1 1 ) 2 ... _aj=y1 y 1 1+ y1y p 2((aj ; 1 1 );(aj ;1 ; 1 1 )) fe ; 1 2 (aj ;1 ; 1 1 ) 2 ;e ; 1 2 (aj ; 1 1 ) 2 g j = 2:::m;1 (3.1) ... am =y1 y 1 1 + y1y p 2 (1;( am;1 ; 1 1 )) e; 1 2 (am ;1 ; 1 1 ) 2

(12)

where y1 =

y1

y1 is the correlation coecient between Y and X

1 and is

the distribution function of the standard normal distribution. On the other hand, we have the decomposition

bg =y2+bg with y2 =y ;y 2 y 2

2 and the main eects for chemical variable X2 as

1b =y 2 y 2 2 ; y2y p 2 (b1 ; 2 2 ) e; 1 2 (b 1 ; 2 2 ) 2 ... _bg =y2 y 2 2+ y2y p 2((bg ; 2 2 );(bg ;1 ; 2 2 )) fe ; 1 2 (bg ;1 ; 2 2 ) 2 ;e ; 1 2 (bg ; 2 2 ) 2 g g= 2:::`;1 (3.2) ... b` =y2 y 2 2+ y2y p 2 (1;( b`;1; 2 2 )) e; 1 2 (b` ;1 ; 2 2 ) 2 where y2 = y2

y2 is the correlation coecient between Y andX 2.

Let us consider an example to illustrate the main eects where we haveY

andX1with bivariate normal distribution with mean and covariance matrix,

respectively as = 0 5 = 1 y1 y1 1 :

We also consider only three levels ANOVA with cuo pointsa1 =F ;1

x1 (1=3)

anda2 =F ;1

x1 (2=3). The corresponding main eects associated with 1y are

displayed in the following table.

Table 1.

Main eects

a 1 a 2 a 3 1y = 0:2 0:781 1 1:218 1y = ;0:2 ;0:781 ;1 ;1:218 1y = 0:4 1:563 2 2:436 1y = 0:6 2:345 3 3:654

This example shows that the main eects may be all positive or all negative. We then have a theorem for one property of the main eects.

(13)

Theorem 3.2.

The group main eects _aj's satisfy one of the following three orderings: (a) a 1 = a 2 =:::=am if y 1 = 0, (b) a 1 < a 2 < ::: < am if y 1 >0, (c) a 1 > a 2 > ::: > am if y 1 <0.

The conclusions for main eects _bj's are similar.

The above theorem indicates one important property that showing mono-tone main eects is equivalent to showing nonzery1. This topic belongs to

the restricted statistical inferences discussed in Robertson et al. 11] where likelihood ratio tests are the main techniques applied. However, the tests developed in literature are not appropriate to apply on the interval group-ing ANOVA model since the assumptions for likelihood ratio tests require known or partial known variances that are not true in this framework.

With the established main eect formulations, we may dene a new one way ANOVA model as

A1 A2 ::: Am y1i =y1+ a 1 + 1i i= 1:::n1 y2i =y1+ a 2 + 2i i= 1:::n2 ::: ymi =y1+am+mi i= 1:::nm

For chemical variable X2, the one way ANOVA model is

B1 B2 ::: B` y1i =y2+ b 1+ 1i i= 1:::n1 y2i =y2+ b 2 + 2i i= 1:::n2 ::: y`i =y2+b` +`i i = 1:::n`

These one way ANOVA models are not identical to the classical one way ANOVA models since their main eects are not restricted to have zero sums.

4. Additive Eects Model and Additive with Interactions Eects

Model

The aim in this section is to formulate the interaction eects in an ANOVA model. We assume that the response variable Y and two chemical variables (X1X2) are jointly normal as

0 @ Y X1 X2 1 A N 3( 0 @ y 1 2 1 A 0 @ 2 y 1y 2y y1 2 1 12 y2 21 2 2 1 A) (4.1)

(14)

where 0 @ y 1 2 1

A is the mean vector and 0 @ 2 y y1 y2 1y 2 1 12 2y 21 2 2 1 A) is the covariance matrix.

Again, when the levelAjBg conditional meanjg =EYjX 1 2AjX 2 2 Bg] may be written as a+ g0E X1 X2 I(X1 2 AjX 2 2 Bg)], we call _jgcomb ₌ _g0E X1I(Aj) X2I(Bg)

the level Aj Bg combined eect for chemicals

X1 and X2. We dene the dierence between the combined eect and the

sum of two main eects as the interaction as jg =jgcomb;(aj+bg).

Theorem 4.1.

With the normality assumption, we have

jg =y12 + comb jg with y12 =y ;(y 1y2) 2 1 12 21 2 2 ;1 1 2 and _jgcomb= 1 P( X1 X2 2AjBg) (y1y2) 2 1 12 21 2 2 ;1 R Bg R Aj x1fx 1x2(x 1x2)dx1dx2 R Bg R Aj x2fx 1x2(x 1x2)dx1dx2 (4.2) With the above theorem, the response variables yjgi in interval Aj Bg

may be formulated into an additive eects model.

Denition 4.2.

We call the response variable follows the two way ANOVA model if it may be written as

yjgi =y12 +aj+bg +jg+jgii= 1:::njg

withjg =jgcomb;(aj+bg) and whereajandbg main eects dened in (3.1)

and (3.2) and jg is called the interaction eect at interval level AjBg.

The combination of chemical variablesX1 andX2 contributes to toxicity

(15)

the interaction eect. When interaction jg > 0 the interaction is

charac-terized as synergistic and when jg < 0 the interaction is characterized as

antagonistic.

The combined eects are available estimated from sample drawn from the natural environment. An interesting question is when will the combin-ing eects be the sum of two main eects such that the ANOVA model is additive? Generally, the combination of chemicals variables X1 and X2

contributes to toxicity Y through the a common mechanism of the sum of individual eects and the interaction eect.

Let us give an example for explanation of interactions where we consider the three dimensional normal distribution for YX1X2 having mean and

covariance matrix as = 0 @ 10 5 5 1 A and = 0 @ 1 0:5 0:5 0:5 1 0:5 1 1 A:

Here we choose y1 = y2 = 0:5 > 0 because chemicals in our research are

health harmful quantied by variable Y. The interval levels are determined with a1 =F ;1 x1 (1=3)a 2 =F ;1 x1 (2=3) and b 1 =F ;1 x2 (1=3)b 2= F ;1 x2 (2=3). In

the following table, we display the true interactions for these inetrval levels.

Table 2.

Interaction eects

= 0:5 =;0:4 = 0 11 ;1:383 2:777 0 12 ;1:453 3:072 0 13 ;1:670 3:344 0 21 ;1:449 3:069 0 22 ;1:664 3:301 0 23 ;1:871 3:627 0 31 ;1:663 3:329 0 32 ;1:871 3:640 0 33 ;1:961 3:798 0

There are comments for the results displayed in Table 2:

(a) The interactions are antagonistic when is positive values, are synergis-tic when is negative values and it is an additive model when is zero.

(16)

(b) There is monotone property for the interactions with

ij < i+1j and ij < ij+1:

This is interesting but not available to be theoretically veried.

(c) Being synergistic or antagonistic is determined from the sign of correla-tion coecient for variables X1 and X2. We have interactions negative if

>0 and positive if <0.

For further investigation of interactions, we consider the following design:

= 0 @ 1 0:2 0:2 0:2 1 0:2 1 1 A= 0 @ 10 5 5 1 Apjg = jg _aj+_bg

where pjg measures the ratio between interaction and the sum of main

ef-fects. The true interactions and interaction to main eects ratio are dis-played in Table 3.

(17)

=;0:2 =;0:4 =;0:6 =;0:8 11 0:416 (0:266) (01::144732) (12::584653) (47::120553) 12 0:443 (0:249) (01::218684) (12::756547) (47::515217) 13 0:493 (0:246) (01::335667) (13::008504) (37::992996) 21 0:460 (0:258) (01::223686) (12::766552) (47::579253) 22 0:497 (0:248) (01::325662) (13::016508) (37::960980) 23 0:553 (0:249) (01::455656) (13::234458) (38::448808) 31 0:507 (0:253) (01::334667) (13::006503) (48::030015) 32 0:556 (0:250) (01::442650) (13::202443) (38::443806) 33 0:570 (0:234) (01::513621) (13::379386) (38::979685) We have two comments drawn from the results showing in Table 3:

(a) The magnitude of the interaction increases when the magnitude of the correlation coecient increases.

(b) The magnitude of the interaction to main eect ratio also increases when the magnitude of the correlation coecient increases.

The additive model in this new ANOVA model is dened in the following denition.

Dention 4.3.

A two way ANOVA model is addditive if it may be wriiten as

yjgi =y12+aj+bg+jgij = 1:::mg= 1:::` (3.3)

where i= 1:::njg.

(18)

in laboratories but they are not shown in natural environment unless that there is no combinational eects for chemicals.

Theorem 4.4.

Let us assume thatX1 andX2 are uncorrelated, i.e.,12 =

0. We have

comb_jg ₌_aj₊_bg _(4.3)

indicating that jg = 0 for al (jg)'s and the two way ANOVA model is

additive with y12 =y ; y1 2 1 1 ; y2 2 2 2:

We conduct a simulation with replication number 100000 from a normal distribution with mean and covariance matrix, respectively, as

= 0 @ 0 5 5 1 A = 0 @ 1 0:2 0:2 0:2 1 ;0:2 0:2 ;0:2 1 1 A

and the levels are setting as a1 = ^F ;1

x1 (0:3) and b

1 = ^F ;1

x2 (0:3). Let the

sample means and sample variances for YX1X2 be respectively denoted

as yx1x2 and S 2

yS2 1S

2

2. Also, we denote the sample correlation

coe-cients for fYX 1

g and fYX 2

g be respectively denoted asry

1 and ry2. The

estimates are dened below: ^ y1 = y ;ry 1 Sy S1 x1^y2= y ;ry 2 Sy S2 x2 ^ a1 = Pn i=1yiI( ;1x 1i F^ ;1 x1 (0:3)) Pn i=1I( ;1x 1i F^ ;1 x1 (0:3)) â2 = Pn i=1yiI( ^F ;1 x1 (0:3) x 1i < 1) Pn i=1I( ^F ;1 x1 (0:3) x 1i < 1) ^ aj= âj;^y 1j = 12 and ^bg = ^bg ;^y 2g= 12 ^ 1jg = 1 n n X i=1 x1iI(x1i 2Ajx 2i 2Bg)^ 2jg = 1 n n X i=1 x2iI(x1i 2Ajx 2i 2Bg) ^ jg = 1_nXn i=1 I(x1i 2Ajx 2i 2Bg) ^ comb_jg _{= 1^}_jg(^y1^y2) S2 1 ^ 12 ^ 21 S 2 2 ;1 ^ 1jg ^ 2jg ^jg = ^combjg ;(âj+ ^bg) MSEjg = ₁₀₀1₀₀₀ 100000 X i=1 (^jgi;jg) 2:

(19)

The simulated interaction estimates and the corresponding MSE's jg MSEjg

are displayed in Table 4.

Table 4.

Performance of Interaction eect Estimation

sample size 11 = 0:4318 12 = 0:4937 21 = 0:4884 22 = 0:5179 n= 30 0:4003 0:3526 0:4908 0:4191 0:4898 0:4187 0:5198 0:4352 n= 50 0:4157 0:1922 0:4889 0:2117 0:4879 0:2169 0:5197 0:2250 n= 100 0:4188 0:0878 0:4905 0:0985 0:4891 0:0985 0:5219 0:1016

5. Detection of Interactions

The practical problem in interaction detection is that we have a data set of variabes Y, X1 and X2 and we want to detect if the interaction on

some interval level of X1 and X2 is positive or greater than some specied

critical point. This can be answered by statistical inferences for the unknown population interaction on that level while it is very popular to discuss this through the hypothesis testing. However, the point estimation can also achieve this purpose.

The rst we want to investigate is the eciencies of point estimation in detection of existence of positive interactions. That is, we evaluate the probability of positive interaction when there exists positive interactions. We now evaluate, in a number of 100000 replications, the power in observ-ing positive interaction by estimation when the true interactionjgk is some

value greater than zero as 1 100000 100000 X k=1 I(^jgk >0jjg >0)

for various situations of positive inetractions where ^jgk is estimate at kth

replication. The simulation will have data drawn from the following distri-bution 0 @ y x1 x2 1 A N( 0 @ 0 5 5 1 A 0 @ 1 0:2 0:2 0:2 1 r 0:2 r 1 1 A):

(20)

We consider ANOVA model of two levels with cuto points a1 = ^F ;1 X1(0:3) and b1 = ^F ;1 X2(0:3).

The simulated levels are displayed in Table 5 where n represents the sample size.

Table 5.

Condence level performance

sample size 11 12 21 22 r =;0:2 (0:4318) (0:4937) (0:4884) (0:5179) n= 30 0:7587 0:7964 0:7966 0:8147 n= 50 0:8435 0:8784 0:8788 0:8968 n= 100 0:9391 0:9623 0:9621 0:9726 r =;0:4 (1:1366) (1:3042) (1:3179) (1:3786) n= 30 0:9374 0:9557 0:9560 0:9625 n= 50 0:9850 0:9911 0:9905 0:9931 n= 100 0:9994 0:9998 0:9998 0:9998 r =;0:6 (2:5216) (2:9593) (2:9624) (3:0788) n= 30 0:9938 0:9938 0:9939 0:9941 n= 50 0:9993 0:9994 0:9994 0:9994 n= 100 1 1 1 1 We have two comments drawn from the results in Table 5:

(a) For interpretation, the powers are 0:75870:84350:9391 respectively for sample sizes being 3050100 when11 = 0:4318 with r =

;0:2. The power

values are all more than 0:75 with true interaction value being 0:43 or more. (b) The power increases when the sample size is larger. This satises our expectation.

One question is more interesting in showing the interaction estimate to be higher than a critical point, saying 0:5 when the true interaction is some value c more than 0:5. This can be evaluated in the following index,

1 100000 100000 X k=1 I(^jgk >0:5jjg =c):

We display the simulated results in the following table.

(21)

sample size 11 12 21 22 r =;0:27(c) (0:6240) (0:7180) (0:7355) (0:7688) n= 30 0:5097 0:5743 0:5752 0:5925 n= 50 0:5550 0:6289 0:6296 0:6567 n= 100 0:6212 0:7219 0:7236 0:7623 n= 500 0:8003 0:9363 0:9358 0:9664 r =;0:33(c) (0:8521) (0:9587) (0:9771) (1:0175) n= 30 0:6244 0:6951 0:6966 0:7140 n= 50 0:7049 0:7782 0:7782 0:8028 n= 100 0:8159 0:8908 0:8912 0:9161 n= 500 0:9861 0:9986 0:9986 0:9996 r =;0:43(c) (1:2864) (1:4728) (1:4949) (1:5593) n= 30 0:8022 0:8072 0:8574 0:8724 n= 50 0:8951 0:9350 0:9349 0:9457 n= 100 0:9742 0:9901 0:9896 0:9932 n= 500 1 1 1 1 The simulated results show that the estimation technique is satisfactory for observing that the interaction estimate reaches the risk point.

In the next, we consider a hypothesis testingH0 :jg =0 vs H1 :jg >

0. Suppose that we have observation 0 @ yi x1i x2i 1

A from a normal distribution

of (4.1). We consider a test for this hyppothesis as rejecting H0 if ^

jg; 0

sjg h

where ^jgis an estimate ofjg,sjgis scale estimate of ^jgfor standardization

and h is the level critical point. For power performance evaluation, we

conduct this data generation mtimes and we have corresponding estimates ^

_cjg and s_cjgc= 1:::m, the power is estimated as

p= 1_mXm

c=1

I(^cjg; 0

s_cjg h ): (5.2)

We then need to decide scale estimate sjg and level critical point h .

The distribution theory of interaction estimator of interaction jg does not

support in using normal distribution to construct h .

(22)

(a) We resamplek= 1000 times from this data set and compute the resulted estimates ^_cjg.

(b) The scale parameter estimate is dened ass2

jg = 1 1000 P 1000 c=1(^cjg ;jg) 2.

(c) The level critical point h is dened as k100(1;)% order statistic

of ^cjg; 0

sjg .

(d) We resample 2000 samples from (4.1) and we denote the interaction estimates be denoted as ^_cjgc= 1:::2000. The simulated power is

p= 1₂₀₀₀2000 X c=1 I(^cjg ; 0 s_cjg h ):

Unfortunately there is no xed valueh making the probabilities of type

I error for dierent sample sizes equal. Hence, we search h for each size

n so that the level of signicance is xed to be 0:05 and then we evaluate the powers when H1 is true with some given values of jg. The simulated

results are displayed in Table 7 where the true value is 0 = 0:49778.

Table 7.

Power performance when signicance level is xed sample size H0 0 = 0:8626 0 = 1:3255 0 = 2:0048 0 = 3:0169 n= 50 0:0512(h = 1:2815) 0:154 0:3796 0:656 0:8845 n= 100 0:049(h = 1:3105) 0:244 0:635 0:918 0:995 n= 500 0:049(h = 1:645) 0:623 0:995 1 1

The results are not very satisfactory. But this is the rst step in developing interaction detection for ANOVA like model.

6. ANOVA Analysis for Unknown Quantiles

It is desired to develop the large sample theory for the estimator of the interactions so that we may construct distribution based test for hypothesis of interactions. However, we have tried but it is dicult to accomplish this task. In the following, we display a result on the asymptotic distribution for the group means that will help in deriving asymptotic distributions of main eects.

Practically the quantile functions F;1

x (j)0s are unknown and then the

interval levels A0

(23)

random sample Y1 X1 ::: YN XN

is available from this distribution. It is generally dene the cuto points as quantiles of the observations of grouping variable X and a monotone and disjoint interval levels as

^ A0 = ( ;1F^ ;1 x (1)]A^1 = ( ^F ;1 x (1)F^ ;1 x (2)]:::A^k= ( ^F ;1 x (k)1): (6.1) By letting Y = P ni=1Yi n Yj = P ni=1YiI ( ^ F;1 x ( j)Xi ^ F;1 x ( j +1 )) P ni=1I ( ^ F;1 x ( j)Xi ^ F;1 x ( j +1 )) j = 01:::k,

we dene two parameters estimates ^y = Y and ^aj = Yj, the main eect

estimate is setting as

^

aj= Yj;^y: (6.2)

The following theorem provides a step for constructing tests for testing the main eects.

Theorem 6.1.

p n(Yj ;Aj) = 1 j+1 ;j n;1=2 n X i=1 ( j(YiXi);E j(YX)]) +op(1) with j(YX) = 8 < : E(Y ;yjF ;1 X (j)) if X F ;1 X (j) Y ;y if F ;1 X (j)< X < F;1 X (j+1) E(Y ;yjF ;1 X (j+1)) if X F ;1 X (j+1) and Aj =y+ 1 j+1 ; jEY ;yI(F ;1 X (j)X F ;1 X (j+1))].

Corollary 6.2.

p

n(Yj ;Aj) is asymptotically normal with distribution

N(02( jj+1)) where 2( jj+1) = 1 (j+1 ;j) 2 fjE(Y ;yjF ;1 X (j))]2+ (1 ;j +1)E(Y ;yjF ;1 X (j+1))] 2 +E(Y ;y) 2I(F ;1 X (j)X F ;1 X (j+1)] + (jEY ;yjF ;1 X (j)] + (1;j +1)EY ;yjF ;1 X (j+1)] +E(Y ;y)I(F ;1 X (j)X F ;1 X (j+1))]) 2 g:

From this theory, we may expect that the main eects are asymptotically normal that help in constructing tests for hypotheses of main eects. How-ever, in our try, we found that the estimators of interactions are quite like products of two correlated normal variables so that their asymptotic distri-butions are unable to develop since the correlations are two complicated.

(24)

7. Appendix

The following proof is rewritten from Chan, et. al. (2008).

Proof of Theorem 3.1:

From the well known property E(yjx

1) = y +

y1y

1 (x 1

;

1) where x1 is a given value, we have

a1 = Z 1 ;1 yfyjA 1(y)dy =Z 1 ;1 y_P₍_X1 1 a 1) Z a 1 ;1 f(yx1)dx1dy = _P₍_X 1 1 a 1) Z a 1 ;1 Z 1 ;1 yf(yjx 1)dy]fx 1(x 1)dx1 = _P₍_X 1 1 a 1) Z a 1 ;1 y+ y1y 1 (x1 ; 1)]fx 1(x 1)dx1 = _P₍_X 1 1 a 1) fyP(X 1 a 1) + y1y 1 Z a 1 ;1 x1fx 1(x 1)dx1 ; 1P(X1 a 1)] g =y1 +y1 y 1 1 ; 1 (a1; 1 1 ) y1y p 2 e; 1 2 (a 1 ; 1 1 ) 2

from the fact that Ra 1 ;1xfx(x)dx = xP(x a 1) ; x p 2e ; 1 2 (a 1 ;x x ) 2 . The other Aj's may be derived analogously and are skipped. We here note

that the main eects, showing in this proof, may also be represented as

_aj= y1 2 1 R Ajx1f1 (x 1 )dx 1 P(X 1 2Aj) and bg = y2 2 2 R Bgx2f2 (x 2 )dx 2 P(X 2 2Bg) .

Proof of Theorem 3.2:

Next, from the proof of Theorem 3.1, we may see that the group means for this interval grouping ANOVA model have an alternative form that can be expressed as the followings:

1a=y 1 y 1 Z A1 x1 fx1(x 1) P(X1 2A 1) dx1 2a=y 1 y 1 Z A2 x1 fx1(x 1) P(X1 2A 2) dx1 ... am;1 =y 1 y 1 Z Am;1 x1 fx1(x 1) P(X1 2Am ;1) dx1 _am=y1 y 1 Z Amx1 fx1(x 1) P(X1 2Am) dx1

(25)

whereA1:::Amis a monotone sequence of intervals forming a partition on

the support of the grouping variable. Since fx1 (x

1 )

P(X12Aj) is a truncated pdf on

space Aj, then we have

Z A1 x1 fx1(x 1) P(X1 2A 1) dx1 < Z A2 x1 fx1(x 1) P(X1 2A 2) dx1 < ::: < Z Amx1 fx1(x 1) P(X1 2Am) dx1

and possible values of y1 must fall in one of the 3 sets

;10)f0gor (01]

which leads to the theorem.

Proof of Theorem 4.1:

jg = Z 1 ;1 yfyjAjBg(y)dy = Z 1 ;1 y 1 P( X1 X2 2Aj Bg) Z Bg Z Aj f(yx1x2)dx1dx2dy = 1 P( X1 X2 2AjBg) Z Bg Z Aj Z 1 ;1 yf(yjx 1x2)dy]fx 1x2(x 1x2)dx1dx2 = 1 P( X1 X2 2AjBg) Z Bg Z Ajy+ (y1y2) 2 1 12 21 2 2 ;1 ( x1 x2 ; 1 2 )] fx1x2(x 1x2)dx1dx2 =y12 + 1 P( X1 X2 2Aj Bg) (y1y2) 2 1 12 21 2 2 ;1 R Bg R Aj x1fx 1x2(x 1x2)dx1dx2 R Bg R Aj x2fx 1x2(x 1x2)dx1dx2

which leads to the result in Theorem 4.1.

Proof of Theorem 4.4:

Assuming thatX1 andX2 are uncorrelated, they

(26)

followings: jg =y + _P₍_X 1 1 2Aj)P(X 2 2Bg)( y1y2) 2 1 0 0 2 2 ;1 R Aj R Bgx1f1(x1)f2(x2)dx1dx2 R Aj R Bgx2f1(x1)f2(x2)dx1dx2 ;(y 1y2) 2 1 0 0 2 2 ;1 1 2 =y+ _P₍_X 1 1 2Aj)P(X 2 2Bg)( y1y2) " 1 2 1 P(X2 2Bg) R Aj x1f1(x1)dx1 1 2 2 P(X1 2Aj) R Bgx2f2(x2)dx2 ! ; y1 2 1 1+ y2 2 2 2] =y+ y1 2 1 ( R Ajx1f1(x1)dx1 P(X1 2Aj) ; 1) + y2 2 2 ( R Bg x2f2(x2)dx2 P(X2 2Bg) ; 2) =y12 +aj+bg:

Proof of Theorem 6.1.

Sample group mean may formulated as Yj =

y + P ni=1 (Yi;y)I( ^ F;1 x ( j)Xi ^ F;1 x ( j +1 )) P ni=1I ( ^ F;1 x ( j)Xi ^ F;1

x ( j+1)) . The trimmed mean Yj may be

re-written as p n(Yj ;y) = n ;1 n X i=1 I( ^F;1 x (j)Xi F^ ;1 x (j+1))] ;1n;1=2 n X i=1 (Yi;y) (I(Xi F^ ;1 x (j+1)) ;I(XiF ;1 x (j+1))) ;n ;1=2 n X i=1 (Yi;y)(I(Xi ^ F;1 x (j));I(XiF ;1 x (j))) +n;1=2 n X i=1 (Yi;y)I(F ;1 x (j)Xi F ;1( j+1))] (7.1) By letting Tx = p n( ^F;1 x ();F ;1 x ()), we see that I(Xi F^ ;1 x ()) = I(Xi F ;1 X () +n;1=2T x) with Tx =p n( ^F;1 x ();F ;1 x ()). n;1=2 n X i=1 (Yi;y)I(XiF ;1 X () +n;1=2T n);I(Xi F ;1 X ()] =E(Y ;yjF ;1 X ())fX(F;1 X ())Tn+op(1) (7.2)

for any sequence Tn =Op(1).

p n( ^F;1 x ();F ;1 x ()) =f;1 X (F;1 X ())n;1=2 n X i=1 (;I(Xi F ;1 X ()))+op(1): (7.3)

(27)

Moreover, we also have n;1 n X i=1 I( ^F;1 x (j)Xi F^ ;1 x (j+1)) =j+1 ;j +op(1): (7.4)

Imposing the results in (7.2)-(7.3) into (7.1), we have the theorem. REFERENCES

Berenbaum, M. C. (1981). Criteria for analyzing interactions between bio-logically active agents. Advances in Cancer Research,

35

, 269-335. Charles, G. D., Gennings, C., Zacharewski, T. R., Gollapudi, B. B. and

Carney, E. W. (2002). An approach for assessing estrogen receptor-mediated interactions in mixtures of three chemicals: a pilot study.

Toxicological Sciences,

68

, 349-360.

Ei-masri, H. A., Reardon, K. F. and Yang, R. S. H. (1997). Integrated approaches for the analysis of toxicologic interactions of chemical mix-tures. Critical Reviews in Toxicology,

27

, 175-197.

Mumtaz, M. M., De Rosa, C. T., Groten, J., Feron, V. J., Hansen, H. and Durkin, P. R. (1998). Estimation of toxicity of chemical mixtures through modeling of chemical interactions. Environmental Health Per-spectives,

106

, 1353-1360.

Rider, C. V. and LeBlanc, G. A. (2005). An integrated addition and inter-action model for assessing toxicity of chemical mixtures. Toxicological Sciences,

87

, 520-528.

Robertson, T., Dykstra, R. L. and Wright, F. T. (1988). Order Restricted Statistical Inference. Barnes & Noble.

Wenyaw Chan, Lin-An Chen and Younghun Han (2008). Interval grouping analysis of variance model. Submitted for possible publication.

由變異數分析來評估化學污染所產生之交互作用

國立交通大學

統計學研究所

碩士論文

由變異數分析來評估化學汙染

所產生之交互作用

Assessment of Interactions in Chemical

Mixtures by ANOVA Method

研究生: 林書維

指導教授: 陳鄰安 博士

中華民國九十九年六月

由變異數分析來評估化學污染所產生之交互作用

Assessment of Interactions in Chemical Mixtures by

ANOVA Method

中華民國九十九年六月

由變異數分析來評估化學汙染所產生之交互作用

國立交通大學統計學研究所

摘 要

Assessment of Interactions in Chemical Mixtures

by ANOVA Method

Student: Shu-wei Lin

Advisors: Dr. Chi-an Lin

Institute of Statistics

National Chiao Tung University

SUMMARY

誌謝

Contents

Assessment of Interactions in Chemical Mixtures

by ANOVA Method

SUMMARY

1. Introduction

2. Development of Grouping Two Way ANOVA Model

3. Formulation of Grouping Individual Eects

Theorem 3.1.

Table 1.

Theorem 3.2.

4. Additive Eects Model and Additive with Interactions Eects

Model

Theorem 4.1.

Denition 4.2.

Table 2.

Dention 4.3.

Theorem 4.4.

Table 4.

5. Detection of Interactions

Table 5.

Table 7.

6. ANOVA Analysis for Unknown Quantiles

Theorem 6.1.

Corollary 6.2.

7. Appendix

Proof of Theorem 3.1:

Proof of Theorem 3.2:

Proof of Theorem 4.1:

Proof of Theorem 4.4:

Proof of Theorem 6.1.

35

68

27

106

87

指導教授: 陳鄰安博士

摘要

3. Formulation of Grouping Individual Eects

4. Additive Eects Model and Additive with Interactions Eects

Denition 4.2.

Dention 4.3.