• 沒有找到結果。

A General Criterion for Factorial Designs Under Model Uncertainty

N/A
N/A
Protected

Academic year: 2022

Share "A General Criterion for Factorial Designs Under Model Uncertainty"

Copied!
34
0
0

加載中.... (立即查看全文)

全文

(1)

A General Criterion for Factorial Designs Under Model Uncertainty

Pi-Wen Tsai

Department of Mathematics National Taiwan Normal University

Taipei 116, Taiwan, R.O.C.

E-mail: pwtsai@math.ntnu.edu.tw

Steven G. Gilmour

Queen Mary University of London School of Mathematical Sciences

Mile End Road London E1 4NS, UK E-mail: s.g.gilmour@qmul.ac.uk

(2)

Abstract

Motivated by two industrial experiments in which rather extreme prior knowledge was used to choose the design, we show that the QB criterion, which aims to improve the estimation in as many models as possible by incorporating experimenters’ prior knowledge along with an approximation to the As criterion, is more general and has a better statistical interpretation than many standard criteria. The generalization and application of the criterion to different types of designs are presented. The relationships between QBand other criteria for different situations are explored. It is shown that the E(s2) criterion is a special case of QB and several aberration-type criteria are limiting cases of our criterion, so that QB provides a bridge between alphabetic optimality and aberration. The two case studies illustrate the potential benefits of the QB criterion.

R programs for calculating QB are available online as supplemental materials.

Keywords: aberration, generalized minimum aberration, model robust, design op- timality criterion, projection efficiency, supersaturated design.

(3)

1 INTRODUCTION

Many experiments in industrial research and development involve studying the effects of several factors on one or more responses. Multifactor designs of various types form one of the main contributions of statistics to industrial research. The broad class of multifactor designs in common use includes response surface designs, regular and irregular fractional factorials with two or three levels and supersaturated designs. The different types of design are defined largely by the models expected to be useful for analyzing the resulting data.

The models might be polynomial models for quantitative factors, or involve factorial effects for qualitative factors. The appropriate polynomial or factorial model might be of first, second or higher order, or of mixed order, e.g. including some, but not all, second order terms.

Another important distinction that can be made is between designs for situations in which the model is known in advance and those for situations of model uncertainty. In practice, the model is almost never known for certain in advance, but “the model is known”

should be interpreted as meaning that, when designing the experiment, only one model is considered. Designs for a single model are often based on optimal design criteria, such as D-optimality or A-optimality. These criteria are very general in that they can be used whether the factors are qualitative or quantitative, however many levels the factors have and whatever the order of the model. These so-called alphabetic optimality criteria and their applications are well summarized by Atkinson et al. (2007). Extensions to situations of model uncertainty have been developed, e.g. L¨auter (1974) suggested a weighted average, over several models, of a standard criterion and Zhao et al. (2003) gave this a Bayesian interpretation when using the A criterion. Heredia-Langner et al. (2004) developed an algo- rithm for constructing designs which are optimal averaged across several candidate models, while Ozol-Godfrey et al. (2005) studied the prediction variance properties of designs un- der different models. However, these papers concentrate on small numbers of alternative models, rather than the hundreds of models which can be considered for multifactor designs.

In the case of design under model uncertainty, different criteria are used in different situations. Two-level supersaturated designs are chosen using the E(s2) criterion, three- level supersaturated designs using Ave(χ2), regular fractional factorials using resolution and aberration, irregular fractions using various generalized aberration criteria and response

(4)

surface designs using near-orthogonality. Sometimes described as model-free, these criteria consider the estimation of main effects and interactions or polynomial effects of different orders and so implicitly assume that the model to be fitted will be one of the many possible submodels of the model containing all of these factorial effects.

In this paper we study a general criterion that can be used in any of the above situations for design under model uncertainty. The QB criterion requires the definition of a maximal model of interest (which need not be estimable from the experiment), and we assume that the models fitted will be submodels of the maximal model. The criterion allows different prior weights for each possible model, so that experimenters’ belief about the model are incorporated into the design selection process.

In Section 2 we motivate our work through two case studies, which are superficially similar, although very different designs were used. The QB criterion is defined in Section 3 and some notation for the generalized word count which represents an overall measure of the partial aliasing of the factorial effects, is introduced. In Sections 4 to 6, the use of the QB

criterion in different situations and its relationship with other criteria for these situations are studied. In Section 7, the relationship with the weighted-A criterion for regular fractions is explored. In Section 8 we discuss alternative ways in which the case studies could have been designed. Finally, in Section 9, our overall conclusions are discussed, along with the limitations of the QB criterion.

2 CASE STUDIES

The motivation for the work described here comes from discussions with industrial statisti- cians and the experimenters they support. It is clear that, although the hierarchical ordering of effects in factorial structures is usually sensible, in most experiments any statement that effects up to order z are of interest, or effects of order higher than z can be assumed negligi- ble, are at best very crude simplifications of the real prior knowledge. What is needed is not a discrete collection of design-types, described by Bailey (1997) as “butterfly collecting”, but a unified theory of how to choose multifactor designs in the face of model uncertainty.

Since most industrial experiments we are directly familiar with are not in the public domain, we will use two published case studies to make our discussions more concrete.

(5)

Logothetis (1990) gave a detailed analysis of experimental data from a plasma etching process. The aim of the experiment was to identify the effects of six quantitative factors, the pressure in the reaction chamber, the radio frequency power level, the temperature of the lower electrode on which the wafer is supported, the flow rate of boron trichloride, the flow rate of chlorine and the flow rate of silicon tetrachloride, on three response variables, the etch rate of the aluminum-silicon alloy layer, the photoresist etch rate and the oxide etch rate. Three levels of each factor were used, so that nonlinearity could be assessed, and an 18-run orthogonal main effects design was chosen, based on the assumption that there would be no interactions. Logothetis (1990) stated that the “levels for the factors were chosen by the engineers so that the possibility of existence of strong interaction effects was minimized. This was based on previous evidence.” This is a very strong prior assumption and to base complete faith in the absence of interactions seems unwise. Fearn (1992), Tsai et al. (1996) and Cox and Reid (2000) reported further analyses of these data and all of these authors included at least one interaction in their best-fitting models. In this paper we show how the prior belief that interactions were unlikely could have been used, without going to the extreme of excluding the possibility of their existence. All that is required is a prior probability of them being active.

An experiment run at Ford Motor Company Ltd. was described by Grove and Davis (1992). The aim was to learn how to improve the quality of the welding of two halves of a fuel tank. The effects of six quantitative factors, weld time, cool time, weld current, cradle speed, wheel condition and wheel clamping pressure, on two responses, the minimum seam width and the strength of the weld, were to be studied. It was decided that the factors would be run at two levels each and that, in addition to the main effects, the interactions of cool time with cradle speed, cradle speed with wheel condition and weld current with wheel condition should be estimated. A sixteen-run regular fractional factorial design was used.

Once again, the assumption that all interactions except those declared are zero seems very extreme. When the data were analysed, the second largest estimated effect on the mean strength and the largest on the strength dispersion were both from contrasts which measured interactions other than those expected. In this experiment, one could also question whether the restriction to two levels of each factor was ideal. In this paper, we show how specifying less extreme prior information allows the experimenters’ knowledge and experience to be used, but the limits of that knowledge to also be recognized.

(6)

These two experiments are actually rather similar, in that they have six factors and similar numbers of runs, but the designs used were very different in the two cases. While this was, quite correctly, based on the engineers’ prior knowledge, we believe that we ought to be more skeptical about such knowledge, or at least to formally recognize the limits of that knowledge. Experimenters have sometimes expressed to us the view that factorial designs

“just tell us what we already know”. Asking for what is known before the experiment to be clearly stated in probabilistic terms allows this view to be challenged more convincingly.

It also allows designs to be tailored more precisely to finding out what is not known.

3 METHODOLOGICAL BACKGROUND

3.1 The QB Criterion

To explore the projection efficiency of a design under model uncertainty, Tsai et al. (2007) introduced the QB criterion to compare three-level main-effects designs for quantitative factors that allow the consideration of interactions in addition to main effects. The crite- rion takes into account experimenters’ prior probabilities of effects being non-negligible in advance of collecting their data.

Let y = Xβ +  denote a maximal model of interest, which need not be estimable from the experiment, where y is the n × 1 vector of responses, X is the n × (v + 1) design matrix, β is the (v + 1) × 1 vector of unknown parameters and  is the error vector, with E() = 0 and V () = σ2I. We assume that we will end up fitting a submodel of the maximal model, i.e. one containing a subset of the parameters β, but we do not in advance know which one.

Here, the marginality principle (McCullagh and Nelder (1989)) is used to define the class of eligible models that might be fitted. Marginality means that every term in the model must be accompanied by all terms marginal to it, whether their estimates are large or small, e.g.

a two-factor interaction cannot appear without the main effects of both factors involved and a quadratic effect cannot appear without the corresponding linear effect.

The QB criterion is to minimize the weighted average of an approximation to the As- criterion,

As=

v

X

i=1

(X0sXs)−1

ii,

over all the eligible models, defined by design matrices Xs and with weights defined by the probability of each model being the best model. In the common case that effects of the

(7)

same type have the same probability of being in the best model, then the prior probability of each model being the best can be easily calculated by obtaining experimenters’ prior beliefs about each individual effect being in the best model.

Let aij, for i, j = 0, . . . , v, be the (i, j)th entry of the information matrix A = X0X, for the maximal model of interest. Tsai et al. (2007) replaced the As criterion by a simple first-order approximation and defined the QB criterion as

QB=

v

X

i=1 v

X

j=0

1 aii

a2ij

aiiajj pij, (1)

where pij is the sum, over models in which terms i and j are both included, of the prior probabilities of a model being the best. Minimizing QB gives low variances for parameter estimates across a range of models, but is inexpensive computationally. When all eligible models have the same prior probability, the reciprocal of the total number of eligible models, QB reduces to the Q criterion defined earlier by Tsai et al. (2000) for three-level main effects designs.

The definition of the QB criterion originated from finding good three-level main-effects designs for quantitative factors. Depending on the prior knowledge of effects being non- negligible, different designs will be chosen among the class of main effects designs to ensure robustness to the two-factor interactions. However, QB can be applied to measure the performance of a design under any given maximal model and so can be used in different design problems under model uncertainty. For example, if the prior knowledge is such that large interactions are expected, then a broader class of designs than main effects plans should be considered.

We note that the QB criterion is based on the As criterion, so that different scales of the contrast functions for the effects in the model will produce different values of QB and different design orderings. In this paper we specify some particular scales for the effects, details of which are given in Appendix A.

The QB criterion has both similarities to and differences from the Bayesian D, or DB, criterion of DuMouchel and Jones (1994). Both use the concept of a maximal model, but whereas DB requires the parameters of the model to be split into only two groups, primary and potential terms, on the basis of prior beliefs, QB allows different prior beliefs to be put on each parameter in the model. The aims are also different, DB focusing mainly on estimating the primary terms, but without biases caused by the potential terms, while QB

(8)

tries to allow estimation of many different models. QB will sometimes prefer a Resolution- III design over a Resolution-IV design, whereas DB will not, at least with the usual priors.

Ultimately the choice between them will depend on how experimenters think about their prior knowledge and how they want the experiment to increase it. We will see that QB is closely related to several other commonly used criteria; no such relationships exist with DB. Tsai et al. (2007) also noted that QB is less computationally expensive, so that it is sometimes possible to do a complete search when DB requires an exchange algorithm.

A clear division has grown up between research using alphabetic optimality and that using aberration-type criteria. The QB criterion was developed from the alphabetic op- timality viewpoint, but using a computationally cheap approximation to the As criterion in order to consider many models. By showing how this criterion is closely related to the aberration-type criteria, we provide a bridge between the two strands of research. This criterion possesses the advantages of alphabetic optimality of being statistically meaningful and applicable to very general situations, and the advantage of aberration-type criteria of being able to consider many different candidate models.

3.2 Definitions and Notation

Making the effect hierarchy assumption that lower order effects are more important than higher order effects, Fries and Hunter (1980) introduced minimum aberration as a refine- ment of maximum resolution in searching for a regular factorial design. Let Wi(d) denote the number of i-factor interactions appearing in the defining contrast subgroup of a reg- ular fractional factorial design d. Given two regular sm−l designs, say d1 and d2, d1 is said to have lower aberration than d2 if there exists an integer r, 1 ≤ r ≤ m, such that Wr(d1) < Wr(d2) and Wj(d1) = Wj(d2) for j = 1, . . . , r − 1. A factorial design is said to have minimum aberration if no other factorial design has lower aberration. See, for exam- ple, Fang and Mukerjee (2000); Butler (2003); Jacroux (2004); Cheng and Tang (2005) for recent research on finding minimum aberration regular factorial designs.

Recently, various generalized aberration criteria have been defined for irregular frac- tional replicates in different situations. For example, Tang and Deng (1999) proposed the G2-aberration criterion for comparing irregular two-level designs, Xu and Wu (2001) and, independently, Ma and Fang (2001) suggested an ANOVA-type generalized aberration cri- terion for asymmetrical factorial designs and Cheng and Ye (2004) defined the β-aberration

(9)

criterion for designs with quantitative factors.

In the context of model robustness, particularly for the situation where the models of interest contain all the main effects and a selection of two-factor interactions, Cheng et al. (1999) defined estimation capacity in the context of regular designs and showed that the minimum aberration criterion gives a good surrogate for model robustness. Li and Nachtsheim (2000) proposed information capacity for comparing so-called model-robust factorial designs. Cheng et al. (2002) showed that the generalized minimum aberration criterion indirectly takes efficiencies into account and tends to provide a good surrogate for model robustness for irregular two-level designs when a few two-factor interactions are of interest.

In this section, we introduce a concept, the generalized word count, which can be useful in many different design problems. For an n-run design with m factors, each with p + 1 levels, let D(z) be the n × m matrix in which the columns correspond to the zth order polynomial effects of individual factors, z = 1, . . . , p. Let v(z)S be an n × 1 vector for the componentwise product of the c1th, . . ., csth columns of D(z), where S = {c1, . . . , cs} is a subset of {1, . . . , m}, 1 ≤ s ≤ m. For S = ∅, v(z) is defined as a vector of 1s.

Let j(R1, . . . , Rp) denote a measure of aliasing (or partial aliasing) between the intercept and a polynomial effect with factors Rz= {c1, . . . , crz} appearing as zth order effects, where rz = |Rz|, |·| denotes the number of elements in the argument set and k =Pp

z=1rz. This can be computed by the sum of the elements of the componentwise product of v(1)R

1, v(2)R

2, . . . and v(p)R

p. j(R1, . . . , Rp) = 0 when the effect is orthogonal to the intercept, and j(R1, . . . , Rp) = n when the effect is completely aliased with the intercept.

We further define bk(i1, · · · , ip) to be the overall aliasing between the intercept and all the sets of k-factor effects with iz factors appearing as zth order effects. This is the sum of squared j(R1, . . . , Rp) divided by n2, where the sum is taken over all mutually disjoint R’s with |Rz| = iz, andPp

z=1iz = k. This is formulated as bk(i1, · · · , ip) = X

R

j(R1, · · · , Rp) 2

, n2

= X

R

( n

X

g=1

 Y

H

d(1)zh

1· · · d(p)zh

p

)2,

n2, (2)

where R = {R1, . . . , Rp : |R1| = i1, · · · , |Rp| = ip}, H = {h1, · · · , hp : h1 ∈ R1, · · · , hp ∈ Rp}, d(z)zh

s is the (z, hs)th column of D(z), 1 ≤ k ≤ m and Pp

z=1iz = k. This is called

(10)

the generalized word count for words referring to the effects of k factors where iz factors appear as zth order effects. It measures the partial aliasing of all the effects with k factors having the same number of factors at a particular order. If all these effects are orthogonal to the intercept, then the generalized word count for words referring to these effects is 0.

For designs with two-level factors, i.e. p = 1, the definition of j(R) is the same as the J -characteristics defined by Tang and Deng (1999). Therefore the generalized word count, bk(k), defined above is the same as the Bkdefined in their paper for irregular designs in that context. It is a measure of the partial aliasing between the intercept and the interactions with k factors. For regular factorial designs, the generalized word count is the same as the usual word count. For three-level designs in a second order polynomial model, bk(i1, i2) is a measure of the partial aliasing between the intercept and the interactions with k factors where i1 factors appear as linear effects and i2 factors appear as quadratic effects, where k = i1+ i2.

4 FIRST ORDER MODELS

First order maximal models are used for screening experiments when the goal is to identify a few important main effects from a large number of potential factors under investigation.

The designs might be nearly saturated, saturated or supersaturated, i.e. there might be a few extra degrees of freedom after fitting main effects, just enough to fit all main effects or not enough to fit all main effects. As noted in Appendix A, qualitative factors are easily dealt with by giving equal weights in the QB criterion to all main effects, i.e. linear main effects, quadratic main effects and so on.

4.1 Two-Level Factors

For an n-run design with m factors, each with two levels, let the first-order model be the maximal model of interest, so that v = m. The QBcriterion in (1) selects a two-level design that minimizes

m

X

i=1

pi0

a2i0 n2 +

m

X

i=1 m

X

j=1 j6=i

pij

a2ij n2,

where pij is the sum of the probabilities of a model being the best model, the sum being taken over models containing both terms i and j, and is determined by the given prior for

(11)

each effect being in the model. In the case that each factor has the same probability of being included in the best model, each model containing a particular number of factors has the same prior probability of being the best model. Then pi10, the sum of probabilities of a model being the best model when the sum is taken over models containing the i1th factor, is equal to pi20, the sum for models containing the i2th factor, for i1 6= i2 and i1, i2= 1, . . . m; and pi1i2, the sum of probabilities of a model being the best when the sum is taken over models containing both the i1th and i2th factors is equal to pi3i4 the sum for models containing the i3th and i4th factors, i1 6= i2 6= i3 6= i4. We let pi0 = ξ1, ∀ i for the sum of probabilities of models containing the main effect of the ith factor and pij = ξ2, ∀ i, j, i 6= j for those containing the main effects of the ith and jth factors.

We note that for two level factors, aii= n for all i. By writing out the elements aij and carefully collecting them, we note that the sum of a2i0/n2, for all i, is the sum of the aliasing between a factor’s main effect and the intercept. This is equal to b1(1), the generalized word count for words referring to the effects of one factor as defined in (2) with p = 1 and k = 1. Similarly, the sum of a2ij/n2, for all i 6= j, is the sum of the aliasing between a pair of factors’ main effects, which is equal to 2b2(2), the generalized word count for the effects of two factors.

Then the QB criterion is equal to

ξ1b1(1) + 2ξ2b2(2),

which is a weighted average of b1(1) and b2(2), with weights ξ1 and ξ2depending on the sum of prior probabilities of the relevant models being the best model. It is shown in the following example that different best designs may be selected taking into account experimenters’ prior beliefs and the QB criterion provides an adequate framework to accommodate such prior beliefs.

We note that when it is further assumed that each submodel of the first-order model is equally likely to be the best model, then the QB criterion reduces to the Q criterion which minimizes

δ1b1(1) + 2δ2b2(2),

where δ1= ξ1δ0, δ2 = ξ2δ0 and δ0 is the total number of eligible models obeying functional marginality. In most cases (n ≥ 4), δ1 = 2δ2, and the Q criterion selects a design that minimizes b1(1) + b2(2).

(12)

For a level-balanced design in which the two levels appear the same number of times, b1(1) = 0, i.e. the main effect of each factor is orthogonal to the intercept. In this case, the QB criterion selects a design that minimizes b2(2). This is proportional to the E(s2) criterion, defined by Booth and Cox (1962) in the context of supersaturated designs, where E(s2) =P

i<ja2ij/ m2. This criterion has become the standard method of choosing two-level supersaturated designs.

The QB criterion allows more flexibility than this. For example, if a subset of the factors can be identified which is thought more likely to have non-negligible effects than the others, more prior weight can be put on their estimation. This provides an alternative method of obtaining designs for such situations to that of Yamada and Lin (1997) who chose E(s2)-optimal designs subject to all |aij| involving the more important factors being below a particular bound.

The QB criterion is also closely related to the p-efficient designs of Lin (1993) who studied the projection efficiencies of first-order saturated designs. To obtain p-efficient designs, he minimized E(s2) among the class of near-level-balanced and near-orthogonal designs. Under the near-level-balance property, b1(1) = 0 for even n and b1(1) = m/n2 for odd n. Therefore, the approach of Lin is equivalent to first minimizing b1(1), secondly minimizing max |aij| and thirdly minimizing b2(2). An alternative way to choose a p- efficient design would be to use QB with ξ1  ξ2. This would have the disadvantage of not guaranteeing near-orthogonality, but would be more flexible by allowing different priors to lead to different designs.

Example Here we give a simple example of five factors in six runs in which each factor has two levels. First we restrict ourselves to level-balanced designs and obtain all possible designs. Up to isomorphism of the generalized word count pattern, there are nine different first-order saturated designs. With these designs, every main effect is orthogonal to the intercept, but the main effects of a pair of factors are neither completely aliased together nor estimated independently. The Q-optimal level-balanced design, which is also the p- efficient design obtained by Lin (1993), has b1(1) = 0 and b2(2) = 1.11.

Alternatively, consider the 3/16 fractional replicate of John (1971), shown in Table 1, obtained by deleting two points from the regular 25−2 design with defining relation I = ACD = ABCE = BDE. In this design, the main effects of factors A and E are

(13)

partially aliased with the intercept and b1(1) = 0.22. For effects of pairs of factors, each of the main effects of A and E is orthogonal to each of the main effects of B, C and D, the remaining pairs, {(A,E), (B,C), (B,D), (C,D)}, are partially aliased together and b2(2) = 0.44. In terms of the Q criterion, this design is better than the best level-balanced design. That is, by sacrificing some orthogonality for factors’ main effects, we can gain more design efficiency when models with different numbers of factors’ main effects are of interest.

If ξ1≥ 6.11ξ2, then Lin’s design is optimal with respect to QB. This would imply that there is a strong prior expectation that a model with a small number of main effects will be fitted which, of course, is the situation Lin’s designs were intended for. At the other extreme, with pij = 1, ∀i, j, only the model with all main effects is considered and John’s design can be shown to be QB-optimal and A-optimal. Thus, whereas Lin contrasted his designs with D- and A-optimal designs, the use of QB shows that there is a smooth transition from one to the other as the prior probability of factors being active changes.

4.2 Three-Level Qualitative Factors

For a three-level factor, two comparisons are made for the main effect. With a qualitative factor, these effects are either both included in or both excluded from the model. The first- order maximal model for three-level qualitative factors includes main effects of the factors with a total of 2m effects.

The relationship between QBand the generalized word counts for three-level qualitative factors is given in Appendix B.1. By writing out the form of elements aij, it is shown that QB selects designs which minimize

ξ1{2b1(1, 0) + b1(0, 1)} + 2ξ2{b2(2, 0) + b2(1, 1) + b2(0, 2)}. (3) For a level-balanced three-level design, the QB-criterion selects a design that minimizes

b2(2, 0) + b2(1, 1) + b2(0, 2). (4) This criterion is similar to the Ave(χ2) criterion defined by Yamada and Lin (1999) in the context of three-level supersaturated designs, where Ave(χ2) is a measure of the dependency between every pair of columns of a design. This criterion minimizes the sum of squared nij, where nij is the number of occurrences for each of the nine combinations of levels of a pair of factors. For any design with all nij equal, Ave(χ2) is minimized, as is QB, since in this

(14)

case b2(2, 0) = b2(1, 1) = b2(0, 2) = 0. However, departures from this ideal are penalized by the two criteria in different ways. We believe that the direct statistical interpretation of QB makes it preferable and again QB with different weights could be used if some factors are thought more likely to be active than others.

4.3 Other Cases

The cases of all factors having the same number of levels, usually being two or three, are by far the most commonly studied and also arise frequently in practice. However, it is not uncommon in practice to have some factors at two levels and others at three levels, or to have some factors having more than three levels. When working with the traditional criteria, it is necessary to define a new criterion for each new class of problems. For example, an extension of the Ave(χ2) criterion, known as E(f (NOD)), has been used for mixed-level supersaturated designs - see Yamada and Matsui (2002); Yamada and Lin (2002); Fang et al. (2003).

In contrast, QBcan be applied to any problem without modification, apart from a careful choice of the prior probabilities and it has the same, statistically meaningful, interpretation in every case. A particular choice of prior probabilities will allow it to closely mimic any sensible criterion that could be suggested for any particular class of problems.

5 SECOND ORDER MODELS FOR QUALITATIVE FAC- TORS

5.1 Two-Level Factors

The second-order model for a two-level design has m main effects and m2 two-factor inter- actions, with a total of v = m + m2 effects. By writing out the elements aij and carefully collecting and counting them we find that the QB criterion defined in (1) selects a design that minimizes

10+ 2(m − 1)ξ21}b1(1) + {2ξ20+ ξ21+ 2(m − 2)ξ32}b2(2) + 6ξ31b3(3) + 6ξ42b4(4), (5) where ξij denotes the sum, over models in which at least main effects of i factors and j two-factor interactions of these i factors are included, of the prior probabilities of a model

(15)

being the best model. Details of this relationship between QB and the generalized word counts are given in Appendix B.2.

In the case of orthogonal main effects designs, i.e. b1(1) = b2(2) = 0, the QB-criterion minimizes

ξ31b3(3) + ξ42b4(4),

which is a linear combination of the generalized word counts for words of effects involving sets of three and four factors respectively. This can be used to select the best second-order design among the class of orthogonal main effects designs when the estimation of two-factor interactions is of interest. In addition, one can use it as a refinement of the E(s2)-criterion among those designs whose E(s2) are the same in the context of saturated or supersaturated designs. Liu and Dean (2004) suggested modifying the E(s2) criterion for supersaturated designs including a few interactions, but did not pursue the idea further. The QB criterion with appropriate priors provides such a modification.

An increasingly popular criterion for choosing two-level designs is the G2 generalized aberration criterion defined by Tang and Deng (1999). The G2-aberration criterion sequen- tially minimizes b2(2), b3(3), b4(4) and so on. This criterion is consistent with the belief that aliasing among lower-order effects is less desirable, so it concentrates first on minimizing aliasing between pairs of main effects, then on minimizing aliasing between main effects and two-factor interactions, then on minimizing aliasing between pairs of two-factor interactions and so on. The QB criterion on the other hand aims to improve the estimation in as many models as possible by jointly minimizing aliasing of these three types with different weights on each depending on the number of models in which some particular effects are included and prior information of each effect being in the model. The G2 criterion is a limiting form of QB. If 2ξ20+ ξ21+ 2(m − 2)ξ32  6ξ31 6ξ42, then the QB criterion converges to the G2 criterion.

5.2 Three-Level Factors

For a three-level qualitative factor the maximal model of interest is the model with all m factors’ main effects and all m2 two-factor interactions. Xu and Wu (2001) defined an ANOVA-type generalized minimum aberration (GMA) criterion for designs with qualitative factors. Effects with the same number of factors are regarded as equally important. Their

(16)

criterion functions can be written as Ak=X

K

bk(i1, · · · , ip), (6)

where K = {i1, . . . , ip : i1+ · · · + ip = k}. Their criterion is to sequentially minimize A3, A4,

· · · among the class of orthogonal main effects designs. This criterion is the same as the minimization of the generalized word count for words of effects with k factors sequentially for k = 3, . . . , m.

For orthogonal main effects designs, where all main effects are orthogonal to each other, the QB-criterion selects a design that minimizes

3{b3(3, 0) + b3(2, 1) + b3(1, 2) + b3(0, 3)}+

ξ4{b4(4, 0) + b4(3, 1) + b4(2, 2) + b4(1, 3) + b4(0, 4)}, (7) where ξs is the sum of prior probabilities for models in which at least effects of particular sets of s factors are included. This is a linear combination of the generalized word counts for words of effects involving sets of 3 and 4 factors respectively.

The GMA criterion concentrates first on minimizing aliasing between effects involving three factors and then on minimizing aliasing between effects involving four factors and so on. The QB criterion on the other hand jointly minimizes such aliasing with different weights depending on the model space and the prior information of effects being in the model. The GMA and QB criteria will select the same optimal designs when 2ξ3  ξ4. Thus, GMA is a limiting form of QB as 2ξ34 → ∞.

5.3 Other Cases

As is the case with first order models, the QBcriterion can be used with second order models when different factors have different numbers of levels or when there are factors with more than three levels. On the other hand a new generalized aberration criterion would have to be defined for each of these cases - see, for example, Joseph, Ai and Wu (2009). Since most sensible aberration criteria would correspond to limiting cases of QB, we recommend that QB be used instead. Again, the interpretation of QB is the same in every case.

(17)

6 POLYNOMIAL MODELS

6.1 Second Order Model with Three-Level Factors

When the second-order polynomial model is the maximal model of interest, the model has m linear main effects, m quadratic main effects and m2 linear×linear two-factor interactions.

For level-balanced designs, the QB criterion selects a design which minimizes {ξ201+ 2ξ200+ 2ξ211+ 2(m − 2)ξ302}b2(2, 0) + 2ξ220 b2(0, 2)

+ {2ξ210+ ξ201}b2(1, 1) + 6ξ301 b3(3, 0) + {2ξ311+ ξ302}b3(2, 1) + 6ξ402 b4(4, 0), where ξabc denotes the sum, over models which include at least linear effects of a factors, b out of the a factors’ quadratic effects and c linear×linear interactions of these a factors, of the prior probabilities of a model being the best among the eligible models.

As an alternative, consider the β-aberration criterion of Cheng and Ye (2004) in which effects with the same order are regarded as equally important. For designs with three levels, their criterion functions can be written as

Bt=

t

X

k=[t+12 ]

bk(2k − t, t − k), (8)

wheret+1

2  denotes the integer part of t+12 and t = 1, . . . , 2m. In general, for designs with (p + 1)-level quantitative factors, the β-aberration criterion sequentially minimizes the sum of the generalized word counts for effects of zth order where z = i1+ 2i2 + · · · + pip for z = 1, . . . , pm.

For designs with three levels, B1 = b1(1, 0), B2 = b1(0, 1) + b2(2, 0), B3 = b2(1, 1) + b3(3, 0), B4 = b2(0, 2) + b3(2, 1) + b4(4, 0) and so on. For level-balanced designs, the β- aberration criterion sequentially minimizes B2, B3, B4 and so on, while the QB criterion minimizes a linear combination of the individual word counts appearing in B2, B3 and B4. We see that the β-aberration criterion is not a special case of QB, but is a limiting form. If ξ201+ 2ξ200+ 2ξ211+ 2(m − 2)ξ302= c2, 6ξ301 = 2ξ210+ ξ201= c23 and 6ξ402 = 2ξ311+ ξ302= 2ξ220 = c34, then QB = c2B2+ c3B3+ c4B4. We see that as cc2

3 → ∞ and cc3

4 → ∞, QB converges to the β-aberration criterion.

In general, the β-aberration criterion aims to minimize aliasing, concentrating first on aliasing between linear and linear×linear interaction effects, then on aliasing between pairs of linear×linear interactions and between quadratic and linear×linear interaction effects.

(18)

The QB criterion is a combination of these three types of aliasing, with weights for each model being the best model determined by the experimenters’ prior belief in each effect being in the best model. Tsai et al. (2007) demonstrated that the minimum β-aberration design tends to be QB-optimal if there is more weight on linear effects and the prior information leads to a model of small size; on the other hand, when more weight is placed on the quadratic effects, different designs will be chosen for the given prior.

6.2 Mixed-Level Designs

Nguyen (1996) extended the E(s2) criterion to factorial designs whose factors have different numbers of levels, although in a rather limited way. He considered designs with one or two three-level factors, which had to be level-balanced and orthogonal to each other and then calculated E(s2) by considering the off-diagonal terms relating to the linear and quadratic contrasts of these factors. The criteria illustrated below are much more general than this.

Note also that Nguyen did not consider the appropriate scaling of the effects, although this would be a straightforward modification.

Table 2 gives three 12-run mixed-level designs with four two-level factors and one three- level factor. The designs are obtained by adding a three-level factor, say X5, to four- factor projections of the Hadamard matrix of order 12. For this design with one three-level factor, let bs(s) denote the generalized word count for words involving s two-level factors, {bs(s) ⊕ b1(1, 0)} denote the generalized word count for words involving s two-level factors and the linear effect of the three-level factor and {bs(s) ⊕ b1(0, 1)} denote the generalized word count for words involving s two-level factors and the quadratic effect of the three-level factor. The values of the three types of word multiplied by 144 are given in Table 3. With F4D1 and F4D2, {b1(1) ⊕ b1(1, 0)} = {b1(1) ⊕ b1(0, 1)} = 0 and all the two-level factors are orthogonal to the linear and quadratic effects of X5; with F4D3, the two-level factors are orthogonal to the linear effect of X5 but not to the quadratic effect. When the main effects model for the five factors is taken as the maximal model of interest, the QB criterion selects designs that minimize a linear combination of the generalized word counts of words referring to effects involving two factors, i.e. b2(2), {b1(1) ⊕ b1(1, 0)} and {b1(1) ⊕ b1(0, 1)}.

In this case, we find that F4D1 and F4D2 are better than F4D3.

When the second-order maximal model involving only the linear interactions is the maximal model of interest, the Q criterion uses the following combination of the generalized

(19)

word counts of effects of 2nd, 3rd and 4th order to select the best design:

450[b3(3) + {b2(2) ⊕ b1(1, 0)}] + 197

2 ({b2(2) ⊕ b1(0, 1)} + 179[b4(4) + {b3(3) ⊕ b1(1, 0)}]), where 450, 197, and 179 are the values of δ31, δ32 and δ42 respectively. In this case, design F4D2 has a lower value of Q than design F4D1.

Another example is a saturated design with nine two-level factors and one three-level factor which is also listed in Table 2 and whose properties are given in Table 3. We note that for this saturated design, the design has two-level factors orthogonal to each other but aliased with the main effects of the three-level factor. These examples show that it is possible to apply the QB criterion to new situations like this and that it keeps its interpretation and remains easy to calculate. On the other hand, new aberration criteria have to be invented for each such new situation.

7 REGULAR FRACTIONAL FACTORIAL DESIGNS

In regular designs with two levels, the generalized word counts defined in this paper are the same as the numbers of words of different lengths in the defining relation. The value of Wi(d) is equal to the generalized word count bi(i) for words referring to effects involving i factors. If Wi(d) = 0, then the i-factor interactions are orthogonal to the intercept so bi(i) = 0. If Wi(d) = 1, then one of the i-factor interactions is completely aliased with the intercept so that bi(i) = 1. Thus for regular factorial designs of Resolution-III or higher, for the second-order maximal model, the QB criterion selects designs that minimize

ξ31 W3(d) + ξ42 W4(d),

where ξab is the sum of prior probabilities for models in which main effects of a factors and b two-factor interactions are included. This is a linear combination of the numbers of words of length 3 and 4 in the defining contrast subgroup of the design. The minimum aberration criterion of Fries and Hunter (1980) sequentially minimizes W3(d), W4(d), etc. Thus, exactly as for the generalized aberration criteria, QB is related to aberration in regular designs.

For regular fractions in which the model is estimable, the approximation to the As criterion used in QB is exact. If the maximal model is estimable then QB represents a true weighted average As criterion over all candidate models. Since the variances of parameter

(20)

estimates do not depend on which other parameters are estimated, because of orthogonality, it is a simple weighted-A criterion for the maximal model. For regular fractional replicates, therefore, this establishes an exact link between alphabetic optimality and aberration.

8 APPLICATION TO CASE STUDIES

As noted in Section 2 the two case studies were superficially similar, both having six factors and both using similar numbers of runs, but the designs used were quite different. Although this was based on prior knowledge, in both experiments the prior knowledge stated was rather extreme and, in both experiments, the results subsequently obtained from the data analysis strongly indicated that the certainty implied by such extreme prior assumptions was not appropriate. We believe the experiments could have been better designed by expressing the prior beliefs in less extreme form.

We first note that the design used in the plasma etching experiment was probably not the best among the class of three-level orthogonal main effects designs. Table 4 gives the values of B3, B4 and Q for the second-order maximal model for some 18-run three-level main effects designs. Design 220 was the one used, which is based on six columns of the L18 orthogonal array. By broadening the class of designs, we find that the minimum β- aberration design is 25 and the Q-optimal design is 2 (designs are given in Table 2 of Tsai et al. (2007)). Comparing these, we find that a trade-off between the aliasing of linear and interaction effects and that of quadratic and interaction effects occurs. Clearly, the final choice of design will depend on the prior knowledge.

For the fuel tank welding experiment, the data suggest that we should have considered regular two-level designs in 16 runs which do not assume that only three specific interactions will be important, but consider the possibility of any interactions. The minimum aberration design is a resolution IV design with (B3, B4, B5) = (0, 3, 0). However, if we do not restrict ourselves to the idea of maximum resolution, some other designs can also be of interest.

For example, there is a design with word count pattern (B3, B4, B5) = (1, 1, 1) (columns 1 2 3 4 8 13 in Table 5). This design is a second-order saturated design in which all degrees of freedom are used to estimate main effects and two-factor interactions, and has 6 clear two-factor interactions which are not aliased with other two-factor interactions.

Here, we consider comparing all of these designs for the two experiments, with less

(21)

extreme prior knowledge specified in each case. Two center points are added to the above two-level designs, giving 18-run designs denoted Df 2.1 and Df 2.2, respectively. Table 6 gives the values of QB for the Q-optimal (D2), β-aberration optimal (D25) and these two designs. π1, π2 and π3 are, respectively, the prior probability of each linear effect being in the best model, the probability of each quadratic effect being in the best model, given that the corresponding linear effect is, and the probability of each interaction being in the best model, given that the corresponding linear effects are.

We can see that when experimenters are more interested in the linear and interaction effects and less interested in the quadratic effects, then a two-level design might be good to use. In this case if simpler models are of interest then Df 2.1 would be preferred since its two-factor interactions are orthogonal to main effects; if models with many interactions are of interest, then Df 2.2 might be a better choice since it has more clear two-factor interactions and would allow better parameter estimates for the interactions.

On the other hand, if experimenters want to estimate some factors’ quadratic effects as well as linear and interaction effects, then a three-level main effects design might be a better choice. Among the three-level main effects designs, different sets of π1, π2 and π3

can lead to different QB-optimal designs. In the case that second-order polynomial models of small size are of interest, D25 is better than the Q-optimal designs since it has the least aliasing between linear main effects and interactions, but the aliasing between quadratic main effects and interactions is high. In the case of more weight on quadratic effects or for models with more parameters, the Q-optimal design is preferred.

Thus there are reasonable interpretations of the prior beliefs stated which suggest that the designs used, or similar designs, were suitable. However, there are other interpretations of the prior beliefs which suggest otherwise. This indicates that there are benefits to be gained from stating prior beliefs formally as probabilities.

We also calculate the values of the DB criterion of DuMouchel and Jones (1994), for the cases when (1) linear main effects are the primary terms and (2) linear and quadratic main effects are the primary terms. (Note that for six-factor designs with 18 runs there are not enough degree of freedoms to consider linear and two-factor interaction effects as the primary terms.) Table 7 gives the values of DB = log(X0X + K/τ2), where K is a diagonal matrix, having diagonal element 0 for each primary term and 1 for each potential term, with τ = 0.25, 1, 4, where 1 is the default value suggested by DuMouchel and Jones

(22)

(1994), τ = 4 is when we have a strong faith in the necessity of the potential terms, and τ = 0.25 is when the potential terms are not really expected to be large. For the case of linear primary terms, Df 2.1 and Df 2.2 are preferred. As we increase τ , Df 2.2 which has more clear interactions is preferred, which is coincident with our results in QB. For the case with linear and quadratic primary terms, D2 and D25 are preferred since the the linear and quadratic main effect model is not estimable in Df 2.1 and Df 2.2. In the cases shown D2 is clearly preferred.

Using DB it is not easy to get results which correspond to those in Table 6 with the priors (0.9,0.3,0.2) or (1,0.4,0.4), even though both these cases seem to fit naturally into the description of linear effects as primary and others as potential. It is also impossible to reproduce results like those from the prior (0.7,0.4,0.1), where there are clearly three distinct groups of effects. One advantage of QB is that it separates the prior probability of an effect being in the final model from the size of that effect. For example, a linear effect can be small, but still have a high probability of being in the final model due to effects to which it is marginal being large. So there are at least some respects in which QB is more flexible that DB. A possible area for future research would be to develop modifications of DB which can match the flexibility of QB.

9 SUMMARY AND CONCLUSIONS

In this paper, we have established connections between the QB criterion and other criteria for multifactor designs in different situations. It has been shown that the QB criterion, based on a weighted average of As criteria, has better statistical justification than some popular criteria. This criterion aims to find designs which can provide good parameter estimates in as many models as possible and is recommended for many situations.

Note also that the somewhat elaborate results presented in Sections 4-7 are developed here just to prove the relationships between QB and other criteria. They are not needed to use QB in practice. All that is needed are:

• a maximal model, usually bigger than that we expect eventually to fit;

• a prior probability of each model being the best, usually obtained from prior proba- bilities of each type of effect being active; and

(23)

• a method of searching for an optimal design, which will be different for different classes of designs, but will be the same as for any other optimality criterion for a given class of designs.

The QB criterion concentrates on estimating many models and does not explicitly take account of how to discriminate between these models. However, since all models are sub- models of the maximal model, discrimination between models is indirectly taken account of.

To discriminate between two models it is sufficient to be able to fit the model which includes all terms which appear in either model. We have not chosen the weights specifically to re- flect this and how to do so might be a fruitful avenue for further exploration. But, clearly, for finding an efficient factorial design under model uncertainty, this criterion is preferable to other related criteria since it has a better statistical justification. We recommend it for practical use.

ACKNOWLEDGEMENTS

We acknowledge financial support from Queen Mary, University of London, the UK Engi- neering and Physical Sciences Research Council and the Isaac Newton Institute for Math- ematical Sciences. The first author’s research was partly supported by a grant from the National Science Council of Taiwan. We also thank Academia Sinica for invited the second author to visit and Ching-Shui Cheng for encouraging us to continue our work on this topic.

We thank the referees for thoughtful and constructive comments which led to considerable changes in the way we present this work.

APPENDIX A: SCALING OF FACTORIAL EFFECTS

For different design situations, we need to clarify the definitions of effects in the maximal model. In general, for a factor with p + 1 levels, there are p comparisons to be made for the factor’s main effect. A set of multiplicative constants for each of the contrast functions is used so that the p effects of a factor are orthogonal to each other and are estimated with equal variances in the full factorial. For a factor with two levels, the main effect of the

(24)

factor is usually represented by a function with coefficients (−1, 1) which is the comparison of the response at the two different levels. With three levels, there are two comparisons to be made. Often, two orthogonal comparisons with coefficients such as p3/2(−1, 0, 1) and p1/2(1, −2, 1) are chosen to define a factor’s main effects. When the three levels of the factor are equally spaced quantitative values, these two orthogonal comparisons represent the linear and quadratic effects of the factor. For a factor with four levels, there are three orthogonal effects to be defined. If the levels are equally spaced levels of a quantitative factor, then we use p1/5(−3, −1, 1, 3), (−1, 1, 1, −1) and p1/5(1, −3, 3, −1) to represent linear, quadratic and cubic effects of the factor, respectively. In all cases, interaction contrast functions are defined by multiplying the corresponding main effect columns in the full factorial in the usual way. Note that these are normalized contrasts in the full factorial but not, in general, in the designs we are comparing (which each have different normalized contrasts).

For factors with qualitative levels, some alternative definitions of effects might also be useful. For example, for a factor with three levels, one might be interested in all pairwise comparisons among levels. It can be shown that the contrasts for three-level qualitative factors can be derived from the linear and quadratic contrasts, and that the variances for the estimates of the pairwise comparisons are on the same scale, which is 32 times those for the comparisons with three-level quantitative factors. Thus choosing a design for linear and quadratic effects having equal weight is equivalent to choosing a design for all pairwise comparisons having equal weight.

APPENDIX B: EXPRESSING Q

B

IN TERMS OF GENER- ALIZED WORD COUNTS

B1: First order model for three-level qualitative factors

For designs with m three-level factors, let xL = p3/2xi and xQ = √

2(32x2i − 1) denote the linear and quadratic main effects of a three-level factor with the levels coded as −1, 0 and 1. When all the diagonal elements of the X0X matrix have the same value, we use the generalized word count bk(i1, i2), defined in equation (2), to represent the terms appearing in the QB-criterion. Here, bk(i1, i2) is the overall measure of the partial aliasing of all the factorial effects with k(= i1+ i2) factors where i1 factors appear in the linear effects and

(25)

i2(= k − i1) factors appear in the quadratic effects.

1. For i referring to the intercept and j referring to a linear effect, the sum of a2ij/n2 is equal to b1(1, 0), which is an overall measure of the partial aliasing between the intercept and linear effects. With level-balanced designs, b1(1, 0) = 0, i.e. linear effects are all orthogonal to the intercept.

2. For i referring to the intercept and j referring to a quadratic effect, the sum of a2ij/n2 is equal to b1(0, 1), which is an overall measure of the partial aliasing between the intercept and quadratic effects.

3. For i, j referring to linear effects of a pair of factors, the sum of a2ij/n2 is equal to 2b2(2, 0), which is twice the overall measure of the partial aliasing between pairs of factors’ linear effects.

4. For i, j referring to quadratic effects of a pair of factors, the sum of a2ij/n2 is equal to 2b2(0, 2), which is twice the overall measure of the partial aliasing between pairs of factors’ quadratic effects.

5. For i referring to a linear effect and j referring to a quadratic effect of a factor or vice versa, the sum of a2ij/n2 is equal to b1(0, 1) for i, j referring to effects of the same factor, and is equal to 2b2(1, 1) for i, j referring to effects of two different factors. We note that for a three-level design, with the coding used in the current paper, the sum of the componentwise product of a factor’s linear effect and its quadratic effect, say xL and xQ, is equal to 1/√

2 times the sum of the componentwise product of the intercept and the factor’s linear effect, j({xLxQ}) = 1

2j({xL}). Thus, the sum of a2ij/n2s for those i, j referring to a factor’s linear and quadratic effects appearing in QB is equal to b1(0, 1).

We note that in QB, pij = ξ1 for i and j referring to effects of the same factor, and pij = ξ2 for i and j referring to effects involving two different factors. The QB criterion in equation (1) selects a design that minimizes

ξ1{2b1(1, 0) + b1(0, 1)} + 2ξ2{b2(2, 0) + b2(1, 1) + b2(0, 2)}, as in (3).

(26)

B2: Second order model for two-level qualitative factors

For designs with m two-level factors, the maximal second-order model contains m main effects and m2 two-factor interactions. Let aij for i, j = 0, . . . , v , be the (i, j)th entry of the X0X matrix for the maximal second-order model where v = m + m2 is the number of effects in the model. For two-level factors, aii= n for all i and we will use the generalized word count bk(k) defined in equation (2) to represent the sum of the a2ij/n2 terms in the QB criterion in equation (1).

1. For i referring to the intercept and j referring to the main effect of a factor, the sum of a2ij/n2 is equal to b1(1). With level-balanced designs, b1(1) = 0, i.e. main effects are all orthogonal to the intercept.

2. For i referring to the intercept and j referring to a two-factor interaction, the sum of a2ij/n2 is equal to b2(2), which is an overall measure of the partial aliasing between pairs of factors’ main effects.

3. For i, j referring to main effects of a pair of factors, the sum of a2ij/n2 is equal to 2b2(2).

4. For i, j referring to a pair of interactions, the sum of a2ij/n2 is equal to 2(m − 2)b2(2) for i, j referring to interactions with a common factor, and is equal to 6b4(4) for i, j referring to interactions with no common factor. We note that with two-level factors, the aliasing between two interactions with a common factor, say AB and AC, is equal to the aliasing between the intercept and the BC interaction. For a given interaction, there are 2(m − 2) other interactions with a common factor and m−22  interactions with no common factor. Thus, for i, j referring to interactions, the sum of a2ij/n2 is 2(m − 2)b2(2) for those with a common factor and 6b4(4) for those with four different factors.

5. For i referring to the main effect of a factor and j referring to an interaction effect or vice versa, the sum of a2ij/n2 is equal to 2(m − 1)b1(1) for i, j referring to effects of a common factor, and is equal to 6b3(3) for those with no common factor.

Let ξab denote the sum, over models which include at least main effects of a factors and b interaction effects of these a factors, of the prior probabilities of a model being the best

參考文獻

相關文件

To date we had used PSO and successfully found optimal designs for experiments up to 8 factors for a mixture model, nonlinear models up to 6 parameters and also for more involved

The main advantages of working with continuous designs are (i) the same method- ology can be essentially used to find continuous optimal designs for all design criteria and

Bingham &amp; Sitter (2001) used the usual minimum-aberration criterion for unblocked designs to compare split-plot designs, but since it often leads to more than one

According to the Heisenberg uncertainty principle, if the observed region has size L, an estimate of an individual Fourier mode with wavevector q will be a weighted average of

If the best number of degrees of freedom for pure error can be specified, we might use some standard optimality criterion to obtain an optimal design for the given model, and

In addition, to incorporate the prior knowledge into design process, we generalise the Q(Γ (k) ) criterion and propose a new criterion exploiting prior information about

There is no general formula for counting the number of transitive binary relations on A... The poset A in the above example is not

The angle descriptor is proposed as the exterior feature of 3D model by using the angle information on the surface of the 3D model.. First, a 3D model is represented