A comparative study of power and sample size calculations for multivariate general linear models

(1)

This article was downloaded by: [National Chiao Tung University 國立交通大學] On: 27 April 2014, At: 20:42

Publisher: Routledge

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Multivariate Behavioral Research

Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmbr20

A Comparative Study of Power and Sample Size

Calculations for Multivariate General Linear Models

Gwowen Shieh

Published online: 10 Jun 2010.

To cite this article: Gwowen Shieh (2003) A Comparative Study of Power and Sample Size Calculations for Multivariate General Linear Models, Multivariate Behavioral Research, 38:3, 285-307, DOI: 10.1207/S15327906MBR3803_01

To link to this article: http://dx.doi.org/10.1207/S15327906MBR3803_01

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no

representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any

form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

(2)

A Comparative Study of Power and Sample Size

Calculations for Multivariate General Linear Models

Gwowen Shieh

Department of Management Science National Chiao Tung University

Repeated measures and longitudinal studies arise often in social and behavioral science research. During the planning stage of such studies, the calculations of sample size are of particular interest to the investigators and should be an integral part of the research projects. In this article, we consider the power and sample size calculations for normal outcomes within the framework of multivariate general linear models that represent the most fundamental method for the analysis of repeated measures and longitudinal data. Direct extensions of the existing generalized estimating equation and likelihood-based approaches are presented. The major feature of the proposed modification is the accommodation of both fixed and random models. A child development example is provided to illustrate the usefulness of the methods. The adequacies of the sample size formulas are evaluated through Monte Carlo simulation study.

Introduction

Repeated measures and longitudinal studies arise often in social and behavioral analyses, in which repeated observations of a response variable and a set of independent variables are recorded on subjects across occasions. Because repeated observations are recorded on the same subject, the response variables are usually correlated. Methods for analyzing correlated data have recently received considerable attention in the literature, see Keselman, Algina and Kowalchuk (2001) for a comprehensive review and their references for related discussions. The generalized estimating equation (GEE) approach proposed by Liang and Zeger (1986), in particular, is widely used by researchers in a number of fields for the analysis of longitudinal data. In recent years, a number of articles are devoted to exemplifying the use of the GEE method. For application of the GEE method relevant to behavioral studies, see Duncan et al. (1995). Two of the attractive properties of the GEE approach are that it

This research was partially supported by the National Science Council. I wish to convey my appreciation to the reviewers, whose suggestions extended and strengthened the article’s content immensely.

(3)

accommodates both discrete and continuous data, and it also allows for the flexibility in the correlation structure under a single framework. Furthermore, it only requires specification of the forms of the first two moments. The full joint distribution of correlated responses is not required. Sample size calculations and power analyses are often critical for researchers to address specific scientific hypotheses and confirm credible treatment effects. Accordingly, it is of practical importance to be able to perform these tasks in a GEE setting. Liu and Liang (1997) proposed a sample size and power formula derived from the score statistic. Shih (1997) and Rochon (1998) proposed similar Wald test procedures to compute sample sizes and statistical powers in the context of GEE. All these GEE-based approaches represent a unified tool for both normal and non-normal responses of repeated measures and longitudinal studies. These procedures have been illustrated in these articles with simulated and real data sets for binary, count and normal outcomes. Due to the complex nature, however, there is little information on the discrepancy of these methods. It would be helpful to make direct comparisons of these approaches in terms of both calculated sample sizes and precision of estimated power with a familiar content, such as the commonly used analysis of variance models in the repeated measures studies of continuous outcome variables.

In this article, we examine the adequacy of the sample size formula for Gaussian outcomes of multivariate general linear models. Since the GEE approach is closely related to quasi-likelihood methods, a question of interest is how the GEE approach compares with the methods obtained from a likelihood-based viewpoint when the correlated responses have a joint multivariate normal distribution. Therefore, the existing likelihood-based approaches of O'Brien and Shieh (1992) are also discussed. However, it is important to note that these approaches are applicable to fixed (conditional) models that assumed all the levels of the independent variables to be predetermined before data collection. The results would be specific to the particular values of the independent variables that are observed or preset by the researcher. However, it is quite common in behavioral research to have studies in which the levels of the independent variables for each experimental unit cannot be controlled and are available only after the observations are made. These models are referred to as random (unconditional) models. A natural generalization to incorporate both fixed and random independent variables should be essential to these approaches for performing power and sample size calculations in practice. According to theoretical arguments, the GEE and likelihood-based approaches are modified to accommodate the two types of models.

(4)

To test the multivariate general linear hypothesis, the test statistics and corresponding modification of sample size formulas of both the GEE and likelihood-based approaches are presented. These methods are illustrated with the child development example from Muller, LaVange, Ramey, & Ramey (1992). Since all the approaches considered here use large sample approximations, simulation studies are conducted to assess their adequacy for finite sample and robustness for noncentrality structure under various model configurations.

The Fixed Model

Consider the standard multivariate general linear model with all the levels of independent variables fixed a priori

(1) Y = XB + εεεεε,

where Y = (y₁, ..., y_N)T_{is a N × p matrix with y}

i as the p × 1 vector of observed

sequence of measurements for the ith_{subject; X = (x}

1, ..., xN)

T_{is a N × r design}

matrix with full column rank r < N, where x_i is the r × 1 vector of independent variables associated with the ith_{subject; B is the r × p matrix of unknown} regression coefficients; and εεεεε = (εεεεε₁, ..., εεεεε_N)T_{is a N × p matrix with}εεεεε

i as the p × 1 vector of random errors associated the ith_{subject, for i = 1, ..., N. The} errorsεεεεε_i are assumed to have independent and identical normal distribution

N_p(0,⌺), where ⌺ is a p × p positive-definite covariance matrix. We are concerned with the general linear hypothesis H₀: CBA = ⌿₀, where C is the

c × r matrix of between-subject contrasts with full row rank cⱕ r, and A

is the p × a matrix of within-subject contrasts with full column rank aⱕ p. The maximum likelihood estimators for B and ⌺ are ˆB = (XT_X)-1_XT_{Y and} _Sˆ

= (Y – X ˆB)T_{(Y – X ˆ}_B_{)/N, respectively. The common statistics for H}

0 are obtained from the eigenvalues of E-1_{H, where E = (YA – X ˆ}_BA₎T_{(YA – X ˆ}_BA₎

and H = (C ˆBA – ⌿₀)T_[C(XT_X)-1_CT_]-1_{(C ˆ}_BA_–⌿

0). It follows from standard results that E has the Wishart distribution W_a(N – r, AT⌺A), H has the

Wishart distribution W_a(c, AT⌺A, ␦), and E and H are independent, where

␦ = (AT⌺A)-1(CBA – ⌿ 0)

T_[C(XT_X)-1_CT_]-1(CBA – ⌿

0) is the noncentrality parameter matrix. See Timm (2002) and Morrison (1990) for further discussion of the Wilks likelihood ratio |E(E + H)-1_{|, Pillai trace tr[H(E + H)}-1_{] and} Hotelling-Lawley trace tr(E-1_{H) test statistics, where tr(•) is the trace of a} matrix.

Note that the analysis of repeated observations can be viewed as a regression model with correlated errors. This is exactly the formulation of the GEE approach. A brief review of the GEE methods is provided in the

(5)

Appendix. Next, we present the important details of the test statistics of the GEE and likelihood-based approaches.

The GEE Approach

With the subject-wise orientation, the following results are derived through the GEE setting of Rochon (1998) and Liu and Liang (1997) for the multivariate general linear model shown by Equation 1. The Wald test statistic or generalized T2

0statistic proposed in Rochon is

(2) Q_W = N•tr(E-1_H)

where tr(E-1_{H) is the Hotelling-Lawley trace mentioned above.} Further-more, it can be shown that the quasi-score statistic or Rao’s score statistic in Liu and Liang (1997) is

(3) Q_S = N•tr[H(E + H)-1_]

where tr[H(E + H)-1_{] is the aforementioned Pillai trace. The actual test is} performed by referring the test statistics Q_W and Q_S to their asymptotic distribution under H₀, which is a central chi-square distribution with ca degrees of freedom.

In order to evaluate the power under alternative hypothesis, the distributions of both statistics shown by Equations 2 and 3 are approximated by a noncentral chi-square distribution with ca degrees of freedom. It follows from Equation 6 of Rochon (1998) and Equation 4 of Liu and Liang (1997) that the noncentrality parameter ␯ = tr(␦), where ␦ is defined earlier. In this case, the asymptotic property of Rochon’s Q_W and Liu and Liang’s

Q_S agree with that of the Hotelling-Lawley trace and Pillai trace, respectively (see Anderson, 1984, p. 330 and Seber, 1984, p. 415 for details). Consequently, the Wald statistic Q_W and Rao’s statistic Q_S have identical asymptotic distributions under both null and alternative hypotheses.

The Likelihood-based Approach

To test a general linear hypothesis under multivariate general linear model shown by Equation 1, the common statistics are computed from |E(E + H)-1_{|, tr[H(E + H)}-1_{] and tr(E}-1_{H), and their critical values have been} widely tabled and charted (for example, see Seber, 1984, pp. 562-564). However, in practice, more tractable ones have been proposed by transforming them to F-type statistics. In this article, we focus on F

(6)

approximations of the Hotelling-Lawley trace statistic tr(E-1_{H) for two} principle reasons. First, it resembles very much the aforementioned statistic (Equation 2). Second, it has more desired properties than the other competing tests. To be specific, two transformations of tr(E-1_{H) are} considered here. Pillai and Samson (1959) proposed

(4) F_T1 = df2_T1tr(E-1_H)/sca

where df2_T1 = s(N – r – a – 1) + 2 and s = minimum(c, a). Moreover, McKeon (1974) presented

(5) F_T2 = df2_T2tr(E-1_H)/hca,

where df2_T2 = (ca + 2)g + 4, g = [(N – r)2_{– (N – r)(2a + 3) + a(a + 3)]/} [(N – r)(c + a + 1) – (c + 2a + a2_{– 1)], and h = (df2}

T2 – 2)/(N – r – a –1).

Under the null hypothesis, both F_T1 and F_T2 are compared to an F distribution with numerator degrees of freedom ca, and denominator degrees of freedom

df2_T1 and df2_T2, respectively.

To approximate the distributions of F_T1 and F_T2 under the alternative hypothesis, O'Brien and Shieh (1992) described a direct extension of the respective F distributions to their noncentral counterpart with the noncentrality parameter ␯ as mentioned earlier. These procedures are motivated and partially justified by the knowledge of the following facts. First, a noncentral

F distribution with numerator and denominator degrees of freedom (df1, df2)

and noncentrality parameter ⌬ converges to a noncentral chi-square distribution with degrees of freedom df1 and noncentrality parameter ⌬ when

df2 tends to infinity. In this case, both F_T1 and F_T2 converge to the same asymptotic distribution of N•tr(E-1_{H), which is a noncentral chi-square} distribution with degrees of freedom ca and noncentrality parameter ␯. Second, when s = 1, the proposed noncentral F distribution is exact for both F-type statistics. Although similar transformation has been presented by Muller and Peterson (1984) and Muller et al. (1992), their method provides smaller noncentrality values for s > 1 as discussed in O'Brien and Shieh (1992).

The Random Model

Traditionally, we treat the model described in Equation 1 as a conditional random model because the values of the independent variables are fixed and known. It is important to recognize and account for the extra variability stemming from the fact that, in another replication of the same study, different settings for the independent variables will be obtained. Thus, the

(7)

random model is more appropriate for the studies in which the values of the independent variables cannot be predetermined. We now define the formal random model associated with the multivariate general linear model.

Assume that the independent variables (x = x_i, i = 1, ..., N) follow a distribution f(x) with finite moments. The form of f(x) is assumed to be depended on none of the unknown parameters B and ⌺. It follows from the standard asymptotic result that

1 / / N T T i i i N N = =

∑

X X x x

converges in probability to K where K = E_x(xxT_{) and E}

x(•) denotes the

expectation taken with respect to the distribution of x. With this additional assumption and the application of Slutsky’s Theorem, it can be demonstrated that

N1/2_{( ˆ}_B_{– B) = N}1/2_(XT_X)-1_XTεεεεε has a limiting matrix normal distribution N r × p(0, K-1_, ⌺). Similarly, [C(XT_X)-1_CT_]-1/2_{(C ˆ}_{B A}_{– CBA) has a limiting matrix} normal distribution N_{c × a}(0, I_c, AT⌺A), where I

c is the c × c identity matrix.

Hence, under the null hypothesis H₀: CBA = ⌿₀, H converges in distribution to the Wishart distribution W_a(c, AT⌺A, 0). It can be shown that the

distribution of E is the Wishart distribution W_a(N – r, AT⌺A) under both

hypotheses for the fixed and random models. Consequently, under the null hypothesis, the proposed central chi-square and F approximations for the four statistics Q_W, Q_S, F_T1 and F_T2 described above are extended to the case of random independent variables.

For the statistics under the alternative hypothesis, we note that [C(XT_X)-1_CT_]-1/2_{(C ˆ}_{B A}_–⌿

0) and N

1/2_(CK-1_CT₎-1/2_{(C ˆ}_{B A}_–⌿ 0) are equivalent in asymptotic distribution. For the purpose of relating asymptotic power function calculations to the local alternatives to H₀, an asymptotically equivalent distribution of [C(XT_X)-1_CT_]-1/2_{(C ˆ}_{B A}_–⌿

0) can be defined in the form of

N_{c × a}[N1/2_(CK-1_CT₎-1/2_{(CBA –}⌿

0), Ic, A T⌺A],

in which case, we propose to consider the distribution of H with the operational and asymptotically equivalent Wishart distribution W_a(c, AT⌺A, Nd ) and d = (AT⌺A)-1_{(CBA –}⌿

0)

T_(CK-1_CT₎-1_{(CBA –}⌿ 0).

We proceed to approximate the distributions of the GEE-based statistics

Q_W and Q_S by a noncentral chi-square distribution with ca degrees of freedom. The noncentrality parameter used in the approximation is defined as

(6) ␯_N = N

n

,

(8)

where

n

= tr(d ). Similarly, we approximate the distributions of the likelihood-based statistics F_T1 and F_T2 by noncentral F distributions with noncentral parameter ␯_N, common numerator degrees of freedom ca and denominator degrees of freedom df2_T1 and df2_T2 defined in Equations 3 and 4, respectively. Essentially, the noncentral parameter ␯_N is the counterpart of␯ with substitution of XT_{X with NK.}

Note that the probability distribution f(x) of x can be discrete, continuous and both. In general, there is no simple and tractable expression for the distribution ˆB except in some special cases. For the common additional assumption that x has a multivariate normal distribution as in Sampson (1974), it can be shown that ˆB follows a matrix t distribution, see Dickey (1967) for detailed discussion. As described in Dickey (1967), the limiting distribution of a matrix t agrees with the matrix normal distribution. Therefore, all the asymptotic properties claimed previously for ˆB and related statistics still apply under the normality assumption of x.

Sample Size Calculations The GEE Approach

The actual implementation of sample size calculations of the GEE approaches using the statistics Q_S and Q_W is as follows. With specified parameter values B and ⌺, and chosen probability distribution f(x), the sample size needed to test hypothesis H₀: CBA = ⌿₀ with specified significance level ␣ and power 1 – ␤ against the alternative H₁: CBA⫽ ⌿₀ is determined by the following two steps. First, find the noncentrality ␯_N of a noncentral chi-square distribution with ca degrees of freedom such that its 100•␤th_{percentile is equivalent to the 100(1 –}␣)th_{percentile of a central} chi-square distribution with ca degrees of freedom. Second, the sample size estimate is computed as ␯_N/

n

, where

n

is defined in Equation 6. Note that both methods result in the same estimated sample size.

Note that although the notion of fixed values for independent variables was introduced in Liu and Liang (1997), their presentation and simulation studies did not emphasize the special feature of random independent variable. Moreover, our proposed formulation naturally accommodates the extension of unequal allocation in Rochon (1998). Therefore, the approach described here differs from that of Liu and Liang (1997) and Rochon (1998) with respect to the independent variables distribution where they proposed to consider only the fixed model. This is naturally extended here to the random case and the distribution f(x) could be either discrete or continuous with a finite number of levels such as Bernoulli or multinomial distributions or an infinite number of

(9)

values, for example, Poisson and normal distributions. If x is discrete with m distinct values x_uj and f(x_uj) = ␲_j, j = 1, …, m, where

1 1 m j j p = =

∑

, then K = E_x(xxT_{) =} 1 . m T j uj uj j p =

∑

x x

Accordingly, this representation is also valid for the case of x_uj being considered as fixed levels rather than random component as in Liu and Liang (1997) and Rochon (1998).

The Likelihood-based Approach

In a manner analogous to the GEE approach described above, we keep the same assumption that the independent variables x has a joint probability function f(x), which can be either discrete or continuous, and propose the factorization defined in Equation 6 for the noncentral parameter ␯_N of the noncentral F approximations of F_T1 and F_T2 under the alternative hypothesis. Therefore, this procedure becomes a direct generalization of O'Brien and Shieh (1992) in which their results are limited to those applications where all the levels of the independent variables in the model are fixed in advance. Hence, given parameter values B and ⌺, chosen probability distribution f(x), and sample size N, the statistical power achieved for testing hypothesis H₀:

CBA = ⌿₀ with specified significance level ␣ against the alternative H₁:

CBA⫽ ⌿₀ is the probability

P[F(ca, df2, N

n

) > F_{ca, df2,}_␣]

with df2 = df2_T1 and df2_T2 for the two F-type test statistics F_T1 and F_T2 defined in Equations 4 and 5, respectively, where F(df1, df2,⌬) denotes a noncentral F random variable with (df1, df2) degrees of freedom and noncentrality parameter ⌬, and Fdf1, df2,␣ denotes the 100(1 – ␣)th percentile of a central F random variable with (df1, df2) degrees of freedom. Note that this procedure can be reversed to calculate the sample size needed in order to attain the specified power. However, it usually involves an iterative process to find the solution because both

F(df1, df2,⌬) and F_{df1, df2,}_␣ depend on the sample size N.

(10)

An Example

We illustrate in this section the power and sample size calculation procedures in a child development example that has motivated our work. Muller et al. (1992) presented a power analysis for a longitudinal study of a child’s intellectual performance as a function of mother’s estimated verbal intelligence. With child IQ measurements at 12, 24, and 36 months (p = 3), and with intercept, linear, quadratic, and cubic trends in mother’s standardized IQ (MSIQ) as independent variables (r = 4), this yields

.12 .24 .36 .12 .24 .36 2 .12 .24 .36 3 .12 .24 .36 1 and , I I I L L L Q Q Q C C C MSIQ MSIQ MSIQ b b b b b b b b b b b b         =  =        _ _   x B

where ␤_I.t is the intercept, while ␤_L.t, ␤_Q.t, and ␤_C.t are the corresponding coefficients of linear, quadratic, and cubic values of MSIQ for time t = 12, 24, and 36, respectively. Here the hypothesized relationship between mother and child competence of interest corresponds to a test of the time × mother’s IQ interaction. Hence, the between-subject and within-subject contrast matrices (c = 3 and a = 2) are

1/ 2 1/ 6 0 1 0 0 0 0 1 0 and 0 2 / 6 , 0 0 0 1 _{1/ 2} _{1/ 6} ₋    _ _   = =_ − _   _ _   _ _ C A

respectively. According to the previous study of the Infant Health and Development Program (IHDP, see Ramey, Bryant, Wasik, Sparling, Fendt, & LaVange, 1992), the model parameter estimates are

114.46 104.66 98.83 218.48 83.66 72.19 2.88 8.77 10.67 _ˆ ˆ _and _83.66 _{251.92 158.60 .} 0.71 0.90 1.30 72.19 158.60 244.58 0.21 0.54 0.72   _ _   _ _ =_₋ ₋ ₋ _ =   ₋ ₋ ₋  _ _     B S

As for the values of the independent variables, one of the three schemes assumed that through screening the sample would be evenly distributed across the four groups of mother’s IQ: namely retarded (IQ < 70), borderline

(11)

(IQ 70-85), normal (IQ 85-100) and high (IQ > 100). Furthermore, it was also assumed that the spread of mother’s IQ scores within each group would follow the IHDP pattern, and the actual mother’s IQ scores were treated as continuous values in the power analysis. At first sight, this may produce reasonable approximation to the true mother’s IQ score distribution of the current study. However, the authors considered it impractical and expected an increase in power associated with over-sampling of the extreme values of mother’s IQ. The primary reason of inducing such scheme and other approximations is due to the lack of a proper procedure that accounts for the nature of continuous distribution. Therefore, the absence of consensus in determining the discretization of continuous independent variables distribution and the failure to embed the method in a general setup are obvious limitations of the existing approaches in Muller et al. (1992), O'Brien and Shieh (1992), Liu and Liang (1997) and Rochon (1998).

For illustrative purpose, we assume that the mother’s standardized IQ (MSIQ) has a standard normal distribution. Note that the mother’s IQ score has a population mean 100 and standard deviation 15. Thus, it follows that

1 2 3 1 2 3 4 2 3 4 5 3 4 5 6 1 1 0 1 0 0 1 0 3 ( ) , 1 0 3 0 0 3 0 15 T X m m m m m m m E m m m m m m m m         = =  =              K xx

where m_i is the ith_{moment of a standard normal distribution. We are} interested in how many subjects are needed to detect the time × mother’s IQ interaction H₀: CBA = 0 in terms of the matrices C and A just stated. With these specifications, it follows from Equation 6 that n = 0.1328. Assuming the significance level ␣ = 0.05, the sample size estimates of the Wald test (Equation 2) and Rao’s test (Equation 3) and F transforms (Equations 4 and 5) are 132, 132, 135 and 137, respectively, for power 1 – ␤ = 0.90, while the corresponding sample size estimates are 158, 158, 161 and 162, respectively, for power 1 – ␤ = 0.95. The achieved power levels of the four tests are 0.9858, 0.9858, 0.9843 and 0.9836 for a given sample size N = 200. These numbers are comparable with the results obtained from Muller et al. (1992) shown in Table 6.

With these sample size and power calculations, the most powerful approach can be easily identified. However, the proposed approaches use large sample approximations, there is no guarantee that the one that gives higher power will always be more accurate in achieving the nominal power. Hence, we continue to compare the accuracy of these formulas in terms of

(12)

the discrepancy between estimated actual power and nominal power, where they all use the same sample size. This is demonstrated in the following simulation studies.

Simulation Studies

In order to evaluate the accuracy of the proposed approaches, Monte Carlo simulation studies are performed for three different multivariate general linear models with Gaussian responses. To reinforce the concept of fixed or random models and to emphasize the flexibility of the proposed methods, the three models are chosen as follows.

Fixed MANOVA Model

For the fixed MANOVA model, we consider two designs for the standard MANOVA model with equal group sizes. The first design has r = 4, whereas the second design is set to have r = 3. For the first design, the vector of independent variable contains only indicator variables, taking the values of zero or one, x = [1 0 0 0]T_{, [0 1 0 0]}T_{, [0 0 1 0]}T_{or [0 0 0 1]}T_{, and each of}

these vectors is replicated with the same number of times in the design matrix

X. In this case, it is trivial that XT_{X = (N/4)I}

4. Hence, even for the fixed independent variable structure, the noncentrality parameter can still be written in the form ␯_N in Equation 6 with K = (1/4)I₄. We are interested in a test of no group effects (c = 3) with

1 1 0 0 1 0 1 0 . 1 0 0 1 −     = −  ₋    C

Accordingly, the independent variable vector in the second design is x = [1 0 0]T_{, [0 1 0]}T_{or [0 0 1]}T_{. Furthermore, X}T_{X = (N/3)I} 3 and 1 1 0 with 2. 1 0 1 c −   =_ ₋ _ =   C

Random MANOVA Model

In contrast to the fixed MANOVA model, we generalize the configurations of independent variables into a random setting of discrete uniform distribution

(13)

with␲_j = 1/r, j = 1, ..., r. Hence, the design matrix X is now composed of vectors, which follow a discrete uniform distribution. It is interesting to note that K = E_x(xxT_{) = (1/r)I}

r for r = 3 and 4. Hence, for any one of the four proposed

approaches, the sample size formulas are identical for the first two models.

Development Model

The third model is patterned after the prescribed child development data set with r = 4 repeated measures. We consider the vector of independent variables composed of powers of a standard normal variable, specifically

x = [1 z z2_z3_]T_{, where z has a standard normal distribution. Hence, the matrix} K has the same form as defined in the previous section for the child

development example. Here, the concern is whether there is a linear, quadratic and cubic trend relationship between response and the between-subject contrast matrix is defined as

0 1 0 0 0 0 1 0 with 3. 0 0 0 1 c     = =     C

As in the previous two models, two different designs are studied as well. Note that all four sample size formulas depend on the identical noncentrality parameter ␯_N as defined in Equation 6, which in turn relies on

tr(d ) or the sum of eigenvalues of d , where d = (AT⌺A)-1_{(CBA –} ⌿0)

T_(CK-1_CT₎-1_{(CBA –}⌿

0). Therefore, without loss of generality, we assume AT⌺A = I

a, and ⌿0 is a c × c null matrix throughout the simulation study. Furthermore, we let d = diag(

d

₁, ...,

d

_a), a diagonal matrix be with diagonal elements (

d

₁, ...,

d

_a), where (

d

₁, ...,

d

_a) are the eigenvalues of d and tr(d ) = 1 . a l l d =

∑

For each of the three models described above, the values of a are set as 3 and 2 for the two designs, respectively. In order to study the robustness properties of the sample size formulas with respect to a wide range of designs and sample sizes, the parameter matrix BA = ( ₁T

B , B )₂T T_{is defined}

such that (

d

₁, ...,

d

_a) has the following four different structures:

(14)

Equal:

d

_l = n */a, l = 1, ..., a; Linear:

d

_l = l·n */( 1 a l l =

∑

), l = 1, ..., a; Geometric:

d

_l = 2l-1_·n */( 1 1 2 a l l − =

∑

), l = 1, ..., a; Extreme:

d

₁ = n *, and

d

_l = 0, l = 2, ..., a. The value of n * is predetermined by P[F(ca, df2_T1, Nn *) >

1,

, 2T 0.05

ca df

F ] =

0.70 for each design. In fact, for simplicity, B₁ is set as a 1 × a null vector in all cases, and B₂ can thus be decided, since it is the only unknown element in d for each of the four different eigenvalue structures. It is important to note that the equal and extreme eigenvalue structures cover the two opposite cases of (

d

₁, ...,

d

_a) for all possible combinations of A,⌺, B, and ⌿₀ under the specifications of C and K. Obviously, the other two structures stand for some of the intermediate situations. Given the sample sizes N and eigenvalues structure n *, the nominal powers of the four approaches can be computed according to the procedures described earlier. Note that the nominal power of the F_T1 statistic is always 0.70. In general, the nominal power of F_T2 statistic is slightly lower, whereas the nominal power of the Wald and Rao’s statistics tends to be higher than 0.70.

Estimates of actual Type I error rate and power associated with the given sample sizes and model configurations are then computed through Monte Carlo simulation using 5000 replicate data sets. The adequacy of the sample size formula is determined by the difference between the estimated and nominal values of Type I error rate and power. All calculations are performed using programs written with SAS/IML (SAS Institute, 1999). The results of the simulation studies are presented in Tables 1–6 for the three models and each with two different designs. We also conducted the simulation study for two different levels of power, namely P[F(ca, df2_T1,

Nn *) >

1,

, 2_T 0.05

ca df

F ] = 0.50 and 0.95. The results are similar to those of power 0.70, and are thus not shown here.

As would be expected, the results in Tables 1–6 suggest that the accuracy of the competing approaches increases with the sample size, and varies with the structure of the eigenvalues. For the errors between the estimated Type I error rate and the nominal level 0.05, it is clear that the method Q_W gives the largest errors while the other three methods have much better performance of achieving the nominal level in all six tables. With respect to the accuracy of power calculations, the Rao’s statistic Q_S of the GEE approach is the most sensitive one among the four approaches with respect to four different eigenvalue structures. This situation is more

(15)

prominent when the sample size is small in all models. On the contrary, the two likelihood-based approaches are much more robust with respect to the level of variation among eigenvalues. Furthermore, the performance of the two F-transform approaches appears to be excellent over the whole range of conditions that we have considered. Although the differences between the two approaches are small, it shows a clear pattern that the F_T2 approach Table 1

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Fixed MANOVA Model with Equal Group Size (c = 3, r = 4, a = 3, ␣ = 0.05)

Sample Method Error Nominal Error for power

size for␣ power Eigenvalues structure of d

Equal Linear Geometric Extreme

N = 20 Q_W 0.2772 0.8075 0.1367 0.1361 0.1359 0.1281 Q_S -0.0160 0.8075 -0.1509 -0.1743 -0.1855 -0.4079 F_T1 0.0148 0.7000 -0.0336 -0.0322 -0.0322 -0.0556 F_T2 0.0018 0.6008 0.0156 0.0150 0.0170 -0.0076 N = 40 Q_W 0.1034 0.7446 0.1086 0.1040 0.1054 0.0920 Q_S -0.0098 0.7446 -0.0712 -0.0774 -0.0806 -0.1562 F_T1 0.0024 0.7000 -0.0162 -0.0160 -0.0174 -0.0208 F_T2 -0.0026 0.6591 0.0013 0.0029 0.0029 -0.0019 N = 60 Q_W 0.0586 0.7281 0.0807 0.0753 0.0745 0.0597 Q_S -0.0036 0.7281 -0.0431 -0.0475 -0.0471 -0.1027 F_T1 0.0048 0.7000 -0.0100 -0.0076 -0.0076 -0.0220 F_T2 0.0002 0.6743 0.0045 0.0051 0.0021 -0.0097 N = 80 Q_W 0.0394 0.7205 0.0453 0.0463 0.0443 0.0451 Q_S -0.0072 0.7205 -0.0421 -0.0389 -0.0423 -0.0751 F_T1 -0.0004 0.7000 -0.0146 -0.0122 -0.0152 -0.0204 F_T2 -0.0024 0.6813 -0.0045 -0.0049 -0.0063 -0.0099 N = 100 Q_W 0.0250 0.7161 0.0515 0.0499 0.0499 0.0409 Q_S -0.0050 0.7161 -0.0251 -0.0281 -0.0287 -0.0541 F_T1 -0.0004 0.7000 -0.0048 -0.0044 -0.0052 -0.0110 F_T2 -0.0024 0.6853 0.0029 0.0021 0.0013 -0.0061

(16)

is more accurate with the estimated actual power than the other F_T1 approach. The performance for the method Q_W is very good throughout the six tables regardless of the eigenvalues structure and sample size. However, this is counteracted by the inflated estimates of Type I error rate ␣ mentioned earlier. Therefore, the practical use of Q_W is questionable at least for the multivariate general linear models.

Table 2

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Fixed MANOVA Model with Equal Group Size (c = 2, r = 3, a = 2, ␣ = 0.05)

Equal Linear/Geometric Extreme

N = 15 Q_W 0.1602 0.8106 0.0964 0.0912 0.0802 Q_S -0.0126 0.8106 -0.1218 -0.1384 -0.2852 F_T1 0.0080 0.7000 -0.0172 -0.0170 -0.0446 F_T2 0.0006 0.6293 0.0237 0.0215 -0.0031 N = 30 Q_W 0.0708 0.7462 0.0744 0.0696 0.0588 Q_S 0.0000 0.7462 -0.0546 -0.0590 -0.1072 F_T1 0.0076 0.7000 -0.0078 -0.0064 -0.0228 F_T2 0.0034 0.6696 0.0088 0.0112 -0.0024 N = 45 Q_W 0.0432 0.7291 0.0395 0.0441 0.0407 Q_S -0.0014 0.7291 -0.0443 -0.0457 -0.0713 F_T1 0.0034 0.7000 -0.0128 -0.0140 -0.0174 F_T2 0.0016 0.6807 -0.0019 -0.0023 -0.0071 N = 60 Q_W 0.0312 0.7213 0.0409 0.0423 0.0313 Q_S -0.0010 0.7213 -0.0285 -0.0321 -0.0495 F_T1 0.0034 0.7000 -0.0036 -0.0044 -0.0128 F_T2 0.0008 0.6859 0.0041 0.0045 -0.0051 N = 75 Q_W 0.0214 0.7167 0.0367 0.0331 0.0371 Q_S 0.0010 0.7167 -0.0159 -0.0193 -0.0313 F_T1 0.0044 0.7000 0.0022 -0.0018 -0.0022 F_T2 0.0036 0.6889 0.0093 0.0055 0.0045

(17)

As mentioned on the previous page, the sample size formulas in the first two models are identical. Hence, the discrepancy between the results in Tables 1–2 and Tables 3–4 are the direct consequence of using fixed or random independent variables. In general, the absolute errors in Tables 3– 4 are slightly larger than those in Tables 1–2. Such phenomena shall continue Table 3

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Random MANOVA Model with Uniformly Distributed Group Size (c = 3, r = 4, a = 3, ␣ = 0.05)

N = 20 Q_W 0.2804 0.8075 0.1239 0.1229 0.1233 0.1101 Q_S -0.0192 0.8075 -0.1793 -0.2049 -0.2169 -0.4195 F_T1 0.0102 0.7000 -0.0490 -0.0574 -0.0580 -0.0830 F_T2 -0.0030 0.6008 -0.0032 -0.0072 -0.0102 -0.0376 N = 40 Q_W 0.1072 0.7446 0.0954 0.0868 0.0830 0.0754 Q_S -0.0094 0.7446 -0.0864 -0.0974 -0.0998 -0.1784 F_T1 0.0048 0.7000 -0.0316 -0.0368 -0.0346 -0.0482 F_T2 -0.0022 0.6591 -0.0145 -0.0167 -0.0157 -0.0271 N = 60 Q_W 0.0644 0.7281 0.0663 0.0677 0.0667 0.0519 Q_S -0.0016 0.7281 -0.0547 -0.0561 -0.0643 -0.1069 F_T1 0.0082 0.7000 -0.0200 -0.0160 -0.0194 -0.0340 F_T2 0.0032 0.6743 -0.0065 -0.0023 -0.0065 -0.0211 N = 80 Q_W 0.0506 0.7205 0.0525 0.0487 0.0507 0.0427 Q_S 0.0018 0.7205 -0.0433 -0.0405 -0.0439 -0.0709 F_T1 0.0098 0.7000 -0.0198 -0.0136 -0.0114 -0.0172 F_T2 0.0062 0.6813 -0.0099 -0.0041 -0.0021 -0.0065 N = 100 Q_W 0.0290 0.7161 0.0443 0.0475 0.0469 0.0371 Q_S -0.0060 0.7161 -0.0329 -0.0293 -0.0309 -0.0589 F_T1 -0.0010 0.7000 -0.0110 -0.0074 -0.0062 -0.0116 F_T2 -0.0032 0.6853 -0.0071 -0.0013 -0.0005 -0.0053

(18)

to exist between random and fixed models in other settings. For the results associated with the development model in Tables 5–6, comparatively much larger sample sizes are required to achieve the same accuracy than the previous two cases in Tables 1–4. Due to the polynomial function of the standard normal covariates, the magnitude of the errors between the Table 4

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Random MANOVA Model with Uniformly Distributed Group Size (c = 2, r = 3, a = 2, ␣ = 0.05)

N = 15 Q_W 0.1614 0.8106 0.0788 0.0756 0.0536 Q_S -0.0166 0.8106 -0.1634 -0.1796 -0.3102 F_T1 0.0024 0.7000 -0.0476 -0.0532 -0.0754 F_T2 -0.0054 0.6293 -0.0019 -0.0075 -0.0315 N = 30 Q_W 0.0624 0.7462 0.0508 0.0522 0.0476 Q_S -0.0022 0.7462 -0.0782 -0.0816 -0.1248 F_T1 0.0042 0.7000 -0.0286 -0.0314 -0.0368 F_T2 0.0004 0.6696 -0.0112 -0.0148 -0.0208 N = 45 Q_W 0.0388 0.7291 0.0299 0.0283 0.0249 Q_S -0.0008 0.7291 -0.0581 -0.0573 -0.0857 F_T1 0.0058 0.7000 -0.0278 -0.0226 -0.0346 F_T2 0.0034 0.6807 -0.0157 -0.0101 -0.0211 N = 60 Q_W 0.0342 0.7213 0.0315 0.0301 0.0227 Q_S 0.0058 0.7213 -0.0373 -0.0369 -0.0583 F_T1 0.0100 0.7000 -0.0110 -0.0106 -0.0198 F_T2 0.0074 0.6859 -0.0021 -0.0019 -0.0139 N = 75 Q_W 0.0212 0.7167 0.0267 0.0249 0.0261 Q_S -0.0048 0.7167 -0.0239 -0.0279 -0.0427 F_T1 0.0000 0.7000 -0.0050 -0.0100 -0.0086 F_T2 -0.0016 0.6889 -0.0003 -0.0021 -0.0037

(19)

estimated power and the nominal power tends to decrease when the eigenvalue structure changes from “equal” to “extreme”. The results in the two MANOVA models show the opposite pattern that the “equal” eigenvalue structure produces the smallest errors among the four different structures, while the “extreme” eigenvalue structure gives the largest errors. This should be the case that is commonly encountered in the standard MANOVA model.

Table 5

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Development Model with Standard Normal Independent Variable (c = 3, r = 4, a = 3, ␣ = 0.05)

N = 100 Q_W 0.0294 0.7161 -0.0019 0.0057 0.0079 0.0181 Q_S -0.0042 0.7161 -0.0823 -0.0751 -0.0727 -0.0885 F_T1 -0.0002 0.7000 -0.0554 -0.0514 -0.0454 -0.0432 F_T2 -0.0024 0.6853 -0.0481 -0.0425 -0.0393 -0.0375 N = 150 Q_W 0.0206 0.7105 0.0069 0.0141 0.0135 0.0085 Q_S 0.0018 0.7105 -0.0415 -0.0381 -0.0391 -0.0569 F_T1 0.0048 0.7000 -0.0280 -0.0212 -0.0212 -0.0304 F_T2 0.0032 0.6904 -0.0230 -0.0166 -0.0170 -0.0258 N = 200 Q_W 0.0194 0.7078 -0.0044 -0.0024 0.0034 0.0144 Q_S -0.0004 0.7078 -0.0432 -0.0390 -0.0386 -0.0334 F_T1 0.0026 0.7000 -0.0336 -0.0284 -0.0264 -0.0090 F_T2 0.0012 0.6929 -0.0301 -0.0245 -0.0225 -0.0047 N = 250 Q_W 0.0162 0.7062 0.0008 0.0080 0.0098 0.0024 Q_S 0.0028 0.7062 -0.0278 -0.0226 -0.0230 -0.0392 F_T1 0.0060 0.7000 -0.0192 -0.0130 -0.0144 -0.0196 F_T2 0.0056 0.6943 -0.0159 -0.0103 -0.0127 -0.0177 N = 300 Q_W 0.0094 0.7052 -0.0032 0.0034 0.0056 0.0104 Q_S 0.0006 0.7052 -0.0298 -0.0242 -0.0226 -0.0226 F_T1 0.0024 0.7000 -0.0234 -0.0154 -0.0146 -0.0084 F_T2 0.0010 0.6953 -0.0203 -0.0133 -0.0125 -0.0067

(20)

Conclusions

The primary aim of the present article is to provide guidance in the choice of approach for sample size and power calculations within the framework of multivariate general linear models with Gaussian responses. The proposed approaches are the direct extension of the work by Liu and Liang (1997) Rochon (1998) and O'Brien and Shieh (1992) to accommodate both fixed and Table 6

The Errors Between Actual and Nominal Values of Type I Error Rate and Power for Development Model with Standard Normal Independent Variable (c = 3, r = 4, a = 2, ␣ = 0.05)

N = 40 Q_W 0.0726 0.7464 0.0174 0.0202 0.0274 Q_S -0.0020 0.7464 -0.1296 -0.1288 -0.1614 F_T1 0.0048 0.7000 -0.0704 -0.0626 -0.0672 F_T2 0.0020 0.6764 -0.0590 -0.0498 -0.0560 N = 80 Q_W 0.0310 0.7217 -0.0003 0.0009 -0.0001 Q_S -0.0016 0.7217 -0.0761 -0.0745 -0.0869 F_T1 0.0028 0.7000 -0.0480 -0.0424 -0.0458 F_T2 0.0018 0.6891 -0.0417 -0.0377 -0.0385 N = 120 Q_W 0.0230 0.7142 -0.0098 -0.0094 -0.0080 Q_S 0.0020 0.7142 -0.0542 -0.0560 -0.0662 F_T1 0.0046 0.7000 -0.0362 -0.0370 -0.0380 F_T2 0.0032 0.6929 -0.0319 -0.0333 -0.0357 N = 160 Q_W 0.0138 0.7105 -0.0003 0.0015 -0.0051 Q_S 0.0006 0.7105 -0.0381 -0.0357 -0.0437 F_T1 0.0032 0.7000 -0.0252 -0.0212 -0.0246 F_T2 0.0026 0.6947 -0.0229 -0.0173 -0.0217 N = 200 Q_W 0.0108 0.7083 0.0019 -0.0001 -0.0039 Q_S 0.0000 0.7083 -0.0293 -0.0323 -0.0409 F_T1 0.0014 0.7000 -0.0184 -0.0200 -0.0246 F_T2 0.0008 0.6958 -0.0164 -0.0186 -0.0226

(21)

random independent variables. Their methods have been restricted to the simplifying assumption that all levels of the independent variables in the model are completely fixed nonrandom. According to the explicit sample size formulas obtained in this article, the general setup of the proposed approaches provides feasible solutions for studies involving both fixed and random independent variables.

The GEE approach described in Liu and Liang (1997) and Rochon (1998) represents a unified tool for sample size and power calculations for both normal and non-normal responses of repeated measures and longitudinal studies. Unfortunately, it is found in this article that their approaches have potential problem associated with the control of Type I error rate or the sensitivity to the unbalanced design for the most common research paradigm of multivariate general linear models with normal responses. More importantly, it is outperformed by the likelihood-based approaches of two F transforms of Hotelling-Lawley trace statistic. Such phenomenon continues to exist in the proposed modification of their approaches over all the conditions that we have considered. The simulation results suggest the proposed F_T2 approach developed from the approximation proposed by McKeon (1974) performs extremely well and shall be of practical use.

References

Anderson, T. W. (1984). An introduction to multivariate statistical analysis (2nd Ed.), New York: John Wiley.

Dickey, J. M. (1967). Matricvariate generalizations of the multivariate t distribution and the inverted multivariate t distribution. Annals of the Mathematical Statistics, 38, 511-518. Duncan, T. E., Duncan, S. C., Hops, H., & Stoolmiller, M. (1995). An analysis of the relationship between parents and adolescent marijuana use via generalized estimating equation methodology. Multivariate Behavioral Research, 30, 317-339.

Dunlop, D. D. (1994). Regression for longitudinal data: A bridge from least squares regression. The American Statistician, 48, 299-303.

Horton, N. J. & Lipsitz, S. R. (1999). Review of software to fit generalized estimating equation regression models. The American Statistician, 53, 160-169.

Keselman, H. J., Algina, J., & Kowalchuk, R. K. (2001). The analysis of repeated measures designs: a review. British Journal of Mathematical and Statistical Psychology, 54, 1-20. Liang, K. Y. & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear

models. Biometrika, 73, 13-22.

Liu, G. & Liang, K. Y. (1997). Sample size calculations for studies with correlated observations. Biometrics, 53, 937-947.

McKeon, J. J. (1974). F approximations to the distribution of Hotelling’s T2

0 . Biometrika,

61, 381-383.

Morrison, D. F. (1990). Multivariate statistical methods. McGraw Hill: New York. Muller, K. E., LaVange, L. M., Ramey, S. L., & Ramey, C. T. (1992). Power calculations

for general linear multivariate models including repeated measures applications. Journal of the American Statistical Association, 87, 1209-1226.

(22)

Muller, K. E. & Peterson, B. L. (1984). Practical methods for computing power in testing the multivariate general linear hypothesis. Computational Statistics and Data Analysis, 2, 143-158.

O'Brien, R. G. & Shieh, G. (1992, August). Pragmatic, unifying algorithm gives power probabilities for common F tests of the multivariate general linear hypothesis. Paper presented at the Annual Joint Statistical Meetings of the American Statistical Association, Boston, Massachusetts.

Pillai, K. C. S. & Samson, P., Jr. (1959). On Hotelling’s generalization of T2_{. Biometrika,}

46, 160-168.

Ramey, C. T., Bryant, D. M., Wasik, B. H., Sparling, J. J., Fendt, K. H., & LaVange, L. M. (1992). The infant health and development program for low birth weight, premature infants: program elements, family participation. Pediatrics, 89, 454-465.

Rochon, J. (1998). Application of GEE procedures for sample size calculations in repeated measures experiments. Statistics in Medicine, 17, 1643-1658.

Sampson, A. R. (1974). A tale of two regressions. Journal of the American Statistician Association, 69, 682-689.

SAS Institute. (1999). SAS/IML software: Usage and reference, Version 8. Carey, NC: Author.

Seber, G. A. F. (1984). Multivariate observations. New York: John Wiley.

Shih, W. J. (1997). Sample size and power calculations for periodontal and other studies with clustered samples using the method of generalized estimating equations. Biomedical Journal, 39, 899-908.

Timm, N. E. (2002). Applied multivariate analysis. New York: Springer-Verlag. Ziegler, A. & Gromping, U. (1998). The generalised estimating equations: a comparison of

procedures available in commercial statistical software packages. Biometrical Journal, 40, 245-260.

Accepted February, 2003.

Appendix

Brief Review of Generalized Estimating Equations (GEE)

Consider the repeated measures design or longitudinal study, and let y be_ij*

the response and x be the K × 1 vector of independent variables or covariates*_it

for the ith_{subject and the t}th_{condition or time point, where i = 1, ..., N and t = 1,} ..., T_i. Here for simplicity we consider the situation that all T_i = T; the case of varying T_i can be handled in a similar way. Let *

i y be the T × 1 vector ( * * 1,..., i iT y y )T_and * i

X be the T × K matrix (x*_i₁,...,x )*_iT T_{. It is assumed throughout}

thaty and *_i y are independent for any i*j ⫽ j. The GEE approach fits a “marginal

model” in which a mean function and a covariance structure are specified, but a full likelihood is not required. The mean function specifies the relationship between the marginal mean E y( _it*|x*_it) = ␮_it and the vector x through a*_it

generalized linear model g(␮_it) = x*T_it␤, where ␤ is a K × 1 vector of parameters,

and g is a known link function. Common choices for the link function might be

(23)

identity link g(␮) = ␮ for Gaussian data, log link g(␮) = log(␮) for count data, and logit link g(␮) = log[␮/(1 – ␮)] for binary data. The variance function V(y_it*|x*_it) =␯_it is expressed as ␯_it = ␯(␮_it)/␾, where ␯ is a known function and ␾ is a scale parameter which may be estimated. Under the GEE model, the key step is to write the covariance matrix of *

i y given X as V*i i = 1/ 2 i A R_i(␣) 1/ 2 i A , where A_i is a T × T diagonal matrix with ␯_it as the tth_{diagonal element, and R}

i(␣) is the

working correlation matrix which is fully specified by the q × 1 vector of unknown parameters␣. The vector ␣ typically contains parameters that characterize the correlation as a function of time lag or distance separation.

The major feature of GEE is the relaxation of the full likelihood function. For the classical univariate and multivariate linear models with normal responses, the mean and variance functions (the first two moments) fully determine the likelihood function, but this assumption is violated for many types of data such as binary or count outcomes. To specify the entire likelihood, additional assumptions about higher order moments are also necessary. Even if additional assumptions are made, the likelihood is often intractable and involves many nuisance parameters in addition to ␣ and ␤ that must be estimated. GEE alleviate these restrictions.

A link can be built from familiar linear regression and generalized linear models to GEE methodology through the form of the estimating equations. Estimating equations represent a set of equations the solution of which give parameter estimates. For linear regression, it is well known that the least squares estimates are obtained from solving the normal equations. In similar fashion, the parameter estimates of generalized linear model are solutions to the estimating equations obtained by maximizing the likelihood function of the exponential family. The estimating equations of generalized linear models were extended by Liang and Zeger (1986) to account for correlated measurements from longitudinal data. Specifically, the GEE approach estimates␤ by solving the following generalized estimating equations:

(

)

1 * 1 , N T i i i i i − = − =

∑

D V y m 0

where D_i = ⭸␮_i/⭸␤T_{is the T × K gradient matrix and}␮

i = (␮i1, ..., ␮iT) T_{. The}

extra term “generalized” distinguishes these as estimating equations in generalized linear models that accommodate the correlation structure R_i(␣). Liang and Zeger (1986) showed that the solution to the generalized estimating equations gives a consistent estimate ˆb of ␤ that is asymptotically multivariate normal with the covariance matrix given by

(24)

( )

1 1 1 1 * 1 1 1 1 1 ˆ cov( ) cov . N N N T T T i i i i i i i i i i i i i i − − − − − − = = =      =_ _  _ _   

∑

D V D  

∑

D V y V D 

∑

D V D  b

Liang and Zeger (1986) proposed to estimate cov( ˆb ) by replacing ␤, ␾ and ␣ with their estimators and replacing cov( *

i y ) by (y – ˆ*i m )(i y – ˆ*i m )i T, where ˆ_i m = (mˆi1, ...,m ) and ˆˆiT m = giT -1₍ *T_ˆ it

x b ), t = 1, ..., T. Another useful feature

of the GEE methodology is that ␤ and cov( ˆb ) are consistently estimated even if the correlation structure is misspecified. A good discussion of the connection between the GEE approach and the well-known least squares regression methodology was given by Dunlop (1994). Review of commercial software packages to fit GEE models can be found in Ziegler and Gromping (1998) and Horton and Lipsitz (1999).