Estimation and prediction of generalized growth curve with grouping variances in AR(q) dependence structure

(1)

Estimation and Prediction of Generalized Growth Curve

with Grouping Variances in AR(q) Dependence Structure

Jack C. Lee* and Ying-Lin Hsu

Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan, R.O.C.

Summary

In this paper we consider maximum likelihood analysis of generalized growth curve model with the Box-Cox transformation when the covariance matrix has AR(q) dependence structure with grouping variances. The covariance matrix under consideration isS = DsCDs where C is the correlation matrix

with stationary autoregression process of order q, q < p and Ds is a diagonal matrix with p elements

divided into gð pÞ groups, i.e., Ds is a function offs1; . . . ;sgg and 1 < q < 1 and sl, l¼ 1; . . . ; g,

are unknown. We consider both parameter estimation and prediction of future values. Results are illu-strated with real and simulated data.

Key words and phrases: Box-Cox transformation; Heteroscedasticity; Maximum likelihood estimates; Predictions; Simulations

1. Introduction

We consider a generalized multivariate analysis of variance model useful espe-cially for many types of growth curve problems. The model was first proposed by Potthoff and Roy (1964) and subsequently considered by many authors, includ-ing Khatri (1966), Geisser (1970, 1980), Lee and Geisser (1972, 1975), Fearn

(1975), Jennrich and Schluchter (1986), Rao (1987), Lee (1988, 1991), von

Rosen (1991), Pan, Fang and von Rosen (1999), among others.

The growth curve model is defined as Y

pN¼ Xpm mrt rNA þ EpN; (1.1)

where t is an unknown matrix and X and A are known design matrices of ranks

m< p and r < N, respectively. The columns of E are independent p-variate

nor-mal, with mean vector 0 and common covariance matrix S. In general, p is the

number of time (or spatial) points observed on each of the N individuals, m and r, which specify the degree of polynomial in time (or space) and the number of distinct groups, respectively, are assumed known. The design matrices X and A will therefore characterize the degree of the growth function and the distinct

# 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0323-3847/03/0203-0165 $ 17.50þ.50/0

(2)

grouping out of the N independent vector observations. Potthoff and Roy (1964) gave many examples of growth curve applications for the model (1.1). Lee and Geisser (1975), Rao (1987) and Lee (1988), among others, applied the model to some biological data. Keramidas and Lee (1990) applied the model to forecast of technology substitutions.

Lee (1988, 1991), Lee and Lu (1987) and Keramidas and Lee (1990, 1995)

demonstrated repeatedly the importance of the AR(1) dependence, or serial covar-iance structure, for the covarcovar-iance matrix S for the model (1.1).

Mansour, Nordheim and Rutledge (1985) allowed for time heteroscedasticity

by applying a nonstationary AR(1) error process for repeated measures experi-ments. Geary (1989) presented a more general case. Rochon (1992) presented stationary ARMA covariance structures with time heteroscedasticity. Laird and

Ware (1982) considered the random effects model for longitudinal data, which

included growth curve as a special case. They provided for polynomial trends over time, e.g. random slope. Jennrich and Schluchter (1986) provided the informa-tion about the use of covariance structures which included autoregressive as well as random effects error terms. In our paper, we provided for grouping the var-iances in order to construct more parsimonious and realistic covariance structure and provided the Box-Cox transformation in order to make growth function linear or piecewise linear, which is helpful for prediction. We conducted the likelihood ratio test for the adequacy of grouping the variances. In order to conduct the test, we have been aided by the plot of confidence intervals for standard deviations in the model. In theory, however, some of the results given in Laird and Ware

(1982) and Jennrich and Schluchter (1986) could be extended to provide the

features considered in this paper.

The estimation of parameters and prediction of future values for the model (1.1) having the covariance structure S¼ s2_{C, C is the correlation matrix with} station-ary autoregression process of order 1, or AR(1), have been studied using the ML method by Lee (1988). The purpose of this paper is to consider an extension of this covariance structure to an AR(q) process with grouping variances. An AR(q) dependence structure with grouping variances in the growth curve model will be much more general and quite useful in practice because it includes the AR(1) dependence with a common variance as a special case. When the AR(q) depen-dence structure with grouping variances holds for S, we have

S¼ DsCDs; ð1:2Þ

where C¼ ðcijÞ for i; j ¼ 1; . . . ; p; i.e.,

cij ¼ q_jijj; when i6¼ j ; 1 ; when i¼ j ; ð1:3Þ q_i¼P q j¼1 f_jq_jijj; for i 1 ; ð1:4Þ

(3)

Ds is a p p diagonal matrix and function of fs1; . . . ;sgg and fj; j¼ 1; . . . ; q are unknown and satisfy the assumption that the roots of the equation

1P

q j¼1

fjxj¼ 0 lie outside the unit circle kxk ¼ 1 in the complex plane and sl> 0 for l¼ 1; . . . ; g are unknown. In the grouping of the variances, we assume that there are gi members in the ith group for i¼ 1; . . . ; g. The purpose of this paper is to consider this covariance structure from the maximum likelihood point of view hoping that a more effective and practical solution can be furnished. We will com-pare our results with those based on the ML method for the AR(1) dependence as given in Lee (1988) via real and simulated data. Sincesi > 0, there is a one-to-one correspondence betweensi ands2_i. Thus, for convenience, we will be dealing with siinstead ofs2i mathematically for the rest of this paper. Also, it is trivial to note that equality of variances is equivalent to equality of standard deviations.

In addition to the inferences on the parameters t, f¼ ðf₁; . . . ;f_qÞ0 and s¼ ðs1; . . . ;sgÞ0, we will also consider conditional prediction problem for the growth curve model as specified by (1.1)–(1.4). Let V be p K future observa-tions drawn from the generalized growth curve model; that is, the set of future observations is such that given the parameters t and S,

EðVÞ ¼ XtF ; ð1:5Þ

where E( ) denotes expected value, F is a known r K matrix, and the columns of V are independent and multivariate normal with a common covariance matrix S. It is noted that F¼ 1 in the special situation in which r ¼ K ¼ 1. Lee and

Geisser (1972, 1975), Fearn (1975), and Lee (1988, 1991) considered the

problem of predicting Vð2Þ, given Vð1Þ and Y, if V is partitioned as

V¼ ðVð1Þ0; Vð2Þ0Þ0, where VðiÞ is pi K ði ¼ 1; 2Þ and p1þ p2¼ p. If p is inter-preted as the number of points in time being observed, then the problem is mainly concerned with predicting the generalized growth curve for future values for the same set of p time points, or a subset of size p₂. When p₂ < p and K¼ 1, it is also called the conditional prediction of the unobserved portion of a partially observed vector.

Meanwhile, we will also consider the situation in which the Box-Cox transfor-mations are applied to the observations Y in (1.1). The model, called data-based transformed (DBT) model, is

YðlÞ¼ XtAþ E ; (1.6)

where YðlÞ¼ ðYðlÞ_ij Þ and YðlÞij ¼ fðYijþ gÞl 1g=l ; whenl6¼ 0 lnðYijþ gÞ; whenl¼ 0 ; ð1:7Þ g is assumed to be a known constant such that Yijþ g > 0 for all i and j, l is an unknown parameter, and columns of E are independently and normally distributed with mean vector 0 and common covariance matrixS as specified in (1.2)–(1.4). It is noted thatl could assume different values for different segments of observations. In

(4)

Leeand Lu (1987), tremendous improvement was found in predictive accuracy using the DBT model for technology substitutions. This is primarily due to the fact that the linearity assumption for the growth function can be enhanced significantly with the Box-Cox transformation, along with incorporating into the model the proper depen-dence structure among the observations. Keramidas and Lee (1990) combined the concepts of the Box-Cox transformation and a generalized growth curve model with serial structure for forecasting technological substitutions based on the maximum like-lihood method when repeated measurements of short time series are available. If the growth function is not linear, we can use the Box-Cox transformation so that the func-tion is linear or piecewise linear, but the transformafunc-tion may not stabilize the variance. Thus, it is appropriate to construct grouping of variances with AR(q) dependence.

In Section 2, maximum likelihood estimation of parameters and prediction of future values are considered. In Section 3, we derive asymptotic result and test for the equality and grouping of variances. The results are illustrated in Section 4 with two real data sets and a simulation. Some concluding remarks are given in Section 5. Finally, the derivation of the Hessian information matrix is given in the Appendix.

2. Parameter estimation and prediction of Vð2Þ based on the maximum likelihood theory

From equation (1.4), we obtain the following equations:

q_k ¼ f₁q_k₁þ f₂q_k₂þ . . . þ f_qq_k_q; for k > 0 ; ð2:1Þ where for j < 0, qj¼ qj and q0¼ 1. From equation (2.1), for k ¼ 1; . . . ; q, we obtain the well-known Yule-Walker equations by replacing qj with ^qqj. The Yule-Walker estimate solves these equations for f¼ ðf₁; . . . ;f_qÞ0 as a function of q’s. Here, for the constraints on q, we would express q as a function of f, the param-eters to be estimated. From equation (2.1), if q is odd,

q_k ¼ f₁q_k1þ f₂q_k2þ . . . þ fqþ1 2 1q1þ f qþ1 2 q0þ f qþ1 2 þ1q1 þ . . . þ f_qq_kq; else, qk ¼ f1qk1þ f2qk2þ . . . þ fq21q1þ f q 2q0þ f q 2þ1q1þ . . . þ fqqkq:

Since q_j ¼ q_j and q₀¼ 1, it follows that

q_k ¼ f_kþP k1 j¼1 ðf_jþ f_2kjÞ qkj; if k ¼ qþ 1 2 fkþ P k1 j¼1 ðf_jþ f_2kjÞ q_kjþ P q j0_¼2k fj0q_j0_k; if k < qþ 1 2 f_kþ P 2kq1 j¼1 f_jq_kjþ P q j0_¼kþ1 ðf_2kj0 þ f_j0Þ q_j0_k; if k > qþ 1 2 : 8 > > > > > > > > < > > > > > > > > : ð2:2Þ

(5)

Hence, we have natural constraints on q that can be expressed as

Gqþ f ¼ 0 ð2:3Þ

where the matrix G is a function off, element-wise

Gij ¼ f_iþjþ f_ij dij; ð2:4Þ

with fk 0 if k 62 f1; . . . ; qg and dij equals to 1 if i¼ j and 0 otherwise. Hence, for the AR(q) dependence structure, C in (1.3), could be expressed as a function off.

From model (1.6), we have the following likelihood oft, s, f and l: Lðt; s; f; l j YÞ / jSjN2_exp_f1

2 trS

1_ðYðlÞ_{XtAÞ ðY}ðlÞ_XtAÞ0

g JY ; ð2:5Þ where JY, the Jacobian of the transformation from YðlÞ to Y, is defined as

JY ¼ Q N j¼1 Qp i¼1 ðYijþ gÞl1: ð2:6Þ

From (3.6) of Lee (1988), we have tr S1ðYðlÞ XtAÞ ðYðlÞ XtAÞ0

¼ tr ðX0S1XÞ ½ðX0S1XÞ1 X0S1SS1XðX0S1XÞ1þ ðt ^ttÞ AA0ðt ^ttÞ0 þ tr ðZ0SZÞ1 Z0YðlÞYðlÞ0Z

and using the fact

S1¼ S1 XðX0S1XÞ1 X0S1þ ZðZ0SZÞ1 Z0; where

S¼ YðlÞðI A0ðAA0Þ1AÞ YðlÞ0 ð2:7Þ

and Z is a known p ðp mÞ matrix with rank p m such that X0_Z_{¼ 0, and ^tt,} the MLE oft, is

^

ttðSÞ ¼ ðX0S1XÞ1 X0S1YðlÞA0ðAA0Þ1; ð2:8Þ

with

S¼ DsCDs: ð2:9Þ

The MLEs of s, f and l, denoted by ^ss¼ ð^ss1; . . . ; ^ssgÞ0, ^ff¼ ð^ff1; . . . ; ^ffqÞ 0

and ^ll, respectively, are obtained by maximizing the profile likelihood function

Lmaxðs; f; lÞ ¼ jSjN=2expf1 2tr½S 1_S_{þ Zð Z}0_SZÞ1 Z0YðlÞA0ðAA0Þ1AYðlÞ0gJY: ð2:10Þ

(6)

Comments regarding the l values can be applied when there are different values of l for different segments of observations.

We consider the conditional prediction of the future matrix Vð2Þ, given Y and Vð1Þ. Schematically, the matrices Y, Vð1Þ and Vð2Þ are shown below:

Y V

ð1Þ

Vð2Þ

where Y, p N, is the complete sample; Vð1Þ, p1 K, is the partially observed matrix and Vð2Þ, p2 K, is the unobserved portion to be predicted. Of course, p1þ p2¼ p.

We restrict attention to the special situation in which K¼ 1 and r ¼ 1; that is, only one vector is being predicted in the same group. This is similar to Lee and Geisser (1975). The approximate mean (denoted by ^VVð2Þ), when the covariance structure given by (1.2) holds, can be obtained as

^ V Vð2Þ¼ f1 þ ^ll½Xð2Þ^ttþ ^SS21SS^111ðVð1Þð^llÞ Xð1Þ^ttÞg 1=^ll g; if ^ll6¼ 0 exp½Xð2Þ_^_tt_{þ ^}_S_S 21SS^1₁₁ðVð1Þð^llÞ Xð1Þ^ttÞ g; if ^ll¼ 0 ; ( ð2:11Þ and applying the delta method to obtain an estimate of the approximate covariance matrix of ^VVð2Þ as follows: ^ S S_V_V^ð2Þ ﬃ diagðð^llX ð2Þ_^_tt_{þ 1Þ}1=^ll1 Þ ^WW diag ðð^llXð2Þ^ttþ 1Þ1=^ll1Þ ; if ^ll6¼ 0 diagðexp ðXð2Þ^ttÞÞ ^WW diag ðexpðXð2Þ^ttÞÞ ; if ^ll¼ 0 ; ( ð2:12Þ and ^ W W¼ 1 N ðX ð2Þ_^_S_S 21SS^1₁₁Xð1ÞÞ ðX0SS^1XÞ1ðXð2Þ ^SS21SS^₁₁1Xð1ÞÞ0þ ^SS21SS^1₁₁SS^12; where Vð1ÞðlÞ is defined in manners similar to YðlÞ given in (1.6) and (1.7), ^tt is as given in (2.7) with S replaced by ^SS, X is partitioned as X¼ ðXð1Þ0; Xð2Þ0Þ0, XðiÞ is pi mði ¼ 1; 2Þ, diag ðaÞ is a diagonal matrix with elements a and

^ S S¼ ^DDsCC ^^DDs ¼ ^ S S11 SS^12 ^ S S21 SS^22 ; ð2:13Þ

where ^SSij is of dimension pi pj (p1þ p2¼ p). It is noted that xa¼ ðxa1; . . . ; xapÞ 0 where x¼ ðx1; . . . ; xpÞ0 and a is a scalar.

3. Asymptotic result and testing equality and grouping of variances

In this section, we will derive the information matrix which is useful for obtaining the standard errors of the MLEs and testing equality and grouping of variances.

(7)

3:1 Information matrix

For ease of presentation and some practical consideration, we will restrict our attention to the special situation in which r¼ 1. The log-likelihood obtained from (2.5) is: l¼ lðt; s; f; l j YÞ ¼ N 2 logðjSjÞ 1 2 PN i¼1

ðYðlÞ_i XtÞ0S1ðYðlÞ_i XtÞ þ log ðJYÞ : ð3:1Þ We are unable to take the expectation of the Hessian matrix, which is too compli-cated. Only observed Hessian matrix is used in the paper. This is still useful in computing the standard errors of the MLEs. Let q¼ ðt0_;_s0_;_f0_;_lÞ0_{. In case there} are g groups with gi members in the group. The Hessian matrix, H¼ @2l=@q @q0, has the following block partitioned form

H¼ H11 H12 H13 H14 H0₁₂ H22 H23 H24 H0₁₃ H0₂₃ H33 H34 H0₁₄ H0₂₄ H0₃₄ H44 0 B B @ 1 C C A ¼ @2_l @t @t0 @2_l @t @s0 @2_l @t @f0 @2_l @t @l @2l @s @t0 @2l @s @s0 @2l @s @f0 @2l @s @l @2_l @f @t0 @2_l @f @s0 @2_l @f @f0 @2_l @f @l @2_l @l @t0 @2_l @l @s0 @2_l @l @f0 @2_l @l @l 0 B B B B B B B B B B @ 1 C C C C C C C C C C A ; ð3:2Þ where H11¼ NX0S1Xand other derivations are placed in the Appendix.

Thus, an approximate information matrix is

Iqﬃ Hð^qqÞ ; ð3:3Þ

where ^qq is the mode of the likelihood function. Under some regularity conditions the estimator ^qq is asymptotically normal with mean q and covariance matrix I1_q .

3:2 Testing equality and grouping of variances

We will first test the equality of g different variances under the assumption of grouping variances, that is,

H0: Ds¼ 0 versus H1: Ds6¼ 0 where D ðg1Þg¼ 1 1 0 . . . 0 0 1 1 0 ... .. . 0 .. . .. . 0 0 . . . 0 1 1 0 B B B @ 1 C C C A:

(8)

The procedure is based on the likelihood ratio criterion, lðYÞ ¼supq2 Q0 Lðq j YÞ

sup_q_{2 Q} Lðq j YÞ; and la is chosen so that

sup q2 Q0

PrflðYÞ la;qg a ;

where Q0 is the null space when H0 is true while Q is the entire parameter space under the assumption of grouping variances. Let QðYÞ ¼ 2 ln lðYÞ. Under some mild regularity conditions, the statistic QðYÞ is asymptotically Chi-square distri-buted with g 1 degrees of freedom under the null hypothesis.

The likelihood ratio test procedure can also be applied to test the null hypothesis H*: grouping variances versus H0 *: all p variances are distinct.1

The likelihood ratio criterion is

l*ðYÞ ¼ sup q2 Q* Lðq j YÞ0 sup q2 Q* Lðq j YÞ ;

where Q0* is the null space when H* is true while Q* is the entire parameter0 space when the p variances are distinct. Let Q*ðYÞ ¼ 2 ln l*ðYÞ. Then under some mild regularity conditions, Q*ðYÞ is asymptotically Chi-square distributed with p g degrees of freedom under the null hypothesis.

4. Illustrative example

This section is devoted to the illustration of the conditional prediction of Vð2Þ

given Vð1Þ and Y. For the conditional prediction, we will set K¼ 1 and

p2¼ 1; 2; 3; 4, that is, we will predict the last four observations of a partially ob-served vector. If p¼ 7 and p2¼ 1; 2; 3; 4, we denote the predictions as V7, V6 V7, V5 V7 and V4 V7, respectively. For the parameter t, we will assume that the N observations have come from the same group. We will use the predic-tive sample reuse (PSR), or the leave-one-out (LOO) method. The method was used by Lee and Geisser (1975), Rao (1987) and Lee (1988), among others. The discrepancy measure is the mean absolute relative derivation (MARD) of the pre-dictions from the actuals. The discrepancy measure is defined as

MARD¼ ðNp2Þ1 Pp2 j¼1 PN i¼1 jYð2Þij ^YY ð2Þ ij j Yð2Þ_ij ;

(9)

where Y¼ ðY1; . . . ; YNÞ, Yi¼ Y ð1Þ i Yð2Þ_i ! , Yð1Þ_i is p1 1, Yð2Þi is p2 1, p1þ p2¼ p and ^YYð2Þi is its predicted values. It is noted that when Y

ð2Þ

i is being

predicted, the complete sample is Y_ðiÞ ¼ ðY1; . . . ; Yi1; Yiþ1; . . . ; YNÞ and the par-tially observed vector is Yð1Þ_i .

4:1 The mice data

The mice data set was reported by Williams and Izenman (1981) and analyzed by Rao (1984, 1987) and Lee (1988, 1991). It consists of weights of 13 male mice, measured at intervals of 3 days, over the 21 days from birth to weaning. The data are plotted in Figure 1. We assume the second degree polynomial for the growth curve, as noted by Rao (1984). When the growth function is quadratic, the design matrix X is X¼ 1 1 1 1 1 1 1 1 2 3 4 5 6 7 1 4 9 16 25 36 49 0 @ 1 A 0 ;

and the design matrix A is 1 13 vector consisting of all 1s.

The ML estimates with standard errors in parentheses for the model (1.6), with q¼ 1, m ¼ 3 and distinct variances, are given in Table 1. When the same l is applied to all observations for each mouse, the estimate of s with 95% confidence interval for each time point is plotted in Figure 2. We found that the mice data

(10)

with the same Box-Cox transformation for different time points have a possible change point at day 15. From Table 1, we also found that standard deviation varies with time and stabilize at 12 days, which is also displayed in Figure 2. Hence, a possible model is to group the last three variances as a single group and the rest as four different variances. To test for the adequacy of this grouping var-iances structure, the value of the log-likelihood function is 169.19 under AR(1) dependence with the above grouping variances. Under the equality of variances, the value of the log-likelihood function is 126.52, resulting in a likelihood ratio statistic Q¼ 85:34 and p-value < 0:0001, indicating that the homogeneity of var-iances is not acceptable. Meanwhile, we can test the grouping varvar-iances (H*)0 against the alternative hypothesis that all p variances are distinct (H*). When all 71 variances are distinct, the value of the log-likelihood function is 170.10, resulting

Table 1

ML estimation results for the distinct variances with AR(1) errors ^ tt 1.1125 (0.0090) 0.2507 (0.0085) 0.0152 (0.0011) ^ s s 0.0333 (0.0099) 0.0461 (0.0113) 0.0666 (0.0144) 0.1093 (0.0226) 0.1279 (0.0242) 0.1333 (0.0266) 0.1100 (0.0198) ^ q q 0.9006 (0.054) ^ l l 0.8372 (0.1420)

(11)

in a likelihood ratio statistic Q*¼ 1:82 and p-value ¼ 0.5975, indicating that the grouping of variances is acceptable. Thus, the AR(1) dependence with the above grouping of variances is a possible model for the data.

The comparisons of predictive performance of different grouping variances structures for conditional predictions of Vð2Þ given Vð1Þ and Y without the Box-Cox transformation and with the Box-Box-Cox transformation are summarized in Ta-ble 2 and TaTa-ble 3, respectively. In the taTa-bles, the notationf(1) (2) (3) (4) (5, 6, 7)g denotes the grouping of the variances corresponding to the seven time points of mice data into five groups with s5¼ s6¼ s7. We found that the grouping var-iances are better than the common variance with AR(q) dependence structure when q¼ 1 or 2, coupled with or without the Box-Cox transformation. The criter-ion used in the comparison is MARD. For example, for the predictcriter-ion of V7, the best prediction result occurs when a common variance is assumed for the AR(2) dependence coupled with the Box-Cox transformation. The resulting MARD is 0.0354, which is slightly better than the result with the above grouping variances. Meanwhile, the best model for the prediction of V6 V7, V5 V7, and V4 V7 is the untransformed AR(1) dependence, having quadratic growth function and with the grouping variances fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg. Thus, with the exception of

pre-dicting V7 alone, the best model seems to be the untransformed AR(1)

depen-dence, having quadratic growth function and with the grouping variances fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg. Of course, the finding is consistent with the variance estimates as shown in Figure 2.

Table 2

The MARD between the predicted and the actual values of Vð2Þ: mice data

Covariance structure V7 V6 V7 V5 V7 V4 V7 linear linear fð1Þð2Þð3Þð4Þð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0477 0.0454 0.0602 0.0822 0.0617 0.0866 0.0808 0.1097 quadratic quadratic fð1Þð2Þð3Þð4Þð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0403 0.0406 0.0523 0.0525 0.0588 0.0628 0.0722 0.0965 Table 3

The MARD between the predicted and the actual values of Vð2Þwith the Box-Cox transfor-mation and linear growth function: mice data

Covariance structure V7 V6 V7 V5 V7 V4 V7 fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0391 0.0418 0.0549 0.0729 0.0610 0.0798 0.0748 0.1067 fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg & AR(2) fð1; 2; 3; 4; 5; 6; 7Þg & AR(2) 0.0367 0.0354 0.0638 0.0836 0.0654 0.0777 0.0847 0.1084

(12)

4:2 The drug dissolution data

The data given in Lee et al. (1999) are the dissolution rates of three standard lots and one test lot. For each lot, there are twelve tablets and for each tablet the dissolution rates are measured at seven time points: 1, 2, 3, 4, 6, 8 and 10 min-utes. We used the pooled data of three lots and removed the observations at time 1 and 3 to create an equally-spaced dataset. The dissolution function FðtÞ of a drug is defined to be the percentage of a tablet that has dissolved at time t, and RðtÞ is defined by FðtÞ=ð100 FðtÞÞ. The data transformation is plotted in Fig-ure 3. Since 0 FðtÞ 100, and RðtÞ 0, both ranges are not the entire real line. It may cause the out of range problem when we model them directly. We will therefore consider applying the Box-Cox transformations to RðtÞ which will avoid the above problem. We observed that there is one possible change point at time 8. Thus, the design matrix X for the growth curve model is

X¼ 1 1 1 1 1 2 4 6 8 8 0 0 0 0 2 0 @ 1 A 0 :

From Figure 3, it is clear that F=ð100 FÞ is not quite linear, but piecewise linear, in time. The Box-Cox transformation applied to F=ð100 FÞ will help achieve the linearity property. If the time is cut into two pieces, 2–6, 8–10, just as in the design matrix and the Box-Cox transformation is applied to the two pieces separately, the variances are more stable. When different Box-Cox transfor-mations are applied to observations from two different segments, the estimates of s1; . . . ;s5 for the drug dissolution data are plotted with confidence intervals in Figure 4.

(13)

From Figure 4, we found that the standard deviation of the pooled data varies with time. Hence, a possible model is to treats2¼ s3 and the rest as three different var-iances in an AR(1) dependence. The value of the log-likelihood function increased from134.35 under the equality of variances to 113.88 under this grouping var-iances structure, resulting in a likelihood ratio statistic Q¼ 40.94, with p-va-lue < 0:0001, indicating that the homogeneity of variances is not acceptable. When all 5 variances are distinct, the value of the log-likelihood function is 111.98, re-sulting in a likelihood ratio statistic Q*¼ 3.80 and p-value ¼ 0.0513, indicating the grouping of variance is acceptable. From Table 4, we found that the model with q¼ 1 and two different variance groups is better than that with equal variance.

4:3 Simulation

In this subsection we will present a simulation study regarding the grouping variances structure. In order to compare different grouping variances, we set

Table 4

The MARD between the predicted and the actual values of Vð2Þwith the Box-Cox transfor-mation: drug data

Drug data Covariance structure V5 V4 V5

Pooled fð1Þ ð2; 3Þ ð4Þ ð5Þg & AR(1) 0.02672 0.02369 fð1Þ ð2Þ ð3Þ ð4Þ ð5Þg & AR(1) 0.02689 0.02436 fð1; 2; 3; 4; 5Þg & AR(1) 0.02742 0.02713

Fig. 4. The estimates of s1; . . . ;s5 of drug dissolution data with 95% confidence interval

(14)

t¼ ð0:1; 0:2Þ0, the design matrix X is

X¼ 1 1 1 1 1 1 1

1 2 3 4 5 6 7

0

;

f¼ 0:93, and the number of replications is 1000. We generate data from group-ing variances, (s1¼ s2¼ s3¼ s4¼ s5¼ 0:007 and s6¼ s7¼ 0:014) and (s1¼ s2 ¼ s3¼ s4¼ s5¼ 0:007 and s6¼ s7¼ 0:021), and present the condi-tional predictive performance in Table 5 and Table 6, respectively. From the two tables, it is clear that the grouping variances structure yield better and more stable prediction results, as expected.

5. Concluding remarks

The model with grouping variances in AR(q) dependence structure provides an effective and practical means of dealing with the growth curve data. An appropri-ate grouping among the variances in an AR(q) dependence structure gives better and more stable predictive performance. If the growth function is not linear, we can apply the Box-Cox transformation so that the function is linear. If there are change points, we can apply different Box-Cox transformations for different seg-ments of observations so that the function is piecewise linear. If the variance var-ies over time we can group the variances appropriately in an AR(q) covariance structure. We can plot confidence intervals for standard deviations for each time in

Table 5

Prediction performance for data generated from grouping variances ðs1¼ s2¼ s3¼ s4

¼ s5¼ 0:007 and s6¼ s7¼ 0:014Þ with AR(1) dependence: simulated data

MARD N V7 V6 V7 V5 V7 V4 V7 Grouping variances 12 36 0.00270 0.00274 0.00352 0.00355 0.00358 0.00356 0.00367 0.00375 Identical variance 12 36 0.00273 0.00276 0.00518 0.00516 0.00449 0.00443 0.00430 0.00431 Table 6

Prediction performance for data generated from grouping variances ðs1¼ s2¼ s3¼ s4

¼ s5¼ 0:007 and s6¼ s7¼ 0:021Þ with AR(1) dependence: simulated data

MARD N V7 V6 V7 V5 V7 V4 V7 Grouping variances 12 36 0.00414 0.00407 0.00536 0.00532 0.00522 0.00511 0.00502 0.00503 Identical variance 12 36 0.00428 0.00416 0.00922 0.00925 0.00741 0.00723 0.00674 0.00646

(15)

order to observe the adequate grouping and use the likelihood ratio test for the adequacy of grouping the variances. The final grouping of variances is obtained by the prediction results of the last few rows of Y, the observed matrix. When the length of the time points is short and the sample size is small, usually q¼ 1 or 2 will suffice. It is conceivable that the grouping of variances will not be a good idea if the variances are somewhat similar. However, if the magnitudes are distinct and in clear clusters, the grouping of variances will certainly improve the predic-tive performance. Hence, the procedure developed in this paper will be useful for dealing with growth curve data.

Appendix

In this appendix, we will derive the information matrix. H22, H23, H33, H12, H13, H14, H24 and H34can be obtained from the following equations:

@2_l @sk@sj ¼ 1 2 s 2 j s 2 k PN i¼1 ðYðlÞ_i XtÞ0½EkkC1Ejj þ EjjC1EkkðY ðlÞ i XtÞ; if k6¼ j ; Ns2 j s3j PN i¼1 ðYðlÞ_i XtÞ0½D1_s C1Ejj þ EjjC1D1s þ s1j EjjC1Ejj ðY ðlÞ i XtÞ; if k¼ j ; 8 > > > > > > > > > < > > > > > > > > > : @2_l @sj@fk ¼ 1 2 s 2 j PN i¼1 ðYðlÞ_i XtÞ0½D1_s C1@C @f_kC 1 Ejj þ EjjC1 @C @f_k C 1_D1 s ðY ðlÞ i XtÞ ; @2_l @f_j@f_k¼ N 2 tr C 1 @2C @f_j@f_k C 1@C @f_jC 1 @C @f_k " # 1 2 PN i¼1 ðYðlÞ_i XtÞ0D1_s C1 @ 2_C @f_j@f_k @C @f_j C 1 @C @f_k @C @f_kC 1@C @f_j " # C1D1_s ðYðlÞ_i XtÞ ; @2_l @t @sj ¼ s2_j P N i¼1 X0ðD1_s C1Ejjþ EjjC1D1s Þ ðY ðlÞ i XtÞ ; @2l @t @f_j¼ PN i¼1 X0D1_s C1 @C @f_jC 1_D1 s ðY ðlÞ i XtÞ ;

(16)

@2_l @t @l¼ PN i¼1 X0S1@Y ðlÞ i @l ; @2_l @sj@l ¼ s2_j P N i¼1 ðYðlÞ_i XtÞ0½D1_s C1Ejjþ EjjC1D1s @YðlÞ_i @l ; @2l @f_j@l¼ PN i¼1 ðYðlÞi XtÞ 0 D1_s C1@C @f_jC 1_D1 s @YðlÞ_i @l ; @2_l @l2¼ PN i¼1 YðlÞ_i 0 @l S 1 @Y ðlÞ i @l PN i¼1 ðYðlÞ_i XtÞ0S1@ 2_YðlÞ i @l2 ; where Ejj is a p p matrix with “1” on the diagonal from

P j1 i¼1 gjþ 1 to Pj i¼1 gi and zero elsewhere. Since C is a function of q, the first two derivatives of C with respect to f can be obtained from the first two derivatives of q as given below:

@q @f_j¼ G 1@G @f_jG 1_f_G1_e i; @2q @f_j@f_k¼ G 1 @2G @f_j@f_kG 1_f_{þ G}1@G @f_jG 1_e kþ G1 @G @f_kG 1_e j G1@G @f_jG 1@G @f_kG 1_f_G1@G @f_kG 1@G @f_jG 1_{f ;}

where ei is the ith column vector of an q q identity matrix Iq. And, the first two derivatives of YðlÞi with respect to l can be easily obtained by the following formulas: @YðlÞ_ij @l ¼ ðYijþ gÞ l 1 l2 þ

ðYijþ gÞllogðYijþ gÞ

l ; l6¼ 0 0 ; l¼ 0 ; 8 < : @2_YðlÞ ij @l2 ¼ 2ððYijþ gÞl 1Þ l3

2ðYijþ gÞllogðYijþ gÞ l2 þðYijþ gÞ l ðlog ðYijþ gÞÞ2 l ; l6¼ 0 0; l¼ 0 : 8 > > > > > < > > > > > : Acknowledgement

The authors thank a referee for some helpful comments on the earlier version of the paper.

(17)

References

Fearn, T.,1975: A Bayesian Approach to Growth Curve. Biometrika 62, 89–100.

Geary, D. N.,_{1989: Modeling the Covariance Structure of Repeated Measurements. Biometrics 45,} 1183–1195.

Geisser, S.,1970: Bayesian Estimation in Multivariate Analysis. Sankhya A32, 53–64.

Geisser, S.,_{1980: Growth Curve Analysis. In P. R. Krishnaiah (ed.):Handbook of Statistics (Vol. 1).} North-Holland, Amsterdam, 89–115.

Jennrich, R. I._{and Schluchter, M. D., 1986: Unbalanced Repeated-Measures Models with Structured} Covariance Matrices. Biometrics 42, 805–820.

Keramidas, E. M._{and Lee, J. C., 1990: Forecasting Technological Substitutions with Concurrent Short} Time Series. Journal of the American Statistical Association 85, 625–632.

Keramidas, E. M. _{and Lee, J. C., 1995: Selection of a Covariance Structure for Growth Curves.} Biometrical Journal 37, 783–797.

Khatri, C. G.,_{1966: A Note on MANOVA Model Applied to Problems in Growth Curves. Annals of} the Institute of Statistical Mathematics 18, 75–86.

Laird, N. M._{and Ware, J. H. 1982: Random-Effects Models for Longitudinal Data. Biometrics 38,} 963–974.

Lee, J. C., _{1988: Prediction and Estimation of Growth Curves with Special Covariance Structures.} Journal of the American Statistical Association 83, 432–440.

Lee, J. C.,_{1991: Test and Model Selection for the General Growth Curve Model. Biometrics 47, 147}– 159.

Lee, J. C., Chen, D. T., Hung, H. N._{and Chen, J. J., 1999: Analysis of Drug Dissolution. Statistics in} Medicine 18, 799–814.

Lee, J. C._{and Geisser, S., 1972: Growth Curve Prediction. Sankhya A34, 393}–412.

Lee, J. C.and Geisser, S., 1975: Applications of Growth Curve Prediction. Sankhya A37, 239–256. Lee, J. C._{and Lu, K. W., 1987: On a Family of Data-based Transformed Models Useful in}

Forecast-ing technological Substitutions. Technological ForecastForecast-ing and Social Change 31, 61–78. Mansour, H., Nordheim, E. V._{and Rutledge, J. J., 1985: Maximum Likelihood Estimation of}

Var-iance Components in Repeated-Measures Designs Assuming Autoregressive Errors. Biometrics 41, 287–294.

Pan, J. X., Fang, K. T.,_{and von Rosen, D., 1999: Bayesian Local Influence in Growth Curve Model} with Unstructured Covariance. Biometrical Journal 41, 641–658.

Potthoff, R. F._{and Roy, S. N., 1964: A Generalized Multivariate Analysis of Variance Model Useful} Especially for Growth Curve Problems. Biometrika 51, 313–326.

Rao, C. R,_{1984: Prediction of Future Observations in Polynomial Growth Curve Model. In} Applica-tion and New DirecApplica-tions: Proc. Indian Statistical Institute Golden Jubilee InternaApplica-tional Confer-ence on Statistics. Indian Statistical Institute, Calcutta, 512–520.

Rao, C. R., _{1987: Prediction of Future Observations in Growth Curve Model. Statistics Science 2,} 434–471.

Rochon, J.,_{1992: ARMA Covariance Structures with Time Heteroscedasticity for Repeated Measures} Experiments. Journal of the American Statistical Association 87, 777–784.

von Rosen, D.,_{1991: The Growth Curve Model: A Review. Communications in Statistics-Theory and} Methods 20, 2791–2822.

Williams, J. S. _{and Izenman, A. J., 1981: A Class of Linear Spectral Models and Analysis for the} Study of Longitude Data. Technical Report, Dept. of Statistics, Colorado State University.

Received, August 2001 Revised, January 2002 Revised, May 2002 Accepted, August 2002