Estimation and Prediction of Generalized Growth Curve
with Grouping Variances in AR(q) Dependence Structure
Jack C. Lee* and Ying-Lin Hsu
Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan, R.O.C.
Summary
In this paper we consider maximum likelihood analysis of generalized growth curve model with the Box-Cox transformation when the covariance matrix has AR(q) dependence structure with grouping variances. The covariance matrix under consideration isS = DsCDs where C is the correlation matrix
with stationary autoregression process of order q, q < p and Ds is a diagonal matrix with p elements
divided into gð pÞ groups, i.e., Ds is a function offs1; . . . ;sgg and 1 < q < 1 and sl, l¼ 1; . . . ; g,
are unknown. We consider both parameter estimation and prediction of future values. Results are illu-strated with real and simulated data.
Key words and phrases: Box-Cox transformation; Heteroscedasticity; Maximum likelihood estimates; Predictions; Simulations
1. Introduction
We consider a generalized multivariate analysis of variance model useful espe-cially for many types of growth curve problems. The model was first proposed by Potthoff and Roy (1964) and subsequently considered by many authors, includ-ing Khatri (1966), Geisser (1970, 1980), Lee and Geisser (1972, 1975), Fearn
(1975), Jennrich and Schluchter (1986), Rao (1987), Lee (1988, 1991), von
Rosen (1991), Pan, Fang and von Rosen (1999), among others.
The growth curve model is defined as Y
pN¼ Xpm mrt rNA þ EpN; (1.1)
where t is an unknown matrix and X and A are known design matrices of ranks
m< p and r < N, respectively. The columns of E are independent p-variate
nor-mal, with mean vector 0 and common covariance matrix S. In general, p is the
number of time (or spatial) points observed on each of the N individuals, m and r, which specify the degree of polynomial in time (or space) and the number of distinct groups, respectively, are assumed known. The design matrices X and A will therefore characterize the degree of the growth function and the distinct
# 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0323-3847/03/0203-0165 $ 17.50þ.50/0
grouping out of the N independent vector observations. Potthoff and Roy (1964) gave many examples of growth curve applications for the model (1.1). Lee and Geisser (1975), Rao (1987) and Lee (1988), among others, applied the model to some biological data. Keramidas and Lee (1990) applied the model to forecast of technology substitutions.
Lee (1988, 1991), Lee and Lu (1987) and Keramidas and Lee (1990, 1995)
demonstrated repeatedly the importance of the AR(1) dependence, or serial covar-iance structure, for the covarcovar-iance matrix S for the model (1.1).
Mansour, Nordheim and Rutledge (1985) allowed for time heteroscedasticity
by applying a nonstationary AR(1) error process for repeated measures experi-ments. Geary (1989) presented a more general case. Rochon (1992) presented stationary ARMA covariance structures with time heteroscedasticity. Laird and
Ware (1982) considered the random effects model for longitudinal data, which
included growth curve as a special case. They provided for polynomial trends over time, e.g. random slope. Jennrich and Schluchter (1986) provided the informa-tion about the use of covariance structures which included autoregressive as well as random effects error terms. In our paper, we provided for grouping the var-iances in order to construct more parsimonious and realistic covariance structure and provided the Box-Cox transformation in order to make growth function linear or piecewise linear, which is helpful for prediction. We conducted the likelihood ratio test for the adequacy of grouping the variances. In order to conduct the test, we have been aided by the plot of confidence intervals for standard deviations in the model. In theory, however, some of the results given in Laird and Ware
(1982) and Jennrich and Schluchter (1986) could be extended to provide the
features considered in this paper.
The estimation of parameters and prediction of future values for the model (1.1) having the covariance structure S¼ s2C, C is the correlation matrix with station-ary autoregression process of order 1, or AR(1), have been studied using the ML method by Lee (1988). The purpose of this paper is to consider an extension of this covariance structure to an AR(q) process with grouping variances. An AR(q) dependence structure with grouping variances in the growth curve model will be much more general and quite useful in practice because it includes the AR(1) dependence with a common variance as a special case. When the AR(q) depen-dence structure with grouping variances holds for S, we have
S¼ DsCDs; ð1:2Þ
where C¼ ðcijÞ for i; j ¼ 1; . . . ; p; i.e.,
cij ¼ qjijj; when i6¼ j ; 1 ; when i¼ j ; ð1:3Þ qi¼P q j¼1 fjqjijj; for i 1 ; ð1:4Þ
Ds is a p p diagonal matrix and function of fs1; . . . ;sgg and fj; j¼ 1; . . . ; q are unknown and satisfy the assumption that the roots of the equation
1P
q j¼1
fjxj¼ 0 lie outside the unit circle kxk ¼ 1 in the complex plane and sl> 0 for l¼ 1; . . . ; g are unknown. In the grouping of the variances, we assume that there are gi members in the ith group for i¼ 1; . . . ; g. The purpose of this paper is to consider this covariance structure from the maximum likelihood point of view hoping that a more effective and practical solution can be furnished. We will com-pare our results with those based on the ML method for the AR(1) dependence as given in Lee (1988) via real and simulated data. Sincesi > 0, there is a one-to-one correspondence betweensi ands2i. Thus, for convenience, we will be dealing with siinstead ofs2i mathematically for the rest of this paper. Also, it is trivial to note that equality of variances is equivalent to equality of standard deviations.
In addition to the inferences on the parameters t, f¼ ðf1; . . . ;fqÞ0 and s¼ ðs1; . . . ;sgÞ0, we will also consider conditional prediction problem for the growth curve model as specified by (1.1)–(1.4). Let V be p K future observa-tions drawn from the generalized growth curve model; that is, the set of future observations is such that given the parameters t and S,
EðVÞ ¼ XtF ; ð1:5Þ
where E( ) denotes expected value, F is a known r K matrix, and the columns of V are independent and multivariate normal with a common covariance matrix S. It is noted that F¼ 1 in the special situation in which r ¼ K ¼ 1. Lee and
Geisser (1972, 1975), Fearn (1975), and Lee (1988, 1991) considered the
problem of predicting Vð2Þ, given Vð1Þ and Y, if V is partitioned as
V¼ ðVð1Þ0; Vð2Þ0Þ0, where VðiÞ is pi K ði ¼ 1; 2Þ and p1þ p2¼ p. If p is inter-preted as the number of points in time being observed, then the problem is mainly concerned with predicting the generalized growth curve for future values for the same set of p time points, or a subset of size p2. When p2 < p and K¼ 1, it is also called the conditional prediction of the unobserved portion of a partially observed vector.
Meanwhile, we will also consider the situation in which the Box-Cox transfor-mations are applied to the observations Y in (1.1). The model, called data-based transformed (DBT) model, is
YðlÞ¼ XtAþ E ; (1.6)
where YðlÞ¼ ðYðlÞij Þ and YðlÞij ¼ fðYijþ gÞl 1g=l ; whenl6¼ 0 lnðYijþ gÞ; whenl¼ 0 ; ð1:7Þ g is assumed to be a known constant such that Yijþ g > 0 for all i and j, l is an unknown parameter, and columns of E are independently and normally distributed with mean vector 0 and common covariance matrixS as specified in (1.2)–(1.4). It is noted thatl could assume different values for different segments of observations. In
Leeand Lu (1987), tremendous improvement was found in predictive accuracy using the DBT model for technology substitutions. This is primarily due to the fact that the linearity assumption for the growth function can be enhanced significantly with the Box-Cox transformation, along with incorporating into the model the proper depen-dence structure among the observations. Keramidas and Lee (1990) combined the concepts of the Box-Cox transformation and a generalized growth curve model with serial structure for forecasting technological substitutions based on the maximum like-lihood method when repeated measurements of short time series are available. If the growth function is not linear, we can use the Box-Cox transformation so that the func-tion is linear or piecewise linear, but the transformafunc-tion may not stabilize the variance. Thus, it is appropriate to construct grouping of variances with AR(q) dependence.
In Section 2, maximum likelihood estimation of parameters and prediction of future values are considered. In Section 3, we derive asymptotic result and test for the equality and grouping of variances. The results are illustrated in Section 4 with two real data sets and a simulation. Some concluding remarks are given in Section 5. Finally, the derivation of the Hessian information matrix is given in the Appendix.
2. Parameter estimation and prediction of Vð2Þ based on the maximum likelihood theory
From equation (1.4), we obtain the following equations:
qk ¼ f1qk1þ f2qk2þ . . . þ fqqkq; for k > 0 ; ð2:1Þ where for j < 0, qj¼ qj and q0¼ 1. From equation (2.1), for k ¼ 1; . . . ; q, we obtain the well-known Yule-Walker equations by replacing qj with ^qqj. The Yule-Walker estimate solves these equations for f¼ ðf1; . . . ;fqÞ0 as a function of q’s. Here, for the constraints on q, we would express q as a function of f, the param-eters to be estimated. From equation (2.1), if q is odd,
qk ¼ f1qk1þ f2qk2þ . . . þ fqþ1 2 1q1þ f qþ1 2 q0þ f qþ1 2 þ1q1 þ . . . þ fqqkq; else, qk ¼ f1qk1þ f2qk2þ . . . þ fq21q1þ f q 2q0þ f q 2þ1q1þ . . . þ fqqkq:
Since qj ¼ qj and q0¼ 1, it follows that
qk ¼ fkþP k1 j¼1 ðfjþ f2kjÞ qkj; if k ¼ qþ 1 2 fkþ P k1 j¼1 ðfjþ f2kjÞ qkjþ P q j0¼2k fj0qj0k; if k < qþ 1 2 fkþ P 2kq1 j¼1 fjqkjþ P q j0¼kþ1 ðf2kj0 þ fj0Þ qj0k; if k > qþ 1 2 : 8 > > > > > > > > < > > > > > > > > : ð2:2Þ
Hence, we have natural constraints on q that can be expressed as
Gqþ f ¼ 0 ð2:3Þ
where the matrix G is a function off, element-wise
Gij ¼ fiþjþ fij dij; ð2:4Þ
with fk 0 if k 62 f1; . . . ; qg and dij equals to 1 if i¼ j and 0 otherwise. Hence, for the AR(q) dependence structure, C in (1.3), could be expressed as a function off.
From model (1.6), we have the following likelihood oft, s, f and l: Lðt; s; f; l j YÞ / jSjN2expf1
2 trS
1ðYðlÞ XtAÞ ðYðlÞ XtAÞ0
g JY ; ð2:5Þ where JY, the Jacobian of the transformation from YðlÞ to Y, is defined as
JY ¼ Q N j¼1 Qp i¼1 ðYijþ gÞl1: ð2:6Þ
From (3.6) of Lee (1988), we have tr S1ðYðlÞ XtAÞ ðYðlÞ XtAÞ0
¼ tr ðX0S1XÞ ½ðX0S1XÞ1 X0S1SS1XðX0S1XÞ1þ ðt ^ttÞ AA0ðt ^ttÞ0 þ tr ðZ0SZÞ1 Z0YðlÞYðlÞ0Z
and using the fact
S1¼ S1 XðX0S1XÞ1 X0S1þ ZðZ0SZÞ1 Z0; where
S¼ YðlÞðI A0ðAA0Þ1AÞ YðlÞ0 ð2:7Þ
and Z is a known p ðp mÞ matrix with rank p m such that X0Z¼ 0, and ^tt, the MLE oft, is
^
ttðSÞ ¼ ðX0S1XÞ1 X0S1YðlÞA0ðAA0Þ1; ð2:8Þ
with
S¼ DsCDs: ð2:9Þ
The MLEs of s, f and l, denoted by ^ss¼ ð^ss1; . . . ; ^ssgÞ0, ^ff¼ ð^ff1; . . . ; ^ffqÞ 0
and ^ll, respectively, are obtained by maximizing the profile likelihood function
Lmaxðs; f; lÞ ¼ jSjN=2expf1 2tr½S 1Sþ Zð Z0SZÞ1 Z0YðlÞA0ðAA0Þ1AYðlÞ0gJY: ð2:10Þ
Comments regarding the l values can be applied when there are different values of l for different segments of observations.
We consider the conditional prediction of the future matrix Vð2Þ, given Y and Vð1Þ. Schematically, the matrices Y, Vð1Þ and Vð2Þ are shown below:
Y V
ð1Þ
Vð2Þ
where Y, p N, is the complete sample; Vð1Þ, p1 K, is the partially observed matrix and Vð2Þ, p2 K, is the unobserved portion to be predicted. Of course, p1þ p2¼ p.
We restrict attention to the special situation in which K¼ 1 and r ¼ 1; that is, only one vector is being predicted in the same group. This is similar to Lee and Geisser (1975). The approximate mean (denoted by ^VVð2Þ), when the covariance structure given by (1.2) holds, can be obtained as
^ V Vð2Þ¼ f1 þ ^ll½Xð2Þ^ttþ ^SS21SS^111ðVð1Þð^llÞ Xð1Þ^ttÞg 1=^ll g; if ^ll6¼ 0 exp½Xð2Þ^ttþ ^SS 21SS^111ðVð1Þð^llÞ Xð1Þ^ttÞ g; if ^ll¼ 0 ; ( ð2:11Þ and applying the delta method to obtain an estimate of the approximate covariance matrix of ^VVð2Þ as follows: ^ S SVV^ð2Þ ffi diagðð^llX ð2Þ^ttþ 1Þ1=^ll1 Þ ^WW diag ðð^llXð2Þ^ttþ 1Þ1=^ll1Þ ; if ^ll6¼ 0 diagðexp ðXð2Þ^ttÞÞ ^WW diag ðexpðXð2Þ^ttÞÞ ; if ^ll¼ 0 ; ( ð2:12Þ and ^ W W¼ 1 N ðX ð2Þ ^SS 21SS^111Xð1ÞÞ ðX0SS^1XÞ1ðXð2Þ ^SS21SS^111Xð1ÞÞ0þ ^SS21SS^111SS^12; where Vð1ÞðlÞ is defined in manners similar to YðlÞ given in (1.6) and (1.7), ^tt is as given in (2.7) with S replaced by ^SS, X is partitioned as X¼ ðXð1Þ0; Xð2Þ0Þ0, XðiÞ is pi mði ¼ 1; 2Þ, diag ðaÞ is a diagonal matrix with elements a and
^ S S¼ ^DDsCC ^^DDs ¼ ^ S S11 SS^12 ^ S S21 SS^22 ; ð2:13Þ
where ^SSij is of dimension pi pj (p1þ p2¼ p). It is noted that xa¼ ðxa1; . . . ; xapÞ 0 where x¼ ðx1; . . . ; xpÞ0 and a is a scalar.
3. Asymptotic result and testing equality and grouping of variances
In this section, we will derive the information matrix which is useful for obtaining the standard errors of the MLEs and testing equality and grouping of variances.
3:1 Information matrix
For ease of presentation and some practical consideration, we will restrict our attention to the special situation in which r¼ 1. The log-likelihood obtained from (2.5) is: l¼ lðt; s; f; l j YÞ ¼ N 2 logðjSjÞ 1 2 PN i¼1
ðYðlÞi XtÞ0S1ðYðlÞi XtÞ þ log ðJYÞ : ð3:1Þ We are unable to take the expectation of the Hessian matrix, which is too compli-cated. Only observed Hessian matrix is used in the paper. This is still useful in computing the standard errors of the MLEs. Let q¼ ðt0;s0;f0;lÞ0. In case there are g groups with gi members in the group. The Hessian matrix, H¼ @2l=@q @q0, has the following block partitioned form
H¼ H11 H12 H13 H14 H012 H22 H23 H24 H013 H023 H33 H34 H014 H024 H034 H44 0 B B @ 1 C C A ¼ @2l @t @t0 @2l @t @s0 @2l @t @f0 @2l @t @l @2l @s @t0 @2l @s @s0 @2l @s @f0 @2l @s @l @2l @f @t0 @2l @f @s0 @2l @f @f0 @2l @f @l @2l @l @t0 @2l @l @s0 @2l @l @f0 @2l @l @l 0 B B B B B B B B B B @ 1 C C C C C C C C C C A ; ð3:2Þ where H11¼ NX0S1Xand other derivations are placed in the Appendix.
Thus, an approximate information matrix is
Iqffi Hð^qqÞ ; ð3:3Þ
where ^qq is the mode of the likelihood function. Under some regularity conditions the estimator ^qq is asymptotically normal with mean q and covariance matrix I1q .
3:2 Testing equality and grouping of variances
We will first test the equality of g different variances under the assumption of grouping variances, that is,
H0: Ds¼ 0 versus H1: Ds6¼ 0 where D ðg1Þg¼ 1 1 0 . . . 0 0 1 1 0 ... .. . 0 .. . .. . 0 0 . . . 0 1 1 0 B B B @ 1 C C C A:
The procedure is based on the likelihood ratio criterion, lðYÞ ¼supq2 Q0 Lðq j YÞ
supq2 Q Lðq j YÞ; and la is chosen so that
sup q2 Q0
PrflðYÞ la;qg a ;
where Q0 is the null space when H0 is true while Q is the entire parameter space under the assumption of grouping variances. Let QðYÞ ¼ 2 ln lðYÞ. Under some mild regularity conditions, the statistic QðYÞ is asymptotically Chi-square distri-buted with g 1 degrees of freedom under the null hypothesis.
The likelihood ratio test procedure can also be applied to test the null hypothesis H*: grouping variances versus H0 *: all p variances are distinct.1
The likelihood ratio criterion is
l*ðYÞ ¼ sup q2 Q* Lðq j YÞ0 sup q2 Q* Lðq j YÞ ;
where Q0* is the null space when H* is true while Q* is the entire parameter0 space when the p variances are distinct. Let Q*ðYÞ ¼ 2 ln l*ðYÞ. Then under some mild regularity conditions, Q*ðYÞ is asymptotically Chi-square distributed with p g degrees of freedom under the null hypothesis.
4. Illustrative example
This section is devoted to the illustration of the conditional prediction of Vð2Þ
given Vð1Þ and Y. For the conditional prediction, we will set K¼ 1 and
p2¼ 1; 2; 3; 4, that is, we will predict the last four observations of a partially ob-served vector. If p¼ 7 and p2¼ 1; 2; 3; 4, we denote the predictions as V7, V6 V7, V5 V7 and V4 V7, respectively. For the parameter t, we will assume that the N observations have come from the same group. We will use the predic-tive sample reuse (PSR), or the leave-one-out (LOO) method. The method was used by Lee and Geisser (1975), Rao (1987) and Lee (1988), among others. The discrepancy measure is the mean absolute relative derivation (MARD) of the pre-dictions from the actuals. The discrepancy measure is defined as
MARD¼ ðNp2Þ1 Pp2 j¼1 PN i¼1 jYð2Þij ^YY ð2Þ ij j Yð2Þij ;
where Y¼ ðY1; . . . ; YNÞ, Yi¼ Y ð1Þ i Yð2Þi ! , Yð1Þi is p1 1, Yð2Þi is p2 1, p1þ p2¼ p and ^YYð2Þi is its predicted values. It is noted that when Y
ð2Þ
i is being
predicted, the complete sample is YðiÞ ¼ ðY1; . . . ; Yi1; Yiþ1; . . . ; YNÞ and the par-tially observed vector is Yð1Þi .
4:1 The mice data
The mice data set was reported by Williams and Izenman (1981) and analyzed by Rao (1984, 1987) and Lee (1988, 1991). It consists of weights of 13 male mice, measured at intervals of 3 days, over the 21 days from birth to weaning. The data are plotted in Figure 1. We assume the second degree polynomial for the growth curve, as noted by Rao (1984). When the growth function is quadratic, the design matrix X is X¼ 1 1 1 1 1 1 1 1 2 3 4 5 6 7 1 4 9 16 25 36 49 0 @ 1 A 0 ;
and the design matrix A is 1 13 vector consisting of all 1s.
The ML estimates with standard errors in parentheses for the model (1.6), with q¼ 1, m ¼ 3 and distinct variances, are given in Table 1. When the same l is applied to all observations for each mouse, the estimate of s with 95% confidence interval for each time point is plotted in Figure 2. We found that the mice data
with the same Box-Cox transformation for different time points have a possible change point at day 15. From Table 1, we also found that standard deviation varies with time and stabilize at 12 days, which is also displayed in Figure 2. Hence, a possible model is to group the last three variances as a single group and the rest as four different variances. To test for the adequacy of this grouping var-iances structure, the value of the log-likelihood function is 169.19 under AR(1) dependence with the above grouping variances. Under the equality of variances, the value of the log-likelihood function is 126.52, resulting in a likelihood ratio statistic Q¼ 85:34 and p-value < 0:0001, indicating that the homogeneity of var-iances is not acceptable. Meanwhile, we can test the grouping varvar-iances (H*)0 against the alternative hypothesis that all p variances are distinct (H*). When all 71 variances are distinct, the value of the log-likelihood function is 170.10, resulting
Table 1
ML estimation results for the distinct variances with AR(1) errors ^ tt 1.1125 (0.0090) 0.2507 (0.0085) 0.0152 (0.0011) ^ s s 0.0333 (0.0099) 0.0461 (0.0113) 0.0666 (0.0144) 0.1093 (0.0226) 0.1279 (0.0242) 0.1333 (0.0266) 0.1100 (0.0198) ^ q q 0.9006 (0.054) ^ l l 0.8372 (0.1420)
in a likelihood ratio statistic Q*¼ 1:82 and p-value ¼ 0.5975, indicating that the grouping of variances is acceptable. Thus, the AR(1) dependence with the above grouping of variances is a possible model for the data.
The comparisons of predictive performance of different grouping variances structures for conditional predictions of Vð2Þ given Vð1Þ and Y without the Box-Cox transformation and with the Box-Box-Cox transformation are summarized in Ta-ble 2 and TaTa-ble 3, respectively. In the taTa-bles, the notationf(1) (2) (3) (4) (5, 6, 7)g denotes the grouping of the variances corresponding to the seven time points of mice data into five groups with s5¼ s6¼ s7. We found that the grouping var-iances are better than the common variance with AR(q) dependence structure when q¼ 1 or 2, coupled with or without the Box-Cox transformation. The criter-ion used in the comparison is MARD. For example, for the predictcriter-ion of V7, the best prediction result occurs when a common variance is assumed for the AR(2) dependence coupled with the Box-Cox transformation. The resulting MARD is 0.0354, which is slightly better than the result with the above grouping variances. Meanwhile, the best model for the prediction of V6 V7, V5 V7, and V4 V7 is the untransformed AR(1) dependence, having quadratic growth function and with the grouping variances fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg. Thus, with the exception of
pre-dicting V7 alone, the best model seems to be the untransformed AR(1)
depen-dence, having quadratic growth function and with the grouping variances fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg. Of course, the finding is consistent with the variance estimates as shown in Figure 2.
Table 2
The MARD between the predicted and the actual values of Vð2Þ: mice data
Covariance structure V7 V6 V7 V5 V7 V4 V7 linear linear fð1Þð2Þð3Þð4Þð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0477 0.0454 0.0602 0.0822 0.0617 0.0866 0.0808 0.1097 quadratic quadratic fð1Þð2Þð3Þð4Þð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0403 0.0406 0.0523 0.0525 0.0588 0.0628 0.0722 0.0965 Table 3
The MARD between the predicted and the actual values of Vð2Þwith the Box-Cox transfor-mation and linear growth function: mice data
Covariance structure V7 V6 V7 V5 V7 V4 V7 fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg & AR(1) fð1; 2; 3; 4; 5; 6; 7Þg & AR(1) 0.0391 0.0418 0.0549 0.0729 0.0610 0.0798 0.0748 0.1067 fð1Þ ð2Þ ð3Þ ð4Þ ð5; 6; 7Þg & AR(2) fð1; 2; 3; 4; 5; 6; 7Þg & AR(2) 0.0367 0.0354 0.0638 0.0836 0.0654 0.0777 0.0847 0.1084
4:2 The drug dissolution data
The data given in Lee et al. (1999) are the dissolution rates of three standard lots and one test lot. For each lot, there are twelve tablets and for each tablet the dissolution rates are measured at seven time points: 1, 2, 3, 4, 6, 8 and 10 min-utes. We used the pooled data of three lots and removed the observations at time 1 and 3 to create an equally-spaced dataset. The dissolution function FðtÞ of a drug is defined to be the percentage of a tablet that has dissolved at time t, and RðtÞ is defined by FðtÞ=ð100 FðtÞÞ. The data transformation is plotted in Fig-ure 3. Since 0 FðtÞ 100, and RðtÞ 0, both ranges are not the entire real line. It may cause the out of range problem when we model them directly. We will therefore consider applying the Box-Cox transformations to RðtÞ which will avoid the above problem. We observed that there is one possible change point at time 8. Thus, the design matrix X for the growth curve model is
X¼ 1 1 1 1 1 2 4 6 8 8 0 0 0 0 2 0 @ 1 A 0 :
From Figure 3, it is clear that F=ð100 FÞ is not quite linear, but piecewise linear, in time. The Box-Cox transformation applied to F=ð100 FÞ will help achieve the linearity property. If the time is cut into two pieces, 2–6, 8–10, just as in the design matrix and the Box-Cox transformation is applied to the two pieces separately, the variances are more stable. When different Box-Cox transfor-mations are applied to observations from two different segments, the estimates of s1; . . . ;s5 for the drug dissolution data are plotted with confidence intervals in Figure 4.
From Figure 4, we found that the standard deviation of the pooled data varies with time. Hence, a possible model is to treats2¼ s3 and the rest as three different var-iances in an AR(1) dependence. The value of the log-likelihood function increased from134.35 under the equality of variances to 113.88 under this grouping var-iances structure, resulting in a likelihood ratio statistic Q¼ 40.94, with p-va-lue < 0:0001, indicating that the homogeneity of variances is not acceptable. When all 5 variances are distinct, the value of the log-likelihood function is 111.98, re-sulting in a likelihood ratio statistic Q*¼ 3.80 and p-value ¼ 0.0513, indicating the grouping of variance is acceptable. From Table 4, we found that the model with q¼ 1 and two different variance groups is better than that with equal variance.
4:3 Simulation
In this subsection we will present a simulation study regarding the grouping variances structure. In order to compare different grouping variances, we set
Table 4
The MARD between the predicted and the actual values of Vð2Þwith the Box-Cox transfor-mation: drug data
Drug data Covariance structure V5 V4 V5
Pooled fð1Þ ð2; 3Þ ð4Þ ð5Þg & AR(1) 0.02672 0.02369 fð1Þ ð2Þ ð3Þ ð4Þ ð5Þg & AR(1) 0.02689 0.02436 fð1; 2; 3; 4; 5Þg & AR(1) 0.02742 0.02713
Fig. 4. The estimates of s1; . . . ;s5 of drug dissolution data with 95% confidence interval
t¼ ð0:1; 0:2Þ0, the design matrix X is
X¼ 1 1 1 1 1 1 1
1 2 3 4 5 6 7
0
;
f¼ 0:93, and the number of replications is 1000. We generate data from group-ing variances, (s1¼ s2¼ s3¼ s4¼ s5¼ 0:007 and s6¼ s7¼ 0:014) and (s1¼ s2 ¼ s3¼ s4¼ s5¼ 0:007 and s6¼ s7¼ 0:021), and present the condi-tional predictive performance in Table 5 and Table 6, respectively. From the two tables, it is clear that the grouping variances structure yield better and more stable prediction results, as expected.
5. Concluding remarks
The model with grouping variances in AR(q) dependence structure provides an effective and practical means of dealing with the growth curve data. An appropri-ate grouping among the variances in an AR(q) dependence structure gives better and more stable predictive performance. If the growth function is not linear, we can apply the Box-Cox transformation so that the function is linear. If there are change points, we can apply different Box-Cox transformations for different seg-ments of observations so that the function is piecewise linear. If the variance var-ies over time we can group the variances appropriately in an AR(q) covariance structure. We can plot confidence intervals for standard deviations for each time in
Table 5
Prediction performance for data generated from grouping variances ðs1¼ s2¼ s3¼ s4
¼ s5¼ 0:007 and s6¼ s7¼ 0:014Þ with AR(1) dependence: simulated data
MARD N V7 V6 V7 V5 V7 V4 V7 Grouping variances 12 36 0.00270 0.00274 0.00352 0.00355 0.00358 0.00356 0.00367 0.00375 Identical variance 12 36 0.00273 0.00276 0.00518 0.00516 0.00449 0.00443 0.00430 0.00431 Table 6
Prediction performance for data generated from grouping variances ðs1¼ s2¼ s3¼ s4
¼ s5¼ 0:007 and s6¼ s7¼ 0:021Þ with AR(1) dependence: simulated data
MARD N V7 V6 V7 V5 V7 V4 V7 Grouping variances 12 36 0.00414 0.00407 0.00536 0.00532 0.00522 0.00511 0.00502 0.00503 Identical variance 12 36 0.00428 0.00416 0.00922 0.00925 0.00741 0.00723 0.00674 0.00646
order to observe the adequate grouping and use the likelihood ratio test for the adequacy of grouping the variances. The final grouping of variances is obtained by the prediction results of the last few rows of Y, the observed matrix. When the length of the time points is short and the sample size is small, usually q¼ 1 or 2 will suffice. It is conceivable that the grouping of variances will not be a good idea if the variances are somewhat similar. However, if the magnitudes are distinct and in clear clusters, the grouping of variances will certainly improve the predic-tive performance. Hence, the procedure developed in this paper will be useful for dealing with growth curve data.
Appendix
In this appendix, we will derive the information matrix. H22, H23, H33, H12, H13, H14, H24 and H34can be obtained from the following equations:
@2l @sk@sj ¼ 1 2 s 2 j s 2 k PN i¼1 ðYðlÞi XtÞ0½EkkC1Ejj þ EjjC1EkkðY ðlÞ i XtÞ; if k6¼ j ; Ns2 j s3j PN i¼1 ðYðlÞi XtÞ0½D1s C1Ejj þ EjjC1D1s þ s1j EjjC1Ejj ðY ðlÞ i XtÞ; if k¼ j ; 8 > > > > > > > > > < > > > > > > > > > : @2l @sj@fk ¼ 1 2 s 2 j PN i¼1 ðYðlÞi XtÞ0½D1s C1@C @fkC 1 Ejj þ EjjC1 @C @fk C 1D1 s ðY ðlÞ i XtÞ ; @2l @fj@fk¼ N 2 tr C 1 @2C @fj@fk C 1@C @fjC 1 @C @fk " # 1 2 PN i¼1 ðYðlÞi XtÞ0D1s C1 @ 2C @fj@fk @C @fj C 1 @C @fk @C @fkC 1@C @fj " # C1D1s ðYðlÞi XtÞ ; @2l @t @sj ¼ s2j P N i¼1 X0ðD1s C1Ejjþ EjjC1D1s Þ ðY ðlÞ i XtÞ ; @2l @t @fj¼ PN i¼1 X0D1s C1 @C @fjC 1D1 s ðY ðlÞ i XtÞ ;
@2l @t @l¼ PN i¼1 X0S1@Y ðlÞ i @l ; @2l @sj@l ¼ s2j P N i¼1 ðYðlÞi XtÞ0½D1s C1Ejjþ EjjC1D1s @YðlÞi @l ; @2l @fj@l¼ PN i¼1 ðYðlÞi XtÞ 0 D1s C1@C @fjC 1D1 s @YðlÞi @l ; @2l @l2¼ PN i¼1 YðlÞi 0 @l S 1 @Y ðlÞ i @l PN i¼1 ðYðlÞi XtÞ0S1@ 2YðlÞ i @l2 ; where Ejj is a p p matrix with “1” on the diagonal from
P j1 i¼1 gjþ 1 to Pj i¼1 gi and zero elsewhere. Since C is a function of q, the first two derivatives of C with respect to f can be obtained from the first two derivatives of q as given below:
@q @fj¼ G 1@G @fjG 1f G1e i; @2q @fj@fk¼ G 1 @2G @fj@fkG 1fþ G1@G @fjG 1e kþ G1 @G @fkG 1e j G1@G @fjG 1@G @fkG 1f G1@G @fkG 1@G @fjG 1f ;
where ei is the ith column vector of an q q identity matrix Iq. And, the first two derivatives of YðlÞi with respect to l can be easily obtained by the following formulas: @YðlÞij @l ¼ ðYijþ gÞ l 1 l2 þ
ðYijþ gÞllogðYijþ gÞ
l ; l6¼ 0 0 ; l¼ 0 ; 8 < : @2YðlÞ ij @l2 ¼ 2ððYijþ gÞl 1Þ l3
2ðYijþ gÞllogðYijþ gÞ l2 þðYijþ gÞ l ðlog ðYijþ gÞÞ2 l ; l6¼ 0 0; l¼ 0 : 8 > > > > > < > > > > > : Acknowledgement
The authors thank a referee for some helpful comments on the earlier version of the paper.
References
Fearn, T.,1975: A Bayesian Approach to Growth Curve. Biometrika 62, 89–100.
Geary, D. N.,1989: Modeling the Covariance Structure of Repeated Measurements. Biometrics 45, 1183–1195.
Geisser, S.,1970: Bayesian Estimation in Multivariate Analysis. Sankhya A32, 53–64.
Geisser, S.,1980: Growth Curve Analysis. In P. R. Krishnaiah (ed.):Handbook of Statistics (Vol. 1). North-Holland, Amsterdam, 89–115.
Jennrich, R. I.and Schluchter, M. D., 1986: Unbalanced Repeated-Measures Models with Structured Covariance Matrices. Biometrics 42, 805–820.
Keramidas, E. M.and Lee, J. C., 1990: Forecasting Technological Substitutions with Concurrent Short Time Series. Journal of the American Statistical Association 85, 625–632.
Keramidas, E. M. and Lee, J. C., 1995: Selection of a Covariance Structure for Growth Curves. Biometrical Journal 37, 783–797.
Khatri, C. G.,1966: A Note on MANOVA Model Applied to Problems in Growth Curves. Annals of the Institute of Statistical Mathematics 18, 75–86.
Laird, N. M.and Ware, J. H. 1982: Random-Effects Models for Longitudinal Data. Biometrics 38, 963–974.
Lee, J. C., 1988: Prediction and Estimation of Growth Curves with Special Covariance Structures. Journal of the American Statistical Association 83, 432–440.
Lee, J. C.,1991: Test and Model Selection for the General Growth Curve Model. Biometrics 47, 147– 159.
Lee, J. C., Chen, D. T., Hung, H. N.and Chen, J. J., 1999: Analysis of Drug Dissolution. Statistics in Medicine 18, 799–814.
Lee, J. C.and Geisser, S., 1972: Growth Curve Prediction. Sankhya A34, 393–412.
Lee, J. C.and Geisser, S., 1975: Applications of Growth Curve Prediction. Sankhya A37, 239–256. Lee, J. C.and Lu, K. W., 1987: On a Family of Data-based Transformed Models Useful in
Forecast-ing technological Substitutions. Technological ForecastForecast-ing and Social Change 31, 61–78. Mansour, H., Nordheim, E. V.and Rutledge, J. J., 1985: Maximum Likelihood Estimation of
Var-iance Components in Repeated-Measures Designs Assuming Autoregressive Errors. Biometrics 41, 287–294.
Pan, J. X., Fang, K. T.,and von Rosen, D., 1999: Bayesian Local Influence in Growth Curve Model with Unstructured Covariance. Biometrical Journal 41, 641–658.
Potthoff, R. F.and Roy, S. N., 1964: A Generalized Multivariate Analysis of Variance Model Useful Especially for Growth Curve Problems. Biometrika 51, 313–326.
Rao, C. R,1984: Prediction of Future Observations in Polynomial Growth Curve Model. In Applica-tion and New DirecApplica-tions: Proc. Indian Statistical Institute Golden Jubilee InternaApplica-tional Confer-ence on Statistics. Indian Statistical Institute, Calcutta, 512–520.
Rao, C. R., 1987: Prediction of Future Observations in Growth Curve Model. Statistics Science 2, 434–471.
Rochon, J.,1992: ARMA Covariance Structures with Time Heteroscedasticity for Repeated Measures Experiments. Journal of the American Statistical Association 87, 777–784.
von Rosen, D.,1991: The Growth Curve Model: A Review. Communications in Statistics-Theory and Methods 20, 2791–2822.
Williams, J. S. and Izenman, A. J., 1981: A Class of Linear Spectral Models and Analysis for the Study of Longitude Data. Technical Report, Dept. of Statistics, Colorado State University.
Received, August 2001 Revised, January 2002 Revised, May 2002 Accepted, August 2002