具異質性多變量線性模型之廣義推論

全文

(1)國立交通大學統計學研究所博士論文. 具異質性多變量線性模型之廣義推論 Generalized inference in heteroscedastic multivariate linear models. 研究生：王仁聖指導教授：洪慧念. 教授. 林淑惠. 教授. 中華民國九十七年七月.

(2) 具異質性多變量線性模型之廣義推論 Generalized inference in heteroscedastic multivariate linear models. 研究生：王仁聖. Student：Ren-Sheng Wang. 指導教授：洪慧念教授. Advisor：Dr. Hui-Nien Hung. 林淑惠教授. Dr. Shu-Hui Lin. 國立交通大學統計學研究所博士論文. A Dissertion Submitted to Institute of Statistics College of Science National Chiao Tung University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in. Statistics July 2008 Hsinchu, Taiwan, Republic of China. 中華民國九十七年七月.

(3) 具異質性多變量線性模型之廣義推論. 研究生：王仁聖. 指導教授：洪慧念教授林淑惠教授. 國立交通大學統計學研究所. 摘. 要. 本論文主題在利用廣義方法處理具異質性 AR(1)共變異矩陣之迴歸模型，由 Tsui 與 Weerahandi (1989) 和 Weerahandi (1993) 提出廣義 p 值和廣義信賴區間的觀念提供不同於傳統處理異質性方法，我們把廣義 p 值和廣義信賴區間推廣到廣義多變量檢定統計量的標準化。Lin and Lee (2003)應用廣義 p 值和廣義信賴區間處理具異質性 uniform 共變異矩陣之多變量變異數分析問題，我們利用他們的程序並做適度修改來處理具異質性 AR(1)共變異矩陣之迴歸模型，所得到的涵蓋機率與預期面積是令人滿意的結果。同時我們的方法也適用於 uniform 共變異矩陣且無需限定 design matrices Xi 具特定形式。. 關鍵詞： AR(1) 、廣義信賴區、廣義 p 值、廣義檢定統計量、異質性、迴歸模型、uniform 共變異矩陣。.

(4) Generalized inference in heteroscedastic multivariate linear models Student：Ren-Sheng Wang. Advisors：Dr. Hui-Nien Hung Dr. Shu-Hui Lin. Institute of Statistics National Chiao Tung University. ABSTRACT Our main subject in this dissertation is applying the generalized method to deal with regression model with heteroscedastic AR(1) covariance matrices. The concepts of the generalized p-values and the generalized confidence intervals proposed by Tsui and Weerahandi (1989) and Weerahandi (1993), respectively, provide an alternative way to handle with heteroscedasticity. We extend these concepts to further consider the standardized expression of the generalized multivariate test variable. Lin and Lee (2003) applied the generalized method to deal with the MANOVA model with unequal uniform covariance structures among multiple groups. We utilize their process with modifications to deal with regression model with heteroscedastic serial dependence. The coverage probabilities and expected areas based on our proposed procedure display satisfactory results. Besides, we also find that our method can be applied to the uniform structures without the special design matrices Xi assumption.. Key words and phrases: AR(1); Generalized confidence intervals; Generalized p-values; Generalized test variable; Heteroscedasticity; Regression model; Uniform covariance structures..

(5) 誌. 謝. 博士論文的完成與學位取得，要感謝林淑惠老師在教學之餘，仍安排出每週半日到一日的研討時間，對新的統計方法悉心指導並提供在此方法的個人經驗與應注意事項。而李昭勝老師不幸發生意外之後，也是在每位老師、學長姐與學弟妹們互相扶持，走出傷痛，由林淑惠老師與洪慧念老師持續鼓勵下，我才得以把這階段的句點完成。在本所碩士班求學期間，常常感受到指導教授李昭勝老師在教學與研究的活力和熱情，而他也會時時關注學生的課業、生活與健康。服兵役時，也不時直接或間接透過學長姐表達他對我的關懷。退伍後，在他的鼓勵下，我又回到交大統計所接受更精進細緻深入的教育與研究訓練。李老師過世之後，由他的親朋好友所透露出的點點滴滴，可知李昭勝老師在學問方面是苦學好學並勇於嘗試新的學科。執筆至此，在腦海中響起與浮現他的聲形。論文口試時，承蒙清華大學徐南蓉教授、中央研究院黃信誠老師及本所盧鴻興教授即使準備出國參加研討會仍在暑假撥冗前來口試，對於細節與其他方向不吝指正且提供更精闢見解與寶貴意見及建議，使本論文更加完備，在此感謝每位口試委員傾囊相授。此外，再次感謝所上每位老師在我求學期間，不論是課業方面還是生涯規劃方面都適時提供幫助與建議；所上行政助理郭碧芬小姐在所務及學生事務明確告知應注意事項，使所上師生得以心無旁鶩專注在課業研究。要感謝的人太多而無法一一列舉。最後感謝家人一路上默默的支持與鼓勵。. 王仁聖謹誌于國立交通大學統計學研究所中華民國九十七年七月.

(6) Contents. 1 Introduction. 1. 2 The Theory of Generalized Inference. 4. 2.1 The theories of generalized p-values and generalized confidence intervals•••••••••••••••••••. 4. 2.2 Substitution method•••••••••••••••••••. 5. 2.3 Illustrative example•••••••••••••••••••. 6. 3 Inferences on a Linear Combination of K Multivariate Normal Mean Vectors 3.1 Introduction••••••••••••••••••••••. 9 9. 3.2 Hypothesis testing and confidence region estimation for. Gµ •••••••••••••••••••••••••• 11 3.2.1 Solutions based on the generalized method••••• 11 3.2.2. Solutions based on the classical methods•••••• 14. 3.3 The multivariate Behrens-Fisher problem••••••••• 15. i.

(7) 4 The Theories of the Regression Model and the Growth-Curve Model. 18. 4.1 Regression model with known covariance matrices•••• 18 4.2 The growth-curve model•••••••••••••••• 20 4.3 Growth curve models with heteroscedastic uniform covariance structure•••••••••••••••••• 21. 5 Generalized Inferences on Regression Models with Unequal AR(1) Covariance Matrices. 24. 5.1 Introduction•••••••••••••••••••••• 24 5.2 Regression model with AR(1) errors••••••••••• 25 5.2.1 Single group based on the generalized method•••• 25 5.2.2 Multiple groups based on the generalized method•• 27 5.3 The other methods••••••••••••••••••• 31. 6 Results and Concluding Remarks. 33. 6.1 Simulation studies about Gµ ••••••••••••••• 33 6.1.1 The multivariate Behrens-Fisher problem•••••• 33 6.1.2 The expected areas and coverage probabilities•••• 35 6.2 Illustrative Examples of linear combination of mean vectors. 37. 6.2.1 Example 1•••••••••••••••••••• 37. ii.

(8) 6.2.2 Example 2•••••••••••••••••••• 39 6.3 Illustrative examples of serial dependence••••••••• 41 6.3.1 Simulated studies (Comparison of coverage probabilities)••••••••••••••••••• 41 6.3.2 Example 3: the dental data••••••••••••• 43 6.3.3 Example 4: the simulated data (Testing equality of the trends) ••••••••••••••••••••• 44 6.4 Concluding remarks•••••••••••••••••• 46. References. 48. Appendix. 53. iii.

(9) List of Tables 6.1. Type I error with 1,000 iterations Σ1 = I 2 , Σ 2 = aI 2 ••••• 34. 6.2. Type I error with 1,000 iterations Σ1 = I 4 , Σ 2 = aI 4 ••••• 35. 6.3. Expected areas of 95% confidence regions and coverage probabilities of µ1 − µ 2 under. 6.4. Σ1 = I2 , and Σ2 =. n2 aI2 ••••• n1. 36. Expected areas of 95% confidence regions and coverage probabilities of. µ1 µ 2 + − µ 3 under Σ1 = I 2 , Σ 2 = 3I 2 and 2 2. Σ3 = aI 2 •••••••••••••••••••••••• 37. 6.5. Sample means of plasma inorganic phosphate (mg/dl) •• 38. 6.6. Various comparisons of mean flux curves over selected time intervals following oral glucose challenge•••••• 39. 6.7. Sample means of GBV and the 95% confidence region of Gµ •••••••••••••••••••••••••• 40. 6.8. Comparison of 95% coverage probabilities of β under σ 1 =1, ρ1 =0.1•••••••••••••••••••••• 42. 6.9. Expected areas of 95% confidence regions of β under σ 1 =1, ρ1 =0.1•••••••••••••••••••••• 42. 6.10 Estimated trends, expected areas and hypotheses of the dental data set••••••••••••••••••••• 43 6.11 The simulated data set for 5 groups••••••••••• 45 6.12 The generalized p-values for the testing equality of the trends••••••••••••••••••••••••• 46. iv.

(10) 6.13 The generalized p-values for the testing equality of growth curves with uniform covariance matrices••••••••. v. 46.

(11) Chapter 1 Introduction Our main subject in this dissertation is to find a method dealing with regression models with heteroscedastic AR(1) covariance matrices. Heteroscedasticity, the phenomenon of a set of statistical distributions with different variances, is one of the attention-getting issues for researchers. Such heteroscedasticity may be pertained to unknown variables while some heteroscedasticity may be related to variables of interest. For instance, the behavior of a chemical reaction might be affected by temperature or reaction time, the heights of children may be affected by the gender and the differences of the yields of the corn may be affected by the species of the corn, etc. Therefore, it is desirable to discuss and to find a method to handle the problem with heteroscedastic phenomena. The Behrens-Fisher problem is the typical case where the variances of the two normal populations are not quite equal, that is, there is heteroscedasticity between two groups. Linnik (1968) has shown that the inferences about the difference of the means between two populations have no exact fixed-level tests (conventional) based on the complete sufficient statistics, that is, based on the two sample means and the two sample variances. However, exact conventional solutions based on other statistics and approximate solutions based on the complete sufficient statistics exist. For example, Scheffé (1943) gave a class of exact solutions to the Behrens-Fisher problem, but Scheffé type solutions are inefficient in the sense that they do not use all the information in the data about the true value of the parameter. The expected length of the confidence intervals given by the Scheffé solution is much larger than those given by approximate solutions. (See, Welch (1947), Lee and Gurland (1975), and Scheffé (1970), etc.) With prior distributions, the Bayesian method can make inferences about the difference of the means based on the posterior distribution, which combines the information in the prior distributions and the information in the data (the likelihood function) about the parameters. Some statisticians believe that it is not appropriate to talk about the prior distribution when it is known that the parameter is not a random variable but rather an unknown fixed number.. 1.

(12) The concepts of the generalized p-values and the generalized confidence intervals were proposed by Tsui and Weerahandi (1989) and Weerahandi (1993), respectively. Although the generalized approach shares the same philosophy of the Bayesian approach that the inferences should be made with special regards to the data at hand, the parameters are not treated as random variables in generalized approach. Comparing to the classical tests, the generalized p-values are based on a number of test statistics whereas conventional p-values are based only on a single test statistics. The methods are exact in the sense that the tests and the confidence intervals developed are based on exact probability expressions rather than on asymptotic approximations. The method of the generalized p-values is frequently applied to deal with many practical problems concerning the situation with unequal variances or unequal covariance matrices. For example, Thursby (1992), Weerahandi (1995), Ananda and Weerahandi (1996), Chang and Huang (2000), McNally, Iyer and Mathew (2003), Krishnamoorthy and Lu (2003), Mathew and Krishnamoorthy (2003, 2004), Lee and Lin (2004), Hannig, Iyer and Patterson (2006) and many others have carried out a number of investigations and applications of generalized p-values in making inferences of the difference of two exponential means, extreme values under normality, the ratio of mean of two normal populations, some functions of the means of lognormal distribution, the Behrens-Fisher problem and the common mean of several normal populations, etc. The generalized method is also applied to deal with the traditional multivariate statistical problems in which nuisance parameters are present and they are difficult to make inferences. Griffiths and Judge (1992), Chi and Weerahandi (1998), Gamage and Weerahandi (1998), Gamage, Mathew and Weerahandi (2004) and others presented the generalized method as an alternative way of handling multivariate statistical problems like regression models, linear models and mixed models etc., with different covariance matrices among multiple groups. However, it is desired that the generalized method in the multivariate case should be brought to more attention. We propose a new generalized test variable to make inferences on a linear combination of multivariate normal mean vectors among multiple populations. In simulation studies, when only two populations are considered, our results are equivalent to those proposed by Gamage et al. (2004) in the bivariate case which is also known as the bivariate Behrens-Fisher problem. However, in some higher dimension case, these two results are quite different. The details will be discussed later.. 2.

(13) With the notions and concepts of generalized p-values and the generalized confidence. regions,. we. provide. the. exact. inferences. on. the. multivariate. analysis-of-variance model (MANOVA), including the growth curve models with the uniform covariance structures and the serial covariance structures. Lee (1988) applied the growth curve model to the multivariate linear model with two special covariance structures, that is, the uniform covariance structures and the serial covariance structures. The growth curve model was first proposed by Potthoff and Roy (1964). Lee and Geisser (1975), Lee (1988) and many others have shown that the growth curve model is one of the most useful methods for dealing with the MANOVA model with the serial covariance structures. However, the growth curve model is restricted to handling either a single group or multiple groups only under the assumption of identical error correlation among the groups. As with many traditional methods, the growth curve model has difficulty in dealing with models in which the error correlations are different among distinct groups. Hence, we will apply the generalized method to discuss the regression model with heteroscedastic AR(1) covariance matrices. Lin and Lee (2003) showed that the generalized method provides an alternative way of dealing with the MANOVA model with unequal uniform covariance structures among multiple groups. However, the procedure was based on the assumption of the special design matrices. Thus we will extend the idea with some modifications to further consider the growth curve model with possibly unequal serial covariance matrices between different groups. In this dissertation, we will start out with brief introduction of generalized inferences, including generalized p-values and the generalized confidence intervals in Chapter 2. We will make the generalized inferences on a linear combination of the mean vectors under the assumption of unequal covariance matrices in Chapter 3. The traditional procedure to deal with regression models when the covariance matrices are known is described in Chapter 4. The growth curve model is also described in Chapter 4. The regression models with the unequal serial covariance structures will be discussed in Chapter 5. Finally, several numerical examples and simulation studies are given to illustrate the advantages of our proposed methods in Chapter 6. The concluding remarks are also provided in Chapter 6. Based on the standardized expression of the generalized test variable (GTV), we proposed algorithms to compute the generalized p-value and the generalized confidence region in Appendix.. 3.

(14) Chapter 2 The Theory of Generalized Inference 2.1 The theories of generalized p-values and generalized confidence intervals Let W be a random variable whose distribution f ( W | ζ ) depends on a vector of unknown parameter vector ζ = (θ , η) , where θ is the parameter of interest, and η is a vector of nuisance parameters. Suppose we are interested in testing H 0 : θ ≤ θ 0 vs. H1 : θ > θ 0 ,. (2.1). where θ 0 is a pre-specified quantity. The concepts of generalized p-values and generalized confidence intervals were developed by Tsui and Weerahandi (1989) and Weerahandi (1993), respectively, to deal with the statistical problems in which nuisance parameters are present such that the classical statistical methods are difficult to make inferences. We will briefly introduce these concepts as follows. The generalized test variable (GTV) of the form H ( W; w, θ , η) with w being the observed value of W is chosen to satisfy the following requirements: (i) For fixed w , the distribution of H ( W; w, θ , η) is free of the vector of nuisance. parameters η . (ii) The value of H ( W; w , θ , η) at W = w is free of any unknown parameters. (iii) For fixed w and η , Pr[ H ( W; w, θ , η) ≥ h] is either an increasing or a decreasing function of θ for any given h .. (2.2). Under the above conditions, if H ( W; w ,θ , η) is stochastically increasing in θ , then the generalized p-values for testing the hypothesis in (2.1) is defined as p = sup Pr[ H ( W; w, θ , η) ≥ h0 ] = Pr[ H ( W; w, θ 0 , η) ≥ h0 ] , θ ≤θ0. (2.3). where h0 = H (w; w,θ 0 , η) . Under the same setup, a generalized pivotal quantity (GPQ), D( W; w, θ , η) , satisfies the following conditions: (i) The distribution of D( W; w,θ , η) is free of unknown parameters. (ii) The observed value of D ( W; w , θ , η) is free of nuisance parameters η . Condition (i) allows us to write probability statements leading to confidence intervals. 4.

(15) that can be evaluated regardless of the values of the unknown parameters. Condition (ii) ensures that given the current sample point D (w; w , θ , η) , we can obtain a subset of parameter space that can be computed without knowing the values of the nuisance parameters. Let c1 and c2 be such that Pr[c1 ≤ D ( W; w , θ , η) ≤ c2 ] = 1 − α ,. (2.4). then {θ : c1 ≤ D(w; w,θ , η) ≤ c2 } is a 100(1 − α )% generalized confidence interval for. θ . Furthermore, if the value of. {D(w;α / 2), D(w;1 − α / 2)} D(w; γ ) represents the γ. th. D ( W; w ,θ , η). at. W=w. is θ , then. is a 100(1 − α )% confidence interval for θ , where quantile of D( W; w, θ , η) .. 2.2 Substitution method To get an applicable GTV or GPQ, Peterson, Berger, and Weerahandi (2003) proposed a systematic approach, that is, substitution method. Let (V1 ,… ,Vk ) be a set of random variables with distributions free of unknown parameters, and their joint distribution be known. Suppose that there is also a set of observable statistics (W1 ,… ,Wk ) , with observed values ( w1 ,… , wk ) and known distributions, such that the number of (W1 ,… ,Wk ) , k , is equal to that of unknown parameters of the problem, say (λ1 ,… , λk ) . Then the substitution method is carried out in the following procedure. 1. Deposit the parameter of interest, θ , into the function of (λ1 ,… , λk ) or express. θ in terms of (W1 ,… ,Wk ) and (V1 ,… , Vk ) . 2. Obtain a GTV H ( W; w,θ , η) by replacing (W1 ,… ,Wk ) with ( w1 ,… , wk ) and substrate θ from step 1. 3. Check whether H ( W; w, θ , η) satisfies properties (i) and (iii) in (2.2). 4. Rewrite (V1 ,… , Vk ) terms appearing in H ( W; w, θ , η) in terms of (W1 ,… ,Wk ) and (λ1 ,… , λk ) . Then check the properties (ii) in (2.2) and show that the observed sample point on the boundary of the extreme region. 5. Calculus the generalized p-value based on H ( W; w, θ , η) . It should noted that to find a potential GTV or GPQ, there are various replacements of parameters by random variables and substitution of random variables by their 5.

(16) observed values from step 1 to step 5.. 2.3 Illustrative example Weerahandi (2004) gave several examples to illustrate the substitution method and Two of them will be chosen to exhibit the substitution method for GTV and GPQ as follows. Suppose that. X1, X 2 ,. , X n are independent and identically distributed as. N ( µ , σ 2 ) , with mean µ and variance σ 2 . X and S 2 are sample mean and sample variance, respectively.. Example of the generalized p-value Suppose θ = µ + σ 2 is a function of the parameters of the normal distribution. The parameter can be expressed in terms of the sufficient statistics and random variables as. where. Z=. X −µ σ n. θ = X −Zσ. n +σ 2. (2.5). = X −Z. S nS 2 + , U U. (2.6). and U =. nS 2. σ2. are the independent standard normal and. Chi-squared random variables. Let x and s 2 be the observed values of X and S 2 , respectively, we can obtain the potential test variable as H = x −Z. s ns 2 + −θ U U. =x−. X − µ s σ n s 2σ 2 + 2 −θ S S σ n. =x−. s ( X − µ ) s 2σ 2 + 2 −θ. S S. (2.7). Having obtained the identity that relates the parameter to the sufficient statistics and random variables that are free of unknown parameters, it is clear that the observed value of H is zero and its distribution does not depend on nuisance parameters. It also follows from (2.7) that it is stochastically decreasing in the parameter of interest θ . Hence, H is indeed a test variable (GTV). So, for instance hypotheses of the form H 0 : θ ≤ θ 0 can be tested on the generalized p-value. 6.

(17) p = Pr( H ≤ 0 | θ = θ 0 ) = Pr(. sZ ns 2 − ≥ x − θ0 ) . U U. (2.8). In this example, the p-value can be computed by numerical integration with respect to independent Z and U. The probability of the inequality in appearing in the formula can also be evaluated by the Monte Carlo method. This is accomplished by generating a large number of random numbers from Z and U, and then finding the fraction of pairs of random numbers for which the inequality is satisfied.. Example of the generalized confidence interval Suppose θ = ( µ + σ ) ( µ 2 + σ 2 ) is the parameter of interest, where µ and σ are the mean and the standard deviation of the normal distribution. Let Z =. U=. nS 2. σ2. X −µ and σ n. be the independent standard normal and Chi-squared random variables, then. θ = (µ + σ ) (µ 2 + σ 2 ) = =. X − Z σ n +σ ( X − Z σ n )2 + σ 2 X −Z S (X − Z S. U +S n U U ) 2 + nS 2 / U. .. Hence we can define two representations of the GPQ as. D=. =. x −Z s (x − Z s. U +s n U. (2.9). U ) 2 + ns 2 / U. x − s ( X − µ ) S + sσ / S . ( x − s ( X − µ ) S ) 2 + ( sσ / S ) 2. (2.10). From (2.9), the distribution D is free of unknown parameters and (2.10) implies that the observed value of D is θ . Then. {D(w;α / 2), D(w;1 − α / 2)}. is a 100(1 − α )%. generalized confidence interval for θ , or with w′ = ( x , s) , 1 − α = Pr[ D(w; α / 2) ≤. x −Z s (x − Z s. U +s n U U ) 2 + ns 2 / U. ≤ D(w;1 − α / 2)] .. (2.11). The probability can be evaluated by numerical integration with respect to ( Z ,U ) or by Monte Carlo integration. Further details on the concepts of generalized p-values and generalized confidence. 7.

(18) intervals can be found in Weerahandi (1995, 2004). When there is more than one parameter of interest, as usually the case in linear models, the substitution method should be modified to obtain potential GTV and GPQ.. 8.

(19) Chapter 3 Inferences on a Linear Combination of K Multivariate Normal Mean Vectors 3.1 Introduction. Suppose there exist K independent d-variate normal populations with mean vector µ i and covariance matrix Σi , i = 1, 2,..., K , where µ i and Σi are possibly unknown. and unequal among group. We want to make inferences on a linear combination of K mean vectors. This problem arises because sometimes there is a theoretical reason for believing some characteristics of these populations to be such that their mean vectors have some relationships or practitioners want to know some characteristics of compound material. For example, in the Edgar Anderson’s famous Iris data, there is a theoretical belief that the four gene structures of three species to be such that the mean vectors of the three populations, (1) iris versicolor (2) iris setosa and (3) iris virginica, are related to 3µ1 = 2µ 2 + µ 3 (Anderson, 2003). If the difference between the covariance matrices is small and the sample sizes are large, the Hotelling’s T 2 -test for testing a linear combination of mean vectors has good performance. However, if the covariance matrices are quite different and/or the sample sizes are small, the nominal significance level may be distorted. Therefore, we intend to develop a procedure to provide generalized inferences for a linear combination of the mean vectors, θ = Gµ, where G is a designed d × dK matrix, and µ is the dK-variate mean vector with µ′ = (µ1′ ,. , µ′K ) . That is, we will provide a generalized. confidence region for θ and test the hypothesis H 0 : Gµ = θ0 where θ0. H1 : Gµ ≠ θ0 ,. vs.. (3.1). is a given vector. For example, in the Iris data, we can set. G = (3I d , −2I d , −I d ) and θ0 = 0 to perform this hypothesis.. Suppose Xij ’s are independent random vectors of sample size ni . Define the ith sample mean vector and sample covariance matrix as Xi =. 1 ni. ni. ∑ Xij j =1. and. Si =. 1 ni. ni. ∑ (X j =1. ij. − Xi )( Xij − Xi )′ , i = 1,..., K .. 9. (3.2).

(20) It can be shown that Xi ~ N d (µ i ,. Σi ) ni. and. A i = ni Si ~ Wd (ni − 1, Σi ) , i = 1,..., K ,. (3.3). and both of them are independently distributed, where N d ( π, Ψ ) denotes d-variate normal distribution with mean vector π and Wd (r , Ψ ) is the d-dimensional Wishart distribution with degrees of freedom r and scale matrix Ψ . Furthermore, ni is supposed to greater than d , ni > d , i = 1,..., K , to ensure Si−1 exists with probability one. Because the distributions of Xi and S i are affine invariant, and thus, we will test the problem (3.1) and construct a confidence region of θ(= Gµ) based on these judicious condensation of the data. Using the underlying distribution assumptions, our approach procedures are associated with an exact probability statement and a repeated sampling interpretation. For K=2, G = (I d , −I d ) and θ0 = 0 , (3.1) is reduced to the well-known multivariate Behrens-Fisher problem. For this topic, there are several exact as well as approximate tests are considered in the literature for the past five decades. For example, Christensen and Rencher (1997) compared seven solutions for their Type I error rates and powers and suggested that Kim’s (1992) and Nel and Van der Merwe’s (1986) solutions had the highest powers among solutions whose Type I error rates were not inflated. Krishnamoorthy and Yu (2004) modified the Nel and Van der Merwe’s (1986) test and provided an approximate invariant solution for the problem. In addition to those approximate procedures, Bennett (1951) provided an exact solution for the generalized Behrens-Fisher problem. However, the power obtained by Bennett’s method was poor under unequal sample sizes because the method was not based on sufficient statistics. Johnson and Weerahandi (1988) provided an exact Bayesian solution based on Bayesian Approach and Gamage, Mathew and Weerahandi (2004) provided the generalized p-values and generalized confidence region for the Behrens-Fisher problem. We would like to further consider K non-homogeneous multivariate normal populations with unequal sample sizes and unequal covariance matrices, and then provide an invariant generalized test variable and construct a generalized confidence region for a linear combination of K multivariate normal mean vectors. In our proposed model, the multivariate Behrens-Fisher problem can be treated as a special case of our model. The concepts of generalized p-value and generalized confidence intervals have. 10.

(21) turned out to be extremely fruitful for obtaining tests and confidence intervals involving “non-standard” parameters. Therefore, we will use the idea to derive a new generalized pivot quantity that is simple to use for both hypothesis testing and confidence region estimation of Gµ . Our procedures for hypothesis testing and the generalized confidence region of Gµ construction are presented in Section 3.2. Several methods in the multivariate Behrens-Fisher problem are briefly introduced in Section 3.3. Results will be illustrated with real and simulated data in Chapter 6. Two simulation studies are presented in Section 6.1 to compare the type I error rates, expected areas and the coverage probabilities in different combinations of sample sizes and covariance matrices for difference procedures, and then two sets of data will be illustrated for our procedures in Section 6.2. 3.2 Hypothesis testing and confidence region estimation for Gµ Suppose we have K independent d-variate multivariate normal populations with mean vector µ i and unequal covariance matrices Σi for the ith sample. Let Xi and. Si be the sample mean vector and sample covariance matrix for the ith population, which are defined in (3.2). We will consider the problem of estimating a linear combination of K multivariate normal mean vectors, Gµ , based on the minimal sufficient statistics ( X1 ,..., X K , S1 ,..., S K ) .. In this section, we will first derive the generalized p-value and construct a generalized confidence region of Gµ based on the generalized method and then reviewed some commonly used methods. For some special cases, especially the multivariate Behrens-Fisher problem, several methods will also be reviewed in Section 3.3. 3.2.1 Solutions based on the generalized method It is noted that Xi and S i are mutually independent with Xi ~ N d (µ i , Σi / ni ) , Si ~ Wd (ni − 1,. Σi ) and A i = ni Si ~ Wd (ni − 1, Σi ) , i = 1,..., K . Let X′ = ( X1′ ,..., X′K ) ni. then the MLE (maximum likelihood estimator) of θ is θˆ = GX ~ N d (θ, GΦG ′) ,. 11. (3.4).

(22) where Φ is the block diagonal matrix (Bdiag), Σ Φ = Bdiag ( 1 , n1.  n1−1Σ1  Σ , K )≡ nK  0 .    . −1 nK Σ K  0. If the covariance matrices Σi ’s are given, it is known that from (3.4) we can get. ( GΦG′ ) If. the. covariance. −1/ 2. G ( X − µ) ≡ Z d ~ N d (0, I d ) .. matrix. Σi. for. the. ith. (3.5) population. is. unknown,. let. S = Bdiag (S1 ,..., S K ) and s = Bdiag (s1 ,..., s K ) be the observed value of S , then we. can define R = s −1/ 2Φs −1/ 2 . −1/ 2. s −1/ 2Ss −1/ 2  s −1/ 2Φs −1/ 2 . −1/ 2. ,. (3.6). where Ψ1/ 2 means the positive definite square root of the positive definite matrix Ψ and Ψ −1/ 2 = (Ψ1/ 2 ) −1 . It should be noted that R also stands for a block diagonal matrix with R = Bdiag (R1 ,..., R K ) , where −1/ 2 −1/ 2 R i = si ( Σi / ni )si . −1/ 2. si −1/ 2S i si −1/ 2  si −1/ 2 ( Σi / ni )si −1/ 2    . −1/ 2. .. (3.7). Since R i ~ Wd (ni − 1, I d ) is free of any unknown parameters, and for the fact that at S = s , the observed value r of R is s −1/ 2Φs −1/ 2 . −1. , it is clear that s1/ 2 R −1s1/ 2 = Φ. at S = s . That means we can use the information of s and R to make inference about the nuisance parameters Φ . Furthermore, we will derive the generalized inferences for Gµ based on X and R . Let x and r be the corresponding observed values of X and R , respectively, the generalized pivot quantity can be expressed as T( X, R; x, r ) = Gx − ( Gs1/ 2 R −1s1/ 2G ′ ). 1/ 2. =Gx − ( Gs1/ 2 R −1s1/ 2G ′ ). 1/ 2. ( GΦG′). −1/ 2. G ( X − µ). Zd .. (3.8). It is noted that the value of T in (3.8) at ( X, S) = ( x, s) is Gµ which is the parameter of interest. Furthermore, given ( x, s) , the distribution of T is independent of any unknown parameters, therefore, T in (3.8) satisfies the two conditions in (2.4) and is truly a GPQ, which can be used to construct confidence region for Gµ . The generalized p-value. 12.

(23) For given ( x, s) , the distribution in (3.8) is independent of unknown parameters and hence the Monte Carlo method can be utilized to construct a confidence region of Gµ , and test the hypothesis H 0 : Gµ = θ0. vs.. H1 : Gµ ≠ θ0 ,. (3.9). where θ0 is a given vector. Suppose m T and S T are the mean and covariance = S −1/ 2 (T − m ) is the standardized expression of T , then the matrix of T , and T T T. generalized p-value for testing (3.9) can be computed by | x, r} , > θ p = Pr{ T 0 = S −1/ 2 (θ − m ) , where θ T T 0 0. and θ 0. T. (3.10). , respectively, and θ are norms of T 0. = T ′T , and the null hypothesis (3.9) will be rejected whenever p ≤ α . with T. Furthermore, if we want to test the MANOVA problem of the form H 0 : µ1 = ... = µ K which can be expressed as H 0 : G *µ = 0 . One convenient choice for G* in this particular problem is I d I * G = d   I d. where G. (i ). = (c I , (i ) 1 d. −I d. 0. 0. −I d. 0. 0. , c I ), (i ) K d. c. 0   G (2)    0 ... 0   G (3)  , =       ... 0 −I d  G ( K )  0 .... (i ) j.  1 j= 1  =  −1 j= i .  0 o.w. . Similar to T in (3.8), the generalized test variable can be expressed as 1/ 2. T =G x −  G *s1/ 2 R −1s1/ 2G *′    *. *. Z d ( K −1) . And the p-value can also be computed in the. similar way as (3.10). The generalized confidence region If we are interested in constructing confidence interval of θ . Since T in (3.8) also fulfills two requirements of the generalized pivotal quantity and the observed value of T is θ , so it can be used to construct the confidence region of θ . Let q T ; 1-α be the { } , such that 100(1 − α ) th percentile of T. 13.

(24) {. }. ′T = (T − m )′S −1 (T − m ) ≤ q 2 = 1-α , Pr T T T T { T ; 1-α}. (3.11). Therefore, the 100(1 − α )% confidence region of θ can be solved through. { θ : (θ − m )′S (θ − m ) ≤ q{ }} . −1 T. T. 2. T. (3.12). ; 1-α T. Some remarks about confidence region are given in the Appendix. 3.2.2 Solutions based on the classical methods. In the classical procedure, the Hotelling’s T 2 test and the Chi-square test are the commonly used methods. In Hotelling’s T 2 test, we assume the population covariance matrices are the same, whereas in the classical Chi-square method, practitioners usually replace the population covariance matrices with the sample covariance matrices. We will briefly introduce these two methods to deal with our problem. The Hotelling’s T 2 test. In this method, we will assume that Σ1 = ... = Σ K = Σ and G = (c1I d ," , cK I d ) , K. then the point estimator of θ = Gµ = ∑ ci µ i and the pool covariance matrix are i =1. µˆ = ∑ i =1 ci Xi K. SH =. and. 1 N −K. ∑ ∑ K. ni. i =1. j =1. ( Xij − Xi )( Xij − Xi )′ =. 1 N −K. ∑. K. n Si ,. i =1 i. (3.13). K. respectively, where N = ∑ ni and Xi and Si are defined in (3.2), respectively. The i =1. −1. K K K criterion is Q 2 = (∑ i =1 ci Xi − θ)′  ∑ i =1 ci2S H / ni  (∑ i =1 ci Xi − θ)  . = (µˆ − θ)′ ( bS H ) (µˆ − θ) , −1. Q 2 has the Hotelling’s T 2 -distribution with N − K degrees of freedom and. where. b = ∑ i =1 ci2 / ni . Thus K. Q2 N − K − d +1 × ~ Fd , N − K − d +1 , N −K d K. so the p-value for testing H 0 : ∑ ci µi = θ0 , where θ0 is a given vector, is i =1. 14. (3.14).

(25)  K K N − K − d +1 , p = Pr Fd, N−K−d+1 > (∑i=1ci xi − θ0 )′ S−H1 (∑i=1ci xi − θ0 ) ⋅ bd(N − K)  . (3.15). and the 100(1 − α )% confidence region of θ can be solved through the inequality. {θ : (µˆ − θ)′S. −1 H. (µˆ − θ) ≤. bd(N − K)  F1−α (d, N − K − d +1) , N − K − d +1 . F1−α (d , N − K − d + 1) is the 100(1 − α ). where. th. (3.16). percentile of the. Fd , N − K − d +1. distribution. The classical Chi-square test. The classical Chi-square method is valid when the covariance matrices are known. The. statistics. Η 2d ,. K Η d2 = (µˆ − θ)′  ∑ i =1 ci2Si /(ni − 1)   . −1. (µˆ − θ) ,. is. distributed. approximately as a Chi-square distribution with degrees of freedom d when the sample K. sizes tend to infinity, where µˆ = ∑ i =1 ci Xi and θ = ∑ ci µ i . The p-value for testing K. i =1. K. H 0 : ∑ ci µi = θ0 is i =1. −1 K K K   p = Pr χd2 > (∑i=1ci xi − θ0 )′ ∑i=1ci2Si /(ni −1) (∑i=1ci xi − θ0 ) ,    . (3.17). and the approximate 100(1 − α )% confidence region of θ may be obtained by evaluating. {θ : (µˆ − θ)′(∑. c S i /(ni − 1)) −1 (µˆ − θ) ≤ χ12−α (d)} ,. K 2 i =1 i. (3.18). where χ12−α (d ) is the 100(1 − α ) th percentile of the χ 2 distribution with degrees of. freedom d. 3.3 The multivariate Behrens-Fisher problem. If we are only interested in the multivariate Behrens-Fisher problem, that is, only two populations are related and c1 = 1 and c2 = −1 , i.e., G = (I d , −I d ) ; then (3.8) for the generalized pivotal quantity becomes T1 ( X, S; x, s) = ( x1 − x2 ) − ( s11/ 2 R1−1s11/ 2 + s 21/ 2 R 2 −1s 21/ 2 ). 1/ 2. Zd .. (3.19). The p-value for testing H 0 : µ1 = µ 2 vs. H1 : µ1 ≠ µ 2. 15. (3.20).

(26) with T and 0 , respectively. is similar to (3.10) by replacing T and θ 0 1 Some other methods for dealing with the multivariate Behrens-Fisher problem are briefly reviewed in the follows.. Gamage, Mathew and Weerahandi (2004) The p-value for testing (3.20) derived by Gamage et al. (2004) is −1    s1 s2  p = Pr TGam ≥ ( x1 − x2 )′  ( ) | H + x − x   1 2 0 ,  n1 − 1 n 2 − 1   . (3.21). where TGam is defined as -1 1/2 1/2 -1 1/2 TGam = Z′[v1/2 1 Ψ1 v1 + v 2 Ψ 2 v 2 ]Z ,.  s s  with Vi =  1 + 2   n1 − 1 n 2 − 1 . −1/ 2.  s s  Si  1 + 2   n1 − 1 n 2 − 1 . (3.22). −1/ 2. , and v i being the observed values. of Vi , Ψ i ~ Wd (n i − 1, I d ) , i = 1, 2 and Z ~ N d (0, I d ) . * Furthermore, they also defined TGam /t *Gam to test the MANOVA problem of the form. H 0 : µ1 = ... = µ K , where T. * Gam. K. (Σ1 ,...,Σ K )= ∑ ni (Xi -µˆ )′ Σi-1 (Xi -µˆ ) , t *Gam is the observed i=1. K. K. i =1. i =1. * value of TGam and µˆ = (∑ ni Σi−1 ) −1 ∑ ni Σi−1Xi . However, as the authors had mentioned * /t *Gam was not invariant under non-singular in their paper, this new GTV TGam. transformation (Gamage et. al., 2004). Krishnamoorthy and Yu (2004) Krishnamoorthy and Yu (2004) modified the Nel and Van der Merwe’s (1986) test and provided an approximate invariant solution for the multivariate Behrens-Fisher problem. They obtained a nonsingular invariant statistic TKri =  ( X1 − X 2 ) − (µ1 − µ 2 ) ′ (n1 − 1) −1 S1 + (n2 − 1) −1 S 2   ( X1 − X 2 ) − (µ1 − µ 2 )  , (3.23) −1. which is approximately distributed as ν dFd ,ν − d +1 /(ν − d + 1) where. ν=. d (d + 1) , (n1 − 1) tr Λ + (tr Λ1 ) 2  + (n2 − 1)−1 tr Λ 22 + (tr Λ 2 ) 2 . Λ1 =. −1. 2 1. S1 S S ( 1 + 2 ) −1 , n1 − 1 n1 − 1 n2 − 1. 16.

(27) Λ2 =. S2 S S ( 1 + 2 ) −1 . n2 − 1 n1 − 1 n2 − 1. The p-value for testing (3.20) is −1    s1 ν − d +1 s2  x x ( ) | p = Pr  Fd ,ν − d +1 ≥ H ⋅ ( x1 − x2 )′  + −  1 2 0. n 1 n 1 d ν − − 1 2    . 17. (3.24).

(28) Chapter 4 The Theories of the Regression Model and the Growth-Curve Model In this chapter, repeated measurements with different covariance matrices among groups can be expressed by the regression model in matrix form as follows: Yij = Xi βi + εij ,. where Yij = (Yij1 , i for i = 1,. (4.1). , YijT )′ , Yijt are measurements at time point t for subject j in group. , I , j = 1,. , J i , t = 1,. with rank K , 1 ≤ K ≤ T . Further,. , T , and Xi ’s are the T × K design matrices. εij. are independent T-variate normal, with mean. vector 0 and the positive definite covariance matrices Σi ’s. Estimating and making inferences on βi ’s are important aspects of regression analysis. If Σi ’s are known, the best linear unbiased estimator (BLUE) of βi can be readily obtained via standard procedures. If the error covariance matrices are not known but are assumed to be identical, maximum likelihood estimates (MLE’s) via the growth-curve method is one of the approximation methods for dealing with this model when the sample size is large. However, if Σi ’s are unknown and distinct between different groups, the traditional methods have serious drawbacks in making inferences about βi ’s. Even when the covariance matrices are identical among different groups, the growth-curve method can only provide an approximate result. We will briefly introduce the traditional regression model when the covariance matrices are known and the growth-curve model with two special covariance structures.. 4.1 Regression model with known covariance matrices In this section, we will briefly introduce the traditional method for making inferences on βi ’s when the covariance matrices are known (Arnold (1981), Scheffé (1999) and Anderson(2003)). If the covariance matrices Σi ’s of the regression model −1 2. (4.1) are known and given, we can pre-multiply Σi −1 2. model (4.1), where Σi. to both sides of the regression −1. denotes a positive definite square root matrix of Σi ,. therefore we get the following standardized regression model: 18.

(29) Yij = Xi βi + εij , j = 1,. εij ~ NT (0, IT ) ,. where. , J i ; i = 1,. ,I ,. (4.2). IT is the T-dimension identical matrix. The best linear. unbiased estimator (BLUE) of βi is βˆ i = ( J i X′i Xi ) −1 X′i ∑ Yij ,. (4.3). j. and βˆ i ~ N K (β i , ( J i X′i Σi−1Xi ) −1 ) , i = 1,. χ 2 -distribution with degree of freedom K for. independently distributed as the. i = 1,. , I . Hence J i (βˆ i − βi )′ X′i Σi−1Xi (βˆ i − βi ) are. , I . Researchers are interested in testing the equality of the trends with. heteroscedastic phenomena, that is,. H 0 : β1 =. = βI = β .. (4.4). Under the null hypothesis (4.4), the estimator of the common β is βˆ = (∑ J i X′i Xi ) −1 (∑∑ X′i Yij ) ~N K (β, Ψ ) , i. i. (4.5). j. where Ψ = (∑ J i X′i Xi ) −1 = (∑ J i X′i Σi−1Xi ) −1. Let S02 = ∑∑ (Yij − Xi βˆ )′ (Yij − Xi βˆ ) i. i. i. j. be the standardized residual sum of squares under the null hypothesis and S a2 = ∑∑ (Yij − Xi βˆ i )′ (Yij − Xi βˆ i ) be the standardized residual sum of squares under i. j. the alternative hypothesis. We can then obtain the F statistic with NT − IK S0 − S a F= ~ F(( I −1) K , NT − IK ) , ( I − 1) K Sa2 2. 2. (4.6). where N = ∑ J i . The p-value for testing (4.4) can be calculated by NT − IK s0 − sa }, ( I − 1) K sa2 2. p-value= Pr{F( I −1) K , NT − IK ≥ 2. 2. 2. 2. (4.7) 2. where s0 and sa are the observed values of S0 and S a , respectively, and hypothesis (4.4) is rejected if p-value ≤ α . If the null hypothesis cannot be rejected, we may assume that the populations have the common trend β . The estimation of β is then important. From (4.5), the confidence region with confidence coefficient 1 − α for the common trend β is { β : (β − βˆ )′Ψ −1 (β − βˆ ) ≤ χ K2 (1 − α ) }, −1 −1 where Ψ = ∑ J i X′i Σi Xi and. (4.8). χ K2 (1 − α ) is the 100(1 − α ) percent point of the. i. 19.

(30) χ 2 -distribution with degrees of freedom K . 4.2 The growth-curve model Potthoff and Roy (1964) proposed the growth-curve model which is a useful generalized multivariate analysis-of-variance model especially for growth-curve problems. Rao (1967, 1975, 1977), Grizzle and Allen (1969), Geisser (1970, 1981), Fearn (1977) and others applied the growth-curve model to some biological data, the forecast of technology substitutions and Bayesian analysis. The regression model (4.1) can be expressed as a growth-curve model if the design matrices are identical. The growth-curve model can be defined as Y = X B F +ε ,. T ×N. where Y = (Y11 ,. T ×K K ×I I × N. ε = (ε11 ,. , YIJ I ) ,. (4.9). T ×N. , ε IJ I ) , B = (β1 ,. , β I ) and F is the I × N. design matrix characterizing the distinct grouping of the N independent vector observations, where N = ∑ J i . Let Z be a known T × (T − K ) matrix with rank i. T − K such that X′Z = 0 . We will utilize the results of the growth-curve model with two special covariance matrices proposed by Lee (1988) to make inferences on βi ’s.. Uniform covariance structure When the design matrix X = (1T , X 2 ) , 1T = (1,… ,1)′ , and the covariance matrix is uniform structure, that is, Σ = σ u2 [(1 − ρu )I + ρu 1T 1T ′ ] = σ u2 (1 − ρu )I + [σ u2 ρu ]1T 1T ′. with. (4.10). −1 < ρu < 1 , then the MLE’s of B , σ u2 and ρu derived by Lee (1988) are T −1. Bˆ = ( X′X) −1 X′YF′(FF′) −1 ,. σˆ u2 = trS* TN , ρˆ u = (1T ′ S* 1T − trS* ) (T − 1)trS* , where S* = Y(I − F′(FF′) −1 F)Y′ + Z(Z′Z) −1 Z′YF′(FF′) −1 FY′Z(Z′Z) −1 Z′ .. 20. (4.11).

(31) Serial covariance structure When the covariance matrix is serial structure, i.e., the AR(1) errors correlation, Σ =σ C, 2. where C = ( ρ. m−n. (4.12). ) , 1 ≤ m, n ≤ T , σ 2 > 0 , and ρ is restricted to ρ < 1 , which. ensures that Σ is positive definite. The MLE’s of B and σ 2 are ˆ −1X) −1 X′C ˆ −1YF′(FF′) −1 Bˆ ( ρˆ ) = ( X′C 2 and σˆ ( ρˆ ) =. (4.13). 1 ˆ −1X) −1 X′C ˆ ) −1 Z′YY′Z] , ˆ −1 Y(I − F′(FF′) −1 F)Y′ C ˆ −1X + tr (Z′CZ [tr ( X′C NT. ˆ = ( ρˆ m − n ) and ρˆ respectively, where C. is obtained by maximizing the profile. likelihood function Lmax ( ρ ) = (σˆ ( ρ )) 2. − NT 2. For the single group, F = (1,. 2 − N (T −1) 2. (1 − ρ ). .. (4.14). ,1) , and FF′ = N , YF′ = ∑∑ Yij = NY and the i. j. MLE’s of (4.13) can be written as _. ˆ −1X) −1 X′C ˆ −1YF′(FF′) −1 = ( X′C ˆ −1X) −1 X′C ˆ −1 Y , βˆ G ( ρˆ ) = ( X′C and σˆ 2 ( ρˆ ) =. (4.15). 1 ˆ −1X) −1 X′C ˆ −1 Y(I − 1 F′F)Y′ C ˆ −1X + tr (Z′CZ ˆ ) −1 Z′YY′Z] , [tr ( X′C NT N. where ρˆ is obtained by maximizing the profile likelihood function. The approximate 100(1 − α )% confidence region for β under (4.4) is ˆ −1X) (β − βˆ ) ≤ χ 2 (1 − α ) }. { β : Nσˆ −2 (β − βˆ G )′ ( X′C G K. (4.16). When K = 2 , the area of the approximate 100(1 − α )% confidence region for β is AG (β, 1 − α ) =. π N. ˆ −1 X ) σˆ −2 ( X′C. −1 2. χ K2 (1 − α ) .. (4.17). 4.3 Growth curve models with heteroscedastic uniform covariance structure Lin and Lee (2003) considered the unbalanced data and unequal design matrices. Xi = (1T , Xi 2 ) for heteroscedastic variances. The model is expressed in matrix form as follows. Yij = Xi βi + α ij 1T + εij , where,. εij ∼ N (0, Σei ) ,. j = 1,. , J i , i = 1,. ,I ,. (4.18). the random effects α ij ∼ N (0, σ α2 ) vary independently, and. Σ ei = σ i [(1 − ρ )I + ρ 1T 1T ′ ] is uniform correlation structure. The covariance matrix of 2. 21.

(32) Yij is also uniform correlation structure, that is, for i = 1,. ,I. Cov(Yij ) = Σi = σ α 1T 1T ′ + Σ ei = σ i (1 − ρ )I + ( ρσ i + σ α )1T 1T ′ , 2. 2. 2. 2. (4.19). and −1. −1. 2. 2. Σi = [σ i (1 − ρ )] [I − 2. 2. 2. 2. φi − σ i (1 − ρ ) 2. T φi. 1T 1T ′ ] ,. (4.20). 2. 2. 2. with φi = σ i (1 − ρ ) + T ( ρσ i + σ α ) . The inverse of Σi depends on σ i (1 − ρ ) and φi , −1. −1. but not on by ρ itself, therefore Σi. −1. 2. 2. can be expressed as Σi = Σi (σ i (1 − ρ ), φi ) .. Furthermore, −1. −1. _ −1 −1 −1 _ −1 ( X′i Σi Xi ) βˆ i = ( X′i Σi Xi ) X′i Σi Yi = ( X′i Xi ) X′i Yi ∼ N (βi , ) Ji. where Yi =. 1 Ji. ∑Y. ij. (4.21). . The residual sum of squares is. j. Ji. I. I. I. i =1. i =1. SSE = ∑∑ ( Yij − Xi βˆ i )′ (Yij − Xi βˆ i ) = ∑ SW ,i + ∑ S B ,i , i =1 j =1. (4.22). where Ji. Ji. j =1. j =1. SW ,i ≡ ∑ [ Yij − Xi βˆ i − (Yij . − Yi.. )1T ]′ [Yij − Xi βˆ i − (Yij . − Yi.. )1T ] and S B ,i ≡ T ∑ (Yij . − Yi.. ). with 1T = (1,1, For i = 1,. ,1)′ , Y j . =. 2. 1 1 1 1 1 1 Y jt = 1′T Y j , Y.. = ∑ Y j . = ∑ 1′T Y j = 1′T Y . ∑ T t T J j J j T T. , I , SW ,i and S B ,i are independently distributed as U W ,i =. SW ,i. σ (1 − ρ ) 2 i. 2 ∼ χ J i (T −1) − ( K −1) and U B ,i =. −1 2. respectively. Pre-multiplying Σ i. −1 2. = Σi. S B ,i. φ. 2 i. ∼ χ Ji −1 , 2. (4.23). (σ i (1 − ρ ), φi ) to both sides of Equation 2. 2. (4.18), the model with identity covariance matrix can be rewritten as Yij = Xi βi + εij ,. where. (4.24). εij ~NT (0, IT ) , which is the same as (4.2).. Let S0 (σ 1 (1 − ρ ), 2. 2. , σ I (1 − ρ ), φ1 , 2. 2. , φI ) be the standardized residual sum of 2. squares under null hypothesis (4.4) and S a (σ 1 (1 − ρ ), 2. 2. , σ I (1 − ρ ), φ1 , 2. 2. , φI ) be the 2. standardized residual sum of squares under the alternative. The generalized p-value for. 22.

(33) testing the hypothesis (4.4) H 0 : β1 =. = β I = β can be expressed as. p = Pr{S0 (σ 1 (1 − ρ ),. , φI ) > s0 (. 2. S0 − S a 2. = Pr{. , σ I (1 − ρ ), φ1 ,. 2. 2. 2. S. > s0 (. 2. sw,1. 2. ,. ,. sw,1 ν2 2 {s0 ( , MM M ν1. ,. 2 a. UW ,1 / U T. = 1 − E∆ {Fν1 ,ν 2 [. 1. 2. 2I. 2. sw , I. sw,1. 2. ,. UW ,1. sb ,1. UW , I / U T U B ,1 / U T sw , I (1 − M I −1 ) M I. ,. ,. ,. ,. sw , I. UW , I U B ,1 sb , I. U B,I / UT. sb ,1. ,. sb ,1. ,. M 2 I (1 − M I ) M I +1. M 2I. ,. ,. ,. sb , I U B,I. )}. ) − 1}. ,. (4.25) sb , I. (1 − M 2 I −1 ) M 2 I. ) − 1}]},. I. I. i =1. i =1. where ν 1 = ( I − 1) K , ν 2 = NT − IK and U T = ∑ (UW ,i + U B ,i )~χν22 with N = ∑ J i . And E∆ is the expected value with respect to the independent Beta random variables M r. ∑ = ∑. r. λ. i =1 i r +1. λ i =1 i. ∑ ~Beta (. r i =1. 2. qi qr +1. ,. 2. ) , r = 1,. constants M 0 ≡ 0 and M 2 I ≡ 1 , where (λ1 ,. , (2 I − 1) , with two auxiliary. , λ2 I ) = (UW ,1 ,. ,UW , I ,U B ,1 ,. , U B , I ) , qr.  J (T − 1) − ( K − 1), r = 1, , I . is the degrees of freedom of λr with qr =  r r = I + 1, , 2 I .  J r − I − 1,. 23.

(34) Chapter 5 Generalized Inferences on Regression Models with Unequal AR(1) Covariance Matrices 5.1. Introduction In many fields, such as business, engineering, medical studies, meteorology, etc., serial dependence, i.e., AR(1) errors correlation, is considered one of the most important correlation structures. In particular, a regression model with a polynomial trend (including a linear trend, especially for few measurements taken over time) and serial dependence is one of the strong candidates for analyzing the data sets collected across equally spaced time intervals. Repeated measurement with serial dependence can be expressed by the regression model in matrix form as follows:. Yij = Xi βi + εij , where Yij = (Yij1 ,. , YijT )′ for i = 1,. design matrices. Further,. εij. ,I ,. (5.1) j = 1,. , J i , t = 1,. , T , and Xi ’s are. are independent T-variate normal, with mean vector 0 and. the AR(1) covariance matrix Σi = σ i2Ci , Ci = ( ρi. m−n. ) , 1 ≤ m, n ≤ T , σ i2 > 0 , and ρi. is restricted to ρi < 1 , which ensures that Σi is positive definite. Potthoff and Roy (1964), Lee and Geisser (1975), Lee (1988) and many others have shown that the growth-curve model is one of the most useful methods for dealing with the regression model (5.1) with AR(1) dependence. However, the growth-curve model is restricted to handling either a single group or multiple groups only under the assumption of identical error correlation among the groups. As with many traditional methods, the growth-curve model has difficulty dealing with models in which the error correlations are different among distinct groups. In this chapter, we propose a method based on the concepts of the generalized p-values and the generalized confidence intervals to handle the problem with heteroscedastic phenomena. Estimating and making inferences on βi ’s are important aspects of regression analysis. If the error covariance matrices are not known but are assumed to be identical, maximum likelihood estimates (MLE’s) via the growth-curve method is one of the approximation methods for dealing with this model when the sample size is large.. 24.

(35) However, if the nuisance parameters σ i2 and ρi are unknown and distinct between different groups, the traditional methods have serious drawbacks in making inferences about βi ’s. Thus, an exact procedure for making inferences of the fixed effect βi when the serial covariance matrices are unknown and unequal among groups needs to be explored. In Section 4.3, Lin and Lee (2003) showed that the generalized method provided an alternative way of dealing with a regression model (5.1) with unequal uniform covariance structures among multiple groups. Thus, we will extend the idea to further consider the regression model (5.1) without making the equal serial dependence assumption. We perform hypothesis testing for the equality of the fixed effects and derive the distribution of the common trend if the null hypothesis cannot be rejected. Our procedures for dealing with a single group and multiple groups are both presented in Section 5.2. The other commonly used methods, the growth-curve model, the classical Hotelling’s T 2 and the classical Chi-square method, are presented in Section 5.3. The illustrative examples of real and simulated data sets are provided in Section 6.3 for the purpose of making comparisons of the different methods with respect to their coverage probabilities, expected areas and p-values.. 5.2 Regression model with AR(1) errors In this section, we first introduce our method for dealing with the single group in Section 5.2.1 and then consider the multiple groups with and without the assumptions of identical AR(1) covariance matrices in Section 5.2.2. Other methods such as the ML method via growth-curve model, the classical Chi-square approximation and the Hotelling’s T 2 –statistic are also briefly introduced in Section 5.3.. 5.2.1 Single group based on the generalized method In the single group, the model (5.1) can be reduced to. Y j = Xβ + ε j , j = 1, where. ε1 ,. ,J ,. (5.2). , ε J are identical and independent multivariate normal distributions with. mean vector 0 and the AR(1) covariance matrix Σ = σ 2C with C = ( ρ. m−n. ),. 1 ≤ m, n ≤ T . Let 1T = (1,1,. ,1)′ , Y j . =. 1 1 1 1 1 1 Y jt = 1′T Y j , Y.. = ∑ Y j . = ∑ 1′T Y j = 1′T Y ∑ T t T J j J j T T. 25.

(36) and Y =. _ _ 1 −1 ′ ′ Y , we obtain a linear unbiased estimator b ( X X ) X Y A Y with = = ∑ j J j. 1 A = ( X′X) −1 X′ , and b is distributed as N K (β, σ 2 ACA′) . We utilize the estimator J b to make inferences on the unknown AR(1) covariance matrix through two. independent random variables, one is the sum of square errors about Xb within subjects, SSW ( Xb) = ∑ [Y j − Xb − (Y j . − Y.. )1T ]′[Y j − Xb − (Y j . − Y.. )1T ] , and the other is j. the sum of square errors between subjects, SSB ( Xb) = T ∑ (Y j . − Y.. ) 2 . The sum of j. square errors about Xb , SST ( Xb) = ∑ (Yj − Xb)′(Y j − Xb) , can be expressed as the j. sum of SSW ( Xb) and SSB( Xb) . Through the distributions and the expected values of SSW ( Xb) and SSB( Xb) , we can get information about Σ . The expectations of SST ( Xb) and SSB( Xb) are E ( SST ( Xb)) = E ( SSW ( Xb)) + E ( SSB( Xb)) = ∑ tr[Cov(Y j − Xb)] = σ ( JT − tr ( XAC)) 2. j. and E ( SSB( Xb)) = Let eb =. σ2 T. ( J − 1) 2 ′ σ 1T C1T . T. 1T ′ C1T and ew =. σ2 J (T − 1) − ( K − 1). UW =. SSW ( Xb) ~ ew. UB =. SSB ( Xb ) 2 ~ χ J −1 , eb. χ J2(T −1)−( K −1) ,. ( JT − tr ( XAC) −. J −1 ′ 1T C1T ) , then T. (5.3) (5.4). and UW and U B are independently distributed. Since the pair ⟨σ 2 , ρ ⟩ can be uniquely determined by the pair ⟨ ew , eb ⟩ , we can get information about nuisance parameters σ 2 and ρ through ew and eb . Hence, Σ can be expressed as. Σ ≡ Σ(ew , eb ) . And for any positive number λ , we have Σ(λ ew , λ eb ) = λ Σ(ew , eb ) and λ Σ −1 (λ ew , λ eb ) = Σ −1 (ew , eb ) . Thus −1. Σ (ew −1. =Σ (. ssw( Xb) ssb( Xb) , eb ) SSW ( Xb) SSB ( Xb). ssw( Xb) ssb( Xb) , ) UW UB 26.

(37) −1. = U T Σ (U T −1. = UT Σ (. ssw( Xb) ssb( Xb) ,UT ) UW UB. ssw( Xb) ssb( Xb) , ), Bv1 ,v2 1 − Bv1 ,v2. (5.5). where ssw( Xb) and ssb( Xb) are the observed values of SSW ( Xb) and SSB( Xb) , 2 is the Beta random variable with respectively, U T = U B + U w ~ χ JT − K , and Bν1 ,ν 2. ν1 =. J (T − 1) − ( K − 1) J −1 and ν 2 = . 2 2. If ew and eb are known, pre-multiplying Σ. −1 2. to both sides of Eq. (5.2), we. obtain the standardized regression model with identity covariance matrix as follows. Y j = Xβ + ε j , j = 1,. , J , where. ε j ~NT (0, IT ) ,. (5.6). which is equivalent to model (3.2). Based on (5.6), the BLUE of β , denoted as βˆ P , 1 βˆ P = ( X′X) −1 X′( ∑ Y j ) and βˆ P ~ N K (β, ( JX′Σ −1X) −1 ) . J j Since J (β − βˆ P )′ X′Σ −1X(β − βˆ P ) is distributed as χ K2 , then the random variable J ( JT − K ) (β − βˆ P )′ X′(UT Σ)−1 X(β − βˆ P ) is distributed as an F distribution with degrees K of freedom K and JT-K. When K = 2 , the expected area of the 100(1 − α )% coverage probability of β can be obtained by AP (β, 1 − α ) =. πK J ( JT − K ). EB. ν1 ,ν 2. ( Xb ) ssb ( Xb ) [ X′Σ −1 ( ssw , 1− B ) X B. ν1 ,ν 2. ν 1 ,ν 2. −1 2. ]FK , JT − K (1 − α ) ,. (5.7). where Bν1 ,ν 2 is as defined in (5.5). 5.2.2 Multiple groups based on the generalized method In this section, we incorporate the generalized method into the traditional regression procedure. Our proposed method will provide an alternative process for making inferences for βi ’s of the regression model. The inferences under the assumptions of distinct AR(1) covariance matrices among groups, and the equal AR(1) covariance matrices case, are both introduced in this section.. Different covariance matrices among groups For the situation with distinct covariance matrices among groups, we utilize similar. 27.

(38) steps as in the single group model with some modifications. First, we have to obtain the −1. information for Σi i = 1,. −1 2. and pre-multiply Σi. , I , then we get the standardized regression model: Yij = Xi βi + εij , j = 1,. where. to both sides of the regression model (5.1),. , J i ; i = 1,. ,I ,. (5.8). εij ~ NT (0, IT ) . In Section 5.2.1, the AR(1) covariance matrix. Σ is expressed. as Σ ≡ Σ(ew , eb ) through the generalized method. Similarly, we will obtain the AR(1) covariance matrices Σi ’s with some modification, then we can make inferences for the common trend β based on the standardized regression model (5.8) via the traditional regression procedure. The procedure is as follows. Let Yi =. 1 Ji. ∑Y ,. Yij . =. ij. j. 1 1T′ Yij , T. Yi.. =. 1 −1 1′T Yi and A i = ( X′i Xi ) X′i , then the T. estimator b i = ( X′i Xi ) −1 X′i Yi = A i Yi is distributed as N K (βi , of. square. errors. “within”. subjects. and. “between”. 1 2 σ i Ai Ci A′i ) . The sum Ji subjects. SSW ( Xi b i ) = ∑ [ Yij − Xi b i − (Yij . − Yi.. )1T ]′[Yij − Xi b i − (Yij . − Yi.. )1T ]. SW ,i ≡. are. S B ,i ≡. and. j. SSB ( Xi b i ) = T ∑ (Yij . − Yi.. ) , respectively. Let U W ,i = 2. j. SW ,i ew,i. and U B ,i =. S B ,i. , with. eb ,i. σi2. Ji −1 ′ σ i2 ′ ew,i = (JiT −tr(XAC 1T C1 and eb ,i = 1T Ci 1T , then it is i i i)− i T) Ji (T −1) − (K −1) T T known that UW ,i and U B ,i are independently distributed as. χ J2 (T −1)−( K −1) and χ J2 −1 , i. i. respectively. Suppose sw,i and sb ,i are the observed values of SW ,i and S B ,i , respectively, then. Σi (ew,i. sw , i SW ,i. , eb ,i. sb ,i S B ,i. ) = Σi (. sw , i U W ,i. ,. sb ,i U B ,i. ).. (5.9). Hence we can obtain the generalized estimator βˆ P ,i for the individual group as 1 βˆ P ,i = ( X′i Xi ) −1 X′i ( Ji. ∑Y ) . ij. j −1 2. The standardized model (5.8) can be also obtained by pre-multiplying Σi square root of (5.9), to both sides of the regression model (5.1), i = 1,. , the. , I . We are. interested in testing the equality of the trends with heteroscedastic phenomena, that is, 28.

(39) H 0 : β1 =. = βI = β .. (5.10). Under the null hypothesis (5.10), the common trend estimator βˆ P is defined as βˆ P = (∑ J i X′i Xi ) −1 (∑∑ X′i Yij ) = Ψ P (∑ J i X′i Xi βˆ P ,i ) , i. i. (5.11). i. j. which is distributed as N K (β, Ψ P ) , where. Ψ P = (∑ J i X′i Xi ) . −1. (5.12). i. We utilize S02 (ew,1 ,. , ew, I , eb ,1 ,. , eb , I ) ≡ S02 = ∑∑ (Yij − Xi βˆ P )′ (Yij − Xi βˆ P ) and i. S a2 (ew,1 ,. j. , eb , I ) ≡ S a2 = ∑∑ (Yij − Xi βˆ P ,i )′ (Yij − Xi βˆ P ,i ) to test the null. , ew, I , eb ,1 ,. i. j. χ 2 -distribution with. hypothesis (5.10). It is noted that S02 and S a2 are distributed as. degrees of freedom NT − K and NT − IK , respectively, where N = ∑ J i . Then the i. generalized p-values for testing (5.10), the hypothesis of the equality of the trends, can be calculated by p = Pr{S0 (ew,1 , 2. S0 − S a 2. = Pr{. 2. S. > s0 (. UW ,1 / U T. ,. sw,1 ν2 2 {s0 ( , ν1 MM M. = 1 − E∆ {Fν1 ,ν 2 [. 1. sw,1. 2. sw,1. 2. 2 a. , eb , I ) > s0 (. , ew, I , eb ,1 ,. 2. 2I. ,. ,. UW ,1. sw , I. ,. ,. ,. sw , I. sb,1. ,. (1 − M I −1 ) M I. sb ,1. UW , I U B ,1. UW , I / U T U B ,1 / U T sw , I. ,. ,. ,. ,. ,. sb , I U B,I. sb , I U B,I / UT. sb ,1. M 2 I (1 − M I ) M I +1. M 2I. ,. )}. ) − 1}. ,. (5.13). sb , I (1 − M 2 I −1 ) M 2 I. ) − 1}]},. 2 where U T = ∑ (UW ,i + U B ,i )~χ NT − IK , Fν 1 ,ν 2 is the cumulative density function(cdf) of i. the F distribution with degrees of freedom ν 1 = ( I − 1) K and ν 2 = NT − IK . And E∆ is the expected value with respect to the independent Beta random variables Mr. ∑ = ∑. r. λ. i =1 i r +1. λ i =1 i. ∑ ~Beta (. r i =1. 2. qi qr +1. ,. 2. ) , r = 1,. M 0 ≡ 0 and M 2 I ≡ 1 , where (λ1 ,. , (2 I − 1) , with two auxiliary constants. , λ2 I ) = (UW ,1 ,. ,UW , I ,U B ,1 ,. ,U B , I ) , qr is the.  J (T − 1) − ( K − 1), r = 1, , I . degrees of freedom of λr with qr =  r r = I + 1, , 2 I .  J r − I − 1,. If the null hypothesis cannot be rejected, the common trend β can be estimated by βˆ P , and (β − βˆ P )′Ψ −P1 (β − βˆ P ) is distributed as. 29. χ K2 . Hence, the random variable.

(40) NT − IK (β − βˆ P )′ [UT Ψ P ]−1 (β − βˆ P ) is distributed as an F distribution with degrees of K freedom K and NT-IK. When K = 2 , the expected area, AP (β, 1 − α ) , of the 100(1 − α )% coverage probability of β can be obtained by a p E∆ [ ∑ J i X′i Σ ( −1 i. i. sw ,i (1 − M i −1 ) M i. πK. where the constant a p =. NT − IK. M 2I. −1 2. sb ,i. ,. (1 − M i + I −1 ) M i + I. M 2I. ) Xi. ],. (5.14). FK , NT − IK (1 − α ) .. Equal covariance matrices among groups When the AR(1) covariance matrices are equal among groups, i.e., σ 12 =. ρ1 =. and. = ρI ,. ScT = ∑ ST ,i ,. set. ScB = ∑ S B ,i. i. ecb =. σ2 T. 1T ′ C1T. then U cW =. ecw =. and. S cW and ecw. U cB =. and. i. σ2 N (T − 1) − I ( K − 1). = σ I2. ScW = ∑ SW ,i . Let i. ( NT − tr (∑ Xi A i C) − i. ScB are independently distributed as ecb. N −I ′ 1T C1T ) , T. χ N2 (T −1)− I ( K −1) and. χ N2 − I , respectively. The subscript c in these notations stands for the case of “common covariance matrix.” Similar to the previous procedure, let scw and scb be the observed values of ScW and ScB , respectively. Then the identical covariance matrix can be expressed as Σ(ecw. scw s s s , ecb cb ) = Σ( cw , cb ) . ScW ScB U cW U cB. (5.15). Hence, βˆ cP = (∑ J i X′i Xi ) −1 (∑∑ X′i Yij ) = Ψ cP (∑ J i X′i Xi βˆ cP ,i ) , the estimator under i. i. i. j. the null hypothesis, is distributed as N K (β, Ψ cP ) , where 1 βˆ cP ,i = ( X′i Xi ) −1 X′i ( Ji. ∑Y ) , ij. j. and Ψ cP = (∑ J i X′i Xi ) . −1. i. The generalized p-values for testing (5.10) can be calculated as p = Pr{S0 (ecw , ecb ) > s0 ( 2. 2. scw scb , )} U cW U cB 30.

(41) S0 − S a 2. = Pr{. 2. S. 2 a. > s0 ( 2. scw scb , ) − 1} U cW / U cT U cB / U cT. ν 2 2 scw scb {s0 ( , ) − 1}]}, ν1 B 1− B. =1 − EB {Fν1 ,ν 2 [. (5.16). where U cT = U cW + U cB ~ χ NT − IK , Fν1 ,ν 2 [⋅] is the cdf of the F distribution with degrees 2. ν 1 = ( I − 1) K and ν 2 = NT − IK and EB is the expected value with. of freedom. respect to the Beta random variables B ~ Beta(. N (T − 1) − I ( K − 1) N − I , ). 2 2. If the null hypothesis cannot be rejected, the common trend β is estimated by βˆ cP , then. NT − IK (β − βˆ cP )′[U cT Ψ cP ]−1 (β − βˆ cP )~FK , NT − IK . When K = 2 , the area of the K. 100(1 − α )% coverage probability of β is. πK. s s AcP (β, 1 − α ) = EB [ (∑ J i X′i Σ ( cw , cb ) Xi ) NT − IK B 1− B i. −1 2. −1. ]FK , NT − IK (1 − α ) .. (5.17). 5. 3 The other methods The growth-curve model The regression model (5.1) can be expressed as a growth-curve model if the design matrices are identical. The results are given in Section 4.2. The classical Chi-square approximation In the classical Chi-square method, researchers often substitute the unknown Σi with. the. sample. covariance. matrices. Si =. 1 J i −1. ∑ (Y. ij. − Yi )(Yij − Yi )′ ,. where. j. Yi =. 1 Ji. ∑Y. −1. for i = 1,. ij. , I . Let ai = ( J i − T − 2) /( J i − 1) , it is then easy to show that. j. −1. E (ai Si ) = Σi . Under hypothesis (5.10), the estimate of the common trend β can be −1 −1 −1 expressed as βˆ Chi = Ψ Chi (∑ J i X′i ai Si Xi βˆ ( chi )i ) , where Ψ Chi = (∑ J i X′i ai Si Xi ) and. i. i. _. −1 −1 βˆ ( chi )i = ( X′i Si Xi ) −1 X′i S i Yi .. The approximate 100(1 − α )% confidence region for the common trend β is. 31.