• 沒有找到結果。

利用廣義p值,廣義信賴區間及特徵函數的統計推論(III)

N/A
N/A
Protected

Academic year: 2021

Share "利用廣義p值,廣義信賴區間及特徵函數的統計推論(III)"

Copied!
9
0
0

加載中.... (立即查看全文)

全文

(1)

行政院國家科學委員會專題研究計畫 成果報告

利用廣義 p 值, 廣義信賴區間及特徵函數的統計推論(3/3)

計畫類別: 個別型計畫 計畫編號: NSC94-2118-M-009-001- 執行期間: 94 年 08 月 01 日至 95 年 07 月 31 日 執行單位: 國立交通大學財務金融研究所 計畫主持人: 李昭勝 計畫參與人員: 林淑惠 報告類型: 完整報告 報告附件: 出席國際會議研究心得報告及發表論文 處理方式: 本計畫可公開查詢

中 華 民 國 95 年 10 月 18 日

(2)

行政院國家科學委員會專題研究成果報告

利用廣義

p 值,廣義信賴區間及特徵函數的統計推論(3/3)

計畫編號:NSC94-2118-M-009-001 計畫類別:個別計畫 執行期限:2005 年 8 月 1 日至 2006 年 7 月 31 日 計畫主持人:李昭勝博士 國立交通大學財務金融研究所 本報告含二篇完成之研究成果。

ㄧ、Generalized Inferences on the Common Mean Vector of Several Multivariate Normal Populations

此乃與林淑惠教授及博士生王仁聖合作之文章。本文將登於 Journal of Statistical Planning and Inference (an SCI journal)。其中、英文摘要如下。

(一) 中文摘要 本文考慮數個多變量常態分佈其變異矩陣未知且不等的共同平均向量的推 論問題。利用廣義推論我們提出共同平均向量的正確信賴區間的新方法。我們用 兩個實例來當此方法的應用例子。我們以期望區域及覆蓋率來展現本方法優於其 他方法。 (二) 英文摘要

The hypothesis testing and confidence region are considered for the common mean vector of several multivariate normal populations when the covariance matrices are unknown and possibly unequal. A generalized confidence region is derived using the concepts of generalized method based on the generalized p-value. The generalized confidence region is illustrated with two numerical examples. The merits of the proposed method are numerically compared with those of existing methods with respect to their expected area or expected d-dimensional volumes and coverage probabilities under different scenarios.

(三) 報告內容

Estimating the common mean vector of several multivariate normal populations with unknown and possibly unequal covariance matrices is one of the oldest and interesting problems in statistical literature. This problem arises, for example, when two or more independent measuring instruments or agencies are involved to measure like products, effects, or substances which are produced by the same production

(3)

process to estimate the average quality in terms of several characteristics. If the samples collected by independent studies are assumed to come from multivariate normal populations with a common mean vector and unknown covariance matrices, then the problem of interest may be to estimate or construct a confidence region for the common mean vector µ of these populations. If the unknown covariance matrices are assumed to be identical, then there are optimal methods available to make inferences on µ . However, when the covariance matrices are unknown and unequal, it is clear that the distribution of any combined estimators of µ will involve nuisance parameters, and then the standard method has serious limitations for the purpose of finding an exact test and confidence region of µ . Therefore, constructing a generalized confidence region of µ for models involving variance components deserves further attention.

Suppose there are I I( ≥2)d-variate normal populations with common mean

vector µ and unknown covariance matrices Σ1,...,Σ . Let I 1,..., i i in X X be independent d-variate vector observations from the ith population,i=1,...,I , and

i

~ ( , ) ij Nd

X µ Σ , 1,...,j= ni. For the ith population, let 1 1 ni i ij j i n = =

X X and 1 1 ( )( ) 1 i n i ij i ij i j i n = ′ = − − −

S X X X X (1.1) be the sample mean vector and sample covariance matrix. We are interested in estimating the common mean vector µ , based on the minimal sufficient statistics(X1,...,X SI, 1,...,SI) .

In the univariate case, the common mean problem has received considerable attention in the statistical literature; we refer the reader to Meier (1953), Maric and Graybill (1979), Pagurova and Gurskii (1979), Sinha (1985), Eberhardt et al. (1989), Fairweather (1972), Jordan and Krishnamoorthy (1996), Krishnamoorthy and Lu (2003), Lin and Lee (2005) and the references therein.

In the multivariate case, Chiou and Cohen (1985) showed that ˆµGDd,

(

)

1 1 1 1 1 ˆGDd I i i I j j j i n j n − − − = = =

µ S S x , (1.2) dominates neither X nor 1 X , when 2 I = and 2 d ≥2 , with respect to the covariance criterion, although Graybill and Deal (1959) got the opposite result in univariate two-sample case. Loh (1991) estimated the common mean vector from a symmetric loss function point of view as alternatives to ˆµGDd. Zhou and Mathew (1994) proposed several combined tests for testing the common mean vector, but the

(4)

problem of multiple comparisons had not been discussed when the null hypothesis was rejected. Jordan and Krishnamoorthy (1995) provided a confidence region of µ centered at a weighted Graybill and Deal estimator ˆµ , JK

(

)

1 1 1 1 1 ˆJK I i i i I j j j j i c n j c n − − − = = =

µ S S x , (1.3) which does not always produce non-empty regions. Moreover, determination of the percentile points that are needed to construct the confidence region of µ is quite difficult in practice, and thus approximation is necessary.

In this paper, we intend to provide a method that is readily applicable for both hypothesis testing and confidence region construction of the common mean vector µ . Our approach is based on the concepts of generalized p-values and generalized confidence intervals, introduced by Tsui and Weerahandi (1989) and Weerahandi (1993), respectively. These ideas have turned out to be very satisfactory for obtaining tests and confidence intervals for many complex problems; see Lin and Lee (2003), Lee and Lin (2004) and many others. Gamage et al. (2004) provided a generalized

p-value and a generalized confidence region for the multivariate Behrens-Fisher

problem and MANOVA. For a discussion of several applications, the readers are referred to the book by Weerahandi (1995). In terms of the expected area or

d-dimensional volumes and coverage probability, our method is compared with the

methods derived by the classical approach, Graybill and Deal (1959) and Jordan and Krishnamoorthy (1995), respectively. The numerical results in sections 4 and 5 also show that our method performs better than these methods.

(四) 參考文獻

Chiou, W., Cohen, A., 1985. On estimating a common multivariate normal mean vector. Ann. Inst. Statist. Math. 37, 499-506.

Eberhardt, K. R., Reeve, C. P., Spiegelman, C. H., 1989. A minimax approach to combining means, with practical examples. Chemometrics Intell. Lab. Systems 5, 129-148.

Fairweather, W. R., 1972. A method of obtaining an exact confidence interval for the common mean of several normal populations. Appl. Statist. 21, 229-233.

Fisher, R. A., 1932. Statistical methods for research workers. Fourth ed. Oliver & Boyd, London.

Gamage, J., Mathew, T., Weerahandi, S., 2004. Generalized p-values and generalized confidence regions for the multivariate Behrens–Fisher problem and MANOVA. Journal of Multivariate Analysis 88, 177-189.

Graybill, F. A., Deal, R. B., 1959. Combining unbiased estimators. Biometrics 15, 543-550.

(5)

Jordan, S. M., Krishnamoorthy, K., 1995. Confidence regions for the common mean vector of several multivariate normal populations. The Canadian Journal of Statistics 23, 283-297.

Jordan, S. M., Krishnamoorthy, K., 1996. Exact confidence intervals for the common mean of several normal populations. Biometrics 52, 77-86.

Krishnamoorthy, K., Lu Yong, 2003. Inferences on the common mean of several normal populations based on the generalized variable method. Biometrics 59, 237-247.

Lee, J. C., Lin, S. H., 2004. Generalized confidence intervals for the ratio of means of two normal populations. Journal of Statistical Planning and Inference 123, 49-60.

Lin, S. H., Lee, J. C., 2003, Exact tests in simple growth curve models and one-way ANOVA with equicorrelation error structure. Journal of Multivariate Analysis 84, 351-368.

Lin, S. H., Lee, J. C., 2005, Generalized confidence interval for the common mean of several normal populations. Journal of Statistical Planning and Inference 134, 568-582.

Loh, W. L., 1991, Estimating the common mean of two multivariate normal distributions. Ann. Statisti. 19, 283-296.

Maric, N., Graybill, F. A., 1979. Small samples confidence intervals on common mean of two normal distributions with unequal variances. Communications in Statistics-Theory and Methods A 8, 1255-1269.

Meier, P., 1953. Variance of a weighted mean. Biometrics 9, 59-73.

Pagurova, V. I., Gurskii, V. V., 1979. A confidence interval for the common mean of several normal distributions. Theory of Probability and Its Applications 88, 882-888.

Rao, C. R., 1973. Linear statistical inference and its applications. Wiley, New York. Sinha, B. K., 1985. Unbiased estimation of the variance of the Graybill-deal estimator

of the common mean of several normal populations. The Canadian Journal of Statistics 13, 243-247.

Tsui, K., Weerahandi S., 1989, Generalized p-values in significance testing of hypothesese in the presence of nuisance parameters. Journal of the American Statistical Association 84, 602-607.

Weerahandi, S., 1993, Generalized confidence intervals. Journal of the American Statistical Association 88, 899-905.

Weerahandi, S., 1995, Exact statistical methods for data analysis. Springer-Verlag, New York.

(6)

models. Journal of Multivariate Analysis 51, 265-276. (五) 計畫成果自評

本研究成果乃計畫所提的一部份,將登於 JSPI,此期刊是 SCI 統計期刊當 中不錯的雜誌,值得肯定。

二、A Robust Approach to t Linear Mixed Models Applied to Multiple Sclerosis Data

此乃與林宗儀教授合作之文章。本文已登於 Statistics in Medicine, 2006, 1397-1412. (an SCI journal)。其中、英文摘要如下。

(六) 中文摘要 我們經由多變量 t 分佈考慮強韌性的線性混合模型。因為長期追蹤資料係順 時間收取且一般均呈自相關,故我們考慮每個個體具一階自相關。我們導出自相 關是否存在的分數檢定統計量,我們也得到最大概似估計量的明確分數程序,並 得到統計量的標準差。我們也討論根據過去值的未來觀察值的預測。這些結果我 們以多重硬化症臨床試驗的實際資料來當實例。 (七) 英文摘要

We discuss a robust extension of linear mixed models based on the multivariate t distribution. Since longitudinal data are successively collected over time and typically



tend to be autocorrelated, we employ a parsimonious first-order autoregressive dependence structure for the within-subject errors. A score test statistic for testing the existence of autocorrelation among the within-subject errors is derived. Moreover, we develop an explicit scoring procedure for the maximum likelihood estimation with standard errors as a by-product. The technique for predicting future responses of a subject given past measurements is also investigated. Results are illustrated with real data from a multiple sclerosis clinical trial.

(八) 報告內容

Multiple sclerosis (MS), one of the most common chronic diseases of the central nervous system in young adults, occurs when the myelin around the nerve

bres in the brain becomes damaged. As yet, the precise causes of MS remain unknown, though abundant research suggests MS may be an autoimmune disease in which the immune system attacks its own myelin, causing disruptions to the nerve

(7)

transmissions. There are no drugs to cure MS, but some treatments are available to ease the symptom. For example, interferon beta-1b (INFB) was approved by the US Food and Drug Administration in mid-1993 for use in early stage relapsing-remitting MS (RRMS) patients. For diagnosis, cranial magnetic resonance imaging (MRI) is the most preferred tool for monitoring MS evolution in both natural history studies and treatment trials.

Gill [1] presents a robust approach based on Huber’s ρfunction to a linear mixed model for the analysis of a data set, called the MS data throughout this paper, from a cohort study of 52 patients with RRMS. The study was a placebo-controlled trial of interferon beta-1b (INFB) in which patients were randomized to either a placebo (PL), a low-dose (LD), or a highdose (HD) treatment. The LD and HD treatments correspond to doses of 1.6 and 8 million international units (MIU) of IFNB every other day, respectively. Each patient had a baseline cranial MRI and subsequent MRIs once every 6 weeks over two years. The 6-weekly serial MRI data were collected from June 1988 to May 1990 at the University of British Columbia site.

The use of the t distribution in place of the normal for robust regression has been investigated by a number of authors, including West [2], Lange et al. [3] and James et al. [4]. The linear mixed model with multivariate t distributed responses, called the t linear mixed model hereafter, was considered by Welsh and Richardson [5], however, they do not explicitly discuss or derive the distributions of the random e ects as well  as the error terms. More recently, Pinheiro et al. [6] incorporated multivariate t distributed random effects and error terms to formulate a normal–normal–gamma hierarchy for the t linear mixed model. They provide several efficient EM-type algorithms for maximum likelihood (ML) estimation and illustrate the robustness with respect to outlying observations using a real example and some simulation results.

In this paper, we develop additional tools for a simplified version of the Pinheiro et al. [6] model and use these tools to analyse the MS data. The model considered here is

where i is the subject index, Yi is a pi-dimensional observed response vector, N is the

number of subjects, Xi and Zi are, respectively, known pi ×m1 and pi ×m2 design

matrices, β is an m1×1 vector of fixed effects, bi is an m2×1 vector of unobservable

random effects, τi is an unknown scale assumed to be distributed as gamma with mean 1 and variance2 v, and biτi and ε τ are assumed to be independent. i i

(8)

Furthermore, Γ is an m2×m2 matrix, which may be unstructured or structured, and Ci

is a pi ×pi correlation matrix.

Pinheiro et al. [6] consider a general model where Ci is allowed to depend upon a

vector of parameters and the parameter v is allowed to vary across subgroups of subjects. In this paper, we exploit the widely used autoregressive structure to model the dependence for the within-subject errors. As an illustration, we concentrate on the simple case where Ci has an AR(1) dependence structure that is common to all

subjects, i.e.

The dependence structure of Ci can be extended to a high order autoregressive

moving average (ARMA) dependence as provided by Rochon [7], Lin and Lee [8] and Lee et al. [9].

(九) 參考文獻

1. Gill PS. A robust mixed linear model analysis for longitudinal data. Statistics in Medicine 2000; 19:975 –987.

2. West M. Outlier models and prior distributions in Bayesian linear regression. Journal of the Royal Statistical Society, Series B 1984; 46:431– 438.

3. Lange KL, Little RJA, Taylor JMG. Robust statistical modelling using the t distribution. Journal of the American Statistical Association 1989; 84:881– 896. 4. James AT, Wiskich JT, Conyers RAJ. t-REML for robust heteroscedastic regression

analysis of mitochondrial power. Biometrics 1993; 49:339 –356.

5. Welsh AH, Richardson AM. Approaches to the robust estimation of mixed models. In Handbooks of Statistics, Maddlal GS, Rao CR (eds). vol. 15. Elsevier Science: Amsterdam, 1997; 343–383.



6. Pinheiro JC, Liu CH, Wu YN. E cient algorithms for robust estimation in linear mixed-e ects models using the multivariate t distribution. Journal of  Computational and Graphical Statistics 2001; 10:249 –276.

7. Rochon J. ARMA covariance structures with time heteroscedasticity for repeated measures experiments. Journal of the American Statistical Association 1992;

(9)

87:777–784.

8. Lin TI, Lee JC. On modelling data from degradation sample paths over time. Australian and New Zealand Journal of Statistics 2003; 45:257–270.

9. Lee JC, Lin TI, Lee KJ, Hsu YL. Bayesian analysis of Box-Cox transformed linear mixed models with ARMA(p; q) dependence. Journal of Statistical Planning and Inference 2005; 133:435–451.

10. Harville DA. Bayesian inference for variance components using only error contracts. Biometrika 1977; 61: 383–385.



11. Laird NM, Ware JH. Random e ects models for longitudinal data. Biometrics 1982; 38:963 –974.

12. D’yachkova YD, Petkau J, White R. Longitudinal analysis for magnetic resonance imaging outcomes in multiple sclerosis clinical trials. Journal of Biopharmaceutical Statistics 1997; 7:501–531.

13. Rubin DB. Inference and missing data. Biometrika 1976; 63:581–592.

14. Geisser S. The predictive sample reuse method with applications. Journal of the American Statistical Association 1975; 70:320 –328.

15. Dawid AP. Present position and potential developments: some personal views. Journal of the Royal Statistical Society, Series A 1984; 147:278 –292.

16. Fellner WH. Robust estimation of variance components. Technometrics 1986; 28:51– 60.

17. Huggins RM. A robust approach to the analysis of repeated measures. Biometrics 1993; 49:715–720.



18. Richardson AM. Bounded in uence estimation in the mixed linear model. Journal of the American Statistical Association 1997; 92:154 –161.

19. Richardson AM, Welsh AH. Robust restricted maximum likelihood in mixed linear models. Biometrics 1995; 51:1429 –1439.

20. Lee JC. Prediction and estimation of growth curve with special covariance structures. Journal of the American Statistical Association 1988; 83:432– 440.



21. Chi EM, Reinsel GC. Models for longitudinal data with random e ects and AR(1) errors. Journal of the American Statistical Association 1989; 84:452– 459. (十) 計畫成果自評

本文發表於 Statistics in Medicine,這是個相當好的期刊,Impact Factor 很 高,值得肯定。

參考文獻

相關文件

por dissolução ou tratamento químico de polímeros orgânicos naturais (por exemplo, celulose), para produzir polímeros tais como raiom cuproamónio (cupro) ou raiom viscose, ou

機能主義 包浩斯 構成主義 未來主義 達達主義.. 美術工藝運動

Consistent with the negative price of systematic volatility risk found by the option pricing studies, we see lower average raw returns, CAPM alphas, and FF-3 alphas with higher

本文前兩部分引入階梯函數、 脈衝函數與廣義函數, 利用運算數學的思路來求得函數的傅 立葉轉換。 第三與第四部分則利用在轉換域的關係式, 進而避開直接由定義去做的繁複計算來 求解。 尤其是求

We do it by reducing the first order system to a vectorial Schr¨ odinger type equation containing conductivity coefficient in matrix potential coefficient as in [3], [13] and use

When Wasan mathematicians propose mathematics problems, two aspects of knowledge activities related to mathematics research and practice are considered: They construct Jutsu using

Students are asked to collect information (including materials from books, pamphlet from Environmental Protection Department...etc.) of the possible effects of pollution on our

Wang, Solving pseudomonotone variational inequalities and pseudocon- vex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17