• 沒有找到結果。

結合賀氏統計量轉換及削截平均數以處理二因子變異數分析(I)

N/A
N/A
Protected

Academic year: 2021

Share "結合賀氏統計量轉換及削截平均數以處理二因子變異數分析(I)"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

行政院國家科學委員會專題研究計畫 成果報告

結合賀氏統計量轉換及削截平均數以處理二因子變異數分

析(I)

計畫類別: 個別型計畫

計畫編號: NSC91-2413-H-006-002-

執行期間: 91 年 08 月 01 日至 92 年 07 月 31 日 執行單位: 國立成功大學教育研究所

計畫主持人: 陸偉明

報告類型: 精簡報告

報告附件: 出席國際會議研究心得報告及發表論文 處理方式: 本計畫可公開查詢

中 華 民 國 92 年 10 月 31 日

(2)

Heteroscedastic Test Statistics for One-way ANOVA:

The Trimmed Means and Hall's Transformation Conjunction

Wei-Ming Luh

National Cheng Kung University Taiwan, R. O. C.

Abstract

To deal with skew and heterogeneous data, the present study adopts trimmed means in conjunction with Hall's invertible transformation into a heteroscedastic test statistic for the one-way fixed effect ANOVA model. The results of a simulation experiment show that the proposed technique can control Type I error rates on the nominal level quite well and there is a small gain in terms of statistical power comparing to other competitors. The confidence intervals of the trimmed means and group difference are also derived.

Keywords: Computer simulation, Confidence interval, Robustness, Welch test

1. Introduction

The comparison of the means of several samples from populations with unequal variances has been the subject of much discussion (Kim & Cohen, 1998; Scheffé, 1970; Wang, 1971). Approximate tests, such as the Welch test (1951) and the Alexander-Govern test (1994), are well-known alternatives for dealing with heterogeneous variances, but they can't handle nonnormality (Clinch & Keselman, 1982; Hsiung & Olejnik, 1996; Keselman, Carriere, & Lix, 1995; Oshima & Algina, 1992; Schneider & Penfield, 1997). The combined heterogeneity and nonnormality problem makes the situation more complicated for applied researchers. It is known that a slight departure from normality has a great effect on power (see Sawilowsky &

Blair, 1992; Scheffé, 1959, Ch. 10; Wilcox, 1995a). Since the violation of the normality assumption has been fairly common in practice (Micceri, 1989; Stigler, 1977), it is worthwhile to develop robust and efficient alternatives for dealing with this problem.

To choose a valid and powerful method for nonnormal as well as heteroscedastic

(3)

data is very crucial since the practical problem is that we do not know whether the populations differ. In applied work, many nonsignificant results would have been significant if a more modern method had been used (Wilcox, 1998). Therefore, robust methods are designed to lessen the impact of extreme data points. Robust methods include finding population parameters, estimators, and hypothesis-testing methods that are not drastically affected by small changes in a distribution (Wilcox, 1998). We should learn and adopt robust methods whenever they are useful or needed, and we should do this in a flexible and creative manner (Hampel, 1991, p. 104).

Replacing ordinary means with some resistant measure of location might make a big difference in terms of power (Wilcox, 1996, p. 205). The trimmed mean method was introduced to deal with nonnormality, especially heavy-tails (Tukey &

McLaughlin, 1963; Wilcox, 1995b). It represents a compromise between the two extremes of the sample mean and sample median. In essence, the trimmed mean is a linear combination of order statistics (L-estimates) which results in focusing on the middle portion of a distribution. The trimmed mean method is easy to compute and the power efficiency is quite good (Oosterhoff, 1994; Staudte & Sheather, 1990, p.

105; Wiens, Wu, & Zhou, 1998). Rosenberger and Gasko (1983) recommend using a proportion of 20% trimming on both sides.Moreover,results in Bickel and Lehmann (1975) indicate that trimmed means are preferable to the one-step M-estimators when dealing with asymmetric distributions.

The trimmed mean method has been investigated for the heterogeneous variance condition (Wilcox, 1994a; Yuen, 1974). Previous findings indicate that one can adopt a heteroscedastic statistic with robust estimators of trimmed means (Luh, 1999).

Algina, Oshima, and Lin (1994) reported that the Yuen test on trimmed means had greater power than the Welch test when data were sampled from a long-tailed distribution. Moreover, for a one-way completely randomized design, Lix and Keselman (1998) also recommended using trimmed means with a heteroscedastic statistic to control the Type I error. Other researchers have investigated the bootstrap and trimmed means (Wilcox, Keselman & Kowalchuk, 1998).

In addition to trimmed mean as a robust estimator for heavy-tailed distributions, Hall (1992a) suggested an empirical and general transformation to remove most of the skewness by modifying Student's t variable by using the inverse Cornish-Fisher expansion. The advantage of Hall's transformation over other's transformation such as Johnson's (Johnson, 1978; Luh & Guo, 1999) is that it is a monotone function and that it provides confidence intervals. In the two-sample case, the confidence interval obtained by using Hall's transformation with trimmed means has been derived by Guo

(4)

& Luh (2000). The technique can be also applied to the pairwise comparison cases. In the way, the unequal variances problem won't affect the accuracy of the confidence interval calculation.

Therefore, the present study proposes heteroscedastic statistics with trimmed means and Hall's transformation when the assumptions are jointly violated. A computer simulation was performed to evaluate the performance of the proposed techniques (Hall's transformation with trimmed means and a heteroscedastic test statistic, HAGt, HWt). The un-trimmed approximate Alexander-Govern (AG), Welch method (W), and their corresponding trimmed mean methods (AGt, Wt) without transformation are also included to compare. The definitions of proposed test statistics are as followings. The other four statistics are referred to in Luh & Guo (1999) for the computational formulas.

Conclusion

The present study proposes the trimmed mean conjunction with Hall's transformation into a heteroscedastic test statistic for the one-way fixed effect ANOVA model in a situation where population variances differ and population distributions are

non-normal. The present study examines the characteristics of the sampling

distributions of the test statistics under a null and a non-null condition in terms of the empirical Type I error rates and statistical power in the simulation experiments. The additional step of applying Hall's procedure for transforming the test statistics appears to somewhat improve the statistical power for skew distributions. The magnitude of the improvement is between 10% and 21% in the present simulation experiments. In addition to power improvement, since unequal variance or/and nonnormality affect the accuracy of the confidence interval computed, the advantage of Hall's

transformation is that it can provide a confidence interval. The proposed technique is recommended especially for skew distributions. The computer program is written by SAS and is available on the request by the first author.

(5)

The percentage of Type I error rates of six methods for N2 =

(

20,40

)

Trimmed

Un-trimmed Trimmed

Transformation

W AG Wt AGt HWt HAGt

(g=0,h=0)

4.96 4.97 5.27 5.28 5.23 5.22

(g=1,h=0.5)

41.95a 41.90 6.04 6.02 6.00 6.03

(g=1,h=0)

12.73 12.72 7.25 7.24 6.77 6.77

(g=0.5,h=0)

7.07 7.07 5.73 5.73 5.45 5.45

(g=0,h=0.5)

2.93 2.91 4.05 4.05 4.31 4.30

Note: a. Bold-faced entries indicate values exceeding the criterion 7.5.

參考文獻

相關文件

Bootstrapping is a general approach to statistical in- ference based on building a sampling distribution for a statistic by resampling from the data at hand.. • The

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

"Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values," Data Mining and Knowledge Discovery, Vol. “Density-Based Clustering in

With regards to the questionnaire and interview aspects, we employed those made up by ourselves "The Questionnaire of trigonometry study present situation

相關分析 (correlation analysis) 是分析變異數間關係的

母體分配 樣本平均數 的抽樣分配 抽樣誤差與 非抽樣誤差 樣本平均數 的平均數與. 變異數

Following the supply by the school of a copy of personal data in compliance with a data access request, the requestor is entitled to ask for correction of the personal data

• In the present work, we confine our discussions to mass spectro metry-based proteomics, and to study design and data resources, tools and analysis in a research