結論與建議

本章節將針對第四章的研究結果做最後結論，並提出建議，以供未來相關研究之探討。以下將分為研究結論與研究建議兩節分別進行討論。

第一節研究結論

本研究主要探討當觀測變項為類別資料，且以多組群驗證性因素分析針對測量恆等性中「因素負荷量相同檢測」進行模擬研究時，「同時檢查」與「逐條檢查」兩種方法在不同樣本數或因素負荷量差距的情境下，強韌卡方差異性檢定是否能正確檢測出兩組群具有差異，並比較不同情境下檢測結果檢定力和型一誤差的表現。

Meade和Lautenschlager (2004)研究中指出若資料型態為連續型資料，在試題為6題時，當樣本數大於500其估算正確率會優於樣本數小於150時，且檢測結果有90%以上的正確率。陳冠志（2006）更指出卡方差異檢定結果之檢定力會隨著樣本數的增加而上升，且因素負荷量差距越大，檢定力也會隨著提升；上述之結果皆能與本研究中檢定力隨樣本數的增加及因素負荷量差距的增加而上升之結論相呼應。

本研究推展陳冠志（2006）的研究，將觀測變項由連續資料擴充為類別資料，

並發現一些與先前研究不同的結果。以「逐條檢查」進行檢測，若觀測變項為連續資料，需要有 200 以上的樣本（陳冠志，2006），但本研究中觀測變項改為類別資料時，則需要有300 以上的樣本；以「同時檢查」進行檢測時，若觀測變項為連續資料，需要有 700 以上的樣本（陳冠志，2006），但本研究中觀測變項改為類別資料時，僅需要有600 以上的樣本即能有較佳的檢定力。

然而，本研究之研究目的主要在探討當觀測變項為類別資料時，樣本數大小、因素負荷量的差距是否會影響強韌卡方差異性檢定之穩定性，及以「同時檢查」與「逐條檢查」兩種方法進行檢測時，其檢定結果是否有所不同。

因此，綜合上述研究結果，本研究提出以下幾點結論，供實徵研究者做為參考：

1、在本研究的研究設計中，不論是以「逐條檢查」或「同時檢查」進行強韌卡方差異性檢定，樣本數的增加均會影響檢定力的增加；而當樣本數足夠大時，

即使因素負荷量差距小，其檢測結果之檢定力也有不錯的表現。

2、在本研究的研究設計中，不論是以「逐條檢查」或「同時檢查」進行強韌卡方差異性檢定，因素負荷量差距越大，檢定力也會跟著越高。

3、在本研究的研究設計中，「逐條檢查」比「同時檢查」在進行強韌卡方差異性檢定時，均能有較精準的檢測結果。

第二節研究建議

本研究主要依據傳統驗證性因素分析（包含一個潛藏變數與六個觀測變數）

之模型為分析模型，探討樣本數為100、200到1000，因素負荷量差距為0、0.1到 0.9時，強韌卡方差異性檢定其結果檢定力之表現。而在本研究中，以「同時檢查」

進行強韌卡方差異性檢定時，當因素負荷量差距為0.1，強韌卡方差異性檢定均沒有得到個較佳的檢定力，若將研究設計的樣本數增加，或許在這部分也能得到較佳的檢定力。

在研究設計部分，本研究是以傳統驗證性因素分析之模型為分析模型，是個較為簡單的模型，未來則可將分析模型擴展為較複雜的模型，如增加潛藏變數或觀測變數（測驗長度）等模型進行模擬研究。而研究中設定兩組群的樣本數是相同的，但在實徵資料部份，兩組群的比例卻不全然會是相同的，未來研究亦可針

對樣本比例設計兩組群樣本數的不同，以探討樣本比例對強韌卡方差異性檢定之影響。此外，本研究設計兩組群只有一因素負荷量具有差異，然而因素負荷量具有差異的試題比例是否會造成「逐條檢查」或「同時檢查」進行卡方差異性檢定時檢定力的表現，也是值得探討的部分。

參考文獻

中文部份

陳冠志（2006）。因素負荷量之測量恆等性檢測模擬研究。國立台中教育大學教育測驗統計研究所理學碩士論文。

蔡良庭、楊志堅、施慶麟（2006）。應用強韌性卡方差異檢定於 MIMIC 模式檢定試題差異性之研究。第七屆海峽兩岸心理與教育測驗學術研討會，政治大學。

英文部分

Asparouhov, T., & Muthén, B. (2006). Robust Chi Square Difference Testing with Mean and Variance Adjusted Test Statistics. Mplus Web Notes: No. 10

Bollen, K. A. (1989). Structural equations with latent variables: Wiley New York.

Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: the issue of partial measurement invariance. Psychological Bulletin, 105(4), 456-466.

Drasgow, F. (1984). Scrutinizing psychological tests: Measurement equivalence and equivalent relations with external variables are the central issues:

Psychological Bulletin, 95, 34-135.

Drasgow, F., & Kanfer, R. (1985). Equivalence of psychological measurement in heterogeneous populations. Journal of Applied Psychology, 70(1), 19-29.

Golembiewski, R. T., Billingsley, K., & Yeager, S. (1976). Measuring change and persistence in human affairs: Types of change generated by OD designs.

Journal of Applied Behavioral Science, 12, 133–157.

Holland, W. P., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129-145).Hillsdale, NJ: Erlbaum.

Holman, R., Glas, C. A. W., & de Haan, R. J. (2003). Power analysis in randomized clinical trials based on item response theory. Controlled Clinical Trials, 24, 390-410

Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-748.

Meade, A. W., & Lautenschlager, G. J. (2004). A Monte-Carlo Study of Confirmatory Factor Analytic Tests of Measurement Equivalence/Invariance. Structural Equation Modeling, 11(1), 60-72.

Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Psychometrika, in press.

Muthén, L. K., & Muthén, B. O.(1998-2004). Mplus user’s guide. Los Angeles:

Muthén & Muthén.

Shealy, R. T., & Stout, W. F. (1933). A model-biased standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58, 159-194.

Thissen, D., Steinberg, L., Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. Braun (Eds.), Test validity (pp. 147-169). Hillsdale, NJ: Erlbaum.

Taris, T. W., Bok, I. A., & Meijer, Z. Y. (1998). Assessing stability and change of psychometric properties of multi-item concepts across different situations: A general approach. Journal of Psychology, 132, 301–316.

Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4-69.

在文檔中以多組群驗證性因素分析探討測量恆等性之模擬研究 (頁 35-40)

第一節 研究結論

第二節 研究建議

參考文獻

中文部份

英文部分

第一節研究結論

第二節研究建議