後續研究建議

第五章結論與建議

第二節後續研究建議

本研究旨於探討三種選題法篩選 DIF-free 試題在概似比檢定法上進行 DIF 檢核的效能，利用篩選出較不可能具有 DIF 現象的試題做為定錨題運用於先定錨後檢核策略，在高 DIF 試題百分比時也能控制型一誤差在合理的誤差範圍之內，此策略在本研究中得到良好的驗證，但本研究發現使用迭代定題法篩選 DIF-free 試題時，在高 DIF 百分比時，選題正確率較量尺淨化法及排序選題法來得低一些，此與以往的研究發現不一致，也與孫國瑋（2010）以概似比檢定法檢測二分題 DIF 時的發現不同，由於本研究探討的情境有限，似不足以釐清原因，建議未來研究中可針對迭代定題法的選題步驟進行改善，提升選題的表現，以增進檢核效能。

本研究使用目前最新研究所提出的三種選題法來篩選 DIF-free 試題，而在未來的議題上，如何更方便又確實的篩選出 DIF-free 試題，在後續研究可繼續進行選題方法上的改良。也可運用於更多研究情境的探討，以更了解先定錨後檢核策略的在各情境下的實施效能。後續研究者也可將先定錨後檢核策略結合

其他 DIF 檢核方法，如 SIBTEST 法等。亦建議可將先定錨後檢核策略運用於其他多元計分模式資料，進而提供後續研究者更多檢核策略上的選擇。

參考文獻

孫國瑋（2010）。先定錨後檢核策略運用在概似比檢定法之差異試題功能檢核效果。國立臺中教育大學教育測驗統計研究所碩士論文，未出版，臺中市。

Ankenmann, R. D., Witt, E. A., & Dunbar, S. B. (1999). An investigation of the power of the likelihood ratio goodness-of-fit statistic in detecting differential item functioning. Journal of Educational Measurement, 36, 277-300.

Bolt, D. M. (2002). A Monte Carlo comparison of parametric and nonparametric polytomous DIF detection methods. Applied Psychological Measurement, 15, 113-141.

Candell, G. L., & Drasgow, F. (1988). An iterative procedure for linking metrics and assessing item bias in item response theory. Applied Psychological Measurement, 12, 253-260.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).

Hillsdale, NJ: Lawrence Earlbaum Associates.

Cohen, A. S., Kim, S. & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17, 335-350.

Embretson, S. E. & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum Publishers.

Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29, 278-295.

French, B. F., & Maller, S. J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and

Psychological Measurement, 67, 373-393.

Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Lawrence Erlbaum.

Kim, S.-H., & Cohen, A. S. (1998). Detection of differential item functioning under the graded response model with the likelihood ratio test. Applied Psychological Measurement, 22, 345-355.

Lord, F. M. (1980). Applications of item response theory to practical testing problems.

Hillsdale, NJ: Lawrence Erlbaum.

Mantel, N. (1963). Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, 58,690-700.

Mantel, N., & Haenszel, W.(1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-748.

Mellenberg, G. J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105-108.

Miller, T.R., & Spray, J.A. (1993). Logistic discriminant function analysis for DIF identification of polytomously scored items. Journal of Educational Measurement, 30 (2), 107-122.

Park, D. G., & Lautenschlager, G. J. (1990). Improving IRT item bias detection with iterative linking and ability scale purification. Applied Psychological Measurement, 14, 163-173.

Raju, N, S. (1988). The area between two item characteristic curves. Psychometrika, 53, 495-502.

Raju, N, S., van der Linden, W., Fleer, P. (1995). An IRT-based internal measure of test bias with applications for differential item functioning. Applied Psychological Measurement, 19, 353-368.

Samejima, F. (1969). Estimation of a latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 17,1-100.

Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58,159-194.

Shih, C.-L., &Wang, W.-C. (2009). Differential item functioning detection using the multiple indicators, multiple causes method with a pure short anchor. Applied Psychological Measurement, 33, 184-199.

Stark, S., Chernyshenko, Oleksandr, S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory:

Toward a unified strategy. Journal of Applied Psychology, 91, 1292-1306.

Thissen, D. (1991). MULTILOG user’s guide(Version 6) [Computer program]. Mooresville, IN: Scientific Software.

Thissen, D. (2001). IRTLRDIF v.2.0b: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item

functioning. University of North Carolina at Chapel Hill.

Thissen, D., Steinberg, L., & Gerrand, M. (1986). Beyond group mean differences:

The concept of item bias. Psychological Bulletin, 99, 118-128.

Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 147-169). Hillsdale NJ: Lawrence Erlbaum.

Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item

functioning using the parameters of item response models. In P. W. Holland & H.

Wainer (Eds.), Differential item functioning (pp. 67-113). Hillsdale NJ: Erlbaum.

Wang, W.-C. (2001, September). Effects of anchor item methods on the detection of differential item functioning within the family of Rasch models. Paper presented at the annual meeting of the Chinese Psychological Association, Chia-Yi, Taiwan.

Manuscript submitted for publication.

Wang, W.-C. (2004). Effect of anchor item methods on the detection of differential item functioning within the family of Rasch models. Journal of Experimental Education, 72, 221-261.

Wang, W.-C. (2008). Assessment of differential item functioning. Journal of Applied Measurement, 9, 387-408.

Wang, W.-C, Shih, C.-L., & Yang, C,-C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69, 713-731.

Wang, W.-C., & Su, Y.-H. (2004). Effects of average signed area between two item characteristic curves and test purification procedures on the DIF Detection via the Mantel-Haenszel method. Applied Measurement in Education, 17, 113-144.

Wang, W.-C & Yeh, Y.-L. (2003). Effects of anchor item methods on differential item functioning detection with the likelihood ratio test. Applied Psychological Measurement, 27, 479-498.

Woods, C. M. (2009). Empirical selection of anchors for tests of differential item functioning. Applied Psychological Measurement, 33, 42-57.

在文檔中三種定錨題選題法於先定錨後檢核策略之效果比較─以概似比檢定法檢核多分題差異試題功能為例─ (頁 49-54)

第五章 結論與建議

第二節 後續研究建議

參考文獻

第五章結論與建議

第二節後續研究建議