肆 肆
肆、 、 、 、估計模式的影響 估計模式的影響 估計模式的影響 估計模式的影響
由模擬樣本資料得知,不論在試題參數與概念辨識率的估計上,DINA 模式 都比 G-DINA 模式良好。應是 G-DINA 模式在估計時受到概念彼此間影響的關 係,而無法精準地判斷純以 DINA 模式產生的模擬樣本資料;但在樣本數較小時,
又因其考慮關係因素較多,當 DINA 模式估計出現無法收斂的情形時,G-DINA 模式則能確實完成估計。
第二節 第二節 第二節
第二節 未來研究建議 未來研究建議 未來研究建議 未來研究建議
茲根據本研究部分未盡完善之處,提出以下具體研究建議,以供後續相關研 究之參考。
一、 不同的樣本數
本研究設定樣本人數為 100 人、500 人、以及 1000 人三種,其試題參數與概 念辨識率皆有明顯變好的趨勢,後續可將樣本數提高為 1500 人或 2000 人來探討 其估計效果。
二、 不同的認知屬性分佈
本研究探討了 HO-DINA 模式、MVN(0)分配、MVN(0.5)分配、以及 uniform 分配…等四種不同的認知屬性分佈,在 HO-DINA 模式方面,試題參數與概念辨 識率有較明顯的差異;而 MVN(0)分配與 MVN(0.5)分配方面,則是比較符合本研
究就所預期的試題參數不變性的特性。後續可針對各種不同多變量常態分配的方 法來探討其估計效果。
三、 不同的能力分佈
本研究把各種認知屬性的受試者皆分為低能力組、中能力組、以及高能力 組,在低能力組方面,試題參數與概念辨識率的估計顯得較差;而中能力組方面,
試題參數與概念辨識率的估計有些微的提升;在高能力組方面,試題參數與概念 辨識率的估計雖然有較好的估計值,但也未能達到 de la Torre & Lee(2010)的研究 結果。後續可針對更多不同能力分佈的差異,例如:正偏態分佈、負偏態分佈、
或雙峰分佈…等來探討其估計效果。
四、 試題參數的估計
從本研究的結果來看,不論模擬樣本資料的產生方式為 HO-DINA 模式或多 變量常態分配方法,其粗心機率(s 參數)的估計上皆有較明顯差異,是因為受試者 能力分佈高低不同的影響,後續可針對此部分的差異進行深入研究與探討。另 外,本研究將試題參數設定為 g=0.1、s=0.1,後續可廣泛討論不同試題參數的組 合來探討其估計效果,如:g=0.2、s=0.1 或 g=0.1、s=0.2。
參考文獻 參考文獻 參考文獻 參考文獻
甘媛源、余嘉元(2009)。心理測量理論的新進展:潛在分類模型。中國考試,
2009(3),3-8。
余民寧(2009)。試題反應理論(IRT)及其應用。臺北市:心理出版社。
洪祥堯(2009)。資訊科技融入國小六年級圓形圖單元教學與評量之行動研究。亞 洲大學資訊工程學系碩士論文。台中縣。
涂金堂(2003)。認知診斷評量的探究。臺南師範學院學報,37(2):67-97。
劉芝毓(2006)。試題反應理論多相模式之延伸與應用。國立中正大學心理學研究 所碩士論文。嘉義縣。
試 題 反 應 理 論 的 介 紹 ( 五 ) - 模 式 與 資 料 間 適 合 度 的 檢 定 (The assessment of model-data fit):http://www.edutest.com.tw/e-irt/irt5.htm
Bloom, B. Englehart, M. Furst, E. Hill, W. & Krathwohl, D. (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I:
Cognitive domain. New York, Toronto: Longmans, Green
Bock, R. D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.
Cheng, Y. & Chang, H. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74(4), 619-632.
Cowles, M. K. (2004). Review of WinBUGS 1.4. The American Statistician, 58, 330-336.
de la Torre, J. (2006). Attribute vector profile comparisons at the state level: An application and extension of cognitive diagnosis modeling in NAEP. Paper presented at the international meeting of the Psychometric Society, Montreal, Canada.
de la Torre, J. (2008a). An empirically-based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343-362.
de la Torre, J. (2008b). Multidimensional scoring of abilities: The ordered polytomous response case. Applied Psychological Measurement, 32, 355-370.
de la Torre, J. (2009a). A cognitive diagnosis model for cognitively based multiple-choice options. Applied Psychological Measurement, 33(3), 163-183.
de la Torre, J. (2009b). DINA model and parameter estimation: A Didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130.
de la Torre, J. & Douglas, J. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353.
de la Torre, J. & Douglas, J. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595-624.
de la Torre, J. & Lee, Y.-S. (2008). Cognitive diagnosticity of IRT-constructed assessment: An empirical investigation. Paper presented at the meeting of the National Council on Measurement in Education, New York, NY.
de la Torre, J. & Lee, Y.-S. (2010). A note on the invariance of the DINA model parameters. Journal of Educational Measurement, 47(1), 115-127.
Doignon, J.P. & Falmagne, J.C. (1999). Knowledge spaces. New York: Springer.
Doornik, J. A. (2003). Object-oriented matrix programming using Ox(Version 3.1).
[Computer software]. London, England: Timberlake Consultants Press.
Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 59-374.
Finkelman, M. Kim, W. & Roussos, L. (2009). Automated test assembly for cognitive diagnostic models using a genetic algorithm. Journal of Educational Measurement, 46(3), 273-292.
Gierl, M. Cui, Y. & Zhou, J. (2009). Reliability and attribute-based scoring in cognitive diagnostic assessment. Journal of Educational Measurement, 46(3), 293-313.
Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer.
Hartz, S. (2002). A Bayesian framework for the Unified Model for assessing cognitive
abilities: blending theory with practice. Unpublished doctoral thesis, University of Illinois at Urbana-Champain.
Hartz, S. Roussos, L. & Stout, W. (2002). Skills diagnosis: Theory and practice [User manual for Arpeggio software]. Princeton, NJ: Educational Testing Service.
Henson, R. A., & Douglas, J. (2005). Test construction for cognitive diagnosis.
Applied Psychological Measurement, 29(4), 262-277. Acceptance Date: 2004 Henson, R. Templin J. & Willse, J. (2009). Defining a family of cognitive diagnosis
models using log-linear models with latent variables. Psychometrika,74(2), 191-210.
Huebner, (2010). Cognitive Diagnostic Computer Adaptive Assessments. Journal of Educational Measurement, 46(3), 293-313.
Huebner, & Alan, (2010). An Overview of Recent Developments in Cognitive Diagnostic Computer Adaptive Assessments. Practical Assessment, Research &
Evaluation, 15(3), January 2010, form http://pareonline.net/getvn.asp?v=15&n=3. Junker, B. & Sijtsma, K. (2001). Cognitive assessment models with few assumptions,
and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
Leighton, J. P. Gierl, M. J. & Hunka, S. M. (2004). The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka’s rule space approach. Journal of Educational Measurement, 41(3), 205-237.
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64(2), 187-212.
McGlohen, M. & Chang, H. (2008). Combining computer adaptive testing technology with cognitively diagnostic assessment. Behavior Research Methods, 40(3), 808-21.
Qiu, Z. Song, P. X.-K. & Tan, M. (2002). Bayesian hierarchical models for multi-level repeated ordinal data using WinBUGS. Journal of Biopharmaceutical Statistics, 12, 121-135.
Rupp, A. & Templin, J. (2008). The effects of q-matrix misspecification on parameter Estimates and classification accuracy in the DINA model. Educational and
Psychological Measurement, 68(1), 78-96.
Sturtz, S. Ligges, U. & Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software, 12, 1-16.
Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconceptions by the pattern classification approach. Journal of Educational Statistics, 10, 55-73.
Templin, J. L. Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287-305.
Templin, J. L., Henson, R. A. Templin, S. E. & Roussos, L. (2008). Robustness of Hierarchical Modeling of Skill Association in Cognitive Diagnosis Models.
Applied Psychological Measurement, 32(7), 559-574.
von Davier, M. (2005). A General diagnostic model applied to language testing data.
ETS Research Report. Princeton, New Jersey: ETS.
Xu, X. Chang, H. & Douglas, J. (2003). A simulation study to compare CAT strategies for cognitive diagnosis. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
Xu, X. & von Davier, M. (2008). Linking for the general diagnostic model. ETS Research Report. Princeton, New Jersey: ETS.