• 沒有找到結果。

Rasch模式應用於次序選擇題之誘項分析:以數學分數測驗為例

N/A
N/A
Protected

Academic year: 2021

Share "Rasch模式應用於次序選擇題之誘項分析:以數學分數測驗為例"

Copied!
33
0
0

加載中.... (立即查看全文)

全文

(1)87. 國立政治大學「教育與心理研究」 2015 年 6 月,38 卷 2 期,頁 87-119 DOI. 10.3966/102498852015063802004. Rasch模式應用於次序選擇題之誘項分析 ⎯⎯以數學分數測驗為例 張麗麗*. 摘. 羅素貞**. 要. 評量不僅是用來評斷學生表現的工具,它還需具備解釋、診斷、高度明確, 能指引後續處置之潛能,達此目的之評量必需連結認知理論、觀察及解釋。本研究 目的即在探討如何應用Rasch模式(採Andrich「閾值具次序性」觀點),以研究者 依據數學認知發展理論所建構之次序選擇題為例進行誘項分析,並比較給予誘項不 同權重的多元計分及傳統的二元計分在多項心理計量特性上的差異。結果大致支持 次序選擇題之發展層次,但僅部分符合閾值具次序、間距適切及誘項曲線尖峰等原 則,能穩定區分不同認知發展層次,適合以多元給分。惟多元給分仍能提高測量精 確度並提供診斷發展處於層次間學童之機會。本研究亦對建置次序選擇題的相關議 題做討論並提出後續研究建議。. 關鍵詞: 分數認知發展、次序選擇題、誘項分析、Rasch 模式. *. 張麗麗(通訊作者):國立屏東大學教育心理與輔導學系副教授 羅素貞:國立屏東大學教育心理與輔導學系副教授 電子郵件:llychang93@gmail.com **. 收件日期:2014.08.19;修改日期:2014.11.11;接受日期:2014.12.05.

(2) 88. Journal of Education & Psychology June, 2015, Vol. 38 No. 2, pp. 87-119. A Rasch Model Distractor Analysis on Ordered Multiple-Choice Items of a Fractional Test Lily Chang*. Su-Jen Lo**. Abstract. Assessments are not used for evaluating students’ performance only; rather, they should provide information deemed interpretative, diagnostic, highly informative, and potentially prescriptive. To achieve this goal, assessments must link cognition, observation and interpretation. Hence, the purposes of this study were: 1) to explore how to conduct a distractor analysis on ordered multiple-choice items (OMC) developed in accordance with cognitive developmental theory of fractions, under Andrich’s “ordered thresholds” perspective within the framework of Rasch modeling; 2) to compare the psychometric properties of OMC that were given partial credits and those that were scored dichotomously. Although results in general support the proposed developmental levels of distractors, only a few which conformed the principles of ordered thresholds, of appropriate distance between thresholds, and of distractor response curve showing peak along the latent continuum, can reliably distinguish the developmental levels of fractions. *. Lily Chang (Corresponding Author): Associate Professor, Department of Educational Psychology and Counseling, National Pingtung University ** Su-Jen Lo: Associate Professor, Department of Educational Psychology and Counseling, National Pingtung University E-mail: llychang93@gmail.com Manuscript received: 2014.08.19; Revised: 2014.11.11; Accepted: 2014.12.05.

(3) 89. and hence are suitable for awarding partial credits. However, when OMC were scored (or rescored) properly as polytomous items, not only did test precision increase, the chances for identifying students who are progressing between developmental levels were also increased. Discussions and suggestions for developing OMC and for conducting further research are also provided.. Keywords: cognitive development of fractions, ordered-multiple choice item, distractor analysis, Rasch model.

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

(29) Rasch 模式應用於次序選擇題之誘項分析⎯⎯以數學分數測驗為例 115. 果顯示:(一)多元給分的題數愈多,試. (注意大部分研究均發現適宜給予部分. 題平均難度就愈低,但學童平均能力不. 分數之選擇題所佔的比例仍有限),而. 因計分方式而有不同;(二)試題大致契. 最大訊息量的位置則依部分給分選項難. 合模式,惟多元給分下試題過度契合模. 度而定。由於文獻中較少探討發展取向. 式的情形略明顯,而二元給分下則是試. OMC (尤其是數學分數概念的發展). 題不契合模式的情形略明顯;(三)多元. 以多元及二元計分差異之研究,故本研. 及二元給分之 DIF 分析結果相近,但仍. 究的各項發現仍有待後續研究來檢驗。. 有些許差異,而前者在診斷誘項契合度. 綜而言之,本研究結果支持以. 上尤具優勢(即 DIF 是否由特定誘項造. Rasch 模式的誘項分析建置發展取向的. 成 ); ( 四 ) 多 元 計 分 因 能 增 加 選 項 閾. OMC 。 OMC 的誘項分析不僅讓我們得. 值,尤其是增加層次間之閾值,故除了. 以檢視誘項是否具發展次序性及是否契. 整體測驗的信度及測驗訊息曲線涵蓋的. 合模式,也提供我們根據結果給予誘項. 範圍較二元計分為高且廣外,也特別適. 部分分數,以及修改選項與試題的機. 合診斷尚處在發展層次之間,隱約可跨. 會。而當誘項適宜給予部分分數時,納. 越但又尚未掌握下一個層次的學童;. 入 OMC 多元計分題(即便僅部分試題. (五)雖然多元計分下高、低能力學童的. 適宜多元給分)不僅允許我們在維持. 能力估計值分別略微下降及上升,但不. (甚至提高)測驗精確度的情形下,縮. 同計分方式下學童能力的排序仍相當穩. 短測驗長度及減少作答時間,更提供我. 定。整體言, OMC 誘項以多元或二元. 們診斷概念發展介於層次之間學童的機. 給分在多項計量特性上有差異,但七題. 會。更重要的是,當反映不同發展層次. 或四題多元給分之間的差異並不大。. 的誘項可與受試者置於同一連續向度上. 當選擇題誘項適宜給予部分分數. 時,我們可以從概念發展的角度解釋及. 時,本研究與其他研究的結論大致相. 診斷學生的錯誤、可能處在的發展階. 近:受試者能力在多元及二元計分下相. 段,以及發展進程的軌跡,並依此設計. 對 穩 定 ( Grunert, Raker, Murphy, &. 適合學生的教學或介入計畫。. Holme, 2013; Jiao, Liu, Haynie, Woo, & Gorham, 2012 );多元計分具提高測量 精確度及測驗訊息量之潛能( Andrich. & Styles, 2011; Jiao et al., 2012; Lin et al., 2010; Ma, 2004),其中測量精確度 提高的幅度受到多元計分題數的影響. 參考文獻 王淑芬(2004)。兒童的分數概念研究: 一個國小三年級的個案(未出版碩士 論文)。國立臺中教育大學,臺中 市。 【 Wang, S. F. (2004). Children’s fraction.

(30) 116 教育與心理研究 38 卷 2 期. concept: A case study of a third-grade student (Unpublished master’s thesis). National Taichung University of Education, Taichung, Taiwan.】. 李瑞明(1997)。「分數詞」之解題活動 類型:一個國小四年級兒童之個案研 究(未出版碩士論文)。國立嘉義大 學,嘉義市。 【Li, T. M. (1997). Wei Wei’s meanings of fractional number words: A case study on fourth graders (Unpublished master’s thesis). National Chiayi University of Education, Chiayi, Taiwan.】. 侯君玲(2010)。中年級兒童整數與分數概 念發展之研究(未出版碩士論文)。 國立屏東大學,屏東。 【Ho, C. L. (2010). A study on middle-graders’ development of concepts in whole number and fraction (Unpublished master’s thesis). National Pingtung University, Pingtung, Taiwan.】. 陳 靜 姿 ( 2000 ) 。 兒 童 分 數 詞 瞭 解 之 研 究。科學教育研究與發展季刊,18, 56-68。 【Chen, C. T. (2000). A study on children’s comprehension of fractional number words. Research and Development in Science Education Quarterly, 18, 56-68.】. 張麗麗、羅素貞(2011)。Rasch多向度模 式檢核「國小數學問題解決態度量 表」(MPSAS)之心理計量特性。教 育與心理研究,34(3),153-185。 【Chang, L. L. & Lo, S. J. (2011). Using the multidimensional Rasch model to examine the psychological properties of Mathematics Problem-Solving Attitude Scale (MPSAS). Journal of Education & Psychology, 34(3), 153-185.】. 鍾啟芳(2006)。兒童的分數概念研究: 一個國小五年級的個案(未出版碩士 論文)。國立臺中教育大學,臺中 市。 【 Chung, C. F. (2006). Children’s fraction concept: Case of a fifth-grade student. (Unpublished master’s thesis). National Taichung University of Education, Taichung, Taiwan.】. Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-574. Andrich, D. (1988). Rasch model for measurement. Newbury Park, CA: Sage. Andrich, D. (2004). Understanding resistance to the data-model relationship in Rasch’s paradigm: A reflection for the next generation. In E. V. Smith & R. M. Smith (Eds.), Introduction to Rasch measurement (pp. 167-200). Maple Grove, MN: JAM Press. Andrich, D., Sheridan, B., & Luo, G. (2007). RUMM 2020. Perth, Australia: RUMM Laboratory. Andrich, A., & Styles, I. (2011). Distractors with information in multiple-choice items: A rationale based on the Rasch model. Journal of Applied Measurement, 12(1), 67-95. Asril, A. & Marais, I. (2011). Applying a Rasch model distractor analysis. In R. F. Cavanagh & R. F. Waugh (Eds.), Application of Rasch measurement in learning environments research (pp. 77100). Rotterdam, The Netherlands: Sense. Bock, R. D. (1972). Estimating item parameters and latent proficiency when the responses are scored in two or more nominal categories. Psychometrika, 37(1), 29-51. Briggs, D. C., & Alonzo, A. C. (2012). The psychometric modeling of ordered multiple-choice item responses for diagnostic assessment with a learning progression. In A. C. Alonzo & A. W..

(31) Rasch 模式應用於次序選擇題之誘項分析⎯⎯以數學分數測驗為例 117. Gotwals (Eds.), Learning progressions in science: Current challenges and future directions (pp. 293-316). Rotterdam, The Netherlands: Sense. Briggs, D. C., Alonzo, A. C., Schwab, C., & Wilson, M. (2006). Diagnostic assessment with ordered multiple-choice items. Educational Assessment, 11(1), 33-63. de la Torre, J. (2009). A cognitive diagnosis model for cognitively based multiplechoice options. Applied Psychological Measurement, 33(3), 163-183. Engelhard, G. (2013). Invariant measurement: Using Rasch models in the social, behavioral, and health sciences. New York, NY: Routledge. Frary, R. B. (1989). Partial-credit scoring methods for multiple-choice tests. Applied Measurement in Education, 2(1), 79-96. Green, B. F., Crone, C. R., & Folk, V. G. (1989). A method for studying differential distractor functioning. Journal of Educational Measurement, 26(2), 147-160. Grunert, J. L., Raker, J. R., Murphy, K. L., & Holme, T. A. (2013). Journal of Chemical Education, 90(10), 1310-1315. doi: 10.1021/ed400247d Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Mahwah, NJ: Lawrence Erlbaum. Herrmann-Abell, C. F., & DeBoer, G. E. (2011). Using distractor-driven standards-based multiple-choice assessments and Rasch modeling to investigate hierarchies of chemistry misconceptions and detect structural. problems with individual items. Chemistry Education Research and Practice, 12, 184-192. Hestenes, D., Wells, W., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141-158. Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Erlbaum. Huynh, H. (1996). Decomposition of a Rasch partial credit model item into independent binary and indecomposable trinary items. Psychometrika, 61(1), 3139. Jiao, H., Liu, J., Haynie, K., Woo, A., & Gorham, J. (2012). Comparison between dichotomous and polytomous scoring of innovative items in a large-scale computerized adaptive test. Educational and Psychological Measurement, 77(3), 493-509. King, J, V., Gardner, D. A., Zucker, S., & Jorgensen, M. A. (2004). The distractor rationale taxonomy: Enhancing multiple-choice items in reading and mathematics. Pearson Assessment Report. Lin, J., Chu, K. L., & Meng, Y. (2010). Distractor rationale taxonomy: Diagnostic assessment of reading with ordered multiple-choice items. Retrieved from http://www.pearsonassessments. com/ Linacre, J. M. (2004). Optimizing rating scale category effectiveness. In E. V. Smith & R. M. Smith (Eds.), Introduction to.

(32) 118 教育與心理研究 38 卷 2 期. Rasch measurement: Theory, models and applications (pp. 258-278). Maple Grove, MN: JAM Press. Linacre, J. M. (2005). A user’s guide to WINSTEPS: Rasch-model computer programs. Chicago, IL: Winsteps.com. Ma, X. (2004). An investigation of alternative approaches to scoring multiple response items on a certification exam (Unpublished doctoral dissertation). University of Massachusetts, Amherst, MA. Master, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. Ning, T. C. (1992). Children’s meaning of fractional number words (Unpublished doctoral dissertation). The University of Georgia, Athens, GA. Olive, J. (1999). From fractions to rational numbers of arithmetic: A reorganization hypothesis. Mathematical Thinking and Learning, 1(4), 279-314. Olive, J., & Steffe, L. P. (2002). The construction of an iterative fractional scheme: The case of Joe. Journal of Mathematical Behavior, 20(4), 413-437. Pellegrino, J. W., Baxter, G. P., & Glaser, R. (1999). Addressing the “two disciplines” problem: Linking theories of cognition and learning with assessment and instructional practices. In A. Iran-Nejad & P. D. Pearson (Eds.), Review of research in education (pp. 307-353). Washington, DC: American Educational Research Association. Pellegrino, J. W., Chudowsky, N., & Glaser, R. (2001). Knowing what students know: The science and design of educational. assessment. Washington, DC: National Academy Press. Penfield, R. D. (2008). An odds ratio approach for assessing differential distractor functioning effects under the nominal response model. Journal of Educational Measurement, 45(3), 247269. Penfield, R. D. (2010). Modeling DIF effects using distractor-level invariance effects: Implications for understanding the causes of DIF. Applied Psychological Measurement, 34(3), 151-165. Penfield, R. D. (2011). How are the form and magnitudes of DIF effects in multiplechoice items determined by distractorlevel invariance effects? Educational and Psychological Measurement, 71(1), 54-67. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests (Rev. and expanded ed.). Chicago, IL: University of Chicago Press. (Original work published 1960) Sadler, P. M. (1998). Psychometric models of student conceptions in science: Reconciling qualitative studies and distractor-driven assessment instruments. Journal of Research in Science Teaching, 35(3), 265-296. Smith, E. V. (2001). Evidence for the reliability of measurement and the validity of measure interpretation: A Rasch measurement perspective. Journal of Applied Measurement, 3(2), 205-231. Smith, R. M. (1987). Assessing partial knowledge in vocabulary. Journal of Educational Measurement, 24(3), 217231..

(33) Rasch 模式應用於次序選擇題之誘項分析⎯⎯以數學分數測驗為例 119. Steffe, L. P. (2002). A new hypothesis concerning children’s fractional knowledge. Journal of Mathematical Behavior, 20(3), 267-307. Steffe, L. P., & Olive, J. (2010). Children’s fractional knowledge. New York, NY: Springer. Suh, Y., & Bolt, D. M. (2011). A nested logit approach for investigating distractors as causes of differential item functioning. Journal of Educational Measurement, 48(2), 188-205. Thissen, D., Steinberg, L., & Fitzpatrick, A. R. (1989). Multiple-choice models: The distractors are also part of the item. Journal of Educational Measurement, 26(2), 161-176. Thissen, D., Steinberg, L., & Wainer, H. (1993). Detecting of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67-113). Hillsdale, NJ: Lawrence Erlbaum. Wang, W. C. (1998). Rasch analysis of distractors in multiple-choice items. Journal of Outcome Measurement, 2(1), 43-65. Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. San Francisco, CA: Jossey-Bass. Wilson, M. (1992). The ordered partition model: An extension of the partial credit model. Applied Psychological Measurement, 16(4), 309-325. Wright, B. D., & Master, G. N. (1982). Rating scale analysis. Chicago, IL: MESA. Wright, B. D., & Stone, M. H. (1979). Best. test design. Chicago, IL: MESA..

(34)

參考文獻

相關文件

在選擇合 適的策略 解決 數學問題 時,能與 別人溝通 、磋商及 作出 協調(例 如在解決 幾何問題 時在演繹 法或 分析法之 間進行選 擇,以及 與小組成 員商 討統計研

H., Liu, S.J., and Chang, P.L., “Knowledge Value Adding Model for Quantitative Performance Evaluation of the Community of Practice in a Consulting Firm,” Proceedings of

在選擇合 適的策略 解決 數學問題 時,能與 別人溝通 、磋商及 作出 協調(例 如在解決 幾何問題 時在演繹 法或 分析法之 間進行選 擇,以及 與小組成 員商 討統計研

CONFIDENTIAL: All capabilities and dates are for planning purposes only and may not be used in any contract Information Portal. Key Performance Indicators &

 Biggs’ Structure of the Observed Learning Outcome (SOLO) Taxonomy.

將基本學力測驗的各科量尺分數加總的分數即為該考生在該次基測的總 分。國民中學學生基本學力測驗自民國九十年至九十五年止基測的總分為 300 分,國文科滿分為 60

住宅選擇模型一般較長應用 Probit 和多項 Logit 兩種模型來估計,其中以 後者最常被使用,因其理論完善且模型參數之估計較為簡便。不過,多項

則巢式 Logit 模型可簡化為多項 Logit 模型。在分析時,巢式 Logit 模型及 多項 Logit 模型皆可以分析多方案指標之聯合選擇,唯巢式 Logit