• 沒有找到結果。

American College Testing. (1994). Setting achievement levels on the 1994National Assessment of Educational Progress in geography and in U.S. history and the 1996National Assessment of

圈色圖』

• 28 .學習成就評量標準設定 謝進昌、謝名娟、林世華、林陳浦、陳清 j柔、謝佩蓉

Educational Progress in science (Finalve月ion) ρesigndocument). Washington,DC: National Assessment Governing Board.

American College Testing. (2005). Developing achievement levels on the 2005 National Assessment of Educational Progress in grade twelve mathematics: Process report. Washington, DC:

National Assessment Governing Board.

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington,DC: American Educational Research Association.

Ango缸; W. H. (1971). Scales,norms,and equivalent scores. In R. L. Thorndike (Ed.),Educational measurement (pp. 508-600). Washington,DC: American Council on Education.

Berk,R. A. (1986). A consumer 's guide to setting performance standards on criterion-referenced tests. Review ofEducationalMeasureme肘, 56(1), 137-172.

Berk, R. A. (1996). Standard setting: The next generation (where few psychometricians have gone before!). AppliedMeasure!ηentin Education,9(3),215-235.

Bourque,M. L. (2009,March). A history ofNAEP achievement levels: Issues,implementation,and impact 1989-2009. Paper commissioned for the 20th anniversary of the National Assessment Governing Board 1988-2008. Retrieved August 22, 2010, from http://www.nagb.org/

publications/reports-papers.htm

Buckendahl,C. w., Smith,R. w., Impara,J. 仁, &Plake,B. S. (2002). A comparison of Angoff and bookmark standard setting methods. Journal ofEducationalMeasureme肘,39(3),253-263.

Ciz吭, G. J., & Bunch, M. B. (2007). Standard setting: A guide t的o e,臼'st,的αbli.臼shing and ev,悶αl似uαωωti切n performαnce sf的αndαrdson tests. Thousand Oaks,CA: Sage.

Cizek, G. 1., Bunch, M. 且, & Koons, H. (2004). Setting performance standards: Contemporary methods. Educational Measurement: Issues and Practice,23(4),31-50.

Cohen, J. (1988). Statistical power analysis for the 卸的vioral sciences (2nd ed.). Hillsdale, NJ:

Lawrence Erlbaum Associates.

Council of Chief State School Officers. (2001). State student assessment programs annual survey (Data vol. 2). Washington,DC: Author.

Ebel, R. L. (1972). Essentials of educational measurement (2nd ed小 Englewood Cli恤, NJ:

Prentice-Hall.

Efron,B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7( 1), 1-26.

Hambleton,R. K. (2001). Setting performance standards on educational assessments and criteria for

謝進昌、謝名娟、林世華、林陳浦、陳清漢、謝佩蓉 學習成就評量標準設定﹒ 29 •

evaluating the process. In G. J. Cizek (Ed.), Standard setting: Concep肘, metho品, and perspectives(pp. 89-116). Mahwah,NJ: Erlbaum.

Hambleton,R. K.,& Pitoniak,M. J. (2006). Setting performance standards. In R. L. Brennan (Ed扎

Educational measurement (4th ed., pp. 433-470). Westp。此, CT: American Council on Education/Praeger.

Huynh,H. (2006). A clarification on the response probability criterion RP67 for standard settings based on bookmark and item mapping. Educational Measurement: Issues and Practice,25(2), 19-20.

Jaeger,R. M.(1991). Selection of judges for standard setting.Educational Measurement: Issues and Practice,10(2),3-14.

Kar嗨, M. (1994). Validating the performance standards associated with passing scores. Review of Educational Research,64(3),425-461.

Kane,M. (1998). Choosing between examinee-centered and test-centered standing setting methods.

Educational Assessment,5(3),129-145.

Kane, M. (2001). So much remains the same: Conception and status of validation in setting standards. In G. J. Cizek (Ed.), Standard setting: Concep缸, metho品, and perspectives (pp.

53-88). Mahwah,NJ: Erlbaum.

Karantonis, A., & Sireci,S. G. (2006). The bookmark standard-setting method: A literature review.

Educational Measurement: Issues and Practice,25(1),4-12.

Land峙,J. R.,

&

Koch,G. G. (1977). The measurement of observer agreement for categorical data.

Biometrics,33(1), 159-174.

Lewis,D. M.,Mitzel,H. C.,& Green,D. R. (1996,June). Standard setting: A bookmark approach.

Paper presented at the Council of Chief State School Officers National Conference on Large Scale Assessment,Boulder,CO.

Lin,J. (2006). The bookmark procedure for setting cut-scores and finalizing performance standards:

Strengths and weaknesses. The Alberta Journal ofEducational Research, 5月 1) , 36-52.

Loom 芯, S. C. (2000, April). Feedback in the NAEP achievement levels setting process. Paper presented at the meeting ofthe National Council on Measurement in Education,New Orleans.

Loomis, S. 仁, & Bourque, M. L. (2001). From tradition to innovation: Standard setting on the National Assessment of Educational Progress. In G. J. Cizek (Ed.),Standard setting: Concepts,

metho品, and perspectives(pp. 175-217). Mahwah,NJ: Erlbaum.

Ma, L., & Ma, X. (2005). Estimating correlates of growth between mathematics and science achievement via a multivariate multilevel design with latent v

jjji

謝進昌、謝名娟、林世華、林陳浦、陳清溪、謝佩蓉

﹒學習成就評量標準設定

Evaluation,31(1),79-98.

Mitzel, H. C., Lewis, D.

30

R. (2001). The Bookmark method:

Psychological perspectives. In G. J. Cizek (Ed.), Setting performance standards: Concepts, M., Patz, R. J., & Green, D.

methods,and perspectives (pp. 249-281). Mahwah,NJ: Erlbaum.

Mullis,I.V.鼠,Erberber,E.,&Preuscho缸~ C. (2008). The TIMSS 2007 international benchmarks of student achievement in mathematics and science. In J. F. Olson,M. O.Mart凹, &I.V.S. Mullis (Eds.), TIMSS 2007 technical report (pp. 339-347). Chestnut Hill, MA: TIMSS & PIRLS International Study Center,Boston College.

National Assessment Governing Board. (1990). Setting appropriate achievement levels for the National Assessment of Educational Progress: Policyj均mework and technical procedures.

Washington,DC: Author.

Nedelsky,L. (1954). Absolute grading standards for objective tests.Edi缸。tionaland Psychological

Measureme肘, 14(1), 3-19.

Organization for Economic Cooperation and Development. (2009). PISA 2006 technical report.

Paris: Author.

Reckase,M. D. (2000). The evolution ofthe NAEP achievement level setting process: Asummaη of

the research and development 吃。orts conducted by ACT. Iowa City, IA: American College Testing.

Reckase,M. D. (2001). Innovative methods for helping standard-setting participants to perform their task: The role of feedback regarding consistency, accuracy, and impact. In G. J. Cizek (Ed扎

Standard setting: Concepts,methods,and perspectives(pp. 159-174). Mahwah,NJ: Erlbaum.

Sireci, S. G., Hauger, J. 且, Wells, C. 鼠, Shea, C., & Zenisl句, A. L. (2009). Evaluation of the Progress Assessment of Educational

mathematics test.Applied Measurement in Education,22(4),339-358.

National 12

Grade the 2005

on settmg standard

謝進昌、謝名娟、林世華、林陳浦、陳清溪、謝佩蓉

附錄

學習成就評量標準設定﹒ 31 •

bootstrapping 常被譯為試靴法、拔靴法或是自助法,為 Efron (1979)所發展之重複取樣 的方法,分為兩種基本類型:無母數試靴法 (nonparametric bootstrapping) 和母數試靴法

(parametric bootstrapping) 。母數試靴法是由研究者界定理論性分配,並利用電腦所產生之

隨機樣本進行估計,常用來評估結構方程式的模式適配度。

另一方面,若當我們將實徵性研究樣本視為虛擬之母群體,從中隨機取出固定的樣本數 再置回,使每次取出的樣本形成一個資料集,並將這樣的「隨機取樣並置回」的動作透過電 腦重複上千次,以產生上千個隨機樣本,也就能藉由樣本統計數分配之標準差來推估抽樣分 配之標準誤,這樣的方式便稱為無母數試靴法。整體而言,無母數或母數試靴法兩者最大的 差別在於前者是由實徵樣本中,模擬產生母群分配,而後者在於已預先假設理論性分配,再 執行取樣並置田。

不少統計軟體或程式均能用來操作試靴法,研究者可依其研究目的選擇適當的工具。例 如:研究者欲計算變項間的中介效果(mediationeffect) 是否存在,可以利用 Preacher 等人所 撰寫之巨集和計算器(參見http://people.ku.edu/~preacher/med叫1tm) ,以試靴法進行中介變項 標準誤之信賴區間推估。此外,

SPSS

,-試靴法附加模組」亦不失為簡單容易上手之工具,可 估計中位數、四分位數、及其他統計數之標準誤與信賴區間。例如,本研究想針對所蒐集之 實徵資料進行樣本中位數抽樣分配標準誤之估計,開啟 SPSS 之後,自「功能表」選擇「分析」

→「敘述統計」→「次數分配表 J '將所欲分析之變項選入主對話方塊右側的「變數」欄位。

按一下「統計量 J '勾選「中位數」後,按「繼續 J 按一下「自助法 J '勾選「執行自助 法 J '會看到系統預設之樣本個數為 1 ,000 (即重複抽樣 1 ,000 次) ,信賴區間為 95% '倘若 研究者沒有額外的需求,便可接著按「繼續」。最後,按一下「確定 J '即可從 SPSS 輸出畫 面,得到由 1 ,000 個樣本資料集所推估之標準誤。

• 32 .學習成就評量標準設定

Journal of Research in Education Sciences 2011,56(1), 1-32

謝進昌、謝名娟、林世華、林陳油、陳清漢、謝佩蓉

Validation of the Standard Setting Procedure

相關文件