研究限制與未來方向

第五章結論與建議

第二節研究限制與未來方向

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

SCORIGHT 而言，其對資料並無任何事前假設，且能在許多軟體上操作，在使用上是較為便利的。

第二節研究限制與未來方向

本次研究中使用的資料由電腦模擬所產生，過程並沒有針對不同能力水準的受試者設定不同的題組相關係數，這可能是造成分組 GEE 對於估計效率的改善不太明顯的原因，此部分尚需透過實際資料來驗證分組 GEE 對於估計結果的改善程度，此外，也可以嘗試使用不同的分組方式進行估計，例如受試者的答題模式(pattern)等。另外，在本次測驗的設計上，對於能力值中間的受試者有較高的訊息量，因此在比較時對於 SCORIGHT 是較為有利的，未來可透過調整試題參數來改變測驗在各能力區間的訊息量，並對兩方法做進一步的比較。

在實際測驗中，我們有時候無法事先得到試題的各項參數，因此必須同時對試題參數和能力參數進行估計，而本次研究主要針對給定試題參數下的估計結果進行探討，因此 GEE 和貝氏題組模型在此部分的比較也是未來可以嘗試的研究方向。

GEE 除了可以成為參數估計的另一種選擇之外，也能使用其對於變異數的假設作為題組式 CAT 選題的參考。不同於一般 CAT，在題組式 CAT 中因為題組內的所有題目必須同時被選取，但該題組內可能只有部分題目對於該受試者有較高的訊息量(陳柏熹 et al., 2008)，在此情況下，GEE 可以用來衡量試題訊息減少的程度，本節在此處利用一些例子，比較當使用 GEE 的變異數公式計算題組訊息量時，題組內試題的結構對於訊息量的影響:

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

圖 5- 1、題組內題目難度不同時之訊息量變化(固定 a=1,c=0)

圖 5- 2、題組內題目難度相同時之訊息量變化(固定 a=1,c=0)

由圖 5-1 中可看出，當題組內題目難度的涵蓋範圍較廣，訊息函數曲線會較平緩，即對於各能力值區間的受試者提供的訊息量較為一致；而圖 5-2 的結果顯示了在題組內題目難度都很接近時，此時對於能力最接近該難度的受試者有最高的訊息量。另外，當題目間相關性增加，整體的訊息量會隨之減少，但對於本來訊息量較高的區間，其下降幅度也會較大，由圖 5-2 可明顯的看出此現象。根據以上結果，若 CAT 測驗是使用目標訊息量作為測驗終止的標準，GEE 可以幫助

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

我們評估題組的訊息量，避免高估估計值的標準誤，但此部分需要再進行更深入的研究。

‧

Dobson, A. J., & Barnett, A. (2008). An introduction to generalized linear models: CRC press.

Leisch, F., Weingessel, A., & Hornik, K. (1998). On the generation of correlated artificial binary data.

Liang, K.-Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 13-22.

Lord, F. M., Novick, M. R., & Birnbaum, A. (1968). Statistical theories of mental test scores.

Park, C. G., Park, T., & Shin, D. W. (1996). A simple method for generating correlated binary variates. The American Statistician, 50(4), 306-310.

Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet‐based tests.

Journal of Educational measurement, 28(3), 237-247.

Wainer, H., Bradlow, E. T., & Wang, X. (2007). Testlet response theory and its

applications: Cambridge University Press.

Wainer, H., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational measurement, 24(3), 185-201.

Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test scores?

What is the effect of local dependence on reliability? Educational Measurement: Issues

and Practice, 15(1), 22-29.

Wang, X., Bradlow, E. T., & Wainer, H. (2004). User's guide for SCORIGHT (version 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis. ETS Research Report Series, 2004(2).

Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational measurement, 30(3), 187-213.

‧

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄二: SCORIGHT3.0 參數設定與操作過程

Step 1: Start SCORIGHT

Step 2: Enter the Number of Examinees and Items: 10000 60

Step 3: Enter the Number of Dichotomous Items Among All the Items: 60 Step 4 (Optional): Enter the Number of 2PL Binary Items: 60

Step 5: Enter the Number of Testlets: 12

Step 6: Enter the Name/Path of the Data File: c:\subdirectory\data Step 7: Enter the Beginning and Ending Column of the Test Data: 1 60 Step 8: Enter the Beginning and Ending Columns for the Testlet Items:

Enter the starting and ending columns of Testlet #1: 1 5 Enter the starting and ending columns of Testlet #2: 6 10 Enter the starting and ending columns of Testlet #3: 11 15 Enter the starting and ending columns of Testlet #4: 16 20 Enter the starting and ending columns of Testlet #5: 21 25 Enter the starting and ending columns of Testlet #6: 26 30 Enter the starting and ending columns of Testlet #7: 31 35 Enter the starting and ending columns of Testlet #8: 36 40 Enter the starting and ending columns of Testlet #9: 41 45 Enter the starting and ending columns of Testlet #10: 46 50 Enter the starting and ending columns of Testlet #11: 51 55 Enter the starting and ending columns of Testlet #12: 56 60

Step 9: Enter the Beginning and Ending Rows of the Dataset: 1 10000 Step 10 (Optional): Create an Information File About the Items:

D 2

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

D 2 …

Step 11 (Optional): Enter the Name/Path of the Item Information File:

c:\subdirectory\result

Step 12: Enter the Name/Path Where the Output Files Should Be Stored: c:\result\

Step 13: Enter the Number of Iterations for the Gibbs Sampler: 4000 Step 14: Enter the Number of Initial Draws To Be Discarded: 3000

Step 15: Enter the Number of Times the Posterior Draws Will Be Recorded: 10 Step 16: Enter the Number of Markov Chains You Want To Run: 1

Step 17: Enter Initial Values for the Parameters:

For CHAIN 1:Do you want to input the initial values for item parameters a, b, and c?If yes, enter 1, otherwise, enter 0: 1

Step 18: Enter Initial Values for the 𝜃𝜃s

For CHAIN 1:Do you want to input the initial values for proficiency parameters theta?If yes, enter 1, otherwise, enter 0: 0

在文檔中廣義估計方程式在題組式測驗的應用 - 政大學術集成 (頁 42-48)

第五章 結論與建議

第二節 研究限制與未來方向

國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第二節 研究限制與未來方向

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

Journal of Educational measurement, 28(3), 237-247.

applications: Cambridge University Press.

and Practice, 15(1), 22-29.

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄二: SCORIGHT3.0 參數設定與操作過程

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

第五章結論與建議

第二節研究限制與未來方向

立政治大學

第二節研究限制與未來方向

立政治大學

立政治大學

立政治大學

立政治大學