• 沒有找到結果。

3.1 Phase 1-Simulated Experiment

3.1.3 Procedure

The 13000 simulated examinees took a CAT, a BR-CAT, and a BRR-CAT under the stopping conditions (test length = 30 items) pre-assigned by the simulation system.

The system was designed by the researcher would generate a response pattern for each simulated examinee automatically based on BRR-CAT, BR-CAT, and CAT algorithms including the Starting, Continuing, and Stopping steps. Figure 3.2 showed the simulated procedure of traditional CAT which forbade reviewing and changing in the

Continuing step when the simulated item bank was adopted. However, both of BRR-CAT and BR-CAT algorithms allowed examinees reviewing and changing one item in a block when examinees completed a block (10 items). Figure 3.3 shows the procedure of BRR-CAT and BR-CAT.

Figure 3. 2 The simulated procedure of CAT

Figure 3. 3 The simulated procedure of BRR-CAT and BR-CAT

The procedure of generating examinees’ response patterns based on the simulation of CAT, BR-CAT, and BRR-CAT algorithms was described as follow.

1. Starting: In this system, the first test item started at an item with a middle difficulty level which meant b was from -.5 to .5.

2. Continuing:

(1) Response patterns: When the jth examinee took the ith item by simulating BRR-CAT, the ability value θj and the parameters of the taken item ( ) would be employed to Formula 3.3. pij indicated an expected probability of answering

correctly in the ith item.

Next, randomly produce a response coefficient p’ in the uniform distribution from 0 to 1. The situation that p’ was less than pij, the expected probability of answering correctly, meant the jth examinee answering the item correctly. On the contrary, the jth examinee answered the item incorrectly if p’ was more than pij

(Lazarte, 1999). The denoted the response situation in ith item from the jth

examinee was defined as Formula 3.4. The vector = , denoted the response pattern from the jth examinee.

uij

(2) Ability estimation: In this step, Bayesian procedure was adopted in ability estimation based on the examinee’s response patterns. Formula 3.5 shows the Bayesian procedure of examinees’ ability estimation (Baker, 1992, p.210).

1

(3) Item selection: Before simulated examinees entered the next block, the maximum information strategy for the remaining ones was adopted to find the most suitable item. Formula 3.6 shows the maximum information strategy of item selection procedure (Wainer, et al., 1990, p.111).

)

(4) Review, change and ability re-estimation in BR-CAT and BRR-CAT:

After the test items in the nth block were completely answered, simulated examinees could review and change once on any test items in this block (block size =10). The block size for BRR-CAT was 10 test items for restraining the effect of Wainer strategy on inflating ability estimation (Stocking, 1997, Vispoel, 2000). The block-review method could mitigate possible dangers when Wainer’s strategy was adopted (Stocking, 1997; Vispoel, 2000). After simulated examinees had completed the review and change of the nth block in BR-CAT and BRR-CAT, they would move to the

(n+1)th block and their abilities in the nth block were re-estimated according to the new response patterns and Bayesian procedure or Bayesian and Rearrangement procedure in BR-CAT and BRR-CAT, respectively. According to Waddell and Blankenship’

meta-analysis research (1995), the number of reviewed items was about 5.1% of the total answered items in 75 studies. There were 57% of changing answers from incorrect to correct, 21% from correct to incorrect, and 22% from incorrect into another incorrect answer. Therefore, the simulated examinees were classified into high, middle or low levels of ability group in order to simulate the effects of the answer changing on examinees’ ability re-estimation during reviewable CAT and to meet the changing conditions based on Waddell and Blankenship’ research results (1995) and Papanastasiou’s (2005) simulated study. The three levels of ability group are detailed as follows:

(a) High ability group: This group was defined as those whose probability to correctly answer an item was more than .80 (Papanastasiou’s, 2005). They were likely to change their answers from incorrect to correct one when they reviewed their incorrectly answered item. This type of changing was based on the assumption that high ability examinees might make a careless mistake on the first-time answering.

(b) Middle ability group: This group was defined as those whose probability to correctly answer an item was from .47 to .53 (Papanastasiou’s, 2005). They were likely to have 72% possibility to change their answers from incorrect to correct one when they reviewed their incorrectly answered item. This type of changing was based on the assumption that middle ability examinees had lower chance to make a careless mistake as high ability examinees on the first-time answering.

(c) Low ability group: This group was defined as those whose probability to correctly answer an item was less than .33 (Papanastasiou’s, 2005). They were likely to change their answers from correct to incorrect one when they reviewed their

correctly answered item. This type of changing was based on the assumption that low ability examinees might guess right on the first-pass answering.

3. Stopping: In the Stopping step, the simulation system checked when and whether or not the stopping criterion was reached. The test length, thirty different items, was used as the Stopping condition (Wainer, et al., 1990, p.114).

Three aforementioned steps summarized above were repeated until all of the simulated examinees completed a BRR-CAT, a BR-CAT, and a CAT. Figure 3.4 showed the pseudo code of the simulated procedure of BRR-CAT. After the simulated examinees changed the kth item, the ability re-estimation would carry out based on the rearranged order of used items which the (k+1)th item might be excluded from the original response pattern. For instance, if the kth item Type 2 change (incorrect to correct) and the (k+1)th item was correct, the (k+1)th item might be excluded because the difficulty level of the (k+1)th item was lower than the examinee’s ability. Similarly, if the kth item Type 3 change (correct to incorrect) and the (k+1)th item was incorrect, the (k+1)th item might be excluded because the difficulty level of the (k+1)th item was higher than the examinee’s ability. Further analysis concerning the following information would be made to analyze the differences of estimators, standard error, and test length of examinees’ ability estimation among BRR-CAT, BR-CAT, and CAT groups in order to evaluate the precision and the efficiency of BRR-CAT algorithm in examinees’ ability estimation.

While (number_of_used_items ≤ TestLength) {

// blocki is done; (number_of_used_items mod m)=0, m=block size, if (finish block=true)

{

// Simulated procedure of review item & change answer– BBR-CAT Procedure BRR-CAT (Simulated Item bank, Simulated Examinee) {

// review item & change answer is triggered if (review-change =true)

{

//type II of changing: incorrect to correct;

//type III of changing: correct to incorrect;

if ((type_of_change=2) or (type_of_change=3) ) {

//execute BRR ability estimation Procedure BRR (selected itemk );

{

//re-estimation (from k to m items in the nth block) //Rearrangement

Store itemk+1;

Rearrangement the order of used items { from the response pattern;}

//type III of changing: correct to incorrect;

3: { the kth item was changed (correct -> incorrect) the (k+1)th item was incorrect;

the ability estimation excluded the (k+1)th item from the response pattern;}

for (i=k to m) //Bayesian procedure

{θ^ = Bayesian(ii1);}

}

}// end of Procedure BRR

}// end of if ((type_of_change=1) or (type_of_change=2) ) }// end of if (review-change =true)

} end of Procedure BRR-CAT } end of if (finish block=true)

// blocki is not done and number_of_used_items <TestLength;

Else Select_item_id=ItemSelection_Max_infor;

} end of While (number_of_used_items ≤ TestLength)

Function Bayesian(int item_number , double θ(previous) {

for ( =1 to Max_number_of_iteration) //set up the number of iteration t {

//BigTA default value for accelerating a convergence in ability estimation {

if ( Delta <0.05) then break;

} }

Function ItemSelection_Max_infor (Item bank) {

Figure 3. 4 The pseudo code of the simulated procedure of BRR-CAT

相關文件