CHAPTER 4 RESULTS
4.1.1 Recovery of Higher-Level Parameters
The higher level estimated parameters of the RHO-RDINA model and RHO-RDINO
model includes attribute discrimination, γ and attribute difficulty, β. The recovery of
three parameters was assessed with RMSE and Bias. The recovery results for these
parameters under all testing conditions are presented in Tables 4.1 to 4.2. Accuracy of
the examinee attributes mastery parameter, α is computed by correct classification rate.
The correct classification rate for α is presented in Tables 4.3 and 4.4.
In Table 4.1, it can be seen that recovery was generally good for the
discrimination parameter with the RHO-RDINA model, when the ability distribution
of the focal and reference groups were equal, and almost as good in the unequal ability condition, under various DIF patterns. The RMSEs for attribute difficulty and
attribute discrimination range from .04 to .16. This indicates generally good recovery.
Recovery of β and γ did not appear to be affected much by any of the different
scenarios (i.e., combinations of DIF patterns; ability distribution difference).
Recovery was less accurate, however, for short test lengths. This may have occurred
because the structure of Q-matrices in 40-items or 60-items conditions is the same as
the short test length, and attribute difficulty parameters are set the same across
different test lengths. Since the numbers of items that tested by each attribute are
increased with test length the estimation of attribute difficulty will be more accurate.
A similar pattern was found in the recovery of each of the parameters with the RHO-RDINO model, which can be seen in Table 4.2. The RMSEs for attribute
difficulty and attribute discrimination range from .04 to .20. This indicates generally
good recovery. Recovery of β and γ did not appear to be affected much by any of the
different scenarios (i.e., combinations of DIF patterns and ability distribution
difference).
Table 4.1 Bias and RMSEs of Attribute Difficulty, A, and Discrimination, γ over 25 Replications with RHO-RDINA Model
Equal Unequal
Balanced One-sided No-DIF Balanced One-sided No-DIF Test Length
par gen Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 20 items A[1] -1.5 -0.08 0.14 -0.06 0.16 -0.04 0.11 -0.06 0.15 -0.08 0.14 -0.06 0.12 A[2] -1.0 -0.06 0.11 -0.07 0.13 -0.04 0.09 -0.04 0.11 -0.04 0.11 -0.05 0.11 A[3] -1.0 -0.04 0.10 -0.04 0.11 -0.03 0.09 -0.03 0.08 -0.04 0.12 -0.04 0.10 A[4] -0.5 -0.01 0.08 -0.03 0.08 0.00 0.07 -0.02 0.08 -0.01 0.08 0.00 0.07 A[5] -0.5 -0.01 0.07 -0.03 0.08 -0.01 0.07 -0.02 0.06 -0.01 0.08 0.00 0.08 γ 1.5 -0.05 0.12 -0.04 0.11 -0.04 0.11 -0.02 0.13 -0.05 0.11 -0.05 0.10 40 items A[1] -1.5 -0.02 0.09 -0.08 0.14 -0.05 0.11 -0.05 0.09 -0.05 0.12 -0.04 0.11 A[2] -1.0 0.00 0.07 -0.04 0.09 -0.03 0.10 -0.03 0.07 -0.02 0.08 -0.02 0.09 A[3] -1.0 -0.02 0.08 -0.03 0.08 -0.03 0.10 -0.04 0.07 -0.03 0.07 -0.03 0.09 A[4] -0.5 0.00 0.06 -0.02 0.06 0.00 0.05 -0.02 0.05 -0.01 0.06 0.00 0.06 A[5] -0.5 0.00 0.04 -0.01 0.07 -0.01 0.05 -0.02 0.05 -0.01 0.05 0.00 0.06 γ 1.5 -0.01 0.09 -0.06 0.09 -0.04 0.11 -0.04 0.10 -0.02 0.10 -0.03 0.10 60items A[1] -1.5 -0.05 0.09 -0.02 0.09 -0.08 0.11 -0.05 0.10 -0.03 0.08 -0.03 0.06 A[2] -1.0 -0.02 0.07 0.00 0.08 -0.03 0.07 0.00 0.07 -0.02 0.07 -0.01 0.07 A[3] -1.0 -0.02 0.07 -0.01 0.07 -0.03 0.08 -0.01 0.07 -0.02 0.07 -0.01 0.05 A[4] -0.5 -0.01 0.06 0.00 0.05 -0.01 0.06 0.01 0.05 0.00 0.07 0.01 0.06 A[5] -0.5 -0.02 0.06 0.00 0.06 -0.03 0.06 0.00 0.06 -0.02 0.06 -0.01 0.04 γ 1.5 -0.02 0.08 -0.01 0.06 -0.05 0.09 -0.04 0.07 -0.01 0.08 -0.01 0.06
66
Table 4.2 Bias and RMSEs of Attribute Difficulty, A, and Discrimination, γ over 25 Replications with RHO-RDINO Model
Equal Unequal
Balanced One-sided No-DIF Balanced One-sided No-DIF Test Length
par gen Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE 20 items A[1] -1.5 -0.09 0.14 -0.12 0.19 -0.12 0.19 -0.12 0.17 -0.07 0.12 -0.11 0.16 A[2] -1.0 -0.04 0.13 -0.07 0.14 -0.06 0.14 -0.06 0.10 -0.04 0.09 -0.09 0.13 A[3] -1.0 -0.08 0.12 -0.06 0.14 -0.06 0.11 -0.06 0.12 -0.05 0.12 -0.05 0.11 A[4] -0.5 -0.02 0.15 -0.02 0.14 -0.01 0.14 0.03 0.08 -0.01 0.11 0.02 0.10 A[5] -0.5 0.01 0.17 0.03 0.13 0.08 0.17 0.03 0.13 -0.01 0.12 0.01 0.12 γ 1.5 -0.12 0.17 -0.12 0.16 -0.10 0.18 -0.12 0.19 -0.07 0.18 -0.11 0.20 40 items A[1] -1.5 -0.05 0.11 -0.02 0.11 -0.04 0.08 -0.05 0.09 -0.04 0.10 -0.07 0.11 A[2] -1.0 -0.01 0.08 0.00 0.07 -0.03 0.08 -0.02 0.07 -0.01 0.08 -0.04 0.07 A[3] -1.0 -0.03 0.07 -0.03 0.09 -0.05 0.09 -0.01 0.05 -0.02 0.08 -0.04 0.08 A[4] -0.5 0.00 0.08 -0.01 0.08 0.00 0.07 0.00 0.07 -0.01 0.06 -0.02 0.07 A[5] -0.5 0.00 0.08 0.00 0.07 -0.01 0.07 -0.01 0.06 -0.01 0.08 -0.01 0.07 γ 1.5 -0.05 0.10 -0.04 0.10 -0.06 0.11 -0.03 0.10 -0.02 0.08 -0.07 0.11 60items A[1] -1.5 -0.03 0.09 -0.06 0.12 -0.05 0.12 -0.07 0.12 -0.04 0.10 -0.04 0.11 A[2] -1.0 -0.02 0.07 -0.02 0.08 -0.03 0.09 -0.04 0.09 -0.02 0.07 -0.04 0.08 A[3] -1.0 -0.02 0.07 -0.03 0.10 -0.02 0.07 -0.04 0.07 -0.01 0.07 -0.01 0.07 A[4] -0.5 0.01 0.07 -0.02 0.05 0.01 0.06 -0.02 0.06 0.02 0.06 -0.01 0.05 A[5] -0.5 -0.02 0.04 -0.02 0.06 -0.01 0.06 -0.04 0.06 0.00 0.05 -0.02 0.05 γ 1.5 -0.02 0.07 -0.03 0.11 -0.03 0.12 -0.04 0.10 -0.04 0.11 0.00 0.09
Table 4.3 shows the percent of correct classification rates of attribute mastery
based on RHO-RDINA model. The correct classification rate was computed by comparing the estimated classification against the deterministic classification obtained
using the true abilities. The table shows that the correct classification rates of attribute
mastery were relatively high, ranging from .92 to .99. The last row of Table 4.3 shows
the percent of examinees whose attribute vector was correctly estimated. The correct
classification rates of attribute mastery and overall consistency decreased for the unequal ability distribution.
Table 4.4 shows the percent of correct classification rates of attribute mastery
based on RHO-RDINO model. The ability distribution difference had an effect on the
accuracy of attribute and overall consistency with the HO-RDINO model. The table shows that the correct classification rates of attribute mastery and overall consistency
increased in the unequal ability distribution compared to the equal ability distribution.
This may have occurred because in the unequal ability distribution, the ability
distribution for the focal group was generated from N (-1, 1) and N (0, 1) for the reference group. This may have lead to more non-masters in this condition compared
to those in the equal ability distribution. There was a large discrepancy between the
sample size of masters and non-masters. Thus, the accuracy of attribute classification
and the overall consistency decreased in the unequal ability distribution with the
RHO-RDINA model. However, the accuracy of attribute and overall consistency
increased in the unequal ability distribution because of the nature of the RHO-RDINO model.
In sum, the recovery of higher-order parameters yielded in a reasonable range.
The RMSEs were below .20 and the Bias below ±.12 for the two proposed models.
The easier attributes the larger bias and RMSE were. For the both proposed models
the recovery of attribute discrimination and attribute difficulty parameters were independent of DIF patterns and ability distribution difference, but the test length had
slightly impact on the recovery of discrimination parameter, attribute difficulty and
correct classification rate of attribute mastery. As test length increased the correct
classification rates increased. The results show that as test length increased, the overall consistency of examinees whose attribute vector was increased especially
from the short test length to median test length condition. This may have occurred
because of the increased test length offered sufficient information and thus improve
the estimation of attribute mastery. Moreover, the percent of correct classification rates of attribute mastery for RHO-RDINA model were higher than .92 across all
conditions and higher than .80 for RHO-RDINO model which indicated that correct
estimates on the examinee attribute profile score estimates using the two proposed
models under these conditions.
Table 4.3 Percent of RHO-RDINA Correct Classification by Attribute and Vector
Test length 20 40 60
Equal Unequal Equal Unequal Equal Unequal
Classification BA ON NO BA ON NO BA ON NO BA ON NO BA ON NO BA ON NO Attribute 1 0.93 0.93 0.93 0.89 0.89 0.89 0.97 0.97 0.97 0.95 0.95 0.95 0.98 0.98 0.96 0.97 0.97 0.97 Attribute 2 0.94 0.95 0.95 0.92 0.93 0.92 0.98 0.98 0.98 0.97 0.97 0.97 0.99 0.99 0.99 0.97 0.98 0.98 Attribute 3 0.93 0.93 0.93 0.91 0.91 0.90 0.97 0.98 0.98 0.96 0.96 0.96 0.99 0.99 0.99 0.98 0.98 0.98 Attribute 4 0.92 0.92 0.92 0.90 0.90 0.90 0.97 0.97 0.97 0.96 0.96 0.96 0.99 0.99 0.99 0.98 0.98 0.98 Attribute 5 0.95 0.95 0.95 0.94 0.94 0.94 0.98 0.98 0.98 0.98 0.97 0.98 0.99 0.99 0.99 0.99 0.99 0.99 Overall consistency 0.76 0.77 0.76 0.68 0.68 0.68 0.90 0.90 0.90 0.85 0.85 0.85 0.95 0.95 0.95 0.91 0.91 0.91 Note: BA denotes balanced DIF pattern; ON denotes one sided DIF pattern; NO denotes no DIF pattern
Table 4.4 Percent of RHO-RDINO Correct Classification by Attribute and Vector
Note: BA denotes balanced DIF pattern; ON denotes one sided DIF pattern; NO denotes no DIF pattern
Test length 20 40 60
Equal Unequal Equal Unequal Equal Unequal
Classification BA ON NO BA ON NO BA ON NO BA ON NO BA ON NO BA ON NO Attribute 1 0.93 0.93 0.93 0.93 0.93 0.93 0.95 0.93 0.96 0.97 0.97 0.97 0.98 0.98 0.98 0.98 0.98 0.98 Attribute 2 0.90 0.90 0.90 0.91 0.91 0.91 0.93 0.92 0.95 0.96 0.96 0.96 0.96 0.96 0.96 0.97 0.97 0.97 Attribute 3 0.88 0.88 0.88 0.89 0.89 0.89 0.92 0.91 0.94 0.95 0.95 0.95 0.97 0.97 0.97 0.97 0.97 0.97 Attribute 4 0.80 0.80 0.80 0.82 0.82 0.82 0.85 0.86 0.88 0.90 0.90 0.90 0.93 0.92 0.92 0.94 0.94 0.94 Attribute 5 0.89 0.89 0.89 0.90 0.90 0.90 0.90 0.91 0.94 0.95 0.95 0.94 0.96 0.96 0.96 0.97 0.97 0.96 Overall consistency 0.55 0.56 0.55 0.59 0.59 0.59 0.68 0.70 0.73 0.77 0.77 0.77 0.82 0.82 0.82 0.85 0.85 0.85