Evaluation Indices - Q矩陣錯誤設定在G-DINA模型下對參數估計和辨識率之影響

For parameter estimates, we examine their estimation accuracy using the mean absolute deviation (MAD) and the empirical bias (EB) in a simulation study. More specifically, we use denoted as δ_j(m), obtained at replication n. Notations of P_j(m) and ˆP_j(m)⁽ⁿ⁾ are similarly defined. For every single parameter of each item, the MAD index computes its mean absolute deviation of estimates from the N replications. This MAD_δ_j(m) index has been used to evaluate estimation performance of the DINA model (Rupp & Templin, 2008;

de la Torre, 2010). We adopt the same criterion to evaluate the estimation performance

that the parameters with a value of MAD greater than 0.1 and a value of EB outside the range of -0.1 to 0.1 are considered relatively poorly estimated.

For classification accuracy, the posterior probability of belonging to the latent at-tribute group α_l given the response vector X_i can be obtained from

p(α_l| X_i) = p(Xi | αl)p(αl)

p(X_i) = p(Xi | αl)p(αl) P2^K

l=1p(X_i | α_l)p(α_l).

Let A = {α_l} be the 2^K × K matrix of all the possible combinations of all the required attributes in a test, and P_(α|X_i₎ is the 2^K × 1 vector consisting of all the posterior probabilities p(α_l | X_i) for l = 1, . . . , 2^K. Define

T_i = A^tP_(α|X_i₎.

Consequently, T_i = (T_i1, . . . , T_iK)^tis the K × 1 vector indicating the posterior probabil-ity of mastering each individual attribute by respondent i. To determine the mastery status of respondent i on each attribute, indicator variables are defined for such a classification, that is, we define for k = 1, . . . , K

I_ik =







1 if T_ik ≥ γ 0 if Tik < γ,

where γ is the pre-specified threshold for mastery status. The default value of γ in the codes written in Ox by de la Torre (2011) is 0.5. That is, once the posterior probability of mastering an attribute exceeds 0.5 for respondent i, he or she is considered to have mastered this particular attribute. Based on this calculation, respondent i will be classified into the attribute pattern Ii = (Ii1, . . . , IiK)^t .

In the simulation study, data on both the attribute patterns αli and test responses Xi of each respondent i are generated and known, so we can then use such informa-tion to calculate the empirical classificainforma-tion accuracy. Two different kinds of classi-fication accuracy are examined in this study. Firstly, to understand the impact of Q-matrix misspecification on the classification accuracy of each attribute, we consider the attribute-specific classification accuracy. Moreover, we also investigate the overall

classification accuracy to capture the correct classification rate of all the respondents into their true attribute patterns.

The attribute-specific classification accuracy, denoted as P_as_k, are defined for each dataset as follows:

P_as_k = PI

i=1I_(I_ik_=α_lik₎

I ,

where I represents the number of respondents in the dataset, I_(I_ik_=α

lik) is an indicator variable for whether I_ik = α_l_i_k, that is, the classified mastery status on attribute k is the same as the kth element of true attribute pattern α_l_i of respondent i. In other words, respondent i is correctly identified as mastering attribute k or not.

Moreover, to summarize the attribute-specific classification accuracy over the N replications, we use

where Pas⁽ⁿ⁾k is the kth attribute classification accuracy for the dataset in replication n.

For the overall classification accuracy, we define the index for each dataset as:

P_ca = PI

i=1I_(I_i_=α_li₎

I ,

where I_(I_i_=α_li₎ is an indicator variable for whether I_i = α_l_i, that is, the classified attribute pattern is in fact the true attribute pattern α_l_i of respondent i. Similarly, to summarize the overall classification accuracy over the N replications, we use

P_c=

where Pca⁽ⁿ⁾ is the overall classification accuracy for the dataset in replication n.

3 Simulation

In this section, we give details on the simulation studies conducted to investigate the effects of Q-matrix misspecification on parameter estimates and classification accuracy of the G-DINA model. We firstly describe the characteristics of the Q-matrix and the parameter values used for data generation. Two types of condition settings for the Q-matrix misspecification were considered. For each condition, three levels of sample size and four different distributions of the respondents’s underlying cognitive patterns were manipulated. Consequently, we have a total of 12 combinations for each condition.

For each combination, 100 replications were run. In fitting the simulated data with the G-DINA model to obtain parameter estimates, we used the R codes written by Hong (2013) which was translated from the Ox codes originally written by de la Torre (2011).

3.1 Data generation

3.1.1 The Q-matrix

It is known that both the number of attributes and the length of the test in the Q-matrix have an impact on item parameter estimates (Wang, 2010). Many studies suggested that the greater the number of attributes in Q-matrix, the longer the length of a test is needed to provide a sufficient number of items that can provide reliable information. Choi, Templin, Cohen, and Atwood (2010) considered a Q-matrix with 40 items and four attributes in their study. Kunina-Habenicht, Rupp, and Wilhelm (2012) used 25 and 50 items together with three and five attributes in their simulation design. Both studies use no more than five attributes to investigate the effects of Q-matrix misspecification under the log linear modeling framework.

In our study, we chose a Q-matrix with 30 items and 5 attributes which is the same as the Q-matrix used in the simulation studies for G-DINA model by de la Torre (2011). The Q-matrix is reported in Table 1. The choice of the Q-matrix specifies that the test is composed of three types of items, namely the one-attribute, two-attribute, and three-attribute items, with ten items of each type.

Table 1: The Q-matrix for data generation

item A1 A2 A3 A4 A5 item A1 A2 A3 A4 A5

1 1 0 0 0 0 16 0 1 0 1 0

2 0 1 0 0 0 17 0 1 0 0 1

3 0 0 1 0 0 18 0 0 1 1 0

4 0 0 0 1 0 19 0 0 1 0 1

5 0 0 0 0 1 20 0 0 0 1 1

6 1 0 0 0 0 21 1 1 1 0 0

7 0 1 0 0 0 22 1 1 0 1 0

8 0 0 1 0 0 23 1 1 0 0 1

9 0 0 0 1 0 24 1 0 1 1 0

10 0 0 0 0 1 25 1 0 1 0 1

11 1 1 0 0 0 26 1 0 0 1 1

12 1 0 1 0 0 27 0 1 1 1 0

13 1 0 0 1 0 28 0 1 1 0 1

14 1 0 0 0 1 29 0 1 0 1 1

15 0 1 1 0 0 30 0 0 1 1 1

3.1.2 Parameter Values

The parameter values used to generate the datasets are presented in Table 2.

Table 2: Parameter values of the G-DINA model

21 0.105 0.034 0.035 −0.023 −0.102 0.001 −0.013 0.861 (11100) 22 0.098 −0.028 0.012 0.001 0.034 0.057 0.018 0.663 (11010) 23 0.125 −0.026 −0.006 −0.033 −0.004 0.012 0.060 0.814 (11001) 24 0.179 −0.074 −0.089 −0.118 0.096 0.161 0.099 0.640 (10110) 25 0.033 0.099 0.086 0.081 −0.117 −0.125 −0.128 0.989 (10101) 26 0.097 −0.037 −0.039 0.002 −0.091 0.030 0.020 0.774 (10011) 27 0.075 0.023 0.008 0.005 −0.035 −0.003 0.028 0.791 (01110) 28 0.148 −0.040 −0.029 −0.077 0.011 0.093 0.060 0.701 (01101) 29 0.042 0.041 0.069 0.070 −0.061 −0.053 −0.112 0.861 (01011) 30 0.158 −0.078 −0.109 −0.061 0.169 0.091 0.121 0.551 (00111) δ₀: The intercept parameter;

δ^∗_k: The main effect parameter of the kth attribute of that item;

δ^∗_kk0: The interaction parameter of the kth and k⁰th attributes of that item.

在文檔中 Q矩陣錯誤設定在G-DINA模型下對參數估計和辨識率之影響 (頁 15-21)