• 沒有找到結果。

For parameter estimates, we examine their estimation accuracy using the mean absolute deviation (MAD) and the empirical bias (EB) in a simulation study. More specifically, we use denoted as δj(m), obtained at replication n. Notations of Pj(m) and ˆPj(m)(n) are similarly defined. For every single parameter of each item, the MAD index computes its mean absolute deviation of estimates from the N replications. This MADδj(m) index has been used to evaluate estimation performance of the DINA model (Rupp & Templin, 2008;

de la Torre, 2010). We adopt the same criterion to evaluate the estimation performance

that the parameters with a value of MAD greater than 0.1 and a value of EB outside the range of -0.1 to 0.1 are considered relatively poorly estimated.

For classification accuracy, the posterior probability of belonging to the latent at-tribute group αl given the response vector Xi can be obtained from

p(αl| Xi) = p(Xi | αl)p(αl)

p(Xi) = p(Xi | αl)p(αl) P2K

l=1p(Xi | αl)p(αl).

Let A = {αl} be the 2K × K matrix of all the possible combinations of all the required attributes in a test, and P(α|Xi) is the 2K × 1 vector consisting of all the posterior probabilities p(αl | Xi) for l = 1, . . . , 2K. Define

Ti = AtP(α|Xi).

Consequently, Ti = (Ti1, . . . , TiK)tis the K × 1 vector indicating the posterior probabil-ity of mastering each individual attribute by respondent i. To determine the mastery status of respondent i on each attribute, indicator variables are defined for such a classification, that is, we define for k = 1, . . . , K

Iik =

1 if Tik ≥ γ 0 if Tik < γ,

where γ is the pre-specified threshold for mastery status. The default value of γ in the codes written in Ox by de la Torre (2011) is 0.5. That is, once the posterior probability of mastering an attribute exceeds 0.5 for respondent i, he or she is considered to have mastered this particular attribute. Based on this calculation, respondent i will be classified into the attribute pattern Ii = (Ii1, . . . , IiK)t .

In the simulation study, data on both the attribute patterns αli and test responses Xi of each respondent i are generated and known, so we can then use such informa-tion to calculate the empirical classificainforma-tion accuracy. Two different kinds of classi-fication accuracy are examined in this study. Firstly, to understand the impact of Q-matrix misspecification on the classification accuracy of each attribute, we consider the attribute-specific classification accuracy. Moreover, we also investigate the overall

classification accuracy to capture the correct classification rate of all the respondents into their true attribute patterns.

The attribute-specific classification accuracy, denoted as Pask, are defined for each dataset as follows:

Pask = PI

i=1I(Iiklik)

I ,

where I represents the number of respondents in the dataset, I(Iik

lik) is an indicator variable for whether Iik = αlik, that is, the classified mastery status on attribute k is the same as the kth element of true attribute pattern αli of respondent i. In other words, respondent i is correctly identified as mastering attribute k or not.

Moreover, to summarize the attribute-specific classification accuracy over the N replications, we use

where Pas(n)k is the kth attribute classification accuracy for the dataset in replication n.

For the overall classification accuracy, we define the index for each dataset as:

Pca = PI

i=1I(Iili)

I ,

where I(Iili) is an indicator variable for whether Ii = αli, that is, the classified attribute pattern is in fact the true attribute pattern αli of respondent i. Similarly, to summarize the overall classification accuracy over the N replications, we use

Pc=

where Pca(n) is the overall classification accuracy for the dataset in replication n.

3 Simulation

In this section, we give details on the simulation studies conducted to investigate the effects of Q-matrix misspecification on parameter estimates and classification accuracy of the G-DINA model. We firstly describe the characteristics of the Q-matrix and the parameter values used for data generation. Two types of condition settings for the Q-matrix misspecification were considered. For each condition, three levels of sample size and four different distributions of the respondents’s underlying cognitive patterns were manipulated. Consequently, we have a total of 12 combinations for each condition.

For each combination, 100 replications were run. In fitting the simulated data with the G-DINA model to obtain parameter estimates, we used the R codes written by Hong (2013) which was translated from the Ox codes originally written by de la Torre (2011).

3.1 Data generation

3.1.1 The Q-matrix

It is known that both the number of attributes and the length of the test in the Q-matrix have an impact on item parameter estimates (Wang, 2010). Many studies suggested that the greater the number of attributes in Q-matrix, the longer the length of a test is needed to provide a sufficient number of items that can provide reliable information. Choi, Templin, Cohen, and Atwood (2010) considered a Q-matrix with 40 items and four attributes in their study. Kunina-Habenicht, Rupp, and Wilhelm (2012) used 25 and 50 items together with three and five attributes in their simulation design. Both studies use no more than five attributes to investigate the effects of Q-matrix misspecification under the log linear modeling framework.

In our study, we chose a Q-matrix with 30 items and 5 attributes which is the same as the Q-matrix used in the simulation studies for G-DINA model by de la Torre (2011). The Q-matrix is reported in Table 1. The choice of the Q-matrix specifies that the test is composed of three types of items, namely the one-attribute, two-attribute, and three-attribute items, with ten items of each type.

Table 1: The Q-matrix for data generation

item A1 A2 A3 A4 A5 item A1 A2 A3 A4 A5

1 1 0 0 0 0 16 0 1 0 1 0

2 0 1 0 0 0 17 0 1 0 0 1

3 0 0 1 0 0 18 0 0 1 1 0

4 0 0 0 1 0 19 0 0 1 0 1

5 0 0 0 0 1 20 0 0 0 1 1

6 1 0 0 0 0 21 1 1 1 0 0

7 0 1 0 0 0 22 1 1 0 1 0

8 0 0 1 0 0 23 1 1 0 0 1

9 0 0 0 1 0 24 1 0 1 1 0

10 0 0 0 0 1 25 1 0 1 0 1

11 1 1 0 0 0 26 1 0 0 1 1

12 1 0 1 0 0 27 0 1 1 1 0

13 1 0 0 1 0 28 0 1 1 0 1

14 1 0 0 0 1 29 0 1 0 1 1

15 0 1 1 0 0 30 0 0 1 1 1

3.1.2 Parameter Values

The parameter values used to generate the datasets are presented in Table 2.

Table 2: Parameter values of the G-DINA model

21 0.105 0.034 0.035 −0.023 −0.102 0.001 −0.013 0.861 (11100) 22 0.098 −0.028 0.012 0.001 0.034 0.057 0.018 0.663 (11010) 23 0.125 −0.026 −0.006 −0.033 −0.004 0.012 0.060 0.814 (11001) 24 0.179 −0.074 −0.089 −0.118 0.096 0.161 0.099 0.640 (10110) 25 0.033 0.099 0.086 0.081 −0.117 −0.125 −0.128 0.989 (10101) 26 0.097 −0.037 −0.039 0.002 −0.091 0.030 0.020 0.774 (10011) 27 0.075 0.023 0.008 0.005 −0.035 −0.003 0.028 0.791 (01110) 28 0.148 −0.040 −0.029 −0.077 0.011 0.093 0.060 0.701 (01101) 29 0.042 0.041 0.069 0.070 −0.061 −0.053 −0.112 0.861 (01011) 30 0.158 −0.078 −0.109 −0.061 0.169 0.091 0.121 0.551 (00111) δ0: The intercept parameter;

δk: The main effect parameter of the kth attribute of that item;

δkk0: The interaction parameter of the kth and k0th attributes of that item.

相關文件