4.2 Effect on classification accuracy
4.2.2 Effect on the overall classification accuracy
For the effect of Q-matrix misspecification on the overall classification accuracy, we give two interaction plots to observe how factors interact with the effect of Q-matrix misspecification.
Figure 5-(a) presents the interaction between the type of misspecification and sam-ple size on the mean overall classification accuracy averaged over 100 replications. We find that an increase in the overall classification accuracy accompanies with a larger sample size for both underspecification and overspecification, and there exists nearly no difference between the overall classification accuracy based on the true and the over-specified Q-matrix. Again the trend is similar among the four distributions of cognitive patterns, so there is no need to draw the three-way interaction plot.
Figure 5-(b) presents the interaction between type of misspecification and the dis-tribution of attribute patterns on the mean overall classification accuracy averaged over 100 replications. The results show that the multivariate normal distribution with ρ = 0.8 lead to the best classification accuracy, followed by discrete uniform distribution and multivariate normal distribution with ρ = 0.3. In addition, classification accuracy goes down across all the distribution when Q-matrix is underspecified. Higher-order distribution results in more decline in the overall classification accuracy than the other three distributions. However, there is nearly no decline in the overall classification accuracy with Q-matrix overspecification for all distributions.
The overall classification accuracy indices for Q-matrix underspecification and over-specification for each attribute pattern are also presented respectively in Tables 13 and 14. Because sample size has only a slight impact on the overall classification accuracy, we only report the results with sample size of 1000.
(a) Q-matrix misspecification and sample size (b) Q-matrix misspecification and the underlying distribu-tion of cognitive patterns
Figure 5: The interaction plots of the type of Q-matrix misspecification and sample size and the underlying distribution of attribute patterns on the arcsin transformed overall classification accuracy
Table 13: The overall classification accuracy for Q-matrix underspecification with different distributions of attribute patterns and each sample size
sample size uniform Mv0.3
Pc0 Pc1 Pc0 Pc1
200 0.893 0.800 0.894 0.827
500 0.907 0.863 0.906 0.862
1000 0.915 0.896 0.911 0.889
sample size Mv0.8 higher-order
Pc0 Pc1 Pc0 Pc1
200 0.914 0.819 0.902 0.787
500 0.918 0.889 0.910 0.836
1000 0.930 0.905 0.916 0.874
Pc0: The classification accuracy for the true Q-matrix
Pc1: The classification accuracy for the underspecified Q-matrix
Table 14: The overall classification accuracy for Q-matrix overspecification with different distributions of attribute patterns and each sample size
sample size uniform Mv0.3
Pc0 Pc2 Pc0 Pc2
200 0.893 0.890 0.894 0.891
500 0.907 0.908 0.906 0.902
1000 0.915 0.916 0.911 0.909
sample size Mv0.8 higher-order
Pc0 Pc2 Pc0 Pc2
200 0.847 0.819 0.902 0.895
500 0.888 0.889 0.910 0.906
1000 0.899 0.905 0.916 0.914
Pc0: The classification accuracy for the true Q-matrix
Pc2: The classification accuracy for the overspecified Q-matrix
5 Summary and Conclusion
Accurate estimation of item parameters and classification accuracy are important for cognitive diagnosis models because they are necessary in getting valid inferences.
This study contributes to a better understanding of the effects of Q-matrix misspecifi-cation on these two practical issues for the G-DINA model. For parameter estimates, underspecification of Q-matrix in its row vector caused an overestimation of the last parameter of the misspecified item as well as the corresponding probabilities of an-swering that item correctly. These affected parameters were all related to the excluded attribute. Additionally, higher-order interaction parameters under underspecification appeared more difficult to recover than under overspecification, whereas the recovery of main effects was poorer than those two-way interaction parameters. Also, the smaller the number of attributes an item requires, the better the parameter estimates will be. These results are consistent with the previous studies (de la Torre, 2008; Rupp
& Templin, 2008; Choi, Templin, Cohen, & Atwood, 2010). For classification accu-racy, the attribute-specific classification accuracy for the misspecified attributes went down with an underspecification of the Q-matrix. However, no significant impact of overspecification on attribute-specific classification accuracy was present.
Interestingly, the response probabilities for respondents who have mastered all the measured attributes were underestimated when Q-matrix underspecification occurs.
For instance, item 26 was misspecified from q26 = (10011) to (10010) and the results showed that δ26,12 was overestimated. However, its corresponding probability esti-mate ˆP26(10010) was also affected by the underspecification, though the probability P26(10011) was underestimated. This phenomenon may result from the change in the number of latent cognitive groups of the G-DINA model under Q-matrix misspecifica-tion. Take item 26 as an example, the number of latent cognitive groups is reduced from eight to four due to underspecification. Hence the probabilities of answering item 26 correctly for the two groups (10010) and (10011), namely P26(10010) and P26(10011) with no misspecification on the Q-matrix will be constrained to have equal probability of answering item 26 correctly because the two groups would be considered falling in the same latent cognitive group of (1001) with Q-matrix underspecification on attribute 5. However, overspecification does not show any apparent impact on the estimates for
either item parameters or probability of answering correctly.
Furthermore, some factors may interact with the impact of Q matrix misspecifica-tion on the parameter estimates and classificamisspecifica-tion accuracy in the G-DINA model. For distribution of cognitive attribute patterns, discrete uniform distribution performed the best in parameter recovery and multivariate normal distribution with high corre-lation coefficient gave the highest attribute-specific classification accuracy and overall classification accuracy. These results indicated that different distributions of cognitive patterns in the population interacted with the impact of Q-matrix misspecification on parameter estimates and classification accuracy. For sample size, both parameter estimates and classification accuracy improved with an increase of sample size when Q-matrix underspecification is present, but there was of no difference when Q-matrix overspecification occurs. This indicates that a large sample size help reduce the impact of Q-matrix underspecification on both parameter estimates and classification accuracy when underspecification.
In summary, underspecification of Q-matrix causes great impact on parameter es-timates, while the impact due to overspecification is little or minor. In addition, the estimation of all other parameters for items whose row vectors were not misspecified was unaffected due to the two types of misspecification. Factors such as sample size and the distribution of cognitive attribute patterns interacted with the impact of Q-matrix misspecification on parameter estimates and classification accuracy in the G-DINA model.
6 Reference
Chang, H, H., Cui, Y., & Gierl, J, M. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49, 19-38.
Choi, H, J., Templin, J., Cohen, A., & Atwood, C. (2010). The impact of model misspecification on estimation accuracy in diagnostic classification models. Paper pre-sented at the annual meeting of the National Council on Measurement in Education in Denver, Colorado.
Corter, J. E. & Im, S. (2011). Statistical consequences of attribute misspecification in
the rule space method. Educational and Psychological Measurement, 71(4), 712-731.
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199.
de la Torre, J., Deng, W., & Hong, Y. (2010). Factors affecting the item parameter estimation and classification accuracy of the DINA model. Journal of Educational Measurement, 47(2), 227-249.
de la Torre, J. (2009). DINA Model and parameter estimation: a didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130.
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications .Journal of Educational Measurement, 45(4), 343-362.
de la Torre, J. & Douglas, J. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353.
DeCarlo, T, L. (2012). Recognizing Uncertainty in the Q-Matrix via a Bayesian Ex-tension of the DINA Model. Applied Psychological Measurement, 36(6), 447-468.
DeCarlo, T, L. (2011). On the analysis of fraction subtraction data: The DINA model classification latent class sizes, and the Q-Matrix. Applied Psychological Measurement, 35(1), 8-26.
DiBello, L. V., Roussos, L. A., & Stout, W.F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, 26 (pp. 979-1030). Amsterdam: Elsevier.
Hong, C, Y. (2013). Estimation of Generalized DINA Model with Order Restrictions.
master thesis. Taiwan, Taipei: National Taiwan Normal University.
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191-210.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assump-tions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258-272.
Kunina-Habenicht, O., Rupp, A, A., & Wilhelm, O. (2012). The impact of model mis-specification on parameter estimation and item-fit assessment in log linear diagnostic classification models. Journal of Educational Measurement, 49, 59-81.
Maris,E. (1999). Estimating multiple classification latent class models. Psychometrika,
64, 187-212.
Rupp, A, A., & Templin, J. (2008). The Effects of Q-Matrix misspecification on parameter estimates and classification accuracy in the DINA Model. Educational and Psychological Measurement, 68, 76-96.
Tatsuoka, K. K. (1983). Rule-space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345-354.
von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52(1), 8-28.
von Davier, M. (2005). A general diagnostic model applied to language testing data.
(ETS Research Report RR-05-16).
Wang, W, C. (2010). Compare the Parameters Estimated by DINA Model with by G-DINA Model. master thesis. Taiwan, Taichung: National Taichung University of Education.