• 沒有找到結果。

Response variables in the human sciences are often categorical and not continuous. Traditionally, such categorical data are often treated as continuous because linear models are simple to implement and interpret. In recent years, nonlinear models have been developed to account for the categorical nature in item responses, which can be classified as two major formulations. In the IRT literature, the conditional-probability formulation is often used; whereas in the statistical literature, the latent-response formulation is common. The use of link functions (Equation 9) can integrate these two formulations.

Parameters in NSEM can be estimated with WinBUGS and Mplus. In a series of simulations, it can be concluded that: (a) The parameters of NSEM can be recovered very well by WinBUGS and Mplus; (b) WinBUGS is slightly more efficient than Mplus when data information is weak (e.g., 10 dichotomous items per latent trait) because WinBUGS considers prior information but Mplus does not, and prior information is more important when data information is weak; and (c) Table 9. Parameter estimates and standard errors (in parentheses) the structural parameters with different estimation approaches in Example 2

Note. γ

1

is regression coefficient of θ

3

on θ

1

; γ

2

is regression coefficient of θ

3

on θ

2

;

is

correlation between θ

1

and θ

2

; Var1 is the variance of θ

1

; Var2 is the variance of θ

2

; Var3 is the residual variance of θ

3

.

Parameter MLE PV Mplus WinBUGS

γ

1

0.23 (0.03) 0.32 (0.06) 0.29 (0.07) 0.30 (0.06) γ

2

0.39 (0.04) 0.48 (0.08) 0.50 (0.09) 0.51 (0.10)

 0.44 (0.04) 0.54 (0.14) 0.53 (0.16) 0.55 (0.14)

Var1 0.28 (0.02) 7.43 (0.75) 8.14 (0.77) 7.38 (0.86)

Var2 0.19 (0.02) 3.31 (0.28) 3.69 (0.31) 3.35 (0.32)

Var3 0.11 (0.01) 1.48 (0.31) 1.59 (0.31) 1.31 (0.29)

Mplus needs less computer time than WinBUGS.

When original item responses are not accessible, one may adopt the PV approach to estimate the structural parameters in SEM. The simulation results reveal that (a) the PV approach can recover the structural parameters as satisfactorily as Mplus and WinBUGS when original item responses are accessible; (b) if measurement error in person measures is ignored by adopting the EAP or MLE approach, the estimation for the structural parameters was seriously biased (the regression parameters and the correlation parameters were underestimated); the shorter the test (the larger the measurement error), the worse the estimation.

The two empirical examples demonstrate how measurement models and NSEM can be fit to self-report item responses. It was found that the proposed models have a good fit in terms of the posterior predictive p-values of the Bayesian chi-squares. The PV approach yields estimates for the structural parameters that are very similar to Mplus and WinBUGS, whereas the MLE approach yields estimates that are much smaller than those from the PV approach, Mplus and WinBUGS. Moreover, the standard errors produced by the MLE approach were much smaller than those produced by the other three approaches, which is mainly because measurement error is ignored in the MLE approach.

The main idea of the PV approach is to consider measurement error properly.

Actually, it is possible to apply this idea to the MLE or EAP approach. In practice, many IRT computer programs proved point estimates (e.g., MLE or EAP), together with their standard errors. In the EAP and MLE approaches used in this

study, we purposely ignored the standard errors in order to mimic the common practice in which the point estimates are treated as true values. To consider measurement error when point estimates and their standard error are provided, one may draw a random sample of 5 (or more) from a normal distribution with the mean equal to the point estimate (e.g., MLE or EAP) and the standard deviation equal to the standard error, and then adopt Equations 18 and 19 to yield a final result, as in the PV approach. This revised approach may not perform as appropriately as the PV approach because the MLE is only asymptotically normal and the posterior distribution may not be normally distributed. Even so, the consideration of measurement error will be helpful and necessary. More studies are needed to assess the performance of the revised MLE or EAP approach.

WinBUGS is more flexible than Mplus in allowing users to specify customized models and various item response functions. Many practitioners may find it very difficult to convert Mplus results into IRT parameters. The popular 3-parameter logistic model as well as many other complicated item response functions such as the linear logistic test model (Fischer, 1983), linear partial credit model (Fischer & Ponocny, 1994), faceted model (Linacre, 1989), ordered partition model (Wilson, 1992), unfolding IRT model (Andrich, 1996; Luo, 1998;

Roberts, Donoghue, & Laughlin, 2000), testlet response models (Wainer, Bradlow,

& Wang, 2007), and multidimensional models (Reckase, 2009) can be fit by WinBUGS, but not by Mplus. Moreover, WinBUGS considers information from all individual persons, whereas Mplus considers only summarized information in the variance-covariance matrix. Thus, Mplus is not statistically optimal. However,

the flexibility of WinBUGS comes at a price. Users have to learn its syntax and be familiar with Bayesian statistics.

The structural part of NSEM in this study consists of only three latent traits.

In practice, the number of latent traits is often much lager and their relationships are often more complicated. The nonlinear relationship in the NSEM used in this study occurs only between item responses and latent traits. NSEM can be extended to accommodate nonlinear relationships among latent traits, multilevel data structures and mixture distributions (Lee, 2007; Lee & Zhu, 2002; Mooijaart

& Bentler, 2010). Future studies can be conducted to evaluate how WinBUGS will perform under complicated NSEM and develop computer programs that are more efficient and user-friendly than WinBUGS.

References

Adams, R. J., Wilson, M., & Wu, M. (1997). Multilevel item response models: An approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22, 47-76.

Andersen, E. B. (2004). Latent regression analysis based on the rating scale model. Psychology Science, 46, 209-226.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.

Andrich, D. (1996). A hyperbolic cosine latent trait model for unfolding polytomous responses:

Reconciling Thurstone and Likert methodologies. British Journal of Mathematical and Statistical Psychology, 49, 347-365.

Beck, A. T., Steer, R. A., Ball, R., & Ranieri, W. F. (1996). Comparison for Beck Depression Inventories-IA and –II in Psychiatric Outpatients. Journal of Personality, 67, 588-597.

Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541-562.

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397-479).

Reading, MA: Addison-Wesley.

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters:

Application of an EM algorithm. Psychometrika, 46, 443-459.

Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2001). A mixture item response model for multiple-choice data. Journal of Educational and Behavioral Statistics, 26, 381-409.

Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets.

Psychometrika, 64, 153-168.

Christensen, K. B., Bjorner, J. B., Kreiner, S., & Petersen, J. H. (2004). Latent regression in loglinear Rasch models. Communications in Statistics: Theory and Methods, 33, 1341-1356.

Cowles, M. K. (2004). Review of WinBUGS 1.4. The American Statistician, 58, 330-336.

Embretson, S. E. (1996). Item response theory models and spurious interaction effects in factorial ANOVA designs. Applied Psychological Measurement, 20, 201–212.

Fischer, G. H., & Ponocny, I. (1994). An extension of the partial credit model with an application to the measurement of change. Psychometrika, 59, 177–192.

Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48, 3–26.

Fryback, D. G., Stout, N. K., & Rosenberg, M. A. (2001). An elementary introduction to Bayesian computing using WinBUGS. International Journal of Technology Assessment in Health Care, 17, 98-113.

Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1996). Bayesian data analysis. London, UK : Chapman & Hall.

Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE trans. Pattern Analysis and Machine Intelligence, 12, 609-628.

Glass, G. V., Peckham, P. D., & Sanders, J. R. (1972). Consequences of failure to meet assumptions underlying the analyses of variance and covariance, Review of Educational Research, 42, 237-288.

Hardin, J., & Hilbe, J. (2007). Generalized linear models and extensions (2nd ed.). College Station, TX: Stata Press.

Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.

Jöreskog, K.G. & Sörbom, D. (2006). LISREL 8.80 for Windows [Computer Software].

Lincolnwood, IL: Scientific Software International.

Kang, T., & Cohen, A. S. (2007). IRT model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331-358.

Lee, S.-Y. (2007). Structural equation modeling: A Bayesian approach. West Sussex, UK : Wiley.

Lee, S.-Y., & Tang, N.-S. (2006a). Analysis of nonlinear structural equation models with nonignorable missing covariates and ordered categorical data. Statistica Sinica, 16, 1117-1141.

Lee, S.-Y., & Tang, N.-S. (2006b). Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika, 71, 541-564.

Lee, S.-Y., & Zhu, H.-T. (2002). Maximum likelihood estimation of nonlinear structural equation models. Psychometrika, 67, 189-210.

Lee, S.-Y., Song, X.-Y., & Tang, N.-S. (2007). Bayesian methods for analyzing structural equation models with covariates, interaction, and quadratic latent variables. Structural Equation Modeling, 14, 404–434.

Lee, S.-Y., Song, X.-Y., Cai, J.-H., So, W,-Y., Ma, C.-W., & Chan, C.-N. (2009). Non-linear structural equation models with correlated continuous and discrete data. British Journal of Mathematical and Statistical Psychology, 62, 327-347.

Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological measurement, 30, 3-21.

Linacre, J. M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA.

Liu, K.-S., Cheng, Y.-Y., & Wang, W.-C. (2007). Rasch analysis of the Beck Depression Inventory-II with Taiwan university students. Paper presented at 2007 Pacific Rim Objective Measurement Symposium. National College of Physical Education & Sports, Taoyuan, Taiwan.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

Lubke, G. H., & Muthén, B. (2004). Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons.

Structural Equation Modeling, 11, 514-534.

Luo, G. (1998). A general formulation for unidimensional unfolding and pairwise preference models: Making explicit the latitude of acceptance. Journal of Mathematical Psychology, 42, 400-417.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-381.

Mislevy, R.J., Beaton, A., Kaplan, B.A., & Sheehan, K. (1992). Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 133-161.

Mooijaart, A., & Bentler, P. M. (2010). An alternative approach for nonlinear latent variable models. Structural Equation Modeling, 17, 357-373.

Muraki, R. J. (1992). A generalized partial credit model: Application of an EM-algorithm. Applied Psychological Measurement, 16, 159-176.

Muthén, B. (1979). A structural probit model with latent variables. Journal of the American Statistical Association, 74, 807-811.

Muthén, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 48-65.

Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical,

and continuous latent variable indicators. Psychometrika, 49, 115-132.

Muthén, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557-585.

Muthén, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A.

Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205-243). Newbury Park, CA: Sage.

Muthén, B. (1996). Growth modeling with binary responses. In A. V. Eye & C. Clogg (Eds.), Categorical variables in developmental research: Methods of analysis (pp. 37-54). San Diego, CA: Academic Press.

Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.

Muthén, B., & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

Muthén, B., & Speckart, G. (1983). Categorizing skewed, limited dependent variables: Using multivariate probit regression to evaluate the California Civil Addict Program. Evaluation Review, 7, 257-269.

Muthén, L., & Muthén, B. (2007). Mplus user’s guide (4th ed.). Los Angeles, CA: Muthén and Muthén.

Olsson, U (1979). Maximum likelihood estimation of the polychoric correlation coefficient.

Psychometrika, 44, 443–460.

Qiu, Z., Song, P. X.-K., & Tan, M. (2002). Bayesian hierarchical models for multi-level repeated ordinal data using WinBUGS. Journal of Biopharmaceutical Statistics, 12, 121-135.

Raftery, A. E., & Lewis, S. M. (1996). Implementing MCMC. In W. R. Gilks, S. Richardson, &

D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 115-130). London, UK:

Chapman & Hall.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment test. Copenhagen, Denmark: Institute of Educational Research.

Reckase, M. D. (2009). Multidimensional item response theory. New York, NY: Springer.

Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A general item response theory model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3-32.

Samejima, F. (1969). Estimation of Latent Ability Using a Response Pattern of Graded Scores (Psychometric Monograph No. 17). Richmond, VA: Psychometric Society. Retrieved from http://www.psychometrika.org/journal/online/MN17.pdf

Satorra, A. (1992). Asymptotic robust inferences in the analysis of mean and covariance structures.

In P. V. Marsden (Ed.), Sociological Methodology 1992 (pp. 249-278). Oxford, UK: Blackwell.

Sheu, C.-F., Chen, C.-T., Su, Y.-H., & Wang, W.-C. (2005). Using SAS PROC NLMIXED to fit item response theory models. Behavior Research Methods, 37, 202-218.

Shiau, W.-L. (2007). Multivariate analysis and best introduction of SEM. Taipei: GOTOP Information Inc.

Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton, FL: Chapman & Hall/CRC.

Song, X.-Y, & Lee, S.-Y. (2005). Maximum likelihood analysis of nonlinear structural equation models with dichotomous variables. Multivariate Behavioral Research, 40, 151-177.

Spiegelhalter, D., Thomas, A., & Best, N. (2003). WinBUGS version 1.4 [Computer program].

Cambridge, UK: MRC Biostatistics Unit, Institute of Public Health.

Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91, 1292-1306.

Storch, E. A., Roberti, J. W., & Roth, D. A. (2001). Factor structure, concurrent validity, and internal consistency of the Beck Depression Inventory-Second Edition in a sample of college students. Depression and Anxiety, 19, 187-189.

Sturtz, S., Ligges, U., & Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software, 12, 1-16.

Tierney, L. (1994). Exploring posterior distributions with Markov Chains. Annals of Statistics, 22, 1701-1762.

Tuerlinckx, F., & Wang, W.-C. (2004). Models for polytomous data. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models: A generalized linear and nonlinear approach (pp.

75-109). New York, NY: Springer-Verlag.

Wainer, H., Bradlow, E.T., & Wang, X. (2007). Testlet response theory and its applications.

Cambridge: Cambridge University Press.

Wang, W.-C. (2004). Direct estimation of correlation as a measure of association strength using multidimensional item response models. Educational and Psychological Measurement, 64, 937-955.

Wang, W.-C., & Liu, C.-Y. (2007). Formulation and application of the generalized multilevel facets model. Educational and Psychological Measurement, 67, 683-605.

Wang, W.-C., Chen, P.-H., & Cheng, Y.-Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116-136.

Wilson, M. (1992). The ordered partition model: an extension of the partial credit model. Applied Psychological Measurement, 16, 309-325.

Zwinderman, A. H. (1991). A generalized Rasch model for manifest predictors. Psychometrika, 56, 589-600.

#NSEM_ERP(GRM)

Appendix: WinBUGS commands for Example 2

exp(a[k]*(theta3[j]-step[k,2]))/(1+exp(a[k]*(theta3[j]-step[k,2])))- theta3 [j] <-lam1*theta [j,1]+lam2*theta [j,2]+e[j]

e [j] ~ dnorm(0,tau3) }

# Priors

tau3 ~ dgamma(0.001,0.001); theta3var <- 1/tau3 for( k in 1:T ){

step[k,1] ~ dnorm(0, 0.25 )I(, step[k,2])

step[k,2] ~ dnorm(0, 0.25 )I(step[k,1], step[k,3]) step[k,3] ~ dnorm(0, 0.25 )I(step[k,2], step[k,4]) step[k,4] ~ dnorm(0, 0.25 )I(step[k,3],)

ph[1:2,1:2]~ dwish( R[1:2,1:2],4) ; phi[1:2,1:2]<- inverse(ph[1:2,1:2]) 1,2,-2,-1,1,2,-2,-1,1,2), .Dim = c(9, 4)), tau3=1,lam1=0.8,lam2=0.7,a = c(NA,1,1,NA,1,1,NA,1,1),

ph=structure( .Data=c(1,-0.2,-0.2,1), .Dim=c(2,2)))

相關文件