• 沒有找到結果。

For each chain, 200000 iterations are run and we choose to burn in the first 100000 iterations to ensure convergence and set the thinning to be 10 to avoid high correla-tions between successive draws. Under both the equality and inequality settings, we compute the posterior means and standard deviations for the latent factor means and calculate the Bayes factors by fi and ci for all possible inequality constrained hypotheses of means of the different groups. The estimation and hypothesis testing results are reported in Table 2 to Table 7, respectively.

Table 4: Posterior means and standard deviations (SD) of the factor means.(ng = 1000)

ng = 1000 chain 1 chain 2 chain 3

Setting Parameter True mean SD mean SD mean SD Rˆ

equality

µ(1) 0 -0.062 0.053 -0.038 0.052 -0.051 0.049 1.06 µ(2) 0 -0.113 0.049 -0.077 0.066 -0.093 0.059 1.01 µ(3) 0 -0.012 0.059 -0.033 0.059 -0.025 0.056 1.03 inequality

µ(1) 0 0.017 0.054 0.053 0.074 -0.006 0.060 1.01 µ(2) -0.2 -0.145 0.042 -0.144 0.044 -0.152 0.048 1.00 µ(3) -0.4 -0.438 0.046 -0.414 0.052 -0.402 0.056 1.01

Table 5: The Bayes factor under ng = 250.

Setting Hypothesis chain 1 chain 2 chain 3 equality

µ(1) > µ(2) > µ(3) 0.2127 0.0628 0.0454 µ(1) > µ(3) > µ(2) 0.1240 0.8398 0.1536 µ(2) > µ(1) > µ(3) 0.7465 0.0546 0.0623 (µ(1) = µ(2) = µ(3))

µ(2) > µ(3) > µ(1) 3.7458 1.2383 1.0234 µ(3) > µ(1) > µ(2) 0.5261 2.6947 2.4940 µ(3) > µ(2) > µ(1) 1.9599 1.9862 4.0285 inequality

µ(1) > µ(2) > µ(3) 9.3062 10.9033 4.7886 µ(1) > µ(3) > µ(2) 1.5062 1.1805 2.2823 µ(2) > µ(1) > µ(3) 0.5556 0.6155 0.7307 (µ(1) > µ(2) > µ(3))

µ(2) > µ(3) > µ(1) 0.0277 0.0302 0.0953 µ(3) > µ(1) > µ(2) 0.0561 0.0307 0.1986 µ(3) > µ(2) > µ(1) 0.0070 0.0085 0.0659

Table 6: The Bayes factor under ng = 500.

Setting Hypothesis chain 1 chain 2 chain 3 equality

µ(1) > µ(2) > µ(3) 1.2274 0.8473 1.4499 µ(1) > µ(3) > µ(2) 1.9444 0.4389 2.3411 µ(2) > µ(1) > µ(3) 1.2360 0.6741 0.8754 (µ(1) = µ(2) = µ(3))

µ(2) > µ(3) > µ(1) 0.5611 1.6103 0.4993 µ(3) > µ(1) > µ(2) 0.9123 0.8976 0.8706 µ(3) > µ(2) > µ(1) 0.3735 1.7549 0.3660 inequality

µ(1) > µ(2) > µ(3) 354.7122 142.0588 95.6036 µ(1) > µ(3) > µ(2) 0.0597 0.1754 0.2615 µ(2) > µ(1) > µ(3) 0.0105 0.0005 0.0000 (µ(1) > µ(2) > µ(3))

µ(2) > µ(3) > µ(1) 0.0000 0.0000 0.0000 µ(3) > µ(1) > µ(2) 0.0000 0.0000 0.0000 µ(3) > µ(2) > µ(1) 0.0000 0.0000 0.0000

Table 7: The Bayes factor under ng = 1000.

Setting Hypothesis chain 1 chain 2 chain 3 equality

µ(1) > µ(2) > µ(3) 0.3225 0.5175 0.5121 µ(1) > µ(3) > µ(2) 1.3363 2.1674 1.9348 µ(2) > µ(1) > µ(3) 0.0720 0.3996 0.1605 (µ(1) = µ(2) = µ(3))

µ(2) > µ(3) > µ(1) 0.0787 0.4819 0.2582 µ(3) > µ(1) > µ(2) 5.1854 1.5181 2.4206 µ(3) > µ(2) > µ(1) 1.1706 1.3211 1.4243 inequality

µ(1) > µ(2) > µ(3) 956.5385 352.1429 282.3563 µ(1) > µ(3) > µ(2) 0.0000 0.0005 0.0040 µ(2) > µ(1) > µ(3) 0.0261 0.0705 0.0844 (µ(1) > µ(2) > µ(3))

µ(2) > µ(3) > µ(1) 0.0000 0.0000 0.0000 µ(3) > µ(1) > µ(2) 0.0000 0.0000 0.0000 µ(3) > µ(2) > µ(1) 0.0000 0.0000 0.0000

Table 8: The seven questions on SITES.

item item description

1 Students are more attentive when computers are used in class

8 ICT can effectively enhance problem solving and critical thinking skills of students 11 ICT-based learning enables students to take more responsibility for their own learning 15 ICT improves the monitoring of students’ learning progress

18 The achievement of students can be increased when using computers for teaching 19 The use of e-mail increases the motivation of students

23 Using computers in class leads to more productivity of students

Table 9: The starting values used for the three Gibbs sampler chains.

Chain Group 1 Group 2 Group 3 Chain 1 0.9916315 -0.0917919 -0.2593416 Chain 2 0.5163471 1.9677642 0.1250624 Chain 3 0.4907143 0.5842933 1.9249208

6 Real data

In 1998, the International Association for the Evaluation of Educational Achieve-ment (IEA) established an international comparative research project, called Second Information Technology in Education Study (SITES), of Information and Commu-nication Technology (ICT) infusion in different countries educational systems. The goal of this project is to provide policy-makers and the educational practitioners with information about the extent to which ICT contributes to bringing about in those systems reforms that will satisfy the needs of the Information Society.

In the SITES project, IEA conducted a survey in 26 countries. The survey col-lected samples of at least 200 computer-using schools from at least one of primary, lower secondary, and upper secondary. We select three countries from this sur-vey, Taiwan, Lithuania, and France with sample sizes of 572, 685, 768 respectively.

Our goal is to test whether there exists some inequality of means among the three counties. There are 24 questions in this survey. For each question there are five response categories as strongly disagree, slightly disagree, uncertain, slightly agree, and strongly agree. We selected seven questions from the survey to examine whether ICT improves the students’ attitude or ability. The seven questions are reported in Table 8.

Here we use the same parameters of the prior distribution as in the simulation settings. But for the starting values of µ(g), we again take random samples from N(0, 1) as Table 9. We first show in Figures 1 to 3 the latent factor means at each iteration with Taiwan, Lithuania, and France, respectively. While examining the

three chains, they seem to reach a stable state after the first 50000 iterations and therefore we choose to burn in the first 50000 iterations. Again a thinning of 10 is applied to avoid high correlation between successive iterations. After a burn-in of 50000 iterations, the “potential scale reduction factor ” is shown in Figure 4 to assess convergence using the three chains and the resultant potential scale reduction factor of factor mean for Taiwan, Lithuania, and France are respectively 1.02, 1.03, and 1.04. According to the criterion of 1.2, we conclude that the chains have reached convergence and therefore we can then obtain the posterior distributions of factor means for each chain and each group. More specifically, we depict in Figures 5 to 7 the posterior distributions of the three factor means.

Figure 1: The iterations of factor mean with chain 1

Figure 2: The iterations of factor mean with chain 2

Figure 3: The iterations of factor mean with chain 3

Figure 4: The potential scale reduction factor plot of factor mean for each group

Figure 5: The posterior distribution(burn in first 50000 times) of factor mean with chain 1

Figure 6: The posterior distribution(burn in first 50000 times) of factor mean with chain 2

Figure 7: The posterior distribution(burn in first 50000 times) of factor mean with chain 3

Table 10: The posterior means and standard deviations (SD) of the factor means for real data.

Chain 1 Chain 2 Chain 3

Country Parameter mean SD mean SD mean SD

Taiwan µ(1) 0.147 0.06 0.165 0.064 0.142 0.066 Lithuania µ(2) 0.590 0.063 0.554 0.066 0.528 0.059 France µ(3) -0.055 0.048 -0.062 0.048 -0.03 0.052

Table 11: The Bayes factors of the equivalent inequality constrained hypotheses of real data.

Hypothesis Chain 1 Chain 2 Chain 3 µ(1) > µ(2) > µ(3) 0.0000 0.0000 0.0000 µ(1) > µ(3) > µ(2) 0.0000 0.0000 0.0000 µ(2) > µ(1) > µ(3) 2626.5789 1277.0513 620.0000 µ(2) > µ(3) > µ(1) 0.0095 0.0196 0.0403 µ(3) > µ(1) > µ(2) 0.0000 0.0000 0.0000 µ(3) > µ(2) > µ(1) 0.0000 0.0000 0.0000

And we calculate the posterior estimates and the Bayes factors by fi and ci as previously described for all possible inequality constrained hypotheses of means of the different groups. More specifically, we report in Table 10 the posterior mean and standard deviation of the factor mean for each group obtained from the three chains. The posterior means obtained from the three chains are similar while taking into account the variability indicated by the associated standard deviations. The Bayes factors for testing the inequality constrained hypotheses as shown in Table 11 where the Bayes factor of µ(2) > µ(1) > µ(3) is obviously larger than all the other hypotheses for all three chains. Therefore, according to the results, we conclude that the mean of latent factor of Lithuania is larger than Taiwan, and Taiwan is larger than France.

Besides, we add up the scores of the seven questions for each people and use the ANOVA and Tukey’s test to test the mean difference of the three countries. The result are shown in Table 12. From Table 12 we can make the same conclusion that the mean of Lithuania is larger than Taiwan, and Taiwan is larger than France.

If there is some response style behind the data, the ANOVA method might not be able to account for it and therefore arrive at a wrong conclusion. However, our method would also be useful for these data with possible response style.

Table 12: Tukey multiple comparisons of means 95% family-wise confidence level.

Comparisons Difference Lower bound Upper bound p-value after adjustment

Lithuania-France 3.7056 3.2180 4.1931 0.0000

Taiwan-France 3.0739 2.5616 3.5863 0.0000

Taiwan-Lithuania -0.6316 -1.1570 -0.1062 0.0135

7 Discussion

When the factor means are equal, we expect the Bayes factor for each inequality constrained hypothesis to be similar which in turn suggests that the data do not tend to support any inequality constrained hypotheses. According to the criterion in Table 1, we tend to support the inequality constrained hypothesis with a Bayes factor greater than 3. However, for ng = 250 in Table 5, the Bayes factor values of 3.7458 obtained for the µ(2) > µ(3) > µ(1) hypothesis and 4.0285 obtained for the µ(3) > µ(2) > µ(1) hypothesis are greater than 3. And in Table 7, the Bayes factor value of 5.1854 is obtained for the hypotheses of µ(3) > µ(1) > µ(2) with n = 1000 is also greater than 3. There are some possible causes of such an outcome.

First, although ξ(g)i follows the N(µ(g), φ(g)), the best estimate of the mean of ξ(g)i for i = 1, . . . , ng may not be exactly equal to µ(g) because of sampling variability.

That is, for the current generated data with ng = 250, a greater Bayes factor value might occur simply by chance. When the sample size increases, the mean of ξ(g)i for i = 1, . . . , ng should be closer to µ(g) and the error due to sampling variability will decrease to improve the Bayes factor values as shown in the cases with ng = 500 and ng = 1000 in Table 6 and Table 7. Second, for the obtained posterior draws, no matter how small the difference µ(1)− µ(2) is, the draws contribute to the calculation of the proportion fi for the inequality hypothesis µ(1) − µ(2) despite the fact that such a small difference might be ignorable or simply occur by chance. Therefore, In other words, the magnitudes of the differences among the factor means µ(1), µ(2), and µ(3) might be small to arrive at such a large Bayes factor value. Third, if we consider the Bayes factor values in other chains, they are all smaller than 3. So the Bayes factor values greater than 3 are not typical and it occurs simply due to chance.

In terms of the processing time of the Gibbs sampling algorithm, it needs about 60 hours for running the 200000 iterations in the case of ng = 500. Any increase either in the number of iterations or the sample size will require more time. This issue of considerably long running time has been known to be a major drawback of MCMC algorithms in general. However, the flexibility of allowing for testing inequality constrained hypotheses does encourage the use of Bayesian estimation methods in obtaining the posterior distributions of the relevant parameters. In prac-tice, we can use other estimation methods such as the limited information or the least-squares approaches available in any existent software such as Mplus (Muth´en

& Muth´en, 1998-2015) to obtain parameter estimates and take the estimated values as the starting values for the proposed Gibbs sampling runs. With such starting values presumably close to or likely to be considered as the draws from the posterior distributions of the parameters, the number of iterations required to reach conver-gence should be greatly reduced and much less running time is needed to obtain the posterior distributions of the parameters of interest.

When inequality relations exist among the factor means, we expect the Bayes factor of actual inequality hypothesis to be significantly larger than all the other hypotheses. For the inequality case in Table 5 to Table 7, the Bayes factor of µ(1) > µ(2) > µ(3) is in fact much larger than all the other inequality hypotheses for sample sizes of ng = 250, 500, 1000. In other words, Bayes factor is shown effective and valuable in testing inequality constrained hypothesis of factor means under the MCCFA model.

In this study, we assume there is just one single latent factor, but in reality there may be more than one latent factors. The Bayesian estimation method here can in fact apply to the cases with many latent factors by using similar arguments, but the identifiability constraints should be reconsidered. However, in that case the means of many latent factors will form a mean vector, so the inequality relations among vectors need a clear definition. However, this might be of some interest for future studies.

8 Conclusion

This study discusses the Bayesian estimation and uses Bayes factor to test for in-equality constrained hypotheses of factor means for ordered categorical data among three groups. We extend the estimation method of Song and Lee (2001) to allow for estimating the mean of the latent factors. And the minimal identification con-straints are used to ensure identifiability of the parameters in the Gibbs sampling algorithm. Overall, we conclude that Bayes factor is useful in testing hypotheses involving inequality constraints of factor means for ordered categorical data.

References

[1] Arminger, G. & Muth´en, B.O. (1998). A Bayesian approach to nonlinear latent variable models using the Gibbs sampler and the Metropolis-Hastings algo-rithm. Psychometrika, 63, 271- 300.

[2] Chang, Y. W., Hsu, N. J. & Tsai, R. C. (2015).Unifying differential item func-tioning of categorical CFA and GRM under a discretization of a normal variant.

Manuscript submitted for publication.

[3] Chang, Y. W., Huang, W. K., & Tsai, R. C. (2015). DIF detection using multiple-group categorical CFA with minimum free baseline approach. Journal of Educational Measurement, 52, 181-199.

[4] Cheung, G. W., & Rensvold, R.B. (2000).Assessing extreme and acquiescence response sets in cross-cultural research using structural equations modeling.

Journal of Cross-Cultural Psychology, 31, 187-212.

[5] Fors, F., & Kulin, J. (2016). Bringing affect back in: Measuring and comparing subjective well-being across countries. Social Indicators Research, 127, 323-339.

[6] Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian data analysis (2nd ed.). Boca Raton, FL, USA: Chapman & Hall/CRC.

[7] Gelman, A. & Rubin,D.B.(1992). Inference from iterative simulation using mul-tiple sequences. Statistical Science, 7, 457-511.

[8] Geman, S. & Geman,D.(1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741.

[9] Hoijtink, H. (2013). Objective Bayes Factors for Inequality Constrained Hy-potheses. International Statistical Review, 81, 207-229.

[10] Kass, R. E. & Raftery, A. E.(1995). Bayes factors. Journal of American Statis-tical Association, 90, 773-795.

[11] Klugkist,I. & Hoijtink,H.(2007). The Bayes factor for inequality and about equality constrained models. Computational Statistics & Data Analysis, 51, 6367-6379.

[12] Lee,S. Y., Poon,W. Y. & Bentler,P.M.(1990). Full maximum likelihood analysis of structural equation models with polytomous variables.Statistics and Proba-bility Letters, 9, 91-97.

[13] Lee, S. Y., Poon, W. Y. & Bentler, P.M.(1995). A two-stage estimation of structural equation models with continuous and polytomous variables. British Journal of Mathematical and Statistical Psychology,48, 339-358.

[14] Millsap,R.E. & Tein,Y.J.(2004).Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39, 479-515.

[15] Muth´en, L.K., & Muth´en, B. O. (1998-2015). Mplus users guide (7th ed.). Los Angeles, CA: Muth´en & Muth´en.

[16] R Core Team. (2015). R: A language and environment for statistical computing.

R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.

[17] Song, X. Y. & Lee, S. Y. (2001).Bayesian estimation and test for factor analysis model with continuous and polytomous data in several populations. British Journal of Mathematical and Statistical Psychology, 54, 237-263.

[18] S¨orbom, D. (1974), A General method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Sta-tistical Psychology, 27, 229-239.

[19] Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory:

Toward a unified strategy, Journal of Applied Psychology, 91, 1292-1306.

[20] Tanner, M. A. & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American statistical Association, 82, 528-540.

[21] van der Sluis, S., Vinkhuyzen, A. A. E., Boomsma, D. I., & Posthuma, D.

(2010). Sex differences in adults’ motivation to achieve. Intelligence, 38, 433-446.

[22] Wagenmakers, E.-J.(2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779-804.

[23] Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention:

Methodological advances from alcohol and substance abuse research (pp. 281-324). Washington, DC: American Psychological Association.

相關文件