Identiﬁcation - 混合SEM模型加入作答時間利用應試行為促進模型分析

The Rasch model has the identiﬁcation problem that diﬀerent sets of parameters give rise to the same distribution of U_ij. For example, the two sets of (b_j, θ_i) = (0, 0.5) and (b_j, θ_i) = (−0.5, 0) result in the same probability of getting a correct answer under

the Rasch model (1). Similar identiﬁcation problems exist in our current model. In current model, similar indeterminacy exists between the item diﬃculty b_j and mean ability µ_θ parameters. That is, the sets of parameters (b_j, µ_θ) and (b_j+ c, µ_θ+ c) result in the same probability of getting a correct answer under the Rasch model such that

p(u_ij = 1) =

In other words, diﬀerent parameter sets could satisfy the same distribution of U_ij. That is, based on the data, we are unable to distinguish between the two sets of parame-ters and therefore it is necessary to constrain some parameter values for identiﬁcation purpose.

In the context of multiple-group factor models for ordered-categorical measures, Millsap and Tein (2004) propose a set of parameter identiﬁcation constraints in con-generic and dichotomous case. Concon-generic denotes the factor structure including any single-factor models, or any multiple-factor model where each indicator loads on only one latent variate. In the multiple-group factor models, the ordered-categorical out-come Y_ij yields the response s where

Y_ij^∗^c = ajc

θ_i^c+ εij, (16)

Y_ij^c = s if τ_j,s^c< Y_ij^∗^c≤ τj,s+1c,

where c is the group membership of examinee i, τj,s is the sth threshold of item j, aj

is the loading of item j, and E(ϵij) = 0, ∀i = 1, · · · , N and j = 1, · · · , J.

The set of identiﬁcation constraints proposed by Millsap and Tein (2004) is:

For some c₀ ∈ {1, · · · , C}, E(θi^c⁰) = 0, and Var(ϵ^c_ij⁰) = 1,∀j = 1, · · · , J;

∀j = 1, · · · , J, τj,s¹ 1 = τ_j,s² ₁ =· · · = τj,s^C1 for some s₁ ∈ {1, · · · , S};

for some j₀ ∈ {1, · · · , J}, a¹j0 = a²_j₀ =· · · = a^Cj0 = 1, and

τ_j¹₀_,s₂ = τ_j²₀_,s₂ =· · · = τj^C0,s2 for some s2 ∈ {1, · · · , S}(s2 ̸= s1).

In the two-group case of the focal (f ) and reference (r) groups, Hwang (2012) proposes a set of identiﬁcation constraints for model (16) as

E(θ^r_i) = 0;

∀j = 1, · · · , J, Var(ϵ^rij) = Var(ϵ^f_ij) = 1;

a^r₁ = a^f₁ = 1, and τ_1,1^r = τ_1,1^f .

An alternative set of equivalent identiﬁcation constraints based on Millsap and Tein (2004) in this case is

E(θ_i^r) = 0, and Var(ϵ^r_ij) = 1,∀j = 1, · · · , J;

∀j = 1, · · · , J, τj,1^r = τ_j,1^f ; a^r₁ = a^f₁ = 1, and τ_1,2^r = τ_1,2^f .

Although the above identiﬁcation constraints are proposed for multiple-group models, similar principles apply to the cases of mixture or latent class models. Our present model (8) uses the logit link and the variance of the standard logistic distribution is

π²

3 . That is, if we formulate our model with the ε_ij terms in (16), the variances of ε_ij’s are ﬁxed for all items and all classes and therefore Hwang’s approach is more applicable. Moreover, we only consider Rasch models for dichotomous responses and therefore the complexity of identiﬁcation issue is greatly reduced. To be more speciﬁc, our study adopts the identiﬁcation constraints that

For some c₀ ∈ {1, · · · , C}, E(θi|η, c0) = µ_θ_c0 = 0;

for some j₀ ∈ {1, · · · , J}, b1j0 = b_2j₀ =· · · = bCj0.

3 Simulation Studies

In this section, we examine the estimation performance under the mixture SEM framework for the MRM-RT model. Furthermore, the advantage of using the informa-tion of response time is evaluated by comparing results from simultaneously analyzing item responses and response time to those from simply using item responses alone.

3.1 Data Generation

First, we introduce the simulation settings. There are three latent classes and 25 multiple-choice items, each with four options. Parameters of the three classes are designed to characterize examinees whose behaviors are rapid-guessing (RG, class 1), solution behavior (SB, class 2), high ability and/or respond with familiarity (HARF, class 3). The proportions of examinees in the RG, SB, and HARF classes are 0.15, 0.55 and 0.3, respectively.

The parameter values are mainly based on the simulation settings used in Meyer (2010). Examinees in the SB class generally spend the most time on each item among the three classes. In contrast, examinees in the HARF class spend less time on items than those in the SB class due to their smartness or familiarity with the items from practice. Examinees in the RG class only take a few seconds on each question, for the circumstances that they simply read through item quickly and guess without too much thinking. In the RG class, mean and variance of response log-time are ﬁxed to be -0.5 and 0.01, respectively. For the SB and HARF classes, mean of response log-time increases linearly as item diﬃculty increases. Response time spent on either the easier or the more diﬃcult items are likely to be similarly shorter or longer among examinees, and therefore the variances of response log-time for both the easier and the more diﬃcult items are set to be small. In the SB class, mean and variance of response log-time range from -0.3 to 0.9 and 0.32 to 0.41, respectively. In the HARF class, mean and variance of response log-time range from -0.47 to 0.25 and 0.21 to 0.28, respectively. In other words, the variability in response time is considered smaller for examinees in the HARF class than those in the SB class. The parameters in Table 1 are in the log scale, and in order to better understand the diﬀerence among the classes, the response time distribution on item 13 for each class are plotted in Figure 2. The

response time for examinees in the RG class is generally much shorter than that for other classes.

Table 1: Mean and Variance Parameters of Response Log-Time for RG, SB and HARF Classes

RG SB HARF

mean variance mean variance mean variance

-0.5 0.01 -0.3 0.32 -0.47 0.23

RG = rapid-guessing; SB = solution behavior;

HARF = high ability and/or respond with familiarity.

As for the distribution of the ability, we assume that examinees in the RG class simply guess one of the options without thinking, and therefore each option of an item has the same probability of being chosen. In other words, the probability of getting a correct answer on each four-option item by guessing is .25 in the RG class.

Therefore, the mean and variance of the ability distribution are both conveniently

Figure 2: The response time distribution on item 13 for the RG, SB and HARF classes.

set to be 0 and we simply use the appropriate item diﬃculty parameter to ensure such a probability of answering correctly for each item. The SB class stands for the more general population, and therefore the ability distribution is assumed to follow the standard normal distribution N(0, 1). In contrast, the mean of the ability is higher for the HARF class, characterizing that the faster response is partly due to the smartness of examinees in this class. Moreover, its variance of ability is smaller than that of the SB class, indicating that examinees in the HARF class are more homogeneous in terms of their ability. The mean and variance of the ability distribution in the HARF class are respectively 0.5 and 0.65.

The characteristics of the three classes are not only reﬂected in the response time and the ability distribution, but also in the item diﬃculty parameters. In the RG class, item diﬃculty parameters b_j’s are ﬁxed to 1.099 for all items such that the probability of getting a correct answer on each item is .25. In both the SB and HARF classes, item diﬃculty parameters b_j’s range from -2 to 2. To capture the feature of the HARF class such that examinees in this class might be more familiar with test items through more practice and therefore some items may appear easier to them, the diﬃculty parameters of those items are considered to be smaller for the HARF class than the SB class. Here, we randomly select ten out of the 25 items to have smaller diﬃculty parameters for the HARF class than those for the SB class. The chosen items are 3, 7, 9, 11, 15, 16, 19, 20, 23, and 25. These items are considered to exhibit diﬀerential item functioning (DIF) with respect to the latent groups of SB and HARF (Maij-de Meij, Kelderman &

van der Flier, 2010). All the item diﬃculty parameter values are summarized in Table 2.

Table 2: Item Diﬃculty Parameters for RG, SB, and HARF Classes

item RG SB HARF

RG = rapid-guessing; SB = solution behavior;

HARF = high ability and/or respond with familiarity.

The estimation performance under various sample sizes, i.e., the total numbers of examinees, is also of interest. In addition to the sample sizes of 500 and 2000 used in Meyer (2010), we also take 250 and 1000 into consideration to better represent the small, medium and large sample sizes. For each case, 100 sets of independent repli-cations are simulated. To further examine the information brought from the response time, two ﬁtted models, MRM-RT and MRM, are considered for each replication.

MLR is applied to both ﬁtted models. All the estimations are done using Mplus 5.

Necessary identiﬁcation constraints are imposed to both the SB and HARF classes to

ensure identiﬁcation of item parameters in both ﬁttings. More speciﬁcally, the ability mean is ﬁxed to 0 for the SB class, and item 1 is considered the anchor item with no diﬀerential item functioning between the SB and HARF classes. In addition, with regard to the RG class, the ability mean and variance are ﬁxed to 0, and item diﬃculty parameters is ﬁxed to 1.099 without estimating.

在文檔中混合SEM模型加入作答時間利用應試行為促進模型分析 (頁 20-27)