• 沒有找到結果。

A Solution to Modeling Multilevel Confirmatory Factor Analysis with Data Obtained from Complex Survey Sampling to AvoidConflated Parameter Estimates

N/A
N/A
Protected

Academic year: 2021

Share "A Solution to Modeling Multilevel Confirmatory Factor Analysis with Data Obtained from Complex Survey Sampling to AvoidConflated Parameter Estimates"

Copied!
19
0
0

加載中.... (立即查看全文)

全文

(1)

Edited by: Holmes Finch, Ball State University, United States Reviewed by: Hsien-Yuan Hsu, University of Texas Health Science Center at Houston, United States Jung Yeon Park, Columbia University, United States Ke-Hai Yuan, University of Notre Dame, United States *Correspondence: Jiun-Yu Wu jiunyu.rms@gmail.com; jiunyuwu@g2.nctu.edu.tw Specialty section: This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology Received: 14 March 2017 Accepted: 15 August 2017 Published: 22 September 2017 Citation: Wu J-Y, Lin JJH, Nian M-W and Hsiao Y-C (2017) A Solution to Modeling Multilevel Confirmatory Factor Analysis with Data Obtained from Complex Survey Sampling to Avoid Conflated Parameter Estimates. Front. Psychol. 8:1464. doi: 10.3389/fpsyg.2017.01464

A Solution to Modeling Multilevel

Confirmatory Factor Analysis with

Data Obtained from Complex Survey

Sampling to Avoid Conflated

Parameter Estimates

Jiun-Yu Wu1*, John J. H. Lin2, Mei-Wen Nian1and Yi-Cheng Hsiao1

1Institute of Education, National Chiao Tung University, Hsinchu, Taiwan,2Office of Institutional Research, National Central

University, Taoyuan, Taiwan

The issue of equality in the between-and within-level structures in Multilevel Confirmatory Factor Analysis (MCFA) models has been influential for obtaining unbiased parameter estimates and statistical inferences. A commonly seen condition is the inequality of factor loadings under equal level-varying structures. With mathematical investigation and Monte Carlo simulation, this study compared the robustness of five statistical models including two model-based (a true and a mis-specified models), one design-based, and two maximum models (two models where the full rank of variance-covariance matrix is estimated in between level and within level, respectively) in analyzing complex survey measurement data with level-varying factor loadings. The empirical data of 120 3rd graders’ (from 40 classrooms) perceived Harter competence scale were modeled using MCFA and the parameter estimates were used as true parameters to perform the Monte Carlo simulation study. Results showed maximum models was robust to unequal factor loadings while the design-based and the miss-specified model-based approaches produced conflated results and spurious statistical inferences. We recommend the use of maximum models if researchers have limited information about the pattern of factor loadings and measurement structures. Measurement models are key components of Structural Equation Modeling (SEM); therefore, the findings can be generalized to multilevel SEM and CFA models. Mplus codes are provided for maximum models and other analytical models.

Keywords: multilevel confirmatory factor analysis, design-based approach, model-based approach, maximum model, level-varying factor loadings, complex survey sampling, measurement

INTRODUCTION

Multilevel Confirmatory Factor Analysis (MCFA) extends the power of Confirmatory Factor Analysis (CFA) to accommodate the complex survey data with the estimation of the level-specific variance components and the respective measurement models. Complex survey data are obtained through cluster sampling or multistage sampling, where a few individuals within a class/household or the entire class/family are selected. This type of sampling scheme is likely to result in non-independent observations with within-cluster dependency (Skrondal and Rabe-Hesketh, 2007). If the dependent data are analyzed through the traditional approaches which assume independent

(2)

observations, “incorrect parameter estimates, standard errors, and inappropriate fit statistics may be obtained” (du Toit and du Toit, 2008, p. 456).

Researchers has devoted their attention in discussing the influences of applying different multilevel modeling constructions on complex survey data (e.g., model-fit indices: Hsu et al., 2015; reliability measures: Geldhof et al., 2013;

parameter estimates and statistical inferences: Wu and Kwok,

2012; longitudinal design:Wu et al., 2014). Among the research designs in these studies, the issue of inequality in the between-(i.e., the higher level or cluster level) and within-level between-(i.e., the lower level or individual level) structure in complex survey data has been proven to be influential for obtaining unbiased parameter estimates along with their consistent statistical inferences. Compared to inequality of level structures in multilevel models, a less addressed condition is that the true model did have the same factor structure at both levels while the magnitudes and statistical significance of the factor loadings varied across levels and varied within the levels, which

occurred frequently in empirical research (e.g.,Dyer et al., 2005;

Klangphahol et al., 2010).

For example, Dyer et al. (2005) applied MCFA to study

organizational leadership at the individual and societal level and obtained a common factor consisting of five items of being “formal,” “habitual,” “cautious,” “procedural,” and “ritualistic.” The five items loaded much stronger onto the single factor at the between level (i.e., societal level) than at the within level (individual level), which supported the belief that this leadership scale operates mainly at the societal level. Based on this finding, Dyer et al. (2005)suggested that a three-item factor (discarding two trivial items with small factor loadings) instead of a five-item factor should be used if the interest of leadership study is at the individual level. Dyer et al.’s suggestion capitalized on the importance of specifying an optimal measurement model with complex survey data in terms of both model structure and sizes of factor loadings to obtain correct statistical and practical interpretations in scale development.

From the factor analysis point of view, items with variance explained smaller than 20% or standardized factor loadings less than 0.45 would be considered as low communality (EFA: MacCallum et al., 1999; CFA: Meade and Bauer, 2007). From a measurement point of view, items with standardized factor loadings larger than 0.6 would exhibit better psychometric properties (Bagozzi and Yi, 1988; Kline, 2010). Failing to detect items with small factor loadings may lead to a misunderstanding that all items are equally important, causing researchers to investigate problems that are of little importance or little relevance to the intended measure.

Therefore, in this study, we performed a substantive-methodological synergy (Marsh and Hau, 2007) by applying different modeling strategies on simulated synthetic datasets with population parameters specified based on an empirical dataset to examine the robustness of model-based, design-based, and maximum models regarding their effectiveness and efficiency in producing unbiased parameter estimates and statistical inference for the measurement data obtained from complex survey sampling. Below we elaborated on the issues with modeling

strategies and unequal factor loadings, followed by introduction to three modeling strategies on complex survey data.

Issues with Modeling Strategies and

Unequal Factor Loadings

Traditionally, several multilevel modeling strategies can be applied to address data dependency in complex survey data (Heck and Thomas, 2008; Rabe-Hesketh and Skrondal, 2008; Hox, 2010; Snijders and Bosker, 2011). Specifying different structures for separate levels, namely a model-based approach, on complex survey data allows free estimation of level-specific parameters and enables the detection of possible inequality in parameter estimates. However, in reality, information or truth about the higher-level structure is rarely known without the support of theoretical evidence. If researchers jump into multilevel analysis without theoretical or empirical evidence, the correctness of the multilevel structure is at risk.

Alternatively, researchers can apply the design-based approach by specifying only an overall model for the complex survey data to infer their findings to the lower level sampling units, and using the robust standard error estimator (Huber, 1967; White, 1980) to correct for the bias in standard error of the fixed effects (Muthén and Satorra, 1995). The design-based approach has been proved to yield satisfying analytic results only when the complex survey data meet the assumption of equal structures in both between- and within-levels (Wu and Kwok, 2012). In addition to design-based and model-based approaches, a possible alternative for analyzing multilevel data is through the use of maximum models (Hox, 2002, 2010; Wu and Kwok, 2012), where a saturated between-level model is estimated and can be used to focus on a specific level of analysis.

To examine the robustness of reliability measures on complex

survey data,Geldhof et al. (2013)used MCFA and single-level

CFA (i.e., without taking data dependency into consideration) on the simulated multilevel datasets, where the between and within levels had exactly the same factor loadings but with different high and low reliability across levels using the average item ICC as a dependency measure. Their study findings suggested that single-level CFAs cannot yield the actual scale reliability unless the true reliabilities are identical at each level. Moreover, in the simulation study, they postulated that the true MCFA model had the same factor loadings within and across levels, i.e., the between and within level model were identical in terms of magnitude of factor loadings and factor structures. Few studies have investigated the issue of inequality of factor loadings under equal factor structure within and across levels. Besides, systematic investigation on the performance of model-fit statistics, indices and information criteria, and the resulted parameter estimates with statistical

inferences were not discussed inGeldhof et al. (2013).

Extending the simulation settings of Geldhof et al. (2013),

we examined performance of different model specifications regarding the issues of inequality of factor loadings and different factor structures within and across levels. Distinct Cluster Numbers (CN), Cluster Sizes (CS) and ICCs were used in true model for the simulation settings of this study. The criterion variables include overall exact model fit chi-square test and various model fit indices, both fixed-effect and random-effect

(3)

parameter estimates, their 95% coverage rate and empirical

power, as well as the variance explained measure (R2) and scale

reliability (ρ).

Specifically, this study aims to examine the robustness of the three modeling strategies using five analytic models (i.e., MCFA, miss-specified MCFA, one-level design-based CFA, Max CFA with saturated Between level, Max CFA with saturated Within level) in testing the multilevel measurement data with unequal magnitudes of factor loadings. Of the factor loadings, some may be trivial or of little relevance in a practical sense at the individual level under equal level structures. In the following section, we provide a review of three multilevel modeling strategies.

Three Modeling Strategies on Complex

Survey Data

Model- and Design-Based Strategies

The rationale for using multilevel models in analyzing complex survey data is to reflect the natural multistage sampling scheme (Muthén, 1994; Heck and Thomas, 2008). Researchers can do so by constructing the analytic model either to simultaneously calculate the lower- and higher-level parameter estimates which may have different values at each level or to adjust the standard errors of fixed effects. The model-based approach (e.g., MCFA technique) conforms to the actual multi-stage sampling scheme by specifying a level-specific model for each level of the data. In other words, for a two-level clustered sampling data, it specifies a between-level model that conforms to the level 2 structure (i.e., higher level) and a within-level model that conforms to the level 1 structure (i.e., lower level). Instead of constructing separate level models for multilevel data, the design-based approach analyzes the data with only one overall model and considers the sampling scheme by adjusting for the standard errors of the parameter estimates based on the sampling design. The adjustment is implemented using the robust standard error estimator (Huber, 1967; White, 1980) or sandwich-type variance estimator, a general name for alternative variance estimators. The sandwich-type variance estimator functions as an overall adjustment of the deviated standard error of parameter estimates due to extra data dependency along with the original statistical approach. This kind of relative variance estimators has been proposed to address data non-independence (i.e., data heteroskedasticity) more directly in CFAs (Muthén and Satorra, 1995). The adjustment is a post-hoc process and is said to only affect the standard errors, not the parameter estimates (Hardin and Hilbe, 2007).

In a simulation study, Muthén and Satorra (1995) showed

that under the same model structure for all data levels, these two approaches performed equally well for complex survey data. Compared to the model-based approach, the design-based approach is used more frequently by researchers in the applied areas (Rebollo et al., 2006; Róbert, 2010; Roberts et al., 2010; Rosenthal and Villegas, 2010; Wu et al., 2010; Brook et al., 2011; Martin et al., 2011; Wu, 2015, 2017) because it only requires a single model specification and often researchers were interested in examining the lower level (i.e., the within-level) model with the most sampling units.

Despite the simplicity of the model’s specifications, the design-based approach for complex survey data is built upon the assumption of the same level-varying structures (Muthén and Satorra, 1995; Wu and Kwok, 2012). However, this assumption is often violated in empirical research when researchers examine the level-specific structures of their multilevel dataset (e.g., Wilhelm and Schoebi, 2007). Inequality in the between- and within-level structures leads to conflated estimations of the fixed and random effects if the design-based approach is used (Wu and Kwok, 2012). What’s more, in the current study, we posit that if the same magnitude and significance of factor loadings do not hold at different levels under same level structures, inequality of the between- and within-level factor loadings may also cause potential problem with the design-based approach.

In the case ofDyer et al. (2005), if the authors had used the

design-based approach for their procedural leadership analysis, they would obtain the design-based estimates which would have been contaminated with information from both the between-and within-level models. Thus, they would have no idea of the larger factor loadings at the societal level and may not be able to detect the two trivial items at the individual level. From a practical perspective, researchers would falsely conclude the scale is a valid measurement for the research question related to the individual participant. In addition, the estimation of the overall model parameters and the scale reliability measures may be questionable to infer the individual-level characteristics. However, the issue of inequality of factor loadings between and within the levels has rarely been systematically examined in previous studies.

Maximum Model

Another feasible modeling strategy for complex survey data is called the “maximum model,” (Hox, 2002, 2010; Wu and Kwok, 2012) where a saturated model in specific level (usually the higher-level) is built by estimating the full rank of between-level variance-covariance matrix with the consumption of all available degrees of freedom. This maximum model technique was firstly

suggested by researchers (e.g.,Hox, 2002; Stapleton, 2006; Yuan

and Bentler, 2007) as the baseline model for constructing

multilevel analysis with theoretical evidence. Ryu and West

(2009), on the other hand, examined the performance of level-specific fit indices using maximum modeling technique. More

recently,Wu and Kwok (2012)found that the maximum model

and correctly specified model-based approaches performed equally well for analyzing complex survey data regardless of equality in level structures whereas the design-based approach only produced satisfying fixed-effect estimates and standard error under equal within-/between-level structure scenarios. Compared to inequality of level structures, what is more commonly found in empirical measurement research is unequal magnitudes of factor loadings in different levels with the same number of factors. However, no study to date has systematically examined the consequences of miss-specifying multilevel models for a two-level CFA measurement data regarding the violation of equality of factor loadings. This study will focus on inequality of factor loadings within and across levels of MCFA to explore potential analytical problems.

(4)

In the SEM framework, analysts commonly use differential chi-square tests to conduct model comparison analysis with numerous completing models. However, this kind of test is only good for comparisons between nested models. Besides, the chi-square test statistic is easily influenced by large sample sizes (Yuan et al., 2007; Kline, 2010). Alternatively, information criteria statistics can be used for model comparison between nested and non-nested models (Sclove, 1987). By taking the model uncertainty into consideration, the information criteria overcome the above-mentioned difficulties (Bollen et al., 2014). In this study, besides commonly-used model-fit test statistics and indices, we discussed the performance of Akaike Information

Criterion (AIC,Akaike, 1974), Bayesian/Schwartz Information

Criterion (BIC, Schwarz, 1978), and the sample-size adjusted

BIC (adj. BIC, Sclove, 1987) in assessing the different model

specifications. Models with smaller AIC, BIC, or adjusted BIC would be considered a better fit to the designated dataset. Detailed discussion among these information criteria under the

SEM framework can be found inNylund et al. (2007)for Latent

Class Analysis and Growth Mixture Modeling, andBollen et al.

(2014)for single-level SEM modeling. This study would add to the literature regarding the guideline of interpreting information criteria to construct measurement models for complex survey data under the SEM framework.

METHODS

Mathematical Investigation of Three SEM

Techniques on Complex Survey

Measurement Data

We provided the model specifications of the model-based, design-based, and maximum modeling approaches (with both saturated between-level model, and saturated within-level structure model) and their mathematical derivations to investigate the robustness of these modeling approaches in dealing with the inequality of factor loadings at between- and within-level models under equal factorial structures.

Using multilevel data drawn from a two-level multistage sampling strategy as an example, let us suppose that the G groups are randomly drawn from the target population at the

first stage of sampling and that ng participants are sampled

within each group g at the second stage. We have a total of N =

G P g=1

ngparticipants. For each participant, P item responses

(ypig, p = 1,2,. . . ,P) are gathered. We now have random vector

of response variables yig= [y1ig, y2ig,. . . , ypig]1×Pfor participant i

(lower-level unit, i = 1,2,. . . ,ng) within group g (higher-level unit,

g = 1,2,. . . ,G).

For the gth group, the random matrix of observations may be arranged as follows: yg=      y1g y2g .. . yngg      =       y11g y21g · · · yP1g  y12g y22g · · · yP2g  .. . ... · · · ... y1ngg y2ngg · · · yPngg        ng×P (1)

Analogous to the variance decomposition used in ANOVA

analysis, the observation yigcan be decomposed into its

between-group component and within-between-group component, that is, yig = yB...g+ yW.ig, ∀i = 1, 2, . . ., ng, g = 1, 2, . . ., G (2)

where yB...gis the between-group component with MVN (µ, 6B)

(i.e., multivariate normal distribution with grand mean µ and

variance-covariance matrix 6B) and yW.ig is the within-group

component with MVN (µ, 6W). Typically, µg is set as 0. The

between-group components in different groups is set to be

uncorrelated; that is, Cov yB...g, yB...g′ = 0, ∀g 6= g′. Similarly,

the correlation between different participants in different groups

is also set to be zero (i.e., Cov yW.ig, yW.i′g′ = 0, ∀i 6= i′ &

∀g 6= g′). Furthermore, the cross-level correlation between y

B...g

and yW.igis defined as uncorrelated.

Hence, the variance-covariance matrix of yigmay be

decomposed into the combination of between-group and

within-group variations, Cov yig = 6B + 6W. Going a step

further, to consider the MCFA model (i.e., the model-based approach), Equation (2) may be written as

yig = yB...g+ yW.ig

= µ + 3BηB...g+ εB...g+ 3WηW.ig+ εW.ig (3)

The between-group component yB...g is the combination

of a product of factor loading matrix 3B and latent

factor ηB...g∼MVN (0, 9B), and the unique vector

εB...g∼MVN (0, 2B). The within-group component yW.ig is

the combination of a product of factor loading matrix 3W

and latent factor ηW.ig∼MVN (0, 9W), and the unique vector

εW.ig∼MVN (0, 2W). Random components were set to be

orthogonal (i.e.,ηB...g⊥εB...g⊥ηW.ig⊥εW.ig).

Equation (3) specifies two sources of random variation for the observed variables, within-group (i.e., within-level) variation and between-group (i.e., between-level) variation to the nature of complex survey data, rather than just one overall random source.

As a result, the variance-covariance matrix of yigmay be further

rewritten as Cov  yig  = CovµB+ 3BηB...g+ εB...g+ µW+ 3WηW.ig + εW.ig = Cov3BηB...g+ εB...g 

+ Cov3WηW.ig+ εW.ig



= 3B9B3B′+ 2B+ 3W9W3W′+ 2W (MCFA)

(4) The variance covariance matrix of indicators is a function of random effects and fixed effects in both between- and within-level models. Using the multilevel CFA model, the total variance-covariance of observations may be expressed as a combination of three components in two levels: (a) factor

loadings between indicators and latent factors (3B and 3W),

(b) latent factor variances and covariance (explained portion of

observed variance, 9B and 9W), and (c) residual variance of

(5)

2W). The single-factor intraclass correlation (ICC) of MCFA is

then defined as ICC = 9B(9B+ 9W)−1(Muthén, 1991, 1994).

When the maximum modeling technique is applied to analyze these two-level data, Equation (4) becomes

Cov yig = 6SaturatedB + 3W9W3W′+ 2W

(Max CFA with saturated between -level structure, 5.1)

or

Cov yig = 3B9B3B′+ 2B+ 6SaturatedW

(Max CFA with saturated within -level structure, 5.2)

The full-rank variance-covariance matrix 6SaturatedB or 6SaturatedW

is unstructured, that is, all the possible between-level or within-level variation of indicators is estimated and separated from their total variance component. For the multilevel measurement model specification in this study, the unique within-level or between-level variation is then used to construct the respective within-level or between-level model with fixed and random effects without contamination from the other level. The residual part is the unique portion of total variation to the within-level

or between-level variation of the indicators. If 3B = 3W (i.e.,

equality of factor loadings and structures holds for between-/ within-level models), the resulting factor loading estimates of design-based approach with one-level model are equal to the between-/within-level factor loadings in the true two-level model

(i.e., 3B= 3W = 3y).

If we ignore the multilevel structure and construct a one-level model with design-based approach for the multilevel dataset y, the observed variance-covariance matrix of the indicators may be represented with the model-driven parameters as follows:

Cov y = 3y93y′+ 2ε= 3y(9B+ 9W)3y

+ (2B+ 2W)(1 − level CFA) (6)

With the inclusion of ICC, Equation (4) can be further reformatted as:

Cov y = 3BICC 93′B+ 3W(I − ICC) 93W

+ (2B+ 2W) (7)

However, if the magnitudes of non-zero elements in between-/within-level factor loading matrix are not the same, the factor loading estimates of design-based approach is a function of true between- and within-level factor loadings and the ICC measures. Snijders and Bosker (2011) shows that, in univariate case, the

regression coefficient of overall model with multilevel dataset will

be λy = ICCλB+ (1 − ICC) λW. In the MCFA case, if there is a

uni-factor structure in both levels, we hypothesize that the factor loading estimates of design-based approach could be simplified as (which is later being validated by the simulation result):

3y= ICC3B+ (1 − ICC) 3W (8)

That is, design-based approach could yield a conflated

factor loading estimate (3y) of complex survey data. If

the indicator has more variation in the within level, its factor loading estimate from design-based approach will be close to its within-level counterpart; if the indicator has more variation in the between level, its conflated factor loading estimate will be close to its between-level counterpart.

The composite reliability with congeneric measures based on CFA can then be calculated for the above models (Raykov, 2004; Brown, 2006), using: ρ =   P X p=1 λp   2 ,    P X p=1 λp   2 + P X p=1 2p  , (9)

where λp is the factor loading of item p onto a single

common factor and 2p is the unique variance of item p.

When constructing a one-level model, we can insert Equation (8) into (9) to obtain the reliability for the design-based model in Equation (10), which can be further expressed as the function of between- and within-level factor loadings and errors: ρDesign−Based Approach=   P P p=1 ICCλBp !2 + P P p=1 (1 − ICC) λWp !2       P P p=1 ICCλBp !2 + PP p=1 (1 − ICC) λWp !2 + P P p=1  2Bp+ 2Wp    , (10)

Where λBPand λWpare the standardized factor loadings of item

p in the between- and within-level, and 2BPand 2Wpare residual

variances of item p in the between- and within-level. The detailed discussion about reliability measures in complex survey data with

MCFA and CFA can be referred toGeldhof et al. (2013).

In the following sections, the simulation study was provided to illustrate the robustness of the three SEM modeling strategies with five model specifications in analyzing a measurement dataset obtained from complex survey. The simulation results could inform the influences of different modeling techniques on overall exact model fit chi-square test and various model fit indices, information criteria, parameter and standard error estimates as well as the statistical inferences in the statistical analysis. Parameter Specification for the Simulation From a substantive-methodological synergy (Marsh and Hau, 2007) perspective, we specified the population parameters in our simulations based on the parameter estimates obtained from an empirical dataset to examine the performance of the proposed modeling approaches on multilevel measurement data.

(6)

Empirical Dataset: Measurement and

Sampling

From a sample of 784 academically at-risk children participating in a longitudinal study, we selected a balanced dataset of 120 students nested within 40 classrooms with 3 students in each class. A total of 120 students (47 Females and 73 males; 39 African Americans, 38 Hispanics, 40 Caucasians and 3 Asians/Pacific Islanders) were drawn. No evidence of selective consent for participation in the larger longitudinal study was found. Details about recruitment of multilevel sampling procedure of the 784

participants were reported in Hughes and Kwok (2007). Their

Grade 3 Harter competence measures were used in the current study. We generated the balanced-design synthetic datasets based on the parameter estimates from the MCFA of their Harter competence measures, considering different levels of cluster sizes, cluster numbers and intraclass correlations.

The Children Perceived Competence Scale (CPC, Harter,

1982) is composed of three domain-specific competences, including child-perceived competence in scholastic competence (CPCSC), social acceptance (CPCSA), and athletic competence (CPCAC), as well as a general global self-worth scale (CPCSW). The item-level responses consisted of ordered and categorical 4-point scale. Each of the subscale was measured using 7 items for a total of 28 items. Reliability of the item-level subscales ranged from 0.75 to 0.86. We used the composite scores of each subscale to form four continuous indicators for children’s general competence at both classroom and individual levels so that the analysis result can be generalized to continuous responses.

Simulation Study: True Model Specification

In order to demonstrate the adequacy and robustness of five different modeling approaches, we used Monte-Carlo simulation to generate the synthetic complex survey dataset with known true multilevel measurement model of CPC scale. A two-level uni-factor CFA model was firstly built for the empirical dataset of CPC scale with an overall factor of child-perceived competence including three domain-specific subscale indicators and one general self-worth indicator in both the between- and within-levels for the empirical dataset (as shown in Figure 1). With Full Information Maximum Likelihood (FIML) estimation, the resulting two-level CFA has an adequate model fit test statistic

and index values (χ2 = 9.421 with df = 4 and p = 0.051, CFI =

0.990, RMSEA = 0.048, SRMR-Within = 0.023, SRMR-Between = 0.018). The parameter estimates of varying factor loadings were retained in the true models for simulation. The ICCs for the indicators in the empirical analysis ranged from 0.352 to 0.617. The factor variances in between- and within-level would then be altered to have different ICC settings in the simulation study.

Even though the between- and within-level model had equal structures, their factor loading magnitudes and patterns of significance were distinct for this empirical dataset (see the two dashed lines in Figure 1B). The unstandardized factor loading estimates from the two-level CFA analysis of empirical dataset were used as the population values for Monte Carlo simulation. The population values for the within-level factor loadings was 1 for scholastic competence (marker variable with standardized factor loading λ = 0.719), 0.45 for social acceptance (λ =

FIGURE 1 | The multilevel CFA model with parameters from empirical Harter dataset. (A) The true between-level model. (B) The true within-level model. **p < 0.05.

0.400), 0.92 for athletic competence (λ = 0.694), and 0.36 for global self-worth (λ = 0.331). In the within-level, only athletic competence was a statistically significant factor loading (i.e., p ≤ 0.05). On the other hand, all the between-level factor loadings were statistically significant. The between-level factor loadings were 1 (marker variable with standardized factor loading λ = 0.910) for scholastic competence, 0.78 for social acceptance (λ = 0.871), 0.60 for athletic competence (λ = 0.807), and 0.62 for global self-worth (λ = 0.816). The intercepts of the indicators were set as 2.896 for scholastic competence, 2.856 for social acceptance, 2.860 for athletic competence, and 3.268 for global self-worth. Finally, the population values of residual variance for the classroom- and individual-level indicators were set as

0.2 and 0.5. Total variance of factor was set at one (9CPC =

9B_CPC+ 9W_CPC= 1), and the between- and within-level factor

variance was set as 9B_CPCand 9W_CPC. The levels of intraclass

correlation (ICC) were then manipulated as 9B_CPC(9B_CPC +

9W_CPC)−1. For the simulation study, the true two-level model was constructed with these empirical parameter estimates under varying conditions of cluster size (CS = 3, 30, 200), cluster

(7)

number (CN = 40, 100, 300) and Intraclass correlation (ICC =

0.1, 0.3, 0.5, 0.7, 0.9,Muthén, 1994) to generate 1,000 converged

copies of balanced-design complex survey datasets. A total of

3(CS)∗3(CN)∗5(ICC)∗1,000(reps) = 45,000 synthetic multilevel

datasets were generated.

Simulation Study: Analytical Models

Specification

Five SEM models for multilevel data with robust estimation were used to analyze the synthetic datasets. For ease of differentiation, we used the following naming scheme for the five model specifications:

(1) 2MLR: the two-level model-based model and the true model (Figures 1A,B).

(2) 1MLR: the one-level design-based model (Figure 1B).

(3) 2MaxB1: the two-level maximum model with saturated

model in between level (Figure 2) and true model in within level (Figure 1B).

(4) 2MaxW: the two-level maximum model with true model in between level (Figure 1A) and saturated model in within level (Figure 2).

(5) 2Miss: the miss-specified two-level model was constructed as Figures 1A,B by constraining the factor loading estimates of the between and within levels to be the same. This miss-specified model was used to test if the model-based approach is robust in detecting trivial items, and to examine if this model performs the same as design-based approach (i.e., 1MLR).

Two Mplus built-in routines were employed for the statistical modeling (Muthén and Muthén, 2012). First, the TYPE = TWOLEVEL routine, which allows level-specific specifications for complex survey data, was used for the 2MLR, 2MaxB, 2MaxW, and 2Miss). Second, TYPE = COMPLEX was used as design-based approach, where only a single level model is estimated (i.e., 1MLR) for complex survey data. By default, both routines use the full information maximum likelihood (FIML) parameter estimator and the robust standard error estimator; in Mplus, this procedure is called as maximum likelihood estimation with robust standard error correction (MLR), which is useful for non-normal and non-independent observations (Muthén and Satorra, 1995). Different from using the inverse of information matrix as the sampling variance estimate with normal distribution assumption, an asymptotically consistent estimate of covariance matrix is derived directly from observations by including a scaling matrix in between two copies of the Hessian matrix and then is used to compute the robust estimate of sampling variance, which is the square of standard error (Huber, 1967; White, 1980; Hardin and Hilbe, 2007). The chi-square test statistic reported using MLR is asymptotically

equivalent to Yuan-Bentler T2∗ test statistic (Muthén and

Muthén, 2012). We compared each model performance in simulation convergence rate (CR), model-fit test statistic and fit indices, Information Criteria (i.e., AIC, BIC and adjusted BIC),

1The exemplary Mplus syntax of 2MaxB model is provided in Appendix for

reader’s reference.

FIGURE 2 | The saturated model.

and the estimates of between/within-level factor loadings, scale reliabilities, residual variance and mean structure estimates as well as their 95% coverage rate and empirical power.

Level-specific scale reliability was calculated based onGeldhof et al.

(2013)using Equation (9) to decompose variance in an item into the individual component and the cluster component.

RESULTS

Convergence Rate of Simulations, Model

Fit Test Statistic, Fit Indices and

Information Criteria

For ease of illustration, we selected the results of simulation conditions with the smallest cluster number (CN = 40 with CS = 3, 30, 200) and the largest combination of sample size (CN = 300 with CS = 200) in Figure 3. When CN larger than 40, the five modeling techniques achieved convergent results across different ICC conditions. Nevertheless, with a cluster of 40, the convergence ratio varied with ICC values: 2Miss and 1MLR reached 100% convergence for all ICCs, but 2MLR, 2MaxB and 2MaxW had 9.5∼38.2% non-convergent simulation results when ICC was smaller than 0.3 or larger than 0.7. For instance, in the smallest case of CN(CS) = 40(3), the CR pattern of the five modeling techniques differed with ICC values: 1MLR and 2Miss reached perfect convergence in all ICC conditions; the CR for 2MaxW exhibited a quadratic pattern, which increased with the increase of ICC and leveled off and reached 100% when ICC ≥0.5 while 2MaxB demonstrated a reversed pattern. 2MLR had a downward-U quadratic pattern of CRs verse ICCs with the peak at ICC = 0.5. According to the error message, the non-convergent result of 2-level models was mostly due to the non-positive definite first-order derivative product matrix for the insufficient portion of variance in the within or between level, especially in the smaller sample size conditions.

All models yielded significant Chi-square exact test results but adequate CFIs, RMSEAs, and SRMR-W values (e.g., CFI

> 0.90, RMSEA < 0.08 and SRMR < 0.08, Hu and Bentler,

1999) in all simulation conditions. However, for SRMR-Bs, 2Miss consistently demonstrated badness of fit across most of simulation conditions. Particularly, the SRMS-Bs of the 2Miss showed a quadratic pattern with downward-U shape and peaked between 0.5 and 0.7 for models with a sample size equal or greater than CN(CS) = 40(30). The result suggested that the

(8)

FIGURE 3 | Plots of selected analytical outputs of ICC against fit statistics across different modeling strategies. CN, Cluster number; CS, Cluster size; CR, Convergence rate of simulations; ICC, Intraclass correlation. 1MLR, the one-level design-based model; 2MLR, the two-level model-based model and the true model; 2MaxB, the two-level maximum model with saturated model in between level and true model in within level; 2MaxW, the two-level maximum model with true model in between level and saturated model in within level; 2Miss, the miss-specified two-level model by constraining the factor loading estimates of the between and within levels to be the same.

2Miss showed lack of fit to the multilevel measurement dataset with level-varying parameters.

Across all simulation conditions, four 2-level models consistently generated smaller AIC and adj. BIC than the 1MLR. The average difference of AIC and adj. BIC between 2-level models and 1MLR were larger than 20 for all the simulation cases even for the smallest sample size conditions

(e.g., for [CN(CS), ICC] = [40(3), 0.1], AIC1MLR = 1,373.41

vs. AIC2MaxB = 1,355.38, and adj. BIC1MLR = 1,368.92 vs.

adj. BIC2MaxB = 1,347.15). AIC and adj. BIC indices preferred

model-based approaches over design-based approaches across all simulation settings. BIC could distinguish the 2-level models

from 1MLR in most of simulation conditions, but not for the conditions with the smallest sample CN(CS) = 40(3) at ICC < 0.3.

Estimation of Fixed Effects

The parameter estimates of [CN(CS), ICC] = [300(200), 0.3], [40(30), 0.3] and [40(3), 0.3] were summarized in Tables 1–3. Besides, the relative and absolute bias values of estimated factor

loadings of CPCSA and CPCAC were tabulated in Table 42.

2Because CPCSC is the maker variable so its factor loading would constantly fixed

(9)

T A B L E 1 | S im u la te d u n st a n d a rd iz e d re su lts ao f S E M te c h n iq u e s o n sy n th e tic h a rt e r’s c o m p e te n c e d a ta se t fo r [C N (C S ), IC C ] = [3 0 0 (2 0 0 ), 0 .3 ]. P o p u la ti o n s e tt in g s 2 M L R 1 M L R 2 M a x B 2 M is s 2 M a x W C o n ve rg e n c e ra te --1 .0 0 1 .0 0 1 .0 0 1 .0 0 1 .0 0 C h i-sq u a re (d f) --4 .1 1 3 (4 ) 6 .4 9 0 (2 ) 2 .0 5 7 (2 ) 8 0 .3 4 3 (7 ) 2 .0 8 2 (2 ) C F I --1 .0 0 0 .9 9 8 1 .0 0 0 .9 9 8 1 .0 0 R M S E A --0 .0 0 1 0 .0 0 5 0 .0 0 1 0 .0 1 3 0 .0 0 1 S R M R (S R M R _B ) -- (- ---) 0 .0 0 1 (0 .0 1 3 ) 0 .0 0 7 0 .0 0 1 (< 0 .0 0 1 ) 0 .0 0 1 (0 .1 4 8 ) < 0 .0 0 1 (0 .0 1 3 ) A IC --6 0 4 ,0 0 9 .4 2 7 6 8 4 ,7 6 9 .9 1 1 6 0 4 ,0 1 1 .4 1 2 6 0 4 ,0 7 9 .3 6 3 6 0 4 ,0 1 1 .3 9 1 B IC --6 0 4 ,1 8 9 .4 6 9 6 8 4 ,8 7 7 .9 3 6 6 0 4 ,2 0 9 .4 5 8 6 0 4 ,2 3 2 .3 9 8 6 0 4 ,2 0 9 .4 3 7 A B IC --6 0 4 ,1 2 5 .9 0 9 6 8 4 ,8 3 9 .8 0 0 6 0 4 ,1 3 9 .5 4 2 6 0 4 ,1 7 8 .3 7 2 6 0 4 ,1 3 9 .5 2 0 P o p u la ti o n E s t. S E 9 5 % S ig . E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig W IT H IN -L E V E L W _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .4 5 0 0 .4 5 0 0 .0 0 5 0 .9 4 4 1 .0 0 0 .5 3 4 0 .0 2 1 0 .0 2 3 1 .0 0 0 .4 5 0 0 .0 0 5 0 .9 4 4 1 .0 0 0 .4 5 1 0 .0 0 5 0 .0 0 0 1 .0 0 C P C A C 0 .9 2 0 ** 0 .9 2 0 0 .0 0 8 0 .9 4 5 1 .0 0 0 .7 9 1 0 .0 3 3 0 .0 3 6 1 .0 0 0 .9 2 0 0 .0 0 8 0 .9 4 5 1 .0 0 0 .9 1 8 0 .0 0 8 0 .0 0 0 1 .0 0 C P C S W 0 .3 6 0 0 .3 6 0 0 .0 0 5 0 .9 5 3 1 .0 0 0 .4 2 8 0 .0 2 0 0 .0 6 1 1 .0 0 0 .3 6 0 0 .0 0 5 0 .9 5 3 1 .0 0 0 .3 6 1 0 .0 0 5 0 .0 0 0 1 .0 0 9W _ C P C 0 .7 0 0 ** 0 .7 0 0 0 .0 0 9 0 .9 5 5 1 .0 0 1 .0 3 6 0 .0 5 2 0 .0 0 0 1 .0 0 0 .7 0 0 0 .0 0 9 0 .9 5 5 1 .0 0 0 .7 0 1 0 .0 0 8 0 .9 5 9 1 .0 0 R e s id u a l V a ri a n c e C P C S C 0 .5 0 0 ** 0 .5 0 0 0 .0 0 6 0 .9 5 1 1 .0 0 0 .6 6 2 0 .0 3 5 0 .0 0 3 1 .0 0 0 .5 0 0 0 .0 0 6 0 .9 5 1 1 .0 0 0 .4 9 9 0 .0 0 6 0 .9 4 7 1 .0 0 C P C S A 0 .5 0 0 ** 0 .5 0 0 0 .0 0 3 0 .9 4 3 1 .0 0 0 .7 2 9 0 .0 2 1 0 .0 0 0 1 .0 0 0 .5 0 0 0 .0 0 3 0 .9 4 3 1 .0 0 0 .5 0 0 0 .0 0 3 0 .9 4 3 1 .0 0 C P C A C 0 .5 0 0 ** 0 .5 0 0 0 .0 0 6 0 .9 5 1 1 .0 0 0 .7 5 3 0 .0 2 8 0 .0 0 0 1 .0 0 0 .5 0 0 0 .0 0 6 0 .9 5 1 1 .0 0 0 .5 0 1 0 .0 0 6 0 .9 4 7 1 .0 0 C P C S W 0 .5 0 0 ** 0 .5 0 0 0 .0 0 3 0 .9 4 8 1 .0 0 0 .7 1 6 0 .0 1 9 0 .0 0 0 1 .0 0 0 .5 0 0 0 .0 0 3 0 .9 4 8 1 .0 0 0 .5 0 0 0 .0 0 3 0 .9 5 0 1 .0 0 B E T W E E N -L E V E L B _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .7 8 0 ** 0 .7 8 5 0 .0 8 4 0 .9 4 9 1 .0 0 0 .4 5 1 0 .0 0 5 0 .0 0 0 1 .0 0 0 .7 8 5 0 .0 8 4 0 .9 4 9 1 .0 0 C P C A C 0 .6 0 0 ** 0 .6 0 3 0 .0 7 1 0 .9 5 5 1 .0 0 0 .9 1 8 0 .0 0 8 0 .0 0 0 1 .0 0 0 .6 0 3 0 .0 7 1 0 .9 5 5 1 .0 0 C P C S W 0 .6 2 0 ** 0 .6 2 2 0 .0 7 3 0 .9 4 9 1 .0 0 0 .3 6 1 0 .0 0 5 0 .0 0 0 1 .0 0 0 .6 2 2 0 .0 7 3 0 .9 4 9 1 .0 0 9B_ C P C 0 .3 0 0 0 .3 0 1 0 .0 4 5 0 .9 4 1 1 .0 0 0 .2 5 5 0 .0 2 9 0 .6 3 0 1 .0 0 0 .3 0 1 0 .0 4 5 0 .9 4 1 1 .0 0 R e s id u a l v a ri a n c e C P C S C 0 .2 0 0 ** 0 .1 9 7 0 .0 3 0 0 .9 3 9 0 .9 9 9 0 .2 3 9 0 .0 3 1 0 .7 7 7 1 .0 0 0 .1 9 7 0 .0 3 0 0 .9 3 9 1 .0 0 C P C S A 0 .2 0 0 ** 0 .1 9 9 0 .0 2 3 0 .9 4 5 1 .0 0 0 .2 7 3 0 .0 2 4 0 .1 0 2 1 .0 0 0 .1 9 9 0 .0 2 3 0 .9 4 5 1 .0 0 C P C A C 0 .2 0 0 ** 0 .1 9 8 0 .0 2 0 0 .9 3 5 1 .0 0 0 .1 5 3 0 .0 2 2 0 .4 1 5 1 .0 0 0 .1 9 8 0 .0 2 0 0 .9 3 5 1 .0 0 C P C S W 0 .2 0 0 ** 0 .1 9 9 0 .0 2 0 0 .9 3 7 1 .0 0 0 .2 4 4 0 .0 2 1 0 .4 5 9 1 .0 0 0 .1 9 9 0 .0 2 0 0 .9 3 7 1 .0 0 IN T E R C E P T /M E A N C P C S C 2 .8 9 6 ** 2 .8 9 8 0 .0 4 1 0 .9 3 7 1 .0 0 2 .8 9 8 0 .0 4 1 0 .9 3 7 1 .0 0 2 .8 9 8 0 .0 4 1 0 .9 3 7 1 .0 0 2 .8 9 8 0 .0 4 1 0 .9 3 7 1 .0 0 2 .8 9 8 0 .0 4 1 0 .9 3 7 1 .0 0 C P C S A 2 .8 5 6 ** 2 .8 5 7 0 .0 3 6 0 .9 5 7 1 .0 0 2 .8 5 7 0 .0 3 6 0 .9 5 7 1 .0 0 2 .8 5 7 0 .0 3 6 0 .9 5 7 1 .0 0 2 .8 5 7 0 .0 3 6 0 .9 5 7 1 .0 0 2 .8 5 7 0 .0 3 6 0 .9 5 7 1 .0 0 C P C A C 2 .8 6 0 ** 2 .8 6 2 0 .0 3 2 0 .9 4 3 1 .0 0 2 .8 6 2 0 .0 3 2 0 .9 4 3 1 .0 0 2 .8 6 2 0 .0 3 2 0 .0 4 3 1 .0 0 2 .8 6 2 0 .0 3 2 0 .9 4 3 1 .0 0 2 .8 6 2 0 .0 3 2 0 .9 4 3 1 .0 0 C P C S W 3 .2 6 8 ** 3 .2 6 8 0 .0 3 3 0 .9 4 6 1 .0 0 3 .2 6 8 0 .0 3 3 0 .9 4 6 1 .0 0 3 .2 6 8 0 .0 3 3 0 .9 4 6 1 .0 0 3 .2 6 8 0 .0 3 3 0 .9 4 6 1 .0 0 3 .2 6 8 0 .0 3 3 0 .9 4 6 1 .0 0 ΨB _ CP C a n d ΨW _ CP C is th e be tw e e n -/ w ith in -l e ve lf a c tor va ri a n c e . Th e n or m a lf on t in di c a te s th e fix e d e ff e c t a n d in te rc e pt e s ti m a te ; th e ita lic in di c a te s th e ra n dom e ff e c t e s ti m a te . Es t, e s ti m a te ; S E, s ta n da rd e rr or ; 95% , 95% c on fide n c e in te rv a lc o ve ra g e ra te ; S ig ., e m pi ri c a lpo w e r. ** p < 0.05. aT h e s ta n da rdi ze d re s u lt c a n be re qu e s te d fr o m th e a u th or .

(10)

T A B L E 2 | S im u la te d u n st a n d a rd iz e d re su lts ao f S E M te c h n iq u e s o n sy n th e tic h a rt e r’s c o m p e te n c e d a ta se t fo r [C N (C S ), IC C ] = [4 0 (3 0 ), 0 .3 ]. P o p u la ti o n s e tt in g s 2 M L R 1 M L R 2 M a x B 2 M is s 2 M a x W C o n ve rg e n c e ra te --0 .9 9 9 1 .0 0 1 .0 0 1 .0 0 0 .9 9 8 C h i-sq u a re (d f) --4 .9 3 5 (4 ) 3 .0 4 1 (2 ) 2 .0 4 1 (2 ) 1 6 .8 3 2 (7 ) 3 .2 8 0 (2 ) C F I --0 .9 9 8 0 .9 9 5 0 .9 9 9 0 .9 8 8 0 .9 9 8 R M S E A --0 .0 1 1 0 .0 1 5 0 .0 1 0 0 .0 1 5 0 .0 1 5 S R M R (S R M R _B ) -- (- ---) 0 .0 0 8 (0 .0 4 3 ) 0 .0 1 3 0 .0 0 8 (0 .0 0 1 ) 0 .0 0 9 (0 .1 5 8 ) < 0 .0 0 1 (0 .0 4 3 ) A IC --1 2 ,4 0 6 .4 9 4 1 3 ,6 6 1 .3 2 8 1 2 ,4 0 8 .4 3 8 1 2 ,4 1 2 .2 4 5 1 2 ,4 0 8 .7 2 6 B IC --1 2 ,5 0 8 .2 9 5 1 3 ,7 2 2 .4 0 9 1 2 ,5 2 0 .4 2 1 2 ,4 9 8 .7 7 6 1 2 ,5 2 0 .7 0 8 A B IC --1 2 ,4 4 4 .7 6 8 1 3 ,6 8 4 .2 9 2 1 2 ,4 5 0 .5 4 1 2 ,4 4 4 .7 7 7 1 2 ,4 5 0 .8 2 8 P o p u la ti o n E s t. S E 9 5 % S ig . E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig W IT H IN -L E V E L W _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .4 5 0 0 .4 5 1 0 .0 3 4 0 .9 3 6 1 .0 0 0 .5 3 3 0 .0 6 4 0 .7 5 7 1 .0 0 0 .4 5 0 0 .0 3 4 0 .9 3 7 1 .0 0 0 .4 5 7 0 .0 3 4 0 .9 3 7 1 .0 0 C P C A C 0 .9 2 0 ** 0 .9 2 2 0 .0 5 9 0 .9 2 9 1 .0 0 0 .7 9 5 0 .0 9 7 0 .6 8 0 1 .0 0 0 .9 2 2 0 .0 5 9 0 .9 2 9 1 .0 0 0 .9 0 7 0 .0 5 8 0 .9 1 8 1 .0 0 C P C S W 0 .3 6 0 0 .3 5 9 0 .0 3 2 0 .9 4 9 1 .0 0 0 .4 2 6 0 .0 6 0 0 .8 1 6 1 .0 0 0 .3 5 9 0 .0 3 2 0 .9 5 0 1 .0 0 0 .3 6 5 0 .0 3 2 0 .9 4 8 1 .0 0 9W _ C P C 0 .7 0 0 ** 0 .7 0 0 0 .6 0 2 0 .9 3 2 1 .0 0 1 .0 4 1 0 .1 5 6 0 .4 2 1 1 .0 0 0 .7 0 3 0 .0 6 0 0 .9 3 2 1 .0 0 0 .7 0 8 0 .0 6 0 0 .9 3 1 1 .0 0 R e s id u a l V a ri a n c e C P C S C 0 .5 0 0 ** 0 .5 0 0 0 .0 4 5 0 .9 4 1 1 .0 0 0 .6 5 3 0 .1 0 7 0 .6 7 9 1 .0 0 0 .4 9 8 0 .0 4 5 0 .9 4 1 1 .0 0 0 .4 9 4 0 .0 4 5 0 .9 3 3 1 .0 0 C P C S A 0 .5 0 0 ** 0 .5 0 0 0 .0 2 2 0 .9 4 3 1 .0 0 0 .7 1 9 0 .0 6 1 0 .0 2 1 1 .0 0 0 .4 9 9 0 .0 2 2 0 .9 4 3 1 .0 0 0 .4 9 7 0 .0 2 3 0 .9 3 6 1 .0 0 C P C A C 0 .5 0 0 ** 0 .5 0 0 0 .0 4 0 0 .9 4 3 1 .0 0 0 .7 4 3 0 .0 8 3 0 .1 5 6 1 .0 0 0 .4 9 8 0 .0 4 0 0 .9 4 2 1 .0 0 0 .5 0 7 0 .0 3 9 0 .9 3 1 1 .0 0 C P C S W 0 .5 0 0 ** 0 .5 0 0 0 .0 2 2 0 .9 3 7 1 .0 0 0 .7 0 8 0 .0 5 7 0 .0 0 9 1 .0 0 0 .4 9 9 0 .0 2 2 0 .9 3 7 1 .0 0 0 .4 9 8 0 .0 2 2 0 .9 3 5 1 .0 0 B E T W E E N -L E V E L B _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .7 8 0 ** 1 .0 4 2 0 .4 2 1 0 .9 2 5 0 .8 7 4 0 .4 5 7 0 .0 3 4 0 .0 0 0 1 .0 0 0 .9 2 8 0 .5 4 2 0 .9 2 6 0 .8 7 6 C P C A C 0 .6 0 0 ** 0 .6 2 1 0 .2 4 4 0 .9 3 3 0 .8 0 1 0 .9 0 7 0 .0 5 8 0 .0 0 0 1 .0 0 0 .6 2 1 0 .2 4 5 0 .9 3 5 0 .8 0 1 C P C S W 0 .6 2 0 ** 0 .6 5 3 0 .2 6 2 0 .9 2 8 0 .8 1 9 0 .3 6 5 0 .0 3 2 0 .0 0 0 1 .0 0 0 .6 5 1 0 .2 6 2 0 .9 2 9 0 .8 2 0 9B _ C P C 0 .3 0 0 0 .3 4 8 0 .4 9 4 0 .9 0 5 0 .6 6 4 0 .2 6 1 0 .0 9 1 0 .8 1 7 0 .9 3 5 0 .3 1 4 0 .1 4 7 0 .9 0 7 0 .6 6 5 R e s id u a l V a ri a n c e C P C S C 0 .2 0 0 ** 0 .1 4 5 0 .4 5 5 0 .9 3 5 0 .5 9 8 0 .2 3 0 0 .0 8 8 0 .9 3 9 1 .0 0 0 .1 7 9 0 .1 0 8 0 .9 3 7 0 .5 9 7 C P C S A 0 .2 0 0 ** 0 .1 6 3 0 .1 0 2 0 .9 0 7 0 .7 9 9 0 .2 6 1 0 .0 6 4 0 .8 9 3 1 .0 0 0 .1 7 2 0 .1 2 0 0 .9 0 6 0 .8 0 1 C P C A C 0 .2 0 0 ** 0 .1 8 6 0 .0 5 6 0 .8 8 9 0 .9 5 0 0 .1 5 4 0 .0 6 4 0 .7 7 3 1 .0 0 0 .1 8 7 0 .0 5 6 0 .8 8 8 0 .9 5 0 C P C S W 0 .2 0 0 ** 0 .1 8 6 0 .0 5 8 0 .8 8 9 0 .9 4 2 0 .2 3 6 0 .0 5 7 0 .9 2 7 1 .0 0 0 .1 8 6 0 .0 5 8 0 .8 9 0 0 .9 4 0 IN T E R C E P T /M E A N C P C S C 2 .8 9 6 ** 2 .8 9 6 0 .1 1 5 0 .9 4 5 1 .0 0 2 .8 9 6 0 .1 1 6 0 .9 4 7 1 .0 0 2 .8 9 6 0 .1 1 5 0 .9 4 5 1 .0 0 2 .8 9 6 0 .1 1 5 0 .9 4 5 1 .0 0 2 .8 9 6 0 .1 1 5 0 .9 4 5 1 .0 0 C P C S A 2 .8 5 6 ** 2 .8 5 2 0 .0 9 9 0 .9 3 6 1 .0 0 2 .8 5 2 0 .1 0 0 0 .9 3 9 1 .0 0 2 .8 5 2 0 .0 9 9 0 .9 3 6 1 .0 0 2 .8 5 2 0 .0 9 9 0 .9 3 6 1 .0 0 2 .8 5 2 0 .0 9 9 0 .9 3 6 1 .0 0 C P C A C 2 .8 6 0 ** 2 .8 6 2 0 .0 9 1 0 .9 3 3 1 .0 0 2 .8 7 2 0 .0 9 2 0 .9 3 5 1 .0 0 2 .8 6 2 0 .0 9 1 0 .9 3 3 1 .0 0 2 .8 6 2 0 .0 9 1 0 .9 3 3 1 .0 0 2 .8 6 2 0 .0 9 1 0 .9 3 4 1 .0 0 C P C S W 3 .2 6 8 ** 3 .2 7 0 0 .0 9 0 0 .9 4 7 1 .0 0 3 .2 7 0 0 .0 9 1 0 .9 4 8 1 .0 0 3 .2 7 0 0 .0 9 0 0 .9 4 7 1 .0 0 3 .2 7 0 0 .0 9 0 0 .9 4 7 1 .0 0 3 .2 7 0 0 .0 9 0 0 .9 4 7 1 .0 0 ΨB _ CP C a n d ΨW _ CP C is th e be tw e e n -/ w ith in -l e ve lf a c tor va ri a n c e . Th e n or m a lf on t in di c a te s th e fix e d e ff e c t a n d in te rc e pt e s ti m a te ; th e ita lic in di c a te s th e ra n dom e ff e c t e s ti m a te . Es t, e s ti m a te ; S E, s ta n da rd e rr or ; 95% , 95% c on fide n c e in te rv a lc o ve ra g e ra te ; S ig ., e m pi ri c a lpo w e r. ** p < 0.05. aT h e s ta n da rdi ze d re s u lt c a n be re qu e s te d fr o m th e a u th or .

(11)

T A B L E 3 | S im u la te d u n st a n d a rd iz e d re su lts ao f S E M te c h n iq u e s o n sy n th e tic h a rt e r’s c o m p e te n c e d a ta se t fo r [C N (C S ), IC C ] = [4 0 (3 ), 0 .3 ]. P o p u la ti o n s e tt in g s 2 M L R 1 M L R 2 M a x B 2 M is s 2 M a x W C o n ve rg e n c e ra te --0 .7 8 6 1 .0 0 0 0 .8 2 0 0 .8 7 9 0 .7 8 2 C h i-sq u a re (d f) --8 .6 7 4 (4 ) 2 .6 8 0 (2 ) 2 7 .9 9 (2 ) 1 2 .6 3 (7 ) 8 .1 1 1 (2 ) C F I --0 .9 5 9 0 .9 8 2 0 .9 6 8 0 .9 4 1 0 .9 6 3 R M S E A --0 .0 5 6 0 .0 4 2 0 .0 7 8 0 .0 6 3 0 .0 7 0 S R M R (S R M R _B ) -- (- ---) 0 .0 3 3 (0 .0 8 5 ) 0 .0 2 3 0 .0 3 2 (0 .0 2 5 ) 0 .0 5 0 (0 .1 7 3 ) 0 .0 1 0 (0 .0 8 2 ) A IC --1 ,3 4 7 .3 1 2 1 ,3 6 8 .6 6 9 1 ,3 4 7 .0 1 0 1 ,3 4 7 .2 5 4 1 ,3 4 9 .3 1 5 B IC --1 ,4 0 3 .0 6 2 1 ,4 0 2 .1 1 9 1 ,4 0 8 .3 3 5 1 ,3 9 4 .6 4 1 1 ,4 1 0 .6 4 0 A B IC --1 ,3 3 9 .8 3 1 1 ,3 6 4 .1 8 1 1 ,3 3 8 .7 8 1 1 ,3 4 0 .8 9 5 1 ,3 4 1 .0 8 6 P o p u la ti o n E s t. S E 9 5 % S ig . E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig E s t. S E 9 5 % S ig W IT H IN -L E V E L W _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .4 5 0 0 .4 5 3 0 .1 4 6 0 .9 4 4 0 .8 0 8 0 .6 0 7 0 .1 5 4 0 .8 9 1 0 .9 8 8 0 .4 7 4 0 .1 4 7 0 .9 5 4 0 .8 2 9 0 .5 0 9 0 .1 2 3 0 .9 4 1 0 .9 9 1 C P C A C 0 .9 2 0 ** 0 .9 5 2 0 .2 8 7 0 .9 4 9 0 .9 5 1 0 .9 1 0 0 .2 1 5 0 .9 6 5 0 .9 9 8 0 .9 8 8 0 .2 8 6 0 .9 8 0 0 .9 6 5 0 .8 3 9 0 .1 7 9 0 .8 6 3 0 .9 9 6 C P C S W 0 .3 6 0 0 .3 5 8 0 .1 3 7 0 .9 4 4 0 .7 9 6 0 .4 8 4 0 .1 4 0 0 .9 0 9 0 .9 6 3 0 .3 7 4 0 .1 4 0 0 .9 5 1 0 .8 1 9 0 .4 0 7 0 .1 1 2 0 .9 3 9 0 .9 6 0 9W _ C P C 0 .7 0 0 ** 0 .7 2 3 0 .2 6 5 0 .9 3 8 0 .8 8 1 1 .0 6 1 0 .2 4 6 0 .9 9 4 0 .9 9 2 0 .6 5 4 0 .2 3 2 0 .9 3 5 0 .9 0 4 0 .7 2 0 0 .2 1 4 0 .9 2 5 0 .9 7 3 R e s id u a l V a ri a n c e C P C S C 0 .5 0 0 ** 0 .4 8 6 0 .2 0 6 0 .9 6 4 0 .7 3 2 0 .7 3 9 0 .1 9 1 0 .7 8 1 0 .9 8 3 0 .5 1 4 0 .1 8 3 0 .9 6 2 0 .8 0 7 0 .4 7 9 0 .1 6 5 0 .9 3 7 0 .8 2 9 C P C S A 0 .5 0 0 ** 0 .4 9 1 0 .0 8 5 0 .9 2 4 0 .9 9 9 0 .6 9 3 0 .1 1 5 0 .6 2 3 1 .0 0 0 .4 8 8 0 .0 8 5 0 .9 1 9 1 .0 0 0 .4 8 5 0 .0 8 6 0 .9 1 5 1 .0 0 C P C A C 0 .5 0 0 ** 0 .4 7 0 0 .1 7 4 0 .9 4 7 0 .7 9 6 0 .6 8 3 0 .1 6 5 0 .7 5 5 0 .9 6 7 0 .4 5 6 0 .1 6 9 0 .9 5 3 0 .7 8 7 0 .5 4 2 0 .1 3 9 0 .9 2 7 0 .9 5 2 C P C S W 0 .5 0 0 ** 0 .4 8 7 0 .0 8 1 0 .9 2 2 1 .0 0 0 .6 8 7 0 .1 0 5 0 .5 8 8 1 .0 0 0 .4 8 4 0 .0 8 1 0 .9 1 3 0 .9 9 9 0 .4 8 4 0 .0 8 1 0 .9 1 6 1 .0 0 B E T W E E N -L E V E L B _C P C b y C P C S C 1 .0 0 0 1 .0 0 0 – – – 1 .0 0 0 – – – 1 .0 0 0 – – – C P C S A 0 .7 8 0 ** 0 .9 3 0 0 .8 1 2 0 .9 2 6 0 .4 0 9 0 .5 0 9 0 .1 2 3 0 .3 5 6 0 .9 9 1 1 .1 2 8 0 .8 3 0 0 .9 0 8 0 .4 1 9 C P C A C 0 .6 0 0 ** 0 .6 8 1 0 .5 8 2 0 .9 6 4 0 .4 2 8 0 .8 3 9 0 .1 7 9 0 .7 8 4 0 .9 9 6 0 .6 6 7 0 .5 3 6 0 .9 6 5 0 .4 3 8 C P C S W 0 .6 2 0 ** 0 .7 2 1 0 .7 4 7 0 .9 4 5 0 .3 6 3 0 .4 0 7 0 .1 1 2 0 .4 5 7 0 .9 6 0 0 .7 0 0 0 .5 9 4 0 .9 4 1 0 .3 7 5 9B _ C P C 0 .3 0 0 0 .3 3 7 0 .2 8 9 0 .9 3 5 0 .2 0 0 0 .3 3 8 0 .1 9 8 0 .9 5 0 0 .3 4 0 0 .3 4 2 0 .2 8 8 0 .9 3 4 0 .1 9 5 R e s id u a l V a ri a n c e C P C S C 0 .2 0 0 ** 0 .1 5 8 0 .2 1 0 0 .9 5 5 0 .1 8 3 0 .1 6 5 0 .1 4 6 0 .9 0 9 0 .1 7 3 0 .1 5 6 0 .2 1 4 0 .9 5 5 0 .1 9 0 C P C S A 0 .2 0 0 ** 0 .1 4 0 0 .1 8 9 0 .9 3 8 0 .3 0 2 0 .2 3 3 0 .1 0 1 0 .9 3 0 0 .6 4 7 0 .1 0 9 0 .2 1 3 0 .9 3 4 0 .2 9 9 C P C A C 0 .2 0 0 ** 0 .1 6 2 0 .1 4 0 0 .9 3 0 0 .2 6 8 0 .1 6 4 0 .1 1 6 0 .8 9 1 0 .2 5 1 0 .1 6 4 0 .1 3 5 0 .9 3 2 0 .2 7 2 C P C S W 0 .2 0 0 ** 0 .1 5 4 0 .1 6 6 0 .9 3 2 0 .3 5 4 0 .2 1 5 0 .0 9 2 0 .9 3 4 0 .6 7 8 0 .1 5 8 0 .1 3 7 0 .9 2 8 0 .3 6 4 IN T E R C E P T /M E A N C P C S C 2 .8 9 6 ** 2 .9 0 3 0 .1 4 9 0 .9 4 3 1 .0 0 2 .9 0 9 0 .1 4 0 0 .9 3 2 1 .0 0 2 .9 0 3 0 .1 4 7 0 .9 3 3 1 .0 0 2 .9 0 3 0 .1 4 9 0 .9 3 9 1 .0 0 2 .9 0 2 0 .1 4 9 0 .9 4 1 1 .0 0 C P C S A 2 .8 5 6 ** 2 .8 6 2 0 .1 2 1 0 .9 3 8 1 .0 0 2 .8 6 3 0 .1 2 0 0 .9 5 7 1 .0 0 2 .8 6 1 0 .1 2 0 0 .9 4 8 1 .0 0 2 .8 6 2 0 .1 2 0 0 .9 4 2 1 .0 0 2 .8 6 2 0 .1 2 1 0 .9 3 8 1 .0 0 C P C A C 2 .8 6 0 ** 2 .8 6 3 0 .1 2 9 0 .9 4 6 1 .0 0 2 .8 6 7 0 .1 2 7 0 .9 4 8 1 .0 0 2 .8 6 3 0 .1 2 7 0 .9 4 2 1 .0 0 2 .8 6 4 0 .1 2 9 0 .9 4 4 1 .0 0 2 .8 6 3 0 .1 2 9 0 .9 4 3 1 .0 0 C P C S W 3 .2 6 8 ** 3 .2 6 4 0 .1 1 2 0 .9 4 1 1 .0 0 3 .2 6 7 0 .1 1 2 0 .9 3 8 1 .0 0 3 .2 6 7 0 .1 1 1 0 .9 3 4 1 .0 0 3 .2 6 8 0 .1 1 1 0 .9 3 6 1 .0 0 3 .2 6 4 0 .1 1 2 0 .9 3 8 1 .0 0 ΨB _ CP C a n d ΨW _ CP C is th e be tw e e n -/ w ith in -l e ve lf a c tor va ri a n c e . Th e n or m a lf on t in di c a te s th e fix e d e ff e c t a n d in te rc e pt e s ti m a te ; th e ita lic in di c a te s th e ra n dom e ff e c t e s ti m a te . Es t, e s ti m a te ; S E, s ta n da rd e rr or ; 95% , 95% c on fide n c e in te rv a lc o ve ra g e ra te ; S ig ., e m pi ri c a lpo w e r. ** p < 0.05. aT h e s ta n da rdi ze d re s u lt c a n be re qu e s te d fr o m th e a u th or .

(12)

TABLE 4 | The relative bias and absolute bias of factor loading estimates from five SEM modeling techniques for ICC = 0.3.

CN(CS) Model Within Level Between Level

CPCSA CPCAC CPCSA CPCAC

Bias (%) Abs(Bias) (%) Bias (%) Abs(Bias) (%) Bias (%) Abs(Bias) (%) Bias (%) Abs(Bias) (%)

40(3) 2MLR 0.72 24.58 3.52 21.88 19.18 53.69 13.41 52.32 2MaxB 5.21 24.24 7.35 20.57 2MaxW 19.55 54.57 11.11 49.29 2Miss 13.13 23.66 −8.84 17.47 −34.74 35.32 39.78 41.25 1MLR 20.79 28.16 −12.02 19.03 1MLR* −0.99 19.13 −1.77 17.03 40(30) 2MLR 0.23 6.17 0.19 5.27 14.63 39.07 3.55 30.88 2MaxB 0.22 6.17 0.19 5.27 2MaxW 15.48 39.94 3.56 30.87 2Miss 1.59 6.23 −1.43 5.37 −41.39 41.39 51.14 51.14 1MLR 18.49 19.74 −13.55 14.92 1MLR* 2.48 12.03 −2.03 7.75 300(200) 2MLR −0.02 0.89 0.02 0.72 0.62 8.76 0.57 9.41 2MaxB −0.02 0.89 0.02 0.72 2MaxW 0.62 8.76 0.57 9.41 2Miss 0.19 0.90 −0.22 0.75 −42.20 42.20 52.99 52.99 1MLR 18.67 18.67 −14.01 14.01 1MLR* −2.73 3.87 −3.92 4.63

Bias: relative bias =Estimate−Parameter

Parameter ; Abs(Bias) = Estimate −Parameter Parameter

. The parameter value of 2-level models and 1MLR in the within level: λCPCSA= 0.45, λCPCAC= 0.92 and in the between level: λCPCSA= 0.78, λCPCAC= 0.60 of population two-level model. 1MLR* presents the bias measures with respect to its true conflated parameter value from Equation (8): λCPCSA= 0.549, λCPCAC= 0.824.

CPCSC was the maker variable so its factor loading would constantly be fixed at one for all the analytical models. CPCSW and CPCSA had the same pattern of bias; therefore, we presented the result of for CPCSA and CPCAC only. Relative bias (RB) is calculated as the value of parameter estimate minus the population value divided by the population value. RB quantifies the degree of deviation of the parameter estimate relative to the population value. A zero value of RB reflects an unbiased estimate of the parameter. A negative value indicates an underestimation of the parameter; on the other hand, a positive value indicates

an overestimation of the parameter. According to Flora and

Curran (2004), the value of RB less than 5% is considered as trivial, between 5 and 10% as moderate, and greater than 10% as substantial. Absolute bias (AB) is the absolute value of RB, which will always be positive and cumulated to reflect the total amount of bias. Across simulation settings, 2MLR, 2MaxB, and 2MaxW models tended to generate factor loading estimates consistent with the population values in respective levels. The empirical results were consistent with the mathematical derivations [e.g., Equation (4), (5.1) and (5.2)]. Generally, as shown in Table 4, ABs were larger than their RB counterparts in smaller CN and CS, but as CN and CS increased, the discrepancy between

The bias measures for CPCSW had the same pattern with CPCSA; thus, we did not present the bias statistics, either.

RB and AB were smaller. The RBs and ABs of the parameter estimates were also getting smaller when sample size increased for 2-level models, except that 2Miss consistently generated biased between-level loading estimates across all sample size settings.

On the other hand, 1MLR and 2Miss tended to generate conflated estimates for the factor loadings, consistent with Equation (7). Take the condition of the smallest sample size as example [CN(CS), ICC] = [40(3), 0.3], compared with the within-level fixed effects in the population model, substantial relative bias was found in the factor loading estimates of 1MLR and 2Miss ranging from −12.02 to 20.79%. In contrast, negligible relative bias of factor loading estimates was found in the 2MLR and 2MaxB models ranging from 0.72 to 7.35%

(e.g., λTrue modelCPCSA, W_CPC = 0.450, ˆλ2MLRCPCSA,W_CPC = 0.453 and

ˆλ2MaxB

CPCSA,W_CPC = 0.474 vs. ˆλ1MLRCPCSA,CPC = 0.607, and ˆλ2Miss

CPCSA,W_CPC = 0.509). We also compared the factor loading estimates of 1MLR with its theoretical conflated values (obtained

from Equation (8) with ICC = 0.3, e.g., λ1MLRCPCSA,CPC = 0.549)

and presented the biases in Table 4 at the row of 1MLR∗. 1MLR

generated negligible biases which grew larger as sample size increased (e.g., the relative bias ranged from −0.99 to −3.92%). Compared with the between-level fixed effects in the population model, the 2MLR and 2MaxW models yielded considerable relative and absolute biases at CN = 40 (the relative bias ranged

(13)

from 3.55 to 19.55%; the absolute bias ranged from 54.57 to 30.87%).

To further investigate the relationship between factor loading estimates and sample sizes (e.g., CN×CS), we tabulated the between- and within-level ˆλ of CPCSA and CPCAC in boxplots for ICC = 0.3 in Figure 4. The dispersion of the parameter estimates of the five models decreased as the sample size increased. When sample size was small, the dispersion of 2MLR was larger than 2MaxB/2MaxW. Across all cluster number and cluster size combinations, the 2MLR and the 2MaxB/2MaxW had consistent median estimates to their parameters. However, the 1MLR models generated conflated parameter estimates which would regress to the weighted means of the true factor loadings from the between-and within-level models (The

true value of within-level λTrue modelCPCAC, W_CPC = 0.920,

between-level λTrue modelCPCAC, B_CPC = 0.600, and the conflated parameter

λ1MLRCPCSA,CPC = 0.824, vs. the estimate of ˆλ1MLRCPCAC, CPC =

0.791; λTrue modelCPCSA, W_CPC = 0.450, λTrue modelCPCSA, B_CPC = 0.780, and

λ1MLRCPCSA,CPC = 0.549 vs. ˆλ1MLR

CPCSA, CPC = 0.534). Different from 1MLR, the 2Miss models had consistent and efficient factor loading estimates as those produced by the 2MLR and 2MaxB models when sample size was greater than 1,200 [i.e., CN(CS) = 40(30)] in the within-level models; whereas, the 2Miss models generated biased parameter estimates across all sample size conditions in the between-level level.

The Conflated Factor Loading Estimates in

Design-Based Models As ICC Changes

To probe into the consequence of applying design-based approach on complex survey data, we plotted the estimates (solid lines) of factor loadings from simulations and those (dash lines) from mathematical derivation (see Equation 8) against different ICC values in Figure 5. As we expected from the mathematical derivation, the factor loading estimates of the design-based model approached the true between-level values as ICCs increased. Even though they were supposed to reflect the within-level information, the estimates got conflated across all simulated ICCs, except for ICC = 0.

Estimation of Random Effects

In terms of factor variance, the four 2-level models yielded consistent random effect estimates (e.g., for [40(3), 0.3], in the

between level: ˆ9B_CPC2MLR = 0.337, ˆ9B_CPC2MaxW = 0.342 and ˆ9B_CPC2Miss =

0.338; in the within level: ˆ9W_CPC2MLR = 0.723, ˆ9W_CPC2MaxB = 0.654,

ˆ

9W_CPC2Miss = 0.720). The performance of the 1MLR, however, was not as consistent as that of the three 2-level models in estimating the random effects. Specifically, the factor variance estimate of 1MLR equaled 1.061, which was roughly the sum of the population between- and within-level factor variance values as shown in Equation (6). The substantial relative bias reached 51.57%. The 1MLR also yielded the same overall estimates for the residual variances (i.e., residuals of Equation 7), while the three 2-level models had fair within-level residual variance

estimate (e.g., ˆθCPCSW1MLR = 0.687 vs. θCPCSW, Within−levelTrue model = 0.500,

ˆ

θCPCSW, Within−level2MLR = 0.487, ˆθCPCSW, Within−level2MaxB = 0.484 and ˆ

θCPCSW, Within−level2Miss = 0.484).

Mean Structures

As for the mean structure, all examined models yielded consistent mean/intercept estimates with conformable statistical inferences as shown in Tables 1–3.

The 95% Confidence Interval Coverage

Rate and Empirical Power of Estimates

With the conflated parameter estimate of fixed and random effect,

the 95% confidence interval coverage rate3 (95%) of 1MLR and

2Miss tended to be much smaller than its nominal level. In

terms of empirical power4(Sig.), all the empirical power for the

three factor loading estimates were equal to or close to 1 in the

1MLR and 2Miss (e.g., for [40(3), 0.3], ˆλ1MLRCPCSA,CPC = 0.659,

95% = 0.891, Sig = 0.988 in Table 3). In contrast, in the 2MLR

and 2MaxB models, the empirical power of ˆλCPCSA,W_CPC and

ˆλCPCSW,W_CPC were both close to 0.8 (e.g., ˆλ2MLRCPCSW,W_CPC =

0.358, 95% = 0.944, Sig = 0.796; ˆλ2MaxBCPCSW,W_CPC, 95% = 0.951,

Sig = 0.819). In the true model, these two factor loadings were considered as non-zero and smaller effects without statistical significance at small sample size. With the small sample size setting in the simulation, this kind of smaller effects were set to have less empirical rate of significant estimates over total replications than the nominal level of 0.8 (Eng, 2003). Results of the 2MLR and 2MaxB were consistent with the population model, in which only the empirical power for the factor loading of individual-level athletic competence (CPCAC) far more than 0.8 but not those for social acceptance (CPCSA) and self-worth (CPCSW).

Variance Explained

5

and Scale Reliability of

Indicators

Taking [CN(CS), ICC] = [40(3), 0.3] as an example shown in Table 5, 1MLR tended to generate inflated R2measure, especially for the indicators with smaller within-level factor loadings but larger between-level factor loadings, so did the 2Miss model

(e.g., ˆR2, 1MLRCPCSA = 0.467 and, ˆR2, 2MissCPCSA, Within−level = 0.278 vs.

ˆR2, 2MLR

CPCSA, Within−level= 0.175 and ˆR 2, 2MaxB

CPCSA, Within−level= 0.176). As

for the between-level, 2MaxW provided consistent ˆR2as 2MLR

but 2Miss generated biased estimate ( ˆR2, 2MaxWCPCSA, Between−level =

3The 95% confidence interval coverage rate (95%, defined as the empirical

proportion for which the 95% confidence interval of estimate contained the true population parameter value).

4Empirical power (Sig., defined as the empirical significance pattern of the

estimates; that is empirical power = average rate of significant estimates over total replications).

5In two-level models, the variables are partitioned into level-l and level-2

components. So the R2computed in these approaches should be interpreted as

the proportion of variance in each within-group component that is accounted for by the lower-level model, and the proportion of variance in each between-group component that is accounted for by the higher-level model; while in 1-level MLR,

R2is proportion of variance in each indicators that is accounted for by an overall

model where the variance composition is confounded by components from both levels.

(14)

FIGURE 4 | The Boxplots of selected factor loading estimates vs. sample size conditions. The red dots in the boxes indicate the means of factor loading estimates. The red dashed lines indicates the parameter settings in respective levels.

0.746 and ˆR2, 2MLRCPCSA, Between−level = 0.740 vs. ˆR2, 2MissCPCSA, Between−level =

0.623).

As for the scale reliability, the 2MaxB and 2MaxW yielded consistent reliability measures as 2MLR in respective levels, but 1-level MLR and 2-level Miss tended to underestimate the

score consistency of indicators (e.g., ˆρWithin−level2MaxB = 0.830 ∼=

ˆ

ρWithin−level2MLR = 0.825; ˆρBetween−level2MaxW = 0.926 ∼= ˆρBetween−level2MLR =

0.930 vs. ˆρ1MLR = 0.747, ˆρWithin−level2Miss = 0.798 and

ˆ

ρBetween−level2Miss = 0.915).

In summary, given the conflated estimates of fixed and random effects, the 1MLR models would provide overestimated variance explained measure and underestimated reliability measure for the indicators. In contrast, the 2MaxB and 2MaxW

model generated consistent R2and ρ for respective within-level

and between-level indicators consistent with those of the 2MLR model across simulation settings.

DISCUSSION AND CONCLUSION

As researchers call for the need to adequately take into account of the multilevel structure of social and behavioral data (Skinner et al., 1997; Lee and Forthofer, 2006), the use of multilevel

data modeling techniques will be inevitable. However, multilevel models are not an infallible statistical strategy unless the hypothesized model conforms to the real data structure. In this study, we demonstrated that maximum models are robust analytic methods as to the inequality of higher- and lower-level factor loadings or to detect possibly non-significant trivial items, especially when researchers have limited information about the significance pattern of factor loadings and level-varying measurement structures. The current study focuses on multilevel CFA, which is a generic form of structural equation models; therefore, the study result can be generalized to more complex models.

Specifically, we examined the performance of five proposed SEM techniques on analyzing complex survey data with unequal factor loadings under equal between- and within-level structures. Across different combinations of cluster numbers, cluster sizes and ICC values, all models yield acceptable model-fit information. AIC and adjusted BIC could be utilized to differentiate 1MLR from 2-level models but could not select the best 2-level model. Among 2-level models, 2MLR, 2MaxB and 2MaxW could consistently generate the effective and efficient parameter estimates. On the contrary, the design-based model would not be an appropriate approach on analyzing complex

數據

FIGURE 1 | The multilevel CFA model with parameters from empirical Harter dataset. (A) The true between-level model
FIGURE 2 | The saturated model.
FIGURE 3 | Plots of selected analytical outputs of ICC against fit statistics across different modeling strategies
TABLE 4 | The relative bias and absolute bias of factor loading estimates from five SEM modeling techniques for ICC = 0.3.
+3

參考文獻

相關文件

In addition, to incorporate the prior knowledge into design process, we generalise the Q(Γ (k) ) criterion and propose a new criterion exploiting prior information about

(From the disease model to the positive psychology model).

We showed that the BCDM is a unifying model in that conceptual instances could be mapped into instances of five existing bitemporal representational data models: a first normal

The Hull-White Model: Calibration with Irregular Trinomial Trees (concluded).. • Recall that the algorithm figured out θ(t i ) that matches the spot rate r(0, t i+2 ) in order

The Hull-White Model: Calibration with Irregular Trinomial Trees (concluded).. • Recall that the algorithm figured out θ(t i ) that matches the spot rate r(0, t i+2 ) in order

• Learn the mapping between input data and the corresponding points the low dimensional manifold using mixture of factor analyzers. • Learn a dynamical model based on the points on

Furthermore, based on the temperature calculation in the proposed 3D block-level thermal model and the final region, an iterative approach is proposed to reduce

The study was based on the ECSI model by Martensen et al., (2000), combined with customer inertia as a mediator in the hope of establishing a customer satisfaction model so as