• 沒有找到結果。

3.4 Measurement development

3.4.3 Validation

Unidimensional Item Response Theory (UIRT) were applied to get proficiency estimates for each test.

For the morphological structure test, morphological production test, phoneme isolation test, reading comprehension and word recognition scale, instead of using the raw total scores of the ability scale of each test, this study used UIRT measure based on estimates of each person’s latent trait to represent those abilities with a mean of 0 and a standard deviation of 1.

UIRT models have a strong assumption that each test item is designed to measure some facet of the same underlying ability or a unique latent trait. It is necessary that a test intending to measure one certain trait should not be affected by other traits, especially when only the overall test scores are reported and used as an assessment criterion for various ability levels.

For unidimensional constructs, the Rasch model (or one-parameter logistic model: 1PLM) and the two-parameter logistic model (2PLM) are commonly used (Embretson & Reise, 2000). According to Thissen and Steinberg’s (1986) taxonomy classifies some IRT models designed to analyze such test items as “Binary Models”. The 1PLM and 2PLM belong to this

36

set. Suppose Xij represents the response of person j to item i, where Xij = 1 means the item i is answered correctly by person j, and Xij = 0 means the item i is answered incorrectly by person j. Then, these two models are expressed as the Equations (1) and (2), respectively.

𝑃 (𝑋𝑖𝑗 = 1⃒𝜃𝑗𝛽𝑖) = 1+exp⁡(𝜃exp⁡(𝜃𝑗−𝛽𝑖)

𝑗−𝛽𝑖) (1)

𝑃 (𝑋𝑖𝑗 = 1⃒𝜃𝑗, 𝛼𝑖, 𝛽𝑖) =1+exp⁡[𝛼exp⁡[𝛼𝑖(𝜃𝑗−𝛽𝑖)]

𝑖(𝜃𝑗−𝛽𝑖)] (2)

Where 𝜃𝑗 represents the ability parameter for examinee j; 𝛼𝑖 (item discrimination) and 𝛽𝑖 (item difficulty) refer to the parameters of item i. Assessment of measurement invariance across time involves checking that the item parameters 𝛼𝑖 and 𝛽𝑖 have not changed over time.

For this study, we assume that all items in a test are equally discriminating, therefore the unidimensional IRT using Rasch model (one-parameter logistic model: 1PLM) was apply, the ConQuest software (Adams, Wu, & Wilson, 2012) was used to estimate students’ abilities of morphological structure, morphological production, phoneme isolation, reading

comprehension and word recognition as measured by the respective scales presented in Table 3.

In Rasch analysis, item fit indexes are reported for individual items by Mean Square error (MNSQ). The MNSQ statistic is sensitive to response patterns of persons whose ability estimates match an item’s difficulty estimate. Over fit indicates that the observations contain less variance than is predicted by the model; under fit indicates more variance in the

observations than is predicted by the model (e.g., the presence of idiosyncratic groups) (Wilson, 2005). The weighted and un-weighted MNSQ differ in that the weighted MNSQ weighs persons performing closer to the item value more heavily; therefore, persons whose ability is more closely matched to the items’ difficulty level will be weighted more heavily

37

than those who are not (Bond & Fox, 2001). Bond and Fox (2001) recommend that Rasch modellers pay more attention to the weighted MNSQ. According to Adams and Khoo (1996), items with adequate fit will have weighted MNSQ between .75 and 1.33. To justify item fit in this study the ranging of MNSQ of Adams and Khoo (1996) was applied.

Given that the ability of the morphological structure test, morphological production test, phoneme isolation test, reading comprehension and word recognition were estimated using UIRT measure, the Rasch reliability (EAP/PV) was applied to determine the reliability of those tests. ConQuest software (Adams, Wu, & Wilson, 2012) was used to generate the Rasch reliability (EAP/PV) of the scales, which indicates the extent to which the observed total variance is accounted for by the model variance. It indicated “how well the sample of subjects had spread the items along the measure of the test” (Fisher, 1992). The EAP/PV reliability is analogous to Cronbach’s alpha and can be interpreted similarly where the minimum acceptable cut-off level for Cronbach’s alpha is 0.50 (Portney & Watkins, 2000).

For the rapid word segmentation test, rapid colour naming test and rapid number naming test, which are the speed tests, instead of using the raw score of each test, the log-transformed scores was applied. Those three scale score were expect to have normal distribution, thus consistent with the previous study of Boscardin, Muthén and Francis (2008), the raw scores from the three tests were log-transformed scores to approximate a normal distribution before the analyses. The log-transformed was described by Boscardin, Muthén, and Francis (2008) using the following formula:

𝐿𝑇𝑆 = 𝑙𝑜𝑔2{ 𝐶𝑂𝑅 (𝑇𝑇𝑆

𝑇𝐼𝑅 + 0.1) }

38 Where;

LTS : Log-transformed score COR : Correct response TIR : Time response TTS : Total score

Since the rapid colour naming and rapid number naming tests were used to collect data repeatedly at two time points to measure each characteristic. The consistence of these

measurements over time was used to determine the reliability of these two tests; in other words, the test-retest reliabilities of the rapid colour naming and the rapid number naming test were reported in this study. Reliability of the rapid word segmentation test was measured in terms of its internal consistency. That is, Cronbach's alpha was applied to determine its reliability. Table 4 presented the methods used to determine the validity and reliability of all tests in this study.

Table 4

Validation of each test.

Tests Ability score Reliability

Cognitive components Morphological awareness

Morphological structure test UIRT EAP/PV

Morphological production test UIRT EAP/PV

39 Table 4 (Continued)

Validation of each test.

Tests Ability score Reliability

Decoding skills

Phoneme isolation test UIRT EAP/PV

Rapid word segmentation test Log-transformed Cronbach's alpha RAN

Rapid colour naming Log-transformed Test-retest Rapid number naming Log-transformed Test-retest Reading abilities

Reading comprehension test UIRT EAP/PV

Word recognition test UIRT EAP/PV