• 沒有找到結果。

CHAPTER 3 METHOD

3.3 Study 3: Real Data Example with DIF Detection in a Framework of

3.3.1 Data Description

Data was taken from booklets 4 and 5 of TIMSS 2007 fourth grade mathematics assessment, which consists of 25 items with 15 multiple choice items and 10

constructed response items (Foy & Olson, 2009). The TIMSS mathematics items were

designed to reflect the international mathematics curriculum by disseminating surveys

to each participating country regarding their assessment objectives and whether they aligned with the design implemented by TIMSS. The TIMSS 2007 mathematics items

were developed with two main domains: (1) content domains in which the Number,

Geometric shapes and Measures, and Data Display were included; (2) cognitive

domains in which the Knowing, Applying and Reasoning were included. The TIMSS release selected items and groups of examinees in its released dataset. In the present

study, Booklets 4 and 5 are chosen for three reasons. First, they encompass the

greatest number of dichotomously scored items, which the RHO-RDINA and

RHO-RDINO model require. Second, the selected dataset conform to the overall

domains created by the test developers. Third, the same datasets has been chosen to fit

DINA model in the previous study (Lee, Park & Taylan, 2011).

It is common that examinees only have to finish one or two Booklets in the large scale

assessment. Considering that the numbers of examinees who finish the Booklets 4 and

5 simultaneously from a country may not sufficient to be analyzed. Thus, a total

number of 858 examinees from two countries (Taiwan and United States) in grade

fourth who took the booklets 4 and 5 of TIMSS 2007 fourth grade mathematics assessment were used (452 girls and 406 boys). The original Q-matrix developed by

Lee et al. (2011) for the booklets 4 and 5 included 15 attributes. The Q-matrix was

developed based on the TIMSS 2007 mathematics framework (Mullis et al., 2005).

The attribute descriptions for the item content Q-matrix is listed in Table 3.3.

However, because of only 25 items were use to analyze too much attributes may cause

some attributes only tested by an item. Thus, the present study adopted the italicized

headings in the attributes column which listed in the Table 3.3 as attributes and

condensed attributes from 15 to 9. The entries of item content Q-matrix is listed in Table 3.4. In the structure of the Q-matrix, all items measured at least one attribute

and some items tested two or three attributes, (so called ‘complex structure’). Von

Davier (2005) suggested that for relative model fit the Akaike’s information criterion

(AIC) and a corrected Bayesian information criterion (BIC) can be used to compare

models that are not nested. The AIC and BIC were used as data model fit indices to

determine which of the two proposed modified models better fit the real dataset.

Gender difference in mathematics performance has been attracting broad attention

however, results of these studies are inconsistent (Ryan & Fan, 1996). Hyde, Fennem

and Lamon (1990) performed a meta-analysis of 100 studies and indicated gender

differences in mathematics performance are small. Studies investigated patterns of

gender difference on mathematics found that some factors such as item context and how to present item will influence gender perform difference in mathematics (Harris

& Carlton, 1993; Ryan & Chiu, 2001). These factors are more related with test

equality issue rather than innate ability. Thus, considering that data analysis from

large scale is often used as guidance for practitioner it is important to aware if the item characteristic favors different gender groups. Hence, gender DIF was

investigated using the dataset.

Table 3.3 Attributes descriptions from the TIMSS 2007 framework for fourth grade mathematics

Content domain Attributes

Number 1.Whole Number

(1) Representing, comparing, and ordering whole numbers as well as demonstrating knowledge of place value.

(2) Recognize multiples, computing with whole numbers using the four operations, and estimating computations.

(3) Solve problems, including those set in real life contexts (for example, measurement and money problems).

(4) Solve problems involving proportions.

2.Fractions and Decimals

(1) Recognize, represent, and understand fractions and decimals as parts of a whole and their equivalents.

(2) Solve problems involving simple fractions and decimals including their addition and subtraction.

3.Number Sentence with Whole Numbers

(1) Find the missing number or operation and model simple situations involving unknowns in number sentence or expressions.

4.Patterns and Relationships

(1) Describe relationships in patterns and their extensions;

generate pairs of whole numbers by a given rule and identify a rule for every relationship given pairs of whole numbers.

5.Lines and Angles Geometric Shapes

& Measurement (1) Measure, estimate, and understand properties of lines and angles and be able to draw them.

6.Two and Three dimensional Shapes

(1) Classify, compare, and recognize geometric figures and shapes and their relationships and elementary properties.

(2) Calculate and estimate perimeters, area, and volume.

7.Location and Movement

(1) Locate points in an informal coordinate to recognize and draw figures and their movement.

Data & Display 8.Reading and Interpreting

(1) Read data from tables, pictographs, bar graphs and pie charts.

(2) Comparing and understanding how to use information from data.

9.Organizing and Representing

(1) Understanding different representations and organizing data using tables, pictographs, and bar graphs.

Note: The italicized headings in the attributes column designates the Topic Areas within the Content Domains as indicated in the 2007 TIMSS framework (Mullis et al., 2005)

Table 3.4 TIMSS 2007 Fourth Grade Mathematics Q-matrix

Therefore the following questions were addressed in this study:

1. What is the data model fit of the proposed modified RHO-RDINA and modified

RHO-RDINO models?

2. Is it consistent in identifying DIF items on gender groups with different DIF

detection methods?

3. Is there any group difference exist in the real dataset conditioning on attribute