Chapter Three
Q: Which best explains the statement, “It would be great to win, but just finishing makes them all winners”?
(A) Finishing the race is an amazing accomplishment.
3 points (correct response)
(B) Winning is the most important accomplishment in the race.
0 point (not text-based)
(C) Everyone who enters the competition wins a prize.
1 point (unrelated to main idea)
(D) Every musher enjoys the competition and does not care if he or she wins.
2 points (only part of main idea)
Research Procedure
As shown in figure 4, the procedure for the present study includes six phases:
preparation, test modification, pilot test, formal test, data analysis, and results.
Figure 4. Research Procedure
In preparation phase, the researcher reviewed related theories and empirical studies of reading comprehension as basis of the study to form the purpose of the study and research questions. Then, the researcher applied the NAEP 2013 reading framework and distractor rationale taxonomy (DRT) as guidelines for test modification. Texts for the reading comprehension test were selected from various sources and texts’ length and difficulty of vocabulary were modified to meet the readability of senior high school students.
According to the NAEP 2013 reading framework and text types, the two-way
specification table were formed to validate reading comprehension items. In order to check the difficulty of the content and vocabulary, a pilot test in a small scale were conducted to measure readability, item difficulty and internal consistency of the original English reading test. The modified version of the reading comprehension test were formulated based on the analysis of the pilot test.
After the modification of the reading comprehension test, the researcher selected three senior classes in high school from Kaohsiung City, a total of 109 participants, for the phase of formal test. After the formal test were accomplished, the data analysis was conducted, and the results were proposed later in Chapter four.
Data Analysis
Data analysis were conducted in two stages—pilot stage and formal stage. The purposes of the pilot study are: 1) to ensure that the content of the test (test items and distractors) is in accordance with the reading comprehension taxonomy, 2) to make sure that the level of readability is appropriate for senior high school students, and 3) to check and improve the quality of the items and distractors. In order to achieve the purposes mentioned above,
members (i.e., senior high school students or students with similar English proficiency) of the intended samples were asked to assess the readability of the reading passages and the test items/distractors. Item analysis, including indices of item difficulty, item discrimination, the percentage of respondents marking each choice to each item (i.e., distractor analysis), and the item mean and standard deviation were used to evaluate the quality of the items/distractors.
The item discrimination index was calculated by correlating item scores with total scores, and it showed the extent to which each item discriminates among the participants in the same way as total score does. To be useful, items that have low or negative correlation with the total score could be eliminated and items that are not equipped with discriminating functions (i.e., ambiguous or double-barreled items) were revised since they are not measuring the same direction. After selecting the most useful items as indicated by the item analysis, the
researcher revised these items distracters into hierarchical orders to form the test with ordered multiple-choice items, and used it for formal test.
During the formal stage, various types of reliability and validity evidences were collected. Reliability concerns the extent to which the measure will yield consistent results each time it is used while validity concerns the extent to which the items purported to measure. Reliability coefficient such as Cronbach’s alpha was used to determine the reliability of the test results.
As for the validity evidences, item difficulty, item discrimination, distractor analysis were reported. Meanwhile, content-based and construct-based validity were collected. An independent sample T-test were employed to determine whether male and female students performed differently on various cognitive targets of the test.
According to the research design, zero to three points was awarded for each correct answer based on the item’s cognitive target and participants’ response to the item, and it was determined that the higher scores students received, the better they understood the content.
The statistical software SPSS was used for statistical analysis of the present study.
Research Limitation
Several research limitations were confronted during the research procedure. And they were illustrated below.
Firstly, the research was constrained by time and participants. The English reading comprehension test of the study was basically designed for senior high school students.
However, the effective samples were only senior students in high school. The reason of this was because the test was administered during the regular semester, and few classes were able to manage extra time for this test, and therefore, the amount of participants were not as expected. Owing to the curriculums’ progress and tight schedule, the administration was limited, and the results for inference might be affected.
Secondly, the instruments used in the study contained informational texts and literary texts. Informational texts included exposition, argumentation, persuasive text, procedural text, and documents while literary texts includes fiction, nonfiction, poetry. Concerning the difficulty of the reading texts and test time for EFL learners, only exposition and fiction are selected for data collection. Besides, the percentage of the first cognitive target, locate and recall, was lower than the framework for NAEP assessment, because of the length and difficulty of the texts. Therefore, the inference of the results might be affected, too.
As for the modification of the instrument, few steps were skipped because of the time constraint. Originally, the researcher tended to do two pilot tests and ask for expert review
before formal administration. However, one pilot test before the formal test was skipped because the test can only be administered during the regular semester. It was not easy to get satisfactory amount of effective samples in such tight school schedule, and therefore, the process of expert review was cancelled before formal administration. However, the researcher still used the NAEP 2013 reading framework and distractor rationale taxonomy (DRT) as guidelines for test modification. Several items may show some unexpected results and they were marked and explained in the later chapter.
The issue of gender differences on reading comprehension performance was explored in the study. However, there were several factors that caused differences on performance, such as learning environment, student’s personal reading habit or instructional methods. The present study did not further discuss the variables mentioned above, but gender differences on English reading comprehension performance were reported later.