In this chapter, the collected data were analyzed by applying a few statistical measures run with the Statistical Package for Social Sciences (SPSS). The results of this study concerning the three research questions were presented and examined carefully. Firstly, to answer the first research question, Descriptive Statistics were used to provide an overview of the test results. The data included the number of subjects, the minimum and maximum values, mean and standard deviation for all variables. Secondly, to answer research question number two, Multiple Regression Coefficients were run to examine how the scores on the vocabulary size test correlated with the reading scores and to check if the correlation changed across distinct frequency levels of vocabulary size. Moreover, One-Way ANOVA and Post Hoc Multiple Comparisons were employed to explore the differences in reading scores among subjects with distinct levels of vocabulary size. Lastly, to address the third research question, text coverage was applied to find out the vocabulary size needed for comprehending the SAET reading texts.
Vocabulary Size of the Participants
Subjects’ vocabulary sizes were examined in terms of the five separate frequency levels first. As shown on Table 11, subjects’ mean vocabulary sizes from the first to the fifth 1,000 word frequency levels on the vocabulary test were 843, 778, 530, 524 and 430 word families respectively. On average, the subjects knew around 800 word families at both the first and second frequency levels. At the fifth frequency level,
subjects’ average vocabulary size decreased to merely around 400 word families and there were 167 (54.6%) subjects whose vocabulary sizes were below or equal to 400 word families at this level (Appendix D).
Table 11 Vocabulary size at the five frequency levels (N=306)
Frequency Levels Mean Accumulative Min. Max. Std.
Deviation
1st 1,000 level 843 843 600 1000 94
2nd 1,000 level 778 1621 200 1000 143
3rd 1,000 level 530 2151 100 1000 162
4th 1,000 level 524 2675 200 1000 151
5th 1,000 level 430 3105 0 900 183
Table 12 presents the numbers and percentages of the subjects across the five frequency levels. According to Schmitt, Schmitt, and Clapham (2001), a criterion of mastery was 86% correct answers in the Vocabulary Levels Test. For comparison, this study used 800 word families as the passing criterion at each level. Among the 306 subjects, 260 (85%) had a vocabulary size above 700 word families and 144 (47.1%) had a vocabulary sizes above 800 word families at the first 1,000 level.
There were 106 (34.6%) of the subjects whose vocabulary sizes were above 800 word families at the second 1,000 levels. 24 (7.8%) and 7 (2.3%) of the subjects knew more than 700 word families and 800 word families at the third 1,000 level. As for the fourth and fifth 1,000 levels, only less than 7% and 5% subjects knew over 700 word families and even fewer (2%) subjects passed the 800 word families criterion at these two levels. In some of the cases, subjects got higher scores at the fourth frequency level than at the third frequency level. It was very likely that these subjects knew more words beyond the third 1,000 level than words under this level. The
reason for this may be that EFL students in Taiwan do not acquire vocabulary in frequency order. Vocabulary items are discrete; and one target word in a frequency level is known does not necessarily guarantee that another word in that same level will be.
Table 12 Numbers and percentage of subjects at the five frequency levels (N=306) Level Vocabulary Size Cumulative No.
of subjects above
The second column of Table 11 shows subject’s accumulated vocabulary size at the five distinct levels, calculated by adding his/her vocabulary size at each 1000-word frequency level. Therefore, subjects’ vocabulary sizes are further measured into the 1,000, 2,000, 3,000, 4,000 and 5,000 words levels (Appendix E).
The average vocabulary size of the subjects at the 1,000, 2,000, 3,000, 4,000 and 5,000 words levels were 843, 1,621, 2,151, 2,675 and 3,105 word families, respectively. In terms of the 2,000 words level (Appendix E), one third of the subjects
had a vocabulary sizes equal to or below 1,500 word families and most subjects’ (56%) vocabulary sizes ranged between 1,600 word families and 1,800 word families. With respect to the 3,000 words level, it was found that near to 40% of senior high school students in this study possessed a vocabulary size of around or below 2,000 word families at the 3,000 words level. Most of the subjects’ (45%) vocabulary size concentrated on the range between 2,100 word families and 2,400 word families. Over 90% of the subjects’ vocabulary sizes were around or below 2,500 word families and only ten students (3%) had a vocabulary size above 2,700 word families. Similarly, at the 4,000 words level, the subjects’ vocabulary size had a peak distribution (42%) within the range between 2,500~2,800 word families. However, there were still around 60 percent of students having vocabulary sizes under the average 2,700 word families and three fourths of the subjects’ had vocabulary sizes under 3,000 word families at this level. In terms of vocabulary size at the 5,000 words level, a great individual difference existed among the subjects. On one hand, there were 3 subjects (1 %) having a cumulated vocabulary size below 2,000 word families, six subjects (2%) below 2,100 word families, near to 15 % under 2,500 word families, around half with a vocabulary size close to or below 3,000 word families; more than 60 percent had a vocabulary size below 3,200 word families. On the other hand, there were six students whose accumulated vocabulary size was as high as 4,300 to 4,500 word families. Yet, most subjects’ (over 50%) vocabulary size did concentrate on the range between 2,700 and 3,500 word families.
In summary, subjects’ vocabulary sizes dispersed widely across the five levels.
Among the 306 valid subjects in this study, 162 (53%) students had a vocabulary size below or equal to 800 word families at the 1,000 words level. 159 (52%) students’
vocabulary sizes were below or equal to 1,600 word families at the 2,000 words level;
187 (61%) below or equal to 2,200 word families at the 3,000 words level; 183 (60%)
below or equal to 2,700 word families at the 4,000 words level, and 169 (55%) below or equal to 3,100 word families at the 5,000 words level.
Correlation between Vocabulary Size and Reading
Multiple Regression Coefficients were employed to explore how EFL learners’
vocabulary size was related to their reading comprehension across the five frequency levels. The accumulated vocabulary sizes at the five frequency levels were used as the combinations and the orders of entry for independent variables and the reading scores as the dependent variable. The correlation figures between vocabulary size and reading comprehension are shown in Table 13.
Table 13 Correlation between vocabulary size at the 5 frequency levels and reading performance
Frequency Levels R R Square R Square Change
1,000 words level .384** .148 .148**
2,000 words level .515** .265 .117**
3,000 words level .563** .317 .052**
4,000 words level .583** .340 .023**
5,000 words level .624** .389 .049**
Note: ** Correlation is significant at the 0.01 level (2-tailed) R= multiple correlation coefficient.
R Square = multiple determination coefficient.
Overall, the subjects’ vocabulary sizes at the five levels and their scores on the reading test had a positive and significant correlation (r=.624**). In other words, as students’ vocabulary sizes were larger, their performances were better on reading comprehension. This finding is in accordance with many previous studies, which also
indicated relatively high correlations, ranging from .50 to .75 between readers’
vocabulary knowledge and text comprehension (e.g., Laufer 1992, 1996; Liu &
Nation, 1985; Qian, 1999). The result supported that readers’ vocabulary knowledge serves as a significant predictor of their proficiency in reading.
However, the correlation between subjects’ vocabulary sizes at the 5,000 words level and reading (r=.624**) was only slightly higher than that between vocabulary sizes at the 3,000 words level and reading (r=.563**). The result indicated that the receptive vocabulary size at the most frequent 3,000 words level had similarly significant relation with students’ reading comprehension performance. The possible explanation for this is that the vocabulary size measure used in this study is based on the BNC Word Family List. As discussed earlier in the previous chapter, the most frequent 3,000 word families on the BNC Word Family List provided more than 90%
text coverage of the six chosen passages in this study. Compared with the 4th and the 5th frequency levels, vocabulary at the 3,000 words level is more associated with the reading process and seems more crucial for the success of comprehending the SAET texts.
The difference in each level’s R Square and the magnitude of R2 changes was tested with significance and was further analyzed to examine the predictive values of each level in explaining the variance in reading comprehension. The figures of R Square tell the accountability of the subjects’ vocabulary size for their reading performance in the SAET reading test. The R Square figures mean that how much the criterion variance (the subjects’ SAET reading performance) is accounted for by the predictor variable (the subjects’ vocabulary size). The results of R2 of the correlation coefficients between each predictor variable (vocabulary size at each frequency level) and the criterion variable (reading scores) turned out to be moderate: 0.148 for the 1000 words level, 0.265 for the 2000 words level, 0.317 for the 3000 words level,
0.340 for the 4000 words level, and 0.389 for the 5000 words level. That is to say, 14.8 % of the score on the reading test was determined by the vocabulary size at the 1000 words level, while 26.5 % of the reading score was determined by the vocabulary size at the 2000 words level, 31.7% at the 3,000 words level and 34% at the 4,000 words level. Similarly, the vocabulary size at the 5,000 words level explained about 38.9% of the variance in the reading comprehension.
The results of the multiple-regression analyses also indicated that the magnitudes of the correlation coefficients between each predictor variable and the criterion variable did differ significantly. All changes in the magnitudes of the shared variances (R2) were statistically significant, which suggested that, in predicting performance on the reading comprehension, using any combination of variables among the ascending frequency levels yielded better results than using one of them alone. The difference in each level’s R2 and the magnitude of R2 changes in the five regression models across the five levels were very significant and positive from 2% to 14%. R2 change at the 2000 words level implied that vocabulary size at the 1000 words level was entered into the model at the first step and the 2000 words level at the second step, the R2 change was .117 (p<.01), indicating that vocabulary size at 2,000 words level provided an additional 11.7% of the criterion variance over the 1,000 words level. The magnitudes of R2 changes for the five vocabulary levels were:
0.148, 0.117, 0.052, 0.023 and 0.049, respectively. R2 changes over and above the 4th 1,000 frequency level tended to become smaller and decreasing. Variables beyond 4,000 words level yielded a slightly additional 2.3% of the criterion variance over the 3,000 words level. As knowing the words at the 4,000 words level provided only little additional criterion for reading comprehension, the predictive values explaining reading comprehension seemed to diminish beyond the 3,000 words level.
In summary, based on the analysis, the following findings are evident: vocabulary
size was closely associated with reading comprehension. R2 changes after the 3,000 words level was decreasing, which indicated that the predicative values of vocabulary size in explaining the reading comprehension started to lessen after the 3,000 words level. The results showed that vocabulary size at the 3000 words level was a better predictor of reading comprehension.
Reading Performance at Distinct Levels of Vocabulary Size
Students’ vocabulary size at the 3,000 words level had been proven to have relatively stronger association with their SAET reading comprehension and the 3,000 words level was identified as a more crucial predictor for reading performance in above discussions. Subjects’ vocabulary sizes at the 3,000 words level were further grouped into three distinct vocabulary size levels: low, intermediate and high levels.
The grouping was based on the percentile rank of subjects’ vocabulary size at the 3,000 words level. The low level was set at vocabulary size equal to or smaller than the 27th percentile, the high-level was equal to or larger than the 73rd percentile, and the intermediate was in-between the two percentiles. The selection of cutoff percentile rank for upper group at the 73rd and lower group at the 27th is a popular practice in the social sciences testing (Wang, 2006, p.130). The dividing lines were 2,000 and 2,300 for separation among groups at the 3,000 words level. Therefore, the vocabulary size in the low level was below or equal to 2,000 word families, the intermediate level range was between 2,001 and 2,299 word families, and the high level was equal to or above 2,300 word families.
After grouping, statistic measures were employed to compare differences in the reading performances among subjects across the three groups. Descriptive Statistics were first used to report the groups’ performances on reading. Next, One-Way
ANOVA and Multiple Comparisons were applied to examine whether the differences in reading performances among the three groups were significant and to investigate to what degree the reading performances differed. The mean scores on reading are presented in Table 14.
To begin with, the mean score on the reading test of all 306 valid subjects in this study was 54.03%. According to CEEC, SAET examinees obtained average scores of 58.52%, 56.00%, 58.33%, 63.68% and 54.13% in the reading comprehension section in the years of 2005, 2006, 2007, 2008 and 2009, respectively (Lin, 2009).
The mean score on the reading test in this study is close to the above-mentioned figures announced by CEEC. This may imply that the sample in this study is representative of the SAET test-takers in general.
The three groups’ mean scores on reading at the 3,000 words level demonstrated that the mean scores on reading test increased progressively from Group Low to High.
Group Low had the lowest mean score (M= 43.25) and Group High had the highest mean score (M =65.83). The mean score of Group Intermediate was in-between (M = 51.88).
Table 14 Reading scores across three groups at 3,000 words level
Group N Mean SD Minimum Maximum
Low 116 43.25 16.73 13 83
Intermediate 71 51.88 17.61 17 88
High 119 65.83 17.98 21 100
Total 306 54.03 20.04 13 100
Note: SD = standard deviation. .
Group Low: vocabulary size at and below 2,000 words
Group Intermediate: vocabulary size between 2,001-2,299 words Group High: vocabulary size at and above 2,300 words
As can be seen from Table 15, the results of One-Way ANOVA comparing the three groups were significant at a value of .000 (Sig.<.05). Post Hoc Multiple Comparisons were further employed to find out the differences in the reading performance among the three groups. Table 16 shows the result of Multiple Comparisons.
Table 15 One-Way ANOVA for three groups at 3,000 words level Sum of Square df Mean
Square
F Sig.
Between Groups 30375.447 2 15187.723 49.95 .000 Within Groups 92118.880 303 304.023
Total 122494.326 305
Table 16 Post Hoc Multiple Comparisons among three groups at 3,000 words level (Scheffe Method)
Comparisons between Groups Mean Difference Sig.
Intermediate
- 8.63* .005 Low
High
-22.58* .000
Low 8.63* .005
Intermediate
High
-13.95* .000
Low 22.58* .000
High
Intermediate 13.95* .000
Note: * The mean difference is significant at the .05 level.
As shown, there were significant differences in the reading scores among the three groups (Sig. < .05). The groups with larger vocabulary size had higher mean scores on reading test. The difference between Group Intermediate and Group Low
was 8.63 with a significance value of .000. Besides, the mean score of Group High was significantly higher than that of Group M (DM =13.95*) and Group L (DM
=22.58*). The vocabulary size criteria were below 2,000 word families for Group Low and above 2,300 word families for Group High. However, the reading performance between these two groups differed to a high degree of 22 percentage points. It implied that for students with a vocabulary size below 2,000 word families (Group Low) at the 3,000 words level, their performances on SAET reading test dropped obviously when compared with those whose vocabulary size level was above 2,300 word families (Group High). The results demonstrated that the smaller the vocabulary size, the lower the reading scores, whereas the larger the vocabulary size, the better the reading performance. With smaller vocabulary sizes, readers may be burdened with the unknown words in the texts and fail to deal with higher-level processes while reading. With a larger vocabulary size, fluent readers are able to deal with lower-level processes more automatically and are allowed to go into higher-order cognitive processes. In other words, students with a larger vocabulary size have a better chance to engage in both lower- and higher-level processes in reading.
Moreover, while taking reading tasks, skilled readers with a larger vocabulary size can tolerate a small proportion of unknown words in a text without disruption of comprehension and can even infer the meanings of the unknown words from sufficiently rich contexts. However, if the proportion of unknown words is too high, like the case of students in the Group Low, comprehension is disrupted (Carver, 1994).
It is therefore concluded that abundant vocabulary size makes fluent reading and proper comprehension possible.
To further examine the effect of vocabulary size at the 3,000 words level on students’ reading comprehension, the sample at Group Low, Intermediate and High at the 3,000 words level was split into three sub-groups: groups 1, 2 and 3, according to
the vocabulary size at the fourth and fifth frequency levels. The division was as follows: group 1 was the group of subjects whose vocabulary size at the fourth and fifth frequency levels was less than 600 word families together; the vocabulary size for group 2 was in the range between 700 and 1,000 word families; and group 3 had vocabulary size above 1,100 word families. Table 17 displays that significant differences in reading scores occur at only three groups: one in Group Intermediate and two in Group High. For the 116 subjects (38%) in Group Low, the reading score differences were not significant and their mean scores on reading were all below 50.
As for the 71 subjects (23%) in Group Intermediate, the reading score difference was significant between Group 1 and Group 3 (DM =18.98*) but the mean reading scores were still below 60. Only in Group High, the reading score difference reached significance. The mean difference between Groups 3 and 2 was 16.47 and 28.18 between Groups 3 and 1. The results implied that the vocabulary size difference at the fourth and fifth 1,000 words frequency levels could not take notable effect on reading performance before subjects achieved a high level of vocabulary size (2,300 word families) at the 3,000 words level. In other words, the larger vocabulary size at the latter frequency levels, the greater effect it promoted on the reading performance particularly when students’ vocabulary size reached above the 2,300 word families level at the 3,000 words level.
Table 17 Comparisons among the three groups at 3,000 words level (with different vocabulary size at the 4th and 5th frequency levels) (n=306) Group Vocabulary Size at Note: * The mean difference is significant at the .05 level.
Another evidence of the crucial role of vocabulary size at the first 3,000 word families on students’ reading comprehension was provided by comparison among subjects who had a vocabulary size over 1,100 word families at the fourth and fifth 1,000 words frequency levels (Group 3). There were 111 subjects with vocabulary size over 1,100 word families at the fourth and fifth 1,000 words frequency levels, 18, 21 and 72 from Group low, intermediate and high, respectively (Appendix F). The mean reading scores of these subjects grew from Group Low (M=49.31) to Group Intermediate (M=59.13) to Group High (M =72.63). Subjects in Group High surpassed those in Group Low and Group Intermediate on reading comprehension test.
Subjects in Group High, whose vocabulary size level was above 2,300 word families, thanks to their superiority in vocabulary size at the 3,000 words level, performed significantly better on reading, achieving an average score of 72%. In contrast, subjects in Group Low, with vocabulary size below 2,000 word families, performed poorly on reading, gaining only 49% correct answers in the SAET reading
comprehension test. It is worth noticing that the gap in reading performances between Group High and Group Low reached 23.32 percentage points. For students with smaller vocabulary sizes at the most frequent 3,000 words level, even though they had high vocabulary size at the fourth and fifth frequency levels, they were unlikely to gain appropriate comprehension of the six SAET reading texts in this
comprehension test. It is worth noticing that the gap in reading performances between Group High and Group Low reached 23.32 percentage points. For students with smaller vocabulary sizes at the most frequent 3,000 words level, even though they had high vocabulary size at the fourth and fifth frequency levels, they were unlikely to gain appropriate comprehension of the six SAET reading texts in this