• 沒有找到結果。

Performance in small classes compared to Reference Schools

60 65 70 75 80 85 90

% max score

Mot (D) Mot (S) Chi (D) Chi (S) Eng (D) Eng (S) Math (D) Math (S)

Figure 10.2 Attitudes of Cohort 2 (disadvantaged (D) v standard (S) schools) start P1 end P1 end P2

60 70 80 90

% max score

Mot (sct)

Mot (ref)

Chi (sct)

Chi (ref)

Eng (sct)

Eng (ref)

Math (sct)

Math (ref)

Figure 11.1 Attitude and motivation scores Cohort 2 v reference schools (P1 through P2)

start P1 end P1 end P2

11.3 There were however gender differences, albeit most with small or very small effect sizes. Girls had higher motivation scores in both the reference school and Cohort 2 classes and the same is true of Chinese. In the latter case, the sharpest falls took place over the course of the P1 year. In English girls have more positive attitudes to the subject but now the major change (both boys and girls) occurred during P2. As with previous comparisons mathematics had the strongest positive attitudes of all three subjects but, nonetheless, there was a steady decline over the two years. Here girls’ scores drop off more sharply than boys, particularly during the P2 year.

11.4 The pattern was very similar when the P2 reference classes, first tested in September 2006 were compared with a combined Cohort 1 and 2 small class sample having first ascertained that the scores in the two cohorts did not differ significantly. These results are shown in Figure 11.2. Across the various measures, the fact that pupils in the reference classes took the P2 pre-test (the end of P1 test) in September and not, as was the case for the small classes, in June means that the initial scores of the reference classes were a little below those in the combined experimental sample. The biggest difference was in English which actually reached the 1% (small effect size) significance level.

By the end of P2 and again at the end of P3 the differences between the two samples were negligible. Girls again scored higher on the combined self esteem and motivation scale, in Chinese and English but not in mathematics.

In the latter subject, the scores of boys from Cohort 1, who remained in small classes, were significantly higher (1% level) than either their peers in the reference classes or those in Cohort 2 who had returned to normal size classes, although the effect size was very small.

60 70 80 90

% max score

Mot (sct)

Mot (ref)

Chi (sct)

Chi (ref)

Eng (sct)

Eng (ref)

Math (sct)

Math (ref)

Figure 11.2 Attitude and motivation scores Cohort 1 & 2 v reference schools (P2 through P3)

end P1 end P2 end P3

11.5 The third comparison traces the P3 reference classes who took the end of P2 test in September 2006 as they move through P4. These classes are contrasted with small classes in Cohort 1 who also moved back to normal size classes in the 2007/08 school year. The results are shown in Figure 11.3. Apart from the fact that due to the later administration the P3 pre-test scores (end of P2 test) of the reference schools tend to be lower than those of their peers in Cohort 1 the overall pattern is very similar to that in the previous two figures. Girls maintain their superiority over boys in motivation and self esteem. In Chinese, although girls outscored boys, the gap closed in P4 where the dip in attitude over the year was sharper for girls while the boy’s scores underwent little change over the course of the year. Returning to a normal class would appear more unsettling for girls although the effect size is again very small. English is again the least liked of the three subjects. By the time Cohort 1 move back to normal classes in P4 there were no differences between these pupils and the reference group. Girls again scored higher than the boys on each test administration.

11.6 Attitudes to mathematics remain steady over P3 but drop in P4 so that the reference group and the pupils from Cohort 1 have almost identical scores by the end of the fourth year. On returning to normal classes in P4 the Cohort 1 girls’ decline was such that there was a significant difference (1% level) between their scores and those of their male peers. In summary, therefore, these results when taken together suggest that being in a smaller class has very little effect on pupils’ motivation, self esteem or their attitudes towards Chinese, English and mathematics. Not only with year to year comparisons was there little difference between the experimental samples and the reference schools, but the advantage of being in a small class for two or three years had little impact on mathematics attitudes on the return to normal sized classes in P4 since by the end of that year the respective scores of the two groups of schools were practically identical in every case.

50 60 70 80 90

% max score

Mot (sct)

Mot (ref)

Chi (sct)

Chi (ref)

Eng (sct)

Eng (ref)

Math (sct)

Math (ref) Figure 11.3 Attitude and motivation scores Cohort 1 v

reference schools (P3 through P4)

end P2 end P3 end P4

11.7 The overall trend, including gender effects, can be illustrated through use of the combined learning disposition score (an aggregated score from all four attitude and motivation measures). In several cases more than one score was available (e.g. two end of P2 scores in the reference groups from different samples) and these were also averaged. The gender variations come through clearly with girls having more positive dispositions on entry and in each successive year. Any small differences between the reference classes and those in the experimental classes have disappeared by the time pupils are completing P4. Overall, attitudes have declined from a high starting point by around 20 percentage points over the four years of the study.

Figure 11.4 Boys' and Girls' Learning Disposition (at entry to end of P4)

60 70 80 90

Entry P1 P2 P3 P4

% maximum score

Boy (ref) Boy (sct) Girl (ref) Girl (sct)

11.8 There are a number of other factors which have the potential to influence attitudes. In the phase covering entry to primary school up to the end of P2, neither the birthplace of the pupils (Mainland born v Hong Kong born), their attendance at kindergarten, nor their identification as pupils with SEN influences attitudes in both reference school and experimental classes.

However, in the move from P2 to P3 overall learning disposition does correlate with some of the above variables. Mainland born pupils, placed in smaller classes score higher than their peers in the reference schools. Even when some of these pupils move back to normal class in P3 they maintain this positive disposition (1% significance level; small effect size). Kindergarten exerts little effect but SEN pupils who remain in small classes during the P3 year have a better learning disposition. In the move from P3 to P4 neither place of birth nor attendance at kindergarten influences pupils’ attitudes.

Furthermore, the SEN pupils of Cohort 1, when they move back to normal classes in P4 appear to lose out, since their scores on the learning disposition scale no longer differ from the equivalent group of pupils in the reference group. The analysis suggests, therefore, that smaller classes do result in more positive attitudes among pupils who have been identified with SEN.

11.9 An equivalent analysis can also be conducted using the scores on the attainment tests. From the point of entry to the end of P2 the straightforward comparison is between the reference schools and Cohort 2. The various mean scores for boys and girls are shown in Table 11.1. In both samples the patterns are remarkably similar. For all three subjects, the start of P1 scores, taken in mid September, are higher for the reference schools for both boys and girls. In Chinese the girls outscore the boys in every year in both reference schools and in Cohort 2 (p<0.01). In English the reference schools display the same pattern but in Cohort 2 the initial scores of both boys and girls are equal, although by the end of P1 and again by the end of P2 girls’ scores are significantly higher (1% level). In all cases the effect sizes are relatively small but increase over time.

Table 11.1 Comparison of Attainment: P1 to P2 (Reference v Cohort 2) Sample Gender Start of P1 End of P1 End of P2 N

Mean s.d. Mean s.d. Mean s.d.

Chinese

Cohort 2 Boys 63.09 25.74 34.46 19.64 49.50 20.00 1560 Reference 67.65 24.66 37.35 19.75 53.45 19.45 1224 Cohort 2 Girls 68.59 24.50 41.89 19.65 56.31 18.27 1244 Reference 71.79 24.09 42.87 20.11 59.16 18.15 1068 English

Cohort 2 Boys 52.87 27.50 50.44 23.47 49.55 24.48 1569 Reference 58.34 27.05 50.73 23.79 54.39 25.00 1233 Cohort 2 Girls 55.09 26.74 55.64 22.01 58.34 22.09 1236 Reference 62.56 25.08 58.24 22.35 64.23 22.11 1066 Maths

Cohort 2 Boys 56.31 22.05 45.56 25.24 55.46 22.96 1592 Reference 61.51 21.94 47.75 27.01 59.02 22.67 1234 Cohort 2 Girls 56.94 21.60 45.54 22.72 53.03 20.12 1259 Reference 61.59 20.85 48.82 24.29 57.16 20.20 1089

11.10 Mathematics shows a different pattern. In both the reference schools and Cohort 2 there are no significant differences between the boys’ and the girls’

scores at the start and end of P1. By the end of P2, however boys are doing better in both samples (1% level: small effect size). Comparing the two samples shows that in all three subjects reference schools tend to maintain their initial advantage suggesting that it is the difference in the initial samples rather than variables such as the size of the class that accounts for this variation. This interpretation is confirmed when the residual gains of the combined scores are calculated using the start of P1 scores to predict the end of P2 attainment. For boys in Cohort 2 the value of the residual is -1.473 while the figure for the reference group comes to -0.682. The corresponding figures for girls are 0.857 and 1.936 respectively. None of these differences are statistically significant supporting the view that the move to small classes has little effect in comparison to the initial differences between pupils on entry to primary school.

11.11 The source of these initial differences can be partially identified by exploring the data obtained from the Parents’ Questionnaire. In the Reference schools 15% of the intake were mainland born compared to 20.5% in Cohort 2. Only 3.8 of pupils in the Reference schools had not attended a kindergarten while in Cohort 2 the figure was 6.9%. Again the Reference Schools with only 11.2%

of SEN pupils have an advantage over schools in Cohort 2 where the proportion of SEN pupils was 17%.

11.12 In both samples the Hong Kong born pupils have poorer scores on entry but whereas this discrepancy is also found in P2 in the Reference school sample mainland pupils in Cohort 2 have caught up by the end of the P2 year. Lack of kindergarten experience again results in lower initial scores. Here although these pupils catch up by the end of P2, irrespective of whether they belong to the Reference group or Cohort 2 the improvement of the latter pupils is greater because they start for an initial lower base. This can be seen in the residual gain scores from the start of P1 to the end of P2. For the reference schools the value is 2.829 while for Cohort 2 it is 4.620.

11.13 For the comparison of scorers from the start of P2 till the end of P3 the timing of the testing was the same for both Cohorts 1 and 2 but for the reference group the end of P1 test was administered in September and not June and in some cases the tests were administered in different years. A further difference was that in P3 Cohort 2 pupils had returned to normal classes. The data is set out in Table 11.2. Mean values and numbers of pupils in Cohort 2 may differ slightly from those in Table 11.1 as not all pupils with scores in P2 also had a score in P3 either because they may have moved away from the neighbourhood during the intervening year or were away at the time of testing.

The differences however are relatively slight and have little impact on the overall analysis. For Chinese girls outperform boys in both P2 and P3 in all three samples. Notwithstanding their higher score on the end of P1 test because it was taken in September and not June, the reference group continues to outperform both Cohort 1 and 2 on subsequent tests (p<0.01, small effect size). The differences are greatest when the comparison is with Cohort 1.

Cohort 2 has a higher mean score at the end of the P1 year than Cohort 2 (p<0.01, small effect size) and this difference is maintained up to the end of P3 although the effect size is negligible. Being in a small class for three rather than two years appears, therefore, to bring little benefit.

Table 11.2 Comparison of Attainment: P2 to P3 (Reference v Cohorts 1 & 2) Sample Gender End of P1 End of P2 End of P3 N

Mean s.d. Mean s.d. Mean s.d.

Chinese

Cohort1 Boys 34.23 18.27 48.54 18.15 44.70 18.44 1721 Cohort 2 37.07 19.54 50.24 19.71 46.98 19.01 1524 Reference 40.53 19.50 51.85 19.31 48.54 19.11 1194 Cohort 1 Girls 39.58 19.48 55.68 17.62 51.84 17.68 1565 Cohort 2 41.98 19.57 56.38 18.30 52.67 17.72 1201 Reference 47.15 20.07 59.02 18.03 55.54 17.72 1151 English

Cohort 1 Boys 49.66 22.46 53.82 24.28 29.31 21.10 1745 Cohort 2 50.63 23.34 50.07 24.21 30.87 22.05 1537 Reference 52.10 23.62 50.54 24.91 31.95 23.83 1195 Cohort 1 Girls 55.80 21.39 63.52 21.58 39.41 21.77 1554 Cohort 2 55.66 22.09 58.51 21.95 39.15 21.63 1189 Reference 58.42 22.09 59.83 22.09 40.87 23.31 1150 Maths

Cohort 1 Boys 43.52 25.36 54.28 21.76 60.05 22.46 1739 Cohort 2 46.34 25.19 56.13 22.52 61.05 22.51 1546 Reference 51.65 25.28 56.82 22.51 60.83 22.28 1188 Cohort 1 Girls 44.76 23.51 52.59 19.65 59.59 20.37 1568 Cohort 2 45.91 22.57 53.31 19.76 59.43 20.41 1206 Reference 53.79 23.26 56.57 20.12 61.66 20.54 1143

11.14 In English girls again always obtain higher scores than boys in all three samples (significance level 1% but small effect sizes). Attainment differences at the end of P1 favour the Reference group (because of later testing) but by the end of the P2 year it is Cohort 1 pupils that are ahead. But this advantage is not maintained in P3 so that the additional year in the small class appears to offer no significant advantage. In mathematics, as in previous comparisons, gender effects are less pronounced. There are significant differences (1% level) for boys in P2 (Cohort 1 and Cohort 2) and in P3 (Cohort 2) but the effect size is very small in each case. Both Cohorts 1 and 2 are behind the reference group at the end of P1 and P2 but have caught up at the end of P3. Moving back to a normal class in P3 appears to have no noticeable effect on Cohort 2’s performance.

11.15 When the results are broken down by place of birth and attendance at kindergarten there are few significant results. Smaller classes tend to reduce the differential between Hong Kong born and Mainland born children but have little noticeable effect on improving the scores of pupils who did not have an

opportunity to attend kindergarten. This suggests that home background is the more important of these contextual variables.

11.16 The final comparison is between the Reference group and Cohort 1 when pupils move from P3 to P4. Here again the timing of the end of P2 testing differed, that for the Reference school pupils taking place in mid September while Cohort 1 were tested in June. Table 11.3 displays the data. Again, in Chinese, the pattern whereby girls outperform boys is continued into P4 irrespective of whether the pupils belong to the reference group or to Cohort 1.

As expected the reference group scores at the end of P2 are higher for both genders (due to the later administration of the test) but by the end of P3 girls’

scores do not differ significantly (5% for boys in favour of reference group but negligible effect size). However, in P4 when all the pupils are in normal classes the Reference group regains the advantage (1% level; small effect size both genders).

Table 11.3 Comparison of Attainment: P3 to P4 (Reference v Cohort 1) Sample Gender End of P2 End of P3 End of P4 N

Mean s.d. Mean s.d. Mean s.d.

Chinese

Cohort 1 Boys 48.65 18.17 44.69 18.40 47.49 17.42 1803 Reference 51.97 17.79 46.39 18.78 50.42 17.56 1186 Cohort 1 Girls 55.47 17.42 51.62 17.67 55.72 15.85 1645 Reference 58.92 17.01 52.77 17.54 57.91 16.10 1137 English

Cohort 1 Boys 53.08 24.24 28.73 20.90 34.62 22.11 1819 Reference 55.78 24.22 31.74 22.91 37.99 23.36 1181 Cohort 1 Girls 62.58 21.75 38.58 21.74 45.84 21.56 1640 Reference 65.57 21.75 42.00 23.53 48.43 22.20 1138 Maths

Cohort 1 Boys 54.22 21.66 59.72 22.57 50.78 22.29 1818 Reference 57.86 21.21 62.18 22.17 54.73 21.20 1165 Cohort 1 Girls 53.05 19.29 59.80 20.08 51.16 19.80 1643 Reference 56.45 19.91 62.49 20.28 54.27 20.33 1128 Aggregated attainment (combined Chinese, English and mathematics scores)

Cohort 1 Boys 52.58 18,69 44.64 17.99 44.62 18.22 1661 Reference 55.35 18.71 46.99 18.90 47.83 18.45 1085 Cohort 1 Girls 57.28 16.84 50.24 17.39 51.12 16.62 1539 Reference 60.37 17.09 52.40 18.01 53.63 17.18 1053

11.17 In English the Reference group not only outperforms Cohort 1 (both genders) on the end of P2 (as expected) but continues to do so at the end of P3 and P4.

In all cases the effect sizes are very small and again the move back to normal size classes in P4 has a negligible effect on Cohort 1 scores. In mathematics there is only one gender difference and that is for the end of P2 testing in Cohort 1. Although as with the languages, the Reference group has higher end of P1 scores because of the delay in administration Cohort 1 has caught up by the end of P3. However, once the Cohort 1 pupils move to normal classes in

P4 the Reference group open up the gap again (1% significance level; small to very small effect size). Thus the inference is that being in a small class in P3 does have some benefit but that this does not carry over once the pupils return to the normal classes in P4. This can perhaps be seen more easily when the aggregated attainment scores are examined. In P3 Cohort 1 has lowered the gap between both boys and girls scores but by P4 the reference group has regained its advantage. While therefore being in a small class in P3 appears to bestow some advantage there exists also the possibility that another reason for the improvement lies in the extended efforts being made to achieve success in the TSA examinations.

11.18 Although being in a small class appeared to benefit pupils when they were in P1 and to a certain extent P2 it is being a Hong Kong born pupil that now reappears in P3 and P4 as a significant variable (1% level; small effect size).

The effect of having attended kindergarten cannot be ascertained since this question was not included on the Parent questionnaire when it was given to the parents of P1 pupils. Being in a small class in P3 appears to give a slight advantage to SEN pupils since they reduce the gap that previously existed with similar pupils in the reference group at the end of the P2 year, but again the effect sizes are extremely small.

11.19 In Table 11. 4 residual gains, calculated using the aggregated scores, are used in support of the above findings. In all cases the girls’ residuals are positive, the boys are negative except when end of P3 scores are used to predict end of P4 attainment. This marks the point in time when Cohort 1 returns to normal classes and both boys and girls in the reference groups make greater progress.

Clearly, having been in a small class for three years does not carry an advantage on return to normal classes in P4. As demonstrated previously (para 11.17) when pupils move to P3, experimental classes retain a slight advantage but when the end of P2 scores are used to predict P4 attainment boys and girls in the reference schools both outperform their peers in the experimental sample. This supports the view that any advantage of being in a small class initially declines from year to year. The differences between the experimental and reference classes for boys and the end of year P2 to the end of P3 scores for girls are statistically significant (1% level) in favour of the experimental classes but the effect sizes are extremely small and indicate no practical difference.

Table 11.4 Residual gains in the Experimental and Reference classes (aggregated scores)

Sample Gender P2 to P3 P2 to P4 P3 toP4 Mean s.d. Mean s.d. Mean s.d.

Experimental Boys -0.258 8.77 -0.941 7.64 -1.098 8.20 Reference -0.896 9.11 -0.160 7.97 0.040 8.81 Experimental Girls 0.872 8.51 0.597 7.06 0.508 7.75 Reference 0.145 8.49 0.776 7.20 1.051 7.69

Gender effects significant at 1% level in all cases (very small effect sizes)

11.20 For all three comparisons (P1 to P2; P2 to P3 and P3 to P4) regression analysis was also employed to determine the magnitude of various effects using the combined aggregated scores. Pupil data for each of the participating schools (experimental and reference) was entered into a regression analysis in attempt to assess the contribution to end of P2, P3 and P4 aggregated attainment of the following variables: gender, school attended, parental support, being in a small or normal class, being classified as SEN, learning orientation, and, successively, aggregated attainment either at the start of P1, P2 or P3 as appropriate. Two regression equations were constructed. The first used a simple linear regression model while the second adopted a multilevel approach with pupil characteristics as the first and schools as the second level variable.

11.21 From the start of P1 to the end of P2 the initial attainment accounted for 42.4% of the total explained variance. Being classified as SEN reduced a pupil’s score (unstandardised regression coefficient = -9.59) and accounted for a further 3% in variation. Parental support then contributed a further 2%. 17 of the schools contributed a further 4.2% to the total variation but being in a small class did not feature. Having a positive orientation to learning and being a girl jointly contribute a further 0.5% to the total variation. Of the 4 schools making a significant negative contribution to the end of P2 score 3 were from the reference schools. Thirteen schools made a positive contribution of which 7 came from the experimental sample. When the analysis was repeated, this time using a multi-level regression model to estimate the contribution of schools to the overall equation then the variation attributable to the various pupil contributions was 162.0 (standard error 4.23) and that for schools 12.47 (standard error 3.14). Thus some 8% of the observed pupil performance can be attributed to differences between schools while the major contributions remain that of initial attainment, attendance at kindergarten and parental support followed to a lesser degree by the learning orientation at the end of P2 and being a girl pupil. The multilevel analysis therefore confirms the findings of the simple regression model. In so far that it has been shown that school differences are always larger than those between the small and normal class samples, it would appear that the variation between schools can mainly be attributed to difference in intake.

11.22 A similar analysis was conducted tracing pupil progress from the beginning of P2 to the end of P3. The simple linear aggression shows that the strongest variable accounting for the end of P3 attainment is the end of P1 score which accounts for 64.3% of the total explained variation. Being in a small classes or place of birth makes no significant contribution to the P3 score, while being classified as SEN, parental contribution and the end of P3 orientation to learning scores contributes a further 1.4%, 0.9% and 0.4%, respectively, to the explained variation. Twenty-seven schools make a significant contribution to simple regression equation predicting the end of P3 attainment. Of these, 15 make a positive contribution and 12 of these are experimental schools. Of the 12 negative contributions 6 belong to experimental sample. If instead of end of P1 score the end of P2 score is used as the predictor of P3 attainment then this variable alone accounts for 75.9% of the explained variation leaving little remaining variance to distribute between the other variables. There are small but positive associations with parental support and overall attitudes but being

in a small P3 class makes a negative contribution of 1.43% to end of year attainment score. When the multi-level model is employed, again using end of P1 attainment as the predictor, the pupil contribution to the total variation is 95.77 (standard error 1.84) while schools account for 8.10 (standard error 1.78). Thus around 8% of pupil performance at the end of P3 can be attributed to differences between schools.

11.23 The third analysis examines the variables associated with the end of P4 test scores. Using the linear regression model it appears that the end of P2 attainment accounts for 75% of the total variation. The fact that the reference schools didn’t take the test until mid September after beginning the P3 year probably contributes to the magnitude of this effect. Other factors such as end of P4 learning orientation and being a girl contribute approximately a further 1% to the explained variance. 20 of the schools then contribute a further 1%

(half positively and half negatively). Four of the 10 positive contributions and 8 of the negative ones come from schools in the experimental sample. Now, however, neither attendance in a small class, parental support or place of birth contributes. If the analysis is repeated this time using the end of P3 scores as the predictor of P4 attainment then the effect of moving back to normal classes in Cohort 1 can be studied. Now, the end of P3 scores contributes 79% of the variance. Moving from the small to the normal class causes the combined score of pupils to fall by about 1.9% so that pupils’ experience of three years of SCT does not provide a sustained advantage. When multilevel regression analysis is performed using end of P2 scores to predict P4 attainment then the pupil contribution to the total variation is 72.70 (standard error 1.69) while the schools effect contributes 3.49 (standard error 0.90).

Thus some 4% of pupils’ performance at the end of P4 can be attributed to school differences.

11.24 In summary, the picture that emerges from these regression analyses is that the pupils’ initial ability at intake is a major determinant of pupils’ progress and this is influenced by various factors such as place of birth, parent support and attendance at kindergarten. Thus differences in post –test attainment between schools are largely explained by difference in their intake of pupils. Being in a small class does not have a significant effect but moving back to a normal class has a negative influence. As pupils move from P1 to P4 the contribution made by schools diminishes and the pre-test measure accounts for an increased proportion of variation in the pupils’ post test scores. Starting with the start of P1 year scores it was found that these accounted for 42.4% of the variation in the scores at the end of P2. When this end of P2 score was used to predict attainment at the end of P3 the corresponding figure was 75.9% and when this P3 score was used to predict end of P4 scores it accounted for 79%

of the explained variance. However a degree of caution needs to be employed when interpreting these results. First the data from the reference schools comes from different samples of pupils. Second the timing of the pre-test varied so that in some cases the scores of the pupils in the experimental sample were collected some three months before those of the reference schools. Nevertheless, the main conclusion to emerge would appear to be that irrespective of whether pupils are in large or small classes the main factor contributing to progress is the quality of the teaching. What matters most in