6.2 ANALYSIS OF LANGUAGE ASSESSMENT DATA . 43
6.2.2.3 Analysis of Coordinated Oral and Listening Scores 49
Assumption tests statistics
The p-value of Box's test of level of covariance matrices evaluates whether the variances and covariance among the dependent variables are the same for all levels of factor. This assumption is violated when the value is less than 0.05, but in reality this assumption is rarely satisfied.
Spread versus level plots illustrate whether any relationships exist between the group means and variances. Where a strong pattern in the plot does exist then the analysis is not reliable.
Further, in order to satisfy the assumption of normality of dependent variable values, the sample size among the combination of groups should be at least fifteen.
Analysis test statistics
Wilks* Lambda test was used to find out whether independent or interaction factors have any significant effect on the mean values of the coordinated oral and listening assessment scores.
The multivariate Eta Squared value is a corresponding multivariate effect size index, ranging from 0 tol; a value of 1 indicating the strongest possible relationship between the independent or interaction factor with the dependent variables. Where independent or interaction factors were found to be significant in the MANOVA, then a follow-up ANOVA was carried out to test the teaching mode effect It should be noted that this follow-up ANOVA is different from a uni-variate ANOVA because the former is performed on dependent variable values that are available in the multivariate analysis.
Test statistics for foUow-up ANOVA
The p-value of Levene's Test of level of error variances evaluates whether the variances among the dependent variables are the same for all levels of factor. A value of less than 0.05 indicates that this assumption is violated. In reality however, this assumption is rarely satisfied,
Type I error level adjustment (alpha level)
For the 1st cohort 2^administration analyses, there are four levels of teaching mode. Therefore the multivariate tests use an alpha level of 0.05, and the follow-up ANOVA tests use an alpha level of 0.05 divided by the total number of ANOVAs performed. Multiple comparison tests within a follow up ANOVA use an alpha level of ANOVA divided by the number of distinct comparisons.
For the 1st cohort 1st administration and 2nd cohort administration analyses, there are 3 levels of teaching mode. Here, multivariate tests use an alpha level of 0.05, and the follow-up ANOVA tests use an alpha level of 0.05. For multiple comparisons, when the follow-up ANOVA effect is significant, an alpha level of 0,05 divided by the number of ANOVAs is used. When the
follow-up ANOVA is not significant, an alpha level of 0.05 is used.
(B) Testing the Assumptions :
Statistics testing the assumptions of 3-way MANOVA analysis for oral and listening scores are listed in the table below together with the results for the secondary and primary data.
Assumption tests statistic* table of oral and listening assessment p-value of Box test of
level of covariance matrices
Pattern in Spread vs level
plot
No. of samples with sizes less than
fifteen
Reliability of the analysis Secondary students
1* cohort 1*
administration 1st cohort 2m
administration 2*1 cohort 1*
administration 2m cohort 2M
administration
<0.001
<0.001
< 0.001
< 0.001
No No
•.No-Weak negative
relationship
None out of 27 13 out of 30 None out of 16
One out of 15
Moderate Moderate Moderate
poor Primary pupils
1* cohort 1*
administration 1" cohort 2**
administration 2*0 cohort 1*
administration 2M cohort!"*
administration
<0.001 0.082 0.003 0.01
No No Too few data Too few data
None out of 6 2 out of 7 None out of 4
2 out of 6
Moderate Good Moderate Moderate Results in the above table suggest that the inability of the analyses were moderate to good, except for the analysis in the 2s3 cohort 2nd secondary administration, and therefore the result of this analysis is not reported.
50
(Q Findings
SPSS outputs of all the analyses can be obtained upon request from the research team (see footnote on page 5 above).
(i) Secondary Schools 1st cohort 1st Administration
Three-way MANOVA on Oral & Listening Assessment
In the table for multivariate tests, there is a significant interaction effect of form level, school level and teaching mode with a Wilks* Lambda p-value of less than 0,001, and an F-statistic of 6.8 (with dfl=16 and d£2=2474). Finally, the corresponding Eta Squared value was 0.042.
These results indicate that a follow-up ANOVA of the interaction effect should be conducted.
Multivariate tests
Effect Intercept
FORM1 SCHJLV
T98_99 FORM1
* SCH LV
FORM1
*T98 99 SCH. LV
*T98 99 FORM1
*
SCH LV
*T98 99
Wilks1
Lambda Wilks1 Lambda
Wilks' Lambda
Wilks1 Lambda . Wilks'
Lambda Wilks1
Lambda Wilks' Lambda
Wilks' Lambda
Value 0.06 0.96 0.56 0.99 0.98
0.97 0.88 0.92
F 10118,0
12.8 206.2
3.1 3.0
5.6 20.9 6.8
Hypothesis df 2 4 4 4 8
8 8 16
Error df 1237 2474 2474 2474 2474
2474 2474 2474
Sig.
0 0 0 0.015 0.003
0 0 0
Eta Squared 0.942
0.02 0.25 0.005
0.01
0.018 0.063 0.042
a. Exact statistic
b. The statistic is an upper bound on F that yields a lower baund on fte significance level.
c. Design: InteIcept4^FOK^fRSCH_LV^T98-.99+FORMl * SCHJLV+FORM1 * T98_99+SCH_LV * T98_99+FORM1 * SCHJLV * T98_99
Follow-up ANOVA on Oral & Listening Assessment
Multiple comparisons of the follow-up ANOVAs were tested at an alpha level of 0.0028, The multiple comparisons of the following combination of factor levels were tested at an alpha level of 0.05: F4 class in a Hi^hi ability school; F4 class in a Low ability school; and F3 class in a Low ability school
In the analysis of the oral scores, there were 12 out of 27 multiple comparisons that showed a significant teaching mode effect (Please refer to the table below).
Form level Fonnl Form!
Form 3 Form 3 Form 3 Form 3 Form 3 Form 4 Form 4 Form 4 Form 4 Form4
School level Medium
Low High High Medium Medium
low High High Medium
Low low
ANOVA p-value
0.001 0.001 0.001 0.001
< 0.001
< 0.001 0.001
< 0.001
< 0.001 0.001
< 0.001
< 0.001
Significant comparisons Both > Local
Both > NET NET > Both Local > Both Both > Local NET > Local Both > Local NET > Both Local > Both Both> Local NET > Both NET>Local
p-value 0.015 0.002
< 0.001
< 0.001
< 0.001
< 0.001 0.003 0.001
< 0.001 0.001
< 0.001
<0.001
In the analysis of the listening scores, there were 10 out of 27 multiple comparisons that showed a significant teaching mode effect (please refer to the table below).
Form level Fonnl Fonn3 Form4 Form 1 Form 4 Fonnl Form3 Form!
Form3 Form 3
School level Low Medium Medium Low Medium
High High ffigh ffigh Medium
ANOVA p-value
< 0,001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
Significant comparison BOTH>Local BOTH>Local BOTH > Local BOTH > NET BOTH>NET NET>BOTH NET>BOTH Local > BOTH Local>BOTH NET>Local
p-value
< 0.001
< 0.001 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001
< 0.001 0.001
Discussion
Students at Fl in medium and low ability schools, perform better when they have had the opportunity of being taufgit by both the local teacher and the NET. In contrast, in high ability schools Fl students taught by a combination of NET and local perform significantly more poorly than those taiigjit by either a NET or a local. This same pattern is replicated for High level F3 groups, students taught by Both or Local perform relatively more poorly. For F4, however, the pattern is reversed, students taught by either a NET or a local perform relatively bettor than those taught by a combination of both. These findings do not indicate the cause of any such relationships, since they are baseline, cross-sectional data.
The first administration of the language assessment instruments was conducted in March 1999.
The first cohort NETs had been in place since the previous September. In this period of six months, during which the oral, listening and writing tests were being developed and piloted the sampled groups would have been influenced by their English teacher, whether he or she was a NET or a local teacher, or whether that class had been taught by both NETs and local teachers, perhaps in some form of team teaching or spit class teaching. This influence may be very sm^
but it is likely to be represented in the data gathered m the fii^ administration of the 1^
assessments to the first cohort groups.
While the influence of the students' English teacher (whether a NET, a local or Both) is likely to account for a certain proportion of the variation in scores b etween the different groups
',••/.' • ' • • ' • • "
:' ' .• "
;•' ' 52 '
:. ' " • : •. ' • • :
::
;. '
:' ' •; ^ •
represented in the findings, the scores of the children assessed at this time will have been influenced by a host of other factors and it is not possible to identify any causal factors for the differences found. These are baseline data, gathered in order to identify and understand the pre-existing patterns or tendencies in the data and provide a basis for making longitudinal comparisons based on the second administration scores obtained from the same students.
1* cohort 2nd Administration
Three-way MANOVA on Oral & Listening Assessment
In the table for multivariate tests, there is a significant interaction effect of school level &
teaching mode with a Wilks* Lambda p-value equal to 0.003, and Eta Squared value of 0.27.
Therefore a 2-way ANOVA was performed.
Multivariate tests
Effect Intercept
FORM2 SCH_LV
GROUP FORM2
* SCH LV
FORM2
* GROUP SCHJLV
* GROUP FORM2
* SCHJLV
*
GROUP
Wilks' Lambda
Wflks' Lambda
Wilks' Lambda
Wilks*
Lambda Wilks' Lambda
Wilks1
Lambda Wilks' Lambda
Wilks' Lambda
Value 0.1 0.9 0.7 1.0 1.0
.1.0
0.9
1.0
F 1653.8
11.3 48.6 0.3 4.1
0.6
2.5
1.2
Hypothesis df 2 4 4 6 6
8
12
8
Error df 533 1066 1066 1066 1066
1066
1066
1066
Sig.
0 0 0 0.935
0
0.809
0.003
0.305
Eta Squared 0.861 0.041 0.154 -.;
0.002 0.023
0.004
0.027
0.009
a. Exact statistic
b. The statistic is an upper bound on Fthat yields a lower bound on the significance level
c. Design: Intercept4^FORM2^SCH_LV-K3ROUP+FORM2 * SCHJLV+FORM2 * GROUP+SCHJLV * GROUP+FORM2 * SCH LV * GROUP
Two-way MANOVA on Oral & Listening Assessment
Assumption: The p-value of Box's test of level of covariance matrices is less than 0.001, The Spread versus level plots showed no clear pattern. The sample size varied among the group combinations and was less than fifteen for a few of them. The reliability of this analysis was moderate. A follow-up ANOVA of the interaction effect was therefore conducted.
Follow-up ANOVA on Oral & Listening Assessment
There were 3 levels of school, 2 dependent variables and 4 levels of teaching mode. Multiple comparisons wore tested at an alpha level of 0.0014.
There were 36 multiple comparisons, and 2 of than were significant These are shown in the table below. The most pertinent finding is that on average, students who had been taught by
NETs in the two-school-year period from September 1998 to May 2000 had significantly higher listening scores than those taught by local English teachers in the same period.
The other significant comparison shows that students who had been taught by local teachers and NETs in successive school yeais (1998-1999 & 1999-2000) had significantly different oral assessment scores. Students who had been taught by a local teacher in the first year, and a NET in the second, had significantly higher scores than those taught by a NET in the first year and a local in the succeeding year.
Language assessment Listening scores
Oral scores
School level
low level school high level school
ANOVA p-value
< 0.001
< 0.001
Significant comparison
NET to NET > Local to Local Local to NET > NET to Local
p-value
< 0.001
< 0.001
2nd cohort 1st Administration
Three-way MANOVA on Oral <£ Listening Assessment •
In the table for midtivariate tests, there is a significant interaction effect of form level, school level and teaching mode with a Wilks* Lambda p-value of less than 0.001, F-statistic of 12.6 (dfl=4 and d£2=1380) and Eta Squared value of 0.035. These results indicate that a follow-up ANOVA of the interaction effect should be conducted.
Multivariaie tests Effect
Intercept FORM SCH_LV
TEACH FORM*
SCH LV FORM*
TEACH SCHJLV
* TEACH FORM*
SCH_LV
* TEACH
Wilks1 Lambda
Witts' Lambda
Witts' Lambda
Witts' Lambda
Witts' Lambda
Witts' Lambda
Witts' Lambda
Witts' Lambda
Value 0.1
1.0 0.8 1.0 1.0 1.0 0.9
0.9
F 59643
6.7 50.6 4.1 4.4 8.4 9.6
12.6
Hypothesis df
2 2 4 4 4 4 8
4
Error df 690 690 1380 1380 1380 1380 1380
1380
Sig.
0 0.001
0 0.003 0.001
0 0
0
Eta Squared 0.945 0.019 0.128 0.012 0.013 0.024 0.053
0.035
a.
b. The statistic is an upper bound on Fthat yields a lower bound on Ac significance level.
c. Design: rntacepHTORM+SCHJLV+IEACH+FORM * SCH LV+FORM * TEACH+SCH LV * TEACH+FORM * SCHJLV * TEACH
Follow-up ANOVA on Oral & listening Assessment
There were three levels of school, two levels of fonn, and 2 dependent variables. When the ANOVA was significant, each multiple comparison used an alpha level of 0.0042, and when
54
the ANOVA was not significant, each multiple comparison used an alpha level of 0.05.
Five of the 28 multiple comparisons tested were significant Referring to the column headed Significant Comparison in the table below, we can see that on average, students taught by certain types of teacher had higher scores than those taught by other types. For example, Fl students in Higji ability schools who were being taught by a NET scored significantly higher in their listening assessment than those taught in split class, oral only, or team teaching mode by both local teachers and NETs.
Form level
Forml Forml Form3 Forml Form 3
School level
High
High Low High
Language scores
Listening scores Listening scores
Oral scores Oral scores
ANOVA p-value
< 0.001
< 0.001 0.019
< 0.001 0.002
Significant comparison NET > Both Local > Both Local > Both Both>Local -Local > Both :
p-value
< 0.001
< 0.001 0.019
< 0.001 0.002
Discussion
The findings in the table above are baseline data gathered at a point in time early in these students experience with a 'brand new NET* (ie. one recruited from September 1999 (see Section 53.13 above)).
For the second cohort investigation, the first administration of the language assessment instruments was conducted in November 1999. The second cohort NETs had been in place since the previous September. In this period of three months, the sampled groups would have been influenced by their English teacher, whether he or she was a NET or a local teacher, or whether that class had been taught by both NETs and local teachers. While this influence is likely to be represented in the data gathered in the first administration of the language assessments to the first cohort groups, and therefore represented in the table above, it is not likely to be large given the s hort time s pan involved. Thus, the influence of the students' English teacher may account for a certain proportion of the variation in scores between the different groups represented in the findings, but the scores will have been influenced by a host of other factors and it is not possible to identify any causal factors to account for the differences found.
This baseline information was gathered in order to identify and understand the pre-existing patterns or tendencies in the data and provide a basis for making longitudinal comparisons based on the second administration scores obtained fh>m the same students.
2nd cohort 2nd Administration
Three-way MANOVA on Oral & Listening Assessment
In the table for multivariate tests, thereis a significant interaction effect of both form level and teaching mode and of school level and teaching mode. Both have Wilks* Lambda p-values of
less than 0.001, and Eta Squared values of 03 and 0.33, respectively. Therefore two 2-way MANOVAs were then conducted.
Multivariate tests
Effect Intercept
FORM SCH LV
TEACH FORM*
SCH LV FORM*
TEACH SCH LV
* TEACH FORM*
SCHJLV TEACH
Wflks' Lambda
Wilks1 Wilks1
Lambda Wilks1 Lambda
Wilks'
Lambda
Wilks1
Lambda Wilks' Lambda
Wilks' .
Lambda
Value 0.1
1.0 0.8 0.9 LO 0.9 0.9
LO
F 5360.0
7.2 31.9 11.4 3.4 8.9 4.9
23
Hypothesis df 2 2 4 4 4 4 8
2
Error df 576 576 1152 1152 1152 1152 1152
576 •
«
Sig.
,000 .001 .000 .000 .010 .000 .000
.100
Eta Squared .949 .024 .100 .038 .012 .030 .033
.008
a. Exact statistic
b. The statistic is an upper bound on F that yields a lower bound on the significance level.
c. Design: Intox^t+FORM^CH^LV+TEACH+FORM * SCH_LV+FORM * TEACH+SCHJLV * TEACH+FORM * SCH LV * TEACH
Two-way MANOVA on Oral & Listening Assessment Checking Assumptions
The Spread versus level plots for the 2-way MANOVA of school level and teaching mode showed a strong pattern. There was a similar finding for the 2-way MANOVA of form level and teaching mode. These results suggest the statistical results were unreliable and therefore they are not reported here.
(U) Primary Schools 1* cohort l" Administration
Two-way MANOVA on Oral & Listening Assessment
In the table of multivariate tests, there is a significant interaction effect of form level, school level and teaching mode with Wilks' Lambda p-value of less than 0.001, F-statistic of 6.9 (dfl=4 and d£2=718) and Eta Squared value of 0.037. These results suggest that a follow-up ANOVA of the interaction effect should be conducted.
56
Multivariate tests
Effect Intercept
FORM1 T98_99 FORM1
* T98 99
Wilks' Lambda
Wilks1
Lambda Wilks1
Lambda Wilks' Lambda
Value 0.1 0.9 1.0 0.9
F 1859.1
20.6 3.3 6.9
Hypothesis df 2 2 4 4 !
Error df 359 359 718 718
Sig.
.000 .000 .012 .000
Eta Squared .912 .103 .018 .037 a. Exact statistic
b. The statistic is an upper bound on F that yields a lower bound on the significance level.
c. Design: Intercept+FORMl+T98_99+FORMl * T98_99
Fottow-up AN OVA on Oral & Listening Assessment
There were two levels of form, two dependent variables, and 3 levels of teaching mode. When the ANOVA was significant, multiple comparisons used an alpha of 0.0125, and when ANOVA was not significant, multiple comparisons used an alpha level of 0.05.
Four out of 12 multiple comparisons wore significant
Form level
Primary three Primary three Primary five Primary five
Language scores
Oral scores Listening scores Listening scores Listening scores
ANOVA p-value
0.034 0.039 0.002 0.002
Significant comparison Both > Local Local>NET NET>Local NET>Both
p-value
0.041 0,047 0.002 0.002
Discussion
Similar considerations to those discussed in section 62.23 C (i) above in relation to the secondary data, apply to these primary school results. The findings provide baseline information and were gathered in order to identify and understand the pre-existing patterns or tendencies in the data and provide a basis for making longitudinal comparisons based on the second administration scores obtained from the same students. Nevertheless, the assessments were administered after six months of the different teaching modes in question; hence the influence of the teacher is likely to be present in the data.
The findings show interesting, but predictable differences between P3 and P5 students. In the lower levels of form, pupils benefit more from the NET when he or she teaches only a proportion of their English lessons. Pupils' oral and listening scores, on average were significantly higher when they were taught by both the NET and a local teacher. At P5, however, there is evidence to suggest that pupils could benefit from being taugbt by the NET for all their English lessons. This was reflected in the significantly higfter listening scores of pupils in NET classes when compared to those of pupils taught by locals or a combination of NETandlocaL
1st cohort 2nd Administration
Two-way MANOVA on Oral & Listening Assessment
In the table for multivariate tests, the p-value of teaching mode effect is less than 0.001, with Eta Squared value equals to 0.57. These results suggest a follow-up MANOVA of teaching mode effect was then conducted.
Multivariate tests
Effect Intercept GROUP1
FORM2 GROUP1
*FORM2
Wilks' Lambda
Wilks' Lambda
Wilks' Lambda
Wilks1 Lambda
Value 0.1 0.9 1.0 1.0
F 3594,1
8.1 -9.4 2.1
Hypothesis df 2 6 2 4
Error df 402 804 402 804
Sig.
.000 .000 .000 .077
Eta Squared .947 .057 .045 .010 a. Exact statistic
b The statistic is an upper bound on F that yields a lower bound on the significance level.
c. Design: Intercept+GROUPl-fFORM24GROUPl * FORM2
Follow-up ANOVA on oral and listening scores
There were two dependent variables and four levels of teaching mode. Multiple comparison used an alpha level of 0.0042, Five of the 12 multiple comparisons were significant. These are
shown in the table below.
Language assessment Oral scores Oral scores Listening scores Listening scores Listening scores
ANOVA p-value
<0.001
<0.001
< 0.001
< 0.001
<0.001
Significant comparisons Local to NET>NETtoNET Local to Local > NET to NET
Local to NET> NET to NET NET to Local > Local to Local Local to NET > Local to Local
p-value
< 0.001
< 0.001
< 0.001 0.002
< 0.001
On average, pupils taught exclusively NETs or exclusively by local teachers for the two year period obtained lower scores on the oral and listening assessments than those taught by local teachers or by a teaching combination (team teaching or split class teaching) of local teachers andNETs.
Discussion
These results do not distinguish between pupils in different class levels or in schools with particular ability levels. They apply to all primary schools in the sample. They present a cross-sectional profile of differences at the end of a two-year period of treatment The general indication is that, in the primary schools, a concentration of NET teaching over two years does not necessarily lead to measurable differences hi oral ability. When pupils taught by NETs for two years, are compared with pupils taught by locals for two years, or by a local for the first year and a NET for the second year, then- oral and listening scores are significantly lower. On the other hand, those pupils with the opportunity of having one year of NET teaching in the
58