• 沒有找到結果。

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

23

Step 1: Sampling and Orientation

Before the fall semester, based on the EBCT scores, sixty low-achieving freshmen were selected from two classes of forty to participate in the present study (See the Participants section). In the first English writing class, both classes were informed that they would have a writing class every week. At this moment, the participants did not know that they would take a picture-description writing test in the following period.

Step 2: Pre-test

In the second English class, the pre-test was administered to collect the data of students’initial ability in writing. The students were asked to write 50 English words to describe four pictures on the test paper. Before the 30-minute picture description, the whole class spent 3 minutes discussing the content of the pictures of a story. By doing this, everyone might have a clear idea of what to write about the pictures. Then they could focus on the use of English rather than think about the plot in the pictures.

With the help of one colleague of the teacher-researcher, the two participating classes took the test concurrently. All the test takers were instructed to read the instructions very carefully before they began to write any sentences. Dictionaries, translators, and peer help were not allowed. The writing test papers were not returned to them until the end of the experiment.

Step 3: Rating and Grouping

The products of the picture description test were first rated by the

teacher-researcher and one fellow English teacher before the next writing session.

Rating was divided into two stages--holistic scoring and error counting. Such device was implemented after both the pre-test and the post-test. On one hand, the results of holistic scoring were used to answer the first and second research questions

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

concerning the comparison between the experimental group and control group. On the other hand, error counting dealt with the third research question regarding error change after the treatment. The method of error counting was explained as follows.

Since research question three targeted at the change in error between the pre-test and the post-test, the error counting method was devised by referring to Weltig (2004), and the procedure went as follows. First, the numbers of and the types of error in each piece of writing were recorded and counted. Second, the error total was tallied by adding up the numbers of the 18 types of errors (see Appendix C) in the experimental group. Next, the percentage of each error type was calculated with the following formula: the number of one type of error was divided by the number of all the errors.

For example, the error total was 933 and the errors of verb formation appeared 54 times. The error density was thus calculated as (54/933)*100%, which was 5.79%.

The error percentage gained in this way would be used later for observation of error change before and after the treatment.

Intra-rater Reliability

To check the intra-rater reliability, ten of these writings were assessed two weeks later, which was the second rating. Each of the two raters independently gave scores based on the GEPT scoring criteria, as she had done for the first rating. All the scores were recorded on separate sheets of paper; that is, the ten writings were kept clean without any marks. When one rater scored the same ten papers, she did not well remember what levels they were assessed as by her for the first rating (since it was two weeks ago).

As for each rater, two sets of grades produced from the first rating and the second rating were respectively calculated with reliability analysis (Cronbach's Alpha value of SPSS 12.0). For the researcher’s rating, there was a strong and positive

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

25

correlation between the two ratings since the correlation coefficient (r) was .945 (-1≦

r ≦ 1). For the other rater, the reliability reached .953 (-1≦ r ≦ 1).

Inter-rater Reliability

By referring to the GEPT scoring guide, two clean photocopies of the same ten pre-test writings were scored by the two independent raters individually on the day after the pre-test. The two sets of scores were later analyzed for inter-rater reliability by means of Cronbach's Alpha value of SPSS 12.0. The correlation coefficient (r) was .947 (-1≦ r ≦ 1). This result indicated there was a close and positive correlation between the ratings of the raters. Namely, there was great agreement between the raters in operating the GEPT scoring guide, which led the rating to be highly reliable.

Next, the participants’pre-test scores were gained by calculating the means of the scores from the two raters. For instance, student A was rated as level 4 by one rater and as level 3 by the other, so his pre-test writing scored 3.5. These scores were then used to select 56 participants out of the 80 students into the experimental group of 28 and the control group of 28. Both were at a similar level of writing.

Grouping

Before the experiment, two senior high freshman classes of forty had taken the pre-test where each student was asked to write a 50-word English paragraph

describing four pictures (see Appendix A). However, only the 56 participants’pre-test results were compared by using independent-sample t-test (SPSS 12.0) and listed as below:

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Table 3.1

T-test Result for Pre-test

Group Means SD T-value df

Class A (n=28) 3.83 .77

2.17 54

Class B (n=28) 3.45 .57

** = p < .005

T-value (2.17, p < .005) indicates no significant difference in writing

performance between the two groups of students, so no adjustment was necessary to form one experimental and one control group. Thus, randomly, Class A was the experimental group and Class B the control group.

相關文件