學習動機、課室評量與教師權變 (II)

(1)

行政院國家科學委員會專題研究計畫期末報告

學習動機、課室評量與教師權變(第 2 年)

計畫類別：個別型計畫編號： NSC 100-2410-H-004-186-MY2 執行期間： 101 年 08 月 01 日至 102 年 07 月 31 日執行單位：國立政治大學外文中心計畫主持人：黃淑真共同主持人：崔正芳計畫參與人員：助教-兼任助理人員：劉珮宜助教-兼任助理人員：李承泰博士班研究生-兼任助理人員：徐維里報告附件：移地研究心得報告出席國際會議研究心得報告及發表論文公開資訊：本計畫涉及專利或其他智慧財產權，2 年後可公開查詢

中華民國 102 年 10 月 04 日

(2)

中文摘要：本計畫深入研究學習動機與課室評量的關聯，及將課室評量與教師權變結合的方法。計畫植基於研究者 96 年度國科會計畫成果之兩篇論文，其一探討師生對以考試與分數激勵學習的看法、其二比較考試與報告二種不同評量方式，對學生動機及學習策略的影響。二文結論指出，學生動機會影響其因應課堂要求的態度，而課室評量的安排亦影響學習動機；教師在安排課程配分給分時，多會將學生動機與其可能投入的程度納入考量。本計畫前半部將擴充對師生做法、態度與信念的調查，以檢驗之前觀察到的本地考試文化效應以及師生心態，究竟為特屬於華人社會的現象，抑或是亞洲現象、跨國跨文化現象？藉由之前的研究基礎，擴大問卷調查與訪談的規模，梳理出動機與評量之間微妙的關係、其背後的成因，以及進一步提出動機與評量理論修正的方向，和可供大專英語教師參考的做法。計畫的後半部以最新的課室評量理論（Black & Wiliam, 2009）為本，據此設計符合近年 assessment FOR learning「評量以助學習」思潮的大專生英語說寫教學流程。同時依新理論所提的 teacher contingency 觀念，帶進企業管理的權變理論，並檢驗教師權變的可行性與效益，將以準實驗研究法設計說寫教學各兩套流程，控制組依一般預擬教案在學生練習後介入，實驗組之不同處在教案不預設，而將於教師閱覽過學生作品後才決定。實驗前後蒐集學生動機問卷，並進行作品修改前後的內容分析，希望結果有助於探討教師權變的可能與修正方向。動機與評量、權變的結合是英語教學的創新思維，其理論與實用價值均值得投注更多心力。中文關鍵詞：形成性評量、課室評量、學習動機、教師權變、英文寫作、回饋、修改

英文摘要： Traditional teacher feedback given in response to L2 students＇ writing has been characterized as

laborious, ineffective, and even harmful. Recent studies have recommended various promising methods that help to remedy observed problems. In a required integrated EFL course for a group of 38 college freshmen, the researcher implemented several

suggested feedback practices to replace conventional written feedback on individual submissions. Major non-traditional variations included revision lessons between the drafting and revision stages, with

(3)

identified from learners＇ drafts. After three rounds of writing, with each round consisting of a draft-instruction-revision sequence, the three drafts and revisions were evaluated using five items scored by two independent raters. Also, learner experience was clarified based on responses to an open-ended

questionnaire. Findings indicate that learners

improved from drafts to revisions for all three tasks on all five measures. Among the three drafts and three revisions, significant differences were found between the first and the third drafts. Furthermore, learners reported that they improved more on

organization and argument than on local linguistic features. In sum, the proposed assessment-based

feedback lesson demonstrates its potential value as a replacement for written feedback provided on

individual student papers.

英文關鍵詞： formative assessment, teacher contigency, written feedback, corrective feedback, second/foreign

(4)

‐ 1 ‐

EFL Writing Instruction Based on Classroom Assessment

to Motivate Learner Revision

Table of Contents

Introduction………3

Replacing Individual Written Feedback with a Face-to-face Feedback Instruction to All………8

The Study………..10

The Context and Participants………10

Procedures……….11

Research Questions………12

Data Collection and Analysis………13

Results………..14

Discussion……….18

Limitations and Conclusion………23

References………26

(5)

‐ 2 ‐

List of Tables

Table 1. Pearson correlation coefficients of scores between two raters……….15

Table 2. Descriptive statistics and t-test results on all three writings……….16

Table 3. Highlights of student reports on what they had and had not learned………17

List of Figures

Figure 1. Sequence of an assessment-based L2 writing instruction process…………9

(6)

‐ 3 ‐

Introduction

Revision is part of the writing process for all writers; students engaged in second language writing especially need to cultivate the ability to revise and improve their own work. Numerous studies have discussed various approaches where teachers help learners revise, such as error feedback (Ferris, 2010), trained peer review (Min, 2006), explicit instruction of revision strategies (Sengupta, 2000), teacher-student

conferences (Ewert, 2009), and tutoring in a writing center (Williams, 2004). That said, written feedback is probably the most commonly adopted approach––as such, it has received the majority of research attention. Moreover, there is evidence

suggesting a significant positive relationship between the teachers’ ability to provide quality feedback and gains in their students’ writing achievement (Parr & Timperley, 2010). Yet interestingly, research has also shown that even the most carefully-worded feedback often holds little meaning for students (Maclellan, 2001). Students either do not pay attention to the feedback or cannot understand it; even when they do, they often do not want to act upon it (Brookhart, 2007/8).

One branch of feedback research investigates what types of information are included in typical instructor feedback. Lee (2007) investigated the nature of teacher feedback in Hong Kong writing classrooms. Based on a sample of 174 actual student texts and the accompanying 5335 teacher feedback points, the author concluded that 94.1% of the points focused on form, 3.8% on content, and only 0.4% on organization. Similarly, in an American university, it was discovered that teacher feedback was oriented more towards local than global issues, despite the fact that teachers reported and perceived that they were doing the contrary (Montgomery & Baker, 2007). Such lopsided emphases on local issues in teacher feedback has been the norm until very recently (Ferris, Brown, Liu and Stine, 2011), despite continuous advice from composition researchers for practitioners to attend to a wide range of textual issues

(7)

‐ 4 ‐

that include content and organization.

In addition to the feedback content, many researchers have focused on the effectiveness of teacher feedback in improving student writing. A number of studies have provided evidence for the efficacy of written corrective feedback, especially in terms of the accuracy of certain well-defined linguistic features (e.g. Ferris, 2006; Bitchener & Knoch, 2010), although skepticism still persists (Truscott, 1996; Truscott & Hsu, 2008). For example, it was found that direct correction was fastest in

producing accurate revisions; however, despite the significant quick effect, actual learning was debatable in comparison with the more time-consuming learner self-correction process (Chandler, 2003). Moreover, there are doubts about the

ecological validity of feedback studies focusing exclusively on grammatical accuracy (Ellis, Sheen, Murakami, & Takashima, 2008) in terms of whether studies of this kind are pedagogically relevant to or credible for writing teachers (Bruton, 2009).

Some teachers believe that feedback on students’ writing should focus on more than linguistic accuracy and include suggestions regarding content and organization, or even take into consideration students’ cognitive and affective development.

However, empirical findings for feedback effects on these other non-linguistic aspects tend to be somewhat negative. For example, several studies (Hyland, 2000; Lee 2009; Lee, 2011; Williams, 2004) have suggested that most feedback is teacher-centered, leading to passive and dependent students. Another report of negative effect on affect noted that corrective feedback is often solely focused on informing students of their errors. When learners receive their writing “awash in red ink”, there can even be damaging psychological impacts (Lee, 2007). Student interviews in the study have shown that students want to learn more about the criteria of good writing and are interested in trying other feedback options such as in-class discussions and conferences with teachers.

(8)

‐ 5 ‐

Further, issues associated with providing feedback on student submissions are not restricted to writing teachers only. In many subject disciplines at different levels of education, feedback is an indispensible, but mostly peripheral, part of teacher practice. Recently, several studies (Bailey & Garner, 2010; Price, Handley, Millar, & O’Donovan, 2010) have questioned the common practice of written feedback, especially in view of the amount of time and effort required by teachers. Bailey and Garner (2010) interviewed 48 lecturers in the British higher education context and found that these lecturers were confused about what feedback should achieve and what students do about it. The researchers referred to feedback as “having a

Cinderella status on the margin of institutional structures and processes” (p. 187). The majority of the “best feedback approach” suggestions in the second language writing literature also demand a great deal of teacher time and effort (Ferris, 2003). In addition, some researchers have attested that the undergraduate students pay little attention to teacher feedbacks or do not understand nor act upon them (Sadler, 2010; Wingate, 2010), although the small number of those who did pay attention and act on the suggestions improved in the areas previously criticized. Two reasons for students’ non-engagement with feedback included failure to offer any strategies for using feedback (Silver & Lee, 2007), and students’ low motivation or low self-perception as writers, factors which are usually neglected in second language acquisition studies (Ellis, 2010). These authors also suggested that teachers and researchers pay close attention to the ways feedback content is communicated as well as learners’ affective factors, which will increase the likelihood that students utilize feedback.

Several studies have investigated alternatives to abridged written corrective feedback when teaching revision and obtained positive results. First, Sengupta (2000), in addition to providing written feedback on individual submissions, taught Hong Kong secondary students revision strategies following the completion of their first

(9)

‐ 6 ‐

drafts. Writing performance was then holistically measured and compared with students who did not learn these revision strategies. The findings demonstrated that explicit teaching of revision strategies had a significant effect on writing scores, which suggests that such explicit instruction may facilitate the development of an awareness of discourse-related features in L2 writing. Another study with similar objectives was situated in first language tertiary context in the US: Butler and Britt (2011) found their students underprepared for academic writing, such that they could not write well-structured arguments. In response, they designed two writing tutorials to help students revise their argumentative essays––an argument tutorial and a global revision tutorial. Students could complete the tutorials independently, and both were shown to help improve the revised submissions; interestingly, the improvement associated with completing both tutorials did not exceed that associated with completing either of the individual tutorials. In contrast, learners who did not complete either of the tutorials made more local changes, and their revisions were generally not considered to be much improved as compared to their drafts. The authors also discussed the potential pitfalls of building an enduring misconception in students if teachers continue to provide feedback on local errors. In sum, attempts to direct learner attention beyond the local issues of revising a draft have been

encouraging.

Recent literature on learning assessment has also provided insights on making feedback more effective (e.g. Black & Wiliam, 2009). First of all, to help ensure students attend to and use feedback, Price et al. (2010) suggested that, instead of leaving learners to deal with feedback on their own and to wait for the somewhat distant next assignment in which they can apply the comments, opportunities for immediate use should be built into the design of tasks. Second, Price et al. also illustrated how communication breakdown is prevalent when instructors send very

(10)

‐ 7 ‐

concise and often obscure notations to students, limiting their professional opinions to the margins of the page, especially when learners are not given a chance to ask for clarification. Opportunities for dialogue are critical if the feedback is to be understood by learners. Thirdly, echoing the notion of the possible damaging psychological impacts of feedback (Lee, 2007), Brookhart (2007/8) reminded teachers to address both cognitive and motivational factors in formative feedback. Moreover, Brookhart pointed out that a student can only hear the message when he is listening, when he can understand, and when he feels that it is useful to listen. In addition, in terms of

amount of feedback, research results indicated “less is more.” But teachers often give too much and overwhelm students. It is also said that teachers should not only limit the amount of feedback, but also need to prioritize areas of improvement for learners. Finally, instead of looking at surface errors and mistakes of second language writers, McGarrell and Verbeem (2007) advocated that teachers use “an inquiring stance” in constructing feedback: including informational questions on early drafts may help learners clarify their communicative intentions and negotiate emerging meanings, thereby guiding learners as they revise and refine their drafts.

A similar set of suggestions were recommended by Nicol and Macfarlane-Dick (2006), who pointed out that students have the ability to appraise feedback given to them. They also linked assessment with self-regulated learning. While learners’ assessment ability often requires significant improvement, instructors can help

learners develop this ability, and thereby transform learners from passive recipients of feedback to proactive users who are able to assess and lead their own learning. In this respect, seven feedback principles conducive to learner self-regulation are as follows: 1) help clarify what good performance is (goals, criteria, expected standards); 2) facilitate the development of self-assessment (reflection) in learning; 3) deliver high quality information to students about their learning; 4) encourage teacher and peer

(11)

‐ 8 ‐

dialogue around learning; 5) encourage positive motivational beliefs and self-esteem; 6) provide opportunities to close the gap between current and desired performance; and 7) provide information to teachers that can be used to help shape teaching. Numbers three through six coincide with the suggestions mentioned above, in that effective communication and learner motivation for improvement are the focal point.

Replacing Individual Written Feedback with a Face-to-face Feedback Instruction to All

It is difficult to imagine applying the above feedback guidelines when feedback is delivered individually to students in the written form as a supplement to homework grades, as is traditionally practiced and reported in the literature. In this study, the author made feedback the center of L2 writing instruction, positioned between the first draft and a mandatory revision. The detailed rationale and instructional steps (Huang, under second review) are beyond the scope of the current study—they are briefly summarized below.

As shown in Figure 1, learners wrote an initial draft, which was followed by group peer reviews. To model this review process, the teacher would use a sample written by a more proficient peer (the teaching assistant for the course), and

demonstrate how the quality could be evaluated using customized instructional rubrics. Students subsequently learned how to assess submissions in terms of areas for

improvement and then reviewed their peers’ work.

The more intensive teacher evaluations began once learner drafts were collected. Rather than spending time circling all possible spotted errors and scribbling a great deal of condensed feedback in the margins, the teacher researcher read each draft to diagnose where learners were in relation to the desired teaching outcome, and

(12)

‐ 9 ‐

strengths and problems, prioritized teaching points, chose representative student text chunks, and designed specific revision exercises as the basis for follow-up revision lessons. As such, feedback and instruction were no longer at two ends of a continuum, but rather intertwined in the middle (Hattie & Timperley, 2007). During the following class meeting, a significant amount of time was allotted to the selected feedback points, and two-way dialogue between the teacher and students was encouraged to check understanding.

It is believed that such feedback instruction has much greater potential to

motivate learner revisions. These types of assessment-based instruction and exercises help learners recognize necessary revision considerations and concrete steps to improve an existing piece of work. More importantly, revision becomes a built-in component of the writing process, as immediately following the feedback instruction learners are invited to revise their own work based on what they just learned.

Learner Peers Teacher

Investigating Status Quo Moving to the goal

Figure 1. Sequence of an assessment-based L2 writing instruction process

The main thrust of this design lay in replacing the traditional written feedback given to individual assignments with a more purposive revision lesson based on actual

Diagnose and plan for instruction Peer review workshop

Draft

Revision

Revision exercises Feedback as instruction

(13)

‐ 10 ‐

student drafts, aimed at clearly communicating to learners the prioritized revision points, ways to improve their drafts, and the thought processes behind the revision.

The Study

The purpose of this study was to examine the effects of the above feedback process, which incorporated recently developed feedback principles into a revision lesson delivered to an entire student group. It should be noted that following these principles ensures that feedback is no longer treated as the final stage of homework assignments—not to be discussed after it is provided. Instead, it becomes a major part of the instruction between the drafting and revising stages. Instructors no longer are required to laboriously scribble comments on individual submissions—a practice that has been criticized as ineffective and even harmful. Instead, attention is strategically directed towards diagnosing pervasive problems in learner work and designing instructional tools that can assist learners to appropriately revise their work. This type of approach goes beyond the traditional steps associated with writing instruction pedagogy. But does it work? In the absence of specific individual feedback, it is unclear whether learners would feel empowered or less secure than before. Moreover, questions persist regarding whether students are able to read their drafts as critically as a teacher does, find problems similar to those that are traditionally shown to them, and eventually make useful revisions. With these issues in mind, this preliminary investigation was initiated by administering the proposed teaching approach to a group of college EFL learners in a regular course setting and collecting data to examine the effectiveness.

(14)

‐ 11 ‐

The participants consisted of a group of college freshmen in Taiwan. Prior to entering college, students in Taiwan have learned English as a foreign language since the third grade in primary school. Many start even earlier or get extra hours of

exposure outside of school. Primary and secondary school education is conducted under guidelines and a selection of textbooks approved by the Ministry of Education. In junior and senior high, educational objectives are in general closely tied to the entrance examination into the subsequent level. For the joint entrance examinations into college, which almost all students must take, there are two written parts in the English section: one is the translation of two compound or complex sentences from Chinese to English (eight out of one hundred total points for the English section) and the other is a composition of one or two paragraphs totaling approximately 120 words; this piece is predominantly narrative, such as describing a series of four comics

(twenty points).

Procedures

The 38 participating freshmen were of various majors enrolled in a four-skill required English course, which met two hours a week for two eighteen-week semesters; after passing the course they were awarded four credits. The writing component of this course focused on opinion essays of about 300 words in length. Before being asked to do any writing, learning goals were communicated by

discussing specific criteria and standards as well as viewing and evaluating multiple writing samples. The writing lesson, following the sequence depicted in Figure 1, was repeated three times––each with a new topic. Using the blended design suggested by Ferris (2010), students wrote a draft, a revision (after the teacher’s feedback lesson, where the feedback was not written or exclusively corrective), and then a new text. In

(15)

‐ 12 ‐

weeks 4, 7, and 12, learners wrote three drafts based on new topics. Each 30-minute timed draft was followed by the teacher-led discussion of a TA sample and learners’ small group peer reviews. Drafts and peer review results were studied by the

instructor between the two weekly meetings of one writing unit. In weeks 5, 8, and 13, feedback was given to the entire class as a revision lesson for 50 minutes before

learners started to revise for another 30 minutes in the second period. It was in this first 50-minute revision lesson where the collective wisdom of the feedback literature was tested. The teacher briefed diagnostic summaries, identified selected problems, demonstrated revisions, and eventually presented problems using learner excerpts, which students initially solved in small groups; later, the teacher led a class discussion to assess the proposed solutions. In the second 50-minute session, learners were given 20 minutes to reread their own first drafts, review peer evaluation results from the previous week, self-assess using the same rubrics, and set a revision goal based on a revision checklist provided by the teacher. The remaining 30 minutes were reserved for revising the draft. The three writing topics and the content of the revision lessons are depicted in Appendix A. Throughout the semester, each student was required to produce a total of six pieces of writing: three drafts and three revisions.

Research Questions

Under the research design presented above, the following specific research questions were asked:

1. Did student writing improve from drafts to revisions?

2. Did student drafts and revisions improve from one task to another? 3. What did learners report regarding their learning?

(16)

‐ 13 ‐

Data Collection and Analysis

In order to evaluate the writing quality of all drafts and revisions, two outside raters were separately invited to grade the six pieces of learner work. One of them had 18 years of experience as an EFL teacher as well as 10 years as a rater for a national standardized English proficiency test and the English composition part of the national college entrance examination. The other was a senior research assistant from the university’s English department. Both were given the instructional rubrics (see Appendix B) that were used in class for discussion and peer review as a guide for their work. They were paid on a piece rate basis; however, neither was informed of the purpose of this experiment during their work. Rating criteria, as illustrated in Appendix B, was adapted from the publicized criteria of the TOEIC (Test of English for International Communication) writing component. Components included

argument, organization, lexical use, grammar, and a holistic score, each on a scale of 1 to 15.

In addition to the above rating system, the cognitive experience associated with the feedback lessons as reported by learners was also a point of interest. The

researcher wanted to know what students felt they had learned and not learned during this entire experience. In addition, the author was also curious about whether the lack of traditional written feedback for individual students caused any problems. A survey was therefore devised and conducted on the course Moodle platform at the end of the semester. Students responded anonymously in writing to a list of four open-ended short-answer items: 1) Please list three things you have learned about English writing and revision during the semester; 2) In terms of revising your own drafts, what was it that you did best; 3) In revising your own drafts, what was it that caused you the most difficulties; and 4) Any other relevant comments are welcome.

(17)

‐ 14 ‐

Each set of writing scores from the two raters on the six pieces of student work contained both a holistic score and four sub-scores. First, inter-rater reliability was calculated. To answer the first research question, the two raters’ averages were then used for paired-sample t tests between drafts and revisions. To answer the second research question, analyses of variance as well as multivariate analyses of variance were conducted for scores among drafts as well as among revisions. To answer the third research question, learners’ short answers were categorized and tallied. Since no prescribed wording or choices were provided, students were free to use their own words to interpret and describe their individual learning experiences. The data set was a collection of verbal descriptions listed as bullet points, sometimes coupled with explanations, and other times containing multiple idea units. A detailed analysis of the results is offered in the following section.

Results

Pearson correlation coefficients were computed to check the consistency

between scores assigned by the two raters. Results in Table 1 indicate that both raters’ scores on the three drafts and revisions in overall rating (H, holistic), argument (A), organization (O), lexical use (L), and grammar (G) were all positively correlated at a significant level, although inter-rater reliability was lower for the third article in terms of both argument and organization. After separate discussions with two raters and re-reviewing the learner submissions, it became clear that one rater disapproved of many students’ method of reasoning in the third submission, referring to it as circular and offering support for the other side; therefore, she assigned these pieces much lower scores as compared to her previous standard and that of the other rater. Her reasons were somewhat subjective but certainly valid. At this point, the researcher

(18)

‐ 15 ‐

considered the possibility of introducing a third rater. After a discussion with

statisticians, it was decided to retain the original scores, since the correlations were all positive and the majority of figures were at a satisfactory level. Moreover, the validity of the scores may have been contaminated through the introduction of a third rater at this point. That said, caution was exercised when interpreting results related to the argument and organization in the third article.

Table 1. Pearson correlation coefficients of scores between two raters

Drafts Revisions H A O L G H A O L G 1 r 0.789 0.712 0.590 0.750 0.749 0.882 0.939 0.807 0.645 0.721 p (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) 2 r 0.943 0.947 0.863 0.828 0.805 0.963 0.923 0.848 0.803 0.699 p (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) 3 r 0.563 0.330 0.567 0.481 0.367 0.397 0.229 0.348 0.631 0.496 p (0.001) (0.080) (0.001) (0.008) (0.050) (0.044) (0.260) (0.082) (0.001) (0.010)

Note: 1 = 1st piece of writing, 2 = 2nd piece of writing, 3 = 3rd piece of writing H = holistic, A = argument, O = organization, L = lexical use, G = grammar

Descriptive statistics for the three drafts and revisions, together with results from paired-sample t-tests, are shown in Table 2; they are also depicted in a line graph in Figure 2. It can be seen from Figure 2 that the mean scores improved from the drafts to revisions for all three rounds, and the degree of improvement seemed to gradually level off from the first to the second and from the second to the third task. The second and third drafts received lower scores than their previous revisions, but they both received higher scores than the previous drafts. Comparing the five scores on each piece of work, it was found that holistic and argument scores were generally higher while grammar and lexical use scores were lower.

(19)

‐ 16 ‐

Table 2. Descriptive statistics and t-test results on all three writings Grading

Criteria

draft Revision paired-sample t

n M SD n M SD t df p 1st Writing Holistic 32 6.73 2.29 28 8.54 2.09 5.509 27 .000 Argument 32 6.75 2.21 28 8.61 2.02 5.837 27 .000 Organization 32 6.08 2.29 28 8.27 2.01 5.954 27 .000 Lexical Use 32 6.06 2.12 28 7.80 1.84 5.696 27 .000 Grammar 32 6.03 2.39 28 7.75 2.03 5.426 27 .000 2nd Writing Holistic 32 7.61 2.30 31 8.98 2.17 4.160 30 .000 Argument 32 7.61 2.32 31 8.76 2.21 3.473 30 .002 Organization 32 7.47 2.25 31 8.71 2.11 4.184 30 .000 Lexical Use 32 7.09 2.11 31 8.34 2.12 3.671 30 .001 Grammar 32 7.59 2.52 31 8.60 2.34 3.115 30 .004 3rd Writing Holistic 29 8.43 1.43 26 9.56 1.34 6.080 25 .000 Argument 29 8.22 1.52 26 9.44 1.40 5.847 25 .000 Organization 29 8.24 1.43 26 9.31 1.23 4.838 25 .000 Lexical Use 29 8.29 1.31 26 9.04 1.32 4.888 25 .000 Grammar 29 8.48 1.54 26 9.12 1.42 3.486 25 .002 Figure 2. Average holistic and sub-scores in all six pieces of writing

Note: H – holistic; A – argument; O – organization; L – lexical use; G – grammar H, 6.73 H, 9.56 A, 6.75 A, 9.44 O, 6.08 O, 9.31 L, 6.06 L, 9.04 G, 6.03 G, 9.12 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

1st draft 1st revision 2nd draft 2nd revision 3rd draft 3rd revision

H A O L G

(20)

‐ 17 ‐

To answer the first research question, paired-sample t-tests were performed between drafts and revisions for all three tasks. In order to control for Type I errors, the significance level was adjusted and set at .05 divided by 3. The results, as shown in the rightmost columns of Table 2, indicate that all revisions were significantly better than their drafts in terms of both holistic scores and all four sub-scores. The answer to the first research question was therefore positive.

To answer the second research question regarding whether the drafts and

revisions improved from one task to another, the author looked first at holistic scores, and further at the four sub-scores across the drafts and revisions. One-way analyses of variance were firstly conducted on holistic scores. For the drafts, the ANOVA was significant, F (2, 90) = 5.134, p = .008. Follow-up comparisons were conducted using Tukey’s tests to evaluate pairwise differences among the means. There were

significant differences in the means between the first and the third drafts, but not between the first and the second or the second and the third drafts. For the revisions, the ANOVA was insignificant, F (2, 82) = 1.908, p = .155. That is, the holistic scores suggest that the three revised versions did not improve over time. Further examination of the same question was made possible by comparing the four sub-scores among three drafts and three revisions. One-way multivariate analyses of variance were conducted. On drafts, the MANOVA was significant: Wilk’s Lambda = .642, F (8, 174) = 5.392, p = .000. For the follow-up ANOVAs, the significance level was set at .0125 (.05 divided by 4, the total number of dependent variables); difference was found, again, to be between the first and the third draft on all four sub-scores, with a 95% confidence interval of improvement on argument from 0.2 and 2.7, on organization from 0.9 and 3.4, on lexis from 1.1 and 3.4, and on grammar from 1.1 and 3.8. Improvements on the other pairs of the drafts were less consistent, with notable improvements from the first to the second restricted to organization and grammar, and

(21)

‐ 18 ‐

from the second to the third restricted to lexis. For the revisions, the MANOVA was insignificant: Wilk’s Lambda = .851, F (8, 156) = 1.660, p = .112. As such, the three revisions were not statistically different from one another.

To answer the third research question, the researcher first calculated the data size (totaling 3743 Chinese characters). Among them, 321 meaning units were identified and 13 categories were induced. A summary of the most mentioned categories for the three questions are listed in Table 3 by number of counts. Based on student reports, they learned more about textual features such as structure and organization, and argumentation and reasoning, followed by various types of meta-linguistic awareness. Regarding what they had done best at, students again predominantly mentioned

structure and organization, followed by correcting mistakes and refining word choices. Their notes on difficulties centered mostly on lexis and grammar, followed by

uncertainties regarding grammaticality and lexis. Argument and organization, while also mentioned, did not seem to stand out as a general problem for the majority of students. For the fourth question on additional comments, the most salient issues learners voiced were about not getting individual feedback from the teacher and not possessing self-confidence regarding what they chose to revise and whether the revisions could be considered successful.

Table 3. Highlights of student reports on what they had and had not learned Three things learned One thing done best The most difficult part ‐ Structure and organization (32) ‐ Argumentation and reasoning (32) ‐ Various meta-linguistic awareness (20) ‐ Deletion (10) ‐ Maintaining topic ‐ Structure and organization (21) ‐ Argumentation and reasoning (18) ‐ Correcting grammatical mistakes (11)

‐ Refining word choice

‐ Limited vocabulary size (29) ‐ Doubt on grammaticality (16) ‐ Other uncertainties and confusions (8) ‐ Argumentation and reasoning (8)

(22)

‐ 19 ‐ relevance (9) (7) ‐ Length of texts (6) ‐ Deletion (6) ‐ Length of texts (5) ‐ Structure and organization (3) ‐ Deletion (3) ‐ Time pressure (3) Discussion

Before the discussion, I would like to highlight the uniqueness of this study. Unlike most written corrective feedback studies, the feedback here was neither written nor corrective. Instead of giving individual feedbacks, the teacher’s effort was

directed to preparing an in-depth feedback for all, which became the major body of instruction for revision. This design was mainly a product of incorporating recent findings from feedback studies in L2 writing and from feedback studies in the area of formative assessment (or assessment for learning). While many L2 writing teachers consider not giving individual feedback irresponsible or unethical (Bruton, 2009), this study experimented with abandoning individual feedback altogether, rather than loading the teacher with providing both individual feedback and an additional

feedback lesson for all students, as was done in the Sengupta (2000) study. Moreover, two features in the design are worth noting. First, the experiment was not a one-shot consisting of one draft and one revision only, but rather it followed Ferris’ (2010) suggestion of a ‘draft-feedback-revision-next text’ sequence, which included both revision of the same text and the writing of new texts, over three rounds. This

methodology made it possible to examine the effects of such feedback instruction not just on immediate revisions, but also on learners’ uptake demonstrated in the

subsequent new tasks. Second, the outcome measures adopted were similar to real world practices in terms of how learner writings are generally evaluated on

(23)

‐ 20 ‐

focused very narrowly on specific grammatical features like the use of definite and indefinite articles and presented the change in number of such mistakes as an

indicator of improvement (e.g. Bitchener & Knoch, 2010), with few implications for practicing teachers. Others used only impressionistic holistic ratings (Sengupta, 2000) or allowed discussions between raters, leading to concerns about the validity of the writing quality scores. In the current study, the ratings were made more

comprehensive by including both holistic and four sub-scores generated by two independent raters, allowing for discrepancies to be recognized, which more closely resembles real-life practice. Hence, both ecological validity and pedagogical

relevance (Bruton, 2009; Ellis et al., 2008 ) were considered in this study.

The answer to the first research question was an unconditional yes––all three revision scores improved as compared to their corresponding drafts for all five measures. Given the ambiguity and uncertainty involved in the learners’ need to

exercise discretion by individually applying the instructor feedback to their own drafts, this finding is significant. First, while it may not be a surprise for many to see that students’ revisions improved on their drafts (it represented a second chance after all), superior revisions were not a sure thing. For example, Williams (2004) investigated revisions of students who had visited a writing center and found no measurable improvement. In her study, despite the one-on-one attention learners received, which is absent from most writing classrooms including the one depicted in this study, they mainly transferred tutor suggestions verbatim as their revisions. Secondly, while many teachers and students believe that individual feedback is a must, it was demonstrated in this study that students were still able to revise their own drafts to an observable degree without obtaining individual feedback. Despite accumulated research findings showing its ineffectiveness for learning (e.g. Truscott, 2010), the practice of giving individual written feedback has continued to be endorsed by teachers and students

(24)

‐ 21 ‐

alike––possibly due to the lack of compatible alternatives. Now that a new feedback approach has been proposed and its effectiveness demonstrated, this may be a time to reconsider appropriate and effective forms of feedback.

The answer to the second research question, although largely conditioned, is excitingly positive. Comparing the three drafts and three revisions, the only salient difference was found between the first and the third drafts; however, the significance here was robust across all five measures, despite the fact that one of the two raters disapproved of the argument rationale used by many students for the third task, which in turn led to decreased scores. It should also be noted that in between the three rounds of tasks, learners were engaged in the speaking component of the course and received no reinforcement on their previous written submissions, which make the result even more noteworthy. As noted from Figure 2, the iterative and ascending pattern of performance across the three tasks can be clearly observed. The only significant difference happens precisely where Truscott (2010) considered real learning could and should be identified in terms of writing performance; evidence of this had been missing until now. While it is acknowledged that learning is gradual and slow, a sign of this kind is encouraging and warrants more studies along this line of research.

For research question three, the data reveal that most students reported having learned about structure and organization. The things they claimed to do best at were generally consistent with what they reported having learned. Their difficulties, however, were mainly associated with a limited vocabulary and uncertainty about grammatically; the latter, as attested by some in the open-ended fourth question, pertained to the lack of individual feedback. These findings coincide with the pattern of sub-scores as shown in Table 2 and Figure 2, where lexis and grammar scores are comparatively lower. However, argument, organization, and holistic scores are all

(25)

‐ 22 ‐

higher. As for the degree of improvement from drafts to revisions, students advanced the most in terms of organization. These results imply that, based on the feedback lessons used in this study, learners improve more at the global discourse level of writing and less on local lexical usage and grammaticality. Indeed, for these experienced college-level EFL students who were predominantly used to writing narratives, the organization of argumentative essays represented something new that could be picked up relatively successfully following a few rounds of instruction. On the other hand, lexis issues required a much longer period of time to work though as they were less subject to (at the mercy of) instruction, even though resources such as a thesaurus were introduced to students.

Researchers offer instructors a wide variety of interventions and facilitations that may help language students. However, teachers are often loaded with large classes and too much work, making it difficult to apply these tools. From a cost-effective standpoint, this new feedback method may serve as a viable alternative to the traditional method, which has been questioned for some time. The challenges for teachers associated with this proposed approach, however, are not less demanding, since this method involves a great deal of spontaneity and contingency. But the beauty of it lies in using teachers’ time more wisely and strategically and making the

complicated mission of communication more likely to succeed. The results are promising, but more careful examinations are needed in the future, such as a study design that separates the effects of individual written feedback and feedback lessons given to the group. When there is no additive effect in terms of the combination of the two types, then a comparison in terms of learning effectiveness and demands on the teacher can be investigated.

One point of caution has to do with the inherent nature of a feedback lesson for all. Instructional lessons targeted at an entire student group certainly do not cater to all

(26)

‐ 23 ‐

learner needs. When the class size is large and learner proficiency levels vary greatly, such lessons may not address the needs of the majority and run the risk of failure. Even in smaller classes where student learning needs are similar, teacher-student conferences allowing learners to discuss specific issues regarding his/her writing may be necessary to supplement the whole-class lessons.

In addition, the feedback lessons were in a very primitive form. They served more as general directions than concrete procedures to follow. With its contingent nature, there probably would never be concrete steps to follow, because contexts and learner populations vary. But once research of this type accumulates, it may be possible that certain principles and specific guidelines are recognized as central, which in turn will help this approach become more useful to teachers in the classroom.

Limitations and Conclusion

Partly because of its exploratory nature, the study has several limitations. First, while pre-writing lessons and activities, such as brainstorming or providing lexical support, are common in L2 writing classes, they were absent from this study. The participating students started writing the drafts immediately after the prompts were presented. They worked independently until their first drafts were finished in a 30-minute timed writing. It is possible that the significant differences found between drafts and revisions would not have existed if some sort of pre-writing instructions had been offered. That said, giving pre-writing aides may very well have violated the principles this feedback instruction methodology is based on: as shown in Figure 1, learners felt empowered and involved from the beginning to investigate their current abilities, which led to greater focus during the follow-up activities and instruction.

(27)

‐ 24 ‐

Secondly, the kinds of revision students were rated on were actually a

combination of the revision and editing steps. In Butler and Britt (2011), these two topics are clearly distinguished: experienced writers usually begin by revising

globally for main ideas and structure, and then deal with local issues such as grammar and lexis. For this exploratory study on a new approach to providing feedback, such fine distinctions were deemed unnecessary. Future studies along this line may well take the different stages of revision and editing into consideration in terms of research design.

Another distinction not made concerns concepts and strategies associated with revision per se, as well as those of the argumentation genre. During the three feedback lessons, the instructor taught students some fundamental steps needed for revising drafts, and informed students about the structure and features of a good opinion essay. These two broad areas of instruction could be treated separately when learner needs differ. As mentioned earlier, Butler and Britt (2011) distinguished between a revision schema and an argumentation schema and prepared two different tutorials for each. Their findings indicated that each had significant impact on the quality of revision; however, students who did both tutorials showed no additive effect. While learners in the current study needed both kinds of instruction, it would be interesting to know if either of them could be more easily taught to students.

In conclusion, the current study proposes an alternative method of providing feedback to L2 writing learners. This new approach directs the teacher’s effort and time away from giving individual written feedback on all drafts. Instead, the teacher diagnoses problems for the entire group and prepares a feedback lesson that focuses on the what and how of revision. In turn, the messages are more clearly

communicated. The results are promising. Students’ revised versions of the three tasks all improved significantly as compared to their drafts in terms of the five measures

(28)

‐ 25 ‐

employed. The effect also seemed to carry over from the first initial draft to the third one. In addition, learners reported having learned more about global features of organization and argument as compared to local linguistic issues pertaining to lexis and grammar. These results warrant further studies regarding alternatives to written corrective feedback.

(29)

‐ 26 ‐

References

Bailey, R., & Garner, M. (2010). Is the feedback in higher education assessment worth the paper it is written on? Teachers’ reflections on their practices. Teaching in Higher Education, 15(2), 187-198.

Bitchener, J., & Knoch, U. (2010). Raising the linguistic accuracy level of advanced L2 writers with written corrective feedback. Journal of Second Language Writing, 19, 207-217.

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation, and Accountability, 21, 5-31.

Brookhart, S. M. (2007/8). Feedback that fits. Educational Leadership, December 2007-January 2008, 54-59.

Bruton, A. (2009). Improving accuracy is not the only reason for writing, and even if it were… System, 37, 600-613.

Butler, J. A., & Britt, M. A. (2011). Investigating instruction for improving revision of argumentative essays. Written Communication, 28(1), 70-96.

Chandler, J. (2003). The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing. Journal of Second Language Writing, 12, 267-296.

Ellis, R. (2010). Epilogue: A framework for investigating oral and written corrective feedback. Studies in Second Language Acquisition, 32, 335-349.

Ellis, R., Sheen, Y., Murakami, M., & Takashima, H. (2008). The effects of focused and unfocused written corrective feedback in an English as a foreign language context. System, 36, 353-371.

Ewert, D. E. (2009). L2 writing conferences: Investigating teacher talk. Journal of Second Language Writing, 18, 251-269.

(30)

‐ 27 ‐

language students. Mahwah, NJ: Lawrence Erlbaum.

Ferris, D. R. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In K. Hyland & F. Hyland (Eds.) Feedback in second language writing (pp.81-104). Cambridge: Cambridge University Press.

Ferris, D. R. (2010). Second language writing research and written corrective feedback in SLA: Intersections and practical applications. Studies in Second Language Acquisition, 32, 181-201.

Ferris, D., Brown, J., Liu, H., & Stine, M. E. A. (2011). Responding to L2 students in college writing classes: Teacher perspectives. TESOL Quarterly, 45(2), 207-234. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational

Research, 77, 81-112.

Hyland, K. (2000). ESL writers and feedback: Giving more autonomy to students. Language Teaching Research, 4(1), 33-54.

Lee, I. (2007). Feedback in Hong Kong secondary writing classrooms: Assessment for learning or assessment of learning? Assessing Writing, 12, 180-198.

Lee, I. (2009). Ten mismatches between teachers’ beliefs and written feedback practice. ELT Journal, 63, 13-22.

Lee, I. (2011). Feedback revolution: What gets in the way? ELT Journal, 65(1), 1-12. Maclellan, E. (2001). Assessment for learning: The different perceptions of tutors and

students. Assessment and Evaluation in Higher Education, 26, 307-318.

McGarrell, H., & Verbeem, J. (2007). Motivating revision of drafts through formative feedback. ELT Journal, 61(3), 228-236.

Min, H.-T. (2006). The effects of trained peer review on EFL students’ revision types and writing quality. Journal of Second Language Writing, 15, 118-141.

(31)

‐ 28 ‐

perceptions, teacher self-assessment, and actual teacher performance. Journal of Second Language Writing, 16, 82-99.

Nicol, D. J., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: a model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199-218.

Parr, J. M., & Timperley, H. S. (2010). Feedback to writing, assessment for teaching and learning and student progress. Assessing Writing, 15, 68-85.

Price, M., Handley, K., Millar, J., & O’Donovan, B. (2010). Feedback: all that effort, but what is the effect? Assessment & Evaluation in Higher Education, 35(3), 277-289.

Sadler, D. R. (2010). Beyond feedback: developing student capability in complex appraisal. Assessment & Evaluation in Higher Education, 35, 5, 535-550. Sengupta, S. (2000). An investigation into the effects of revision strategy instruction

on L2 secondary school learners. System, 28, 97-113.

Silver, R., & Lee, S. (2007). What does it take to make a change? Teacher feedback and student revisions. English Teaching: Practice and Critique, 6(1), 25-49. Truscott, J. (1996). The case against grammar correction in L2 writing classes.

Language Learning, 46, 327-369.

Truscott, J. (2010). Some thoughts on Anthony Bruton’s critique of the correction debate. System, 38, 329-335.

Truscott, J., & Hsu, A.Y.-P. (2008). Error correction, revision, and learning. Journal of Second Language Writing, 17, 292-305.

Williams, J. (2004). Tutoring and revision: Second language writers in the writing center. Journal of Second Language Writing, 13, 173-201.

Wingate, U. (2010). The impact of formative feedback on the development of academic writing. Assessment & Evaluation in Higher Education, 35(5),

(32)

‐ 29 ‐

(33)

‐ 30 ‐

Appendices

Appendix A. Writing prompts and teaching points in the three revision lessons Task 1 (W4 draft + W5 revision) Task 2 (W7 draft + W8 revision) Task 3 (W12 draft + W13 revision) Writing Prompts Discuss the advantages and disadvantages of having a job while in college and then state your own opinion on this topic. Many college teachers encourage group discussions among students. But some students feel listening to peers is a waste of time as compared to listening to teachers. Do you agree that teachers should encourage more group discussion? Include specific reasons or examples to support your answer. In some universities,

students do not choose their majors until the second or third year, while most students in Taiwan are put into different

professional/academic departments before they are admitted. Which do you think may be a better arrangement for college students? Why? Provide detailed reasons for your answer. Teaching Points in the Revision Lessons 1. Explain initial procedures for revision, such as rereading own draft for an analysis of overall structure and argument points. 2. Check main points and examine the need for additions, deletions, or reorganization. 1. Support with clear examples or explanations that are directly relevant. 2. Be careful when making assumptions about the background knowledge of readers; supplement with more details when necessary.

1. Make sure the

discussion adheres to the topic; do not digress from the main topic. 2. Ensure good time

management so there’s enough time for a proper conclusion. 3. Do not bring up new

points in the concluding paragraph.

4. Deletion can sometimes make the whole

argument more focused. 5. Eliminate minor

(34)

‐ 31 ‐

3. Check if supporting details support the main points. 4. Analyze sentence structure and check grammar. 3. Check consistency of referents; be careful when switching among I, we, and you. 4. Discuss using dictionaries to help make decisions on word choice. such as subject-verb agreement, tense, and singular/plural forms.

(35)

32

Appendix B. Rubrics for instruction and grading

項目

Scoring

Criterion/Level

Excellent

(90% & up)

15, 14, 13

Good

(80%-89%)

12, 11, 10

Fair

(70%-79%)

9, 8, 7

OK

(60%-69%)

6, 5, 4

Poor

(below 60%)

3, 2, 1

總分

_{Holistic Score}

作文整體品質

極優秀

優良

良好

尚可

一般水準之

下

內容

Content

Argument

主題/主張

細部論點

+ 主題清晰、立場明確、有解釋、很容易就知道作者的主張

(-) (找不到主題、或需費力尋找、令人不清楚作者的意圖)

+ 有明確清楚的細部理由、解釋清楚、能說服人、適當舉例

(-) (沒有支持主張的理由、沒有充分說明理由、不能說服人)

Organization

整體組織結構

+ 有清楚開頭、逐漸發展完整的內容、良好收尾、整篇文章統

一、連貫、順暢、有邏輯

(-) (漫無目標、雜亂的內容，或空有開頭結尾的形式)

語言

Language

Lexical Use

字彙/用語

+ 用語恰當、自然、有變化、生動、有力

(-) (用語令人迷惑不解、重複多、不自然)

Grammar 文法句

構、書寫體例

+ 句構、單複數、人稱、時態、動詞變化、標點符號、拼字等

文法均正確、幾乎沒有錯

(-) (很多明顯的文法問題)

(36)

(37)

English Teaching: Practice and Critique December, 2012, Volume 11, Number 4

http://education.waikato.ac.nz/research/files/etpc/files/2012v11n4art7.pdf pp. 99-119

Like a bell responding to a striker: Instruction contingent on assessment SHU-CHEN HUANG

Foreign Language Center, National Chengchi University, Taipei, Taiwan

ABSTRACT: This article is concerned pragmatically with how recent research findings in assessment for learning (AfL) can bring about higher quality learning in the day-to-day classroom. The first half of this paper reviews recent studies within Black and Wiliam’s (2009) framework of formative assessment and looks for insights on how pedagogical procedures could be arranged to benefit from and resonate with research findings. In the second half, based on lessons drawn from the review, the findings were incorporated into an instructional design that is contingent on formative assessment. The concept of teacher contingency is elaborated and demonstrated to be central to the AfL pedagogy. Attempts were made to translate updated research findings into an English as a foreign language (EFL) writing instruction to illustrate how teachers may live up to promises offered by recent developments on AfL. This AfL lesson, situated in L2 writing revision, made instruction contingent on and more responsive to learner performance and learning needs. As shown in an end-of-semester survey, learner response to the usefulness of the instruction was generally quite positive.

KEYWORDS: English teaching, EFL pedagogy, assessment for learning, L2 writing, instructional design.

Respond properly to learners’ enquiries, like a bell responding to a bell striker. If the strike is feeble, respond softly. If the strike is hard, respond loudly. Allow some leisure for the sound to linger and go afar. (Record on the Subject of Education Xue Ji, Book of Rites Li Ji, 202 BCE-220 CE)

INTRODUCTION

Modern educators believe that learning does not occur in a vacuum. For any individual learner, knowledge is co-constructed in the social-cultural context with “scaffolds” provided by more experienced others and peers. To facilitate such learning, Alexander (2006) describes “dialogic teaching,” in which quality, dynamics and content of teacher talks are most important, regardless of institutional settings and classroom structures. Conventional IRF (initiation – response – feedback) turns are not adequate if the teacher’s speech remains the core aspect and the learner’s speech remains peripheral. In real dialogues, the learner’s thinking and its rationale must be deliberately sought and addressed. Mercer (1995) describes this type of talk among teachers and learners as “the guided construction of knowledge.” Right or wrong, learners’ discussions of the subject matter provide an opportunity for self-reflection and self-assessment of their current knowledge. These discussions also allow the teacher to realize what needs to be taught. This understanding opens a path to deep learning. Mercer, Dawes, and Staarman (2009) provide an enlightening example of how dialogic teaching differs from more authoritative teacher talk and how this dialogue can be facilitated through pedagogical tools. Before explaining why the moon changes shape, these teachers designed “talking points” – a list of factually correct, controversial, or incorrect statements – and allowed pupils to discuss them

(38)

S.-C. Huang Like a bell responding to a striker: Instruction contingent on assessment

without fear of judgment. Based on these free and extended discussions, the teachers identified students’ prior concepts, both right and wrong. This information improved follow-up teaching for the teacher and the pupils.

Perhaps a metaphor can help us to grasp the notion of dialogic teaching. We use the metaphor of treating an enquirer as a brass bell responding to a striker (see Figure 1). This metaphor comes from an ancient Chinese publication, Record on the Subject of Education (Xue Ji), a collection of ideas and conduct compiled by Confucian disciples. The 18th volume of the 45 volumes of the Book of Rites (Li Ji), Xue Ji contains 20 sections with a total of 1229 Chinese characters. This classic’s concise and archaic language must be translated into modern-day language and is subject to interpretation. Generally considered the earliest systematic documentation on education, it covers the purposes, systems, principles, and pedagogies of education. Many of its propositions remain true and inspiring after thousands of years, and a number of metaphors make its doctrines approachable. Among them, the bell metaphor illustrates the suggested attitude for teachers responding to students’ questions. Teachers must be aware of and consider learners’ capacity. Teachers are advised to assess learners’ proficiency, to provide the appropriate amount of feedback at the right level, and to allow learners time to ponder and fully understand this feedback. These principles resonate remarkably with the convictions of dialogic teaching.

Figure 1. A bell responding to a striker

Studies on dialogic teaching, especially studies on the teaching of science and mathematics, have afforded illuminating examples of how effective dialogue helps learners to understand and, even more importantly, reveals misunderstandings. Hence, dialogue helps teachers to build on learners’ existing knowledge and teach to their critical needs.

The current study on teaching English writing is slightly more complicated. First, it is not sufficient for learners to be able to talk about their understanding of writing; they

(39)

must perform. What is said well may not be performed correctly. Writing must be practised, and it is not practicable to wait until students have mastered every concept about writing. They may never be perfectly ready. In fact, students learn as they write. Furthermore, some basics of writing, such as coherence and unity, are abstract to learners, and understanding these concepts is usually a matter of degree rather than absolute knowledge. What may be the “talking points” (Mercer et al., 2009), as mentioned above, for a writing teacher? What kind of “answers” should a second-language writing teacher elicit to help her teach? The answer is straightforward: student writing. Student writing may disclose valuable information that allows a teacher to plan and structure her instruction. Yet, too often, student writing marks the end of a unit, and the teacher’s written feedback returned with the students’ writings are not used to their full advantage. As scholars have cautioned in other teaching contexts, “…the child’s answer can never be the end of a learning exchange (as in many classrooms it all too readily tends to be) but its true centre of gravity” (Alexander, 2006, p. 25).

As Alexander (2006, p. 33) has noted, the ideas heralded by dialogic teaching are strikingly similar to ideas related to the assessment for learning presented by Black and his colleagues (Black, Harrison, Lee, Marshall & Wiliam, 2003). Because the focus of this article is a pragmatic application of dialogic teaching ideals in the second language writing classroom with a focus on teacher assessment and feedback on student writing, the following discussion will elaborate on recent studies in the area of learning assessment to justify the proposed approach to second/foreign language writing instruction.

ASSESSMENT FOR LEARNING

Formative assessment, in contrast to summative assessment which is high-stakes, standardised, evaluative, large scale and institutional, did not attract as much research attention as its counterpart did in the previous century. What educational roles it could play and how it was carried out were largely subject to classroom teachers’ idiosyncratic discretion. The earliest systematic reviews are generally believed to be those of Crooks (1988) and Natriello (1987), focusing on the impact of evaluation practices on students. A decade later, Black and Wiliam (1998) used “the black box” as a metaphor to describe classroom assessment and started to explore the potential of revealing that black box, that is, using formative assessment for teaching and learning. A few research teams elaborated on possibilities of assessment for learning (AfL), as opposed to the more conventional role assumed for assessment, that is, assessment of learning (AoL). Earlier studies on formative assessment, or AfL, were mostly situated in science and math education at the school level, yet its influence has gradually expanded to other subject areas and institutional contexts.

L2 education has by no means been left out of this formative assessment movement. Although a few years ago concerns were voiced about the scarcity of such research in L2 classrooms (for example, Colby-Kelly & Turner, 2007; Rea-Dickins, 2004), the situation has been changing. Harlen and Winter (2004) depicted the development of formative assessment in science and math education in Britain for readers of the Language Testing journal. Six features of quality teacher assessment were identified, namely: 1) gathering and using information about learning processes and products; 2)

(40)

using indicators of progress; 3) questioning and listening; 4) feedback; 5) ensuring pupils understand the goals of learning; and 6) self- and peer-assessment. Many of these features have since been explored in other contexts. Leung (2004) located areas of challenge for AfL to be implemented in L2 classrooms, including conceptual clarification, infrastructural development, as well as teacher education. Cumming (2009), in a review of language assessment, pinpointed the difficulties in aligning curricula and tests and in describing or promoting optimal pedagogical practices and conditions for learning. More recently in 2009, TESOL Quarterly devoted a special issue on teacher-based assessment, offering a variety of perspectives on L2 formative assessment, in which teachers are empowered to make assessment decisions conducive to learning. In addition to general L2 issues, AfL was also introduced to and interpreted for specific subfields such as L2 writing (Lee, 2007a).

With the gradually widespread awareness and acceptance of formative assessment in different areas of education, the above-mentioned reviews helped scholars synthesise collective wisdom and attempted to lay down agenda for more researchers to follow, as well as principles for practitioners to apply. However, a genuine difficulty has gotten in the way, that being the lack of a unifying theory (Davison & Leung, 2009), one that could consolidate diffuse efforts and establish a future trajectory apart from the long-established, standardised testing paradigm.

An integrated AfL theory

Acting in response, Black, Harrison, Lee, Marshall and Wiliam (2003) summarised five types of activities based on evidence of effectiveness. These are sharing success criteria with learners, classroom questioning, comment-only marking, peer- and self-assessment, and formative use of summative tests. Subsequently, Black and Wiliam (2009) developed a two-dimensional framework to organise the various aspects of formative assessment. One dimension in their theory is the agent of learning which could be the teacher, a peer, or the learner him/herself. The other illustrates the stages of learning, these being the goal — “where the learner is going”, the current status — “where the learner is right now”, and the bridge between the two — “how to get there.” Forged under Black and Wiliam’s two dimensions, classroom learning based on formative assessment is believed to progress in the following temporal sequence (p. 8).

Where the learner is going

Step One: Clarifying learning intentions and criteria for success (teacher) Understanding and sharing learning intentions and criteria for success (peer) Understanding learning intentions and criteria for success (learner)

Where the learner is right now

Step Two: Engineering effective classroom discussions and other learning tasks that elicit evidence of student understanding (teacher)

How to get there

Step Three: Providing feedback that moves learners forward (teacher)

Where the learner is right now/How to get there

Step Four: Activating students as instructional resources for one another (peer) Step Five: Activating students as the owners of their own learning (learner)