• 沒有找到結果。

以語料庫為本之半自動化英語母語者及學習者動名詞搭配詞比較

N/A
N/A
Protected

Academic year: 2021

Share "以語料庫為本之半自動化英語母語者及學習者動名詞搭配詞比較"

Copied!
132
0
0

加載中.... (立即查看全文)

全文

(1)國立臺灣師範大學英語學系 碩 士 論 文 Master Thesis Graduate Institute of English National Taiwan Normal University. 以語料庫為本之半自動化英語母語者及學習者動名詞搭配詞比較 A Semi-Automated Corpus-Based Method of Comparing Verb-Noun Collocations Between Native and Non-Native English Speakers. 指導教授:陳浩然博士 Advisor: Hao-Jan Chen, Ph.D. 研究生:劉昱成 Yu-Cheng Liu. 中華民國 102 年 7 月 July, 2013.

(2) 中文摘要 在第二語言學習者的搭配詞研究中,動詞與名詞的搭配是最常見且重要的層 面。然而,在近數十年的發現及研究方法上,較具效率且系統化的語料庫資料提 取及分析技術尚未被充分討論及實行。本研究利用 The Sketch Engine (SKE)此一 線上平臺,期望能以半自動化之方式,檢視中文為母語之英語學習者於動詞-名 詞搭配詞使用上之錯誤。透過對此線上平臺之實驗性測試,能達成更有效率及可 靠之方法,並找出更具研究意義的結果。. 本研究針對三個研究問題。他們分別為 (1) 在 The Sketch Engine 此線上平臺 上實驗性測試 Sketch Diff,一半自動化語料分析功能,來比較英語母語與學習者 間之動詞-名詞搭配詞使用、(2) 探究中文為母語之英語學習者於動詞-名詞搭 配詞使用上之錯誤類別、及 (3) 檢視這些錯誤背後之可能形成原因。. 首先,利用 SKE 上的 Corpus Creating 功能,將英國國家語料庫 (BNC),即 英 語 母 語 者 語 料 庫 , 及 由 四 個 中 文 為 母 語 之 英 語 學 習 者 語 料 庫 (CLEC 、 SWECCL、大考中心、臺灣英語學習者語料庫) 組合而成之一大型英語學習者語 料庫,上傳至 SKE 平臺。接下來,在此一大型英語學習者語料庫中,提取出最 常用的 690 個英語名詞。最後,Sketch-Diff,一項原本在 SKE 平臺上用來比較 同義字的功能,被實驗性地操作,用來比較前述 690 個中文母語之英語學習者最 常用的英語名詞,他們分別於英國國家語料庫(BNC)及另一大型英語學習者語料. i.

(3) 庫中,常見的動詞搭配。. 結果顯示,共計發現 134 項、2841 筆的動詞-名詞搭配詞錯誤。關於錯誤 類別,63 項 (832 筆) 為動詞錯誤、43 項 (1502 筆) 為介詞或動名詞性質之動詞 錯誤、28 項 (507 筆) 為名詞錯誤。. 錯誤成因部分,具影響力的來源因子排名為 a) 母語負面影響 (75 項、1376 筆)、b) 過度延伸 (23 項、213 筆)、c) 同義字錯誤 (21 項、449 筆)、d) 錯誤類 比 (9 項、728 筆)、e) 拼字相似錯誤 (4 項、64 筆)、f) 輕動詞錯誤 (1 項、6 筆)、 及 g) 錯誤造義 (1 項、5 筆)。. 依據上述結果,有下列幾項觀察。第一,中文,也就是英語學習者的母語, 仍有壓倒性的負面影響力介入動詞-名詞搭配詞的學習歷程。第二,介詞或動名 詞性質之動詞錯誤最為普遍,的確需要更多關注及應對策略。最後,可能由過度 延伸及同義字錯誤引發的錯誤,指出了英語學習者在搭配詞上的學習盲點,也就 是亟需更多的英語接受量及情境化式的練習。. 關鍵字:半自動化、動詞-名詞搭配詞、語料庫分析. ii.

(4) ABSTRACT Among the studies of ESL learners' collocations, the aspect of Verb-Noun combinations has been the most popular and important one. Yet, with the findings discovered and research methods developed these decades, a more efficient and systematic manner of corpus data extraction and analysis has not been thoroughly discussed and practiced. This study, adopting the online platform The Sketch Engine (SKE), aims to examine Chinese ESL learners' Verb-Noun miscollocations with a semi-automated method. Through an experimental utilization of the online interface, a more streamlined and reliable approach is expected to be realized, and more interesting results are revealed through this way.. Three research questions were targeted in this study. They are (1) to experiment Sketch Diff, a semi-automated corpus-based function of The Sketch Engine, an online platform, to compare the Verb-Noun collocations between native and non-native English speakers, (2) to explore the types of Chinese ESL learners' Vern-Noun miscollocations, and (3) to inspect the probable causes of learners' Verb-Noun miscollocations.. First, with the Corpus Creating function on the SKE, a native speaker corpus, the British National Corpus (BNC), and a Chinese ESL learner corpus, merged by CLEC,. iii.

(5) SWECCL, JCEE Testees Corpus, and Taiwanese Learner Corpus, were uploaded unto the platform. Then, a 690-word-list of the most frequent nouns in the Chinese ESL learner corpus was generated. Finally, a Sketch-Diff function, originally for synonym comparison on the SKE, was alternatively manipulated in this study for the comparison of the 690 nouns' verb collocates respectively in the BNC and the Chinese ESL learner corpus.. A total of 134 types of Chinese ESL learners' Verb-Noun miscollocations were found, with 2841 tokens overall. As for the general types of learners' V-N miscollocation types, 63 sorts (832 tokens) belonged to the deviant use of verbs, 43 types (1502 tokens) were categorized under the misuse of prepositional and phrasal verbs, and 28 types (507 tokens) were grouped by the misuse of nouns.. In terms of contributing error sources, the influential factors are a) negative L1 transfer (75 types, 1376 tokens), b) overgeneralization (23 types, 213 tokens), c) erroneous use of synonyms (21 types, 449 tokens), d) false analogy (9 types, 728 tokens), e) approximation (4 types, 64 tokens), f) erroneous delexical verbs (1 type, 6 tokens), and g) erroneous coinage (1 type, 5 tokens).. According to the results above, several observations were proposed. First, the overwhelming influence of Chinese, the target ESL learners' L1, still exerts a great. iv.

(6) impact on learners' acquisition of V-N collocations. Second, as the most found type of error, the misuse of prepositional and phrasal verbs indeed requires more attention and coping strategies. Finally, miscollocations possibly caused by overgeneralization and erroneous use of synonyms have pointed out the blind spots of learners' collocation learning, which is in serious need of more L2 input and contextualized exercises.. Keywords: semi-automated, Verb-Noun collocations, corpus analysis. v.

(7) ACKNOWLEDGMENT Dear Prof. Howard, I still recollect the time vividly when we chatted in the corridor, in the elevator, or on your busy way to another meeting or class. A teacher's reminder always means a lot to his student. Thank you for your small talks. They were not long; they were not formal, but they really worked. I still recollect your enthusiasm and professionalism when you analyzed your innovative ideas. A teacher's guidance always means a lot to his student. Thank you for your suggestions. They were not just the beginning. They led me firmly toward the end of this study.. Thank you, Prof. Gao and Tseng, for being my committee members. Your precise and wise questions pointed out my room for improvement. They were clear, thought-provoking, and undoubtedly of great help.. Thank you, Joe and Jacob. You are my crazy smart mentors. Your always instructed me, inspired me, and motivated me to pursue the next level of excellence. Thank you, Prof. Jean, for carefully reviewing my results and offering much practical advice. Joe perused my rough draft and corrected many negligent mistakes. I felt very sorry and especially appreciate that.. Thank you, my respectable friends and colleagues in different places and schools. You shared with me how to stick to one's goal when struggling between work and. vii.

(8) study. Thank you, my powerful bosses, Catherine, Jessie, Chiuchin, Meichiou, Chihkai, and the whole knowledgably energetic staff of English teachers as well as other subject teachers at the Green Garden. I felt blessed every time being asked about my progress of thesis writing. It was a huge pressure, but pressure that was absolutely needed during the final stages of work completion. Thank you all for the encouragement, care, and consideration.. Thank you, my soul mate and partner for life, Lexie. You always redirected me back to be on the right track whenever I got tired and felt like giving up. We have supported each other for so many years. Thank you for your tolerance and company. It's time to adjust my lifestyle now.. Thank you, my dear family, my dad, my sister, and my mom in particular. In the last few months, I thought I could have been strong enough to face all the stress and difficulty alone. Yet I was wrong. Most of the time, had it not been for your visit and your selfless love for arranging everything at home for me, I wouldn't have finished this task in such a short time. I do learn a lot from this and my highest respect for you, mom. You are the superhero.. viii.

(9) TABLE OF CONTENTS ABSTRACT (Chinese) ..............................................................................i ABSTRACT (English) ............................................................................ iii ACKNOWLEDGMENT ....................................................................... vii CHAPTER I INTRODUCTION ..........................................................1 1.1 Background of the Study ................................................................................. 1 1.2 Motivation of the Study ................................................................................... 5 1.3 Purpose and Significance of the Study ............................................................ 7 1.4 Research Questions .......................................................................................... 9 1.5 Definition of Key Terms ................................................................................ 10. CHAPTER II LITERATURE REVIEW ...........................................12 2.1 Collocations ................................................................................................... 12 2.1.1 Collocations and Verb-Noun Types .................................................... 12 2.1.2 Difficulties ESL Learners Face ........................................................... 16 2.1.3 Error Types of ESL Learners' Collocations ........................................ 18 2.2 Methods of Analyzing ESL Learners' Miscollocations.................................. 22 2.2.1 Manual Extraction and Examination .................................................. 22 2.2.2 Learner Corpus and KWIC Concordances ......................................... 23 2.2.3 Concordancers and MI Measures........................................................ 25. CHAPTER III METHOD ...................................................................28 3.1 Instruments-The Sketch Engine ................................................................... 28 3.1.1 Corpus Creating Function .................................................................. 31 3.1.2 Sketch Diff Function ............................................................................ 32 3.2 Corpora .......................................................................................................... 35 3.2.1 The British National Corpus ............................................................... 35.

(10) 3.2.2 CLEC, SWECCL, JCEE, The Taiwanese Learner Corpus ................. 36 3.3 Data Extraction .............................................................................................. 38 3.4 Data Analysis ................................................................................................. 42. CHAPTER IV RESULTS AND DISCUSSION ................................46 4.1 Statistical Data and Miscollocation Types ..................................................... 48 4.2 Misuse of Simple Verbs ................................................................................. 51 4.2.1 Synonymous Verb Pairs ...................................................................... 52 4.2.2 Verbs in Common Expressions: An Illustration .................................. 56 4.2.3 Other Verb Pairs .................................................................................. 58 4.3 Misuse of Prepositional Verbs and Phrasal Verbs .......................................... 62 4.4 Misuse of Nouns ............................................................................................ 66 4.4.1 Incomplete Noun Phrases ................................................................... 66 4.4.2 Other Noun Pairs................................................................................. 68 4.5 Discussion of Verb-Noun Miscollocations .................................................... 71 4.5.1 Salient Miscollocates by Chinese ESL Learners ................................ 71 4.5.2 Possible Factors Causing the ESL Learners' Verb-Noun Errors ......... 74. CHAPTER V CONCLUSION ............................................................89 5.1 Summary of the Major Findings .................................................................... 89 5.2 Pedagogical Implications ............................................................................... 92 5.3 Limitations and Future Research ................................................................... 95. REFERENCES........................................................................................98 APPENDIXES .......................................................................................107 Appendix A. Top 690 Nouns in the Chinese ESL Learner Corpus .................... 107 Appendix B. Verb-Noun Miscollocations (Alphabetical Order) ....................... 115 Appendix C. Verb-Noun Miscollocations (Frequency Order) ........................... 119.

(11) CHAPTER I INTRODUCTION. 1.1 Background of the Study Based on an abundance of informative research and fruitful results, collocations have been recognized as an influential factor of competence in the field of Second Language Acquisition (Sinclair 1991; Nattinger and DeCarrico 1992; Biber et al. 1999; Wray 2008, and Schmitt 2010). As the studies of Brown (1974), Granger (1998), Lewis (2000), and Martinez & Schmitt (2012) have indicated, collocations not only facilitate linguistic production but also enhance overall comprehension. Its significance and related studies can be illustrated in the following three aspects- pragmatics, language acquisition, and corpus linguistics.. The more diverse collocations one knows, the more skillfully he or she could communicate in the English language (Boers, Eyckmans, Stengers, and Demecheleer, 2006; Laufer and Waldman 2011). Hill (1995) offered a realistic and precise observation, "Students with good ideas often lose marks because they do not know the four or five most important collocations of a keyword that is central to what they are writing about (p. 5).” A sentence from TIME.com served as another practical example. 1.

(12) about collocation use:. "Snow began to fall around the Northeast on Friday at the start of what’s predicted to be a massive, possibly historic blizzard, and residents scurried to stock up on food and supplies ahead of the storm poised to dump up to 3 feet of snow from New York City to Boston and beyond (TIME.com Feb. 8, 2013).". Without a sufficient repertoire of collocations, such a sentence above would otherwise end up verbose and redundant. This, on the other hand, illustrates the next pivotal aspect about collocations: trouble of learning these "arbitrarily constructed" chunks (Firth 1957).. No matter how long one has learned English and how proficient one is, the acquisition of collocation use always seems a frustrating and tiring process. In the study of Laufer and Waldman (2011), though not producing as many Verb-Noun miscollocations as beginners, advanced ESL learners still made a variety of miscollocations. This persistent gap between one's general English proficiency and his or her collocational competence, as Laufer and Waldman indicated, is that while learners' vocabulary knowledge advances, their progress of collocation utilization still falls farther behind than actually perceived. Barfield (2007) also reported a similar. 2.

(13) case of Japanese ESL learners in which they demonstrated well-informed knowledge of individual vocabulary items but could not fully understand collocations containing the same vocabulary items therein.. Why "a powerful computer" preferred to "a strong computer"? In addition to consulting native speakers' intuition, another common way to observe how collocations form mostly lies in the compilation and analysis of corpus. The BBI Dictionary of Combinatory English Words (Benson et al. 1997), a corpus-based dictionary, provides different collocations made with no reasonably clear rules: adjective + noun (stiff breeze instead of rigid breeze), verb + noun (hold an election instead of make an election); kick the bucket [passing away] instead of hit the bucket), noun + noun or noun + of + noun (movie theater instead of film theater; swarm of bees instead of crowd of bees), and adverb + verb (thoroughly amuse instead of completely amuse). How come certain mutually proximal tendencies appear more salient for some combinations but not others, while all of them being grammatically correct? That is how learners tend to encounter difficulties handling these conventionalized language usages, and also why Kennedy (1990) Bahns (1997), Nesselhaulf (2005) and Leacock et al. (2010) pointed out, combinations like "make a decision" or "a bitter disappointment" should be incorporated as well as emphasized in second language instruction instead of leaving them for learners to discover by 3.

(14) themselves.. To examine the discrepancy of collocation use between native and non-native English speakers, miscollocations made by ESL learners and their corresponding analysis mechanisms are always a core subject in the field of corpus linguistics (Flowerdew 2001; Nation 2003; Kaur and Hegelheimer 2005; Varley 2009; Huang 2010). For instance, Shih (2000) adopted Taiwan learner corpus of English (TLCE), a 415,700-token corpus, for miscollocation analysis. Liu (2002) looked into lexical miscollocations in high school and college students' compositions from English Taiwan Learner Corpora (ETLC). Nesselhauf (2003; 2005) investigated the Verb-Noun collocation errors in German ESL learners' essays from the ICLE (International Corpus of Learner English). Lin (2010), furthermore, did a thorough examination of Taiwanese and Chinese ESL learners' common Verb-Noun collocation errors with the BNC (British National Corpus) as a native reference. A large amount of effort has been exerted, and a variety of fields of collocations have been delved in. Just like what Gledhill (2000) stated, “It is impossible for a writer to be fluent without a thorough knowledge of the phraseology of the particular field he or she is writing in (p. 1).” To achieve an excellent command of a second language, collocations undoubtedly played an indispensible part.. 4.

(15) 1.2 Motivation of the Study From the history of and contemporary work on collocation study in SLA and TESOL, the significance of collocations has been substantially proved. However, a practical method of analyzing learners' miscollocations still has not been established in a user-friendly as well as theoretically-sound manner.. In the research of Liu (1999; 2000), Shih (2000), Nesselhauf (2003; 2005), Lin (2010), etc, many interesting results were discussed. Yet, the researchers all somehow adopted a manual data extraction as well as sorting of the possible erroneous V-N collocations by L2 learners. This is indeed laborious, and might not guarantee a reliable cover of all the misused V-N entries in the learner corpora.. Moreover, the reference corpus in past research, mostly the BNC, was only chosen for screening out overlapped entries in the learner corpora. That is, though in that way incorrect V-N collocations in the learner corpora were identified, native V-N collocations in the BNC were somehow neglected for further discussion. Learners were found to make certain V-N mistakes, but how did the natives come up with the opposite authentic ones? This point seems missing in past studies.. Third, according to the limitations and suggestions in the studies of Lin (2010) and others, the size of ESL learner corpora used can still expand. With a larger data. 5.

(16) bank, both quantitatively and qualitatively significant examples would be expected to increase the validity of final results.. In sum, three major aspects of needs can be generalized-efficiently pinpointing learners' miscollocations, offering comparable native examples from corpus-based data, and for linguistic awareness raising, displaying a scale of collocational comparison between native and non-native speakers.. By combining the strengths of current concordancing systems, the field of corpus linguistics could transform the interface of query systems into not only referential but educational devices. The needs mentioned above could be explained more vividly with a scenario. If a learner wants to know whether the combination change work is workable in a native way, by searching the keyword work on an online platform, he or she could observe a range of collocations with work therein commonly used by natives. In addition, another set of not-so-suitable ones with work often misused by L2 learners can also be shown at the same time. This way, the learner understands whether his or her intended usages are right and can also notice what other ones he/ she could or could not apply next time.. 6.

(17) 1.3 Purpose and Significance of the Study Planning to tackle L2 learners' miscollocations more effectively and proactively, the author would like to discuss an alternative method on a query system-SKE, The Sketch Engine (http://www.sketchengine.co.uk). Among the various functions on the website, there are two of them which distinguish the SKE most from the other concordancing systems-the Corpus Creating function and the Sketch Diff function (Kilgarriff et al. 2004).. The Corpus Creating function allows users to upload and merge their corpora, and apply the SKE's functions to research their data online. The Sketch Diff function, on the other hand, can be utilized to compare a keyword's collocates of different parts of speech in two different corpora, e.g., two general corpora or one general corpus with another learner corpus (Kilgarriff et al. 2004). The author of this thesis expects to conduct a study based on the two main functions above in the following aspects.. To begin with, by adopting the Sketch Diff function, a complete check and comparison of certain target words' collocates between the BNC and another designated learner corpora can be achieved to avoid possible human ignorance or misjudgment.. Second, Verb-Noun collocations are the focus of this study. Since this type of. 7.

(18) collocation has been identified as the most challenging yet important feature of SLA (Alterberg 1993; Liu 1999; Liu 2002; Li 2005), by probing into this field again through a semi-automated method, certain previously unheeded results would be expected to surface.. Third, through a careful and full-scale inspection of V-N misuse between natives and non-natives, the probable causes contributing to learners' miscollocations should be recalibrated. Various past studies have explored this issue (Shih 2000; Liu 2002; Chang and Yang 2009; Lin 2010; Laufer and Waldman 2011). However, with a totally advanced technological support, an alternative viewpoint might hopefully be demonstrated through the corpus-based juxtaposition of both native as well as non-native V-N collocation uses.. Last but not least, several teaching methods centering on lexis and collocations have been proposed (Willis 1990; Nattinger and DeCarrico 1992; Lewis 1994; 1997). Phraseological units, instead of being left as a peripheral part, are now a red-hot topic in SLA and TESOL (Nesselhaulf 2005). By applying the popular The Sketch Engine platform and the semi-automated comparison of Verb-Noun uses between Chinese ESL learners and the British National Corpus, the author expects to provide some suggestions for future TESOL research as well as material compilation.. 8.

(19) 1.4 Research Questions According to the aforementioned history and motives, the author's research questions are proposed as follows.. 1. Can Sketch Diff, a semi-automated corpus-based function of The Sketch Engine, an online platform, be experimented to compare the Verb-Noun collocations between native and non-native English speakers?. 2. What are the types of miscollocations made by Chinese ESL learners?. 3. What can be the possible causes of Chinese ESL learners' collocational misuse?. 9.

(20) 1.5 Definition of Key Terms 1. Verb-Noun Collocation: Based on past studies (Givon 1993; Fromkin el al. 2003; Lin 2010), the Verb-Noun collocations refer to the lexical verbs and the nouns with preceded modifiers. In this study, the online query system of the SKE would scan the corpora and extract Verb-Noun collocations semi-automatically by itself.. 2. Learner Corpus: With L2 learners' linguistic production compiled and systematically-tagged, either spoken or written, this type of corpus is distinguished from general corpora like the BNC (British National Corpus), which consists of linguistic production of L1 natives, i.e., natives of English-speaking regions.. 3. Semi-Automated: This term here means that in this study, from corpus data extraction to preliminarily pragmatic as well as semantic appropriateness inspection, the online system of The Sketch Engine does the part for human manual labor.. 4. The Sketch Engine: A web-based program (https://the.sketchengine.co.uk/) that accepts uploaded linguistic data on subscription terms and offers its analysis functions in return. It contains two main functions-concordancer and the Word Sketch program (Kilgarriff et al. 2004). Word Sketch, like what its literal meaning suggests, organizes a keyword's collocates of all parts of speech, and provides a. 10.

(21) one-page summary of them with the number of frequency of each collocate shown neatly in the form of colorful columns. The picture (Figure 1.1) below is an example of the word fun demonstrated with Word Sketch on the SKE website.. Figure 1.1 Fun in the BNC demonstrated by the Word Sketch function. 11.

(22) CHAPTER II LITERATURE REVIEW. 2.1 Collocations This section is organized in two parts. The first discusses the general notions and studies of collocations. The latter focuses on Verb-Noun collocations and their related topics, which is the emphasis of this study.. 2.1.1 Collocations and Verb-Noun Types Since Firth (1957) first coined the term "collocations," the definition of this pivotal concept has spurred much discussion and multifarious explanations. For instance, Murphy (1983) referred to collocations as "word associations," Alexander (1984) "fixed expressions," and Granger (1998) "prefabricated patterns." As a quote from Firth adequately explained, "you shall know a word by the company it keeps (p.197)," the relations among words appear much more complex than their individual components.. Nesselhaulf (2005) offered a practical description of "non-substitutability" about. 12.

(23) the feature of collocations. It explains that the meaning of a collocation is muddled once one of its components is replaced with another semantic counterpart. For example, "powerful" can be semantically similar to "strong," but in the collocation "a powerful computer," powerful cannot be exchanged with strong. Otherwise, the meaning turns out weird and just not right.. In this study, the definition the author opts for is the one proposed by Laufer and Waldman (2011). They suggested "restricted co-occurrence (Sinclair 1991)" and "semantic transparency (Cowie 1994)" as their core definition of collocations. A brief section of their examples serves as a straightforward and clear categorization:. "We consider throw a disk and pay money to be free combinations, we consider throw a party and pay attention to be collocations, and we consider throw someone’s weight around and pay lip service to be idioms (p. 649).". In respect of the importance of collocations, Howarth (1996) found that over a third of the collocations observed in a 240,000-word corpus were collocations. Moreover, Nesselhaulf (2005) delineated a precise argument of why collocations count. The past observations about the crucial role collocations stand for can be summarized in the subsequent four facets.. 13.

(24) First, collocations are the foundation of creative language in both L1 and L2 (Peters 1983; Wray 1999). Second, a set of ready-for-use chunks are psycholinguistically the decisive factor for one to achieve fluency in language production (Pawley and Syder 1983; Aitchison 1987). Third, since collocations are the shared platform between conversants, pragmatic comprehension surely receives enhancement through the utilization of them (Hunston and Francis 2000). Finally, like register in a certain region, collocations in common serve a similar function to meet “the desire to sound [and write] like others” (Wray 2002).. There are always various interpretations of so called Verb-Noun collocations. For a clear inspection of V-N collocations in this study, two camps of rationale are outlined: one from Cowie (1991; 1992) and the other from Howarth (1966; 1998), Siepmann (2005), and Lin (2010).. First, Cowie (1991; 1992) suggested that verbs could contain three major features: figurative, delexical, and technical.. 1. Figurative: in "deliver a speech," the keyword deliver goes beyond its basic denotation of sending an object to a more figurative or abstract aspect of getting certain ideas across to the listeners.. 2. Delexical: in "make recommendations," the verb make does not really convey. 14.

(25) a specific meaning but is rather grammaticalized and vague.. 3. Technical: in "try a case," the action try instead is constrained, just exhibits the meaning of the collocation, and tends to be narrow as well as specific.. Second, based on Howarth (1966; 1998), Siepmann (2005), and Lin (2010), from the most fixed to the least, there can be five levels of fixedness of Verb-Noun collocations.. 1. Complete restriction on individual components: they can be (a) pure, i.e., opaque in literal meaning (let the cat out of the bag, spill the beans), or (b) figurative idioms (call the shots).. 2. Constraint on one element, while others could be substituted: such as give the appearance/impression, or take/pay heed.. 3. Partial regulation on certain distributional places: like make/give a speech/presentation.. 4. Freedom of replacement on one component, while others are partially restrained: weak collocations in a sense (accept/agree to/adopt a plan/proposal/ suggestion/recommendation/convention, etc).. 5. Open to any substitution on any component: free or non-restricted. 15.

(26) compositional sequence (paint the wall; suggest an idea).. 2.1.2 Difficulties ESL Learners Face In the previous part, the major discussion of collocations in general and Verb-Noun features is briefly summarized. The following two sections would review the past research concerning learners' difficulties with collocations and what the basic miscollocation types are.. Studies have been carried out through plenty of techniques of elicitation, such as translation, cloze, fill-in, multiple-choice tests, or questionnaires. Biskup (1990) reported that by means of translation tests, Polish ESL learners were observed to answer correctly when translating L2 collocations to L1, yet not the other way around. Bahns and Eldaw (1993) designed translation as well as fill-in questions like "He was too proud to _____ his defeat." Their study showed that verbs which were part of a collocation caused much more difficulties than others, no matter whichever level the testees were at. Shei (1999) revealed another interesting case about Chinese ESL learners encountering more problems dealing with cloze collocation tests than other learners with a European L1 background. Furthermore, Wang (2001), through a 50-question fill-in test for her subjects, pointed out that English collocations which. 16.

(27) were more idiomatic, with no similar Chinese counterparts, or contained more interchangeable synonyms therein had imposed much trouble on students' learning.. Chen (2008) designed a multiple-choice test with 50 questions for 440 non-English major college students. Questionnaires were also distributed to understand their ESL learning background. The results of Chen's study suggested that Verb-Noun miscollocations were the most marked errors. Other than that, students' knowledge of collocations was found to have much to do with their performance in English on their Joint College Entrance Examinations. After analyzing her subjects' collocation errors, Chen concluded that negative transfer from L1, overgeneralization, and confusing usages of synonyms mainly resulted in students' deficiency of English collocations.. Gitsaki (1997) proceeded further to combine essay writing, translation, and fill-in questions in her study. She divided her subjects into three respective groups according to their proficiency levels, i.e., post-beginning, intermediate, and post-intermediate. Her results showed that ESL learners were obviously in need of collocation instruction due to several opposing factors: the discrepancy between learners' L1 and L2, the intrinsic complicatedness of collocations, and the insufficient amount of L2 input received.. 17.

(28) Regarding. learners'. self-perceptions,. Li. (2005). recruited. 38. college. undergraduates for her study in which assignments and questionnaires were given to examine their ideas about L2 collocation difficulty. The results demonstrated that what students deemed to be hard collocations were not the actual mistakes they made in their assignments. Moreover, ignorance of collocational constraint turned out to be the major cause of errors in students' production.. These above studies, though conducted in different forms of design, all pointed in one direction: collocation, Verb-Noun bundles in particular, caused much difficulty for ESL learners both in acquisition as well as production, and corresponding mechanisms should be especially taken into consideration.. 2.1.3 Error Types of ESL Learners' Collocations Drawing on the discussion of Nesselhauf (2003) and James (1998), Chang and Yang (2009) proposed twelve genres of V-N miscollocations and eleven potential causes, according to their findings from the CLEC corpus.. 1. Erroneous verb choice: such as *learn knowledge, which should be acquire knowledge instead.. 2. Misuse of delexical verbs: like the categorization of Howarth (1966; 1998), 18.

(29) delexical verbs do not convey much meaning unless accompanied with a complement,. as. do. in. *do. recommendations. (should. be. make. recommendations).. 3. Erroneous use of idioms: if certain collocations function like phrases, then they should be seen as a whole without much replacement, like *get touch with them (should be keep in touch with them).. 4. Erroneous noun choice: such as *tell a speech, which should be tell a story instead.. 5. Erroneous preposition after verb: for instance, in *reply letters, the V-Prep is misused and should be reply to letters.. 6. Erroneous preposition after noun: similarly, in the collocation *give sympathy to animals, the Noun-Prep is wrong and the right one is give sympathy for animals.. 7. Erroneous use of determiner: as the determiner is missing in *play piano, the original one ought to be play the piano.. 8. Erroneous syntactic structure: basically means the distributional misuse like *rang the phone should be the phone rang.. 19.

(30) 9. Erroneous choice for intended meaning: in Chinese there is a common phrase *break my armed self, but the correct corresponding English expression should be undermine my self-esteem.. 10. Redundant repetition: a semantically similar word is repeated partially due to L1 interference, like *work one job (do one's job or just work would suffice).. 11. Erroneous combination of two collocations: for example, in *enjoy yourself a good time, either enjoy yourself or have a good time would be appropriately sufficient.. 12. Miscellaneous: those types of miscollocations which cannot be categorized.. In addition, Liu (1999) collected 94 copies of general writings and 127 final exam papers written by college students. In his result, 63 miscollocations were found and a categorization of them was provided by the researcher.. 1. Overgeneralization: if a word contains more than one usage, then a possible overuse of it could be foreseeable. As the word worry can function as a noun or verb, mistakes like *I am worry about you happened (I am worried about you as the right one).. 2. False analogy: an example like *ask you a favor can be an erroneous extension from structures like Verb+Noun+O.C. 20.

(31) 3. Erroneous assumption: commonly found for delexical verbs in the instance of *do plans (make plans as the correct one), false guesses about these vague verbs could be made by ESL learners.. 4. Erroneous use of synonyms: as Farghal and Obiedat (1995) indicated as a "straightforward application of the open choice principle," *broaden your eyesight could be produced by ESL learners instead of broaden your horizons.. 5. Negative transfer: also possibly stemming from the influence of Chinese, *eat medicine is translated directly from Chinese expressions, while take medicine should be the right one in English expressions.. 6. Erroneous coinage: out of vocabulary deficiency or a lack of collocational awareness, students could make *see sun-up, combining sun and up together, instead of see a sunrise.. 7. Approximation: when two words share similar meanings or forms, a muddled sense of them could result in *release my pressure, whereas relieve my pressure ought to be used by ESL learners (release and relieve might appear alike in learners' memory).. 21.

(32) 2.2 Methods of Analyzing ESL Learners' Miscollocations A large number of studies have been conducted in order to understand the hurdles learners face concerning their collocation learning and proficiency. Based on the generalization of Laufer and Waldman (2011) and Kilgarriff et al. (2004), the methods of these studies are categorized into three crucial stages-traditional manual data collection of language samples, concordances obtained from learner corpora with KWIC (keywords in context), and the use of MI (mutual information) measures for a more systematic analysis of a word's collocates.. 2.2.1 Manual Extraction and Examination Before the prevalence of online linguistic data sorting and storage systems, human-labor data collection as well as error inspection was the most typical way. Liu (1999) collected 94 copies of general writings and 127 exam papers of Taiwanese ESL students for analysis. After being manually checked by the researcher, 63 errors were detected in the writing samples, which mostly were Verb-Noun collocation errors.. Chen (2002), in addition, looked into the miscollocations of high school students in Taiwan through 90 English examination papers. 272 miscollocations were found 22.

(33) according to the category system of Benson (1986), with Adjective-Noun and Verb-Noun miscollocations being the two most significant sorts of all. Chen also pointed out that the negative transfer from L1 seemed to be the main cause of ESL learners' miscollocations, especially when a L1 equivalent example was available for the intended message to be phrased in English.. 2.2.2 Learner Corpus and KWIC Concordances Later, thanks to the development of modern technology, large-scale corpora and corresponding KWIC (keyword in context) programs have been widely available to researchers.. Shih (2000), after manually inspecting the most frequently-used verbs by ESL students in the TLCE (Taiwan Learner Corpus of English), a 415,700-word corpus, found several key verbs from the most problematic Verb-Noun combinations. These key verbs were achieve, understand, disturb, ask, and avoid. The possible reasons for learners' misuse of these verbs in collocations were considered to be their high frequency as well as dominant tendency to be memorized with other noun collocates. That is, the more common for a verb to be collocates of other words, the more frequent it is to be misused in learners' collocation production.. 23.

(34) Liu. (2002). adopted. ETLC. (English. Taiwan. Learner. Corpus),. a. 1-million-plus-word corpus of which the data was collected from IWill, an online reading project platform, for her study of Verb-Noun miscollocations made by ESL learners. With the help of error tags already marked by other English teachers online such as word choice, wrong verb/ noun, or problematic usages, the researcher combed these tagged words for potentially erroneous Verb-Noun collocations. She found 233 Verb-Noun miscollocations out of 265 lexical miscollocations in total. The quantitative result was quite significant, and Liu further pointed out that verb-based mistakes outnumbered noun-based ones, which meant that students had more problems learning how to utilize the verbs in collocations they intended to produce. Apart from that, by consulting WordNet (http://wordnet.princeton.edu/), an online lexical database, Liu claimed that over half of the students' miscollocations stemmed from their puzzled semantic concepts of inter-related verbs like run and move, operate and drive, while the rest of the errors were mostly triggered by direct translations from learners' L1.. As for European ESL learners, Nesselhauf (2003) retrieved 32 essays composed by German college undergraduates from ICLE (International Corpus of Learner English). She perused the writings, tagged keywords one by one, and extracted errors mostly on her own. The results pointed out that 56 out of 1072 Verb-Noun 24.

(35) combinations were confirmed to be miscollocations, the most salient type of all.. In an even larger scale of research, Nesselhauf (2005) went on to choose GeLEE, a 318-argumentative-essay, 154,191-word German ESL learner subcorpus in ICLE, for her doctoral dissertation research. She manually searched for miscollocations, with the BNC corpus, dictionaries, and native speakers as reference. Her focus was on Verb-Noun collocations, and the final outcome showed that 744 out of 2078 Verb-Noun collocations were miscollocations.. 2.2.3 Concordancers and MI Measures With the promotion of lexical statistics (Church and Hanks 1989) and better-programmed concordancers, other measures like MI (mutual information) have allowed researchers to broaden the scope of inspecting a word's collocates up to five words (Kilgarriff et al. 2004). Instead of reading concordance after concordance, a list of salient collocates for a keyword could be conveniently summarized.. Lin (2010) conducted an informative study to compare the Verb-Noun miscollocations between Chinese and Taiwanese ESL learners by adopting CLEC (Chinese Learner English Corpus), approximately 3.4 million words, and a Taiwanese ESL learners’ corpus, around 1.8 million words. First, she extracted all the Verb-Noun 25.

(36) combinations from the above two corpora with the software Antconc and MonoConc Pro. Then, comparing these combinations with the BNC corpus by using another program Perl, Liu identified those combinations not overlapped in the BNC. Finally, the researcher performed a manual check of all the potentially erroneous Verb-Noun collocations by means of the consultation with dictionaries and online websites. Her result showed that 210 types of miscollocations were detected in the Taiwanese ESL learner corpus, while 268 in CLEC, and about 10% of the miscollocations appeared overlapped in the two corpora.. The studies above from 2.2.1 to 2.2.3, though all offering insightful results and discussions, seemed to leave room for improvement in the following realms.. As for the data extraction procedures, except for Lin (2010), the past studies all required too much labor during data extraction process and might not really be feasible for future academic reproduction. Even if in Lin's (2010) study, certain semi-automated method was adopted (a software Perl was applied to filter out overlapping V-N collocations in both the learner corpora and BNC to extract those erroneous V-N ones from the learner corpora), most of the procedure was still manual.. Also, as notified before, the potential miscollocations in the previous research. 26.

(37) were double-checked manually with many kinds of resources such as the The BBI Dictionary of English Word Combination, Oxford Collocations Dictionary, Oxford Advanced Learner’s Dictionary, etc. Nevertheless, the native examples from a well-organized corpus like the BNC were left without further consultation. This could result from the issue of labor and time constraint that if the V-N collocations from the BNC were to be manually extracted for comparison, it might take too much time and human resources.. Third, due to practicality constraint, most of the analyses and decisions in the past research were conducted by the researchers alone, with the suggestions from hard-copy as well as online dictionary references. This, though the results were examined rigorously later on, still poses a question of human judgment and fatigue concerns.. Finally, with the assistance of technology, the sizes of different corpora around the world are increasing day by day. Once new sources of data are to be incorporated into current corpora, new results or different analyses could be promisingly awaited.. 27.

(38) CHAPTER III METHOD. This section presents the tool, data source, and planned extraction coupled with analysis procedures for this study. First, the powerful online platform, The Sketch Engine, is introduced accompanied with its two major functions for the target research. Then, the adopted corpora, both general corpora as well as ESL learner ones, are discussed. Finally, the semi-automated method and final judgment process are explained.. 3.1 Instruments-The Sketch Engine First utilized in the compilation of the Macmillan English Dictionary (Rundell 2002), and debuted at Euralex 2002 (Kilgarriff and Rundell 2002), word sketches are a one-page summary of a word's features, both grammatically and collocationally, drawing on corpus-based data in an automatic manner (Kilgarriff et al. 2004: p. 1). The Sketch Engine (SKE, also known as the Word Sketch Engine, would be referred to as SKE henceforth) is an innovated corpus query system that demonstrates word sketches, grammatical relations, and a distributional thesaurus (Huang and Hong 28.

(39) 2006). With its clear and constantly renovated online platform, SKE has been gaining more and more attention these days.. In response to the ever-changing era of hi-tech advancement, the invention of SKE copes with the ensuing challenges and develops distinctive functions.. First of all, as witnessing the introduction of Gigaword (1000M word corpus) by The Linguistic Data Consortium (http://www.ldc.upenn.edu/), researchers around the world sensed that the traditional interface of concordancers could not handle such an amazing amount of data any more (Kilgarriff and Grefenstette 2003). Instead of just reading the lines of co-occurrences, more systematical arrangements were surely needed. As a result, Word Sketch, examining a word from its varied grammatical contexts, was designed with Manatee, a state-of-the-art CQS (Corpus Query System), and is now able to demonstrate a set of up to 27 grammatical relations connected to a headword (Kilgarriff et al. 2004) (Figure 3.1).. Currently, The Sketch Engine, with its delicately designed tagging functionality, included word sketch references, thesaurus search, sketch differences, and many other practical uses for its repertoire of services. When keying in a word, users can surprisingly discover that not only are concordancing lines available but all the parts of speech of a target word are delineated like its DNA combinations. Without a. 29.

(40) traditional repetition of searching for certain collocation types of a headword time after time, this convenient one-touch-for-all-results interface saves much time and sweat undoubtedly.. Figure 3.1 An Example of the Word Sketch Function on the SKE Website. Now, serving as a commercial product, the SKE website provides researchers, teachers, and students with a platform of learning and academic study. As Kilgarriff et al. (2004) claimed, a multi-word searching function was being tested. Maybe in the near future, a "multi-word sketch" would be announced as a breakthrough. The following are introductions of two main functions on the SKE, which would be 30.

(41) needed for this study-Corpus Creating and Sketch Diff.. 3.1.1 Corpus Creating Function Generally, there are three basic functions on the SKE-access to large corpora, from 30 million to 10 billion words in up to 42 languages, a WebBootCat category, which allows members to build their own instant corpus by automatically retrieving all the keywords from a website, and a Corpus Creating function, for personal corpus data compilation.. As its name denotes, the Corpus Creating function allows users to upload their own data onto the platform for further alternative analyses by applying the tools on the SKE website. Once a corpus is set up on the SKE, several kinds of uses can be executed like corpus querying, wordlist compiling, word sketch, thesaurus, and sketch-diff, the central one which is elaborated in detail in 3.2.2. The target Chinese ESL learner corpora for this study were uploaded onto the SKE so that a semi-automated comparison could be carried out with its technical assistance.. 31.

(42) 3.1.2 Sketch Diff Function The Sketch Diff function, with Diff standing for difference, is developed by Kilgarriff et al. (2004) to display the collocational discrepancy between two synonyms (Figure 3.2). When learners come across two seemingly similar words like intelligent and clever, they often inevitably wonder how they could use them correctly in real situations. As observed in Figure 3.2, we can tell that certain distinctive adjectives just accompany intelligent or clever in a straightforwardly. Figure 3.2 The Sketch Diff Interface Showing intelligent and clever in the BNC. 32.

(43) different manner, which used to be an unimaginable power of traditional thesaurus dictionaries or even online resources. For example, sensitive/ bright / charming and intelligent are often co-collocates, while cunning/ brave/ bloody and clever tend to be other frequent combinations.. For Verb-Noun collocations, if speak and tell in the BNC are taken for example, by applying the Sketch Diff option, it is clearly displayed that for words to accompany speak and tell as Objects in sentences, story shows an overwhelming frequency of 1309 times for tell, while 0 time for speak. On the other hand, English appears 414 times with speak while none with tell (Figure 3.3). In other words, tell a story/ lie/ tale are strongly related pairs when speak English/ words/ languages are closely connected ones. The four numbers next to each collocate respectively indicate its frequency as well as salience scores with the first and second keywords (in this case speak and tell). A parallel contrast,. therefore,. can. be. quickly. scientifically observed. Figure 3.3 Collocates in the Object position with speak and tell in the BNC 33. and.

(44) The use of Sketch Diff in this study, however, is not to compare two words in the same corpus. Instead, with the unique functionality of Word Sketch and Sketch Diff on the SKE, the author plans to retrieve the collocates of a keyword from two different corpora, i.e., native speaker English corpora vs. ESL learner corpora, and compare them with each other, all of which were automatically executed with Sketch Diff. In this way, human misjudgment and time-consuming issues could be avoided, and a more comprehensive overview as well as a better systematic comparison between native and non-native English speakers' collocation uses established. The author's purpose of utilizing the Sketch Diff functionality in an alternative manner is elaborated in section 3.3, Data Extraction.. 34.

(45) 3.2 Corpora. 3.2.1 The British National Corpus The native reference corpus adopted in this study is the BNC, British National Corpus. Boasting more than 100 million tokens, the BNC is a comprehensively balanced corpus consisting of both written (90%) and oral (10%) input, with a wide variety of sources from newspapers, university essays, to business meetings, and informal interviews, etc.. There are four main features about the BNC. First, it is mostly comprised of modern British English. Second, instead of a chronological documentation of the English language, the BNC only selects historical linguistic records during the late twentieth century. Third, covering various styles and subject matters, the BNC is not specifically restricted to certain type of domain, but rather a comprehensive database. Fourth, to prevent the tendency of collecting texts from repeated idiosyncratic styles, the BNC ensures that its sampling of input can be as multifarious as possible, setting up maximums for different lengths and types of sources like single or multiple authors, shorter or longer texts.. 35.

(46) 3.2.2 CLEC, SWECCL, JCEE, The Taiwanese Learner Corpus The Chinese ESL learner corpora, on the other hand, are composed of four major parts-CLEC (Chinese Learner English Corpus, 1.0), SWECCL (Spoken and Written English Corpus of Chinese Learners. 1.0 and 2.0), JCEE (Joint College Entrance Examinations) Testees Corpus, and The Taiwanese Learner Corpus. The total is 7.3 million words.. The first two corpora are based on the input of Chinese ESL learners in Mainland China. CLEC (Chinese Learner English Corpus) is a large-scale ESL learner corpus compiled by professors Gui and Yang. Comprised of about 1 million words produced by high school and college students, it is frequently adopted for research purposes for its balanced tagging labels of 61 types of errors, including up to 1288 tokens of Verb-Noun miscollocations ready for analysis (Zhou 2005; Li 2005). As to SWECCL (Spoken and Written English Corpus of Chinese Learners), it is a project led by Wen et al., and is so far the largest Chinese ESL learner corpus in Mainland China. With a size of 3.5 million words, the SWECCL corpus possesses both written and spoken data. Since the BNC, the authors' native reference corpus for this study, is mainly comprised of written input, only the written data from the SWECCL, a sum of around 2.4 million words, are adopted for further semi-automated extraction and comparison.. 36.

(47) The other two learner corpora are made up of the English production from Taiwanese ESL learners. The JCEE (Joint College Entrance Examinations) Testees Corpus, with an approximate total of 2 million words, is constituted with English written data by Taiwanese high school graduates on their college entrance exams. Compiled by the College Entrance Examination Center in Taiwan, the JCEE corpus currently is for research purposes only. With regard to The Taiwanese Learner Corpus, it consists of about 1.8 million words, which were contributed by students from National Taiwan Normal University, National Tsing Hua University, National Taiwan Ocean University, National Taiwan University, National Taichung University, and Soochow University. The students composed online about various topics like technology, politics, education, school life, etc, with the length of three hundred to five hundred words per essay.. After the native and non-native corpora were obtained respectively, they were uploaded onto the SKE. The four Chinese ESL learner corpora were merged into one big corpus first, and then the BNC and the combined Chinese ESL learner corpus were analyzed with the functions on the SKE.. 37.

(48) 3.3 Data Extraction One breakthrough of this study is the alternative manipulation of the Sketch Diff functionality on the SKE to accomplish a semi-automated fashion of both data extraction as well as analysis. In the past research, the discussions and criteria on what Verb-Noun structures to specify and what to filter out often took much of researchers' time and effort. By applying the tagging system and powerful sorting tools on the SKE, the author would, based on a list of frequent nouns generated from the Chinese ESL learner corpus, extract the target Verb collocates both from the BNC and the Chinese ESL learner corpus, and compare them with the Sketch Diff function.. First, if knowledge is selected as an example to compare between the native and non-native corpora, it is clear that the Sketch Diff provides a summary chart concerning the corresponding collocates of knowledge in distinct parts of speech positions (Figure 3.4). Then, since our focus is on Verb-Noun miscollocations, the left column with the heading object_of (that is, knowledge used as Objects) would be examined. The red area means those Verb collocates Chinese ESL learners tend to use with knowledge while native speakers never do, such as enrich (99 times), study (94 times), and master (58 times). The green part signifies the Verb collocates native speakers habitually apply to go with knowledge, but not vice versa for non-natives.. 38.

(49) Those extreme examples which native speakers never produce are our target for further inspection.. Figure 3.4 Knowledge Compared between Native & Non-Native Corpora. Next, to probe into what the concordances are and how they are misused, the entry of enrich is chosen, and a list of lines are shown (Figure 3.5). This way, the author could examine the context quickly to decide whether the evidence given by the 39.

(50) Sketch Diff between natives and non-natives provides actual miscollocations or not. The references adopted for this study are introduced in section 3.4, Data Analysis.. Figure 3.5 Concordances of enrich_knowledge in the Chinese ESL Learner Corpus. As for the keywords that were tested with the Sketch Diff interface, they were based on a list of the most frequently used nouns from the Chinese ESL learner corpus, generated online by the SKE. According to the suggestion of Liu (2002), nouns tend to be the main crucial indicators for learners' English Verb-Noun miscollocations. By inspecting the verb collocates of a noun, it is more efficient to capture the V-N misuse. 40.

(51) than looking into the noun collocates of a verb. A similar idea is also proposed by Manning and Schütze (1999) with the term "focal word" indicating the crucial feature of nouns in V-N collocations.. In this study, the set threshold of frequency count was 300. That is, only nouns with no fewer than 300 frequency tokens would be incorporated in this study. This requirement eventually narrowed the number down to 690 key nouns to be compared (cf. Appendix A). On the other hand, only those V-N miscollocations found more than three times in the Chinese ESL learner corpus would be counted significant enough by the author for further comparison and discussion with the native speaker corpora.. In a semi-automated manner, the most frequently used nouns in the Chinese ESL learner corpus and their respectively common verb collocates in the ESL leaner corpus and the BNC corpus were checked one by one with the Sketch-Diff function. Demonstrated above by the two colored areas (cf. Figure 3.4), it is obviously shown that certain verbs are significantly used more often by either natives or non-natives. This, ultimately, is the target function on the SKE platform the author would like to apply in this study, i.e., manipulating the Sketch Diff interface to examine common Verb-Noun collocations in native corpora and non-native ones in a semi-automated manner.. 41.

(52) 3.4 Data Analysis The analysis of the results provided by the Sketch Diff described in section 3.3 would be stratified in the following steps.. First, the suspicious V-N collocations, detected by the Sketch Diff function, which native speakers never used (0 token found in the BNC) were targeted. Then, only those suspicious V-N collocations found at least three times in the Chinese ESL learner corpus would be counted significant enough by the author for further comparison and discussion with the native speaker corpora.. Second, during the process of examination, based on the red area (Figure 3.4), which indicated those V-N combinations found at least three times in the Chinese ESL corpus but none in the BNC, the author would double-check the suspicious examples in the Corpus of Contemporary American English (COCA), another powerful online corpus, for further confirmation. Since the BNC is basically composed of linguistic input of British English, a parallel check of the suspicious V-N collocations on the COCA, mostly consisting of American English, could avoid any possible negligence. Once those suspicious V-N collocations were double-checked on the COCA and there was no entry found, the author would regard them as V-N miscollocations for sure.. 42.

(53) Third, due to the feature of the Sketch-Diff function, which treats Verb-Noun combinations and Prep-Noun combinations as two separate categories on the SKE platform, the author would not additionally extract possible Verb-Prep-Noun collocations from the Prep-Noun category for this study. All of the results displayed in Chapter IV are originally classified in the Verb-Noun category by the SKE. Even though some Verb-Prep-Nouns would be discussed, they were included because the Sketch-Diff function actually highlighted them as suspiciously wrong V-N collocations (not found in the BNC). After the author looked into them, it was discovered that actually the verbs in the examples were acceptable, but that the prepositions after the verbs were deviant. The author, therefore, still considered them part of the results for general consistency and their original categorization as Verb-Nouns by the SKE online system.. Fourth, in terms of error classification, the possible types would be partially based on Chang and Yang (2009). As reviewed in section 2.1.3, Error Types of ESL Learners' Collocations, there are generally 12 kinds of Verb-Noun miscollocations (cf. Table 3.1).. 43.

(54) Table 3.1 Verb-Noun Types of Chang and Yang (2009) Error Types. Examples. 1. Erroneous verb choice. *learn knowledge. 2. Misuse of delexical verbs. *do recommendations. 3. Erroneous use of idioms. *get touch with them. 4. Erroneous noun choice. *tell a speech. 5. Erroneous preposition after verb. *reply letters. 6. Erroneous preposition after noun. *give sympathy to animals. 7. Erroneous use of determiner. *play piano. 8. Erroneous syntactic structure. *rang the phone. 9. Erroneous choice for intended meaning. *break my armed self. 10. Redundant repetition. *work one job. 11. Erroneous combination of two collocations. *enjoy yourself a good time. 12. Miscellaneous. miscollocations which cannot be categorized. Fifth, if the author cannot be sure to which category a V-N error should belong, a native speaker of English as well as other resources would be consulted, such Just the Word (http://www.just-the-word.com/), dictionaries like The BBI Dictionary of English Word Combination, Oxford Collocations Dictionary, Oxford Advanced Learner’s Dictionary, and the Collins COBUILD English Dictionary.. Finally, after a basic error categorization is compiled, the author would look into the possible causes of these V-N miscollocations. Here, the study of Liu (1999) would be the basis of discussion (cf. Table 3.2).. 44.

(55) Table 3.2 Possible Causes of Verb-Noun Miscollocations by Liu (1999) Possible Causes. Examples. 1. Overgeneralization. *I am worry about you. 2. False analogy. *ask you a favor. 3. Erroneous assumption. *do plans. 4. Erroneous use of synonyms. *broaden your eyesight. 5. Negative L1 transfer. *eat medicine. 6. Erroneous coinage. *see sun-up. 7. Approximation. *release my pressure. Table 3.3 summarizes the general data analysis procedures for this study.. Types of collocates provided by the Sketch Diff. Inspect the concordances of suspicious V-N miscollocations. Discard those collocation types with more than 1 time found in the BNC, the native corpus Discard those found less than 3 times in the Chinese ESL learner Corpus. Categorize the V-N miscollocations. Discard those which could be acceptable V-N collocations. Double-check with resources. Generate an overview of ESL learners' V-N miscollocations. Discuss the possible causes of ESL learners' V-N miscollocations. Table 3.3 Data Analysis Procedures for this Study. 45.

(56) CHAPTER IV RESULTS AND DISCUSSION. This chapter aims to provide the general findings, types of Chinese ESL learner's Verb-Noun miscollocations, the salient miscollocates among these mistakes, and the possible causes leading to the V-N misuse. An overall discussion would be provided at the end of each analysis. By applying the Corpus Creating function of The Sketch Engine website, several Chinese ESL learner corpora (cf. Chapter III) were uploaded onto the platform, and combined into one large corpus. With a size of 7,376,712 tokens, it has been one of the biggest Chinese ESL learner corpora for linguistic research so far. The author then adopted the function of Sketch-Diff on the SKE for the target data extraction. Originally, this Sketch-Diff is designed for the sake of comparing two synonymous words in the same corpus or between two separate corpora of different genres (e.g. academic and oral). This function is alternatively extended for the author's study. That is, instead of examining the corpora of native speakers, the Sketch-Diff function was utilized to compare the uses of the same word respectively in an English native corpus (the BNC), and an ESL learner corpus (the Chinese ESL learner corpus) (cf. Figure 4.1 and Figure 4.2).. 46.

(57) Figure 4.1 Interface of the Sketch-Diff function. Figure 4.2 Stress as an example of Sketch-Diff Result between the BNC and Chinese ESL Learners. 47.

(58) 4.1 Statistical Data and Miscollocation Types In this study, the number of suspicious types (with at least three times of occurrence in the Chinese ESL learner corpus) is 1284, with 23385 collocations in total. After validation from the COCA (Corpus of Contemporary American English), many other corpus-based resources, and examination by native speakers, 134 types of Verb-Noun miscollocations were eventually indentified, with 2841 tokens overall (cf. Table 4.1). Table 4.1 Overall Types and Tokens of Verb-Noun Miscollocations. Total Corpus Size. Verb-Noun Miscollocation Types. Verb-Noun Miscollocation Tokens. 7,376,712. 134. 2841. Chinese ESL Learner Corpus 1. CLEC 2. SWECCL 3. JCEE 4. Taiwanese Learners. Among these V-N miscollocations, adapted from Chang and Yang (2009) and Nesselhauf (2005), there are basically three aspects of misuse-simple verb usages, prepositional and phrasal verb usages, and noun usages (cf. Table 4.2). Prepositional and phrasal verb misuses were grouped together. This is due to the comparatively small number of phrasal verb errors in the results, which only accounts for two distinct types. The major criterion of distinction between verb and noun misuse lies in. 48.

(59) the concept that in deviant-verb-based V-N collocations, the use of nouns was semantically and grammatically correct according to the intended meanings, but their verb collocates were not. Deviant-noun-based V-N collocations, on the other hand, were composed of semantically as well as grammatically acceptable verbs, but not their accompanying nouns. Both deviant-verb-based and deviant-noun-based V-N collocations were confirmed with the aforementioned corpus-based resources. Table 4.2 Aspects and Tokens of Verb-Noun Miscollocation Types Verb-Noun. Verb-Noun. Miscollocation Types. Miscollocation Tokens. Deviant Verb Usages. 63 (47%). 832 (29%). Deviant Prepositional or Phrasal Verb Usages. 43 (32%). 1502 (53%). Deviant Noun Usages. 28 (21%). 507 (18%). Total. 134. 2841. Aspects of Misuse. Generally, the most common misuse of Verb-Noun miscollocations was found to be in the prepositional and phrasal verb category, with 1502 tokens (53%). Deviant simple verb use, on the other hand, occupied the most various kinds of usages, with 63 different types (47%) in total. Erroneous noun use was the least common both in variety and quantity, with 28 types of Verb-Noun miscollocations (21%) and 507 tokens (18%) overall. The incorrect verb usages of the Verb-Noun miscollocations, simple verbs 49.

(60) combined together with prepositional and phrasal verbs, take up 106 discrete types (79%), with 2334 tokens (82%) in the whole results found by utilizing the Sketch-Diff interface on the Sketch Engine platform. In the following presentations, the column of "suggested verbs" or "suggested nouns" is based on those frequently-adopted verb/noun collocates on the Sketch-Diff interface, COCA platform, and from the suggestions of a native speaker consultant, which would also be part of the results and discussion in this study.. 50.

(61) 4.2 Misuse of Simple Verbs Three major types of Simple-V-N miscollocations can be classified from the results. They are synonymous verb pairs (22 types with 346 tokens), verbs in common expressions (10 types with 109 tokens), and other verb pairs (31 types with 377 tokens) (cf. Table 4.3). Table 4.3 Types of Simple-Verb-Based V-N Miscollocations Verb-Noun Miscollocation. Verb-Noun Miscollocation. Types. Tokens. Synonymous Verbs. 22 (35%). 346 (42%). Verbs in Common Expressions. 10 (16%). 109 (13%). Others. 31 (49%). 377 (45%). Total. 63. 832. Aspects of Misuse. Synonymous verb pairs mean that the deviant verb collocates Chinese ESL learner used are actually semantically related with their accurate verb counterparts. Verbs in common expressions, on the other hand, reveal that the correct verbs with their noun collocates are in fact common or almost fixed expressions. The original verbs chosen by ESL learners for their intended collocations, therefore, turned out to be grammatically yet not pragmatically acceptable. Other verb pairs refer to those that cannot be directly categorized, with various background factors, and would be further. 51.

(62) examined in the discussion section.. 4.2.1 Synonymous Verb Pairs 22 different kinds of Simple-V-N miscollocations were found to be in the synonymous verb pair category, with 346 tokens. The erroneous verbs in their ESL-learner-produced collocations might appear semantically similar to their correct verb suggestions at first sight. Yet, evidence from the BNC, COCA, and other large-corpora-based resources all proved that they just do not go with certain nouns (0 token found on the BNC nor the COCA), as shown by Table 4.4. Table 4.4 Simple-Verb-Noun Miscollocation Types: Synonymous Verb Pairs No. Incorrect Verb. Suggested Verb(s). V-N Miscollocations. Frequency. 1. accept. receive/ enter/ have. accept higher education. 130. 2. keep. maintain. sports are good ways to keep health. 60. 3. catch. grab/ seize. catch that chance immediately. 25. 4. train. enhance/ increase/ develop. train our ability through practice. 24. 5. take. get/ have/ earn. take good grades on exams. 18. 6. enlarge. broaden/ expand/ widen. enlarge our horizons. 11. 7. relax. relieve/ reduce. relax our stress. 10. 8. increase. enhance/ cultivate/ foster. increase our friendship. 7. 9. look. watch/ see. looked this TV advertisement. 7. 10. talk. tell/ crack. talked many jokes. 7. 11. finish. fulfill/ satisfy/ meet/ achieve. finish their wish. 6. 12. invent. make/ develop/ work on. invent an invention. 6. 13. devote. donate. devote two million dollars. 5. 14. forget. ignore. forget the stress. 5. 15. say. speak/ talk in. say good English. 4. 16. appreciate. enjoy. appreciate the comfortable wind. 3. 17. content. fulfill/ satisfy. content my desire. 3. 52.

(63) 18. gain. earn/ win. gained five thousand dollars. 3. 19. promise. grant. parents promised their wish. 3. 20. realize. understand/ comprehend. how much they realize the lessons. 3. 21. realize. understand. realize the custom of many countries. 3. 22. talk. tell/ reveal. talk their secrets. 3. Total. 346. In terms of the suggested verbs alongside the wrong verb column, they are the frequently-adopted verb collocates of the key nouns from the data of BNC or COCA. Drawing on the Sketch -Diff function on the SKE, the author first extracted all the verb collocates of a key noun from the BNC and the Chinese ESL learner corpus. Then, examining the verb collocates used by native speakers, the author endeavored to find the appropriate verb collocates which would possibly reflect the intended meanings of ESL learners' V-N miscollocations. The online platform of the COCA can generate all the frequent verb collocates of a noun as well. The author, therefore, also checked the suggestions on the COCA for reference so that a reasonable set of advised verbs could be arranged for this study. In Table 4.4, it is obvious that *accept high education is the most often misused synonymous-verb-pair collocation type, with 130 tokens overall. The verb collocates adopted by native speakers for the noun education, instead, are receive, enter, and have. According to OALD, Oxford Advanced Learner's Dictionary (8th Edition), a definition of accept is "to take willingly something that is offered; to say ‘yes’ to an. 53.

(64) offer, invitation, etc." One possible explanation for this misuse could be that education usually entails compulsory duties or decisions already made. As a result, one either decides to or not to receive education. There is no need for one to express his or her will to "accept education," which sounds like an ideological stance about certain issue, not the real classes to be taken at school. *Keep health, taking up 60 entries, is the second most misused V-N miscollocation in the synonymous-verb-pair category. In OALD, keep basically refers to "to stay in a particular condition or position; to make somebody/something do this." Though its denotation seems possible to go with the word health to ESL learners, the actual usages about keep in OALD are "to keep somebody/something + adjective," such as "She kept the children amused for hours (OALD)." or "to keep somebody/something (+ adverb/preposition)," as in the example "He kept his coat on (OALD)." On the other hand, maintain, which is the native suggestion in the BNC, means "to make something continue at the same level, standard, etc (OALD)," and its example is "She maintained a dignified silence." Maintain, clearly, already indicates the continuation of certain status, without additional words needed to complete its meaning. For the word health, maintain obviously is a better choice to signify the continuation of one's effort to "keep his or her health at an ideal level." The third most misused synonymous-verb-pair collocation is *catch that chance,. 54.

參考文獻

相關文件

[This function is named after the electrical engineer Oliver Heaviside (1850–1925) and can be used to describe an electric current that is switched on at time t = 0.] Its graph

依據教育部臺教師(二)字第 1070199256 號,辦理國小全英語教學之教師專業成長工作

依據教育部臺教師(二)字第 1070199256 號,辦理國小全英語教學之教師專業成長工作

(三) 使用 Visual Studio 之 C# 程式語言(.Net framework 架構)、Visual Studio Code 之 JavaScript 程式語言. (JavaScript framework 架構),搭配 MS SQL

4.以年資辦理國民小學教師加註英語專長證書者,以本參照表為採認依據,不在本參照表之

As students have to sketch and compare graphs of various types of functions including trigonometric functions in Learning Objective 9.1 of the Compulsory Part, it is natural to

語文運用 留意錯別字 辨識近義詞及詞語 的感情色彩 認識成語

詞語 詞性 詞解 練習 主題. 人來人往 (短語) 來往的人很多