• 沒有找到結果。

In section 2.1, the present study introduces two main types of corpus. The first one is corpus for general English which contains English text from various genres.

BNC and COCA are examples of corpus of general English. The second one is corpus for specific purpose. This type of corpus was built often due to a specific need such as pedagogy and research. It contains text of specific English such as academic English and engineering English.

2.1.1 Corpus for General English

A corpus may contain a wide variety of texts from different kinds of sources such as novels, newspaper, research articles, etc. Take the British National Corpus for example, it is a 100-million-word text corpus composed of written and

spoken English from a wide range of sources. To be more specific, the corpus

9

contains British English of the late 20th century from many kinds of genres. The main merit of BNC is that it can be viewed as the representative sample of spoken and written British English of that time. Furthermore, BNC is a corpus of modern, naturally occurring language in the form of speech and written text (see Wikipedia, 2016). With the help of computer, it can be further analyzed. As a corpus readable by computers, BNC contributes to pave the way for automatic search and processing in the field of corpus linguistics. A main reason why BNC stands out from existing corpora is that it opens up the data not just for the use of academic research, but to commercial and educational uses as well.

In addition to BNC, the Corpus of Contemporary American English (COCA) is also a very large freely-available corpus. The context, however, is mainly from American English. According to Wikipedia (2016), COCA was created by Mark Davies, a professor of Corpus Linguistics at Brigham Young University. It is probably the general English corpus that is used most. The main strength of COCA is that it contains more than 520 million words of text and it is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts. For spoken texts, it includes transcripts of unscripted conversation from nearly 150 different TV and radio programs. For fiction texts, it contains short stories, plays and movie scripts. For popular magazines, about 100 different kinds of magazines such as news, health, home and gardening, women's, financial, religion, and sports are included. In the section of newspaper, ten newspapers from across the US are all included. For academic texts, there are nearly 100 different peer-reviewed journals

Rich as BNC and COCA are, they still have some limitations. For example, they include texts of general English usage. However, the range of the content is so wide that it can‟t narrow down and focus on a certain type of text such as engineering, nursing or sports English. For example, the phrase point guard, an important and very

10

frequently used phrase in basketball English, does not appear in BNC. Additionally, the word layup which is very often used in basketball English only appears for 1 time in the BNC. Therefore, a study which wants to do research on a specific area will have difficulty gaining enough rich information in a general English corpus. In terms of basketball English, there is for sure a certain amount of coverage in a general English corpus. Nevertheless, it is not rich enough. For instance, there are lots of texts existing on various web pages nowadays. Those valuable texts cannot all be included in BNC and COCA. People who want to lean more English vocabulary knowledge on such a specific genre may have difficulty reaching all the essential words and phrases of the field in a large general English corpus like BNC and COCA. Therefore,

forming a user-oriented corpus for specific purpose is still necessary if there are special needs for learning English and educational research.

2.1.2 Corpus for Specific Purpose

To meet the special needs of learning a certain kind of English, there are many English for specific corpora having been created. West (1953) built a corpus of 5 millions words with the needs of ESL/EFL learners in mind. In the end, he selected 2000 most widely used words from the corpus. Thanks to the corpus, a General Service List was created and it contributed a lot to ESL and EFL beginning learners.

The list could benefit beginning English learners because a person who understands all of the 2,000 words on the list and their related families would make sense of approximately 90 to 95 percent of colloquial speech and 80 to 85 percent of common written texts. The list can also benefit writers who want to compile simple readers for beginning English learners.

Coxhead (2000) also created a corpus of 3.5 million words. The text of Coxhead‟s corpus came only from academic resources. From the academic corpus,

11

Coxhead selected 570 word families to form the Academic Word List. The 570 words only account for 10% of the total words in academic text but only 1.4 % of the total words in a fiction collection of the same size. This difference in coverage means the list contains predominately academic words. The Academic Word List was mainly used by teachers as part of a material for learners pursuing college education.

In terms of the advantages of building a user oriented corpus on English teaching and learning, there are also many studies worthy of our attention. Charles (2012) conducted a research showing us that even students who practiced academic writing could benefit from the corpus they constructed on their own. In the research, students found it easy to build their academic corpus. The size was relatively small which only contained 15 academic research articles. However, most of the students reported that even small size corpus like theirs helped their academic writing a lot. Additionally, those participants also mentioned that they like to analyze the corpus and found the corpus a valuable tool for their academic writing.

Wu (2014) designed a study to look at the merit of building ESP corpus to motivate technical college students to learn English for specific purpose. First, corpus building skill was explored. Second, it aimed to find out the way to simplify ESP courses through corpus building. Third, the study also did evaluation of ESP courses.

According to the study, ESP courses have been developed and offered in Taiwan‟s technical colleges for a long time. Nevertheless, a large number of students and teachers are not satisfied with the academic achievement and outcome. Despite the fact that ESP corpus building is often used for advanced English learners, the target technical college students‟ English proficiency is generally low in this study.

Therefore, teaching an ESP course to those students can be full of difficulties. After implementing some tutorials on how to build ESP corpus to students and learners, the researcher found that building and making good use of a specialized corpus based on

12

learners‟ needs can benefit classroom learners a lot. Teachers can build their own corpora to help learners select target words based on their English proficiency. This way, they can manage to overcome learning difficulties when acquiring ESP

vocabulary.

After realizing that ESP corpus can be useful in many ways, it is suggested that ESP corpus be implemented in language teaching, learning and material design.

Tough as the building of the corpus may sound, it is actually doable with the aid of modern technology. Even students can build and analyze their own corpus to help their study. For the next section, the combination of corpus and ESP lexical item list will be further discussed.