Models of Reading

(1)

CHAPTER TWO LITERATURE REVIEW

This chapter gives a general review of the literature on the rationale of reading, the study of reading comprehension tests, the revision of Bloom’s Taxonomy, and the application of Bloom’s Taxonomy on assessment and testing. Four sections are included in this chapter. Section one discusses models of reading, reading skills, and the correlation between the classification of skills and Bloom’s Taxonomy. Section two reviews studies on reading comprehension tests in EFL/ESL contexts and studies of the Scholastic Achievement English Test (SAET) and the Department Required English Test (DRET) reading comprehension item analysis in Taiwan. Section three describes the Revised Bloom’s Taxonomy; while the last section discusses studies related to the operationalization of Bloom’s Taxonomy.

The Rationale of Reading

Of the literature on the nature of reading, comprehension is often assumed as the goal of the reading process or as the product of reading. Despite the fact that reaching the ideal comprehension, i.e., recapturing author’s original meaning, can never be fully achieved, a reader accesses a certain degree of understanding in different reading situations, depending on readers’ background knowledge, interest, goals, language abilities etc. (Urquhart &Weir, 1998). To comprehend a text, that is, to understand textual information and interpret it appropriately, a reader engaged actively in a mental process. Taking a cognitive perspective, the reading process, as Bernhardt (1991) specified, is “an intrapersonal problem-solving task that takes place within the brain’s knowledge structure” (p. 6). Metaphorically, three models: bottom-up model, top-down model, and interactive model, describe the reading processes.

Models of Reading

Bottom-up (or data-driven) model is a sequential model, in which the reader

begins with the printed letters, decodes them to sounds, recognizes them as words and

(2)

decodes their meanings, and further proceeds reading in the same manner to the sentence level (Gough, 1972). It emphasizes the recognition and decoding of the text as input and is typically known as a lower-level processing, which represents

automatic linguistic process. Criticisms against this model in both L1 and L2 research and pedagogy in the seventies were influenced primarily by models proposed by Goodman (1967) and Smith (1971), which emphasize the contribution of readers (or the knowledge they bring to the text) to their reading comprehension.

Goodman’s (1967) model is a top-down (or reader-driven) model within which the reader samples the text, makes prediction, confirms his or her guesses, forms new hypothesis, and then samples further (i.e., a cyclical hypothesis verification process).

Reading in this model has been called “a psycholinguistic guessing game.” This approach has a great impact on ESL reading theory and practice. Coady (1979), elaborating on this psycholinguistic model, suggested that reading comprehension involved EFL/ESL readers’ background knowledge, conceptual ability (i.e.,

intellectual capacity), and process strategies (i.e., various subcomponents of reading ability). He claimed that background knowledge might be able to compensate for certain syntactic deficiencies. Related to this view, a quantity of studies have been conducted on the effects of schema—an abstract structure representing concepts stored in the reader’s memory according to schema theory—on both first language comprehension (e.g., Anderson, Reynolds, Schallert & Goetz, 1977; Bransfor &

Johnson, 1972; Freebody & Anderson, 1983; Mandler & Johnson, 1977) and

second/foreign language comprehension (e.g., Carrell, 1983; Hudson, 1982; Johnson, 1981 & 1982). Carrell (1983) proposed two essential kinds of schemata for L2 reading—formal schema, knowledge of language and linguistic conventions, and content schema which refers to knowledge of the world or subject matter of the text.

Currently, the preferred model in both L1 and L2 is the interactive model, which

(3)

provides a synthesized view of both bottom-up and top-down reading processes.

Rumelhart (1977) pointed out that several sources such as orthographic, lexical, syntactic, semantic etc provided input simultaneously and interactively in the reading process and ultimately influenced our interpretation of the text. Stanovich (1980) in his proposal of the interactive-compensatory model stated that the reader may try to compensate for deficiencies at one level (e.g., word recognition) by depending more on another, either higher or lower level (e.g., contextual knowledge). This interactive view recognizes both readers’ role, the knowledge they bring to the text, and the role of the written text in reading comprehension.

Grabe (1991) stated that reading was a complex cognitive process and that “a description of reading has to account for the notions that fluent reading is rapid, purposeful, interactive, comprehending, flexible, and gradually developing” (p.378).

By interactive process, he explained that reading involved: (1) the interaction between the reader and the text, suggesting that readers use their background knowledge as well as the information from the text to aid comprehension; and (2) the simultaneous interaction of various skills (e.g., decoding skills and comprehending skills work at the same time). Carrell and Eisterhold (1983) suggested that bottom-up processing ensured that the readers noticed information that was new or that did not

accommodate their prediction or anticipation of the text content or structure, whereas top-down processing helped readers to resolve ambiguities or to select possible interpretation of the incoming information.

Carrell (1988) noted two existing types of skill deficiencies that might influence the efficiency of the interaction between the text-based and knowledge-based

processing in ESL reading: (1) linguistic deficiencies and (2) reading skill deficiencies.

Linguistic competence, on the one hand, has been extensively discussed and viewed

as the fundamental aspect, including skills in decoding the vocabulary and syntactic

(4)

structure in text-based processing, in successful English reading (Eskey, 1988). On the other hand, Carrell (1988) in reviewing Spiro’s (1978) research explained that reading skill deficiencies, i.e., lacking either text-based skills (e.g., decoding) or knowledge- based skills (e.g., pragmatic inference), may affect reading comprehension. For example, readers who prefer one particular reading skill such as decoding might only use this skill to solve all problems encountered in reading or escape the problem by shifting to another skill such as making inferences. Problems occur when the decoder receives too much discrete information without higher-level understanding, or when the decoder overtly relies on his or her assumption of the text that causes

misinterpretation. Thus, different from less skilled readers who were likely to

over-rely on one model of processing most of the time, efficient readers shift from one process to another frequently in order to accommodate a certain text or reading

situation.

In simple terms, reading emphasizes an interactive process in which the reader dynamically construct meaning of the text and where various kinds of knowledge and skills are being used such as word decoding skills (through bottom-up processing) and higher-level mental operating skills (through top-down processing). Efficient reading requires the interaction of these two processes.

Reading Skills

Much of what has been described as the components of models could be translated into terms of reading skills, for example, decoding, accessing the lexicon, making inference and so on. “A reading skill can be described roughly as a cognitive ability which a person is able to use when interacting with written text” (Urquhart &

Weir, 1998, p.88). Researchers, teachers, and test writers who believe in the

multi-dimensional nature of reading comprehension assume that reading skills can be

identified, researched, taught and tested (e.g., Chapelle et al., 1997; Dubin et al., 1986;

(5)

Grabe, 1991; Weir, 1997; William & Moran, 1989). Many different lists, taxonomies and hierarchies of skills have been developed, yet little consensus of the content or terminologies in those taxonomies can be found (Williams and Moran, 1989). Skills can be identified as the linguistic elements of the text (Munby, 1978), different

knowledge areas (Grabe, 1991), different levels of textual understanding (Gray, 1960), or even a hierarchy of skills and sub-skills, specifically related to Benjamin Bloom’s

“Taxonomy of Educational Objectives: Cognitive Domain” (Adams-Smith, 1981).

This view has been criticized by Alderson (2000), who views reading as a unitary entity, claiming that firstly, there is a lack of empirical evidence to illustrate

identifiable skills in reading; secondly, skills are more or less ill defined, overlapping, and the concept of the reading skills will mislead audiences that reading can be divided into discrete components; thirdly, judges are unable to agree on which item tests which skill; and finally, analysis of the test result does not show separately on each skill. Nonetheless, the claim of the unitary view of reading comprehension is far from conclusive to prove reading skills do not exist.

Since reading can be considered as a cognitive activity that takes place in the reader’s mind, it can also represent a problem-solving process in which the reader applies skills or strategies to resolve difficulties in reading. The present study, taking a cognitive-processing-oriented viewpoint (i.e., reading involves different levels of cognitive abilities), attempts to employ the Revised Bloom’s Taxonomy of Educational Objectives (Anderson & Krathwohl, 2001) as the coding scheme to explore what cognitive skills were measured in the reading comprehension test items on the Scholastic Achievement English Test (SAET) and the Department Required English Test (DRET) in Taiwan.

Reading Skills and Bloom’s Taxonomy

Studies related to the use of Bloom’s Taxonomy in categorizing reading skills

(6)

reveal that the taxonomy is adequate for evaluating reading comprehension objectives, students’ reading abilities, and designing classroom questions, material, and test items (Adams-Smith, 1981; Beatty, 1975; Costin, 1986). Beatty (1975) believed that the Bloom’s Taxonomy (Bloom, 1956) solved the problems of selecting the reading comprehension objectives, the sequence of those objectives, and the confusion of different definitions for a reading skill. He argued that skill such as finding the main idea defined by different authors could involve either a lower-level processing (when the main idea is in the topic sentence, which can be easily identified) or higher-level processing (when the main idea has to be implied), and Bloom’s Taxonomy could describe these differences while classifying skills. Beatty adapted the categories in the Bloom’s Taxonomy published in 1956 (comprises six categories: Knowledge,

Comprehension, Application, Analysis, Synthesis, and Evaluation) by firstly renaming the Knowledge category into Recall with subcategories—recall of specifics, recall of conventions, and recall of trends, which he thought was more representative to the process of comprehending rather than what was comprehended. Beatty excluded some categories that he considered irrelevant to reading and added three categories,

Translation, Apprehension, and Extrapolation, in place of the Comprehension category. Beatty further demonstrated how specific comprehension skills fit into Bloom’s classification. Skills such as finding facts (recall of specific), finding the main idea stated explicitly in the topic sentence (recall of conventions), and retaining concepts (recall of trends) could be categorized into Recall. Translation category included skills like interpreting figurative language; Apprehension category involved identifying theme, finding main idea not stated in topic sentence, or writing a

summary; and skills such as inferring, drawing conclusions, predicting outcomes went

to Extrapolation category. Application level comprised skills that required learners to

apply an idea from a selection to a different situation. Analysis category included

(7)

skills like recognizing assumptions, distinguishing relevant and irrelevant information (analysis of elements), showing how details relate to the main idea (analysis of

relationships), and identifying author’s purpose or point of view (analysis of

organizational principles). Reading skills such as making comparison and contrasts, making analogies, or writing a synthesis of a passage belonged to Synthesis category, whereas skills of judging relevancy/significance, forming own opinion, or using evidence to support opinion were at Evaluation level. Beatty suggested that reading teachers refer to the scheme and use it to teach comprehension skills.

In more recent research, Surjosuseno and Watts (1999) discussed how EFL reading teachers could use the cognitive process domain of Bloom’s Taxonomy to promote learners’ critical reading abilities. By comparing the categories in the taxonomy with various critical reading skills (proposed by Paul, 1993; Flynn, 1989;

Cheek, Flippo and Lindsey,1989; Hickey, 1988; and Rubin, 1982) and instructional strategies of critical reading (proposed by Singh, Chirgwin and Elliott, 1997; and Karlin, 1980), they found that even though there were a variety of names and definitions to describe critical reading abilities, abilities like analysis, synthesis, and evaluation, which required higher order thinking, could be found among studies and fit into Bloom’s classification. Surjosuseno and Watts maintained that all six levels of Bloom’s Taxonomy with a slight modification could be useful as a planning tool for teaching critical reading in EFL classes.

Lately, interest in the development of critical reading skills or abilities draws our

attention to the study of reading and thinking. Advocates of critical reading suggest

that reading critically involves readers actively interact with the written text. Abdulah

(1994, cited in Alderson, 2000), for example, indicated that critical reading sub-skills

included skills such as evaluate deductive and inductive references, recognize hidden

assumptions or author’s motives, or evaluate the strength of arguments. These critical

(8)

reading abilities are major goals of the SAET and the DRET test design and are essential abilities for a college student (objectives extracted from CEEC website).

Adams-Smith (1981), who adapted Bloom’s Taxonomy in designing questions at each level for ESP (English for Specific Purposes) classes, stated that college students needed skills like problem solving, deductive thinking, and evaluation, and that Bloom’s Taxonomy can be a useful scheme for English language teachers to develop materials to help students learn to think. Therefore, Bloom’s Taxonomy is selected as the framework for test item analysis in the present study.

Testing Reading in the EFL/ESL Context

Testing reading in EFL/ESL contexts focused on the test methods in the early 1980s, and then the concern of testing reading had shifted from how to what to test in late 80s (Weir, 1997). Among various test formats, multiple-choice questions

(henceforth MCQs) were widely used in testing reading comprehension, whereas some researchers have made unfavorable comments on the use of multiple-choice questions. Using multiple-choice questions as a test technique allows more items to be tested in a given period of time; it is a rapid and reliable scoring, and it tests receptive skills without asking examinees to produce written language. A major problem is that questions providing possible answers might affect the test results. It is hard to know whether the failure of certain questions is due to the lack of comprehension of the text or of the question (Weir, 1990). The strongest criticism of multiple-choice reading comprehension tests is the problem of passage-independent (Bernhardt, 1991; Teale &

Rowley, 1984). Teale and Rowley (1984) questioned the validity of the test items when examinees could answer the question without referring to the reading text.

Additionally, the training of test taking techniques can improve students’ scores rather

than their language ability. Test taking strategies would mislead students to the surface

feature of the text rather than to its embedded deeper meaning (Nevo, 1989; Weir,

(9)

1990). Irrespective of these comments, issues of what MCQs actually measure and of whether they are valid measurements have become the major focus for debate of the MCQs reading tests (Cummings, 1982; Farr, Pritchard & Smitten, 1990; Hughes, 2003; Weir, 1997; Weir & Urquhart, 1998). Farr et al. (1990) stated that “the types of questions that follow a reading selection will determine if the reading selection focuses on only the surface meaning of the text or on other—perhaps

deeper—comprehension,” and that developing questions tapping important elements of the text “enhance the validity of the test” (p.224).

Validity is one of the major considerations in language test design. It is “the appropriateness of a given test or any of its component parts as a measure of what it is purported to measure” (Henning, 1987, p. 89). McNamara (2000) explained that “the purpose of validation in language testing is to ensure the defensibility and fairness of interpretations based on test performance” (p.48). A test is valid when it measures what it is supposed to measure. Among several kinds of validity, construct validity and content validity play crucial roles on the decision of test validity (Alderson et al., 1995; Bachman, 1990; Bachman & Palmer, 1996; Davies, 1990; Henning, 1987;

Hughes, 1989; McNamara, 2000; Weir, 1990).

Construct validity refers to the degree to which an instrument measures an

intended hypothetical construct. It is how well you translated your ideas or theories

into actual measurement. The word construct, as defined by Ebel and Frisbie (1991),

is “a psychological construct, a theoretical conceptualization about an aspect of

human behavior that cannot be measured or observed directly.” Hughes (1989)

defined construct in language testing as “any underlying ability (or trait) that is

hypothesized in a theory of language ability” (p. 31). Hence, the construct of reading

is based on models of reading, which may translate into various skills and factors that

affect reading.

(10)

In Weir’s (1997) four-level version of reading comprehension for testing

purposes (i.e., reading expeditiously for global comprehension, reading expeditiously for local comprehension, reading carefully for global comprehension, and reading carefully for local comprehension), skills such as locating or identifying a specific phrase are micro-skills operated while reading at local levels, whilst finding the main theme and making inference are seen as macro-skills while reading at global levels.

Both micro-skills like word recognition and macro-skills like inferring or evaluating need to be tested.

As mentioned in the previous section, researchers holding the view of multi-dimensional nature of reading attempted to design questions to test the underlying skills and sub-skills required in different levels of understanding of the text. Nonetheless, researchers disapproving the multi-dividable nature of reading challenged the accountability of test items intended to assess different reading abilities, claiming that it was not possible to differentiate which reading ability component was tested in which language test item, either through empirical demonstration or through the judgment of experts (Alderson, 1990; Alderson &

Lukmani, 1989; Carver, 1992; Lunzer et al., 1979; Rosenshine, 1980; Rost,1993).

Despite claims that it is hard to demonstrate various reading skills do exist in the previous studies, it is also hard to demonstrate that they do not. And if there are no distinguishable components in reading, it should not really matter how we test it or what we try to test.

Reading Comprehension Test Item Analysis

Content validity is one of the forms that provide evidence for construct validity— “whether or not the content of the test if sufficiently representative and comprehensive for the test to be a valid measure of what it is supposed to measure”

(Henning, 1987, p. 94). If we are going to assess a student’s reading ability, a test with

(11)

content validity will contain items that measure, for example, a variety of reading (sub) skills, which are parts of the construct of reading ability. Thus, whether a test of

reading comprehension does have the content validity has been a main concern for researchers or teachers who want to ensure what is actually tested in each item.

Research on English reading comprehension tests, specifically on standardized tests, has received great interest in EFL setting in Taiwan. Testing plays a vital role in English learning and teaching especially on secondary education because test results usually bring consequences to students’ future, they determine whether students can continue studying in higher education or which school to enter. Research reports on test content analysis and statistical analysis (e.g., the number of items and distribution, length of text, vocabulary, topics, discriminatory power, and examinees’ test

performances) of both SAET and DRET subject tests have been conducted annually by the College Entrance Examination Center (CEEC Web site; Huang, 1994; Jeng 1992; Jiang & Lin, 1999; Xu & Lu, 1998). Huang (1994) presented a qualitative analysis of the Joint College Entrance Examination (henceforth JCEE, renamed as DRT in 1992) English test items from 1985 to year 1994. The results of the reading comprehension item analysis showed that over 90% of the items were well written, yet a few were not. Some items were found to test examinees’ vocabulary and grammatical knowledge rather than their reading abilities. He indicated that items on vocabulary and grammar that could be answered without referring back to the text should be excluded and items that involved arithmetic should be designed with caution. Xu and Lu (1998) researched the topics, text length, syntactic complexity, vocabulary, distractors, and question types on the JCEE English test content in 1998.

They reported that reading comprehension test items could usually be classified into

four types: vocabulary, main idea, detail, and inference questions. Yet, they did not

further identify those elements item by item or examine the frequency and distribution

(12)

of different items. These studies give an overview of the overall test construction or the statistical results rather than a thorough report on the reading skills tested on each comprehension item in particular.

Recently, two studies (Lu, 2002; Hsu, 2005) used Mo’s (1987) classification to analyze the reading comprehension test items on the SAET and the DRET in Taiwan.

Mo (1987) proposed that a reading test should include questions requiring test takers to clarify the organization of the text and questions of textual comprehension. He excluded skills such as reading speed, habit, and pleasure that were unrelated to the text structure and then classified reading skills into six main categories: (1)

identifying the main idea, (2) finding specific details mentioned in the passage, (3) finding implications and drawing inferences and conclusions from the text, (4) recognizing style and tone, (5) determining the special techniques used by the author to achieve his effect

¹

, and (6) determining the meaning of strange words or phrases as used in the test.

Lu (2002) conducted both qualitative and quantitative analyses of the reading comprehension test items on SAETs from 1995 to 2002. Qualitatively, she classified items into Mo’s six question types, and examined textual materials, examinees’

passing rates on each question type, test variables that affected those passing rates, and discrimination index, etc; whereas quantitatively, she computed the frequency distribution of question types and the correlation between question types and passing rates. Results indicated that the most common question type was items on details, followed by items on inference, main idea, style/tone, organization, and word

meaning. Generally, the examinees performed best on word meaning items, followed

1This question type refers to items that test examinee’s ability to recognize the organization of the passage or writing techniques used by the author. Techniques are those that used to develop paragraphs, such as the use of details, examples, cause and effect, comparison, definition and so on. Example questions are: in what way does the writer present the passage, the author supports his argument by…, etc.

(13)

by main idea, detail and inference items, whilst they performed worst on the

style/tone and the organization items. High achievers performed better on detail items followed by inference items, while low achievers failed on these two types of

questions probably owing to their lack of linguistic knowledge (for they even failed to answer detail items even when the information was clearly stated) and summarizing and synthesizing abilities. Comprehending difficulties occurred when questions required readers to use higher-order reading processes, such as synthesizing numerous details needed to reach the correct answer. Also, lengthy articles dealing with

unfamiliar topics inhibited understanding.

Hsu (2005), applying the same coding scheme, analyzed reading comprehension test items taken from 2001 JCEE (renamed as DRT in 2002) English test and 2002 to 2004 DRET. The themes of the texts and text variables that accounted for item difficulty were also investigated. Different from Lu’s study, she examined the use of words in the chosen texts by comparing to the Word List published by the DRT, and examined the performances of 76 Grade II students (divided into the high-proficiency group, the middle-proficiency group, and the low-proficiency group) from two

different classes in a high school in Kahohsiung city on question types, instead of directly computing the passing rates of examinees taking those tests. Eighteen passages on the 2000-2004 DRET tests were given to those students during a

ten-week time frame. Similar to Lu’s (2002) study, it was found that items on details were the most frequent. Likewise, examinees performed well on items tested

micro-skills like determining the meaning of words and finding specific details, whereas they performed worst on questions of drawing inferences, which required higher level processing.

The present study, different from pervious studies on test items analysis using

Mo’s classification of reading skills or other methods, attempts to employ the Revised

(14)

Bloom’s Taxonomy, which presents a hierarchy of cognitive processes (skills) (i.e., skills are arranged from simple to complex, and from lower-levels to higher-levels), to analyze the reading comprehension multiple-choice questions qualitatively and

quantitatively on both SAETs and DRETs in Taiwan from 2002 to 2006.

Revision of Bloom’s Taxonomy

The Taxonomy for Educational Objectives: Cognitive Domain (Bloom, et al., 1956), typically referred to Bloom’s Taxonomy, was published by Bloom and his associates in 1956 and has been used in various ways in education. In year 2001, Anderson and Krathwohl proposed a revision for the original Taxonomy. The revised Taxonomy attempts to help teachers and educators in at least four ways: (a) to analyze the objectives of a unit/syllabus/curriculum, (b) to improve instruction (Anderson, 2002; Hoff, 200l; James, 2002), (c) to construct or validate assessment tool, and (d) to align curriculum (Anderson, 2002).The following sections discussed the original framework, the revised Taxonomy, the differences between those two, and studies related to the application of the Taxonomy.

The Original Bloom’s Taxonomy

The original classification includes six major categories in the Cognitive Domain:

Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation (see Table 1 for the structure of the original Taxonomy). Except for the Application category, each of these categories contains subcategories.

The categories above Knowledge level were labeled as abilities and skills. The

structure of the original taxonomy was assumed to present a cumulative hierarchy,

that is to say, the categories were arranged from simple to complex and from concrete

to abstract, while mastery of each simpler category was prerequisite to mastery of the

next more complex one (Krathwohl 2002; Krietzer et al., 1994). One of the most

frequent use of the taxonomy has been to examine the adequacy of curricular

(15)

objectives and test items

Table 1. Structure of the Original Taxonomy 1.0 Knowledge

1.10 Knowledge of specifics

1.11 Knowledge of terminology 1.12 Knowledge of specific facts

1.20 Knowledge of ways and means of dealing with specifics 1.21 Knowledge of conventions

1.22 Knowledge of trends and sequences

1.23 Knowledge of classifications and categories 1.24 Knowledge of criteria

1.25 Knowledge of methodology

1.30 Knowledge of universals and abstractions in a field 1.31 Knowledge of principles and generalization 1.32 Knowledge of theories and structures 2.0 Comprehension

2.1 Translation 2.2 Interpretation 2.3 Extrapolation 3.0 Application 4.0 Analysis

4.1 Analysis of elements 4.2 Analysis of relationships

4.3 Analysis of organizational principles 5.0 Synthesis

5.1 Production of a unique communication

5.2 Production of a plan, or proposed set of operations 5.3 Derivation of a set of abstract relations

6.0 Evaluation

6.1 Evaluation in terms of internal evidence 6.2 Judgments in terms of external criteria

Note. Adopted from A revision of Bloom’s taxonomy: An overview, by D.R.

Krathwohl, 2002, Theory Into Practice, 41(4), 213.

Krathwohl (2002) indicates that Bloom saw the original Taxonomy as more than

a measurement tool, as it could serve as a

(16)

‧ common language about learning goals to facilitate communication across persons, subject matter, and grade levels;

‧ basis for determining for a particular course or curriculum the specific meaning of broad educational goals, such as those found in the currently prevalent national, state, and local standards;

‧ means for determining the congruence of educational objectives, activities, and assessments in a unit, course, or curriculum; and

‧ panorama of the range of educational possibilities against which the limited breadth and depth of any particular educational course or curriculum could be contrasted.

(Krathwohl, 2002, p.212) However, several weaknesses and practical limitations could be found in the original Taxonomy. One of the major problems is the cumulative hierarchy structure (Furst, 1994). For example, some demands for the Knowledge categories are more complex than certain demands for Analysis or Evaluation levels, and Evaluation is not more complex than Synthesis (which involves evaluation) (Krietzer et al., 1994).

The Revised Bloom’s Taxonomy

A focus on meaningful learning, a constructivist view point, is recognized as a crucial educational goal these days. In constructivist learning, students engage in active cognitive processing, being actively and mentally constructing the meaning of their selective information by integrating the information with their existing

knowledge (Mayer, 2002). Educators thus need to emphasize what learners know (knowledge) and how they think (cognitive process). The former (i.e., the knowledge they acquired) helps the teachers know what to teach, while the latter (i.e., their cognitive process) tells the teacher how to help learners retain and then transfer the knowledge they have learned. Additionally, instructional objectives are usually formulated in a verb-noun form, which typically consist of a noun or noun phrase—the subject matter content, and a verb or verb phrase—the cognitive

processes (cf. In the original Taxonomy, the Knowledge category embodies both noun

and verb aspects. This differentiates the Knowledge category, which is

(17)

dual-dimensional, from the other five that are unidimensional, and thus brought confusion to its structure). Consequently, based upon the beliefs above, in the revised version of Bloom’s Taxonomy, Anderson and Krathwohl (2001) divided the

framework into two dimensions—the knowledge dimension and the cognitive process dimension.

The Knowledge Dimension

The knowledge categories of the revised Taxonomy, similar to the original Taxonomy, cut across subject matter lines. The knowledge dimension contains four major types of knowledge with subtypes under each category—factual knowledge, conceptual knowledge, procedural knowledge, and metacognitive knowledge (a new category to the original).

Factual Knowledge is the knowledge of discrete, isolated content elements, the basic elements of certain discipline. It contains (1) knowledge of terminology, including knowledge of specific verbal and nonverbal labels and symbols (e.g., knowledge of alphabet, vocabulary, terms), and (2) knowledge of specific details and elements (e.g., knowledge of events, locations, people or the source of information etc). In contrast, Conceptual Knowledge refers to knowing how basic elements are interrelated or interconnected in a larger structure and how these parts function together (e.g., schemas, mental models, and theories). It includes (1) knowledge of classifications and categories (e.g. knowledge of the parts of sentences, the variety of types of literature), (2) knowledge of principles and generalizations (e.g., knowledge of major generalizations about particular cultures, the principles involved in learning).

The third category is Procedure Knowledge, which refers to the knowledge of

how to do something. That something could be completing a routine task or a new

task. Procedure knowledge includes knowledge of skills, algorithms, techniques and

methods (which are more subject specific), known as procedures of a series of steps to

(18)

follow. Procedure knowledge also includes knowledge of the criteria used to decide when to use certain procedure. Subtypes in this category are: (1) knowledge of subject-specific skills and algorithms (e.g., knowledge of skills for spelling words in English); (2) knowledge of subject-specific techniques and methods (e.g., knowledge of research methods relevant to the social science), and (3) knowledge of criteria for determining when to use appropriate procedures (e.g., knowledge of the criteria for determining which method to use in solving algebraic equations).

The fourth new category, Metacognitive Knowledge, involves knowledge about cognition in general as well as awareness of and knowledge of one’s own cognition.

Metacognitive Knowledge contains three subcategories: (1) strategic knowledge that

shows both what strategies to use and how to use them (i.e., knowledge for general

strategies for learning, thinking, and problem solving, such as knowledge for rehearsal

of information as a learning strategy to retain the information); (2) knowledge about

cognitive tasks, including contextual and conditional knowledge (i.e., knowing when

and why to use these strategies under different circumstance, for example, knowing

that strategies like summarizing and paraphrasing can result in deeper levels of

comprehension); and (3) self-knowledge (Flavell, 1979), which includes knowledge

of one’s strengths and weaknesses relating to cognition and learning, awareness of

different types of strategies to use in different situations, and motivational beliefs

(Pintrich & Schunk, 1996). This knowledge is unique and varies from person to

person; hence it is difficult to measure metacognitive knowledge through paper-pencil

measurement. It may be best assessed in the context of classroom activity or strategy

instruction. Table 2 is the structure of the knowledge dimension.

(19)

Table 2. Structure of the Knowledge Dimension of the Revised Taxonomy A. Factual Knowledge—The basic elements that students must know to be

acquainted with a discipline or solve problems in it.

Aa. Knowledge of terminology

Ab. Knowledge of specific details and elements

B. Conceptual Knowledge—The interrelationships among the basic elements within a larger structure that enable them to function together.

Ba. Knowledge of classifications and categories Bb. Knowledge of principles and generalizations Bc. Knowledge of theories, models, and structures

C. Procedural Knowledge—How to do something; methods of inquiry, and criteria for using skills, algorithms, techniques, and methods.

Ca. Knowledge of subject-specific skills and algorithms Cb. Knowledge of subject-specific techniques and methods

Cc. Knowledge of criteria for determining when to use appropriate procedures

D. Metacognitive Knowledge—Knowledge of cognition in general as well as awareness and knowledge of one’s own cognition.

Da. Strategic knowledge

Db. Knowledge about cognitive tasks, including appropriate contextual and conditional knowledge

Dc. Self-knowledge

Note. Adopted from A revision of Bloom’s taxonomy: An overview, by D.R.

Krathwohl, 2002, Theory Into Practice, 41(4), 214.

The Cognitive Process Dimension

Two of the most important educational goals are to promote retention and transfer. The revised Taxonomy encompasses also six cognitive process

categories—one relates closely to retention (i.e., Remember) and the other five relate

increasingly to transfer (i.e., Understand, Apply, Analyze, Evaluate, and Create). Each

of these main cognitive skills includes subcategories. The structure of the cognitive

process dimension is presented in Table 3.

(20)

Table 3. Structure of the Cognitive Process Dimension of the Revised Taxonomy 1.0 Remember—Retrieving relevant knowledge from long-term memory 1.1 Recognizing

1.2 Recalling

2.0 Understand—Determining the meaning of instructional messages, including oral, written and graphic communication.

2.1 Interpreting 2.2 Exemplifying 2.3 Classifying 2.4 Summarizing 2.5 Inferring 2.6 Comparing 2.7 Explaining

3.0 Apply—Carrying out or using a procedure in a given situation.

3.1 Executing 3.2 Implementing

4.0 Analyze—Breaking material into its constituent parts and detecting how the parts relate to one another and to an overall structure or purpose.

4.1 Differentiating 4.2 Organizing 4.3 Attributing

5.0 Evaluate—Making judgments based on criteria and standards.

5.1 Checking 5.2 Critiquing

6.0 Create—Putting elements together to form a novel, coherent whole or make an original product.

6.1 Generating 6.2 Planning 6.3 Producing

Note. Adopted from A revision of Bloom’s taxonomy: An overview, by D.R.

Krathwohl, 2002, Theory Into Practice, 41(4), 215.

Remember. Remember involves retrieving relevant knowledge from long-term

memory. Remembering knowledge is essential for meaningful learning and problem solving in a larger and more complex task, including recognizing and recalling.

Recognizing (also called identifying), the first subcategory of Remember, involves

(21)

locating knowledge in long-term memory to compare it with the present information.

Learners would be asked to identify or recall the knowledge they have learned in the assessment. A corresponding test format would be true or false questions or

multiple-choice questions. The second subcategory, Recalling (also called retrieving), involves retrieving relevant knowledge from long-term memory.

Understand. Understand refers to constructing meaning from the oral, written or

graphic messages by integrating the new information with their prior knowledge.

Cognitive processes in the category of Understand include interpreting, exemplifying, classifying, summarizing, inferring, comparing, and explaining. (1) Interpreting (also called clarifying, paraphrasing, representing, or translating) takes place when a student is able to convert information from one form of representation to another.

Interpreting may involve words to words converting (e.g., paraphrasing), pictures to words, number to words or vice versa. (2) Exemplifying (also called illustrating or instantiating) occurs when a student can find or give a specific example or instance of a general concept or principle. (3) Classifying (also called categorizing or subsuming) occurs when a student determines that something belongs to a certain category. (4) Summarizing (also called abstracting or generalizing) occurs when a student expresses the general idea of a presented information in a short form. (5) Inferring (also called concluding, extrapolating, interpolating, or predicting) involves drawing a logical conclusion from presented information. (6) Comparing (also called contrasting, mapping, or matching) involves examining similarities or differences between two or more objects, events, ideas, problems, or situations. (7) Explaining (also called constructing models) occurs when a student mentally constructs and uses a cause-and-effect model of a system.

Apply. Apply (closely related to procedural knowledge) involves carrying out or

using a procedure to do an exercise or solve problems through executing or

(22)

implementing. Executing (also called carrying out) occurs when a student uses a procedure to a familiar task (i.e., doing an exercise). Implementing (also called using) occurs when a student applies one or more procedures to an unfamiliar task (i.e., solving a problem). Unlike Executing, when implementing a task, students not only apply a procedure but also rely on conceptual understanding of the problem and procedure.

Analyze. Analyze involves breaking material into its constituent parts and

determining how the parts are related to each other and to an overall structure. This category includes the cognitive processes of differentiating, organizing, and

attributing. Differentiating (also called discriminating, selecting, distinguishing, or focusing) occurs when a student determines the relevant or important parts of a message. Organizing (also called finding coherence, integrating, outlining, parsing, or structuring) involves determining how elements fit or function within a structure.

Attributing (also called deconstructing) occurs when a student is able to determine the underlying purpose like author’s point of view, biases, values, or intent in the

presented material.

Evaluate. Evaluate refers to making judgments according to criteria and

standards through checking (i.e., judgments about internal consistency) and critiquing (judgments based on external criteria). The criteria frequently used are quality,

effectiveness, efficiency, and consistency. Checking (also called coordinating,

detecting, monitoring, or testing) involves detecting inconsistencies or fallacies within

a process or product, examining the internal consistency of the process or product or

the effectiveness of a procedure. Critiquing (also called judging), the core for critical

thinking, involves detecting inconsistencies between a product or operation and some

external criteria, by examining the external consistency of a product, or by judging the

adequacy of a procedure for a given problem.

(23)

Create. Create is the process of putting elements together to form a coherent or

functional whole, or to form a new pattern or structure (which involves originality and creativity). Objectives classified as Create involve having students produce an

original product. Subcategories in Create are generating, planning, and producing.

Generating (also called hypothesizing) refers to inventing alternative hypotheses based on criteria. A student is required to produce an alternative solution for a problem. Planning (also called designing) involves inventing a method or a plan for accomplishing some tasks, for example, a student can break a task into smaller tasks to be performed when solving the problem. Producing (also called constructing) involves carrying out a plan for solving a given problem that meets the description of a goal, or creating or inventing a product.

The Taxonomy Table

In the revised Taxonomy, a two-dimensional table (termed as the Taxonomy Table) is constructed. The knowledge dimension forms the vertical axis of the table, while the cognitive process dimension forms the horizontal axis. The intersections of the knowledge and the cognitive process categories form the cells. Objectives,

activities, or assessments can thus be analyzed and accordingly placed in either one of these cells. Anderson & Krathwohl (2001) demonstrated the use of the Taxonomy Table by teachers in different subjects. One of the examples was provided by Ms.

Airasian, a fifth grade teacher who described a classroom unit in which she integrated history of pre-revolutionary war with a persuasive writing assignment. Table 4

presents the placement of those four objectives in her history lesson.

The Taxonomy Table reinforces the idea of the original Taxonomy that different

types of objectives require different types of assessments (regardless of the subject

area), whereas similar types of objectives need similar types of assessments. The

dual-dimension in the taxonomy draws our attention to assess the higher-level

(24)

Table 4. An Example of Objectives Classification into the Taxonomy Table

The Cognitive Process Dimension The Knowledge

Dimension

1.

Remember

2.

Understand 3.

Apply

4.

Analyze

5.

Evaluate

6.

Create A. Factual

Knowledge

Objective 1 Objective 3

B. Conceptual Knowledge

Objective 2 Objective 4 Objective 3

C. Procedural Knowledge D.Metacognitive

Knowledge

Note. From A taxonomy for learning, teaching, and assessing: A revision of Bloom’s educational objectives (p.174), by L.W. Anderson,& D.R. Krathwohl (Eds.), 2001,

New York: Longman.

Objective 1= Remember the specific parts of the Parliamentary Acts

Objective 2=Explain the consequences of the Parliamentary Acts for different colonial groups Objective 3=Choose a colonial character or group and write a persuasive editorial stating his/her/its position on the Acts

Objective4=Self and peer edit the editorial

processing, the importance of assessing metacognitive knowledge, and the need of new assessing techniques to tap those two. Airasian & Miranda (2002, p. 253) concluded that “using the Taxonomy Table to increase the alignment of school-wide or district-wide curriculum and instruction with state standards and state-mandated assessments will enable teachers to focus on the standards without ‘teaching to the test.’”

Differences Between the Original and the Revised Taxonomy

The revision of the Taxonomy contains twelve changes in total: four changes in emphasis, four in terminology, and four in structure (Anderson & Krathwohl, 2001).

(1) Changes in emphasis include focusing on the Taxonomy in use (i.e., the

application of the taxonomy in planning curriculum, instruction, assessment and the

(25)

alignment of the three): aiming at broader audience, particularly teachers at all grade levels; including more sample assessment tasks (rather than including model test items only) to clarify meaning in categories; and emphasizing the subcategories. (2) Changes in terminology consist of changing the major category titles into verbs to be consistent with how objects are formed (i.e., a verb-noun relationship): renaming and reorganizing the Knowledge subcategories into a new dimension—the knowledge dimension that includes factual, conceptual, procedural, and metacognitive knowledge;

replacing the nouns in the subcategories in the cognitive process by verbs; and re-titling the Comprehension (to Understand) and Synthesis (to Create) (i.e., from noun to verb). (3) Changes in structure involve separating the noun and verb components in objective into two dimensions: constructing the Taxonomy Table of these two dimensions as the analytical tool; restructuring the cognitive process categories from simple to complex, yet eliminating the idea of cumulative hierarchy;

and changing the order of Create (originally Synthesis) and Evaluate (originally Evaluation).

Application of Bloom’s Taxonomy

²

Bloom’s Taxonomy has been widely used in objectives setting/evaluation, instruction and assessment design/evaluation, or the curriculum alignment cutting across subject areas, e.g., computer science, economics, mathematics, and language learning (Adam-Smith, 1981; Aviles, 1999, 2000, 2001; Bissell & Lemons, 2006;

Chen, 2004; Costin ,1986; Granello, 2001; Hoeppel, 1980; Karns, Burton, & Martin, 2001; Lee, 2004). Within the extensive literature on various disciplines, however, comparatively little research has focused on the application of the Taxonomy in EFL

2 Studies that applied the original Taxonomy are: Adam-Smith, 1981; Aviles, 1999; Bissell & Lemons, 2006; Buckles & Siegfried, 2006; Costin ,1986; Granello, 2001; Karns, Burton, & Martin, 2001; David, 2002a; David, 2002b; Frisbie, Miranda, & Baker, 1993; Hoeppel, 1980; Squire, 2001; while those that used the revised Taxonomy are: Chen, 2004; Lee, 2004; Liu, 2004.

(26)

reading instruction or assessment (Costin, 1986; Surjosuseno, & Watts, 1999), and nearly none on test item analysis.

In the area of teaching English for Specific Purpose, Adams-Smith (1981) attempted to employ the original Bloom’s Taxonomy, with the Knowledge category renamed as Memory by Norris M., to demonstrate the teaching and testing of reading.

She illustrated how reading comprehension questions could be designed at different levels. Questions at different levels are as follows, and the article for the following example questions that discussed the heat loss from the human body is presented in Appendix A:

1. Memory: Without referring back to the passage, answer the following question.

What are the factors that affect the rate of evaporation?

2. Translation: Draw a simple graph to show the relation between the amount of air movement, and the rate of sweat evaporation.

3. Interpretation: (a) What is the relationship between heat loss and environmental temperature? (b) Compare Figure 1, and the graph you drew in answer to question 2.

4. Application: During which of the following weather conditions would you expect (i) the highest incidence, (ii) the lowest incidence of heat stroke? Give reasons.

Max. temperature Humidity Wind Velocity 105

95 100 110

92%

10%

50%

0%

Zero 15 mph

5 mph

20 mph

(27)

5. Analysis: (a) Explain how heat transfer occurs when the environmental

temperature exceeds 99

^o

F (b) Analyze the reasoning in the following statements: “You do not need to drink so much when it is dry, as the moisture in the air helps to keep you cool”.

6. Synthesis: Design a heat stroke trauma unit for a Kuwait hospital.

7. Evaluation: (a) Decide which climate is more dangerous to health: that of Kuwait in summer, or of Alaska in winter. (b) Evaluate the care received by heat-stroke victims in Kuwait, and suggest measures of improving it.

These examples evidence how questions can be posed at different levels. The Memory level is equivalent to the category of Remember in the revised Bloom’s Taxonomy; Translation and Interpretation equal to the Understand category;

Application is Apply; Analysis is Analyze. Synthesis and Evaluation refer to Create and Evaluate categories, yet as aforementioned, their place were reversed in the revised Taxonomy. Of these examples, questions on the Memory, Translation, and Interpretation related more to the micro-level of understanding. The answers to those questions were explicitly stated in the text. Questions on Application and Analysis required readers to determine whether the situations given could be explained by the concepts they had read in the article. As to questions on Synthesis and Evaluation, readers needed not only to relate their schemata (e.g., the weather in Kuwait and Alaska) to what had been read, but also to use their creativity in order to evaluate the given situation or to create something.

Hoeppel (1980) attempted to determine the questions in the reading skills development books used in Maryland’s community college per taxonomy category.

Through surveying textbooks used in the developmental/remedial reading programs, 185 different skill development books were obtained. A total of 555 randomly

selected questions (three from each book) were used for analysis. The results showed

(28)

that of the questions, 145 fell into the Knowledge category, 400 into the Comprehension category, two into the Application category, and none in the categories of Analysis, Synthesis, and Evaluation. This overemphasis of the two lowest levels of thinking provided by questions in the textbooks might inhibit students’ development of higher thinking skills, which are absolutely the essential skills for college-level reading.

Problems of the concentration of questions at lower levels were also found within studies on analyzing tests provided by textbooks in different disciplines.

Moreover, some of these studies showed that the alignment of the accompanying tests and textbooks’ objectives was quite low (David, 2002a; Frisbie, Miranda, & Baker, 1993; Hampton, & Krentler, 1993; Karns, et al., 2001; Liu, 2004; Masters et al., 2001).

Among them, Frisbie et al. (1993) evaluated the learning objectives and

chapter-ended tests in elementary and middle school textbooks of social studies and science. They found that only half of the test items matched the chapter objectives, and around 90 percent of the test items were categorized at the Knowledge level (the Remember category in the revised Taxonomy).

Karns, Burton, and Martin (2001) investigated to what degree the accompanying examine questions in teacher’s manuals measured the learning objectives in six economics textbooks. Findings revealed that the course objectives did include higher level statements (average 16% on application level, 4% on analysis level, and 1% on synthesis and evaluation level), however, the accompanying exam questions were mostly at lower levels. There were around 11% of the questions on Application level, and nearly none on Analysis, Synthesis, and Evaluation levels.

Different from previous studies, Liu (2004) analyzed the cognitive levels of

objectives, exercises, and the material content of three high school computer

(29)

textbooks using the revised Taxonomy as the classifying scheme. Nevertheless, similar results such as the mismatch of the cognitive process levels between learning objectives and accompanying test items, and the dominance of lower-level test items were found. This great discrepancy between the content of accompanying

exercises/tests and that of the learning objectives will certainly inhibit the accurate evaluation of students’ expected learning outcome and may mislead students into believing that mastering lower cognitive processing skills means mastering the course.

As regards the analysis of test items by Bloom’s Taxonomy, Alderson and

Lukmnai (1989) investigated whether test items that intended to measure certain skills indeed tested those skills. Nine teachers in the institute for English Language

Education at Lancaster University were given four tasks: (1) reading through 41 items as they were taking the test; (2) writing down what they think each item was supposed to test in their own words; (3) classifying each question or sub-question into lower order, middle order, or higher order; and (4) identifying which of the skills were being tested by those questions. In the last task, questions were required to be classified into eight skills described by the Bombay University Communication Skills Group—

recognition of words, identification, discrimination, analysis, interpretation, inference, synthesis, and evaluation. These skills closely followed the original Bloom’s

Taxonomy. Results indicated that raters had agreement on only 14 out of 41 items in classification. The possible reason for this low consensus might be the fact that it was hard to really know how individuals arrive at the answer to a question and different people reach any answer in various ways, which is one limitation of the present study.

Gierl (1997) also questioned the adequacy of using Bloom’s Taxonomy as a

model to guide item writers to construct items that measured the cognitive processes

they hoped to be applied by students on a large-scale achievement test in mathematics.

(30)

In his study, 30 Grade 7 students (divided evenly into high and low achievers) were asked to think aloud as they solved problems on a mathematics achievement test. Both the multiple-choice test items and students’ think-aloud protocols were then classified based on Bloom’s Taxonomy (only at the four lowest levels of Knowledge,

Comprehension, and Application). The overall match between the responses expected by the item writers and the responses observed from the students was only 53.7%

(56% match in the high achievers’ group and 50% match in the low achievers’ group).

Gierl stated that the cognitive process levels in Bloom’s Taxonomy did describe students’ thinking process while answering questions, yet maintained that it was difficult to judge which question test which specific skill because readers could arrive at any answer in different ways (e.g., some items expected to be solved by knowledge processes were solved via comprehension processes instead). Alderson (1995),

showing similar concerns, argued that despite this problem, defining what to test and then trying to test it would likely lead to a better test item.

Although the results in Gierl’s (1997) study did not satisfactorily demonstrate a complete match between test takers’ thinking processes and what the test items intended to measure, it should not be taken as conclusive because the pool of

population is restricted to the thirty Grade 7 students taking math exams. Gierl’s study, however, draws our attention that a coding scheme like Bloom’s Taxonomy does show the test writers that the items, at least half of the time, are indeed testing what the items intend to measure.

Bloom’s Taxonomy has been widely applied in testing and evaluation across different subject matter and various kinds of tests such as state-wide, nationwide, or classroom assessments (Airasian, 1994; Aviles, 1999; Chen, 2004; David, 2002b;

Kastberg, 2003; Masters, Hulsmeyer, Pike, Leichty, Miller, & Verst, 2001; Squire,

2001). Similarly, research related to the analysis of test items also indicated that test

(31)

items mostly assessed lower cognitive processing levels and thus called for a revision of the test content (Chen, 2004; David, 2002b; Masters, et al. 2001; Squire, 2001).

Masters et al. (2001) analyzed 2913 multiple-choice questions randomly selected from 17 test-banks of accompanying selected nursing textbooks. Questions were evaluated on thirty generally accepted guidelines for multiple-choice questions writing, the cognitive levels defined by the original Taxonomy, and distribution of correct answer. Result showed that most of the questions were written at lower cognitive levels, i.e., 47.3% of questions were at the Knowledge level, 24.8% were at the Comprehension level; 21% were written at Application level, and only 6.5% were at the Analysis level. They reported that the result was somewhat surprising and concerning because most of the textbooks reviewed were intended for upper division courses. In addition, a large amount of NCLEX-RN (National Council Licensure Examination— Registered Nurse, a computer-adaptive test of entry-level nursing competence) questions were written at the Application and Analysis levels, not at the two lowest levels emphasized in those textbooks. A harmful effect of this great discrepancy in learning goals, the assessment tools, and the state-wide examinations might be brought to students’ performances on standardized examinations.

Squire (2001) analyzed the cognitive levels of testing agricultural science in senior secondary schools in Botswana. The materials analyzed were 628 questions taken from the Senior Cambridge Overseas School Certificate (COSC) Agriculture Paper 1 (containing 250 objective type items from Section A, 211 short answer items in Section B, and 167 essay type items in Section C) during 1989 to 1998. Data were analyzed by comparing the questions in each section of the examination papers to the sample questions (e.g., what is the capital of France?) and questions with

characteristic words (e.g., questions at Knowledge level usually contained key words

such as what, who, when). Results indicated that a great amount of questions in

(32)

section A and C were at knowledge level of Bloom’s original Taxonomy; most of the questions in section B were also at Knowledge level as well with about 39% questions being at Comprehension level in 1993, 1997, and 1998. Very few or even no questions could be found at higher cognitive levels of Application, Analysis, Synthesis, and Evaluation. Surprisingly, the essay type items in those tests were at the two lowest levels as well, which contradicted to the belief about the possibility of using

constructed-response test items (involving constructing one’s own answer) to assess higher level thinking (Buckles and Siegfried, 2006; Simkin & Kuechler, 2005).

Chen (2004) applied the revised Taxonomy to examine the knowledge types and cognitive levels of computer science test in technical college entrance examination from 2001 to 2004 in Taiwan. Three raters participated in the analysis to ensure the inter-rater reliability of the classification of test items, and together they developed a subject matter table (knowledge types appeared in 1999 and 2000 tests) and a table of example questions at all six cognitive process levels gathered from previous computer science literature as principles to classify test items into the two dimensions. A

preliminary analysis was performed to ensure the inter-rater consistency. When inconsistency occurred, discussions were held to reach consensus in classification.

Results revealed that firstly, certain types of knowledge associated with certain

cognitive processes, for instance, factual knowledge was related to remembering

whereas procedural knowledge to applying. Secondly, similar to previous studies,

most of the test items (44% to 77%) measured only lower-level thinking that required

students to remember factual information. No item was found at Evaluate and Create

levels, possibly due to the constrain of the multiple-choice questions in which test

takers are always forced to choose one correct answer. This view is consistent with

Buckles and Siegfried (2006), who found that multiple-choice questions can measure

elements of in-depth understanding when being carefully designed, maintained that

(33)

Synthesis and Evaluation levels could not be accurately measured since the creativity or originality could not be simply tested by multiple-choice questions. Tasks like constructed response questions could be considered as an alternative in detecting those two highest levels. Lastly, Chen (2004) discovered no implicational relationship between question types and item difficulty (i.e., the difficulty of the test items

increase when the cognitive levels measured proceed) in the statistical results.

Addressing the debate of the implicational scale, Alderson & Lukmani (1989) found no implicational relationship between testees’ performance and the level of question under study, and thus questioned the existence of the implicational scale among skills. Matthews (1990), on the other hand, believed that there was an implicational scale involved in the hierarchy of reading skills, yet indicated that it would probably be easier for readers to reach global understanding (a higher-order skill) than the local one because there was more redundant information for readers to get the gist. Therefore, in the current study, whether an implicational relationship between question type and item difficulty can be found become one of the research topic to be explored.

Within the context of ESL reading, Costin (1986) surveyed the kinds and purposes of reading assignments, the levels of cognitive processes related to reading assignments, and the cognitive abilities of weak students in the in ESL remedial program judged by their first-year subject teachers at Hong Kong Baptist College.

Results showed that around 21% of students were regarded as weak with lacking the four mostly needed cognitive processing skills: knowledge, comprehension,

application, and analysis. Moreover, in comparing the syllabus of the past ESL

remedial reading programs with those required cognitive skills, it was found that only

the skills of knowledge and comprehension were emphasized. It is suggested that

English language teachers could reinforce the essential cognitive skills in reading

(34)

programs by means of cognitively oriented approaches such as using schema theory with an interactive model or training cognitive skills by questioning.

Currently in the educational setting in Taiwan, the Revised Bloom’s Taxonomy was used in evaluating the Grade 1-9 curriculum (Lee, 2004; Wu, et al., 2005). Lee (2004) demonstrated the use of the revised Taxonomy by identifying the knowledge and the cognitive process operationalized in the objectives in the Grade 1-9

curriculum across subjects and illustrated how accompanying exercises in textbooks could be used to assess learners’ cognitive processes in all six levels. The revised Taxonomy was also applied to BCT (Basic Competence Test for Junior High School Students) test analysis. However, the report was not available for circulation.

Wu, Fu, and Lin (2005) examined the English subject objectives in the Grade 1-9 curriculum. Results showed that from the knowledge dimension, we emphasized from factual knowledge, procedural knowledge, to conceptual knowledge, yet

metacognitive knowledge could hardly be found; while from cognitive process dimension, most of the objectives lay in Apply, Understand, and Remember levels, whereas objectives of the ability to analyze, evaluate, and create were relatively few.

These analyses provide us references for setting or revising our teaching or

curriculum objectives in the future. Correspondingly, Bloom’s Taxonomy can be used in test analysis to provide information of the adequacy or effectiveness of a test, which further provides direction for test design and English reading instruction.

In testing area, the Taxonomy has been widely used to analyze test items in

mathematics, nursing, computer science, or social work education as shown in the

reviews above, and numerous studies demonstrated the validity and practicality of its

application in assessment. Notwithstanding, in the ESL reading context, merely

limited research can be found, and nearly none in the SAET and DRET reading

comprehension item analysis in Taiwan.