The Research Framework of the Study

(1)

CHAPTER THREE RESEARCH METHODOLOGY

The purposes of the present study were to establish a set of principles for developing English talented curriculum for senior high school in the context of Taiwan and a set of criteria for evaluating English talented programs under the CIPP model. Given the fact that the component of “evaluation” and the other components of a curriculum are interrelated (Brown, 1995; Pratt, 1994; Lynch, 1996), the present study is also aimed at transforming the curriculum development principles into program evaluation criteria. Therefore, the study is composed of two main parts—the establishment of principles for English talented curriculum development and the transformation of principles into criteria for English talented program evaluation.

The Design of the Study

In this section, the design of the study will be presented, including the framework of the study, stages the whole study went through, the derivation and arrangement of principles for English talented curriculum development, construction of content validity of the questionnaire, transformation of principles for curriculum development into criteria for program evaluation under the CIPP model, and the main research method, the Delphi technique.

The Research Framework of the Study

Figure 3.1 shows the framework of the present study. As the framework

demonstrates, the study is composed of two main parts, the part of literature review

and the surveys. To have the research questions answered and to achieve the purposes

of the study, the researcher started from comprehensive literature review. Through the

thorough and comprehensive literature review, principles for English talented

curriculum development were delineated. Besides, models for program evaluation

(2)

Figure 3.1 The Research Framework

(3)

were compared, and the model most appropriate for school program evaluation was thus specified. The part of survey consisted of two three subparts. The first part was to construct content validity of the preliminary questionnaire, which was compiled based on previous literature review. The second part was the Delphi surveys to finalize principles for English talented curriculum development. The final result of the survey would then be transformed into the preliminary criteria for English talented program evaluation under the CIPP model. Such preliminary criteria were later consolidated through the application of the Delphi technique, which was the third part of the surveys.

Stages of the Study

Based on the research framework, the researcher developed a set of research

procedures to explore the research questions that the present study aimed to clarify. The

procedures are demonstrated as Figure 3.2 shows. The whole research was divided into

five stages. The first stage was the planning stage, in which related important literature

works were reviewed and discussed with a view to specifying principles for gifted and

talented curriculum development and comparing evaluation models, among which one

most suitable model was chosen for evaluating gifted and talented programs. Then, data

collection methods and data analysis procedures were further decided upon. With that,

research instruments for data collection were developed and participants of the study

were specified. Once all these were accomplished, the procedure moved on to the

second stage, the data collection stage. This stage was preceded with the construction of

content validity of the research instrument, a questionnaire for the Delphi survey on

principles for English talented curriculum development. The Delphi survey aimed at

collecting experts’ opinions about English talented curriculum development. The data

from the Delphi survey, once analyzed, would help to establish criteria for English

talented program development. Given that curriculum development and program

evaluation is closely related (Brown 1995), the set of principles for curriculum

(4)

development would be further transformed into a set of criteria for program evaluation

Planning Stage

Data Collection Stage (for

Curriculum Development)

Criteria Establishment

&

Transformation Stage

Criteria Validation

Stage (for Program Evaluation)

Finalization Stage

Figure 3.2 The Research Procedure of the Study

under the CIPP evaluation model. This was the criteria establishment and transformation stage. Next would be the criteria validation stage, where to validate this

Literature Review

Specify principles for talented curriculum development

Specify program evaluation models

Specify audiences, goals, research questions

Determine data collection methods

Develop instruments

Establishment of Principles for Curriculum Development

& Transformation of Principles

To establish criteria for establishing English talented curriculum development in the context of Taiwan

To transform program development principles into program evaluation criteria (preliminary version)

Validation of Criteria for Evaluation through the Delphi Surveys

Teachers from senior high schools and experts from colleges (expert panels)

To consolidate criteria for English talented program evaluation

Finalization of English Talented Curriculum Development Principles and Evaluation Criteria

To finalize principles for English talented curriculum development

To finalize criteria for English talented program evaluation The Delphi Surveys

Teachers from senior high schools and experts from colleges (expert panels)

To consolidate principles for English talented curriculum development

Constructing Content Validity

Professors in the fields of TESOL and special education

To construct content validity of the questionnaire

(5)

set of evaluation criteria was the main concern. The Delphi technique was again employed to seek for experts’ opinions on the criteria transformed to fit the CIPP evaluation model. In the last stage, the finalization stage, the final sets of criteria for curriculum development and program evaluation was finalized. For those criteria on which the experts failed to reach a consensus, more discussion on them would be rendered in order to gain a sounder understanding of gifted curriculum development and evaluation both in theory and in practice.

Principles for English Talented Curriculum Development

As stated earlier, to develop an English talented curriculum is a formidable task. It requires professional expertise in TESOL, curriculum development, gifted education and program evaluation. If high school teachers and school administration have some criteria to depend on in developing an English talented curriculum, much time and efforts can thus be saved. Besides, the final result will be more appropriate and of better quality. Based on Brown’s (1995) and Pratt’s (1994) ideas of curriculum development, the researcher developed a revised model for curriculum development with six components pertaining to it as Figure 2.5 shows (See p. 19). To establish principles for English talented curriculum development, we may start from crystallizing each of the six components.

To establish principles covering aspects that deserve attention from program developers and arrange all the principles in an organized way, the researcher further divided each component into several domains. Within each domain, there are criteria catering to what needs to be taken into consideration to derive a full-fledged curriculum.

In the component of needs analysis, there are four domains addressing needs and

expectations from the four main groups of people who are involved, including students,

parents, teachers, and school administration. The component of goals and objectives

comprises three domains, including goals of the program, objectives of the program,

(6)

and conveyance of goals and programs. The component of materials and resources involves five domains. Materials include teacher-guided teaching materials, and student self-study materials. Resources include human, financial and facility resources. The component of courses and teaching is the largest component including the largest number of domains and principles to pay heed to, since this is the component that highlights the difference between a gifted program from a regular one (Borland, 2003b).

They include application of differentiated teaching methods and skills, common core curriculum, special courses on English, second foreign language learning, other curricular variations, teacher preparation for the program and administration preparation for the program. In the component of tests and assessment, there are two domains to consider, that is, specification and identification of the gifted, and application of differentiated ways of assessment. The final component is evaluation, which contains only one domain to inspect, that is, implementation of evaluation.

In sum, there are 22 domains in total, from which the researcher further specified

110 principles that should be given attention and satisfied to serve as principles for

developing English talented curriculum development. Table 3.1 summarizes how the

principles were derived. Among the 110 principles, 104 of them are derived from

literature review. Therefore, they are theory-based principles which help to substantiate

a well-rounded English talented curriculum. The other 6 principles, though not

theory-based, are actual practices in many schools with English talented programs. The

inclusion of these principles is meant to examine whether or not these

commonly-practiced arrangements are really important in implementing a well-rounded

English talented curriculum.

(7)

Table 3.1 Sources of Principles for English Talented Curriculum Development

Component Principle Source

1.1-1.2; 1.4-1.6; 2.1-2.4;

3.1-3.2; 4.1

Maker (1982, 1986); Brwon (1995) Pratt (1994)

: Needs

Analysis

1.3 Actual practice

: Goals &

Objectives

1.1-1.8; 2.1-2.6; 3.1-3.4 Maker (1982, 1986); Brwon (1995) Pratt (1994)

1.1-1.8; 2.1-2.5; 3.1; 3.3

4.1-4.3; 5.1-5.2

Maker (1982, 1986); Brwon (1995) Pratt (1994);

VanTassel-Baska (1994, 1996)

5.3 VanTassel-Baska (1994, 1996)

:

Materials &

Resources

3.2 Actual practice

1.1-1.10; 2.1-2.3; 3.1-3.8

5.1-5.3; 5.5-5.6; 6.1-6.5

Maker (1982, 1986);

Maker & Nielson (1995)

VanTassel-Baska (1994,1996,2007)

7.1-7.6

Enforcement to the Act of Special Education; Maker (1982)

: Courses &

Teaching

_4.1-4.2; 5.4 Actual practice

1.1; 1.3-1.6; 2.1-2.5 Maker (1982, 1986);

Feldhusen & Jarwan (2000);

Davis & Rimm (2004) VanTassel-Baska (2008)

V:

Test &

Assessment

1.2 Actual practice

: Program Evaluation

1.1-1.9 Stufflebeam & Shinkfield (1985);

Worthen et al. (1997)

VanTassel-Baska & Feng (2004)

Construction of Content Validity

To make certain the validity of the questionnaire for the Delphi surveys, the

tentative questionnaire for establishing principles for English talented curriculum

development underwent a process of content validity construction, in which the five

experts were to examine and evaluate each principle, deciding whether every single

principle was a valid consideration in developing an English talented curriculum, what

kind of modification needed to be made, and whether elimination of certain principles

was necessary. The professional judgment of the experts and modification thus made

would help to confirm the validity of the questionnaire, from which subsequent

instruments would be derived.

(8)

Transformation of the Curriculum Development Principles into Criteria for Program Evaluation under the CIPP Model

As suggested in the previous chapter, the CIPP evaluation model would be applied as a model for evaluating English talented program. Just as curriculum development is a formidable task, program evaluation is by no means less demanding.

In fact, it requires still another field of professional expertise—that of program evaluation. Accordingly, if there is a set of criteria available for evaluating English talented program, more teachers and schools may come to evaluate their own programs by themselves without resorting to experts from outside of the school.

Brown describes the component of evaluation being like “the glue that connects and holds all the elements together” (Brown, 1989: 217). Following this metaphor, the evaluator needs to go back and forth between “evaluation” and the other components of the curriculum to ensure all the components are well-established to sustain a most appropriate curriculum. Such being the case, we may assume that the principles for curriculum development and the criteria for program evaluation are interrelated in such a way that both are aimed at the development and maintenance of a sound and effective program, and that the principles and the criteria may be connected by a certain mechanism. We may assume that such a mechanism contains rules of transformation to transform the principles into the criteria and filters to pass for curriculum development principles to be classified into criteria for particular kinds of evaluation under the CIPP evaluation model. We may call the rules “the Transformational Rules,” and the filters

“Context Filter,” “Input Filter,” “Process Filter,” and “Product Filter.” These filters

are meant to put curriculum development principles, once transformed into program

evaluation criteria, into the proper category of evaluation. It stands to reason to assume

that all the principles for developing an English talented curriculum shall be able to be

transformed into criteria for English talented program evaluation. This is because what

is considered to be an important factor for building up an appropriate curriculum should

(9)

serve as a criterion for program evaluation too.

In the meantime, interrelated as curriculum development and program evaluation are, they both after all embrace different concepts and thus contain different contents. It is very likely that the whole set of principles for curriculum development may not necessarily satisfy the requirement for working as criteria for program evaluation. It is also likely that there are certain criteria that are applicable in particular to the implementation of evaluation, not derived from any of the curriculum development principles. If so, then to what extent the principles for curriculum development fit to be transformed into the criteria for program evaluation? What needs to be added to make a whole set of criteria for program evaluation? These will be the questions the present study aims to explore. Once this is accomplished, we may derive one set of principles for English gifted program development and one set of criteria for English gifted program evaluation. Based on the above hypothesis, the researcher thus developed a model to depict the whole process of transformation of criteria. The model is as Figure 3.3 shows.

To the left of the model are the components of curriculum development following a combination of Brown’s (1995) model and Pratt’s (1994) curriculum planning. In the middle are a transformational mechanism which contains transformation rules to transform curriculum development principles into program evaluation criteria, evaluation filters for principles to pass in order to be put under appropriate categories of evaluation, and a cluster of criteria that are applicable particularly to program evaluation.

These criteria are to be added to the newly-developed set of criteria to make a sound set

of program evaluation criteria. The dotted arrows in the model indicate that the very

step does not undergo the application of Transformational Rule.

(10)

Figure 3.3 The Principle Transformation Model

To transform curriculum development principles into criteria for program evaluation under the CIPP evaluation model, the researcher, by consulting the objectives of the CIPP evaluations specified by Stufflebeam and Shinkfield (1985), derived filters for principles to pass to fit into an appropriate kind of evaluation category. The filters are as follows.

Context Filter: For statements that serve to define the institutional context, to identify target population and assess their needs, to identify opportunities for

addressing the needs, to diagnose problems underlying the needs and to characterize the program’s environment to pass to be transformed into Context Evaluation Criteria.

Input Filter: For statements that serve to identify and assess system capabilities, to describe and analyze available human and material resources, solution strategies and procedural designs for relevance, feasibility and economy to pass to be transformed into Input Evaluation Criteria.

Process Filter: For statements that serve to describe the actual process of the program, to observe activities of project staff, to identify process defects in

implementation to pass to be transformed into Process Evaluation Criteria.

Needs Analysis

Goals &

Objectives

Testing &

Assessment Materials &

Resources

Courses &

Teaching

Transformational Rule to Transform Curriculum Development

Principles into Program Evaluation

Criteria

Criteria pertaining to evaluation in particular, but not employed in curriculum development

Context Filter

Input Filter

Process Filter

Product Filter

Context Evaluation

Input Evaluation

Product Evaluation Process Evaluation

E V A L U A T I O N

(11)

Product Filter: For statements that serve to describe and judge program outcomes, to relate outcomes to objectives, context, input and process information to interpret their worth and merit to pass to be transformed into Product Evaluation Criteria.

The above filters provide a norm, based on which principles for curriculum development can be transferred into criteria under an appropriate category of evaluation under the CIPP model. In the meantime, it is likely that some criteria for program evaluation may not be considered at the stage of curriculum development but pertain to program evaluation in particular. For example, when developing a program, program developers at best come to predict the desirable outcomes, but unable to describe what the results really are unless the program has been accomplished or implemented for some time. For example, some criteria for Product Evaluation will not be derived from transformation of principles for curriculum development. Instead, they are independent from curriculum development principles, yet related to these principles in that product evaluation criteria are meant to relate the outcomes to the context, input and process information (Stufflebeam & Shinkfield, 1985). What Product Evaluation aims for will not be the concern of program developers at the stage of developing a curriculum.

Therefore, we may infer that besides the set of principles for curriculum development, we still need another set of criteria for product evaluation, which together with criteria for context evaluation, input evaluation, and process evaluation, composes the whole set of evaluation criteria.

Being transformed, the content of each criterion remains the same, but the way it

is presented should be altered. This is because the principles and the criteria have

different roles to play in the two different educational undertakings. Principles for

curriculum development serve as reminders for program developers when constructing a

curriculum for English talented students from nothing. On the other hand, when

evaluating a program, evaluators are trying to find answers to whatever is presented to

them concerning implementation of the program. Therefore, it seems reasonable to

(12)

propose that curriculum development principles should be in the form of affirmatives, while program evaluation criteria would be in the form of interrogatives. The change from affirmatives into interrogatives indicates different purposes of the principles and the criteria and the thought given to the transformation. Without such change, the significance of the transformation process would be hard to manifest itself. Accordingly, the transformational rule would be a rule that functions to transform an affirmative to an interrogative, which may be depicted as:

Transformational Rule: Transform an affirmative statement of a principle for

curriculum development into an interrogative to form a criterion for program evaluation before it is checked with the evaluation filters.

Such a transformational rule is to be applied before evaluation filters are applied as

Figure 3.3 shows. Figure 3.4 shows an example of principle transformation. The ten

curriculum development principles (V1.1-V1.5, V2.1-V2.5) serve to gauge the

arrangement of testing and assessment related to specification of the gifted and

application of differentiated ways of assessment. With the application of the

Transformational Rule, they are transformed into interrogatives. Then after further

examination, four of them fulfill the Context Filter and go to the category of Context

Evaluation, and the other six fulfill the Process Filter and go to the category of Process

Evaluation.

(13)

Component Domain Principle

V1.1 Specification of requirements to meet for students who want to enroll in the program.

V1.2 Identification tools are both valid and reliable.

V1.3 The process of identification includes activities that assess students’ four language skills in English.

V1.4 Multiple procedures of identification are applied to select the gifted.

V1: Specification&

identification of the gifted

V1.5 Identification is viewed as an ongoing process throughout the program.

V2.1 Differentiated ways of assessment from general classes are applied.

V2.2 Diverse ways of assessment are applied.

V2.3 Assessments activities are able to measure students’ four language skills in English.

V2.4 Different types of tests are applied to assess students’

learning.

V: Testing &

Assessment

V2: Application of differentiated ways of assessment

V2.5 Assessments reflect objectives and goals of each course.

Transformed

(Transformational Rule + Evaluation Filters )

Category of

Evaluation Domain Layer Criteria

Is specification of requirements to meet for students who want to enroll in the program clearly made? (From V1.1)

Are identification tools both valid and reliable? (From V1.2)

Does the process of identification include activities that assess students’ four language skills in English? (From V1.3)

Context

Evaluation Con1:

Specification of population

Cont1.1:

Specification and identification of gifted students

Are multiple procedures of identification applied to select the gifted? (From V1.4) Are differentiated ways of assessment from general classes applied? (From V2.1) Are diverse ways of assessment applied?

(From V2.2)

Are assessment activities able to measure students’ four language skills in English?

(From V2.3)

Are different types of tests applied to assess students’ learning? (From V2.4)

Proc1:

Teaching &

assessment

Proc1.2 Application of differentiated ways of assessment

Do assessments reflect objectives and goals of each course? (From V2.5)

Process Evaluation

Proc2:

Administrative Management

Proc2.1 Administrative assistance

Is identification of the gifted viewed as an ongoing process throughout the program?

(From V1.5)

Figure 3.4 Transformation of Curriculum Development Principles into Program Evaluation Criteria

(14)

The Delphi Surveys

The Delphi technique is a quantitative technique that is used to obtain the opinions of groups, often for needs assessment studies (Worthen et al., 1997; McKillp, 1987). In Brown’s (1995) and others’ classifications (Chen, 2005; Gordon, 1994; Brown, 1995; Weir and Roberts, 1994), however, it belongs to the qualitative category of information gathering. Such divergence is caused by how the results from the application of the method are used and interpreted. The method makes use of a series of mailings of specially styled questionnaires to respondents, aiming to reach consensus among a group of respondents without necessarily bringing its members together for a meeting. It offers individuals feedback on what others think about the issue in question without laying pressure on them to express conforming views, and it enables a record of divergent opinions to be reserved (Worthen et al., 1997; Weir and Roberts, 1994;

Linstone, 1978; Whitecotton, 1992; Thomas, 1990; Killian, 1993). The process has been successfully adopted in educational situations (Judd, 1972; Killian, 1993). Cyphert and Gant (1971) comment that the Delphi procedure is regarded as “useful in educational planning at all levels” (p. 272). Besides, as Linstone (1978) points out, the process has also “been applied to “exposing priorities of personal values and social goals, explicating budget allocations, examining the significance of historical events, and distinguishing or clarifying perceived and real human motivations”(p. 275). Generally speaking, the implementation of the Delphi technique follows the steps below:

1. Respondents are asked to write down their opinions on an issue.

2. All the responses that have been received are listed, and a summary is circulated to the respondents.

3. Each respondent rates the items on that list according to their personal proprieties and returns the list to the organizer. (It is helpful to set a clear deadline for this, as for other stages in the Delphi process.)

4. These ratings are again summarized, and a new list is circulated. This time the

lists are individualized, showing both the group rating and that of the individual

respondent.

(15)

5. Respondents are asked to reconsider their ratings. If they decide to diverge from the group consensus, they are invited to explain their reasons. (A form giving more space for comments is required for this.)

6. A third report is circulated. Normally this will be sufficient for issue in question.

But in principle one can continue the process through repeating stages 4 and 5 until a broad consensus on all important points is apparent.

The first mailing is to solicit opinions of the issue in question. Subsequent mailings show statistical summaries of results to previous mailings (reporting medians and interquartile ranges) and remind the respondents of their earlier responses. The respondents are allowed to change their responses or justify them if their responses are not within the interquartile range. The process may go on until consensus is reached among respondents.

The above steps are the procedures taken in the classic Delphi survey. The Delphi survey which was applied in the study is a modified one, with the first step being altered.

In the first step of the Delphi survey, instead of writing down their opinions on an issue which is presented in several essay questions, the respondents in the study were to rate the importance of criteria for curriculum development and program evaluation.

Meanwhile, they could write down their opinions on the criteria they were rating, if there was any. Such modification is to facilitate questionnaire filling by making it easier than answering open-ended questions. As the questionnaire is easier to finish, respondents will be more willing to fill out, which will in turn enhance the opportunity of maintaining the same number of respondents in the panels (Tsao, 2007; Jiang, 2007).

Respondents taking part in the Delphi technique are experts in the field of consideration who are geographically dispersed and it would be costly, if not impossible to bring them together. Linstone (1978) also mentions another circumstance where Delphi proves particularly useful—when “the problem does not let itself to precise analytical technique but can benefit from subjective judgments on a collective basis” (p.

275). Delphi can also be useful when respondents desire anonymity (Worthen et al.,

(16)

1997: 357). This method distinguishes general trends from individual standpoints (Weir and Roberts, 1994: 29). Weir and Roberts point out that a key element in the Delphi method is the influence brought to bear on individuals to modify their views in the light of what they perceive the group consensus to be (p. 333). On the other hand, a study by Woudenberg (1991) cautions that while the Delphi method achieves consensus among group members, this consensus is due more to group pressure than to finding any “true”

expert opinions (cited in Worthen et al., 1997: 357). However, when anonymity can be sustained in the process, it can be expected that most respondents shall not feel too much pressure when their individual opinions are different from others (Goodwin, 1987;

Killian, 1993), and an undesirable bandwagon effect can thus be avoided (Linstone, 1978: 275). Besides, participants in Delphi surveys are not randomly chosen respondents. They must be experts in the area of consideration (Goodwin, 1987;

Thomas, 1990). They have their own professional judgments and are not as easy to be influenced by others’ opinions. If they finally decide to maintain their own opinions which do not conform to the opinions held by most experts, they are welcome to do so, but it would be preferable if they may provide explanation to justify. Thus, the influence from group pressure may thus be lessened. In fact, the beauty of the Delphi technique does not only lie in reaching a consensus among the experts. The divergence of opinions of some experts can be revealing regarding the issues in question. Therefore, if the divergences of opinions are further explored, some insights may thus be derived.

As for the number of participants, there has been a very wide range in published

Delphi studies, between less than 20 and more than 2000 (Roberts-Davis and Read,

2001; Duffield, 1993; Butterworth and Bishop, 1995). Gordon (1994) points out that

most studies use panels of 15 to 35 people, while Dalkey (1969) found that a suitable

minimum panel size is 7 (cited by Linstone 1978: 296). Reid (1988) suggests that what

really matters in the Delphi respondents is whether the number can be justified as a

genuine population.

(17)

Participants

The participants of the construction of content validity included five college professors. Two of them major in field of TESOL, and three of them major in special education, specializing in gifted education. Names and expertise of the participants are as shown in Appendix B. The five experts evaluated the appropriateness and validity of the tentative principles for English talented curriculum development and provided suggestions for modification. The final version of the principles would serve as the questionnaire for the first round Delphi survey on the establishment of principles for English talented curriculum development.

The participants of the Delphi surveys included 10 college professors in the field of TESOL, special education and program evaluation, and 10 high school English teachers who are either coordinators of the English talented programs of their respective schools or teaching English gifted classes. Names and expertise of the experts are as shown in Appendix C. The participants altogether constituted a panel of experts, who were to provide personal opinions and professional judgment concerning English talented curriculum development and evaluation. Opinions from these experts represent two different perspectives—the theoretical and the practical concerns. The college professors may approach English talented curricula from a theoretical point of view which caters more to theoretical concerns, while the high school teachers may approach English talented curricula from a more practical point of view. The consensus reached among the experts thus serves to establish the principles for English talented curriculum development and criteria for English talented program evaluation. On the other hand, the divergent opinions between the two groups of experts indicate discrepancy between theoretical and practical concerns.

Instruments

The instruments applied in the study include six questionnaires, including the

(18)

questionnaire for construction of content validity (see Appendix D), the questionnaires for three rounds of the Delphi surveys on establishing principles for English gifted/talented curriculum development (see Appendices E, F, and G), the questionnaires for two rounds of the Delphi surveys on establishing criteria for English talented program evaluation.

Questionnaire for Construction of Content Validity

The tentative questionnaire for establishing principles for English talented curriculum was developed based on the results of literature review regarding curriculum development and gifted education and actual practice in many schools. The first draft questionnaire for the experts took the form as shown in Appendix D. There were 110 principles pertaining to 22 domains. After examination and evaluation by the experts, some of the principles in the questionnaire were modified. The modifications made are shown in Table 3.2. There were other modifications mostly involving wording and the correspondent Chinese translation, which would not be discussed here given the triviality of the modifications. As a consequence, the total number of principles for the Delphi survey remains 110, which comprise the questionnaire for the first round of the Delphi survey on establishing the principles for English gifted/talented curriculum development (see Appendix E).

Table3.2 Modifications of the Tentative Principles for English Talented Curriculum Development

Tentative principle Modification

II1.5 Goals respond to world trends. II1.5 Goals respond to world trends in gifted education.

II1.6 goals respond to societal needs.

V1.2 The total number of students admitted

to the program Deleted as suggested

Questionnaires for the Delphi Surveys on Establishing Principles for English Talented Curriculum Development

The first round Delphi survey to establish principles for English talented

(19)

curriculum development encompasses three main parts, the explication part, the rating part and the personal data part. In the explication part, the purposes of the survey, the construction of the survey items and the tasks required from the respondents are clearly described. The rating part is composed of six components, each corresponding to the six components of the revised model for curriculum development (See Figure 2.5 on p. 19) proposed by the researcher. Within each component, there are layers of focus. Table 3.3 shows the six components together with the domains pertaining to them.

In the component of “Needs Analysis”, there are four domains of focus with 13 principles. The four domains include assessment of students’, teachers’, parents’, and

Table 3.3 Components and Domains in English Talented Curriculum Development

Components Domains Principles

I:

Needs Analysis

I1: Assessment of learners’ needs and perception of the program

I2: Assessment of teachers’ needs and perception of the program,

I3: Assessment of parents’ needs and perception of the program

I4: Assessment of administration’s needs and perception of the program

I1.1-I1.6 (6 entries) I2.1-I2.4 (4 entries) I3.1-I3.2 (2 entries) I4.1 (1 entry)

II:

Goals &

Objectives

II1: Goals of the program II2: Objectives of the program

II3: Conveyance of goals and objectives

II1.1-II1.8 (8 entries) II2.1-II2.6 (6 entries) II3.1-II3.4 (4 entries)

III:

Materials &

Resources

III1: Teacher-guided teaching materials for English

III2: Self-study materials for English III3: Human resources

III4: Financial resources III5: Facility resources

III1.1-III1.8 (8 entries) III2.1-III2.4 (4 entries) III3.1-III3.3 (3 entries) III4.1-III4.2 (2 entries) III5.1-III5.3 (3 entries)

IV:

Courses &

Teaching

IV1: Application of differentiated teaching methods and skills

IV2: Common core curriculum IV3: Advanced courses on English IV4: Second foreign language learning IV5: Other curricular variations

IV6: Teacher preparation for the program, IV7: Administration preparation for the program

IV1.1-IV1.10 (10 entries) IV2.1-IV2.3 (3 entries) IV3.1-IV3.8 (8 entries) IV4.1-IV4.2 (2 entries) IV5.1-IV5.6 (6 entries) IV6.1-IV6.5 (5 entries) IV7.1-IV7.6 (6 entries)

V:

Testing &

Assessment

V1: Specification and identification of the gifted V2: Application of differentiated ways of

assessment

V1.1-V1.5 (5 entries) V2.1-V2.5 (5 entries)

VI:

Program Evaluation

VI1: Implementation of evaluation VI1.1-VI1.9 (9 entries)

Total 22 domains 110 principles

(20)

administration’s needs and perception of the program. The second component of “Goals and Objectives” includes three domains of focus, more general goals of the program, specific objectives of the program, and conveyance of the two, which altogether encompass 18 principles. The third component is “Materials and Resources”, which includes what is used both in the class and outside the class for teaching and learning, and whatever resources that may be employed in implementing the program. There are five domains of focus with 20 principles included in this part. The forth component is

“Courses and Teaching”, including the content of the curriculum and the instruction provided by the teachers. Totally, there are seven domains in this component, making

“Courses and Teaching” the largest component in the gifted curriculum. The seven layers deal with courses, instruction, teacher preparation and administration preparation with altogether 40 principles. The next component is “Testing and Assessment”, in which there are two domains of focus with totally 10 principles. The last component is

“Evaluation” with one domain of focus containing 9 principles. Following the rating part of criteria are blanks for respondents’ personal data, including name, workplace, title, curricular vitae, and working experiences.

After the first round of the Delphi survey, suggestions were proposed by the expert panel and modification was thus made to the first round of questionnaire, yielding the questionnaire for the second round of the Delphi survey (see Appendix F).

This time there were nine additional principles added to the principle pool, rendering

119 principles for the second of the Delphi survey. Besides, the second round

questionnaire also differs from the first round in that the second round contains

statistical information from the first round, including the mean, the median, the

interquartiles (Q

¹

and Q

³

), the interquartile range (Q

³

Q

¹

), and the responses from

every individual expert. Then based on the results of the second round Delphi survey, a

third round of questionnaire was produced with the mean, the median, the interquartiles

(Q and Q ), the interquartile range (Q

Q ), the responses from each expert. Experts

(21)

were required to check their responses and explain if they decided to maintain the responses outside of the interquartile range.

Questionnaires for the Delphi Surveys on Establishing Criteria for English Talented Program Evaluation

The criteria in the questionnaire for the first round Delphi survey on program evaluation (see Appendix J) were transformed from the principles for English talented curriculum development finalized through three rounds of the Delphi surveys. Through the application of the Transformational Rule and the four Evaluation Filters, each principle for curriculum development was transformed into an evaluation criterion and further categorized into the rightful kinds of evaluation—Context, Input, Process and Product Evaluations (see Appendix H). Table 3.4 summarizes the numbers of criteria transformed under the CIPP model of evaluations. There are 44 criteria for Context Evaluation, 39 criteria for Input Evaluation, 25 criteria for Process Evaluation, and 11 criteria for Product Evaluation. However, when taking into account the purposes for Product Evaluation as Stufflebeam and Shinkfield (1985) suggest, we may find that the criteria for Product Evaluation so far seem not enough to fulfill the task of conducting product evaluation. For one thing, when conducting Product Evaluation, evaluators need to describe and judge the program outcomes and to relate the outcomes to the objectives, to the context, input, and process information, and to further interpret the worth and merit of the program, but much part of this will not be catered to when developing a curriculum. This is predictable since what Product Evaluation aims for may not be the concern of program developers at the stage of curriculum development, such as the specification of stakeholders, and their judgment and perception of the program.

Therefore, we may assume that besides the 119 criteria derived from program

development principles, we still need to add to them criteria pertaining to Product

Evaluation in particular to compose a whole set of evaluation criteria to fulfill the task

of CIPP evaluations. Meanwhile, there are certain concerns for Context Evaluation that

(22)

Table 3.4 Summary of Numbers of Criteria Transformed under the CIPP Model Components Principles for

Curriculum Development Transformed into CIPP Model

I:

Needs Analysis

I1.1-I1.7 (7 entries) I2.1-I2.5 (5 entries) I3.1-I3.3 (3 entries) I4.1 (1 entry)

Context: I1.1-I1.7; I2.1-I2.5;

I3.1-I3.3; I4.1 (16 entries)

II:

Goals &

Objectives

II1.1-II1.8 (8 entries) II2.1-II2.6 (6 entries) II3.1-II3.4 (4 entries)

Context: II1.1-II1.8; II2.1-II2.6;

II3.1-II3.4 (18 entries)

III:

Materials &

Resources

III1.1-III1.9 (9 entries) III2.1-III2.5 (5 entries) III3.1-III3.4 (4 entries) III4.1-III4.3 (3 entries) III5.1-III5.3 (3 entries)

Input : III1.1-III1.7; III1.9; III2.1-III2.5;

III3.1-III3.4; III4.1-III4.3;

III5.1-III5.3 (23 entries) Process : III1.8 (1 entry)

IV:

Courses &

Teaching

IV1.1-IV1.10 (10 entries) IV2.1-IV2.3 (3 entries) IV3.1-IV3.8 (8 entries) IV4.1-IV4.2 (2 entries) IV5.1-IV5.6 (6 entries) IV6.1-IV6.5 (5 entries) IV7.1-IV7.6 (6 entries)

Context: IV6.1-IV6.5 (5 entries) Input: IV2.1-IV2.3; IV3.1-IV3.7;

IV4.1-IV4.2; IV5.1-IV5.4;

(16 entries)

Process: IV1.1-IV1.9; IV7.1-IV7.6;

(15 entries)

Product: IV1.10; IV3.8; IV5.5-IV5.6;

(4 entries)

V:

Testing &

Assessment

V1.1-V1.5 (5entries)

V2.1-V2.7 (7 entries) Context: V1.1-V1.4 (4 entries) Process: V1.5; V2.1-V2.7 (8 entries)

VI:

Program Evaluation

VI1.1-VI1.9 (9 entries) Context: VI1.1 (1 entry) Process: VI1.2 (1 entry)

Product: VI1.3-VI1.9 (7 entries)

Total 119 criteria 119 entries

are not the main concerns at the stage of curriculum development, for example, different stakeholders’ needs of program evaluation and specification of audiences. These context criteria would have to be added to the criteria pool.

Therefore, based on the objectives and methods for CIPP evaluations suggested by Stufflebeam and Shinkfield (1985) and the program evaluation standards (Joint Committee, 1994), the researcher added 21 criteria to the set of 119 criteria transformed from curriculum development, making the set of evaluation criteria composed of 140 criteria. The newly added criteria include:

Context criteria: 1. Specification of audiences interested in evaluation of the program;

2. Learners’ needs of evaluation of the program;

3. Teachers’ needs of evaluation of the program;

4. Parents’ needs of evaluation of the program;

5. Administrators’ needs of evaluation of the program;

(23)

6. Other audiences’ needs of the evaluation of the program;

7. Analyses of differences of perceptions and expectations among stakeholders.

Product criteria: 1. To describe the outcomes faithfully and elaborately;

2. Overall outcomes include intended and unanticipated ones;

3. Students’ judgment and perception of the program;

4. Parents’ judgment and perception of the program;

5. Teachers’ judgment and perception of the program;

6. Administrators’ judgment and perception of the program;

7. Other stakeholders’ judgment and perception of the program;

8. Quantitative analyses of stakeholders’ judgment of outcomes;

9. Qualitative analyses of stakeholders’ judgment of outcomes;

10. Overall outcomes reach program objectives 11. Outcomes reflect cost-effectiveness.

12. To interpret worth and merit of the program;

13. To provide suggestions in terms of continuation, termination,

modification, or refocus of the program;

14. To render reports of evaluation results to stakeholders.

The above Context criteria conform to the utility standards proposed by the Joint Committee (1994), while the product criteria are to fulfill the four standards, the utility standards, the feasibility standards, the propriety standards, and the accuracy standards.

Appendix J shows this preliminary set of program evaluation criteria, which is further classified into 11 domains composed of 27 layers as Table 3.5 shows. For a better understanding of how each principle is transformed into a new criterion under the CIPP model, Appendix H shows the categories of evaluation each curriculum development principle is categorized into, while Appendix I shows the source of each program evaluation criterion.

Data Collection Procedures

A modified Delphi technique was employed to gather experts’ opinions so as to

validate the principles for English talented curriculum development and criteria for

English talented program evaluation. In this study, there were two Delphi surveys

conducted, one for establishment of principles for English talented curriculum

(24)

Table 3.5 Evaluation Criteria for English Talented Programs under the CIPP Model Evaluation

Category Domain Layer

1. Specification of population

Specification and identification of gifted students

Identification of evaluation audiences

2. Needs assessment

Assessment of learners’ needs

Assessment of teachers’ needs

Assessment of other audiences’ needs

3. Staff

perception of the program

Teacher perception and preparation for the program

Administration perception and preparation for the program

Student perception of the program

Parent perception of the program

Analysis of perception and expectation differences

Context Evaluation

4. Goals and objectives

Goals of the program

Objectives of the program

1. Differentiated curriculum design

Common core curriculum

Advanced English courses for gifted students

Second foreign language learning

Other curricular variations

2. Teaching materials

Teacher-guided teaching materials

Self-study materials

Input Evaluation

3. Resources and facility

Human resources

Financial resources

Facility resources

1. Test and assessing

Application of differentiated teaching skills

Application of differentiated ways of assessment

Process Evaluation

2. Administrative management.

Administrative assistance

1. Examination of program

performance

Outcome description

Audiences’ judgment of the program

Product Evaluation

2. Report of the

evaluation

Overall judgment and suggestions

Total 11 domains 27 layers

development, and the other for establishment of criteria for English talented program

evaluation. The whole procedures are as Figure 3.5 shows. The researcher finalized the

list of curriculum development principles and mailed it to the panel of experts to gather

their responses to the principles and collect more information from them. The experts

responded to the questionnaire, provided more ideas if necessary, and then mailed it

back to the researcher. Then the researcher summarized experts’ opinions and calculated

experts’ responses and tabulated them, and sent the results to the panel. In each

(25)

circulation, the experts were to rate the importance of each criterion and to make comment if there was any to be made. They may also make changes to their previous response if they find it necessary. Or they may choose to maintain the original responses.

When they chose to maintain the original responses which were outside of the interquartile range, they were required to give further explanation as to why they held a different point of view from the broad consensus. The whole procedures will go on until final consensus is reached.

Not Reached

Reached

Figure 3.5 The Delphi Technique Procedures

First circulation

The preliminary questionnaire is sent to the expert panel to rate the importance of each principle/criteria.

Second circulation

Experts reconsider their own ratings. If they decide to diverge from group consensus, they are to explain reasons.

Third circulation

.Experts recheck their won ratings. If they decide to diverge from group consensus, they are to explain reasons.

Researcher

summarizes responses and tabulates them.

Some principles are modifies or revised.

Researcher

summarizes responses again and tabulates them

Ratings are summarized again.

Consensus is either reached or not.

The final circulation

Final principles/criteria are consolidated and sent to individual expert.

Researcher makes a preliminary questionnaire comprising principles/criteria based on literature review

(26)

Data Analysis Procedures

Both quantitative and qualitative analyses were done for data from the Delphi surveys. In processing the results of the Delphi surveys, the responses were tabulated and statistically analyzed. The first and third quartiles

¹

(i.e., Q

¹

and Q

³

), the interquartile ranges (i.e., Q

³

-Q

¹

), the medians, the means, and variances are calculated for each principle and criterion.

The scale of importance for each criterion is in the form of Likert scale. The larger the number is, the more important the principle or criterion is, with 5 indicating very important, and 1 indicating not important. The median for each criterion indicates the importance assigned to the specific principle or criterion by the respondents. The higher the median is, the more important the principle or the criterion is. There are also studies that applied the mean score as an indicator of importance (Tsao, 2007; Jiang, 2006). However, when the respondents are not in a large number, applying the mean score as an indicator of importance would be under more influence of the extreme responses. Since the Delphi survey is designed to seek for broad consensus among group members, by which it means to seek for the opinions of most respondents, if not everyone, it is advisable to put aside the influence of sporadic extreme responses. But this is not to say that extreme responses should be totally ignored. In fact, they deserve more exploration and discussion if more revelation and better understanding is desired concerning each principle and criterion. Therefore, in this study, the median is chosen to be an indicator of importance of each principle or criterion. However, the mean score also serves as an indicator. For a principle or criterion which gets a mean score of 3 or above, it will be kept on the list for further exploration. Otherwise, it will be eliminated from the list, given the fact that not enough experts give great importance to the specific principle or criterion. All the respondents were provided with the results of the survey

1 The ratings given by the 20 experts for each criterion will be arranged in the order of the rating number chosen, from 1 to 5. The rating given by the 5^th expert will be Q1, while the rating given by the 15^th expert

(27)

and then went on with the next round of survey. Any suggestions made to the principles and criteria by the expert respondents were collected and shown to the experts in the following round of survey.

Meanwhile, when the median is chosen to function as a measure of central location, quartile deviation would be applied to indicate the degree of dispersion (Lin, 1992). The interquartile range (Q

³

Q

¹

) reflects the degree of consensus concerning a principle or criterion on the instrument (Binning et al.,1972; Tersin & Riggs, 1976;

Linstone, 1978; Goodwin, 1987; Whitecotton, 1992; Thomas, 1990). The smaller the interquartile range is, the higher the degree of consensus is. For respondents whose responses in the previous round of survey fall outside of the interquartile range and they decide to diverge from the group consensus and maintain their original responses which are in the minority, they are required to explain their reasons. The items with interquartile ranges larger than 1 are considered of lower degree of consensus and will go through another round of survey until a consensus is reached, that is, when interquartile ranges are smaller than 1. If a final consensus on a certain principle or criterion fails to reach after rounds of surveys, the very principle or criterion will be reserved for further discussion.

Another way to examine the consistency of responses among experts is to conduct Kolmogorov-Smirnov one sample test through SPSS for Windows. This is a nonparametric statistical test. If the calculated Z reaches the required .05 level of significance, that is, p< .05, then the consistency of responses among experts is believed to be reached. What is worth of note is that the results from Kolmogorov-Smirnov one sample test may not be exactly the same as the results of the interquartile range test.

This is because the former test through SPSS software relies on standard deviation to

derive the value of Kolmogorov-Smirnov Z, which in turn decides the level of

significance p. In other words, such a test would be inevitably under more influence

from the extreme values given by the respondents. Principles or criteria with responses

(28)

at the two extremes may pass the interquartile range test as reaching consensus in that they have small interquartile ranges. But, they can be identified as not reaching consensus through Kolmogorov-Smirnov one sample test because of the extreme responses assigned to them. Or two principles or criteria with the same median and interquartile range may receive different levels of consistency of responses after undergoing Kolmogorov-Smirnov one sample test, in that one principle or criterion elicits more responses at two extremes while the other does not. In order to examine the influence from extreme responses, both tests would be applied to double check the consistency of responses among experts. Besides, through such a procedure, more attention can be directed to principles or criteria that cause extreme responses, which are by all means worthy of further discussion.

As to how to decide whether further iterations of the Delphi procedure are warranted, we have to rely on the test of consistency between rounds. To this end, a nonparametric statistical test was applied. Wilcoxon matched-pairs signed-ranks test through SPSS for Windows was the analysis method applied to examine whether the difference between two rounds reaches level of significance. For items that fail to reach consensus, they need to go on with more rounds of survey. The responses in two successive rounds underwent Wilcoxon matched-pairs signed-ranks test. If the calculated p is larger than the required level of significance .05, that is p>.05, then the change in the variance of responses is not significant. The insignificance of change means that no more iteration of the Delphi procedure is needed. The survey may stop.

On the other hand, if p< .05, then the change in the variance of responses is significant.

In this situation, more rounds of survey will be required, if the latest round of survey renders a response that does not reach the expected level of consistency. On the other hand, if the latest round of survey reveals reaching consensus among experts, no more iteration of survey would be required.

Finally, to better understand if there is significance difference between college

(29)

professors’ and senior high school teachers’ responses, tests through

Kruskal-Wallis One-Way ANOVA through SPSS for Windows would be applied. If the calculated

Chi-Square value corresponds to p which is larger than .05 (i.e., p>.05) , then there is no significant difference between the responses from the two groups of experts. On the other hand, if p is smaller than .05, (i.e., p< .05), it is indicative that significant difference exists between the two groups of experts, which deserves further discussion to explore what causes such divergence of responses.