• 沒有找到結果。

1.1 Background

For many years, educational assessment has played an important role in teaching

and learning (Gronlund, 1993). It can evaluate the effectiveness of teaching, diagnose

the state of learning, and help the development of students’ learning (Chen, Lee, &

Chen, 2005; Chen & Chung, 2008; Johns, Hsingchin, & Lixun, 2008; Barla et al.,

2010). With the development of computers and the Internet, Computer Adaptive

Test-ing (CAT) is now a developTest-ing way to administer tests adaptTest-ing to learners’ knowledge

or competence in language learning (Troubley, Heireman, & Walle, 1996). Based on

adaptive tests, examinees’ abilities can be more accurately measured by fewer suitable

questions (Weiss & Kingsbury, 1984; Van der Linden & Glas, 2000); moreover, student

performance has also been demonstrated improved (Barla et al., 2010). CAT can not

only provide questions but also be combined with scaffolding hints and instructional

feedback (Feng, Heffernan & Koedinger, 2010). This facilitates students learning and

helps them acquire knowledge with external help. However, when a great number of

in assessment resources because it is time–consuming and cost–intensive for human

experts to manually produce questions.

In recent years, there has been increasing attention to computer-aided question

generation (also called automatic question generation or automatic quiz generation) in

the field of e-learning and Natural Language Processing (NLP). It is useful in multiple

subareas and has been proposed to use in generating instructions in tutoring system

(Mostow & Chen, 2009), assessing domain knowledge (Mitkov & Ha, 2003),

evaluat-ing language proficiency (Brown, Frishkoff, & Eskenazi, 2005), assistevaluat-ing academic

writing (Liu, Calvo, Aditomo, & Pizzato, 2012) and question answering (Pasca, 2011).

In order to make learning environment more effectively and efficiently, many

re-searchers have been exploring the possibility of an automatic question generation in

various contexts. For example, a wide variety of applications, such as Linguistics

(Mitkov, Ha, & Karamanis, 2006) and Biology (Agarwal & Mannem, 2011), identified

the important concepts in textbooks and generated multiple-choice questions and

gap-fill questions. In the domain of language learning, a growing number of studies

(Turney, 2001; Turney, Littman, Bigham, & Shnayder, 2003; Liu, Wang, Gao, &

Huang, 2005; Sumita et al., 2005; Lee & Seneff, 2007; Lin, Sung, & Chen, 2007; Pino,

Heilman, & Eskenazi, 2008; Smith, Avinesh, & Kilgarriff,, 2010) are now available to

not only drills and exercise, including vocabulary, grammar, reading questions, but also

formal exams, including SAT (Scholastic Aptitude Test) analogy questions and TOEFL

(Test of English as a Foreign Language) synonym task. To support academic writing,

Liu et al. (2012) used Wikipedia and the conceptual graph structures of research papers

and generated specific trigger questions for supporting literature review writing.

Several researches have addressed the benefit of facilitative learning and teaching

with automatic question generation. The use of computer-aided question generation for

educational purpose was motivated as research of reading comprehension consistently

found that assessment is helpful in learning and enhances learners’ retention of

materi-al (Anderson & Biddle, 1975). Mitkov et materi-al. (2006) demonstrated that computer–aided

question generation was more timeȉefficient than manual labor. Turney et al. (2003)

showed that the generated SAT and TOEFL questions are comparable to that generated

by experts. Liu et al. (2012) found that the generated trigger questions were more

use-ful than manual generic questions and that the questions could prompt students to

re-flect on key concepts, because the questions were generated based on what students

read. With the advantage of automatic question generation, students can practice

with-out waiting for a teacher to compose a quiz, and teachers can spend more time on

teaching; moreover, besides evaluating students’ understanding, automatic question

generation can be designed with additional functions.

1.2 Research problem

Recent theories on learning have focused increasing attention on understanding

and measuring student ability. There is now general consensus over Vygotsky’s (1978)

observation that a learner’s ability in the Zone of Proximal Development (ZPD)—the

difference between a learner’s actual ability and his or her potential development—can

progress well with external help. Instructional scaffolding (Wood, Bruner, & Ross,

1976), closely related to the concept of ZPD, suggests that appropriate support during

the learning process helps learners achieve their learning goals. However, effective

in-structional support requires identifying students’ prior knowledge, tailoring assistance

to meet their initial needs, and then removing this aid when they acquire sufficient

knowledge.

Even though previous studies in the field of computer-aided question generation

automatically generate all possible questions based on their proposed approach in an

attempt to reduce the cost of time and money of manual question generation, such

ex-haustive list of questions is inappropriate for language learning, because it can lead to

redundant, over–simplistic test questions that are unsuitable for evaluating student

progress. Moreover, it is hard to achieve meaningful test purpose and maximize

exam-inees’ learning outcomes because the personalized design (Fehr et al., 2012; Hsiao,

Chang, Chen, Wu, & Lin, 2013; Wu, Su, & Liu, 2013) is still critically lacking.

1.3 Research purpose

This work is intended to provide personalized computer-aided question

genera-tion on formative assessment to assess students’ receptive skills in English as a foreign

or second language. It generates three question types, including vocabulary, grammar

and reading comprehension, and differs from previous studies in the way learners’

language proficiency levels are considered in the generating process and questions are

generated with difficulties. The definition of “personalization” refers to the adjustment

to learner needs by matching the difficulty of questions to their knowledge level. In

other words, questions are generated based on an individual’s ability even though

stu-dents read the same learning material.

This work, the personalized computer-aided question generation, is based on the

related concept to the age of acquisition (AOA). The basic idea of age of acquisition is

the age at which a word, a concept, even specific knowledge is acquired. For instance,

people learn some words such as “dog” and “cat” before others such as “calculus” and

“statistics”. Numerous studies in psychology and cognitive science have shown the

positive influence on the process of brain, such as object recognition (Urooj et al.,

2013), object naming (Carrolla & Whitea, 1973; Morrison, Ellis, & Quinlan, 1992;

Alario, Ferrand, Laganaro, New, Frauenfelder, & Segui, 2005; Davies, Barbón, &

Cuetos, 2013), and language learning (Brysbaert, Wijnendaele, & Deyne, 2000;

McDonald, 2000; Izura & Ellis, 2002; Zevin & Seidenberg, 2002). Today, with the

various number of content available from the web and other digital resources, this

concept can be realized with advanced technology, Information Retrieval (Baeza-Yates

& Ribeiro-Neto, 1999; Manning, Raghavan, & Schütze, 2008) and Natural Language

Processing (Manning & Schütze, 1999), which counts word frequency and calculates

the probability of which a word is acquired at a certain school grade when given a

group of documents. With a large enough resource, such as an extensive collection of

all learning materials which people read and learn, the acquisition grade distributions

can be computed and implemented. For example, based on textbooks authored

specifi-cally for students in grade level six, questions can be generated based on concepts in

these textbooks that were correctly answered by a student, and from this, the student

can be said to either have or lack the skills at the grade level six. This implies that

learning materials, such as textbook, are written with intent to represent what learners

at a certain grade level learn and acquire. Two related work to this concept are a

reada-bility prediction (Kidwell, Lebanon & Collins-Thompson, 2011), which mapped a

document to a numerical value corresponding to a grade level based on the distribution

of acquisition age, and a word difficulty estimation (Kireyev & Landauer, 2011), which

modeled language acquisition with Latent Semantic Analysis to compute the degree of

knowledge of words at different learning stages.

In response to the personalized design based on the acquisition grade distributions,

we propose a personalized automatic quiz generation to generate multiple–choice

questions with varying difficulty, a reading difficulty estimation to predict the

difficul-ty level of an article for English as foreign language learners, as well as an

interpreta-ble and statistical ability estimation to estimate a student’s ability with inherent

ran-domness in the acquisition process, specifically in the Web-based learning environment,

as shown in Figure 1.

The purpose of personalized testing is to not only measure the achievement

per-formance of students, but also help them improve their own learning process and

cor-rect their mistakes by understanding what they has learned and has not learned yet.

Through this approach, students can read any materials online and then do more

exer-cises to understand their strengths and to improve their weaknesses, as a strategy to

guide them to language acquisition.

Figure 1 The architecture of the personalized computer-aided question generation.

The main research questions addressed in this study are:

(1) Does the proposed personalized design with the appropriate instructional

scaffolding help students advance their learning progress?

(2) Does the proposed personalized question selection help students correct their

unclear concept?

(3) How are students’ perceptions and experiences in the proposed personalized

computer-aided question generation?

We also conduct simulation and empirical evaluations to investigate the property

(4) What are the representative features of the proposed reading difficulty

esti-mation in English as a foreign or second language?

(5) How is the performance of the proposed reading difficulty estimation

com-pared with the other reading difficulty estimation?

(6) What are the characteristics of the proposed ability estimation based on the

quantiles of acquisition grade distributions and item response theory?

(7) How is the performance of the proposed ability estimation compared with

the other ability estimations?

(8) How is the performance of the proposed ability estimation with the empirical

data in a Web-based learning environment?

The rest of this article is organized as follows. Chapter 2 describes related work.

In Chapter 3, we present the design of automatic quiz generation and the mechanism

for assigning question difficulty. Chapter 4 outlines the personalization framework,

consisting of reading difficulty estimation, ability estimation and quiz selection. In

Chapter 5 and Chapter 6, we present simulation evaluations of reading difficulty

esti-mation and ability estiesti-mation respectively. Chapter 7 evaluates the effectiveness of

personalized computer-aided question generation in the empirical study. Finally,

Sec-tion 8 summarizes with contribuSec-tions, limitaSec-tions, and potential applicaSec-tions.









相關文件