• 沒有找到結果。

子計劃二:網路化創造性學習環境之可行性研究 (III)


Academic year: 2021

Share "子計劃二:網路化創造性學習環境之可行性研究 (III)"


加載中.... (立即查看全文)





※ ※

※ 中學生資訊科技之網路學習與評量系統之研究 ※



計畫類別:□個別型計畫 整合型計畫

計畫編號:NSC - 89 - 2520 - S - 009 - 011

執行期間: 89 年 8 月 1 日至 90 年 7 月 31 日

計畫主持人: 林盈達 教授







執行單位: 交通大學資訊科學系

民 國






中學生資訊科技之網路學習與評量系統之研究 子計劃二:


計劃編號:NSC 89-2520-S-009-011

執行期限:89 年 8 月 1 日至 90 年 7 年 31 日

主持人:林盈達 教授


一、中文摘要 本論文提出一個新穎的網路電腦輔助 測 驗 系 統 , DIYexamer (Do-It-Yourself Examer)。 其三項特色使得它和現有的系 統有著明顯的差別,分別為學生自創試 題 、 題 庫 共 享 以 及 自 動 評 鑑 機 制 。 DIYexamer 接受由老師和學生所提供的試 題 , 同 時 也 和 其 他 機 構 所 使 用 的 DIYexamer 共享題庫。我們採用一個可以 動態評估題庫內試題鑑別度的演算法,藉 此將不合格的試題由題庫中刪除。如此不 但保證題庫中試題的品質,也可以有效的 增加題庫內的試題數量。 關鍵詞:電腦輔助測驗系統,試題評量, 試題取得、鑑別度、遠距學習。 Abstr act

This paper presents a novel network CAT system, DIYexamer (Do-It-Yourself

Examer). It has three features that

differentiate it from existing CAT systems: student DIY items, item-bank sharing, and

automatic assessment of item

discriminability. DIYexamer accepts test items contributed form teachers as well as students, and allows limited item sharing between item-banks possibly maintained by different organizations. An algorithm is

applied dynamically to assess the

discriminability of items in item-banks in

order to filter out less qualified

contributions, hereby assuring the quality of stored items while scaling up the size of


Keywor ds: Computer Assisted Testing, Test

Evaluation, Test Item Acquisition,

Discriminability, Distant



Computer-assisted Testing (CAT) or Computer-based Testing (CBT), the use of computers for testing purposes, has a history spanning more than twenty years. The

documented advantages of computer

administered testing include reductions of testing time, an increase in test security, provision of instant scoring, and an individualized adaptive testing environment [1][2][3][4]. Three categories of CAT are currently employed: standalone packages,

test centers and networked systems.

Regardless of which CAT system is employed, a critical issue in developing CAT is the construction of a test item-bank. Traditionally, asking teachers and content experts to submit items generates the item-bank. Three major drawbacks of the traditional method can be observed:

1) Limitation of item amount: Teachers and content experts tend to have similar views on the test subject. That is, in a given field vital subject matter might be confined. Therefore, although more teachers and content experts are invited to contribute test items, the total number of distinct items remains low.

2) Passive learning attitude: Students are

conventionally excluded from the


computer-assisted testing system, teachers generate tests, the system presents test sheets and students then complete the tests. That is, they play a passive role within the testing system, and are not afforded the opportunity to

conduct “meta-learning” or


3) No guarantee on item quality: Permitting students to generate tests may be a possible solution to the aforementioned problems. However, this raises a new problem: quality assurance and ensuring that the tests are worth storing and used for further tests. Even when the whole item-bank is contributed by teachers and content experts, ways to dynamically assess and filter test items are needed.

1 The Diyexamer Solution

The DIYexamer[5] provides a web interface for users to remotely control and operate the system. Three kinds of users are supported: administrators, teachers, and students. It allows students to contribute test items, and provides an effective means of verifying the discriminability of these items. Three main ideas are introduced below: 1) Item DIY by students: DIYexamer allows

students to generate test items into the item-banks online as Fig 1, while teachers can query these items generated by students. In addition to rapidly increasing the total number of items in an item-bank, this feature also encourages

students to develop meta-learning, i.e.

creative learning. In order to submit tests,

students must thoroughly study the learning materials, develop higher-level overviews of the materials, and practice cognitive and creative thinking.

2) Assessment of item discriminability:

DIYexamer provides an

item-discriminability assessment method to ensure the quality of the stored items. In addition to ensuring the internal consistency of existing test items, this

method also continuously and

dynamically screens additional new items

in the item-bank.

Fig 1: Students generate items into the item-bank

3) Item-bank sharing: DIYexamer, a scalable multi-server system, connects many item-banks stored in different servers. Therefore, via the Internet, more items can be accessed and shared. The sharing is limited and controlled in a sense that a server issues a request, describing the criteria of a test item it requests, to another server.

Additional advantages have been identified and include the facts that since DIYexamer provides a real-time on-demand generation of test-sheet function, cheating is avoided. Also, DIYexamer provides an item cross-analysis function to which the degree of difficulty for each test as well as the entire test base can be accurately measured.

2 Discr iminability Assessment Of Diyexamer

When selecting sample students, only those whose scores have large gap with the average score should be considered. Accordingly, those with the top 30%, in terms of range, scores are defined as “high-score group (H’)”, while those with the bottom 30% scores are defined as “low-score group (L’)”.

To show the different criteria and effects of choosing samples in the traditional method and DIYexamer method, Fig.2 depicts the score distribution in a test. In this example, the highest score is 92, the lowest


score is 34, and the average score is 69. The “high rank score group” and the “low rank score group” are chosen according to these two methods. Take student X as an example, the score of X is 66, which differs only 3 points from the average score. The associated information of X should have little, if not none, referential value in computing item discriminability. However, X is chosen as a sample in the high rank group in the traditional method. This fallacy results from using rank group, in terms of count, as the criterion of choosing samples. In DIYexamer, X is not chosen since score group, in terms of range, rather than rank group is used. Only those with large gap with the average score are chosen as samples. 20 30 40 50 60 70 80 90 100 3 4 M I N X L AVG6 9 M AX9 2 H L ’ H ’ 6 6

Fig 2: Comparison of samples taken in the traditional method and DIYexamer method For different samples that have different impacts on discriminability, a referential value with respect to an item is generated for each student selected as a

sample. We first define the item

discriminability as the average of all associated referential values,

Discriminability = Sum of the referential values of sampled students

Number of sampled students

Since the referential values depend on students’ scores, the referential values are computed according to the ratio of correct and incorrect answers of the sampled students. The ratios of correct and incorrect answers are defined as,

Ratio of correct answer = Number of items answered correctly

Number of items on the test

Ratio of incorrect answer = Number of items answered incorrectly

Number of items on the test According to Table 1, the referential

value of a student correctly answered an item is the ratio of correct answer of the student. Alternately, the referential value of a student incorrectly answered an item is the ratio of incorrect answer of the student. This policy comes from the fact that an item should have increased discriminability if correctly answered by a competent student, while rendering decreased discriminability if correctly answered by a less competent student. In this way, a competent student contributes large referential value to a

correctly answered item and small

referential value to an incorrectly answered item, and vice versa.


1 Evaluation Of The Discr iminability Assessment In Diyexamer

The fairness and performance of DIYexamer was evaluated. We conducted an experiment where 10 students took the test on-line using DIYexamer with 10 items. Discriminability for each item is computed using both the traditional method and the

DIYexamer method. However, the

discriminability originally falls between -1 to 1 using the traditional method, while falling between 0 to 1 using the DIYexamer method. To compare these two methods, both two ranges of discriminability are normalized from 0 to 10, as shown in Fig 3.

0 2 4 6 8 1 0 1 2 I te m Discriminability D I Y E x a m e r 3 .6 6 .4 6 .4 6 8 8 8 7 .2 8 8 T r a d itio n a l 5 5 5 6 .7 8 .3 3 1 0 1 0 6 .7 1 0 1 0 1 2 3 4 5 6 7 8 9 1 0

Fig 3: Comparison of item discriminability


Student Answer Item discr iminability

Refer ential value to compute discr iminability

Correct High Ratio of correct answer Competent

(With high ratio of correct

answer) Incorrect Low Ratio of incorrect answer Correct Low Ratio of correct answer Less competent

(With low ratio of correct

answer) Incorrect High Ratio of incorrect answer

2 Conclusion

This paper has presented a novel architecture for a networked CAT system, DIYexamer. It supports item DIY by students, item-bank sharing, and item discriminability assessment.

For discriminability assessment, new calculation formulas were proposed. When compared with the traditional assessment scheme, the main difference is that the top

and the bottom 30% of the score group, in

terms of range of scores were selected rather

than the rank group, in terms of count of

students. Thus, item discriminability is more

accurately reflected particularly when the tested students have close scores.

Item-bank sharing and item DIY by

students has increased both the amount and

the variety of questions in item-banks. Item

DIY by students promotes creative learning

within students, while automatic

discriminability assessment assures better quality than traditional CAT systems.


A questionnaire was used to survey subjective attitudes of students about DIYexamer and the outcome revealed that most students were interested in item DIY. The technique proposed herein is useful in general tuition not only to improve the

quality of test items and fairness; but also to save time from generating questions and computing scores. We recommend that DIYexamer be popularized to schools.


[1] C. V. Bunderson, D. K. Inouye, and J. B.

Olsen, “The four generations of

computerized educational

measurement,” in Educational

measurement (3rd ed.), R. L. Linn, Ed.

New York: American Council on Education— Macmillan, pp 367-407, (1989).

[2] S. L.Wise and B. S. Plake, “Research on the effects of administering tests via computers,” Educational Measurement: Issues and Practice, vol. 8, no. 3, pp. 5-10, (1989).

[3] A. C. Bugbee, Examination on Demand: Findings in Ten Years of Testing by Computers 1982-1991. Edina, MN: TRO Learning, (1992).

[4] Load, F. M., Applications of Item Response Theory to Practical Testing

Problems. Erlbaum, Hillsdale,

NJ ,(1980).

[5] “DIYexamer system”,



Fig 1: Students generate items into the  item-bank
Fig 3: Comparison of item discriminability



- allow students to demonstrate their learning and understanding of the target language items in mini speaking

• enhance teachers’ capacity to integrate language arts rich in cultural elements into the school- based English language curriculum to broaden students’ understanding of the

If the students are very bright and if the teachers want to help prepare these students for the English medium in 81, teachers can find out from the 81 curriculum

• Teaching grammar through texts enables students to see how the choice of language items is?. affected by the context and how it shapes the tone, style and register of

By correcting for the speed of individual test takers, it is possible to reveal systematic differences between the items in a test, which were modeled by item discrimination and

批判性思考 + 創造力 創造力示例二 創造力 創造力 示例二 示例二: 示例二 : :環境調查 : 環境調查 環境調查 環境調查..

Teachers can design short practice tasks to help students focus on one learning target at a time Inferencing task – to help students infer meaning while reading. Skimming task –

Having regard to the above vision, the potential of IT in education and the barriers, as well as the views of experts, academics, school heads, teachers, students,