Learning How Students Learn - 分類技術與貝氏網路之應用---法學文件之語意標記與人機互動之使用者建模

Chao-Lin Liu

National Chengchi University, Taipei 11605, Taiwan, [email protected]

Abstract

This extended abstract summarizes an exploration of how computational tech-niques may help educational experts identify fine-grained student models. In particular, we look for methods that help us learn how students learn composite concepts. We employ Bayesian networks for the representation of student mod-els, and cast the problem as an instance of learning the hidden substructures of Bayesian networks. The problem is challenging because we do not have direct access to students’ competence in concepts, though we can observe students’

responses to test items that have only indirect and probabilistic relationships with the competence levels. We apply mutual information and backpropagation neural networks for this learning problem, and experimental results indicate that computational techniques can be helpful in guessing the hidden knowledge structures under some circumstances.

Summary

Behavior models of activity participants are crucial to the success of computer systems that interact with human users. When using Bayesian networks (BNs) as the language for model construction, Mislevy et al. asked where we could obtain the numbers for the conditional probability tables (CPTs) [1]. We could ponder where we could obtain the structures of the BNs in the first place. For educational practitioners, an obvious and practical answer to this inquisitiveness may be that we should consult experts of the targeted domains to provide the knowledge structures, such as the prerequisite relationships between concepts, for building student and instructor models. Indeed this is an effective and the de facto approach to building computer-assisted educational software in general.

Can computers be more helpful than finding the detailed numbers in the CPTs for student modeling? More specifically, can computers assist in any way for finding the structures of student models? Given a composite concept, say dABC, that requires knowledge about three basic concepts, say cA, cB, and cC, how can we tell how students learn dABC from cA, cB, and cC? Do students combine cA and cB into an intermediate product, dAB, and then combine dAB and cC into dABC? Or, do students integrate the basic concepts directly to learn dABC?

In this exploration, we assume that students learn the composite concept from ingredient constructs that do not include overlapping basic concepts. For instance, we subjectively exclude the possibility of learning dABC from two

intermediate composite concepts dAB and dBC, because they both include cB.

This assumption simplifies the search space. However, the size of the search space still grows explosively with the number of basic concepts included in the target composite concept, and is related to the Stirling number of the second kind.

We assume that educational experts provide a set of possible ways that stu-dents may, implicitly or explicitly, employ to learn the composite concept, and our job is to help experts identify which of these learning patterns is the most likely answer. Hence, the process of learning how students learn begins with the acquisition of a set of candidate answers. We use the set of candidate learning patterns to build BNs for simulating possible student behaviors, and employ the simulated data to train backpropagation neural networks (BPNs). The learned BPNs can then be used to classify the unobservable learning pattern, based on students’ item responses, into one of the candidate answers.

Following the steps of many researchers who explored methodologies for building computer-assisted tutoring systems, we employ simulated students in this study. Simulated students were generated from Liu’s simulation system that considers the probabilistic relationships between students’ responses to test items and students’ competence levels in concepts [2]. The degree of uncertain relation-ship between these two factors was controled by a parameter called fuzziness. We set fuzziness to a larger value when we simulated a more uncertain relationship between responses to items and competence in concepts. The other parameter, named groupInfluence, affected the uncertain relationship between the students’

actual behaviors and students’ stereotypical behaviors. We set groupInfluence to a larger value to make students more likely to deviate from their typical behav-iors. In short, it became harder to guess the real mental states of a student when either fuzziness or groupInfluence were set to larger values in the simulation.

Students’ responses to test items and students’ competence levels were repre-sented with different, though directly connected, nodes in the BNs that were used to generate simulated students. States of nodes that represented competence lev-els in concepts were not observable, and only states of nodes that represented correctness of item responses were accessible. Hence, our job was to guess the substructure of the unobservable nodes based on the data that had only indirect and probabilistic relationships with the true answers. Due to this reason, known algorithms for learning structures of Bayesian networks, such as the PC algo-rithm implemented in Hugin, were not directly viable for this learning problem.

We employed estimated mutual information (EMI) for comparing the candi-date solutions. If students learn dABC from dAB and cC rather than from cA and dBC, the EMI between the nodes for both dAB and cC and the node for dABC may be larger than the EMI between the nodes for both cA and dBC and the node for dABC. (In this case, EM I(dAB, cC|dABC) is expected to be larger than EM I(cA, dBC|dABC).) Namely, we used the EMI to represent the merits of a competing substructure. We had to estimate the mutual information between two sets of nodes, since we did not have direct access to the states of the nodes that represented concepts. We estimated the state for the node that represented a concept with the percentage of correct responses to test items de-signed for the concept, and used the estimated states of nodes to calculate the

EMIs. In addition to the EMIs for all competing substructures, we introduced ratios between the EMIs for training the BPNs. Experience indicated that ratios between the EMIs, e.g., the ratios between the EMIs and the largest EMI, were useful for improving the prediction quality of the trained BPNs.

We tested the proposed procedure for guessing how

stu-10 20 30

0.75 0.8 0.85 0.9 0.95 1

fuzziness

accuracy

.05 .10 .15 .20 .25 .30

dents learn dABC. There were four possible answers. We ran-domly sampled 500 network instances that had different un-derlying joint probability distributions for each of these four answers, and simulated item responses of 10000 students that were generated from these 2000(=4×500) networks. Each sim-ulated students responded to three items for seven concepts, i.e., cA, cB, cC, dAB, dAC, dBC, dABC, and the responses must be either correct or incorrect. We calculated the EMIs and their ratios for each network instance for training BPNs, so we trained the BPNs with 2000 training instances. We then applied the trained BPNs to predict the learning patterns of 400 groups of students—100 groups generated for each of the

four answers. We repeated the above procedure for 36 combinations of fuzziness and groupInfluence, each ranging between 0.05 to 0.30. The figure on this page shows the results. The horizontal axis shows the decimal part of fuzziness, the legend shows the values of groupInfluence, and the vertical axis shows the per-centage of correct identification of hidden structures in 400 test cases. The results suggest that it is possible to identify the hidden structure better than 80 per-cent of the time, if fuzziness and groupInfluence are not large and if educational experts’ guess list does include the correct structure.

Do we really need student models of better quality? Experimental results re-ported by Carmona et al. suggested that student models of higher quality could help us improve the effectiveness of computerized adaptive tests [3]. Hence, we hope results outlined in this extended abstract can be useful. We have expanded our experiments to cases where we learned how students learn composite con-cepts that included four basic concon-cepts [4]. The accuracy remained above 75% in unfavorable conditions. We thank reviewers for their invaluable comments on the original manuscript. This work was partially supported by the research contract 94-2213-E-004-008 of National Science Council of Taiwan.

References

1. Mislevy, R.J., Almond, R.G., Yan, D., Steinberg, L.S.: Bayes nets in educational assessment: Where do the numbers come from? 15th UAI (1999) 437–446 2. Liu, C.L.: Using mutual information for adaptive item comparison and student

assessment. J. of Educational Technology & Society 8(4) (2005) 100–119

3. Carmona, C., Mill´an, E., P´erez-de-la-Cruz, J.L., Trella, M., Conejo, R.: Introducing prerequisite relations in a multi-layered Bayesian student model. Lecture Notes in Computer Science 3538 (2005) 347–356

4. Liu, C.L., Wang, Y.T.: An experience in learning about learning composite con-cepts. 6th IEEE ICALT (2006) to appear

在文檔中分類技術與貝氏網路之應用---法學文件之語意標記與人機互動之使用者建模 (頁 29-32)