In this chapter, we review SCORM standard and some related works as follows.
2.1 SCORM (Sharable Content Object Reference Model)
Among those existing standards for learning contents, SCORM, which is proposed by the U.S. Department of Defense’s Advanced Distributed Learning (ADL) organization in 1997, is currently the most popular one. The SCORM specifications are a composite of several specifications developed by international standards organizations, including the IEEE [LTSC], IMS [IMS], AICC [AICC] and ARIADNE [ARIADNE]. In a nutshell, SCORM is a set of specifications for developing,
packaging and delivering high-quality education and training materials whenever and wherever they are needed. SCORM-compliant courses leverage course development investments by ensuring that compliant courses are "RAID:" Reusable: easily modified and used by different development tools, Accessible: can be searched and made available as needed by both learners and content developers, Interoperable:
operates across a wide variety of hardware, operating systems and web browsers, and Durable: does not require significant modifications with new versions of system software [Jonse04].
In SCORM, content packaging scheme is proposed to package the learning objects into standard learning materials, as shown in Figure 2.1. The content packaging scheme defines a learning materials package consisting of four parts, that is, 1) Metadata: describes the characteristic or attribute of this learning content, 2) Organizations: describes the structure of this learning material, 3) Resources:
denotes the physical file linked by each learning object within the learning material,
and 4) (Sub) Manifest: describes this learning material is consisted of itself and another learning material. In Figure 2.1, the organizations define the structure of whole learning material, which consists of many organizations containing arbitrary number of tags, called item, to denote the corresponding chapter, section, or subsection within physical learning material. Each item as a learning activity can be also tagged with activity metadata which can be used to easily reuse and discover within a content repository or similar system and to provide descriptive information about the activity. Hence, based upon the concept of learning object and SCORM content packaging scheme, the learning materials can be constructed dynamically by organizing the learning objects according to the learning strategies, students' learning aptitudes, and the evaluation results. Thus, the individualized learning materials can be offered to each student for learning, and then the learning material can be reused, shared, recombined.
Figure 2.1: SCORM Content Packaging Scope and Corresponding Structure of Learning Materials
2.2 Document Clustering/Management
For fast retrieving the information from structured documents, Ko et al. [KC02]
proposed a new index structure which integrates the element-based and attribute-based structure information for representing the document. Based upon this index structure, three retrieval methods including 1) top-down, 2) bottom-up, and 3) hybrid are proposed to fast retrieve the information form the structured documents.
However, although the index structure takes the elements and attributes information into account, it is too complex to be managed for the huge amount of documents.
How to efficiently manage and transfer document over wireless environment has become an important issue in recent years. The articles [LM+00][YL+99] have addressed that retransmitting the whole document is a expensive cost in faulty transmission. Therefore, for efficiently streaming generalized XML documents over the wireless environment, Wong et al. [WC+04] proposed a fragmenting strategy, called Xstream, for flexibly managing the XML document over the wireless environment. In the Xstream approach, the structural characteristics of XML documents has been taken into account to fragment XML contents into an autonomous units, called Xstream Data Unit (XDU). Therefore, the XML document can be transferred incrementally over a wireless environment based upon the XDU.
However, how to create the relationships between different documents and provide the desired content of document have not been discussed. Moreover, the above articles didn’t take the SCORM standard into account yet.
In order to create and utilize the relationships between different documents and provide useful searching functions, document clustering methods have been extensively investigated in a number of different areas of text mining and information retrieval. Initially, document clustering was investigated for improving the precision or recall in information retrieval systems [KK02] and as an efficient way of finding the nearest neighbors of the document [BL85]. Recently, it is proposed for the use of searching and browsing a collection of documents efficiently [VV+04][KK04].
In order to discover the relationships between documents, each document should be represented by its features, but what the features are in each document depends on different views. Common approaches from information retrieval focus on keywords.
The assumption is that similarity in words usage indicates similarity in content. Then, the selected words seen as descriptive features are represented by a vector, and one distinct dimension assigns one feature respectively. The way to represent each document by the vector is called Vector Space Model method [CK+92]. In this thesis, we also employ the VSM model to encode the keywords/phrases of learning objects into vectors to represent the features of learning objects.
2.3 Keyword/phrase Extraction
As those mentioned above, the common approach to represent documents is giving them a set of keywords/phrases, but where those keywords/phrases comes from?
The most popular approach is using the TF-IDF weighting scheme to mining keywords from the context of documents. TF-IDF weighting scheme is based on the term frequency (TF) or the term frequency combined with the inverse document frequency (TF-IDF). The formula of IDF is where n is total number of documents and df is the number of documents that contains the term. By applying statistical analysis, TF-IDF can extract representative words from documents, but the long enough context and a number of documents are both its prerequisites.
) / log(n df
In addition, a rule-based approach combining fuzzy inductive learning was proposed by Shigeaki and Akihiro [SA04]. The method decomposes textual data into word sets by using lexical analysis, and then discovers key phrases using key phrase relation rules training from amount of data. Besides, Khor and Khan [KK01] proposed a key phrase identification scheme, which employs the tagging technique to indicate the positions of potential noun phrase and uses statistical results to confirm them. By this kind of identification scheme, the number of documents is not a matter. However, a long enough context is still needed to extracted key-phrases from documents.