Related Works - 設計與實作一個新的物件導向規則式知識庫平台

2.1 Object-Oriented concept

The object-oriented technology provides a way to analyze problem effectively.

Although this technology is independent of programming language, various languages that adapt this idea have been designed, e.g., C++, Smalltalk and so on. With those language tools, users can more easily focus on the problem itself without paying too much attention to the language syntax. In addition, some properties of the object-oriented technology, e.g., encapsulation, inheritance, dynamic binding, may improve the maintainability, reusability, and adaptability of software.

Most knowledge systems exploit the object-oriented technology. Based on the object-oriented concepts, knowledge can be divided into some classes. Only the required classes are loaded for inference. Thus, the requirement of system resources can be reduced and the performance can be improved.

The knowledge representation schemes with properties of object-oriented technology are effective on the maintainability of KBS. The property of encapsulation means that only the interface can be used to access the functions or data within a class. Similarly, there is an interface to access the rules or data that are encapsulated in a class of knowledge. Because the details of the knowledge are hidden, this feature can benefit managing a large knowledge base. Based on inheritance, knowledge base system provides the reusability. Moreover, the ability of dynamic binding allows knowledge representation more flexible.

2.2 Knowledge base maintenance

For most knowledge systems, maintaining knowledge is a very important task to keep the systems working properly. For example, when new knowledge comes into a knowledge system, how to combine it with existing knowledge, how to resolve conflicts and redundancies, and how to maintain modularity, etc, are the problems to be considered as the system grows. There are some researches [LT03][TSA02] focus on solving these related issues; hence the knowledge base can be maintained from time to time. When a knowledge system grows, the following issues should be considered:

1. Modularity: Group knowledge into proper units (classes) according to the corresponding knowledge concept; highly modularized knowledge can be managed properly.

2. Confliction: Avoid the confliction inside the knowledge, the confliction of knowledge may cause the process result of a knowledge base to be uncertain.

3. Redundancy: Reduce redundant knowledge contained in the knowledge base;

redundant knowledge can lower the performance of the knowledge base.

4. Incomplete: Ensure the knowledge to be complete, which means for any given facts and problem, there is always some results can be obtained.

5. Complexity: Simplify the inter-relation between knowledge; complicated knowledge relationship makes the inference and explanation of knowledge to be harder.

To the best of our knowledge, there are some efficient algorithms have been proposed to deal with confliction, redundancy, and incomplete issues. However, for the modularity and complexity issues, it still lacks a systematic approach. It seems analyzing the knowledge and partitioning knowledge into less complex and more modular structure will be very helpful in the knowledge system maintenance.

2.3 Knowledge Engineering

Knowledge Engineering is the process of structuring, preparing, formalizing, and optimizing information and knowledge. Many topics related to process knowledge is so-called Knowledge Engineering. In this work, several specific types of knowledge engineering process are involved, including Knowledge Representation, Knowledge Acquisition, Data Mining and Knowledge Fusion.

Knowledge Representation is the way to representing and structuring knowledge into computer compliant data structure, and also provides corresponding mechanism to process the data structure of knowledge. There are several general types of knowledge representation, including Rules, Cases, and other special models (Decision Tree, Neuron Nets, etc). Many researches proposed different approaches to deal with all these different kinds of knowledge representation, and good knowledge representation

considering the performance, maintenance of the knowledge is always a major research area of the domain.

Knowledge Acquisition (KA) is a process to extract knowledge from experts or other knowledge sources and transfer the expertise into well-structured form to be used in knowledge based systems. There are quite many different kinds of KA approaches proposed in many researches [RH03][HW03][HY02][NF02][TL99][WW99], including interviewing with experts, Repertory Grids, machine learning, etc. As we know, Knowledge Engineer (KE), who is responsible for executing the process of KA, plays a major role in KA process to elicit the knowledge from experts and transfer the knowledge into structured format; and the preparation done by KE may obviously influence the KA result.

Data Mining is also a research area of knowledge engineering. Mining knowledge from huge amount of data is much more important in recent years since computer systems are widely used in many different areas and hence generate lots of transactions and log information. Quite many data mining researches focus on retrieving deep knowledge contained inside massive raw data, and hence using data mining in knowledge engineering area is becoming be a more and more important.

Expert systems are more and more popular in recent years, and the knowledge for the same domain may be implemented in different expert systems. For example, many Network Intrusion Systems are implemented based on knowledge base technologies, and many of them may have the knowledge for detecting the same intrusion behavior.

Researches of Knowledge Fusion are proposed to help knowledge engineer combine

to the knowledge fusion problem: the hierarchical approaches and the non-hierarchical approaches. Hierarchical approaches include EPAM [FS84], COBWEB [FIS87], CLUSTER/2 [MSD81], CLUSTER/S [SM86], RESEARCHER [LEB86], CLASSIT [GLF90], LABYRINTH [TL89], AutoClass [TL91], SUBDUE [JHC00], and so on. Non-hierarchical approaches include the common subgraph approach [MG95] and the concept lattice approach [GMA95]. The common subgraph approach based on Sowa’s conceptual graph and knowledge space [SOW84][SOW00]

is efficient and accurate. The concept lattice approach provides an efficient way for knowledge fusion based on the formal concept analysis.

2.4 Ontology

The term ontology is borrowed from philosophy, where an Ontology is a systematic account of Existence [GRU03]. In computer science area, ontology is a conceptualized data structure to be used in knowledge systems or artificial intelligent systems. Based on the same ontology, different systems can communicate with each other, or the knowledge inside computer systems may be structured and presented more accurately.

In recent years, due to the increasing requirement for inducing domain knowledge into computer systems [HY02][NS01][KM03], many researches [AS03][MS01]

[FF99][ERI03][VAR01][SBA04][CTL03] were proposed to discover, represent, and use of ontology. Especially in knowledge based systems, ontology becomes a key to build a successful knowledge base; with ontology, more meaningful and accurate knowledge content for the users can be presented and used. Thus, building up the

ontology for knowledge system before developing the knowledge content helps lots in the knowledge acquisition process.

2.5 Rule Base System

Rule is a natural knowledge representation, in the form of the “IF … Then…”

structure and Rule Base System (RBS) is popular for real applications among expert systems. RBS consists of two components, inference engine and assertions. The assertions can be divided into a set of facts and a set of rules that can be fired by patterns in facts. The inference engine, an interpreter of an RBS, uses an iterative match-select-act cycling model. In act phase of the cycle, a fired rule may modify or generate some facts.

CLIPS [CLI98], one of the most successful expert system shell, which allows a knowledge base to be partitioned into modules, provides a feature called defmodule, and provides a more explicit method for controlling the execution of a system. Each module is able to inference sequentially and independently by inference engine.

Different domain knowledge can be placed in different modules created by defmodule functions. Logically, related rules and facts can be collected into one module, which provides better maintenance and performance.

RBS has many advantages [REI91]. The first is naturalness of expression since experts rely on rules rather than on textbook knowledge. The second is modularity that permits RBS easy to construct, to debug, and to maintain. Restricted syntax and ability of explanation are also the advantages of RBS. Although RBS is powerful

enough in many applications, it has several disadvantages in maintenance and construction, e.g., the weak ability of incremental construction of knowledge [LO96].

Accordingly, many researches aim to integrate object-oriented and rule-based programming paradigms to take advantage of OO technology. There are two paradigms on the integration of objects and rules: incorporating rules into objects and embedding objects into rules. Knowledge objects are an integration of the object-oriented paradigm with logic rules [WU00]. Furthermore, many rule-base tools, which cooperate with OO technology, have been developed, e.g., COOL (CLIPS Object-Oriented Language) [CLI98].

Chapter 3 A New Object-oriented

在文檔中設計與實作一個新的物件導向規則式知識庫平台 (頁 18-25)