• 沒有找到結果。

決策邏輯型機制及其在知識表徵中之應用

N/A
N/A
Protected

Academic year: 2021

Share "決策邏輯型機制及其在知識表徵中之應用"

Copied!
88
0
0

加載中.... (立即查看全文)

全文

(1)

國 立 交 通 大 學

資訊管理研究所

博 士 論 文

97

7

(2)

A Dissertation

Submitted to Institute of Information Management College of Management

National Chiao Tung University in Partial Fulfillment of the Requirements

for the Degree of

Doctor of Philosophy in Information Management July 2008

Hsinchu, Taiwan, the Republic of China

(3)
(4)
(5)

Abstract

In recent years, knowledge discovery in databases (KDD) and its kernel data mining have received more and more attention for practical applications. While the mainstream research of data mining concentrates on the design of efficient algorithms for extracting knowledge from databases, the question to close the semantic gap between structured data and human-comprehensible concepts has been a lasting challenge for the research community. Since the discovered knowledge is useful for a human user only when he can understand its meaning, the representation formalism will play an important role during the knowledge management life cycle.

In this dissertation, we investigate several extensions of decision logic (DL) from the perspective of rough set theory. Traditionally, DL has been considered as a standard way of knowledge representation for rough set-based data mining, whereas our extensions show that DL-styled logics are also useful in more complicated knowledge management tasks.

On the one hand, we propose some decision logic languages for rule representation in rough set-based multicriteria decision analysis. The semantic models of these logics are data tables representing multicriteria decision records. Each decision record is described by a finite set of criteria/attributes. The domains of the criteria may have ordinal properties expressing preference scales, while the domains of the attributes may not.

On the other hand, we propose an arrow decision logic (ADL) to represent and reason about knowledge discovered from relational information systems (RIS). The logic combines the main features of decision logic (DL) and arrow logic (AL). AL is the basic modal logic of arrows. ADL formulas are interpreted in RIS which not only specifies the properties of objects, but also the relationships between objects. We present a complete axiomatization of ADL and discuss its application to knowledge representation in multicriteria decision analysis and social network analysis.

Our work is particularly useful for the knowledge representation phase in the knowledge man-agement life cycle. A realistic scenario about human resource manman-agement is used to show how the proposed logics can serve as representational formalisms in different stages of the recruitment process and team formation process of a company.

(6)

Contents

1 Introduction 1

1.1 Outline of the Thesis . . . 3

1.2 The Main Theme and Contributions . . . 4

2 Information Technology and Knowledge Management 6 2.1 Rough Set Theory . . . 6

2.1.1 Approximation space . . . 6

2.1.2 Data tables and decision logic . . . 7

2.1.3 The connection . . . 10

2.2 Logical Preliminary . . . 10

2.2.1 Propositional logic . . . 10

2.2.2 Modal logic . . . 11

2.2.3 Arrow logic (AL) . . . 12

2.3 Knowledge Management . . . 14

2.3.1 Knowledge management process . . . 14

2.3.2 Logic-based knowledge management . . . 18

2.3.3 A running scenario . . . 18

2.4 Related Work . . . 20

3 Decision Logics for Multicriteria Decision Analysis 22 3.1 Preference-ordered Data Tables . . . 23

3.1.1 The running scenario . . . 24

3.2 Preference-ordered Uncertain Data Tables . . . 26

3.2.1 The running scenario . . . 31

3.3 Preference-ordered Fuzzy Data Tables . . . 32

3.3.1 The running scenario . . . 37

3.4 Pairwise Comparison Decision Logic . . . 38

3.4.1 The running scenario . . . 40

3.5 Summary and Remarks . . . 43

4 Arrow Decision Logic for Relational Information Systems 44 4.1 Information Systems . . . 45

4.1.1 Functional and relational information systems . . . 45

4.1.2 Relationship between information systems . . . 46

(7)

4.3 Arrow Decision Logic . . . 53

4.3.1 Syntax and semantics of ADL . . . 53

4.3.2 Axiomatization . . . 55

4.4 Discussion and Applications . . . 57

4.4.1 Discussion . . . 57

4.4.2 An Application of ADL to MCDM . . . 58

4.4.3 An application of ADL to the representation of attribute dependency . . . 60

4.4.4 The running scenario . . . 61

4.5 Summary and Remarks . . . 61

5 Conclusion 66 5.1 The Contribution of the Work . . . 66

5.2 Remarks . . . 66

5.2.1 A philosophical consideration . . . 67

5.2.2 Hypothesis selection . . . 70

(8)

List of Figures

1.1 An interface is needed between the KDD and AI systems . . . 2

1.2 The organization of the paper . . . 4

2.1 An arrow model . . . 14

2.2 The knowledge management life cycle . . . 17

2.3 The preliminary classification phase of the recruitment process . . . 21

3.1 The short-term performance prediction phase of the recruitment process . . . 25

3.2 The long-term performance prediction phase of the recruitment process . . . 33

3.3 The membership functions of fuzzy sets . . . 35

3.4 The project proposal test phase of the recruitment process . . . 38

3.5 The comparison of possible scores s(x) and s(y) . . . 41

3.6 The final decision phase of the recruitment process . . . 42

4.1 An ADL model . . . 55

4.2 The axiomatic system for ADL . . . 63

4.3 The MCDA and MCDM phases . . . 64

4.4 The process of team formation . . . 65

(9)

List of Tables

2.1 A summary of reviewers’ reports for 10 papers . . . 9

3.1 A POUDT for route selection . . . 31

3.2 A data table for project evaluation . . . 36

(10)

Chapter 1

Introduction

Knowledge management (KM) is a discipline concerned with theories, tools and methodologies that can help individuals, groups and organizations manage the knowledge they use on a daily basis. The theory of knowledge has long been an important topic in many academic disciplines, such as philosophy, psychology, economics, and artificial intelligence, whereas the storage and retrieval of data is the main concern of information science. From a KM perspective, knowledge is usually acquired from observed data, which is a valuable resource for researchers and decision-makers. However, when the amount of data is large, it is difficult to analyze the data and extract knowledge from it. With the aid of computers, the vast amount of data stored in relational data tables can be transformed into symbolic knowledge automatically. Thus, intelligent data analysis has received a great deal of attention of KM researchers in recent years.

Especially, knowledge discovery in databases (KDD) and its kernel data mining have received more and more attention for practical applications. While the mainstream research of data mining concentrates on the design of efficient algorithms for extracting knowledge from databases, the question to close the semantic gap between structured data and human-comprehensible concepts has been a lasting challenge for the research community [50]. This is called the interpretability problem of intelligent data analysis in [50]. Since the discovered knowledge is useful for a human user only when he can understand its meaning, the representation formalism will play an impor-tant role during the KM life cycle. As mentioned in [72], the key open problem of knowledge organization in the KM life cycle is “knowledge representation using a universal language that may allow multiple channel and experience support”.

Many different forms of knowledge have been considered by the KDD researchers, notably, the association rules and sequential patterns [1, 2]. However, it is in general difficult to integrate the discovered patterns and traditional AI systems. The main reason is that the inference engine of AI systems usually employ a logic-based knowledge representation, which is quite different from the specialized patterns discovered by a fixed data mining algorithm. Therefore, a uniform interface between the discovery and utilization of knowledge is urgently needed. The interface will transform the discovered patterns into the knowledge based on the logical formalism employed by the AI system (Figure 1.1). From the KM viewpoint, the KDD process is concerned with knowledge creation, whereas the AI system is designed for knowledge application. Thus, the smooth interface between these two will improve the coherence of the whole KM process.

The advantages of the logic-based representation for data mining have also been observed in the past [26].

(11)

Data Mining

Raw data Mined

pattern KB (Logic)

Problem solver KDD system (knowledge creation) AI system (knowledge application)

Inference Engine

interface interface

Figure 1.1: An interface is needed between the KDD and AI systems

. . . a coherent formalism, capable of dealing uniformly with induced knowledge and background, or domain, knowledge, would represent a breakthrough in the design and development of decision support systems, in diverse application domains. The advan-tages of such an integrated formalism are, in principle:

• a high degree of expressiveness in specifying expert rules, or business rules; • the ability to formalize the overall KDD process, thus tailoring a methodology to

a specific class of applications;

• the separation of concerns between the specification level and the mapping to the underlying databases and data mining tools.

The rough set theory proposed by Pawlak provides an effective tool for extracting knowledge from data tables [67]. In fact, many powerful data mining algorithms have been proposed based on the rough set theory (for example, see papers in [75, 76, 68] for some recent progress). To represent and reason about the extracted knowledge, a decision logic (DL) is also proposed in [67]. The semantics of the logic is defined in a Tarskian style through the notions of models and satisfaction.

Due to the following two reasons, DL is a good candidate to serve as the bridge between the KDD and AI systems: On the one hand, the data mining algorithms based on rough set theory usually extract rules which can be easily represented in the syntactical form of DL language. On

(12)

the other hand, the semantic similarity between DL and Classical logic makes it easier to integrate the mined results into knowledge-based systems.

Although DL can faithfully represent knowledge discovered from classical data tables, it is inadequate for more advanced data mining tasks, such as uncertainty management or multi-criteria decision analysis. To address such challenging issues, this work is aimed at the investigation of different extensions of DL that can represent more complicated forms of knowledge. Though we can envision a powerful universal logic that integrates all features of the extended DL languages, it is impractical to devise a logic that can cover all possible forms of knowledge. Thus, our strategy is to develop a particular extended DL for each individual aspect of the knowledge representation requirements of different data mining tasks, while keep it easy to modularly combine these different extended logics.

1.1

Outline of the Thesis

One key step in knowledge management is the transformation of data and information into knowl-edge. Data mining can play an important role in such a transformation process. To set up the foundations of the dissertation, we review rough set theory, which is an effective method of data mining; propositional and modal logics, which are basic knowledge representation formalisms; and the theory of knowledge management, which provides an appropriate context of our work.

In classical data tables, no relationship exists between values in a domain of attribute. However, in many cases, in particular, when the attribute is a criterion for decision making, there exist several preference relations on the domain. When rough set theory is applied to multi-criteria decision analysis (MCDA), it is crucial that preference-ordered attribute domains and decision classes be dealt with [29, 30, 31, 33, 34, 35, 36, 82]. The original rough set theory cannot handle inconsistencies arising from violations of the dominance principle due to its use of the indiscernibility relation. Therefore, in the above-mentioned work, the indiscernibility relation is replaced by a dominance relation to solve the multi-criteria sorting problem, and the data table is replaced by a pairwise comparison table to solve multi-criteria choice and ranking problems. The approach is called the dominance-based rough set approach (DRSA). For MCDA problems, DRSA can induce a set of decision rules from sample decisions provided by decision-makers. The induced rules form a comprehensive preference model and can provide recommendations about a new decision-making environment. The objective of chapter 3 is to investigate DL for DRSA.

In addition of preference relations, general relations may exist on the domains of attribute values. These relations may induce complicated relationship between objects possessing the at-tributes. For example, a social network is a set of objects with complicated interaction relations. To model such kind of data, we must consider relational information systems (RIS). In data tables, each attribute is actually a function from the set of objects to the domain of attribute values, so a data table is also called a functional information system (FIS). In contrast with FIS, an attribute in a RIS is considered as a function from a pair of objects to the domain of relational indicators. The objective of Chapter 4 is to develop a logic for relational information systems.

The materials in this thesis are mainly drawn from some of our previously published papers. Chapter 3 is a slightly adapted version of [21]. Chapter 4 is a fusion of [20] and [18]. Finally, Chapter 5 contains partially the contents of [17] and [19]. The organization of the thesis is shown in Figure 1.2.

(13)

Knowledge

Representation

Logic for

MCDA

Decision Logic Dominance-based rough set analysis

Arrow

Decision Logic

Arrow Logic Decision Logic

Theme

Contribution

Foundation

Figure 1.2: The organization of the paper

1.2

The Main Theme and Contributions

From Figure 1.2, we can see that the main theme of our work is to design knowledge representation formalisms in a uniform style. The foundations of our knowledge representation formalisms include decision logic, arrow logic, and (dominance-based) rough set analysis. As a consequence, we have proposed different extensions of decision logics that can accommodate knowledge representation requirements in different knowledge management tasks. Thus, the main contributions of the thesis are the development of the following extensions of decision logics for representing different forms of knowledge:

• PODL: for knowledge from the data of preference-ordered domains (e.g., evaluation data), • POUDL: for knowledge discovered from data tables where data is uncertain or information

is incomplete,

• POFDL: for knowledge discovered from data tables where data is qualitatively and imprecise, • PCDL: for knowledge discovered from pairwise comparisons of objects, and

(14)

• ADL: for knowledge discovered from relational information systems (e.g., social network data).

(15)

Chapter 2

Information Technology and Knowledge

Management

In this chapter, we introduce necessary background knowledge for our work. Furthermore, we also mention several related work in the last section.

2.1

Rough Set Theory

2.1.1

Approximation space

The basic construct of rough set theory is an approximation space. An approximation space is defined as a pair (U, R), where U is the set of universe and R ⊆ U × U is an equivalence relation on U . Recall that a binary relation R is an equivalence relation if it is reflexive (i.e., (x, x) ∈ R for all x ∈ U ), symmetric (i.e., for all x, y ∈ U , if (x, y) ∈ R, then (y, x) ∈ R), and transitive (i.e., for all x, y, z ∈ U , if (x, y) ∈ R and (y, z) ∈ R, then (x, z) ∈ R). An equivalence relation partitions the universe U into a family of equivalence classes so that each element of U belongs to exactly one of these equivalence classes. In other words, there exist U1, U2, · · · , Uk ⊆ U such that

U = ∪k

i=1Ui, Ui∩ Uj = ∅ for i 6= j, and for x, y ∈ U , (x, y) ∈ R if and only if (iff) there exists i

such that both x and y ∈ Ui. Thus, we can write an equivalence class of R as [x]R if it contains

the element x. Note that [x]R= [y]R iff (x, y) ∈ R.

According to Pawlak’s insight, it is claimed that knowledge is deep-seated in the classification capabilities of human beings. A classification is simply a partition of the universe, so an approxi-mation space construct the basic knowledge about the objects in the universe. In philosophy, the extension of a concept is defined as the objects that are the instances of the concept. For example, the extension of the concept “bird” is simply the set of all birds in the universe. Pawlak identified a concept with its extension. Thus, a subset of the universe is called a concept or a category in rough set theory.

Given an approximation space (U, R), each equivalence class of R is called a R-basic category or R-basic concept , and any union of R-basic categories is called a R-category. Now, for an arbitrary concept X ⊆ U , we are interested in the definability of X by using R-basic categories. We say that X is R-definable, if X is a R-category; otherwise X is R-undefinable. The R-definable concepts are also called R-exact sets, whereas R-undefinable concepts are said to be R-inexact or R-rough.

(16)

When the approximation space is explicit from the context, we simply omit the qualifier R and call a set exact set or rough set.

A rough set can be approximated from lower and upper by two exact sets. The lower ap-proximation and upper apap-proximation of X is denoted by RX and RX and defined as follows respectively:

RX = {x ∈ U | [x]R⊆ X},

RX = {x ∈ U | [x]R∩ X 6= ∅}.

2.1.2

Data tables and decision logic

In data mining problems, data is usually provided in the form of data tables (DT). A formal definition of a data table is given in [67].

Definition 2.1 A data table1 is a tuple

T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}),

where U is a nonempty finite set, called the universe; A is a nonempty finite set of primitive attributes; for each i ∈ A, Vi is the domain of values for i; and for each i ∈ A, fi : U → Vi is a

total function.

Given a data table T , we denote its universe U and attribute set A by U ni(T ) and Att(T ) respectively. An attribute in A is usually denoted by the lower-case letters i or a.

In [67], a decision logic (DL) is proposed for the representation of knowledge discovered from data tables. It is called decision logic because it is particularly useful in a special kind of data table, called a decision table.2 A decision table is a data table T = (U, C ∪ D, {Vi | i ∈ A}, {fi | i ∈ A}),

where Att(T ) can be partitioned into two sets, C and D, called condition attributes and decision attributes respectively. Decision rules relating the condition and the decision attributes can be derived from the table by data analysis. A rule is then represented as an implication between the formulas of the logic.

The basic alphabet of a DL consists of a finite set of attribute symbols A, and a finite set of value symbols Vi for i ∈ A. The syntax of DL is then defined as follows.

Definition 2.2

1. An atomic formula of DL is a descriptor (i, v), where i ∈ A and v ∈ Vi.

2. The set of DL well-formed formulas (wff ) is the smallest set containing the atomic formulas and closed under the Boolean connectives ¬, ∧, and ∨.

3. If ϕ and ψ are wffs of DL, then ϕ −→ ψ is a rule in DL, where ϕ is called the antecedent of the rule and ψ the consequent.

1Also called knowledge representation systems, information systems, or attribute-value systems 2note that for a general data table, the abbreviation DL can also be used to denote data logic.

(17)

A data table T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}) relates to a given DL if there is a bijection

τ : A → A such that, for every a ∈ A, Vτ (a) = Va. Thus, by somewhat abusing the notation, we

usually denote an atomic formula as (i, v), where i ∈ A and v ∈ Vi if the data tables are clear

from the context. Intuitively, each element in the universe of a data table corresponds to a data record, and an atomic formula (which is in fact an attribute-value pair) describes the value of some attribute in the data record. Thus, the atomic formulas (and therefore the wffs) can be satisfied or not with respect to each data record. This generates a satisfaction relation between the universe and the set of wffs.

Definition 2.3 Given a DL and a data table T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}) relating to

it, the satisfaction relation |=T between U and the wffs of the DL is defined inductively as follows

(the subscript T is omitted for brevity). 1. x |= (i, v) iff fi(x) = v,

2. x |= ¬ϕ iff x 6|= ϕ,

3. x |= ϕ ∧ ψ iff x |= ϕ and x |= ψ, 4. x |= ϕ ∨ ψ iff x |= ϕ or x |= ψ. If ϕ is a DL wff, the set mT(ϕ) defined by:

mT(ϕ) = {x ∈ U | x |= ϕ}, (2.1)

is called the meaning set of the formula ϕ in T . If T is understood, we simply write m(ϕ). Sometimes, the notations T, x |= ϕ and x |=T ϕ are considered interchangeable if the data table

T must be made explicit.

A formula ϕ is said to be valid in a data table T (written as |=T ϕ or |= ϕ for short when T

is clear from the context) if and only if m(ϕ) = U . That is, ϕ is satisfied by all individuals in the universe. Also, ϕ is said to be satisfiable in a data table T if m(ϕ) 6= ∅

A DL wff states the properties of individuals in the universe; therefore, it is satisfied by some individuals, but not by the others. However, the mined knowledge usually relates to the aggregated or statistical information of all individuals. Obviously, wffs that are valid in a data table represent a kind of knowledge that can be induced from the table, since they hold for all individuals. However, not all kinds of useful information are in the form of valid wffs. Sometimes, even probabilistic rules are very useful from the viewpoint of knowledge discovery. To quantify the usefulness of the mined rules, some measures have been proposed [95, 93]. The most common measures are support and confidence.

Definition 2.4 Let Φ1 be the set of all DL rules and T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}) be a

data table. Then:

1. the rule ϕ −→ ψ is valid in T iff mT(ϕ) ⊆ mT(ψ);

2. the absolute support function aspT : Φ1 → ℵ is

(18)

Table 2.1: A summary of reviewers’ reports for 10 papers U \ A o p t d 1 4 4 3 4 2 3 2 3 3 3 4 3 2 3 4 2 2 2 2 5 2 1 2 1 6 3 1 2 1 7 3 2 2 2 8 4 1 2 2 9 3 3 2 3 10 4 3 3 3

3. the relative support function rspT : Φ1 → [0, 1] is

rspT(ϕ −→ ψ) =

|mT(ϕ ∧ ψ)|

|U | ; and 4. the confidence function cf dT : Φ1 → [0, 1] is

cf dT(ϕ −→ ψ) =

|mT(ϕ ∧ ψ)|

|mT(ϕ)|

.

Example 2.1 Let us use an example to illustrate the concept introduced in this section. Assume that Table 1 is a summary of reviewers’ report for ten papers submitted to a journal. The table details ten papers evaluated by means of four attributes:

• o: originality, • p: presentation,

• t: technical soundness, and

• d: overall evaluation (the decision attribute) By Definition 2.1, the components of the data table are:

U = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, A = {o, p, t, d},

Vo = Vp = Vt ={1 (poor), 2 (fair), 3 (good), 4 (excellent)},

Vd={1 (reject), 2 (major revision), 3 (minor revision), 4 (accept)},

(19)

Thus, we have atomic formulas like (o, 4), (p, 1), and (t, 2); and formulas like (o, 4) ∧ (p, 3) and ¬(p, 1) ∨ ¬(t, 1). The rule r = (o, 3) ∧ ((p, 3) ∨ (t, 3)) −→ (d, 3) is valid, since m((o, 3) ∧ ((p, 3) ∨ (t, 3))) = {2, 9} ⊆ m((d, 3)) = {2, 3, 9, 0}. Furthermore, we have asp(r) = 2, rsp(r) = 15, and cf d(r) = 1.

2.1.3

The connection

While an approximation space is an abstract framework to represent classification knowledge, it can be easily derived from a concrete data table. Let T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}) be a

data table and B ⊆ A be a subset of attributes, then we can define an equivalence relation, called the indiscernibility relation based on B, as

ind(B) = {(x, y) | x, y ∈ U, fi(x) = fi(y)∀i ∈ B}.

In other words, x and y are B-indiscernible if they have the same values with respect to all attributes in B. Consequently, for each B ⊆ A, (U, ind(B)) is an approximation space.

In terms of DL, each equivalence class of B is characterized by a DL formula ∧i∈B(i, vi) and

any formula ϕ of DL can be considered as a concept mT(ϕ). Then, the equivalence class is a

subset of the lower (resp. upper) approximation of the concept if the rule ∧i∈B(i, vi) −→ ϕ is valid

(resp. the formula ∧i∈B(i, vi) ∧ ϕ is satisfiable).

2.2

Logical Preliminary

Logic is commonly defined as the analysis of methods of reasoning[63]. In the presentation of modern symbolic logic, the form of a statement and its content are usually separated, and are defined by the syntax and the semantics of the logic respectively. The syntax of a logic defines a formal language by some grammatical rules and its semantics stipulates the truth conditions of the formulas in the language.

2.2.1

Propositional logic

The syntax of propositional logic (PL) is based on the combinations of simple sentences in various ways to form more complicated sentences. The combinations are truth-functional in the sense that the truth value of the new sentence is determined by those of its component sentences. The alphabet of PL consists of a set of primitive propositions, Φ0, and the logical symbols ¬(negation),

∧(and), ∨(or), and ⊃(material implication). The logical symbols of PL are also called Boolean connectives. The set of well-formed formulas (wffs) of propositional logic is defined as the smallest set Φ such that Φ0 ⊆ Φ and

• if ϕ ∈ Φ, then ¬ϕ ∈ Φ;

• if ϕ and ψ ∈ Φ, then ϕ ∧ ψ, ϕ ∨ ψ, and ϕ ⊃ ψ ∈ Φ.

The auxiliary parentheses symbols are usually employed to disambiguate the reading of a sentence. The equivalence connective ≡ is defined as an abbreviation, i.e., ϕ ≡ ψ is an abbreviation of (ϕ ⊃ ψ) ∧ (ψ ⊃ ϕ).

(20)

The semantic models of PL are simply truth assignments to the primitive propositions. Mathe-matically, an interpretation (or a model) of a PL is a function π : Φ0 → {0, 1}. The interpretation

π can be extended to the whole domain Φ by the following recursion rules: 1. π(¬ϕ) = 1 − π(ϕ),

2. π(ϕ ∧ ψ) = min(π(ϕ), π(ψ)), 3. π(ϕ ∨ ψ) = max(π(ϕ), π(ψ)), 4. π(ϕ ⊃ ψ) = max(1 − π(ϕ), π(ψ)).

We say that a wff ϕ is true (resp. false) under the interpretation π if π(ϕ) = 1 (resp. π(ϕ) = 0). A wff ϕ is valid if it is true under all interpretations and satisfiable if it is true under some interpretation. A wff that is not satisfiable is said to be unsatisfiable. Note that ϕ is valid iff ¬ϕ is unsatisfiable.

2.2.2

Modal logic

As shown by the well-known Stone representation theorem, classical set theory has the intimate connection with the Boolean logic[83]. Analogously, rough set theory is closely related to modal logic[7]. The most well-known relationship is the connection of approximation space with possible world semantics for the modal epistemic logic S5.

The alphabet of S5 consists of a set of primitive propositions, Φ0, and the logical

sym-bols ¬(negation), ∧(and), ∨(or), ⊃(material implication), (necessity modal operator), and ♦(possibility modal operator). The set of well-formed formulas (wffs) of S5 is defined as the smallest set Φ such that Φ0 ⊆ Φ and

• if ϕ ∈ Φ, then ¬ϕ, ϕ, and ♦ϕ ∈ Φ

• if ϕ and ψ ∈ Φ, then ϕ ∧ ψ, ϕ ∨ ψ, and ϕ ⊃ ψ ∈ Φ.

A Kripke model for S5 is a triple M = (W, R, π), where W is a set of possible worlds, R is an equivalence relation on W , called an accessibility relation, and π : Φ0 → 2W is a truth assignment

that map a primitive propositions to the set of worlds in which it is evaluated to be true. The function π can be extended to all wffs recursively in the following way:

1. π(¬ϕ) = W − π(ϕ) 2. π(ϕ ∧ ψ) = π(ϕ) ∩ π(ψ) 3. π(ϕ ∨ ψ) = π(ϕ) ∪ π(ψ) 4. π(ϕ ⊃ ψ) = π(¬ϕ) ∪ π(ψ) 5. π(ϕ) = {w | ∀u((w, u) ∈ R ⇒ u ∈ π(ϕ))} 6. π(♦ϕ) = {w | ∃u((w, u) ∈ R ∧ u ∈ π(ϕ))}

(21)

For each model M and wff ϕ, π(ϕ) is called the truth set of ϕ (in M ).

Obviously, if M = (W, R, π) is a Kripke model for S5, then (W, R) is an approximation space, and for each wff ϕ, π(ϕ) is a subset of W and denote some concept in the approximation space, so we can consider its lower and upper approximations. A direct but interesting relationship between S5 and rough set theory is then established as follows:

Rπ(ϕ) = π(ϕ), Rπ(ϕ) = π♦ϕ).

2.2.3

Arrow logic (AL)

In this section, we review the basic syntax and semantics of AL in order to lay the foundation for the development of arrow decision logic. AL is the basic modal logic of arrows [61, 87]. An arrow can represent a state transition in a program’s execution, a morphism in category theory, an edge in a directed graph, etc. In AL, an arrow is an abstract entity; however, we can usually interpret it as a concrete relationship between two objects, which results in a pair-frame model [61, 87]. We now present the syntax and semantics of AL.

The basic alphabet of AL consists of a countable set of propositional symbols, the Boolean connectives ¬ and ∨, the modal constant δ, the unary modal operator ⊗, and the binary modal operator ◦. The set of AL wffs is the smallest set containing the propositional symbols and δ, closed under the Boolean connectives ¬ and ∨, and satisfying

• if ϕ is a wff, then ⊗ϕ is a wff too;

• if ϕ and ψ are wffs, then ϕ ◦ ψ is also a wff.

In addition to the standard Boolean connectives, we also abbreviate ¬ ⊗ ¬ϕ and ¬(¬ϕ ◦ ¬ψ) as ⊗ϕ and ϕ◦ψ respectively.

Semantically, these wffs are interpreted in arrow models. Definition 2.5

1. An arrow frame is a quadruple F = (W, C, R, I) such that C ⊆ W × W × W , R ⊆ W × W and I ⊆ W .

2. An arrow model is a pair M = (F, π), where F = (W, C, R, I) is an arrow frame and π is a valuation that maps propositional symbols to subsets of W . An element in W is called an arrow in the model M.

3. The satisfaction of a wff ϕ on an arrow w of M, denoted by w |=M ϕ (as usual, the subscript

M can be omitted), is inductively defined as follows: (a) w |= p iff w ∈ π(p) for any propositional symbol p, (b) w |= δ iff w ∈ I,

(c) w |= ¬ϕ iff w 6|= ϕ,

(22)

(e) w |= ϕ ◦ ψ iff there exist s, t such that (w, s, t) ∈ C, s |= ϕ, and t |= ψ, (f ) w |= ⊗ϕ iff there is a t with (w, t) ∈ R and t |= ϕ.

Example 2.2 Let us use the (multi-)graph shown in Figure 2.1 to explain the basic concept of arrow models. As shown in the figure, the arrow frame is characterized as F = (W, C, R, I), where

• W = {a1, a2, . . . , a10},

• C = {(a1, a4, a5), (a1, a4, a7), . . . , (a8, a8, a10)}

• R = {(a5, a6), . . . , (a10, a10)}

• I = {a9, a10}

If x1, x2, x3, and x4 denote four cities and ai(1 ≤ i ≤ 10) denote routes between these them, then

I denote the set of intra-city routes, whereas the others are inter-cities routes. A route ai is a

reverse route of another route aj if (ai, aj) ∈ R. For example, a6 is a reverse route of a5. Also,

(ai, aj, ak) ∈ C if aj followed by ak is an alternative route of ai. For example, a3 is a direct route

connecting the cities x1 and x3. However, alternatively, we can also go from x1 to x2 through

route a4 and then from x2 to x3 through the route a5. Thus, a4 followed by a5 is an alternative

route to a3. Now, let us consider an arrow logic language with two propositional symbols p and q

meaning “the route is in congestion” and “the route is in bad situation” respectively. Assume that the valuation π of the arrow model is given as follows:

π(p) = {a1, a2, a3, a9, a10},

π(q) = {a1, a4, a6, a8, a10}.

Then, in the model (F, π), we have a9 |= δ since a9 is an intra-city route. We also have a7 |=

¬q∧(⊗q) which means that a7 is not in bad situation, but one of its reverse routes is. Furthermore,

we have a3 |= p ∧ (q ◦ (¬p ∧ ¬q)) which means that a3 is in congestion and there is an alternative

route with a section (a4) in bad situation followed by a section (a5) neither in bad situation nor

in congestion.

Intuitively, in the arrow frame (W, C, R, I), W can be regarded as the set of edges of a directed graph; I denotes the set of identity arrows3; (w, s) ∈ R if s is a reversed arrow of w; and

(w, s, t) ∈ C if w is an arrow composed of s and t. This intuition is reflected in the following definition of pair frames.

Definition 2.6 An arrow frame F = (W, C, R, I) is a pair frame if there exists a set U such that W ⊆ U × U and

1. for x, y ∈ U , if (x, y) ∈ I then x = y,

2. for x1, x2, y1, y2 ∈ U , if ((x1, y1), (x2, y2)) ∈ R, then x1 = y2 and y1 = x2,

3. for x1, x2, x3, y1, y2, y3 ∈ U , if ((x1, y1), (x2, y2), (x3, y3)) ∈ C, then x1 = x2, y2 = x3, and

y1 = y3.

An arrow model M = (F, π) is called a pair model if F is a pair frame. A pair model is called a (full) square model if the set of arrows W = U × U .

(23)

a

1

a

2

a

3

a

4

a

5

a

6

a

7

a

8 x1 x2 x3

a

10

a

9 x4

I

={

a

9

a

10

}

R

={(

a

5

,

a

6

) , (

a

6

,

a

5

) (

a

7

,

a

6

) , (

a

6

,

a

7

) (

a

9

,

a

9

) (

a

10

,

a

10

)}

C

={(

a

1

,

a

4

,

a

5

), (

a

1

,

a

4

,

a

7

), (

a

2

,

a

4

,

a

5

), (

a

2

,

a

4

,

a

7

),

(

a

3

,

a

4

,

a

5

) ,(

a

3

,

a

4

,

a

7

) , (

a

8

,

a

8

,

a

9

), (

a

8

,

a

8

,

a

10

) }

W

={

a

1

,

a

2

, ….,

a

10

}

Figure 2.1: An arrow model

2.3

Knowledge Management

2.3.1

Knowledge management process

Although there is no commonly accepted definition of knowledge management, the following one mentioned in [72] is rather conceivable:

Knowledge management (KM) is a discipline that provides strategy, process, and technology to share and leverage information and expertise that will increase our level of understanding to more effectively solve problems and make decisions.

According to this definition, it is believed that the objective of knowledge management systems is to support creation, transfer, and application of knowledge in organizations [3].

To uncover some assumptions about knowledge that underlie organizational KM process and KMS, several perspectives on knowledge are summarized in [3]. These include

1. the contrast of knowledge with data and information, 2. knowledge as a state of mind,

(24)

4. knowledge as a process,

5. knowledge as a condition of having access to information, and 6. knowledge as a capability

For the contrast of knowledge with data and information, it is reiterated that data is raw numbers and facts, information is processed data, and knowledge is authenticated information. For knowledge as a state of mind, knowledge is described as “a state or fact of knowing” with knowing being a condition of “understanding gained through experience or study; the sum or range of what has been perceived, discovered, or learned”. For knowledge as an object, it means that knowledge can be viewed as a thing to be stored and manipulated (i.e., an object). For knowledge as a process, it is emphasized that knowledge can be viewed as a process of simultaneously knowing and acting. For knowledge as a condition of having access to information, organizational knowledge must be organized to facilitate access to and retrieval of content. It is thought of as an extension of the view of knowledge as an object, with a special emphasis on the accessibility of the knowledge objects. For knowledge as a capability, knowledge can be viewed as a capability with the potential for influencing future action. However, it is also suggested that knowledge is not so much a capability for specific action, but the capacity to use information; learning and experience result in an ability to interpret information and to ascertain what information is necessary in decision making. (See [3] for further references).

These different perspectives of knowledge lead to different emphasis on how knowledge should be managed. Among them, the process view focuses on knowledge flow and the processes of creation, sharing, and distribution of knowledge. This is closely related to the KM life cycle— creation, capture, organization, and dissemination/sharing[72].

Any KM process starts from the creation of knowledge. The theory of organizational knowledge creation proposed in [66] is concerned with developing new content or replacing existing content within the organization’s tacit and explicit knowledge. According to [3],

rooted in action, experience, and involvement in a specific context, the tacit di-mension of knowledge (henceforth referred to as tacit knowledge) is comprised of both cognitive and technical elements. The cognitive element refers to an individual.s mental models consisting of mental maps, beliefs, paradigms, and viewpoints. The technical component consists of concrete know-how, crafts, and skills that apply to a specific context.

On the other hand, the explicit knowledge is defined as follows [3]:

the explicit dimension of knowledge (henceforth referred to as explicit knowledge) is articulated, codified, and communicated in symbolic form and/or natural language. Based on such tacit-explicit knowledge classification, four modes of knowledge creation have been identified: socialization, externalization, internalization, and combination [66]. These modes of knowledge creation is explicated in [3] as follows:

The socialization mode refers to conversion of tacit knowledge to new tacit knowl-edge through social interactions and shared experience among organizational members

(25)

(e.g., apprenticeship). The combination mode refers to the creation of new explicit knowledge by merging, categorizing, reclassifying, and synthesizing existing explicit knowledge (e.g., literature survey reports). The other two modes involve interactions and conversion between tacit and explicit knowledge. Externalization refers to con-verting tacit knowledge to new explicit knowledge (e.g., articulation of best practices or lessons learned). Internalization refers to creation of new tacit knowledge from explicit knowledge (e.g., the learning and understanding that results from reading or discussion).

However, as pointed out in [91], the term “tacit knowledge” should be replaced by the more appropriate “implicit knowledge”:

Nonaka and Takeuchi put forward the proposition, embodied in the diagram, that “tacit knowledge” is somehow derived from explicit knowledge and, by other means, is made explicit. However, it is clear, from the analysis above, that implicit knowledge, which is not normally expressed, but may be expressed, is actually intended here. Im-plicit knowledge is that which we take for granted in our actions, and which may be shared by others through common experience or culture. For example, in establishing a production facility in a foreign country, a company knows it needs to acquire local knowledge of “how things are done here”. Such knowledge may not be written down, but is known by people living and working in the culture and is capable of being written down, or otherwise conveyed to those who need to know. The knowledge is implicit in the way people behave towards one another, and towards authority, in that foreign cul-ture, and the appropriate norms of behavior can be taught to the newcomers. Implicit knowledge, in other words, is expressible: tacit knowledge is not, and Nonaka would have saved a great deal of confusion had he chosen the more appropriate term.

Apparently, the decision cases occurred in the past and stored in a data table may embed such a kind of implicit knowledge, and data mining is simply the IT tool to make implicit knowledge explicit. Thus, KDD is a process of knowledge creation from raw data. Once knowledge is cre-ated, its storage, organization, and retrieval, which are also referred as organizational memory, be-come crucial for effective organizational knowledge management. Organizational memory includes knowledge residing in various component forms, including written documentation, structured in-formation stored in electronic databases, codified human knowledge stored in expert systems, and so on [3].

Several technologies to support the KM process are identified in [72]. These technologies include hybrid expert systems; personalization–profiling and customization; taxonomies, search, or knowledge discovery; knowledge metrics; and knowledge visualization. Among them, the core requirement of hybrid expert systems is “to capture the knowledge of experts and translate them into rules and reasoning processes to aid in decision support. Rules may range from simple and rigid to complex and vague” [72]. The objective of our work is to provide a class of logics that can represent such rules and reasoning processes.

The phase next to knowledge organization is knowledge transfer. Knowledge transfer depends on a learning process. Communication process and information flows drive knowledge transfer in organizations. To ease the communication process and reduce the cost of conversion between different forms of knowledge, a uniform style of knowledge representation formalism is expected.

(26)

Since the logics proposed in this thesis are all the same style—they are all based on extensions of the decision logic, we can achieve the purpose of easy communication between different units or organizations.

The last phase of knowledge management is the application of knowledge. It is emphasized in [3] that the source of competitive advantage of KM resides in the application of the knowledge rather than in the knowledge itself. There are three primary mechanisms for the integration of knowledge to create organizational capability: directives, organizational routines, and self-contained task teams[3]. An important problem of knowledge application may be “deciding upon the rules and routines to apply to a problem, given that over time, the organization has learned and codified a large number of rules and routines, so that choosing which rules to activate for a specific choice making scenario is itself problematic”[3]. The technology that can help in this phase is the reasoning power of decision-support systems. As mentioned in Chapter 1, our logic-based representation facilitates the easy integration of the knowledge bases with inference engines of the decision-support systems.

The knowledge management process is summarized in Figure 2.2, where knowledge represen-tation is highlighted with red color to reiterate the main theme of our work.

Knowledge creation

• Data mining • Text mining • Brainstorming etc. Knowledge representation • Organizational memory • Document

• Codified knowledge (logic, rules, expert systems, etc)

Knowledge transfer

• Communication • Formal and informal • Personal and impersonal

Knowledge application

• Rule activation

• Easy access and maintenance • Expert systems

(27)

2.3.2

Logic-based knowledge management

In [22], a logic-based approach to KM is advocated for meeting the requirement of system integra-tion. The requirement occurs when an organization have a large asset of heterogenous computer programs, databases and applications. It will be ideal to integrate all these systems into a single homogenous system. In real life, however, this would be an unrealistic task because it requires a lot of efforts and a long development time. Therefore, a more realistic approach to system integra-tion is to support the use of the existing systems in a way most comfortable for the end-user, hide the technical difficulties caused by the heterogenous databases and/or applications, and provide tools for combining information coming from various sources[22]. In order to find an intelligent solution for the KM oriented support of system integration, the most important thing is to choose a synthetic and uniform technology basis.

Logic-based paradigm is chosen in [22] because of its important advantages in the following aspects:

• declarative description of meta-information on information sources, • symbolic manipulation for generating database queries,

• openness, knowledge-based customizability, • deduction in intelligent answering,

• learning capacity wherever it is possible, • natural language interface.

The paradigm results in the development of the SILK project (System Integration via Logic in Knowledge Management). The objective of SILK is to show the applicability of the logic-based knowledge management approach to system integration. Although the logic employed in the SILK project is different than the DL-styled logics used here, our approach also inherits the advantages of logic-based KM. In particular, we can have a uniform style of knowledge representation formalism. This will make the organization and application of knowledge more accessible for the enterprize. We will use a running scenario to illustrate how effective human resource management (HRM) can be achieved via such a representation.

2.3.3

A running scenario

The recruitment and selection of high-potential talent are important for the enterprize, since the human capital is the key to success in a knowledge-based economy. However, it is difficult in the selection stage to predict the work performance of the applicants. Conventional selection approaches including basic competency test, professional skill test, project proposal test, and interviews have been widely used for many years. Data mining methodology to assist the decision makers in identifying the most suitable talents has been previously proposed in [10]. The approach is developed to explore human resource data and thus derive decision rules between personnel characteristics and work behaviors. Let us now consider a high-tech company (say, the Knowledge Technology Corporation, abbreviated as KTC), which is recruiting talent to fill different positions including program designers, system analysts, project managers, and researcher.

(28)

Example 2.3 To warm up, let us consider how DL can be used to represented decision rules for the preliminary classification of the applicants to KTC recruitment. To achieve this task, we retrieve the basic data of well-performed employees from the personnel data base of the company. The definition of well-performed employees depends on the requirement of the positions. For example, an employee may be considered as well-performed if the evaluation of his/her performance in the last five years is above some level. It is believed that these employees fit their current positions quite well. We assume the basic data of employees contains the following attributes: age (when hired), gender, educational background (degree), and major subject, and current position. The last attribute is the decision attribute and the others are condition attributes. For the sake of simplicity, we will denote attributes by lower case italic letters. Thus, the attributes and their domains are coded as follows:

1. i1: the age of an employee when he/she is hired

• 1: 25 years old and below • 2: 26 to 30 years old • 3: 31 to 35 years old • 4: 36 years old and above 2. i2: the gender of an employee

• F: female • M: male

3. i3: education background (the highest degree that an employee possesses)

• JC: junior college degree and below • BA: bachelor degree

• MS: master degree and above 4. i4: the major of an employee

• EE: electrical engineering • CS: computer science • MH: mathematics

• IM: information management 5. d: the current position of an employee

• PD: program designers • SA: system analysts • PM: project managers • RD: researcher

(29)

The retrieved employee data is then taken as our input decision table and appropriate data mining algorithm is applied to derive decision rules. Since the main focus of the work is on the representation of the rules and the decision table just looks like that appears in Example 2.1, we will not write down the decision table again. Also, it is assumed that pre-determined thresholds for confidences and supports are used to select the decision rules, so we come up with a set of decision rules in the form of DL rules. For example, we may have a rule like this

(i3, MS) ∧ ((i4, , MH) ∨ (i4, , CS)) −→ (d, RD).

The set of all derived decision rules (perhaps with the background knowledge) constitutes the first knowledge base KB1 for the recruitment process. Now, given the basic data of the applicants, each

applicant can be classified as appropriate for one (or more) particular position(s). An applicant classified as appropriate for a particular position is consider as a candidate of that position. For example, an applicant whose major is mathematics and highest degree is master is consider as a candidate for the researcher position according to the above rule. Thus, for each position, we have a set of candidates for further process. The process of the preliminary classification phase is shown in Figure 2.3, where the knowledge base is colored red to emphasize the role of DL-styled logics.

2.4

Related Work

When rough set theory is applied to multi-criteria decision analysis (MCDA), it is crucial that preference-ordered attribute domains and decision classes be dealt with [29, 30, 31, 33, 34, 35, 36, 82]. Thus, the indiscernibility relation in rough set theory is replaced by a dominance relation to solve the multi-criteria sorting problem, and the data table is replaced by a pairwise comparison table to solve multi-criteria choice and ranking problems. The approach is called the dominance-based rough set approach (DRSA).

A strong assumption about data tables is that each object takes exactly one value with respect to an attribute. However, in practice, we may only have incomplete information about the values of an object’s attributes. Thus, more general data tables and decision logics are needed to represent and reason about incomplete information. For example, set-valued and interval set-valued data tables have been introduced to represent incomplete information [51, 52, 53, 60, 94]. DRSA has also been extended to deal with missing values in MCDA problems [34, 82].

The notion of IS-morphism introduced in Chapter 4 is related to the work in [39]. The algebraic properties of IS-morphism between functional information systems (FIS) was previously studied there under the name of O-A-D homomorphism4. Our notion of IS-morphism between relational

information systems (RIS) is a straightforward generalization of that between FIS. In fact, if FIS and RIS are considered as many-sorted algebras [6], both IS-morphism and O-A-D homomorphism can be seen as homomorphism in universal algebra [8, 11].

The investigation of RIS also facilitates a further generalization of rough set theory. In classical rough set theory, lower and upper approximations are defined in terms of indiscernibility relations based on functional information associated with the objects. However, it has been noted that many applications, such as social network analysis [74], need to represent both functional and

(30)

Employee database

SQL query

(performance evaluation, good)

Data table Data mining algorithm Knowledge base of DL rules KB1

Input data for applicants Rule application (inference engine) Preliminary classification of applicants Candidate sets: PD, SA, PM, and RD

Figure 2.3: The preliminary classification phase of the recruitment process

relational information. Based on this observation, a concept of relational granulation was recently proposed in [57].

(31)

Chapter 3

Decision Logics for Multicriteria

Decision Analysis

When rough set theory is applied to multi-criteria decision analysis (MCDA), it is crucial that preference-ordered attribute domains and decision classes be dealt with [29, 30, 31, 33, 34, 35, 36, 82]. The original rough set theory cannot handle inconsistencies arising from violations of the dominance principle due to its use of the indiscernibility relation. Therefore, in the above-mentioned work, the indiscernibility relation is replaced by a dominance relation to solve the multi-criteria sorting problem, and the data table is replaced by a pairwise comparison table to solve multi-criteria choice and ranking problems. The approach is called the dominance-based rough set approach (DRSA). For MCDA problems, DRSA can induce a set of decision rules from sample decisions provided by decision-makers. The induced rules form a comprehensive preference model and can provide recommendations about a new decision-making environment.

A strong assumption about data tables is that each object takes exactly one value with respect to an attribute. However, in practice, we may only have incomplete information about the values of an object’s attributes. Thus, more general data tables and decision logics are needed to represent and reason about incomplete information. For example, set-valued and interval set-valued data tables have been introduced to represent incomplete information [51, 52, 53, 60, 94]. A generalized decision logic based on interval set-valued data tables is also proposed in [94]. In these formalisms, the attribute values of an object may be a subset or an interval set in the domain. Since crisp subsets and interval sets are both special cases of fuzzy sets, further generalization of data tables is desirable to represent uncertain information. In data tables containing such information, an object can take a fuzzy subset of values for each attribute. To represent knowledge induced from uncertain data tables, the decision logic also needs to be generalized.

DRSA has also been extended to deal with missing values in MCDA problems [34, 82]. A data table with missing values is a special case of uncertain data tables. Therefore, we propose further extending DRSA to uncertain data tables and fuzzy data tables. In this chapter, we present a logical treatment of DRSA in precise data tables, as well as uncertain and fuzzy data tables. Our approach is concerned with variants of DL for data tables.

The remainder of the chapter is organized as follows. In Sections 3.1 to 3.4, we respectively present generalized DL for preference-ordered data tables, preference-ordered uncertain data ta-bles, preference-ordered fuzzy data tata-bles, and pairwise comparison tables. For each logic, the syntax and semantics are described, and some quantitative measures for the rules of the logics are

(32)

defined. Finally, in Section 3.5, we discuss the main contribution of this chapter and indicate the direction of future research.

3.1

Preference-ordered Data Tables

For MCDA problems, each object in a data table or decision table can be seen as a sample decision, and each condition attribute is a criterion for the decision. Since the domain of values of a criterion is usually ordered according to the decision-maker’s preferences, we define a preference-ordered data table (PODT) as a tuple

T = (U, A, {(Vi, i) | i ∈ A}, {fi | i ∈ A}),

where T = (U, A, {Vi | i ∈ A}, {fi | i ∈ A}) is a classical data table; and for each i ∈ A, i⊆ Vi×Vi

is a binary relation over Vi. The relation i is called a weak preference relation or outranking on

Vi, and represents a preference over the set of objects with respect to the criterion i [82]. For

x, y ∈ U , fi(x) i fi(y) means “x is at least as good as y with respect to criterion i.

To represent the rules induced from a PODT, we introduce preference-ordered decision logic (PODL). The syntax of PODL is the same as that of DL, except for the form of the atomic formulas. An atomic formula in PODL is a descriptor in the form of (≥i, v) or (≤i, v), where

i ∈ A and v ∈ Vi. The satisfaction relation between U and the set of PODL wffs is defined

in the same way as for DL wffs, except that the satisfaction of an atomic formula is defined by x |= (≥i, v) iff fi(x)  v, and by x |= (≤i, v) iff v  fi(x). Other semantic notions in DL, such as

validity, support, and confidence, can all be used in the case of PODL without any modifications. The confidence function for PODL rules has also been defined by [37].

In [34], three types of rules are explicitly identified. We translate these rules into PODL rules as follows.

1. V

i∈C(≥i, vi) −→ (≥d, vd), where C ⊆ A is a subset of condition attributes, d ∈ A\C is a

decision attribute, vi ∈ Vi for all i ∈ C, and vd∈ Vd.

2. V

i∈C(≤i, vi) −→ (≤d, vd), where C ⊆ A is a subset of condition attributes, d ∈ A\C is a

decision attribute, vi ∈ Vi for all i ∈ C, and vd∈ Vd.

3. (V i∈C1(≥i, vi) ∧ V i∈C2(≤i, vi)) −→ ((≥d, vd) ∧ (≤d, v 0 d)), where C1 ∪ C2 ⊆ A is a subset of

condition attributes, d ∈ A\(C1∪ C2) is a decision attribute, vi ∈ Vi for all i ∈ C1∪ C2, and

vd, v0d∈ Vd.

Example 3.1 Continuing with Example 2.1, let us assume that each Vi(i = o, p, t, d) is now

endowed with a weak preference relation i such that 4 i 3 i 2 i 1. Thus, we have atomic

formulas like (≥o, 4), (≥p, 1), and (≥t, 2). Let us now consider the following rules:

r1 = (≥o, 3) −→ (≥d, 3),

r2 = (≤p, 2) −→ (≤d, 2),

(33)

Then, we have

asp rsp cf d

r1 5 12 58

r2 5 12 56

r3 2 15 1

Among these rules, only r3 is valid.

3.1.1

The running scenario

Let us now continue the recruitment process of the KTC and conduct the professional skills test for the applicants. It is assumed that the professional skills of an employee are related to his/her short-term performance since the required skills of high-tech industry are rapidly changing. Note that in the preliminary classification stage, we have classified each applicant into appropriate position class(es). The professional skills required by different positions may be different. However, for simplification, we assume that all applicants are tested with the same set of professional skills, which includes programming languages, engineering mathematics, and software engineering. The test result of each applicant is given in a 10-point score. To utilize the past experience, we retrieve the professional skill test results of the employees hired in the last three years and their performance evaluation. The performance evaluation data of the employees may be stored in a separated database. We assume the company has established a performance management system to evaluate employees performance. Based on the performance, the employees will be ranked into four categories: outstanding (A), good (B), fair (C), and poor (D). By joining the data retrieved from the databases of professional skill test results and employees’ performance evaluation, we obtain a PODT with three decision attributes and one condition attribute. The attributes and their domains are coded as follows:

1. condition attributes:

• p: programming languages, Vp = {0, 1, 2, . . . , 10};

• e: engineering mathematics, Ve = {0, 1, 2, . . . , 10};

• s: software engineering, Vs= {0, 1, 2, . . . , 10};

2. decision attribute:

• d: performance evaluation, Vd= {A, B, C, D}

It is assumed that Vp, Ve, and Vs are all endowed with the weak preference ordering 10 i 9 i

· · · i 0 and Vd is endowed with the ordering A dB dC dD. As in the case of Example 2.3,

we assume an appropriate data mining algorithm (e.g., the DRSA approach) is applied to derive PODL rules. Also, it is still assumed that pre-determined thresholds for confidences and supports are used to select the decision rules, so finally, we have a set of PODL rules that constitute the second knowledge base KB2 of the recruitment process. A typical rule in KB2 is like this

(≥p, 6) ∧ (≥e, 8) ∧ (≥s, 6) −→ (≥d, B).

Now, given the professional skills test result of an applicant, we can apply the PODL rules to predict his/her short-term performance in the following way. Because there may be more than

(34)

one rules in KB2 whose antecedents can match the applicant’s test result, we will apply all

applicable rules to the applicant. Let x be an applicant and ϕ −→ ψ is a PODL rule in KB2, then

we will write x |= ψ if the test result of x matches with ϕ. For example, if the test result of x is programming languages, 8, engineering mathematics, 9 and software engineering, 7, then we can write x |= (≥d, B). Considering all rules applicable to x, we denote spl(x) = max{v | x |= ((≥d, v)}

and spu(x) = min{v | x |= ((≤d, v)}. Intuitively, spl(x) and spu(x) denote the lower and upper

bounds of x’s short-term performance prediction respectively. We can further define the set of grades sp(x) = {v ∈ Vd | spl(x) ≤ v ≤ spu(x)} for each applicant x. Thus, after the

short-term performance prediction phase, each applicant is associated with a set of grades denoting the prediction on his/her short-term performance. Note that, in reality, inconsistency may exist in the original PODT (i.e., the dominance principle may be violate). In the presence of inconsistency, the derived rules may be also mutually inconsistent and sp(x) may be empty for some x. If it is the case, we will set sp(x) to be Vd. The process of this phase is shown in Figure 3.1.

Professional skills test results

SELECT employees hired in the last three years,

JOIN with employees’ id

Data table (PODT) Data mining algorithm Knowledge base of PODL rules KB2

Professional skills test results for applicants Rule application (inference engine) Short-term performance prediction of applicants x: applicant (x, sp(x)) Employees’ performance evaluation

(35)

3.2

Preference-ordered Uncertain Data Tables

PODL is suitable for the representation of rules induced from a PODT. However, the latter inherits the restriction of classical DT so that uncertain information can not be represented. An uncertain data table is a generalization of DT such that the values of some or all attributes are imprecise [16, 13]. An analogous generalization can be applied to PODT to define preference-ordered uncertain data tables (POUDT). Formally, a POUDT is a tuple

T = (U, A, {(Vi, i) | i ∈ A}, {fi | i ∈ A}),

where U, A, {(Vi, i) | i ∈ A} are defined as above, and for each i ∈ A, fi : U → 2Vi− {∅}. The

intuition about POUDT is that the value of attribute i of an object x belongs to fi(x), though

the value is not known exactly. When fi(x) is a singleton, we say that the value is precise. If all

attribute values of T are precise, then T is said to be single-valued.

PODL is also generalized to preference-ordered uncertain decision logic (POUDL). The syntax of POUDL is same as that of PODL, except that its atomic formulas are of the form (i, si), where

i ∈ A and si ⊆ Vi. When si = {v ∈ Vi | v i vi} (resp. si = {v ∈ Vi | vi i v}), we abbreviate

(i, si) as (≥i, vi) (resp. (≤i, vi)). To define the semantics of POUDL, we must first rewrite each

wff into its normal form.

A wff is in a conjunctive normal form (CNF) if it is a conjunction of formulas of the form ∨i∈B(i, si), where B ⊆ A is a subset of mutually distinct attributes. A wff is in a disjunctive

normal form (DNF) if it is a disjunction of formulas of the form ∧i∈B(i, si), where B ⊆ A is a

subset of mutually distinct attributes. Given a POUDL wff ϕ, its CNF and DNF are denoted by ϕc and ϕd respectively. Any POUDL wff can be rewritten in both CNF and DNF by using Boolean algebra and the following rewriting rules:

¬(i, s) = (i, Vi\s)

(i, s1) ∨ (i, s2) = (i, s1∪ s2)

(i, s1) ∧ (i, s2) = (i, s1∩ s2).

For the semantics of POUDL, we define the positive satisfaction relation |=+for CNF formulas

and negative satisfaction relation |=− for DNF formulas. The definition is as follows: 1. x |=+ (i, s) iff fi(x) ⊆ s, 2. x |=+ ϕ ∨ ψ iff x |=+ϕ or x |=+ψ, 3. x |=+ ϕ ∧ ψ iff x |=+ϕ and x |=+ ψ, 4. x |=− (i, s) iff fi(x) ∩ s = ∅, 5. x |=− ϕ ∧ ψ iff x |=−ϕ or x |=−ψ, 6. x |=− ϕ ∨ ψ iff x |=−ϕ and x |=− ψ.

(36)

Then, for any POUDL wff ϕ, we define x |=+ ϕ iff x |=+ϕc, and x |=− ϕ iff x |=− ϕd. According to the semantics of POUDL, x |=+ (≥

i vi) if for all v ∈ fi(x), v is preferred over viwith respect to the

criterion i. Therefore, we can be sure that, if x |=+ (≥

i vi) holds, then the value of criterion i of x

will at least reach the level of vi no matter what the actual value is. Analogously, if x |=− (≥i vi)

holds, we can be sure that the value of criterion i of x will not be above the level of vi no matter

what the actual value is.

For each POUDL wff ϕ and a given POUDT T , we define two meaning sets: m+T(ϕ) = {x ∈ U | x |=+ϕ},

m−T(ϕ) = {x ∈ U | x |=−ϕ};

m+T(ϕ) is the set of objects that are known to satisfy ϕ, and m−T(ϕ) is the set of objects that are known not to satisfy ϕ. The indeterminate region of ϕ with respect to T is defined as

m∗T(ϕ) = U \(m+(ϕ) ∪ m−T(ϕ)).

As usual, the subscript T can be omitted if it is clear from the context. Using the notations from rough set theory, we also define

m(ϕ) = m+(ϕ) and m(ϕ) = U \m−(ϕ).

Note that the three types of rules mentioned in Section 3.1 can also be represented in POUDL, though the semantics is quite different.

The quantitative measures of the rules’ usefulness can be defined by the notion of completion of a POUDT. Let T = (U, A, {(Vi, i) | i ∈ A}, {fi | i ∈ A}) be a POUDT. Then, a PODT

S = (U, A, {(Vi, i) | i ∈ A}, {fi0 | i ∈ A}) is a completion of T if fi0(x) ∈ fi(x) for all i ∈ A

and x ∈ U . The number of completions of T is equal to Q

i∈A,x∈U|fi(x)|. Let CL(T ) denote the

set of all completions of T . If we identify a singleton set with its element by slightly abusing the notation, then a completion of a POUDT (i.e., a PODT) can be considered as a special case of POUDT. This yields the following definition.

Definition 3.1 Let T = (U, A, {(Vi, i) | i ∈ A}, {fi | i ∈ A}) be a POUDT and ϕ −→ ψ be a

POUDL rule, then

1. ϕ −→ ψ is strongly valid in T if m(ϕ) ⊆ m+(ψ) and weakly valid in T if m+(ϕ) ⊆ m+(ψ);

2. the absolute support interval of ϕ −→ ψ is asiT(ϕ −→ ψ) = [ min

S∈CL(T )aspS(ϕ −→ ψ), maxS∈CL(T )aspS(ϕ −→ ψ)];

3. the relative support interval of ϕ −→ ψ is rsiT(ϕ −→ ψ) = [ min

S∈CL(T )rspS(ϕ −→ ψ), maxS∈CL(T )rspS(ϕ −→ ψ)]; and

4. the confidence interval of ϕ −→ ψ is cf iT(ϕ −→ ψ) = [ min

(37)

The next two propositions show how these measures are calculated.

Proposition 3.1 Let ϕ be a POUDL wff and T be a POUDT. Then, for all x ∈ U ni(T ), we have 1. x |=+T ϕ iff x |=S ϕ for all S ∈ CL(T ),

2. x |=−T ϕ iff x 6|=S ϕ for all S ∈ CL(T ), and

3. x ∈ m∗T(ϕ) iff there exist S1, S2 ∈ CL(T ) such that x |=S1 ϕ and x 6|=S2 ϕ.

Proof: We first note that if S is a PODT, then for any POUDL wff ϕ, x |=S ϕ iff x |=S ϕc iff

x |=S ϕd. Thus, without loss of generality, we only need to consider wffs in CNF or DNF. Let us

now prove the first equivalence. The second equivalence can be proved analogously, and the third follows from the first two.

(⇒): If ϕ is in CNF, then we have x |=+T ϕ iff x |=+T ∨i∈B(i, si) for each conjunct ∨i∈B(i, si) of

ϕ. Now, x |=+T ∨i∈B(i, si) implies that there exists i ∈ B such that fi(x) ⊆ si. This, in turn,

implies that x |=S ∨i∈B(i, si) for any S ∈ CL(T ). Thus, x |=+T ϕ implies that x |=S ϕ for all

S ∈ CL(T ).

(⇐): If ϕ is in CNF and x |=S ϕ for all S ∈ CL(T ), then for any conjunct ∨i∈B(i, si) of ϕ,

we have x |=S ∨i∈B(i, si) for any S ∈ CL(T ). Assume x 6|=+T ∨i∈B(i, si) for some conjunct

∨i∈B(i, si) of ϕ; then fi(x) 6⊆ si holds for all i ∈ B. Thus, since the attributes in B are

mutually distinct, we can have an S = (U, A, {(Vi, i) | i ∈ A}, {fi0 | i ∈ A}) ∈ CL(T )

such that fi0(x) ∈ fi(x)\si for all i ∈ B. Obviously, this implies that x 6|=S ∨i∈B(i, si) and

contradicts the fact that x |=S ∨i∈B(i, si) for any S ∈ CL(T ). Therefore, we can derive

x |=+T ∨i∈B(i, si) for any conjunct ∨i∈B(i, si) of ϕ, and consequently, x |=+T ϕ.

Proposition 3.2 Let ϕ −→ ψ be a POUDL rule and T be a POUDT, then we have 1. |m(ϕ ∧ ψ)| = min S∈CL(T )aspS(ϕ −→ ψ), |m(ϕ ∧ ψ)| = max S∈CL(T )aspS(ϕ −→ ψ); 2. |m(ϕ ∧ ψ)| |U | =S∈CL(T )min rspS(ϕ −→ ψ), |m(ϕ ∧ ψ)| |U | = maxS∈CL(T )rspS(ϕ −→ ψ); 3. |m(ϕ ∧ ψ)| |m(ϕ)\(m∗(ϕ) ∩ m(ϕ ∧ ¬ψ))| =S∈CL(T )min cf dS(ϕ −→ ψ), |m(ϕ ∧ ψ)| |m(ϕ)\(m∗(ϕ) ∩ m(ϕ ∧ ψ))| = maxS∈CL(T )cf dS(ϕ −→ ψ).

數據

Figure 1.1: An interface is needed between the KDD and AI systems
Figure 1.2: The organization of the paper
Table 2.1: A summary of reviewers’ reports for 10 papers U \ A o p t d 1 4 4 3 4 2 3 2 3 3 3 4 3 2 3 4 2 2 2 2 5 2 1 2 1 6 3 1 2 1 7 3 2 2 2 8 4 1 2 2 9 3 3 2 3 10 4 3 3 3
Figure 2.1: An arrow model
+7

參考文獻

相關文件

In our AI term project, all chosen machine learning tools will be use to diagnose cancer Wisconsin dataset.. To be consistent with the literature [1, 2] we removed the 16

2 machine learning, data mining and statistics all need data. 3 data mining is just another name for

This bioinformatic machine is a PC cluster structure using special hardware to accelerate dynamic programming, genetic algorithm and data mining algorithm.. In this machine,

We try to explore category and association rules of customer questions by applying customer analysis and the combination of data mining and rough set theory.. We use customer

Furthermore, in order to achieve the best utilization of the budget of individual department/institute, this study also performs data mining on the book borrowing data

Step 5: Receive the mining item list from control processor, then according to the mining item list and PFP-Tree’s method to exchange data to each CPs. Step 6: According the

(1999), "Mining Association Rules with Multiple Minimum Supports," Proceedings of ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego,

Since Dolby AC-3(abbreviated as AC-3) is the main technology of the surrounding sound format, in this thesis, we proposes a data model for mining the relationship between