知識擷取

(1)

Chapter 3 Knowledge Acquisition

(2)

3.1 INTRODUCTION

• The goal of knowledge acquisition （知識擷取） is to elicit ex

pertise （專業知識） from domain experts （領域專家） .

Knowledge base

Computerized Representation

Expert

(3)

Advantages of Employing Knowledge

Acquisition （知識擷取） Systems ：

1. They does not only depend on the training cases （訓

練範例） .

2. Real-time analysis is possible.

3. Real-time consistency checking is possible. 4. They can be integrated with KE tools.

5. Knowledge bases （知識庫） can be automatically g enerated.

(4)

REVIEWS OF PREVIOUS WORKS

Substantive Knowledge _： To identify current state

“Am I in danger of being attacked” Strategic Knowledge ：

To determine what to do next “Climb to 30000 feet”

(5)

Repertory Grid Approach Knowledge Acquisition （知識擷取） System Substantive Knowledge Strategic Knowledge Classification

Decision making _PlanningControl MORE SALT MOLE ASK Other Approach AQUINAS

KITTEN KNACK RuleCon

KRITON TEIRESIAS

(6)

 The Acquisition of Substantive Knowledge

• Repertory Grid （知識表格） -Oriented Methods ： Step 1. Elicit elements to be classified.

Step 2. Elicit constructs from experts.

Each time three elements are selected. The expert is asked to give a construct

to distinguish one element from the other two.

Step 3. Rate the grid by filling a rating (1-5) to each entry. Step 4. Generate implication graph.

(7)

Step 1 _{： Elicit elements from experts.}

Step 2 _{： Elicit constructs from experts.}

Measles German Dangue Chickenpox Smallpox Measles Fever

Measles German Dangue Chickenpox Smallpox Measles Fever 　 4 　　　 1 　　　　 5 　　　　 1 　　　　 5 　　　　 5 　　　　　　　　　 4 　　　　 1 　　　　 1 　　　　　　　 4 　　　　 2 high fever red purple headache no high fever not red no purple no headache

(8)

Step 3 _{： Rate each entry of the grid.}

Step 4 _{： Generate the implication} graph.

Measles German Dangue Chickenpox Smallpox Measles Fever 　 4 　　　 1 　　　　 5 　　　 4 　　　　 2 　 5 　　 1 　　　　 5 　　　　 5 4 　 4 　　 1 　　　　 4 　　　　 1 　　　　 1 　 1 　　 2 　　　　 4 　　　　 4 　　　　 2 high fever red purple headache no high fever not red no purple no headache headache

purple high fever red

(9)

Rules generated from the grid

：

First column ： IF

high_fever and red and purple and (not headache) Then 　　 Disease = Measles 　　 CF = MIN (0.8,1.0,0.8,0.8) 　　　 = 0.8 Second column ：　　 IF

　　　 (not high_fever) and (not red) and 　　　 (not purple) and (not headache) 　　 Then

(10)

Advantages of applying repertory-grids

（知識表格）

Easy to analyze the elicited knowledge _： 1. Similarity analysis of constructs.

2. Similarity analysis of elements.

3. Analysis of the relationships among constructs. 4. Detection of missed elements.

(11)

3.2 ELICTATION OF SUBSTANTIVE

KNOWLEDGE

 Knowledge Representation （知識表示法） 　 dog 　　 bird 　　 fish

4-legs 2-legs no-legs 　　 5 　　　　 1 　　　　 1 　　 1 　　　　 5 　　　　 1 　　 1 　　　　 1 　　　　 5 not 4-legs not 2-legs has-legs

　　 dog 　　　 bird 　　　 fish

# of legs

4,2 　　　 2,2 　　　 0,2

(12)

 An acquisition table is a repertory grid （知識

表格） of multiple data types ：

Boolean ： true or false

Single value ： an integer, a real, or a symbol

Set of value ： a set of integers, real numbers or symbols. Range of values ： a set of integers or real numbers.

‘X’ ： no relation.

‘U’ ： unknown or undecidable.

 Ratings ：

2 ： very likely to be. 1 ： maybe.

(13)

3.3 Some Problems of Repertory Grids （知識表

格）

 Problem of Element Selection

　 E₁　 E₂　 E₃　 E₄　 E₅ C₁ C₂ C₃ C₄ 　 1 　 5 　 5 　 4 　 2 　 1 　 5 　 1 　 1 　 5 　 1 　 5 　 1 　 2 　 2 　 1 　 5 　 1 　 1 　 4 C’₁ C’₂ C’₃ C’₄

(14)

 Problem of Multi-Level Knowledge and

Acquirability

INPUT DATA INPUT DATA

SUBGOAL

SUBGOAL INPUT DATA SUBGOAL

(15)

• The Concept of Acquirability ：

The value of a terminal attribute of a decision tree must either be a constant or be acquirable from users.For example _：

IF

　　 (leaf-shape = scale) and 　　 (class = Gymnosperm) THEN

　　 family = Cypress.

(16)

Leaf Shape

Class

Family

(17)

Domain basis and classification knowledge

：

Domain basis Other diseases Acute Exanthemas Classification knowledge

Measles, German measles, Dangue fever,…

(18)

 Problem of Missing Embedded Meanings （隱

含

知識）

• When a diagnostician expresses the features of catch cold are

headache, feel tired, cough, sneeze,…,

he means “if a person catches cold, he may

have those features”

• We usually represent the expertise as the following rules: (Headache = yes) and (Feel_tired = yes) and

(cough = yes) and …,

(19)

• The embedded meaning （隱含知識） of the di

agnostician

“if one or some features do not appear, it is still possible that the patient catches cold.”

(20)

3.4 EMCUD

： A New Model for Eliciting

Knowledge Representation （知識表示法）：

Conventional Repertory grid （知識表格） or Ac

quisition Table

+

(21)

 Eliciting embedded meanings （隱含知識） b

y constructing the Attribute Ordering Table （屬

性序列表格）

• Value in an AOT may be ：

‘D’ ： The attribute dominate （主導權） the object. ‘X’ ： The attribute has no relation with the object.

an integer ： The attribute is of some degree of importance to

the object.(A smaller integer means less important.)

　 Obj1 　　　 Obj2 　　　 Obj3 　　　 Obj4 　　　 Obj5 A₁ A₂ A₃ 　　　 D 　　　　 D 　　　　 2 　　　　　 1 　　　　 D 　　　 1 　　　　 1 　　　　　 1 　　　　 D 　　　 D 　　　 X 　　　　 X 　　　　 D 　　　　　 1 　　　　 D

(22)

The rule generated from first column ：

RULE3： (13<A116)(A2=YES)  (A3=4.3) → GOAL = Obj3

Where

F(confidence) = 1.0 if confidence = 2 　　　　　 = 0.8 if confidence = 1 and

　 Certainty Factor CF = MIN(F(2),F(1), (F(2)) = 0.8

An example of Repertory Grid （知識表格）：

　 Obj1 　　　 Obj2 　　　 Obj3 　　　 Obj4 　　　 Obj5 A₁

A₂ A₃

　 {9,10,12},2 　 20,2 　　 (13-16],2 　　 17,2 　　　 3,2

　 YES,1 　　 NO,2 　　 YES,1 　　 YES,2 　　 NO,2

　　　 X 　　　　 X 　　　　 4.3,2 　　　 2.1,2 　　　 6.0,2

(23)

An example of constructing AOT.

EMCUD _{： If A}1  {9,10,12}, is it possible that GOAL =Obj1 ?

EXPERT _{： No. /*This implies that A}1 dominates Obj1 and

　　　　　 AOT<Obj1,A1> = ‘D’ */

EMCUD _{： If A}2  YES,is it possible that GOAL = Obj1?

EXPERT ： Yes. /*A2 does not dominate Obj1 */

EMCUD ： If A1 > 16 or A1  13, is it possible that GOAL = Obj3?

EXPERT _{： Yes. /* A}1 does not dominate Obj3 */

EMCUD _{： If A}2  YES, is it possible that GOAL = Obj3 ?

EXPERT _{： Yes. /* A}2 does not dominate Obj3 */

EMCUD _{： If A}3  4.3 , is it possible that GOAL = Obj3 ?

(24)

EMCUD _{： Please rank A}1 and A2 in the order of importance to

　　　　 Obj3 by choosing one of the following expressions ：

　　　　　 1)A1 is more important that A2

　　　　　 2)A1 is less important that A2

　　　　　 3)A1 is as important as A2

EXPERT ： 1 /* A1 is more important to Obj3 than A2, hence

　　　　　 AOT < Obj3,A1> = 2 and AOT <Obj3,A2> = 1 */

Obj₁　 Obj₂　　 Obj₃　　 Obj₄　 Obj₅ A₁ A₂ A₃ 　　 D 　　 D 　　 2 　　　 1 　　　 D 　　 1 　　 1 　　 1 　　 D 　　　 D 　　 X 　　 X 　　 D 　　 1 　　　 D

(25)

Elicit Embedded Meanings （隱含知識）

From RULE3, the following embedded rules （隱含規則） will Be generated by negating the predicates of A1 and A2 ：

RULE_3,1 ： NOT(13<A116)(A2=YES)  (A3=4.3)

　　　　 → _{GOAL = Obj}₃

RULE_3,2 ： _(13<A₁₁₆₎_NOT(A₂_=YES)_(A₃_=4.3)

　　　　 → _{GOAL = Obj}₃

RULE3,3： NOT(13<A116)NOT(A2=YES)  (A3=4.3)

(26)

Certainty Sequence(CS)

：

Represents the drgree of certainty degradation. CS(RULESij) = SUM(AOT<Obji,Ak>)

for each ak in the negated predicates of ruleij

For example ：

CS(RULE3,3) = AOT < Obj3,A1 + AOT<Obj3,A2>

　　　　　 = 2 + 1 = 3

The embedded rules （隱含規則） generated from RULE3 ：

RULE3,1： NOT(13<A116)(A2=YES)  (A3=4.3)

　　　　 → GOAL = Obj3 　　CS = 2

RULE3,2： (13<A116)NOT(A2=YES)  (A3=4.3)

　　　　 → GOAL = Obj3　　CS = 1

RULE3,3： NOT(13<A116)NOT(A2=YES)  (A3=4.3)

(27)

Construct Constraint List

1. Sort the embedded rules according to the CS values _：

　　 RULES3,2 　 CS = 1

　　 RULES3,1 　 CS = 2

　　 RULES3,3 　 CS = 3

2. A prune-and-search algorithm _：

　　 EMCUD ： Do you think RULE3,1 is acceptable?

　　 Expert ： Yes. /* then RULE3,2 is also accepted*/

　　 EMCUD ： Do you think RULE3,3 is acceptable?

　　 Expert ： No. /* then CS=3 is recorded in the constraint list */

(28)

Calculate Certainty Factors （確定因子）

Confirm ： 1.0

Strongly support ： 0.8 Support ： 0.6

May support ： 0.4

CFij= Upper-Boundi- (Csij/MAX(Csi))  (Upper-Boundi – Lower-Boundi)

MAX(Csi) ： maximum CS value of the embedded

　　 rules generated from RULEi.

Upper-Boundi： certainty factor of embedded

Lower-Boundi： certainty factor of embedded

(29)

An example of calculating certainty factors （確定

因子）

For the embedded rules （隱含規則） from RULE3：

1. Upper – Bound = CF(RULE3) = 0.8

2. Since RULE3 is not accepted, the embedded rule with MAX(CS)

is RULE3,1：

　 EMCUD ： If RULE3 strongly supports GOAL = Obj3 , 　　　　　 what about RULE3,1 ?

　 Expert ： 1. /*The Lower-Bound = 0.6*/

　　 CF3,1 = 0.8 – (2/2) * (0.8 – 0.6) = 0.6

(30)

• The process of eliciting embedded meanings （隱

含知識）：

repertory grid Attribute-Ordering Table Constraint List mapping function original rules possible embedded rules accepted embedded rules certainty factors of

the embedded rules

eliciting embedded

rules thresholding

(31)

ACQUISITION TABLE

肺　炎咳嗽疲倦頭痛 YES,2 YES,2 YES,1 肺　炎咳嗽疲倦頭痛 D 2 1 AOT

(32)

Conventional Repertory Grids （知識表格）：

IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 =YES) THEN DISEASE= 肺炎 CF=0.8

EMCUD ：

IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 =YES) THEN DISEASE= 肺炎　　　　　 CF=0.67

IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 <>YES) THEN DISEASE= 肺炎　　　　　 CF=0.73

IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 <>YES) THEN DISEASE= 肺炎　　　　　 CF=0.6

(33)

 OBJECT CHAIN

_{： A METHOD FOR}

questions selection

：

• For the grid with 50 elements (or objects), there are 19600 po ssible choices of questions to elicit constructs (or attributes). • Initial repertory grid （知識表格） and the object chains ： OBJECT CHAIN Obj1 --> 2,3,4,5 Obj2 --> 1,3,4,5 Obj3 --> 1,2,4,5 Obj4 --> 1,2,3,5 Obj5 --> 1,2,3,4

　 Obj₁　 Obj₂　 Obj₃　 Obj₄　 Obj₅

(34)

• The expert gives attribute P

₁

to distinguish Obj

₁

and

Obj

2

from

Obj

3

OBJECT CHAIN Obj1 -- > 2,5 Obj2 -- > 1,5 Obj3 -- > 4 Obj4 -- > 3 Obj5 -- > 1,2

　 Obj₁　 Obj₂　 Obj₃　 Obj₄　 Obj₅

P₁ 　 T 　　 T 　　 F 　　 F 　

　 T 　　

(35)

• The expert gives attribute P

₂

to distinguish Obj

₂

and

Obj

5

from

Obj

1

OBJECT CHAIN Obj1 -- > NULL Obj2 -- > 5 Obj3 -- > NULL Obj4 -- > NULL Obj5 -- > 2

　 Obj₁　 Obj₂　 Obj₃　 Obj₄　 Obj₅ P₁ P₂ 　 T 　　 T 　　 F 　　 F 　　 T 　　 T 　　 F 　　 T 　　 F 　　 F 　

(36)

• The expert gives attribute P

₃

to distinguish

Obj

₂

from

Obj

₅ OBJECT CHAIN Obj₁-- > NULL Obj₂-- > NULL Obj₃-- > NULL Obj₄-- > NULL Obj₅-- > NULL

　 Obj₁　 Obj₂　 Obj₃　 Obj₄　 Obj₅ P₁ P₂ P₃ 　 T 　　 T 　　 T 　　 F 　　 T 　　 T 　　 F 　　 T 　　 F 　　 F 　　 F 　　 T 　　 T 　　 F 　　 F 　

(37)

• Advantages

_：

1. Fewer questions are asked(log2n to n-1 questions).

2. All of the objects are classified.

3. Every question matches the current requirement of classifying objects.

• Disadvantages ：

1. It may force the expert to think a specific direction. 2. Some important attributes may be ignored.

(38)

 Eliciting hierarchy of grids

_：

• For the expert system （專家系統） of classifying families

of plants

　 Cypress 　 Pine 　　　 Bald Cypress 　 Magn olia

Leaf shape Needle pat. Class ( 綱 ) Silver band

　 scale 　　 needle 　　　 needle 　 scale

　 X {random,cvenline} evenline X Gymnosperm Gymnosperm Gymnosperm Magnolia 　　 X 　　　　 T 　　　　　　 F 　　　　 X

(39)

• Since class is not acquirable, it becomes the goal of a

new grid.

　 Gymnosperm 　 Magnolia 　　 Angiosperm Type ( 種 )

Flate

　　 Tree 　　　　 Herb 　　　　 Tree 　　 F 　　　　　 T 　　　　　 T

(40)

• Since class is not acquirable, it becomes the goal of a

new grid.

　 Herb 　　　 Vine 　　　 Tree 　　　 Shrub stem

position one trunk

　 green 　　 woody 　　 woody 　　 woody 　　　 X 　　　 creeping 　 upright 　　 upright 　　 F 　　　　 T 　　　　 T 　　　　 F

(41)

Decision tree of the hierarchy of grids

：

FAMILY OF PLANT

LEAF SHAPE NIDDLE PATTERN CLASS

TYPE FLATE

(42)

3.5 An Application and Performance Evaluation

of EMCUD

 Application Domain _：

Diagnosis of Acute Exanthema

 Hardware ：

Personal Computer

 Software _：

(43)

The codes of diseases and their translations: 1-Measles 　　　　　　 8 - Meningococcemia

2-German measles 　　　　　　 9 - Rocky Mt. Spotted fever 　 3-Chickenpox 　　　　　　　 10 - Typhus fevers

4-Smallpox 　　　　　　　　 11 – Infectious mononucleosis 　　 5-Scarlet 　　　　　　　　　 12 – Enterovirus infections

6-Exanthem subitum 　　　　　 13 – Drug eruptions 　

7-Fifth disease 　　　　　　　 14 – Eczema herpeticum 　　　　　　

case number 1 　 2 　 3 　 4 　 5 　 6 　 7 　 8 　 9 　 10 　 11 　 12 　 13 physician 12 　 3 　 3 　 1 　 2 　 1 　 14 　 2 　 6 　 5 　 5 　 3 　 1 old prototype 12 X X X X　 X 　 14 X 　 6 　 X 　 X 　 3 　 1 new prototype 12 　 3 　 3 　 1 　 2 　 1 　 14 　 2 　 6 　 5 　 5 　 3 　 1 case number 14 　 15 　 16 　 17 　 18 　 19 　 20 　 21 　 22 　 23 　 24 　 25 physician 6 6 12 5 8 9 14 13 4 1 2 14 old prototype X X 12 5 X 9 14 13 4 1 2 14 new prototype 6 6 12 5 8 9 14 13 4 1 2 14

(44)

3.6 Knowledge integration （知識整合） from

multiple experts

 To build a reliable expert system, the cooperation of several experts is usually required.

 Difficulties ：

• Synonyms of elements (possible solutions)

• Synonyms of traits (attributes to classify the solutions) • Conflicts of ratings

(45)

Integrated Knowledge

Use more attributes to make choices from more possible decisions

Habitual domain of Expert 1

Each expert has his own way to

(46)

Expert 1 Expert 2 Expert N

Busy Busy _Busy

Far away Far away

Knowledge Engineer

It is difficult to have all of the experts work together

(47)

Expert 1 Expert 2

…

Expert N Phase 1 interview Repertory Grid 1 Repertory Grid 2 Repertory Grid N

The unions of element sets and construct sets

Common Repertory Grid

Phase 2 interview

Expert 1 Expert 2

…

Expert N

Eliminate some redundant vocabularies

(48)

Expert 1 Expert 2

…

Expert N Phase 3 interview Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge Integration

Integrated Repertory Grid

(49)

Repertory Grid 1 Repertory Grid 2 Repertory Grid N

The unions of element sets and construct sets

Phase 2 interview

Expert 1 Expert 2

…

Expert N

Eliminate some redundant vocabularies

Phase 3 interview

(50)

Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge Integration Integrated Repertory Grid

Generate AOT Flat Repertory Grid

AOT

Filled AOT 2

Filled AOT 1

…

Filled AOT N

Integrated AOT

Rule Generation Integration or AOT’s

(51)

Expert 1 Expert 2 5 4 1 4 5 1 1 5 1 1 4 4 5 3 1 5 5 5 4 3 4 1 1 5 4 4 1 1 5 5 5 1 1 5 4 1 4 5 1 1 5 2 2 5 5 5 1 4 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side 5 3 1 5 4 1 2 4 1 1 3 4 5 2 1 5 5 5 3 2 5 1 1 5 4 4 1 1 4 5 5 1 1 5 5 1 3 4 1 1 5 2 1 5 5 5 1 3 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side E1 E2 E3 E4 E5 E1 E2 E3 E4 E5 Knowledge Integration

(52)

Expert 3 5 4 1 5 5 1 1 5 1 1 4 4 5 2 1 5 5 5 4 2 5 1 1 5 4 4 1 1 5 5 5 1 1 5 5 1 4 5 1 1 5 2 1 5 5 5 1 4 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side E1 E2 E3 E4 E5

(53)

Results of the first experiment

Differential Diagnosis for Common Causes of Inflamed Eyes. 60 test cases are used to evaluate the knowledge base from Expert 1, the knowledge base from Expert 2, and the

integrated knowledge base. Knowledge

base Ratio of Correct Diagnosis

Expert 1 Expert 2 Integrated 0.67 0.64 0.8

(54)

Results of the first experiment

Differential Diagnosis for Common Causes of Inflamed Eyes. 336 test cases are used to evaluate the knowledge base from Expert 1, the knowledge base from Expert 2, and the

integrated knowledge base. Knowledge

base Number of Correct Diagnosis Ratio of Correct Diagnosis Expert 1 Expert 2 Integrated 255 243 306 0.759 0.723 0.911

(55)

3.7 Machine Learning （機器學習）

Building computer programs able to construct new

knowledge or to improve already possessed knowledge Application ： Expert Systems Cognitive Simulation 　　　 Problem Solving 　　　 Control … Example ：　　　 Perceptron 　 [Rosenblatt, 1961]

　　　 Meta-Dendral [Bucmanan, Feigenbaum, Sridharan, 1972] 　　　 AM 　　　　 [Lenat, 1976] 　　　

(56)

傳統專家架構系統知識源知識編輯介面推理機置知識庫使用者介面使用者

(57)

具歸納式機器學習能力之專家系統架構學習機置推理機置使用者介面使用者知識源範例編輯介面範例庫知識庫

(58)

Machine Learning

 Machine Learning  Central to A. I.  

Learning from training cases.

                                     

(59)

Taxonomy

[Michalski, 1983]

Learning

Learning from

Examples

Learning from

Observation and

Discovery

Rote

Learning

by Analog

Learning

Learning by

(60)

Classification

Learning Strategies Symbolic

Learning _LearningNeural

Batch Learning e.g. Version Space e.g. ID3 e.g. PRISM e.g. Perceptron Incremental Learning

(61)

Symbolic Learning

Learning Unit

1.Attributes

3. Hypothesis

Space

2. Matching

4.Training

(62)

Review of some data-driven learning

strategies

[T.M. Mitchell 1979]

1. Depth-first search

2. Specific-to-general breadth-first search

3. ID3

(63)

Description of instances:

an unorder pair of simple objects, characterized by

three attributes(size, color, shape)

Three instances:

　 {(Large,Red,Triangles)(Small,Blue,Circle)} 　　　 {(Larg e,Blue,Circle) (Small,Red,Triangle)} 　　　 {(Large,Blu e,Triangle)(Small,Blue,Triangle)} 　

+ +

(64)

Depth-First Search

1.{(Large,Red,Triangle) 　 (Small,Blue,Circle)} 2.{(Large,Blue, Circle) 　 (Small,Red, Triangle)} 3.{(Large,Blue, Triangle)) 　 (Small,Blue, Triangle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,?,?) (Small,?,?)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,?,?) {(?,Red,Triangle)

(65)

Disadvantages of Depth-First Search:

　 1. Needs backtracking

　 2. Needs additional cost of maintaining

　　 consistence with past instances

(66)

Specific-to-general breadth-first search

1.{(Large,Red,Triangle) 　 (Small,Blue,Circle)} 2.{(Large,Blue, Circle) 　 (Small,Red, Triangle)} 3.{(Large,Blue, Triangle)) 　 (Small,Blue, Triangle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(Large,?,?) {(?,Red,Triangle) {(Large,?,?) (Small,?,?)} {(?,Red,Triangle) (?,Blue,Circle)}

(67)

Disadvantages of breadth-first search:

　 Needs to check past negative instances to assure

(68)

Problem Description

1. A set of attributes = { A : the age of the patient, 年齡 B : spectacle prescription, 視力 C : astigmatic, 亂視

D : tear production rate 淚量 } 2. Matching Predicates:

A= { A1 : young , 青年 , A2 : pre-presbyopic, 中年 , A3 : presbyopic 老年 }

B= { B1 : myope, 近視 , B2 : hypermetrope 遠視 } C= { C1 : no 無 , C2 : yes 有 }

D= { D1 : reduced , 較少 , D2 : normal 正常 } 3. A set of classes (Hypothesis Space)

= { DEC1 : hard contact lenses, 硬式隱形眼鏡 DEC2 : soft contact lenses, 軟式隱形眼鏡

(69)

4. Training Instances 訓練範例範 A B C D Dec 範 A B C D Dec 範 A B C D Dec 例例例 1 A1 B1 C1 D1 Dec3 2 A1 B1 C1 D2 Dec2 3 A1 B1 C2 D1 Dec3 4 A1 B1 C2 D2 Dec1 5 A1 B2 C1 D1 Dec3 6 A1 B2 C1 D2 Dec2 7 A1 B2 C2 D1 Dec3 8 A1 B2 C2 D2 Dec1 9 A2 B1 C1 D1 Dec3 10 A2 B1 C1 D2 Dec2 11 A2 B1 C2 D1 Dec3 12 A2 B1 C2 D2 Dec1 13 A2 B2 C1 D1 Dec3 14 A2 B2 C1 D2 Dec2 15 A2 B2 C2 D1 Dec3 16 A2 B2 C2 D2 Dec1 17 A3 B1 C1 D1 Dec3 18 A3 B1 C1 D2 Dec2 19 A3 B1 C2 D1 Dec3 20 A3 B1 C2 D2 Dec1 21 A3 B2 C1 D1 Dec3 22 A3 B2 C1 D2 Dec2 23 A3 B2 C2 D1 Dec3 24 A3 B2 C2 D2 Dec1

(70)

基本的歸納決策樹學習演算法推導決策樹

A= A1 B=B1 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] A= A3 B=B1 C=C1 [Dec3] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] A= A2 B=B1 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 [Dec3] 訓練範例

(71)

Assumption: p 個正例 n 個反例

C

1. C 中所含正反例的個數可以反映出一般正反例的比例。正：反 = 2. 表達此資訊的最小 bit 期望值 ( 所含資訊量 ) 為 n : p n p n : n p p _   n p n n p n n p p n p p N P I        log₂ log₂ ) , (

ID3

(72)

以 Data Compression 的觀點：越少出現的 information 用越多 bits 表示越常出現的 information 用越少 bits 表示  可用最少的 memory 表達最多的 information  若出現機率是則用 memor y 表示則用 memory 表示所以 P 個正例， n 個反例期望的 bits 數為 bits 數和 information 量成正比 n p p  n p n  K n p p 1   K n p n 1   ' K n p n log n p n n p p log n p p K log n p n K log n p p 2 2 n pn 2 n p p 2             

(73)

· · · · · · C C1 C2 C3 _Cr P + n -P1 + n1 - _{P2 +} n2 -P3 + n3 - Pr +_nr

-A1, A2, A3, ... (Attributes)

gain (A) = I(P, n) - E(A) A 所含的 in formation 原來的 information 量用 A 分類之後剩餘的 information 量取 Gain(A) 最大的 attribute 來分



    v 1 i i i i i _I₍_p _,_n ₎ n p n p ) A ( E

(74)

An Alternative point of view : Entropy ( 亂度 ) + + + + ₊ ₊ ₊ ₊ + + －－－－ + + + + + + C _C A1 _A2 亂度高亂度低 n p n log n p n n p p log n p p ) n , p ( I ₂ ₂       

(75)

ID3 歸納決策樹學習演算法推導決策樹

D=D1 [Dec3] D=D2 C=C1 A=A1 [Dec2] A=A2 [Dec2] A=A3 B=B1 [Dec3] B=B2 [Dec2] C=C2 B=B1 [Dec1] B=B2 A=A1 [Dec1] A=A2 [Dec3] A=A3 [Dec3] 訓練範例

(76)

歸納法決策樹可以轉換成決策法則，例如在上圖的第

二個分枝可以表示成如下之法則

: IF D=D2 意為若患者的淚量 = 正常 AND C=C1 且患者有亂視 = 有 AND A=A1 且患者的年齡 = 青年 THEN Dec=Dec2 則隱形眼鏡決策 = 軟式隱形眼鏡

(77)

assume only one attribute exists

Instance space: terminal nodes,Hypothesis space: all nodes Predicates: predecessor-successor relations

Positive Training Instances ： sin and cos Negative Training Instance ： ln

→ Concept ： trig

transc

trig explog

sin cos tan ln exp

(78)

Terminology

• An Instance Space ：

　 a set of instances which can be legally 　 described by a given instance language 　　　　． Attribute-based Instance Space 　　　　． Structured Instance Space

• A Hypothesis Space ：

　 a set of hypotheses which can be legally described 　 by a generalization language

Conjunctive Form 　　　　　 Disjunctive Form 　　　 e.g.

Color=red and shape=convex 　 C1 or C2 or C3… (most prevalent form)

　　　　　　　　　　　　 conjunctive form

(79)

Terminology

 Predicates ：

　 required for testing whether a given instance is contained i n the instance set corresponding to a given hypothesis

• Powerful basis for organizing a search • Two partial ordering relations exist ：  A is more specific （特殊） than B ： 　 B is more general （泛化） than A ： If each instance contained in A is also

(80)

 Incremental Learning （逐漸式學習演算法）  For Conjunctive Hypothesis

 Idea ：

　 Be represented by two sets hypotheses

 S ： the most specific set （最特殊規則集） consistent with the training instances

G ： the most general set （最泛化規則集） consistent with the training instances

Version Space

more general more specific G S +

(81)

-Example of Version Space

( sin + ) 　　　 S ： sin 　　　 G ： transc

( ln - ) 　　　 S ： sin 　　　 G ： trig

( cos + ) 　　　 S ： trig 　　　 G ： trig

Concept ：　　　　　　　 trig

Lemma ： a 　　 S, 　　 b 　 　 G,

　　　　　　 a is more specific than b

transc

trig explog

(82)

1.{(Large,Red,Triangle) 　 (Small,Blue,Circle)} 2.{(Large,Blue, Circle) 　 (Small,Red, Triangle) 3.{(Large,Blue, Triangle)) 　 (Small,Blue, Triangle)} {(Large,Red,Triangl e) 　 (Small,Blue, Circle)} {(?,?,?) (?,?,?) } S _： G ： S _： G ： S _： G ： {(Large,?,?) 　 (Small,?,?)} {(?,Red,Triangle) (?,Blue,Circle) } {(?,?,Circle ) (?,?,?) } {(?,Red,?) (?,?,?) } {(?,?,?) (?,?,?) } {(?,Red,Triangl e) 　 (?,Blue,C ircle)}

(83)

Check contradiction between S and G

• Step1: Take a generalization s in S and a

generalization g in G. Check s with g, if g is not

more general than s , mark s and g.

• Step2: Repeat step1 until each in S and G are

processed.

• Step3: Discard those generalizations in S with |G|

marks and those in G with |S| marks.

(84)

Advantage of Version Space:

　 Needs not check past instances---the reason to

(85)

Exercise

1. 試以動物分類為例，建立一個 Repertory

Grid （知識表格）及產生對應的推論規

則。

2. 分析產生的動物分類推論規則中是否有

遺漏的 Embedded Meanings （隱含知

識）。

(86)

3. Use Depth-First-Search to learn from the

following training cases.

COLOR black brown brown black brown black brown brown brown black black black SIZE large large medium small medium large small small large medium medium small COAT shaggy smooth shaggy shaggy smooth smooth shaggy smooth shaggy shaggy smooth smooth COLOR + + -+ + + -+

(87)

-範 A B C D Dec 例 1 A1 B1 C1 D1 Dec1 2 A1 B1 C1 D2 Dec1 3 A1 B1 C1 D3 Dec1 4 A1 B1 C2 D1 Dec1 5 A1 B1 C2 D2 Dec1 6 A1 B1 C2 D3 Dec1 7 A1 B2 C1 D1 Dec2 8 A1 B2 C1 D2 Dec2 9 A1 B2 C1 D3 Dec2 10 A1 B2 C2 D1 Dec2 11 A1 B2 C2 D2 Dec3 12 A1 B2 C2 D3 Dec2 範 A B C D Dec 例 13 A2 B1 C1 D1 Dec1 14 A2 B1 C1 D2 Dec1 15 A2 B1 C1 D3 Dec1 16 A2 B1 C2 D1 Dec1 17 A2 B1 C2 D2 Dec1 18 A2 B1 C2 D3 Dec1 19 A2 B2 C1 D1 Dec2 20 A2 B2 C1 D2 Dec2 21 A2 B2 C1 D3 Dec2 22 A2 B2 C2 D1 Dec2 23 A2 B2 C2 D2 Dec3 24 A2 B2 C2 D3 Dec2 範 A B C D Dec 例 25 A3 B1 C1 D1 Dec1 26 A3 B1 C1 D2 Dec1 27 A3 B1 C1 D3 Dec1 28 A3 B1 C2 D1 Dec1 29 A3 B1 C2 D2 Dec1 30 A3 B1 C2 D3 Dec1 31 A3 B2 C1 D1 Dec2 32 A3 B2 C1 D2 Dec2 33 A3 B2 C1 D3 Dec3 34 A3 B2 C2 D1 Dec2 35 A3 B2 C2 D2 Dec3 36 A3 B2 C2 D3 Dec3 4. 已知有一分析型領域問題的屬性與決策如下：屬性 A = {A1, A2, A3} 屬性 B = {B1, B2} 屬性 C = {C1, C2} 屬性 D = {D1, D2, D3}

決策 Dec = {Dec1, Dec2, Dec3}

(1) 用屬性 A, B, C, D 作順序所產生之決策樹。 (2) 用 ID3 演算法時所產生之決策樹。