Chapter 3
Knowledge Acquisition
3.1 INTRODUCTION
• The goal of knowledge acquisition (知識擷取) is to elicit ex
pertise (專業知識) from domain experts (領域專家) .
Knowledge base
Computerized Representation
Expert
Advantages of Employing Knowledge
Acquisition (知識擷取) Systems :
1. They does not only depend on the training cases (訓
練範例) .
2. Real-time analysis is possible.
3. Real-time consistency checking is possible. 4. They can be integrated with KE tools.
5. Knowledge bases (知識庫) can be automatically g enerated.
REVIEWS OF PREVIOUS WORKS
Substantive Knowledge : To identify current state
“Am I in danger of being attacked” Strategic Knowledge :
To determine what to do next “Climb to 30000 feet”
Repertory Grid Approach Knowledge Acquisition (知識擷取) System Substantive Knowledge Strategic Knowledge Classification
Decision making Planning Control MORE SALT MOLE ASK Other Approach AQUINAS
KITTEN KNACK RuleCon
KRITON TEIRESIAS
The Acquisition of Substantive Knowledge
• Repertory Grid (知識表格) -Oriented Methods : Step 1. Elicit elements to be classified.
Step 2. Elicit constructs from experts.
Each time three elements are selected. The expert is asked to give a construct
to distinguish one element from the other two.
Step 3. Rate the grid by filling a rating (1-5) to each entry. Step 4. Generate implication graph.
Step 1 : Elicit elements from experts.
Step 2 : Elicit constructs from experts.
Measles German Dangue Chickenpox Smallpox Measles Fever
Measles German Dangue Chickenpox Smallpox Measles Fever 4 1 5 1 5 5 4 1 1 4 2 high fever red purple headache no high fever not red no purple no headache
Step 3 : Rate each entry of the grid.
Step 4 : Generate the implication graph.
Measles German Dangue Chickenpox Smallpox Measles Fever 4 1 5 4 2 5 1 5 5 4 4 1 4 1 1 1 2 4 4 2 high fever red purple headache no high fever not red no purple no headache headache
purple high fever red
Rules generated from the grid
:
First column : IF
high_fever and red and purple and (not headache) Then Disease = Measles CF = MIN (0.8,1.0,0.8,0.8) = 0.8 Second column : IF
(not high_fever) and (not red) and (not purple) and (not headache) Then
Advantages of applying repertory-grids
(知識表格)
Easy to analyze the elicited knowledge : 1. Similarity analysis of constructs.
2. Similarity analysis of elements.
3. Analysis of the relationships among constructs. 4. Detection of missed elements.
3.2 ELICTATION OF SUBSTANTIVE
KNOWLEDGE
Knowledge Representation (知識表示法) dog bird fish
4-legs 2-legs no-legs 5 1 1 1 5 1 1 1 5 not 4-legs not 2-legs has-legs
dog bird fish
# of legs
4,2 2,2 0,2
An acquisition table is a repertory grid (知識
表格) of multiple data types :
Boolean : true or false
Single value : an integer, a real, or a symbol
Set of value : a set of integers, real numbers or symbols. Range of values : a set of integers or real numbers.
‘X’ : no relation.
‘U’ : unknown or undecidable.
Ratings :
2 : very likely to be. 1 : maybe.
3.3 Some Problems of Repertory Grids (知識表
格)
Problem of Element Selection
E1 E2 E3 E4 E5 C1 C2 C3 C4 1 5 5 4 2 1 5 1 1 5 1 5 1 2 2 1 5 1 1 4 C’1 C’2 C’3 C’4
Problem of Multi-Level Knowledge and
Acquirability
INPUT DATA INPUT DATA
SUBGOAL
SUBGOAL INPUT DATA SUBGOAL
• The Concept of Acquirability :
The value of a terminal attribute of a decision tree must either be a constant or be acquirable from users.For example :
IF
(leaf-shape = scale) and (class = Gymnosperm) THEN
family = Cypress.
Leaf Shape
Class
Family
Domain basis and classification knowledge
:
Domain basis Other diseases Acute Exanthemas Classification knowledgeMeasles, German measles, Dangue fever,…
Problem of Missing Embedded Meanings (隱
含
知識)
• When a diagnostician expresses the features of catch cold are
headache, feel tired, cough, sneeze,…,
he means “if a person catches cold, he may
have those features”
• We usually represent the expertise as the following rules: (Headache = yes) and (Feel_tired = yes) and
(cough = yes) and …,
• The embedded meaning (隱含知識) of the di
agnostician
“if one or some features do not appear, it is still possible that the patient catches cold.”
3.4 EMCUD
: A New Model for Eliciting
Knowledge Representation (知識表示法):
Conventional Repertory grid (知識表格) or Ac
quisition Table
+
Eliciting embedded meanings (隱含知識) b
y constructing the Attribute Ordering Table (屬
性序列表格)
• Value in an AOT may be :
‘D’ : The attribute dominate (主導權) the object. ‘X’ : The attribute has no relation with the object.
an integer : The attribute is of some degree of importance to
the object.(A smaller integer means less important.)
Obj1 Obj2 Obj3 Obj4 Obj5 A1 A2 A3 D D 2 1 D 1 1 1 D D X X D 1 D
The rule generated from first column :
RULE3: (13<A116)(A2=YES) (A3=4.3) → GOAL = Obj3
Where
F(confidence) = 1.0 if confidence = 2 = 0.8 if confidence = 1 and
Certainty Factor CF = MIN(F(2),F(1), (F(2)) = 0.8
An example of Repertory Grid (知識表格):
Obj1 Obj2 Obj3 Obj4 Obj5 A1
A2 A3
{9,10,12},2 20,2 (13-16],2 17,2 3,2
YES,1 NO,2 YES,1 YES,2 NO,2
X X 4.3,2 2.1,2 6.0,2
An example of constructing AOT.
EMCUD : If A1 {9,10,12}, is it possible that GOAL =Obj1 ?
EXPERT : No. /*This implies that A1 dominates Obj1 and
AOT<Obj1,A1> = ‘D’ */
EMCUD : If A2 YES,is it possible that GOAL = Obj1?
EXPERT : Yes. /*A2 does not dominate Obj1 */
EMCUD : If A1 > 16 or A1 13, is it possible that GOAL = Obj3?
EXPERT : Yes. /* A1 does not dominate Obj3 */
EMCUD : If A2 YES, is it possible that GOAL = Obj3 ?
EXPERT : Yes. /* A2 does not dominate Obj3 */
EMCUD : If A3 4.3 , is it possible that GOAL = Obj3 ?
EMCUD : Please rank A1 and A2 in the order of importance to
Obj3 by choosing one of the following expressions :
1)A1 is more important that A2
2)A1 is less important that A2
3)A1 is as important as A2
EXPERT : 1 /* A1 is more important to Obj3 than A2, hence
AOT < Obj3,A1> = 2 and AOT <Obj3,A2> = 1 */
Obj1 Obj2 Obj3 Obj4 Obj5 A1 A2 A3 D D 2 1 D 1 1 1 D D X X D 1 D
Elicit Embedded Meanings (隱含知識)
From RULE3, the following embedded rules (隱含規則) will Be generated by negating the predicates of A1 and A2 :
RULE3,1 : NOT(13<A116)(A2=YES) (A3=4.3)
→ GOAL = Obj3
RULE3,2 : (13<A116)NOT(A2=YES) (A3=4.3)
→ GOAL = Obj3
RULE3,3: NOT(13<A116)NOT(A2=YES) (A3=4.3)
Certainty Sequence(CS)
:
Represents the drgree of certainty degradation. CS(RULESij) = SUM(AOT<Obji,Ak>)
for each ak in the negated predicates of ruleij
For example :
CS(RULE3,3) = AOT < Obj3,A1 + AOT<Obj3,A2>
= 2 + 1 = 3
The embedded rules (隱含規則) generated from RULE3 :
RULE3,1: NOT(13<A116)(A2=YES) (A3=4.3)
→ GOAL = Obj3 CS = 2
RULE3,2: (13<A116)NOT(A2=YES) (A3=4.3)
→ GOAL = Obj3 CS = 1
RULE3,3: NOT(13<A116)NOT(A2=YES) (A3=4.3)
Construct Constraint List
1. Sort the embedded rules according to the CS values :
RULES3,2 CS = 1
RULES3,1 CS = 2
RULES3,3 CS = 3
2. A prune-and-search algorithm :
EMCUD : Do you think RULE3,1 is acceptable?
Expert : Yes. /* then RULE3,2 is also accepted*/
EMCUD : Do you think RULE3,3 is acceptable?
Expert : No. /* then CS=3 is recorded in the constraint list */
Calculate Certainty Factors (確定因子)
Confirm : 1.0
Strongly support : 0.8 Support : 0.6
May support : 0.4
CFij= Upper-Boundi- (Csij/MAX(Csi)) (Upper-Boundi – Lower-Boundi)
MAX(Csi) : maximum CS value of the embedded
rules generated from RULEi.
Upper-Boundi: certainty factor of embedded
Lower-Boundi: certainty factor of embedded
An example of calculating certainty factors (確定
因子)
For the embedded rules (隱含規則) from RULE3:
1. Upper – Bound = CF(RULE3) = 0.8
2. Since RULE3 is not accepted, the embedded rule with MAX(CS)
is RULE3,1:
EMCUD : If RULE3 strongly supports GOAL = Obj3 , what about RULE3,1 ?
Expert : 1. /*The Lower-Bound = 0.6*/
CF3,1 = 0.8 – (2/2) * (0.8 – 0.6) = 0.6
• The process of eliciting embedded meanings (隱
含知識):
repertory grid Attribute-Ordering Table Constraint List mapping function original rules possible embedded rules accepted embedded rules certainty factors ofthe embedded rules
eliciting embedded
rules thresholding
ACQUISITION TABLE
肺 炎 咳 嗽 疲 倦 頭 痛 YES,2 YES,2 YES,1 肺 炎 咳 嗽 疲 倦 頭 痛 D 2 1 AOTConventional Repertory Grids (知識表格):
IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 =YES) THEN DISEASE= 肺炎 CF=0.8
EMCUD :
IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 =YES) THEN DISEASE= 肺炎 CF=0.67
IF ( 咳嗽 =YES)&( 疲倦 =YES)&( 頭痛 <>YES) THEN DISEASE= 肺炎 CF=0.73
IF ( 咳嗽 =YES)&( 疲倦 <>YES)&( 頭痛 <>YES) THEN DISEASE= 肺炎 CF=0.6
OBJECT CHAIN
: A METHOD FOR
questions selection
:
• For the grid with 50 elements (or objects), there are 19600 po ssible choices of questions to elicit constructs (or attributes). • Initial repertory grid (知識表格) and the object chains : OBJECT CHAIN Obj1 --> 2,3,4,5 Obj2 --> 1,3,4,5 Obj3 --> 1,2,4,5 Obj4 --> 1,2,3,5 Obj5 --> 1,2,3,4
Obj1 Obj2 Obj3 Obj4 Obj5
• The expert gives attribute P
1to distinguish Obj
1and
Obj
2from
Obj
3OBJECT CHAIN Obj1 -- > 2,5 Obj2 -- > 1,5 Obj3 -- > 4 Obj4 -- > 3 Obj5 -- > 1,2
Obj1 Obj2 Obj3 Obj4 Obj5
P1 T T F F
T
• The expert gives attribute P
2to distinguish Obj
2and
Obj
5from
Obj
1OBJECT CHAIN Obj1 -- > NULL Obj2 -- > 5 Obj3 -- > NULL Obj4 -- > NULL Obj5 -- > 2
Obj1 Obj2 Obj3 Obj4 Obj5 P1 P2 T T F F T T F T F F
• The expert gives attribute P
3to distinguish
Obj
2from
Obj
5 OBJECT CHAIN Obj1 -- > NULL Obj2 -- > NULL Obj3 -- > NULL Obj4 -- > NULL Obj5 -- > NULLObj1 Obj2 Obj3 Obj4 Obj5 P1 P2 P3 T T T F T T F T F F F T T F F
• Advantages
:
1. Fewer questions are asked(log2n to n-1 questions).
2. All of the objects are classified.
3. Every question matches the current requirement of classifying objects.
• Disadvantages :
1. It may force the expert to think a specific direction. 2. Some important attributes may be ignored.
Eliciting hierarchy of grids
:
• For the expert system (專家系統) of classifying families
of plants
Cypress Pine Bald Cypress Magn olia
Leaf shape Needle pat. Class ( 綱 ) Silver band
scale needle needle scale
X {random,cvenline} evenline X Gymnosperm Gymnosperm Gymnosperm Magnolia X T F X
• Since class is not acquirable, it becomes the goal of a
new grid.
Gymnosperm Magnolia Angiosperm Type ( 種 )
Flate
Tree Herb Tree F T T
• Since class is not acquirable, it becomes the goal of a
new grid.
Herb Vine Tree Shrub stem
position one trunk
green woody woody woody X creeping upright upright F T T F
Decision tree of the hierarchy of grids
:
FAMILY OF PLANT
LEAF SHAPE NIDDLE PATTERN CLASS
TYPE FLATE
3.5 An Application and Performance Evaluation
of EMCUD
Application Domain :
Diagnosis of Acute Exanthema
Hardware :
Personal Computer
Software :
The codes of diseases and their translations: 1-Measles 8 - Meningococcemia
2-German measles 9 - Rocky Mt. Spotted fever 3-Chickenpox 10 - Typhus fevers
4-Smallpox 11 – Infectious mononucleosis 5-Scarlet 12 – Enterovirus infections
6-Exanthem subitum 13 – Drug eruptions
7-Fifth disease 14 – Eczema herpeticum
case number 1 2 3 4 5 6 7 8 9 10 11 12 13 physician 12 3 3 1 2 1 14 2 6 5 5 3 1 old prototype 12 X X X X X 14 X 6 X X 3 1 new prototype 12 3 3 1 2 1 14 2 6 5 5 3 1 case number 14 15 16 17 18 19 20 21 22 23 24 25 physician 6 6 12 5 8 9 14 13 4 1 2 14 old prototype X X 12 5 X 9 14 13 4 1 2 14 new prototype 6 6 12 5 8 9 14 13 4 1 2 14
3.6 Knowledge integration (知識整合) from
multiple experts
To build a reliable expert system, the cooperation of several experts is usually required.
Difficulties :
• Synonyms of elements (possible solutions)
• Synonyms of traits (attributes to classify the solutions) • Conflicts of ratings
Integrated Knowledge
Use more attributes to make choices from more possible decisions
Habitual domain of Expert 1
Each expert has his own way to
Expert 1 Expert 2 Expert N
Busy Busy Busy
Far away Far away
Knowledge Engineer
It is difficult to have all of the experts work together
Expert 1 Expert 2
…
Expert N Phase 1 interview Repertory Grid 1 Repertory Grid 2 Repertory Grid NThe unions of element sets and construct sets
Common Repertory Grid
Phase 2 interview
Expert 1 Expert 2
…
Expert NEliminate some redundant vocabularies
Expert 1 Expert 2
…
Expert N Phase 3 interview Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge IntegrationIntegrated Repertory Grid
Repertory Grid 1 Repertory Grid 2 Repertory Grid N
The unions of element sets and construct sets
Common Repertory Grid
Phase 2 interview
Expert 1 Expert 2
…
Expert NEliminate some redundant vocabularies
Common Repertory Grid
Phase 3 interview
Rated Common Repertory Grid 1 Rated Common Repertory Grid 2 Rated Common Repertory Grid N Knowledge Integration Integrated Repertory Grid
Generate AOT Flat Repertory Grid
AOT
Filled AOT 2
Filled AOT 1
…
Filled AOT NIntegrated AOT
Rule Generation Integration or AOT’s
Expert 1 Expert 2 5 4 1 4 5 1 1 5 1 1 4 4 5 3 1 5 5 5 4 3 4 1 1 5 4 4 1 1 5 5 5 1 1 5 4 1 4 5 1 1 5 2 2 5 5 5 1 4 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side 5 3 1 5 4 1 2 4 1 1 3 4 5 2 1 5 5 5 3 2 5 1 1 5 4 4 1 1 4 5 5 1 1 5 5 1 3 4 1 1 5 2 1 5 5 5 1 3 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side E1 E2 E3 E4 E5 E1 E2 E3 E4 E5 Knowledge Integration
Expert 3 5 4 1 5 5 1 1 5 1 1 4 4 5 2 1 5 5 5 4 2 5 1 1 5 4 4 1 1 5 5 5 1 1 5 5 1 4 5 1 1 5 2 1 5 5 5 1 4 1 1 Eye pain Pupil size headache Cornea Inflame of Eye Tears Redness Vision Papillary light response Both Side E1 E2 E3 E4 E5
Results of the first experiment
Differential Diagnosis for Common Causes of Inflamed Eyes. 60 test cases are used to evaluate the knowledge base from Expert 1, the knowledge base from Expert 2, and the
integrated knowledge base. Knowledge
base Ratio of Correct Diagnosis
Expert 1 Expert 2 Integrated 0.67 0.64 0.8
Results of the first experiment
Differential Diagnosis for Common Causes of Inflamed Eyes. 336 test cases are used to evaluate the knowledge base from Expert 1, the knowledge base from Expert 2, and the
integrated knowledge base. Knowledge
base Number of Correct Diagnosis Ratio of Correct Diagnosis Expert 1 Expert 2 Integrated 255 243 306 0.759 0.723 0.911
3.7 Machine Learning (機器學習)
Building computer programs able to construct new
knowledge or to improve already possessed knowledge Application : Expert Systems Cognitive Simulation Problem Solving Control … Example : Perceptron [Rosenblatt, 1961]
Meta-Dendral [Bucmanan, Feigenbaum, Sridharan, 1972] AM [Lenat, 1976]
傳統專家架構系統 知 識 源 知 識 編 輯 介 面 推 理 機 置 知 識 庫 使 用 者 介 面 使 用 者
具歸納式機器學習能力之專家系統架構 學 習 機 置 推 理 機 置 使 用 者 介 面 使 用 者 知 識 源 範 例 編 輯 介 面 範 例 庫 知 識 庫
Machine Learning
Machine Learning Central to A. I.
Learning from training cases.
Taxonomy
Taxonomy
[Michalski, 1983]
Learning
Learning from
Examples
Learning from
Observation and
Discovery
Rote
Learning
by Analog
Learning
Learning by
Classification
Classification
Learning Strategies Symbolic
Learning LearningNeural
Batch Learning e.g. Version Space e.g. ID3 e.g. PRISM e.g. Perceptron Incremental Learning
Symbolic Learning
Symbolic Learning
Learning Unit
1.Attributes
3. Hypothesis
Space
2. Matching
4.Training
Review of some data-driven learning
strategies
[T.M. Mitchell 1979]1. Depth-first search
2. Specific-to-general breadth-first search
3. ID3
Description of instances:
an unorder pair of simple objects, characterized by
three attributes(size, color, shape)
Three instances:
{(Large,Red,Triangles)(Small,Blue,Circle)} {(Larg e,Blue,Circle) (Small,Red,Triangle)} {(Large,Blu e,Triangle)(Small,Blue,Triangle)}
+ +
Depth-First Search
1.{(Large,Red,Triangle) (Small,Blue,Circle)} 2.{(Large,Blue, Circle) (Small,Red, Triangle)} 3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,?,?) (Small,?,?)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,?,?) {(?,Red,Triangle)Disadvantages of Depth-First Search:
1. Needs backtracking
2. Needs additional cost of maintaining
consistence with past instances
Specific-to-general breadth-first search
1.{(Large,Red,Triangle) (Small,Blue,Circle)} 2.{(Large,Blue, Circle) (Small,Red, Triangle)} 3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(Large,?,?) {(?,Red,Triangle) {(Large,?,?) (Small,?,?)} {(?,Red,Triangle) (?,Blue,Circle)}Disadvantages of breadth-first search:
Needs to check past negative instances to assure
Problem Description
1. A set of attributes = { A : the age of the patient, 年齡 B : spectacle prescription, 視力 C : astigmatic, 亂視
D : tear production rate 淚量 } 2. Matching Predicates:
A= { A1 : young , 青年 , A2 : pre-presbyopic, 中年 , A3 : presbyopic 老 年 }
B= { B1 : myope, 近視 , B2 : hypermetrope 遠視 } C= { C1 : no 無 , C2 : yes 有 }
D= { D1 : reduced , 較少 , D2 : normal 正常 } 3. A set of classes (Hypothesis Space)
= { DEC1 : hard contact lenses, 硬式隱形眼鏡 DEC2 : soft contact lenses, 軟式隱形眼鏡
4. Training Instances 訓練範例 範 A B C D Dec 範 A B C D Dec 範 A B C D Dec 例 例 例 1 A1 B1 C1 D1 Dec3 2 A1 B1 C1 D2 Dec2 3 A1 B1 C2 D1 Dec3 4 A1 B1 C2 D2 Dec1 5 A1 B2 C1 D1 Dec3 6 A1 B2 C1 D2 Dec2 7 A1 B2 C2 D1 Dec3 8 A1 B2 C2 D2 Dec1 9 A2 B1 C1 D1 Dec3 10 A2 B1 C1 D2 Dec2 11 A2 B1 C2 D1 Dec3 12 A2 B1 C2 D2 Dec1 13 A2 B2 C1 D1 Dec3 14 A2 B2 C1 D2 Dec2 15 A2 B2 C2 D1 Dec3 16 A2 B2 C2 D2 Dec1 17 A3 B1 C1 D1 Dec3 18 A3 B1 C1 D2 Dec2 19 A3 B1 C2 D1 Dec3 20 A3 B1 C2 D2 Dec1 21 A3 B2 C1 D1 Dec3 22 A3 B2 C1 D2 Dec2 23 A3 B2 C2 D1 Dec3 24 A3 B2 C2 D2 Dec1
基本的歸納決策樹學習演算法推導決策樹
A= A1 B=B1 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] A= A3 B=B1 C=C1 [Dec3] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] A= A2 B=B1 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 D=D1 [Dec3] D=D2 [Dec1] B=B2 C=C1 D=D1 [Dec3] D=D2 [Dec2] C=C2 [Dec3] 訓練 範例Assumption: p 個正例 n 個反例
C
1. C 中所含正反例的個數可以反映出 一般正反例的比例。 正:反 = 2. 表達 此資訊的最小 bit 期望值 ( 所含資訊量 ) 為 n : p n p n : n p p n p n n p n n p p n p p N P I log2 log2 ) , (ID3
以 Data Compression 的觀點: 越少出現的 information 用越多 bits 表示 越常出現的 information 用越少 bits 表示 可用最少的 memory 表達最多的 information 若出現機率是 則用 memor y 表示 則用 memory 表示 所以 P 個正例, n 個反例期望的 bits 數為 bits 數和 information 量成正比 n p p n p n K n p p 1 K n p n 1 ' K n p n log n p n n p p log n p p K log n p n K log n p p 2 2 n pn 2 n p p 2
· · · · · · C C1 C2 C3 Cr P + n -P1 + n1 - P2 + n2 -P3 + n3 - Pr +nr
-A1, A2, A3, ... (Attributes)
gain (A) = I(P, n) - E(A) A 所含的 in formation 原來的 information 量 用 A 分類之後剩餘的 information 量 取 Gain(A) 最大的 attribute 來分
v 1 i i i i i I(p ,n ) n p n p ) A ( EAn Alternative point of view : Entropy ( 亂度 ) + + + + + + + + + + - - - - + + + + + + C C A1 A2 亂度高 亂度低 n p n log n p n n p p log n p p ) n , p ( I 2 2
ID3 歸納決策樹學習演算法推導決策樹
D=D1 [Dec3] D=D2 C=C1 A=A1 [Dec2] A=A2 [Dec2] A=A3 B=B1 [Dec3] B=B2 [Dec2] C=C2 B=B1 [Dec1] B=B2 A=A1 [Dec1] A=A2 [Dec3] A=A3 [Dec3] 訓練 範例歸納法決策樹可以轉換成決策法則,例如在上圖的第
二個分枝可以表示成如下之法則
: IF D=D2 意為 若 患者的淚量 = 正常 AND C=C1 且 患者有亂視 = 有 AND A=A1 且 患者的年齡 = 青年 THEN Dec=Dec2 則 隱形眼鏡決策 = 軟式隱 形眼鏡assume only one attribute exists
Instance space: terminal nodes,Hypothesis space: all nodes Predicates: predecessor-successor relations
Positive Training Instances : sin and cos Negative Training Instance : ln
→ Concept : trig
transc
trig explog
sin cos tan ln exp
Terminology
• An Instance Space :
a set of instances which can be legally described by a given instance language . Attribute-based Instance Space . Structured Instance Space
• A Hypothesis Space :
a set of hypotheses which can be legally described by a generalization language
Conjunctive Form Disjunctive Form e.g.
Color=red and shape=convex C1 or C2 or C3… (most prevalent form)
conjunctive form
Terminology
Predicates :
required for testing whether a given instance is contained i n the instance set corresponding to a given hypothesis
• Powerful basis for organizing a search • Two partial ordering relations exist : A is more specific (特殊) than B : B is more general (泛化) than A : If each instance contained in A is also
Incremental Learning (逐漸式學習演算法) For Conjunctive Hypothesis
Idea :
Be represented by two sets hypotheses
S : the most specific set (最特殊規則集) consistent with the training instances
G : the most general set (最泛化規則集) consistent with the training instances
Version Space
more general more specific G S +-Example of Version Space
( sin + ) S : sin G : transc
( ln - ) S : sin G : trig
( cos + ) S : trig G : trig
Concept : trig
Lemma : a S, b G,
a is more specific than b
transc
trig explog
1.{(Large,Red,Triangle) (Small,Blue,Circle)} 2.{(Large,Blue, Circle) (Small,Red, Triangle) 3.{(Large,Blue, Triangle)) (Small,Blue, Triangle)} {(Large,Red,Triangl e) (Small,Blue, Circle)} {(?,?,?) (?,?,?) } S : G : S : G : S : G : {(Large,?,?) (Small,?,?)} {(?,Red,Triangle) (?,Blue,Circle) } {(?,?,Circle ) (?,?,?) } {(?,Red,?) (?,?,?) } {(?,?,?) (?,?,?) } {(?,Red,Triangl e) (?,Blue,C ircle)}
Check contradiction between S and G
• Step1: Take a generalization s in S and a
generalization g in G. Check s with g, if g is not
more general than s , mark s and g.
• Step2: Repeat step1 until each in S and G are
processed.
• Step3: Discard those generalizations in S with |G|
marks and those in G with |S| marks.
Advantage of Version Space:
Needs not check past instances---the reason to
Exercise
1. 試以動物分類為例,建立一個 Repertory
Grid (知識表格)及產生對應的推論規
則。
2. 分析產生的動物分類推論規則中是否有
遺漏的 Embedded Meanings (隱含知
識)。
3. Use Depth-First-Search to learn from the
following training cases.
COLOR black brown brown black brown black brown brown brown black black black SIZE large large medium small medium large small small large medium medium small COAT shaggy smooth shaggy shaggy smooth smooth shaggy smooth shaggy shaggy smooth smooth COLOR + + -+ + + -+
-範 A B C D Dec 例 1 A1 B1 C1 D1 Dec1 2 A1 B1 C1 D2 Dec1 3 A1 B1 C1 D3 Dec1 4 A1 B1 C2 D1 Dec1 5 A1 B1 C2 D2 Dec1 6 A1 B1 C2 D3 Dec1 7 A1 B2 C1 D1 Dec2 8 A1 B2 C1 D2 Dec2 9 A1 B2 C1 D3 Dec2 10 A1 B2 C2 D1 Dec2 11 A1 B2 C2 D2 Dec3 12 A1 B2 C2 D3 Dec2 範 A B C D Dec 例 13 A2 B1 C1 D1 Dec1 14 A2 B1 C1 D2 Dec1 15 A2 B1 C1 D3 Dec1 16 A2 B1 C2 D1 Dec1 17 A2 B1 C2 D2 Dec1 18 A2 B1 C2 D3 Dec1 19 A2 B2 C1 D1 Dec2 20 A2 B2 C1 D2 Dec2 21 A2 B2 C1 D3 Dec2 22 A2 B2 C2 D1 Dec2 23 A2 B2 C2 D2 Dec3 24 A2 B2 C2 D3 Dec2 範 A B C D Dec 例 25 A3 B1 C1 D1 Dec1 26 A3 B1 C1 D2 Dec1 27 A3 B1 C1 D3 Dec1 28 A3 B1 C2 D1 Dec1 29 A3 B1 C2 D2 Dec1 30 A3 B1 C2 D3 Dec1 31 A3 B2 C1 D1 Dec2 32 A3 B2 C1 D2 Dec2 33 A3 B2 C1 D3 Dec3 34 A3 B2 C2 D1 Dec2 35 A3 B2 C2 D2 Dec3 36 A3 B2 C2 D3 Dec3 4. 已知有一分析型領域問題的屬性與決策如下: 屬性 A = {A1, A2, A3} 屬性 B = {B1, B2} 屬性 C = {C1, C2} 屬性 D = {D1, D2, D3}
決策 Dec = {Dec1, Dec2, Dec3}
(1) 用屬性 A, B, C, D 作順序所產生之決策樹。 (2) 用 ID3 演算法時所產生之決策樹。