KNOWLEDGE LEARNING ON FUZZY EXPERT NEURAL NETWORKS

(1)

KNOWLEDGE LEARNING ON FUZZY EXPERT

NEURAL NETWORKS*

H.C. Fu, J.J. Shann

Department of Computer Science and Information Engineering

Flsiao-Tien Pao

Department of Management Science

National Chiao-Tung University,

Hsinchu, Taiwan 300,

R.O.C.

Abstract

The proposed fuzzy expert network is an event-driven, acyclic neural network designed for knowledge learning on a fuzzy expert system. Initially, the network is constructed according

to a primitive (rough) expert rules including the input and output linguistic variables and

values of the system. For each inference rule, it corresponds to an inference network, which

contains five types of nodes: Input, Membership-Function, AND, OR, and Defuzzification Nodes. We propose a two-phase learning procedure for the inference network. The first phase is the competitive backpropagation (CBP) training phase, and the second phase is the rule-prziriing phase. The CBP learning algorithm in the training phase enables the network to leant the fuzzy rules as precisely as backpropagation-type learning algorithms and yet as quickly as competitive-type learning algorithms. After the BP training, the rile-pruning process is performed to delete redundant weight connections for simple network structures

and yet comp atible ret rieving performance.

1 Introduction

Recently. there are more and more research on implementing fuzzy expert systems in neural networks for the advantages of learning fuzzy expert rules from examples [7, 3, 2, 5, 8J. Most of the neural networks methodologies for learning knowledge can be divided into two categories:

back propagation —_type and corn_petitive—_type. _{Roughly speaking, backpropagation-type}

learn-ing algorithms learn more precisely thami competitive-type algorithms because they are based on gradient descent search, thus they take a long time and numerous training epochs to converge.

*This research is supported in part by the National Science Council under Grant NSC82-0408-E009-427.

(2)

In contrast, competitive-type learning algorithms learn much rapidly than backpropagation-type algorithms because they are based on unsupervised clustering, hence its the knowledge learning may not be precise enough. Therefore, an important issue that remains to be resolved in this field is how to learn knowledge both precisely and rapidly.

in this paper, we propose an event-driven, acyclic neural network for knowledge learning on

a fuzzy expert system. The fuzzy rules considered are in the linguistic IF-THEN form. The IF-part, i.e., the antecedent, of a rule is the conjunction of several input linguistic variables [ 12}, each associated with a linguistic value. The THEN-part, i.e., the consequence, of a rule contains only one output or intermediate linguistic variable associated with a linguistic value. The proposed learning procedure of this network consists of two phases. The first phase is a

competitive backpropagation (CBP) training phase, and the second phase is a rule-pruning phase. The CBP learning algorithm in the training phase is designed to be a compromise between the advantages of backpropagation-type and competitive-type learning methodologies. After the CBP training phase. a rule-pruning process is executed to delete redundant weight connections and to better represent the inference rules.

This paper is organized as follows. The structure of the network and the functions of the nodes in the network are described in Section 2. The CBP learning algorithm is stated in Section 3. A pruning method for deleting redundant weight connections is described in Section 4. In Section 5, a simple exemplar problem is simulated and analyzed. Finally, conclusions and future research plans are presented in Section 6.

2 Network Structure and Node Functions

in general, the inference structure of a fuzzy expert system can be classified into two categories: ( 1 ) single level of inferences, and (2) multi-level of inferences. A single level inference expert system

can be constructed by a five layered neural networks, and the inference knowledge can be learned from examples. The network structure for single-level inference as shown in Figure 1. Knowledge learning for single-level inference system have been presented in [9, 11]. For multi-level inference expert system, outputs of a defuzzification function can be inputs to a membership function of another inference rules. implementing a multi-level inference system on neural networks, requires training examples and primitive knowledge of inference relations (rules). For instance, suppose an expert expert system is composed by the following rules:

Rulel: if A and B then F,

Rule2: if D and E then G.

Rule3: if C and F then H,

Rule4: if C and G then I.

Without knowing primitive inference relations, a neural expert system can also be constructed

by a single level inference structure, and training by many examples. Obviously, the resulting

network will containing more connections and requires more training iterations to converge the

(3)

(Output or intermediate

linguistic variables)

MF Nodes

(Linguistic values of

the input linguistic variables)

Input Nodes

(Input linguistic

variables)

Figure 1: The structure of a fuzzy expert network for single-level inference. Defuzzification Nodes

zi

OR Nodes (THEN-parts) AND Nodes (IF-parts)

S...

S..

. S S Xh 164 ISPIE Vol. 2243

(4)

learning process. This paper will focus on the construction and training of multi-level inference expert networks. First, according to the primitive inference relation, an inference flow diagram

can be constructed (see Figure 2). Based on the inference diagram, we can apply the

single-SLINW: S ingle Level Inference Network

Figure2: The structure of a fuzzy expert network for multi-level inference,

level inference structure to implement each individual inference rules. As proposed in [9, 1 1] ,the network contains five types of nodes: Input, Membership-Function, Fuzzy-AND, Fuzzy-OR, and Defuzzification Nodes. The semantic meaning, connectivity, and function of each type of nodes

are described as follows:

Let P be the set of the indices of the nodes each of which has its output link connected to node i. The semantic meaning, connectivity, and function of each type of nodes in the proposed network

are described as follows:

e Input Nodes:

Each input node of the fuzzy expert network represents an input linguistic variable of the

fuzzy expert system, and is used as a buffer to broadcast the input to the

membership-function nodes of its linguistic values.

o MF (Membership-Function) Nodes:

Each MF node represents the membership function of a linguistic value associated with an input or an intermediate linguistic variable. An MF node has one input link emitted from an input node or from a defuzzification node. The output link of the MF node are connected to the AND nodes which represent the IF-parts containing this linguistic value of the linguistic

variable. For each input or intermediate linguistic variable, there is riMF _{nodes, where}

SPIEVol. 2243 / 165

I I

(5)

ii is the number of linguistic values associated with the linguistic variable. The output of an MF node is in the range of [0, 1] and represents the membership grade of the input or

intermediate linguistic variable with respect to the membership function of a linguistic value

of the variable.

In general, the most commonly used fuzzy-set membership functions are in the shape of trapezoid, triangle, or bell. For bell-shaped membership functions, the operation of an MF

node i with an input link emitted from an input or defuzzification node h is defined as

follows:

(xjh—ct)2

zi

=

exp 2o2 (1)

where cjis the centroid, a2 is the variance, Xih = zh, and zh is the output value of node h. The weight of the input link of an MF node is unity.

0

AND

Nodes:

Each AND node represents an IF-part for the possible rules of the fuzzy expert system.

An AND node has several input links each of which is emitted from an MF node for the linguistic value of an linguistic variable containing in the IF-part. The output link of the AND node is connected to all the OR nodes which represent the THEN-parts of the fuzzy rules with this IF-part.

In fuzzy set theory, the most commonly used operator for fuzzy intersection is the

mm-operator suggested by Zadeh [13] . Therefore, the operation performed by an AND node j is defined as follows:

zi = rnin(x) = MIN3 ,

(2)

iEP3

where

= z.

Theweights of the input links of the AND node, w's, are unity.

0 OR

Nodes:

Each OR node represents a THEN-part for the possible rules of the fuzzy expert system. The

operation performed by an OR node is to integrate fuzzy rules with the same consequence. An OR node may have several input links each of which is emitted from an AND node, and its output link is connected exactly to one defuzzification node which represents an output or an intermediate linguistic variable. The weight Wkj of the link connected from an AND

node j to an OR node k represents the weight of a fuzzy rule which regards node jas the

IF-part and node k as the THEN-part, respectively.

In fuzzy set theory, the most commonly used operator for fuzzy union is the max-operator suggested by Zadeh [13]. Therefore, the operation performed by an OR node k is defined as

follows:

Zk = rnax(xkwk) = MAXk,

(3)

JEPk

where Xkj = z3. The weights of the input links of an OR node, Wkj'S, are learnable positive real numbers.

(6)

0 Defuzzification

Nodes:

Each defuzzification node represents either an intermediate linguistic variable or an output

linguistic variable, and performs the defuzzification concerning all the linguistic values of the

linguistic variable. A defuzzification node has several input links each of which is emitted

from an OR node. The output link of a defuzzification node may be connected to the

MF nodes of its linguistic values when the defuzzification node represents an intermediate linguistic variable, or may be one of the outputs of the network when the defuzzification node represents an output linguistic variable.

Suppose that the correlation-product inference and the fuzzy centroid defuzzification scheme

[ 4] are used. The function of a defuzzification node 1 is defined as follows: — EkEpl(xlkakck)

zi— , (4)

>kEP1 (xlkak)

where ak and Ck are the area and centroid of the membership function for a linguistic value

of an linguistic variable in the THEN-part represented by OR node k, respectively, and

Xlk Zk. For a bell-shaped membership function, ak =

\/cYk,

where ok is the variance of the membership function. The weights of the input links of a defuzzification node are unity.

3 The Competitive Backpropagation Learning

Algorithm

We propose a two-phase learning procedure for our fuzzy expert network. The first phase is a

competitive backpropagation (CBP) training phase, and the second phase is a rule-pruning phase. The basic concept of the BP learning algorithm [9, ii] is to choose competitive operations, e.g., Eq.(2) and (3) , for the nodes in a neural network and to train the network by the process of the

backpropagatiort learning algorithm.

In the first phase of the learning procedure, the process of backpropagation learning method is performed to minimized the error function: E = (T1 —z1)2, where M is the number of output

linguistic variables, and T1 and zi are the target and the actual output values of a defuzzification node I which represents an output linguistic variable. Since the proposed fuzzy expert network

is acyclic, it is guaranteed that the network has stable forward activation and backward error

propagation [6].

Let N be the set of the indices of nodes each of which has an input link emitted from node i. The definition of the Delta values for OR, AND and MF nodes, the gradients of E with respect to the learnable weights, and the adjustments of learnable weights are described as follows:

o The definition of Delta values:

For a defuzzification node 1, the definition of the Delta value, i, is defined as follows:

5E

Ol =

ozi

(7)

— J — (T1 —z1) ,

if

node 1 is an output linguistic variable, ₅

—

— {>IiEN1{8i _{z(zL_c$)]}} if it is an intermediate linguistic variable .

For an OR node k, the definition of the Delta value is defined as follows:

ÔE ÔE az1

k —

— ——

ôZk ôz1 3zk

=

_'5 0k(Ck _z1)

, (6)

>1k'eP1 (Xjk'crkl)

where the output link of node k is connected to defuzzification node 1 only. For an AND node j, the definition of the Delta value is defined as follows:

a

3

=

9z

aZk ôz3 — c-' ₆ Wkj ,

if

XkjWkj MAXk,

— kEN

k j

, otherwise . (7)

For an MF node i, the definition of the Delta value is defined as follows:

— 8E

_

9E 9z3 I

9z

ôz3 ôz,

=

';-'

..

I

1 _,

ifx=MIN3

_, (8 L_s 3 [ o , otherwise JEN1

o The gradient of E with respect to the weight of a fuzzy rule:

For the weight Wkj of a link connected from AND node jtoOR node k, the gradient of E

with respect to Wkj is evaluated as follows:

VE -

Wk —

__

—

__

OWj OZk OWk3

—

f X , if

XkjWkj =M AXk,

— ₍₉₎

1 0 , othcr'wzse.

o The adjustment of a learnable weight:

The adjustment of a learnable weight Wkj, which is based on the gradient descent search,

can be described as follows:

wk(t + 1) = wk(t)—

/3 VEWkJ,

where i3 is the learning rate.

The knowledge of the fuzzy rules learned by the proposed CBP algorithm is distributed over the weights of the links between AND and OR nodes.

(8)

4 The Rule-Pruning Method

In general, after an idea fuzzy rule learning, a sound rule base is expected to be obtain. A sound

rule base is defined as that for an output linguistic variable, at most one consequence can be

implied from each possible antecedent in such a rule base. However, since the knowledge of fuzzy rules learned by the CBP training is distributed over the weights on the input links of OR nodes, there may be rules with identical antecedents and output linguistic variables but with different output linguistic values. Therefore, a pruning method is required to determine which rule in such

a group of rules should remain while the others are pruned to form a sound rule base.

The physical meaning of the weights on the input links of OR nodes is explained in the

following. After the CBP training, the learned weights Wkj on a set of links from an AND node j to the OR nodes of an output linguistic variable F (see Figure 3) are interpreted as the weights ( or certainty factors) of a set of fuzzy rules which contain the same IF-part and are related to the same output linguistic variable with different linguistic values.

An Defuzzification Node The defuzzification node represents an output

linguistic variable F1

(F

1,f)

OR Nodes

The OR nodes represent the linguistic values of the output linguistic

variable F1

An AND Node

The AND node represetns an antecedent AND1 of

fuzzy rules.

Figure 3: The diagram of the possible fuzzy rules with identical antecedent ANDJ for an output

linguistic variable F1.

As an example, Figure 3 can be interpreted as the following fuzzy rules:

SPIEVol. 2243 / 169

zi

(F

1

(9)

R,i,1: If AND3, then F1 is (wk1,3)

R,1,2: If AND, then F1 is J,2• (Wk2,)

R,l,k,: If AND3, then F1 is (wk,,3)

R,l,f1 : If AND3 , then F1 is .

(Wk11

,)

where AND is the antecedent represented by an AND node j. Theeffects of these rules are that when the antecedent AND3 holds, each of the rules is activated to a certain degree represented by the weight value (the certainty factor) associated with that rule.

For the output linguistic variable F1, at most one of the rules listed above could exist in a sound rule base. In the rule pruning phase, the fuzzy rules with identical antecedents and output linguistic variables are combined into one rule. Then, the redundant rules are deleted.

An evaluation equation is proposed here based on the concept of the centroid of gravity, which is also the basis of the defuzzification scheme described in Section2. The centroid of these rules is evaluated according to the areas and centroids of the membership functions (of the linguistic values for the output linguistic variable) and the weights of these rules.

The equation for determining the remaining rule in the group of fuzzy rules represented by the links from an AND node jto the OR nodes of an output linguistic variable F1 is defined as

follows [11]:

>1k(wkikck)

clj=

\

, (10)

LkWkj°k)

where kE N3 and kE P1.

We divide the space of the output linguistic variable F1 into several nonoverlapped intervals corresponding to the membership functions of the linguistic values of F1. The computed value of C1 will be in one of the intervals. Then we prune all the other rules (represented by weight connections) by setting their weights equal to zero.

For each AND node, the pruning process is performed to delete its redundant output links connected to the OR nodes associated with an output or intermediate linguistic variable. After the pruning phase, a sound rule base is obtained.

5 Simulation and Comparison

A general purpose simulator of the proposed fuzzy expert network has been implemented. Several

simple fuzzy expert systems were simulated to observe the learning ability of the fuzzy expert network. For an exemplar fuzzy expert system, there are five linguistic variables A, B, C, D, and

E. Their linguistic values are defined as:

1. A has three linguistic values A1, A2, and A3. 2. B has three linguistic values B1, B2, and B3.

(10)

3. C has three linguistic values C1 , C2, and C3.

4. D has four linguistic values D1, D2, D3, and D4. 5, E has four linguistic values E1, E2, E3, and E4.

The inference relationship between these variables is: 1. C is inferred from A and B, and

2. E is inferred from C and D.

Therefore, A, B, and D are input linguistic variables, C is an intermediate linguistic variable, and E is an output linguistic variable. The target fuzzy rule bases are shown in Table 1.

Table 1: The target rule base for the exemplar problems

A,

A2

A3

B1

C3

3 _C3

2

2 C3

C3

B3 C1

C2

C3

1 c2

c3

D1

E3

_E4

_E4 D2

E2

E3

E4

D3 E1

E2

E3

D4 E1 E1

E2

From the simulation results of this fuzzy expert system, after 220 CBP training epochs, the error rate was about 6%. Then the pruning process was performed to delete the redundant links

with a slightly increased error rate, 7%. Table 2 shows the learned rule-base for the exemplar

fuzzy expert system.

6 _Concluding

Remarks

A fuzzy expert network for rule learning of a fuzzy expert system is proposed and presented.

The learning procedure of the network is divided into two phases. The first one is a competitive backpropagation (CBP) training phase, and the second one is a rule-pruning phase. In the first phase, the CBP learning algorithm enables the network to acquire the knowledge of fuzzy rules

(11)

Table 2: The retrieved rule base after 220 CPB learning epochs, and rule-pruning processes. The *

symbolindicates the difference between target rule-base and learned rule-base.

A1

A

2

A

3

C3

B2

*

C3

*

C2

C3

B3

Cl

C2

*

C2

1

2 ₃

D1

E4

_E4

D2

E2

E3

E4 D3 E1

E2

*E2

D4 E1 E1

quickly and precisely. The main reason for these advantages is that the gradient descent search approach in the CBP algorithm enables the network to learn more precisely than typical com-petitive learning algorithms, while the dedicated structure of the network and the comcom-petitive characteristics of the CBP algorithm enable the network to converge much more rapidly than conventional backpropagation learning algorithms. The knowledge of fuzzy rules learned in the

CBP training phase is distributed over the learnable weights of the network. Therefore, in the second phase of the learning procedure, a pruning process is performed to convert the distributed knowledge of fuzzy rules learned by the CBP training into itermizable rule bases.

In the near future, we plan to further study on the learnability of the parameters of membership

functions and on the choosing of the competitive operations performed by the nodes in a fuzzy expert network.

References

[1] H.C. Fu and J.J. Shann, "Fuzzy Expert Networks," in the Proc. of ICANN'93, Sep. 13-16,

1993. Amsterdan, Netherlands.

[2] S.l. Horikawa, T. Furuhashi, and Y. Uchikawa, "On Fuzzy Modeling Using Fuzzy Neural Networks with the Back-Propagation Algorithm," IEEE Trans. on Neural Networks, Vol. 3,

No. 5, pp. 801-806, Sep., 1992.

[3] S.C. Kong and B. Kosko, "Adaptive Fuzzy Systems for Backing up a Truck-and-Trailer,"

IEEE Trans. on Neural Networks, Vol. 3, No. 2, pp. 211-223, March, 1992.

[4] B. Kosko, Neural Networks and Fuzzy Systems: A Dynamic Systems Approach to Machine

Intelligence , Prentice-Hall, Inc., New Jersey, 1992.

(12)

ELI I EtZZ •IOi\ RIdS c96T

'ci-gii:

dd

'

1°A

'/tLQ

pun uoznwioJ'uj Azznj,.

ipz

vii

[TJ cL6T 'O-1T7 dd '6

i°A

'LcTO

dd '6Tg66T dd

'

I°A 'ddud:ldS uoi]VULiofuJ

'jjj

'jj

'j

rnuoj

'-ipz

-LIL

dDuoD

jo

i

qSiflUiJ

qii

pui

sj

suoqirjddy

o

iujixoddy

v•i

[ci] •3•O•'ll 'u%A:L

'nqDUS

'66T

'-O

Q

rn

oig

ivuowuJduifo

_WfliSOdWfiSUO

1!JV

1VJfldN

'(66NNvsI)

piddi

-"is

v'-'

3H

'n

y,,

Azznj

punj

)OM

ioj

uuinbDy

iczznj

T1T

o

q

rr

[

ii

.v.s.u

'uJo

'pU%TJOd '8661

'cii

I

J1f

'6NN3h1

Jo

do4f

dy

UI 'S)JJOMN iczzrLd

jo

PTAOUM

uhiinbDy

ioj

uurj

uoqidoidpi,,

'nj

3H

P'

"TS

fT

{o}

10N

'ffS

'66T

'6i'

1cjnç 'ssALfiuo3 Pl0A'1 VStII UJLt! Jo

oj

u

ioj

pow.j

punj

Azznj

j,,

'nj

3H

p"

uuq

cr

[6]

66T

'LOIOT

dd

'(86ivivoI)

lN

U0

dU

-Juo3

/vuozwuuI

Jo

s5uzpdo4J

rn

'uoidoJdpeH

ions

Azznj Aq

suoipunj

dnsiq

-UiJN

pu

SJflJ

JOiUO

AZZnJ UiUJ%?J

T10N

icZzflJ y,,

'SflJ)J

}J pU? TDfl1?N G [s] 1661

'ci

'T

°N 'OTD 1°A

'

ndwo

uo

'suaiJ

jrjjj

AzZrLd

Morf

IUD

pu

UorrJ

p H-poA2N-p?JnoN,,

rT

isc

"VT L'C) [LI

'66T

'•u%?f 'ii °N '1

i°A

'

/omJ

1VflN

UO $UV4] 'S)JJOA

-N

idx

'it

uiu'irI

'pP""M

IYcI

P"

'TSnJH

TS

'iTJDWJ

flJ

[

66T

'oc-c

dd

'c

1°A

'I°°'N

jlThLfldj\j 'UOJSJj

induio3

UI uoisnd uoiprniiojuj

spoj

IitpiJIll

ps--Azznj,,

'rj

ç

pu

}I