An Architecture of Neural Network for Fuzzy Teaching Inputs

(1)

Proc. of the

1993

IEEE Int.'l

Conf.

on Tools with AI, Boston, Massachusetts,

Nov. 1993

An Architecture of Neural Network for Fuzzy Teaching Inputs

Hahn-Ming Lee and Weng-Tang Wang Department of Electronic Engineering

National Taiwan Institute of Technology, Taipei, Taiwan E-mail ^: hmlee@et,.ntit.edu.tw

A b s t r a c t

A neural network for classifcation problems with linguistic terms is proposed. A fuzzy input is represented as a LR-type fizzy set. A generalized pocket algorithm, called f i z z y pocket algorithm, that utilizes .IR-type fuzzy sets operations and defuzziJication method is first applied to train a linear threshold unit

(

LTU ). This LTU node will classrfj, as many fuzzy input instances as possible. Afterwards, FV nodes that represent fuzzy vectors will then be generated and expanded by the FVGE learning algorithm to classifi, those input instances that cannot be classified by the LTU node. The network structure is automatically generated. Besides, on-line learning is supplied, and learning speed is fast. One sample problems, called knowledge-based evaluator, is considered to illustrate the working of the proposed method. Also, the experimental results are very encouraging.

I . Introduction

Pattern classification is an important realm in AI.

h4ost AI techniques adopt symbolic computations and generate a spec- of very powerhl representing scheme for this problem. Neural networks adopt numerical computations with fault-tolerance, massively parallel clomputing and trainable property that are suitable for classification algorithm. However, numerical quantities evidently suffer from a lack of representing power.

Therefore, it is useful for a neural network classifier to be capable of symbolic processing. Fuzzy set theory is a hclpful method to process linguistic and ambiguous information [l] using numerical computations and could li~& the advantage of symbolic and numerical processing [2.]. According to the above description, this paper is aimed at incorporating fuzzy set theory into neural networks to improve neural networks ability of linguistic terms handling.

In this paper, we extend the capability of our previously proposed model [3] to handle linguistic tenns

terms. LR-type

fuzzy

sets representation and opcxations [4] [5] and defuzzification method are used. Nodes needed for a linguistic variable are few, and computation load is light. The network structure of the classifier is automatically generated Eiesides, on-line learning is supplied and learning speed is fast.

2. The classifier structure

4 Classification

Input X

Figure 1. The proposed classifier.

The complete structure off the proposed classifier is illustrated in Figure 1. It is a two-layered neural network with a single node at the output. The first layer has a linear threshold unit ( LTU ) 1[6] and none or some FV nodes that represent

fuzzy

vectors

(

described in section 3.2

).

The LTU node is trained by proposed fuzzy pocket algorithm [7]. FV nodes are trained by the IVGE learning algorithm to classifj those training instances that cannot be classified by the LTU node. When an input fuzzy vector X is presented to tlhe classifier, this classifier will check whether X is a fuzzy subset [SI of some FV

nodes. If it is, the winner FV node will send a correct classification output to the OUTPUT node. Otheiwise, input X will be classified by the LTU node.

285 106'3-6730/93 $03.00 0 1993 IEEE

(2)

2.1 The LTU node

The LTU node is first trained by proposed fuzzy pocket algorithm[7]. In fuzzy pocket algorithm, LR-type fuzzy sets operations and dehzification are used It will classifl as many training instances as possible. The learning equation of the

fuzzy

pocket algorithm is

:

If

(

Output of LTIJ

=

1 and desired-output

⁼

-1

)

or

(

Output of LTU

= 0

and desired-output

=

1

)

then WLW,~

⁼

W ~ w j + COA( Xi

)

* desireboutput Where COA is one of defuzzification methods [q.

In our model, Pocket-run-len is used as stopping criterion for the fuzzy pocket-algorithm. The FVGE leaming algorithm will then be applied to recognize those instances that cannot be classified by the LTU node.

2.2 The FV node

Each FV node represents a fuzzy vector. That is, FV

=

( FI1, FI2, - - - ^, FIN ), where connection weights FI1, FI2, - - ^, FIN are LR-type fuzzy intervals. Figure 2 illustrates the FV node's structure and its functions. The similarity-degree( X, FV

)

measuxes the similarity degree between X and FV, and it takes values in [ 0, 1 1. It is measured by the equation below

:

N

similarity-degree(Xl,, FL) similarity-degree( X, FV)

= I =

and similarity-degree( X,, FI,

)=

Max( ~zzy-subsethood( Xi, FIi ),close-degree( Xi, FI,

) )

N

4 To Output node

Transfer function Transfer

function _1, If SM

⁼

1 Figure 2. The FV node.

Fuzzy-subsethood( Xi, FIi

)

denotes whether Xi

is

a fuzzy subset of FIi. For fuzzy set, A

C

B iff Vx _E X, pa( x )

5

pb( x ) , where pa( x ) and pb( x

)

are the membership values of element x belonging to fuzzy set A and B, respectively. Fuzzy subsethood theorem [SI is a good measurement for it except that computation load is high.

Since we use LR-type representation of a fuzzy set, Therefore, Fuzzy-subsethood( X,, FI,

)

can be decided by an easier way. That is, let X,

=

( ml, m2,

a,

b

)LR

and

Fh

= (

wi, ~ 2 ,

C,

d

)LIZ,

if ( m l

^t

w1) and( m2

⁵

w2 ) and ( m l - a )

²

( w 1 - c ) and( m2 + b

) 5

(w2 + d ) ,

then f u z z y - ~ ~ b ~ e t h ~ ~ d ( Xi, FIi

) =

1 ; else, fkzy-subsethood( Xi, FIi )

=

0. If input Xi is a fuzzy subset of a FIi, fuzzy-subsethood(

Xi, FIi ) is equal to one. Otherwise, the distance between Xi and FIi must be considered for the similarity-degree( Xi, FI;

)

measurement. Close-degree( Xi, FIi

)

takes the distance between Xi and F1; as a factor. The longer the distance between Xi and FIi is, the lower the close-degree(

Xi, FIi ) will be. Both fuzzy-subsethood( Xi, FIi

)

and closedegree(Xi, FIi) take value in [ 0, 1 1. Similarity- degree( Xi, FIi ) is the maximum value between fuzzy- subsethood( Xi, FIi

)

and close-degree(Xi, FIi). Therefore, as long as input Xi is a fuzzy subset of a FIi, similarity- degree&, FIi) will then be one.

Fuzzy vectors are allowed to be nested. Inner fuzzy vectors represent the exceptions to the outer's. It may happen that input instance X is a fuzzy subset of more than one fuzzy vectors. Take one dimension fuzzy vector for example, in Figure

3

both similarity-degree( X, A

)

and similarity-degree( X, B ) are one. Class of input hzzy vector X will be class of fuzzy vector A. It is decided by comparing the sue of fuzzy vector A and B. The smallest size FV will be the winner. The size of a FV, Size( FV

),

is calculated by the equation below

:

h T

Figure 3,

Both similarity-degree(

X,

A ) and fuzzy-

subsethood(

^X,

B

) are zero, but input

X

will

be

classified

by

fuzzy vector A.

286

(3)

2.3 The OUTPUT node.

Each connection weight between FV nodes and the OUTPUT node is either +1 (i.e.,

(

1, 1, 0, 0

)LR)

or -1 (i.e.,

(

-1,

-1,O, 0 )LR).

Connection weight between LTU node and OUTPUT node is 1. The activation function of the OUTPUT node is weighted summation.

If the result of weighted summation is larger than zero than calssification result will be 1, otherwise it will be -

1. Classification result is decided by the output of the LTU node and the winner FV node ( if have ). That is, If input X is a fuzzy subset of the winner FV node, lclassification result is equal to the class of the winner FV

inode (i.e. connection weight between the winner FV node and OUTPUT node.) no matter what output value of the ILTU is. If the winner FV node cannot be found, dassification result is decided by the output of the LTU node. That is,

If output of the LTU node

⁼

0 then classification-result

⁼

-1 else classification-result

⁼

1 3. The training procedure

The training procedure is a two-phased learning procedure. In phase 1, Proposed fuzzy pocket algorithm is applied to train the LTU node. One-epoched FV generation / expansion (FVGE) learning is taken place in phase 2.

When an input X is presented to the classifier, the similarity-degree( X, FV ) is calculated Classification rzsult is decided by the output of the winner node,FV,, and LTU node. The FVWln satis9 :

(1)

similarity-degree( X, FVwin

)

is 1

( 2 ) FVwln is innerest among those fuzzy vectors that s unilarity-degree( X, FV) are one.

If current classification result is not correct, the FVGE leaming will be applied. It is described below

:

[Step 11 If input X is a fuzzy subset of some FV nodes, the second matched Nsec-match node that is a fiizzy subset of FV- will be searched. That is, F'Vsec-match satisfies three conditions listed below

:

condition 1

:

Similarity-degree( Nsec-match,

FV,, )

=

1 (i.e. FV- and

FVSec-match

are nested h z y vactors. Inner fuzzy vector, FVsec-match, is an exception to the outer fuzzy vector, FVw,.)

condition 2 : Class of FVsec-match

⁼

class of input X. (i.e. Wsec-math

^{out =}

desired-out)

condition

3 :

Similarity-degree

(

X, FVsec-match ) is largest among those fuzzy vectors's that satisfy condition 1.

[step la] If FVsec-match be found, each f k z y interval of FVsec-match node will then be modified, and fuzzy information X will be incorporated in the modified fuzzy vector FVsw-match. One example is illustrated in

Figure 4, The expansion criterion is

:

maximun dowed size of expanded

Fvsecpmat&

is

p * size( FV,,

),

where p takes value in [O:, 11 and controls the allowed size of inner fuzzy vector's.

[step lb] If

FVseomatch

cannot be found or the expansion criterion listed in step 1

a

cannot be satisfied, a new FV node, FVK+I, will be generated to represant input instance X. That is,

FIi of FVK+~ N, and the

connection weight

wFVk+l,OUT

is set to the class of input instance X. That i s ,

Xi, where i

⁼

1 [step 21 If input X is not a fuzzy subset of all existing FV nodes. The best matched FVbest will be selected.

That is, FVbest satisfies two conditions listed below

:

condition 1

:

class of FVbest

=

class of x.

Wbest, out =

desired-out)

condition 2

:

similarity-degree

(

X, FVbest

)

is largest among existing FV nodes's.

[step 2a] Each

filzzy

interval Of FVbest node will then be modified, and fuzzy informaition X will be incorporated in the modified fuzzy vector FVbest. The expansion criterion is : the modified FVbest will never overlap with other fuzzy vectors that represant different class.

[step 2b] If this criterion cannot be satisfied, a new FV node, FVK+~, will be generated to represent input instance X.

The function of FVGE learning is to generate or expand FV nodes to classifi those instances that cannot be classified by the LTU node. It is clear that, on-line learning ability is supplied in this model if the FVGE leaming is applied during operation.

Figure 4. Fuzzy interval expansion.

287

(4)

4. Experimental results

2 3

The 18 instances of knowledge-based evaluator ( KBE ) [lo] were used as training instances. Inputs of each feature are linguistic terms. The output of the KBE is Suitability, indicating that the application of the expert system on a domain is poor or good, respectively. These membership functions used for each attribute are illustrated in Figure 5 , respectively. For instance 2 and 3, we generate each possible term for the don’t care condition. So, total 340 training instances are generated fiom the

18

training instances.

Worth Value Employee Acceptance

5 3 0

4 4 0

Solution Available

Easier

Solution

Teachabilitv Risk

Figure 5. The membership function of K B E .

The experimental results of three trials are illustrated in Table 1. Pocket-run-len is 150, and the maximum size of a fuzzy vector allowed is set to

40.

Proposed model correctly classified these

340

training instances using few nodes.

W n o d e

Table 1.

The classification results on

KBE

instances. Pocket-run-[en

i s 150

and size

of

a

FV

is

40. 5. Conclusion

The contributions of fbzzy set theory lies in their methods to model and process uncertain or ambiguous data, so often encountered in rcal life. Therefore, to enable a neural network classifier to handle real life situation and link fhe advantage of numerical and symbolic processing, one may incorporate the fuzzy set theory into neural networks.

Fuzzy set theory including LR-type fuzzy sets operations, defuzzification method are utilized in the proposed neural network model. As illustrated in the experiments, proposed model can handle both crisp and

f i z z y

inputs well. The network structure does not have to be specified before training. Besides, on-line learning is supplied and learning speed is fast. Also, our proposed model can be used as a method to generate a knowledge base of a connectionist expert system with capability of

fuzzy inputs handling. The issues of inference, and explanation method will be explored in the future.

References

[l] W. Pedrycz, “Selected Issues of Frame of Knowledge Representation Realized by Means of Linguistic Labels,”

Inntemational Journal

of

Intelligent Systems, vol. 4, [2]

W. Pedrycz,

“Fuzzy Logic in Development of

Fundamentals of Pattern Recognition,” International Joumal

^of

Approximate Reasoning, vol. 5,

pp.

251- 264, 1991.

[3] H.M.Lee and W.T.Wang, “Training of a Neural Network Classifier by Combining Hyperplane with Exemplar Approach,” IEEE International Conference on Neural Networks, 1993, pp. 494499.

[4]

H. J. Zimmermann, Fuzzy Set Theory and Its Applications, Kluwer-Nijhoff Publishing, 199

1.

[5] Dibies Dubios and Henri Prade, “Fuzzy Real Algebra

:

Some Results,” Fuzzy Sets and Systems, vol. 2, pp.

[6]

Richard P. Lippmann,

“An

Introduction to Computing with Neural Net,” IEEE ASSP Magazine, 1987, pp. 4-

22.

[7] H.M.Lee and W.T.Wang, “Fuzzy Pocket Algorithm

: A

Generalized Pocket

Algorithm for Classification of

Fuzzy Inputs,” to appear

in IJCNN ‘93, NAGOYA, 1993

[8] Bart Kosok, Neural Networks and Fuzzy Systems, Prentice-Hall, 1992.

[9] Neural Works Professional IIPlus, Neuralware Inc., 1991.

[lo] R.Keller, Expert System-Development

^&

An Architecture of Neural Network for Fuzzy Teaching Inputs

Proc. of the

IEEE Int.'l

on Tools with AI, Boston, Massachusetts,

An Architecture of Neural Network for Fuzzy Teaching Inputs

Hahn-Ming Lee and Weng-Tang Wang Department of Electronic Engineering

National Taiwan Institute of Technology, Taipei, Taiwan E-mail : hmlee@et,.ntit.edu.tw

A b s t r a c t

I . Introduction

Pattern classification is an important realm in AI.

In this paper, we extend the capability of our previously proposed model [3] to handle linguistic tenns

terms. LR-type

sets representation and opcxations [4] [5] and defuzzification method are used. Nodes needed for a linguistic variable are few, and computation load is light. The network structure of the classifier is automatically generated Eiesides, on-line learning is supplied and learning speed is fast.

2. The classifier structure

4 Classification

Input X

Figure 1. The proposed classifier.

The complete structure off the proposed classifier is illustrated in Figure 1. It is a two-layered neural network with a single node at the output. The first layer has a linear threshold unit ( LTU ) 1[6] and none or some FV nodes that represent

vectors

described in section 3.2

nodes. If it is, the winner FV node will send a correct classification output to the OUTPUT node. Otheiwise, input X will be classified by the LTU node.

2.1 The LTU node

The LTU node is first trained by proposed fuzzy pocket algorithm[7]. In fuzzy pocket algorithm, LR-type fuzzy sets operations and dehzification are used It will classifl as many training instances as possible. The learning equation of the

pocket algorithm is

If

Output of LTIJ

1 and desired-output

-1

or

Output of LTU

and desired-output

1

then WLW,~

W ~ w j + COA( Xi

* desireboutput Where COA is one of defuzzification methods [q.

In our model, Pocket-run-len is used as stopping criterion for the fuzzy pocket-algorithm. The FVGE leaming algorithm will then be applied to recognize those instances that cannot be classified by the LTU node.

2.2 The FV node

Each FV node represents a fuzzy vector. That is, FV

( FI1, FI2, - - - , FIN ), where connection weights FI1, FI2, - - , FIN are LR-type fuzzy intervals. Figure 2 illustrates the FV node's structure and its functions. The similarity-degree( X, FV

measuxes the similarity degree between X and FV, and it takes values in [ 0, 1 1. It is measured by the equation below

similarity-degree(Xl,, FL) similarity-degree( X, FV)

and similarity-degree( X,, FI,

Max( ~zzy-subsethood( Xi, FIi ),close-degree( Xi, FI,

N

4 To Output node

Transfer function Transfer

function 1, If SM

1

Figure 2. The FV node.

Fuzzy-subsethood( Xi, FIi

denotes whether Xi

a fuzzy subset of FIi. For fuzzy set, A

B iff Vx E X, pa( x )

pb( x ) , where pa( x ) and pb( x

are the membership values of element x belonging to fuzzy set A and B, respectively. Fuzzy subsethood theorem [SI is a good measurement for it except that computation load is high.

Since we use LR-type representation of a fuzzy set, Therefore, Fuzzy-subsethood( X,, FI,

can be decided by an easier way. That is, let X,

( ml, m2,

b

and

Fh

wi, ~ 2 ,

d

if ( m l

w1) and( m2

w2 ) and ( m l - a )

( w 1 - c ) and( m2 + b

(w2 + d ) ,

then f u z z y - ~ ~ b ~ e t h ~ ~ d ( Xi, FIi

1 ; else, fkzy-subsethood( Xi, FIi )

0.

If input Xi is a fuzzy subset of a FIi, fuzzy-subsethood(

Xi, FIi ) is equal to one. Otherwise, the distance between Xi and FIi must be considered for the similarity-degree( Xi, FI;

measurement. Close-degree( Xi, FIi

takes the distance between Xi and F1; as a factor. The longer the distance between Xi and FIi is, the lower the close-degree(

Xi, FIi ) will be. Both fuzzy-subsethood( Xi, FIi

and closedegree(Xi, FIi) take value in [ 0, 1 1. Similarity- degree( Xi, FIi ) is the maximum value between fuzzy- subsethood( Xi, FIi

and close-degree(Xi, FIi). Therefore, as long as input Xi is a fuzzy subset of a FIi, similarity- degree&, FIi) will then be one.

Fuzzy vectors are allowed to be nested. Inner fuzzy vectors represent the exceptions to the outer's. It may happen that input instance X is a fuzzy subset of more than one fuzzy vectors. Take one dimension fuzzy vector for example, in Figure

both similarity-degree( X, A

National Taiwan Institute of Technology, Taipei, Taiwan E-mail ^: hmlee@et,.ntit.edu.tw

( FI1, FI2, - - - ^, FIN ), where connection weights FI1, FI2, - - ^, FIN are LR-type fuzzy intervals. Figure 2 illustrates the FV node's structure and its functions. The similarity-degree( X, FV

function _1, If SM

B iff Vx _E X, pa( x )