Proc. of the
1993IEEE Int.'l
Conf.on Tools with AI, Boston, Massachusetts,
Nov. 1993An Architecture of Neural Network for Fuzzy Teaching Inputs
Hahn-Ming Lee and Weng-Tang Wang Department of Electronic Engineering
National Taiwan Institute of Technology, Taipei, Taiwan E-mail : hmlee@et,.ntit.edu.tw
A b s t r a c t
A neural network for classifcation problems with linguistic terms is proposed. A fuzzy input is represented as a LR-type fizzy set. A generalized pocket algorithm, called f i z z y pocket algorithm, that utilizes .IR-type fuzzy sets operations and defuzziJication method is first applied to train a linear threshold unit
(LTU ). This LTU node will classrfj, as many fuzzy input instances as possible. Afterwards, FV nodes that represent fuzzy vectors will then be generated and expanded by the FVGE learning algorithm to classifi, those input instances that cannot be classified by the LTU node. The network structure is automatically generated. Besides, on-line learning is supplied, and learning speed is fast. One sample problems, called knowledge-based evaluator, is considered to illustrate the working of the proposed method. Also, the experimental results are very encouraging.
I . Introduction
Pattern classification is an important realm in AI.
h4ost AI techniques adopt symbolic computations and generate a spec- of very powerhl representing scheme for this problem. Neural networks adopt numerical computations with fault-tolerance, massively parallel clomputing and trainable property that are suitable for classification algorithm. However, numerical quantities evidently suffer from a lack of representing power.
Therefore, it is useful for a neural network classifier to be capable of symbolic processing. Fuzzy set theory is a hclpful method to process linguistic and ambiguous information [l] using numerical computations and could li~& the advantage of symbolic and numerical processing [2.]. According to the above description, this paper is aimed at incorporating fuzzy set theory into neural networks to improve neural networks ability of linguistic terms handling.
In this paper, we extend the capability of our previously proposed model [3] to handle linguistic tenns
terms. LR-type
fuzzysets representation and opcxations [4] [5] and defuzzification method are used. Nodes needed for a linguistic variable are few, and computation load is light. The network structure of the classifier is automatically generated Eiesides, on-line learning is supplied and learning speed is fast.
2. The classifier structure
4 Classification
Input X
Figure 1. The proposed classifier.
The complete structure off the proposed classifier is illustrated in Figure 1. It is a two-layered neural network with a single node at the output. The first layer has a linear threshold unit ( LTU ) 1[6] and none or some FV nodes that represent
fuzzyvectors
(described in section 3.2
).The LTU node is trained by proposed fuzzy pocket algorithm [7]. FV nodes are trained by the IVGE learning algorithm to classifj those training instances that cannot be classified by the LTU node. When an input fuzzy vector X is presented to tlhe classifier, this classifier will check whether X is a fuzzy subset [SI of some FV
nodes. If it is, the winner FV node will send a correct classification output to the OUTPUT node. Otheiwise, input X will be classified by the LTU node.
285 106'3-6730/93 $03.00 0 1993 IEEE
2.1 The LTU node
The LTU node is first trained by proposed fuzzy pocket algorithm[7]. In fuzzy pocket algorithm, LR-type fuzzy sets operations and dehzification are used It will classifl as many training instances as possible. The learning equation of the
fuzzypocket algorithm is
:If
(Output of LTIJ
=1 and desired-output
=-1
)or
(Output of LTU
= 0and desired-output
=1
)then WLW,~
=W ~ w j + COA( Xi
)* desireboutput Where COA is one of defuzzification methods [q.
In our model, Pocket-run-len is used as stopping criterion for the fuzzy pocket-algorithm. The FVGE leaming algorithm will then be applied to recognize those instances that cannot be classified by the LTU node.
2.2 The FV node
Each FV node represents a fuzzy vector. That is, FV
=
( FI1, FI2, - - - , FIN ), where connection weights FI1, FI2, - - , FIN are LR-type fuzzy intervals. Figure 2 illustrates the FV node's structure and its functions. The similarity-degree( X, FV
)measuxes the similarity degree between X and FV, and it takes values in [ 0, 1 1. It is measured by the equation below
:N
similarity-degree(Xl,, FL) similarity-degree( X, FV)
= I =and similarity-degree( X,, FI,
)=Max( ~zzy-subsethood( Xi, FIi ),close-degree( Xi, FI,
) )N
4 To Output node
Transfer function Transfer
function 1, If SM
=1
Figure 2. The FV node.
Fuzzy-subsethood( Xi, FIi
)denotes whether Xi
isa fuzzy subset of FIi. For fuzzy set, A
CB iff Vx E X, pa( x )
5
pb( x ) , where pa( x ) and pb( x
)are the membership values of element x belonging to fuzzy set A and B, respectively. Fuzzy subsethood theorem [SI is a good measurement for it except that computation load is high.
Since we use LR-type representation of a fuzzy set, Therefore, Fuzzy-subsethood( X,, FI,
)can be decided by an easier way. That is, let X,
=( ml, m2,
a,b
)LRand
Fh
= (wi, ~ 2 ,
C,d
)LIZ,if ( m l
tw1) and( m2
5w2 ) and ( m l - a )
2( w 1 - c ) and( m2 + b
) 5(w2 + d ) ,
then f u z z y - ~ ~ b ~ e t h ~ ~ d ( Xi, FIi
) =1 ; else, fkzy-subsethood( Xi, FIi )
=0.
If input Xi is a fuzzy subset of a FIi, fuzzy-subsethood(
Xi, FIi ) is equal to one. Otherwise, the distance between Xi and FIi must be considered for the similarity-degree( Xi, FI;
)measurement. Close-degree( Xi, FIi
)takes the distance between Xi and F1; as a factor. The longer the distance between Xi and FIi is, the lower the close-degree(
Xi, FIi ) will be. Both fuzzy-subsethood( Xi, FIi
)and closedegree(Xi, FIi) take value in [ 0, 1 1. Similarity- degree( Xi, FIi ) is the maximum value between fuzzy- subsethood( Xi, FIi
)and close-degree(Xi, FIi). Therefore, as long as input Xi is a fuzzy subset of a FIi, similarity- degree&, FIi) will then be one.
Fuzzy vectors are allowed to be nested. Inner fuzzy vectors represent the exceptions to the outer's. It may happen that input instance X is a fuzzy subset of more than one fuzzy vectors. Take one dimension fuzzy vector for example, in Figure
3both similarity-degree( X, A
)and similarity-degree( X, B ) are one. Class of input hzzy vector X will be class of fuzzy vector A. It is decided by comparing the sue of fuzzy vector A and B. The smallest size FV will be the winner. The size of a FV, Size( FV
),
is calculated by the equation below
:h T
Figure 3,
Both similarity-degree(X,
A ) and fuzzy-subsethood(
X,B
) are zero, but inputX
willbe
classifiedby
fuzzy vector A.286
2.3 The OUTPUT node.
Each connection weight between FV nodes and the OUTPUT node is either +1 (i.e.,
(1, 1, 0, 0
)LR)or -1 (i.e.,
(-1,
-1,O, 0 )LR).Connection weight between LTU node and OUTPUT node is 1. The activation function of the OUTPUT node is weighted summation.
If the result of weighted summation is larger than zero than calssification result will be 1, otherwise it will be -
1. Classification result is decided by the output of the LTU node and the winner FV node ( if have ). That is, If input X is a fuzzy subset of the winner FV node, lclassification result is equal to the class of the winner FV
inode (i.e. connection weight between the winner FV node and OUTPUT node.) no matter what output value of the ILTU is. If the winner FV node cannot be found, dassification result is decided by the output of the LTU node. That is,
If output of the LTU node
=0
then classification-result
=-1 else classification-result
=1 3. The training procedure
The training procedure is a two-phased learning procedure. In phase 1, Proposed fuzzy pocket algorithm is applied to train the LTU node. One-epoched FV generation / expansion (FVGE) learning is taken place in phase 2.
When an input X is presented to the classifier, the similarity-degree( X, FV ) is calculated Classification rzsult is decided by the output of the winner node,FV,, and LTU node. The FVWln satis9 :
(1)
similarity-degree( X, FVwin
)is 1
( 2 ) FVwln is innerest among those fuzzy vectors that s unilarity-degree( X, FV) are one.
If current classification result is not correct, the FVGE leaming will be applied. It is described below
:[Step 11 If input X is a fuzzy subset of some FV nodes, the second matched Nsec-match node that is a fiizzy subset of FV- will be searched. That is, F'Vsec-match satisfies three conditions listed below
:condition 1
:Similarity-degree( Nsec-match,
FV,, )
=1 (i.e. FV- and
FVSec-matchare nested h z y vactors. Inner fuzzy vector, FVsec-match, is an exception to the outer fuzzy vector, FVw,.)
condition 2 : Class of FVsec-match
=class of input X. (i.e. Wsec-math
out =desired-out)
condition
3 :Similarity-degree
(X, FVsec-match ) is largest among those fuzzy vectors's that satisfy condition 1.
[step la] If FVsec-match be found, each f k z y interval of FVsec-match node will then be modified, and fuzzy information X will be incorporated in the modified fuzzy vector FVsw-match. One example is illustrated in
Figure 4, The expansion criterion is
:maximun dowed size of expanded
Fvsecpmat&is
p * size( FV,,
),where p takes value in [O:, 11 and controls the allowed size of inner fuzzy vector's.
[step lb] If
FVseomatchcannot be found or the expansion criterion listed in step 1
acannot be satisfied, a new FV node, FVK+I, will be generated to represant input instance X. That is,
FIi of FVK+~ N, and the
connection weight
wFVk+l,OUTis set to the class of input instance X. That i s ,
Xi, where i
=1
[step 21 If input X is not a fuzzy subset of all existing FV nodes. The best matched FVbest will be selected.
That is, FVbest satisfies two conditions listed below
:condition 1
:class of FVbest
=class of x.
Wbest, out =
desired-out)
condition 2
:similarity-degree
(X, FVbest
)is largest among existing FV nodes's.
[step 2a] Each
filzzyinterval Of FVbest node will then be modified, and fuzzy informaition X will be incorporated in the modified fuzzy vector FVbest. The expansion criterion is : the modified FVbest will never overlap with other fuzzy vectors that represant different class.
[step 2b] If this criterion cannot be satisfied, a new FV node, FVK+~, will be generated to represent input instance X.
The function of FVGE learning is to generate or expand FV nodes to classifi those instances that cannot be classified by the LTU node. It is clear that, on-line learning ability is supplied in this model if the FVGE leaming is applied during operation.
Figure 4. Fuzzy interval expansion.
287
4. Experimental results
2 3
The 18 instances of knowledge-based evaluator ( KBE ) [lo] were used as training instances. Inputs of each feature are linguistic terms. The output of the KBE is Suitability, indicating that the application of the expert system on a domain is poor or good, respectively. These membership functions used for each attribute are illustrated in Figure 5 , respectively. For instance 2 and 3, we generate each possible term for the don’t care condition. So, total 340 training instances are generated fiom the
18training instances.
Worth Value Employee Acceptance
5 3 0
4 4 0
Solution Available
EasierSolution
Teachabilitv Risk
Figure 5. The membership function of K B E .
The experimental results of three trials are illustrated in Table 1. Pocket-run-len is 150, and the maximum size of a fuzzy vector allowed is set to
40.Proposed model correctly classified these
340training instances using few nodes.
W n o d e
Table 1.
The classification results onKBE
instances. Pocket-run-[eni s 150
and sizeof
aFV
is40.
5. Conclusion
The contributions of fbzzy set theory lies in their methods to model and process uncertain or ambiguous data, so often encountered in rcal life. Therefore, to enable a neural network classifier to handle real life situation and link fhe advantage of numerical and symbolic processing, one may incorporate the fuzzy set theory into neural networks.
Fuzzy set theory including LR-type fuzzy sets operations, defuzzification method are utilized in the proposed neural network model. As illustrated in the experiments, proposed model can handle both crisp and
f i z z yinputs well. The network structure does not have to be specified before training. Besides, on-line learning is supplied and learning speed is fast. Also, our proposed model can be used as a method to generate a knowledge base of a connectionist expert system with capability of
fuzzy inputs handling. The issues of inference, and explanation method will be explored in the future.
References
[l] W. Pedrycz, “Selected Issues of Frame of Knowledge Representation Realized by Means of Linguistic Labels,”
Inntemational Journal
ofIntelligent Systems, vol. 4, [2]
W. Pedrycz,“Fuzzy Logic in Development of
Fundamentals of Pattern Recognition,” International Joumal
ofApproximate Reasoning, vol. 5,
pp.251- 264, 1991.
[3] H.M.Lee and W.T.Wang, “Training of a Neural Network Classifier by Combining Hyperplane with Exemplar Approach,” IEEE International Conference on Neural Networks, 1993, pp. 494499.
[4]
H. J. Zimmermann, Fuzzy Set Theory and Its Applications, Kluwer-Nijhoff Publishing, 199
1.[5] Dibies Dubios and Henri Prade, “Fuzzy Real Algebra
:Some Results,” Fuzzy Sets and Systems, vol. 2, pp.
[6]