A neural fuzzy network for word information processing

(1)

A neural fuzzy network for word information processing

Chin-Teng Lin

∗

_{, Fun-Bin Duh, Der-Jenq Liu}

Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, ROC

Abstract

A neural fuzzy system learning with fuzzy training data is proposed in this study. The system is able to process and learn numerical information as well as word information. At 0rst, we propose a basic structure of 0ve-layered neural network for the connectionist realization of a fuzzy inference system. The connectionist structure can house fuzzy logic rules and membership functions for fuzzy inference. The inputs, outputs, and weights of the proposed network can be fuzzy numbers ofany shape. Also they can be hybrid offuzzy numbers and numerical numbers through the use offuzzy singletons. Based on interval arithmetics, a fuzzy supervised learning algorithm is developed for the proposed system. It extends the normal supervised learning techniques to the learning problems where only word teaching signals are available. The fuzzy supervised learning scheme can train the proposed system with desired fuzzy input–output pairs. An experimental system is constructed to illustrate the performance and applicability of the proposed scheme. c 2002 Elsevier Science B.V. All rights reserved.

1. Introduction

Some observations obtained from a system are pre-cise, while some cannot be measured at all. Namely, two kinds ofinformation are available. One is numer-ical information from measuring instruments and the other is word information from human experts. How-ever, some ofdata obtained in this manner are hybrid; that is, their components are not homogeneous but a blend of precise and fuzzy information.

Neural networks adopt numerical computations with fault-tolerance, massively parallel computing, and trainable properties. However, numerical quan-tities evidently su9er from a lack of representation power. Therefore, it is useful for neural networks to be

∗_{Corresponding author. Tel.: +886-3-5712121, Ext. 54315; fax:}

+886-3-5739497.

E-mail address: [email protected] (C.-T. Lin).

capable ofsymbolic processing. Most learning meth-ods in neural networks are designed for real vectors. There are many applications that the information can-not be represented meaningfully or measured directly as real vectors. That is, we have to deal with fuzzy in-formation in the learning process of neural networks. Fuzzy set is a good representation form for linguis-tic data. Therefore, combining neural networks with fuzzy set could combine the advantages of symbolic and numerical processing. In this study, we propose a new model ofneural fuzzy system that can process the hybrid of numerical and fuzzy information. The main task is to develop a fuzzy supervised learning algorithm for the proposed neural fuzzy system.

Most ofthe supervised learning methods ofneu-ral networks, for example the perception [1], the backpropagation (BP) algorithm [2,3], process only numerical data. Some approaches have been proposed to process linguistic information with fuzzy inputs, 0165-0114/02/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.

(2)

fuzzy outputs, or fuzzy weights [4–7]. The common points ofthese approaches are summarized as fol-lows: (1) The -level sets offuzzy numbers represent linguistic inputs, linguistic outputs, fuzzy weights, or fuzzy biases. (2) The operations in neural network are performed by using interval arithmetic operations for -level sets. (3) Fuzzy numbers are propagated through neural networks. (4) Fuzzy weights are usu-ally triangular or trapezoidal fuzzy numbers. Because the real number arithmetic operations in the traditional neural networks are extended to interval arithmetic operations for -level sets in the above fuzzi0ed net-works, the computations become complex (e.g. multi-plication ofinterval) and time-consuming. Moreover, since the fuzzy numbers are propagated through the whole neural networks, the time ofcomputations and the required memory capacities are 2h times ofthose in the traditional neural networks, where h represents the number of -level sets. In this study, we attack this problem by allowing numerical signals to Eow in the proposed network internally and reach the same purpose ofprocessing fuzzy numbers.

The objective ofthis study is to explore the approach to supervised learning ofneural fuzzy sys-tems which receive only word teaching signals. At 0rst, we propose a basic structure of0ve-layered feed-forward network for the network realization of a fuzzy inference system. This connectionist structure can house fuzzy logic rules and membership functions, and perform fuzzy inference. We use -level sets of fuzzy numbers to represent word information. The inputs, outputs, and weights ofthe proposed network can be fuzzy numbers of any shape. Since numerical values can be represented by fuzzy singletons, the proposed system can in fact process and learn hy-brid offuzzy numbers and numerical numbers. Based on interval arithmetics, a fuzzy supervised learning scheme is developed for the proposed system. It gen-eralizes the normal supervised learning techniques to the learning problems where only word teaching signals are available. The fuzzy supervised learning scheme can train the proposed network with desired fuzzy input–output pairs (or, equivalently, desired fuzzy if-then rules) represented by fuzzy numbers instead ofnumerical values.

This study is organized as follows: Section 2 describes the fundamental properties and opera-tions offuzzy numbers and their -level sets. These

operations and properties will be used in later deriva-tion. In Section 3, the basic structure ofour neural fuzzy system is proposed. A fuzzy supervised learn-ing algorithm for the proposed system is presented in Section 4. The learning algorithm contains structure and parameter learning phases. In Section 5, simu-lations are performed to illustrate the performance ofthe proposed techniques. Finally, conclusions are summarized in the last section.

2. Representation of word information

In our model, we use -level sets offuzzy numbers (i.e., convex and normal fuzzy sets), as shown in Fig. 1, to represent word information due to their several good properties such as closeness and good represen-tation form. In using -level sets, we consider a fuzzy number to be an extension ofthe concept ofthe val ofcon0dence [9]. Instead ofconsidering the inter-val ofcon0dence at one unique level, it is considered at several levels and more generally at all levels from 0 to 1. Namely, the level ofpresumption , ∈ [0; 1] gives an interval ofcon0dence A= [a()1 ; a()2 ], which

is a monotonical decreasing function of ; that is, (_{¿ ) → (A} ⊂ A); (1) or (_{¿ ) → ([a}(₎ 1 ; a( ₎ 2 ] ⊂ [a()1 ; a()2 ]); (2) for every ; _{∈ [0; 1];}

Fig. 1. Representations offuzzy number. (a) -level sets offuzzy number. (b) discretized (pointwise) membership function.

(3)

2.1. Basic de2nitions of fuzzy numbers

Some notations and basic de0nitions are given in this subsection. We use the uppercase letter to repre-sent a fuzzy set and the lowercase letter to reprerepre-sent a real number.

Let x be an element in a universe ofdiscourse X . A fuzzy set, P, is de0ned by a membership function, P(x), as

P: x → [0; 1]: (3)

When X is continuum rather than a countable or 0nite set, the fuzzy set P is represented as

P =

XP(x)=x; (4)

where x ∈ X . When X is a countable or 0nite set,

P =

i

P(xi)=xi; (5)

where xi∈ X . We call the form in the above equation

as a discretized or pointwise membership function. A fuzzy set, P, is normal when its membership function, P(x), satis0es

max

x P(x) = 1: (6)

A fuzzy set is convex ifand only ifeach ofits -level sets is a convex set. Equivalently, we may say that a fuzzy set P is convex ifand only if

P(x1+ (1 − )x2) ¿ min[P(x1); P(x2)]; (7)

where 0661; x1∈ X; x2∈ X:

The -level set ofa fuzzy set P, P, is de0ned by

P= {x|P(x) ¿ }; (8)

where 0661; x ∈ X:

A fuzzy set P is convex ifand only ifevery P is

convex; that is, P is a closed interval of . It can be

represented by

P= [P()1 ; P()2 ]; (9)

where ∈ [0; 1]. A convex and normalized fuzzy set whose membership function is piecewise continuous is called a fuzzy number. Thus, a fuzzy number can be considered as containing the real numbers within some interval to varying degrees. Namely, a fuzzy number P

may be decomposed into its -level set, P, according

to the resolution identity theorem [11] as follows:

P = P= [p()₁ ; p()₂ ] = x sup P(x)=x: (10)

2.2. Basic operations of fuzzy numbers

In this subsection, we introduce some basic op-erations of -level sets offuzzy numbers. These operations will be used in the derivation ofour model in the following section. More detailed operations of fuzzy numbers can be found in [9].

Addition. Let A and B be two fuzzy numbers and A and B their -level sets, A =A=[a()1 ;

a()₂ ], B =B=[b()1 ; b()2 ]: Then we can

write

A(+)B= [a()1 ; a()2 ](+)[b()1 ; b()2 ]

= [a()₁ + b()₁ ; a()₂ + b()₂ ]; (11) where ∈ [0; 1].

Subtraction. The de0nition ofaddition can be extended to the de0nition ofsubtraction as follows. A(−)B= [a()1 ; a()2 ](−)[b()1 ; b()2 ]

= [a()₁ − b()₂ ; a()₂ − b()₁ ]; (12) where ∈ [0; 1].

Multiplication by an Ordinary Number. Let A be a fuzzy number in and k an ordinary number k ∈ . We have

k · A=

[ka()₁ ; ka()₂ ]; if k ¿ 0;

[ka()₂ ; ka()₁ ]; if k ¡ 0: (13) Multiplication. Here we consider multiplication of fuzzy numbers in +_{. Consider two fuzzy numbers A}

and B in +_{. For the level ofpresumption, we have}

A(·)B= [a()1 ; a()2 ](·)[b()1 ; b()2 ]

(4)

The reader is referred to [9] for the general case that A and B are fuzzy numbers in .

The operations on fuzzy numbers can be performed on the basis ofthe extension principle. Let G(A; B) be an operation on fuzzy numbers A and B by giv-ing a function g(x; y). The membership function of G(A; B); G(A; B)(z), is obtained by extension

princi-ple as follows: G(A; B)(z)

=

sup_(x;y)∈g−1_(z)(_A(x) ∧ _B(y)) if g−1(z) = ∅;

0 if g−1_{(z) = ∅;}

(15) where x ∈ A; y ∈ B; z ∈ Z.

If G(·) is addition, subtraction, or multiplication as above, the result can be easily obtained by using the -level sets of A and B as above operations. It can be proved easily that these operations based on -level sets and the extension principle are equivalent [9]. When G(·) is one ofthe operations given above, ifA and B are fuzzy numbers in , then G(A; B) is also a fuzzy number.

Dierence. We can compute the di9erence between fuzzy numbers A and B by

di5 (A; B) = 1₂

[(a()₁ − b()₁ )2_{+ (a}()

2 − b()2 )2]:

(16) Fuzzi!cation. Fuzzi0cation is a mapping from an ob-served input space to fuzzy sets. Namely, Fuzzi0ca-tion is an operaFuzzi0ca-tion that obtains the membership grade of x, P(x), to fuzzy number P. The values P(x) can

be easily obtained by using the following procedure. In this procedure, the notation h represents the number ofquantized membership grade.

Procedure: Fuzzi2cation search

Inputs. The order set S = {p(0)₁ ; p(1=h)₁ ; p(2=h)₁ ; : : : ; p(1)₁ ; p(1)₂ ; : : : ; p(0)₂ }; x.

Output. ˆ.

Step 1. Use binary search or sequential search to 0nd the correct position in S; that is, to 0nd p(a) _{and p}(b) _{such that}

p(a)_6x6p(b)_{; p}(a)_{; p}(b)_{∈ S.}

Fig. 2. Illustration ofthe fuzzi0cation search procedure (the value of c is equal to ˆ).

Step 2. ˆ = a + (x − p(a)_{)(b − a)=(p}(b)_{− p}(a)_).

Step 3. Output ˆ, and stop.

After this recursive calculation, the value ˆ is equal to the value of p(x) as shown in Fig. 2. Ifwe use

the binary search method, the processing time with this procedure is proportional to log₂h. Ifwe use the sequential search method, the processing time is O(h). Therefore, performing this procedure is very easy and not time consuming.

Defuzzi!cation. In many practical applications such as control and classi0cation, numerical (crisp) data are required. That is, it is essential to transform a fuzzy number to a numerical value. The process map-ping a fuzzy number into a nonfuzzy number is called “defuzzi0cation”. Various defuzzi0cation strategies have been suggested in [12,13]. In this subsection, we describe two methods (MOM, COA) that transform a fuzzy number in the form of -level sets into a crisp value.

• Mean of maximum method (MOM)

The mean ofmaximum method (MOM) generates a crisp value by averaging the support values whose membership values reach the maximum. For a dis-crete universe ofdiscourse, this is calculated based on membership function by

z0=

_l

j=1zj

l ; (17)

where l is the number ofquantized z values which reach their maximum membership value.

(5)

For a fuzzy number Z in the form of -level sets, the defuzzi0cation method can be expressed according to Eq. (17) as

defuzzi2er(Z) = z0= (z (1) 1 + z2(1))

2 ; (18)

where defuzzi2er represents a defuzzi0cation opera-tion.

• Center of area method (COA)

Assuming that a fuzzy number with a pointwise membership Zhas been produced, the center ofarea

method calculates the center ofgravity ofthe distri-bution for the nonfuzzy value. Assuming a discrete universe ofdiscourse, we have

z0 = _n j=1xjZ(xj) _n j=1Z(xj) : (19)

For a fuzzy number Z with the representation form of -level sets, it can be expressed according to Eq. (19) as

defuzzi2er(Z) = z0=

(z1()+ z2())

2 : (20)

3. Basic structure of the neural fuzzy system In this section, we construct an architecture of neural fuzzy system that can process fuzzy and crisp information. Fig. 3 shows the proposed net-work structure which has a total of0ve layers. This 0ve-layered connectionist structure performs fuzzy inference e9ectively. We shall describe the signal propagation in the proposed network layer by layer following the arrow directions shown in Fig. 3. This is done by de0ning the transfer function of a node in each layer. Signal may Eow in the reserve direction in the learning process as we shall discuss in the follow-ing sections. In the followfollow-ing description, we shall consider the case ofsingle output node for clarity. It can be easily extended to the case ofmultiple output nodes. A typical neural network consists ofnodes, each ofwhich has some 0nite fan-in ofconnections represented by weight values from other nodes and fan-out of connections to other nodes (see Fig. 4). The notations u and U represent the input crisp and fuzzy numbers of the node, respectively. The nota-tions o and O represent, respectively, the output crisp

Fig. 3. The 0ve-layered architecture ofthe proposed neural fuzzy system.

Fig. 4. Basic structure ofa node in the proposed neural fuzzy system.

and fuzzy numbers. The superscript in the following formulas indicates the layer number.

Layer 1 (Input): Ifthe input is a fuzzy number, each node in this layer only transmits input fuzzy number Xi to the next layer directly. No computation is done

in this layer. That is, O1 i = [o1()_i1 ; o1()_i2 ] = Xi= [x()_i1 ; x()_i2 ]: (21)

(6)

Ifthe input is a crisp number xi, it can be viewed as a

fuzzy singleton, i.e., O1 i = [o1()_i1 ; o1()_i2 ] = [xi; xi]: (22)

Note that there is no weight to be adjusted in this layer. Layer 2 (Matching=Fuzzi!cation): Each node in this layer has exactly one input from some input linguistic node, and feeds its output to rule node(s). For each layer-2 node, the input is a fuzzy number and the out-put is a numerical number. The weight in this layer is a fuzzy number WXij. The index i; j means the jth

term ofthe ith input linguistic variable Xi. The

trans-fer function of each layer-2 node is, f2 ij= di5 (WXij; Ui) =1₂ [(wx_ij1()− u2()_i1 )2 + (wx()_ij2 − u2()_i2 )2_]; ₍₂₃₎ o2 ij= a(fij2) = e−(f 2 ij)2=2%2; (24)

where % is the variance ofthe activation function a(·). It is a constant given in advance. The activation func-tion a(·) is a nonnegative, monotonically decreasing function of f2

ij∈ [0; ∞], and a(0) is equal to 1. For

example, a(·) can also be given alternatively as o2 ij= a(fij2) = rf 2 ij; (25) where 0¡r¡1, or o2 ij= a(fij2) = _{1 + e}2_−f2 ij; (26)

where is a nonnegative constant.

Layer 3 (MIN): The input and output ofthe node in this layer are both numerical. The links in this layer perform precondition matching of fuzzy logic rules. Hence, the rule nodes should perform fuzzy AND operation. The most commonly used fuzzy AND op-erations are intersection and algebraic product [13]. Ifintersection is used, we have

o3

i = min(u31; u32; : : : ; u3k): (27)

On the other hand, ifalgebraic product is used, we have

o3

i = u13u32: : : u3k: (28)

Similar to layer one, there is no weight to be adjusted in this layer.

Layer 4 (MAX): The nodes in this layer should per-form fuzzy OR operation to integrate the 0red rules which have the same consequent. The most commonly used fuzzy OR operations are union and bounded sum [13]. Ifthe union operation is used in this model, we have

o4

i = max(u41; u42; : : : ; u4k): (29)

Ifthe bounded sum is used, we have o4

i = min(1; u41+ u42+ · · · + u4k): (30)

The output and input ofeach layer-4 node are both numerical values.

Layer 5 (Merging=Defuzzi!cation): In this layer, each node has a fuzzy weight WYi. There are two

kinds ofoperations in this layer. When we need a fuzzy output Y , the following formula is executed to perform a “merging” action:

O5 ₌ [o5()₁ ; o5()₂ ] = Y = i u5iWYi i u5i : (31) Namely, Y = [y₁(); y₂()]; WYi= [wy()_i1 ; wy()_i2 ]; (32) where y()₁ = iu5iwy()i1 i u5i ; (33) y()₂ = iu5iwy()i2 i u5i : (34)

From the above description we observe that only layer-1 inputs and layer-5 outputs ofthe proposed net-work are fuzzy numbers (in the form of -level sets). Real numbers are propagated internally from layer two to layer four in the network. This makes the opera-tions in our proposed network less time-consuming as compared to the neural networks that can also pro-cess fuzzy input=output data but require fuzzy signals Eowing in it.

(7)

4. Supervised learning of the basic neural fuzzy system

In this section, we shall derive a supervised learn-ing algorithm for the structure of the proposed neu-ral fuzzy system. This algorithm is applicable to the situations that pairs ofinput–output training data are available.

In general, the fuzzy rules for training are Rp: IF x1 is Xp1 and : : : and xpis Xpn;

THEN y is Yp;

where p = 1; 2; : : : ; m, and m is the total number of training rules. These fuzzy if-then rules can be viewed as the fuzzy input–output pairs, (Xp1; Xp2; : : : ; Xpn; Yp),

where p = 1; 2; : : : ; m: Ifthe input or output are crisp data, the corresponding fuzzy elements in the training pairs become numerical elements.

Before the learning of the neural fuzzy system, an initial network structure is 0rst constructed. Then during the learning process, some nodes and links in the initial network are deleted or combined to form the 0nal structure ofthe network. At 0rst, the number ofinput (output) nodes is set equal to the number of input (output) linguistic variables. The number of nodes in the second layer is decided by the number offuzzy partitions ofeach input linguistic variable xi,

|T(xi)|, which must be assigned by the user. The fuzzy

weights WXij in layer two are initialized randomly

as fuzzy numbers. One better way is to distribute the initial fuzzy weights evenly on the interested domain ofthe corresponding input linguistic variable. As for layer three ofthe initial network, there are_i |T(xi)|

rule nodes with the inputs ofeach rule node coming from one possible combination ofthe terms ofinput linguistic variables under the constraint that only one term in a term set can be a rule node’s input. This gives the preconditions ofinitial fuzzy rules.

Finally, let us consider the structure oflayer four in the initial network. This is equivalent to determin-ing the consequents ofinitial fuzzy rules. To do the initialization, the number offuzzy partitions ofthe output linguistic variable y, |T(y)|, must be given in advance. The initial fuzzy weights WYi in layer four

are distributed evenly on the output space. We let the layer-5 nodes to perform up-down transmission. In the up-down transmission, the desired outputs are pumped

into the network from its output side and the opera-tions in layer 0ve are the same as those in layer two. Signals from both external sides of the network can thus reach the output points ofterm nodes at layer two and layer four. Since the outputs of term nodes at layer two can be transmitted to rule nodes through the initial architecture oflayer-3 links, we can obtain the output (0ring strength) ofeach rule node. Based on the 0ring strengths ofrule nodes (o3

i) and the

out-puts ofterm nodes at layer four (o4

j), we can

de-cide the correct consequent links ofeach rule node by unsupervised (self-organized) learning method [21]. The links at layer four are fully connected initially. We denote the weight on the link from the ith rule node to the jth output term node as wij. The Hebbian

learning law is used to update these weights for each training data set. The learning rule is described as

wij= co4jo3i; (35)

where c is a positive constant. After the learning, only one link with the biggest weight in the fan-out of a layer-3 (rule) node is remained as its consequent.

With the above initialization process, the network is ready for learning. We shall next propose a two-phase supervised learning algorithm for our 0ve-layered neu-ral fuzzy system.

4.1. Parameter learning phase

A gradient-descent-based backpropagation algo-rithm [21,22] is employed to adjust fuzzy weights in layer two and layer four of the proposed network. If the FCLO is used, the error function to be minimized is e = di5 (Y; D) =1₂ [(y₁()− d()₁ )2_{+ (y}() 2 − d()2 )2]; (36)

where Y = [y₁(); y₂()] is the current fuzzy output and D = [d₁(); d()₂ ] is the desired fuzzy output. If the FCNO is used, the error function to be minimized is

e = 1

2(d − y)2; (37)

where y is the current output and d is the desired output. We assume that W = [w₁(); w₂()] is the

(8)

adjustable fuzzy parameter in layer two and layer four. Then to update fuzzy weights means to update the parameters w₁() and w₂(). We shall next derive the update rules for these parameters layer by layer based on the general learning rule

w(t + 1) = w(t) + /

−_@w@e

; (38)

where w represents w₁()or w₂(), and / is the learning rate.

Layer 5. The update rules of wy_i1()and wy_i2()are de-rived from Eqs. (33) and (34) as follows:

@e @wy_i1() = @e @y()₁ @y()₁ @wy()_i1 = (y () 1 − d()1 ) u 5 i iu5i; (39) @e @wy_i2() = @e @y()₂ @y()₂ @wy()_i2 = (y () 2 − d()2 ) u 5 i iu5i: (40)

The error signals to be propagated to the preceding layer are 15()₁ = @e @o5()₁ = @e₁() @y₁() = (y () 1 − d()1 ); (41) 15()₂ = @e @o5()₂ = @e₂() @y₂() = (y () 2 − d()2 ); (42) where e()₁ =1₂(y()₁ − d()₁ )2_; ₍₄₃₎ e()₂ =1 2(y()2 − d()2 )2: (44)

Layer 4. In this layer, there is no weights to be adjusted. Only the error signals need to be computed and propagated. The error signal 14

i is derived form Eq. (29) as follows: 14 i =_@o@e4 i = @(e₁()+ e()₂ ) @o4 i = (14()_i1 + 14()_i2 ); (45) where 14()_i1 =@e()1 @o4 i = @e()₁ @o5()₁ @o5()₁ @o4 i = 15()₁ @y()1 @o4 i = 1 5() 1 wy () i1 − i u5iwy()i1 (_iu5 i)2 ; (46) 14()_i2 =@e()2 @o4 i = @e()₂ @o5()₂ @o5()₂ @o4 i = 15()₂ @y()1 @o4 i = 1 5() 1 wy () i2 −i u5iwy()i2 (_iu5 i)2 : (47)

Layer 3. As in layer four, only the error signals need to be computed. According to Eq. (27), this error sig-nal 13 i can be derived as 13 i =_@o@e3 i = @e @o4 i @o4 i @o3 i = 14 i if o4i = max(u41; : : : ; u4k); 0 otherwise: (48)

Layer 2. In this layer, there are fuzzy weights WX to be adjusted. The update rules can be derived from Eqs. (23) and (24) as follows:

@e @wx_ij1() = @e @o3 i @o3 i @o2 ij @o2 ij @wx_ij1() = 1 3 i @o 3 i @o2 ij @o2 ij @wx_ij1(); (49) @e @wx_ij2() = @e @o3 i @o3 i @o2 ij @o2 ij @wx_ij2() = 1 3 i @o 3 i @o2 ij @o2 ij @wx_ij2(); (50) where @o3 i @o2 ij = 1 if o3 ij= min(u31; : : : ; u3k) 0 otherwise; (51) and @o2 ij @wx_ij1()= o 2 ijln e −2f_2%2ij @f2 ij @wx()_ij1 = −o2 ijf 2 ij %2(wx()ij1 − u2()i1 ); (52)

(9)

Fig. 5. Illustration ofconsequent combination. @o2 ij @wx()_ij2 = −o 2 ij f2 ij %2(wx()ij2 − u2()i2 ): (53)

4.2. Structure learning phase

In this subsection, we propose a structure learning algorithm for the proposed neural fuzzy system to re-duce its node and link number. This structure learning algorithm is divided into two parts: One is to merge the fuzzy terms of input and output linguistic vari-ables (term-node combination). The other is to do rule combination to reduce the number ofrules. We shall discuss these two parts separately in the following. A. Term-node combination scheme

Term-node combination is to combine similar terms in the term sets ofinput and output linguistic vari-ables. We shall present this technique for the term set ofoutput linguistic variables. The whole learning pro-cedure ofinitialization is described as follows:

Step 1: Perform parameter learning until the out-put error is smaller than a given value; i.e., e6error limit, where error limit is a small positive constant.

Step 2: If di5 (WYi; WYj)6similar limit and

similar limit is a given positive constant, remove term node j with fuzzy weight WYjand its fan-out links, and connect rule

node j in layer 3 to term node i in layer four (see Fig. 5).

Step 3: Perform the parameter learning again to optimally adjust the network weights. B. Rule combination scheme

After the fuzzy parameters and the consequents of the rule nodes are determined, the rule combination scheme is performed to reduce the number of rules. The conditions for applying rule combination has been explored in [17] and are given as follows.

(1) These rule nodes have exactly the same conse-quents.

(2) Some preconditions are common to all the rule nodes, that is, the rule nodes are associated with the same term nodes.

(3) The union ofother preconditions ofthese rule nodes composes the whole terms set ofsome input linguistic variables.

Ifsome rule nodes satisfy these three conditions, then these rules can be combined into a single rule. An illustration in shown in Fig. 6.

5. Illustrative examples

In this section, we shall use two examples to illustrate the performance of the proposed neural fuzzy system.

(10)

Fig. 6. Illustration ofrule combination.

Fig. 7. The membership functions of the input linguistic value “very small” (X1), “small” (X2), “large” (X3) in Example 1.

Example 1. Fuzzy input and fuzzy output.

Consider the following three fuzzy if-then rules for training:

R1: IF x is very small (X1), THEN y is very large

(D1),

R2: IF x is small (X2), THEN y is large (D2),

R3: IF x is large (X3), THEN y is small (D3),

where the fuzzy numbers “small”, “large”, “very small” are given in Fig. 7. Fig. 8 shows the learning curve. The error tolerance is 0.0001 and the number of -cuts is 6. After supervised learning, the fuzzy outputs ofthe learned network and the corresponding desired outputs are shown in Fig. 9. The 0gure shows that they match closely. The two learned

(representa-Fig. 8. The learning curve in Example 1.

Fig. 9. The actual fuzzy outputs, Y1, Y2, Y3 of the learned neural fuzzy system and the corresponding desired fuzzy outputs, D1, D2, D3 in Example 1.

tive) fuzzy rules after learning (condensing) are: IF x is WX1; THEN y is WY1; and

IF x is WX2; THEN y is WY2;

where the fuzzy weights after learning are shown in Fig. 10. For illustration, Figs. 11 and 12 show the change offuzzy weights in the learning process. Hence the original three fuzzy rules have been condensed to two rules, and these two sets ofrules represent equivalent knowledge.

(11)

Fig. 10. The learned fuzzy weights of the network in Example 1.

Fig. 11. Time evolving graph offuzzy weights WX 1; WX 2 during the learning process in Example 1.

There are 0ve training data in Fig. 13. Fig. 15 shows the fuzzy weights after training. In order to examine the generalization ability ofthe trained neural network, we presented the three fuzzy inputs for testing in Fig. 14. In these 0gures, 0ve level sets corresponding to h = 0; 0:25; 0:5; 0:75; 1 are depicted.

6. Conclusions

In this study, we proposed the learning techniques for neural fuzzy systems to process both numerical and word information. The developed systems have some characteristics and advantages: (1) The inputs and outputs can be fuzzy numbers or numerical

num-Fig. 12. Time evolving graph offuzzy weights WY 1; WY 2 during the learning process in Example 1.

Fig. 13. The training data in Example 2.

bers. (2) The network weights are fuzzy weights. (3) Owing to the representation forms of the -level sets, the fuzzy weights, fuzzy inputs, and fuzzy outputs can be fuzzy numbers of any shape. (4) Except the in-put and outin-put layers, numerical numbers are propa-gated through the whole network; thus the operations in the proposed neural fuzzy systems are not time-consuming and the required memory capacity is small. The developed systems have fuzzy supervised learn-ing capability. With fuzzy supervised learnlearn-ing, these systems can be used for fuzzy expert systems, fuzzy system modeling, and rule base concentration. When learning with numerical values (real vectors), the

(12)

pro-Fig. 14. The actual outputs and testing results in Example 2.

Fig. 15. The fuzzy weights in Example 2.

posed systems can be used for adaptive fuzzy control. Computer simulations and experimental studies sat-isfactorily veri0ed the performance of the proposed neural fuzzy learning schemes.

References

[1] S.K. Pal, S. Mitra, Multilayer perceptron, fuzzy sets and classi0cation, IEEE Trans. Neural Networks 3 (5) (1992) 683–696.

[2] J.M. Keller, H. Tahani, Backpropagation neural networks for fuzzy logic, Inform. Sci. 62 (1992) 205–221.

[3] S. Horikawa, T. Furuhashi, Y. Uchikawa, On fuzzy modeling using fuzzy neural networks with the backpropagation

algorithm, IEEE Trans. Neural Networks 3 (5) (1992) 801–806.

[4] H. Ishibuchi, R. Fujioka, H. Tanaka, Neural networks that learn from fuzzy if-then rules, IEEE Trans. Fuzzy Systems 1 (2) (1993) 85–97.

[5] H. Ishibuchi, H. Tanaka, Fuzzy regression analysis using neural networks, Fuzzy Sets and Systems 50 (1992) 257–265.

[6] H. Ishibuchi, H. Tanaka, H. Okada, Fuzzy neural networks with fuzzy weights and fuzzy biases, Proc. Int’l Joint Conf. on Neural Networks, San Francisco, 1993, pp. 1650–1655. [7] Y. Hayashi, J.J. Buckley, E. Czogula, Fuzzy neural network,

Internat. J. Intelligent Syst. 8 (1993) 527–537.

[8] Y. Hayashi, J.J. Buckley, E. Czogula, Systems engineering application of fuzzy neural networks. Proc. Int’l Joint Conf. on Neural Networks, Baltimore, 1992, pp. 413–418. [9] A. Kaufmann, M.M. Gupta, Introduction to Fuzzy Arithmetic,

Van Nostrand Reinhold, New York, 1985.

[10] K. Uehara, M. Fujise, Fuzzy inference based on families of -level sets, IEEE Trans. Fuzzy Systems 1 (2) (1993) 111– 124.

[11] L.A. Zadeh, The concept ofa linguistic truth variable and its application to approximate reasoning—I, II. Inform. Sci. 8 (1975) 199–249, 301–357.

[12] M. Braae, D.A. Rutherford, Fuzzy relations in a control setting, Kyberbetes 7 (3) (1978) 185–188.

[13] C.C. Lee, Fuzzy logic in control systems: Fuzzy logic controller—Part I & II, IEEE Trans. Syst. Man Cybern. SMC-20 (2) (1990) 404–435.

[14] T. Yamakawa, The current mode fuzzy logic integrated circuits fabricated by the standard CMOS process, IEEE Trans. Comput. 35 (2) (1992) 122–130.

[15] K. Uehara, Computational eVciency of fuzzy inference based on level sets, Proc. Spring Nat. Conv. Rec., IEICE, JAPAN, 1989, pp. D-400.

[16] K. Uehara, Fast operation of fuzzy inference based on level sets. Proc. 38th Ann. Conv. Rec. IPS Japan Rec., 1989, pp. 3G-3.

[17] C.T. Lin, C.S.G. Lee, Neural-network-based fuzzy logic control and decision system, IEEE Trans. Comput. 40 (12) (1991) 1320–1336.

[18] C.T. Lin, C.S.G. Lee, Reinforcement structure=parameter learning for neural-network-based fuzzy logic control systems, IEEE Trans. Fuzzy Systems 2 (1) (1995) 46–63. [19] J.S. Jang, Self-learning fuzzy controllers based on temporal

back propagation, IEEE Trans. Neural Networks 3 (5) (1992) 714–723.

[20] T. Tsukamoto, An approach to fuzzy reasoning method, in: M.M. Gupta, R.K. Regade, R.R. Yager (Eds.), Advances in Fuzzy Set Theory and Applications, North-Holland, Amsterdam, 1979.

[21] J.M. Zurada, Introduction to Arti0cial Neural Systems, West publisher, New York, 1992.

[22] G.E. Hinton, Connectionist learning procedure, Arti0cial Intelligence 40 (1) (1989) 143–150.

[23] R. Keller, Expert System Technology—Development and Application, Prentice-Hall, NJ, 1987.