Fuzzy BP: A Neural Network Model with Fuzzy Inference

(1)

FUZZY BP: A Neural Network Model with Fuzzy Inference

Hahn-Ming Lee and Bing-Hui Lu Department of Electronic Engineering National Taiwan Institute of Technology

Taipei, Taiwan E-mail: hmlee @ et.ntit.edu. tw

Abstract

In this paper, a neural network model, named Fuzzy BP, with fuzzy inference is proposed. It performs nonlinear mapping between fuzzy input vectors and crisp outputs. Therefore, it has the ability of processing fuzzy numbers. The fuzzy numbers are represented in LR-type to reduce network complexity. Besides, the connection weights and biases are represented as fuzzy numbers to increase fuzzy inference ability. In addition, a fuzzy neuron which performs fuzzy weighted summation, defuzzification, and nonlinear mapping is proposed. Also, a simple defuzzification formula is presented. One sample problem, called Knowledge-Based Evaluator, is considered to illustrate the working of the proposed model, and the experimental results are very encouraging.

1. Introduction

In the research of expert systems, knowledge acquisition is an important problem [ 11. Extracting knowledge from experts and building knowledge base is difficult. Many researches in the artificial neural networks show that the neural networks could ease the knowledge acquisition bottleneck [2]. Many conventional neural network models have been developed successfully in many fields [2]. They can make highly nonlinear mapping between numerical information. For the practical application problems, however, the linguistic and ambiguous information must be converted into numerical form before applied to neural networks. The distortion of fuzziness always occurs on the information transformation which will lead to inexact outputs. Unfortunately, for most applications, such as control system, pattern recognition, and decision making, the inputs are fuzzy terms and output are crisp values. Thus, we intend to build a neural network model which can map fuzzy input vectors to crisp outputs.

In this paper, a fuzzy version of neural network model, named Fuzzy BP, is proposed. It carries out the mapping between fuzzy input vectors and crisp outputs. A fuzzy neuron with fuzzy number processing ability is proposed, which is the basic element of this model. The backpropagation-like learning algorithm [5] is derived for training Fuzzy BP. Also, the triangular 1-R-type fuzzy number [3] is used for simplifying the architecture and reducing computational load.

2. LR-type fuzzy number and its operations

The LR-type fuzzy number representation was proposed by Dubois and Prade [3]. Its definition can be described as follows:

A fuzzy number M is said to be a LR-type fuzzy number iff -

Where is the membership function of fuzzy number M. L and R are the reference functions for left and -

right reference, respectively. m denotes the mean value of M. a and p are called left and right spreads, respectively.

Therefore, a LR-type fuzzy number M can be expressed as (m,a,P)LR. If a and P are both zero, the LR-type fuzzy number indicates a crisp value.

-

The basic operations of LR-type fuzzy number are shown below:

(2)

3. The fuzzy neuron

The fuzzy neuron is the basic element of Fuzzy BP. It performs nonlinear mapping between the weighted summation of fuzzy input vectors and crisp outputs. Fig.1 shows the structure of a fuzzy neuron. The output value 0 can be expressed as:

." . " . " I N N N

Fuzzy number I = (b,I,, ..., a) is the input vector and W =(WO,WI ,..., GI) is the fuzzy weight vector, where is the bias. The fuzzy weighted summation net=Eqjj and the inference result N E T S E (net) are also shown in

N

1

Fig. 1. The function CE is a centroid [4] operation of triangular fuzzy number. It can be viewed as a defuzzification operation which maps fuzzy weighted summation value to a crisp value. The centroid function CE is derived as follows:

First, we describe the definition of centroid. A median of a triangle is a line segment whose endpoints are a vertex of triangle and midpoint of the side opposite this vertex. The point of intersection of the medians is called the centroid. This is the balance point of the triangle. Also, this is the defuzzification result of the triangle: Fig.2 illustrates a centroid point of a triangle. - The Errices of a triangle are denoted as a, b, and c. The points a, b, and c are the midpoint of line segments bc, G, and ab, respectively.

According to the above descriptions, the medians aa', bb', and 2 will intersect at point g whose coordinate is denoted as (x,y). Thus, g is the centroid point of this triangle. The coordinates of vertices a, b, and c are (m-so), (m+P,O), and (m,l), respectively. Also, the coordinates of midpoints a, b, and c are (m$,l), ( m a l), and (m=,O), respectively. The slopes of line a'a ^and@ are equal. That is

2 2 ^{2 ' 2}

2

1 - 0

- Y - 0

--

2

m + e - ^(m-a) ^x-(m-a)

L

The line equation of @ is

- x-m+a

p+2a - -

Similarly, the slopes of line cc' and gc are the same. That is 1 - 0 = Y - 0

m - ( m + w . - ( m e )

y=- W - m ) +1

2 2

The line equation of $ is a-P x = m a p - a ) From (2) and (3), we have

(3)

3 N

Finally, assume the fuzzy weighted summation is net=(net,,neta,netg). The function CE can be written as

(3)

net) = netm+qnetp-ne+m (4) 3 N

Where NET is the defuzzification result of triangular fuzzy number net. The triangular fuzzy numbers are both used on fuzzy weight vectors and fuzzy input vectors. In Fig.3, a triangular fuzzy number is shown. The functions L and R are defined as follows.

The function f is a sigmoid function which performs nonlinear mapping between input and output. It is defined as

f(NET)= .

1 +exp(-NET) (7)

4. The architecture of Fuzzy BP

The architecture of Fuzzy BP is shown in Fig.4. It is a three-layered feedforward network. When an input

- " " ^I

vector Ip=(Ipl,Ip2 ... ,Ip[) is presented to the input layer, the processing of each individual layer is expressed as below.

Input units:

Hidden units:

N I -I

Opi=Ipi i=1,2, ..., 1; ol@=( 1 ,O,O). (8)

Obj=f(NETpj), j= 1,2,. . . ,m; O p l (9)

1 N N

NETpj=CE(C WjiOpi).

i =O

Output units:

O"pt=f(NETi;), k=O, 1 ,.. .,II.

Y

NETpk=CE(C w'kjo'pj).

i =O -

. -

Whereipi is the ith input element of input patterns. Opi is the output value of ith input node. Oij, and

0 " p k are the jth and kth crisp defuzzification outputs of the hidden and output nodes, respectively. Wji is the fuzzy

connection weight between ith input node and jth hidden node. W'kj is the fuzzy connection weight between jth hidden node and kth output node. In addition, f and CE are the sigmoid and centroid functions which are mentioned in Section 3.

Y

5. Learning

In this section, we will derive the learning procedure. At the beginning, the mean square error function for pattern p is defined as:

2

Ep=C 4Dpj-O"pi) (13)

i 2

Dpi is the desired output value of ith output unit, and O'bi is the actual output value of ith output unit. The overall error of training pattern is E'C E,. In the learning phase, the values of weights will be adjusted to minimize E. At time t, the weight change value is defined as:

P

AG(t) = -IlVEp(t)+a.AG(t-l)

Y

Where q is the learning rate, and 01 is a constant value. The term aAW(t-1) is a momentum term which is added for improving convergent speed. The first term V&(t) can be rewritten as:

(4)

N aE aEp aEp

Where W(t)=(wm(t),Wa(t),We(t)). In what follows, we will derive 2, -, and - in detail. AS show in Fig.4, assume Wji=(Wqi,Waii,Wpii) is the connection weight between ith input node and jth hidden node, and

aw, ^aWa awp

-

As previous derivation, the connection weights between input and hidden layers can be derived as the following results:

In the above derived results, (19)-(21) and (25)-(27) are applied as weights change formula. Because Wm has more influence on the change of centroid than those of W, and Wp, it is conceivable that the adjusting range of Wm is larger than those of W, and Wp. Besides, the changing direction of W, and Wp is opposite. According to

(5)

our derived results, the slight difference between (19)-(21) and (25)-(27) means that the computational load of Fuzzy BP is not heavy.

6. An experiment

To illustrate the working of Fuzzy BP, we simulate the Knowledge-Based Evaluator (KBE) on SUN/SPARC I1 workstation. The KBE is an expert system to evaluate the suitability of applying an expert system on a domain. The KBE consists of 18 instances. Each input attribute is a linguistic term, and output indicates suitability of an expert system under evaluation. Fig.5 and Tab.1 illustrate the membership functions used for each attribute and training instances, respectively.

For the don’t care terms of instance:$ 2 and 3, each possible term is generated. Thus, we have total 340 instances from the 18 instances in Tab.1. In these 340 instances, 290 instances are randomly selected as training instances, and the others are used as testing instances. We use a three-layered Fuzzy BP with six input nodes, six hidden nodes, and single output node. The value of q is 0.9, and 01 is 0.1. The average learning time is 483 iterations in 10 trials.

After each learning, we use the testing instances to verify the generalization ability of Fuzzy BP. We explain the output value according following conditions:

If output 2 0.5 : stability is good.

If output c 0.5 : stability is poor.

Consequently, each testing instance can be conectly classified.

7. Conclusion

In this paper, a neural network model with fuzzy inference is proposed. It carries out the mapping between fuzzy input vectors and crisp outputs. The fuzzy number is represented in LR-type to reduce the network complexity. Besides, a fuzzy neuron is proposed, which performs fuzzy weighted summation, defuzzification, and nonlinear mapping. Moreover, a simple defuzzification formula is derived. This formula supports an easy way to do defuzzification and reduces the computational load. In learning phase, the fuzzy weights and biases are adjusted to minimize the network output error. Also, we show the learning procedure in detail.

In the experiments, we use Fuzzy BP to simulate Knowledge-Based Evaluator (KBE). The simulation results show the excellent performance of Fuzzy BP.

In the further work, we intend to apply more practical applications to show the usefulness of Fuzzy BP.

Moreover, the research of interval fuzzy number instead of triangle fuzzy number to extend the ability of Fuzzy BP is also under investigation.

References:

[I] Robert Keller, Expert System Technology Development & Application, Yourdon Press, 1987.

[2] Jacek M.Zurada, Introduction to Artijkiai Neural Systems, Info Access Distribution Pte Ltd., 1992.

[3] Didier Dubois and Henri Prade, Fuzzy Setss and Systems: Theory and Applications, Academic Press, 1980.

[4] Lathrop and Stevens, GEOMETRY A Contemporary Approach, Wadsworth Publishing Company, Inc., [ 5 ] D.E.Rumelhart, J.L. McClelland and the PDP Research Group, Parallel Distributed Processing, vo. 1,

Belmont, California, pp. 229-230, 1967.

Cambridge, MA:MIT Press, 1986.

membership degree

- \

0.0

0.0 ^X

Fig.2: A centroid of a triangle. Fig.3: A triangular fuzzy number (m,a,p).

(6)

o., 0, 0, ob

3 4

@,

Low * High poa

ModcrpG. Ncuepl Adequate complclc Diflicult High Poor

ow Negative me muen1 poa

Fuzzy neuron j

: / ' ^I

Fig. 1 : The architecture of fuzzy neuron. C, CG, and f perform fuzzy weighted summation, defuzzification, and nonlinear operation, respectively. The fuzzy neuron applys nonlinear mapping between fuzzy input vector and crisp output.

Worth Value Employee Afecptlna

, ^SolutionAvallable Ewler SduUon

2 3 4 5 6 7 8 9 1 0

Fig.5 The membership functions used for instances of KBE. Each linguistic variable, such as Worth Value, consists three or four fuzzy sets. These fuzzy sets are represented in LR-type.

Fig.4: An architecture of three-layered Fuzzy BP. Each unit in hidden and output layers is a fuzzy neuron. The input units perform identical operations.

Tab.1: The instances of KBE. * denotes don't care, and each possible term is generated. Hence, a total 340 instances are generated from these 18 instances.