Dynamical optimal training for interval type-2 fuzzy neural network (T2FNN)

(1)

Dynamical Optimal Training for Interval Type-2

Fuzzy Neural Network (T2FNN)

Chi-Hsu Wang, Senior Member, IEEE, Chun-Sheng Cheng, and Tsu-Tian Lee, Fellow, IEEE

Abstract—Type-2 fuzzy logic system (FLS) cascaded with neural

network, type-2 fuzzy neural network (T2FNN), is presented in this paper to handle uncertainty with dynamical optimal learning. A T2FNN consists of a type-2 fuzzy linguistic process as the an-tecedent part, and the two-layer interval neural network as the consequent part. A general T2FNN is computational-intensive due to the complexity of type 2 to type 1 reduction. Therefore, the in-terval T2FNN is adopted in this paper to simplify the computa-tional process. The dynamical optimal training algorithm for the two-layer consequent part of interval T2FNN is first developed. The stable and optimal left and right learning rates for the interval neural network, in the sense of maximum error reduction, can be derived for each iteration in the training process (back propaga-tion). It can also be shown both learning rates cannot be both nega-tive. Further, due to variation of the initial MF parameters, i.e., the spread level of uncertain means or deviations of interval Gaussian MFs, the performance of back propagation training process may be affected. To achieve better total performance, a genetic algo-rithm (GA) is designed to search optimal spread rate for uncer-tain means and optimal learning for the antecedent part. Several examples are fully illustrated. Excellent results are obtained for the truck backing-up control and the identification of nonlinear system, which yield more improved performance than those using type-1 FNN.

Index Terms—Back propagation, dynamic optimal learning rate,

genetic algorithm, interval type-2 FNN.

I. INTRODUCTION

D

URING the past decade, intelligent methodologies have been found to possess the best potential to solve many en-gineer problems which cannot be solved before. Especially the fuzzy neural network (FNN) has been explored during the past few years by many researchers to equip the intelligent method-ologies with better learning capabilities. For instance, the FNN has been applied successfully to control nonlinear, ill-defined systems [1]. In particular, the back propagation (BP) of FNN has been developed to tune the parameters of fuzzy sets and the weighting factors of neural network in [1]. The BP algo-rithm is applied to minimize the difference (error) between the desired and actual outputs through iterations. For each itera-tion, the parameters and weighting factors are adjusted by the BP algorithm in order to reduce the error along a descent di-rection. A reasonable learning rate should be assigned during

Manuscript received March 3, 2003; revised August 27, 2003. This work was supported by the Ministry of Education, Taiwan, R.O.C., under the Project “Intelligent Transportation System,” Grant 91X104EX-91-E-FA06-4-4 (April 2003–2004). This paper was recommended by Associate Editor D. S. Yeung.

The authors are with the Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan 300, R.O.C. (e-mail: [email protected]).

Digital Object Identifier 10.1109/TSMCB.2004.825927

the BP process. Therefore the dynamic optimization of learning rate for type-1 FNN has been proposed to accelerate the con-vergence of the BP algorithm [2], [3]. Moreover the analysis of stable and optimal learning rates for type-1 FNN was also dis-cussed rigorously in [3]. However, all of these discussion and analyses are focused on type-1 FNN. To date, type-2 fuzzy sets and fuzzy logic controller have been used in decision making [4], survey processing [5]–[7], series forecasting [8], time-varying channel equalization [9], [10], control of mobile robots [11], and preprocessing of data [12]. Further genetic algorithms (GAs) was adopted in [3] to fine-tune the Gaussian MFs in the antecedent part of type-1 FNN. The authors in [13] also applied GAs to search for optimal uncertain means and its extent of in-terval type-2 Gaussian MFs for the chaotic time-series predic-tion. Although many reasonable results have been obtained by using BP process or GAs, the discussion of stable and optimal learning rates has not been established in type-2 FNN (T2FNN). Due to the learning capability of type-1 FNN [14], T2FNN can be similarly defined. We proposed an interval T2FNN that consists of the interval type-2 fuzzy linguistic process as the an-tecedent part and the two-layer interval NN as the consequent part. The two-layer interval neural network consists of left and right weighting factors which will require left and right learning rates during the learning process. The T2FNN is computational intensive due to the complexity of type 2 to type 1 reduction. Therefore, the interval T2FNN is adopted in this paper to sim-plify the computational process. The result of type reduction process, called type-reduced set, possesses more important in-formation than a crisp output of type-1 FNN. The stability anal-ysis of the left and right learning rates for this two-layer interval NN will be discussed. A new theorem will be proposed to yield the dynamic optimal learning rates for this two-layer interval NN, which guarantees the maximum error reduction during the BP process. It can also be shown that the left and right learning rates for the interval neural network cannot both be negative. It is not necessary that both learning rates are positive, but they cannot be both negative. For comparison purpose, the dynam-ical optimal learning rate for type 1 FNN should be positive [3]. Since the variations of parameters setting in the type-2 MFs, i.e., the spread of uncertain means or deviations, will affect total performance during the BP training process. In order to find the optimal settings of uncertain means or deviations in the interval T2FNN, a genetic algorithm is also proposed to-gether with dynamical optimal BP training process to search for optimal spread rate of MFs and optimal learning rate in an-tecedent part simultaneously. In the meantime, the dynamic op-timal learning rate of two-layer neural network of consequent part also can be obtained for each iteration. The well-known ex-1083-4419/04$20.00 © 2004 IEEE

(2)

Fig. 1. (a) Interval type-2 fuzzy set with uncertain mean. (b) Three-dimensional membership function for interval type-2 fuzzy set.

amples of truck backing-up and nonlinear system identification will be illustrated via our new optimally trained interval T2FNN with GAs to yield more improved performances than those using type-1 FNN.

This paper is organized as follows. In Section II, a type-2 fuzzy neural network model will be defined. In Section III, the dynamic optimal learning theorem with BP process will be de-veloped to tune the interval T2FNN. Section IV describes how to find optimal spread rate and learning rate via genetic algo-rithm. Section V shows two applications via dynamic optimal learning theorem with GA. The conclusions and topics for fu-ture research are drawn in Section VI.

II. INTERVALTYPE-2 FUZZYNEURALNETWORK(T2FNN) In this section, the interval type-2 fuzzy set and the inference of type-2 fuzzy logic system will be described first. This will lead to interval type-2 fuzzy neural network (T2FNN). A. Type-2 Fuzzy Logic System (T2FLS)

A type-2 fuzzy set in universal set is denoted as which is characterized by a type-2 membership function in (1). The can be referred as a secondary membership function or also referred as a secondary set, which is a type-1 fuzzy set in [0, 1]. In (1), is a secondary grade, which is the amplitude

of a secondary membership function; i.e., .

The domain of a secondary membership function is called the primary membership of . In (1), is the primary membership

of , where for is a fuzzy set in [0,

1], rather than a crisp point in [0, 1].

(1)

When , then the secondary MFs

are interval sets such that in (1) can be called an interval type-2 MF [6]. Therefore the type-2 fuzzy set can be re-ex-pressed as

(2)

Fig. 2. Type-2 fuzzy logic system.

Also, a Gaussian primary MF with uncertain mean and fixed standard deviation having an interval type-2 secondary MF can be called an interval type-2 Gaussian MF (3). Fig. 1(a) shows a 2-D interval type-2 Gaussian MF with an uncertain mean in

and a fixed deviation . It can be stated as

(3) It is obvious that the type-2 fuzzy set is in a region, called a footprint of uncertainty (FOU), and bounded by an upper MF

and a lower MF [6], which are denoted as and ,

respectively. Both of them are two type-1 MFs. Hence, (2) can be re-stated as

(4) We will make great use of upper and lower MFs for type reduc-tion in this secreduc-tion and develop the dynamic optimal learning rate algorithm in next section. Also the interval type-2 Gaussian MF with uniform uncertainty at primary memberships of in Fig. 1(a)–(b) will be adopted in this paper.

A type-2 FLS in Fig. 2 is constructed by the same structure of type-1 IF-THEN rules, which is still dependent on the knowl-edge of experts. Expert knowlknowl-edge, however, is always repre-sented by linguistic terms and implied uncertainty, which leads to the rules of type-2 FLSs having uncertain antecedent part and/or consequent part; then translate into uncertain antecedent or consequent MFs. The structure of rules in the type-2 FLS and its inference engine is similar to those in type-1 FLS. The inference engine combines rules and provides a mapping from input type-2 fuzzy sets to output type-2 fuzzy sets. To achieve this process, we must find unions and intersections of type-2 sets, as well as compositions of type-2 relations. The output of

(3)

Fig. 3. Interval T2FNN with antecedent part and consequent part.

the type-2 inference engine is a type-2 set. Using Zadeh’s ex-tension principle [15], type-1 defuzzification can derive a crisp output from type-1 fuzzy set; similarly, for a higher type set as type-2, this operation derives the type-2 sets to a type-1 set. This process can be so called “type reduction.” The complete type-2 fuzzy logic theory with the handling of uncertainties, such as the operations on type-2 fuzzy sets, centroid of a type-2 fuzzy sets, type-reduction, , etc., can be found in [16]–[21]. B. Type-2 Fuzzy Neuro Network

Due to the complexity of type reduction, the general type-2 FLS becomes computationally intensive. An interval type-2 FLS, whose secondary MFs are all unity, make things simpler and easier to compute meet and join operations, which leads finally to simplify type reduction. An interval T2FNN system is shown on Fig. 3, which is an implementation of interval type-2 fuzzy logic system, and some of their parameters and components are presented by fuzzy logic terms. Like type-1 FNN, Fig. 3 is a typical FNN with four layers structure [1]. Input nodes and type-2 fuzzification nodes are drawn on layer I and layer II, respectively. They form the antecedent part of this T2FNN. Consequent parts are drawn on layer III and IV which are constructed from a classical 2-layer NN with fuzzy rule nodes and output nodes. The fuzzifier nodes in layer II will yield type-2 membership grades. Each node at layer III is a fuzzy rule. Layer III nodes consist of the preconditions of the rule, i.e., the firing strength from (6) as shown in the following context. Layer IV nodes define the consequences of the rule nodes. The links between layer III and layer IV consist of interval weighting factors which will decide the actual outputs of this system.

The IF-THEN rule for interval T2FNN can be expressed as

IF is and and is

THEN is and and is (5)

where is rule number, the is

the interval type-2 fuzzy sets of antecedent part, and , is a centroid set with unity member-ship grade (interval type-1 fuzzy set), which can be called weighting interval set, derived from interval type-2 fuzzy set in the consequent part [6], [17]. Both and are treated as weighting factors to fully connect layer III and layer IV in our interval T2FNN structure. In practical use, both and can also be set at random initially in a reasonable interval. This structure combines type-2 fuzzification in antecedent part with a random weighting interval set in consequent part. It cannot only totally represents a type-2 fuzzy logic relation, but it can also process type-reduction and lead to the development a dynamic optimal training in consequent part which will be shown later in Section III.

C. Type Reduction

In Fig. 3, we only consider singleton input fuzzification throughout this paper. Similar to type-1 FNN, the firing strength in (6) can be obtained by the following inference process:

(6)

where is the meet operation and is the join operation [6]. For Gaussian interval type-2 fuzzy set as shown in Fig. 1, the upper MF is a subset that has the maximum membership grade and the lower MF is a subset that has the minimum membership grade. The join operation in (6) leads to join the result from above meet operations using supremum (i.e., maximum value), the result can be an interval type-1 set [20] as

(7) where

and

(8) The center-of-sets type-reduction [6], [20] will be used in this paper. In order to simplify the notation, we consider single output here. Then we have the center-of-sets type reduction method (as shown the bottom of the page) where is also an interval type-1 set determined by left and right end points ( and ), which can be derived from consequent centroid set

(4)

and firing strengths . The interval

set should be computed or set first

before the computation of . For any value can be expressed as

(10)

where is a monotonic increasing function with respect to . Also, in (9) is the minimum associated only with , and in (9) is the maximum associated only with . Note that and

depend only on mixture of or values. Hence, left-most point and right-most point can be expressed as [9]

(11) and

(12) For illustrative purposes, the type-reduction algorithm for com-puting from ([6], P. 310–311) is listed below as Algorithm 1. Algorithm 1. Type Reduction for Interval T2FNN: Without loss of generality, assume the ’s are arranged in ascending

order, i.e., .

[Step 1]: Computey_rin (12) by initially usingf_ri= (fi+ fi)=2 for i = 1; . . . ; M, where

fi_and_fi_{are pre-computed by (8); and let}_y0 r= yr. [Step 2]: FindR(1 R M 0 1) such that wR_r y0_r w_rR+1. [Step 3]: Computeyr in (12) withf_ri = fifori R and f_ri = fi fori > R, then set

y00 r = yr.

[Step 4]: Ify00_r 6= y_r0, then go to Step 5. Ify_r00 = y_r0, then setyr= y00r

and go to Step 6.

[Step 5]: Lety0_r= y_r00and return to Step 2. [Step 6]: End.

This algorithm decides the point to separate two sides by the number , one side using lower firing strengths ’s and another side using upper firing strengths ’s. Hence, the in (12) can be re-expressed as

(13)

where and

.

The procedure to compute is similar to compute . In

step 2, it only needs to find , such that

. In step 3, let for and

for . in (11) can be also re-expressed as

(14)

where and

.

The defuzzified crisp output from an interval type-2 FLS is the average of and , i.e.,

(15) According to the above analysis, the defuzzified output , i.e., actual output, is determined only by the upper and lower an-tecedent MFs and the weighting interval set. Like type-1 FNN, we also can use the BP method to tune all the parameters of type-2 fuzzy MFs in T2FNN. However, the process of tuning the parameters of interval T2FNN is more complicated than those in type-1 FNN. We must first determine the parameters associated

wth and . This requires comparing to

some points associated with parameters of upper and lower an-tecedent MFs [6]. When the input is located in one segment of domain, then its corresponding MF branch is called active branch. For instance, in Fig. 1(a), we have two respec-tive acrespec-tive upper and lower MFs branches. Once these parame-ters are changed due to tuning, the dependency of and on parameters may also be changed, i.e., the active branches may be changed to the other branches. The tuning of the parameters of these active branches are the same as tuning the parameters in type-1 FNN. For instance, to tune the parameters of the active branches located in the any upper or lower MF, we can have the following details.

By using the back propagation method, for input-output

training data the following error

func-tion should be minimized:

(16) To tune the mean of Gaussian MF in the th rule [6] as

(17)

where and

(5)

Similarly, to tune standard deviation and weighting factor , we have

(18)

(19) where is learning rate for tuning the parameters of MFs.

Whereas, and can be or and or respectively,

and can be or . The weighting factor or

depends on which branch is active in the process calculating left-most point or right-most point (13)–(14). Based on both type-reduction and BP processes, a dynamic optimal learning algorithm for tuning weighting matrices of conse-quents will be developed to fasten the convergence of back propagation process in the next section. An example to tune the parameters by using back propagation process in (17)–(19) is illustrated as follows.

Example 1: The following interval T2FNN has three rules in which each rule has two antecedent parts and two consequent parts (i.e., MIMO—multiple inputs and multiple outputs):

IF is and is THEN is and is (20) IF is and is THEN is and is (21) and IF is and is THEN is and is (22) where two antecedents are Gaussian primary MFs with uncer-tain mean. We extend type-1 Gaussian MFs by its deviation ratio 0.5 to form interval type-2 Gaussian MFs. The original type-1 Gaussian MFs are

The extended type-2 MFs are: (with same fixed deviations)

The weighting matrices of consequent part (20)–(22) are ini-tially at random assumed as

and given four pairs of training data ( as

For example, the interval Gaussian type-2 MFs of antecedent part in rule are shown in Fig. 4(a)–(b). The interval type-1 set of weighing factors in consequent part for the first output are also shown in Fig. 4(c). We examine the first training

pair and . Given into this MIMO

interval T2FNN, by using product t-norm, we can obtain three interval type-1 sets of firing strength as

After firing the consequent part, we have upper and lower interval type-1 sets, which form interval type-2 fuzzy sets. Fig. 4(d) shows the three-dimensional (3-D) view of these interval type-2 fuzzy sets. The type-reduction procedure in Algorithm 1 can be applied to find the right-most point from these fired type-2 interval sets in Fig. 4(d). Similarly the left-most point can also be found. Then the type-reduced

set for this pair training data is . As a

result, we have as shown in Fig. 4(e). By using

(6)

Fig. 4. Two interval type-2 Gaussian MFs of antecedent part for ruleR are shown in (a) and (b). The weighing interval sets and corresponding fired interval type-2 sets for first output are shown in (c). The 3-D view of interval type-2 sets is shown in (d). The resulty(~x ) = 3:4921 shown in (e).

have the final means, standard deviations, weighting matrices and actual output as

Then the trajectory of total squared errors is plotted in Fig. 5.

We only apply back-propagation algorithm in Example 1 with fixed learning rate. Better training results using dynamic op-timal learning rates can be seen in the next section.

Fig. 5. Total squared errorJJJ versus iteration ttt for a fixed learning rate =0:2.

III. DYNAMICOPTIMALLEARNINGRATE OFINTERVALTYPE-2 FUZZYNEURALNETWORK

According to [3], authors have developed the dynamic optimal learning rate theorem to speed up the convergence of tuning weighting factors of consequent part in type-1 FNN. From the type-reduction process in Algorithm 1 with (13)–(14), we have

(7)

Fig. 6. Detailed look of the consequent part in Fig. 3.

Then it is obvious that we can interpret the above equations as an interval NN which is shown in layers III and IV of Fig. 3 and is presented in more details in Fig. 6. The goal is then to find the dynamic optimal training for tuning weighting interval sets in Fig. 6. The th node input of layer III is firing strength from layer II. Once all firing strengths enter into the type-reduction process, , and in (13)–(14) can be determined via Algorithm 1 to find and . A dynamic optimal learning algorithm for T2FNN will be developed in this section to guarantee maximum error reduction during the back propagation process in previous section, where

the firing strength matrix (23)

the weighting matrix (24)

the th left weighting vector (25)

the th right weighting vector (26)

the left firing strength vector (27)

the right firing strength vector (28) the actual output vector (29) the desired output vector (30) and “ ” denotes matrix transpose, “ ” denotes vector.

To derive the actual output, we need first to compute its left-most (14) and right-most (13) output as

(31) and

(32) Therefore, the actual output can be obtained as

(33)

Then, we have as (29) by union all

out-puts.

Given training vectors, its actual output , and the desired

output as

the left firing strength matrix (34) the right firing strength matrix (35) the left weighting factor matrix (36) the right weighting factor matrix (37) the actual output matrix (38) the desired output matrix (39) From (33), the actual output matrix can be expressed as

(40) The total squared error (16) can be expressed as

(41) By using matrix notation to re-organize , first we define error function as

(42) then we have

(43) To tune weighting factors using chain rule, we have

(44)

(8)

After training, assuming zero error, we should have . The learning rate for each iteration during the back propagation process is different, i.e., the learning rates are not fixed [3]. To find such the optimal learning rate for and , we have the following theorem.

Theorem 1: The optimal learning rates and defined in (44) and (45) can be found from the minimum of a quadratic

polynomial , where

and can be obtained from the left

firing strength and right firing strength , desired output and the weighting factors and .

Proof: First, we must find the stable range for and . To do so, we define the Lyapunov function as

(46) where is defined in (41). The change of the Lyapunov function

is . It is well known that if , the

response of the system is guaranteed to be stable. For , we have

(47) Consider all the P firing strengths as

and , the firing strengths

remain the same but their order may change according to the order of weighting factors during the training process. Then, we

have from (43) as Hence (48) where (49) (50) (51) (52) (53) It is obvious that A and B in (49) and (50) and F and G in (52) and (53) contain quadratic matrices. Therefore the and should be positive; and should be negative. However, C in (51) can be either positive or negative.

Therefore we have

In type-1 FNN [3], authors similarly defined

to guarantee the NN to be stable and the one-variable quadratic

polynomial was derived for

and . Therefore, the optimal learning rate

can be derived to make at its

imum in type-1 FNN. Similarly the determination of the min-imum values of two-variable quadratic function

can be found as follows:

Let , we have the first order partial

derivatives of as

(54) (55) The second partial derivatives of H are

(56) From partial derivatives theorem [22] and (56), if

and then there forms a quadratic parabolic

func-tion and has a local minimum value at a critical point . Therefore, by solving (54) and (55), we can find the left-end and right-end optimal learning rate as

(57) (58)

(9)

To prove , we first let . From (49)–(51), we have

(59)

(60)

(61) According to Cauchy inequality [22] we have

(62) Therefore, by using (59)–(62), we can obtain

(63)

By inspecting (42) and (48)–(53), it is obvious that the stable

range of is a function of and . In

com-parison with the positive optimal learning rate in type-1 FNN, the following Theorem 2 shows that the stable optimal learning rates in interval T2FNN can be positive or negative, but cannot be negative simultaneously.

Theorem 2: For the two-layer NN of consequent part, both stable optimal learning rate and cannot be negative simultaneously.

Proof: Suppose both learning rates in (57)–(58) are both negative, i.e.,

and

Since , therefore and

. Therefore we have

(64) (65)

In Theorem 1, we know that and ,

but can be positive or negative. If is negative then in (64) will be positive. So (64) cannot be true due to . Similarly, (65) cannot be true if C is negative. As a result, both learning rates are all positive if .

Fig. 7. Quadratic parabolic trajectories ofJ 0 J versus and in (a). Three cases of and are shown on (b), (c), and (d) whileC > 0. (b) Shows both and are>0. (c) Shows < 0 but > 0. (d) Shows the case when > 0 and < 0. (e) Shows both and are>0 when C < 0.

If is positive, multiply (64) by , we have

(66) and

(67) Combining (66) and (67), we have

(68)

Deleting from (68), we can yield . This violates

(63), i.e., . Therefore we know that both optimal learning rates should not be negative simultaneously.

According to the Theorem 2, it is obvious that the two lines (on the plane of [ versus ]) defined in (54) and (55) may have an intersecting point located on one of the first, second and fourth quadrant respectively. Fig. 7(a) shows a 3-D view of the intersections. Fig. 7(b) and (e) show the cases when both

optimal learning rates and . Fig. 7(b)

(10)

shows the case when and . Fig. 7(d) is for

and .

Consequently, the algorithm of tuning process in this two-layer interval NN is stated as the following Algorithm 2.

Algorithm 2. Dynamic Optimal Learning Rates for Conse-quent Part of T2FNN:

[Step 1]: Given the initial weighting matricesWl0 andWr0, firing matricesQlandQr, and

desired outputD, find the initial actual output Y0(40) and op-timal learning rate

l0;optandr0;opt. Then, set initial iterationt = 0 and start the back propagation training process.

[Step 2]: Check if the desired outputD and actual output Ytare close enough (i.e., threshold

limit) or if the maximize number of iteration is achieved? If Yes, Go to Step 6.

[Step 3]: Update the weighting matrices to obtain [W_l;t+1W_r;t+1] (44), (45).

[Step 4]: Find the optimal learning rate_l;opt;t+1and_r;opt;t+1for the next iteration.

[Step ]5: Sett = t + 1. Go to step 2. [Step 6]: End.

Given the same case of Example 1, the following example illustrates the major concept in this section.

Example 2: The weighting factors of the consequent part in Example 1 will be tuned by using dynamic optimal learning al-gorithm as stated in Alal-gorithm 2. We also allow 15 iterations for this dynamical optimal tuning so that we can compare the tuning results with those in Example 1. The firing strength ma-trices from this MIMO T2FNN (4 inputs and two outputs

can be found from Example 1 as

During the dynamical optimal training process, the weighting intervals will be tuned to reduce maximum error after each it-eration. The initial in (41) can be obtained as 0.3919. The optimal learning rates are used to update the weighting matrix . The trajectory of total squared error is listed in Table I and shown in Fig. 8(a). Fig. 8(a) also shows the trajectory for fixed learning rate as shown in Fig. 5. It is obvious that the total squared error is decreased as expected in decent direction and converged faster than that in Fig. 5. Fig. 8(b) shows that both and cannot be negative simultaneously, which is in accordance with Theorem 2. Note that the tunings in Example 1 include the mean, standard deviation and weighting factors, all

TABLE I

OPTIMALLEARNINGRATES ANDTOTALSQUAREDERROR

using fixed learning rate . After 15 iterations, we have the final weighting matrices and output as

IV. TUNNING INTERVAL T2FNN USING A GENETICALGORITHM

Authors in [20], [23] used a reasonable spread rate by stan-dard deviation ratios, and , to set the uncertain means as

(69) where both the mean and the standard deviation are pa-rameters of the th primary MF in the th fuzzy rules in (5); and is a spread rate. By doing so, all interval type-2 fuzzy sets can be constructed and yield better performance than that in type-1 FLS. However the optimal selection of spread rate for type-2 FLS can be done via GA-based approach in [13]. For the overall tunings of T2FNN, we need to find the optimal learning rate in (17) and (18) (for means and standard devi-ations), the optimal spread rate , and the optimal weighting matrices for consequent part. However, due to the dynamical optimal training algorithm derived in previous section, it is ob-vious that we do not have to rely on the genetic search algorithm to find the optimal weighting matrices. We only have to design a GA-based algorithm to search the optimal spread rate and optimal learning rate for means and standard deviations. The fitness function ( is the total squared error) in the GA-based search algorithm can be defined as [24]

(70)

where . The above (70) finds a larger

(11)

Fig. 8. Case a: Optimal learning for consequent part only. Case b: Example 1, = 0:2. (b) and cannot be negative simultaneously. TABLE II

PARAMETERS FORGAAND THERESULTS FOR AND

and to form a chromosome in a population for each iteration. The chromosome with larger fitness value has a larger proba-bility of selection. Then, a new population is formed by selecting the better-fit chromosomes. Some members of the new popula-tion undergo transformapopula-tion by means of genetic operators to form new solutions. The crossover operation combines the fea-tures of two parent chromosomes to form two similar children by swapping corresponding segments of their parents. The pa-rameters defining the crossover operation are the probability of crossover and the crossover position. Mutation is the process of occasional alternation of some gene values in a chromosome by a random change with a probability less than the mutation

rate .

Therefore, based on this GA design, we can combine the dynamic optimal learning algorithm (Algorithm 2) to perform some desired iterations under the back propagation training process. Because every new population consists of better-fit chromosomes, the search then can be continued to obtain the optimal spread rate and the optimal learning rate such that the total squared error is a minimum. The overall search algorithm, which summarizes the whole concept, is listed as follows.

Algorithm 3. Tuning FNN Via Genetic Algorithm With Op-timal Learning: Given input-output training sample pairs , we wish to tune the all parameters in antecedent and consequent part so that (46) can be minimized for iterations.

Step 1: Initialize weighting matricesWl0andWr0randomly. Define range intervals for

spread rate [l; u] and learning rate [l; u]. Set

Pop size; Max gen; Iteration and Threshold.

Step 2: Initialize populationPop = f_j; _jg; _j 2 (_l; _u); _j 2 (l; u); j = 1; . . . ; Pop size.

Forgeneration = 1 : Max gen Forj = 1 : Pop size

Getith and . Use to initialize uncertain means

Fort = 1 : Iteration Forp = 1 : P

Apply training sample xp, and compute the total firing strength for each

rule in (8).

Computey_l; y_r and its defuzzified output(y_l+ y_r)=2 in (15).

Use back propagation to tune the parameters of active branches.

End

Compute new firing strength matricesQlandQr. WhilejJt;min0 Jt02;minj=Jt02;min> Threshold

Compute matrixE in (42), and find l;opt;tandr;opt;t

(Theorem 1).

Compute matricesWl;t+1 andWr;t+1 in (44)–(45), and Jt+1;minin (43).

End End

Putjth J_t+1;min(and/or any other CPI factors) into fitness vector.

End

Perform selection, crossover, and mutation for next generation. End

Optimal spread rateoptand optimal learning rateoptare found. For antecedent part: uncertain means[mi_k1; mi_k2] and deviation i

kare found

For consequent part:Wl;t+1; Wr;t+1; l;opt; r;optandJt+1;min

are found.

Example 3: Given the same Example 1, we extend the same primary type-1 Gaussian MFs by using the spread rate in (69). Then we use the Algorithm 3 to tune this MIMO interval T2FNN. Table II shows all parameters in the GAs process and the final results for optimal spread rate and optimal learning rate . To increase the efficiency, we define mutation rate and crossover rate [3] as

(12)

TABLE III

OPTIMALLEARNINGRATE& TOTALSQUAREDERROR

Fig. 9. Performance comparisons. Case a: Optimal spread rate and learning rate with dynamical optimal learning rates and . Case b: = 0:5; = 0:2 with dynamical optimal learning rates and

.

(72) where denotes the th generation. After 5 iterations, we have the final tuned weighting factors

In comparison with fixed spread rate and optimal learning rate, we also combine Example 1 with Algorithm 2 to tune the overall T2FNN. Therefore we have the output results as

and

The total squared errors vs. iterations are listed in Table III, and the results are plotted in Fig. 9. Fig. 9 also shows the trajectory for the tuning results of using fixed rates in antecedent part (i.e.,

and ), and optimal learning rates and

in consequent part. By inspecting the results in this ex-ample, it is obvious that the actual output of GA-based approach

Fig. 10. Diagram of simulated truck and loading zone.

is closer to desired output and convergence is much faster than those results from fixed rate, or even Examples 1 and 2.

V. EXAMPLES

Based on the above GAs design for the interval T2FNN, two popular examples will be fully illustrated in this section. Ex-ample 4 is the truck back up control problem. ExEx-ample 5 is non-linear system identification.

Example 4: Truck Back Up Control Problem: The well-known nonlinear problem of backing up a truck into a loading dock via the FNN controller [25], [26], will be controlled by using interval T2FNN. Fig. 10 shows the truck and loading zone. The truck position is located by three state variables , and , where is the angle of the truck with the horizontal axis . Steering angle is used to control the truck. The truck moves backward by a fixed unit distance every step. We assume enough clearance between the truck and the loading zone, therefore does not have to be considered. Hence the

system has two inputs, i.e., & ,

and one output within . The final states

will be equal or close to . In this simulation, steering angle will be normalized into [0, 1].

A reasonable numbers of training data pairs must be first generated as desired input-output pairs so

that it can cover whole situation. The following

14 initial states are used to generate such pairs:

, and . The following

approximate kinematics is used to simulate the truck path as (73)

(13)

Fig. 11. Two antecedents initial MFs of T1FNN in (a) and (b), and the two corresponding antecedents initial MFs of interval T2FNN in (c) and (d). (74)

(75) where is the length of the truck, we assume . Equations (73)–(75) will be used to derive the next state when present state and control are given. Since is not considered, (74) will be discarded.

In this application, we compare the performances by using two different FNNs, i.e., singleton type-1 FNN (T1FNN) and interval T2FNN. We use a single rate to identify the spread of uncertain means of interval Gaussian MF by its deviation ratio in (69), i.e., and ; then all parameters of interval T2FNN can be obtained. The weighting factors and can also be set randomly in [0, 1].

Figs. 11(a) and (b) show the two antecedents initial type-1 MFs of T1FNN, where means small level, CE means center and means big level. By using spread rate to extend from type-1 MFs, we have two corresponding antecedents initial type-2 MFs of interval T2FNN shown in Figs. 11(c) and (d). For steering angle , we let the T’s be the its fuzzy MF. The

centers of T1, T2, T3, T4, T5, T6, and T7 are ,

0 , 7 , 20 , and 40 , respectively.

Each FNN system has 35 rules which come from 7 MFs in the first antecedent by five MFs in the second antecedent. We use 32 bits to form the chromosome, 16 bits for spread rate and 16 bits for optimal learning rate . The chromosome will be

mapped into real values in the ranges of and ,

respectively. The mutation rate and crossover rate are defined as in Example 3. To guarantee the performance of con-trol process, the term of settling time (or settling steps) is also taken as a control performance criteria (CPI) for deriving op-timal spread and learning rate in GA searching process. The re-sults of simulation show that the smaller settling time (or settling

steps) and smaller squared error can lead better performance in this interval type-2 FNN case. The best settling steps from two initial states (0, 0 ) and (20, 180 ) are 19 and 28 steps, respec-tively.

Given the same training pairs, we also train the type-1 FNN with dynamic optimal learning for consequent part. For com-parison purpose, the following TSK model in [27] and [28]

is and is

THEN (76)

where . With more trainable parameters

in the consequent part will be tuned to compare their perfor-mance with T1FNN and T2FNN. The TSK model, in fact, can be treated as T1FNN (T1FNN-TSK) and trained with optimal learning algorithm [3].

Each case under this application is run for five iterations. Fig. 12(a) shows the performance comparsion with T1FNN, T1FNN-TSK and interval T2FNN. Fig. 12(b) shows the times for the truck to arrive target position in ten different initial conditions, and their first five trajectories are plotted in Fig. 13. Table IV shows the number of design parameters for these three models. The total squared errors of all cases are shown in Table V. From Figs. 12 and 13, it is obvious that the perfor-mances of interval T2FNN is better than those in T1FNN. It not only takes less steps to reach target position using interval T2FNN, but it also shows smoother trajectories. However, due to more parameters included, the performances of T1FNN-TSK is still better than T1FNN in Fig. 12.

The TSK model was also similarly defined as a Type-3 fuzzy reasoning in adaptive network-based inference system (ANFIS) in [29]. The T2FNN with dynamic optimal training and more trainable parameters can achieve desired performance in fewer iterations, whereas ANFIS with least square estimation needs

(14)

Fig. 12. (a) Shows the results of total squared errors in T1FNN and T2FNN. (b) Shows the total steps to arrive target position by ten different cases.

Fig. 13. (a) Shows truck trajectories via T1FNN. (b) Shows via interval T2FNN, all from different initial conditions 1–5. TABLE IV

NUMBER OFDESIGNPARAMETERS FORDIFFERENTMODEL, WHEREn = 2; M = 35

TABLE V

TOTALSQUAREDERRORJFORFIVEITERATIONS

more iterations in its proposed model [29]. It is obvious that the improvement of dynamic optimal training in Theorem 1 of T2FNN and T1FNN [3] can yield faster convergence.

Example 5: Nonlinear System Identification Second Order System: The plant to be identified is described by the following second-order difference:

(77)

where

A series-parallel FNN identifier [25] described by the following equation

(78)

will be adopted, where is the form of (10)

with two fuzzy variables and . Training data

of 500 pairs are generated from plant model, assuming a

random input signal uniformly distributed in .

The data are used to build fuzzy model for . We follow (69) to allocate type-1 MFs for these two variakbes and

as , and

. Then we extend above MFs to type-2 interval MFs by using spread rate, where its optimal value will be found through Algorithm 3. The

(15)

Fig. 14. (a) Shows the results of total squared error vs iterations in T1FNN (dashed line) and T2FNN (solid line). (b) Shows outputs of the plant y (solid line) and the identification model^y (dashed line).

TABLE VI

TOTALSQUAREDERRORJFORTENITERATIONS

mutation rate and crossover rate are defined the same as in Example 3. The initial value of is 1.8803. Fig. 14(a) shows the performance comparsion with T1FNN and interval T2FNN. It is obvious that faster convergence is also obtained via interval T2FNN. After the training process is finished, the FNN model is tested by applying a sinusoidal input signal . Fig. 14(b) shows the outputs of both FNN model and actual model. The total squared error J using 120 testing data items is 0.0020. This example shows excellent results are obtained via interval T2FNN. Table VI shows total squared error for 10 iterations.

VI. CONCLUSION

The interval type-2 FLS with type reduction was extended with interval neural network to construct an interval T2FNN in this paper. The consequent part of this interval T2FNN is also an interval neural network. The dynamical optimal training for this interval neural network is also developed to guarantee maximum error reduction during the training process. A multi-input, multi-output FNN model is adopted to illustrate all the properties of this T2FNN with dynamical optimal training in its consequent part. This dynamical optimal training algorithm can be combined into a GA-based approach to find the timal spread rate and learning rate for antecedent part. The op-timal weighting factors in the consequent part of this T2FNN can be directly found from the dynamical optimal training al-gorithm with global searching. This interval T2FNN with dy-namic optimal learning algorithm is applied to control the truck backing-up system and nonlinear system identification. All the

simulation results by using the interval T2FNN show better per-formances than those using T1FNN.

REFERENCES

[1] C. H. Wang, W. Y. Wang, T. T. Lee, and P. S. Tseng, “Fuzzy B-spline membership function (BMF) and its applications in fuzzy-neural con-trol,” IEEE Trans. Syst., Man, Cybern., vol. 25, pp. 841–851, May 1995. [2] X. H. Yu et al., “Dynamic learning rate optimization of the back prop-agation algorithm,” IEEE Trans. Neural Networks, vol. 6, pp. 669–677, May 1995.

[3] C. H. Wang, H. L. Liu, and C. T. Lin, “Dynamic optimal learning rate of a certain class of fuzzy neural networks and its applications with genetic algorithm,” IEEE Trans. Syst., Man, Cybern. B, vol. 31, pp. 467–475, June 2001.

[4] R. R. Yager, “Fuzzy subsets of type II in decisions,” J. Cybern., vol. 10, pp. 137–159, 1980.

[5] N. N. Karnik and J. M. Mendel, “Applications of type-2 fuzzy logic sys-tems: handling the uncertainty associated with surveys,” in Proc. IEEE

FUZZ Conf., Seoul, Korea, Aug. 1999, pp. 1546–1551.

[6] J. M. Mendel, Uncertain Rule-Based Fuzzy Logic Systems: Introduction

and New Directions. Englewood Cliffs, NJ: Prentice-Hall, 2001. [7] S. Auephanwiriyakul, A. Adrian, and J. M. Keller, “Type 2 fuzzy set

analysis in management surveys,” in Proc. FUZZ-IEEE Conf., May 2002, pp. 1321–1325.

[8] N. N. Karnik and J. M. Mendel, “Applications of type-2 fuzzy logic systems to forecasting of time-series,” Inform. Sci., vol. 120, pp. 89–111, 1999.

[9] N. N. Karnik, J. M. Mendel, and Q. Liang, “Type-2 fuzzy logic systems,”

IEEE Trans. Fuzzy Syst., vol. 7, pp. 643–658, Dec. 1999.

[10] Q. Liang and J. M. Mendel, “Equalization of nonlinear time-varying channels using type-2 fuzy adaptive filters,” IEEE Trans. Fuzzy Syst., vol. 8, pp. 551–563, Oct. 2000.

[11] K. C. Wu, “Fuzzy interval control of mobile robots,” Comput. Elect.

Eng., vol. 22, no. 3, pp. 211–229, 1996.

[12] R. I. John, P. R. Innocent, and M. R. Barnes, “Neuro-fuzzy clustering of radiographic tibia images using type 2 fuzzy sets,” Inform. Sci., vol. 125, pp. 65–82, 2000.

(16)

[13] S. Park and H. Lee-Kwang, “A designing method for type-2 fuzzy logic systems using genetic algorithms,” in Proc. Joint 9th IFSA World

Con-gress 20th NAFIPS Int. Conf., Vancouver, BC, Canada, July 2001, pp.

2567–2572.

[14] C. T. Lin and C. S. G. Lee, Neural Fuzzy System. Englewood Cliffs, NJ: Prentice-Hall, 1996.

[15] L. A. Zadeh, “The concept of a linguistic variable and its application to approximate reasoning—I,” Inform. Sci., vol. 8, pp. 199–249, 1975. [16] N. N. Karnik and J. M. Mendel, “Operations on type-2 fuzzy sets,” Fuzzy

Sets Syst., vol. 122, pp. 327–348, 2001.

[17] , “Centroid of a type-2 fuzzy set,” Inform. Sci., vol. 132, pp. 195–220, 2001.

[18] , “Introduction to type-2 fuzzy logic system,” in Proc. IEEE FUZZ

Conf., Anchorage, AK, May 1998, pp. 915–920.

[19] J. M. Mendel and R. I. B. John, “Type-2 fuzzy sets made simple,” IEEE

Trans. Fuzzy Syst., vol. 10, pp. 117–127, Apr. 2002.

[20] Q. Liang and J. M. Mendel, “Interval type-2 logic systems: Theory and design,” IEEE Trans. Fuzzy Syst., vol. 8, pp. 535–550, Oct. 2000. [21] N. N. Karnik and J. M. Mendel, “2 fuzzy logic systems:

Type-reduction,” in Proc. IEEE Syst., Man, Cybern. Conf., San Diego, CA, Oct. 1998, pp. 2046–2051.

[22] S. I. Grossman, Multivariable Calculus, Linear Algebra, and

Differen-tial Equations. Orlando, FL: Academic, 1986.

[23] J. M. Mendel, “Uncertainty, fuzzy logic, and signal processing,” Signal

Process., vol. 80, pp. 913–933, 2000.

[24] C.-C. Hsu et al., “Digital redesign of continuous systems which im-proved suitability using genetic algorithms,” Electron. Lett., vol. 33, no. 15, pp. 1345–1347, July 1997.

[25] L. X. Wang, Adaptive Fuzzy Systems and Control: Design and Stability

Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1994.

[26] B. Kosko, Neural Network and Fuzzy System. Englewood Cliffs, NJ: Prentice-Hall, 1992.

[27] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its appli-cation to modeling and control,” IEEE Trans. Syst., Man, Cybern., vol. SMC-15, pp. 116–132, 1985.

[28] M. Sugeno and G. T. Kang, “Structure identification of fuzzy model,”

Fuzzy Sets Syst., vol. 28, pp. 15–33, 1988.

[29] J.-S. R. Jang, “ANFIS: Adaptive-network-based fuzzy inference system,” IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 665–684, May/June 1993.

Chi-Hsu Wang (M’92–SM’93) was born in Tainan, Taiwan, R.O.C, in 1954. He received the B.S. degree in control engineering from National Chiao-Tung University (NCTU), Hsinchu, the M.S. degree in computer science from the National Tsing-Hua University, Hsinchu, and the Ph.D. degree in elec-trical and computer engineering from the University of Wisconsin, Madison, in 1976, 1978, and 1986, respectively.

He was appointed Associate Professor in 1986, and Professor in 1990, in the Department of Electrical Engineering, National Taiwan University of Science and Technology, Taiwan. He is currently Professor in the Department of Electrical and Control Engi-neering, NCTU. His current research interests and publications are in the areas of digital control, fuzzy-neural-network, intelligent control, adaptive control, and robotics.

Dr. Wang is currently Associate Editor of IEEE TRANSACTIONS ONSYSTEMS,

MAN, ANDCYBERNETICS,PART–B and Webmaster for the IEEE Systems, Man, and Cybernetics Society.

Chun-Sheng Cheng was born in Tainan, Taiwan, R.O.C., in 1957. He received the B.S. degree in com-munication engineering from National Chiao-Tung University, Hsinchu, Taiwan, and the M.S. degree in microelectronic engineering, Griffith University, Brisbane, Australia, in 1980 and 2003, respectively.

He was a Process Control System Engineer at China Steel Corporation, Kaohsiung, Taiwan, from 1983 to 1992. He is currently a Developer for the e-commerce data base, point of sales system, and duty free system in Data Control Pty. Ltd., Brisbane. His current research interests and publications are in the areas of type-2 fuzzy-neural-network, and adaptive control.

Tsu-Tian Lee (M’87–SM’89–F’97) was born in Taipei, Taiwan, R.O.C., in 1949. He received the B.S. degree in control engineering from the National Chiao-Tung University (NCTU), Hsinchu, Taiwan, in 1970, and the M.S., and Ph.D. degrees in electrical engineering from the University of Oklahoma, Norman, in 1972 and 1975, respectively.

In 1975, he was appointed Associate Professor and in 1978 Professor and Chairman of the Department of Control Engineering at NCTU. In 1981, he became Professor and Director of the Institute of Control En-gineering, NCTU. In 1986, he was a Visiting Professor and in 1987, a Full Pro-fessor of electrical engineering at the University of Kentucky, Lexington. In 1990, he was a Professor and Chairman of the Department of Electrical Engi-neering, National Taiwan University of Science and Technology (NTUST). In 1998, he became the Professor and Dean of the Office of Research and Devel-opment, NTUST. Since 2000, he has been with the Department of Electrical and Control Engineering, NCTU, where he is now a Chair Professor. Since 2004, he has been with the National Taipei University of Technology (NTUT), where he is now President. He has published more than 200 refereed journal and confer-ence papers in the areas of automatic control, robotics, fuzzy systems, and neural networks. His current research involves motion planning, fuzzy and neural con-trol, optimal control theory and application, and walking machines.

Prof. Lee received the Distinguished Research Award from the National Sci-ence Council, R.O.C., in 1991–1998, and the Academic Achievement Award in Engineering and Applied Science from the Ministry of Education, R.O.C., in 1997, the National Endow Chair from the Ministry of Education, R.O.C., in 2003, and the TECO Science and Technology Award from TECO Technology Foundation in 2003. He was elected to the grade of IEE Fellow in 2000, re-spectively. He became a Fellow of New York Academy of Sciences (NYAS) in 2002. His professional activities include serving on the Advisory Board of Divi-sion of Engineering and Applied Science, National Science Council, serving as the Program Director, Automatic Control Research Program, National Science Council, and serving as an Advisor of Ministry of Education, Taiwan, and nu-merous consulting positions. He has been actively involved in many IEEE activ-ities. He has served as Member of Technical Program Committee and Member of Advisory Committee for many IEEE sponsored international conferences. He is now the Vice President of Membership, a member of the Board of Governors, and the Newsletter Editor for the IEEE Systems, Man, and Cybernetics Society.