A Recurrent Self-Evolving Interval Type-2 Fuzzy Neural Network for Dynamic System Processing

(1)

A Recurrent Self-Evolving Interval Type-2 Fuzzy

Neural Network for Dynamic System Processing

Chia-Feng Juang, Senior Member, IEEE, Ren-Bo Huang, and Yang-Yin Lin

Abstract—This paper proposes a recurrent self-evolving

inter-val type-2 fuzzy neural network (RSEIT2FNN) for dynamic sys-tem processing. An RSEIT2FNN incorporates type-2 fuzzy sets in a recurrent neural fuzzy system in order to increase the noise re-sistance of a system. The antecedent parts in each recurrent fuzzy rule in the RSEIT2FNN are interval type-2 fuzzy sets, and the consequent part is of the Takagi–Sugeno–Kang (TSK) type with interval weights. The antecedent part of RSEIT2FNN forms a lo-cal internal feedback loop by feeding the rule firing strength of each rule back to itself. The TSK-type consequent part is a linear model of exogenous inputs. The RSEIT2FNN initially contains no rules; all rules are learned online via structure and parameter learning. The structure learning uses online type-2 fuzzy clustering. For the parameter learning, the consequent part parameters are tuned by a rule-ordered Kalman filter algorithm to improve learning per-formance. The antecedent type-2 fuzzy sets and internal feedback loop weights are learned by a gradient descent algorithm. The RSEIT2FNN is applied to simulations of dynamic system identi-fications and chaotic signal prediction under both noise-free and noisy conditions. Comparisons with type-1 recurrent fuzzy neural networks validate the performance of the RSEIT2FNN.

Index Terms—Dynamic system identification, online fuzzy

clus-tering, recurrent fuzzy neural networks (RFNNs), recurrent fuzzy systems, type-2 fuzzy systems.

I. INTRODUCTION

T

HE TOPOLOGIES of recurrent networks include feed-back loops, which are used to memorize past information. In contrast with pure feedforward architectures, which exhibit static input–output behavior, recurrent networks are able to store information from the past (e.g., prior system states) and are, thus, more appropriate for the analysis of dynamic systems. Some recurrent fuzzy neural networks (RFNNs) have already been proposed [1]–[9] to deal with temporal characteristic problems, and have been shown to outperform feedforward FNNs and re-current NNs. One category of RFNNs uses feedback loops from the network output(s) as a recurrence structure [1]–[3]. The au-thors in [1] and [3] proposed an output RFNN where the output values are fed back as input values. A recurrent neural fuzzy network is proposed in [2], where the consequent of a rule is Manuscript received June 30, 2008; revised October 20, 2008, January 12, 2009, and March 20, 2009; accepted March 25, 2009. First published May 2, 2009; current version published October 8, 2009. This work was supported by the Ministry of Education, Taiwan, under the Aiming for Top University plan.

C.-F. Juang and R.-B. Huang are with the Department of Electrical En-gineering, National Chung-Hsing University, Taichung 402, Taiwan (e-mail: [email protected]; [email protected]).

Y.-Y. Lin is with the Department of Electrical Engineering, National Chiao-Tung University, Hsinchu 300, Taiwan (e-mail: daviddavid715288@ yahoo.com.tw).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TFUZZ.2009.2021953

a reduced linear model in autoregressive form with exogenous inputs. Another category of RFNNs uses feedback loops from internal state variables as its recurrence structure [4]–[9]. The recurrence property in studies [4] and [5] is achieved by feeding the output of each membership function (MF) back to itself, and therefore, each membership value is influenced by and only by its past values. Recurrent self-organizing neural fuzzy in-ference networks (RSONFINs) [6] and Takagi–Sugeno–Kang (TSK) type recurrent fuzzy networks (TRFN) [7], [9] use a global feedback structure, where the firing strengths of each rule are summed and fed back as internal network inputs.

All of the aforementioned RFNNs use type-1 fuzzy sets. In recent years, studies on type-2 fuzzy logic systems (FLSs) have drawn much attention [10]–[14]. Type-2 FLSs are extensions of type-1 FLS, where the membership value of a type-2 fuzzy set is a type-1 fuzzy number. Type-2 FLSs appear to be a more promising method than their type-1 counterparts in handling problems with uncertainties, and have already been successfully applied in several situations [15]–[18]. Some interval type-2 FNNs have been proposed for the automatic design of inter-val type-2 FLS [19]–[23]. A gradient descent algorithm was proposed for interval type-2 FNN learning in [19]–[22]. In the center-of-sets type reduction process, the consequent values in the interval type-2 FLS are rearranged in ascending order to compute the interval outputs using the Karnik–Mendel iterative procedure [19]–[23]. During the parameter learning process, the consequent values change, and their ascending orders and cor-responding fuzzy rule orders should change accordingly. The parameter learning equations in [19], [20], [22], and [23] did not explicitly address this fuzzy rule reordering problem. Pa-rameter learning equations that considered the fuzzy rule re-ordering problem using a gradient descent algorithm were pro-posed in [21]. This paper proposes a rule-ordered Kalman filter algorithm for recurrent type-2 FNN consequent part parameter learning. The algorithm derives detailed learning equations, tak-ing into account the rule-reordertak-ing problem in network output computation.

This paper proposes a recurrent type-2 FNN, i.e., the re-current self-evolving interval type-2 FNN (RSEIT2FNN), for dynamic system processing. The self-evolving property means that the RSEIT2FNN can automatically evolve its network struc-ture and parameters according to training data, i.e., the task of preassigning network structure (rule numbers and initial fuzzy set shapes) is no longer necessary. The major contributions of the RSEIT2FNN are twofold. The first contribution is the pro-posal of a novel RFNN structure with the introduction of type-2 fuzzy sets. Current studies on type-type-2 FNNs only focus on feedforward network structures to handle static input–output 1063-6706/$26.00 © 2009 IEEE

(2)

mapping problems. In the RSEIT2FNN structure, local feed-back loops in the antecedent part are formed by feeding the rule firing strength of each rule back to itself, and the consequent part is a combination of current and lagged network inputs. The second contribution is the proposal of novel structure and parameter learning algorithms to improve the network learn-ing performance. The aforementioned type-2 FNNs learn only parameters, where the structures are all fixed and must be as-signed in advance. The RSEIT2FNN learns both the structure and parameters concurrently and online. All of the rules in an RSEIT2FNN are generated online. The consequent parameters in the RSEIT2FNN are learned by a rule-ordered Kalman filter algorithm to improve learning performance. The antecedent part parameters and rule feedback weights are learned by a gradient descent learning algorithm. Several simulations are conducted to verify RSEIT2FNN performance. These simulations also com-pare the RSEIT2FNN with recurrent type-1 FNNs, feedforward type-1 FNNs, and other type-2 FNNs.

The rest of this paper is organized as follows. Section II in-troduces the RSEIT2FNN structure. Section III inin-troduces the structure and parameter learning methods in an RSEIT2FNN. Section IV simulates three examples on dynamic systems iden-tification and chaotic series prediction. Finally, Section V draws conclusions.

II. RSEIT2FNN STRUCTURE

This section introduces the structure of an RSEIT2FNN. Suppose that the dynamic system to be processed is a multi-input–multioutput (MIMO) system that consists of nu control

inputs and no outputs and that the control input and dynamic

system output vectors are denoted by u = (u1, . . . , unu) and

yp = (yp1, . . . , ypnO), respectively, where nu and no denote

the input and output dimensions, respectively. Fig. 1 shows the proposed MIMO six-layered RSEIT2FNN structure. The con-sequent of each recurrent fuzzy rule is of first-order Takagi– Sugeno-Kang (TSK) type and executes a linear function. The detailed mathematical functions of each layer are introduced next.

1) Layer 1 (Input Layer): The inputs are crisp values. Only the current states x(t) = (u(t), y_p(t)) are fed as inputs to this layer, in contrast to feedforward FNNs where both current and past states are fed as inputs to the input layer. To unify the input range, each node in this layer scales the inputs to lie in the range around [−1, 1]. Note that there are no weights to be adjusted in this layer.

2) Layer 2 (MF Layer): Each node in this layer defines an interval type-2 MF. For the ith interval type-2 fuzzy set ˜Ai_j in input variable xj, j = 1, . . . , nu+ no, two types of MFs are

studied. For the first type, we use a Gaussian primary MF having a fixed standard deviation (STD) σi

j and an uncertain mean that

takes on values in [mi

j 1, mij 2] [see Fig. 2(a)], i.e.,

µ_A˜i j = exp   − 1 2 xj − mij σi j 2_  ≡N (mij, σij; xj) mi_j ∈ [mi_{j 1}, mi_{j 2}]. (1)

Fig. 1. Structure of the RSEIT2FNN, where each node in layer 4 forms an internal feedback loop and each node in layer 5 functions as a linear combination of current and lagged network inputs (denoted as xz).

Fig. 2. Interval type-2 fuzzy set. (a) Uncertain mean. (b) Uncertain STD.

An RSEIT2FNN using this type of MF is called RSEIT2FNN-UM. The footprint of uncertainty (FOU) of this MF can be represented as a bounded interval in terms of its upper MF µi j

(3)

and lower MF µi j, where µi_j(xj) =      N (mi_{j 1}, σ_ji; xj), xj < mij 1 1, mi j 1≤ xj ≤ mij 2 N (mi j 2, σji; xj), xj > mij 2 (2) and µi_j(xj) =        N (mi j 2, σji; xj), xj ≤ mi j 1+ mij 2 2 N (mi_{j 1}, σ_ji; xj), xj > mi j 1+ mij 2 2 (3)

i.e., the output µ_A˜i

j of each node can be represented as an interval

[µi_j, µi_j]. For the second type, we use a Gaussian primary MF having a fixed mean mi

j and an uncertain STD that takes on

values in [σi_{j 1}, σi_{j 2}]. An RSEIT2FNN using this type of MF is called RSEIT2FNN-UD. The membership value is µA˜i

j = [µ i j, µi j], where µi_j(xj) = N (mij, σij 2; xj) (4) and µi_j(xj) = N (mij, σj 1i ; xj). (5)

The two types of interval type-2 MFs mentioned before are also widely used in many previous studies [19]–[25].

3) Layer 3 (Spatial Firing Layer): Each node in this layer corresponds to one fuzzy rule and functions as a spatial rule node. Each node performs a fuzzy meet operation on inputs from layer 2 using an algebraic product operation to obtain a spatial firing strength Fi_{. The spatial firing strength is an interval}

type-1 fuzzy set and is computed as follows [26]:

Fi= [fi, fi], i = 1, . . . , M (6) where fi = nu+ no j = 1 µi_j, fi= nu+ no j = 1 µi j (7)

and M is the total number of rules.

4) Layer 4 (Temporal Firing Layer): Each node in this layer is a recurrent rule node that forms an internal feedback loop. The output of a recurrent rule node is a temporal firing strength that depends not only on the current spatial firing strength but on the previous temporal firing strength as well. The temporal firing strength Ψi_q(t) = [ψi_q(t), ψi_q(t)], i = 1, . . . , M and q = 1, . . . , no is an interval given as a linear combination of the

spatial firing strength Fi(t) and the last temporal firing strength Ψiq(t− 1) by the equation

Ψi_q(t) =λi_q· Fi(t) + (1− λi_q)Ψi_q(t− 1) (8) where λi_q is a feedback weight and 0≤ λi_q ≤ 1. Equation (8) may be written as [ψi_q(t), ψi q(t)] =λ i q[f i (t), fi(t)] + (1− λi_q)[ψi_q(t−1), ψi q(t−1)] (9) where ψi_q(t) =λi_qfi(t) + (1− λi_q)ψi_q(t− 1) (10) and ψi_q(t) =λi_qfi(t) + (1− λi_q)ψi_q(t− 1). (11) 5) Layer 5 (Consequent Layer): Each node in this layer is called a consequent node and functions as a linear model with exogenous inputs and time-delay synapses. Each re-current rule node in layer 4 has a corresponding conse-quent node in layer 5. The linear model is a linear com-bination of the current input states x(t) = (u(t), yp(t)) =

(u1(t), . . . , unu(t), yp1(t), . . . , ypno(t)), together with their

lagged values. The output ˜yiq(t + 1), i = 1, . . . , M and q =

1, . . . , no, of the ith consequent node connecting to the qth

net-work output variable is computed as follows:

˜ yqi(t +1) = nu j = 0 Nj k = 0 ˜ aij k quj(t−k) + no j = 1 Oj k = 0 ˜ ai_{(j + n}_u_{)k q}ypj(t−k) (12) where u0(t) ∆ = 1 and N0 ∆

= 0, Nj and Oj are the maximum lag

numbers of the control input uj(t) and the system output ypj(t),

respectively, and ˜ai

j k q are interval sets denoted by

˜

aij k q = [cij k q− sj k qi , cij k q+ sij k q] (13)

where ci_{j k q} and si_{j k q} denote the center and spread, respectively, of the interval. The inclusion of the lagged values of u(t) and y_p(t) in the linear consequent part instead of the antecedent part simplifies the network computation process for dynamic system processing, especially when interval type-2 fuzzy sets are used. The output ˜yi

q(t + 1) is an interval type-1 set, which is

denoted by [˜yi

lq, ˜yr qi ], where the indices l and r denote left and

right limits, respectively. According to (12) and (13), the node output is given as ˜ y_qi = [˜y_lqi , ˜yi_{r q}] = nu j = 0 Nj k = 0 [ci_{j k q}− si_{j k q}, ci_{j k q}+ si_{j k q}]uj(t− k) + no j = 1 Oj k = 0 [ci_{(j + n}_u_{)k q}− si_{(j + n}_u_{)k q}, ci_{(j + n}_u_{)k q} + si_{(j + n}_u_{)k q}]ypj(t− k) (14) i.e., ˜ y_lqi = nu j = 0 Nj k = 0 ci_{j k q}uj(t− k) + no j = 1 Oj k = 0 ci_{(j + n}_u_{)k q}ypj(t− k) − nu j = 0 Nj k = 0 si_{j k q}|uj(t− k)|− no j = 1 Oj k = 0 si_{(j + n}_u_{)k q}|ypj(t−k)| (15)

(4)

and ˜ yr qi = nu j = 0 Nj k = 0 cij k quj(t− k) + no j = 1 Oj k = 0 ci_{(j + n}_u_{)k q}ypj(t− k) + nu j = 0 Nj k = 0 si_{j k q}|uj(t− k)|+ no j = 1 Oj k = 0 si_{(j + n}_u_{)k q}|ypj(t− k)|. (16) 6) Layer 6 (Output Layer): Each node in this layer cor-responds to one output variable. The qth output layer node computes the network output variable y_q using type reduc-tion followed by defuzzificareduc-tion operareduc-tions. In the type re-duction, the type-reduced set is an interval type-1 fuzzy set [y_lq , y_{r q}]. The outputs y_lq and yr q can be computed using the

Karnik–Mendel iterative procedure [11]. In this procedure, the consequent parameters are reordered in ascending order. Let ˜

y_lq = (˜y1

lq, . . . , ˜yMlq ) and ˜yr q = (˜y1r q, . . . , ˜yMr q) denote the

origi-nal rule-ordered consequent values, and let ˆylq = (ˆy1lq, . . . , ˆyMlq )

and ˆy_{r q} = (ˆy1

r q, . . . , ˆyMr q) denote the reordered sequences,

where ˆy1_lq ≤ ˆy_lq2 ≤ · · · ≤ ˆy_lqM and ˆy1_{r q} ≤ ˆy2_{r q}≤ · · · ≤ ˆyM_{r q}. The relationship between ˜ylq, ˜yr q, ˆylq, and ˆyr qis

ˆ

ylq = Qly˜lq and yˆr q = Qry˜r q (17)

where Ql and Qr are M× M permutation matrices with

el-ementary vectors (i.e., vectors all of whose elements are zero except one element that is equal to one) as columns, and these vectors are arranged (permuted) to move elements in ˜y_lq and ˜

y_{r q}to new locations in ascending order in the transformed vec-tors ˆy_lq and ˆy_{r q}, respectively. The original rule firing strength orders ψ = (ψ1_q, ψ2_q, . . . , ψM_q )T _{and ψ = (ψ}1 q, ψ 2 q, . . . , ψ M q )T

are reordered accordingly. To compute the output y_lq, the new rule orders for ψ and ψ are Qlψ and Qlψ, respectively. To

compute the output y_{r q} , the new rule orders for ψ and ψ are Qrψ and Qrψ, respectively. According to [21], the output ylq

can be computed as y_lq = L i= 1(Qlψ)iyˆilq+ M i= L + 1(Qlψ)iyˆlqi L i= 1(Qlψ)i+ M i= L + 1(Qlψ)i =ψ T QT l ET1E1Qly˜lq + ψTQTl ET2E2Qly˜lq pT l Qlψ + gTl Qlψ (18) where L and R denote the left and right crossover points, respectively p_l= (1, 1, . . . , 1∆ L , 0, . . . 0)T ∈ M×1 gl= (0, . . . , 0, 1, . . . , 1 M−L )T ∈ M×1 (19) E1 = (e1, e2, . . . , eL, 0, . . . , 0)∈ L×M E2 = (0, . . . , 0, ε1, ε2, . . . , εM−L)∈ (M−L)×M (20)

and where ei∈ L×1 and εi∈ M−L are elementary vectors.

Similarly, the output yr qcan be computed as

yr q = R i= 1(Qrψ)iyˆr qi + M i= R + 1(Qrψ)iyˆir q R i= 1(Qrψ)i+ M i= R + 1(Qrψ)i = ψ T_QT rE3TE3Qry˜r q+ ψ T QT rE4TE4Qry˜r q pT rQrψ + gTrQrψ (21) where p_r = (1, 1, . . . , 1∆ R , 0, . . . 0)T ∈ M×1 g_r = (0, . . . , 0, 1, . . . , 1 M−R )T ∈ M×1 (22) E3 = (e1, e2, . . . , eR, 0, . . . , 0)∈ R×M E4 = (0, . . . , 0, ε1, ε2, . . . , εM−R)∈ (M−R)×M (23)

and where ei∈ R×1and εi∈ (M−R)×1are elementary

vec-tors. In contrast to the prior studies [19], [20], [22], [23], the outputs y_lq and y_{r q} in (18) and (21) are expressed in the orig-inal rule-ordered format. This expression is helpful in deriv-ing the proposed parameter learnderiv-ing algorithm discussed in Section III-B. Finally, the defuzzification operation defuzzifies the interval set [y_lq, yr q ] by computing the average of ylqand yr q.

Hence, the defuzzified output for network output variable yqis

y_q = y

lq + yr q

2 . (24)

III. RSEIT2FNN LEARNING

The RSEIT2FNN evolves all of its composing recurrent type-2 fuzzy rules by simultaneous structure and parameter learning. The following sections introduce more details on the structure and parameter learning algorithms.

A. Structure Learning

The structure learning algorithm is responsible for online rule generation. A previous study [28] used the rule firing strength as a criterion for type-1 fuzzy rule generation. This idea is extended to type-2 fuzzy rule generation criteria in an RSEIT2FNN. The spatial firing strength Fi _{in (6) is used as the criterion to}

de-termine whether a rule should be generated. Since this firing strength is an interval, its center is

f_ci= 1 2(f

i

+ f

i). (25)

The spatial firing strength center then serves as a rule gener-ation criterion.

Structure learning of an RSEIT2FNN-UM is introduced as follows. For the first incoming piece of data x, a new rule is generated, with the uncertain mean and center of each new type-2 fuzzy set assigned by

[m1_{j 1}, m1_{j 2}] = [xj− 0.1, xj + 0.1] and σji = σﬁxed

(5)

where σﬁxedis a predefined threshold (σﬁxed= 0.3 in this paper)

that determines the fuzzy set width. For each subsequent piece of incoming data x(t), find

I = arg max 1≤i≤M (t)f i c( x ) (27)

where M (t) is the number of existing rules at time t. If fcI(

x )≤ fth, for some prespecified threshold fth ∈ (0, 1), then a new rule

is generated. Once a new rule is generated, the initial uncertain means and widths of the corresponding new type-2 fuzzy sets are assigned as [mM (t)+ 1_{j 1} , mM (t)+ 1_{j 2} ] = [xj(t)− 0.1, xj(t) + 0.1] (28) and σM (t)+ 1_j = β  nu+ no j = 1 xj − mJ j 1+ mJj 2 2 2  0.5 . (29) The network inputs are scaled to lie in the range [−1, 1] in layer 1. Equations (26) and (28) set an initial uncertain mean range of 0.2 according to this input range. If the uncertain mean range is too small, then the initial type-2 fuzzy set becomes too close to a type-1 fuzzy set. In contrast, if this range is too large, then the uncertain mean covers most of the input range. Equation (29) indicates that the initial width is equal to the Euclidean distance between the current input data point x and its nearest cluster mean average times an overlapping parameter β. In this paper, we set β = 0.5 so that the width of the new fuzzy set is half the Euclidean distance, and a suitable overlapping between clusters is generated.

For structure learning of an RSEIT2FNN-UD, the learning process is similar to RSEIT2FNN-UM, except that (26), (28), and (29) are slightly modified. Equation (26) is changed to be m1_j = xj and [σj 11 , σj 21 ] = [σﬁxed, σﬁxed+ 0.1]

j = 1, . . . , nu+ no. (30)

Equations (28) and (29) are changed to be

mM (t)+ 1_j = xj(t) (31) and [σM (t)+ 1_{j 1} , σM (t)+ 1_{j 2} ] =  β  nu+ no j = 1 (xj− mJj)2   0.5 , σ_{j 1}M (t)+ 1+ 0.1   (32) where setting of σ_{j 1}M (t)+ 1 is similar to (29), and β is also set to 0.5. Like (28), the initial value of σ_{j 2}M (t)+ 1 is assigned to generate a suitable FOU.

Previous studies [24], [25] have shown that different uncer-tain mean (STD) ranges generate different FOU areas and may influence the final results. In these studies, a constant uncertain range is manually selected in advance, and all MFs share the same range. In RSEIT2FNN, different MFs use different un-certain mean (STD) ranges, which are all automatically tuned using the following MF parameter learning algorithm.

B. Parameter Learning

The parameter learning phase occurs concurrently with the structure learning phase. For each piece of incoming data, all the free RSEIT2FNN parameters are tuned, whether the rules are newly generated or originally existent. For clarity, consider only the qth network output. The objective of the parameter learning process is to minimize the error function

E = 1 2[y

q(t + 1)− yd(t + 1)]2. (33)

Here, y_q(t + 1) and yd(t + 1) denote the RSEIT2FNN and

desired outputs, respectively. The Karnik–Mendel iterative pro-cedure for computing y_lq and yr qhas the premise that ˜yilqand ˜yir q

have been rearranged in ascending order. During the parameter learning process, the consequent values ˜yi

lq and ˜yir qchange, and

their corresponding rule orders change accordingly. To update the parameters, it is necessary to know the precise locations of specific antecedent and consequent parameters, and this is very difficult to ascertain when the rule orders are different at each learning time step. The proposed rule-ordered Kalman filtering algorithm addresses this problem by keeping the original rule order during the parameter learning process. This is achieved by mapping the consequent values in ascending order with respect to the original rule order [see (17)]. According to this mapping, (18) and (21) are expressed by ˜ylq and ˜yr q, i.e., with the

con-sequent values arranged in the original rule order, despite their changes during the parameter learning process. Based on this original rule-ordered expression, the rule-ordered Kalman fil-tering algorithm is derived as follows. Equations (18) and (21) can be reexpressed as ylq = φTlq˜ylq φ_lq =ψ T QT l E1TE1Ql+ ψTQTl E2TE2Ql pT l Qlψ + gTl Qlψ ∈ M×1 ₍₃₄₎ and yr q = φTr q˜yr q φ_{r q} =ψ T_QT rE3TE3Qr+ ψ T QT rE4TE4Qr pT rQrψ + gTrQrψ ∈ M×1 ₍₃₅₎

respectively. Thus, the output yq in (24) can be reexpressed as

y_q =1 2(y lq + yr q) = 1 2(φ T lqy˜lq + φTr qy˜r q) = [φTlqφ T r q] _˜ ylq ˜ yr q = [φ1_lq· · · φM_lqφ1_{r q}· · · φM_{r q}]               ˜ y1 lq .. . ˜ yM lq ˜ y1 r q .. . ˜ yM r q               (36)

(6)

where φTlq = 0.5φTlq, and φ

T

r q= 0.5φTr q. According to (15) and

(16), (36) can be further expressed as in (37), shown at the bottom of the page.

Since RSEIT2FNN rules are generated online, the dimen-sion of ˜y_lq in (37) increases with time, and the positions of ci

j k q and sij k q in the same vector change accordingly. To

keep the positions of ci

j k q and sij k q constant in the vector,

the vector components in (37) are rearranged in rule order in the proposed rule-ordered Kalman filtering algorithm. Let ˜

yT SK ∈ 2M (Σ

n u

j = 0(Nj+ 1)+ Σn oj = 1(Oj+ 1))×1_{denote all of the}

con-sequent parameters ci_{j k q}and si_{j k q}, i.e., ˜ y_{T SK} = [c1_00q. . . c1_(n_o_{+ n}_u_)N n oqs 1 00q. . . s1(no+ nu)Nn oq. . . cM00q. . . cM(no+ nu)Nn oqs M 00q. . . sM(no+ nu)Nn oq] T ₍₃₈₎

where the parameters are positioned according to the rule order so that their positions remain constant as the rule numbers in-crease during the structure learning process. Equation (37) can then be expressed as y_q = [φc1u0. . . φc1ypno(t− Ono)− φs1|u0| . . . − φs1|ypno(t− Ono)| . . . φcMu0. . . φ_cMypno(t− Ono)− φsM|u0| . . . − φsM|ypno(t− Ono)|]˜yT SK = φT_{T SK}y˜_{T SK} (39) where φjcq = φ j lq + φ j r q and φ j sq = φ j r q− φ j lq, j = 1, . . . , M .

The consequent parameter vector ˜y_{T SK}is updated by executing

the following rule-ordered Kalman filtering algorithm: ˜ y_{T SK}(t + 1) = ˜y_{T SK}(t) + S(t + 1)φ_{T SK}(t + 1)(yd(t + 1) − φT SKT(t + 1)˜yT SK(t)) S(t + 1) = 1 λ S(t)−S(t)φ T SK(t+1)φTT SK(t+1)S(t) λ+φT T SK(t+1)S(t)φ T SK (40) where 0 <λ ≤ 1 is a forgetting factor (λ = 0.9995 in this paper). The dimensions of the vectors ˜y_{T SK} and φ_{T SK} and the matrix S increase when a new rule evolves. When a new rule evolves, RSEIT2FNN augments S(t) as in (41), shown at the bottom of this page, where C is a large positive constant, and the size of the identity matrix I is 2(nu

j = 0(Nj + 1) + no j = 1(Oj + 1))× 2(nu j = 0(Nj+ 1) + no j = 1(Oj+ 1)).

The RSEIT2FNN antecedent parameters are tuned by a gradi-ent descgradi-ent algorithm. Detailed learning equations can be found in the Appendix. Finally, it should be emphasized that after pa-rameter update at each time step, rule consequent values ˜y_lqand ˜

y_{r q}change. Therefore, the two permutation matrices Qland Qr

in (17) should change accordingly at each time step. IV. SIMULATIONS

This section describes four examples of RSEIT2FNN sim-ulations. These examples include single-input–single-output (SISO) dynamic system identification (Example 1), chaotic se-ries prediction (Example 2), MIMO dynamic system identifica-tion (Example 3), and real-time series predicidentifica-tion (Example 4). The performance of an RSEIT2FNN is compared with recurrent

y_q = [φT_lq φT_{r q}] ˜ y_lq ˜ y_{r q} = [φ1_lq· · · φM_lq φ1_{r q}· · · φM_{r q}]                         nu j = 0 Nj k = 0 c1_{j k q}uj(t− k) + no j = 1 Oj k = 0 c1_{(j + n}_u_{)k q}ypj(t− k) − nu j = 0 Nj k = 0 s1_{j k q}|uj(t− k)| − no j = 1 Oj k = 0 s1_{(j + n}_u_{)k q}|ypj(t− k)| .. . nu j = 0 Nj k = 0 cM_{j k q}uj(t− k) + no j = 1 Oj k = 0 cM_{(j + n}_u_{)k q}ypj(t− k) − nu j = 0 Nj k = 0 sM_{j k q}|uj(t− k)| − no j = 1 Oj k = 0 sM_{(j + n}_u_{)k q}|ypj(t− k)| nu j = 0 Nj k = 0 c1_{j k q}uj(t− k) + no j = 1 Oj k = 0 c1_{(j + n}_u_{)k q}ypj(t− k) + nu j = 0 Nj k = 0 s1_{j k q}|uj(t− k)| + no j = 1 Oj k = 0 s1_{(j + n}_u_{)k q}|ypj(t− k)| .. . nu j = 0 Nj k = 0 cMj k quj(t− k) + no j = 1 Oj k = 0 cM_{(j + n}_u_{)k q}ypj(t− k) + nu j = 0 Nj k = 0 sMj k q|uj(t− k)| + no j = 1 Oj k = 0 sM_{(j + n}_u_{)k q}|ypj(t− k)|                         . (37)

S(t) = block diag[S(t) CI] ∈ 2(M + 1)n u j = 0(Nj+ 1)+ n o j = 1(Oj+ 1) ×2(M +1) n u j = 0(Nj+ 1)+ n o j = 1(Oj+ 1) ₍₄₁₎

(7)

TABLE I

PERFORMANCE OFRSEIT2FNNANDOTHERFEEDFORWARD ANDRECURRENTMODELS FORSISO PLANTIDENTIFICATION INEXAMPLE1

type-1 FNNs and feedforward type-1 and type-2 FNNs in these examples.

A. Example 1 (SISO Dynamic System Identification)

This example uses the RSEIT2FNN to identify an SISO linear time-varying system, which is a problem that was introduced in [7]. The dynamic system with input delays is guided by the difference equation

yp1(t + 1) = 0.72yp1(t) + 0.025yp1(t− 1)u1(t− 1)

+ 0.01u2₁(t− 2) + 0.2u1(t− 3). (42)

This system has a single input (nu = 1) and a single output

(no = 1), and therefore, the two current variables u1(t) and

yp1(t) are fed as inputs to the RSEIT2FNN input layer. The

current output of the plant depends on three previous inputs and one previous output. Therefore, the lag numbers N1 and O1 in

RSEIT2FNN consequent part are set to 3 and 1, respectively. The training procedure for the RSEIT2FNN uses the plant out-put yp1(t + 1) as the desired output yd(t + 1). In training the

RSEIT2FNN, we use only ten epochs, each of which contains 900 time steps. As in [7], the input is an independently and identically distributed (i.i.d.) uniformly random sequence over [−2, 2] for about half of the 900 time steps and a sinusoid given by 1.05 sin(πk/45) for the remaining time. There is no repeti-tion in these 900 training data, i.e., we have different training sets for each epoch. This type of training is similar to an online training process, where there are a total number of 9000 online training time steps. There is no repeated training for the training dataset obtained in each time step. The structure learning thresh-old fth determines the number of fuzzy rules to be generated.

For RSEIT2FNN-UM, two rules are generated when fth is set

to 0.05. Table I shows the root-mean-squared error (RMSE) of the training data. To see the identification result, the following input used in [7] is also adopted for the test:

u1(t) =                            sin ! πt 25 " , t < 250 1.0, 250≤ t < 500 −1.0, 500≤ t < 750 0.3 sin ! πt 25 " + 0.1 sin ! πt 32 " + 0.6 sin ! πt 10 " , 750≤ t < 1000. (43) Fig. 3 shows the outputs of the plant and the RSEIT2FNN-UM for these test inputs. Fig. 4 shows the test error y₁(t + 1)−

Fig. 3. Outputs of the dynamic plant (solid line) and RSEIT2FNN-UM (dotted line) in Example 1.

Fig. 4. Test errors between the RSEIT2FNN-UM and actual plant outputs.

yp1(t + 1) between the outputs of the RSEIT2FNN-UM and the

actual plant. Table I shows the network size, and training and test RMSEs of RSEIT2FNN-UM. Table I also shows perfor-mance of an RSEIT2FNN-UD with the same network size as the RSEIT2FNN-UM. The results show that these two networks have similar performance.

The performance of RSEIT2FNN is compared with that of TSK-type feedforward type-1 and type-2 FNNs, a recurrent

(8)

TABLE II

INFLUENCE OFft hON THEPERFORMANCE OF ANSEIT2FNN-UM WITH

β = 0.5

NN, and recurrent type-1 FNNs. The compared feedforward type-1 FNN is a self-constructing neural fuzzy inference net-work (SONFIN) [29], which is a powerful netnet-work with both structure and parameter learning. As in an RSEIT2FNN, the consequent part of a SONFIN is also trained using the Kalman filter algorithm. The feedforward type-2 FNN for comparison is the interval type-2 FNN [21], where all the network parameters are learned using the steepest descent algorithm. In the original interval type-2 FNN, the network structure is fixed and assigned in advance. Since an RSEIT2FNN uses structure learning for network design, the proposed structure learning in Section III-A is also incorporated in the interval type-2 FNN in this and the following examples. Comparisons are based on a similar net-work size, i.e., the total number of parameters in a feedforward type-1 FNN is similar to that in a feedforward interval type-2 FNN. The number of rule parameters in an interval type-2 FNN is larger than that in a type-1 FNN due to the use of additional free parameters in type-2 fuzzy sets and rule consequent part. Therefore, the total number of rules in a feedforward type-1 FNN is set to be larger than that in an interval type-2 FNN, as shown in Table I. The recurrent NN used for comparison is Elman’s recurrent NN (ERNN) [30], which is applied to the same prob-lem in [7]. The recurrent type-1 FNNs include the RFNN [4], the wavelet-based RFNN (WRFNN) [5], and the TSK-type re-current fuzzy network with supervised learning (TRFN-S) [7]. All these networks use the same training data, test data, and number of training epochs as the RSEIT2FNN. Table I shows the number of rules, the total number of network parameters, and the training and test errors of these compared networks. Among all the recurrent type-1 networks being compared, the TRFN-S achieves the minimum test error. The results show that the RSEIT2FNN achieves smaller training and test errors than the other feedforward and recurrent networks.

We now analyze the practical computational cost of construct-ing an RSEIT2FNN. Since the functions in an RSEIT2FNN-UM are similar to those in an RSEIT2FNN-UD, only the former network is studied. All simulations are performed on an Intel 3.0 GHz dual CPU, and the programs are written in Visual C++. The total learning time of the RSEIT2FNN-UM men-tioned before is 0.344 s, which is a very short time. We com-pare the computational costs for training the two representative networks SONFIN and TRFN-S for feedforward and recurrent type-1 FNNs. The SONFIN and TRFN-S take 0.891 and 4.407 s, respectively. The results show that the RSEIT2FNN-UM takes less training time than these two networks.

We consider the influence of the values of fth and β on the

RSEIT2FNN-UM performance. The threshold fth decides the

TABLE III

INFLUENCE OFβON THEPERFORMANCE OF ANRSEIT2FNN-UM WITHft h= 0.05

number of rules in the RSEIT2FNN-UM. Table II shows the RSEIT2FNN-UM performance for different values of fth when

β = 0.5. Larger values of fth generate larger numbers of rules

and improve the learning performance of the network in general. However, when the value of fth is too large, the performance

saturates. One reason is that it is easier to get stuck in a local optimum when training a larger network. Another reason is that it requires a larger number of training epochs for network convergence. For a constant value of fth, smaller values of β

generate larger numbers of rules because of the smaller initial type-2 fuzzy set width. Table III shows the RSEIT2FNN-UM performance for different values of β when fth = 0.05. The

influence of rule number on network performance is similar to that discussed before. Table III shows that for a constant rule number of 2, the network performance is insensitive to variations inβ. The general rule for selecting fthand β is to set a constant

value of β (e.g., β = 0.5 in this paper), first for the given problem and then to select fth based on a compromise between network

size and performance.

This example also studies the RSEIT2FNN test performance when the measured plant output yp1 contains noise. The test

also uses the control input sequence in (43). The added noise is artificially generated white Gaussian noise with three different STDs of 0.1, 0.5, and 0.7. There are 30 Monte Carlo realiza-tions for statistical analysis. Table IV shows the statistical mean and STD (denoted as mean± STD) of RSEIT2FNN-UM and RSEIT2FNN-UD for different noise levels. The results show that these two networks have similar performance. The primary MFs in both RSEIT2FNN-UM and RSEIT2FNN-UD are of Gaussian type. The numbers of additional flexible parameters provided by the FOU in each MF of both networks are identical and are tuned by the same parameter-learning algorithm. It is reasonable that these two networks have similar performance in Tables I and IV.

The performance of feedforward type-1 and type-2 FNNs and TRFN-S for the same noisy test patterns is compared with that of the RSEIT2FNN. Though the performance of different recurrent models is studied in Table I, this example only com-pares RSEIT2FNN with TRFN-S. The reason is that TRFN-S achieves the minimum test RMSE among the compared recur-rent models in Table I. Table IV shows the average RMSEs of the feedforward and recurrent models for different noise levels. For the two feedforward FNNs, the test errors of the type-2 FNN are smaller than those of the type-1 FNN for different noise lev-els. The results show that the test errors of RSEIT2FNN are smaller than those of the other models tested for all of the test noise levels.

(9)

TABLE IV

PERFORMANCE OFRSEIT2FNNANDOTHERFEEDFORWARD ANDRECURRENTMODELSWITHDIFFERENTNOISELEVELS INEXAMPLE1

Fig. 5. Results of the phase plot for the chaotic system () and RSEIT2FNN-UM (×).

B. Example 2 (Chaotic Series Prediction) The chaotic system is described by

yp1(t + 1) =−P y2p1(t) + Qyp1(t− 1) + 1.0. (44)

The study [31] shows that the system produces a chaotic strange attractor when the parameters P and Q are set to 1.4 and 0.3, respectively. The system has no control input (nu = 0) and

a single output (no= 1); therefore, only the state yp1(t) is fed as

input to the RSEIT2FNN input layer. The system is of second order with one delay, so the lag number O1in the RSEIT2FNN

consequent part is set to one. The desired output is yd(t + 1)

= yp1(t + 1). Two thousand patterns are generated from the

initial state [yp1(1), yp1(0)] = [0.4, 0.4], where the first 1000

patterns are used for training, and the remaining 1000 patterns are used for testing. In RSEIT2FNN-UM training, the structure learning threshold fthis set to 0.3. After 90 epochs of training,

six rules are generated. Fig. 5 shows the phase plane of the actual and RSEIT2FNN-UM prediction results for the test pat-terns. Table V shows the structure and training and test RMSEs of RSEIT2FNN-UM. The performance of an RSEIT2FNN-UD with the sane network size is also shown in Table V. Like the results in Example 1, Table V shows that RSEIT2FNN-UM and RSEIT2FNN-UD have similar performance. As in Example 1, the performance of the RSEIT2FNN is compared with that of feedforward type-1 and type-2 FNNs [21], [29] and recurrent type-1 FNNs, including RFNN [4], WRFNN [5], and TRFN-S

[7]. These compared networks use the same number of training epochs and training and test data as in the RSEIT2FNN. Table V shows the numbers of rules and network parameters and training and test RMSEs of these compared networks. The results show that the RSEIT2FNN achieves better performance than other networks.

This example also studies the RSEIT2FNN test performance when the measured plant output yp1 contains noise. The added

noise is artificially generated white Gaussian noise with STD of 0.3, 0.5, and 0.7. Table VI shows the test RMSEs over 30 Monte Carlo realizations. The results in Table VI show that RSEIT2FNN-UM and RSEIT2FNN-UD have similar perfor-mance. For comparison, Table VI also shows the test RMSEs of feedforward type-1 and type-2 FNNs, and TRFN-S. The results show that the RMSE of the RSEIT2FNN is smaller than that of the other networks for all of the test noise levels.

C. Example 3 (MIMO Dynamic System Identification)

The identified MIMO plant is the same as that used in [32]. The plant is described by

yp1(t + 1) = 0.5 yp1(t) 1 + y2 p2(t) + u1(t− 1) yp2(t + 1) = 0.5 yp1(t)yp2(t) 1 + y2 p2(t) + u2(t− 1) . (45) This plant has two inputs (nu = 2) and two outputs (no = 2)

so that the four current input and output variables u1(t), u2(t),

yp1(t), and yp2(t) are fed as inputs to the RSEIT2FNN input

layer. The current output of the plant depends on the control inputs with one time-step delay and current plant states. There-fore, the lag numbers N1, N2, O1, and O2 in the RSEIT2FNN

consequent part are set to be 1, 1, 0, and 0, respectively. The two desired outputs for the RSEIT2FNN training are yp1(t + 1)

and yp2(t + 1). During the training phase, the RSEIT2FNN

is trained online from time step t = 1 to t = 11 000. The two control inputs u1(t) and u2(t) are i.i.d. uniformly

ran-dom sequences over [−1.4, 1.4] from t = 1 to t = 4000 and sinusoid signals given by sin(πt/45) from t = 4001 to t = 11 000. In the RSEIT2FNN-UM training, the structure learn-ing threshold fth is set to 0.07. A larger network is generated

when fth is larger than 0.1. The threshold value (0.07) is

deter-mined based on a compromise between network size and per-formance, as discussed in Example 1. After the training, three rules are generated. Table VII shows the structure and RMSE of

(10)

TABLE V

PERFORMANCE OFRSEIT2FNNANDOTHERFEEDFORWARD ANDRECURRENTMODELS INEXAMPLE2

TABLE VI

PERFORMANCE OFRSEIT2FNNANDOTHERFEEDFORWARD ANDRECURRENTMODELSWITHDIFFERENTNOISELEVELS INEXAMPLE2

TABLE VII

PERFORMANCE OFRSEIT2FNNANDOTHERRECURRENTMODELS FORMIMO PLANTIDENTIFICATION INEXAMPLE3

RSEIT2FNN-UM. To test the identification result, the two con-trol input sequences are as follows:

u1(t) = u2(t) =                            sin ! πt 25 " , 1001≤ t < 1250 1.0, 1250≤ t < 1500 −1.0, 1500≤ t < 1750 0.3 sin ! πt 25 " + 0.1 sin ! πt 32 " + 0.6 sin ! πt 10 " , 1750≤ t < 2000. (46) Fig. 6 shows the test results. Table VII shows the test RMSEs for outputs yp1 and yp2.

For comparison, Table VII shows the performance of the memory NN (MNN) [32], feedforward type-1 and type-2 FNNs, and recurrent type-1 FNNs studied in Example 2. The MNN is a kind of recurrent NN, and has been applied to the same problem in [32]. These networks use a total number of 11 000 training time steps as in the RSEIT2FNN-UM, except in the case of the MNN, where a total number of 77 000 time steps is used for training in [32]. The results in Table VII show that the RSEIT2FNN-UM achieves smaller RMSEs for both outputs than these feedforward and recurrent networks.

Fig. 6. Outputs of the MIMO plant (solid curve) and RSEIT2FNN-UM (dotted curve) in Example 3. (a) Output yp 1. (b) Output yp 2.

This example also studies the RSEIT2FNN-UM test perfor-mance when the measured plant outputs yp1 and yp2 contain

noise. The added noise is artificially generated white Gaussian noise with STD of 0.3, 0.5, and 0.7. Table VIII shows the average test RMSEs of the RSEIT2FNN-UM, feedforward type-1 and type-2 FNNs, and TRFN-S over 30 Monte Carlo realiza-tions. The results show that the RMSEs of RSEIT2FNN-UM

(11)

TABLE VIII

PERFORMANCE OFRSEIT2FNNANDTRFN-S WITHDIFFERENTNOISELEVELS INEXAMPLE3

TABLE IX

PERFORMANCE OFRSEIT2FNNANDDIFFERENTMODELS FOR THESERIES-E PREDICTIONPROBLEM INEXAMPLE4

are smaller than those of the comparable networks when there is noise.

D. Example 4 (Practical Time Series Prediction)

This example studies the performance of an RSEIT2FNN-UM for a real-world series database. The Series-E from the Sante Fe Time Series [33] is used (database Web site: http://www-psych.standford.edu/∼andress/Time-Series/). This series is a set of astrophysical data (variation in light intensity of a star). The objective is to predict the intensity of the star at time t + 1, yp1(t + 1), according to its past intensities. This benchmark

series is selected because it is very noisy and discontinuous. According to [34], 2048 observations were collected, of which 90% were used for training and the remaining 10% for testing.

Since the appropriate number of lagged intensities for pre-diction is unknown in advance for this practical series, the lag number O1in the RSEIT2FNN-UM is simply set to zero. Only

the intensity yp1(t) is fed as input to the RSEIT2FNN-UM input

layer, because the system has no external input (nu = 0), i.e.,

only yp1(t) is fed as input to the RSEIT2FNN-UM for

predict-ing yp1(t + 1), and the past values yp1(t− j) are automatically

memorized in the feedback loops. If the appropriate lag num-ber O1 is known in advance, then more past values other than

yp1(t) can be included in the RSEIT2FNN-UM consequent part

for a better prediction performance. The threshold fth is set to

0.03, and the number of rules is 6 (48 parameters in total) after 100 epochs of training. Table IX shows the test RMSE of the UM. Fig. 7 shows the actual and RSEIT2FNN-UM-predicted intensities. For comparison, Table IX also shows the test RMSEs of the feedforward type-1 and type-2 FNNs using the same input–output pairs. The numbers of rules (pa-rameters) in the feedforward type-1 and type-2 FNNs are 12 (48) and 7 (49), respectively. The test error of the RSEIT2FNN-UM is smaller than those of these two networks.

A model called pattern modeling and recognition system (PMRS) was proposed in [34] to predict the same series. The prediction results using feedforward NN and statistical exponen-tial smoothing (ES) method are also reported in that study. In

Fig. 7. Prediction results of the RSEIT2FNN-UM for the Series-E problem in Example 4.

these methods, an appropriate number of past intensities should be determined for each model input, which burdens model de-sign effort. For example, the NN uses the past five intensities as network inputs. Table IX shows the test RMSEs of these three models, all of which have larger errors than the RSEIT2FNN-UM and the two feedforward FNNs.

V. CONCLUSION

This paper proposes a new recurrent type-2 FNN, i.e., the RSEIT2FNN. In contrast to existing feedforward type-2 FNNs, this network is especially useful for handling problems with temporal properties. For RSEIT2FNN learning, there is no need to determine the RSEIT2FNN structure in advance because the proposed structure learning ability enables the RSEIT2FNN to evolve its structure online. Moreover, the proposed rule-ordered Kalman filter algorithm helps tune the consequent parameters online and improves learning accuracy. Simulation results show that the RSEIT2FNN achieves a better performance than exist-ing recurrent type-1 FNNs in both noise-free and noisy envi-ronments. The FOU in the RSEIT2FNN helps handle the nu-merical uncertainty associated with system inputs and outputs.

(12)

Therefore, the RSEIT2FNN has the potential to achieve better performance than type-1 fuzzy systems when dealing with noisy data, as demonstrated in the examples given in Section IV. Future studies will theoretically analyze the learning convergence of the RSEIT2FNN and examine possible practical applications of the RSEIT2FNN to temporal problems with noise or uncertainty.

APPENDIX

This Appendix derives the antecedent parameter learning equations using a gradient descent algorithm. For convenience in notation of the gradient descent results, (18) can be reexpressed according to [21] as y_lq = ψ T alq + ψTblq ψTclq+ ψTdlq (A1) where alq = QTl ET1E1Qly˜lq ∈ M×1 blq = QTl ET2E2Qly˜lq ∈ M×1 (A2) clq = QTl pl∈ M×1 dlq = QTl gl ∈ M×1. (A3)

Similarly, (21) can be reexpressed as y_{r q} =ψ T_a r q+ ψ T br q ψTcr q+ ψ T dr q (A4) where ar q= QTrE3TE3Qry˜r q ∈ M×1 br q= QTrE4TE4Qry˜r q ∈ M×1 (A5) cr q= QTrpr ∈ M×1, dr q= QTrgr ∈ M×1. (A6)

Using the gradient descent algorithm, we have λi q(t + 1) =λiq(t)− η ∂E ∂λi q(t) (A7) where η is a learning constant (η = 0.08 in this paper), and

∂E ∂λi q = ∂E ∂y_q ∂y_q ∂y_lq ∂y_lq ∂λi q + ∂y q ∂y_{r q} ∂y_{r q} ∂λi q =1 2(y q− yd) ∂y_lq ∂ψi_q +∂y r q ∂ψi_q ∂ψiq ∂λi q + ∂y_lq ∂ψi_q + ∂yr q ∂ψi_q ∂ψiq ∂λi q (A8) where ∂y_lq ∂ψi_q = alq i− ylqclq i ψTclq + ψTdlq ∂y_{r q} ∂ψi_q = br q i− yr qdr q i ψTcr q+ ψ T dr q (A9) ∂y_lq ∂ψi_q = blq i− ylqdlq i ψTclq + ψTdlq ∂y_{r q} ∂ψi_q = ar q i− yr q cr q i ψTcr q+ ψ T dr q (A10) ∂ψi_q ∂λi q = fi ∂ψ i q ∂λi q = fi. (A11) Let wi

jdenote a parameter in the ith interval type-2 fuzzy set

˜

Ai_j in input variable xj. This parameter is updated as follows:

w_ji(t + 1) = wi_j(t)− η ∂E ∂wi j(t) (A12) where ∂E ∂wi j(t) =1 2(y q− yd) ∂y_lq ∂ψi_q + ∂y_{r q} ∂ψi_q ∂ψi_q ∂wi j + ∂y_lq ∂ψi_q + ∂yr q ∂ψi_q ∂ψi_q ∂wi j . (A13)

1) RSEIT2FNN with uncertain mean (RSEIT2FNN-UM): The parameters mi

j 1, mij 2, and σjiin (2) and (3) are updated

accord-ing to (A12) and (A13). If wi

j = mij 1, then we have ∂ψi_q ∂wi j = ∂ψ i q ∂mi j 1 = ∂ψ i q ∂fi ∂fi ∂µi_j ∂µi j ∂mi j 1 =      λi qf i ×xj − mij 1 (σi j)2 , xj ≤ mij 1 0, otherwise (A14) ∂ψi q ∂wi j = ∂ψ i q ∂mi j 1 = ∂ψ i q ∂fi ∂fi ∂µi j ∂µi j ∂mi j 1 =      λi qfi× xj− mij 1 (σi j)2 , xj > mi j 1+ mij 2 2 0, otherwise. (A15)

Similarly, if wi_j = mi_{j 2}, then we have ∂ψi_q ∂wi j = ∂ψ i q ∂mi j 2 =      λi qf i ×xj − m i j 2 (σi j)2 , xj > mij 2 0, otherwise (A16) ∂ψi_q ∂mi j 2 =   λ i qfi× xj − mij 2 (σi j)2 , xj ≤ mi j 1+ mij 2 2 0, otherwise. (A17) If wi j = σji, then we have ∂ψi_q ∂wi j =∂ψ i q ∂σi j =            λi qf i ×(xj− m i j 1)2 (σi j)3 , xj < mij 1 λi qf i ×(xj− mij 2)2 (σi j)3 , xj > mij 2 0, otherwise (A18) and ∂ψi_q ∂wi j =∂ψ i q ∂σi j =          λi qfi× (xj− mij 2)2 (σi j)3 , xj ≤ mi j 1+ mij 2 2 λi qfi× (xj− mij 1)2 (σi j)3 , xj > mi j 1+ mij 2 2 . (A19)

(13)

2) RSEIT2FNN with uncertain STD (RSEIT2FNN-UD): The parameters mi_j, σi_{j 1}, and σi_{j 2}in (4) and (5) are updated according to (A12) and (A13). If wi_j = mi_j, then we have

∂ψi_q ∂mi j =λi_qfi×xj − m i j (σi j 2)2 and ∂ψi q ∂mi j =λi_qfi×xj − m i j (σi j 1)2 . (A20) If wi j = σj 1i , then we have ∂ψiq ∂σi j 1 = 0 and ∂ψ i q ∂σi j 1 =λi_qfi×(xj − m i j)2 (σi j 1)3 . (A21) If wi_j = σ_{j 2}i , then we have ∂ψi_q ∂σi j 2 =λi_qfi×(xj− m i j)2 (σi j 2)3 and ∂ψ i q ∂σi j 2 = 0. (A22) REFERENCES

[1] G. Mouzouris and J. M. Mendel, “Dynamic non-singleton fuzzy logic systems for nonlinear modeling,” IEEE Trans. Fuzzy Syst., vol. 5, no. 2, pp. 199–208, May 1997.

[2] J. Zhang and A. J. Morris, “Recurrent neuro-fuzzy networks for nonlinear process modeling,” IEEE Trans. Neural Netw., vol. 10, no. 2, pp. 313–326, Feb. 1999.

[3] Y. C. Wang, C. J. Chien, and C. C. Teng, “Direct adaptive iterative learn-ing control of nonlinear systems uslearn-ing an output-recurrent fuzzy neural network,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 3, pp. 1348–1359, Jun. 2004.

[4] C. H. Lee and C. C. Teng, “Identification and control of dynamic systems using recurrent fuzzy neural networks,” IEEE Trans. Fuzzy Syst., vol. 8, no. 4, pp. 349–366, Aug. 2000.

[5] C. J. Lin and C. C. Chin, “Prediction and identification using wavelet-based recurrent fuzzy neural networks,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 5, pp. 2144–2154, Oct. 2004.

[6] C. F. Juang and C. T. Lin, “A recurrent self-organizing neural fuzzy inference network,” IEEE Trans. Neural Netw., vol. 10, no. 4, pp. 828– 845, Jul. 1999.

[7] C. F. Juang, “A TSK-type recurrent fuzzy network for dynamic systems processing by neural network and genetic algorithm,” IEEE Trans. Fuzzy Syst., vol. 10, no. 2, pp. 155–170, Apr. 2002.

[8] J. B. Theocharis, “A high-order recurrent neuro-fuzzy system with internal dynamics: Application to the adaptive noise cancellation,” Fuzzy Sets Syst., vol. 157, no. 4, pp. 471–500, 2006.

[9] C. F. Juang and J. S. Chen, “Water bath temperature control by a recur-rent fuzzy controller and its FPGA implementation,” IEEE Trans. Ind. Electron., vol. 53, no. 3, pp. 941–949, Jun. 2006.

[10] N. N. Karnik, J. M. Mendel, and Q. Liang, “Type-2 fuzzy logic systems,” IEEE Trans. Fuzzy Syst., vol. 7, no. 6, pp. 643–658, Dec. 1999. [11] J. M. Mendel, Uncertain Rule-Based Fuzzy Logic System: Introduction

and New Directions. Upper Saddle River, NJ: Prentice–Hall, 2001. [12] J. M. Mendel and R. I. John, “Type-2 fuzzy sets made simple,” IEEE

Trans. Fuzzy Syst., vol. 10, no. 2, pp. 117–127, Apr. 2002.

[13] J. M. Mendel, “Type-2 fuzzy sets and systems: An overview,” IEEE Comput. Intell. Mag., vol. 2, no. 1, pp. 20–29, Feb. 2007.

[14] R. John and S. Coupland, “Type-2 fuzzy logic: A historical view,” IEEE Comput. Intell. Mag., vol. 2, no. 1, pp. 57–62, Feb. 2007.

[15] Q. Liang and J. M. Mendel, “Equalization of nonlinear time-varying chan-nels using type-2 fuzzy adaptive filters,” IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 551–563, Oct. 2000.

[16] H. Hagras, “A hierarchical type-2 fuzzy logic control architecture for autonomous mobile robots,” IEEE Trans. Fuzzy Syst., vol. 12, no. 4, pp. 524–539, Aug. 2004.

[17] J. Zeng and Z. Q. Liu, “Type-2 fuzzy hidden Markov models and their application to speech recognition,” IEEE Trans. Fuzzy Syst., vol. 14, no. 3, pp. 454–467, Jun. 2006.

[18] C. Hwang and F. C.-H. Rhee, “Uncertain fuzzy clustering: Interval type-2 fuzzy approach to C-means,” IEEE Trans. Fuzzy Syst., vol. 15, no. 1, pp. 107–120, Feb. 2007.

[19] C. H. Lee, Y. C. Lin, and W. Y. Lai, “Systems identification using type-2 fuzzy neural network (type-2 FNN) systems,” in Proc. IEEE Int. Symp. Comput. Intell. Robot. Autom., 2003, vol. 3, pp. 1264–1269.

[20] C. H. Wang, C. S. Cheng, and T. T. Lee, “Dynamical optimal training for interval type-2 fuzzy neural network (T2FNN),” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 3, pp. 1462–1477, Jun. 2004.

[21] J. M. Mendel, “Computing derivatives in interval type-2 fuzzy logic sys-tem,” IEEE Trans. Fuzzy Syst., vol. 12, no. 1, pp. 84–98, Feb. 2004. [22] H. Hagras, “Comments on dynamical optimal training for interval type-2

fuzzy neural network (T2FNN),” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 36, no. 5, pp. 1206–1209, Oct. 2006.

[23] G. M. Mendez and O. Castillo, “Interval type-2 TSK fuzzy logic systems using hybrid learning algorithm,” in Proc. IEEE Int. Conf. Fuzzy Syst., May 22–25, 2005, pp. 230–235.

[24] J. Zeng and Z. Q. Liu, “Type-2 fuzzy Markov random fields and their application to handwritten Chinese character recognition,” IEEE Trans. Fuzzy Syst., vol. 16, no. 3, pp. 747–760, Jun. 2008.

[25] J. Zeng, L. Xie, and Z. Q. Liu, “Type-2 fuzzy Gaussian mixture models,” Pattern Recognit., vol. 41, no. 12, pp. 3636–3643, 2008.

[26] Q. Liang and J. M. Mendel, “Interval type-2 fuzzy logic systems: Theory and design,” IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 535–550, Oct. 2000.

[27] J. M. Mendel, R. I. John, and F. Liu, “Interval type-2 fuzzy logic systems made simple,” IEEE Trans. Fuzzy Syst., vol. 14, no. 6, pp. 808–821, Dec. 2006.

[28] C. F. Juang, S. H. Chiu, and S. W. Chang, “A self-organizing TS-type fuzzy network with support vector learning and its application to classification problems,” IEEE Trans. Fuzzy Syst., vol. 15, no. 5, pp. 998–1008, Oct. 2007.

[29] C. F. Juang and C. T. Lin, “An on-line self-constructing neural fuzzy inference network and its applications,” IEEE Trans. Fuzzy Syst., vol. 6, no. 1, pp. 12–32, Feb. 1998.

[30] J. L. Elman, “Finding structure in time,” Cognit. Sci., vol. 14, pp. 179–211, 1990.

[31] G. Chen, Y. Chen, and H. Ogmen, “Identifying chaotic systems via a Wiener-type cascade model,” IEEE Trans. Control Syst. Mag., vol. 17, no. 5, pp. 29–36, Oct. 1997.

[32] P. S. Sastry, G. Ssntharam, and K. P. Unnikrishnan, “Memory neural networks for identification and control of dynamic systems,” IEEE Trans. Neural Netw., vol. 5, no. 2, pp. 306–319, Mar. 1994.

[33] A. S. Weigend and N. A. Gersehnfield, Time Series Prediction: Forecasting the Future and Understanding the Past. Reading, MA: Addison-Wesley, 1994.

[34] S. Singh, “Noise impact on time-series forecasting using an intelligent pattern matching technique,” Pattern Recognit., vol. 32, no. 8, pp. 1389– 1398, Aug. 1999.

Chia-Feng Juang (M’00–SM’08) received the B.S.

and Ph.D. degrees in control engineering from the National Chiao-Tung University, Hsinchu, Taiwan, in 1993 and 1997, respectively.

Since 2001, he has been with the Department of Electrical Engineering, National Chung-Hsing Uni-versity (NCHU), Taichung, Taiwan, where he has been a Professor since 2007. He has authored or coauthored three book chapters, more than 55 ref-ereed journal papers, and 55 conference papers. He has been a Referee for more than 45 international journals. He is currently a member of the Editorial Advisory Boards of the Open Cybernetics and Systemics Journal and the Open Automation and Con-trol Journal. He is also a member of the Editorial Boards of the International Journal of Computational Intelligence in Control and the Journal of Advanced Research in Evolutionary Algorithms. He is an Area Editor of the International Journal of Intelligent Systems Science and Technology. His current research interests include computational intelligence (CI), intelligent control, computer vision, speech signal processing, and implementation of CI techniques using field-programmable gate arrays chips.

Dr. Juang was the recipient of the Youth Automatic Control Engineering Award from the Chinese Automatic Control Society, Taiwan, in 2006 and the Outstanding Youth Teacher Award from the NCHU in 2007. Six of his pub-lished journal papers were recognized as highly cited papers according to the Information Sciences Institute database in 2008.

(14)

Ren-Bo Huang received the B.S. degree in

electri-cal engineering from the National United Univer-sity, Miaoli, Taiwan, in 2007. He is currently work-ing toward the M.S. degree in electrical engineerwork-ing with the National Chung-Hsing University, Taichung, Taiwan.

His current research interests include type-2 neu-ral fuzzy systems and field-programmable gate array chip design.

Yang-Yin Lin received the M.S. degree in electrical

engineering from the National Chung-Hsing Univer-sity, Taichung, Taiwan, in 2008. He is currently work-ing toward the Ph.D. degree with the Department of Electrical Engineering, National Chiao-Tung Univer-sity, Hsinchu, Taiwan.

His current research interests include neural net-works and fuzzy systems.