Organization of Dissertation

1 Introduction

1.3 Organization of Dissertation

The overall objective of this dissertation is to develop a novel neuro-fuzzy network and its related learning algorithm. Organization and objectives of each chapter in this dissertation are as follows.

In Chapter 2, we propose a functional-link-based neuro-fuzzy network (FLNFN) structure for nonlinear system control. The proposed FLNFN model uses a functional link neural network (FLNN) to the consequent part of the fuzzy rules. This dissertation uses orthogonal polynomials and linearly independent functions in a functional expansion of the FLNN. Thus, the consequent part of the proposed FLNFN model is a nonlinear combination of input variables. An online learning algorithm, which consists of structure learning and parameter learning, is also presented. The structure learning depends on the entropy measure to determine the number of fuzzy rules. The parameter learning, based on the gradient descent

method, can adjust the shape of the membership function and the corresponding weights of the FLNN.

In Chapter 3, we present a modified differential evolution (MODE) for the proposed FLNFN model. The proposed MODE learning algorithm adopts an evolutionary learning method to optimize the FLNFN parameters. The MODE algorithm uses a method to effectively search toward the current best individual. Furthermore, the MODE algorithm also provides a cluster-based mutation scheme, which maintains useful diversity in the population to increase the search capability.

In Chapter 4, we propose a rule-based symbiotic modified differential evolution (RSMODE) for the proposed FLNFN model. The proposed RSMODE learning algorithm consists of initialization phase and parameter learning phase. The initialization phase can determine the number of subpopulation which satisfies the fuzzy partition of input variables using the entropy measure. The parameter learning phase combines two strategies including a subpopulation symbiotic evolution and a modified differential evolution. The RSMODE can automatically generate initial subpopulation and each individual in each subpopulation evolves separately using a modified differential evolution. We also compare our method with other methods in the literature early. Finally, conclusions and future works are summarized in the last section.

Chapter 2 A Functional-Link-Based Neuro-Fuzzy Network

In this chapter, a functional-link-based neuro-fuzzy network (FLNFN) model is presented for nonlinear system control. The FLNFN model, which combines a neuro-fuzzy network with a functional link neural network (FLNN), is designed to improve the accuracy of functional approximation. Each fuzzy rule that corresponds to a FLNN consists of a functional expansion of input variables. The orthogonal polynomials and linearly independent functions are adopted as functional link neural network bases. An online learning algorithm, consisting of structure learning and parameter learning, is proposed to construct the FLNFN model automatically. The structure learning algorithm determines whether or not to add a new node which satisfies the fuzzy partition of input variables. Initially, the FLNFN model has no rules.

The rules are automatically generated from training data by entropy measure. The parameter learning algorithm is based on back-propagation to tune the free parameters in the FLNFN model simultaneously to minimize an output error function.

2.1 Structure of Functional-Link-Based Neuro-Fuzzy Network

This section describes the structure of functional link neural networks and the structure of the FLNFN model. In functional link neural networks, the input data usually incorporate high order effects and thus artificially increase the dimensions of the input space using a functional

expansion. Accordingly, the input representation is enhanced and linear separability is achieved in the extended space. The FLNFN model adopted the functional link neural network generating complex nonlinear combination of input variables to the consequent part of the fuzzy rules. The rest of this section details these structures.

2.1.1 Functional Link Neural Networks

The functional link neural network is a single layer network in which the need for hidden layers is removed. While the input variables generated by the linear links of neural networks are linearly weighted, the functional link acts on an element of input variables by generating a set of linearly independent functions (i.e., the use of suitable orthogonal polynomials for a functional expansion), and then evaluating these functions with the variables as the arguments.

Therefore, the FLNN structure considers trigonometric functions. For example, for a two-dimensional input X=[x₁,x₂]^T, the enhanced input is obtained using trigonometric functions in Φ=[x₁,sin(

π

x₁),cos(

π

x₁),...,x₂,sin(

π

x₂),cos(

π

x₂),...]^T . Thus, the input variables can be separated in the enhanced space [23]. In the FLNN structure with reference to Fig. 2.1, a set of basis functions Φ and a fixed number of weight parameters W represent f_W(x). The theory behind the FLNN for multidimensional function approximation has been discussed elsewhere [24] and is analyzed below.

x₂

Functional Expansion

. . .

xN . . .

. . . φ1

φ2

φM

∑

ˆy1

ˆy2

yˆm

Yˆ

Figure 2.1: Structure of FLNN.

Consider a set of basis functions Β={

φ

_k∈Φ(A)}_k_∈_Κ, Κ={1,2,...} with the following

properties; 1)

φ

₁ =1, 2) the subset

Β

_j ={

φ

_k∈Β}^M_{k 1}₌ is a linearly independent set, meaning

that if

∑

^M_k=₁w_k

φ

_k =0, then w_k =0 for all k=1,2,...,M , and 3) supj

[ _∑

_k^j₌₁

φ

k ²A

]

¹^/²^<^∞.

Let _k ^M

}k 1

{ ₌

φ

Β be a set of basis functions to be considered, as shown in Fig. 2.1. The FLNN comprises M basis functions {

φ

₁,

φ

₂,...,

φ

_M}∈Β_M. The linear sum of the jth node is given by

∑

= ^M

k k kj

j w ( )

yˆ

φ

X (2.1)

where X∈Α⊂ℜ^N, X=[x₁,x₂,...,x_N]^T is the input vector and W_j =[w_j₁,w_j₂,...,w_jM]^T is

the weight vector associated with the jth output of the FLNN. ˆy denotes the local output of _j

the FLNN structure and the consequent part of the jth fuzzy rule in the FLNFN model. Thus, Eq.(2.1) can be expressed in matrix form as yˆ_j =W_jΦ, where Φ=[

φ

₁(x),

φ

₂(x),...,

φ

_N(x)]^T is the basis function vector, which is the output of the functional expansion block. The m-dimensional linear output may be given by Yˆ =WΦ, where Yˆ =[yˆ₁,yˆ₂,...,yˆ_m]^T , m denotes the number of functional link bases, which equals the number of fuzzy rules in the FLNFN model, and W is a (m×M)-dimensional weight matrix of the FLNN given by

,..., ,

[w₁ w₂ w_M

W= . In the FLNFN model, the corresponding weights of functional link bases do not exist in the initial state, and the amount of the corresponding weights of functional link bases generated by the online learning algorithm is consistent with the number of fuzzy rules. Section 3 details the online learning algorithm.

2.1.2 Structure of the FLNFN Model

This subsection describes the FLNFN model, which uses a nonlinear combination of input variables (FLNN). Each fuzzy rule corresponds to a sub-FLNN, comprising a functional link.

Figure 2.2 presents the structure of the proposed FLNFN model.

The FLNFN model realizes a fuzzy if-then rule in the following form. term of the precondition part with Gaussian membership function; N is the number of input variables; wkj is the link weight of the local output;

φ

_k is the basis trigonometric function of input variables; M is the number of basis function, and Rule-j is the jth fuzzy rule.

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Normalization

Figure 2.2: Structure of proposed FLNFN model.

The operation functions of the nodes in each layer of the FLNFN model are now described. In the following description, u^(l) denotes the output of a node in the lth layer.

No computation is performed in layer 1. Each node in this layer only transmits input values to the next layer directly:

i x

u⁽¹⁾ = . (2.3) Each fuzzy set Aij is described here by a Gaussian membership function. Therefore, the calculated membership value in layer 2 is



where mij and

σ

_ij are the mean and variance of the Gaussian membership function, respectively, of the jth term of the ith input variable xi.

Nodes in layer 3 receive one-dimensional membership degrees of the associated rule from the nodes of a set in layer 2. Here, the product operator described above is adopted to perform the precondition part of the fuzzy rules. As a result, the output function of each inference node is

∏

uij⁽²⁾ of a rule node represents the firing strength of its corresponding rule.

Nodes in layer 4 are called consequent nodes. The input to a node in layer 4 is the output from layer 3, and the other inputs are calculated from a functional link neural network, as shown in Fig. 2.2. For such a node,

where wkj is the corresponding link weight of functional link neural network and

φ

_k is the functional expansion of input variables. The functional expansion uses a trigonometric polynomial basis function, given by

[ φ

₁

φ

₂

φ

₃

φ

₄

φ

₅

φ

₆

]

[

x₁sin(

π

x₁) cos(

π

x₁ )x₂sin(

π

x₂ )cos(

π

x₂)

]

for two-dimensional input variables. Therefore, M is the number of basis functions, M = 3×N, where N is the number of input variables.

Moreover, the output nodes of functional link neural network depend on the number of fuzzy rules of the FLNFN model.

The output node in layer 5 integrates all of the actions recommended by layers 3 and 4 and acts as a defuzzifier with,

∑

where R is the number of fuzzy rules, and y is the output of the FLNFN model.

As described above, the number of tuning parameters for the FLNFN model is known to be (2+3×P)×N×R, where N, R and P denote the number of inputs, existing rules, and outputs, respectively. The proposed FLNFN model can be demonstrated to be a universal uniform approximation by Stone-Weierstrass theorem [56] for continuous functions over compact sets.

The detailed proof is given in the Appendix.

2.2 Learning Algorithms of the FLNFN Model

This section presents an online learning algorithm for constructing the FLNFN model. The proposed learning algorithm comprises a structure learning phase and a parameter learning phase. Figure 2.3 presents flow diagram of the learning scheme for the FLNFN model.

Structure learning is based on the entropy measure used to determine whether a new rule should be added to satisfy the fuzzy partitioning of input variables. Parameter learning is based on supervised learning algorithms. The back-propagation algorithm minimizes a given cost function by adjusting the link weights in the consequent part and the parameters of the membership functions. Initially, there are no nodes in the network except the input-output nodes, i.e., there are no any nodes in the FLNFN model. The nodes are created automatically as learning proceeds, upon the reception of online incoming training data in the structure and parameter learning processes. The rest of this section details the structure learning phase and

the parameter learning phase. Finally in this section, the stability analysis of the FLNFN model based on the Lyapunov approach is performed the convergence property.

Start

Initialization

Is xi the first input data?

Generate the first rule with mi1=xi

σi1=σinit

wk1=random [-1,1]

where σinit is

a prespecified constant

Generate a new rule with mij=xi

σij=σinit

wkj=random [-1,1]

where σinit is

a prespecified constant Yes

Yes

Done?

End Yes No

Parameter learning (Backpropagation )

max EM? EM <

Structure learning

Figure 2.3: Flow diagram of the structure/parameter learning for the FLNFN model.

2.2.1 Structure Learning Phase

The first step in structure learning is to determine whether a new rule from should be extracted the training data and to determine the number of fuzzy sets in the universal of discourse of each input variable, since one cluster in the input space corresponds to one potential fuzzy logic rule, in which m and _ij

σ

_ij represent the mean and variance of that cluster, respectively. For each incoming pattern xi, the rule firing strength can be regarded as

the degree to which the incoming pattern belongs to the corresponding cluster. Entropy measure between each data point and each membership function is calculated based on a similarity measure. A data point of closed mean will has lower entropy. Therefore, the entropy values between data points and current membership functions are calculated to determine whether or not to add a new rule. For computational efficiency, the entropy measure can be calculated using the firing strength from u_ij⁽²⁾ as follow;

∑

−

= ^N

i ij ij

j D D

1 log2 (2.8) where Dij ⁼^exp

( )

uij⁽²⁾⁻¹ and EM_j∈[0,1]. According to Eq. (2.8), the measure is used to generate a new fuzzy rule and new functional link bases for new incoming data is described as follows. The maximum entropy measure

R j

j EM

T ) 1 (

max max

≤

= ≤ (2.9) is determined, where R(t) is the number of existing rules at time t. If EM_max ≤ EM, then a

new rule is generated, where EM∈[0,1] is a prespecified threshold that decays during the learning process.

In the structure learning phase, the threshold parameter EM is an important parameter.

The threshold is set to between zero and one. A low threshold leads to the learning of coarse clusters (i.e., fewer rules are generated), whereas a high threshold leads to the learning of fine clusters (i.e., more rules are generated). If the threshold value equals zero, then all the training data belong to the same cluster in the input space. Therefore, the selection of the threshold value EM will critically affect the simulation results. As a result of our extensive experiments and by carefully examining the threshold value EM , which uses the range [0, 1], we concluded that the relationship between threshold value EM and the number of input variables. Accordingly, EM is defined as 0.26-0.3 times of the number of input variables.

Once a new rule has been generated, the next step is to assign the initial mean and

variance to the new membership function and the corresponding link weight for the consequent part. Since the goal is to minimize an objective function, the mean, variance and weight are all adjustable later in the parameter learning phase. Hence, the mean, variance and weight for the new rule are set as follows;

i R

ij x

m⁽ ⁽^t⁺¹⁾⁾ = (2.10)

init R

ij ^t

σ

⁽ ⁽⁺¹⁾⁾ = (2.11)

] 1 , 1

) [

( ₍⁺₁₎ =random−

w_kj^R^t (2.12) where xi is the new input and

σ

_init is a prespecified constant. The whole algorithm for the generation of new fuzzy rules and fuzzy sets in each input variable is as follows. No rule is assumed to exist initially exist:

Step 1: IF xi is the first incoming pattern THEN do {Generate a new rule

with mean mi1=xi, variance

σ

_i₁=

σ

_init, weight wk1=random[-1, 1]

where

σ

_init is a prespecified constant.

}

Step 2: ELSE for each newly incoming xi, do

{Find _j

j EM

t ) 1 (

max max

≤

= ≤

IF EM_max ≥EM do nothing ELSE

{R(t+1) = R(t) +1 generate a new rule

with mean m_iR₍_t₊₁₎ =x_i, variance _iR _init

) t

( σ

σ ₊₁ = , weight w_kR₍_t₊₁₎ =random[−1,1] where

σ

_init is a prespecified constant.}

}

2.2.2 Parameter Learning Phase

After the network structure has been adjusted according to the current training data, the network enters the parameter learning phase to adjust the parameters of the network optimally based on the same training data. The learning process involves determining the minimum of a given cost function. The gradient of the cost function is computed and the parameters are adjusted with the negative gradient. The back-propagation algorithm is adopted for this supervised learning method. When the single output case is considered for clarity, the goal to minimize the cost function E is defined as

) where y^d(t) is the desired output and y(t) is the model output for each discrete time t. In each training cycle, starting at the input variables, a forward pass is adopted to calculate the activity of the model output y(t).

When the back-propagation learning algorithm is adopted, the weighting vector of the FLNFN model is adjusted such that the error defined in Eq. (2.13) is less than the desired threshold value after a given number of training cycles. The well-known back-propagation learning algorithm may be written briefly as



where, in this case,

η

and W represent the learning rate and the tuning parameters of the FLNFN model, respectively. Let W=[m,

σ

,w]^T be the weighting vector of the FLNFN model. Then, the gradient of error E(.) in Eq. (2.13) with respect to an arbitrary weighting vector W is

Recursive applications of the chain rule yield the error term for each layer. Then the

parameters in the corresponding layers are adjusted. With the FLNFN model and the cost function as defined in Eq. (2.13), the update rule for wj can be derived as follows;

) variance, respectively. In this dissertation, both the link weights in the consequent part and the parameters of the membership functions in the precondition part are adjusted by using the back-propagation algorithm. Recently, many researchers [10], [57] tuned the consequent parameters using either least mean squares (LMS) or recursive least squares (RLS) algorithms to obtain optimal parameters. However, they still used the back-propagation algorithm to

adjust the precondition parameters.

2.2.3 Convergence Analysis

The selection of suitable learning rates is very important. If the learning rate is small, convergence will be guaranteed. In this case, the speed of convergence may be slow. However, the learning rate is large, and then the system may become unstable. The Appendix derives varied learning rates, which guarantee convergence of the output error based on the analyses of a discrete Lyapunov function, to train the FLNFN model effectively. The convergence analyses in this dissertation are performed to derive specific learning rate parameters for specific network parameters to ensure the convergence of the output error [58]-[59]. Moreover, the guaranteed convergence of output error does not imply the convergence of the learning rate parameters to their optimal values. The following simulation results demonstrate the effectiveness of the online learning FLNFN model based on the proposed delta adaptation law and varied learning rates.

2.3 Experimental Results

This dissertation demonstrated the performance of the FLNFN model for nonlinear system control. This section simulates various control examples and compares the performance of the FLNFN model with that of other models. The FLNFN model is adopted to design controllers in four simulations of nonlinear system control problems - water bath temperature control system [60], control of a bounded input bounded output (BIBO) nonlinear plant [58], control of the ball and beam system [61], and multi-input multi-output (MIMO) plant control [62].

Example 1: Control of Water Bath Temperature System

The goal of this section is to elucidate the control of the temperature of a water bath

system according to, system; Y is room temperature; C is the equivalent thermal capacity of the system, and T₀ R is the equivalent thermal resistance between the borders of the system and the surroundings.

TR and C are assumed to be essentially constant, and the system in Eq.(2.19) is rewritten in discrete-time form to some reasonable approximation. The system

40 0 obtained from a real water bath plant considered elsewhere [60]. The plant input u(k) is limited to 0 and 5V, and the sampling period is Ts=30 second.

The conventional online training scheme is adopted for online training. Figure 2.4 presents a block diagram for the conventional online training scheme. This scheme has two phases - the training phase and the control phase. In the training phase, the switches S1 and S2 are connected to nodes 1 and 2, respectively, to form a training loop. In this loop, we can define a training data with input vector I(k)=[y_p(k+1) y_p(k)] and desired output

u (k )

, where the input vector of the FLNFN controller is the same as that used in the general inverse modeling [63] training scheme. In the control phase, the switches S1 and S2 are connected to nodes 3 and 4, respectively, forming a control loop. In this loop, the control signal

u ˆ k ( )

is generated according to the input vector I'(k)=[y_ref(k+1) y_p(k)], where y is the plant _p

output and y is the reference model output. _ref

A sequence of random input signals urd(k) limited to 0 and 5V is injected directly into the

simulated system described in Eq. (2.20), using the online training scheme for the FLNFN controller. The 120 training patterns are selected based on the input-outputs characteristics to cover the entire reference output. The temperature of the water is initially 25°c, and rises progressively when random input signals are injected. After 10000 training iterations, four fuzzy rules are generated. The obtained fuzzy rules are as follows.

Rule-1: IFx₁is

µ

(32.416,11.615)andx₂is

µ

(27.234,7.249)

Figure 2.4: Conventional online training scheme.

This dissertation compares the FLNFN controller to the proportional-integral-derivative (PID) controller [64], the manually designed fuzzy controller [1], the functional link neural network [25] and the TSK-type neuro-fuzzy network (TSK-type NFN) [9]. Each of these controllers is applied to the water bath temperature control system. The performance measures include the set-points regulation, the influence of impulse noise, and a large parameter

variation in the system, and the tracking capability of the controllers.

The first task is to control the simulated system to follow three set-points.

Figure 2.5(a) presents the regulation performance of the FLNFN controller. The regulation performance was also tested using the FLNN controller and the TSK-type NFN controller.

Figure 2.5(b) plots the error curves of the FLNFN controller, the FLNN controller and the TSK-type NFN controller between k=81 and k=100. In this figure, the FLNFN controller

在文檔中以函數鏈結為基礎之類神經模糊網路及其應用 (頁 25-0)