Neural Network for Control - Intelligent Control

Chapter 2 Intelligent Control

2.2 Neural Network for Control

Since traditional control theory is based on the mathematic model of the plant, it fails when the mathematic model is unknown or not accurate. The intelligent control theory based on the abilities of thinking and learning of human is not restricted to the mathematic model. It has more abilities to solve the control problem than the conventional control theory.

The investigators have proposed many control method based on neural network, such as model reference control. Besides, some neural network structures are proposed for system identification, such as recurrent neural network. The model reference control based on neural network and recurrent neural network will be introduced in this section.

Figure 2.4 the feedback system

The model reference control based on neural network is introduced first. When the plant is given, the feedback system is shown as Fig. 2.4 and the controller is the learning objective system [16]. The reference model is designed according to the specification of the problem.

Basically, the learning algorithm is based on the gradient method, so the learning process needs two neural networks, one to be a controller and the other to be a plant. This learning can

use backpropagation learning algorithm introduced in the last section. To back propagate the error to the neural network, the inverse of the plant should be known. Unfortunately, not every inverse dynamic of the plant exists, or can be known even if it exists. Hence, the neural network plant should be trained first for back propagating the system error to the neural network controller, shown as Fig 2.5. However, it causes that the neural network plant should be retrained whenever the condition of the plant changed a little.

Figure 2.5 to train neural network plant

It is known that the learning algorithm alters the connecting weights depending on the error between the plant output and the neural network plant output. If the neural network plant is trained well, it will replace the original plant. It is worthy noticing that the connecting weights of the neural network plant do not change when the neural network controller is trained. Fig. 2.6 shows the learning system for training the neural network controller.

Compared with the reference model, the neural network controller does not learn as the real controller. If the plant in s-domain is defined as H(s) and the controller is defined as G(s), the system is described as

( ) ( ) ( ) ( ) ( ) ( )

s H s G

s H s G s U

s Y

= +

1 (2.9) where Y(s) is the system output and U(s) is the system input. However, the neural network

system is described as

( ) ( )

( )

s H

( )

s s

U s Y

NN = NN (2.10) where GNN(s) denotes the neural network controller and HNN (s) denotes the neural network plant. Because the neural network plant is trained as the real plant, HNN (s) is assumed identical to H(s). Thus, the neural network controller does not learn as the real controller, and expressed as

( ) ( )

s H s G

s s G

G = +

NN 1 (2.11) which does not equal to G(s).

Figure 2.6 to train the neural network controller

The model reference control based on neural network is a simple way to control the system depending on the specification of the problem. However, the neural network plant should be trained first, and that leads errors while the plant changes, such as the input-output relationship and the sampling time. No mater what the plant changes, the neural network plant should be trained again for back propagation precisely. Besides, it is known that the neural

network controller is not the original controller. These restrict the use of the model reference control based on the neural network.

Because of the restriction of the inverse dynamic, some investigators proposed other control method without learning the neural network plant [17-19]. For simplicity, the neural network controller could be learned as system identification while the controller is designed, but it can not be guarantee the control ability of the neural network controller. Thus, a fully recurrent neural network, shown as Fig 2.7, is proposed with capability of dynamics and the ability to store the information for later use. In the next paragraph, a leaning algorithm for fully recurrent neural network, real- time recurrent learning (RTRL), will be introduced.

Figure 2.7 the fully recurrent neural network

RTRL has been proposed by many investigators independently, and the most used is by Williams and Zipser [20], which is introduced here. The learning algorithms will be described in detail as below.

While the network have n neurons, with m external inputs, the outputs of neurons are denoted as ykn(t), kn = 1,2,…,n, and the external inputs are denoted as xki(t), ki = 1,2,…,m,

where t is the time. These values are concatenated to form the (m+n)-tuple z(t), which is

which are the outputs of every neurons and input neurons. Since the network is fully connected, the weight matrix W in a n×

(

m+n

)

matrix to obtain n outputs of neurons by m external inputs and n inputs of neurons. Thus, the net input of the kn^th neurons at time t can be calculated as

where l = 1,2,…,m+n. The output of the neuron at next time step is determined as

( )

t f

(

( )

)

y_kn +1 = _kn _kn (2.14)

where f_kn

( )

⋅ is the activation function of the kn^th neuron. Let the first o neurons exist specified target value d, so the n-tuple error e is described as

( ) ( ) ( )

Define the overall network error at time t as

( )

⁼ ∑

[ ( ) ]

denotes the network error running from time t0 to the time t1. W is adjusted along the negative

where δik denotes the Kronecher delta. It is known that the initial states of the network is independent of the weights, so it can be given as

( )

⁰ =0

∂ . Finally, the weight change at time t can be determined by

( )

⁼ ∑

( ) ( )

⁺

and the overall weight change can be also determined.

The RTRL is broadly used such as in classification and learning finite state systems, and the author show many simulation results to demonstrate its ability. Unfortunately, the calculation of this learning algorithm needs many previous data, so the algorithm needs a lot of memory to store information. Besides, it is computational expensive, because the weight matrix is large and the previous state information is much.

There are still many other methods to learn dynamic systems or controllers, but they will not be explained in detail in the thesis. In the next section, evolution strategies for neural network will be introduced using simple neural network structures to learn first order LTI systems.

Chapter 3 Evolution Strategies

Recently, the concept of the biological evolution is used in the intelligent theory, such as genetic algorithms, to reproduce the species generation by generation, then learn and survive based on the nature selection. Several investigators have proposed many evolution strategies to solve problems in diverse fields.

3.1 Modeling of First Order LTI System

In general, a linear time invariant system is described by an ordinary differential equation, expressed as

( )

t a y⁽ ⁾

( )

t a y⁽ ⁾

( )

t ... a y

( )

t bu

( )

yⁿ + ₁ ⁿ⁻¹ + ₂ ⁿ⁻² + + _n₋₁ = (3.1)

where y(t) is the system output at time t, u(t) is the system input at time t, and ai are the constant parameters of the system, i=1,2,3,…,n-1. When the initial conditions, y^{( )}ⁿ

( )

0 ,

(ⁿ⁻¹)

( )

y , …, y&

( )

0 , and y

( )

0 , are all zero, the LTI system could be rewritten into the transfer function as

( ) ( )

= + ₁ ⁻¹+ + ₋₂ + ₋₁

n n n

n as ... a s a

b s

U s

Y (3.2)

which is commonly used to clarify the characteristics of the LTI system. This thesis will focus

on the modeling of the simplest first order LTI system, expressed as

( )

t ay

( )

t bu

( )

y& + = , (3.3) by intelligent structures and algorithms. For simplicity, the first LTI system is commonly rewritten into the transfer function as

( ) ( )

s a b s U

s Y

= + (3.4) which has been widely used in controller design.

The thesis focus on implementing the first order LTI systems using neural networks, but it is known that the neural networks belong to discrete time systems, not continuous time system. Therefore, the error does exist between the NN system and the objective system, the first order LTI systems. A simple method of DT system has been proposed to approximate the first order LTI system [21], and will be introduced next.

The discrete-time system obtained from (3.3) under the sampling time ∆T is described as

[ ]

ⁿ

(

^a ^T

)

[ ]

ⁿ

(

^b ^T

)

[ ]

ⁿ

y +1 = 1− ∆ + ∆ (3.5) where y[n] = y(n∆T) and u[n] = u(n∆T) and which is so called the first order difference equation.

To find the error between the first order LTI system (3.3) and its corresponding difference equation (3.5) when the system input is a step function, the solution of (3.3) should be determined first as

( )

t c de ^at

y = + ⁻ and y&

( )

t =−ade⁻^at (3.6) where c and d are related to the input amplitude and the system initial conditions y(0) and

( )

y& . If the step input is given with amplitude A and the system is initially idled, the solution

is obtained as

As for the solution of (3.5) with the same input and initial conditions of (3.3), its solution corresponding sampling time ∆T can be found as

[ ] [ (

^a ^T

)

ⁿ

]

where ∆T is assumed to be small and the sum of error is then proportional to ∆T². It implies that the sum of error is reduced while the sampling time decreases.

A close look at (3.11) will reveal that the fitness is large while the sampling time is small.

The problem is how to increase the fitness except decreasing the sampling time. Here, neural network theory is introduced to learn the first order LTI system for solving the problem. Next, two neural network structures will be introduced in order to learn the first order LTI system.

3.2 Neural Network Structure

In the thesis, the learning result after evolution strategies is determined from the fitness function which is defined as the negative sum of the errors between the outputs of the first order LTI system and the NN system. This thesis is intended as an investigation of whether a neural network structure could learn the first order LTI system as a controller. In the last section, the first order difference equation is proposed as a simple way to approach the first order LTI system. Before introducing our neural network structures, the structure for first order difference equation will be introduced first. It shows that the first order LTI system could be implementing based on neural networks.

3.2.1 Structure for The First Order Difference Equation

It can be found in (3.5) that the first order difference equation needs two parameters, (1-a∆T) and (b∆T), for the input and the output at last time step to approach the first order LTI system. In neural network, a general neuron with n inputs contains n connecting weights, (n-1) operators, and one activation function to produce an output. Here, the system input and output at last time are both viewed as inputs of a neural network. Now that the first order difference equation could be implemented by just one neuron with the activation function whose slope is one, shown as Fig. 3.1. Thus, it is concerned that whether the neural network with more

neurons is possible to let fitness larger than first order difference equation. Intuitively, it is possible to do that. It will be shown in Chapter 4.

Figure 3.1 the first order difference equation using one neuron

3.2.2 General Structure

According to the first order difference equation, a general structure of neural network contains an input layer with two neurons which represent the system input and output at last time, an output layer with one neuron which represents the system output at this time, and some hidden layers. It is known that MLP can process more problems than single layer, so one hidden layer is used in the general structure for simplicity. Although amount of the neurons in the hidden layer will increase the possibility of good performance, they will increase the computation time. To give consideration to the possibility of good performance and the computation time, five neurons is chosen to put in the hidden layer, shown as Fig. 3.2.

One may notice that the activation functions of this structure are all described as

( )

⋅ =1

f , (3.12) and there is no threshold term in the structure. In this two-layered neural network structure, called ‘GS’ for short, the synaptic weight connecting the neuron i to the neuron j is symbolized as w^{( )}_ji^l where l means that the synaptic weight is between (l-1)^th layer and l^th

layer, l = 1,2. Namely, there are two weight matrices, which are described as

in the GS, and the output of the GS can be simply determined as

[ ]

^{( )} ^{( )}

[ ]

that is a discrete time equation.

Figure 3.2 the GS for first order LTI system

Compare with the fist order difference equation in the last section, the output of the GS will be the same if the weight matrices are given as

( )

where a and b are the parameters of first order LTI system a s

+ and ∆T means the sampling time. As long as the GS with weight matrices described as (3.15) will be equivalent to the fist

z⁻1

order difference equation, the fitness will be the same.

Whether any other weight matrices will lead to larger fitness is a question. It can be said that the GS is better than the fist order difference equation if fitness of the GS is larger with the same sampling time. There remains a second question: how much does the sampling time affect the fitness of the GS. It is known that the fitness will increase as the sampling time decrease in the fist order difference equation, so to reduce the sampling time is necessary.

Thus, the question about whether the sampling time has the same influence on the GS with fist order difference equation is taken up in the next chapter.

3.2.3 Structure with Sampling Time

Since the GS is a discrete time system without coefficients related to sampling time, it learns under the fixed sampling time of the training data. As a result, the GS is only suitable for the fixed sampling time, which restricts the use of the GS. One problem is raised: Does any neural network structure exist for arbitrary sampling times after learning, just like the first order difference equation suitable for a first order LTI system? To solve the problem, a structure with sampling time, called ‘SST’ in short, is designed to adapt a wide range of sampling times, as shown in Fig. 3.3, which adopts the parameters of the first order difference equation and is described as

[ ]

^{( )} ^{( )}

⁽ ⁾ [ ]

where the gains of the inputs is the main difference with the GS. By setting the weight matrices as

the SST obtains the output same as the first order difference equation. In this case, the SST is available for a wide range of sampling times and subject to an error due to the use of the sampling time. To further increase the fitness, it is required to find different weight matrices based on the evolution strategies, which will be introduced in the next section.

Figure 3.3 the SST for first order LTI system

In this section, except the structure for the fist order difference equation, two types of the structures of the neural network are designed to learn the first order LTI system, and the most difference between these two types is how the sampling time effects. Before simulations, the learning algorithm of these structures will be introduced in the next section.

3.3 Evolution Strategies

In neural network theory, it is an important issue to find the neural network whose weights lead to the global minimum of an error function. Unfortunately, it is difficult to know the minimum is global or not, even for systems without uncertainties. Therefore, instead of global minimum, investigators often develop evolution strategies to search the optimal minimum of fitness function with largest fitness locally, not globally.

The backpropagation learning algorithm adjusts the synaptic weights using chain rule z⁻1

[ ]

y n+ ^{y n}

[ ]

b∆T u n

[ ]

y n

(1-a∆T)

depending on the gradient descent method as (2.6) and (2.7), so the synaptic weights could not be updated using the backpropagation learning algorithm while the inverse dynamic system is unknown. However, the evolution strategies we proposed adjust weights using evolution, but the chain rule according to the fitness function, so it avoid the disadvantage of the backpropagation learning algorithm. It can be said that that the neural network can learn easily using the evolution strategies even if the inverse dynamic system is unknown.

According to the origin of species by Darwin [22], individuals less suited to the environment are less likely to survive and to reproduce. Under the limit of the environment, much of the species reproduce sexually, which leads no two individuals are identical generally, and thus the individuals more suited to the environment are more likely to keep their inheritable characteristics to future generations. That is so called nature selection, the most widely used by biologists to represent the scientific model of how species evolve.

Here, evolution strategies, depending on nature selection, are proposed for the learning algorithm of the neural network. In the evolution strategies, the given problem is viewed as the environment and every set of weight matrices is viewed as an individual [23]. Basically, the biological reproduction is divided into two groups: sexual and asexual. Individual is different with their parents by sexual reproduction, but it is just identical copy of its parents by asexual reproduction except for mutation. Different with the nature world where the mutation happens with the reproduction unpredictably, it does not happen in the evolution strategies. In general, the species do not reproduce sexually and asexually at the same time, even if hydras and earthworms which can reproduce either sexually or asexually. In the evolution strategies, the reproduction happen both sexually and asexually at the same time.

Four points is helpful in sketch out the evolution strategies: the initial individuals creation, the reproduction process, the learning process, and the elite process. Since the generation inherits from the last generation, the initial individuals affect the future offspring;

it means that not arbitrary initial individual after learning will behave as the objective system.

By various reproduction processes and the learning processes, the individuals of every generation will be different even if the initial individuals are the same. Further, the elite process chooses the individuals which are more suitable to the problem. In the thesis, the evolution leads to a lot of results depending on its initial population, reproduction process, learning process, and elite process, and any above terms probably fails to learn. These four influences will be discussed next.

3.3.1 Initial Individuals Creation

From (3.13), the structures both contain two weight matrices W⁽¹⁾ aad W⁽²⁾. In the evolution strategies, the matrices are combined as an individual W_k^g, defined as

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

[

w₁₁¹ w₁₂¹ w₂₁¹ w₂₂¹ w₃₁¹ w₃₂¹ w₄₁¹ w₄₂¹ w₅₁¹ w₅₂¹ w₁₁² w₁₂² w₁₃² w₁₄² w₁₅²

]

k =

(3.18)

which is the k^th individual of the g^th generation. Let g=0, 1, 2,…, and k=1, 2, 3,…, n^g where n^g is the population size of the g^th generation. First, it is known that the initial individuals W_k⁰

affect the offspring W_k^g where g≠0 and not arbitrary W_k⁰ lead a successful learning. No exact way can decide how these initial individuals are before the evolution. The learning is expected to success even if the initial individuals are given randomly. Unfortunately, it is difficult because of the issue of the local optimal. Note that (3.15) and (3.17) could be thought as good individual of GS and SST, respectively. If the parameters are used to be an initial individual, the probability of success increases. Chapter 4 will show the influences of the initial individuals.

3.3.2 Reproduction Process

The reproduction process is used for increasing the searching space of the individuals.

For the human beings, the offspring combines the half chromosomes of each parent when the sexual reproduction happens. The child contains the half genes from the mother and the other half genes from the father. Not all the reproduction processes of the living things are same to the human begin, such as hydras whose offspring can be produced asexually. Therefore, how to create the offspring is concerned.

In the evolution strategies, the individuals W_k^g are called the parents and Ω_k^g are called the children after the reproduction process. In the beginning of the reproduction process,

在文檔中基於類神經網路的演化策略應用於一階動態系統 (頁 19-0)

Neural Network for Control

Chapter 2 Intelligent Control

2.2 Neural Network for Control

( ) ( ) ( ) ( ) ( ) ( )

( ) ( )

( )

( )

( ) ( )

( ) ( )

(

)

( )

(

( )

)

( )

( ) ( ) ( )

( )

[ ( ) ]

( )

( )

( ) ( )

Chapter 3

Evolution Strategies

3.1 Modeling of First Order LTI System

( )

( )

( )

( )

( )

( )

( )

( )

( )

( ) ( )

( )

( )

( )

( ) ( )

[ ]

(

)

[ ]

(

)

[ ]

( )

( )

( )

[ ] [ (

)

]

3.2 Neural Network Structure

3.2.1 Structure for The First Order Difference Equation

3.2.2 General Structure

( )

[ ]

[ ]

3.2.3 Structure with Sampling Time

[ ]

( ) [ ]

3.3 Evolution Strategies

[ ]

[ ]

[ ]

[ ]

3.3.1 Initial Individuals Creation

[

]

3.3.2 Reproduction Process

⁽ ⁾ [ ]