• 沒有找到結果。

CHAPTER 1 INTRODUCTION

1.4 Outlines

The dissertation is organized as follows. Chapter 2 introduces the formulations for various optimal control problems and the general methods for solving such problems. Also briefly discussed are problem-solving procedures and the difficulties with direct and indirect methods.

Chapter 3 specifically addresses the computational methods for solving optimal control problems and presents the theoretical basis and numerical preliminaries for developing a general optimal control problem solver. The architecture of the OCP solver and the systematic procedure for solving the OCP are described in Chapter 4, which also present the details of the implementation and user interface of the proposed solver. Here, the van der Pol oscillator problem with various types of terminal conditions and the time-optimal control problem of overhead crane control are used to demonstrate and verify the capability and accuracy of the proposed OCP solver. Chapter 5 introduces a two-phase scheme that integrates the admissible optimal control problem method and the enhanced branch-and-bound algorithm to efficiently solve the bang-bang control problems in the field of engineering. In Chapter 6, the proposed

solver is applied to two practical engineering applications: the flight level control problem and the vehicle suspension design problem. Finally, Chapter 7 draws some conclusions and makes suggestions for further research.

CHAPTER 2

METHODS FOR SOLVING OPTIMAL CONTROL PROBLEMS

2.1 Introduction

Optimal control theory has been of considerable importance in a wide variety of disciplines. Over the years, the theory has been developed for various applications in many different fields, e.g., mechanical systems (Kim and Ha, 2001), automotive vehicle design (Panagiotis, 2000; Jalili and Esmailzadeh, 2001), and manufacturing processes (Samaras and Simaan, 2001). However, because most real-world problems are becoming too complex to be solved analytically (Kirk, 1970), using computational algorithms to solve them is becoming inevitable. As a result, several successful families of algorithms are now available in the literature.

Techniques for the numerical solution of optimal control problems can be broadly divided into direct and indirect methods (Bock, 1978; Stryk and Bulirsch, 1992). In the direct method, the state and/or control variables are parameterized using a piecewise polynomial approximation. Inserting these approximations into the cost functional, dynamic equations, and constraints and boundary conditions leads to a static parameter optimization problem. On the other hand, the indirect method is based on the solution of the first-order necessary conditions for optimality obtained from Pontryagin’s maximum principle (Pontryagin et al., 1962) or derived from the Hamilton-Jacobi-Bellman equation (Bellman, 1957).

Two early methods commonly used to solve optimal control problems are Bellman’s dynamic programming (Bellman, 1957) and Pontryagin’s maximum principle (Pontryagin et

al., 1962). Dynamic programming, developed by Bellman in the late 1950s (Bellman, 1957;

Bellman and Dreyfus, 1962; Bellman and Kalaba, 1965), is a computational technique that extends the decision-making concept to sequences of decisions, which together define an optimal policy and trajectory. Subsequently, Soviet mathematician Pontryagin and his

colleagues (Pontryagin et al., 1962) developed the calculus of variations approach using a maximum principle. Although both the dynamic programming method and PMP have been used to solve optimal control problems, many practical problems described by strongly nonlinear differential equations cannot be easily solved by either technique. As a result, many approximation methods based on NLP methods are used to solve these practical problems (see,

e.g., Lin, 1992; Pytlak, 1999; Jaddu and Shimemura, 1999; Hu et al., 2002).

A nonlinear programming problem consists of a multivariable function subject to multiple inequality and equality constraints. The solution to the nonlinear programming problem is found by solving the Kuhn-Tucker points of equalities given by the first-order boundary conditions. Conceptually, this procedure is analogous to solving optimal control problems using Pontryagin’s maximum principle. Depending on the discretization technique applied, methods that apply NLP solvers can be classified into two groups: simultaneous or sequential strategies. In the simultaneous methods, the state and control variables are fully discretized and thus usually lead to large-scale NLP problems that require special solution strategies (Cervantes and Biegler, 1998; Betts and Huffman, 1992). However, in sequential methods – also known as control variable parameterization methods – only the control variables are discretized. Based on initial conditions and a set of control parameters, the system equations are integrated with an ordinary differential equation (ODE) solver at each iteration to produce cost functional (performance index) and constraint values used by a nonlinear programming solver to find the control parameterization’s optimal coefficient values. The sequential approach is a feasible path method, i.e., in each of iteration, the system equation is solved.

However, this procedure is robust only when the system contains stable modes. Otherwise, finding a feasible solution for a given set of control parameters may be difficult. In this dissertation, a different discretization technique – the shooting method – is implemented and used in conjunction with sequential quadratic programming (SQP) to solve various types of

optimal control problems.

The shooting method serves as a bridge between sequential and simultaneous approaches by partitioning the time domain into smaller time intervals and integrating the system equations separately in each interval. Control variables are treated in the same manner as in the sequential approach. Moreover, to obtain gradient information, sensitivities are obtained for both the control variables and the initial state conditions in each time interval. Finally, equality constraints are added to the nonlinear program in order to link the time intervals and ensure that the states are continuous across each time interval. This method allows inequality constraints for both the state and the controls to be imposed directly at the grid points. Thus, the admissible optimal control problem (AOCP) formulation based on the shooting method is adopted as the core of the proposed method.

2.2 Canonical Formulation of Optimal Control Problems

Considering a dynamical system described by the following nonlinear differential equations on [0, tf ]:

( t

, , ( ), ( ) ,

t t )

x

=

f b x u t ∈ ⎣ ⎡ 0, t

f

⎤ ⎦

(2.1)

with the initial condition

(0)= 0

is a vector of the state variables. The function is assumed to be continuously differentiable with respect to all its arguments, and is a given vector in . It is assumed that the process starts from t

: × π× n× m

f \ \

\ \ 6 \

x

0 \n 0 = 0

and ends at the fixed terminal time tf > 0. A process that starts from t0

0 may be transformed

to satisfy this assumption by suitable shifting on the time axis. Let U be the class of all such

admissible controls. Then an optimal control problem may be stated formally as follows:

Given the dynamical system expressed in Eqs. (2.1) and (2.2), find u ∈ U such that the cost

functional (performance index)

( ) ( )

0 0

, ( ),

f f 0tf 0

, ( ), ( ),

J = Φ b x t t + ∫ L b u t x t t dt

(2.3)

is minimized subject to the constraint

( )

0

( ) 0; 1,...,

and the following continuous inequality constraint on the function of the state and control:

( , ( ), ( ), ) 0; 1,...,

j

t t t j q

ψ

b u x ≤ =

,∀ ∈ ⎣

t

⎡0,

t

f ⎤⎦. (2.5)

where Φ0, L0, Φi, Li and

ψ

jare continuously differentiable with respect to their respective arguments. This problem is referred to as problem (PU). A control u ∈ U is said to be a feasible control if it satisfies constraints (2.4) and (2.5).

The preceding definition extends the original Bolza problem to account for inequality constraints because the original Bolza formulation, containing only equality constraints, is not general for the OCP. It also fails to treat the design variables b, which may serve a variety of useful purposes apart from the obvious design parameters, e.g., weight and velocity of a vehicle. Moreover, when the terminal time tf is unconstrained (for optimization), a free-time problem occurs. Otherwise, a fixed-time problem is given. In addition, the initial conditions are separated from the functional constraints in Eq. (2.4) for practical considerations, and the terminal conditions are treated as equality constraints in the first term of Eq. (2.4). The differential equations for the system in Eq. (2.1) are written in general first-order form.

Equation (2.5) represents the mixed state and control inequality dynamic constraints.

According to the constraints encountered in practical applications, most constraints can be classified under one of the following categories (Teo et al., 1991):

Type 1. Control bounds:

min

≤ ( ) t

max

u u u

,∀ ∈ ⎣

t

⎡0,

t

f ⎤⎦ (2.6)

Type 2. Terminal state constraint with fixed terminal time:

( ) 0; 1,...,

Type 3. Terminal state constraint with free terminal time:

( , ( ), ) 0

i

t

f

t

f

Φ b x =

=

, tf is unspecified. (2.8)

Type 4. Interior point state constraint:

( , ( ), ) 0

i

t t

l l

Φ b x

,0 < tl < tf (2.9)

Type 5. Integral constraint:

( )

Type 6. Continuous equality constraint on the function of the state and control:

( , ( ), ( ), ) 0

i

t t t

Φ b x u =

,∀ ∈ ⎣

t

⎡0,

t

f ⎤⎦ (2.11)

Type 7. Continuous inequality constraint on the function of the state and control:

( , ( ), ( ), ) 0

i

t t t

Φ b x u

,∀ ∈ ⎣

t

⎡0,

t

f ⎤⎦ (2.12)

To develop a general optimal control solver, any constraint of type 2 to type 7 can be regarded as a special case of Eqs. (2.4) and (2.5).

2.3 First-Order Necessary Condition – Euler Lagrangian Equation

The first-order necessary condition for optimality, known as the Euler-Lagrangian equation, can be found in many research studies (e.g., Teo et al., 1991; Kirk, 1970). Given an optimal control problem where control u ∈ U is chosen such that the cost functional defined

in Eq. (2.3) is minimized, then

( ) ( )

0 0

( ),

f f 0tf 0

( ), ( ),

J = Φ x t t + ∫ L u t x t t dt

(2.13)

where Φ0 and L0 are continuously differentiable with respect to their respective arguments.

It should be noted that the cost functional may be regarded as depending explicitly only on u, as x is implicitly determined by u from Eqs. (2.1) and (2.2). In addition, the design variables vector, b, is treated as a constant and is not involved. The system equations (2.1) and (2.2) can be appended to the cost functional by introducing the appropriate Lagrange multiplier

λ

∈ \n:

The Hamiltonian function is defined as follows:

( )

0

( ) ( )

H , x,u, λ t = L t , x,u + λ f , x,u

T

t

(2.15)

It should again be noted that, if the system equation is satisfied, the appended cost functional

J is indifferent to the original

0 . The time dependent Lagrange multiplier is referred to as the costate vector, also known as the adjoint vector.

J

0

Substituting Eq. (2.15) into Eq. (2.14) and integrating the last term by parts, the cost functional becomes

For a small variation c in u, the corresponding first-order variations in x and

J are

0 δx and

δ J

0, respectively, where

δ J

0is obtained by the chain rule:

( ) ( ( ) ) ( ( ) ) ( ) ( ( ) ) ( )

on u exist; otherwise, the Pontryagin’s maximum principle to be discussed later will be

applied. Equations (2.1), (2.2), (2.18), (2.19), and (2.21) are the well-known Euler-Lagrangian equations whose results can be summarized in the following theorem.

Theorem 2.1 If u

*(t) is a control that yields a local minimum for the cost functional (2.13), and x*(t) and

λ

*(t) are the corresponding state and costate, then it is necessary that

It should be noted that Eqs. (2.22a)-(2.22d) constitute 2n differential equations with n boundary conditions for x* specified at t = 0 and n boundary conditions for

λ

* specified at t

= tf. This is referred to as a two-point boundary value problem. In principle, the dependence on u* can be removed by solving u* as a function of x* and

λ

* from the m algebraic equations in Eq. (2.22e) via the implicit function theorem, provided that the Hessian

H H

2.4 Methods for Solving Optimal Control Problems

2.4.1 Indirect Methods

As mentioned in Section 1.2.1, the indirect method is based on the solution of the first-order necessary conditions for optimality obtained from Pontryagin’s maximum principle (Pontryagin et al., 1962), which has been modified and applied in various applications (see,

e.g., Xu and Antsaklis, 2004; Chyba et al.,2003; Steindl and Troger, 2003). For problems

without inequality constraints, the optimality conditions can be formulated as a set of differential-algebraic equations (DAEs). Obtaining a solution to DAEs requires careful attention to the boundary conditions because the state variables frequently have specified initial conditions and costate (adjoint) variables whose final conditions result in a TPBVP that is notoriously difficult to solve analytically and requires the use of iterative numerical techniques (Kirk, 1970). On the other hand, if the problem requires that active inequality constraints be handled, finding the correct switching structure together with suitable initial guesses for state and costate variables is often very difficult because of a lack of physical significance and the need for prior knowledge of the control’s switching structure. Many numerical techniques, including single shooting, invariant embedding, and multiple shooting, can be used to solve TPBVP, but PMP does not deal well with nonlinear optimal control problems. Figure 2.1 shows a solution process based on indirect methods.

Pontryagin’s Maximum Principle

According to the Euler-Lagrangian equation for the unconstrained optimal control problem of Section 2.3 depicts that the Hamiltonian function must necessarily be stationary with respect to the control, i.e.

H

0

u

∂ =

∂ at optimality. However, the optimality condition obtained in Section 2.3 does not have to be satisfied if the control is constrained to lie on the boundary of a subset Us. Here, Us is a compact subset of . Then, the Pontryagin’s maximum principle can be described by the following theorem:

\r

Theorem 2.2 Given the problem, where the cost functional (2.13) is to be minimized

over U subjected to the system equations (2.1) and (2.2), if u*(t) ∈ U is an optimal control, and x*(t) and

λ

*(t) are the corresponding state and costate, then it is necessary that

(

*

( ) ) (

*

( ) ( ) ( )

* *

) (

*

( ) ( ) ( )

* *

)

Dynamic programming (DP), based on Bellman’s principle of optimality (Bryson and Ho, 1975; Bellman and Dreyfus, 1962; Bellman, 1957), requires solution of the Hamilton-Jacobi-Bellman partial differential equation in a domain of the state space that contains the optimal solution. In dynamic programming, the optimal control problem is expressed as a state-variable feedback in graphical or tabular form. Optimal control strategies must be determined by working backward from the final stages. In other words, this method operates in sweeps through the state set, performing a full backup operation on each state.

Each backup updates the value of one state based on the value of all possible successor states.

The computational procedure for dynamic programming can be described briefly by the following steps.

Step 1: Approximating the continuous-time system using a discrete-time system.

In this step, the time interval, [t0, tf], is divided into N equal spaced intervals, Δt, and the performance index and state equations are converted into discrete form. Then, by applying the principle of optimality, the performance index can be converted into recurrent form:

Step 2: Quantizing the admissible state and control values into a finite number of levels.

Step 3: Calculating and storing the minimum values of the performance index of each stage from final state to initial state. In each stage, every quantized control value is tried at each quantized state value to discover the corresponding state values of the next stage. Additionally, the value of the performance index from current stage to final stage is calculated and compared. The minimum performance index is then chosen and stored. If the corresponding state values of the next stage are not in the quantized grid points, interpolation is required.

Step 4: Showing the results.

2.4.2 Direct Methods

Direct methods try to solve the dynamic optimization problem directly without explicitly solving the necessary conditions. Usually, these methods are based on an iterative procedure that generates approximations to the optimal solution of the dynamic optimization problem within each iteration step. For instance, the SQP method uses quadratic subproblems to approximate a general nonlinear programming problem locally.

As mentioned in Section 2.1, most direct methods that apply NLP solvers can be classified into simultaneous and sequential strategies. The important question for these numerical direct methods is whether these iterative approximate algorithms converge to a solution of the original problem or not. A solution process based on such methods is shown in Figure 2.2 and

their details will be introduced in Chapter 3.

2.5 Summary

The primary objective of this chapter has been to survey methods of the optimal control problems and provide formulations of various types of optimal control problems. The first-order necessary condition (Euler-Lagrangian equation) has also been briefly introduced to provide the theoretical foundation for Pontryagin’s maximum principle. In addition, the chapter has described two typical methods for solving optimal control problems – indirect and direct approaches – whose advantages and drawbacks are listed in Table 2.1. Understanding the advantages of and difficulties with these methods will help engineers apply them to problem solving.

As regards applicability, dynamic programming (DP) is sometimes thought to be limited because of “the curse of dimensionality” (Bellman, 1957), i.e., the fact that the number of states often grows exponentially with the number of state variables. In reality, even though large state sets do create difficulties, these are the inherent difficulties of the problem not of DP as a solution method. In fact, the DP method can be used with today’s computers to solve optimal control problems with millions of states. In particular, dynamic programming can deal with multistage optimal control problems that are difficult to solve using other methods.

Nevertheless, even though dynamic programming can be used to solve optimal control problems in nonlinear time-variant systems, using it to deal with time-optimal trajectory planning is difficult in practice because it relies on the exact dynamic models of the system.

Yet, unfortunately, the time-optimal control problem is a very common application of the optimal control problem.

In contrast, Pontryagin’s maximum principle, which provides the analytical foundation for this study, can deal with various types of optimal control problem. However, in any such control problem, PMP unfortunately leads to a nonlinear two-point boundary value problem

that, as earlier mentioned, is notoriously difficult to solve analytically and requires the use of iterative numerical techniques (Kirk, 1970).

Furthermore, neither DP nor PMP can serve as a convenient and complete method for reformulating different control problems. Rather, engineers either have to derive the state equations, costate equations, and boundary conditions from PMP or have to reformulate the discrete form of the system equations and performance index by applying the DP algorithm.

Engineers must then also implement numerical programs to solve the TPBVP using PMP or execute recurrence equations using DP. For engineers inexperienced in optimal control theory or numerical techniques, carrying out these theoretical derivations and program implementations is difficult. Thus, a general-purpose solver is needed for various types of optimal control problems.

From a practical viewpoint, of the two types of NLP methods compared in Section 2.1 (simultaneous and sequential strategies), the sequential NLP methods are the best for developing a general-purpose problem solver.

Table 2.1 Comparison of the methods for solving optimal control problems.

Method Advantages Disadvantages / Difficulties

Dynamic programming

method

1. Can obtain global optimal solutions.

2. Can deal with nonlinear constrained time-variant systems.

4. Suits multistage optimal control problems.

3. Is straightforward to program.

1. Hard to apply the algorithms for time-optimal trajectory planning in practice.

2. Inconvenient to reuse.

Pontryagin’s minimum

principle

1. Provides the analytical foundation.

2. Can deal with various types of optimal control problem.

1. Leads to a nonlinear TPBVP that is difficult to solve.

2. Inconvenient to reuse.

Simultaneous NLP methods

1. Can deal with path constraint problems.

2. Can be implemented as a general OCP solver.

1. The computational efficiency is slowed for large-scale problems.

2. Needs extra efforts to deal with inconsistency problem between state equations and controls.

3. Needs a proper initial guess to obtain the optimal solution.

Sequential NLP methods

1. Can deal with various types of nonlinear optimal control problem.

2. Easy to implement as a general OCP solver.

3. Many well-developed numerical schemes can be applied to solve initial value problems.

4. Higher computational efficiency for solving large-scale problems.

1. Needs a proper initial guess to obtain the optimal solution.

2. Path constraints for the states may not be satisfied between grid points.

Dynamic Optimization Problem

Necessary Conditions

Complementarity Problem

Algorithm

Candidate

Check for optimality (sufficient conditions)

Figure 2.1 Solution process based on indirect methods.

Dynamic Optimization Problem

Iterative / approximative Algorithm (SQP)

Solution

Check for convergence or optimality (sufficient / necessary conditions)

Figure 2.2 Solution process based on direct methods.

CHAPTER 3

COMPUTATIONAL METHODS AND NUMERICAL PRELIMINARIES FOR SOLVING OCP

3.1 Introduction

The rapid advancements in modern computers have brought about a revolution in the solutions to many physical and engineering problems, including optimal control problems.

However, most real-world problems are becoming too complex to allow analytical solution;

thus, computational methods must inevitably be used in solving them. As a result, computational methodology has attracted the interest of many engineers and mathematicians, and over the last two decades, many state-of-the-art computational methods for optimal control theory – including collocation transcription and the AOCP method – have been developed (see, e.g., Betts, 1998 and 2001; Hu et al., 2002; Jaddu and Shimemura, 1999; Lin, 1992; Pytlak, 1999).

Some earlier computational methods for solving optimal control problems were based on the indirect approach that assumes the direct solution of a set of necessary optimality conditions resulting from Pontryagin’s maximum principle. The adjoint (co-state) equations are combined with the original state equations to form a TPBVP. This problem may be efficiently solved using the shooting method discussed earlier, which guesses the unknown initial values of the adjoint variables, integrates both system and adjoint equations forward, and then reestimates the initial guesses from residuals at the end point (Bulirsch, 1971;

Lastman, 1978). Nevertheless, because of difficulties arising from the sensitivity and instability of the solutions to the initial guesses, Bulirsch and his coworkers (1971, 1980) introduced multiple shooting algorithms to improve convergence and stability. Multiple

shooting refers to the breaking up of a trajectory into subintervals, on each of which an

initial-value problem is defined. The solutions are then adjusted in successive iterations until

Lastman, 1978). Nevertheless, because of difficulties arising from the sensitivity and instability of the solutions to the initial guesses, Bulirsch and his coworkers (1971, 1980) introduced multiple shooting algorithms to improve convergence and stability. Multiple

shooting refers to the breaking up of a trajectory into subintervals, on each of which an

initial-value problem is defined. The solutions are then adjusted in successive iterations until