CHAPTER 2 METHODS FOR SOLVING OPTIMAL CONTROL PROBLEMS
2.5 Summary
The primary objective of this chapter has been to survey methods of the optimal control problems and provide formulations of various types of optimal control problems. The first-order necessary condition (Euler-Lagrangian equation) has also been briefly introduced to provide the theoretical foundation for Pontryagin’s maximum principle. In addition, the chapter has described two typical methods for solving optimal control problems – indirect and direct approaches – whose advantages and drawbacks are listed in Table 2.1. Understanding the advantages of and difficulties with these methods will help engineers apply them to problem solving.
As regards applicability, dynamic programming (DP) is sometimes thought to be limited because of “the curse of dimensionality” (Bellman, 1957), i.e., the fact that the number of states often grows exponentially with the number of state variables. In reality, even though large state sets do create difficulties, these are the inherent difficulties of the problem not of DP as a solution method. In fact, the DP method can be used with today’s computers to solve optimal control problems with millions of states. In particular, dynamic programming can deal with multistage optimal control problems that are difficult to solve using other methods.
Nevertheless, even though dynamic programming can be used to solve optimal control problems in nonlinear time-variant systems, using it to deal with time-optimal trajectory planning is difficult in practice because it relies on the exact dynamic models of the system.
Yet, unfortunately, the time-optimal control problem is a very common application of the optimal control problem.
In contrast, Pontryagin’s maximum principle, which provides the analytical foundation for this study, can deal with various types of optimal control problem. However, in any such control problem, PMP unfortunately leads to a nonlinear two-point boundary value problem
that, as earlier mentioned, is notoriously difficult to solve analytically and requires the use of iterative numerical techniques (Kirk, 1970).
Furthermore, neither DP nor PMP can serve as a convenient and complete method for reformulating different control problems. Rather, engineers either have to derive the state equations, costate equations, and boundary conditions from PMP or have to reformulate the discrete form of the system equations and performance index by applying the DP algorithm.
Engineers must then also implement numerical programs to solve the TPBVP using PMP or execute recurrence equations using DP. For engineers inexperienced in optimal control theory or numerical techniques, carrying out these theoretical derivations and program implementations is difficult. Thus, a general-purpose solver is needed for various types of optimal control problems.
From a practical viewpoint, of the two types of NLP methods compared in Section 2.1 (simultaneous and sequential strategies), the sequential NLP methods are the best for developing a general-purpose problem solver.
Table 2.1 Comparison of the methods for solving optimal control problems.
Method Advantages Disadvantages / Difficulties
Dynamic programming
method
1. Can obtain global optimal solutions.
2. Can deal with nonlinear constrained time-variant systems.
4. Suits multistage optimal control problems.
3. Is straightforward to program.
1. Hard to apply the algorithms for time-optimal trajectory planning in practice.
2. Inconvenient to reuse.
Pontryagin’s minimum
principle
1. Provides the analytical foundation.
2. Can deal with various types of optimal control problem.
1. Leads to a nonlinear TPBVP that is difficult to solve.
2. Inconvenient to reuse.
Simultaneous NLP methods
1. Can deal with path constraint problems.
2. Can be implemented as a general OCP solver.
1. The computational efficiency is slowed for large-scale problems.
2. Needs extra efforts to deal with inconsistency problem between state equations and controls.
3. Needs a proper initial guess to obtain the optimal solution.
Sequential NLP methods
1. Can deal with various types of nonlinear optimal control problem.
2. Easy to implement as a general OCP solver.
3. Many well-developed numerical schemes can be applied to solve initial value problems.
4. Higher computational efficiency for solving large-scale problems.
1. Needs a proper initial guess to obtain the optimal solution.
2. Path constraints for the states may not be satisfied between grid points.
Dynamic Optimization Problem
Necessary Conditions
Complementarity Problem
Algorithm
Candidate
Check for optimality (sufficient conditions)
Figure 2.1 Solution process based on indirect methods.
Dynamic Optimization Problem
Iterative / approximative Algorithm (SQP)
Solution
Check for convergence or optimality (sufficient / necessary conditions)
Figure 2.2 Solution process based on direct methods.
CHAPTER 3
COMPUTATIONAL METHODS AND NUMERICAL PRELIMINARIES FOR SOLVING OCP
3.1 Introduction
The rapid advancements in modern computers have brought about a revolution in the solutions to many physical and engineering problems, including optimal control problems.
However, most real-world problems are becoming too complex to allow analytical solution;
thus, computational methods must inevitably be used in solving them. As a result, computational methodology has attracted the interest of many engineers and mathematicians, and over the last two decades, many state-of-the-art computational methods for optimal control theory – including collocation transcription and the AOCP method – have been developed (see, e.g., Betts, 1998 and 2001; Hu et al., 2002; Jaddu and Shimemura, 1999; Lin, 1992; Pytlak, 1999).
Some earlier computational methods for solving optimal control problems were based on the indirect approach that assumes the direct solution of a set of necessary optimality conditions resulting from Pontryagin’s maximum principle. The adjoint (co-state) equations are combined with the original state equations to form a TPBVP. This problem may be efficiently solved using the shooting method discussed earlier, which guesses the unknown initial values of the adjoint variables, integrates both system and adjoint equations forward, and then reestimates the initial guesses from residuals at the end point (Bulirsch, 1971;
Lastman, 1978). Nevertheless, because of difficulties arising from the sensitivity and instability of the solutions to the initial guesses, Bulirsch and his coworkers (1971, 1980) introduced multiple shooting algorithms to improve convergence and stability. Multiple
shooting refers to the breaking up of a trajectory into subintervals, on each of which an
initial-value problem is defined. The solutions are then adjusted in successive iterations until the boundary conditions and continuity properties at the ends of the subintervals are satisfied.Multiple shooting is much more successful than its ancestor, the simple shooting method, in which a single initial-value problem is defined. However, even though especially good convergence properties are attributed to multiple-shooting algorithms, the necessity to define the proper control structure and initialize the adjoint variables within a sufficient vicinity of the optimal values still remains a serious limitation.
To avoid the drawbacks of shooting techniques, the direct methods have been studied extensively during the last two decades (Betts, 1993; Barclay, 1997; Gill et al., 2002). One of the most widely used methods for solving optimal control problems is the direct method whose basis is the transformation of the optimal control problem into a NLP problem using either the discretization or parameterization technique (see, e.g., Goh and Teo, 1988; Xu and Antsaklis, 2004; Jaddu, 2002; Lee et al., 1999).
When the discretization technique is applied, the optimal control problem is converted into a nonlinear programming problem with a large number of unknown parameters and constraints (Betts, 1998). On the other hand, parameterizing the control variables (Goh and Teo, 1988; Teo et al., 1991) requires integration of the system equations. Moreover, the simultaneous parameterization of both the state variables and the control variables also results in a nonlinear programming problem with a large number of parameters and equality constraints.
As a prelude to discussing computational methods for solving optimal control problems, the following sections introduce some fundamental NLP concepts. Also introduced is one of the best and most frequently applied NLP methods for solving optimal control problems, sequential quadratic programming (see Barclay, et al., 1998; Betts, 2000; Gill et al., 2002;
Kraft, 1994; Stryk, 1993). Subsequently, the AOCP method, which uses the discretization technique to convert an OCP into a NLP problem, is proposed, and then a standard SQP algorithm is applied to solve it. Also discussed are the dynamic constraint treatments and
design sensitivity analysis used in AOCP.
3.2 Nonlinear Programming Problem
Mathematically, the general form of a constrained NLP problem can be expressed as follows:
minimize f(x)
subject to
g(x) ≤ 0 , x
T = (x1, …, xn)h(x) = 0
(3.1)
where f(x) is the objective function, and h(x) and g(x) are the equality and inequality constraint functions, respectively. It should be noted that in the inequality constraint functions
g(x), the simple bounds of the design variables (x
L ≤ x ≤ xU) are considered and classified.Because maximization problems can be converted to minimization ones by negating their objectives, only minimization problems are considered here, without loss of generality.
A general continuous constrained NLP problem is defined in Eq. (3.1) in which x is a vector of continuous variables. Over the past three decades, a variety of methods has been produced in a wide body of research to solve the general constrained continuous optimization problem (Betts, 2001; Michalewicz et al., 1996; Horst and Tuy, 1993; Floudas and Pardalos, 1992; Hansen, 1992). Based on different problem formulations, existing methods can be classified into three categories: penalty formulations, direct solutions, and Lagrangian methods. Figure 3.1 classifies these methods according to their formulations, and the details of these methods and their comparisons can be found in Wu (2000). Here, the SQP method based on the Lagrangian method and adopted as an NLP solver in the AOCP algorithm is introduced briefly.
In general, because Lagrangian methods work on equality constraints, inequality
constraints are first transformed into their equal equivalents before Lagrangian methods are applied. For example, an inequality constraint can be transformed into an equality constraint by adding a slack variable (Luenberger, 1984). Thus, a general continuous equality constrained optimization problem can be formulated as follows:
minimize f(x)
subject to
h(x) = [h
1(x), …, hm] T = 0(3.2)
where xT = (x1, …, xn) is a vector of the continuous variables. Both f(x) and h(x) are assumed to be continuous functions that are at least first-order differentiable. The augmented Lagrangian function in continuous space in Eq. (3.2) is then defined as
( ) ( ) ( )
1( )
2L
cx λ ,
≡ ∇xf x
+ ∇λ
T xh x
+2h x
(3.3)where
λ
is a vector of the Lagrange multipliers. Compared to the conventional Lagrangian function in continuous space defined as Lc(
x λ,)
≡ f( )
x +λ h xT( )
, the augmented Lagrangian function reduces the possibility of ill conditioning and is, therefore, more stable.Various continuous Lagrangian methods have been developed to find the (local) optimum, all based on first-order necessary conditions. To state these conditions, the concept of regular points must first be introduced.
Definition 3.1. A point x, which satisfies constraints h(x) = 0, is said to be a regular point
(Luenberger, 1984) if the gradient vectors ∇h1( )
x ,∇h2( )
x ,…,∇hm( )
x at point x are linearly independent.First-order necessary conditions for continuous constrained NLP problems.
Letting x be a (local) optimal solution of f(x) subject to constraints h(x) = 0, and assuming that x is a regular point, then there exists
λ \
∈ msuch that( )
T( )
0xf x
∇ x + ∇λ h x = (3.4)
Based on the definition of a Lagrangian function, the necessary conditions for x to be a constrained (local) optimal solution can be written as follows:
( )
To ensure that the equilibrium point is an optimal solution, second-order sufficient conditions are used to check that the solution is a strictly relative minimum subject to constraints (Luenberger, 1984). These second-order sufficient conditions require second-order derivatives, and the Hessian matrix of the Lagrangian function is needed to satisfy certain conditions (Luenberger, 1984) if the solution to Eq. (3.5) is to be a strictly (local) optimal solution. Here, The Hessian matrix of the Lagrangian can be defined as
( ) ( )
L x x xx
H
≡ ∇ ∇⎡⎣L x, λ
c ⎤⎦T = ∇L x, λ
c. Many searching methods based on first-order necessary conditions in continuous space have been developed for solving constrained optimization problems (Bertsekas, 1982; Luenberger, 1984), including the first-order method, Newton’s method, modified Newton’s methods, quasi-Newtonian methods, and sequential quadratic programming (Hribar, 1996; Boggs and Tolle, 1995). A major advantage of these methods is that solving the first-order conditions exactly matches the goal of locating a (local) optimal solution. Therefore, these algorithms are usually efficient for solving continuous constrained NLPs. One of most popular methods for solving constrained optimization is sequential quadratic programming, discussed briefly in the next section.
3.3 Sequential Quadratic Programming Method
SQP, one of most popular Lagrangian methods for solving constrained NLPs, is also widely applied to develop computational methods for solving optimal control problems (Betts, 2000; Buskens and Maurer, 2000; Volkwein, 2000; Barclay et al., 1997). SQP methods have
proven reliable and efficient for many practical constrained optimization problems. The method described here is implemented in MOST (Tseng et al., 1993) and is similar to the algorithm employed by SNOPT (Gill et al., 2002) software.
The SQP method is actually a generalization of Newton’s method (Luenberger, 1984) for unconstrained optimization in the sense that it obtains search directions from a sequence of quadratic programming (QP) subproblems. Each QP subproblem minimizes a quadratic model of a certain Lagrangian function subject to linearized constraints. In its simplest form, an SQP algorithm replaces f(x) in the Lagrangian function with a quadratic approximation and the weighted constraint functions
λ
Th(x) with their linear approximations:
( ) ( ) ( )
where d refers to the descent direction (or search direction) and HL to the Hessian matrix of the Lagrangian function. The descent direction d(k) of the kth iteration of SQP can be found by solving the following quadratic problem, assuming equality constraints only:
minimize
where xk is used to represented the values of the design variables of the kth iteration of SQP.
The local convergence property of SQP is well defined when (x,
λ
) satisfies the second-order sufficient conditions (Luenberger, 1984). That is, if point (x,λ
) is sufficiently close to an optimal solution (x*,λ
*), then the sequence generated using the descent direction d and the appropriate step size α will converge to x* at a second-order rate. The search step size α can be obtained by applying a line search method.The SQP method described here requires a more precise computation of the Hessian matrix, ∇2xx
L
c( x λ
k,
k)
, at each step. However, it is usually replaced with a BFGS approximation (Arora, 1989) BB)
k updated at each iteration. Using a BFGS formula allows the following simple update strategy to be defined:
( ) (
Once the descent direction has been determined, the step size must be calculated based on simultaneously decreasing the objective, as well as improving constraint satisfaction. To accomplish this goal, a suitable unconstrained function must be developed upon which to base the step size determination. Many unconstrained optimization methods, such as the golden section search method, can be found in the literature (see Arora, 1989) and applied to calculate the search step size,
α
.Nevertheless, although SQP methods are generally efficient, they often require that functions be differentiable and therefore cannot be applied directly to solving NLPs containing discrete variables. Thus, a modified SQP algorithm in cooperation with an enhanced branch-and-bound method is proposed here to solve discrete-valued NLP problems.
The details of this modified algorithm are introduced in Chapter 5 of this dissertation. The details of SQP method implementation can be found in the literature (e.g., Arora, 1989; Boggs and Tolle, 1995; Gill et al., 2002). Figure 3.2 presents a conceptual flowchart of the SQP method, whose algorithm is briefly described below.
Algorithm: Sequential Quadratic Programming
Step 1. Choose x0, Ns (maximum number of iteration)ε
i(for convergence and stopping), k = 1 (iteration counter).
Step 2. Find the descent direction d by solving the QP subproblem defined in Eqs. (3.6)
and Eq. (3.7).
Step 3. Check feasible and convergence criteria.
(a) Convergence for SQP:
IF (h
j = 0, j = 1, …, m) THENIF (KT condition is satisfied) THEN
Algorithm converged, Stop.(b) Stopping criteria:
Δx = xk+1 – xk
IF (Δx
TΔx ≤ε
i) THEN Stop. (design variable not changing)IF (k = N
s) THEN Stop. (maximum iteration reached)Continue
Step 4. Calculate the step size α.
Step 5. Update Hessian matrix HL by applying BFGS approximation BBk (Eq. 3.8).
Step 6. k = k +1; Go to Step 2.
3.4 Admissible Optimal Control Problem Method
The admissible optimal control problem method is a direct method that transcribes an optimal control problem into a NLP problem, a process shown in Figure 3.3. Whereas an NLP problem consists of a finite set of variables and constraints, an optimal control problem can involve continuous functions and be treated as an infinite-dimensional extension of an NLP problem. However, most practical methods for solving optimal control problems require Newton-based iterations with a finite set of variables and constraints. Therefore, a discretization technique is needed to convert the infinite-dimensional problem into a finite-dimensional approximation. On the other hand, a general optimal control problem may include some dynamic constraints that make the problem complex and difficult to solve. Thus, an efficient dynamic constraint treatment becomes more important for developing a general
optimal control solver. Presented below are two common design sensitivity analysis (DSA) methods used to determine the effect of a change in the current design on the cost functional and the constraint functions.
3.4.1 Discretization and Parameterization Techniques
Various discretization and parameterization techniques for state and control variables allow for an optimal solution for the OCP via nonlinear programming. Jaddu and Shimemura (1999) used quasi-linearization and state parameterization using Chebyshev polynomials to solve constrained nonlinear optimal control problems. Hu et al. (2002) applied an enhanced scheme based on the direct collocation and nonlinear programming problem (DCNLP) to transform the system dynamics into constraints for nonlinear programming. Nevertheless, although these simultaneous discretization methods are applied to many numerical examples and solve them successfully, using a full discretization strategy sharply increases the number of design variables. Therefore, a different discretization technique, in conjunction with SQP, is implemented here, one used to solve various types of optimal control problems (Betts, 2000;
Barclay et al., 1997). This technique is based on the sequential method in which only the control variables u are approximated by some interpolation function in each time interval. The approximate trajectories x are generated by solving the initial-value problem defined in Eqs.
(2.1) and (2.2). This method, first proposed by Sage and White (1977), is termed the AOCP method.
Control function parameterization. Parameterization of the control functions can be carried
out using the following process. First, the entire time intervalt
∈ ⎣⎡t t
0, f ⎤⎦ is subdivided intoN general unequal time intervals and the time grids are designated as
t
0 = 0, t1, t2,…, tN-1, tN = tf (3.9)The time intervals between the grid points are defined in a vector form as
T = [ T
1, T2,…, TN ]T (3.10)If at each time grid, control u(l) is treated as a set of m unknown parameters, then interval
0, f
⎡⎣
t t
⎤⎦ will have an Nm unknown parameter and can be represented asS
D =[ u(1), u(2),…, u(N) ]T= [ u1(t0),…, um(t0), u1(t1),…, um(t1),…, u1(tN-1),…, um(tN-1)]T
= [ S1,…, Sm, Sm+1,…, S2m, S2m+1,…,S(N-1)m+1,…, SmN]T
(3.11)
where u( )l ∈\mis the vector of the control variables for the lth time interval [tl, tl+1]. This formulation can be treated as a subset of the design variable vector, resulting in a total number of k+N+Nm design variables:
P = [ b
1,…,b
π, T1,…, TN,…,S1,…, SN+1, SN+2,…,SmN]T (3.12)Finding an accurate solution for any practical application requires one set of fine time grid intervals. However, discretizing control functions with a fine time interval increases the number of design variables considerably, especially for a practical optimal control problem with a large number of control variables. Hence, certain parameterization techniques have been developed to overcome this problem. If parameterization techniques are applied, control function u(t) may be represented by an interpolation function, and the coefficients of the interpolation function may be considered design variables instead of Ti and Si in Eq. (3.12).
For example, if the time grid is not considered a design variable, the interpolation function based on a third-order polynomial u (1) =
ζ
1( )l +ζ
2( )l × t +ζ
3( )l × t2 +ζ
4( )l × t3 can be used to represent the first component of the control forces in u(t) andζ
1( )l ,ζ
2( )l ,ζ
3( )l , andζ
4( )l can be treated as a subset of the design variables. Therefore, the control functions can be approximated by interpolation functions I(t), whereI
( ) :t
⎡⎣t t
0, f⎤ ∈⎦ \m. The continuous timeoptimal control problem with interpolation functions I(t) is thus reformulated as an NLP problem without using approximate discretization. As noted earlier, the control functions, u(t), are treated as a subset of the design variable vector P. Similarly, the terminal time tf can be treated as one of the design variables in time interval vector T, e.g., . The admissible control functions are represented in the form u(t) = I(S, T, t), and the state variables are written in the form x(b, S, T, t) to emphasize that they are functions of the design variable vector P. Here, S represents the parameter vector of interpolation function. As a result, the admissible optimal control problem in an NLP formulation can be rewritten as follows. and the system equation is represented as
( )
( )l =
t
, , ( , , , ), ( , , ) ,t t
x f b x b S T I S T t ∈ [ t t
l,
l+1]
(3.16) with the initial condition( ) (0)
where l is used to indicate the index of the time grid, and the original optimal control problem is divided into N subproblems. These NLP subproblems are solved sequentially from time
grid t0 to terminal time tf, and the solutions of the state variables obtained in each time interval are then applied as initial values in the next time interval.
With AOCP, the system equation in Eq. (3.16), together with the initial conditions in Eq.
(3.17), forms an initial-value problem (IVP), and the corresponding values for the state variables can be calculated by solving the problem using the design variable values in each iteration. For integrating the state equations in Eqs. (3.16) and (3.17), some good first-order differential equation methods are available that have a variable step size and error control, e.g., Adam’s method, the Runge-Kutta-Fehlberg method, and the backward difference formulae (BDF) (Press et al., 1992). Those solvers can give accurate results with user-defined error control. The state trajectories are internally approximated using interpolation functions in the differential equation solvers. Values of the state and control variables between the grid points can be also obtained with different types of interpolation schemes.
3.4.2 Dynamic Constraint Treatments
The continuum dynamic constraints in Eq. (2.5) must be satisfied over the entire time
The continuum dynamic constraints in Eq. (2.5) must be satisfied over the entire time