Solving the Multi-objective Dynamic Optimization Problems with Fuzzy Satisfying Method

(1)

Solving multi-objective dynamic optimization problems

with fuzzy satisfying method

Cheng-Liang Chen

n,y

_{, Chia-Yuan Chang and Daim-Yuang Sun}

Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan, R.O.C.

SUMMARY

This article proposes a novel algorithm integrating iterative dynamic programming and fuzzy aggregation to solve multi-objective optimal control problems. First, the optimal control policies involving these objectives are sequentially determined. A payoff table is then established by applying each optimal policy in series to evaluate these multiple objectives. Considering the imprecise nature of decision-maker’s judgment, these multiple objectives are viewed as fuzzy variables. Simple monotonic increasing or decreasing membership functions are then defined for degrees of satisfaction for these linguistic objective functions. The optimal control policy is finally searched by maximizing the aggregated fuzzy decision values. The proposed method is rather easy to implement. Two chemical processes, Nylon 6 batch polymerization and Penicillin G fed-batch fermentation, are used to demonstrate that the method has a significant potential to solve real industrial problems. Copyright # 2003 John Wiley & Sons, Ltd.

KEY WORDS: optimal control; multi-objective optimization; fuzzy set; iterative dynamic programming

1. INTRODUCTION

In recent years, a considerable number of studies have been made on the optimization of dynamic systems with single objective [1–7]. However, multiple aims are usually desired in practice. For instance, when operating a batch reactor, maintaining undesirable byproducts at the lowest possible levels and attaining the desired fractional conversion in the shortest amount of time are both important. Furthermore, the operators can simultaneously consider other important factors such as economic eﬃciency, safety, reliability, or the impact on the environment. All these objectives are usually non-commensurable. Operators thus need a multi-objective decision-making technique to help them look for a satisfying solution from those conﬂicting objectives. Recently, application of multi-objective approach on dynamic optimiza-tion problems has been addressed by many researchers [8].

Optimization for a multi-objective problem is a procedure looking for a compromise policy. The result, called a Pareto optimal or non-inferior solution, consists of an inﬁnite number of

Received 22 January 2001

y

E-mail: [email protected]

n

Correspondence to: Professor Cheng-Liang Chen, Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan, R.O.C

(2)

alternatives. There is a large variety of methods for treating the multi-objective optimization problem. Those methods can be classified in many ways according to different criteria [9, 10]. For example, Cohon [11] categorized methods into two relatively distinct subsets: generating methods and preference-based methods. The generating methods produce a set of Pareto optima and then the decision maker (DM) selects one of them on a basis of subjective value judgment. Among them the weighting-sum method and the e-constraint method are well-known. As the algorithm cannot converge to a suitable solution, or the DM does not agree with the result, the DM can adjust the related parameters used in the algorithm, such as the weighting factors in the weighting-sum method. The computation can be repeated until a satisfactory solution is obtained. The preference-based methods, on the other hand, contain DM’s preference as the solution process goes on, and the solution that best fulfills DM’s preference is selected. Thus, all these multi-objective optimization methods for finding a Pareto optimal solution are filled with subjective and fuzzy properties [10].

In order to overcome the difficulty of describing a fuzzy attribute, Zadeh [12] proposed the fuzzy set concept. By using multi-valued logic to replace the traditional Boolean logic, people can quantitatively elucidate unclear information or knowledge. Afterward, Bellman and Zadeh [13] further extended the fuzzy concept to the decision making under fuzzy environment. Tanaka et al. [14] brought in the concept of fuzzy mathematical programming and proved that fuzzy mathematical programming can be reduced into conventional non-linear programming problem. Zimmermann [15] introduced fuzzy set theory into conventional linear programming problems, considering linear programming problems with a fuzzy goal and some fuzzy constraints. Following the fuzzy decision, together with linear membership functions, he proved that there exists an equivalent linear programming problem. Recently, Sakawa et al. [16] proposed the fuzzy satisfying method to find solution for multi-objective linear problems by applying the payoff table. Wang et al. [17] proposed a fuzzy decision-making procedure to determine the feed profile of a fermentation process for fuel ethanol production, using the fuzzy min–max method. In this paper, we attempt to extend the fuzzy inference on solving the dynamic optimization problem with multiple objectives. The augmented min–max approach, proposed by Sakawa et al. [16], is applied instead of the conventional min–max approach used by Wang and Shieh [18] since the uniqueness of the optimal solution is not guaranteed in non-linear systems. By mapping each objective value into a normalized domain referenced from the payoff table and further aggregating these normalized values, the vector objective problem can be grouped into a single objective problem. Then, the iterative dynamic programming (IDP), developed by Luus and his co-workers [4], is utilized as the platform to determine the solution for such problems. By using appropriate mapping functions and aggregates, the solution found by the proposed algorithm can be proved to be (local) Pareto optimal. Notably, by applying the fuzzy aggregation and payoff table approach, one can obtain single Pareto optimal solution that best satisfies the decision maker with least subjective knowledge. Furthermore, the grouped single objective dynamic optimization problem can also be solved by using other searching methods, such as the integrated controlled random search (ICRS) [7, 19].

In the rest of this article, the formulation of the problem is set out in Section 2. The procedure for grouping the vector objectives into a scalar one using the fuzzy set concept is given in Section 3. Some related mathematical properties required to guarantee the optimality of the solution are given in Section 4. In Section 5, a review and modiﬁcation for IDP is presented. Therein, limitations of proposed method on local Pareto optimum is also discussed. Two

(3)

numerical examples are supplied in Section 6, demonstrating the usefulness of the proposed method. Some conclusions and discussions are made in Section 7.

2. PROBLEM STATEMENT

Consider the following multi-objective dynamic optimization problem (MODOP) with a speciﬁed ﬁnal time tf;

min uðtÞ2O J1ðxðtfÞÞ and min uðtÞ2O J2ðxðtfÞÞ .. . and min uðtÞ2O JIðxðtfÞÞ ð1Þ

or in a more compact form min

uðtÞ2O JðxðtfÞÞ ¼ ½J1ðxðtfÞÞ; . . . ; JIðxðtfÞÞ

T _ð2Þ

Here, xðtÞ denotes the n 1 state vector with initial condition xð0Þ ¼ x0; uðtÞ is the m 1 control

vector; O is the feasible region in the control action space uðtÞ that satisﬁes some constraints, O ¼ fuðtÞjhðxðtÞ; uðtÞÞ ¼ ’xxðtÞ fðxðtÞ; uðtÞÞ; xð0Þ ¼ x0; gðxðtÞ; uðtÞÞ40; u4uðtÞ4%uug; hð*Þ ¼ ½h

1ð*Þ;

. . . ; hLð*ÞTand gð*Þ ¼ ½g

1ð*Þ; . . . ; g

Kð*ÞTare equality and inequality constraints, respectively; u

and %uu are the lower/upper bounds for control vector; and JðxðtfÞÞ denotes an I 1 dimensional

objective function vector.

The multi-objective dynamic optimization problem is to determine the optimal control policy un

ðtÞ over t 2 ½0; tf which brings the state vector xðtÞ from the initial condition x0 to the ﬁnal

state xðtfÞ so that these objectives in JðxðtfÞÞ; regardless of the commensurability, are minimized

under given constraints. These objective functions, however, are usually conﬂicted with one another in practice. It is thus impossible to attain their own optimum, Jn

iðxðtfÞÞ0s;

simultaneously. The optimization of one objective implies the sacriﬁce of other targets. Therefore, the decision maker must make some compromise among these goals. In contrast to the optimality used in single objective optimization problems, Pareto optimality characterizes the solutions in a multi-objective optimization problem [9–11].

Deﬁnition 1 un

ðtÞ 2 O is said to be Pareto optimal for Equation (2), if and only if there exists no uðtÞ such that JiðuðtÞ; xðtfÞÞ4Jiðu

n

ðtÞ; xn

ðtfÞÞ for all i 2 f1; . . . ; Ig and JjðuðtÞ; xðtfÞÞ5Jjðu n

ðtÞ; xn

ðtfÞÞ for some

j 2 f1;. . . ; I g:

From the above definition, the number of solutions satisfying Pareto optimality in a multi-objective optimization problem can be infinite. It is difficult for the DM to attribute a set of incompatible objectives, such as economic efficiency, safety, reliability, or environmental

(4)

impact, without knowledge of the possible level of attainment for those objectives. It is thus a fuzzy problem for ﬁnding a Pareto optimal solution that best satisﬁes the decision maker.

3. FUZZY DECISION MAKING IN MODOP

In this work, we extended the fuzzy set theory of Zadeh [12] to deal with the multi-objective dynamic optimization problem. By considering the uncertain property of human thinking, it is quite natural to assume that the DM has multiple fuzzy goals, Ji; i ¼ 1; . . . ; I ; where an

interval ½J‘

i; Jiu exists for each fuzzy objective Ji: For the ith minimum objective, it is

thoroughly satisﬁed as the objective value JiðxðtfÞÞ is less than Ji‘; and it is unacceptable as

JiðxðtfÞÞ4Jiu: For a JiðxðtfÞÞ value in between Ji‘ and Jiu; the extent of satisfaction by the DM

decreases with an increase in its value. A strictly monotonic decreasing membership function, m_J

iðJiðxðtfÞÞÞ 2 ½0; 1; can be used to characterize such a transition from the objective value,

JiðxðtfÞÞ; to the degree-of-satisfaction, m_Ji [7] m_J_iðJiðxðtfÞÞÞ ¼ 1 for JiðxðtfÞÞ5Ji‘ FJiðJiðxðtfÞÞ; J ‘ i; JiuÞ for Ji‘4JiðxðtfÞÞ4Jiu 0 for JiðxðtfÞÞ > Jiu 8 > > < > > : ð3Þ

Here, a membership value of 1 denotes absolute satisfaction and 0 means unacceptable. The original MODOP is now equivalent to look for a suitable control policy that can provide the maximal degree-of-satisfaction for the multiple fuzzy objectives.

max uðtÞ2O mJ1ðJ1ðxðtfÞÞÞ and max uðtÞ2O mJ2ðJ2ðxðtfÞÞÞ .. . and max uðtÞ2OmJIðJIðxðtfÞÞÞ ð4Þ

or in a more compact form, max

uðtÞ2OmðxðtfÞÞ ¼ ½mJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞ

T _ð5Þ

Under incompatible objective circumstances, a DM must make a compromise decision that provides a maximum degree-of-satisfaction for all of these conﬂict objectives. The new optimization problem, Equation (5), can be interpreted as the synthetic notation of a conjunction statement (maximize jointly all objectives). The result of this aggregation, D; can be viewed as a fuzzy intersection of all fuzzy objectives, Ji; i ¼ 1; . . . ; I ; and is still a fuzzy set.

(5)

The ﬁnal degree-of-satisfaction resulting from certain kinds of control actions, m_DðxðtfÞÞ;

over t 2 ½0; tf can be determined by aggregating the degree-of-satisfaction for all objectives,

m_J_iðJ_ixðt_fÞÞ via speciﬁc t-norm, T:

m_DðxðtfÞÞ ¼ TfmJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð7Þ

The fundamental properties for a fuzzy set and the related operators can be found in Reference [20]. As the ﬁring level for each control policy is determined by the above procedure, the best control policy un

ðtÞ with the maximal ﬁring level, mn

DðuðtÞÞ; over t 2 ½0; tf can be selected.

mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2O mDðxðtfÞÞ ¼ max uðtÞ2O T_fm J1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð8Þ

Two famous t-norms are discussed here. 1. T ¼ algebraic product mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2O YI i¼1 m_J_iðJiðxðtfÞÞÞ ð9Þ 2. T ¼ Zadeh-minimum mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2OminfmJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð10Þ

Notably, an equivalent expression when the Zadeh-minimum is used as the t-norm is, max

uðtÞ2Oþ

a ð11Þ

where Oþ¼ O \ fm_J

iðJiðxðtfÞÞÞ5a; i ¼ 1; . . . ; Ig:

4. RELATED MATHEMATICAL THEOREMS

Since the original problem, Equation (2), has been modiﬁed as an another multi-objective problem, Equation (5), using the t-norm, this new problem is converted into a new single objective problem, Equation (8). It is necessary to ensure that the solution from this procedure is Pareto for Equation (2). Since the original problem may be non-convex, only local Pareto is guaranteed if a local optimizer is applied [9]. Should the global optimum is emphasized for non-convex non-linear optimal control problems, one can refer to recent work done by Esposito and Floudas [21]. However, the calculations involved can be hard to unravel. Notably, that Equation (5) is also a multi-objective problem, it makes sense to give the following deﬁnitions [10]:

(6)

Deﬁnition 2 un

in O is said to be (local) M-Pareto optimal for Equation (2) if and only if it is (local) Pareto for Equation (5), that is, there exists no u such that m_J_iðuÞ5m_J

iðu

n

Þ 8i; with strict inequality for some i; where M refers to membership.

Note that the set of (local) Pareto optimal solutions is a subset of the set of (local) M-Pareto optimal solutions. Using the above deﬁnitions, it is clear that only (local) M-Pareto solutions can be looked for as Equation (2) is solved through Equation (5) and a local optimizer is used. Thus, it is quite natural to question

1. How can we guarantee a (local) M-Pareto solution for Equation (2) as the t-norm is used? 2. Under what conditions is a (local) M-Pareto solution a (local) Pareto?

To answer these queries, the following propositions [22] can be applied: Proposition 1

Let T be an arbitrary t-norm. If un

is the unique (local) optimal solution for Equation (8), then un

is (local) M-Pareto. Proposition 2

If un

is a (local) optimal solution for Equation (8) such that m_J_i=_{0 8i; and T is strictly} monotonous, then un

is (local) M-Pareto for Equation (2). Based on Propositions 1 and 2, the following corollaries arise: Corollary 1

An (local) optimal solution for Equation (9), say un

is (local) M-Pareto for Equation (2), if either it is the only (local) solution or m_J

i½u

n

=0 8i; since the Product is a strictly monotonous t-norm.

Corollary 2

An (local) optimal solution from Equation (10) or Equation (11) will be (local) M-Pareto when it is unique, because the Zadeh-min is not strictly monotonous.

Based on these corollaries and the property of the product operator, as the control policy causes some objectives to deviate from their own optimum to an unacceptable extent, membership values for these objectives will become zero. At that time, this control policy will be rejected. Thus, as the product is used for solving Equation (8), any meaningful solution will be (local) M-Pareto. On the other hand, when Zadeh-min is used as the aggregate, a test for the uniqueness of the (local) optimal solution for Equation (10) or Equation (11) is unavoidable to ensure the (local) M-Pareto solution. Furthermore, as strictly decreasing functions are used to calculate membership value for each objective, the solution with (local) Pareto optimality can be found from the following propositions:

Proposition 3 Let un

be (local) M-Pareto for Equation (2) with mJi 2 ð0; 1Þ 8i: If all of the mJi functions are

strictly decreasing, then un

(7)

Proposition 4 Let un

be an (local) optimal solution for Equation (8) with all of the m_J_i being strictly decreasing. un

is the (local) Pareto for Equation (2) if either 1. un

is the unique (local) solution or

2. T is strictly monotonous and m_J_i 2 ð0; 1Þ 8i:

From the above discussions, if Zadeh-min is selected as the t-norm, the key point that guarantees (local) Pareto solution is to determine the unique optimal solution. To achieve this, two strategies can be utilized. The ﬁrst is to employ a global optimization technique to determine the solution [21]. However, it is diﬃcult to justify whether the solution is the only global optimum or not. The second strategy is to employ the augmented minimax algorithm [10]. The merit of this method is that it circumvents the necessity for testing the uniqueness of the solution by modifying the objective function. Based on this algorithm, Equation (10) can be reformulated as follows: min uðtÞ2O max8i ½%mmi mJiðJiÞ þ r XI i¼1 ½%mm_i m_J iðJiÞ ( ) ð12Þ or equivalently min uðtÞ2O o s:t: %mmi mJiðJiÞ4o r PI i¼1 ½%mm_i m_J iðJiÞ i ¼1;. . . ; I ð13Þ

where %mmi is the reference membership level determined by the DM, and r is a suﬃciently small

positive constant. For such a modiﬁcation, we can apply the following theorem, which is similar to the works done by Sakawa et al., to ensure the properties of the solution:

Theorem 1 Let un

be a (local) optimal solution for Equation (12) or Equation (13) for some %mmi; i ¼ 1; . . . ; I :

Then un

is a (local) Pareto optimal solution of Equation (2).

5. THE ITERATIVE DYNAMIC PROGRAMMING (IDP) INCORPORATING FUZZY DECISION

Methods for solving the dynamic optimization problems can be classified into three major categories: (1) the variation-based approach, (2) the non-linear programming (NLP) approach and (3) the dynamic programming. The variation-based methods, a direct applications of Pontryagin’s minimum principle, transforms the original problem into a two-point-boundary-value problem (TPBVP). It is usually a complex and extreme difficult task to solve the resulting TPBVP, however. The NLP approaches, including complete parameterization [1, 3] and control parameterization [2, 5, 6, 19], transform the dynamic optimization problem into a general non-linear optimization problem. Any standard NLP technique can then be used to determine the solution. All of these parameterization methods exhibit convergence difficulties, and it is difficult to obtain a global optimum because of the highly non-linear, multi-modal, and/or discontinuous

(8)

natures of these systems. The direct use of dynamic programming in solving dynamic optimization problems is usually difficult. Two major barriers discourage the use of dynamic programming. The greatest difficulty is the problem of setting up the grid values for the state and control. To produce a meaningful result, the state grids must be sufficiently fine. At each time stage, therefore, a large numbers of integration must be performed for each state grid and each allowable control value. A greater problem arises when the trajectory calculated for a particular grid point does not meet a grid point at the next time step. Under such condition, interpolation can be used, but the resulting approximation is usually unreliable.

Recently, Luss [4] proposed the so-called iterative dynamic programming (IDP) to alleviate the computational burdens of the original dynamic programming. By using the accessible grid points and region-reduction strategy, the IDP can successfully overcome the curse of dimensionality, and its computational effectiveness has been elucidated in many reports, such as Reference [23]. Although IDP is not a deterministic global optimizer which can guarantee convergence to the global solution, there are many advantages. First, it is easy for implementation and no gradient information is required. Second, the probability of obtaining the global solution is high and can be even higher by increasing the parameter resolution and by repeating the optimization process with several different initial trials. These properties are very practical in real industrial environments. One of the best inherent properties of dynamic programming is that it is easy to extend the algorithm into parallel or distributed computation. As the computation efforts is heavily related to the number of state grids, the computation effort can be reduced significantly if multiple searching loops work simultaneously. Therefore, the barrier of numerous integrations is overcome by the parallel version of IDP, proposed by Hartig et al.[24]. In this article, only the regular version is demonstrated because the major purpose of this paper is to illustrate the fuzzy decision making.

In order to apply the IDP to solve our problem, Equation (8), we ﬁrst divide the entire time horizon into P time stages, P ¼ tf=T : Further assume that the control action within each

duration T is kept constant. Therefore, the original inﬁnite-dimensional problem can then be put into the following ﬁnite-dimensional form:

max

ui2 *OOd

Tfm_J

1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð14Þ

where *OOd is the feasible searching space deﬁned as

*O

Od ¼ fu0; . . . ; uP1j hðxðtÞ; uðtÞÞ ¼ ’xxðtÞ fðxðtÞ; uðtÞÞ; xð0Þ ¼ x0; gðxðtÞ; uðtÞÞ40;

u4uðtÞ ¼ ui4%uu 8t 2 ½iT; iT þ T; i ¼ 0; . . . ; P 1g

The multi-objective dynamic optimization problem can now be regarded as ﬁnding a series of piecewise constant control actions ui; i ¼ 0; . . . ; P 1 with the highest ﬁring level to attain the

least compromise between the objectives, meanwhile the state equations and related constraints are all satisﬁed.

The IDP-based algorithm for the MODOP can be stated as follows:

* Preliminary phase(I): generation of state grid points

1. Divide the time horizon tf into P time stages each with length T ¼ tf=P:

2. Choose the number of state grid points, N; the number of allowable control values, M; and the testing region r for control action.

(9)

3. By Choosing M sets of control actions at the time stages, integrate the dynamic equation M times to generate the state grids at the time stages.

* Preliminary phase(II): generation of payoﬀ table

1. Determine the optimal value for each single objective, Jn

i; and record the

corresponding control policy, un

i: Meanwhile record the other objective values

when applying un

i; i.e. Js;s=i s ¼1;. . . ; I :

2. Construct the payoﬀ table (Table I), 3. Let J‘

i ¼ J n iðu

n

iÞ and Jiu¼ max8sfJiðunsÞg: Determine the membership function FJi

for each objective.

4. Choose a suitable t-norm to perform aggregation. If the Zadeh-min is used, it is needed to modify the objective function and assign the %mmiand r values. The default

values for %mmi and r can be set as 1.0 and 103; respectively. * Search by the IDP:

1. Begin at the ðP 1ÞT; corresponding to time tf T : Integrate the dynamic equation

from time tf T to tf; starting from each grid point and using each of the M

allowable control values.

2. Calculate membership values for all objects at tf attained by the M control actions

at each grid point.

3. Take aggregation on each objective via the selected t-norm operator to calculate ﬁring levels for the control actions.

4. Select the control action with highest ﬁring level as the best choice for the state grid and record it.

5. Step back one stage, corresponding to time tf 2T : Integrate the dynamic equation

from time tf 2T to tf for each state grid with M allowable control actions. To

continue integration from tf T to tf; use the best control action for the closet state

grid point tf T :

6. Re-calculate membership values for all objects at tf; since state trajectory at tf does

not exactly meet target value obtained from previous computation. Take aggregation on objects to calculate ﬁring level for each control value used at this stage.

7. Select the control action with highest ﬁring level as the best choice.

8. Continue the above procedure until t ¼ 0: Store control actions with maximal fuzzy decisions and store the corresponding state trajectory.

Table I. The payoﬀ table for a standard multi-objective optimization problem.

J1ðuniÞ JsðuniÞ JIðuniÞ

Jiðun1Þ J n 1ðu n 1Þ Jsðun1Þ JIðun1Þ .. . .. . . . . .. . . . . .. . JiðunsÞ J1ðunsÞ J n sðu n sÞ JIðunsÞ .. . .. . . . . .. . . . . .. . JiðunIÞ J1ðuInÞ JsðunIÞ J

n Iðu

n IÞ

(10)

9. Reduce the region for state grid and region for allowable control values using a contracting factor e; i.e.

rðjþ1Þ¼ ð1 eÞrðjÞ ð15Þ

where j is iteration index. Use the optimal state trajectory from previous search as the midpoint for the state grid at each time stage. Use the optimal control policy from previous steps as the midpoint for allowable values for control.

10. Increment the iteration index j by 1 and go back to step 1. Continue the iteration for a speciﬁc number of iterations, such as 20, and examine the results.

Notably, the number of required integrations (NT) for each iteration can be analytically ﬁgured

out as follows: NT¼ M P 1 þ NðP 1Þ 2 ð16Þ

One can estimate the overall computation time if M; N; P and the number of iterations are given. Though the proposed method requires huge resources of computer processing time, it has a signiﬁcant potential to solve real industrial problems due to its easy implementation. The feasibility and superiority of the IDP for solving the MODOP is demonstrated in next section.

6. NUMERICAL EXAMPLES

Two chemical processes are used to demonstrate the proposed algorithm. The ﬁrst example concerns the multi-objective dynamic optimization for the non-vaporized Nylon-6 polymeriza-tion in the batch process. The second is producpolymeriza-tion of penicillin G via fed-batch fermentapolymeriza-tion. We use the Pentium-100 personal computer, and DVERK and LSODI integration packages for computation.

6.1. Example 1: Nylon 6 polymerization problem

The optimization for Nylon 6 polymerization has drawn considerable attention in the past decade. Ray and Gupta [25] explored the optimum temperature profiles using the minimum principle for respective single objective. Recently, Wajge and Gupta [26] applied the so-called surrogate worth trade-off (SWT) method to study the operation under two objective functions. Based on the mass and moment equations proposed by Ramagopal et al. [27], we attempt to find the temperature profiles T ðtÞ which simultaneously optimize the following three objectives:

min J1 ¼ concentration of unreacted monomer in product

min J2 ¼ concentration of undesirable cyclic compounds

ðprimarily; cyclic dimerÞ in product; ½C2tf ðmol=kgÞ

(11)

Meanwhile, the following constraints are considered:

1. To ensure a single phase polymerization, the control variable (temperature) should be limited within the range between 220 and 2708C;

2208C4T ðtÞ42708C ð17Þ

2. To guarantee processing properties (number average molecular weight MNðtfÞ within 1:4

104 _{and 2 10}4_{), the stopping criterion is setting the number average chain-length of}

polymer, CLNðtfÞ; to be 140

CLNðtfÞ ¼ 140 ð18Þ

Under this condition, the number average molecular weight MNðtfÞ is about 1:6 104:

3. To facilitate and smooth control proﬁle for implementation, the change in temperature between successive time intervals should not exceed 208C

jT ðtkÞ Tðtk1Þj4208C ð19Þ

Furthermore, each time interval must be larger than 15 min

tk tk1 > 15 min ð20Þ

In this example, we set N ¼ 49; M ¼ 5; P ¼ 10 and 30 iterations for each run of optimization. With the same physical data, initial conditions and tolerance as that of Wajge and Gupta [26], we ﬁrst determine optimal values for considered objectives to establish the payoﬀ table, as shown in Table II. The results in Table II show that:

* _{When the polymerization temperature is maintained at its highest value, 2708C; the}

concentration of unreacted monomer can be kept at the lowest level and the reaction time the shortest. The cyclic compound concentration is increased, however.

* _{The reaction temperature should be kept at the lowest value, 2208C; to suppress the cyclic}

compound concentration.

From the above results, the control proﬁles can be qualitatively divided into three stages: Stage1: Initializing the reaction with the highest temperature to promote the monomer

conversion.

Stage2: Keeping the temperature at its lowest level to limit the concentration of undesired product as low as possible.

Stage3: Re-increasing the temperature to raise the product concentration.

Table II. Payoﬀ table for Nylon 6 batch polymerization. T ðtÞ J1ðuniÞ ¼ ½C1tf J2ðu n iÞ ¼ ½C2tf J3ðu n iÞ ¼ tf Jiðun1¼ 2708CÞ 1.9005 0.01202 6.071 Jiðun2¼ 2208CÞ 2.2430 0.00735 35.33 Jiðun3¼ 2708CÞ 1.9005 0.01202 6.071

(12)

Based on Table II, we set up the following linear type membership functions, designated as Condition-1, to express our preference on each objective.

m_J₁ ¼ 1 if J151:9005 2:2430 J1 2:2430 1:9005 if 1:90054J142:2430 0 if J1> 2:2430 8 > > > > < > > > > : ð21Þ m_J₂ ¼ 1 if J250:00735 0:01202 J2 0:01202 0:00735 if 0:007354J240:01202 0 if J2> 0:01202 8 > > > > < > > > > : ð22Þ m_J 3¼ 1 if J356:071 35:330 J3 35:330 6:071 if 6:0714J3435:330 0 if J3535:330 8 > > > > < > > > > : ð23Þ

Here, both min and product are utilized as fuzzy aggregation for exploring the effect on final results, such as shown in Table III. Table III illustrates the fact that min operator has the tendency to distribute the preference on average. It can be seen in the third column of the table that membership values are all equal to 0.61. However, the product is more inclined to concentrate the preference on some objectives. This can be shown in the fifth column that membership values for J1 and J3are larger than that of J2: Control profiles for Condition-1

are shown in the following. Notably, applying the min operator in aggregation does not guarantee that the final results is (local) Pareto optimal. We modify the max–min type objective function into the augmented min–max type objective function. The simulation results for this modification are listed in Table IV, Comparing the results from Tables III to IV, we find that although we have replaced the max–min type objective function with the augmented min–max type function, the final results do not change significantly. The possible reason is that IDP determines the unique solution for the former objective function. But, to ensure (local) Pareto optimality, we still recommend using the augmented min–max type objective function (Figure 1).

Table III. Comparisons between min and product operator under Condition-1 (Example 1).

min product

m_J m_J

½C1tf 2.0339 0.61 1.9634 0.8164

½C2tf 0.00917 0.61 0.00989 0.4563

(13)

Now considering a practical operation (8 h per day), the maximal allowable operating time is changed from 35.330 to 15 h (Condition-2). The membership function for operating time becomes, m_J₃¼ 1 if J356:071 15 J3 15 6:071 if 6:0714J3415 0 if J3> 15 8 > > > > < > > > > : ð24Þ

For this modiﬁcation, we also use the min and the product operators to perform the aggregation at the beginning. The results are listed in Table V, which are similar to results from Condition-1. The control proﬁles are shown in Figure 2. We replaced the min–max type objection function by the augmented max–min objective function. The simulation results are shown in Table VI.

Table IV. Results for augmented min–max type function under Condition-1 (Example 1).

Aug. min–max Aug. min–max (r ¼ 103) (r ¼ 105)

m_J m_J

½C1tf 2.0342 0.61 2.0341 0.61

½C2tf 0.00917 0.61 0.00917 0.61

tf 17.441 0.61 17.478 0.61

Figure 1. Temperature proﬁles for Condition-1 (Example 1).

Table V. Comparisons when using min and product under Condition-2 (Example 1). min product m_J m_J ½C1tf 2.0751 0.49 1.9917 0.7338 ½C2tf 0.00973 0.49 0.01056 0.3126 tf 10.624 0.49 8.747 0.7003

(14)

6.2. Example 2: Penicillin G fed-batch fermentation The simpliﬁed Heijnen’s model [28] was considered:

dS dt ¼ sX þ uðtÞ dX dt ¼ nX dP dt ¼ pX khP dG dt ¼ 1 Cs;in uðtÞ 0:0008G dCs dt ¼ 1 G dS dt Cs G dG dt ð25Þ

In these equations, S; X and P denote the amount of substrate (glucose), cell mass and product (penicillin) in broth, respectively. G is the total broth weight. u is substrate feed rate (mole/hr). Cs;in and Cs designate the glucose concentration in feed and broth, respectively. kh is the

penicillin hydrolysis constant. The speciﬁc rates s; n; and p are modelled as follows. 1. The speciﬁc substrate rate s is modelled by a Monod-type relationship.

s ¼ Qs;max

Cs

Ksþ Cs

ð26Þ where Qs;maxdenotes the maximum speciﬁc sugar uptake rate and Ksis the Monod constant for

the sugar uptake.

Figure 2. Temperature proﬁles for Condition-2 (Example 1).

Table VI. Results for augmented min–max type function under Condition-2 (Example 1).

Aug. min–max Aug. min–max ðr ¼ 103_Þ _{ðr ¼ 10}5_Þ

m_J m_J

½C1tf 2.0759 0.49 2.0752 0.49

½C2tf 0.00974 0.49 0.00973 0.49

(15)

2. The (overall) speciﬁc growth rate n is given by

n ¼ YX=Sðs m p=YP=SÞ ð27Þ

where m is the overall speciﬁc maintenance demand. YX=S is the biomass-on-substrate yield

coeﬃcient. YP=S denotes the product-on-substrate yield coeﬃcient.

3. The speciﬁc production rate p is assumed to be directly coupled with the speciﬁc growth rate m; following a Blackman-type relation

pðnÞ ¼ Qp;max n ncrit for n4ncrit 1 for n5ncrit 8 < : ð28Þ

where Qp;max denotes the maximal speciﬁc production rate, and ncrit is critical speciﬁc growth

rate.

The initial conditions and the related physical data are shown in Table VII. Now, if we limit the glucose feeding rate as not exceeding 2000 mole=h; and assign the total amount of glucose during the operation as equal to 2 105 _mol:

04uðtÞ42000 mol=h ð29Þ

Z tf 0

uðtÞdt ¼ 2 105mol ð30Þ

Such a fermentation process can be divided into two phases [28]. The first phase is rapid growth with almost no product formation and the second phase has limited growth during product formation. Thus the optimization problem can now be formulated as finding the glucose feeding policy uðtÞ that simultaneously maximizes the final product amount J1¼ PðtfÞ and finishes the

operation in the shortest time J2¼ tf: That is,

J1¼ max

uðtÞ PðtfÞ ¼ minuðtÞ PðtfÞ ð31Þ

J2 ¼ min

uðtÞ tf ð32Þ

Table VII. Initial conditions and constants used in Penicillin G fermentation model.

S 5500 mole kh 0.002 1h

X 4000 mol dry weight Qs;max 0.0245 ðmol dry weightÞðhÞmole

P 0 mol Qp;max 3:3 104 ðmol dry weightÞðhÞmole

G 1 105 _kg _m _0.0034 mole

ðmol dry weightÞðhÞ

Cs;in 1=0:36 mole_kg YX=S 3.67 mol dry weightmol

Cs 0.055 mole_kg YP=S 0.46 molemole

(16)

As stated previously, we must search for the optimal value for each single objective to build the payoﬀ table. By using the IDP with N ¼ 49; M ¼ 5; P ¼ 10 and the number of iteration is 30, we can produce 7878 mol of penicillin G during 232:2 h of operation if only the maximum production is considered. However, one can obtain 3463 mole of product when the shortest operating time, tf ¼ 100 h; is involved. The payoﬀ table can be found in Table VIII. The linear

type membership functions can then be constructed to depict the change in preference for individual objective. mJ1¼ 1 if J1> 7878 J1 3463 7878 3463 if 78785J153463 0 if J153463 8 > > > > < > > > > : ð33Þ m_J₂¼ 1 if J25100 232:3 J2 232:3 100 if 1004J24232:3 0 if J2> 232:3 8 > > > > < > > > > : ð34Þ

As we use the min operator to perform the fuzzy aggregation, profiles for control policy and specific growth rate are shown in Figure 3. As product is used as the aggregating operator, the resulting control policy and specific growth rate are shown in Figure 4. The numerical results using different operators that are slightly different, as depicted in Table IX.

7. CONCLUSIONS

An algorithm applying iterative dynamic programming and fuzzy inference to solve the multi-objective optimal control problems is proposed. The optimal control policy for each multi-objective is

Table VIII. Payoﬀ table for Penicillin G fed-batch fermentation.

J1ðuniÞ J2ðuniÞ

Jiðun1Þ 7878 232.3

Jiðun2Þ 3463 100

(17)

determined sequentially at first. The payoff table is established by applying these optimal control policies to individual objective. A simple monotonic increasing or decreasing membership function is then used to define the degree of satisfaction for each objective function. The final optimal control policy is searched by maximizing the aggregated membership values. Herein, two popular t-norms, the min and the product, have been applied as the fuzzy aggregation. By applying the fuzzy aggregation and payoff table approach, one can obtain single Pareto optimal solution that best satisfies the decision maker with least subjective knowledge. This hybrid algorithm is not only easy to implement, but powerful and efficient in computation. Two chemical processes, the Nylon 6 batch polymerization and the Penicillin G fed-batch fermentation, are used to demonstrate the feasibility and superiority of the proposed algorithm.

REFERENCES

1. Biegler LT. Solution of dynamic optimization problems by successive quadratic programming and orthogonal collocation. Computers and Chemical Engineering 1984; 8:243–248.

2. Goh CK, Teo KL. Control parameterization: a uniﬁed approach to optimal problems with general constraints. Automatica1988; 24:3–18.

3. Cuthrell JJ, Biegler LT. Simultaneous optimization and solution methods for batch reactor proﬁles. Computers and Chemical Engineering1989; 13:49–62.

4. Luus R. Optimal control by dynamic programming using grids points and region reduction. Hungarian Journal of Industrial Chemistry1989; 17:523–543.

5. Vassiladis RWVVSS, Pantelides CC. Solution of a class of multistage dynamic optimization problems}1. problems without path constraints. Industrial Engineering and Chemical Research 1994; 33:2111–2122.

6. Vassiladis RWVVSS, Pantelides CC. Solution of a class of multistage dynamic optimization problems}2. problems with path constraints. Industrial Engineering and Chemical Research 1994; 33:2123–2133.

7. Chen CL, Sun DY. Solution of fuzzy dynamic optimization problems by adaptive stochastic algorithm. International Journal of Artiﬁcial Intelligence Tools2000; 9(4):527–535.

8. Choi KY, Butala DN. An experimental study of multi-objective dynamic optimization of a semi-batch copolymerization process. Polymer Engineering and Science 1991; 31(5):353.

9. Miettinen K. Nonlinear Multi-objective Optimization. Kluwer Academic: New York, 1999.

10. Sakawa M. Fuzzy Sets and Interactive Multi-objective Optimization. Plenum Press: New York, 1993. 11. Cohon JL. Multi-objective Programming and Planning. Academic Press: New York, 1985.

Figure 4. Control policy and speciﬁc growth rate ðT ¼ productÞ:

Table IX. Comparisons for diﬀerent t-norms for Peni-cillin G fed-batch fermentation.

Obj. T_{¼ min} T_{¼ product} PðtfÞ (mol) 6649.6 6716.7

(18)

12. Zadeh LA. Fuzzy sets. Information and Control 1965; 8:338–353.

13. Bellman RE, Zadeh LA. Decision making in a fuzzy environment. Management Science 1970; 17:141–164. 14. Tanaka TO, Okuda H, Asai K. On the fuzzy mathematical programming. Journal of Cybernetics 1974; 3:37–46. 15. Zimmermann HJ. Description and optimization of fuzzy systems. International Journal of General Systems 1976;

2:209–215.

16. Sakawa M, Inuiguchi M, Kato K, Ikeda T. A fuzzy satisfying method for multi-objective linear optimal control problems. Fuzzy Sets and Systems 1996; 78:223–229.

17. Wang FS, Jing CH, Tsao GT. Fuzzy-decision-making problems of fuel ethanol production using a genetically engineered yeast. Industrial Engineering and Chemical Research 1998; 37:3434–3443.

18. Wang FS, Shieh TL. Extension of iterative dynamic programming to multi-objective optimal control problems. Industrial Engineering and Chemical Research1997; 36:2279–2286.

19. Carrasco EF, Banga JR. Dynamic optimization of batch reactors using adaptive stochastic algorithms. Industrial Engineering and Chemical Research1997; 36:2252–2261.

20. Klir GJ, Yuan B. Fuzzy Sets and Fuzzy Logics-Theory and Application. Prentice Hall: New York, 1995.

21. Esposito WR, Floudas CA. Deterministic global optimization in nonlinear optimal control problems. Journal of Global Optimization2000; 17:97–126.

22. Delgado M, Verdegay JL, Vila MA. A possibility approach for multi-objective programming problems: eﬃciency of solutions. In Stochastic Versus Fuzzy Approaches to Multi-objective Mathematical Programming under Uncertainty, Slowiniski R, Teghem J (eds). Kluwer Academic: New York, 1991.

23. Luus R. Optimal control of batch reactor by iterative dynamic programming. Journal of Process Control 1994; 4:216–218.

24. Hartig F, Mandel K, Keil FJ. Parallelization of iterative dynamic programming. Periodica Polytechnica Series in Chemical Engineering1999; 43(1):3–16.

25. Ray AK, Gupta SK. Optimization of non-vaporizing nylon 6 reactors with stopping conditions and end-point constraints. Polymer Engineering and Science 1986; 26:1033–1044.

26. Wajge RM, Gupta SK. Multi-objective dynamic optimization of a nonvaporizing nylon 6 batch reactor. Polymer Engineering and Science1994; 34:1161–1172.

27. Ramagopal AK, Kumar A, Gupta SK. Optimal temperature proﬁles for Nylon 6 polymerization in plug-ﬂow reactor. Journal of Applied Polymer Science 1983; 28:2261–2279.

28. van Impe JF. Optimal control of the penicillin G fed-batch fermentation: an analysis of the model of heijnen et al. Optimal Control Applications and Methods1994; 15:13–34.