Solving multi-objective dynamic optimization problems
with fuzzy satisfying method
Cheng-Liang Chen
n,y, Chia-Yuan Chang and Daim-Yuang Sun
Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan, R.O.C.
SUMMARY
This article proposes a novel algorithm integrating iterative dynamic programming and fuzzy aggregation to solve multi-objective optimal control problems. First, the optimal control policies involving these objectives are sequentially determined. A payoff table is then established by applying each optimal policy in series to evaluate these multiple objectives. Considering the imprecise nature of decision-maker’s judgment, these multiple objectives are viewed as fuzzy variables. Simple monotonic increasing or decreasing membership functions are then defined for degrees of satisfaction for these linguistic objective functions. The optimal control policy is finally searched by maximizing the aggregated fuzzy decision values. The proposed method is rather easy to implement. Two chemical processes, Nylon 6 batch polymerization and Penicillin G fed-batch fermentation, are used to demonstrate that the method has a significant potential to solve real industrial problems. Copyright # 2003 John Wiley & Sons, Ltd.
KEY WORDS: optimal control; multi-objective optimization; fuzzy set; iterative dynamic programming
1. INTRODUCTION
In recent years, a considerable number of studies have been made on the optimization of dynamic systems with single objective [1–7]. However, multiple aims are usually desired in practice. For instance, when operating a batch reactor, maintaining undesirable byproducts at the lowest possible levels and attaining the desired fractional conversion in the shortest amount of time are both important. Furthermore, the operators can simultaneously consider other important factors such as economic efficiency, safety, reliability, or the impact on the environment. All these objectives are usually non-commensurable. Operators thus need a multi-objective decision-making technique to help them look for a satisfying solution from those conflicting objectives. Recently, application of multi-objective approach on dynamic optimiza-tion problems has been addressed by many researchers [8].
Optimization for a multi-objective problem is a procedure looking for a compromise policy. The result, called a Pareto optimal or non-inferior solution, consists of an infinite number of
Received 22 January 2001
y
E-mail: ccl@ccms.ntu.edu.tw
n
Correspondence to: Professor Cheng-Liang Chen, Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan, R.O.C
alternatives. There is a large variety of methods for treating the multi-objective optimization problem. Those methods can be classified in many ways according to different criteria [9, 10]. For example, Cohon [11] categorized methods into two relatively distinct subsets: generating methods and preference-based methods. The generating methods produce a set of Pareto optima and then the decision maker (DM) selects one of them on a basis of subjective value judgment. Among them the weighting-sum method and the e-constraint method are well-known. As the algorithm cannot converge to a suitable solution, or the DM does not agree with the result, the DM can adjust the related parameters used in the algorithm, such as the weighting factors in the weighting-sum method. The computation can be repeated until a satisfactory solution is obtained. The preference-based methods, on the other hand, contain DM’s preference as the solution process goes on, and the solution that best fulfills DM’s preference is selected. Thus, all these multi-objective optimization methods for finding a Pareto optimal solution are filled with subjective and fuzzy properties [10].
In order to overcome the difficulty of describing a fuzzy attribute, Zadeh [12] proposed the fuzzy set concept. By using multi-valued logic to replace the traditional Boolean logic, people can quantitatively elucidate unclear information or knowledge. Afterward, Bellman and Zadeh [13] further extended the fuzzy concept to the decision making under fuzzy environment. Tanaka et al. [14] brought in the concept of fuzzy mathematical programming and proved that fuzzy mathematical programming can be reduced into conventional non-linear programming problem. Zimmermann [15] introduced fuzzy set theory into conventional linear programming problems, considering linear programming problems with a fuzzy goal and some fuzzy constraints. Following the fuzzy decision, together with linear membership functions, he proved that there exists an equivalent linear programming problem. Recently, Sakawa et al. [16] proposed the fuzzy satisfying method to find solution for multi-objective linear problems by applying the payoff table. Wang et al. [17] proposed a fuzzy decision-making procedure to determine the feed profile of a fermentation process for fuel ethanol production, using the fuzzy min–max method. In this paper, we attempt to extend the fuzzy inference on solving the dynamic optimization problem with multiple objectives. The augmented min–max approach, proposed by Sakawa et al. [16], is applied instead of the conventional min–max approach used by Wang and Shieh [18] since the uniqueness of the optimal solution is not guaranteed in non-linear systems. By mapping each objective value into a normalized domain referenced from the payoff table and further aggregating these normalized values, the vector objective problem can be grouped into a single objective problem. Then, the iterative dynamic programming (IDP), developed by Luus and his co-workers [4], is utilized as the platform to determine the solution for such problems. By using appropriate mapping functions and aggregates, the solution found by the proposed algorithm can be proved to be (local) Pareto optimal. Notably, by applying the fuzzy aggregation and payoff table approach, one can obtain single Pareto optimal solution that best satisfies the decision maker with least subjective knowledge. Furthermore, the grouped single objective dynamic optimization problem can also be solved by using other searching methods, such as the integrated controlled random search (ICRS) [7, 19].
In the rest of this article, the formulation of the problem is set out in Section 2. The procedure for grouping the vector objectives into a scalar one using the fuzzy set concept is given in Section 3. Some related mathematical properties required to guarantee the optimality of the solution are given in Section 4. In Section 5, a review and modification for IDP is presented. Therein, limitations of proposed method on local Pareto optimum is also discussed. Two
numerical examples are supplied in Section 6, demonstrating the usefulness of the proposed method. Some conclusions and discussions are made in Section 7.
2. PROBLEM STATEMENT
Consider the following multi-objective dynamic optimization problem (MODOP) with a specified final time tf;
min uðtÞ2O J1ðxðtfÞÞ and min uðtÞ2O J2ðxðtfÞÞ .. . and min uðtÞ2O JIðxðtfÞÞ ð1Þ
or in a more compact form min
uðtÞ2O JðxðtfÞÞ ¼ ½J1ðxðtfÞÞ; . . . ; JIðxðtfÞÞ
T ð2Þ
Here, xðtÞ denotes the n 1 state vector with initial condition xð0Þ ¼ x0; uðtÞ is the m 1 control
vector; O is the feasible region in the control action space uðtÞ that satisfies some constraints, O ¼ fuðtÞjhðxðtÞ; uðtÞÞ ¼ ’xxðtÞ fðxðtÞ; uðtÞÞ; xð0Þ ¼ x0; gðxðtÞ; uðtÞÞ40; u4uðtÞ4%uug; hð*Þ ¼ ½h
1ð*Þ;
. . . ; hLð*ÞTand gð*Þ ¼ ½g
1ð*Þ; . . . ; g
Kð*ÞTare equality and inequality constraints, respectively; u
and %uu are the lower/upper bounds for control vector; and JðxðtfÞÞ denotes an I 1 dimensional
objective function vector.
The multi-objective dynamic optimization problem is to determine the optimal control policy un
ðtÞ over t 2 ½0; tf which brings the state vector xðtÞ from the initial condition x0 to the final
state xðtfÞ so that these objectives in JðxðtfÞÞ; regardless of the commensurability, are minimized
under given constraints. These objective functions, however, are usually conflicted with one another in practice. It is thus impossible to attain their own optimum, Jn
iðxðtfÞÞ0s;
simultaneously. The optimization of one objective implies the sacrifice of other targets. Therefore, the decision maker must make some compromise among these goals. In contrast to the optimality used in single objective optimization problems, Pareto optimality characterizes the solutions in a multi-objective optimization problem [9–11].
Definition 1 un
ðtÞ 2 O is said to be Pareto optimal for Equation (2), if and only if there exists no uðtÞ such that JiðuðtÞ; xðtfÞÞ4Jiðu
n
ðtÞ; xn
ðtfÞÞ for all i 2 f1; . . . ; Ig and JjðuðtÞ; xðtfÞÞ5Jjðu n
ðtÞ; xn
ðtfÞÞ for some
j 2 f1;. . . ; I g:
From the above definition, the number of solutions satisfying Pareto optimality in a multi-objective optimization problem can be infinite. It is difficult for the DM to attribute a set of incompatible objectives, such as economic efficiency, safety, reliability, or environmental
impact, without knowledge of the possible level of attainment for those objectives. It is thus a fuzzy problem for finding a Pareto optimal solution that best satisfies the decision maker.
3. FUZZY DECISION MAKING IN MODOP
In this work, we extended the fuzzy set theory of Zadeh [12] to deal with the multi-objective dynamic optimization problem. By considering the uncertain property of human thinking, it is quite natural to assume that the DM has multiple fuzzy goals, Ji; i ¼ 1; . . . ; I ; where an
interval ½J‘
i; Jiu exists for each fuzzy objective Ji: For the ith minimum objective, it is
thoroughly satisfied as the objective value JiðxðtfÞÞ is less than Ji‘; and it is unacceptable as
JiðxðtfÞÞ4Jiu: For a JiðxðtfÞÞ value in between Ji‘ and Jiu; the extent of satisfaction by the DM
decreases with an increase in its value. A strictly monotonic decreasing membership function, mJ
iðJiðxðtfÞÞÞ 2 ½0; 1; can be used to characterize such a transition from the objective value,
JiðxðtfÞÞ; to the degree-of-satisfaction, mJi [7] mJiðJiðxðtfÞÞÞ ¼ 1 for JiðxðtfÞÞ5Ji‘ FJiðJiðxðtfÞÞ; J ‘ i; JiuÞ for Ji‘4JiðxðtfÞÞ4Jiu 0 for JiðxðtfÞÞ > Jiu 8 > > < > > : ð3Þ
Here, a membership value of 1 denotes absolute satisfaction and 0 means unacceptable. The original MODOP is now equivalent to look for a suitable control policy that can provide the maximal degree-of-satisfaction for the multiple fuzzy objectives.
max uðtÞ2O mJ1ðJ1ðxðtfÞÞÞ and max uðtÞ2O mJ2ðJ2ðxðtfÞÞÞ .. . and max uðtÞ2OmJIðJIðxðtfÞÞÞ ð4Þ
or in a more compact form, max
uðtÞ2OmðxðtfÞÞ ¼ ½mJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞ
T ð5Þ
Under incompatible objective circumstances, a DM must make a compromise decision that provides a maximum degree-of-satisfaction for all of these conflict objectives. The new optimization problem, Equation (5), can be interpreted as the synthetic notation of a conjunction statement (maximize jointly all objectives). The result of this aggregation, D; can be viewed as a fuzzy intersection of all fuzzy objectives, Ji; i ¼ 1; . . . ; I ; and is still a fuzzy set.
The final degree-of-satisfaction resulting from certain kinds of control actions, mDðxðtfÞÞ;
over t 2 ½0; tf can be determined by aggregating the degree-of-satisfaction for all objectives,
mJiðJixðtfÞÞ via specific t-norm, T:
mDðxðtfÞÞ ¼ TfmJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð7Þ
The fundamental properties for a fuzzy set and the related operators can be found in Reference [20]. As the firing level for each control policy is determined by the above procedure, the best control policy un
ðtÞ with the maximal firing level, mn
DðuðtÞÞ; over t 2 ½0; tf can be selected.
mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2O mDðxðtfÞÞ ¼ max uðtÞ2O Tfm J1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð8Þ
Two famous t-norms are discussed here. 1. T ¼ algebraic product mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2O YI i¼1 mJiðJiðxðtfÞÞÞ ð9Þ 2. T ¼ Zadeh-minimum mn DðxðtfÞ; u n ðtÞÞ ¼ max uðtÞ2OminfmJ1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð10Þ
Notably, an equivalent expression when the Zadeh-minimum is used as the t-norm is, max
uðtÞ2Oþ
a ð11Þ
where Oþ¼ O \ fmJ
iðJiðxðtfÞÞÞ5a; i ¼ 1; . . . ; Ig:
4. RELATED MATHEMATICAL THEOREMS
Since the original problem, Equation (2), has been modified as an another multi-objective problem, Equation (5), using the t-norm, this new problem is converted into a new single objective problem, Equation (8). It is necessary to ensure that the solution from this procedure is Pareto for Equation (2). Since the original problem may be non-convex, only local Pareto is guaranteed if a local optimizer is applied [9]. Should the global optimum is emphasized for non-convex non-linear optimal control problems, one can refer to recent work done by Esposito and Floudas [21]. However, the calculations involved can be hard to unravel. Notably, that Equation (5) is also a multi-objective problem, it makes sense to give the following definitions [10]:
Definition 2 un
in O is said to be (local) M-Pareto optimal for Equation (2) if and only if it is (local) Pareto for Equation (5), that is, there exists no u such that mJiðuÞ5mJ
iðu
n
Þ 8i; with strict inequality for some i; where M refers to membership.
Note that the set of (local) Pareto optimal solutions is a subset of the set of (local) M-Pareto optimal solutions. Using the above definitions, it is clear that only (local) M-Pareto solutions can be looked for as Equation (2) is solved through Equation (5) and a local optimizer is used. Thus, it is quite natural to question
1. How can we guarantee a (local) M-Pareto solution for Equation (2) as the t-norm is used? 2. Under what conditions is a (local) M-Pareto solution a (local) Pareto?
To answer these queries, the following propositions [22] can be applied: Proposition 1
Let T be an arbitrary t-norm. If un
is the unique (local) optimal solution for Equation (8), then un
is (local) M-Pareto. Proposition 2
If un
is a (local) optimal solution for Equation (8) such that mJi=0 8i; and T is strictly monotonous, then un
is (local) M-Pareto for Equation (2). Based on Propositions 1 and 2, the following corollaries arise: Corollary 1
An (local) optimal solution for Equation (9), say un
is (local) M-Pareto for Equation (2), if either it is the only (local) solution or mJ
i½u
n
=0 8i; since the Product is a strictly monotonous t-norm.
Corollary 2
An (local) optimal solution from Equation (10) or Equation (11) will be (local) M-Pareto when it is unique, because the Zadeh-min is not strictly monotonous.
Based on these corollaries and the property of the product operator, as the control policy causes some objectives to deviate from their own optimum to an unacceptable extent, membership values for these objectives will become zero. At that time, this control policy will be rejected. Thus, as the product is used for solving Equation (8), any meaningful solution will be (local) M-Pareto. On the other hand, when Zadeh-min is used as the aggregate, a test for the uniqueness of the (local) optimal solution for Equation (10) or Equation (11) is unavoidable to ensure the (local) M-Pareto solution. Furthermore, as strictly decreasing functions are used to calculate membership value for each objective, the solution with (local) Pareto optimality can be found from the following propositions:
Proposition 3 Let un
be (local) M-Pareto for Equation (2) with mJi 2 ð0; 1Þ 8i: If all of the mJi functions are
strictly decreasing, then un
Proposition 4 Let un
be an (local) optimal solution for Equation (8) with all of the mJi being strictly decreasing. un
is the (local) Pareto for Equation (2) if either 1. un
is the unique (local) solution or
2. T is strictly monotonous and mJi 2 ð0; 1Þ 8i:
From the above discussions, if Zadeh-min is selected as the t-norm, the key point that guarantees (local) Pareto solution is to determine the unique optimal solution. To achieve this, two strategies can be utilized. The first is to employ a global optimization technique to determine the solution [21]. However, it is difficult to justify whether the solution is the only global optimum or not. The second strategy is to employ the augmented minimax algorithm [10]. The merit of this method is that it circumvents the necessity for testing the uniqueness of the solution by modifying the objective function. Based on this algorithm, Equation (10) can be reformulated as follows: min uðtÞ2O max8i ½%mmi mJiðJiÞ þ r XI i¼1 ½%mmi mJ iðJiÞ ( ) ð12Þ or equivalently min uðtÞ2O o s:t: %mmi mJiðJiÞ4o r PI i¼1 ½%mmi mJ iðJiÞ i ¼1;. . . ; I ð13Þ
where %mmi is the reference membership level determined by the DM, and r is a sufficiently small
positive constant. For such a modification, we can apply the following theorem, which is similar to the works done by Sakawa et al., to ensure the properties of the solution:
Theorem 1 Let un
be a (local) optimal solution for Equation (12) or Equation (13) for some %mmi; i ¼ 1; . . . ; I :
Then un
is a (local) Pareto optimal solution of Equation (2).
5. THE ITERATIVE DYNAMIC PROGRAMMING (IDP) INCORPORATING FUZZY DECISION
Methods for solving the dynamic optimization problems can be classified into three major categories: (1) the variation-based approach, (2) the non-linear programming (NLP) approach and (3) the dynamic programming. The variation-based methods, a direct applications of Pontryagin’s minimum principle, transforms the original problem into a two-point-boundary-value problem (TPBVP). It is usually a complex and extreme difficult task to solve the resulting TPBVP, however. The NLP approaches, including complete parameterization [1, 3] and control parameterization [2, 5, 6, 19], transform the dynamic optimization problem into a general non-linear optimization problem. Any standard NLP technique can then be used to determine the solution. All of these parameterization methods exhibit convergence difficulties, and it is difficult to obtain a global optimum because of the highly non-linear, multi-modal, and/or discontinuous
natures of these systems. The direct use of dynamic programming in solving dynamic optimization problems is usually difficult. Two major barriers discourage the use of dynamic programming. The greatest difficulty is the problem of setting up the grid values for the state and control. To produce a meaningful result, the state grids must be sufficiently fine. At each time stage, therefore, a large numbers of integration must be performed for each state grid and each allowable control value. A greater problem arises when the trajectory calculated for a particular grid point does not meet a grid point at the next time step. Under such condition, interpolation can be used, but the resulting approximation is usually unreliable.
Recently, Luss [4] proposed the so-called iterative dynamic programming (IDP) to alleviate the computational burdens of the original dynamic programming. By using the accessible grid points and region-reduction strategy, the IDP can successfully overcome the curse of dimensionality, and its computational effectiveness has been elucidated in many reports, such as Reference [23]. Although IDP is not a deterministic global optimizer which can guarantee convergence to the global solution, there are many advantages. First, it is easy for implementation and no gradient information is required. Second, the probability of obtaining the global solution is high and can be even higher by increasing the parameter resolution and by repeating the optimization process with several different initial trials. These properties are very practical in real industrial environments. One of the best inherent properties of dynamic programming is that it is easy to extend the algorithm into parallel or distributed computation. As the computation efforts is heavily related to the number of state grids, the computation effort can be reduced significantly if multiple searching loops work simultaneously. Therefore, the barrier of numerous integrations is overcome by the parallel version of IDP, proposed by Hartig et al.[24]. In this article, only the regular version is demonstrated because the major purpose of this paper is to illustrate the fuzzy decision making.
In order to apply the IDP to solve our problem, Equation (8), we first divide the entire time horizon into P time stages, P ¼ tf=T : Further assume that the control action within each
duration T is kept constant. Therefore, the original infinite-dimensional problem can then be put into the following finite-dimensional form:
max
ui2 *OOd
TfmJ
1ðJ1ðxðtfÞÞÞ; . . . ; mJIðJIðxðtfÞÞÞg ð14Þ
where *OOd is the feasible searching space defined as
*O
Od ¼ fu0; . . . ; uP1j hðxðtÞ; uðtÞÞ ¼ ’xxðtÞ fðxðtÞ; uðtÞÞ; xð0Þ ¼ x0; gðxðtÞ; uðtÞÞ40;
u4uðtÞ ¼ ui4%uu 8t 2 ½iT; iT þ T; i ¼ 0; . . . ; P 1g
The multi-objective dynamic optimization problem can now be regarded as finding a series of piecewise constant control actions ui; i ¼ 0; . . . ; P 1 with the highest firing level to attain the
least compromise between the objectives, meanwhile the state equations and related constraints are all satisfied.
The IDP-based algorithm for the MODOP can be stated as follows:
* Preliminary phase(I): generation of state grid points
1. Divide the time horizon tf into P time stages each with length T ¼ tf=P:
2. Choose the number of state grid points, N; the number of allowable control values, M; and the testing region r for control action.
3. By Choosing M sets of control actions at the time stages, integrate the dynamic equation M times to generate the state grids at the time stages.
* Preliminary phase(II): generation of payoff table
1. Determine the optimal value for each single objective, Jn
i; and record the
corresponding control policy, un
i: Meanwhile record the other objective values
when applying un
i; i.e. Js;s=i s ¼1;. . . ; I :
2. Construct the payoff table (Table I), 3. Let J‘
i ¼ J n iðu
n
iÞ and Jiu¼ max8sfJiðunsÞg: Determine the membership function FJi
for each objective.
4. Choose a suitable t-norm to perform aggregation. If the Zadeh-min is used, it is needed to modify the objective function and assign the %mmiand r values. The default
values for %mmi and r can be set as 1.0 and 103; respectively. * Search by the IDP:
1. Begin at the ðP 1ÞT; corresponding to time tf T : Integrate the dynamic equation
from time tf T to tf; starting from each grid point and using each of the M
allowable control values.
2. Calculate membership values for all objects at tf attained by the M control actions
at each grid point.
3. Take aggregation on each objective via the selected t-norm operator to calculate firing levels for the control actions.
4. Select the control action with highest firing level as the best choice for the state grid and record it.
5. Step back one stage, corresponding to time tf 2T : Integrate the dynamic equation
from time tf 2T to tf for each state grid with M allowable control actions. To
continue integration from tf T to tf; use the best control action for the closet state
grid point tf T :
6. Re-calculate membership values for all objects at tf; since state trajectory at tf does
not exactly meet target value obtained from previous computation. Take aggregation on objects to calculate firing level for each control value used at this stage.
7. Select the control action with highest firing level as the best choice.
8. Continue the above procedure until t ¼ 0: Store control actions with maximal fuzzy decisions and store the corresponding state trajectory.
Table I. The payoff table for a standard multi-objective optimization problem.
J1ðuniÞ JsðuniÞ JIðuniÞ
Jiðun1Þ J n 1ðu n 1Þ Jsðun1Þ JIðun1Þ .. . .. . . . . .. . . . . .. . JiðunsÞ J1ðunsÞ J n sðu n sÞ JIðunsÞ .. . .. . . . . .. . . . . .. . JiðunIÞ J1ðuInÞ JsðunIÞ J
n Iðu
n IÞ
9. Reduce the region for state grid and region for allowable control values using a contracting factor e; i.e.
rðjþ1Þ¼ ð1 eÞrðjÞ ð15Þ
where j is iteration index. Use the optimal state trajectory from previous search as the midpoint for the state grid at each time stage. Use the optimal control policy from previous steps as the midpoint for allowable values for control.
10. Increment the iteration index j by 1 and go back to step 1. Continue the iteration for a specific number of iterations, such as 20, and examine the results.
Notably, the number of required integrations (NT) for each iteration can be analytically figured
out as follows: NT¼ M P 1 þ NðP 1Þ 2 ð16Þ
One can estimate the overall computation time if M; N; P and the number of iterations are given. Though the proposed method requires huge resources of computer processing time, it has a significant potential to solve real industrial problems due to its easy implementation. The feasibility and superiority of the IDP for solving the MODOP is demonstrated in next section.
6. NUMERICAL EXAMPLES
Two chemical processes are used to demonstrate the proposed algorithm. The first example concerns the multi-objective dynamic optimization for the non-vaporized Nylon-6 polymeriza-tion in the batch process. The second is producpolymeriza-tion of penicillin G via fed-batch fermentapolymeriza-tion. We use the Pentium-100 personal computer, and DVERK and LSODI integration packages for computation.
6.1. Example 1: Nylon 6 polymerization problem
The optimization for Nylon 6 polymerization has drawn considerable attention in the past decade. Ray and Gupta [25] explored the optimum temperature profiles using the minimum principle for respective single objective. Recently, Wajge and Gupta [26] applied the so-called surrogate worth trade-off (SWT) method to study the operation under two objective functions. Based on the mass and moment equations proposed by Ramagopal et al. [27], we attempt to find the temperature profiles T ðtÞ which simultaneously optimize the following three objectives:
min J1 ¼ concentration of unreacted monomer in product
min J2 ¼ concentration of undesirable cyclic compounds
ðprimarily; cyclic dimerÞ in product; ½C2tf ðmol=kgÞ
Meanwhile, the following constraints are considered:
1. To ensure a single phase polymerization, the control variable (temperature) should be limited within the range between 220 and 2708C;
2208C4T ðtÞ42708C ð17Þ
2. To guarantee processing properties (number average molecular weight MNðtfÞ within 1:4
104 and 2 104), the stopping criterion is setting the number average chain-length of
polymer, CLNðtfÞ; to be 140
CLNðtfÞ ¼ 140 ð18Þ
Under this condition, the number average molecular weight MNðtfÞ is about 1:6 104:
3. To facilitate and smooth control profile for implementation, the change in temperature between successive time intervals should not exceed 208C
jT ðtkÞ Tðtk1Þj4208C ð19Þ
Furthermore, each time interval must be larger than 15 min
tk tk1 > 15 min ð20Þ
In this example, we set N ¼ 49; M ¼ 5; P ¼ 10 and 30 iterations for each run of optimization. With the same physical data, initial conditions and tolerance as that of Wajge and Gupta [26], we first determine optimal values for considered objectives to establish the payoff table, as shown in Table II. The results in Table II show that:
* When the polymerization temperature is maintained at its highest value, 2708C; the
concentration of unreacted monomer can be kept at the lowest level and the reaction time the shortest. The cyclic compound concentration is increased, however.
* The reaction temperature should be kept at the lowest value, 2208C; to suppress the cyclic
compound concentration.
From the above results, the control profiles can be qualitatively divided into three stages: Stage1: Initializing the reaction with the highest temperature to promote the monomer
conversion.
Stage2: Keeping the temperature at its lowest level to limit the concentration of undesired product as low as possible.
Stage3: Re-increasing the temperature to raise the product concentration.
Table II. Payoff table for Nylon 6 batch polymerization. T ðtÞ J1ðuniÞ ¼ ½C1tf J2ðu n iÞ ¼ ½C2tf J3ðu n iÞ ¼ tf Jiðun1¼ 2708CÞ 1.9005 0.01202 6.071 Jiðun2¼ 2208CÞ 2.2430 0.00735 35.33 Jiðun3¼ 2708CÞ 1.9005 0.01202 6.071
Based on Table II, we set up the following linear type membership functions, designated as Condition-1, to express our preference on each objective.
mJ1 ¼ 1 if J151:9005 2:2430 J1 2:2430 1:9005 if 1:90054J142:2430 0 if J1> 2:2430 8 > > > > < > > > > : ð21Þ mJ2 ¼ 1 if J250:00735 0:01202 J2 0:01202 0:00735 if 0:007354J240:01202 0 if J2> 0:01202 8 > > > > < > > > > : ð22Þ mJ 3¼ 1 if J356:071 35:330 J3 35:330 6:071 if 6:0714J3435:330 0 if J3535:330 8 > > > > < > > > > : ð23Þ
Here, both min and product are utilized as fuzzy aggregation for exploring the effect on final results, such as shown in Table III. Table III illustrates the fact that min operator has the tendency to distribute the preference on average. It can be seen in the third column of the table that membership values are all equal to 0.61. However, the product is more inclined to concentrate the preference on some objectives. This can be shown in the fifth column that membership values for J1 and J3are larger than that of J2: Control profiles for Condition-1
are shown in the following. Notably, applying the min operator in aggregation does not guarantee that the final results is (local) Pareto optimal. We modify the max–min type objective function into the augmented min–max type objective function. The simulation results for this modification are listed in Table IV, Comparing the results from Tables III to IV, we find that although we have replaced the max–min type objective function with the augmented min–max type function, the final results do not change significantly. The possible reason is that IDP determines the unique solution for the former objective function. But, to ensure (local) Pareto optimality, we still recommend using the augmented min–max type objective function (Figure 1).
Table III. Comparisons between min and product operator under Condition-1 (Example 1).
min product
mJ mJ
½C1tf 2.0339 0.61 1.9634 0.8164
½C2tf 0.00917 0.61 0.00989 0.4563
Now considering a practical operation (8 h per day), the maximal allowable operating time is changed from 35.330 to 15 h (Condition-2). The membership function for operating time becomes, mJ3¼ 1 if J356:071 15 J3 15 6:071 if 6:0714J3415 0 if J3> 15 8 > > > > < > > > > : ð24Þ
For this modification, we also use the min and the product operators to perform the aggregation at the beginning. The results are listed in Table V, which are similar to results from Condition-1. The control profiles are shown in Figure 2. We replaced the min–max type objection function by the augmented max–min objective function. The simulation results are shown in Table VI.
Table IV. Results for augmented min–max type function under Condition-1 (Example 1).
Aug. min–max Aug. min–max (r ¼ 103) (r ¼ 105)
mJ mJ
½C1tf 2.0342 0.61 2.0341 0.61
½C2tf 0.00917 0.61 0.00917 0.61
tf 17.441 0.61 17.478 0.61
Figure 1. Temperature profiles for Condition-1 (Example 1).
Table V. Comparisons when using min and product under Condition-2 (Example 1). min product mJ mJ ½C1tf 2.0751 0.49 1.9917 0.7338 ½C2tf 0.00973 0.49 0.01056 0.3126 tf 10.624 0.49 8.747 0.7003
6.2. Example 2: Penicillin G fed-batch fermentation The simplified Heijnen’s model [28] was considered:
dS dt ¼ sX þ uðtÞ dX dt ¼ nX dP dt ¼ pX khP dG dt ¼ 1 Cs;in uðtÞ 0:0008G dCs dt ¼ 1 G dS dt Cs G dG dt ð25Þ
In these equations, S; X and P denote the amount of substrate (glucose), cell mass and product (penicillin) in broth, respectively. G is the total broth weight. u is substrate feed rate (mole/hr). Cs;in and Cs designate the glucose concentration in feed and broth, respectively. kh is the
penicillin hydrolysis constant. The specific rates s; n; and p are modelled as follows. 1. The specific substrate rate s is modelled by a Monod-type relationship.
s ¼ Qs;max
Cs
Ksþ Cs
ð26Þ where Qs;maxdenotes the maximum specific sugar uptake rate and Ksis the Monod constant for
the sugar uptake.
Figure 2. Temperature profiles for Condition-2 (Example 1).
Table VI. Results for augmented min–max type function under Condition-2 (Example 1).
Aug. min–max Aug. min–max ðr ¼ 103Þ ðr ¼ 105Þ
mJ mJ
½C1tf 2.0759 0.49 2.0752 0.49
½C2tf 0.00974 0.49 0.00973 0.49
2. The (overall) specific growth rate n is given by
n ¼ YX=Sðs m p=YP=SÞ ð27Þ
where m is the overall specific maintenance demand. YX=S is the biomass-on-substrate yield
coefficient. YP=S denotes the product-on-substrate yield coefficient.
3. The specific production rate p is assumed to be directly coupled with the specific growth rate m; following a Blackman-type relation
pðnÞ ¼ Qp;max n ncrit for n4ncrit 1 for n5ncrit 8 < : ð28Þ
where Qp;max denotes the maximal specific production rate, and ncrit is critical specific growth
rate.
The initial conditions and the related physical data are shown in Table VII. Now, if we limit the glucose feeding rate as not exceeding 2000 mole=h; and assign the total amount of glucose during the operation as equal to 2 105 mol:
04uðtÞ42000 mol=h ð29Þ
Z tf 0
uðtÞdt ¼ 2 105mol ð30Þ
Such a fermentation process can be divided into two phases [28]. The first phase is rapid growth with almost no product formation and the second phase has limited growth during product formation. Thus the optimization problem can now be formulated as finding the glucose feeding policy uðtÞ that simultaneously maximizes the final product amount J1¼ PðtfÞ and finishes the
operation in the shortest time J2¼ tf: That is,
J1¼ max
uðtÞ PðtfÞ ¼ minuðtÞ PðtfÞ ð31Þ
J2 ¼ min
uðtÞ tf ð32Þ
Table VII. Initial conditions and constants used in Penicillin G fermentation model.
S 5500 mole kh 0.002 1h
X 4000 mol dry weight Qs;max 0.0245 ðmol dry weightÞðhÞmole
P 0 mol Qp;max 3:3 104 ðmol dry weightÞðhÞmole
G 1 105 kg m 0.0034 mole
ðmol dry weightÞðhÞ
Cs;in 1=0:36 molekg YX=S 3.67 mol dry weightmol
Cs 0.055 molekg YP=S 0.46 molemole
As stated previously, we must search for the optimal value for each single objective to build the payoff table. By using the IDP with N ¼ 49; M ¼ 5; P ¼ 10 and the number of iteration is 30, we can produce 7878 mol of penicillin G during 232:2 h of operation if only the maximum production is considered. However, one can obtain 3463 mole of product when the shortest operating time, tf ¼ 100 h; is involved. The payoff table can be found in Table VIII. The linear
type membership functions can then be constructed to depict the change in preference for individual objective. mJ1¼ 1 if J1> 7878 J1 3463 7878 3463 if 78785J153463 0 if J153463 8 > > > > < > > > > : ð33Þ mJ2¼ 1 if J25100 232:3 J2 232:3 100 if 1004J24232:3 0 if J2> 232:3 8 > > > > < > > > > : ð34Þ
As we use the min operator to perform the fuzzy aggregation, profiles for control policy and specific growth rate are shown in Figure 3. As product is used as the aggregating operator, the resulting control policy and specific growth rate are shown in Figure 4. The numerical results using different operators that are slightly different, as depicted in Table IX.
7. CONCLUSIONS
An algorithm applying iterative dynamic programming and fuzzy inference to solve the multi-objective optimal control problems is proposed. The optimal control policy for each multi-objective is
Table VIII. Payoff table for Penicillin G fed-batch fermentation.
J1ðuniÞ J2ðuniÞ
Jiðun1Þ 7878 232.3
Jiðun2Þ 3463 100
determined sequentially at first. The payoff table is established by applying these optimal control policies to individual objective. A simple monotonic increasing or decreasing membership function is then used to define the degree of satisfaction for each objective function. The final optimal control policy is searched by maximizing the aggregated membership values. Herein, two popular t-norms, the min and the product, have been applied as the fuzzy aggregation. By applying the fuzzy aggregation and payoff table approach, one can obtain single Pareto optimal solution that best satisfies the decision maker with least subjective knowledge. This hybrid algorithm is not only easy to implement, but powerful and efficient in computation. Two chemical processes, the Nylon 6 batch polymerization and the Penicillin G fed-batch fermentation, are used to demonstrate the feasibility and superiority of the proposed algorithm.
REFERENCES
1. Biegler LT. Solution of dynamic optimization problems by successive quadratic programming and orthogonal collocation. Computers and Chemical Engineering 1984; 8:243–248.
2. Goh CK, Teo KL. Control parameterization: a unified approach to optimal problems with general constraints. Automatica1988; 24:3–18.
3. Cuthrell JJ, Biegler LT. Simultaneous optimization and solution methods for batch reactor profiles. Computers and Chemical Engineering1989; 13:49–62.
4. Luus R. Optimal control by dynamic programming using grids points and region reduction. Hungarian Journal of Industrial Chemistry1989; 17:523–543.
5. Vassiladis RWVVSS, Pantelides CC. Solution of a class of multistage dynamic optimization problems}1. problems without path constraints. Industrial Engineering and Chemical Research 1994; 33:2111–2122.
6. Vassiladis RWVVSS, Pantelides CC. Solution of a class of multistage dynamic optimization problems}2. problems with path constraints. Industrial Engineering and Chemical Research 1994; 33:2123–2133.
7. Chen CL, Sun DY. Solution of fuzzy dynamic optimization problems by adaptive stochastic algorithm. International Journal of Artificial Intelligence Tools2000; 9(4):527–535.
8. Choi KY, Butala DN. An experimental study of multi-objective dynamic optimization of a semi-batch copolymerization process. Polymer Engineering and Science 1991; 31(5):353.
9. Miettinen K. Nonlinear Multi-objective Optimization. Kluwer Academic: New York, 1999.
10. Sakawa M. Fuzzy Sets and Interactive Multi-objective Optimization. Plenum Press: New York, 1993. 11. Cohon JL. Multi-objective Programming and Planning. Academic Press: New York, 1985.
Figure 4. Control policy and specific growth rate ðT ¼ productÞ:
Table IX. Comparisons for different t-norms for Peni-cillin G fed-batch fermentation.
Obj. T¼ min T¼ product PðtfÞ (mol) 6649.6 6716.7
12. Zadeh LA. Fuzzy sets. Information and Control 1965; 8:338–353.
13. Bellman RE, Zadeh LA. Decision making in a fuzzy environment. Management Science 1970; 17:141–164. 14. Tanaka TO, Okuda H, Asai K. On the fuzzy mathematical programming. Journal of Cybernetics 1974; 3:37–46. 15. Zimmermann HJ. Description and optimization of fuzzy systems. International Journal of General Systems 1976;
2:209–215.
16. Sakawa M, Inuiguchi M, Kato K, Ikeda T. A fuzzy satisfying method for multi-objective linear optimal control problems. Fuzzy Sets and Systems 1996; 78:223–229.
17. Wang FS, Jing CH, Tsao GT. Fuzzy-decision-making problems of fuel ethanol production using a genetically engineered yeast. Industrial Engineering and Chemical Research 1998; 37:3434–3443.
18. Wang FS, Shieh TL. Extension of iterative dynamic programming to multi-objective optimal control problems. Industrial Engineering and Chemical Research1997; 36:2279–2286.
19. Carrasco EF, Banga JR. Dynamic optimization of batch reactors using adaptive stochastic algorithms. Industrial Engineering and Chemical Research1997; 36:2252–2261.
20. Klir GJ, Yuan B. Fuzzy Sets and Fuzzy Logics-Theory and Application. Prentice Hall: New York, 1995.
21. Esposito WR, Floudas CA. Deterministic global optimization in nonlinear optimal control problems. Journal of Global Optimization2000; 17:97–126.
22. Delgado M, Verdegay JL, Vila MA. A possibility approach for multi-objective programming problems: efficiency of solutions. In Stochastic Versus Fuzzy Approaches to Multi-objective Mathematical Programming under Uncertainty, Slowiniski R, Teghem J (eds). Kluwer Academic: New York, 1991.
23. Luus R. Optimal control of batch reactor by iterative dynamic programming. Journal of Process Control 1994; 4:216–218.
24. Hartig F, Mandel K, Keil FJ. Parallelization of iterative dynamic programming. Periodica Polytechnica Series in Chemical Engineering1999; 43(1):3–16.
25. Ray AK, Gupta SK. Optimization of non-vaporizing nylon 6 reactors with stopping conditions and end-point constraints. Polymer Engineering and Science 1986; 26:1033–1044.
26. Wajge RM, Gupta SK. Multi-objective dynamic optimization of a nonvaporizing nylon 6 batch reactor. Polymer Engineering and Science1994; 34:1161–1172.
27. Ramagopal AK, Kumar A, Gupta SK. Optimal temperature profiles for Nylon 6 polymerization in plug-flow reactor. Journal of Applied Polymer Science 1983; 28:2261–2279.
28. van Impe JF. Optimal control of the penicillin G fed-batch fermentation: an analysis of the model of heijnen et al. Optimal Control Applications and Methods1994; 15:13–34.