Output Feedback Control of Bilinear Systems via a Bilinear LTR Observer

(1)

VII. CONCLUSION

We have derived a density function for an NF-based system. The function is derived for a transformed, smooth vector field that enjoys the navigation properties of the original NF vector field. Under some assumptions, the convergence results derived on the transformed vector field are propagated to the original. This result will enable exploitation of several features of dual Lyapunov techniques to robotic navigation. Initial results from applying this approach to robotic navigation are reported in [13]. Further research includes finding density functions that are directly applicable to the primary navigation system.

REFERENCES

[1] A. Rantzer, “A dual to Lyapunov’s stability theorem,” Syst. Control Lett., vol. 42, no. 3, pp. 161–168, 2001.

[2] A. Rantzer, “A converse theorem for density functions,” in Proc. IEEE Conf. Decis. Control, 2002, pp. 1890–1891.

[3] A. Rantzer and F. Ceragioli, “Smooth blending of nonlinear con-trollers using density functions,” presented at the Eur., Porto, Portugal, 2001.

[4] P. Monzon, “On necessary conditions for almost global stability,” IEEE Trans. Autom. Control, vol. 48, no. 4, pp. 631–634, Apr. 2003.

[5] D. Angeli, “Some remarks on density functions for dual Lyapunov methods,” in Proc. 42nd IEEE Conf. Decis. Control, 2003, pp. 5080– 5082.

[6] D. E. Koditschek and E. Rimon, “Robot navigation functions on manifolds with boundary,” Adv. Appl. Math., vol. 11, pp. 412–442, 1990.

[7] J. Milnor, Morse Theory (Annals of Mathematics Studies Series). Princeton, NJ: Princeton University Press, 1963.

[8] E. Rimon and D. E. Koditschek, “Exact robot navigation using artificial potential functions,” IEEE Trans. Robot. Autom., vol. 8, no. 5, pp. 501– 518, Oct. 1992.

[9] E. Rimon and D. E. Koditschek, “The construction of analytic diffeomor-phisms for exact robot navigation on star worlds,” Trans. Amer. Math. Soc., vol. 327, no. 1, pp. 71–115, 1991.

[10] H. G. Tanner, S. G. Loizou, and K. J. Kyriakopoulos, “Nonholonomic navigation and control of cooperating mobile manipulators,” IEEE Trans. Robot. Autom., vol. 19, no. 1, pp. 53–64, Feb. 2003.

[11] S. Loizou and K. Kyriakopoulos, “A feedback-based multiagent nav-igation framework,” Int. J. Syst. Sci., vol. 37, no. 6, pp. 377–384, 2006.

[12] C. Karag¨oz, H. Bozma, and D. Koditschek, “Coordinated navigation of multiple independent disk-shaped robots,” EECS Dept., Univ. Michigan, Ann Arbor, MI, Tech. Rep. CSE-TR-486-04, 2004.

[13] S. Loizou and V. Kumar, “Weak input-to-state stability properties for navigation function based controllers,” in Proc. IEEE Int. Conf. Decis. Control, 2006, pp. 1800–1805.

Output Feedback Control of Bilinear Systems via a Bilinear LTR Observer

Min-Shin Chen and Chi-Che Chen

Abstract—In the literature, most observer-based output feedback con-trols for bilinear systems are only applicable to open-loop (neutrally) stable systems. This paper proposes a new observer-based output feedback con-trol that can be applied to open-loop unstable systems. The key component of the new control is an exponentially stable bilinear loop transfer recovery (LTR) observer that derives from the linear LTR observer.

Index Terms—Bilinear observer, bilinear system, dyadic bilinear system, loop transfer recovery (LTR) observer, output feedback control.

I. INTRODUCTION

Bilinear systems exist in many physical phenomena that are of con-siderable interest to human activities [1], [2]. Recent applications of bilinear system control include heating, air conditioning control [3], power converter control [4], electromagnetic actuator control [5], and quantum system control using finite-dimensional bilinear models [6] or infinite-dimensional bilinear models [7], [8]. Even though a variety of control designs have been developed for bilinear systems, most of them are based on state feedback [9]–[15]. If only part of the state variables are accessible for measurement, one has to resort to output feedback control. Unfortunately, most output feedback controls in the literature require that the open-loop bilinear system be stable [16], neutrally sta-ble [17] or dissipative [18]. The reason for requiring this open-loop stable condition is that they all assume the stabilizing control signal be of small magnitude so that their bilinear observer designs can be successful. There are a few bilinear observer designs proposed in the literature for the open-loop unstable bilinear system without impos-ing the small control condition. For example, an open-loop dead-beat observer for state estimation of open-loop unstable bilinear systems is suggested in [19], but the system must satisfy the existence con-dition of a control Lyapunov function [20]. In [21] and [22], bilinear observers can be constructed with the state estimation error converg-ing independent of the control input under a set of system matrix equalities.

This paper proposes a new output feedback control for unstable bilinear systems. The key element is a bilinear loop transfer recov-ery (LTR) observer that derives from the linear LTR observer [23]. The new bilinear LTR observer is exponentially stable without impos-ing the small control condition, the existence of a control Lyapunov function, or extra matrix equalities on the system matrices. Hence, it relaxes the stringent conditions imposed by previous bilinear ob-server designs. Then, by combining this new bilinear LTR obob-server with the state feedback division control in [15], one obtains a stabiliz-ing output feedback control for bilinear systems that may be open-loop unstable.

The remainder of this paper is arranged as follows. Section II intro-duces the new bilinear LTR observer. Section III presents the observer-based output feedback control, and its stability analysis. Section IV concludes the paper.

Manuscript received March 7, 2006; revised November 27, 2006, May 9, 2007, and October 12, 2007. Recommended by Associate Editor D. Angeli.

The authors are with the Department of Mechanical Engineering, National Taiwan University, Taipei 106, Taiwan, R.O.C. (e-mail: [email protected]; [email protected]).

(2)

II. BILINEARLTR OBSERVER

Consider a dyadic bilinear system with a multiplicative control input ˙x = Ax + bzu, z = hx

y = cx (1)

where the system state x∈ Rn _{and z}_{∈ R}1 _{are not accessible for}

measurement, and the only accessible signals are the control input u∈

R1_{and the system output y}_{∈ R}1_{. All system matrices A}_{∈ R}n×n_{, b}_∈

Rn_{, h}T _{∈ R}n_{, c}T _{∈ R}n _{are known, and the open-loop system matrix}

A is allowed to be unstable. Assume that the bilinear system (1) satisfies

the following conditions:

A1: the bilinear system is observable in the sense that (A, c) is an

observable pair;

A2: the bilinear system is controllable in the sense [24] that (A, b) is

controllable and (A, h) observable;

A3: the system (A + I, b, c) is minimum-phase (has only stable zeros).

One now proposes an observer design for the bilinear system (1) to estimate its state x from the input u and output y. If the control u of the system is uniformly bounded

|u| ≤ U (2)

where U may be arbitrarily large, the following bilinear LTR observer is suggested

˙ˆx = Aˆx + bhˆxu + L(y − cˆx) (3)

in which the output injection gain L is designed as in the linear LTR observer [23] L = 1 µQc T , µ = 1 Q(A + I)T _{+ (A + I)Q}_{− Q}c T_c µ Q + πbb T _{= 0} π(> 0) sufficiently large. (4) From Assumptions A1 and A2, (A, c) is observable, and (A, b) con-trollable. Hence, the solution Q∈ Rn×n _{of the aforesaid Riccati}

equa-tion is positive definite [25]. Furthermore, the relaequa-tionship between the solution Q(π) and the design parameter π satisfies the following rela-tionship, which is a well-known result in the study of the linear LTR observer.

Theorem 1 [25]: Under the Assumption A3, the solution Q(π) of

the observer Riccati equation (4) satisfies lim

π→∞

Q(π)

π = 0.

With Theorem 1, one can now prove the stability of the proposed bilinear LTR observer.

Theorem 2: Consider the bilinear system (1), which satisfies a control

upper bound in (2). Given whatever large control bound U in (2), there always exists a sufficiently large design parameter π > 0 in the Riccati equation (4) so that the state estimate ˆx of the bilinear LTR observer

(3) approaches the true state x exponentially.

Proof: The state estimation error ˜x = x− ˆx resulting from the

bi-linear LTR observer satisfies

˙˜x = (A− Lc)˜x + bh˜xu. (5)

The goal is to prove that even for large U in (2), the error dynamics (5) is globally exponentially stable if the Riccati design parameter π in (4) is sufficiently large.

Define a Lyapunov function W = ˜xT_Q−1_{x, where Q > 0 is from}_˜

the observer Riccati equation (4), and ˜x from (5). The change rate of W along the trajectory (5) satisfies

˙ W ≤ −2W −1 µ˜y 2 _{− πb}T_Q−1_x_˜2 + 2˜x · h · |u| · bT_Q−1_x_˜ ≤ −2W − πbT Q−1x˜ 2 + 2Uh · ˜x · bT_Q−1_x_˜

where ˜y = c˜x. Note that the maximum of the last two terms in the

aforemntioned equation occurs when bT_Q−1_˜_{x = Uh · ˜x/π,}

with the maximum value being (Uh · ˜x)2_{/π. Hence}

˙ W≤ −2W +(Uh · ˜x) 2 π ≤ − 2− (Uh)2σ(Q)¯ π W (6)

where the second inequality is derived using W ≥ σ(Q−1)˜x2 ₌

1/¯σ(Q)˜x2_{. According to Theorem 1, ¯}_{σ(Q)/π approaches zero as}

the observer design parameter π approaches infinity. Hence, given whatever large constant Uh, there always exists a sufficiently large design parameter π such that the number in the square bracket in (6) is positive. This implies that W (t), and hence, ˜x(t) decay to zero

exponentially.

Remark 1: Most previous bilinear observers restrict that the

open-loop system be stable or the control upper bound U be small, while the proposed bilinear LTR observer has no such constraints. However, if the observer design parameter π is chosen too large, the proposed observer becomes high-gain. A disadvantage of high-gain observers is the peaking phenomenon [26], in which the estimated state ˆx(t)

peaks to extremely large values during the very initial period of the observation process. One way to relieve the peaking phenomenon is to schedule the design parameter according to π(t) = πi, t∈ [tk, tk + 1),

where πistepwisely jumps from 1 to the designed large value.

III. OBSERVER-BASEDSTATEFEEDBACKCONTROL

In the literature, a state feedback “division control” [15] has been proposed to exponentially stabilize the bilinear system (1) when the system state x is accessible for measurement. Their division control is stabilizing even for an open-loop unstable bilinear system, and is given by

ux =−

z

z2 ₊2x2kx (7)

where the state feedback gain k is chosen to stabilize the system matrix

A− bk, and is a sufficiently small parameter. The resultant

exponen-tially stable closed-loop system is as ˙x = f (x), f (x) = Ax− bhxux

= Ax− bkx (hx)

2

(hx)2₊2x2. (8)

Since the aforementioned state feedback control system (8) is exponen-tially stable, one can quote the converse Lyapunov stability theorem to claim the following.

Theorem 3 [27]: Regarding the exponentially stable system ˙x = f (x) in (8), there exists a Lypunov function V (x) and positive constants

(3)

αisuch that for all x∈ Rn α1x2 ≤ V (x) ≤ α1x2 (9) d dtV (x) = ∂V (x) ∂x f (x)≤ −α3x 2 (10) ∂V (x) ∂x ≤ α4x. (11) When the system state x and z = hx are not accessible for measure-ment, one can combine the bilinear LTR observer (3) in the previous section with the aforementioned division control (7). The resultant observer-based state feedback control is given by

uˆx =−

ˆ

z

ˆ

z2₊2ˆx2k ˆx, z = hˆˆ x (12)

where ˆx is the estimate of x from the bilinear LTR observer (3). Lemma 4: Both the state feedback control uxin (7) and the

observer-based state feedback control uˆx in (12) are uniformly bounded

inde-pendently of the boundedness of x and ˆx.

Proof : First, one will show that the observer-based control uˆx is

uniform bounded independently of the boundedness of ˆx. Using the

inequality|kˆx| ≤ k · ˆx and dividing (12) by ˆx2_{, one can obtain}

the following inequality

|uxˆ| ≤ s s2₊2k d e f = g(s), s = ˆz ˆx ∈ [0, ∞).

A simple calculation shows that the maximum value of g(s) for s ranging from zero to infinity isk/2. Hence, the observer-based control uxˆ is uniformly bounded

|uˆx| ≤

k

2 ∀ˆx ∈ R

n_. ₍₁₃₎

In a similar way as one derives (13), one can also show that the state feedback control uxin (7) is uniformly bounded independently of the

boundedness of x

|ux| ≤k

2 ∀x ∈ R

n_. ₍₁₄₎

According to Lemma 4, the proposed observer-based control (12) satisfies the uniform upper bound condition (2) required in Theorem 2. One can, therefore, legally quote Theorem 2 to conclude that ˆx approaches x exponentially fast. In other words, exponential

sta-bility of the proposed bilinear LTR observer can be concluded before the stability analysis of the controlled closed-loop system.

Corollary 5: If the observer design parameter π in (4) is sufficiently

large, the bilinear LTR observer (3) for the bilinear system (1) under the proposed observer-based control (12) is exponentially stable; hence, there exist positive constants K and γ such that

˜x(t) ≤ Ke−γ t_{˜x(0) → 0, as t → ∞.} ₍₁₅₎

The stability analysis of the proposed controlled system will proceed as follows.

Theorem 6: Consider the bilinear system (1) under the

observer-based state feedback control (12). The controlled system state will not explode to infinity in finite time, nor will it decay to zero in finite time.

Proof : One can check, using (1) and (13), that the system state under

the observer-based state feedback control (12) satisfies ˙x ≤ M x for some bounded constant M > 0. In other words, the closed-loop

system equation is Lipschitz. By quoting [27, Proposition 1.4.1], one concludes that

0 <x(0)e−M t≤ x(t) ≤ x(0)eM t_<_∞.

The inequality on the left ofx(t) shows that x(t) cannot decay to zero in finite time, and the inequality on the right shows that x(t) cannot

explode to infinity in finite time.

Theorem 7: Consider the bilinear system (1) under the

observer-based state feedback control (12). If the observer design parameter π in (4) is sufficiently large, the proposed control stabilizes the system (1) in the sense that the system state x will converge asymptotically to the origin.

Proof: Define an arbitrarily small neighborhood around the origin Vη ={x|V (x) ≤ η}

where V (x) is the Lyapunov function in Theorem 3, and η > 0 an arbitrarily small number. Also, define a critical time constant T∗

T∗= 1 γln 2α4βmK˜x(0) α3'η/α1 (16)

where αi are as in Theorem 3, K , γ, and ˜x(0) are as in Corollary

5, β =b · h, and m is the upper bound in (18), as shown next. According to Theorem 6, the controlled system state x can grow at most exponentially fast; therefore, given any bounded initial condition

x(0), x(T∗) is also bounded. It will now be shown that starting from the bounded x(T∗), x(t) will enter Vη within a finite time, say at

t = T∗+ T1, and x(t) will never exit Vη again. Since η is arbitrarily

small, this result is equivalent to asymptotic stability of the controlled system.

Define the normalized state ex= x/x, and normalized estimated

state eˆx = ˆx/ˆx. It is shown in the Appendix that they satisfy

ex− eˆx ≤

2

xx − ˆx. (17)

Also, notice that the observer-based control (12) and the state feedback control (7) can be expressed as the same function of the (normalized) state

uˆx= q(exˆ), and ux = q(ex), where q(e) =

he· ke |he|2₊2.

It is easy to see that q(e) has a bounded derivative for all bounded e

00 00∂q(e)

∂e

00

00≤ m < ∞ ∀e = 1. (18)

Using the mean value theorem [28] and the inequalities (17), (18) yields

|∆(t)|d e f

= |uˆx− ux| = |q(eˆx)− q(ex)|

=0000∂q(e) ∂e 00 00· eˆx− ex ≤ 2m xˆx − x. (19)

One will now use a contradiction argument to show that starting from the bounded x(T∗), x(t) will enter the small neighborhood Vη

within a finite time at T∗+ T1. Assume the contrary; that is, x is always

outside of Vη for all t≥ T∗. As a result of this assumption and (9) in

Theorem 3, one hasx(t) >'η/α1for all t≥ T∗. Then, using (15) in Corollary 5 and (19), one concludes that

|∆(t)| ≤'2m

(4)

Now check the time derivative of V (x) along the trajectory under the proposed observer-based control uxˆ

d dtV (x) =

∂V (x)

∂x (Ax− bhxuˆx± bhxux)

= ∂V (x) ∂x f (x) + ∂V (x) ∂x bhx∆(t) ≤ (−α3 + α4β|∆(t)|) · x2 ≤ −α3 α1 + α4β α1 |∆(t)| V (x) (21)

where f (x) is as defined in (8), β =b · h, and the inequalities in Theorem 3 have been used to derive the last two inequalities. The last inequality (21) and equation (20) imply that V will eventually decay exponentially to values smaller than η, contradicting the earlier assumption that x will always stay outside of Vη (V (x) > η). Hence,

one concludes that if initially x(T∗) is outside of Vη, x(t) must enter

Vη within a finite time at T∗+ T1for some finite T1.

Now, it remains to show that once x(t) enters Vη at T∗+ T1, it will

never exit Vη again. To show this, notice that whenever x intends to

exit Vη, it must go across the boundary of Vη, where V (x) = η. One

will now check the change rate of V (x) at the boundary of Vη. At the

boundary, one has V (x) = η and hence,x ≥'η/α1, according to (9) in Theorem 3. Substitutingx ≥'η/α1, (15), and (19) into (21) leads to d dtV (x)≤ −α3 α1 + α4β α1 2mK ' η/α1˜x(0)e −γ t V (x). (22)

From this inequality, it is not difficult to see that the change rate of

V (x) at the boundary of Vη is always negative if t≥ T∗; meaning

that x can never exit Vη at t≥ T∗. Since we have already shown

that x enters Vη at T∗+ T1, we conclude that x will stay within Vη

thereafter.

Remark 2: In the analysis of Theorem 7, there is no telling where the

control signal will converge asymptotically. However, one has proved in (13) that the proposed control signal uxˆ = q(eˆx) is uniformly bounded.

One can further show that the time derivative of uxˆ = q(exˆ) d dtuˆx = ∂q(eˆx) ∂eˆx I− eˆxeTxˆ

(Aˆx + bhˆxuˆx+ L(y− cˆx))

is also uniformly bounded due to (18) and boundedness of x, ˆx, uˆx. In

other words, the control signal is smooth to some extent so that it can be implemented without difficulty on physical actuators.

Remark 3: In proving Theorem 7, one implicitly assumes in the

def-inition of ex = x/x that x = 0 for any finite time. This assumption

is supported by Theorem 6 which states that x will not decay to zero in finite time. Also, in Corollary 5, it has been shown that ˆx converges

to x with a specific exponential rate independently of the stability of

x. In case when the system state x converges to zero faster than ˆx

converges to x, then the stability proof of x is trivially done without referring to the arguments in Theorem 7. In the other case when x has not converged to zero before ˆx converges to x, one then applies the

argument in Theorem 7 to conclude the stability of x. In this case, after ˆ

x has converged to x(= 0), one can say that ˆx = 0 in the definition of eˆx= ˆx/ˆx.

APPENDIX

Define state estimation error ˜x = x− ˆx. It is easy to show that ex− eˆx =

ˆ

x(ˆx − ˆx + ˜x) + ˜xˆx ˆx · ˆx + ˜x .

Using the inequalitya − b ≤ a −b = b − a for any two vec-tors a and b, one can show that

|ˆx − ˆx + ˜x| ≤ ˜x.

Combining the aforementioned two equations, one obtains

ex− eˆx ≤ 2ˆx · ˜x ˆx · ˆx + ˜x= 2˜x ˆx + ˜x = 2 xx − ˆx. ACKNOWLEDGMENT

The authors wish to thank one of the reviewers who pointed out a discrepancy in the original proof of Theorem 7, and thank especially the Associate Editor whose suggestion has helped to shape the final proof of Theorem 7.

REFERENCES

[1] R. R. Mohler, Bilinear Systems, Volume II: Applications to Bilinear Con-trol. Englewood Cliffs, NJ: Prentice-Hall, 1991.

[2] C. Bruni, G. Dipillo, and G. Koch, “Bilinear systems: An appealing class of nearly linear systems in theory and applications,” IEEE Trans. Autom. Control, vol. AC-19, no. 4, pp. 334–348, Aug. 1974.

[3] B. Arg¨uello-S´errano and M. Velez-Reyes, “Nonlinear control of a heat-ing, ventilatheat-ing, and air conditioning system with thermal load estima-tion,” IEEE Trans. Control Syst. Technol., vol. 7, no. 1, pp. 56–63, Jan. 1999.

[4] V. Rajasekaran, J. Sun, and B. S. Heck, “Bilinear discrete-time modeling for enhanced stability prediction and digital control design,” IEEE Trans. Power Electron., vol. 18, no. 1, pp. 381–389, Jan. 2003.

[5] R. F. Fung, Y. T. Liu, and C. C. Wang, “Dynamic model of an electromag-netic actuator for vibration control of a cantilever beam with a tip mass,” J. Sound Vib., vol. 288, pp. 957–980, 2005.

[6] F. Albertini and D. D’Alessandro, “Notions of controllability for bilinear multilevel quantum systems,” IEEE Trans. Autom. Control, vol. AC-48, no. 8, pp. 1399–1403, Aug. 2003.

[7] G. M. Huang, T. J. Tarn, and J. W. Clark, “On the controllability of quantum mechanical systems,” J. Math. Phys., vol. 24, no. 11, pp. 2608– 2618, 1983.

[8] C. Lan, T. J. Tarn, Q. S. Chi, and J. W. Clark, “Analytic controllability of time-dependent quantum control systems,” J. Math. Phys., vol. 46, p. 052102, 2005.

[9] M. Slemrod, “Stabilization of bilinear control systems with applications to nonconservative problems in elasticity,” SIAM J. Control Optim., vol. 16, pp. 131–141, 1978.

[10] J. P. Quinn, “Stabilization of bilinear systems by quadratic feedback con-trols,” J. Math. Anal. Appl., vol. 75, pp. 66–80, 1980.

[11] E. P. Ryan and N. J. Buckingham, “On asymptotically stabilizing feedback control of bilinear systems,” IEEE Trans. Autom. Control, vol. 28, no. 8, pp. 863–864, Aug. 1983.

[12] P. O. Gutman, “Stabilizing controllers for bilinear systems,” IEEE Trans. Autom. Control, vol. 26, no. 4, pp. 917–921, Aug. 1981.

[13] M. S. Chen, “Exponential stabilization of a constrained bilinear system,” Automatica, vol. 34, no. 8, pp. 989–992, 1998.

[14] Y. R. Hwang, M. S. Chen, and T. Wu, “Division controllers for homo-geneous dyadic bilinear systems,” IEEE Trans. Autom. Control, vol. 48, no. 4, pp. 701–705, Apr. 2003.

[15] M. S. Chen, Y. R. Hwang, and Y. J. Kuo, “Exponentially stabilizing division controller for dyadic bilinear systems,” IEEE Trans. Autom. Control, vol. 48, no. 1, pp. 106–110, Jan. 2003.

[16] R. Genesio and A. Tesi, “The output stabilization of SISO bilinear sys-tems,” IEEE Trans. Autom. Control, vol. 33, no. 10, pp. 950–952, Oct. 1988.

[17] G. Lu, Y. Zheng, and C. Zhang, “Dynamic ooutput feedback stabilization of MIMO bilinear systems with undamped natural response,” Asian J. Control, vol. 5, pp. 251–260, 2003.

[18] J. P. Gauthier and I. Kupka, “A separation principle for bilinear systems with dissipative drift,” IEEE Trans. Autom. Control, vol. 37, no. 12, pp. 1970–1974, Dec. 1992.

(5)

[19] S. Hanba and Y. Miyasato, “Output feedback stabilization of bilinear systems using dead-beat observers,” Automatica, vol. 37, pp. 915–920, 2001.

[20] E. D. Sontage, “A ‘universal’ construction of Artstein’s theorem on non-linear stabilization,” Syst. Control Lett., vol. 13, pp. 117–123, 1989. [21] S. Hara and K. Furuta, “Minimal order state observers for bilinear

sys-tems,” Int. J. Control, vol. 24, pp. 705–718, 1976.

[22] Y. Funahashi, “Stable state estimator of bilinear systems,” Int. J. Control, vol. 29, pp. 181–188, 1979.

[23] J. C. Doyle and G. Stein, “Robustness with observers,” IEEE Trans. Autom. Control, vol. 24, no. 4, pp. 607–611, Aug. 1979.

[24] M. E. Evans and D. N. P. Murthy, “Controllability of a class of discrete-time bilinear systems,” IEEE Trans. Autom. Control, vol. 22, no. 1, pp. 78–83, Feb. 1977.

[25] H. Kwakernaak and R. Sivan, Linear Optimal Control Systems.. New York: Wiley, 1972.

[26] H. K. Khalil, Nonlinear Systems. Upper Saddle River, NJ: Prenticce-Hall, 2000.

[27] S. Sastry, Nonlinear Systems, Analysis, Stability, and Control. New York: Springer-Verlag, 1999.

[28] J. E. Marsden and M. J. Hoffman, Elementary Classical Anaysis, 2nd ed. New York: Freeman, 1993.

On Linear-Quadratic Stackelberg Games With Time Preference Rates

Marc Jungers

Abstract—This note deals with linear-quadratic Stackelberg differen-tial games including time preference rates with an open-loop information structure. The properties of the characteristic matrix associated with the necessary conditions for a Stackelberg strategy are pointed out. It is shown that such a matrix exhibits a special symmetry property of its eigenvalues. Sufficient conditions to guarantee a predefined degree of stability are given based on the distribution of the eigenvalues in the complex plane.

Index Terms—Game theory, Hamiltonian matrix, Riccati equation, α-stability, Stackelberg strategy, time preference rate.

I. INTRODUCTION

T

HE STACKELBERG strategies are an elegant concept for deal-ing with hierarchical differential games [1], [2]. In the framework of an open-loop information structure [3], the necessary conditions are well known and could be obtained explicitly within the context of linear-quadratic problems [1], [2]. Nevertheless, it seems that an ex-plicit solution, coping with differential games with criteria including time preference rates, does not exist.

It was proved in [4] that the linear-quadratic optimal control problem with a single criterion including a constant time preference rate α could

Manuscript received July 18, 2006; revised February 15, 2007, June 22, 2007, and October 4, 2007. Recommended by Associate Editor M. Fujita.

The author is with the Centre de Recherche en Automatique de Nancy (CRAN), Nancy-Universit´e, Centre National de la Recherche Sci-entifique (CNRS), 54516 Vandoeuvre-les-Nancy, France (e-mail: marc. [email protected]).

Digital Object Identifier 10.1109/TAC.2008.917649

be restated as a standard one with a shift of the eigenvalues of the drift matrix by α. The reformulation uses a change of variable, which is closely connected with asymptotic stability of degree α.

Besides, a criterion with a time preference rate is quite frequent especially in economic applications of game theory (see [5], [6] for more details) and are recognized as the discount rate associated with the cost functionals. In order to emphasize the fact that each player has its own objective, the time preference rates are not necessarily identical [8].

When there is no time preference rate, the necessary conditions for obtaining an open-loop Stackelberg equilibrium are characterized by a Hamiltonian matrix (see [9]–[11] for an overview). This leads to a symmetry of the eigenvalues with respect to the origin of the complex plane. However, for the general case where time preference rates are different and not null, this property does not hold. The main contribution of this note is to consider such a general case. Two points are examined. First, the eigenvalues distribution of the characteristic matrix associated with an open-loop Stackelberg strategy applied on the differential game is studied. Second, it is shown that a predefined degree of stability could be imposed to the controlled system.

The note is organized as follows. In Section II, the Stackelberg strategy with an open-loop information structure is recalled and the associated necessary conditions are derived. The cases of finite and infinite time horizon are considered. The characteristic matrix and the corresponding coupled Riccati equations are presented. A nontrivial symmetry for the eigenvalues is described in Section III. Sufficient con-ditions for a strict α-stability are provided in the same section, followed by an interpretation in terms of game theory. An example illustrates the main result. Some concluding remarks make up Section IV.

II. STACKELBERGSTRATEGY

A. Problem Statement

Consider a two-players linear-quadratic differential game on a finite time horizon, defined by

˙x(t) = Ax(t) + B1u1(t) + B2u2(t), x (t0) = x0 (1)

where x∈ Rn_{, u}

i∈ Ua d , i⊂ Rri (i∈ {1, 2} and n, ri∈ N, Ua d , i is

the admissible set of the controls ui) and with the cost functionals Ji

(i∈ {1, 2}) including a time preference rate αi

Ji = 1 2x T f e 2 αitf_K i fxf +1 2 tf t0 e2 αit_xT_Q ix + uT1Ri 1u1+ uT2Ri 2u2 dτ (2)

where xf = x(tf). All weighting matrices are constant and symmetric

with Qi= CiTCi≥ 0, Ki f ≥ 0, Ri j ≥ 0 (i = j), and Ri i > 0. The

matrices Ciare of full rank Ci∈ Rmi×n.

Stackelberg strategy with an open-loop information structure is ap-plied for the differential game (1)–(2). Player 2 is assumed to be the leader while player 1 is the follower. The hierarchy in the game comes from the fact that the leader knows the rational reaction of the follower and reveals first his/her strategy. The follower does not know the ratio-nal reaction of the leader and must optimize his/her criterion J1 for a

given control u∗2(t) of the leader. Define the rational reaction set of the

followerR1(u) ˜ u1 | J1(˜u1, u)≤ J1(u1, u) ,∀u1 ∈ Ua d , 1 . (3)

For a differential game with an open-loop information structure [12], i.e., the players are committed to follow a predetermined strategy or no