Convergence Analysis of the Doubling Algorithm for Several Nonlinear Matrix Equations in the Critical Case

(1)

FOR SEVERAL NONLINEAR MATRIX EQUATIONS IN THE CRITICAL CASE ∗

CHUN-YUEH CHIANG†, ERIC KING-WAH CHU‡, CHUN-HUA GUO§, TSUNG-MING HUANG¶, WEN-WEI LINk, AND SHU-FANG XU∗∗

Abstract. In this paper, we review two types of doubling algorithm and some techniques for analyzing them. We then use the techniques to study the doubling algorithm for three different nonlinear matrix equtions in the critical case. We show that the convergence of the doubling algorithm is at least linear with rate 1/2. As compared to earlier work on this topic, the results we present here are more general, and the analysis here is much simpler.

Key words. nonlinear matrix equation, minimal nonnegative solution, maximal positive definite solution, critical case, doubling algorithm, cyclic reduction, convergence rate

AMS subject classifications. 15A24, 15A48, 65F30, 65H10

1. Introduction. The doubling algorithm has been studied for various nonlinear matrix equations in [1, 6, 7, 19, 21, 24, 26, 27, 33]. Its convergence behaviour in the critical case, however, has not been fully investigated. The doubling algorithm is said to be structure-preserving (and denoted by SDA) because it preserves certain block structures for matrix pairs (or pencils) related to matrix equations.

In section 2, we review two types of doubling algorithm and some techniques for analyzing them. The presentation here is more general than in previous papers, to allow direct application to various matrix equations. In sections 3–5, the techniques reviewed in section 2 are used to study the convergence behaviour of the doubling algorithm for three different nonlinear matrix equations in the critical case. As com-pared to previous papers, the results here are obtained with only basic assumptions. In particular, the results we obtain about a quadratic matrix equation arising from quasi-birth-death processes are more general than previous results, and the analysis here is much simpler. A connection between the doubling algorithm and the cyclic reduction algorithm is also pointed out for that quadratic matrix equation. Some concluding remarks are made in section 6.

2. The doubling algorithm. The first three subsections are based on [33], [24], and [26], but the presentation here is more general. The last subsection is directly from [26].

∗_{Version of February 27, 2008.}

†_Department _of _Mathematics, _National _Tsinghua _University, _Hsinchu _300, _Taiwan

([email protected]).

‡_{School of Mathematical Sciences, Building 28, Monash University, VIC 3800, Australia}

§_{Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada}

([email protected]). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada.

¶_{Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan}

([email protected]). The work of this author was partially supported by the National Sci-ence Council and the National Center for Theoretical SciSci-ences in Taiwan.

k_Department _of _Mathematics, _National _Tsinghua _University, _Hsinchu _300, _Taiwan

([email protected]). The work of this author was partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.

∗∗_LMAM, _School _of _Mathematical _Sciences, _Peking _University, _Beijing _100871, _China

(2)

2.1. SDA-1. For a given matrix pair L0= I −G0 0 F0 , M0= E0 0 −H0 I , (2.1)

where E0, F0, G0, H0 are n × n, m × m, n × m, m × n, respectively, we are going to

define Lk= I −Gk 0 Fk , Mk= Ek 0 −Hk I (2.2) for all k ≥ 0. Assume that Lk and Mk have been defined and I − GkHk (and thus

I − HkGk) is nonsingular for k ≥ 0. Then we can define the matrices

e Lk = I −Ek(I − GkHk)−1Gk 0 Fk(I − HkGk)−1 , Mfk = Ek(I − GkHk)−1 0 −Fk(I − HkGk)−1Hk I .

It is easily verified that eLkMk = fMkLk. We then define

Lk+1= eLkLk = I −(Gk+ Ek(I − GkHk)−1GkFk) 0 Fk(I − HkGk)−1Fk , Mk+1= fMkMk = Ek(I − GkHk)−1Ek 0 −(Hk+ Fk(I − HkGk)−1HkEk) I .

Therefore, the sequence {Lk, Mk} can be defined by the following doubling algorithm

if no breakdown occurs.

Algorithm 2.1. (SDA-1) Given E0, F0, G0, H0.

For k = 0, 1, . . . compute

Ek+1= Ek(I − GkHk)−1Ek, (2.3)

Fk+1= Fk(I − HkGk)−1Fk, (2.4)

Gk+1= Gk+ Ek(I − GkHk)−1GkFk, (2.5)

Hk+1= Hk+ Fk(I − HkGk)−1HkEk. (2.6)

The algorithm requires about 14₃m3_{+ 6m}2_{n + 6mn}2₊14 3n

3 _{flops each iteration.}

Note that the flop count is 64₃n3 _{when m = n.}

2.2. SDA-2. For a given matrix pair L0= −P0 I T0 0 , M0= V0 0 Q0 −I , where all matrix blocks are n × n, we are going to define

Lk = −Pk I Tk 0 , Mk = Vk 0 Qk −I (2.7) for all k ≥ 0. Assume that Lk and Mk have been defined and Qk− Pk is nonsingular

for k ≥ 0. Then we can define the matrices

e Lk= I −Vk(Qk− Pk)−1 0 Tk(Qk− Pk)−1 , Mfk= Vk(Qk− Pk)−1 0 −Tk(Qk− Pk)−1 I .

(3)

It is easily verified that eLkMk = fMkLk. We then define Lk+1= eLkLk= −(Pk+ Vk(Qk− Pk)−1Tk) I Tk(Qk− Pk)−1Tk 0 , Mk+1= fMkMk = Vk(Qk− Pk)−1Vk 0 Qk− Tk(Qk− Pk)−1Vk −I .

Therefore, the sequence {Lk, Mk} can be defined by the following doubling algorithm

if no breakdown occurs.

Algorithm 2.2. (SDA-2) Given V0, T0, Q0, P0.

For k = 0, 1, . . ., compute

Vk+1= Vk(Qk− Pk)−1Vk,

Tk+1= Tk(Qk− Pk)−1Tk,

Qk+1= Qk− Tk(Qk− Pk)−1Vk,

Pk+1= Pk+ Vk(Qk− Pk)−1Tk.

This algorithm requires about 38₃n3_{flops each iteration.}

2.3. Relation between Lk and Mk. Suppose we have

L0U = M0U E, (2.8)

where the matrix pair (L0, M0) is the initialization for either SDA-1 or SDA-2, and

E is a square matrix.

Pre-multiplying (2.8) with eL0 and using eL0M0= fM0L0, we get L1U = M1U E2.

In general, we have for each k ≥ 0

LkU = MkU E2

k

. (2.9)

Suppose that there are nonsingular matrices V and Z such that

V L0Z = JL, V M0Z = JM, (2.10)

and JLJM = JMJL. Then it follows that

M0ZJL= V−1JMJL= V−1JLJM = L0ZJM,

and

M1ZJL2 = fM0M0ZJL2 = fM0L0ZJMJL= eL0M0ZJLJM = eL0L0ZJM2 = L1ZJM2 .

In general, we have for each k ≥ 0 MkZJ2

k

L = LkZJ2

k

(4)

2.4. Result on special Jordan blocks. Let Jω,p be the p × p Jordan block

with a unimodular eigenvalue ω = eiθ_:

Jω,p≡          ω 1 0 · · · 0 0 . .. . .. . .. ... .. . . .. . .. . .. 0 .. . . .. . .. 1 0 · · · 0 ω          . (2.12)

When p = 2m, let Γk,mbe determined through the partition

J_ω,2m2k = " J2k ω,m Γk,m 0 J2k ω,m # . (2.13)

The following useful Lemma is proved in [26].

Lemma 2.1. The matrix Γk,m is invertible and satisfies

kΓ−1_k,mJω,m2k k = O(2−k), kJ2

k

ω,mΓ−1k,mJ 2k

ω,mk = O(2−k) as k → ∞. (2.14)

In the next three sections, we will apply the techniques reviewed in this section to three different nonlinear matrix equations. Although the general approach will be the same, we will need to fully exploit the special properties of each equation. Among other things, the following two issues deserve special attention: (1) Given a nonlinear matrix equation, how should we rewrite it in its equivalent form (2.8)? If possible, we should try to get a form (2.8) that would lead to SDA-2 rather than SDA-1, since SDA-2 is less expensive. (2) How should we choose the matrices JL and JM in (2.10)?

The matrices must satisfy JLJM = JMJL, and the resulting equation (2.11) and its

analogue should be easy to handle together. We will keep these issues in mind when we carry out the convergence analysis for the three equations.

3. A special nonlinear matrix equation. In this section we consider the nonlinear matrix equation (NME)

X + ATX−1A = Q, (3.1)

where A, Q ∈ Rn×n _{with Q being symmetric positive definite. Various aspects of the}

NME, like solvability, numerical solution, perturbation and applications, can be found in [8, 9, 13, 17, 22, 34, 37, 38, 39, 40] and the references therein.

For symmetric matrices X and Y , we write X ≥ Y (X > Y ) if X − Y is positive semidefinite (definite). We assume that (3.1) has a symmetric positive definite solu-tion. Then [9] it has a maximal symmetric positive definite solution X+ (X+≥ X for

any symmetric positive definite solution X of (3.1)), and ρ(X+−1A) ≤ 1, where ρ(·) is

the spectral radius. Let L0= 0 I AT ₀ , M0= A 0 Q −I . (3.2)

It is easy to verify that the pencil M0− λL0(also denoted by (M0, L0)) is symplectic,

i.e., M0J M0T = L0J LT0 for J = 0 I −I 0 .

(5)

Using Algorithm 2.2 with V0= A, T0= AT, Q0= Q, P0= 0, we have Tk = VkT, QTk =

Qk, PkT = Pk. So Algorithm 2.2 is simplified to the following, where we have used Ak

for Vk. Algorithm 3.1. Let A0= A, Q0= Q, P0= 0. For k = 0, 1, . . ., compute Ak+1= Ak(Qk− Pk)−1Ak, Qk+1= Qk− ATk(Qk− Pk)−1Ak, Pk+1= Pk+ Ak(Qk− Pk)−1ATk.

The matrices Lk, Mk in (2.7) are now given by

Lk= −Pk I AT k 0 , Mk= Ak 0 Qk −I . (3.3)

It is noted in [33] that the cyclic reduction algorithm in [34] is recovered from Algorithm 3.1 when Qk− Pk and Qk are replaced by Qk and Xk, respectively. So we

know from [34] that Qk− Pk> 0 in Algorithm 3.1. Thus the algorithm is well defined

and 0 ≤ Pk < Qk ≤ Q.

It is easy to verify that M0 I X+ = L0 I X+ X₊−1A. (3.4)

We are interested in the case with ρ(X₊−1A) = 1. It follows from [13, Theorem 2.4] that the eigenvalues of X₊−1A have the following characterization.

Theorem 3.1. For (3.1), the eigenvalues of the matrix X+−1A are precisely the

eigenvalues of the matrix pencil M0− λL0 inside or on the unit circle, with half of the

(necessarily even) partial multiplicities for each unimodular eigenvalue of the pencil. In view of the connection between Algorithm 3.1 and the cyclic reduction algo-rithm in [34], we know from [13] that the sequence Qk in Algorithm 3.1 converges to

X+at least linearly with rate 1/2, as long as all eigenvalues of X+−1A on the unit circle

are semisimple. With the tools in section 2, we are going to prove more convergence results for Algorithm 3.1, without any assumption on the unimodular eigenvalues of X+−1A.

Suppose there are r Jordan blocks associated with unimodular eigenvalues of (M0, L0). Then they have the form

Jωj,2mj = Jωj,mj Γ0,mj 0 Jωj,mj , Γ0,mj ≡ emje T 1, (3.5)

where ωj= eiθj for j = 1, . . . , r.

By the results on Kronecker canonical form for a symplectic pencil (see [11] and [32]), there exist nonsingular matrices V and Z such that

V L0Z = In 0n 0n JsH⊕ Im ≡ JL, (3.6) V M0Z = Js⊕ J1 0l⊕ Γ0 0n Il⊕ J1 ≡ JM, (3.7)

(6)

where Js∈ Cl×l consists of stable Jordan blocks (so ρ(Js) < 1), J1= Jω1,m1⊕ · · · ⊕

Jωr,mr, Γ0≡ Γ0,m1⊕ · · · ⊕ Γ0,mr, m = m1+ · · · + mr, l = n − m, ⊕ denotes the direct

sum of matrices and (·)H _{the conjugate transpose. Moreover, the nonsingular matrix}

Z can be taken to be of the form Z = ZaZb with Za symplectic and Zb = In⊕ Zc.

It follows that span{Z(:, 1 : n)} forms the unique weakly stable Lagrangian deflating subspace of (M0, L0) corresponding to Js⊕ J1.

Let Γk,mj be given by (2.13) with ω = ωj and m = mj. Since JLJM = JMJL, we

have by (2.11) MkZ _I ₀ 0 (JH s )2 k ⊕ I = LkZ " J2k s ⊕ J2 k 1 0 ⊕ Γk 0 I ⊕ J12k # , (3.8) where Γk= Γk,m1⊕ · · · ⊕ Γk,mr.

Similarly, there exist nonsingular matrices T and W such that

T M0W = JL, T L0W = JM, (3.9) and LkW _I ₀ 0 (JH s )2 k ⊕ I = MkW " J2k s ⊕ J2 k 1 0 ⊕ Γk 0 I ⊕ J12k # . (3.10) By Lemma 2.1 we have kΓ−1_k J12kk = O(2−k), kJ2 k 1 Γ−1k J 2k 1 k = O(2−k) as k → ∞. (3.11)

We now prove some convergence results for Algorithm 3.1. Partition Z and W as Z = Z1 Z3 Z2 Z4 , W = W1 W3 W2 W4 , (3.12) where Zi, Wi∈ Cn×n (i = 1, . . . , 4).

Theorem 3.2. When ρ(X+−1A) = 1, the sequences {Ak, Qk, Pk} generated by

Algorithm 3.1 satisfy (a) kAkk = O(2−k);

(b) kQk− X+k = O(2−k) and X+= Z2Z1−1;

(c) kPk − X−k = O(2−k) for X− = W2W1−1 if W1 is invertible; if A is also

invertible, then X− is a solution of (3.1) and the eigenvalues of X−−1A are

the reciprocals of the eigenvalues of X₊−1A;

(d) Qk− Pk converges to a singular matrix as k → ∞.

Proof. (a) Substituting Lk and Mk of (3.3) and Z of (3.12) into (3.8), we obtain

AkZ1= (−PkZ1+ Z2)(J2 k s ⊕ J 2k 1 ), (3.13) AkZ3((JsH) 2k_{⊕ I) = (−P} kZ1+ Z2)(0 ⊕ Γk) + (−PkZ3+ Z4)(I ⊕ J2 k 1 ), (3.14) QkZ1− Z2= ATkZ1(J2 k s ⊕ J 2k 1 ), (3.15) (QkZ3− Z4)((JsH) 2k ⊕ I) = AT kZ1(0 ⊕ Γk) + ATkZ3(I ⊕ J2 k 1 ). (3.16)

From (3.6) and (3.7) we have M0 Z1 Z2 = L0 Z1 Z2 (Js⊕ J1).

(7)

By Theorem 3.1, X₊−1A is similar to Js⊕ J1. Then from (3.4) and the uniqueness of

weakly stable Lagrangian deflating subspaces of (M0, L0) corresponding to Js⊕ J1,

we have Z1 Z2 = I X+ R

for a nonsingular matrix R. It follows that Z₁−1 exists and X+ = Z2Z1−1.

Post-multiplying (3.14) by (0 ⊕ Γ−1_k J₁2k)Z₁−1 and using (3.13), we have Ak h I − Z3(0 ⊕ Γ−1_k J2 k 1 )Z −1 1 i = (−PkZ1+ Z2)(J2 k s ⊕ 0)Z1−1− (−PkZ3+ Z4)(0 ⊕ J2 k 1 Γ−1k J 2k 1 )Z1−1.

It follows from (3.11) and the boundedness of {Pk} that

kAkk = O(2−k). (3.17)

(b) Post-multiplying (3.16) by (0 ⊕ Γ−1_k J2k 1 )Z

−1

1 and using (3.15), we get

Qk h I − Z3(0 ⊕ Γ−1k J 2k 1 )Z1−1 i − X+ =hAT_kZ1(J2 k s ⊕ 0) − A T kZ3(0 ⊕ J2 k 1 Γ −1 k J 2k 1 ) − Z4(0 ⊕ Γ−1k J 2k 1 ) i Z₁−1. (3.18) By (3.11) and (3.17), we have kQk− X+k = O(2−k).

(c) Substituting Lk and Mk of (3.3) and W of (3.12) into (3.10), we have

W2− PkW1= AkW1 J_s2k⊕ J12k , (3.19) (W4− PkW3) (J_sH)2k⊕ I= AkW1(0 ⊕ Γk) + AkW3 I ⊕ J₁2k. (3.20) Let X− = W2W1−1. As before, post-multiplying (3.20) by

0 ⊕ Γ−1_k J₁2kW₁−1 and using (3.19), we get X−− Pk h I − W3 0 ⊕ Γ−1_k J₁2kW₁−1i =hW4 0 ⊕ Γ−1_k J12k + AkW1 Js2k⊕ 0 − AkW3 0 ⊕ J12kΓ−1k J 2k 1 i W₁−1. (3.21) By (3.11) and the result of (a), we have

kX−− Pkk = O(2−k). From (3.9) we get 0 I AT ₀ I X− = A 0 Q −I I X− R−,

where R−= W1(Js⊕ J1)W1−1. It follows that

(8)

When A is invertible, the matrices X₊−1A, R−, X− are all invertible and we obtain

X−+ ATX−−1A = Q.

Moreover, the eigenvalues of X₋−1A are the reciprocals of the eigenvalues of R− (and

thus X₊−1A).

(d) From (3.13) and (3.15), we get −PkZ1(J2 k s ⊕ J 2k 1 ) = AkZ1− Z2(J2 k s ⊕ J 2k 1 ), QkZ1(J2 k s ⊕ J 2k 1 ) = Z2(J2 k s ⊕ J 2k 1 ) + A T kZ1(J2·2 k s ⊕ J 2·2k 1 ).

This implies that

(Qk− Pk)Z1 0 Im = AkZ1 ₀ J₁−2k + AT_kZ1 ₀ J2k 1 . (3.22)

Since 0 ≤ Pk ≤ Pk+1 ≤ Q, the sequence Pk converges even if W1 is singular. Let

lim(Qk− Pk) = R∗. It follows from (3.22) and the result of (a) that

R∗Z1 0 Im = 0. Thus R∗ is singular.

The most important conclusion in Theorem 3.2 is that the sequence Qk from the

doubling algorithm converges to X+ at least linearly with rate 1/2, regardless of the

values of mj(j = 1, 2, . . . , r). This is in sharp contrast with the behaviour of Newton’s

method. The NME (3.1) is a special case of the discrete algebraic Riccati equation studied in [12]. It is conjectured in [12] that the convergence of Newton’s method is linear with rate 1/√q

2, where q = max1≤j≤rmj.

4. A quadratic matrix equation from quasi-birth-death problems. A discrete-time quasi-birth-death (QBD) process is a Markov chain with state space {(i, j) | i ≥ 0, 1 ≤ j ≤ n}, and with a transition probability matrix of the form

P =        B0 B1 0 0 · · · A0 A1 A2 0 · · · 0 A0 A1 A2 · · · 0 0 A0 A1 · · · .. . ... ... ... . ..        ,

where B0, B1, A0, A1, and A2are n×n nonnegative matrices such that P is stochastic.

In particular, (A0+ A1+ A2)e = e, where e = (1, 1, . . . , 1)T.

We make the standard assumption that the matrix P and the matrix A = A0+

A1+ A2 are both irreducible. Thus, A0 6= 0 and A2 6= 0. Moreover, there exists

a unique positive vector α with αT_{e = 1 and α}T_{A = α}T_{. The QBD is positive}

recurrent if αT_A

0e > αTA2e, transient if αTA0e < αTA2e, and null recurrent if

αT_A

0e = αTA2e.

The minimal nonnegative solution G of the matrix equation

(9)

plays an important role in the study of the QBD process (see [31]). We will also need the dual equation

F = A2+ A1F + A0F2, (4.2)

and we let F be its minimal nonnegative solution. It is well known (see [31], for exam-ple) that if the QBD is positive recurrent, then G is stochastic and F is substochastic with spectral radius ρ(F ) < 1; if the QBD is transient, then F is stochastic and G is substochastic with ρ(G) < 1; if the QBD is null recurrent, then G and F are both stochastic.

The Latouche–Ramaswami (LR) algorithm [30] and the cyclic reduction (CR) algorithm [5] are both efficient iterative methods for finding the minimal solution G. The convergence of these two algorithms is quadratic for positive recurrent and tran-sient QBDs. A convergence analysis has been performed in [15] for the LR algorithm in the null recurrent case under two additional assumptions. The first assumption is that λ = 1 is a simple eigenvalue of G and F and there are no other eigenvalues of G or F on the unit circle; the second assumption is made under the first assumption and is more technical. The convergence rate for the LR algorithm is the same in view of the relationship between CR and LR, given in [3].

We can also use the doubling algorithm (SDA-1 or SDA-2) to find the minimal solution G. We will choose to use SDA-2 since it is less expensive. Moreover, there is a close connection between the CR algorithm and SDA-2. In this section we determine the convergence rate of SDA-2 in the null recurrent case, without the two additional assumptions in [15]. The convergence rate for the CR (or LR) algorithm in the null recurrent case is the same in view of their connections to SDA-2. As compared to [15], the result here is more general and the analysis here is much simpler.

The CR algorithm for (4.1), or for −A0+ (I − A1)G − A2G2= 0, is the following:

Algorithm 4.1. Set T0= A0, U0= I − A1, V0= A2, S0= I − A1. For k = 0, 1, . . ., compute Tk+1= TkUk−1Tk, Uk+1= Uk− TkUk−1Vk− VkUk−1Tk, Vk+1= VkU_k−1Vk, Sk+1= Sk− VkUk−1Tk.

The above CR algorithm is as presented in [3], but with one minor change: if we follow [3] exactly, Tk and Vk here would have to be replaced by −Tk and −Vk for

k ≥ 0.

The following result is known from the discussions in [4] and [31].

Theorem 4.1. The sequences {Tk}, {Uk}, {Vk}, {Sk} in Algorithm 4.1 are well

defined. For each k ≥ 0, Tk and Vk are nonnegative, and Ukand Sk are nonsingular

M -matrices. When the QBD is positive recurrent or transient, the sequence {Sk}

converges quadratically to a nonsingular M -matrix S∗ and S∗−1A0= G.

We note that Algorithm 4.1 may break down if we do not assume the irreducibility of the transition matrix P . As an example, we consider

A0= 0 0 1 0 , A1= 0, A2= 0 1 0 0 .

(10)

It is easy to see that P is not irreducible, although A0+ A1+ A2is. For this example,

U1= 0 in Algorithm 4.1, so the algorithm breaks down. The LR algorithm also breaks

down for this example.

To use the doubling algorithm to find G, we may rewrite (4.1) as 0 I A0 A1− I I G = I 0 0 −A2 I G G.

Multiplying the second block row by −(I − A1)−1and eliminating the I in the (1, 2)

block of the leftmost matrix, we get (I − A1)−1A0 0 −(I − A1)−1A0 I I G = I −(I − A1)−1A2 0 (I − A1)−1A2 I G G.

We can then use SDA-1 to find the matrix G. However, the less expensive SDA-2 can also be used if we rewrite (4.1) as

L0 I A2G = M0 I A2G G, (4.3) where L0= 0 I A0 0 , M0= A2 0 I − A1 −I .

It is easily seen that L0− λM0is a linearization of −A0+ λ(I − A1) − λ2A2.

If we use SDA-1, the matrix G can be approximated directly by a sequence gen-erated by SDA-1. One may have some concern about the SDA-2 approach: how can one get G if A2G is obtained and A2 is singular? This concern will turn out to be

unnecessary.

In this section SDA-2 is Algorithm 2.2 with the initialization

T0= A0, Q0= I − A1, P0= 0, V0= A2. (4.4)

The algorithm generates the sequence {Lk, Mk} (see (2.7)) if no breakdown occurs.

It is readily seen that Algorithm 4.1 is recovered from SDA-2 by letting Uk =

Qk− Pk and Sk = S0− Pk. By Theorem 4.1, Qk− Pk = Uk are nonsingular M

-matrices for all k ≥ 0. So SDA-2 is also well defined. In view of (2.9) we have for each k ≥ 0

Lk I A2G = Mk I A2G G2k. So −Pk+ A2G = VkG2 k , Tk= QkG2 k − A2G2 k₊₁ . (4.5) Similarly we have b L0 I A0F = cM0 I A0F F, where b L0= 0 I V0 0 , Mc0= T0 0 Q0 −I .

(11)

It is easily seen that cM0− λbL0is also a linearization of −A0+ λ(I − A1) − λ2A2.

For each k ≥ 0 we now have

b Lk I A0F = cMk I A0F F2k, where b Lk = − bPk I Vk 0 , Mck = _T k 0 b Qk −I with b Pk = I − A1− Qk, Qbk = I − A1− Pk. (4.6) So − bPk+ A0F = TkF2 k , Vk= bQkF2 k − A0F2 k₊₁ . (4.7)

We mentioned before that the Sk in Algorithm 4.1 satisfies Sk = S0− Pk =

I − A1− Pk. So we have bQk= Sk.

When the QBD is positive recurrent or transient, we know by Theorem 4.1 that b

Qk converges quadratically to a nonsingular M -matrix bQ∗and bQ−1∗ A0= G. Here we

give a quick proof using the doubling algorithm. By the first equation in (4.5) and the second equation in (4.6), we have

b

Qk− I + A1+ A2G = VkG2

k

. Eliminating Vk using the second equation in (4.7) gives

b Qk(I − F2 k G2k) = I − A1− A2G − A0F2 k₊₁ G2k. It follows that lim sup k→∞ 2kq k bQk− (I − A1− A2G)k ≤ ρ(F )ρ(G) < 1.

Since bQ∗ = I − A1 − A2G is a nonsingular M -matrix and A0 = bQ∗G, we have

G = bQ−1_∗ A0. Similarly, Qk converges quadratically to the nonsingular M -matrix

Q∗= I − A1− A0F and F = Q−1∗ A2.

Our main purpose of this section, however, is to determine the convergence rate of SDA-2 for the null recurrent case.

We start with a review of an important result about the spectral properties of the quadratic pencil −A0+ λ(I − A1) − λ2A2and of the matrices G and F when the

QBD is null recurrent. See Proposition 14 and Theorem 4 of [10] and Theorem 4.10 of [4].

Theorem 4.2. Let the QBD be null recurrent. Then

(a) For some integer r ≥ 1 the quadratic pencil −A0+ λ(I − A1) − λ2A2 has

n − r eigenvalues inside the unit circle, n − r eigenvalues outside the unit circle (which include eigenvalues at infinity), and 2r eigenvalues on the unit circle, which are the rth roots of unity, each with multiplicity two.

(12)

(c) The eigenvalues of G are the n − r eigenvalues of the pencil inside the unit circle plus the r simple eigenvalues at the rth roots of unity, the eigenvalues of F are the reciprocals of the n − r eigenvalues of the pencil outside the unit circle plus the r simple eigenvalues at the rth roots of unity.

Using the Kronecker form for matrix pairs, we have nonsingular matrices V and Z such that V M0Z = In 0 0 J2⊕ Ir ≡ JM, (4.8) V L0Z = J1⊕ Dr 0 ⊕ Ir 0 In−r⊕ Dr ≡ JL, (4.9)

where J1 and J2 are (n − r) × (n − r) matrices consisting of the Jordan blocks with

diagonal elements inside the unit circle, Dr is a r × r diagonal matrix with the rth

roots of unity on the diagonal.

Similarly, we have nonsingular matrices T and W such that

T bL0W = In 0 0 J2⊕ Ir = JM, (4.10) T cM0W = J1⊕ Dr 0 0 ⊕ Ir In−r⊕ Dr ≡ bJL. (4.11)

We have for each k ≥ 0 MkZJ2 k L = LkZJ2 k M, LbkW bJ2 k L = cMkW J2 k M. (4.12)

Let Z and W be partitioned as in (3.12). From (4.8) and (4.9) we have

L0 Z1 Z2 = M0 Z1 Z2 (J1⊕ Dr).

Comparing this with (4.3) and using Theorem 4.2, we know that Z1 is nonsingular

and Z2Z1−1= A2G. Similarly, W3 is nonsingular and W4W3−1= A0F .

Using block matrix multiplication for (4.12), we have VkZ1(J2 k 1 ⊕ D 2k r ) = −PkZ1+ Z2, (4.13) (QkZ1− Z2)(J2 k 1 ⊕ D 2k r ) = TkZ1, (4.14) VkZ1(0 ⊕ 2kD2 k₋₁ r ) + VkZ3(I ⊕ D2 k r ) = (−PkZ3+ Z4)(J2 k 2 ⊕ I), (4.15) (QkZ1− Z2)(0 ⊕ 2kD2 k₋₁ r ) + (QkZ3− Z4)(I ⊕ D2 k r ) = TkZ3(J2 k 2 ⊕ I), (4.16) (− bPkW1+ W2)(J2 k 1 ⊕ D 2k r ) + (− bPkW3+ W4)(0 ⊕ 2kD2 k₋₁ r ) = TkW1, (4.17) VkW1(J2 k 1 ⊕ D 2k r ) + VkW3(0 ⊕ 2kD2 k₋₁ r ) = bQkW1− W2, (4.18) (− bPkW3+ W4)(I ⊕ D2 k r ) = TkW3(J2 k 2 ⊕ I), (4.19) VkW3(I ⊕ D2 k r ) = ( bQkW3− W4)(J2 k 2 ⊕ I). (4.20)

Post-multiplying (4.16) by 0 ⊕ 2−kDr and subtracting the result from (4.14), we get

Tk(Z1−Z3(0⊕2−kDr)) = (QkZ1−Z2)(J2

k

1 ⊕0)−(QkZ3−Z4)(0⊕2−kD2

k₊₁

(13)

By (4.19) we have − bPk= −W4W3−1+ TkW3(J2 k 2 ⊕ D −2k r )W −1 3 . (4.22) Thus, in view of (4.6), Qk= I − A1− W4W3−1+ TkW3(J2 k 2 ⊕ D −2k r )W −1 3 . (4.23)

Inserting (4.23) into (4.21) and letting Q∗= I − A1− W4W3−1, we get

Tk h Z1− Z3(0 ⊕ 2−kDr) − W3(J2 k 2 ⊕ D −2k r )W −1 3 (Z1(J2 k 1 ⊕ 0) − Z3(0 ⊕ 2−kD2 k₊₁ r )) i = (Q∗Z1− Z2)(J2 k 1 ⊕ 0) − (Q∗Z3− Z4)(0 ⊕ 2−kD2 k₊₁ r ),

from which it follows that

kTkk = O(2−k).

It then follows from (4.23) that

kQk− (I − A1− W4W3−1)k = O(2−k).

Post-multiplying (4.15) by 0 ⊕ 2−kDr and subtracting the result from (4.13), we get

−PkZ1+Z2−(−PkZ3+Z4)(0⊕2−kDr) = Vk(Z1(J2 k 1 ⊕0)−Z3(0⊕2−kD2 k₊₁ r )). (4.24) By (4.20), Vk= ( bQkW3− W4)(J2 k 2 ⊕ D −2k r )W −1 3 . (4.25)

Inserting (4.25) into (4.24) and using bQk= I − A1− Pk, we get

−PkZ1+ Z2− (−PkZ3+ Z4)(0 ⊕ 2−kDr) = ((I − A1− Pk)W3− W4)Ck

for some Ck with kCkk = O(2−k). Thus,

Pk(Z1− Z3(0 ⊕ 2−kDr) − W3Ck) = Z2− Z4(0 ⊕ 2−kDr) − ((I − A1)W3− W4)Ck. It follows that kPk− Z2Z1−1k = O(2 −k_). Post-multiplying (4.18) by 0 ⊕ 2−kD1−2k r , we get VkW1(0 ⊕ 2−kDr) + VkW3(0 ⊕ I) = ( bQkW1− W2)(0 ⊕ 2−kD1−2 k r ). (4.26) Post-multiplying (4.20) by I ⊕ 0, we get VkW3(I ⊕ 0) = ( bQkW3− W4)(J2 k 2 ⊕ 0). (4.27)

Adding (4.26) and (4.27) gives

Vk(W3+ W1(0 ⊕ 2−kDr)) = ( bQkW1− W2)(0 ⊕ 2−kD1−2

k

r ) + ( bQkW3− W4)(J2

k

(14)

It follows that

kVkk = O(2−k),

since W3is nonsingular and { bQk} has been shown to be bounded.

In summary, we have proved the following result.

Theorem 4.3. Let the QBD be null-recurrent. Then for SDA-2 we have kVkk = O(2−k), kTkk = O(2−k),

kQk− (I − A1− A0F )k = O(2−k), kPk− A2Gk = O(2−k).

Corollary 4.4. Let lim Qk = Q∗ and lim Pk = P∗. Then Q∗ is nonsingular

and Q−1

∗ A2= F , I − A1− P∗ is nonsingular and (I − A1− P∗)−1A0= G. The matrix

Q∗− P∗ is a singular M -matrix.

Proof. By Theorem 4.3, Q∗= I −A1−A0F and I −A1−P∗ = I −A1−A2G. These

two matrices are known to be nonsingular [31]. Since Q∗F = (I − A1− A0F )F = A2,

Q−1_∗ A2= F . Since (I − A1− P∗)G = (I − A1− A2G)G = A0, (I − A1− P∗)−1A0= G.

Q∗− P∗is a singular M -matrix since

(Q∗− P∗)e = (I − A1− A0F − A2G)e = e − (A1+ A0+ A2)e = 0.

This completes the proof.

When the QBD is null recurrent, the interpretation of the CR algorithm as a doubling algorithm has allowed us to show that the minimal solutions G and F can be found by the CR algorithm (or the closely related LR algorithm) simultaneously and with at least linear convergence with rate 1/2. It is important to note that we no longer need the assumption that the matrices G and F have no eigenvalues on the unit circle other than the simple eigenvalue 1. With that assumption, one would use the shift technique as studied in [25], [16] and [4], and apply the CR algorithm or the LR algorithm to the shifted equation. When G and F have more than one eigenvalues on the unit circle, the shift technique is not helpful and the CR algorithm or the LR algorithm will be applied directly to the equation (4.1).

5. A nonsymmetric algebraic Riccati equation. In this section we consider the nonsymmetric algebraic Riccati equation (NARE)

XCX − XD − AX + B = 0, (5.1)

where A, B, C, D are real matrices of sizes m × m, m × n, n × m, n × n, respectively, and the matrix

K = D −C −B A (5.2) is a nonsingular M -matrix or an irreducible singular M -matrix. The NARE arises in the study of Wiener–Hopf factorization of Markov chains [36], and it includes the NARE arising from transport theory [28, 29]. We will also need the dual equation of (5.1)

Y BY − Y A − DY + C = 0, (5.3)

(15)

We will use the elementwise order for matrices: for any matrices A = [aij], B =

[bij] ∈ Rm×n, we write A ≥ B(A > B) if aij ≥ bij(aij > bij) for all i, j.

A basic result about (5.1) and (5.3) is the following [14].

Theorem 5.1. If the matrix K in (5.2) is a nonsingular M -matrix or an irre-ducible singular M -matrix, then the NARE (5.1) and the NARE (5.3) have minimal nonnegative solutions X and Y , respectively. Moreover, D − CX and A − BY are M -matrices.

The minimal nonnegative solution of the NARE is the solution of practical inter-est. There have been a number of methods for finding this solution. The methods and their analyses can be found in [2, 14, 18, 20, 21, 23, 24, 35]. Among the iterative methods, the doubling algorithm proposed in [24] stands out for its overall efficiency. The algorithm is analyzed in [24] for the case when K is a nonsingular M -matrix, and is analyzed in [21] for the case when K is an irreducible singular M -matrix. When K is an irreducible singular M -matrix, we let [vT

1, v2T]T > 0 and [uT1, uT2]T > 0 be

the right and the left null vectors of K in (5.2), respectively. If uT

1v1 6= uT2v2, then

the convergence of the doubling algorithm is still quadratic; if uT

1v1= uT2v2, then the

convergence is observed to be linear with rate 1/2 (see [21]). The later case will be referred to as the critical case for the NARE. For this critical case, the convergence of Newton’s method has been shown to at least linear with rate 1/2 [14, 20, 23]. We will reach the same conclusion for the doubling algorithm.

We start with a brief review of the doubling algorithm in [24]. Let H =D −C B −A , (5.4) and R = D − CX, S = A − BY, (5.5)

where X and Y are given in Theorem 5.1. Then the NAREs (5.1) and (5.3) can be rewritten as H In X = In X R (5.6) and H Y Im = Y Im (−S). (5.7)

Applying the Cayley transform to equation (5.6) with a scalar γ > 0 we have (H − γI) In X = (H + γI) In X Rγ,

where Rγ = (R + γIn)−1(R − γIn). Premultiplying the above equation by a proper

nonsingular matrix gives M0 In X = L0 In X Rγ. (5.8)

Here L0 and M0are given by (2.1) with

E0= In− 2γVγ−1, F0= Im− 2γWγ−1, G0= 2γDγ−1CW −1 γ , H0= 2γWγ−1BD −1 γ , (5.9)

(16)

where Aγ = A + γIm, Dγ = D + γIn, Wγ = Aγ− BDγ−1C, Vγ = Dγ− CA−1γ B. (5.10) Similarly, M0 Y Im Sγ= L0 Y Im , (5.11)

where Sγ = (S + γIm)−1(S − γIm).

In this section SDA-1 denotes Algorithm 2.1 with E0, F0, G0, H0given by (5.9).

The following result from [21] improves the original results given in [24].

Theorem 5.2. Let the matrix K in (5.2) be a nonsingular M -matrix or an irreducible singular M -matrix, and X, Y ≥ 0 be the minimal nonnegative solutions of the NAREs (5.1) and (5.3), respectively. If γ satisfies

γ ≥ γ0≡ max

max

1≤i≤maii, max1≤i≤ndii

, (5.12)

where aii and dii are the diagonal entries of A and D, respectively, then the sequence

{Ek, Fk, Hk, Gk} in SDA-1 is well defined. Moreover, we have

(a) E0, F0< 0 and Ek, Fk > 0 for k ≥ 1;

(b) For k ≥ 0, 0 ≤ Hk < Hk+1< X, 0 ≤ Gk < Gk+1< Y ;

(c) For k ≥ 0, Im− HkGk and In− GkHk are nonsingular M-matrices.

From now on we assume that K in (5.2) is an irreducible singular M -matrix, and consider the critical case of the NARE (5.1). We always assume that γ satisfies (5.12). The Kronecker form for the pencil (M0, L0) can be determined with the help of

the following result [14], where C− and C+ denote the open left and the open right

half planes, respectively.

Theorem 5.3. For the critical case of the NARE (5.1), the matrix H has n − 1 eigenvalues in C+, m−1 eigenvalues in C−, and two zero eigenvalues with a quadratic

divisor. Moreover, R and S in (5.5) are irreducible singular M-matrices (so each of them has a simple eigenvalue 0 and the remaining eigenvalues are in C+).

In view of Theorem 5.3, the property of Cayley transform, and the process leading to (5.8) and (5.11), we know that there are nonsingular matrices V and Z such that

V L0Z = In 0n,m 0m,n J2,s⊕ [1] ≡ JL, (5.13) V M0Z = J1 Γ 0m,n Im−1⊕ [−1] ≡ JM, (5.14) in which J1= J1,s⊕ [−1] s ∼ Rγ, J2≡ J2,s⊕ [−1] s ∼ Sγ, Γ = 0n−1,m−1⊕ [1] ≡ eneTm, (5.15) where ρ(J1,s) < 1, ρ(J2,s) < 1, and “ s

∼” denotes the similarity transformation. Since JLJM = JMJL, for the matrices Lk and Mk given by (2.2) we have by (2.11)

MkZJ2

k

L = LkZJ2

k

(17)

On the other hand, there are nonsingular matrices T and W such that T L0W = J2 bΓ 0n,m In−1⊕ [−1] ≡ bJL, (5.17) T M0W = Im 0m,n 0n,m J1,s⊕ [1] ≡ bJM, (5.18)

where bΓ = emeTn. We now have

LkW bJ2

k

M = MkW bJ2

k

L . (5.19)

The following result determines the convergence rate of SDA-1 in the critical case. Theorem 5.4. Let X, Y ≥ 0 be the minimal nonnegative solutions of the NAREs (5.1) and (5.3), respectively, and let {Ek, Fk, Gk, Hk} be generated by SDA-1. Then

for the critical case

kEkk = O(2−k), kFkk = O(2−k), kHk− Xk = O(2−k), kGk− Y k = O(2−k).

Proof. Partition the matrices Z and W as Z = Z1 Z3 Z2 Z4 , W = W1 W3 W2 W4 , (5.20)

where Z1, W3 ∈ Rn×n and Z4, W2 ∈ Rm×m. Then from (5.13) and (5.14), and from

(5.17) and (5.18), we have M0 Z1 Z2 = L0 Z1 Z2 J1, M0 W1 W2 J2= L0 W1 W2 . (5.21) Comparing (5.21) with (5.8) and (5.11), and using (5.15), we know that Z1 and W2

are invertible and X = Z2Z1−1, Y = W1W2−1.

Note that for k ≥ 1 we have J_L2k= _I n 0 0 J2k 2 , J_M2k= J2k 1 Γk 0 Im , bJ_M2k= _I m 0 0 J2k 1 , bJ_L2k= J2k 2 Γbk 0 In ,

where Γk = −2kΓ = −2keneTm, bΓk = −2kbΓ = −2kemeTn. It follows from (5.16) and

(5.19) that for k ≥ 1 EkZ1= (Z1− GkZ2)J2 k 1 , (5.22) EkZ3J2 k 2 = (Z1− GkZ2)Γk+ (Z3− GkZ4), (5.23) −HkZ1+ Z2= FkZ2J2 k 1 , (5.24) (−HkZ3+ Z4)J2 k 2 = FkZ2Γk+ FkZ4, (5.25) W1− GkW2= EkW1J2 k 2 , (5.26) (W3− GkW4)J2 k 1 = EkW1Γbk+ EkW3, (5.27) FkW2= (W2− HkW1)J2 k 2 , (5.28) FkW4J2 k 1 = (W2− HkW1)bΓk+ (W4− HkW3). (5.29)

(18)

Post-multiplying (5.29) by bΓ†_k = −2−kΓ, the Moore-Penrose pseudo inverse of bΓk,

subtracting the result from (5.28), and noting that bΓkbΓ

† k= 0m−1⊕ [1], we get Fk(W2+ 2−kW4J2 k 1 Γ) = (W2− HkW1)(J2 k 2,s⊕ [0]) + 2 −k_(W 4− HkW3)Γ. (5.30)

Since W2is invertible and {Hk} is bounded by Theorem 5.2(b), it follows from (5.30)

that kFkk = O(2−k). It then follows from (5.24) that kHk− Xk = O(2−k).

Similarly, post-multiplying (5.23) by Γ†_k = −2−kbΓ, subtracting the result from (5.22), and noting that ΓkΓ†k = 0n−1⊕ [1], we get

Ek(Z1+ 2−kZ3J2

k

2 Γ) = (Zb 1− GkZ2)(J2

k

1,s⊕ [0]) + 2−k(Z3− GkZ4)bΓ. (5.31)

Since Z1 is invertible and {Gk} is bounded by Theorem 5.2(b), it follows from (5.31)

that kEkk = O(2−k). It then follows form (5.26) that kGk− Y k = O(2−k).

We note that lim(I − GkHk) = I − Y X and lim(I − HkGk) = I − XY are both

singular M -matrices (see [21]).

The critical case we have considered is a singular case, and the singularity can be removed by applying a proper shift technique. Indeed, a shift technique has been introduced in [21] and SDA-1 applied to the shifted NARE has quadratic convergence if no breakdown happens. However, whether breakdown is possible remains an open problem in general, although some partial results have been obtained in [21].

Since K is an irreducible singular M -matrix, we may assume without loss of generality that Ke = 0. In this case, one can transform the NARE to a quadratic matrix equation of the type in section 4, but with (m + n) × (m + n) matrices in the equation (see [35]). One can then apply CR and LR to the transformed equation (see [2, 18]). A specific shift technique (following [25]) is introduced in [18] to the transformed equation, and quadratic convergence is recovered for the LR algorithm (thus also for the CR algorithm) if no breakdown happens. It has been shown in [20] that the LR algorithm is indeed well-defined when the shift technique is used. However, when m = n, the computational work required in each iteration is nearly twice that for SDA-1, due to the dimension expansion from n to 2n.

Although it is preferable to use a shift technique for the critical case of the NARE (with an irreducible singular M -matrix K), our convergence results in Theorem 5.4 still provide some insights about the convergence behaviour of SDA-1 for nearby NAREs with a nonsingular M -matrix K (where the shift technique is no longer app-plicable). The exact solution of a singular NARE is quite sensitive to the input data in the NARE (see [20]). For the singular NARE and nearby NAREs, it would be reasonable to stop the iteration when kHk − Hk−1k < 1/2, where is the machine

epsilon, and take Hkas an approximation to the exact solution X. Further iterations

for SDA-1 may not be able to improve the accuracy significantly in view of the pertur-bation behaviour of X and the fact that I − GkHk and I − HkGk are nearly singular

for large k. So we are mainly interested in the behaviour of SDA-1 for iterations up to the point where kHk− Hk−1k < 1/2(assuming this is achievable). And up to that

point, the behaviour of SDA-1 for those nearby NAREs would be very much similar to that of SDA-1 for the singular NARE. We use one example to illustrte this point. Example 5.1. Let T be a 16 × 16 doubly stochastic matrix given by T =

1

2056magic(16), where magic is the Matlab function that generates magic squares.

Let K = I − T , and let the 8 × 8 matrices A, B, C, D be determined through (5.2). The matrix K is an irreducible singular M -matrix and we have the critical case for the NARE (5.1). We take γ to be the largest diagonal entry of K (which is the last

(19)

diagonal entry of K) and apply SDA-1. We find that kHk− Hk−1k < 10−7 is satisfied

for k = 24. The convergence rate of Hk− X is determined through that of Fk (see

the proof of Theorem 5.4). We find that the values of pkFk

kk∞ are between 0.4924

and 0.5001 for k = 4 : 24.

We then increase the (1,1) entry of K by 10−12. So K is now a nonsingular M -matrix. The matrix D is changed accordingly. The change in K does not change the largest diagonal entry of K. So we apply SDA-1 to the new NARE with the same γ. We find that kHk− Hk−1k < 10−7 is satisfied for k = 23, and that the values

of _pkFk

kk∞ are between 0.4924 and 0.5000 for k = 4 : 21 (the values are 0.4855

and 0.4570 for k = 22 and k = 23, respectively). Thus, the (non-terminal and more important) convergence behaviour of SDA-1 for this nearby NARE is largely dictated by our theoretical results in Theorem 5.4.

6. Conclusion. We have determined the convergence rate of the doubling algo-rithm in the critical (or singular) case for three different nonlinear matrix equations. It is possible to apply the techniques we reviewed in section 2 to other nonlinear matrix equations. Through this study, we have also gained more insights for the convergence behaviour for the doubling algorithm for nearly singular cases.

REFERENCES

[1] B. D. O. Anderson, Second-order convergent algorithms for the steady-state Riccati equation, Internat. J. Control, 28 (1978), pp. 295–306.

[2] D. A. Bini, B. Iannazzo, G. Latouche, and B. Meini, On the solution of algebraic Riccati equations arising in fluid queues, Linear Algebra Appl., 413 (2006), pp. 474–494. [3] D. A. Bini, G. Latouche, and B. Meini, Solving matrix polynomial equations arising in

queueing problems, Linear Algebra Appl., 340 (2002), pp. 225–244.

[4] D. A. Bini, G. Latouche, and B. Meini, Numerical Methods for Structured Markov Chains, Oxford University Press, 2005.

[5] D. Bini and B. Meini, On the solution of a nonlinear matrix equation arising in queueing problems, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 906–926.

[6] E. K.-W. Chu, H.-Y. Fan, and W.-W. Lin, A structure-preserving doubling algorithm for continuous-time algebraic Riccati equations, Linear Algebra Appl., 396 (2005), pp. 55–80. [7] E. K.-W. Chu, H.-Y. Fan, W.-W. Lin, and C. S. Wang, Structure-preserving algorithms for periodic descrete-time algebraic Riccati equations, Internat. J. Control, 77 (2004), pp. 767– 788.

[8] J. C. Engwerda, On the existence of a positive definite solution of the matrix equation X + AT_X−1_{A = I, Linear Algebra Appl., 194 (1993), pp. 91–108.}

[9] J. C. Engwerda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and sufficient conditions for the existence of a positive definite solution of the matrix equation X + A∗X−1A = Q, Linear Algebra Appl., 186 (1993), pp. 255–275.

[10] H. R. Gail, S. L. Hantler, and B. A. Taylor, Spectral analysis of M/G/1 and G/M/1 type Markov chains, Adv. Appl. Probab., 28 (1996), pp. 114–165.

[11] F. R. Gantmacher, The Theory of Matrices, Vol. II, Chelsea Publishing Company, New York, 1987.

[12] C.-H. Guo, Newton’s method for discrete algebraic Riccati equations when the closed-loop matrix has eigenvalues on the unit circle, SIAM J. Matrix Anal. Appl., 20 (1998), pp. 279–294.

[13] C.-H. Guo, Convergence rate of an iterative method for a nonlinear matrix equation, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 295–302.

[14] C.-H. Guo, Nonsymmetric algebraic Riccati equations and Wiener–Hopf factorization for M-matrices, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 225–242.

[15] C.-H. Guo, Convergence analysis of the Latouche–Ramaswami algorithm for null recurrent quasi-birth-death processes, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 744–760. [16] C.-H. Guo, Comments on a shifted cyclic reduction algorithm for quasi-birth-death problems,

(20)

[17] C.-H. Guo, Numerical solution of a quadratic eigenvalue problem, Linear Algebra Appl., 385 (2004), pp. 391–406.

[18] C.-H. Guo, Efficient methods for solving a nonsymmetric algebraic Riccati equation arising in stochastic fluid models, J. Comput. Appl. Math., 192 (2006), pp. 353–373.

[19] C.-H. Guo, A new class of nonsymmetric algebraic Riccati equations, Linear Algebra Appl., 426 (2007), pp. 636–649.

[20] C.-H. Guo and N. J. Higham, Iterative solution of a nonsymmetric algebraic Riccati equation, SIAM J. Matrix Anal. Appl., 29 (2007), pp. 396–412.

[21] C.-H. Guo, B. Iannazzo, and B. Meini, On the doubling algorithm for a (shifted) nonsym-metric algebraic Riccati equation, SIAM J. Matrix Anal. Appl., 29 (2007), pp. 1083–1100. [22] C.-H. Guo and P. Lancaster, Iterative solution of two matrix equations, Math. Comp., 68

(1999), pp. 1589–1603.

[23] C.-H. Guo and A. J. Laub, On the iterative solution of a class of nonsymmetric algebraic Riccati equations, SIAM J. Matrix Anal. Appl., 22 (2000), pp. 376–391.

[24] X.-X. Guo, W.-W. Lin, and S.-F. Xu, A structure-preserving doubling algorithm for nonsym-metric algebraic Riccati equation, Numer. math., 103 (2006), pp. 393–412.

[25] C. He, B. Meini, and N. H. Rhee, A shifted cyclic reduction algorithm for quasi-birth-death problems, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 673–691.

[26] T.-M. Huang and W.-W. Lin, Structured doubling algorithms for weakly stabilizing Hermitian solutions of algebraic Riccati equations, Linear Algebra Appl., to appear.

[27] T.-M. Hwang, E. K.-W. Chu, and W.-W. Lin, A generalized structure-preserving doubling algorithm for generalized discrete-time algebraic Riccati equations, Internat. J. Control, 78 (2005), pp. 1063–1075.

[28] J. Juang, Existence of algebraic matrix Riccati equations arising in transport theory, Linear Algebra Appl., 230 (1995), pp. 89–100.

[29] J. Juang and W.-W. Lin, Nonsymmetric algebraic Riccati equations and Hamiltonian-like matrices, SIAM J. Matrix Anal. Appl., 20 (1998), pp. 228–243.

[30] G. Latouche and V. Ramaswami, A logarithmic reduction algorithm for quasi-death-birth processes, J. Appl. Probab., 30 (1993), pp. 650–674.

[31] G. Latouche and V. Ramaswami, Introduction to Matrix Analytic Methods in Stochastic Modeling, SIAM, Philadelphia, PA, 1999.

[32] W.-W. Lin, V. Mehrmann, and H. Xu, Canonical forms for Hamiltonian and symplectic matrices and pencils, Linear Algbera Appl., 302/303 (1999), pp. 469–533.

[33] W.-W. Lin and S.-F. Xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 26–39. [34] B. Meini, Efficient computation of the extreme solutions of X + A∗X−1A = Q and X −

A∗_X−1_{A = Q, Math. Comp., 71 (2002), pp. 1189–1204.}

[35] V. Ramaswami, Matrix analytic methods for stochastic fluid flows, in Proceedings of the 16th International Teletraffic Congress, Elsevier Science B. V., Edinburgh, 1999, pp. 1019–1030. [36] L. C. G. Rogers, Fluid models in queueing theory and Wiener–Hopf factorization of Markov

chains, Ann. Appl. Probab., 4 (1994), pp. 390–413.

[37] J.-G. Sun and S.-F. Xu, Perturbation analysis of the maximal solution of the matrix equation X + AT_X−1_{A = P , II, Linear Algebra Appl., 362 (2003), pp. 211—228.}

[38] S.-F. Xu, Numerical methods for the maximal solution of the matrix equation X + AT_X−1_{A =}

I, Acta Scientiarum Naturalium Universitatis Pekinensis, 36 (2000), pp. 29–38.

[39] X. Zhan, Computing the extremal positive definite soluions of a matrix equation, SIAM J. Sci. Comput., 17 (1996), pp. 1167–1174.

[40] X. Zhan and J. Xie, On the matrix equation X + AT_X−1_{A = I, Linear Algebra Appl., 247}