A structured doubling algorithm for nonsymmetric algebraic Riccati equations and quasi-birth-death problems (a singular case)

(1)

A structured doubling algorithm for

nonsymmetric algebraic Riccati equations (a

singular case)

Chun-Yueh Chiang

a

, Wen-Wei Lin

b,

∗

a_{Department of Mathematics, National Tsing Hua University, Hsinchu 300,}

Taiwan.

b_{Department of Mathematics, National Tsing Hua University, Hsinchu 300,}

Taiwan.

Abstract

In this paper we propose a structured doubling algorithm (SDA) for the compu-tation of the minimal nonnegative solutions to the nonsymmetric algebraic Riccati equation (NARE) and its dual equation simultaneously, for a singular case. Simi-lar to the Newton’s method we establish a global and linear convergence for SDA under the singular condition, using only elementary matrix theory. Numerical exper-iments show that the SDA algorithm is feasible and effective, outperforms Newton’s method for NARE. Furthermore, SDA algorithm can easily be applied to solving the quadratic matrix equation, arising form quasi-birth-death (QBD) processes, which is different from the existing Latouchu-Ramaswami (LR) algorithm. The convergence of SDA is shown to be linear at least with rate 1₂ when QBD is null recurrent.

Key words: nonsymmetric algebraic Riccati equation, minimal negative solution, doubling algorithm, quasi-birth-death (QBD) Markov chains,

Latouche-Ramaswami algorithm

1 Introduction

In this paper, we investigate the convergence behavior of a structured doubling algorithm for minimal nonnegative solutions to the nonsymmetric algebraic

∗ Corresponding author; Phone: +886-3-5720862, FAX: +886-3-5718949 Email addresses: [email protected] (Chun-Yueh Chiang), [email protected] (Wen-Wei Lin).

(2)

Riccati equation (NARE)

XCX − XD − AX + B = 0, (1.1)

and its dual equation

Y BY − Y A − DY + C = 0, (1.2)

where A, B, C, D are real matrices of sizes m × m, m × n, n × m, n × n, respec-tively. Throughout the paper we assume the matrix

(H) K =    D −C −B A  

is an irreducible singular M-matrix. (1.3)

The relevant definitions are given as follows.

Definition 1.1(i) For any matrices A = [aij], B = [bij] ∈ Rm×n, we write

A ≥ B(A > B) if aij ≥ bij(aij > bij) for all i, j.

(ii) A matrix A ∈ Rn×n is said to be a Z-matrix if all its off-diagonal elements are non-positive. A Z-matrix A is called a nonsingular M-matrix if A = sI − B with B ≥ 0 and s > ρ(B), where ρ(B) is the spectral radius of B. In addition, if s = ρ(B), then A is called a singular M-matrix.

Definition 1.2 A matrix A ∈ Rn×n _{is irreducible if there does not exists a}

permutation matrix P such that

P APT =    A1 A3 0 A2   ,

where A1, A2 are square matrices.

A special form of the NARE (1.1) arising from transport theory with rank one matrices B > 0 and C > 0 has been extensively studied (see [1,2] and the references therein). A Schur-type method for solving NARE has been de-veloped in [2,3]. It is worthwhile to mention that a Schur-type method in [3] works for the general form of the NARE in the singular case. The general form of NARE (1.1) that originated from the Wiener-Hopf factorization of Markov Chain; see [4–8] for details. A Newton method (NM) and a class of fixed-point iterations (FPIs) have been proposed in [9,10] for computing the minimal nonnegative solution of NARE. Newton’s method has been shown to be globally quadratically convergent under the condition that K in (1.3) is a nonsingular M-matrix. It requires approximately 60n3 _{flops for solving}

a Sylvester equation at each iteration step when m = n. Fixed-point itera-tions are less expensive than Newton’s method per iteration. However, the slow linear convergence rate of FPIs increases the total computational cost of

(3)

FPIs making them more expensive than NM (see examples in [9]). Recently, a structured doubling algorithm (SDA) has been developed in [11] for solving minimal nonnegative solutions X and Y of NARE (1.1) and its dual equation (1.2), simultaneously, under the condition that K in (1.3) is a nonsingular M-matrix. The SDA algorithm has been proven to be globally quadratically convergent. However, at each iteration step, the SDA needs only two LU-factorizations and some matrix multiplications, requiring approximately 64₃n3 flops when m = n, which is about 1₃ flop counts of NM.

Under the hypothesis (H) that K in (1.3) is an irreducible singular M-matrix and in some additional singular conditions (see Case(1) of Theorem2.2 below), the convergence of FPIs has been shown to be sub-linear, i.e., lim sup

k→∞

kXk+1−Xkk

kXk−Xk =

1, and the convergence of NM has been proven to be linear with contraction factor 1₂, i.e., lim sup

k→∞

kXk+1−Xkk

kXk−Xk =

1

2 [9]. In this paper, we mainly prove that

under hypothesis (H) the SDA algorithm developed in [11] has either global quadratically convergent or at least linear convergence with a contraction rate

1

2 as in NM. Numerical experiments show that the SDA is feasible and

ef-fective, and outperforms NM because of the low computational cost at each iteration step.

The SDA algorithm [11] can also be applied to solving the special quadratic matrix equation

A0+ A1X + A2X2 = X (1.4)

A2 + A1Y + A0Y2 = Y, (1.5)

where A0, A1 and A2 are nonnegative, and (A0 + A1 + A2)e = e with e =

(1, . . . , 1)>. The minimal nonnegative solutions X and Y of (1.4) and (1.5), respectively, play an import role in the study of quasi-birth-death process (QBDs). (See e.g., [12]). A highly efficient Latouche-Ramaswami (LR) algo-rithm has been proposed in [13], and is related to the cyclic reduction (CR) method [14]. A shift technique to the CR method has been introduced by [15], for finding the minimal nonnegative solutions X and Y , respectively. The con-vergence of LR algorithm has been shown to be quadratic [16,13] for positive recurrent/transient QBDs, and to be linear with rate 1₂ [17] for null recurrent QBDs under mild conditions, respectively. In this paper, under mild conditions which are different from those of [17] and SDA algorithm, we shall show that SDA algorithm can be used to solve (1.4) and (1.5) simultaneously, and con-verges quadratically for positive recurrent/transient QBDs and linearly with rate 1₂ for null recurrent QBD, respectively.

This paper is organized as follows. In Section 2, we describe some useful pre-liminaries. In Section 3, we establish the convergence theory of SDA under assumption (H). Numerical results of SDA and NM for solving NARE are shown in Section 4. Applications of SDA algorithm to solving QBD problems

(4)

are given in section 5.

Throughout this paper, for A ∈ Rm×n_{, we use kAk}

1 to denote the matrix

1-norm, and kAk∞ := kA>k1. We denote ej the j-th standard basis vector and

e the vector with all components being one. We denote 0m,n the m × n zero

matrix, 0nand Inthe n × n zero and identity matrices, respectively. The open

left /the open right half planes, the closed left/the closed right half planes are denoted by C−, C+, C−∪ C0 and C+∪ C0, respectively.

2 Preliminaries

We will make use of the following results.

Lemma 2.1 [18,19] Let A be an n × n Z-matrix, and σ(A) denote the spec-trum of A. The statements in (1) are equivalent and the statements in (2) are equivalent:

(1)

(a) A is a nonsingular M-matrix; (b) A−1 ≥ 0;

(c) Av > 0 holds for some v > 0; (d) σ(A) ⊆ C+.

(2)

(a) A is a singular M-matrix;

(b) σ(A) ⊆ C+∪ C0, σ(A) ∩ C0 6= φ.

Theorem 2.1 [9,10] If (H) in (1.3) holds, then the NARE (1.1) and the dual NARE (1.2) have minimal nonnegative solutions X and Y , respectively, such that D − CX and A − BY are M-matrices.

We define the Hamiltonian-like matrix

H =    D −C B −A    (2.1) and let R := D − CX, S := A − BY, (2.2)

where X and Y are given in Theorem 2.1. Then the NARE (1.1) and its dual equation (1.2) can be, respectively, rewritten by

H    In X   =    In X   R (2.3)

(5)

and H    Y Im   =    Y Im   (−S). (2.4)

Let [v₁>, v₂>]> > 0 and [u>₁, u>₂]> > 0 be the right and the left null vectors of K in (1.3), respectively. The following essential result determining the signs of real parts for all eigenvalues of H in (2.1) can be found in [9].

Theorem 2.2 [9] Assume that (H) holds. Then

(1) If u>₁v1 = u>2v2, then H has n − 1 eigenvalues in C+, m − 1 eigenvalues

in C−, and two zero eigenvalues with quadratic divisor. Moreover, R and

S are singular M-matrices.

(2) If u>₁v1 > u>2v2, then H has n − 1 eigenvalues in C+, m eigenvalues in

C−, and one zero eigenvalue. Moreover, R is a singular M-matrix and S

is a nonsingular M-matrix.

(3) If u>₁v1 < u>2v2, then H has n eigenvalues in C+, m − 1 eigenvalues in

C−, and one zero eigenvalue. Moreover, R is a nonsingular M-matrix and

S is a singular M-matrix.

Using the technique developed in [11] by applying the Cayley transformation to the equation (2.3) with a suitable γ > 0 we have

M    In X   = L    In X   Rγ (2.5) and M    Y Im   Sγ = L    Y Im    (2.6) with

Rγ = (R + γIn)−1(R − γIn), Sγ = (S + γIm)−1(S − γIm). (2.7)

The matrices M and L in (2.5) are of the form

M =    Eγ 0 −Hγ Im   , L =    In −Gγ 0 Fγ   , (2.8) where Eγ = In− 2γVγ−1, Gγ = 2γDγ−1CW −1 γ , Fγ = Im− 2γWγ−1, Hγ = 2γWγ−1BD −1 γ , (2.9a) with Aγ = A + γIm, Dγ = D + γIn, Wγ = Aγ− BD−1γ C, Vγ = Dγ− CA−1γ B. (2.9b)

(6)

The matrix pair (M, L) being of the form in (2.8) is said to be of the standard symplectic-like form (SSF). Since R, S in (2.2), and K in (1.3) are M-matrices, R + γIn, S + γIm and K + γIm+n are nonsingular M-matrices. Therefore, the

matrices in (2.9a)-(2.9b) are well-defined. The detail derivation of (2.8) can be found in [11]. We now state the SDA algorithm developed in [11] for finding minimal nonnegative solutions X of the NARE (1.1), and Y of the dual NARE (1.2), simultaneously. SDA Algorithm Set E0 = Eγ, F0 = Fγ, G0 = Gγ, H0 = Hγ, (2.10) For k = 0, 1, . . . compute Ek+1 = Ek(In− GkHk)−1Ek, (2.11a) Fk+1 = Fk(Im− HkGk)−1Fk, (2.11b) Gk+1 = Gk+ Ek(In− GkHk)−1GkFk, (2.11c) Hk+1 = Hk+ Fk(Im− HkGk)−1HkEk. (2.11d)

It requires about 64₃ n3 flops for each step when m = n. The following results guarantee that the sequence Ek, Fk, Gk, Hk generated by SDA algorithm are

well-defined and bounded above. The theorem follows from Theorems 3.1 and 4.1 of [11] straightforward because K + γI is a nonsingular M-matrix for any γ > 0.

Theorem 2.3 Assume that (H) holds. Let X, Y ≥ 0 be the minimal nonneg-ative solutions of the NARE (1.1) and its dual equation (1.2), respectively. If the parameter γ satisfies

γ > γ0 ≡ max

max

1≤i≤maii, max1≤i≤ndii

, (2.12)

where {aii}mi=1= diag(A) and {dii}i=1m = diag(D), then the sequence {Ek, Fk, Hk, Gk}

generated by the SDA algorithm is well defined, and (a) Ek = (In− GkX)R2 k γ ≥ 0, with Eke > 0; (b) Fk = (Im− HkY )S2 k γ ≥ 0, with Fke > 0;

(c) Im− HkGk and In− GkHk are nonsingular M-matrices;

(d) 0 ≤ Hk ≤ Hk+1 ≤ X and 0 ≤ X − Hk= (Im− HkY )S2 k γ XR 2k γ ≤ S 2k γ XR 2k γ ; (e) 0 ≤ Gk ≤ Gk+1 ≤ Y and 0 ≤ Y − Gk= (In− GkX)R2 k γ Y S 2k γ ≤ R 2k γ Y S 2k γ .

(7)

3 Convergence of SDA algorithm

Assume that (H) in (1.3) holds. Let (M, L) be defined in (2.8)-(2.9). By Cayley transformation it follows that (M, L) is equivalent to (H−γI, H+γI). From results of Theorem 2.2 and (2.5)–(2.7) we have that (M, L) is equivalent to the following three different Kronecker forms. That is, there are nonsingular Q and Z such that

QMZ =    J1 Γ 0m,n Iˆm   ≡ JM, (3.1a) QLZ =    In 0n,m 0m,n J2   ≡ JL, (3.1b)

in which for Case (1) of Theorem 2.2:

J1 = J1,s⊕[−1], Γ = 0n−1,m−1⊕[1] ≡ ene>m, J2 = J2,s⊕[1], Iˆm = Im−1⊕[−1]; (3.2) for Case (2): J1 = J1,s⊕ [−1], Γ = 0n,m, J2 = J2,s, Iˆm = Im; (3.3) for Case (3) J1 = J1,s, Γ = 0n,m, J2 = J2,s⊕ [−1], Iˆm = Im, (3.4) where ρ(J1,s) < 1, ρ(J2,s) < 1, J1 s ∼ Rγ and J2 s ∼ Sγ. (3.5)

Here “⊕” denotes the direct sum of two matrices and “∼” denotes the sim-s ilarity transformation. The following theorem show that when uT

1v1 = uT2v2,

SDA algorithm converges quadratically.

Theorem 3.1 Assume that (H) holds and satisfies Case (2) or Case (3) of Theorem 2.2. Let X, Y ≥ 0 be minimal nonnegative solutions of NARE (1.1) and its dual equation (1.2), respectively. Then the sequence {Hk, Gk}∞k=1

gen-erated by SDA algorithm satisfies kX − Hkk1 ≤ kXk1kS2 k γ k1kR2 k γ k1 → 0, quadratically, (3.6) kY − Gkk1 ≤ kY k1kS2 k γ k1kR2 k γ k1 → 0, quadratically. (3.7)

Proof. Suppose that Case (2) or (3) of Theorem 2.2 holds. From (3.3)–(3.5) it follows that either ρ(Rγ) ≤ 1, and ρ(Sγ) < 1, or ρ(Rγ) < 1, and ρ(Sγ) ≤ 1.

(8)

Note that 0 ≤ A ≤ B implies kAk1 ≤ kBk1. Then (3.6) and (3.7) follow from

(d) and (e) in Theorem 2.3 follows immediately.

As for Case (1), that is, uT

1v1 = uT2v2. We are to show that Hkand Gkconverge

linearly to X and Y . From (3.1a)-(3.1b) one can derive

MZJL= Q−1JLJM= LZJM, (3.8)

since JM and JL commute with each other. Let {(Mk, Lk)}∞k=1 be the

se-quence of symplectic pairs in SSF with

Mk=    Ek 0n,m −Hk Im   , Lk =    In −Gk 0m,n Fk   , (3.9)

which is generated by SDA algorithm with M0 = M and L0 = L. Define

M∗ k=    Ek(I − GkHk)−1 0 −Fk(I − HkGk)−1Hk Im   , (3.10a) L∗_k=    In −Ek(I − GkHk)−1Gk 0 Fk(I − HkGk)−1   . (3.10b)

Then we have (see [11] for details)

M∗_kLk = L∗kMk, (3.11)

and

Mk+1 = M∗kMk, Lk+1 = L∗kLk. (3.12)

From (3.8), (3.9) and (3.10)–(3.12), for k = 1 it follows that

M1ZJL2= M ∗ 0M0ZJL2 = M ∗ 0L0ZJMJL = L∗₀M0ZJLJM= L∗0L0ZJM2 = L1ZJM2 . (3.13)

By inductive process we have

MkZJ2

k

L = LkZJ2

k

M. (3.14)

On the other hand, if we interchange the role of M and L in (3.1a)-(3.1b) and consider the symplectic pair (L, M), there are nonsingular P and Y such that

(9)

PLY =    J2 Γˆ 0n,m In−1⊕ [−1]   ≡ ˆJL, (3.15a) PMY =    Im 0m,n 0n,m J1,s⊕ [1]   ≡ ˆJM, (3.15b) where Γ = eb _me>

n. Similar arguments as (3.13)-(3.14) produce

LY ˆJM = MY ˆJL. (3.16)

Consequently, it holds that

LkY ˆJ2

k

M = MkY ˆJ2

k

L . (3.17)

We now prove the main result of this paper.

Theorem 3.2 Assume that (H) holds and satisfies Case (1) of Theorem 2.2. Let X, Y ≥ 0 be minimal nonnegative solutions of NARE (1.1) and its dual equation (1.2), respectively. Let {Ek, Fk, Gk, Hk}∞k=1 be the sequence generated

by SDA algorithm and {Mk, Lk}∞k=1 be defined in (3.9). Suppose Z and Y

satisfy (3.14) and (3.17), respectively. Denote

Z =    Z1 Z3 Z2 Z4   , Y =    Y1 Y3 Y2 Y4   , (3.18)

where Z1, Y3 ∈ Rn×n and Z4, Y2 ∈ Rm×m. Then Z1 and Y2 are invertible, and

kX − Hkk1 ≤ O(kJ2 k 2,sk1) + O(2−k) → 0, as k → ∞, kY − Gkk1 ≤ O(kJ2 k 1,sk1) + O(2−k) → 0, as k → ∞. where X = Z2Z1−1 and Y = Y1Y2−1.

Proof. From (2.5)-(2.6), (3.8) and (3.16) it is easily seen that [I, X>]> and [Z₁>, Z₂>]> as well as [Y>, I>]> and [Y₁>, Y₂>]> span the unique stable and the unique unstable subspaces of (M, L) corresponding to σ(J1) and σ(J2),

respectively. Therefore, Z1 and Z2 are invertible, furthermore, from (1.1)-(1.2)

we have X = Z2Z1−1 and Y = Y1Y2−1 [9]. Form (3.1) and (3.2) it follows that

J2k L =    In 0 0 J₂2k   , J 2k M =    J2k 1 Γk 0 Im   , (3.19)

where Γk= −2kΓ = −2kene>m. Substituting (Mk, Lk) of (3.9), and Z of (3.18)

(10)

EkZ1= (Z1− GkZ2)J2 k 1 , (3.20a) EkZ3J2 k 2 = (Z1− GkZ2)Γk+ (Z3− GkZ4), (3.20b) −HkZ1+ Z2= FkZ2J2 k 1 , (3.20c) (−HkZ3+ Z4)J2 k 2 = FkZ2Γk+ FkZ4. (3.20d)

Similarly, substituting (3.9), Y of (3.18) and (3.15) into (3.17), and comparing both side we have

FkY2= (Y2− HkY1)J2 k 2 , (3.21a) FkY4J2 k 1 = (Y2− HkY1)ˆΓk+ (Y4− HkY3), (3.21b) Y1− GkY2= EkY1J2 k 2 , (3.21c) (Y3− GkY4)J2 k 1 = EkY1Γˆk+ EkY3, (3.21d) where ˆΓk = −2kΓ = −2ˆ keme>n. Postmultiplying (3.21b) by ˆΓ † kY −1 2 , where ˆΓ † k

denoted the pseudo-inverse of ˆΓk, we get

(Y2− HkY1)ˆΓkΓˆ † kY −1 2 = FkY4J2 k 1 Γˆ + k † Y −1 2 − (Y4− HkY3)ˆΓ+k † Y −1 2 . (3.22)

Substituting (3.22) into (3.21a) we have

Fk(Im− Y4J2 k 1 Γˆ † kY −1 2 ) = Fk(Im− Y4(0n−1,m−1⊕ [−2−k])Y2−1) = (Y2− HkY1)(J2 k 2,s⊕ [0])Y −1 2 − (Y4− HkY3)ˆΓ † kY −1 2 . (3.23)

Since kHkk1 ≤ kXk1, by Theorem 2.3(d), it follows from (3.23), (3.2) and

(3.5) that

kFkk1 ≤ O(kJ2

k

2,sk1) + O(2−k) → 0, k → ∞. (3.24)

By (3.24) and the boundness of kJ2k

1 k1, the matrix in (3.20c) can be estimated

by

kX − Hkk1 ≤ O(kJ2

k

2,sk1) + O(2−k) → 0, (3.25)

linearly as k → ∞, where X = Z2Z1−1. Similarly, postmultiplying (3.20b) by

Γ†_kZ₁−1 and substituting it into (3.20a), we get

Ek(In− Z3J2 k 2 Γ † kZ −1 1 ) = Ek(In− Z3(0m−1,n−1⊕ [−2−k])Z1−1) = (Z1− GkZ2)(J2 k 1,s⊕ [0])Z −1 1 − (Z3− GkZ4)ˆΓ † kZ −1 1 . (3.26)

Since kGkk1 ≤ kY k1, by Theorem 2.3(e), from (3.26), (3.2) and (3.5) follows

that

kEkk1 ≤ O(kJ2

k

(11)

By (3.27) and the boundness of kJ₁2kk1, the matrix in (3.21c) can be estimated by kY − Gkk1 ≤ O(kJ2 k 1,sk1) + O(2−k) → 0, (3.28) linearly as k → ∞, where Y = Y1Y2−1.

Remark 3.1 For Case (1), since ρ(J1,s) and ρ(J2,s) are both less than 1,

O(kJ2k

1,sk1) and O(kJ2

k

2,sk1) will be smaller than O(2−k), when k is sufficiently

large. Thus, we have lim sup k→∞ k q kX − Hkk1 ≤ 1 2, lim supk→∞ k q kY − Gkk1 ≤ 1 2. i.e. , the convergence rate of SDA is proved to be at least with rate 1₂.

4 Numerical Examples

In this section, we compare the numerical behavior of the SDA algorithm with Newton’s method with respect to the numbers of iterations (ITs), the CPU times in seconds, and the “normalized” residual (NRes):

NRes = kXCf X −f XD − Af X + Bkf ∞ kXkf _∞ kXkf _∞kCk_∞+ kDk_∞+ kAke _∞ + kBk∞ ,

where X is the approximate solution to the minimal nonnegative solution off

(1.1). In test examples, the IT counts for the SDA algorithm are increased by one, accounting for the additional work to compute the initial matrices E0, F0, G0 and H0. Furthermore, we take γ = [γ0] + 1, where γ0 is defined

in (2.12), and [γ0] denotes the integer part of γ0.

Our implementations were run in MATLAB (version 7.0) on a personal com-puter Pentium III (850MHZ) with the machine precision 2.2 × 10−16.

Newton’s method (NM). Given an initial X0 = 0. For k = 0, 1, 2 . . . until

Xk converges, compute Xk+1 from Xk by solving the Sylvester equation

(A − XkC)Xk+1+ Xk+1(D − CXk) = B − XkCXk.

Note that we use the Bartels-Stewart method [20] to solve Sylvester equations, and the computational cost for each Newton step is approximately 60n3 _flops

when m = n. Recall that the flops count for each iteration in SDA algorithm is 64₃n3.

(12)

SDA and NM for NARE and its dual NARE:

As developed in Section 3, the SDA algorithm can be used for solving NARE (1.1) and its dual equation (1.2), simultaneously. On the other hand, New-ton’s method developed in [9,10] can also be used for solving (1.1) and (1.2), respectively. Example 4.1 [2] Let A = diag(δ1, . . . , δn) − eq>, B = ee>, C = qq>, D = diag(d1, . . . , dn) − qe>, in which δi = 1 cωi(1 + α) , di = 1 cωi(1 − α) , q = (q1, . . . , qn)> with qi = ci 2ωi , where 0 < c ≤ 1, 0 ≤ α < 1 and 0 < ωn < · · · < ω1 < 1, n X i=1 ci = 1, ci > 0.

It has shown in [9] that for c = 1 and α = 0 the matrix H defined in (2.1) has two zero eigenvalues with quadratic divisor. It satisfies Case (1) of The-orem 2.2. Thus, from TheThe-orem 3.2 the SDA algorithm converges linearly at least with rate 1₂ to minimal nonnegative solutions of (1.1) and (1.2), simulta-neously.

We take n = 50, 100, 200, 300, 400, 500, c = 1 and α = 0. The IT counts, CPU times and NRes for SDA and NM are listed in Table 1.

Table 1

Numerical results for Example 4.1

Methods NM SDA Methods NM SDA

IT 26 27 IT 24 26

n = 50 CPU 0.36 0.06 n=100 CPU 2.8 0.64 NRes 3.7E-16 7.2E-16 NRes 1.1E-15 1.1-15

IT 23 26 IT 25 28

n = 200 CPU 33 4 n=300 CPU 150 14 NRes 3.8E-15 1.5E-15 NRes 1.5E-15 1.3E-15

IT 26 28 IT 25 31

(13)

Example 4.2 [9] Let R =    R11 R12 R21 R22   ∈ R

2n×2n _{be a doubly stochastic matrix}

(i.e., R ≥ 0, Re = R>e = e), where R21 and R12 are nonzero n × n. Here the

matrix R is generated by the Matlab code

R = 1

n(4n2_{+ 1)}magic(2n).

Let K = a(I2n − R), where a is a given positive number (a=rand). K is a

singular M-matrix. Let

K =    D −C −B A    and H =    D −C B −A   , (4.1) where A, B, C, D ∈ Rn×n_.

From the fact that e>K = 0, and Ke = 0, the condition of Case (1) in The-orem 2.2 holds. Therefore, H has two zero eigenvalue with quadratic divisor. Theorem 3.2 shows that the SDA algorithm converges linearly at least with rate 1₂ to minimal nonnegative solution of (1.1) and (1.2) simultaneously. The numerical results are listed in Table 2.

Table 2

Numerical results for Example 4.2

Methods NM SDA Methods NM SDA

IT 24 31 IT 24 33

n = 50 CPU 0.74 0.36 n=100 CPU 6.3 1.9 NRes 9.3E-15 8.7E-15 NRes 4.1E-14 2.6-14

IT 25 33 IT 25 34

IT 27 34 IT 28 34

From Tables 1 and 2 we observe that IT(SDA) & IT(NM), numerical ex-periments shows that both SDA and NM converge linear with rate 1₂. But CPU(SDA) is much less than CPU(NM). This indicated that the SDA could be particularly effective for solving NAREs which satisfies the singular Case (1) of Theorem 2.2.

(14)

5 Application to quasi-birth-death processes

We now consider the matrix equation

A0+ A1X + A2X2 = X (5.1)

A2 + A1Y + A0Y2 = Y (5.2)

arising from time quasi-birth-death processes (QBDs) [12]. A discrete-time QBD is a Markov chain with state space {(i, j); i ≥ 0, 1 ≤ j ≤ m}, which has a transition probability matrix of the form

P =           B0 B1 0 0 · · · 0 A0 A1 A2 · · · 0 0 A0 A1 · · · .. . ... ... ... . ..          

where B0, B1, A0, A1, and A2 are m × m nonnegative matrices such that P is

stochastic and irreducible. Therefore, A0, A1 and A2 satisfy

(A0+ A1 + A2)e = e. (5.3)

and A0, A2 are two nonzero matrices. We rewrite (5.1) and (5.2), respectively,

into    0 I A0 A1− I       I X   =    I 0 0 −A2       I X   Rq, (5.4a)    0 I A0 A1− I       Y I   Sq=    I 0 0 −A2       Y I   , (5.4b)

where X = Rq ≥ 0 and Y = Sq ≥ 0. It is known that [12] (5.1) and (5.2)

have minimal nonnegative solutions in the set {P ≥ 0|P e ≤ e}. By Perron-Frobenius (P-F) Theorem it holds that ρ(X) ≤ 1 and ρ(Y ) ≤ 1. We now assume that

A = A0+ A1+ A2 is irreducible (5.5a)

and

(I−A1)−1A0e = A0e+A1A0e+· · · > 0, (I−A1)−1A2e = A2e+A1A2e+· · · > 0.

(5.5b) Note that by (5.3) and (5.5a), we have ρ(A) = 1 by applying P-F Theorem. Since A0, A2 are nonzero matrices, we have A1 ≤ A and A1 6= A. Again,

(15)

From assumption (5.5a), by P-F Theorem there is a unique vector α > 0 such that α>e = 1 and α>A = α>. By [12], the QBD is null recurrent if α>A0e =

α>A2e; positive recurrent if α>A0e > α>A2e; and transient if α>A0e < α>A2e.

Let M =    0 I A0 A1− I   , L =    I 0 0 −A2   . (5.6)

We have the fundamental results.

Lemma 5.1 [12] If the QBD is positive recurrent, then X is stochastic and Y is substochastic with ρ(Y ) < 1. If the QBD is transient, then Y is stochastic and X is substochastic with ρ(X) < 1, If the QBD is null recurrent, then X and Y are both stochastic.

Lemma 5.2 Assume (5.5) holds, and (M, L) is given by (5.6). We have the following simple observation (see also [12]).

(i) (null recurrent): If α>A0e = α>A2e, then M − λL has two eigenvalues

one with quadratic divisor.

(ii) (positive recurrent): If α>A0e > α>A2e, then M − λL has an eigenvalue

one.

(iii) (transient): If α>A0e < α>A2e, then M − λL has an eigenvalue one.

Proof. From (5.3) it holds that M[e>, e>]> = L[e>, e>]> for three cases. That is, one is an eigenvalue of M − λL. It remains to show Case (i). Suppose

M    u v   = L    u v   + L    e e   , (5.7)

where u, v ∈ Rn _{are undetermined vectors. From (5.7) follows that}

−u + v = e, (5.8a)

A0u + (A1− I + A2)v = −A2e. (5.8b)

Substituting v = u + e into (5.8b), and using (5.3) we have ˆ

Au ≡ (A0+ (A1 − I) + A2)u = (A0− A2)e. (5.9)

Since α>A = α> with α > 0 the vector α is the unique left null vector of ˆA. By assumption of Case (i), it holds that α>(A0− A2)e = 0, that is, the right

hand side of (5.9) lies in the range of ˆA. It follows that the linear system of (5.9) has a nonzero solution u. This shows that M − λL has two eigenvalues one with quadratic divisor.

(16)

We, furthermore, assume that

(QBD-H) M − λL has no other unimodular eigenvalues except one. Under this hypothesis (QBD-H), the following property is well known [21]. (1) If the QBD is positive recurrent, then (M, L) has m − 1 eigenvalues in

|z| < 1, at most m eigenvalues in |z| > 1, the number of eigenvalues at infinity exactly equals to nullity(Y ), and one eigenvalue 1. Moreover, 1 is a simple eigenvalue of X and ρ(Y ) < 1.

(2) If the QBD is null recurrent, then (M, L) has m − 1 eigenvalues in |z| < 1, at most m − 1 eigenvalues in |z| > 1, the number of eigenvalues at infinity exactly equals to nullity(Y ), and two eigenvalues 1 with quadratic divisor. Moreover, 1 is a simple eigenvalue of X and Y , respectively.

We are ready to prove the main theorem of this section.

Theorem 5.1 Assume that (5.5) and (QBD-H) holds. Let X, Y ≥ 0 be min-imal nonnegative solutions of (5.1) and (5.2), respectively. Then the sequence {Ek, Fk, Gk, Hk} generated by SDA algorithm is well-defined with E0 = E,

F0 = F , G0 = F and H0 = E. Furthermore, for Case (i) (null recurrent) it

holds kX − Hkk1 ≤ O(kYsk2 k 1 ) + O(2 −k ) → 0, (5.10a) kY − Gkk1 ≤ O(kXsk2 k 1 ) + O(2 −k_{) → 0,} _(5.10b)

at least linearly with rate 1₂ as k → ∞, where σ(X) = σ(Xs) ∪ {1} with

ρ(Xs) < 1 and σ(Y ) = σ(Ys) ∪ {1} with ρ(Ys) < 1; for Case (ii) (positive

recurrent) and Case (iii) (transient) it holds kX − Hkk1 ≤ kXk1kY2 k k1kX2 k k1 → 0, (5.11a) kY − Gkk1 ≤ kXk1kY2 k k1kX2 k k1 → 0, (5.11b)

as k → ∞, where ρ(X) ≤ 1, ρ(Y ) < 1 for Case (ii), and ρ(X) < 1, ρ(Y ) ≤ 1 for Case (3).

Proof. Multiplying the second block-row of (5.4) by (I − A1)−1 and eliminating

I in the (1,2)-block of (5.4) we get

   E 0 −E I       I X   =    I −F 0 F       I X   Rq, (5.12a)    E 0 −E I       Y I   Sq=    I −F 0 F       Y I   , (5.12b)

(17)

where

E = (I − A1)−1A0, F = (I − A1)−1A2. (5.13)

By assumptions (5.5b) and (5.3) we have

Ee = (I − A1)−1A0e > 0, F e = (I − A1)−1A2e > 0, (5.14)

and

(I − EF )e = (I − A1)−1(A2+ A0(I − A1)−1A0)e > 0, (5.15a)

(I − F E)e = (I − A1)−1(A0+ A2(I − A1)−1A0)e > 0. (5.15b)

Inequalities (5.15) show that I − EF and I − F E are nonsingular M-matrices. The result of Theorem 2.3 shows that the sequence {Ek, Fk, Gk, Hk}

gener-ated by SDA algorithm is well-defined. The quadratic convergence of (5.11) for Cases (ii) and (iii) follows from Theorem 3.1 immediately. The linear con-vergence of (5.10) for Case (i) follows from Theorem 3.2.

Remark 5.1 Latouche and Ramaswami [21,13] proposed the Latouche Ra-maswami (LR) algorithm which solves the QBD problem (positive recurrent and transient) with high efficiency, low cost and quadratic convergence rate. It requires about 62

3 n

3 _and 64 3n

3 _{flops for each step of LR algorithm and SDA}

algorithm, respectively, for solving X and its dual solution Y . Although SDA is slightly more expensive than LR algorithm, however, both methods have the same accuracy. Moreover, we give another sufficient conditions (5.5) different from that of [17] to solve the QBD problem in null recurrent case. The con-ditions of (5.5b) and (3.11) of [17] do not cover each other. We can simply note that the condition (3.11) of [17] is satisfied by Examples 5.1 and 5.2 of [17], but not satisfied by Example 5.3 of [17]. On the other hand, the condition (5.5b) is satisfied by Examples 5.2 and 5.3 of [17], but not satisfied by Example 5.1 of [17].

SDA Algorithm for QBDs

Set E0 = (I − A1)−1A0, F0 = (I − A1)−1A2, G0 = E0, H0 = F0, k = 0, 1, . . . compute Ek+1= Ek(I − GkHk)−1Ek, Fk+1= Fk(I − HkGk)−1Fk, Gk+1= Gk+ Ek(I − GkHk)−1GkFk, Hk+1= Hk+ Fk(I − HkGk)−1HkEk.

(18)

Example 5.1 [17] Consider the equation (1.4) with A0 =        0.6 0.4 0 0.1 0 0.1 0 0 0        , A1 =        0 0 0 0.2 0 0.1 0 0 0        , A2 =        0 0 0 0.26 0 0.24 0.2 0 0.8        .

The exact solution of (1.4) and (1.5) are given by

X = 1 2300        1380 920 0 1320 680 300 1357 828 115        , Y = 1 25        5 0 20 9 0 16 5 0 20        .

Example 5.2 [17] Consider the equation (1.4) with

A0 =    0 0.5 0 0   , A1 =    0 0 1 0   , A2 =    0 0.5 0 0   .

The exact solutions for (1.4) and (1.5) are given by X = Y =

   0 1 0 1   .

Example 5.3 [17] Consider the equation (1.4) with

A0 =    0.25 0 0.25 0   , A1 =    0.25 0.25 0.25 0.25   , A2 =    0 0.25 0 0.25   .

The exact solutions for (1.4) and (1.5) are given by X =

   1 0 1 0   , Y =    0 1 0 1   .

We definite the “normalized” residual (NRes) by

NRes = kA0+ (A1− I)G + A2G 2_k ∞ kGk∞(kA1− Ik∞+ kA2k∞kGk∞|) + kA0k∞ and Flops=ITs × 64 3 n 3 .

(19)

The relative error of Hk and Gk in SDA to the exact solutions X and Y are denoted, respectively, by Errk(X) = kHk− Xk∞ kXk∞ , Errk(Y ) = kGk− Y k∞ kY k∞ .

The numerical results of Example 5.1–5.3 computed by SDA algorithm are reported in Tab 3

Table 3

Numerical results for Example 5.1–5.3

SDA ITs Flops NRes Errk(X) Errk(Y )

EX 5.1 26 554n3 2.5E-16 6.0E-9 6.0E-9 EX 5.2 25 533n3 1.5E-16 7.8E-9 7.8E-9 EX 5.3 26 554n3 1.1E-16 7.3E-9 7.3E-9

6 Conclusions

In this paper, we prove a global and linear convergence of the structured doubling algorithm for solving the minimal nonnegative solutions to the non-symmetric algebraic Riccati equation and its dual equation under a singular case. The convergence is shown to be at least linear with rate 1₂ using only elementary matrix theory. The new presentation of the proof technique in The-orem 3.2 makes the analysis of convergence much easier than the approach by fixed point iteration. It could be used to study the convergence of the gener-alized Riccati equation.

Acknowledgments

We would like to thank Dr. C. S. Wang from National Cheng Kung University for many helpful discussion.

References

[1] J. Juang, Existence of algebraic matrix Riccati equations arising in transport theory., Lin. Alg. Appl., 230 (1995), pp. 89–100.

[2] J. Juang and W.-W. Lin, Nonsymmetric algebraic Riccati equations and Hamilitonian-like matrices., SIAM J. Matrix Analy. Appl., 20 (1999), pp. 228– 243.

(20)

[3] C.-H. Guo, Efficient methods for solving a nonsymmertic algebra Riccati equation arising in stochastic fluid models., J. Comput. Appl. Math., 192 (2006), pp. 353–373.

[4] M. Barlow, L. Rogers, and D. Williams, Wiener-Hopf factorization for matrices, in S´eminaire de probabilit´es XIV, no. 784 in Lecture Notes in Math., 1980, pp. 324–331.

[5] R. London, H. Mckean, L. Rogers, and D. Williams, A martingale approach to some Wiener-Hopf problems II., in S´eminaire de probabilit´es XIV, no. 920 in Lecture Notes in Math., Springer-Verlag, 1982, pp. 68–90.

[6] L. Rogers, Fluid models in queueing theory and Wiener-Hopf factorization of Markov chains., Ann. Appl. Probab., 4 (1994), pp. 390–413.

[7] L. Rogers and Z. Shi, Computing the invariant law of a fluid model., J. Appl. Probab., 31 (1994), pp. 885–896.

[8] D. Williams, A “potential-theoretic” note on the quadratic Wiener-Hopf equation for Q-matrices., in S´eminaire de probabilit´es XIV, no. 920 in Lecture Notes in Math., 1982, pp. 91–94.

[9] C.-H. Guo, Nonsymmetric algebraic Riccati equations and Winener-Hopf factorization for M-matrices., SIAM J. Matrix Analy. Appl., 23 (2001), pp. 225– 242.

[10] C.-H. Guo and A. Laub, On the iterative solution of a class of nonsymmetric algebraic Riccati equations., SIAM J. Matrix Analy. Appl., 22 (2000), pp. 376– 391.

[11] X.-X. Guo, W.-W. Lin, and S.-F. Xu, A structure-preserving doubling algorithm for nonsymmetric algebraic Riccati equation., Numer. math., 103 (2006), pp. 393–412.

[12] G. LATOUCHE and V. RAMASWAMI, Introduction to Matrix Analytic Methods in Stochastic Modeling., SIAM, Philadelphia, 1999.

[13] G. Latouche and V. Ramaswami, A logarithmic reduction algorithm for quasi-birth-death processes., J. Appl. Probab., 30 (1993), pp. 650–674.

[14] D. Bini and B. Meini, On the solution of a nonlinear matrix equation arising in queueing problems., SIAM J. Matrix Analy. Appl., 17 (1996), pp. 906–926. [15] C. He, B. Meini, and N. H. Rhee, A shifted cyclic reduction algorithm for

quasi-birth-death problems., SIAM J. Matrix Analy. Appl., 23 (2001), pp. 673– 691.

[16] D. A. Bini, G. Latouche, and B. Meini, Solving matrix polynomial equations arising in queueing problems., Linear Algebra Appl., 340 (2002), pp. 225–244.

[17] C.-H. Guo, Convergence analysis of the Latouche-Ramaswami algorithm for null recurrent quasi-birth-death processes., SIAM J. Matrix Analy. Appl., 23 (2002), pp. 744–760.

(21)

[18] A. Berman and R. Plemmons, Nonnegative matrices in the Mathematical Sciences., Academic Press, 1979.

[19] R. Varga, Matrix iterative analysis., Prentice-Hall, 1962.

[20] R. Bartels and G. Stewart, Solution of the matrix equation AX+XB=C., Comm. ACM., 15 (1972), pp. 820–826.

[21] D. A. Bini, G. Latouche, and B. Meini , Numerical Methods for Structured Markov Chains., Oxford University Press, New York, 2005.