Convergence Rate of the Cyclic Recuction Algorithm for Null Recurrent QUASI-BIRTH-DEATH Problems

(1)

ALGORITHM FOR NULL RECURRENT QUASI-BIRTH-DEATH PROBLEMS

CHUN-HUA GUO∗ _AND _{WEN-WEI LIN}†

Abstract. The minimal nonnegative solution G of the matrix equation G = A0+ A1G + A2G2, where the matrices A0, A1and A2 are nonnegative and A0+ A1+ A2 is stochastic, plays an important role in the study of quasi-birth-death processes (QBDs). The cyclic reduction algorithm is a very efficient iteration for finding the matrix G, under the standard assumption that the transition probability matrix of the QBD and the matrix A0+ A1+ A2are both irreducible. The convergence is known to be quadratic for positive recurrent QBDs and for transient QBDs. For the null recurrent case, the convergence of a closely related algorithm, the Latouche–Ramaswami algorithm, has been shown to be linear with rate 1/2 under two additional assumptions. In this paper, we show that the convergence of the cyclic reduction algorithm and hence of the Latouche–Ramaswami algorithm is at least linear with rate 1/2 in the null recurrent case, without those two additional assumptions, and the proof here is much simpler.

Key words. matrix equation, minimal nonnegative solution, quasi-birth-death process, null recurrent, cyclic reduction, Latouche–Ramaswami algorithm, doubling algorithm, convergence rate

AMS subject classifications. 15A24, 15A51, 60J10, 60K25, 65H10

1. Introduction. A discrete-time quasi-birth-death (QBD) process is a Markov chain with state space {(i, j) | i ≥ 0, 1 ≤ j ≤ m}, and with a transition probability matrix of the form

P =        B0 B1 0 0 · · · A0 A1 A2 0 · · · 0 A0 A1 A2 · · · 0 0 A0 A1 · · · .. . ... ... ... . ..        ,

where B0, B1, A0, A1, and A2are m×m nonnegative matrices such that P is stochastic.

In particular, (A0+ A1+ A2)e = e, where e = (1, 1, . . . , 1)T.

We make the standard assumption that the matrix P and the matrix A = A0+

A1+ A2 are both irreducible. Thus, A0 6= 0 and A2 6= 0. Moreover, there exists a

unique vector α > 0 with αT_{e = 1 and α}T_{A = α}T_{. The QBD is positive recurrent if}

αT_A

0e > αTA2e, transient if αTA0e < αTA2e, and null recurrent if αTA0e = αTA2e.

The minimal nonnegative solution G of the matrix equation

G = A0+ A1G + A2G2 (1.1)

plays an important role in the study of the QBD process (see [11]). We will also need the dual equation

F = A2+ A1F + A0F2, (1.2)

∗_{Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada} ([email protected]). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada.

†_{Department of Mathematics, National Tsing Hua University, Hsinchu 300, Taiwan (wwlin@math.} nthu.edu.tw).

(2)

and we let F be its minimal nonnegative solution. It is well known (see [11], for exam-ple) that if the QBD is positive recurrent, then G is stochastic and F is substochastic with spectral radius ρ(F ) < 1; if the QBD is transient, then F is stochastic and G is substochastic with ρ(G) < 1; if the QBD is null recurrent, then G and F are both stochastic.

The Latouche–Ramaswami (LR) algorithm [10] and the cyclic reduction (CR) algorithm [3] are both efficient iterative methods for finding the minimal solution G. The convergence of these two algorithms are quadratic for positive recurrent and transient QBDs. A convergence analysis has been performed in [6] for the LR algorithm in the null recurrent case under two additional assumptions. The first assumption is that λ = 1 is a simple eigenvalue of G and F and there are no other eigenvalues of G or F on the unit circle; the second assumption is made under the first assumption and is more technical. In this paper, we are going to determine the convergence rate for the CR algorithm in the null recurrent case, without those two additional assumptions. Moreover, the analysis here is much simpler than that in [6]. The convergence rate for the LR algorithm is the same in view of the relationship between CR and LR, given in [1].

2. Connection between the CR algorithm and the doubling algorithm. Our convergence analysis will be based on a connection between the CR algorithm and the doubling algorithm. The application of the CR algorithm and the doubling algorithm is not limited to the matrix equations arising in QBD problems. In [13] and [5], the CR algorithm is studied for the matrix equation X + A∗X−1A = Q, where Q is a Hermitian positive definite matrix. In [12], the doubling algorithm is studied for this equation and a close relationship between the CR algorithm and the doubling algorithm is observed there. We expect that a similar relationship would exist between the CR algorithm and the doubling algorithm when they are applied to the matrix equation (1.1).

The CR algorithm for (1.1), or for −A0+ (I − A1)G − A2G2= 0, is the following:

Algorithm 2.1. Set T0= A0, U0= I − A1, V0= A2, W0= I − A1. for k = 0, 1, . . ., compute Tk+1= TkUk−1Tk, Uk+1= Uk− TkUk−1Vk− VkUk−1Tk, Vk+1= VkUk−1Vk, Wk+1= Wk− VkU_k−1Tk.

The above CR algorithm is as presented in [1], but with one minor change: if we follow [1] exactly, Tk and Vk here would have to be replaced by −Tk and −Vk for

k ≥ 0.

The following result is known from the discussions in [2] and [11].

Theorem 2.2. The sequences {Tk}, {Uk}, {Vk}, {Wk} in Algorithm 2.1 are well

defined. For each k ≥ 0, Tk, Vk≥ 0, and Uk, Wk are nonsingular M -matrices. When

the QBD is positive recurrent or transient, the sequence {Wk} converges quadratically

to a nonsingular M -matrix W∗ and W∗−1A0= G.

We note that Algorithm 2.1 may break down if we do not assume the irreducibility of the transition matrix P . As an example, we consider

A0= 0 0 1 0 , A1= 0, A2= 0 1 0 0 .

(3)

It is easy to see that P is not irreducible, although A0+ A1+ A2is. For this example,

U1= 0 in Algorithm 2.1, so the algorithm breaks down. The LR algorithm also breaks

down for this example.

We now set out to establish a connection between Algorithm 2.1 and a doubling algorithm.

It is easily verified that L0 I A2G = M0 I A2G G, (2.1) where L0= 0 I T0 0 , M0= V0 0 P0 −I

with T0= A0, P0= I − A1, V0= A2. It is easily seen that L0− λM0is a linearization

of −A0+ λ(I − A1) − λ2A2.

We are going to define Lk = −Qk I Tk 0 , Mk = Vk 0 Pk −I

for all k ≥ 0, where Q0 = 0 and P0− Q0 = I − A1 is nonsingular. We now assume

that Lk and Mk have been defined and Pk− Qk is nonsingular for k ≥ 0. Then we

can define the matrices e Lk= I −Vk(Pk− Qk)−1 0 Tk(Pk− Qk)−1 , Mfk= Vk(Pk− Qk)−1 0 −Tk(Pk− Qk)−1 I .

It is easily verified that eLkMk = fMkLk. We then define

Lk+1= eLkLk= −(Qk+ Vk(Pk− Qk)−1Tk) I Tk(Pk− Qk)−1Tk 0 , Mk+1= fMkMk = Vk(Pk− Qk)−1Vk 0 Pk− Tk(Pk− Qk)−1Vk −I .

Therefore, the sequence {Lk, Mk} can be defined by the following doubling algorithm

if no breakdown occurs. Algorithm 2.3. Set T0= A0, P0= I − A1, Q0= 0, V0= A2. For k = 0, 1, . . ., compute Tk+1= Tk(Pk− Qk)−1Tk, Pk+1= Pk− Tk(Pk− Qk)−1Vk, Qk+1= Qk+ Vk(Pk− Qk)−1Tk, Vk+1= Vk(Pk− Qk)−1Vk.

Note that the complexity of this algorithm is 38/3n3 _{flops each iteration.}

It is readily seen that Algorithm 2.1 is recovered from Algorithm 2.3 by letting Uk = Pk− Qk and Wk = W0− Qk. By Theorem 2.2, Pk− Qk = Uk are nonsingular

(4)

Pre-multiplying (2.1) by eL0 and using eL0M0= fM0L0, we get L1 I A2G = M1 I A2G G2. In general, we have for each k ≥ 0

Lk I A2G = Mk I A2G G2k. So −Qk+ A2G = VkG2 k , Tk= PkG2 k − A2G2 k₊₁ . (2.2) Similarly we have b L0 I A0F = cM0 I A0F F, where b L0= 0 I V0 0 , Mc0= T0 0 P0 −I .

It is easily seen that cM0− λbL0is also a linearization of −A0+ λ(I − A1) − λ2A2.

For each k ≥ 0 we now have

b Lk I A0F = cMk I A0F F2k, where b Lk = − bQk I Vk 0 , Mck = _T k 0 b Pk −I with b Qk = I − A1− Pk, Pbk = I − A1− Qk. (2.3) So − bQk+ A0F = TkF2 k , Vk= bPkF2 k − A0F2 k₊₁ . (2.4)

We mentioned before that the Wk in Algorithm 2.1 satisfies Wk = W0− Qk =

I − A1− Qk. So we have bPk= Wk.

When the QBD is positive recurrent or transient, we know by Theorem 2.2 that b

Pk converges quadratically to a nonsingular M -matrix bP∗ and bP∗−1A0= G. Here we

give a quick proof using the doubling algorithm. By the first equation in (2.2) and the second equation in (2.3), we have

b

Pk− I + A1+ A2G = VkG2 k

. Eliminating Vk using the second equation in (2.4) gives

b Pk(I − F2 k G2k) = I − A1− A2G − A0F2 k₊₁ G2k.

(5)

It follows that

lim sup

k→∞ 2kq

k bPk− (I − A1− A2G)k ≤ ρ(F )ρ(G) < 1.

Since bP∗ = I − A1− A2G is a nonsingular M -matrix and A0 = bP∗G, we have G =

b

P_∗−1A0. Similarly, Pk converges quadratically to the nonsingular M -matrix P∗ =

I − A1− A0F and F = P∗−1A2.

The usefulness of the interpretation of the CR algorithm through the doubling algorithm will be more clear when we study the null recurrent case in the next section. 3. Convergence rate of the CR algorithm in the null recurrent case. We start with a review of an important result about the spectral properties of the quadratic pencil −A0+ λ(I − A1) − λ2A2and of the matrices G and F when the QBD

is null recurrent. See Proposition 14 and Theorem 4 of [4] and Theorem 4.10 of [2]. Theorem 3.1. Let the QBD be null recurrent. Then for some integer r ≥ 1 the quadratic pencil −A0+ λ(I − A1) − λ2A2 has m − r eigenvalues inside the unit circle,

m−r eigenvalues outside the unit circle (which include eigenvalues at infinity), and 2r eigenvalues on the unit circle, which are the rth roots of unity, each with multiplicity two. The partial multiplicity of each eigenvalue on the unit circle is excatly two. Moreover, the eigenvalues of G are the m − r eigenvalues of the pencil inside the unit circle plus the r simple eigenvalues at the rth roots of unity, the eigenvalues of F are the reciprocals of the m − r eigenvalues of the pencil outside the unit circle plus the r simple eigenvalues at the rth roots of unity.

Using the Kronecker form for matrix pairs, we have nonsingular matrices Q and Z such that QM0Z = Im 0 0 J2⊕ Ir ≡ JM, (3.1) QL0Z = J1⊕ Dr 0 ⊕ Ir 0 Im−r⊕ Dr ≡ JL, (3.2)

where In(or simply I) is the identity matrix of order n, J1and J2are (m−r)×(m−r)

matrices consisting of the Jordan blocks with diagonal elements inside the unit circle, Dr is a r × r diagonal matrix with the rth roots of unity on the diagonal, and X ⊕ Y

denotes [X 0 0 Y ].

Similarly, we have nonsingular matrices P and Y such that P bL0Y = Im 0 0 J2⊕ Ir = JM, (3.3) P cM0Y = J1⊕ Dr 0 0 ⊕ Ir Im−r⊕ Dr ≡ bJL. (3.4)

As in [9], it follows from (3.1) and (3.2) that

M0ZJL= Q−1JMJL= Q−1JLJM = L0ZJM, and M1ZJL2 = fM0M0ZJL2 = fM0L0ZJMJL= eL0M0ZJLJM = eL0L0ZJM2 = L1ZJM2 . Similarly, b L1Y bJL2 = cM1Y JM2 .

(6)

In general, we have for each k ≥ 0 MkZJ2 k L = LkZJ2 k M, LbkY bJ2 k L = cMkY J2 k M. (3.5) Let Z = Z1 Z3 Z2 Z4 , Y = Y1 Y3 Y2 Y4 . From (3.1) and (3.2) we have

L0 Z1 Z2 = M0 Z1 Z2 (J1⊕ Dr).

Comparing this with (2.1) and using Theorem 3.1, we get Z1 Z2 = I A2G W

for a nonsingular matrix W . It follows that Z1 is nonsingular and Z2Z1−1 = A2G.

Similarly, Y3 is nonsingular and Y4Y3−1= A0F .

Using block matrix multiplication for (3.5), we have VkZ1(J2 k 1 ⊕ D 2k r ) = −QkZ1+ Z2, (3.6) (PkZ1− Z2)(J2 k 1 ⊕ D 2k r ) = TkZ1, (3.7) VkZ1(0 ⊕ 2kD2 k₋₁ r ) + VkZ3(I ⊕ D2 k r ) = (−QkZ3+ Z4)(J2 k 2 ⊕ I), (3.8) (PkZ1− Z2)(0 ⊕ 2kD2 k₋₁ r ) + (PkZ3− Z4)(I ⊕ D2 k r ) = TkZ3(J2 k 2 ⊕ I), (3.9) (− bQkY1+ Y2)(J2 k 1 ⊕ D 2k r ) + (− bQkY3+ Y4)(0 ⊕ 2kD2 k₋₁ r ) = TkY1, (3.10) VkY1(J2 k 1 ⊕ D 2k r ) + VkY3(0 ⊕ 2kD2 k₋₁ r ) = bPkY1− Y2, (3.11) (− bQkY3+ Y4)(I ⊕ D2 k r ) = TkY3(J2 k 2 ⊕ I), (3.12) VkY3(I ⊕ D2 k r ) = ( bPkY3− Y4)(J2 k 2 ⊕ I). (3.13)

Post-multiplying (3.9) by 0 ⊕ 2−kDrand subtracting the result from (3.7), we get

Tk(Z1− Z3(0 ⊕ 2−kDr)) = (PkZ1− Z2)(J2 k 1 ⊕ 0) − (PkZ3− Z4)(0 ⊕ 2−kD2 k₊₁ r ). (3.14) By (3.12) we have − bQk = −Y4Y3−1+ TkY3(J2 k 2 ⊕ D−2 k r )Y3−1. (3.15) Thus, in view of (2.3), Pk = I − A1− Y4Y3−1+ TkY3(J2 k 2 ⊕ D−2 k r )Y −1 3 . (3.16)

Inserting (3.16) into (3.14) and letting R = I − A1− Y4Y3−1, we get

Tk h Z1− Z3(0 ⊕ 2−kDr) − Y3(J2 k 2 ⊕ D−2 k r )Y3−1(Z1(J2 k 1 ⊕ 0) − Z3(0 ⊕ 2−kD2 k₊₁ r )) i = (RZ1− Z2)(J2 k 1 ⊕ 0) − (RZ3− Z4)(0 ⊕ 2−kD2 k₊₁ r ),

(7)

from which it follows that

Tk= O(2−k).

It then follows from (3.16) that

Pk− (I − A1− Y4Y3−1) = O(2 −k_).

Post-multiplying (3.8) by 0 ⊕ 2−k_D

rand subtracting the result from (3.6), we get

−QkZ1+Z2−(−QkZ3+Z4)(0⊕2−kDr) = Vk(Z1(J2 k 1 ⊕0)−Z3(0⊕2−kD2 k₊₁ r )). (3.17) By (3.13), Vk = ( bPkY3− Y4)(J2 k 2 ⊕ D−2 k r )Y −1 3 . (3.18)

Inserting (3.18) into (3.17) and using bPk = I − A1− Qk, we get

−QkZ1+ Z2− (−QkZ3+ Z4)(0 ⊕ 2−kDr) = ((I − A1− Qk)Y3− Y4)Wk

with Wk = O(2−k). Thus,

Qk(Z1− Z3(0 ⊕ 2−kDr) − Y3Wk) = Z2− Z4(0 ⊕ 2−kDr) − ((I − A1)Y3− Y4)Wk. It follows that Qk− Z2Z1−1= O(2−k). Post-multiplying (3.11) by 0 ⊕ 2−kD1−2k r , we get VkY1(0 ⊕ 2−kDr) + VkY3(0 ⊕ I) = ( bPkY1− Y2)(0 ⊕ 2−kD1−2 k r ). (3.19) Post-multiplying (3.13) by I ⊕ 0, we get VkY3(I ⊕ 0) = ( bPkY3− Y4)(J2 k 2 ⊕ 0). (3.20)

Adding (3.19) and (3.20) gives

Vk(Y3+ Y1(0 ⊕ 2−kDr)) = ( bPkY1− Y2)(0 ⊕ 2−kD1−2 k r ) + ( bPkY3− Y4)(J2 k 2 ⊕ 0). It follows that Vk= O(2−k),

since Y3 is nonsingular and { bPk} has been shown to be bounded.

In summary, we have proved the following result.

Theorem 3.2. Let the QBD be null-recurrent. Then for Algorithm 2.3 we have Vk = O(2−k), Tk= O(2−k), Pk− (I − A1− A0F ) = O(2−k), Qk− A2G = O(2−k).

Corollary 3.3. Let lim Pk = P∗ and lim Qk = Q∗. Then P∗is nonsingular and

P_∗−1A2 = F , I − A1− Q∗ is nonsingular and (I − A1− Q∗)−1A0 = G. The matrix

(8)

Proof. By Theorem 3.2, P∗= I −A1−A0F and I −A1−Q∗ = I −A1−A2G. These

two matrices are known to be nonsingular [11]. Since P∗F = (I − A1− A0F )F = A2,

P_∗−1A2= F . Since (I − A1− Q∗)G = (I − A1− A2G)G = A0, (I − A1− Q∗)−1A0= G.

P∗− Q∗is a singular M -matrix since

(P∗− Q∗)e = (I − A1− A0F − A2G)e = e − (A1+ A0+ A2)e = 0.

This completes the proof.

Thus, the minimal solutions G and F can be found by the CR algorithm (or the closely related LR algorithm) simultaneously and the convergence is at least linear with rate 1/2.

4. Conclusion. We have studied the convergence rate of the CR algorithm for null recurrent QBDs under standard assumptions. It is important to note that we no longer need the assumption that the matrices G and F have no eigenvalues on the unit circle other than the simple eigenvalue 1. With that assumption, one would use the shift technique as studied in [8], [7] and [2], and apply the CR algorithm or the LR algorithm to the shifted equation. When G and F have more than one eigenvalues on the unit circle, the shift technique is not helpful and the CR algorithm or the LR algorithm will be applied directly to the equation (1.1). Through this study, we have gained more insights for the convergence behaviour for the CR algorithm and the LR algorithm for nearly null recurrent cases, regardless of how many eigenvalues of G or F are on or near the unit circle.

REFERENCES

[1] D. A. Bini, G. Latouche, and B. Meini, Solving matrix polynomial equations arising in queueing problems, Linear Algebra Appl., 340 (2002), pp. 225–244.

[2] D. A. Bini, G. Latouche, and B. Meini, Numerical Methods for Structured Markov Chains, Oxford University Press, 2005.

[3] D. Bini and B. Meini, On the solution of a nonlinear matrix equation arising in queueing problems, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 906–926.

[4] H. R. Gail, S. L. Hantler, and B. A. Taylor, Spectral analysis of M/G/1 and G/M/1 type Markov chains, Adv. Appl. Probab., 28 (1996), pp. 114–165.

[5] C.-H. Guo, Convergence rate of an iterative method for a nonlinear matrix equation, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 295–302.

[6] C.-H. Guo, Convergence analysis of the Latouche–Ramaswami algorithm for null recurrent quasi-birth-death processes, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 744–760. [7] C.-H. Guo, Comments on a shifted cyclic reduction algorithm for quasi-birth-death problems,

SIAM J. Matrix Anal. Appl., 24 (2003), pp. 1161–1166.

[8] C. He, B. Meini, and N. H. Rhee, A shifted cyclic reduction algorithm for quasi-birth-death problems, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 673–691.

[9] T.-M. Hwang and W.-W. Lin, Structured doubling algorithms for weakly stabilizing Hermitian solutions of algebraic Riccati equations, Linear Algebra Appl., to appear.

[10] G. Latouche and V. Ramaswami, A logarithmic reduction algorithm for quasi-death-birth processes, J. Appl. Probab., 30 (1993), pp. 650–674.

[11] G. Latouche and V. Ramaswami, Introduction to Matrix Analytic Methods in Stochastic Modeling, SIAM, Philadelphia, PA, 1999.

[12] W.-W. Lin and S.-F. Xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 26–39. [13] B. Meini, Efficient computation of the extreme solutions of X + A∗_X−1_{A = Q and X −}