Contents lists available atSciVerse ScienceDirect
Linear Algebra and its Applications
journal homepage:w w w . e l s e v i e r . c o m / l o c a t e / l a a
Solving large-scale nonlinear matrix equations by doubling
Peter Chang-Yi Weng
a, Eric King-Wah Chu
a,∗
, Yueh-Cheng Kuo
b,
Wen-Wei Lin
caSchool of Mathematical Sciences, Building 28, Monash University, Vic 3800, Australia bDepartment of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan cDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan
A R T I C L E I N F O A B S T R A C T
Article history:
Received 22 November 2011 Accepted 7 August 2012
Available online 19 September 2012 Submitted by P.Semrlˇ AMS classification: 15A24 65F50 Keywords: Doubling algorithm Green’s function Krylov subspace Leaky surface wave Nano research
Nonlinear matrix equation Surface acoustic wave
We consider the solution of the large-scale nonlinear matrix equa-tion X+BX−1A−Q = 0, with A,B,Q,X ∈ Cn×n, and in some applications B = A. (. = or H). The matrix Q is assumed to be nonsingular and sparse with its structure allowing the solu-tion of the corresponding linear system Qv = r in O(n) computa-tional complexity. Furthermore, B and A are respectively of ranks ra,rbn. The type 2 structure-preserving doubling algorithm by Lin and Xu (2006) [24] is adapted, with the appropriate applica-tions of the Sherman–Morrison–Woodbury formula and the low-rank updates of various iterates. Two resulting large-scale doubling algorithms have an O((ra+rb)3)computational complexity per it-eration, after some pre-processing of data in O(n)computational complexity and memory requirement, and converge quadratically. These are illustrated by the numerical examples.
© 2012 Elsevier Inc. All rights reserved.
1. Introduction
Consider the nonlinear matrix equation (NME)
R
(
X) ≡
X+
BX−1A−
Q=
0 (1)with A
,
B,
Q,
X∈ C
n×n. We assume that Q is nonsingular with structures, like being banded or sparse, allowing the solution of the corresponding linear system Qv=
r in O(
n)
computational complexity.∗Corresponding author.
E-mail addresses:[email protected](P.C.-Y. Weng),[email protected](E.K.-w. Chu),[email protected](Y.-C. Kuo), [email protected](W.-W. Lin).
0024-3795/$ - see front matter © 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.laa.2012.08.008
We further assume that A
,
B are respectively of ranks ra,
rb n. These NMEs arise in the solution of palindromic eigenvalue problems, with applications in the computation of Green’s function in nano research [12–14,16] and surface acoustic simulations [18,19]; for the individual models, structures of the particular NMEs and their solvability conditions, please consult these references. For the surface acoustic wave application in [18,19], we have Q=
Q∈ C
n×n and B=
A∈ C
n×n. In some applications as in [18,19], selected eigenvalues from the pencilsλ
X−
A orλ
B−
X are required, afterthe solution X to (1) is found.
We shall adapt the structure-preserving doubling algorithm (SDA) of type 2 [13,16,24] for the NME in (1), resulting in an efficient large-scale doubling algorithm (SDA_ls). The original SDA is usually attributed to Anderson [1],1 as an accelerated variant of the direct functional iteration method. It has recently been revitalized and further developed in [5–7], for a great variety of applications [9]. Recently, we have extended the SDA (of type 1) for large-scale algebraic Riccati equations (AREs) [21,22,28] and the associated linear equations [23], with the resulting algorithms possessing an efficient O
(
n)
computational complexity and memory requirement per iteration. We shall extend these methods to NMEs, based on similar philosophy. Notice the important difference in the large-scale NME, from large-scale AREs, that the solution X is nonsingular and not numerically low-ranked. (We shall see later that X is a numerically low-ranked update of the nonsingular Q .) Interestingly, the essential steps of compression and truncation of Krylov bases for large-scale AREs are not required for NMEs. Also, the SDA_ls for large-scale NMEs is of a more efficient O
((
ra+
rb)
3)
computational complexity per iteration, as compared to O(
n)
for AREs. The overall algorithm shares the O(
n)
computational complexity and memory requirement because of the pre-processing of data.Similar techniques in this paper are applicable to the cyclic reduction method in [25] for X
±
A.X−1A
−
Q=
0 (.= ,
H; denoting the transpose and the Hermitian).2. Preliminaries
In this section, we shall introduce some notations, briefly describe the solvability condition for NME (1) and give some preliminary results. Throughout this paper, we denote the unit circle in the complex plane by
T
. For a matrix A∈ C
n×n,σ (
A)
andρ(
A)
denote respectively the spectrum and spectral radius of A, andσ
max(
A)
andσ
min(
A)
are respectively the maximum and minimum singular values ofA. The conjugate transpose and transpose of A are denoted by AHand Arespectively. We can write
A
=
AR+
iAI, where the Hermitian matricesAR
=
1 2(
A+
A H),
A I=
1 2i(
A−
A H)
are called the real part and the imaginary part of A, respectively. For Hermitian matrices A1, A2
∈ C
n×n, we use A1>
A2(A1≥
A2) to denote the fact that A1−
A2is positive definitive (positive semi-definite).The NME in (1) can be reformulated as
R
(
X) =
X+ (
CH+
iDH)
X−1(
C+
iD) − (
QR+
iQI) =
0,
where C
≡
12(
A+
BH)
, D≡
2i1(
A−
BH)
and QR, QIare the real part and the imaginary part of Q , respectively. Letψ(
z) =
zDH+
QI+
z−1D (2)be a rational matrix-valued function. The following is a consequence of the solvability results from [13] and the proof can be found in [14], after superficial modifications.
1 In [2, p. 149], we have the quotation “Doubling algorithms have been part of the folklore associated with Riccati equations in
linear systems problems for some time. We are unable to give any original reference containing material close to that presented here.” This quotation from Anderson, to whom the SDA is widely attributed, indicates the uncertain origin of the method.
Theorem 2.1. Let A
=
C+
iD, B=
CH+
iDHand Q=
QR+
iQIbe n×
n complex matrices. Supposethat
ψ(
z)
defined in (2) is positive definite for each z∈ T
.(i) The matrix polynomial P
(
z) =
z2B−
zQ+
A has exactly n eigenvalues each inside and outsideT
. (ii) The NME (1) has a solution X=
XR+
iXIwithρ(
X−1A) <
1 and XI>
0.Note that if X is solution of NME, then P
(
z) = (
zBX−1−
I)
X(
zI−
X−1A)
. So the eigenvalues ofX−1A are the n eigenvalues of P
(
z)
, and the eigenvalues of BX−1are the reciprocals of remaining n eigenvalues of P(
z)
. A solution X of NME is said to be stabilizing ifρ(
X−1A) <
1. From Theorem2.1, ifψ(
z) >
0 for each z∈ T
, then the NME and its dualX
+
AX−1B=
Q (3)have stabilizing solutions.
Remark 2.1. Theorem2.1gives a sufficient condition for the existence of stabilizing solutions of NMEs. For applications in the computation of the surface Greens function in nano research [13,16] , we have
QI
=
I, D=
0 and C, QRbeing real, and it is easy to check that the sufficient condition holds. For the surface wave application in [18,19], we have C, D, QRand QIbeing real such thatψ(
z) >
0 for eachz
∈ T
. For the special case where C,
QR=
0, ifψ(
z) >
0 for each z∈ T
, thenX
+ (
iDH)
X−1(
iD) =
iQI (4)has a stabilizing solution XSwith XS,I
>
0,ρ(
XS−1(
iD)) <
1 andρ((
iDH)
XS−1) <
1. We shall show that the real part of XSis zero.Consider the nonlinear matrix equationX
+ (
iDH)
X−1(
iD) = −
iQI.
(5)Itis obvious that
−
XSand XSHare solutions withρ(−
X−S1(
iD)) <
1 andρ(
XS−H(
iD)) <
1, i.e.,−
XS and XSHare stabilizing solutions. Since the stabilizing solution of (5) is unique, XSH= −
XS. Hence, XS has only imaginary part, i.e., XS=
iXS,I. Since XS,I>
0, substituting XS=
iXS,Iinto (4) implies thatX
+
DHX−1D=
QIhas a Hermitian positive definite solution XS,I. This coincides with the result in [10]. Under the assumption thatψ(
z) >
0 for each z∈ T
, we know that the NME and its dual have stabilizing solutions X and X, respectively. The corresponding SDA of type 2 [13,16,24] has the formA0
=
A,
B0=
B,
Q0=
Q,
P0=
0;
Ak+1
=
Ak(
Qk−
Pk)
−1Ak,
Bk+1=
Bk(
Qk−
Pk)
−1Bk;
Qk+1
=
Qk−
Bk(
Qk−
Pk)
−1Ak,
Pk+1=
Pk+
Ak(
Qk−
Pk)
−1Bk.
(6) From [13, Theorem 3.1], all the iterates are well-defined (i.e., Mk≡
Qk−
Pkare always invertible),Qk
→
X, Q−
Pk→
X and A k,
Bk→
0, all quadratically as k→ ∞
. In the following, we shall give an upper bound of{
M−k12
:
k=
0,
1, . . .}
For the case where B=
AHand Q=
QH, an upper bound has been given in [3, Theorem 15].The following lemma is useful to estimate the upper bound.
Lemma 2.2. Let
=
R+
iI∈ C
n×n, whereRandIare the real and imaginary parts of. SupposethatIis positive (negative) definite.
(i) is invertible and −1
=
−1with−I 1R
(
I+
R−I1R)
−1being Hermitian and−(
I+
R−I 1R)
−1being Hermitiannegative (positive) definite.
(ii) IfI
I (I
−
I) for some>
0, thenσ
min(
)
, i.e.,
−1
2
−1.
Proof.
(
i)
For eachunitvector x∈ C
n,sinceR,Iare Hermitian andIis positive (negative) definite,we have
x
2
= (
R+
iI)
x2
xH
(
R+
iI)
x2
xHIx
2
σ
min(
I).
(8) Hence,is invertible. From(
R+
iI)[
I−1R(
I+
RI−1R)
−1−
i(
I+
R−I 1R)
−1]
= (
R+
iI)(
−I 1R−
iI)(
I+
R−I 1R)
−1= (
I+
R−I 1R)(
I+
R−I 1R)
−1=
I,
we obtain (7). SinceIis positive (negative) definite andR
=
HR, it is easy to see that−(
I+
R−I 1R
)
−1 is Hermitian negative (positive) definite. Finally, we show that −I 1R(
I+
R−I 1R)
−1is Hermitian. From (7), we have[
−1I R
(
I+
RI−1R)
−1−
i(
I+
R−I 1R)
−1](
R+
iI) =
I.
It follows that−I 1R
(
I+
R−I 1R)
−1I− (
I+
R−I 1R)
−1R=
0. SinceRandIare Hermitian, we deduce that−I 1R(
I+
R−I 1R)
−1is Hermitian.(
ii)
IfII (I
−
I) for some>
0 then from (8), we haveσ
min(
) =
minx =1
x
2
.
Hence,−1
2
= σ
min(
)
−1−1.
Suppose that
ψ(
z) >
0 for each z∈ T
. LetMk
=
Qk−
Pk, φ
k(
z) = −
Bkz+
Mk−
Akz−1,
(9) where Ak, Bk, Qk, Pkare generated by SDA (6). From [13], we know that Mkis invertible. Thenφ
k(
z)
Mk−1φ
k(−
z) = (−
Bkz+
Mk−
Akz−1)(
Mk−1Bkz+
I+
M−k1Akz−1)
= −
BkM−k1Bkz2+ (
Mk−
BkMk−1Ak−
AkMk−1Bk) −
AkMk−1Akz−2= −
Bk+1z2+
Mk+1−
Ak+1z−2= φ
k+1(
z2).
(10) Letφ
0,R(
z)
andφ
0,I(
z)
be the real and imaginary parts ofφ
0(
z)
, respectively. Since, for z∈ T
,φ
0(
z) = φ
0,R(
z) +
iφ
0,I(
z), φ
0,I(
z) = ψ(−
z) >
0.
(11) Lemma2.2implies thatφ
0(
z)
is invertible for z∈ T
. It follows from (10) that for k=
0,
1, . . .
,φ
k(
z)
is invertible for z∈ T
. Letϕ
k(
z) = φ
k(
z)
−1for z∈ T
andwhere
ϕ
k,R(
z)
andϕ
k,I(
z)
are the real and imaginary parts ofϕ
k(
z)
. Taking the inverse of (10) yieldsϕ
k+1(
z2) = φ
k+1(
z2)
−1= φ
k(−
z)
−1Mkφ
k(
z)
−1= φ
k(−
z)
−1φ
k(
z) + φ
k(−
z)
2φ
k(
z)
−1=
1 2[φ
k(
z)
−1+ φ
k(−
z)
−1] =
1 2[ϕ
k(
z) + ϕ
k(−
z)].
(13) From (11), (12) and Lemma2.2, we haveϕ
0,I(
z) = −(φ
0,I(
z) + φ
0,R(
z)φ
0,I(
z)
−1φ
0,R(
z))
−1<
0 forz
∈ T
. It follows from (13) that for each k,ϕ
k,I(
z) <
0 for z∈ T
and maxz∈T
σ
max(ϕ
k,R(
z))
maxz∈Tσ
max(ϕ
0,R(
z)) ≡ σ
max,R,
maxz∈T
σ
max(ϕ
k,I(
z))
maxz∈Tσ
max(ϕ
0,I(
z)) ≡ σ
max,I,
minz∈T
σ
min(ϕ
k,I(
z))
minz∈Tσ
min(ϕ
0,I(
z)) ≡ σ
min,I>
0.
(14) For z∈ T
, we then haveϕ
k,R(
z)
2σ
max,R, ϕ
k,I(
z)
2σ
max,I, ϕ
k,I(
z)
−12
σ
min−1,I.
(15) The following theorem gives upper bounds ofMk
2and
Mk−1
2for k
=
0,
1, . . .
Theorem 2.3. Let A
=
C+
iD, B=
CH+
iDHand Q=
QR+
iQIbe given such thatψ(
z)
is positive foreach z
∈ T
. Let Mk=
Qk−
Pk, where Qkand Pkare generated by the SDA in (6). ThenMk
2
σ
2 max,R+ σ
min2 ,Iσ
2 min,I,
M−k12
σ
max,I+
σ
2 max,Rσ
min,I,
where
σ
max,R,σ
max,Iandσ
min,Iare given in (14), which are only dependent onφ
0(
z)
for z∈ T
. Proof. For each k=
0,
1, . . .
, from (15), we have that for each z∈ T
,−
σ
max,Rσ
min,Iϕ
k,I(
z)
ϕ
k,R(
z)
σ
max,Rσ
min,Iϕ
k,I(
z),
0< σ
min,II−ϕ
k,I(
z) − ϕ
k,R(
z)ϕ
k,I(
z)
−1ϕ
k,R(
z)
σ
max,I+
σ
2 max,Rσ
min,I I.
Let
φ
k,R(
z)
andφ
k,I(
z)
be the real and imaginary parts ofφ
k(
z)
, respectively. From Lemma2.2(
i)
and using the fact thatφ
k(
z) = ϕ
k(
z)
−1, we have for z∈ T
,
σ
max,Rσ
2 min,I Iφ
k,R(
z)
−
σ
max,Rσ
2 min,I I,
σ
−1 min,IIφ
k,I(
z)
σ
max,I+
σ
2 max,Rσ
min,I −1 I.
Since Mk
= [φ
k(
z) + φ
k(−
z)]/
2= [φ
k,R(
z) + φ
k,R(−
z)]/
2+
i[φ
k,I(
z) + φ
k,I(−
z)]/
2, we haveMk
2
σ
max,Rσ
2 min,I 2+ σ
−2 min,I=
σ
2 max,R+ σ
min2 ,Iσ
2 min,I and Mk,I=
1 2i(
Mk−
M H k) =
φ
k,I(
z) + φ
k,I(−
z)
2σ
max,I+
σ
2 max,Rσ
min,I −1 I.
It follows from Lemma2.2
(
ii)
thatM−k1
2
σ
max,I+ σ
max2 ,R/σ
min,I.
For the special case C
,
QR=
0, it follows from (14) and Theorem2.3thatMk
2
σ
min−1,I andMk−1
2
σ
max,I. This coincides with the results in [3, Theorem 15].3. Large-scale doubling algorithm
3.1. Main ideas
From [21–23,28], the main ideas behind the algorithm for large-scale problems are:
(a) The appropriate application of the Sherman–Morrison–Woodbury formula (SMWF) in order to avoid the inversion of large or unstructured matrices.
(b) The use of low-rank updates for various iterates.
(c) The computation of matrix operators (Ak) recursively, to preserve the corresponding sparsity or low-rank structures, instead of forming them explicitly.
(d) The careful organization of convergence control in the algorithm, so as to preserve the low computational complexity and memory requirement per iteration.
For the SDA for large-scale NMEs, we shall see that (c) is not relevant.
Let A, B, Q
∈ C
n×nbe given such thatψ(
z)
defined in (2) is positive definite for each z∈ T
and A,B be respectively of ranks ra, rb
n. Assume the full-rank decompositionsA
=
FaRaGHa,
B=
FbRbGbH (16)with Ra
∈ C
ra×ra and Rb∈ C
rb×rb. Without loss of generality, we shall assume that Fa, Fb, GaandGbare unitary. In this paper, we shall call some matrices “kernels”, mostly denoted by R with various subscripts (Y is also used in Section3.3). Most of our computation will be done in terms of kernels.
From Theorem2.1, we know that NME (1) and its dual (3) have stabilizing solutions X and X,
respectively. The SDA for NME and its dual has the form in (6) which requires the inverse of Mk
=
Qk
−
Pk. It is shown in [13] that Mkis invertible for each k=
0,
1, . . .
Furthermore, an upper bound of{
M−k12
|
k=
0,
1, . . .}
is given in Theorem2.3. Similar to the approach in [21,23], we shall apply the SMWF:(
A±
U CV)
−1=
A−1∓
A−1U(
I±
CV A−1U)
−1 CV A−1to various inverses of matrices in sparse-plus-low-rank (splr) form, enabling the computation of large inverses of size n in terms of much smaller matrices. In the following lemma, we show that the small size matrix
(
I±
CV A−1U)
is invertible provided that A and(
A±
U CV)
are invertible.Lemma 3.1. If A and
(
A±
U CV)
are invertible then(
I±
CV A−1U)
is invertible.Proof. Suppose that A and
(
A±
U CV)
are invertible. Then ⎡ ⎣ A±
U CV 0 0∓
I ⎤ ⎦ is invertible. Since A is invertible and ⎡ ⎣I∓
U 0 I ⎤ ⎦ ⎡ ⎣ A±
U CV 0 0∓
I ⎤ ⎦ ⎡ ⎣ I 0∓
CV I ⎤ ⎦=
⎡ ⎣ A U CV∓
I ⎤ ⎦=
⎡ ⎣ I 0 CV A−1∓
I ⎤ ⎦ ⎡ ⎣ A 0 0 I±
CV A−1U ⎤ ⎦ ⎡ ⎣I A−1U 0 I ⎤ ⎦,
we have shown that
(
I±
CV A−1U)
is invertible.3.2. Algorithm 1
For k
=
0,
1, . . .
, we can organize the SDA so that the iterates have the recursive formsAk
=
FaRakGHa,
Bk=
FbRbkGHb;
Qk
=
Q−
FbRqkGHa,
Pk=
FaRpkGHb;
(17)with Rak
∈ C
ra×ra, Rbk∈ C
rb×rband Rpk,
RHqk∈ C
ra×rb. The general forms in (17) can be verified easily from (6), when identifying the updating formulae in (23) and the initail values in (25). Note that we can equivalently formulate the SDA in terms of Ak, Bk, Pkand the new variable Qk≡
Q−
Qk, with the symmetry in the low-ranked Pkand Qk. Note also that the row and column spaces of all these matrices remain constant, with only the various kernels Rs varying with k. Also, Qkare low rank updates of Q . (For the nano research application in [13,16], this corresponds to the behaviour that only upper right corner in Akand the lower right corner of Qkare changing for different k.)We require the inverse of Mk
=
Qk−
Pkin the SDA in (6). From (17), we haveMk
=
Q− [
Fa,
Fb]
Rmk[
Ga,
Gb]
H,
(18) where Rmk≡
⎡ ⎣ 0 Rpk Rqk 0 ⎤ ⎦∈ C
(ra+rb)×(ra+rb). Applying the SMWF to M−1 k yields M−k1=
Q−1+
Q−1[
Fa,
Fb]
Nk[
Ga,
Gb]
HQ−1 (19) with Nk≡ (
Ira+rb−
RmkT)
− 1R mk (20) and T=
⎡ ⎣Taa Tab Tba Tbb ⎤ ⎦≡ [
Ga,
Gb]
HQ−1[
Fa,
Fb].
(21)Remark 3.1. Since Q and Mkare invertible, it follows from Lemma3.1that Ira+rb
−
RmkT(
k=
0,
1, . . .)
are invertible. In the following, we shall give upper bounds of
I
−
RmkT2and
(
I−
RmkT)
−12for k
=
0,
1, . . .
when[
Ga,
Gb]
and[
Fa,
Fb]
are of full column rank. Assumeσ
min([
Ga,
Gb]) = σ
min,G>
0and
σ
min([
Fa,
Fb]) = σ
min,F>
0. Since Ga, Gb, Faand Fbare unitary, we obtain[
Ga,
Gb]
2√
2,
[
Fa,
Fb]
2√
2,
[
Ga,
Gb]
†2
=
1/σ
min,G, [
Fa,
Fb]
†2
=
1/σ
min,F,
(22)where
(·)
†denotes the pseudoinverse of a matrix. From (18), (21) and (22), we haveRm,k
2
(
Mk2
+
Q
2
)/(σ
min,Fσ
min,G)
andT
2 2
Q−1
2. Hence, we obtain
I
+
RmkT2 1
+
2
Q−1
2
σ
min,Fσ
min,G(
Mk
2
+
Q2
).
From (19) and (22), we have
Nk
2
(
Q2
Mk−1
2
+
1)
Q2
/(σ
min,Fσ
min,G)
. Using the fact that(
I−
RmkT)
−1(
I−
RmkT) =
I, it follows that(
I−
RmkT)
−1=
NkT+
I and hence(
I−
RmkT)
−12 1
+
2
Q
2
Q−1
2
σ
min,Fσ
min,G(
Q
2
Mk−1
2
+
1).
With (17) and (19), the SDA in (6) now becomes the updating formulae:Ra,k+1
=
RakWkaaRak,
Rb,k+1=
RbkWkbbRbk;
Rq,k+1
=
Rqk+
RbkWkbaRak,
Rp,k+1=
Rpk+
RakWkabRbk;
(23)where Wkuv
≡
GHuMk−1Fv(
u,
v=
a,
b)
. From (19), we obtain ⎡⎣Wkaa Wkab
Wkba Wkbb
⎤
⎦
=
T+
TNkT.
(24)The computation requires about263
(
r3a+
rb3)+
22rarb(
ra+
rb)
flops for each iteration (the detailed count is given in Table1), with the help of (19), after the pre-processing in O(
n)
complexity for quantities like Q−1U, Q−HV(
U=
Fa,
Fband V=
Ga,
Gb)
in (20) and T in (21).For initial values, we have the obvious
Ra0
=
Ra,
Rb0=
Rb;
Rp0,
Rq0=
0.
(25)In [21–23,28], the SDA of type 1 has been extended for large-scalarStein/Lyapunov and AREs equa-tions. The iterates Akin the SDA of type 1 are computed recursively, without being forming explicitly. As the SDA of type 2 in (6) now translates to the updating formulae (23) in
C
ru×rv(u,
v=
a,
b), this previously important aspect (c) in Section3.1of the SDA_ls is now irrelevant.The SDA_ls for an NME (and its dual), realizes the iteration in (6) with the help of (17), (19) and (23), the initial values in (25), and the convergence control in Section4.1. We summarize the SDA_ls in Algorithm 1 below.
Algorithm 1 (SDA_ls)
Input
: A=
FaRaGHa,
B=
FbRbGHb,
Q∈ C
n×n, positive tolerance;
Output
: Rp,
RHq∈ C
ra×rb, with Q−
FbRqGHa and Q−
FaRpGHb,approximating, respectively, the solutions X and X to the
large-scale NME (1) and its dual (3);
Orthogonalize
Fa,
Fb,
Ga,
Gb, modify Ra,
Rb; (if required) Setk=
0, r0=
2; Ra0
=
Ra, Rb0=
Rb; Rp0,
Rq0=
0;(as in (25))
Do
until convergence:If
the relative residual˜
rk= |
rk/(
qk+
mk)| <
,Set
Rq=
Rqkand Rp=
Rpk;Exit
End If
Compute
Wkuv=
GuHM−k1Fv(
u,
v=
a,
b)
, (as in (20) and (24))Ra,k+1
=
RakWkaaRak,Rb,k+1
=
RbkWkbbRbk,Rq,k+1
=
Rqk+
RbkWkbaRak, andRp,k+1
=
Rpk+
RakWkabRbk; (as in (23))Compute
k←
k+
1, rk, qkand mk; (as in (40) in Section4.1)End Do
3.3. A non-symmetric discrete-time algebraic Riccati equation
From Section3.2, Q
−
Qkand PkHare low-ranked with small rb×
rakernels. It suggests that the solution X, or its kernel Y , can be characterized by a matrix equation of lower dimensions, as stated in the following lemma. We shall not elaborate on the similar result for the solution X of dual equation.Lemma 3.2. Let X be a solution of the large-scale NME (1). Then there exists Y
∈ C
rb×rasuch thatX
=
Q−
FbYGHa.
(26)Proof. Suppose that X is a solution of NME (1). Substituting (16) into (1) and setting Y
=
RbGHbX−1FaRa, we obtain (26).We shall now show that Y in (26) satisfies an uncommon non-symmetric discrete-time algebraic Riccati equation (NARE_D). Note that the standard non-symmetric algebraic Riccati equations (NARE) [28] are of continuous-time type in most literature.
Substituting (16) and (26) into (1), we have
Multiplying FbHand Gafrom the left and the right of (27), respectively, and applying the SMWF, we obtain
−
Y+
RbGbH Q−1+
Q−1FbY(
I−
GaHQ−1FbY)
−1GaHQ−1 FaRa=
0.
(28)With the notation in (21), (28) is equivalent to
−
Y+
RbTbaRa+
RbTbbY(
I−
TabY)
−1TaaRa=
0,
or, the NARE_DD
(
Y) ≡ −
Y+
TbbY(
I−
TabY)
−1 Taa+
Tba=
0,
(29) whereTaa
≡
TaaRa,
Tbb≡
RbTbb,
Tba≡
RbTbaRa,
Tab≡
Tab.
(30) In [15], an NARE is first processed by Cayley transform to the form (29) and then doubling is applied. We shall refer to this approach (without the Cayley transform) as Algorithm 2.Algorithm 2 (SDA_ls)
Input
: A=
FaRaGHa,
B=
FbRbGHb,
Q∈ C
n×n, positive tolerance;
Output
: Y,
YH∈ C
rb×ra, with Q−
FbYGHa and Q
−
FaYGbH, approximating, respectively, the solutions X and X to thelarge-scale NME (1) and its dual (3);
Orthogonalize
Fa,
Fb,
Ga,
Gb, modify Ra,
Rb; (if required) Compute Taa, Tbb, Tab, Tba; (as in (30)) tba=
Tba;
Setk
=
0,r˜
0=
2; Taa,0
=
Taa, Tbb,0=
Tbb, Tba,0=
Tba, Tab,0=
Tab;Do
until convergence:If
the relative residual˜
rk= |
rk/(
qk+
mk+
tba)| <
,Set
Y=
Tba,kand Y=
Tab,k;Exit
End If
Compute
Taa,k+1=
Taa,k(
Ir a−
Tab,k Tba,k)
−1 Taa,k, Tbb,k+1=
Tbb,k(
Irb−
Tba,k Tab,k)
− 1 T bb,k, Tab,k+1=
Tab,k+
Taa,k Tab,k(
Ir b−
Tba,k Tab,k)
− 1 T bb,k, and Tba,k+1=
Tba,k+
Tbb,k Tba,k(
Ira−
Tab,k Tba,k)
− 1 T aa,k;Compute
k←
k+
1, rk=
D(
Tba,k)
, qk=
Tba,kand mk
=
Tbb Tba,k(
I−
Tab Tba,k)
−1 Taa;
End Do
Notice that the computation of Algorithm 2 can be realized as: computing Tu,v(u
,
v=
a,
b) in (21) requires O(
n)
computational complexity; computing Tu,v(u,
v=
a,
b) in (30) requires 2ra3+
2rb3+
2rarb(
ra+
rb)
flops; for each iteration, it requires223ra3+
10ra2rb+
8rarb2+
143rb3flops. The detailed countis given in Table2. For the case ra
=
rb, Algorithm 2 requires 8ra3+(
30ra3)
k flops, where k is the number of iterations, after the pre-processing in O(
n)
computational complexity. Algorithms 1 and 2 all need to compute Tu,v(u,
v=
a,
b) in (21) which required O(
n)
computational complexity. From Table1, we obtain that Algorithm 1 requires 74ra3flops per iteration when ra=
rb. When ra≈
n1/3 n, Algorithm 2 will be more efficient than Algorithm 1. However, we guarantee that all iterations in Algorithm 1 are well-defined, but the same for Algorithm 2 cannot be guaranteed. After each iterative step, Rqkin Algorithm 1 and Tba,kin Algorithm 2 are different but they converge to the same Y . From our numerical experience, for a given example, the two algorithms converge in the same number of iterations, possibly the result of Theorem3.3below.Suppose that X
=
Q−
FbYGHa and X=
Q−
FaY GHb are the stabilizing solutions of NME and its dual, respectively. We shall show below, when all iterations of Algorithm 2 are well-defined then Tba,k→
Y and Tab,k→
Y as k→ ∞
. Let M=
⎡ ⎣ TaaRa 0−
RbTbaRa Irb ⎤ ⎦,
L=
⎡ ⎣Ira−
Tab 0 RbTbb ⎤ ⎦∈ C
(ra+rb)×(ra+rb).
(31) Suppose that (29) has a solution Y∈ C
rb×ra. Then Y satisfies thatM ⎡ ⎣Ira Y ⎤ ⎦
=
L ⎡ ⎣Ira Y ⎤ ⎦ S,
(32) where S= (
Ira−
TabY)
−1TaaRa∈ C
ra×ra.Theorem 3.3. Let X
=
Q−
FbYGHa and X=
Q−
FaY GHb be the stabilizing solutions of, respectively, NME (1) and its dual (3). Then(i) X−1A and S
= (
Ira−
TabY)
− 1TaaRahave the same nonzero eigenvalues.
(ii) X−1B and S
= (
Irb−
TbaY)
− 1TbbRbhave the same nonzero eigenvalues.
Proof.
(
i)
From (16), we have X−1A= (
Q−
FbYGHa)
−1FaRaGHa. Applying the SMWF to(
Q−
FbYGHa)
−1 yieldsX−1A
= (
Q−1Fa+
Q−1FbY(
Ira−
G HaQ−1FbY
)
−1GHaQ−1Fa)
RaGHa.
(33) Let G= [
Ga,
Ga]
be a unitary matrix. Multiply GHand G from the left and right of (33), respectively, we obtain GHX−1AG=
⎡ ⎣Z 0 . 0 ⎤ ⎦,
where Z=
Taa+
TabY(
Ira−
TabY)
− 1T aa Ra= (
Ira−
TabY)
− 1T aaRa=
S.
Hence, X−1A and S= (
Ira−
TabY)
− 1TaaRahave the same nonzero eigenvalues. The proof of
(
ii)
is similar to the proof of(
i)
.As mentioned before, we may require the eigenvalues of X−1A or X−1B in some applications, and
Theorem3.3provides an efficient route to the nonzero parts of these spectra via the much smaller matrices S and S.
Suppose that X
=
Q−
FaY GHb is a stabilizing solution of (3). Then the kernel Y∈ C
ra×rb and S= (
Irb−
TbaY)
− 1T bbRbsatisfy ⎡ ⎣ TbbRb 0−
RaTabRb Ira ⎤ ⎦ ⎡ ⎣Irb Y ⎤ ⎦=
⎡ ⎣Irb−
Tba 0 RaTaa ⎤ ⎦ ⎡ ⎣Irb Y ⎤ ⎦S.
Let Y=
R−a1Y Rb−1,
S=
RbSR−b1.
(34) It is easily seen thatM ⎡ ⎣ Y Irb ⎤ ⎦ S
=
L ⎡ ⎣ Y Irb ⎤ ⎦,
(35)where M and L are defined in (31). It follows from (32), (34), (35) and Theorem3.3that the matrix pencil
λ
L−
M has raeigenvalues inside the unit circle and rbeigenvalues outside the unit circle. Similar to the theory in [5,15], if the matrix sequences{
Taa,k}
,{
Tbb,k}
,{
Tab,k}
and{
Tba,k}
generated by Algorithm 2 are well-defined, then we haveTba,k
−
YO
(ρ(
S)
2 kρ(
S)
2k),
T ab,k−
YO
(ρ(
S)
2 kρ(
S)
2k),
(36) where S= (
Ira−
TabY)
−1TaaRaand S is defined in (34) withρ(
S) <
1 andρ(
S) <
1. For Algorithm 1, it has been shown in [13] thatRq,k
−
Y= (
Q−
FbRq,kGaH) − (
Q−
FbYGHa)
O(ρ(
X−1B)
2 kρ(
X−1A)
2k),
Rp,k
−
Y= (
Q−
FaRp,kGbH) − (
Q−
FaY GHb)
O(ρ(
X−1B)
2 kρ(
X−1A)
2k),
(37)where
{
Rq,k}
and{
Rp,k}
are generated by Algorithm 1. From (36), (37) and Theorem3.3, it is easily seen that Algorithms 1 and 2 converge in the same number of iterations.It is intriguing, that we started from an NME associated with the SDA of type 2 and ended up with an equivalent NARE_D associated with the SDA of type 1. Similar links between NMEs and AREs have been considered before. In [10], an NME has been transformed to a discrete-time ARE when B
=
A∗. The transformation of an NARE into a unilateral quadratic matrix polynomial (UQME), which was then solved by the SDA of type 2, has recently been studied in [4].3.4. Errors in SDA_ls
It is easy to see from (6) that errors in the iterates will propagate through the SDA. Let
δ
Ak,δ
Bk,δ
Pk andδ
Qkbe the errors in Ak, Bk, Pkand Qk, respectively. From (6), withk≡ δ
Qk− δ
Pk, and ignoring higher order terms, we haveδ
Ak+1≈ δ
AkMk−1Ak−
AkMk−1kM−k1Ak+
AkMk−1δ
Ak,
δ
Bk+1≈ δ
BkMk−1Bk−
BkMk−1kM−k1Bk+
BkMk−1δ
Bk,
δ
Pk+1≈ δ
Pk+ δ
AkMk−1Bk−
AkMk−1kMk−1Bk+
AkMk−1δ
Bk,
δ
Qk+1≈ δ
Qk− δ
BkMk−1Ak+
BkMk−1kMk−1Ak−
BkM−k1δ
Ak.
With
δ
k≡
max{ δ
Ak, δ
Bk, δ
Pk, δ
Qk}
, cak≡
M−k1Ak
and cbk
≡
Mk−1Bk
, we have
δ
k+1δ
k·
max{
2cak(
1+
cak),
2cbk(
1+
cbk),
1+ (
cak+
cbk)
2} +
O(δ
k2).
From Theorem2.3, we have the upper bounds of
{
M−k12
|
k=
0,
1, . . .}
dependent only onφ
0(
z)
, or from (9), only on A, B and Q . Using the fact thatAk
and
Bk
are also bounded in [13], the errors
δ
Ak,δ
Bk,δ
Pkandδ
Qkthen pass intoδ
Ak+1,δ
Bk+1,δ
Pk+1andδ
Qk+1, creating errors of the same order. Note that ignoring the higher terms simplifies the error equations, without altering the conclusions of the discussion. The fact that Ak,
Bk→
0, or cak,
cbk→
0, will contribute towards diminishing the errors.4. Computational issues
4.1. Residual and convergence control
For the convergence control in Algorithm 1, we should compute residuals and differences of iterates carefully. Note that it is much easier to compute the smaller analogous quantities in Algorithm 2.
Consider the differences of successive iterates:
dQk
≡
Qk+1−
Qk=
Fb(
Rqk−
Rq,k+1)
GaH,
dPk
≡
Pk+1−
Pk=
Fa(
Rp,k+1−
Rpk)
GHb,
implying thatdQk
=
Rqk−
Rq,k+1,
dPk=
Rpk−
Rp,k+1in 2- or F-norm, because the Fs and Gs are unitary by choice. The computations of
dQk
and
dPk
can be achieved in about 8rarb
ˆ
rab(see [11, p. 254]) and 4rarbflops for 2-norm and F-norm, respectively, whereˆ
rab=
max{
ra,
rb}.
Similarly, we have the residual rk
≡
R(
Qk)
of the NME, the corresponding relative residual equals rk≡
rk qk+
mk (38) with qk≡
Q−
Qk=
Qk,
mk≡
BQk−1A.
(39)With the low-rank forms in (17), we then have
rk
= −
FbRqkGaH+
FbRbGHb(
Q−
FbRqkGHa)
−1FaRaGHa,
qk
=
FbRqkGaH,
mk
=
FbRbGbH(
Q−
FbRqkGHa)
− 1FaRaGHa
.
After applying the SMWF, using the notation in (21), we have the efficient formulae
rk
=
−
Rqk+
Rb Tba+
Tbb(
Irb−
RqkTab)
− 1R qkTaa Ra,
qk=
Rqk,
mk=
Rb Tba+
Tbb(
Irb−
RqkTab)
− 1R qkTaa Ra.
(40)The computation of relative residual requires about 6ra2rb
+
4rarb2+
8 3r3
bflops, assuming the Ts are available, and using F-norm in (40). If we use 2-norm, then an additional 12rarb
ˆ
rabflops are required. Notice that the computation of relative residual in Algorithm 2 requires83ra3+
4ra2rb+
2rarb2flops.On the relation between residuals and actual errors in the computed solutions, please consult [27,29].
4.2. Operation and memory counts
In Algorithms 1 and 2, the dominant calculations of O
(
n)
computational complexity and memory requirement are in the pre-processing, with help from the structure of Q . We shall assume that cqn flops are required in the solution of Qv=
r or QHv=
r, with v,
r∈ C
n.For the pre-processing, the cost of 2
(
cq+
ra+
rb)(
ra+
rb)
n flops is made up of the following: (1) compute Q−1U and VHQ−1(
U=
Fa,
Fband V=
Ga,
Gb)
, requiring 2cq(
ra+
rb)
n flops; and (2) compute VHQ−1U (U=
Fa,
Fband V=
Ga,
Gb), requiring 2(
ra+
rb)
2n flops.There is also a memory requirement for
(
ra+
rb)(
ra+
rb+
2)
n variables. In addition, there may be up to 8(
ra2+
rb2)
n flops required for the orthogonalization of Fu,
Gv(u,
v=
a,
b) [11, p. 250], if required and dependent on the exact structures in A and B.After the pre-processing of data, it is easily seen that the memory requirement is about O
((
ra+
rb)
2)
variables, per iteration in both Algorithms 1 and 2. The memory requirement of each iteration is much smaller than(
ra+
rb)(
ra+
rb+
2)
n (the memory required in the pre-processing). Hence, we do not need to count the memory requirement for each iteration in Algorithms 1 and 2. In Algorithm 2, we need to compute Tu,v(u,
v=
a,
b) as in (30), which requires 2ra3+
2rb3+
2rarb(
ra+
rb)
flops, before starting the iteration.The detailed operation counts for the kth iteration in the SDA for large-scale NMEs of Algorithms 1 and 2 are summarized in Tables1and2below, respectively, with all the kernels formed explicitly. Only the dominant counts are recorded and the F-norm is applied. When B
=
A.and Q is Hermitian, the workload and memory requirement will be halved.Table 1
Operation counts for the kth iteration in Algorithm 1 (SDA_ls).
Computation Flops RmkT 4rarb(ra+rb) (I−RmkT)−1RmkT 83(ra+rb)3 Wuv k (u,v=a,b) 2(ra+rb)3 Ra,k+1, Rb,k+1 4(ra3+r3b) Rq,k+1, Rp,k+1 4rarb(ra+rb) rk, qk, mk 6ra2rb+4rar2b+ 8 3r 3 b Total 263r3 a+28r2arb+26rarb2+ 34 3r 3 b Table 2
Operation counts for the kth iteration in Algorithm 2 (SDA_ls).
Computation Flops Tab,k Tba,k 2r2arb Tba,k Tab,k 2rarb2 Taa,k+1, Tba,k+1 143r 3 a+2rarb(ra+rb) Tbb,k+1, Tab,k+1 143r 3 b+2rarb(ra+rb) rk, qk, mk 83r 3 a+4r2arb+2rar2b Total 223r3 a+10r2arb+8rar2b+ 14 3r 3 b
5. Numerical examples
All the numerical experiments were conducted using MATLAB 2010a on an iMac with a 2.97 GHz Intel i7 processor and 8 Gigabyte RAM. To measure the accuracy of a computed stabilizing solution X of NME (1), we use the relative and absolute residuals:
RRes
≡
X
+
BX−1A
−
QX
−
Q+
BX−1A,
ARes≡
X+
BX−1A
−
Q.
(41)Suppose that Rq
=
Rqk∗ and Y=
Tba,k∗ are the computed kernels by Algorithms 1 and 2, re-spectively, and k∗is the required number of iterations for convergence. From (38)–(40), we obtain that the relative and absolute residuals of the computed stabilizing solution X=
Q−
FbRq,GHa are RRes=
rk∗/(
qk∗+
mk∗)
and ARes=
rk∗, respectively, where rk∗, qk∗ and mk∗are defined in (40). From (27)–(30), it is easily seen that the relative and absolute residuals of the computed stabilizing solution X=
Q−
FbYGHa are RRes=
rk∗/(
qk∗+
mk∗+
tab)
and ARes=
rk∗, respectively, where rk∗,qk∗, mk∗and tabare given in Algorithm 2. Furthermore, we use“Time 1” and “Time 2”to represent the
executiontimes for the pre-processing data and SDA, respectively.
Example 1. In order to report the actual errors in the computed solution, we consider an example with exact solution. Suppose that A
=
iD and B=
iDHwhere D∈ C
n×n with ra=
rb=
3. We randomly generate RD∈ C
ra×ra,Fa,
Ga∈ C
n×ra and setFa:=
Fa(
FaHFa)
−1
2, Ga
:=
Ga(
GH aGa)
−1 2 and RD
=
RD/(
4RD
2
)
. Assume thatD=
FaRDGHa, then A and B have the full-rank decompositions A=
FaRaGaHand B=
FbRbGHb whereGb
=
Fa,
Fb=
Ga,
Ra=
iRD,
Rb=
iRHD.
Let H∈ C
n×raand set H=
H(
HHH)
−12. Assume thatXe
=
i(
In−
0.
5HHH)
(42)is the exact solution of NME. Then Xe−1
= −
i(
In+
HHH)
and Q=
iQI=
i(
In−
0.
5HHH+
GaRDHFaH(
In+
HHH)
FaRDGHa)
=
i(
In−
0.
5HHH+
GaRHDRDGaH+
GaRHDFaHHHHFaRDGHa).
Since
D
2
=
RD2
=
1/
4 andσ
min(
QI)
σ
min(
In−
0.
5HHH) =
1/
2, it is easily seen thatψ(
z) =
zDH
+
QI+
z−1D is positive definite for each z∈ T
andρ(
Xe−1A) <
1. That is, Xein (42) is the stabilizing solution of X+
BX−1A=
Q .We compute the stabilizing solution X with n
=
102, 5×
102, 103and 5×
103, respectively, by using Algorithm 1 with positive tolerance=
10−10. The numerical results are shown in Table3. Example 2. We first consider an example associated with the computation of Green function in nano research [13,16].Table 3
(Example 1) Numbers of iterations, absolute residuals, relative residuals and X−Xe 2.
n 102 5×102 103 5×103
# iterations (k∗) 5 5 5 5
ARes 1.46×10−17 1.75×10−17 1.82×10−17 1.39×10−17 RRes 6.48×10−17 7.86×10−17 8.28×10−17 6.35×10−17 X−Xe 2 4.01×10−17 1.11×10−16 1.11×10−16 1.11×10−16
Table 4
(Example 2) Numbers of iterations, absolute residuals, relative residuals andexecutiontimes in seconds.
n 102 103 104 105 106 107 # iterations (k∗)6 6 6 6 7 6 ARes 1.85×10−16 2.36×10−16 2.71×10−16 2.07×10−16 2.52×10−16 1.83×10−16 RRes 7.16×10−17 8.14×10−17 8.92×10−17 7.16×10−17 6.96×10−17 6.60×10−17 Time 1 1.65×10−3 3.65×10−3 1.57×10−2 1.24×10−1 9.94×10−1 133.16 Time 2 5.06×10−3 7.05×10−3 7.16×10−3 7.76×10−3 8.62×10−4 7.41×10−3 Table 5
(Example 2) O(n)Computational Complexity.
n 1×106 2×106 3×106 4×106 5×106 6×106 # iterations (k∗)7 6 6 6 6 6 ARes 2.52×10−16 1.98×10−16 2.46×10−16 1.85×10−16 2.10×10−16 1.89×10−16 RRes 6.96×10−17 7.19×10−17 9.86×10−17 6.76×10−17 5.95×10−17 7.26×10−17 Time 1 0.994 1.89 2.81 3.75 4.70 5.69 Time 2 8.62×10−4 5.08×10−3 6.27×10−3 8.17×10−3 5.16×10−4 8.47×10−4
With ra
=
3, and rb=
5 and we randomly generate Ra∈ C
ra×ra, Rb∈ C
rb×rb, Fa,
Ga∈ C
n×raandFb
,
Gb∈ C
n×rband set Fu:=
Fu(
FuHFu)
− 1 2, Gu:=
Gu(
GH uGu)
− 1 2 (u=
a,
b). Then Fu, Gu(u=
a,
b) areunitary. Recall that A and B have the forms
A
=
FaRaGHa,
B=
FbRbGbH.
Let Q be the tridiagonal matrix of dimension n with 2 on the main diagonal and
−
1 on the two adjacent diagonals. Choose a suitable∈ R
and set Q:=
Q+
iI such that A
,
B and Q satisfy the solvabilitycondition which is given in Theorem2.1, i.e.,
ψ(λ) =
I+ λ
D+ λ
−1DH>
0,
for allλ ∈ T,
(43) where D= (
A−
BH)/(
2i)
. In the following numerical experiments, we choose=
5. We compute the stabilizing solution X with n=
102, 103, 104, 105, 106and 107, by using Algorithm 1 with the positive tolerance=
10−10. The numerical results are shown in Table4. Somehow, when the size of the problem jumped from n=
106to 107, the memory requirement crossed some critical boundary concerning virtual memory on the computer. The amount ofexecution timejumped 134-folds, instead of the previous 1.9- to 8-folds when n was increased 10-folds. Anyway, Algorithm 1 was successful in solving the associated NMEs to machine accuracy (in terms of residuals), for n=
102 to 106, in reasonably quickexecution times. This is consistent with the predicted O(
n)
computational complexity and memory requirement for the SDA.Tosee the O
(
n)
computational complexity more clearly, we construct Table5for n=
j×
106(
j=
1:
6)
, with the corresponding execution time equals approximately to j seconds.The numerical results from Algorithm 2 is similar. Example 3.
Next we consider a small example associated with the surface acoustic wave simulation [18,19]. Here we have