RESEARCH
CHUN-HUA GUO†, YUEH-CHENG KUO‡,AND WEN-WEI LIN§
Abstract. The matrix equation X + A>X−1A = Q arises in Green’s function calculations in nano research, where A is a real square matrix and Q is a real symmetric matrix dependent on a parameter and is usually indefinite. In practice one is mainly interested in those values of the parameter for which the matrix equation has no stabilizing solutions. The solution of interest in this case is a special weakly stabilizing complex symmetric solution X∗, which is the limit of the unique
stabilizing solution Xηof the perturbed equation X + A>X−1A = Q + iηI, as η → 0+. It has been
shown that a doubling algorithm can be used to compute Xη efficiently even for very small values
of η, thus providing good approximations to X∗. It has been observed by nano scientists that a
modified fixed-point method can sometimes be quite useful, particularly for computing Xηfor many
different values of the parameter. We provide a rigorous analysis of this modified fixed-point method and its variant, and of their generalizations. We also show that the imaginary part XIof the matrix
X∗ is positive semi-definite and determine the rank of XI in terms of the number of unimodular
eigenvalues of the quadratic pencil λ2A>− λQ + A. Finally we present a new structure-preserving
algorithm that is applied directly on the equation X + A>X−1A = Q. In doing so, we work with real arithmetic most of the time.
Key words. nonlinear matrix equation, complex symmetric solution, weakly stabilizing solution, fixed-point iteration, structure-preserving algorithm, Green’s function
AMS subject classifications. 15A24, 65F30
1. Introduction. The nonlinear matrix equation X + A>X−1A = Q, where A is real and Q is real symmetric positive definite, arises in several applications and has been studied in [3, 7, 11, 21, 22, 27], for example.
In this paper we further study the nonlinear matrix equation X + A>X−1A = Q + iηI,
where A ∈ Rn×n, Q = Q> ∈ Rn×n and η ≥ 0, but Q is usually not positive definite.
The equation arises from the non-equilibrium Green’s function approach for treating quantum transport in nanodevices, where the system Hamiltonian is a semi-infinite or bi-infinite real symmetric matrix with special structures [1, 5, 17, 18, 25]. A first systematic mathematical study of the equation has already been undertaken in [12].
For the bi-infinite case, the Green’s function corresponding to the scattering region GS ∈ Cns×ns, in which the nano scientists are interested, satisfies the relation [4, 17]
GS = (E + i0+)I − HS− CL,S> GL,SCL,S− DS,RGS,RD>S,R
−1 ,
where E is energy, a real number that may be negative, HS ∈ Rns×ns is the
Hamil-tonian for the scattering region, CL,S ∈ Rn`×ns and DS,R ∈ Rns×nr represent the ∗Version of July 24, 2011.
†Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada
(chguo@math.uregina.ca). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada.
‡Department of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan
(yckuo@nuk.edu.tw). The work of this author was partially supported by the National Science Council in Taiwan.
§Department of Mathematics, National Taiwan University, Taipei 106, Taiwan & Center of
Math-ematical Modelling and Scientific Computing, National Chiao Tung University, Hsinchu 300, Taiwan (wwlin@math.nctu.edu.tw). The work of this author was partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.
coupling with the scattering region for the left lead and the right lead, respectively, GL,S∈ Cn`×n` and GS,R∈ Cnr×nr are special solutions of the matrix equations
GL,S= (E + i0+)I − BL− A>LGL,SAL −1 (1.1) and GS,R= (E + i0+)I − BR− ARGS,RA>R −1 (1.2) with AL, BL = BL> ∈ R n`×n` and A R, BR = BR> ∈ R nr×nr. Since (1.1) and (1.2)
have the same type, we only need to study (1.1). So we simplify the notation n` to n.
In nano research, one is mainly interested in the values of E for which GL,S in (1.1)
has a nonzero imaginary part [18].
For each fixed E , we replace “0+” in (1.1) by a sufficiently small positive number
η and consider the matrix equation
X = (E + iη)I − BL− A>LXAL
−1
. (1.3)
It is shown in [12] that the required special solution GL,S of (1.1) is given by GL,S =
limη→0+GL,S(η) with X = GL,S(η) being the unique complex symmetric solution
of (1.3) such that ρ (GL,S(η)AL) < 1, where ρ(·) denotes the spectral radius. Thus
GL,S is a special complex symmetric solution of X = E I − BL− A>LXAL −1
with ρ (GL,SAL) ≤ 1.
The question as to when GL,S has a nonzero imaginary part is answered in the
following result from [12], where T denotes the unit circle.
Theorem 1.1. For λ ∈ T, let the eigenvalues of ψL(λ) = BL+ λAL+ λ−1A>L be
µL,1(λ) ≤ · · · ≤ µL,n(λ). Let
∆L,i=
min
|λ|=1µL,i(λ), max|λ|=1µL,i(λ)
,
and ∆L = S n
i=1∆L,i. Then GL,S is a real symmetric matrix if E /∈ ∆L. When
E ∈ ∆L, the quadratic pencil λ2A>L − λ(EI − BL) + AL has eigenvalues on T. If all
these eigenvalues on T are simple (they must then be nonreal, as explained in the proof of Theorem 3.3), then GL,S has a nonzero imaginary part.
By replacing X in (1.3) with X−1, we get the equation
X + A>X−1A = Qη, (1.4)
where A = AL and Qη = Q + iηI with Q = E I − BL. So Q is a real symmetric
matrix dependent on the parameter E and is usually indefinite. For η > 0, we need the stabilizing solution X of (1.4), which is the solution with ρ(X−1A) < 1, and then GL,S(η) = X−1. The existence of the stabilizing solution was proved in [12] using
advanced tools; an elementary proof has been given recently in [10]. When η = 0 and E ∈ ∆L, it follows from Theorem 1.1 that the required solution X = G−1L,S of (1.4) is
only weakly stabilizing, in the sense that ρ(X−1A) = 1.
The size of the matrices in (1.4) can be very small or very large, depending on how the system Hamiltonian is obtained. If the Hamitonian is obtained from layer based models, as in [18] and [25], then the size of the matrices is just the number of principal layers. In [18] considerable attention is paid to single layer models and the more realistic double layer models, which correspond to n = 1 and n = 2 in (1.4).
So we can say that (1.4) with n ≤ 10 is already of significant practical interest. On the other hand, if the Hamiltonian is obtained from the discretization of a differential operator, as in [1], then the size of the matrices in (1.4) can be very large if a fine mesh grid is used.
One way to approximate GL,Sis to take a very small η > 0 and compute GL,S(η).
It is proved in [12] that the sequence {Xk} from the basic fixed-point iteration (FPI)
Xk+1= Qη− A>Xk−1A, (1.5)
with X0= Qη, converges to GL,S(η)−1. And it follows that the sequence {Yk} from
the basic FPI
Yk+1= (Qη− A>YkA)−1, (1.6)
with Y0 = Q−1η , converges to GL,S(η). However, the convergence is very slow for
E ∈ ∆L since ρ(GL,S(η)A) ≈ 1 for η close to 0. It is also shown in [12] that a
doubling algorithm (DA) can be used to compute the desired solution X = GL,S(η)−1
of the equation (1.4) efficiently for each fixed value of E . However, in practice the desired solution needs to be computed for many different E values. Since the DA is not a correction method, it cannot use the solution obtained for one E value as an initial approximation for the exact solution at a nearby E value. To compute the solutions corresponding to many E values, it may be more efficient to use a modified FPI together with the DA. Indeed, it is suggested in [25] that the following modified FPI be used to approximate GL,S(η).
Yk+1= 1 2Yk+ 1 2(Qη− A >Y kA)−1. (1.7)
A variant of this FPI is given in [12] to approximate GL,S(η)−1:
Xk+1= 1 2Xk+ 1 2(Qη− A >X−1 k A), (1.8)
which requires less computational work each iteration. However, the convergence analysis of these two modified FPIs has been an open problem even for the special initial matrices Y0= Q−1η and X0= Qη, respectively.
Our first contribution in this paper is a proof of convergence (to the desired solutions) of these two modified FPIs and their generalizations, for many choices of initial matrices. So these methods can be used as correction methods. In this process we will show that the unique stabilizing solution X = GL,S(η)−1 of (1.4) is
also the unique solution of (1.4) with a positive definite imaginary part. It follows that the imaginary part XI of the matrix G−1L,S is positive semi-definite. Our second
contribution in this paper is a determination of the rank of XI in terms of the number
of eigenvalues on T of the quadratic pencil λ2A>− λQ + A. Our third contribution is a structure-preserving algorithm that is applied directly on (1.4) with η = 0. In doing so, we work with real arithmetic most of the time.
2. Convergence analysis of fixed-point iterations. In this section we per-form convergence analysis for some FPIs, including the modified FPIs (1.7) and (1.8). The main tool we need is the following important result due to Earle and Hamilton [6]. The presentation here follows [14, Theorem 3.1] and its proof.
Theorem 2.1. (Earle–Hamilton theorem) Let D be a nonempty domain in a complex Banach space Z and let h : D → D be a bounded holomorphic function.
If h(D) lies strictly inside D, then h has a unique fixed point in D. Moreover, the sequence {zn} defined by the fixed point iteration zk+1= h(zk) converges to this fixed
point for any z0∈ D.
Now let Z be the complex Banach space Cn×n equipped with the spectral norm.
For any K ∈ Cn×n, its imaginary part is the Hermitian matrix
ImK = 1
2i(K − K
∗).
For any Hermitian matrices X and Y , X > Y (X ≥ Y ) means that X − Y is positive definite (semi-definite). Let D+ = {X ∈ Cn×n : ImX > 0}, D− = {X ∈ Cn×n :
ImX < 0}.
We start with a proof of convergence for the basic FPI (1.5) for many different choices of X0, not just for X0= Qη.
Theorem 2.2. For any X0∈ D+, the sequence {Xk} produced by the FPI (1.5)
converges to the unique fixed point Xη in D+.
Proof. Let D = {X ∈ Cn×n : ImX > η2I}. For each X ∈ D, X is invertible by Bendixson’s theorem (see [26] for example) and we also have kX−1k < 2
η (see [2,
Corollary 4] or [13, Lemma 3.1]). Now let
f (X) = Qη− A>X−1A. Then for X ∈ D Imf (X) = ImQη− 1 2i A >X−1A − (A>X−1A)∗ = ηI + A>X−1(ImX)(A>X−1)∗≥ ηI.
It follows that f : D → D is a bounded holomorphic function and f (D) lies strictly inside D. By the Earle–Hamilton theorem, f has a unique fixed point Xηin D and Xk
converges to Xη for any X0 ∈ D. The theorem is proved by noting that X1 ∈ D for
any X0∈ D+and that any fixed point X∗in D+must be in D by X∗= Qη−A>X∗−1A.
Remark 2.1. Since {Xk} converges to GL,S(η)−1 for X0= Qη ∈ D+, we know
that Xη = GL,S(η)−1 in Theorem 2.2. Thus Xη is the unique solution of the equation
(1.4) such that ρ(Xη−1A) < 1, and it is also the unique solution of (1.4) in D+. If we
have obtained a particular solution of (1.4) by some method and would like to know whether it is the required solution, the latter condition is easier to check.
The matrix GL,S(η) can also be computed directly by using (1.6) for many
dif-ferent choices of Y0. Note that the FPI (1.6) is Yk+1= ˆf (Yk) with
ˆ
f (Y ) = (Qη− A>Y A)−1.
Corollary 2.3. For any Y0∈ D−, the sequence {Yk} produced by the FPI (1.6)
converges to the unique fixed point Yη = GL,S(η) in D−.
Proof. For any Y0 ∈ D−, Y0 is invertible by Bendixson’s theorem. We now take
X0 = Y0−1 in (1.5). Then X0∈ D+ since ImX0 = −Y0−1(ImY0)Y0−∗. It follows that
the sequence {Yk} is well-defined and related to the sequence {Xk} from (1.5) by
Yk = Xk−1. Thus {Yk} converges to Yη = Xη−1 ∈ D−. Since Xη is the unique fixed
point of f in D+, Yη is the unique fixed point of ˆf in D−.
For faster convergence, we consider the modified FPI for (1.4)
or Xk+1= g(Xk) with the function g defined by
g(X) = (1 − c)X + cf (X). (2.2)
Note that f (X) = X if and only if g(X) = X. So f and g have the same fixed points. Note also that the FPI (1.8) is a special case of the FPI (2.1) with c = 12. We can now prove the following general result.
Theorem 2.4. For any X0∈ D+, the FPI Xk+1= g(Xk) converges to the unique
fixed point Xη in D+.
Proof. For any X0 ∈ D+, X1 is well defined and ImX1 > cηI. Let b be any
number such that b > kX1k and b > 2(kQηk+cη1kAk2). Let D = {X ∈ Cn×n: ImX >
cηI, kXk < b}. So X1 ∈ D. For each X ∈ D, X is invertible and kX−1k < cη1, as
before. Then for X ∈ D
Img(X) = (1 − c)ImX + cImf (X) > (1 − c)cηI + cηI = (2 − c)cηI, and kg(X)k ≤ (1 − c)kXk + c kQηk + 1 cηkAk 2 < (1 − c)b + cb 2 = 1 − c 2 b. It follows that g : D → D is a bounded holomorphic function and g(D) lies strictly inside D. By the Earle–Hamilton theorem, Xk converges to the unique fixed point of
g in D, which must be Xη.
Similarly, we consider the modified FPI
Yk+1= (1 − c)Yk+ c(Qη− A>YkA)−1, 0 < c < 1, (2.3)
or Yk+1= ˆg(Yk) with the function ˆg defined by
ˆ
g(Y ) = (1 − c)Y + c ˆf (Y ). (2.4) The FPI (2.3) includes (1.7) as a special case. Note that ˆf (Y ) = Y if and only if ˆg(Y ) = Y . So ˆf and ˆg have the same fixed points. However, there are no simple relations between Xk from (2.1) and Yk from (2.3).
Theorem 2.5. For any Y0∈ D−, the FPI Yk+1= ˆg(Yk) converges to the unique
fixed point Yη in D−.
Proof. Take b > 2/η, and let D = {Y ∈ Cn×n : ImY < 0, kY k < b}. For any
Y ∈ D, Im(Qη− A>Y A) ≥ ηI. So ˆg(Y ) is well defined and k(Qη− A>Y A)−1k ≤ 1η.
Thus kˆg(Y )k < (1 − c)b + c1 η < 1 −1 2c b. Moreover,
Im(ˆg(Y )) < cIm(Qη− A>Y A)−1
= −c(Qη− A>Y A)−1Im(Qη− A>Y A)(Qη− A>Y A)−∗ ≤ −cη (Qη− A>Y A)∗(Qη− A>Y A) −1 ≤ −cηkQη− A>Y Ak−2I ≤ − cη (kQηk + bkAk2)2 I.
It follows that ˆg : D → D is a bounded holomorphic function and ˆg(D) lies strictly inside D. By the Earle–Hamilton theorem, Yk converges to Yη for any Y0 ∈ D, and
hence for any Y0∈ D− since we can take b > kY0k.
We remark that the modified FPI (2.1) is slightly less expensive than the modified FPI (2.3) for each iteration. These two methods make improvements over the basic FPIs (1.5) and (1.6) in the same way, as explained below.
The rate of convergence of each FPI can be determined by computing the Fr´echet derivative of the fixed-point mapping, as in [11]. For (1.5) and (1.6), we have
lim sup k→∞ k q kXk− Xηk ≤ (ρ(Xη−1A)) 2, lim sup k→∞ k q kYk− Yηk ≤ (ρ(YηA))2,
where equality typically holds. Recall that Yη = Xη−1. Note also that if Y0 = X0−1
(with X0∈ D+) then lim sup k→∞ k q kYk− Yηk = lim sup k→∞ k q kXk− Xηk.
The Fr´echet derivative at Xη of the function g in (2.2) is given by
g0(Xη)E = (1 − c)E + cA>Xη−1EX −1 η A.
It follows that for FPI (2.1) lim sup k→∞ k q kXk− Xηk ≤ ρ (1 − c)I + c(A>Xη−1) ⊗ (A >X−1 η ) .
Similarly, the Fr´echet derivative at Yη of the function ˆg in (2.4) is given by
ˆ
g0(Yη)E = (1 − c)E + cYηA>EAYη.
It follows that for FPI (2.3) lim sup
k→∞
k
q
kYk− Yηk ≤ ρ (1 − c)I + c(YηA>) ⊗ (YηA>) .
The rate of convergence of both modified FPIs is then determined by rη= max i,j 1 − c + cλi(Xη−1A)λj(Xη−1A) .
The convergence of the modified FPIs is often much faster because rη may be much
smaller than 1 for a proper choice of c. An extreme example is the following. Example 2.1. Consider the scalar equation (1.4) with A = 1 and Qη = ηi. It is
easy to find that
Xη=
1 2(η +
p
4 + η2)i.
Thus ρ(Xη−1A) → 1 as η → 0+, while for c = 1
2 we have rη→ 0 as η → 0 +.
Note that for i, j = 1, . . . , n, the n2numbers λ
i(Xη−1A)λj(Xη−1A) are inside T for
each η > 0. In the limit η → 0+
, at least one of them is on T if E ∈ ∆L. So each of
in (2.1) and (2.3). In this case, the basic FPIs (1.5) and (1.6) are recovered. To get some insight, we first consider choosing c ∈ (0, 1] such that for fixed (r, θ)
p(c) =1 − c + creiθ
is minimized. If (r, θ) = (1, 0), then p(c) = 1 for all c ∈ (0, 1]. So assume (r, θ) 6= (1, 0). In this case, p(c) = 1 − reiθ 1 1 − reiθ − c is minimized on (0, 1] when c = min 1, Re 1 1 − reiθ = min 1, 1 − r cos θ 1 + r2− 2r cos θ ≥1 2, where we have used 1 − r cos θ −12(1 + r2− 2r cos θ) = 1
2(1 − r
2) ≥ 0. It follows that
c = 1 is the best choice when 1+r1−r cos θ2−2r cos θ ≥ 1 or, in other words, when z = reiθ is in
the disk {z ∈ C : |z − 1 2| ≤
1
2}. Note that p(1) = r. It also follows that c = 1 2 is the
best chooice when r = 1. Note that p(12) = 12√1 + r2+ 2r cos θ =1
2p2(1 + cos θ) for
r = 1.
We know from [12] that the eigenvalues of Xη−1A are precisely the n eigenvalues inside T of the quadratic pencil λ2A>− λQη+ A. We also know from Theorem 1.1
that the quadratic pencil P (λ) = λ2A>− λQ + A has some eigenvalues on T when E ∈ ∆L. We can then make the following conclusions.
Remark 2.2. If P (λ) has some eigenvalues near 1 or −1, the convergence of the FPI (2.1) is expected to be very slow for any choice of the parameter c. The DA is recommended for this case. If all eigenvalues of P (λ) are clustered around ±i, then the FPI (2.1) with c = 1
2 is expected to be very efficient. In the general case, the
optimal c is somewhere between 1
2 and 1. If all eigenvalues of X −1
η A are available,
we can determine the optimal c to minimize rη using the bisection procedure in [9,
Section 6], with [12, 1] as the initial interval. In practice we would not compute these eigenvalues for every E value. But we may use DA to compute Xη for one E value,
determine the optimal c value for this E , and use it as a suboptimal c for many nearby E values. If one does not want to compute any eigenvalues when using the FPI (2.1), then c = 12 is recommended since this c value is best in handling the eigenvalues of Xη−1A that are extremely close to T, which are the eigenvalues responsible for the extreme slow convergence of the basic FPIs (1.5) and (1.6).
We note that the approximate solution from the DA or the FPI for any E value is in D+, and can be used as initial guess for the exact solution when using the FPI for
other E values, with guaranteed convergence. However, even for small size problems the convergence of the FPI (2.1), with c = 12 for example, can still be very slow when P (λ) has some eigenvalues near 1 or −1. Moreover, the latter situation will happen for some energy values since P (λ) is palindromic and thus, as E varies, eigenvalues of P (λ) leave or enter the unit circle typically through the points ±1. When n is large, there may be some eigenvalues of P (λ) near ±1 for almost all energy values of interest, and thus the convergence of the FPI may be very slow and other methods should be used.
3. Rank of Im(GL,S). The equation (1.4) has a unique stabilizing solution Xη =
GL,S(η)−1 for any η > 0. So
with ρ(Xη−1A) < 1. We also know that Xη is complex symmetric. Write Xη =
Xη,R+ iXη,I with Xη,R> = Xη,R, Xη,I> = Xη,I ∈ Rn×n. We know from the previous
section that Im(Xη) = Xη,I > 0. Let
ϕη(λ) = λA>+ λ−1A − Qη. (3.2)
By (3.1) the rational matrix-valued function ϕη(λ) has the factorization
ϕη(λ) = λ−1I − Sη> Xη(−λI + Sη) ,
where Sη = Xη−1A. Let X = limη→0+Xη= G−1L,S. Then
X + A>X−1A = Q (3.3)
with ρ(X−1A) ≤ 1 and Im(X) ≥ 0. Note that ϕ0(λ) = λA>+ λ−1A − Q has the
factorization
ϕ0(λ) = λ−1I − S> X (−λI + S) , (3.4)
where S = X−1A. In particular, ϕ0(λ) is regular, i.e., its determinant is not
indenti-cally zero. In this section we will determine the rank of Im(X), which is the same as the rank of Im(GL,S) since Im(GL,S) = Im(X−1) = −X−1Im(X)X−∗.
Let M = A 0 Q −I , L = 0 I A> 0 . (3.5)
Then the pencil M − λL, also denoted by (M, L), is a linearization of the quadratic matrix polynomial
P (λ) = λϕ0(λ) = λ2A>− λQ + A. (3.6)
It is easy to check that y and z are the right and left eigenvectors, respectively, corresponding to an eigenvalue λ of P (λ) if and only if
y Qy − λA>y , z −λz (3.7)
are the right and left eigenvectors of (M, L), respectively.
Theorem 3.1. Suppose that λ0 is a semi-simple eigenvalue of ϕ0(λ) on T with
multiplicity m0 and Y ∈ Cn×m0 forms an orthonormal basis of right eigenvectors
corresponding to λ0. Then iY∗(2λ0A>− Q)Y is a nonsingular Hermitian matrix. Let
dj, j = 1, . . . , `, be the distinct eigenvalues of iY∗(2λ0A>− Q)Y with multiplicities
mj0, and let ξj ∈ Cm0×m
j
0 form an orthonormal basis of the eigenspace corresponding
to dj. Then for η > 0 sufficiently small and j = 1, . . . `
λ(k)j,η = λ0−
λ0
dj
η + O(η2), k = 1, . . . , mj0, and yj,η= Y ξj+ O(η) (3.8)
are perturbed eigenvalues and a basis of the corresponding invariant subspace of ϕη(λ),
Proof. Since P (λ0)Y = λ0ϕ0(λ0)Y = 0 with Y∗Y = Im0 and |λ0| = 1, we have 0∗= (P (λ0)Y )∗= 1 λ2 0 Y∗(λ20A>− λ0Q + A).
It follows that Y forms an orthonormal basis for left eigenvectors of P (λ) correspond-ing to λ0. From (3.7), we obtain that the column vectors of
YR= Y QY − λ0A>Y and YL= Y −λ0Y
form a basis of left and right eigenspaces of M − λL corresponding to λ0, respectively.
Since λ0is semi-simple, the matrix
[Y∗, −λ0Y∗]L
Y QY − λ0A>Y
= −Y∗(2λ0A>− Q)Y = −Y∗P0(λ0)Y
is nonsingular. Let e YR= −YR(Y∗P0(λ0)Y )−1, YeL= YL. Then we have e YL∗L eYR= Im0 and eY ∗ LM eYR= λ0Im0. (3.9)
For η > 0, sufficiently small, we consider the perturbed equation of P (λ) by P (λ) − λiηI = λ2A>− λ(Q + iηI) + A = λϕη(λ). Let Mη = A 0 Q + iηI −I . Then Mη− λL is a linearization of λϕη(λ). By (3.9)
and [24, Chapter VI, Theorem 2.12] there are bYR and bYL such that
h e YRYbR i and h e YLYbL i
are nonsingular and " e Y∗ L b Y∗ L # MhYeRYbR i = λ 0Im0 0 0 Mc , " e Y∗ L b Y∗ L # LhYeRYbR i = I m0 0 0 Lb .
Then, by [24, Chapter VI, Theorem 2.15] there exist matrices ∆1(η) = O(η) and
∆2(η) = O(η2) such that the column vectors of eYR+ ∆1(η) span the right eigenspace
of (Mη, L) corresponding to (λ0Im0+ ηE11+ ∆2(η), Im0), where
E11= eYL∗ 0 0 iI 0 e YR= λ0Y∗(iI)Y (Y∗P0(λ0)Y )−1 = −λ0(iY∗(2λ0A>− Q)Y )−1. (3.10)
The matrix iY∗(2λ0A>− Q)Y in (3.10) is Hermitian since
iY∗(2λ0A>− Q)Y = iY∗ϕ0(λ0)Y + iλ0Y∗A>Y − iλ0Y∗AY
Let dj for j = 1, . . . , ` be the distinct eigenvalues of iY∗(2λ0A>− Q)Y with
mul-tiplicities mj0, and let ξj ∈ Cm0×m
j
0 form an orthonormal basis of the eigenspace
corresponding to dj. Then we have
Φ∗E11Φ = diag −λ0 d1 Im1 0, . . . , −λ0 d` Im` 0 ,
where Φ = [ξ1, . . . , ξ`] ∈ Cm0×m0. It follows that λ0Im0+ ηE11+ ∆2(η) is similar to
λ0Im0+ diag −λ0 d1 ηIm1 0, . . . , −λ0 d` ηIm` 0 + ∆3(η)
for some ∆3(η) = O(η2). Then for each j ∈ {1, 2, . . . , `}, the perturbed eigenvalues
λ(k)j,η, k = 1, . . . , mj0, and a basis of the corresponding invariant subspace of Mη− λL
with λ(k)j,η|η=0= λ0can be expressed by
λ(k)j,η = λ0− λ0 dj η + O(η2), k = 1, . . . , mj0, (3.12a) and ζj,η= YRξj+ O(η). (3.12b)
The second equation in (3.8) follows from (3.12b). Lemma 3.2. Let Zη be the solution of the equation
Zη− R∗ηZηRη = ηWη, for η > 0, (3.13)
where Wη ∈ Cm×m is positive definite, Rη = eiθIm+ ηEη with θ ∈ [0, 2π] fixed, and
Eη ∈ Cm×m is uniformly bounded such that ρ(Rη) < 1. Then Zη is positive definite.
Furthermore, if Zη converges to Z0 and Wη converges to a positive definite matrix W0
as η → 0+, then Z
0 is also positive definite.
Proof. Since ρ(Rη) < 1 and ηWη is positive definite, it is well known that Zη is
uniquely determined by (3.13) and is positive definite. Since Eη is bounded, we have from (3.13) that
ηWη = Zη− e−iθIm+ ηEη∗ Zη eiθIm+ ηEη
= −ηeiθEη∗Zη− ηe−iθZηEη+ O(η2).
This implies that
Wη= −eiθE∗ηZη− e−iθZηEη+ O(η). (3.14)
If Zη converges to Z0 as η → 0+, then Z0 is positive semi-definite. To prove that Z0
is positive definite, it suffices to show that Z0 is nonsingular. Suppose that x ∈ Cm
such that Z0x = 0. Then we have Zηx → 0 and x∗Zη → 0 as η → 0+. Multiplying
(3.14) by x∗ and x from the left and right, respectively, we have
x∗Wηx = −eiθx∗Eη∗Zηx − e−iθx∗ZηEηx + O(η) → 0, as η → 0+.
Thus x = 0 because Wη converges to W0 and W0 is positive definite. It follows that
Z0 is positive definite.
Theorem 3.3. The number of eigenvalues (counting multiplicities) of ϕ0(λ) on
T must be even, say 2m. Let X = limη→0+Xη be invertible and write X = XR+ iXI
(i) rank (XI) ≤ m;
(ii) rank (XI) = m, if all eigenvalues of ϕ0(λ) on T are semi-simple and kSη−
Sk2= O(η) for η > 0 sufficiently small, where Sη= Xη−1A and S = X−1A;
(iii) rank (XI) = m, if all eigenvalues of ϕ0(λ) on T are semi-simple and each
unimodular eigenvalue of multiplicity mj is preturbed to mj eigenvalues (of
ϕη(λ)) inside the unit circle or to mj eigenvalues outside the unit circle.
Proof. Consider the real quadratic pencil P (λ) = λϕ0(λ) = λ2A>− λQ + A. So
P (λ) and ϕ0(λ) have the same eigenvalues on T. If λ06= ±1 is an eigenvalue of P (λ)
on T with multiplicity m0, then so is λ0. Thus the total number of nonreal eigenvalues
of P (λ) on T must be even. Now the quadratic pencil Pη(λ) = λ2A>− λ(Q + iηI) + A
is >-palindromic, and it has no eigenvalues on T for any η 6= 0 [12]. If 1 (or −1) is an eigenvalue of P (λ) with multiplicity r and Q in P (λ) is perturbed to Q + iηI, then half of these r eigenvalues are perturbed to the inside of T and the other half are perturbed to the outside of T. This means that r must be even. Thus the total number of eigenvalues of ϕ0(λ) on T is also even and is denoted by 2m.
(i) By Xη+ A>Xη−1A = Qη we have
i(Q∗η− Qη) = i(Xη∗− Xη) − iA>(Xη−1− X −∗ η )A
= i(Xη∗− Xη) − (Xη−1A)∗i(Xη∗− Xη)(Xη−1A).
Thus
Kη− Sη∗KηSη= 2ηI, (3.15)
where Kη = i Xη∗− Xη = 2Xη,I. Note that the eigenvalues of Sη = Xη−1A are
the eigenvalues of Pη(λ) inside T. Since X = limη→0+Xη is invertible, we have
S = X−1A = limη→0+Sη. Let S = V0 R0,1 0 0 R0,2 V0−1 (3.16)
be a spectral resolution of S, where R0,1 ∈ Cm×m and R0,2 ∈ C(n−m)×(n−m) are
upper-triangular with σ(R0,1) ⊆ T and σ(R0,2) ⊆ D ≡ {λ ∈ C| |λ| < 1}, and
V0= [V0,1, V0,2] with V0,1∈ Cn×m and V0,2 ∈ Cn×(n−m) having unit column vectors.
It follows from [24, Chapter V, Theorem 2.8] that there is a nonsingular matrix Vη= [Vη,1, Vη,2] with Vη,1∈ Cn×m and Vη,2∈ Cn×(n−m) such that
Sη = Vη Rη,1 0 0 Rη,2 Vη−1, (3.17) and Rη,1→ R0,1, Rη,2→ R0,2, and Vη→ V0, as η → 0+.
From (3.15) and (3.17) we have
Vη∗KηVη− R∗η,1 0 0 R∗η,2 Vη∗KηVη Rη,1 0 0 Rη,2 = 2ηVη∗Vη. (3.18) Let Hη= Vη∗KηVη= Hη,1 Hη,3 H∗ η,3 Hη,2 , Vη∗Vη= Wη,1 Wη,3 W∗ η,3 Wη,2 . (3.19)
Then (3.18) becomes
Hη,1− R∗η,1Hη,1Rη,1= 2ηWη,1, (3.20a)
Hη,2− R∗η,2Hη,2Rη,2= 2ηWη,2, (3.20b)
Hη,3− R∗η,1Hη,3Rη,2= 2ηWη,3. (3.20c)
As η → 0+, R
η,1 → R0,1 with ρ(R0,1) = 1, Rη,2 → R0,2 with ρ(R0,2) < 1, and Wη,2
and Wη,3are bounded. So we have Hη,2→ 0 from (3.20b) and Hη,3→ 0 from (3.20c).
It follows from (3.19) that Kη= 2Xη,I converges to K0= 2XI with rank(XI) ≤ m.
(ii) Suppose that eigenvalues of ϕ0(λ) on T are semi-simple and kSη− Sk2= O(η)
for η > 0 sufficiently small. Then we will show that Hη,1 in (3.20a) converges to H0,1
with rank(H0,1) = m. Let λ1, . . . , λr∈ T be the distinct semi-simple eigenvalues of S
with multiplicities m1, . . . , mr, respectively. Then (3.16) can be written as
S = V0 D0,1 0 0 R0,2 V0−1
where D0,1 = diag{λ1Im1, · · · , λrImr}, V0 = [V0,λ1, · · · , V0,λr, V0,2] and
Pr
i=1mi =
m. Now Sη = S + (Sη− S) with kSη− Sk2 = O(η). By repeated application of [24,
Chapter V, Theorem 2.8] there is a nonsingular matrix Vη= [Vη,λ1, · · · , Vη,λr, Vη,2] ∈
Cn×n such that Sη = Vη D0,1+ ηEη,1 0 0 R0,2+ ηEη,2 Vη−1
and Vη→ V0as η → 0+, where Eη,1= diag{Em11,η, · · · , E
1
mr,η} with E
1 mj,η∈ C
mj×mj
and Eη,2 ∈ C(n−m)×(n−m) are such that kEm1j,ηk2 = O(1) for j = 1, . . . , r and
kEη,2k2= O(1).
The equation (3.20a) can then be written as
Hη,1− (D0,1+ ηEη,1)∗Hη,1(D0,1+ ηEη,1) = 2ηWη,1. (3.21)
Since D0,1+ ηEη,1 is a block diagonal matrix and all eigenvalues of its jth diagonal
block converge to λj, with λj’s distinct numbers on T, we have
Hη,1= diag{Hm11,η, · · · , H 1 mr,η} + O(η), where diag{H1 m1,η, · · · , H 1
mr,η} is the block diagonal of Hη,1. Then (3.21) gives
Hm1 j,η− (λjImj+ ηE 1 mj,η) ∗H1 mj,η(λjImj + ηE 1 mj,η) = 2ηW 1 mj,η, for j = 1, . . . , r,
where Wm1j,η is the jth diagonal block of Wη,1. Since Wη,1 is positive definite and
converges to a positive definite matrix, W1
mj,η, j = 1, . . . , r, are also positive definite
and converge to positive definite matrices. For η > 0, we have ρ(λjImj+ ηE
1
mj,η) < 1
for j = 1, . . . , r since ρ(Sη) < 1. By the assumption that Xη converges to X, we
have that H1
mj,η converges to H
1
mj,0 for j = 1, . . . , r. From Lemma 3.2, we obtain
that H1
mj,0 is positive definite for j = 1, . . . , r. Hence, Hη,1 converges to H0,1 with
rank(H0,1) = m. It follows from (3.19) that Kη= 2Xη,I converges to K0= 2XI with
(iii) It suffices to show that kSη− Sk2 = O(η) for η > 0 sufficiently small. Since X is a solution of X + A>X−1A = Q, we have M I X = L I X S,
where the pencil (M, L) is defined in (3.5) . Under the condition in (iii), the column space of
I X
is a simple eigenspace of (M, L), in the terminology of [24]. It follows from [24, Chapter VI, Theorems 2.12 and 2.15] that
A 0 Q + iηI −I I + ηFη,1 X + ηFη,2 = 0 I A> 0 I + ηFη,1 X + ηFη,2 (S + ηEη),
where Fη,1, Fη,2, Eη ∈ Cn×n with max{kFη,1k2, kFη,2k2, kEηk2} ≤ c for η > 0
suffi-ciently small and c > 0. It is easily seen that Xη = (X + ηFη,2)(I + ηFη,1)−1,
Sη = Xη−1A = (I + ηFη,1)(S + ηEη)(I + ηFη,1)−1.
It follows that kSη− Sk2= O(η) for η > 0 sufficiently small.
Remark 3.1. Without the additional conditions in Theorem 3.3 (ii) or (iii), rank (XI) could be much smaller than m. Consider the example with A = In and
Q = 2In. Then ϕ0(λ) has all 2n eigenvalues at 1, with partial multiplicities 2. So
m = n, but it is easy to see that rank (XI) = 0. For this example, we have kSη−Sk2=
O(η1/2) for η > 0 sufficiently small. We also know that the 2n eigenvalues of ϕ 0(λ)
at 1 are perturbed to n eigenvalues inside the unit circle and n eigenvalues outside the unit circle.
Corollary 3.4. If ϕ0(λ) has no eigenvalues on T, then X is real symmetric.
Furthermore, In(X) = In (−ϕ0(1)). Here In(W ) denotes the inertia of a matrix W .
Proof. From Theorem 3.3, it is easy to see that X is a real symmetric matrix. Since X is real, S = X−1A is a real matrix. By setting λ = 1 in (3.4) we get ϕ0(1) = −(I − S>)X(I − S). Hence, In(X) = In(−ϕ0(1)).
Corollary 3.5. If all eigenvalues of ϕ0(λ) are on T and are simple, then XI is
positive definite.
Proof. By Theorem 3.3 (iii) immediately.
4. A structure-preserving algorithm. As explained in [12] and also in this paper, the required solution X = G−1L,S is a particular weakly stabilizing solution of (3.3) and is given by X = limη→0+Xη, where Xη is the unique stabilizing solution of
(1.4). We will call this particular solution the weakly stabilizing solution of (3.3). It can be approximated by Xη for a small η. For a fixed η > 0, Xη can be computed
efficiently by the doubling algorithm studied in [12], for all energy values.
In this section we will develop a structure-preserving algorithm that can for most cases find the weakly stabilizing solution of (3.3) more efficiently and more accurately than the doubling algorithm, by working on the equation (3.3) directly.
Consider the pencil (M, L) given by (3.5). The simple relation M I X = L I X
X−1A shows that the weakly stabilizing solution of (3.3) is obtained by X = X2X1−1, where
X1
X2
forms (or more precisely, the colums of
X1
X2
form)
a basis for the invariant subspace of (M, L) corresponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed to the inside of T when Q is replaced by Qη with η > 0.
We now assume that all unimodular eigenvalues λ 6= ±1 of (M, L) are semi-simple and the eigenvalues ±1 (if exist) have partial multiplicities 2. This assumption seems to hold generically. Under this assumption, for computing the weakly stabilizing solution we need to include all linearly independent eigenvectors associated with the eigenvalues ±1, and use Theorem 3.1 to determine which half of the unimodular eigenvalues λ 6= ±1 should be included to compute the required invariant subspace.
We may use the QZ algorithm to determine this invariant subspace, but it would be better to exploit the structure of the pencil (M, L). We will use the same approach as in [15] to develop a structure-preserving algorithm (SA) to find a basis for the desired invariant subspace of (M, L) and then compute the weakly stabilizing solution of (3.3). The algorithm is still based on the (S + S−1)-transform in [19] and Patel’s algorithm in [23], but some new issues need to be addressed here.
It is well known that (M, L) is a symplectic pair, i.e., (M, L) satisfies MJ M>= LJ L>, where J =
0 I
−I 0
. Furthermore, the eigenvalues of (M, L) form recip-rocal pairs (ν, 1/ν), where we allow ν = 0, ∞. We define the (S + S−1)-transform [19]
of (M, L) by K := MJ L>J + LJ M>J = Q A − A> A>− A Q , (4.1a) N := LJ L>J = A 0 0 A> . (4.1b)
Then K and N are both skew-Hamiltonian, i.e., KJ = J K> and N J = J N>. The relationship between eigenvalues of (M, L) and (K, N ), and their Kronecker structures have been studied in [19, Theorem 3.2]. We will first extend that result to allow unimodular eigenvalues for (M, L). The following preliminary result is needed. Lemma 4.1. Let Nr(λ) := λIr+ Nr, where Nr is the nilpotent matrix with
Nr(i, i + 1) = 1, i = 1, . . . , r − 1, and zeros elsewhere. Let eq.
∼ denote the equivalence between two matrix pairs (Two matrix pairs (Y1, Y2) and (Z1, Z2) are called equivalent
if there are nonsingular matrices U and V such that U Y1V = Z1 and U Y2V = Z2).
Then (i) For λ 6= 0, ±1, (Nr(λ) + Nr(λ)−1, Ir) eq. ∼ (Nr(λ + 1/λ), Ir). (ii) (N2 r+ Ir, Nr) eq. ∼ (I, Nr).
Proof. (i) Since λ 6= 0, one can show that Nr(λ)−1≡ [tj−i] and Nr(λ)+Nr(λ)−1≡
[sj−i] are Toeplitz upper triangular with tk = (−1)kλ−(k+1), for k = 0, 1, . . . , r − 1,
as well as s0= λ + 1/λ, s1= 1 − λ−2 and sk= tk, for k = 2, . . . , r − 1. Since λ 6= ±1,
s1= 1 − λ−2 is nonzero. It follows that (Nr(λ) + Nr(λ)−1, Ir) eq. ∼ (Nr(λ + 1/λ), Ir). (ii) (I + Nr2, Nr) eq. ∼ (Ir, Nr(I + Nr2)−1) eq. ∼ (Ir, Nr− Nr3+ N 5 r− · · · ) eq. ∼ (Ir, Nr).
Theorem 4.2. Suppose that (M, L) has eigenvalues {±1} with partial multiplic-ities 2. Let γ = λ + 1/λ (λ = 0, ∞ permitted). Then λ and 1/λ are eigenvalues of (M, L) if and only if γ is a double eigenvalue of (K, N ). Furthermore, for λ 6= ±1 (i.e., γ 6= ±2) γ, λ and 1/λ have the same sizes of Jordan blocks, i.e., they have the same partial multiplicities; for λ = ±1, γ = ±2 are semi-simple eigenvalues of (K, N ).
Proof. By the results on Kronecker canonical form for a symplectic pencil (see [20]) and by our assumption, there are nonsingular matrices X and Y such that
YMX = J D 0 In , YLX = In 0 0 J , (4.2)
where J = J1⊕ Js⊕ J0, J1 = Ip⊕ (−Iq), Js is the direct sum of Jordan blocks
corresponding to nonzero eigenvalues λj of (M, L), where |λj| < 1 or λj = eiθj
with Im(λj) > 0, and J0 is the direct sum of nilpotent blocks corresponding to zero
eigenvalues, and D = Ip⊕ Iq⊕ 0n−r with r = p + q.
Let X−1J X−>= X1 X2 −X> 2 X3 , where Xi ∈ Cn×n, i = 1, 2, 3. So X1> = −X1
and X3>= −X3. Using (4.2) in MJ M>= LJ L> we get
J X1J>− DX2>J
>+ J X
2D + DX3D = X1, (4.3a)
J X2+ DX3= X2J>, (4.3b)
J X3J>= X3. (4.3c)
Let Js,0 = Js⊕ J0. We partition X3 and X2 by X3 =
X3,1 X3,2 −X> 3,2 X3,3 and X2 = X2,1 X2,2 X2,3 X2,4
, respectively, where X2,1, X3,1 ∈ Cr×r. Comparing the
di-agonal blocks in (4.3b) we get
J1X2,1+ X3,1= X2,1J1>, (4.4a)
Js,0X2,4= X2,4Js,0> . (4.4b)
From (4.4a) we see that X3,1 has the form X3,1 =
0p ω −ω> 0 q . From (4.3c) we have [Ip⊕ (−Iq)]X3,1[Ip⊕ (−Iq)] = X3,1. It follows that ω = 0 and thus X3,1 = 0.
From (4.3c) we also have J1X3,2Js,0> = X3,2 and Js,0X3,3Js,0> = X3,3, from which we
get X3,2 = 0 and X3,3 = 0. So we have X3= 0. Then (4.3b) becomes J X2= X2J>,
from which we get
X2,1= ηp⊕ ηq, X2,2= X2,3> = 0, (4.5)
where ηp ∈ Cp×p and ηq ∈ Cq×q. Moreover, X2,1 and X2,4 are nonsingular by the
nonsingularity of X−1J X−>. Substituting (4.3b) into (4.3a) we get
X1= J X1J>− DJ X2>+ X2J>D ≡ J X1J>+ V, (4.6) where V = X2J>D − DJ X2>= V1 0 0 0 with V1= (ηp− η>p) ⊕ (ηq>− ηq) by (4.5). Partition X1= X1,1 X1,2 −X> 1,2 X1,3
with X1,1 ∈ Cr×r. From the equations for the (1, 2)
and (2, 2) blocks of (4.6) we get X1,2 = 0 and X1,3 = 0, respectively. Furthermore,
from the equation for the (1, 1) block in (4.6) we get X1,1 = ξp⊕ ξq with ξp ∈ Cp×p
From (4.2), (4.4) and Lemma 4.1 (ii) we now have (K, N )eq.∼ (MJ L>+ LJ M>, LJ L>)eq.∼ (J1X1,1+ X1,1J1) ⊕ 0n−r (J12+ Ir)X2,1 ⊕ (Js,02 + In−r)X2,4 X2,1(J12+ Ir) ⊕ (X2,4> ((Js,02 )>+ In−r)) 0n , X1,1⊕ 0n−r J1X2,1⊕ Js,0X2,4 X2,1J1⊕ X2,4> Js,0> 0n eq. ∼ 2ξp⊕ (−2ξq) 2Ir 2Ir 0r ⊕ (Js+ Js−1) ⊕ I ⊕ (Js+ Js−1) ⊕ I, ξp⊕ ξq Ip⊕ (−Iq) Ip⊕ (−Iq) 0r ⊕ I ⊕ J0⊕ I ⊕ J0 eq. ∼ 2I2p⊕ (−2I2q) ⊕ (Js+ Js−1) ⊕ I ⊕ (Js+ Js−1) ⊕ I, I2p⊕ I2q⊕ I ⊕ J0⊕ I ⊕ J0 .
The proof is completed by using Lemma 4.1 (i).
4.1. Development of SA. It is helpful to keep in mind that the transform λ → γ achieves the following:
{0, ∞} → ∞, T → [−2, 2], R \ {±1, 0} → R \ [−2, 2], C \ (R ∪ T) → C \ R. By Theorem 4.2 and our assumption on the unimodular eigenvalues of (M, L), all eigenvalues of (K, N ) in [−2, 2] are semi-simple.
Based on Patel approach [23] we first reduce (K, N ) to a block triangular matrix pair of the form
U>KZ = K1 K2 0 K> 1 , U>N Z = N1 N2 0 N> 1 , (4.7)
where K1and N1∈ Rn×nare upper Hessenberg and upper triangular, respectively, K2
and N2are skew symmetric, U and Z ∈ R2n×2nare orthogonal satisfying U>J Z = J .
By the QZ algorithm, we have orthogonal matrices Q1 and Z1 such that
Q1K1Z1= K11, Q1N1Z1= N11, (4.8)
where K11 and N11are quasi-upper and upper triangular, respectively.
From (4.7) and (4.8) we see that the pair (K11, N11) contains half of eigenvalues
of (K, N ). We now reduce (K11, N11) to the quasi-upper and upper block triangular
matrix pair e U>K11Z =e Im0 Kb01 Kb02 · · · Kb0r Γ1 Kb12 · · · Kb1r Γ2 . .. ... . .. b Kr−1r Γr , e U>N11Z = diag{Γe 0, Im1, Im2, . . . , Imr},
where m0+ m1+ · · · + mr= n, Γ0is strictly upper triangular, Γ1= diag{g1, . . . , gm1}
σ(Γj) ∩ σ(Γi) = ∅, i 6= j, i, j = 2, . . . , r. Let b Ki= [ bKii+1, · · · , bKir], i = 0, . . . , r − 1, b Γj = " Γj Kbj 0 Γbj+1 # , bΓr= Γr, j = 1, . . . , r − 1.
By solving some Sylvester equations Γ0Z bΓ1− Z = bK0,
Z bΓi+1− ΓiZ = bKi, i = 1 . . . , r − 1,
(4.9) we can reduce (K11, N11) to the quasi-upper and upper block diagonal matrix pair
b
U>K11Z = diag{Ib m0, Γ1, Γ2, . . . , Γr},
b
U>N11Z = diag{Γb 0, Im1, Im2, . . . , Imr}.
(4.10) The procedure here is very much like the block diagonalization described in section 7.6.3 of [8]. Partition bZ = [ bZ0, bZ1, . . . , bZr] with bZi ∈ Rn×mi according to the block
sizes of (4.10). It holds that
K11Zb0Γ0= N11Zb0, K11Zbj= N11ZbjΓj, j = 1, . . . , r. (4.11)
It follows that, for Z1from (4.8), Z(:, 1 : n)(Z1Zbj) forms a basis for an invariant
subspace of (K, N ) for j = 0, 1, . . . , r. In particular, the columns of Z(:, 1 : n)(Z1Zb1)
are real eigenvectors of (K, N ) corresponding to real eigenvalues in [−2, 2]. We then need to get a suitable invariant subspace of (M, L) from each of these invariant subspaces for (K, N ).
We start with two lemmas about solving the quadratic equation γ = λ + 1/λ in the matrix form.
Lemma 4.3. Given a real quasi-upper triangular matrix
Γs= γ11 · · · γ1m . .. ... 0 γmm , (4.12)
where γii is 1 × 1 or 2 × 2 block with σ(γii) ⊆ C \ [−2, 2], i = 1, . . . , m. Then the
quadratic matrix equation
Λ2s− ΓsΛs+ I = 0 (4.13)
of Λsis uniquely solvable with Λsbeing real quasi-upper triangular with the same block
form as Γsin (4.12) and σ(Λs) ⊆ D ≡ {λ ∈ C| |λ| < 1}. Proof. Let Λs= λ11 · · · λ1m . .. ... 0 λmm
have the same block form as Γs. We first solve the diagonal blocks {λii}mi=1 of Λs
from the quadratic equation λ2− γ
iiλ + I[i]= 0, where [i] denotes the size of γii. Note
that the scalar equation
has no solutions on T for γ ∈ C \ [−2, 2]. So it always has one solution inside T and the other outside T.
For i = 1, . . . , m, if γii ∈ R \ [−2, 2], then λii ∈ (−1, 1) is uniquely solved from
(4.14) with γ = γii. If γii∈ R2×2 with γiiz = γz for z 6= 0 and γ ∈ C \ R, then γii =
[z, z]diag {γ, γ} [z, z]−1 and the required solution is λii = [z, z]diagλ, λ [z, z]−1 ∈
R2×2, where λ ∈ D is uniquely solved from (4.14).
For j > i, comparing the (i, j) block on both sides of (4.13) and using λii− γii=
−λ−1ii , we get λijλjj− λ−1ii λij= γijλjj+ j−1 X `=i+1 (γi`− λi`)λ`j.
Since σ(λ−1ii ) ∩ σ(λjj) = ∅, i, j = 1, . . . , m, the strictly upper triangular part of Λs
can be determined by the following recursive formula. For d = 1, . . . , m − 1, For i = 1, . . . , m − d, j = i + d, A := λ>jj⊗ I[i]− I[j]⊗ λ−1ii , b := γijλjj+Pj−1`=i+1(γi`− λi`)λ`j, λij= vec−1(A−1vec(b)), end i, end d.
Here ⊗ denotes the Kronecker product, vec is the operation of stacking the columns of a matrix into a vector, and vec−1 is its inverse operation.
We remark that the equation (4.13) is a special case of the palindromic matrix equation studied recently in [16]. The desired solution Λs could also be obtained by
applying the general formula in [16, Theorem 5], which involves the computation of (Γ−1s )2 and a matrix square root. However, for our special equation, the procedure
given in the proof of Lemma 4.3 is more direct and numerically advantageous. Lemma 4.4. Given a strictly upper triangular matrix Γ0 = [γij] ∈ Re×e. The
quadratic matrix equation
Γ0Λ20− Λ0+ Γ0= 0 in Λ0= [λij] ∈ Re×e (4.15)
with Λ0 being strictly upper triangular is uniquely solvable.
Proof. From (4.15) the matrix Λ0 is uniquely determined by λi,i+j = γi,i+j,
i = 1, . . . , e − 2, j = 1, 2, λe−1,e= γe−1,eand
For j = 3, . . . , e,
For i = 1, . . . , e − j + 1, b
λi,i+j−1=Pi+j−2`=i+1λi,`λ`,i+j−1,
end i,
For i = 1, . . . , e − j, λi,i+j= γi,i+j+P
i+j−2
`=i+1γi,`bλ`,i+j,
end i, end j.
Theorem 4.5. Let Zs form a basis for an invariant subspace of (K, N )
cor-responding to Γs with σ(Γs) ⊆ C \ [−2, 2], i.e., KZs = N ZsΓs. Suppose that Λs
solves Γs = Λs+ Λ−1s as in Lemma 4.3 with σ(Λs) ⊆ D \ {0}. If the columns of
J (L>J Z
sΛs− M>J Zs) are linearly independent, then they form a basis for a stable
Proof. Since KZs− N ZsΓs= (MJ L>+ LJ M>)J Zs− LJ L>J Zs(Λs+ Λ−1s ) = 0, we have MJ L>J ZsΛs− M>J Zs = LJ L>J ZsΛs− M>J Zs Λs. Remark 4.1. Let Zs= Z1 Z2 , X1 X2 = J (L>J ZsΛs− M>J Zs),
where each of X1, X2, Z1, and Z2 has n rows. Then direct computation gives X1 =
Z2Λs− Z1. But it is more convenient to get X2= QX1− A>X1Λsfrom M
X1 X2 = L X1 X2 Λs.
We now explain how we can get eigenvectors of (M, L) from eigenvectors of (K, N ) corresponding to eigenvalues in [−2, 2]. Theorem 4.6. Let v = v1 v2
be a real eigenvector of (K, N ) corresponding to eigenvalue γ ∈ [−2, 2]. Let λ be a solution of (4.14) and let u1 = λv2 − v1,
u2= Qv1− λA>v1. Then u = u1 u2 is an eigenvector of (M, L) corresponding to eigenvalue λ if u 6= 0. Moreover, we indeed have u 6= 0 for each γ ∈ (−2, 2).
Proof. The first part is proved by direct computation as in the proof of Theorem 4.5. For the second part, we simply note that λ is not real when γ ∈ (−2, 2), and then u16= 0 since v is a nonzero real vector.
Remark 4.2. When γ ∈ (−2, 2) is an eigenvalue of (K, N ) with multiplicity 2k (or an eigenvalue of (K11, N11) with multiplicity k), we can use Theorem 4.6 to get k
eigenvectors of (M, L) corresponding to eigenvalue λ, from the k linearly independent eigenvectors of (K, N ) we have already obtained. However, there is no guarantee that the k eigenvectors of (K, N ) so obtained are also linearly independent.
When A is singular, (M, L) has eigenvalues at 0 and ∞. The following result will then be needed.
Theorem 4.7. Let Z∞ ∈ R2n×m span an infinite invariant subspace of (K, N )
corresponding to (I, Γ0), where Γ0 is strictly upper triangular, i.e., N Z∞= KZ∞Γ0.
Suppose that Λ0solves Γ0Λ20−Λ0+Γ0= 0 as in Lemma 4.4 with Λ0being strictly upper
triangular. If the columns of J (L>J Z∞Λ0−M>J Z∞) are linearly independent, then
they form a basis for a zero invariant subpace of (M, L) corresponding to (Λ0, I).
Proof. Since Γ0= Λ0(I + Λ20)−1, we have
N Z∞(I + Λ20) = MJ M >J Z
∞+ LJ L>J Z∞Λ20= KZ∞Λ0
= LJ M>J Z∞Λ0+ MJ L>J Z∞Λ0,
and then MJ (L>J Z∞Λ0− M>J Z∞) = LJ (L>J Z∞Λ0− M>J Z∞)Λ0.
We can now present a structure-preserving algorithm (SA) for the computation of the weakly stabilizing solution of (3.3).
Structure-preserving Algorithm (SA) Input: A ∈ Rn×n, Q = Q>∈ Rn×n.
Output: The weakly stabilizing solution X of X + A>X−1A = Q. Step 1: Form the matrix pair (K, N ) as in (4.1);
Step 2: Reduce (K, N ) as in (4.7): K ← U>KZ = K1 K2 0 K1> , N ← U>N Z = N1 N2 0 N1>
, where K1and N1∈ Rn×nare upper Hessenberg and
upper triangular, respectively, U and Z are orthogonal satisfying U>J Z = J (see a pseudo code in Appendix of [15]); Apply QZ algorithm to get (4.8); Step 3: Compute eigenmatrix pairs K11Zb0Γ0 = N11Zb0, K11Zbj = N11ZbjΓj of
(K11, N11), j = 1, . . . , r, as in (4.11) by solving the Sylvester equations in
(4.9) to get (4.10); [ bZ0, bZ1, . . . , bZr] ← Z1[ bZ0, bZ1, . . . , bZr], where Z1 is from
(4.8); Step 4: Z∞ = Z(:, 1 : n) bZ0 ≡ Z∞,1 Z∞,2 , Zj = Z(:, 1 : n) bZj ≡ Zj,1 Zj,2 , j = 1, . . . , r;
Step 5-1: Use Lemma 4.4 to solve the strictly upper triangular Λ0 for Γ0Λ20−
Λ0+ Γ0= 0;
Compute X∞,1= Z∞,2Λ0− Z∞,1, X∞,2= QX∞,1− A>X∞,1Λ0;
Set X1← X∞,1, X2← X∞,2 (by Theorem 4.7);
Step 5-2: For k = 1, . . . , m1,
Solve λ = eiθ with Im(λ) ≥ 0 from (4.14) with γ = g k;
Compute x1 = z2eiθ− z1, x2 = Qx1− eiθA>x1, where z1 = Z1,1ek,
z2= Z1,2ek (ek is the kth column of the identity matrix);
If Im(eiθx∗
1A>x1) > 0, then x1← x1, x2← x2;
Set X1← [X1|x1], X2← [X2|x2] (by Theorem 4.6 and Theorem 3.1);
end k;
Step 5-3: For j = 2, . . . , r,
Use Lemma 4.3 to solve Λj = Λsfor (4.13) with Γs= Γj;
Compute Xj,1= Zj,2Λj− Zj,1, Xj,2 = QXj,1− A>Xj,1Λj;
Set X1← [X1|Xj,1], X2← [X2|Xj,2] (by Theorem 4.5);
end j;
Step 6: Compute X = X2X1−1.
Table 4.1
Total number of flops for SA flops
Step 2 1703 n3(Patel algorithm) + 39n3(QZ algorithm)
Step 3 2n3(Assume m
0= 0, mi= 2, i = 1, . . . , r.)
Step 4 4n3
Step 5 between 4n3(no eigenvalues are on T) and 8n3(all eigenvalues are on T)
Step 6 323n3
Total ≈ 120n3
In Step 5-2 of SA we have assumed that the nonsingular Hermitian matrix iY∗(2λ0A>− Q)Y in Theorem 3.1 (which is the matrix in (3.11)) is definite. This
assumption is the same as the assumption in Theorem 3.3 (iii). Under this assumption we do not need to form the matrix Y , but only need to check the sign of its diagonal element determined by any (normalized) eigenvector. We could have used Theorem 3.1 to choose the right eigenvectors whether the Hermitian matrix is definite or not, but this will increase computational work. We have thus chosen to use the simpler
Step 5-2. We can always check in the end whether the imaginary part of the computed X is (nearly) positive semidefinite. If not, we can use a more complicated Step 5-2 according to Theorem 3.1 to re-compute X.
To find the weakly stabilizing solution of (3.3), the total number of flops for SA is roughly 120n3, where a flop denotes a multiplication or an addition in real arithmetic.
A more detailed counting is given in Table 4.1.
4.2. Comparison of SA with other methods. The required solution in the nano application is X = limη→0+Xη. By now, we have available five methods in
two categories. In the first category, we have three methods that compute Xη for a
small η as an approximation to X. They are fixed-point iteration (FPI), Newton’s method (NM) and doubling algorithm (DA). These methods are already discussed in [12]. In the second category, we have two methods that compute X directly. They are QZ and SA. In this paper we have further studied FPI. We now know when and why it works well. It has been shown by numerical experiments in [12] that NM often converges to an undesired solution or even diverges. We now also have a better understanding of this. Basically, Xη is the unique solution of (3.1) with a
positive definite imaginary part and, starting with an initial guess with a positive definite imaginary part, the iterates produced by NM often do not have a positive definite imaginary part (while the iterates from FPI always do). So NM is really not a contender for solving our specific problem, and could be dropped from the first category. We have also mentioned earlier that the convergence of FPI (say (2.1) with c = 12) is usually very slow for at least some of the energy values of interest. Therefore we only need to compare SA with DA and QZ, as general purpose methods.
Strictly speaking, SA and QZ are applicable only under some generic assumptions since they rely on our Theorem 3.1 to pick the right nonreal unimodular eigenvalues of ϕ0(λ) to get the desired weakly stabilizing solution. The difficulty associated with
non-generic eigenstructure of ϕ0(λ) is avoided in DA by the introduction of an η > 0.
More precisely, this difficulty is only concealed since the relation between kXη−Xk and
η is not clear in non-generic cases. In the generic case, we often have kXη−Xk = O(η),
but the constant hidden in the big O notation could still be big. The smallest η that we have seen in the nano literature is 10−6, although we have also experimented with
η = 10−10for DA in [12]. In the experiments there, DA typically requires 26 iterations for η = 10−6 , and nearly 40 iterations for η = 10−10. Note that DA requires 104
3 n 3
real flops each iteration. So in terms of flop counts, four DA iterations is already more expensive than SA. Moreover, since we take η = 0 directly in SA, the accuracy achieved by SA is usually much better than that achieved by DA with η = 10−6. But we also note that DA is much easier to use. If we take η to be very small in DA, then DA could also have stability problems since the matrices to be inverted in DA iterations may also be very ill-conditioned in that case.
One potential problem with SA is that we cannot rule out the possibility that the column vectors generated in Steps 5-1, 5-2, and 5-3 of SA are linearly dependent (in exact arithmetic). In that case, SA fails. Note that QZ does not have this problem, but it requires about 440n3 flops, much more than the 120n3 flops required for SA. The accuracy achieved by QZ is not necessarily better than that from SA since QZ does not exploit the structure of the problem. Moreover, for QZ we may encounter the uncomfortable situation where the number of computed eigenvalues inside T does not match the number of computed eigenvalues outside T.
Finally we explain why X1is unlikely to be singular (in exact arithmetic) in Step 6
5-2, and 5-3 of SA are unlikely to be linearly dependent. Take Step 5-3 for example, which is based on Theorem 4.5. Take Zs= Zj and Γs= Γj (j = 2, . . . , r), we need to
examine whether the matrix L>J ZsΛs− M>J Zs in Theorem 4.5 is of full column
rank. We start with the following result.
Theorem 4.8. Let Xs, Xu∈ C2n×m (m ≤ n) form bases for stable and unstable
invariant subspaces of (L>, M>), respectively, corresponding to Λs and Λ−1s where
σ(Λs) ⊆ D \ {0}, i.e.,
L>Xs= M>XsΛs, L>Xu= M>XuΛ−1s . (4.16)
Then J>Xs and J>Xu span two linearly independent eigenspaces of (K, N )
corre-sponding to Λs+ Λ−1s .
Proof. From LJ L> = MJ M> and (4.16) we have
KJ>Xs= MJ L>Xs+ LJ M>Xs= MJ M>XsΛs+ LJ L>XsΛ−1s
= LJ L>Xs(Λs+ Λ−1s ) = N J>Xs(Λs+ Λ−1s ).
Similarly, KJ>Xu= N J>Xu(Λs+ Λ−1s ).
Now we can write Zs = J>XsCs+ J>XuCu, where Cs and Cu are coefficient
matrices. From (4.16) we have
L>J ZsΛs− M>J Zs= L>(XsCs+ XuCu)Λs− M>(XsCs+ XuCu) =L>X s, L>Xu CsΛs− Λ−1s Cs CuΛs− ΛsCu . (4.17)
Since σ(Λs) ⊆ D \ {0}, the linear mapping T defined by T (W ) = W Λs− Λ−1s W
is a bijection on Cm×m. Thus, the matrix C
sΛs− Λ−1s Cs = T (Cs) is generically
nonsingular. When T (Cs) is nonsingular, L>J ZsΛs− M>J Zsis also of full column
rank since the columns ofL>X
s, L>Xu are linear independent. The latter assertion
is clear when A is nonsingular in (3.5), but can also be shown by using the Kronecker form of (L>, M>) when A is singular.
From the above discussions, we have the following practical strategy for computing the required solution X with reasonable accuracy. We first use SA. In the unlikely event that SA fails, we use QZ. If QZ also fails, then we use DA with a small η > 0, and might have to accept an approximation with lower accuracy.
5. Numerical results. All numerical experiments are carried out using MAT-LAB R2008b with IEEE double-precision floating-point arithmetic (eps ≈ 2.22 × 10−16) on the Linux system. We first illustrate the positivity of Xη,I for η > 0, where
Xη,I is the imaginary part of the stabilizing solution Xηof (3.1), as well as the rank of
XI = limη→0+Xη,I. To measure the accuracy of the computed Xη we use the relative
residual
RResη =
kXη+ A>Xη−1A − Qηk
kXηk + kAk2kXη−1k + kQηk
, (5.1)
where k · k is the spectral norm.
Example 5.1. We randomly generate a real matrix A and a real symmetric ma-trix Q of dimension 6. Then we use the invariant subspace method (the QZ algorithm) to compute the complex symmetric stabilizing solution Xη of (3.1) with η = 10−4,
10−8, 10−12, respectively. When η = 0, ϕ0(λ) has 2m = 6 eigenvalues on T, given by
By Theorem 3.1 we determine that
Λs= {−0.80913 + 0.58763i, 0.64993 + 0.76000i, −0.13000 − 0.99151i} is such that the perturbed eigenvalues of ϕη(λ) (η > 0) associated with each λs∈ Λs
are inside the unit circle. Then we compute the weakly stabilizing solution X of (3.3) by using the invariant subspace corresponding to stable eigenvalues and eigenvalues in Λs. The numerical results are shown in Table 5.1, where X
0= X. Table 5.1
Relative residuals and kXη− X>ηk/kXηk
η 10−4 10−8 10−12 0
RResη 1.17 × 10−15 1.51 × 10−15 1.69 × 10−15 1.59 × 10−15
kXη− Xη>k/kXηk 9.88 × 10−15 7.47 × 10−15 1.12 × 10−14 1.14 × 10−14
We know from section 2 that Xη,I is positive definite and we know from Theorem
3.3 (iii) that rank(XI) = m = 3. These are confirmed by the numerical results shown
in Table 5.2, where X0,I = XI.
Table 5.2 The eigenvalues of Xη,I
η The eigenvalues of Xη,I
10−4 33.938, 6.4240, 9.5171, 6.98 × 10−4, 1.00 × 10−4, 1.34 × 10−4 10−8 33.937, 6.4232, 9.5134, 6.98 × 10−8, 1.00 × 10−8, 1.34 × 10−8
10−12 33.937, 6.4232, 9.5134, 7.01 × 10−12, 1.01 × 10−12, 1.36 × 10−12 0 33.937, 6.4232, 9.5134, 2.54 × 10−16, −1.94 × 10−15, −3.71 × 10−17
We now present some numerical comparison of SA and QZ for the computation of the weakly stabilizing solution of (3.3). To measure the accuracy of a computed solution X of (3.3) we use the relative residual RRes defined as in (5.1), with Xηand
Qη replaced by X and Q, respectively.
Example 5.2. As in [12], we consider a semi-infinite Hamiltonian operator for a heterostructured semiconductor of the form
H(ψ, ~x) = −∇ ~ 2ε(~x)∇ψ + V (~x)ψ, ~x ≡ x1 x2 ∈ Ω, (5.2) where Ω ≡ Ω1∪ Ω2 with Ω1= ([−9, −1] ∪ [1, 9]) × (−∞, 0], Ω2= [−1, 1] × (−∞, 0],
~ is the reduced Planck constant, ψ is the associated wave function, ε(~x) is the electron effective mass with
ε(~x) =
ε1, ~x ∈ Ω1,
ε2, ~x ∈ Ω2,
and V (~x) = ωx2
Relative Residuals of SA 0 1 2 3 4 5 6 7 8 0 1 2 3 4 x 10 −15 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 40 80 120 160 Numbers of eigenvalues on
T
Condition numbers of 1 XE
101 100 102Fig. 5.1. Relative residuals of SA, condition numbers of X1 and numbers of eigenvalues of
(M, L) on T that are used in the computation of X (so they are halves of the total numbers).
Let Tr = Trid(−1, 4, −1) be the tridiagonal matrix of dimension r. We use the
classical five-point central finite difference method to discretize the Hamiltonian op-erator (5.2) on uniform grid points in Ω with mesh size h. Then the corresponding matrices A and Q in (3.3) are of the forms
A = − δ1I`⊕ δ1+ δ2 2 ⊕ δ2Im⊕ δ1+ δ2 2 ⊕ δ1I` and Q = E I − B with B =δ1T`⊕ (2(δ1+ δ2)) ⊕ δ2Tm⊕ (2(δ1+ δ2)) ⊕ δ1T` − δ1 e`+1e>` + e`e>`+1 − δ2 e`+2e>`+1+ e`+1e>`+2 − δ2 et+1e>t + ete>t+1 − δ1 et+2e>t+1+ et+1e>t+2 + ωh2diag (1 − c)2, (2 − c)2, . . . , (n − c)2 ,
0 1 2 3 4 5 6 7 x 10 −14 0 1 2 3 4 5 6 7 8 Relative Residuals of QZ 0 1 2 3 4 5 6 7 8 101 100 102 103
E
Condition numbers of 1 XFig. 5.2. Relative residuals of QZ and condition numbers of X1.
matrix, ` and m are the numbers of grid points on the x1-axis in (−9, −1) and (−1, 1),
respectively, t = ` + m + 1, n = 2` + m + 2, c = (n + 1)/2 and ⊕ denotes the direct sum of two matrices.
In our test we take ` = 79, m = 19, δ1 = 1, δ2 = 0.1 and ω = 5 × 10−4, then
the matrix size of A is n = 2` + m + 2 = 179. Let ∆ = [−0.5, 8.5]. We divide ∆ into p subintervals using p + 1 equally spaced nodes Ei, i = 0, 1, . . . , p. We now
choose p = 1000 and run SA and QZ for each Ei. Recall that for both algorithms
X is computed from X = X2X1−1 in the end. In Figure 5.1, we plot the relative
residual and the condition number of X1 (with each column of X1 normalized, here
and elsewhere) for SA. We also plot the number of eigenvalues of (M, L) on T that are used in the computation of X (so it is half of the total number). In Figure 5.2, we plot the relative residual and the condition number of X1for QZ. From the figures
we can see that the condition numbers are not very large and the accuracy of the computed X is high for both methods, with that from SA slightly better.
In Example 5.2 the matrix A is nonsingular and so Step 5-1 in SA is never used. We now construct an example with a singular A.
Example 5.3. We construct 10 × 10 matrices A and Q for (3.3) as follows. A = rand(10, 10) ∗ diag(a1, . . . , a5, 0, 0, 0, 0, 0) ∗ rand(10, 10), where ai= 10 ∗ rand(1),
for i = 1, . . . , 5, and Q = E I10−B with B = (B0+B>0)/2 and B0= 10∗(rand(10, 10)−
0.5 ∗ ones(10, 10)). We run SA and QZ for E = 0.01i, for i = 0, . . . , 1000. The results are shown in Figures 5.3 and 5.4. We see that SA and QZ have roughly the same accuracy for this example.
6. Conclusions. In this paper we have further studied a nonlinear matrix equa-tion arising in nano research. We have proved general convergence results on fixed-point iterations for (1.4). It is also shown that the required solution of (1.4) is the
0 1 2 3 4 5 6 7 8 9 10 10-17 10-16 10-15 10-14 10-13 10-12 10-11 0 1 2 3 4 5 6 7 8 9 10 105 104 103 102 101 0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 Relative Residuals of SA Numbers of eigenvalues on
T
Condition numbers of 1 XE
Fig. 5.3. Relative residuals of SA, condition numbers of X1 and numbers of eigenvalues of
(M, L) on T.
unique solution with a positive definite imaginary part. Our analysis has also shown that the convergence of these methods is usually very slow when the size of matri-ces in (1.4) is large. So the use of these simple methods is recommended only when the matrix size is small, which is the case when the equation is obtained from layer based models. We have also studied the equation (3.3) directly, which is the equation we obtain by letting η = 0 in (1.4). We have shown which half of the unimodular eigenvalues of P (λ) in (3.6) should be included for computing the required weakly stabilizing solution of (3.3) using subspace methods, and we have also determined the rank of the imaginary part of this solution in terms of the number of unimodular eigenvalues of P (λ). We have presented a structure-preserving algorithm (SA) for (3.3) that is nearly 4 times more efficient than the QZ algorithm. The SA and the QZ algorithm often provide very good accuracy. But the accuracy would suffer if the matrix X1 used at the end of the algorithms happens to be very ill-conditioned.
Newton’s method cannot be used as a correction method since the Fr´echet derivative at the solution is always singular when P (λ) has unimodular eigenvalues. When SA
1 2 3 4 5 6 7 8 9 10 10-17 10-16 10-15 10-14 10-13 10-12 10-11 Relative Residuals of QZ 0 1 2 3 4 5 6 7 8 9 10
E
105 104 103 102 101 Condition numbers of 1 XFig. 5.4. Relative residuals of QZ and condition numbers of X1.
and QZ fail we may use DA on (1.4) with a small η > 0, but may need to increase η if DA also runs into numerical difficulties. It is possible to improve the accuracy of an approximation from DA by using the FPI (2.1) with c = 1
2, although at a high
computational cost when the matrix size is large.
Acknowledgment. The authors thank the referees for their helpful comments.
REFERENCES
[1] I. Appelbaum, T. Wang, J. D. Joannopoulos, and V. Narayanamurti, Ballistic hot-electron transport in nanoscale semiconductor heterostructures: Exact self-energy of a three-dimensional periodic tight-binding Hamiltonian, Physical Review B, 69 (2004), Arti-cle Number 165301, 6 pp.
[2] A. G. Baskakov, Dichotomy of the spectrum of non-self-adjoint operators, (Russian) Sibirsk. Mat. Zh., 32 (3) (1991), pp. 24–30; translation in Siberian Math. J., 32 (3) (1992), pp. 370–375.
[3] C.-Y. Chiang, E. K.-W. Chu, C.-H. Guo, T.-M. Huang, W.-W. Lin, and S.-F. Xu, Con-vergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 227–247.
[4] S. Datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, 1995. [5] S. Datta, Nanoscale device modeling: the Green’s function method, Superlattices and
Mi-crostructures, 28 (2000), pp. 253–278.
[6] C. J. Earle and R. S. Hamilton, A fixed point theorem for holomorphic mappings, Global Analysis (Proc. Sympos. Pure Math., Vol. XVI, Berkeley, Calif., 1968), American Mathe-matical Society, Rhode Island, 1970, pp. 61–65.
[7] J. C. Engwerda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and sufficient conditions for the existence of a positive definite solution of the matrix equation X + A∗X−1A = Q, Linear Algebra Appl., 186 (1993), pp. 255–275.
[8] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, 1996.
[9] C.-H. Guo, On Newton’s method and Halley’s method for the principal pth root of a matrix, Linear Algebra Appl., 432 (2010), pp. 1905–1922.
[10] C.-H. Guo, Y.-C. Kuo, and W.-W. Lin, Complex symmetric stabilizing solution of the matrix equation X + ATX−1A = Q, Linear Algebra Appl., 435 (2011), pp. 1187–1192.
[11] C.-H. Guo and P. Lancaster, Iterative solution of two matrix equations, Math. Comp., 68 (1999), pp. 1589–1603.
[12] C.-H. Guo and W.-W. Lin, The matrix equation X + A>X−1A = Q and its application in
nano research, SIAM J. Sci. Comput., 32 (2010), pp. 3020–3038.
[13] U. Haagerup and S. Thorbjørnsen, A new application of random matrices: Ext(Cred∗ (F2))
is not a group, Ann. of Math., 162 (2005), pp. 711–775.
[14] L. A. Harris, Fixed points of holomorphic mappings for domains in Banach spaces, Abstr. Appl. Anal., 2003, no. 5, pp. 261–274.
[15] T.-M. Huang, W.-W. Lin, and J. Qian, Structure-preserving algorithms for palindromic quadratic eigenvalue problems arising from vibration of fast trains, SIAM J. Matrix Anal. Appl., 30 (2009), pp. 1566–1592.
[16] B. Iannazzo and B. meini, Palindromic matrix polynomials, matrix functions and integral representations, Linear Algebra Appl., 434 (2011), pp. 174–184.
[17] D. L. John and D. L. Pulfrey, Green’s function calculations for semi-infinite carbon nan-otubes, Physica Status Solidi B – Basic Solid State Physics, 243 (2006), pp. 442–448. [18] A. Kletsov, Y. Dahnovsky, and J. V. Ortiz, Surface Green’s function calculations: A
nonrecursive scheme with an infinite number of principal layers, Journal of Chemical Physics, 126 (2007), Article Number 134105, 5 pp.
[19] W.-W. Lin, A new method for computing the closed-loop eigenvalues of a discrete-time alge-braic Riccati equation, Linear Algebra Appl., 96 (1987), pp. 157–180.
[20] W.-W. Lin, V. Mehrmann, and H. Xu, Canonical forms for Hamiltonian and symplectic matrices and pencils, Linear Algbera Appl., 302/303 (1999), pp. 469–533.
[21] W.-W. Lin and S.-F. Xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 26–39. [22] B. Meini, Efficient computation of the extreme solutions of X + A∗X−1A = Q and X −
A∗X−1A = Q, Math. Comp., 71 (2002), pp. 1189–1204.
[23] R. V. Patel, On computing the eigenvalues of a symplectic pencil, Linear Algebra Appl., 188/189 (1993), pp. 591–611.
[24] G. W. Stewart and J.-G. Sun, Matrix Perturbation Theory, Academic Press, 1990. [25] J. Tomfohr and O. F. Sankey, Theoretical analysis of electron transport through organic
molecules, Journal of Chemical Physics, 120 (2004), pp. 1542–1554.
[26] H. Wielandt, On eigenvalues of sums of normal matrices, Pacific J. Math., 5 (1955), pp. 633–638.
[27] X. Zhan and J. Xie, On the matrix equation X + A>X−1A = I, Linear Algebra Appl., 247