On a nonlinear matrix equation arising in nano research

(1)

RESEARCH

CHUN-HUA GUO†, YUEH-CHENG KUO‡,AND WEN-WEI LIN§

Abstract. The matrix equation X + A>X−1A = Q arises in Green’s function calculations in nano research, where A is a real square matrix and Q is a real symmetric matrix dependent on a parameter and is usually indefinite. In practice one is mainly interested in those values of the parameter for which the matrix equation has no stabilizing solutions. The solution of interest in this case is a special weakly stabilizing complex symmetric solution X∗, which is the limit of the unique

stabilizing solution Xηof the perturbed equation X + A>X−1A = Q + iηI, as η → 0+. It has been

shown that a doubling algorithm can be used to compute Xη efficiently even for very small values

of η, thus providing good approximations to X∗. It has been observed by nano scientists that a

modified fixed-point method can sometimes be quite useful, particularly for computing Xηfor many

different values of the parameter. We provide a rigorous analysis of this modified fixed-point method and its variant, and of their generalizations. We also show that the imaginary part XIof the matrix

X∗ is positive semi-definite and determine the rank of XI in terms of the number of unimodular

eigenvalues of the quadratic pencil λ2_A>_{− λQ + A. Finally we present a new structure-preserving}

algorithm that is applied directly on the equation X + A>X−1A = Q. In doing so, we work with real arithmetic most of the time.

Key words. nonlinear matrix equation, complex symmetric solution, weakly stabilizing solution, fixed-point iteration, structure-preserving algorithm, Green’s function

AMS subject classifications. 15A24, 65F30

1. Introduction. The nonlinear matrix equation X + A>X−1A = Q, where A is real and Q is real symmetric positive definite, arises in several applications and has been studied in [3, 7, 11, 21, 22, 27], for example.

In this paper we further study the nonlinear matrix equation X + A>X−1A = Q + iηI,

where A ∈ Rn×n_{, Q = Q}> _{∈ R}n×n _{and η ≥ 0, but Q is usually not positive definite.}

The equation arises from the non-equilibrium Green’s function approach for treating quantum transport in nanodevices, where the system Hamiltonian is a semi-infinite or bi-infinite real symmetric matrix with special structures [1, 5, 17, 18, 25]. A first systematic mathematical study of the equation has already been undertaken in [12].

For the bi-infinite case, the Green’s function corresponding to the scattering region GS ∈ Cns×ns, in which the nano scientists are interested, satisfies the relation [4, 17]

GS = (E + i0+)I − HS− CL,S> GL,SCL,S− DS,RGS,RD>S,R

−1 ,

where E is energy, a real number that may be negative, HS ∈ Rns×ns is the

Hamil-tonian for the scattering region, CL,S ∈ Rn`×ns and DS,R ∈ Rns×nr represent the ∗_{Version of July 24, 2011.}

†_{Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada}

(chguo@math.uregina.ca). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada.

‡_{Department of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan}

(yckuo@nuk.edu.tw). The work of this author was partially supported by the National Science Council in Taiwan.

§_{Department of Mathematics, National Taiwan University, Taipei 106, Taiwan & Center of}

Math-ematical Modelling and Scientific Computing, National Chiao Tung University, Hsinchu 300, Taiwan (wwlin@math.nctu.edu.tw). The work of this author was partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.

(2)

coupling with the scattering region for the left lead and the right lead, respectively, GL,S∈ Cn`×n` and GS,R∈ Cnr×nr are special solutions of the matrix equations

GL,S= (E + i0+)I − BL− A>LGL,SAL −1 (1.1) and GS,R= (E + i0+)I − BR− ARGS,RA>R −1 (1.2) with AL, BL = BL> ∈ R n`×n` _{and A} R, BR = BR> ∈ R nr×nr_{. Since (1.1) and (1.2)}

have the same type, we only need to study (1.1). So we simplify the notation n` to n.

In nano research, one is mainly interested in the values of E for which GL,S in (1.1)

has a nonzero imaginary part [18].

For each fixed E , we replace “0+_{” in (1.1) by a sufficiently small positive number}

η and consider the matrix equation

X = (E + iη)I − BL− A>LXAL

−1

. (1.3)

It is shown in [12] that the required special solution GL,S of (1.1) is given by GL,S =

limη→0+GL,S(η) with X = GL,S(η) being the unique complex symmetric solution

of (1.3) such that ρ (GL,S(η)AL) < 1, where ρ(·) denotes the spectral radius. Thus

GL,S is a special complex symmetric solution of X = E I − BL− A>LXAL −1

with ρ (GL,SAL) ≤ 1.

The question as to when GL,S has a nonzero imaginary part is answered in the

following result from [12], where T denotes the unit circle.

Theorem 1.1. For λ ∈ T, let the eigenvalues of ψL(λ) = BL+ λAL+ λ−1A>L be

µL,1(λ) ≤ · · · ≤ µL,n(λ). Let

∆L,i=

min

|λ|=1µL,i(λ), max|λ|=1µL,i(λ)

,

and ∆L = S n

i=1∆L,i. Then GL,S is a real symmetric matrix if E /∈ ∆L. When

E ∈ ∆L, the quadratic pencil λ2A>L − λ(EI − BL) + AL has eigenvalues on T. If all

these eigenvalues on T are simple (they must then be nonreal, as explained in the proof of Theorem 3.3), then GL,S has a nonzero imaginary part.

By replacing X in (1.3) with X−1, we get the equation

X + A>X−1A = Qη, (1.4)

where A = AL and Qη = Q + iηI with Q = E I − BL. So Q is a real symmetric

matrix dependent on the parameter E and is usually indefinite. For η > 0, we need the stabilizing solution X of (1.4), which is the solution with ρ(X−1A) < 1, and then GL,S(η) = X−1. The existence of the stabilizing solution was proved in [12] using

advanced tools; an elementary proof has been given recently in [10]. When η = 0 and E ∈ ∆L, it follows from Theorem 1.1 that the required solution X = G−1L,S of (1.4) is

only weakly stabilizing, in the sense that ρ(X−1A) = 1.

The size of the matrices in (1.4) can be very small or very large, depending on how the system Hamiltonian is obtained. If the Hamitonian is obtained from layer based models, as in [18] and [25], then the size of the matrices is just the number of principal layers. In [18] considerable attention is paid to single layer models and the more realistic double layer models, which correspond to n = 1 and n = 2 in (1.4).

(3)

So we can say that (1.4) with n ≤ 10 is already of significant practical interest. On the other hand, if the Hamiltonian is obtained from the discretization of a differential operator, as in [1], then the size of the matrices in (1.4) can be very large if a fine mesh grid is used.

One way to approximate GL,Sis to take a very small η > 0 and compute GL,S(η).

It is proved in [12] that the sequence {Xk} from the basic fixed-point iteration (FPI)

Xk+1= Qη− A>X_k−1A, (1.5)

with X0= Qη, converges to GL,S(η)−1. And it follows that the sequence {Yk} from

the basic FPI

Yk+1= (Qη− A>YkA)−1, (1.6)

with Y0 = Q−1η , converges to GL,S(η). However, the convergence is very slow for

E ∈ ∆L since ρ(GL,S(η)A) ≈ 1 for η close to 0. It is also shown in [12] that a

doubling algorithm (DA) can be used to compute the desired solution X = GL,S(η)−1

of the equation (1.4) efficiently for each fixed value of E . However, in practice the desired solution needs to be computed for many different E values. Since the DA is not a correction method, it cannot use the solution obtained for one E value as an initial approximation for the exact solution at a nearby E value. To compute the solutions corresponding to many E values, it may be more efficient to use a modified FPI together with the DA. Indeed, it is suggested in [25] that the following modified FPI be used to approximate GL,S(η).

Yk+1= 1 2Yk+ 1 2(Qη− A >_Y kA)−1. (1.7)

A variant of this FPI is given in [12] to approximate GL,S(η)−1:

Xk+1= 1 2Xk+ 1 2(Qη− A >_X−1 k A), (1.8)

which requires less computational work each iteration. However, the convergence analysis of these two modified FPIs has been an open problem even for the special initial matrices Y0= Q−1η and X0= Qη, respectively.

Our first contribution in this paper is a proof of convergence (to the desired solutions) of these two modified FPIs and their generalizations, for many choices of initial matrices. So these methods can be used as correction methods. In this process we will show that the unique stabilizing solution X = GL,S(η)−1 of (1.4) is

also the unique solution of (1.4) with a positive definite imaginary part. It follows that the imaginary part XI of the matrix G−1L,S is positive semi-definite. Our second

contribution in this paper is a determination of the rank of XI in terms of the number

of eigenvalues on T of the quadratic pencil λ2A>− λQ + A. Our third contribution is a structure-preserving algorithm that is applied directly on (1.4) with η = 0. In doing so, we work with real arithmetic most of the time.

2. Convergence analysis of fixed-point iterations. In this section we per-form convergence analysis for some FPIs, including the modified FPIs (1.7) and (1.8). The main tool we need is the following important result due to Earle and Hamilton [6]. The presentation here follows [14, Theorem 3.1] and its proof.

Theorem 2.1. (Earle–Hamilton theorem) Let D be a nonempty domain in a complex Banach space Z and let h : D → D be a bounded holomorphic function.

(4)

If h(D) lies strictly inside D, then h has a unique fixed point in D. Moreover, the sequence {zn} defined by the fixed point iteration zk+1= h(zk) converges to this fixed

point for any z0∈ D.

Now let Z be the complex Banach space Cn×n _{equipped with the spectral norm.}

For any K ∈ Cn×n_{, its imaginary part is the Hermitian matrix}

ImK = 1

2i(K − K

∗_).

For any Hermitian matrices X and Y , X > Y (X ≥ Y ) means that X − Y is positive definite (semi-definite). Let D+ = {X ∈ Cn×n : ImX > 0}, D− = {X ∈ Cn×n :

ImX < 0}.

We start with a proof of convergence for the basic FPI (1.5) for many different choices of X0, not just for X0= Qη.

Theorem 2.2. For any X0∈ D+, the sequence {Xk} produced by the FPI (1.5)

converges to the unique fixed point Xη in D+.

Proof. Let D = {X ∈ Cn×n : ImX > η₂I}. For each X ∈ D, X is invertible by Bendixson’s theorem (see [26] for example) and we also have kX−1k < 2

η (see [2,

Corollary 4] or [13, Lemma 3.1]). Now let

f (X) = Qη− A>X−1A. Then for X ∈ D Imf (X) = ImQη− 1 2i A >_X−1_{A − (A}>_X−1_A)∗ = ηI + A>X−1(ImX)(A>X−1)∗≥ ηI.

It follows that f : D → D is a bounded holomorphic function and f (D) lies strictly inside D. By the Earle–Hamilton theorem, f has a unique fixed point Xηin D and Xk

converges to Xη for any X0 ∈ D. The theorem is proved by noting that X1 ∈ D for

any X0∈ D+and that any fixed point X∗in D+must be in D by X∗= Qη−A>X∗−1A.

Remark 2.1. Since {Xk} converges to GL,S(η)−1 for X0= Qη ∈ D+, we know

that Xη = GL,S(η)−1 in Theorem 2.2. Thus Xη is the unique solution of the equation

(1.4) such that ρ(X_η−1A) < 1, and it is also the unique solution of (1.4) in D+. If we

have obtained a particular solution of (1.4) by some method and would like to know whether it is the required solution, the latter condition is easier to check.

The matrix GL,S(η) can also be computed directly by using (1.6) for many

dif-ferent choices of Y0. Note that the FPI (1.6) is Yk+1= ˆf (Yk) with

ˆ

f (Y ) = (Qη− A>Y A)−1.

Corollary 2.3. For any Y0∈ D−, the sequence {Yk} produced by the FPI (1.6)

converges to the unique fixed point Yη = GL,S(η) in D−.

Proof. For any Y0 ∈ D−, Y0 is invertible by Bendixson’s theorem. We now take

X0 = Y0−1 in (1.5). Then X0∈ D+ since ImX0 = −Y0−1(ImY0)Y0−∗. It follows that

the sequence {Yk} is well-defined and related to the sequence {Xk} from (1.5) by

Yk = X_k−1. Thus {Yk} converges to Yη = Xη−1 ∈ D−. Since Xη is the unique fixed

point of f in D+, Yη is the unique fixed point of ˆf in D−.

For faster convergence, we consider the modified FPI for (1.4)

(5)

or Xk+1= g(Xk) with the function g defined by

g(X) = (1 − c)X + cf (X). (2.2)

Note that f (X) = X if and only if g(X) = X. So f and g have the same fixed points. Note also that the FPI (1.8) is a special case of the FPI (2.1) with c = 1₂. We can now prove the following general result.

Theorem 2.4. For any X0∈ D+, the FPI Xk+1= g(Xk) converges to the unique

fixed point Xη in D+.

Proof. For any X0 ∈ D+, X1 is well defined and ImX1 > cηI. Let b be any

number such that b > kX1k and b > 2(kQηk+_cη1kAk2). Let D = {X ∈ Cn×n: ImX >

cηI, kXk < b}. So X1 ∈ D. For each X ∈ D, X is invertible and kX−1k < _cη1, as

before. Then for X ∈ D

Img(X) = (1 − c)ImX + cImf (X) > (1 − c)cηI + cηI = (2 − c)cηI, and kg(X)k ≤ (1 − c)kXk + c kQηk + 1 cηkAk 2 < (1 − c)b + cb 2 = 1 − c 2 b. It follows that g : D → D is a bounded holomorphic function and g(D) lies strictly inside D. By the Earle–Hamilton theorem, Xk converges to the unique fixed point of

g in D, which must be Xη.

Similarly, we consider the modified FPI

Yk+1= (1 − c)Yk+ c(Qη− A>YkA)−1, 0 < c < 1, (2.3)

or Yk+1= ˆg(Yk) with the function ˆg defined by

ˆ

g(Y ) = (1 − c)Y + c ˆf (Y ). (2.4) The FPI (2.3) includes (1.7) as a special case. Note that ˆf (Y ) = Y if and only if ˆg(Y ) = Y . So ˆf and ˆg have the same fixed points. However, there are no simple relations between Xk from (2.1) and Yk from (2.3).

Theorem 2.5. For any Y0∈ D−, the FPI Yk+1= ˆg(Yk) converges to the unique

fixed point Yη in D−.

Proof. Take b > 2/η, and let D = {Y ∈ Cn×n _{: ImY < 0, kY k < b}. For any}

Y ∈ D, Im(Qη− A>Y A) ≥ ηI. So ˆg(Y ) is well defined and k(Qη− A>Y A)−1k ≤ 1_η.

Thus kˆg(Y )k < (1 − c)b + c1 η < 1 −1 2c b. Moreover,

Im(ˆg(Y )) < cIm(Qη− A>Y A)−1

= −c(Qη− A>Y A)−1Im(Qη− A>Y A)(Qη− A>Y A)−∗ ≤ −cη (Qη− A>Y A)∗(Qη− A>Y A) −1 ≤ −cηkQη− A>Y Ak−2I ≤ − cη (kQηk + bkAk2)2 I.

(6)

It follows that ˆg : D → D is a bounded holomorphic function and ˆg(D) lies strictly inside D. By the Earle–Hamilton theorem, Yk converges to Yη for any Y0 ∈ D, and

hence for any Y0∈ D− since we can take b > kY0k.

We remark that the modified FPI (2.1) is slightly less expensive than the modified FPI (2.3) for each iteration. These two methods make improvements over the basic FPIs (1.5) and (1.6) in the same way, as explained below.

The rate of convergence of each FPI can be determined by computing the Fr´echet derivative of the fixed-point mapping, as in [11]. For (1.5) and (1.6), we have

lim sup k→∞ k q kXk− Xηk ≤ (ρ(Xη−1A)) 2_, _{lim sup} k→∞ k q kYk− Yηk ≤ (ρ(YηA))2,

where equality typically holds. Recall that Yη = Xη−1. Note also that if Y0 = X0−1

(with X0∈ D+) then lim sup k→∞ k q kYk− Yηk = lim sup k→∞ k q kXk− Xηk.

The Fr´echet derivative at Xη of the function g in (2.2) is given by

g0(Xη)E = (1 − c)E + cA>Xη−1EX −1 η A.

It follows that for FPI (2.1) lim sup k→∞ k q kXk− Xηk ≤ ρ (1 − c)I + c(A>Xη−1) ⊗ (A >_X−1 η ) .

Similarly, the Fr´echet derivative at Yη of the function ˆg in (2.4) is given by

ˆ

g0(Yη)E = (1 − c)E + cYηA>EAYη.

It follows that for FPI (2.3) lim sup

k→∞

k

q

kYk− Yηk ≤ ρ (1 − c)I + c(YηA>) ⊗ (YηA>) .

The rate of convergence of both modified FPIs is then determined by rη= max i,j 1 − c + cλi(Xη−1A)λj(Xη−1A) .

The convergence of the modified FPIs is often much faster because rη may be much

smaller than 1 for a proper choice of c. An extreme example is the following. Example 2.1. Consider the scalar equation (1.4) with A = 1 and Qη = ηi. It is

easy to find that

Xη=

1 2(η +

p

4 + η2_)i.

Thus ρ(X_η−1A) → 1 as η → 0+_{, while for c =} 1

2 we have rη→ 0 as η → 0 +_.

Note that for i, j = 1, . . . , n, the n2_{numbers λ}

i(Xη−1A)λj(Xη−1A) are inside T for

each η > 0. In the limit η → 0+

, at least one of them is on T if E ∈ ∆L. So each of

(7)

in (2.1) and (2.3). In this case, the basic FPIs (1.5) and (1.6) are recovered. To get some insight, we first consider choosing c ∈ (0, 1] such that for fixed (r, θ)

p(c) =1 − c + creiθ

is minimized. If (r, θ) = (1, 0), then p(c) = 1 for all c ∈ (0, 1]. So assume (r, θ) 6= (1, 0). In this case, p(c) = 1 − reiθ 1 1 − reiθ − c is minimized on (0, 1] when c = min 1, Re 1 1 − reiθ = min 1, 1 − r cos θ 1 + r2_{− 2r cos θ} ≥1 2, where we have used 1 − r cos θ −1₂(1 + r2_{− 2r cos θ) =} 1

2(1 − r

2_{) ≥ 0. It follows that}

c = 1 is the best choice when _1+r1−r cos θ2_{−2r cos θ} ≥ 1 or, in other words, when z = reiθ is in

the disk {z ∈ C : |z − 1 2| ≤

1

2}. Note that p(1) = r. It also follows that c = 1 2 is the

best chooice when r = 1. Note that p(1₂) = 1₂√1 + r2_{+ 2r cos θ =}1

2p2(1 + cos θ) for

r = 1.

We know from [12] that the eigenvalues of X_η−1A are precisely the n eigenvalues inside T of the quadratic pencil λ2A>− λQη+ A. We also know from Theorem 1.1

that the quadratic pencil P (λ) = λ2A>− λQ + A has some eigenvalues on T when E ∈ ∆L. We can then make the following conclusions.

Remark 2.2. If P (λ) has some eigenvalues near 1 or −1, the convergence of the FPI (2.1) is expected to be very slow for any choice of the parameter c. The DA is recommended for this case. If all eigenvalues of P (λ) are clustered around ±i, then the FPI (2.1) with c = 1

2 is expected to be very efficient. In the general case, the

optimal c is somewhere between 1

2 and 1. If all eigenvalues of X −1

η A are available,

we can determine the optimal c to minimize rη using the bisection procedure in [9,

Section 6], with [1₂, 1] as the initial interval. In practice we would not compute these eigenvalues for every E value. But we may use DA to compute Xη for one E value,

determine the optimal c value for this E , and use it as a suboptimal c for many nearby E values. If one does not want to compute any eigenvalues when using the FPI (2.1), then c = 1₂ is recommended since this c value is best in handling the eigenvalues of X_η−1_{A that are extremely close to T, which are the eigenvalues responsible for the} extreme slow convergence of the basic FPIs (1.5) and (1.6).

We note that the approximate solution from the DA or the FPI for any E value is in D+, and can be used as initial guess for the exact solution when using the FPI for

other E values, with guaranteed convergence. However, even for small size problems the convergence of the FPI (2.1), with c = 1₂ for example, can still be very slow when P (λ) has some eigenvalues near 1 or −1. Moreover, the latter situation will happen for some energy values since P (λ) is palindromic and thus, as E varies, eigenvalues of P (λ) leave or enter the unit circle typically through the points ±1. When n is large, there may be some eigenvalues of P (λ) near ±1 for almost all energy values of interest, and thus the convergence of the FPI may be very slow and other methods should be used.

3. Rank of Im(GL,S). The equation (1.4) has a unique stabilizing solution Xη =

GL,S(η)−1 for any η > 0. So

(8)

with ρ(X_η−1A) < 1. We also know that Xη is complex symmetric. Write Xη =

Xη,R+ iXη,I with Xη,R> = Xη,R, Xη,I> = Xη,I ∈ Rn×n. We know from the previous

section that Im(Xη) = Xη,I > 0. Let

ϕη(λ) = λA>+ λ−1A − Qη. (3.2)

By (3.1) the rational matrix-valued function ϕη(λ) has the factorization

ϕη(λ) = λ−1I − Sη> Xη(−λI + Sη) ,

where Sη = Xη−1A. Let X = limη→0+Xη= G−1_L,S. Then

X + A>X−1A = Q (3.3)

with ρ(X−1A) ≤ 1 and Im(X) ≥ 0. Note that ϕ0(λ) = λA>+ λ−1A − Q has the

factorization

ϕ0(λ) = λ−1I − S> X (−λI + S) , (3.4)

where S = X−1A. In particular, ϕ0(λ) is regular, i.e., its determinant is not

indenti-cally zero. In this section we will determine the rank of Im(X), which is the same as the rank of Im(GL,S) since Im(GL,S) = Im(X−1) = −X−1Im(X)X−∗.

Let M = A 0 Q −I , L = 0 I A> 0 . (3.5)

Then the pencil M − λL, also denoted by (M, L), is a linearization of the quadratic matrix polynomial

P (λ) = λϕ0(λ) = λ2A>− λQ + A. (3.6)

It is easy to check that y and z are the right and left eigenvectors, respectively, corresponding to an eigenvalue λ of P (λ) if and only if

y Qy − λA>y , z −λz (3.7)

are the right and left eigenvectors of (M, L), respectively.

Theorem 3.1. Suppose that λ0 is a semi-simple eigenvalue of ϕ0(λ) on T with

multiplicity m0 and Y ∈ Cn×m0 forms an orthonormal basis of right eigenvectors

corresponding to λ0. Then iY∗(2λ0A>− Q)Y is a nonsingular Hermitian matrix. Let

dj, j = 1, . . . , `, be the distinct eigenvalues of iY∗(2λ0A>− Q)Y with multiplicities

mj₀, and let ξj ∈ Cm0×m

j

0 form an orthonormal basis of the eigenspace corresponding

to dj. Then for η > 0 sufficiently small and j = 1, . . . `

λ(k)_j,η = λ0−

λ0

dj

η + O(η2), k = 1, . . . , mj₀, and yj,η= Y ξj+ O(η) (3.8)

are perturbed eigenvalues and a basis of the corresponding invariant subspace of ϕη(λ),

(9)

Proof. Since P (λ0)Y = λ0ϕ0(λ0)Y = 0 with Y∗Y = Im0 and |λ0| = 1, we have 0∗= (P (λ0)Y )∗= 1 λ2 0 Y∗(λ2₀A>− λ0Q + A).

It follows that Y forms an orthonormal basis for left eigenvectors of P (λ) correspond-ing to λ0. From (3.7), we obtain that the column vectors of

YR= Y QY − λ0A>Y and YL= Y −λ0Y

form a basis of left and right eigenspaces of M − λL corresponding to λ0, respectively.

Since λ0is semi-simple, the matrix

[Y∗, −λ0Y∗]L

Y QY − λ0A>Y

= −Y∗(2λ0A>− Q)Y = −Y∗P0(λ0)Y

is nonsingular. Let e YR= −YR(Y∗P0(λ0)Y )−1, YeL= YL. Then we have e YL∗L eYR= Im0 and eY ∗ LM eYR= λ0Im0. (3.9)

For η > 0, sufficiently small, we consider the perturbed equation of P (λ) by P (λ) − λiηI = λ2A>− λ(Q + iηI) + A = λϕη(λ). Let Mη = A 0 Q + iηI −I . Then Mη− λL is a linearization of λϕη(λ). By (3.9)

and [24, Chapter VI, Theorem 2.12] there are bYR and bYL such that

h e YRYbR i and h e YLYbL i

are nonsingular and " e Y∗ L b Y∗ L # MhYeRYbR i = _λ 0Im0 0 0 Mc , " e Y∗ L b Y∗ L # LhYeRYbR i = _I m0 0 0 Lb .

Then, by [24, Chapter VI, Theorem 2.15] there exist matrices ∆1(η) = O(η) and

∆2(η) = O(η2) such that the column vectors of eYR+ ∆1(η) span the right eigenspace

of (Mη, L) corresponding to (λ0Im0+ ηE11+ ∆2(η), Im0), where

E11= eYL∗ 0 0 iI 0 e YR= λ0Y∗(iI)Y (Y∗P0(λ0)Y )−1 = −λ0(iY∗(2λ0A>− Q)Y )−1. (3.10)

The matrix iY∗(2λ0A>− Q)Y in (3.10) is Hermitian since

iY∗(2λ0A>− Q)Y = iY∗ϕ0(λ0)Y + iλ0Y∗A>Y − iλ0Y∗AY

(10)

Let dj for j = 1, . . . , ` be the distinct eigenvalues of iY∗(2λ0A>− Q)Y with

mul-tiplicities mj₀, and let ξj ∈ Cm0×m

j

0 form an orthonormal basis of the eigenspace

corresponding to dj. Then we have

Φ∗E11Φ = diag −λ0 d1 I_m1 0, . . . , −λ0 d` I_m` 0 ,

where Φ = [ξ1, . . . , ξ`] ∈ Cm0×m0. It follows that λ0Im0+ ηE11+ ∆2(η) is similar to

λ0Im0+ diag −λ0 d1 ηI_m1 0, . . . , −λ0 d` ηI_m` 0 + ∆3(η)

for some ∆3(η) = O(η2). Then for each j ∈ {1, 2, . . . , `}, the perturbed eigenvalues

λ(k)_j,η, k = 1, . . . , mj₀, and a basis of the corresponding invariant subspace of Mη− λL

with λ(k)_j,η|η=0= λ0can be expressed by

λ(k)_j,η = λ0− λ0 dj η + O(η2), k = 1, . . . , mj₀, (3.12a) and ζj,η= YRξj+ O(η). (3.12b)

The second equation in (3.8) follows from (3.12b). Lemma 3.2. Let Zη be the solution of the equation

Zη− R∗ηZηRη = ηWη, for η > 0, (3.13)

where Wη ∈ Cm×m is positive definite, Rη = eiθIm+ ηEη with θ ∈ [0, 2π] fixed, and

Eη ∈ Cm×m is uniformly bounded such that ρ(Rη) < 1. Then Zη is positive definite.

Furthermore, if Zη converges to Z0 and Wη converges to a positive definite matrix W0

as η → 0+_{, then Z}

0 is also positive definite.

Proof. Since ρ(Rη) < 1 and ηWη is positive definite, it is well known that Zη is

uniquely determined by (3.13) and is positive definite. Since Eη is bounded, we have from (3.13) that

ηWη = Zη− e−iθIm+ ηEη∗ Zη eiθIm+ ηEη

= −ηeiθEη∗Zη− ηe−iθZηEη+ O(η2).

This implies that

Wη= −eiθE∗ηZη− e−iθZηEη+ O(η). (3.14)

If Zη converges to Z0 as η → 0+, then Z0 is positive semi-definite. To prove that Z0

is positive definite, it suffices to show that Z0 is nonsingular. Suppose that x ∈ Cm

such that Z0x = 0. Then we have Zηx → 0 and x∗Zη → 0 as η → 0+. Multiplying

(3.14) by x∗ and x from the left and right, respectively, we have

x∗Wηx = −eiθx∗Eη∗Zηx − e−iθx∗ZηEηx + O(η) → 0, as η → 0+.

Thus x = 0 because Wη converges to W0 and W0 is positive definite. It follows that

Z0 is positive definite.

Theorem 3.3. The number of eigenvalues (counting multiplicities) of ϕ0(λ) on

T must be even, say 2m. Let X = limη→0+X_η be invertible and write X = X_R+ iX_I

(11)

(i) rank (XI) ≤ m;

(ii) rank (XI) = m, if all eigenvalues of ϕ0(λ) on T are semi-simple and kSη−

Sk2= O(η) for η > 0 sufficiently small, where Sη= Xη−1A and S = X−1A;

(iii) rank (XI) = m, if all eigenvalues of ϕ0(λ) on T are semi-simple and each

unimodular eigenvalue of multiplicity mj is preturbed to mj eigenvalues (of

ϕη(λ)) inside the unit circle or to mj eigenvalues outside the unit circle.

Proof. Consider the real quadratic pencil P (λ) = λϕ0(λ) = λ2A>− λQ + A. So

P (λ) and ϕ0(λ) have the same eigenvalues on T. If λ06= ±1 is an eigenvalue of P (λ)

on T with multiplicity m0, then so is λ0. Thus the total number of nonreal eigenvalues

of P (λ) on T must be even. Now the quadratic pencil Pη(λ) = λ2A>− λ(Q + iηI) + A

is >-palindromic, and it has no eigenvalues on T for any η 6= 0 [12]. If 1 (or −1) is an eigenvalue of P (λ) with multiplicity r and Q in P (λ) is perturbed to Q + iηI, then half of these r eigenvalues are perturbed to the inside of T and the other half are perturbed to the outside of T. This means that r must be even. Thus the total number of eigenvalues of ϕ0(λ) on T is also even and is denoted by 2m.

(i) By Xη+ A>Xη−1A = Qη we have

i(Q∗_η− Qη) = i(Xη∗− Xη) − iA>(Xη−1− X −∗ η )A

= i(X_η∗− Xη) − (Xη−1A)∗i(Xη∗− Xη)(Xη−1A).

Thus

Kη− Sη∗KηSη= 2ηI, (3.15)

where Kη = i Xη∗− Xη = 2Xη,I. Note that the eigenvalues of Sη = Xη−1A are

the eigenvalues of Pη(λ) inside T. Since X = limη→0+Xη is invertible, we have

S = X−1A = limη→0+Sη. Let S = V0 R0,1 0 0 R0,2 V₀−1 (3.16)

be a spectral resolution of S, where R0,1 ∈ Cm×m and R0,2 ∈ C(n−m)×(n−m) are

upper-triangular with σ(R0,1) ⊆ T and σ(R0,2) ⊆ D ≡ {λ ∈ C| |λ| < 1}, and

V0= [V0,1, V0,2] with V0,1∈ Cn×m and V0,2 ∈ Cn×(n−m) having unit column vectors.

It follows from [24, Chapter V, Theorem 2.8] that there is a nonsingular matrix Vη= [Vη,1, Vη,2] with Vη,1∈ Cn×m and Vη,2∈ Cn×(n−m) such that

Sη = Vη Rη,1 0 0 Rη,2 V_η−1, (3.17) and Rη,1→ R0,1, Rη,2→ R0,2, and Vη→ V0, as η → 0+.

From (3.15) and (3.17) we have

V_η∗KηVη− R∗_η,1 0 0 R∗_η,2 V_η∗KηVη Rη,1 0 0 Rη,2 = 2ηV_η∗Vη. (3.18) Let Hη= Vη∗KηVη= Hη,1 Hη,3 H∗ η,3 Hη,2 , V_η∗Vη= Wη,1 Wη,3 W∗ η,3 Wη,2 . (3.19)

(12)

Then (3.18) becomes

Hη,1− R∗η,1Hη,1Rη,1= 2ηWη,1, (3.20a)

Hη,2− R∗η,2Hη,2Rη,2= 2ηWη,2, (3.20b)

Hη,3− R∗η,1Hη,3Rη,2= 2ηWη,3. (3.20c)

As η → 0+_{, R}

η,1 → R0,1 with ρ(R0,1) = 1, Rη,2 → R0,2 with ρ(R0,2) < 1, and Wη,2

and Wη,3are bounded. So we have Hη,2→ 0 from (3.20b) and Hη,3→ 0 from (3.20c).

It follows from (3.19) that Kη= 2Xη,I converges to K0= 2XI with rank(XI) ≤ m.

(ii) Suppose that eigenvalues of ϕ0(λ) on T are semi-simple and kSη− Sk2= O(η)

for η > 0 sufficiently small. Then we will show that Hη,1 in (3.20a) converges to H0,1

with rank(H0,1) = m. Let λ1, . . . , λr∈ T be the distinct semi-simple eigenvalues of S

with multiplicities m1, . . . , mr, respectively. Then (3.16) can be written as

S = V0 D0,1 0 0 R0,2 V₀−1

where D0,1 = diag{λ1Im1, · · · , λrImr}, V0 = [V0,λ1, · · · , V0,λr, V0,2] and

Pr

i=1mi =

m. Now Sη = S + (Sη− S) with kSη− Sk2 = O(η). By repeated application of [24,

Chapter V, Theorem 2.8] there is a nonsingular matrix Vη= [Vη,λ1, · · · , Vη,λr, Vη,2] ∈

Cn×n such that Sη = Vη D0,1+ ηEη,1 0 0 R0,2+ ηEη,2 V_η−1

and Vη→ V0as η → 0+, where Eη,1= diag{Em11,η, · · · , E

1

mr,η} with E

1 mj,η∈ C

mj×mj

and Eη,2 ∈ C(n−m)×(n−m) are such that kEm1j,ηk2 = O(1) for j = 1, . . . , r and

kEη,2k2= O(1).

The equation (3.20a) can then be written as

Hη,1− (D0,1+ ηEη,1)∗Hη,1(D0,1+ ηEη,1) = 2ηWη,1. (3.21)

Since D0,1+ ηEη,1 is a block diagonal matrix and all eigenvalues of its jth diagonal

block converge to λj, with λj’s distinct numbers on T, we have

Hη,1= diag{Hm11,η, · · · , H 1 mr,η} + O(η), where diag{H1 m1,η, · · · , H 1

mr,η} is the block diagonal of Hη,1. Then (3.21) gives

H_m1 j,η− (λjImj+ ηE 1 mj,η) ∗_H1 mj,η(λjImj + ηE 1 mj,η) = 2ηW 1 mj,η, for j = 1, . . . , r,

where Wm1j,η is the jth diagonal block of Wη,1. Since Wη,1 is positive definite and

converges to a positive definite matrix, W1

mj,η, j = 1, . . . , r, are also positive definite

and converge to positive definite matrices. For η > 0, we have ρ(λjImj+ ηE

1

mj,η) < 1

for j = 1, . . . , r since ρ(Sη) < 1. By the assumption that Xη converges to X, we

have that H1

mj,η converges to H

1

mj,0 for j = 1, . . . , r. From Lemma 3.2, we obtain

that H1

mj,0 is positive definite for j = 1, . . . , r. Hence, Hη,1 converges to H0,1 with

rank(H0,1) = m. It follows from (3.19) that Kη= 2Xη,I converges to K0= 2XI with

(13)

(iii) It suffices to show that kSη− Sk2 = O(η) for η > 0 sufficiently small. Since X is a solution of X + A>X−1A = Q, we have M I X = L I X S,

where the pencil (M, L) is defined in (3.5) . Under the condition in (iii), the column space of

I X

is a simple eigenspace of (M, L), in the terminology of [24]. It follows from [24, Chapter VI, Theorems 2.12 and 2.15] that

A 0 Q + iηI −I I + ηFη,1 X + ηFη,2 = 0 I A> 0 I + ηFη,1 X + ηFη,2 (S + ηEη),

where Fη,1, Fη,2, Eη ∈ Cn×n with max{kFη,1k2, kFη,2k2, kEηk2} ≤ c for η > 0

suffi-ciently small and c > 0. It is easily seen that Xη = (X + ηFη,2)(I + ηFη,1)−1,

Sη = Xη−1A = (I + ηFη,1)(S + ηEη)(I + ηFη,1)−1.

It follows that kSη− Sk2= O(η) for η > 0 sufficiently small.

Remark 3.1. Without the additional conditions in Theorem 3.3 (ii) or (iii), rank (XI) could be much smaller than m. Consider the example with A = In and

Q = 2In. Then ϕ0(λ) has all 2n eigenvalues at 1, with partial multiplicities 2. So

m = n, but it is easy to see that rank (XI) = 0. For this example, we have kSη−Sk2=

O(η1/2_{) for η > 0 sufficiently small. We also know that the 2n eigenvalues of ϕ} 0(λ)

at 1 are perturbed to n eigenvalues inside the unit circle and n eigenvalues outside the unit circle.

Corollary 3.4. If ϕ0(λ) has no eigenvalues on T, then X is real symmetric.

Furthermore, In(X) = In (−ϕ0(1)). Here In(W ) denotes the inertia of a matrix W .

Proof. From Theorem 3.3, it is easy to see that X is a real symmetric matrix. Since X is real, S = X−1A is a real matrix. By setting λ = 1 in (3.4) we get ϕ0(1) = −(I − S>)X(I − S). Hence, In(X) = In(−ϕ0(1)).

Corollary 3.5. If all eigenvalues of ϕ0(λ) are on T and are simple, then XI is

positive definite.

Proof. By Theorem 3.3 (iii) immediately.

4. A structure-preserving algorithm. As explained in [12] and also in this paper, the required solution X = G−1_L,S is a particular weakly stabilizing solution of (3.3) and is given by X = limη→0+X_η, where X_η is the unique stabilizing solution of

(1.4). We will call this particular solution the weakly stabilizing solution of (3.3). It can be approximated by Xη for a small η. For a fixed η > 0, Xη can be computed

efficiently by the doubling algorithm studied in [12], for all energy values.

In this section we will develop a structure-preserving algorithm that can for most cases find the weakly stabilizing solution of (3.3) more efficiently and more accurately than the doubling algorithm, by working on the equation (3.3) directly.

Consider the pencil (M, L) given by (3.5). The simple relation M I X = L I X

X−1A shows that the weakly stabilizing solution of (3.3) is obtained by X = X2X1−1, where

X1

X2

forms (or more precisely, the colums of

X1

X2

form)

(14)

a basis for the invariant subspace of (M, L) corresponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed to the inside of T when Q is replaced by Qη with η > 0.

We now assume that all unimodular eigenvalues λ 6= ±1 of (M, L) are semi-simple and the eigenvalues ±1 (if exist) have partial multiplicities 2. This assumption seems to hold generically. Under this assumption, for computing the weakly stabilizing solution we need to include all linearly independent eigenvectors associated with the eigenvalues ±1, and use Theorem 3.1 to determine which half of the unimodular eigenvalues λ 6= ±1 should be included to compute the required invariant subspace.

We may use the QZ algorithm to determine this invariant subspace, but it would be better to exploit the structure of the pencil (M, L). We will use the same approach as in [15] to develop a structure-preserving algorithm (SA) to find a basis for the desired invariant subspace of (M, L) and then compute the weakly stabilizing solution of (3.3). The algorithm is still based on the (S + S−1)-transform in [19] and Patel’s algorithm in [23], but some new issues need to be addressed here.

It is well known that (M, L) is a symplectic pair, i.e., (M, L) satisfies MJ M>= LJ L>_{, where J =}

0 I

−I 0

. Furthermore, the eigenvalues of (M, L) form recip-rocal pairs (ν, 1/ν), where we allow ν = 0, ∞. We define the (S + S−1_{)-transform [19]}

of (M, L) by K := MJ L>J + LJ M>J = Q A − A> A>− A Q , (4.1a) N := LJ L>J = A 0 0 A> . (4.1b)

Then K and N are both skew-Hamiltonian, i.e., KJ = J K> and N J = J N>. The relationship between eigenvalues of (M, L) and (K, N ), and their Kronecker structures have been studied in [19, Theorem 3.2]. We will first extend that result to allow unimodular eigenvalues for (M, L). The following preliminary result is needed. Lemma 4.1. Let Nr(λ) := λIr+ Nr, where Nr is the nilpotent matrix with

Nr(i, i + 1) = 1, i = 1, . . . , r − 1, and zeros elsewhere. Let eq.

∼ denote the equivalence between two matrix pairs (Two matrix pairs (Y1, Y2) and (Z1, Z2) are called equivalent

if there are nonsingular matrices U and V such that U Y1V = Z1 and U Y2V = Z2).

Then (i) For λ 6= 0, ±1, (Nr(λ) + Nr(λ)−1, Ir) eq. ∼ (Nr(λ + 1/λ), Ir). (ii) (N2 r+ Ir, Nr) eq. ∼ (I, Nr).

Proof. (i) Since λ 6= 0, one can show that Nr(λ)−1≡ [tj−i] and Nr(λ)+Nr(λ)−1≡

[sj−i] are Toeplitz upper triangular with tk = (−1)kλ−(k+1), for k = 0, 1, . . . , r − 1,

as well as s0= λ + 1/λ, s1= 1 − λ−2 and sk= tk, for k = 2, . . . , r − 1. Since λ 6= ±1,

s1= 1 − λ−2 is nonzero. It follows that (Nr(λ) + Nr(λ)−1, Ir) eq. ∼ (Nr(λ + 1/λ), Ir). (ii) (I + N_r2, Nr) eq. ∼ (Ir, Nr(I + Nr2)−1) eq. ∼ (Ir, Nr− Nr3+ N 5 r− · · · ) eq. ∼ (Ir, Nr).

Theorem 4.2. Suppose that (M, L) has eigenvalues {±1} with partial multiplic-ities 2. Let γ = λ + 1/λ (λ = 0, ∞ permitted). Then λ and 1/λ are eigenvalues of (M, L) if and only if γ is a double eigenvalue of (K, N ). Furthermore, for λ 6= ±1 (i.e., γ 6= ±2) γ, λ and 1/λ have the same sizes of Jordan blocks, i.e., they have the same partial multiplicities; for λ = ±1, γ = ±2 are semi-simple eigenvalues of (K, N ).

(15)

Proof. By the results on Kronecker canonical form for a symplectic pencil (see [20]) and by our assumption, there are nonsingular matrices X and Y such that

YMX = J D 0 In , YLX = In 0 0 J , (4.2)

where J = J1⊕ Js⊕ J0, J1 = Ip⊕ (−Iq), Js is the direct sum of Jordan blocks

corresponding to nonzero eigenvalues λj of (M, L), where |λj| < 1 or λj = eiθj

with Im(λj) > 0, and J0 is the direct sum of nilpotent blocks corresponding to zero

eigenvalues, and D = Ip⊕ Iq⊕ 0n−r with r = p + q.

Let X−1J X−>₌ X1 X2 −X> 2 X3 , where Xi ∈ Cn×n, i = 1, 2, 3. So X1> = −X1

and X₃>= −X3. Using (4.2) in MJ M>= LJ L> we get

J X1J>− DX2>J

>_{+ J X}

2D + DX3D = X1, (4.3a)

J X2+ DX3= X2J>, (4.3b)

J X3J>= X3. (4.3c)

Let Js,0 = Js⊕ J0. We partition X3 and X2 by X3 =

X3,1 X3,2 −X> 3,2 X3,3 and X2 = X2,1 X2,2 X2,3 X2,4

, respectively, where X2,1, X3,1 ∈ Cr×r. Comparing the

di-agonal blocks in (4.3b) we get

J1X2,1+ X3,1= X2,1J1>, (4.4a)

Js,0X2,4= X2,4Js,0> . (4.4b)

From (4.4a) we see that X3,1 has the form X3,1 =

0p ω −ω> ₀ q . From (4.3c) we have [Ip⊕ (−Iq)]X3,1[Ip⊕ (−Iq)] = X3,1. It follows that ω = 0 and thus X3,1 = 0.

From (4.3c) we also have J1X3,2Js,0> = X3,2 and Js,0X3,3Js,0> = X3,3, from which we

get X3,2 = 0 and X3,3 = 0. So we have X3= 0. Then (4.3b) becomes J X2= X2J>,

from which we get

X2,1= ηp⊕ ηq, X2,2= X2,3> = 0, (4.5)

where ηp ∈ Cp×p and ηq ∈ Cq×q. Moreover, X2,1 and X2,4 are nonsingular by the

nonsingularity of X−1J X−>_{. Substituting (4.3b) into (4.3a) we get}

X1= J X1J>− DJ X2>+ X2J>D ≡ J X1J>+ V, (4.6) where V = X2J>D − DJ X2>= V1 0 0 0 with V1= (ηp− η>p) ⊕ (ηq>− ηq) by (4.5). Partition X1= X1,1 X1,2 −X> 1,2 X1,3

with X1,1 ∈ Cr×r. From the equations for the (1, 2)

and (2, 2) blocks of (4.6) we get X1,2 = 0 and X1,3 = 0, respectively. Furthermore,

from the equation for the (1, 1) block in (4.6) we get X1,1 = ξp⊕ ξq with ξp ∈ Cp×p

(16)

From (4.2), (4.4) and Lemma 4.1 (ii) we now have (K, N )eq.∼ (MJ L>+ LJ M>, LJ L>)eq.∼ (J1X1,1+ X1,1J1) ⊕ 0n−r (J12+ Ir)X2,1 ⊕ (Js,02 + In−r)X2,4 X2,1(J12+ Ir) ⊕ (X2,4> ((Js,02 )>+ In−r)) 0n , X1,1⊕ 0n−r J1X2,1⊕ Js,0X2,4 X2,1J1⊕ X2,4> Js,0> 0n eq. ∼ 2ξp⊕ (−2ξq) 2Ir 2Ir 0r ⊕ (Js+ Js−1) ⊕ I ⊕ (Js+ Js−1) ⊕ I, ξp⊕ ξq Ip⊕ (−Iq) Ip⊕ (−Iq) 0r ⊕ I ⊕ J0⊕ I ⊕ J0 eq. ∼ 2I2p⊕ (−2I2q) ⊕ (Js+ Js−1) ⊕ I ⊕ (Js+ Js−1) ⊕ I, I2p⊕ I2q⊕ I ⊕ J0⊕ I ⊕ J0 .

The proof is completed by using Lemma 4.1 (i).

4.1. Development of SA. It is helpful to keep in mind that the transform λ → γ achieves the following:

{0, ∞} → ∞, T → [−2, 2], R \ {±1, 0} → R \ [−2, 2], C \ (R ∪ T) → C \ R. By Theorem 4.2 and our assumption on the unimodular eigenvalues of (M, L), all eigenvalues of (K, N ) in [−2, 2] are semi-simple.

Based on Patel approach [23] we first reduce (K, N ) to a block triangular matrix pair of the form

U>KZ = K1 K2 0 K> 1 , U>N Z = N1 N2 0 N> 1 , (4.7)

where K1and N1∈ Rn×nare upper Hessenberg and upper triangular, respectively, K2

and N2are skew symmetric, U and Z ∈ R2n×2nare orthogonal satisfying U>J Z = J .

By the QZ algorithm, we have orthogonal matrices Q1 and Z1 such that

Q1K1Z1= K11, Q1N1Z1= N11, (4.8)

where K11 and N11are quasi-upper and upper triangular, respectively.

From (4.7) and (4.8) we see that the pair (K11, N11) contains half of eigenvalues

of (K, N ). We now reduce (K11, N11) to the quasi-upper and upper block triangular

matrix pair e U>K11Z =e         Im0 Kb01 Kb02 · · · Kb0r Γ1 Kb12 · · · Kb1r Γ2 . .. ... . ._. b Kr−1r Γr         , e U>N11Z = diag{Γe 0, Im1, Im2, . . . , Imr},

where m0+ m1+ · · · + mr= n, Γ0is strictly upper triangular, Γ1= diag{g1, . . . , gm1}

(17)

σ(Γj) ∩ σ(Γi) = ∅, i 6= j, i, j = 2, . . . , r. Let b Ki= [ bKii+1, · · · , bKir], i = 0, . . . , r − 1, b Γj = " Γj Kbj 0 Γbj+1 # , bΓr= Γr, j = 1, . . . , r − 1.

By solving some Sylvester equations Γ0Z bΓ1− Z = bK0,

Z bΓi+1− ΓiZ = bKi, i = 1 . . . , r − 1,

(4.9) we can reduce (K11, N11) to the quasi-upper and upper block diagonal matrix pair

b

U>K11Z = diag{Ib m0, Γ1, Γ2, . . . , Γr},

b

U>N11Z = diag{Γb 0, Im1, Im2, . . . , Imr}.

(4.10) The procedure here is very much like the block diagonalization described in section 7.6.3 of [8]. Partition bZ = [ bZ0, bZ1, . . . , bZr] with bZi ∈ Rn×mi according to the block

sizes of (4.10). It holds that

K11Zb0Γ0= N11Zb0, K11Zbj= N11ZbjΓj, j = 1, . . . , r. (4.11)

It follows that, for Z1from (4.8), Z(:, 1 : n)(Z1Zbj) forms a basis for an invariant

subspace of (K, N ) for j = 0, 1, . . . , r. In particular, the columns of Z(:, 1 : n)(Z1Zb1)

are real eigenvectors of (K, N ) corresponding to real eigenvalues in [−2, 2]. We then need to get a suitable invariant subspace of (M, L) from each of these invariant subspaces for (K, N ).

We start with two lemmas about solving the quadratic equation γ = λ + 1/λ in the matrix form.

Lemma 4.3. Given a real quasi-upper triangular matrix

Γs=    γ11 · · · γ1m . ._. .._. 0 γmm   , (4.12)

where γii is 1 × 1 or 2 × 2 block with σ(γii) ⊆ C \ [−2, 2], i = 1, . . . , m. Then the

quadratic matrix equation

Λ2s− ΓsΛs+ I = 0 (4.13)

of Λsis uniquely solvable with Λsbeing real quasi-upper triangular with the same block

form as Γsin (4.12) and σ(Λs) ⊆ D ≡ {λ ∈ C| |λ| < 1}. Proof. Let Λs=    λ11 · · · λ1m . ._. .._. 0 λmm   

have the same block form as Γs. We first solve the diagonal blocks {λii}mi=1 of Λs

from the quadratic equation λ2_{− γ}

iiλ + I[i]= 0, where [i] denotes the size of γii. Note

that the scalar equation

(18)

has no solutions on T for γ ∈ C \ [−2, 2]. So it always has one solution inside T and the other outside T.

For i = 1, . . . , m, if γii ∈ R \ [−2, 2], then λii ∈ (−1, 1) is uniquely solved from

(4.14) with γ = γii. If γii∈ R2×2 with γiiz = γz for z 6= 0 and γ ∈ C \ R, then γii =

[z, z]diag {γ, γ} [z, z]−1 and the required solution is λii = [z, z]diagλ, λ [z, z]−1 ∈

R2×2, where λ ∈ D is uniquely solved from (4.14).

For j > i, comparing the (i, j) block on both sides of (4.13) and using λii− γii=

−λ−1_ii , we get λijλjj− λ−1ii λij= γijλjj+ j−1 X `=i+1 (γi`− λi`)λ`j.

Since σ(λ−1_ii ) ∩ σ(λjj) = ∅, i, j = 1, . . . , m, the strictly upper triangular part of Λs

can be determined by the following recursive formula. For d = 1, . . . , m − 1, For i = 1, . . . , m − d, j = i + d, A := λ>_jj⊗ I[i]− I[j]⊗ λ−1ii , b := γijλjj+Pj−1_`=i+1(γi`− λi`)λ`j, λij= vec−1(A−1vec(b)), end i, end d.

Here ⊗ denotes the Kronecker product, vec is the operation of stacking the columns of a matrix into a vector, and vec−1 is its inverse operation.

We remark that the equation (4.13) is a special case of the palindromic matrix equation studied recently in [16]. The desired solution Λs could also be obtained by

applying the general formula in [16, Theorem 5], which involves the computation of (Γ−1s )2 and a matrix square root. However, for our special equation, the procedure

given in the proof of Lemma 4.3 is more direct and numerically advantageous. Lemma 4.4. Given a strictly upper triangular matrix Γ0 = [γij] ∈ Re×e. The

quadratic matrix equation

Γ0Λ20− Λ0+ Γ0= 0 in Λ0= [λij] ∈ Re×e (4.15)

with Λ0 being strictly upper triangular is uniquely solvable.

Proof. From (4.15) the matrix Λ0 is uniquely determined by λi,i+j = γi,i+j,

i = 1, . . . , e − 2, j = 1, 2, λe−1,e= γe−1,eand

For j = 3, . . . , e,

For i = 1, . . . , e − j + 1, b

λi,i+j−1=Pi+j−2_`=i+1λi,`λ`,i+j−1,

end i,

For i = 1, . . . , e − j, λi,i+j= γi,i+j+P

i+j−2

`=i+1γi,`bλ`,i+j,

end i, end j.

Theorem 4.5. Let Zs form a basis for an invariant subspace of (K, N )

cor-responding to Γs with σ(Γs) ⊆ C \ [−2, 2], i.e., KZs = N ZsΓs. Suppose that Λs

solves Γs = Λs+ Λ−1s as in Lemma 4.3 with σ(Λs) ⊆ D \ {0}. If the columns of

J (L>_{J Z}

sΛs− M>J Zs) are linearly independent, then they form a basis for a stable

(19)

Proof. Since KZs− N ZsΓs= (MJ L>+ LJ M>)J Zs− LJ L>J Zs(Λs+ Λ−1s ) = 0, we have MJ L>J ZsΛs− M>J Zs = LJ L>J ZsΛs− M>J Zs Λs. Remark 4.1. Let Zs= Z1 Z2 , X1 X2 = J (L>J ZsΛs− M>J Zs),

where each of X1, X2, Z1, and Z2 has n rows. Then direct computation gives X1 =

Z2Λs− Z1. But it is more convenient to get X2= QX1− A>X1Λsfrom M

X1 X2 = L X1 X2 Λs.

We now explain how we can get eigenvectors of (M, L) from eigenvectors of (K, N ) corresponding to eigenvalues in [−2, 2]. Theorem 4.6. Let v = v1 v2

be a real eigenvector of (K, N ) corresponding to eigenvalue γ ∈ [−2, 2]. Let λ be a solution of (4.14) and let u1 = λv2 − v1,

u2= Qv1− λA>v1. Then u = u1 u2 is an eigenvector of (M, L) corresponding to eigenvalue λ if u 6= 0. Moreover, we indeed have u 6= 0 for each γ ∈ (−2, 2).

Proof. The first part is proved by direct computation as in the proof of Theorem 4.5. For the second part, we simply note that λ is not real when γ ∈ (−2, 2), and then u16= 0 since v is a nonzero real vector.

Remark 4.2. When γ ∈ (−2, 2) is an eigenvalue of (K, N ) with multiplicity 2k (or an eigenvalue of (K11, N11) with multiplicity k), we can use Theorem 4.6 to get k

eigenvectors of (M, L) corresponding to eigenvalue λ, from the k linearly independent eigenvectors of (K, N ) we have already obtained. However, there is no guarantee that the k eigenvectors of (K, N ) so obtained are also linearly independent.

When A is singular, (M, L) has eigenvalues at 0 and ∞. The following result will then be needed.

Theorem 4.7. Let Z∞ ∈ R2n×m span an infinite invariant subspace of (K, N )

corresponding to (I, Γ0), where Γ0 is strictly upper triangular, i.e., N Z∞= KZ∞Γ0.

Suppose that Λ0solves Γ0Λ20−Λ0+Γ0= 0 as in Lemma 4.4 with Λ0being strictly upper

triangular. If the columns of J (L>J Z∞Λ0−M>J Z∞) are linearly independent, then

they form a basis for a zero invariant subpace of (M, L) corresponding to (Λ0, I).

Proof. Since Γ0= Λ0(I + Λ20)−1, we have

N Z∞(I + Λ20) = MJ M >_{J Z}

∞+ LJ L>J Z∞Λ20= KZ∞Λ0

= LJ M>J Z∞Λ0+ MJ L>J Z∞Λ0,

and then MJ (L>J Z∞Λ0− M>J Z∞) = LJ (L>J Z∞Λ0− M>J Z∞)Λ0.

We can now present a structure-preserving algorithm (SA) for the computation of the weakly stabilizing solution of (3.3).

Structure-preserving Algorithm (SA) Input: A ∈ Rn×n_{, Q = Q}>_{∈ R}n×n_.

Output: The weakly stabilizing solution X of X + A>X−1A = Q. Step 1: Form the matrix pair (K, N ) as in (4.1);

(20)

Step 2: Reduce (K, N ) as in (4.7): K ← U>KZ = K1 K2 0 K₁> , N ← U>_{N Z =} N1 N2 0 N₁>

, where K1and N1∈ Rn×nare upper Hessenberg and

upper triangular, respectively, U and Z are orthogonal satisfying U>J Z = J (see a pseudo code in Appendix of [15]); Apply QZ algorithm to get (4.8); Step 3: Compute eigenmatrix pairs K11Zb0Γ0 = N11Zb0, K11Zbj = N11ZbjΓj of

(K11, N11), j = 1, . . . , r, as in (4.11) by solving the Sylvester equations in

(4.9) to get (4.10); [ bZ0, bZ1, . . . , bZr] ← Z1[ bZ0, bZ1, . . . , bZr], where Z1 is from

(4.8); Step 4: Z∞ = Z(:, 1 : n) bZ0 ≡ Z∞,1 Z∞,2 , Zj = Z(:, 1 : n) bZj ≡ Zj,1 Zj,2 , j = 1, . . . , r;

Step 5-1: Use Lemma 4.4 to solve the strictly upper triangular Λ0 for Γ0Λ20−

Λ0+ Γ0= 0;

Compute X∞,1= Z∞,2Λ0− Z∞,1, X∞,2= QX∞,1− A>X∞,1Λ0;

Set X1← X∞,1, X2← X∞,2 (by Theorem 4.7);

Step 5-2: For k = 1, . . . , m1,

Solve λ = eiθ _{with Im(λ) ≥ 0 from (4.14) with γ = g} k;

Compute x1 = z2eiθ− z1, x2 = Qx1− eiθA>x1, where z1 = Z1,1ek,

z2= Z1,2ek (ek is the kth column of the identity matrix);

If Im(eiθ_x∗

1A>x1) > 0, then x1← x1, x2← x2;

Set X1← [X1|x1], X2← [X2|x2] (by Theorem 4.6 and Theorem 3.1);

end k;

Step 5-3: For j = 2, . . . , r,

Use Lemma 4.3 to solve Λj = Λsfor (4.13) with Γs= Γj;

Compute Xj,1= Zj,2Λj− Zj,1, Xj,2 = QXj,1− A>Xj,1Λj;

Set X1← [X1|Xj,1], X2← [X2|Xj,2] (by Theorem 4.5);

end j;

Step 6: Compute X = X2X1−1.

Table 4.1

Total number of flops for SA flops

Step 2 170₃ n3_{(Patel algorithm) + 39n}3_{(QZ algorithm)}

Step 3 2n3_{(Assume m}

0= 0, mi= 2, i = 1, . . . , r.)

Step 4 4n3

Step 5 between 4n3_{(no eigenvalues are on T) and 8n}3_{(all eigenvalues are on T)}

Step 6 32₃n3

Total ≈ 120n3

In Step 5-2 of SA we have assumed that the nonsingular Hermitian matrix iY∗(2λ0A>− Q)Y in Theorem 3.1 (which is the matrix in (3.11)) is definite. This

assumption is the same as the assumption in Theorem 3.3 (iii). Under this assumption we do not need to form the matrix Y , but only need to check the sign of its diagonal element determined by any (normalized) eigenvector. We could have used Theorem 3.1 to choose the right eigenvectors whether the Hermitian matrix is definite or not, but this will increase computational work. We have thus chosen to use the simpler

(21)

Step 5-2. We can always check in the end whether the imaginary part of the computed X is (nearly) positive semidefinite. If not, we can use a more complicated Step 5-2 according to Theorem 3.1 to re-compute X.

To find the weakly stabilizing solution of (3.3), the total number of flops for SA is roughly 120n3_{, where a flop denotes a multiplication or an addition in real arithmetic.}

A more detailed counting is given in Table 4.1.

4.2. Comparison of SA with other methods. The required solution in the nano application is X = limη→0+Xη. By now, we have available five methods in

two categories. In the first category, we have three methods that compute Xη for a

small η as an approximation to X. They are fixed-point iteration (FPI), Newton’s method (NM) and doubling algorithm (DA). These methods are already discussed in [12]. In the second category, we have two methods that compute X directly. They are QZ and SA. In this paper we have further studied FPI. We now know when and why it works well. It has been shown by numerical experiments in [12] that NM often converges to an undesired solution or even diverges. We now also have a better understanding of this. Basically, Xη is the unique solution of (3.1) with a

positive definite imaginary part and, starting with an initial guess with a positive definite imaginary part, the iterates produced by NM often do not have a positive definite imaginary part (while the iterates from FPI always do). So NM is really not a contender for solving our specific problem, and could be dropped from the first category. We have also mentioned earlier that the convergence of FPI (say (2.1) with c = 1₂) is usually very slow for at least some of the energy values of interest. Therefore we only need to compare SA with DA and QZ, as general purpose methods.

Strictly speaking, SA and QZ are applicable only under some generic assumptions since they rely on our Theorem 3.1 to pick the right nonreal unimodular eigenvalues of ϕ0(λ) to get the desired weakly stabilizing solution. The difficulty associated with

non-generic eigenstructure of ϕ0(λ) is avoided in DA by the introduction of an η > 0.

More precisely, this difficulty is only concealed since the relation between kXη−Xk and

η is not clear in non-generic cases. In the generic case, we often have kXη−Xk = O(η),

but the constant hidden in the big O notation could still be big. The smallest η that we have seen in the nano literature is 10−6_{, although we have also experimented with}

η = 10−10for DA in [12]. In the experiments there, DA typically requires 26 iterations for η = 10−6 , and nearly 40 iterations for η = 10−10. Note that DA requires 104

3 n 3

real flops each iteration. So in terms of flop counts, four DA iterations is already more expensive than SA. Moreover, since we take η = 0 directly in SA, the accuracy achieved by SA is usually much better than that achieved by DA with η = 10−6. But we also note that DA is much easier to use. If we take η to be very small in DA, then DA could also have stability problems since the matrices to be inverted in DA iterations may also be very ill-conditioned in that case.

One potential problem with SA is that we cannot rule out the possibility that the column vectors generated in Steps 5-1, 5-2, and 5-3 of SA are linearly dependent (in exact arithmetic). In that case, SA fails. Note that QZ does not have this problem, but it requires about 440n3 flops, much more than the 120n3 flops required for SA. The accuracy achieved by QZ is not necessarily better than that from SA since QZ does not exploit the structure of the problem. Moreover, for QZ we may encounter the uncomfortable situation where the number of computed eigenvalues inside T does not match the number of computed eigenvalues outside T.

Finally we explain why X1is unlikely to be singular (in exact arithmetic) in Step 6

(22)

5-2, and 5-3 of SA are unlikely to be linearly dependent. Take Step 5-3 for example, which is based on Theorem 4.5. Take Zs= Zj and Γs= Γj (j = 2, . . . , r), we need to

examine whether the matrix L>J ZsΛs− M>J Zs in Theorem 4.5 is of full column

rank. We start with the following result.

Theorem 4.8. Let Xs, Xu∈ C2n×m (m ≤ n) form bases for stable and unstable

invariant subspaces of (L>, M>), respectively, corresponding to Λs and Λ−1s where

σ(Λs) ⊆ D \ {0}, i.e.,

L>Xs= M>XsΛs, L>Xu= M>XuΛ−1s . (4.16)

Then J>Xs and J>Xu span two linearly independent eigenspaces of (K, N )

corre-sponding to Λs+ Λ−1s .

Proof. From LJ L> = MJ M> and (4.16) we have

KJ>Xs= MJ L>Xs+ LJ M>Xs= MJ M>XsΛs+ LJ L>XsΛ−1s

= LJ L>Xs(Λs+ Λ−1s ) = N J>Xs(Λs+ Λ−1s ).

Similarly, KJ>Xu= N J>Xu(Λs+ Λ−1s ).

Now we can write Zs = J>XsCs+ J>XuCu, where Cs and Cu are coefficient

matrices. From (4.16) we have

L>J ZsΛs− M>J Zs= L>(XsCs+ XuCu)Λs− M>(XsCs+ XuCu) =L>_X s, L>Xu CsΛs− Λ−1s Cs CuΛs− ΛsCu . (4.17)

Since σ(Λs) ⊆ D \ {0}, the linear mapping T defined by T (W ) = W Λs− Λ−1s W

is a bijection on Cm×m_{. Thus, the matrix C}

sΛs− Λ−1s Cs = T (Cs) is generically

nonsingular. When T (Cs) is nonsingular, L>J ZsΛs− M>J Zsis also of full column

rank since the columns ofL>_X

s, L>Xu are linear independent. The latter assertion

is clear when A is nonsingular in (3.5), but can also be shown by using the Kronecker form of (L>, M>) when A is singular.

From the above discussions, we have the following practical strategy for computing the required solution X with reasonable accuracy. We first use SA. In the unlikely event that SA fails, we use QZ. If QZ also fails, then we use DA with a small η > 0, and might have to accept an approximation with lower accuracy.

5. Numerical results. All numerical experiments are carried out using MAT-LAB R2008b with IEEE double-precision floating-point arithmetic (eps ≈ 2.22 × 10−16) on the Linux system. We first illustrate the positivity of Xη,I for η > 0, where

Xη,I is the imaginary part of the stabilizing solution Xηof (3.1), as well as the rank of

XI = limη→0+Xη,I. To measure the accuracy of the computed Xη we use the relative

residual

RResη =

kXη+ A>Xη−1A − Qηk

kXηk + kAk2kXη−1k + kQηk

, (5.1)

where k · k is the spectral norm.

Example 5.1. We randomly generate a real matrix A and a real symmetric ma-trix Q of dimension 6. Then we use the invariant subspace method (the QZ algorithm) to compute the complex symmetric stabilizing solution Xη of (3.1) with η = 10−4,

10−8, 10−12, respectively. When η = 0, ϕ0(λ) has 2m = 6 eigenvalues on T, given by

(23)

By Theorem 3.1 we determine that

Λs= {−0.80913 + 0.58763i, 0.64993 + 0.76000i, −0.13000 − 0.99151i} is such that the perturbed eigenvalues of ϕη(λ) (η > 0) associated with each λs∈ Λs

are inside the unit circle. Then we compute the weakly stabilizing solution X of (3.3) by using the invariant subspace corresponding to stable eigenvalues and eigenvalues in Λs_{. The numerical results are shown in Table 5.1, where X}

0= X. Table 5.1

Relative residuals and kXη− X>ηk/kXηk

η 10−4 10−8 10−12 0

RResη 1.17 × 10−15 1.51 × 10−15 1.69 × 10−15 1.59 × 10−15

kXη− Xη>k/kXηk 9.88 × 10−15 7.47 × 10−15 1.12 × 10−14 1.14 × 10−14

We know from section 2 that Xη,I is positive definite and we know from Theorem

3.3 (iii) that rank(XI) = m = 3. These are confirmed by the numerical results shown

in Table 5.2, where X0,I = XI.

Table 5.2 The eigenvalues of Xη,I

η The eigenvalues of Xη,I

10−4 33.938, 6.4240, 9.5171, 6.98 × 10−4, 1.00 × 10−4, 1.34 × 10−4 10−8 _{33.937, 6.4232, 9.5134, 6.98 × 10}−8_{, 1.00 × 10}−8_{, 1.34 × 10}−8

10−12 33.937, 6.4232, 9.5134, 7.01 × 10−12, 1.01 × 10−12, 1.36 × 10−12 0 33.937, 6.4232, 9.5134, 2.54 × 10−16, −1.94 × 10−15, −3.71 × 10−17

We now present some numerical comparison of SA and QZ for the computation of the weakly stabilizing solution of (3.3). To measure the accuracy of a computed solution X of (3.3) we use the relative residual RRes defined as in (5.1), with Xηand

Qη replaced by X and Q, respectively.

Example 5.2. As in [12], we consider a semi-infinite Hamiltonian operator for a heterostructured semiconductor of the form

H(ψ, ~x) = −∇ ~ 2ε(~x)∇ψ + V (~x)ψ, ~x ≡ x1 x2 ∈ Ω, (5.2) where Ω ≡ Ω1∪ Ω2 with Ω1= ([−9, −1] ∪ [1, 9]) × (−∞, 0], Ω2= [−1, 1] × (−∞, 0],

~ is the reduced Planck constant, ψ is the associated wave function, ε(~x) is the electron effective mass with

ε(~x) =

ε1, ~x ∈ Ω1,

ε2, ~x ∈ Ω2,

and V (~x) = ωx2

(24)

Relative Residuals of SA 0 1 2 3 4 5 6 7 8 0 1 2 3 4 x 10 −15 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 40 80 120 160 Numbers of eigenvalues on

T

Condition numbers of 1 X

E

101 100 102

Fig. 5.1. Relative residuals of SA, condition numbers of X1 and numbers of eigenvalues of

(M, L) on T that are used in the computation of X (so they are halves of the total numbers).

Let Tr = Trid(−1, 4, −1) be the tridiagonal matrix of dimension r. We use the

classical five-point central finite difference method to discretize the Hamiltonian op-erator (5.2) on uniform grid points in Ω with mesh size h. Then the corresponding matrices A and Q in (3.3) are of the forms

A = − δ1I`⊕ δ1+ δ2 2 ⊕ δ2Im⊕ δ1+ δ2 2 ⊕ δ1I` and Q = E I − B with B =δ1T`⊕ (2(δ1+ δ2)) ⊕ δ2Tm⊕ (2(δ1+ δ2)) ⊕ δ1T` − δ1 e`+1e>` + e`e>`+1 − δ2 e`+2e>`+1+ e`+1e>`+2 − δ2 et+1e>t + ete>t+1 − δ1 et+2e>t+1+ et+1e>t+2 + ωh2diag (1 − c)2, (2 − c)2, . . . , (n − c)2 ,

(25)

0 1 2 3 4 5 6 7 x 10 −14 0 1 2 3 4 5 6 7 8 Relative Residuals of QZ 0 1 2 3 4 5 6 7 8 101 100 102 103

E

Fig. 5.2. Relative residuals of QZ and condition numbers of X1.

matrix, ` and m are the numbers of grid points on the x1-axis in (−9, −1) and (−1, 1),

respectively, t = ` + m + 1, n = 2` + m + 2, c = (n + 1)/2 and ⊕ denotes the direct sum of two matrices.

In our test we take ` = 79, m = 19, δ1 = 1, δ2 = 0.1 and ω = 5 × 10−4, then

the matrix size of A is n = 2` + m + 2 = 179. Let ∆ = [−0.5, 8.5]. We divide ∆ into p subintervals using p + 1 equally spaced nodes Ei, i = 0, 1, . . . , p. We now

choose p = 1000 and run SA and QZ for each Ei. Recall that for both algorithms

X is computed from X = X2X1−1 in the end. In Figure 5.1, we plot the relative

residual and the condition number of X1 (with each column of X1 normalized, here

and elsewhere) for SA. We also plot the number of eigenvalues of (M, L) on T that are used in the computation of X (so it is half of the total number). In Figure 5.2, we plot the relative residual and the condition number of X1for QZ. From the figures

we can see that the condition numbers are not very large and the accuracy of the computed X is high for both methods, with that from SA slightly better.

In Example 5.2 the matrix A is nonsingular and so Step 5-1 in SA is never used. We now construct an example with a singular A.

Example 5.3. We construct 10 × 10 matrices A and Q for (3.3) as follows. A = rand(10, 10) ∗ diag(a1, . . . , a5, 0, 0, 0, 0, 0) ∗ rand(10, 10), where ai= 10 ∗ rand(1),

for i = 1, . . . , 5, and Q = E I10−B with B = (B0+B>0)/2 and B0= 10∗(rand(10, 10)−

0.5 ∗ ones(10, 10)). We run SA and QZ for E = 0.01i, for i = 0, . . . , 1000. The results are shown in Figures 5.3 and 5.4. We see that SA and QZ have roughly the same accuracy for this example.

6. Conclusions. In this paper we have further studied a nonlinear matrix equa-tion arising in nano research. We have proved general convergence results on fixed-point iterations for (1.4). It is also shown that the required solution of (1.4) is the

(26)

0 1 2 3 4 5 6 7 8 9 10 10-17 10-16 10-15 10-14 10-13 10-12 10-11 0 1 2 3 4 5 6 7 8 9 10 105 104 103 102 101 0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 Relative Residuals of SA Numbers of eigenvalues on

T

E

Fig. 5.3. Relative residuals of SA, condition numbers of X1 and numbers of eigenvalues of

(M, L) on T.

unique solution with a positive definite imaginary part. Our analysis has also shown that the convergence of these methods is usually very slow when the size of matri-ces in (1.4) is large. So the use of these simple methods is recommended only when the matrix size is small, which is the case when the equation is obtained from layer based models. We have also studied the equation (3.3) directly, which is the equation we obtain by letting η = 0 in (1.4). We have shown which half of the unimodular eigenvalues of P (λ) in (3.6) should be included for computing the required weakly stabilizing solution of (3.3) using subspace methods, and we have also determined the rank of the imaginary part of this solution in terms of the number of unimodular eigenvalues of P (λ). We have presented a structure-preserving algorithm (SA) for (3.3) that is nearly 4 times more efficient than the QZ algorithm. The SA and the QZ algorithm often provide very good accuracy. But the accuracy would suffer if the matrix X1 used at the end of the algorithms happens to be very ill-conditioned.

Newton’s method cannot be used as a correction method since the Fr´echet derivative at the solution is always singular when P (λ) has unimodular eigenvalues. When SA

(27)

1 2 3 4 5 6 7 8 9 10 10-17 10-16 10-15 10-14 10-13 10-12 10-11 Relative Residuals of QZ 0 1 2 3 4 5 6 7 8 9 10

E

105 104 103 102 101 Condition numbers of 1 X

Fig. 5.4. Relative residuals of QZ and condition numbers of X1.

and QZ fail we may use DA on (1.4) with a small η > 0, but may need to increase η if DA also runs into numerical difficulties. It is possible to improve the accuracy of an approximation from DA by using the FPI (2.1) with c = 1

2, although at a high

computational cost when the matrix size is large.

Acknowledgment. The authors thank the referees for their helpful comments.

REFERENCES

[1] I. Appelbaum, T. Wang, J. D. Joannopoulos, and V. Narayanamurti, Ballistic hot-electron transport in nanoscale semiconductor heterostructures: Exact self-energy of a three-dimensional periodic tight-binding Hamiltonian, Physical Review B, 69 (2004), Arti-cle Number 165301, 6 pp.

[2] A. G. Baskakov, Dichotomy of the spectrum of non-self-adjoint operators, (Russian) Sibirsk. Mat. Zh., 32 (3) (1991), pp. 24–30; translation in Siberian Math. J., 32 (3) (1992), pp. 370–375.

[3] C.-Y. Chiang, E. K.-W. Chu, C.-H. Guo, T.-M. Huang, W.-W. Lin, and S.-F. Xu, Con-vergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 227–247.

[4] S. Datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, 1995. [5] S. Datta, Nanoscale device modeling: the Green’s function method, Superlattices and

Mi-crostructures, 28 (2000), pp. 253–278.

[6] C. J. Earle and R. S. Hamilton, A fixed point theorem for holomorphic mappings, Global Analysis (Proc. Sympos. Pure Math., Vol. XVI, Berkeley, Calif., 1968), American Mathe-matical Society, Rhode Island, 1970, pp. 61–65.

[7] J. C. Engwerda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and sufficient conditions for the existence of a positive definite solution of the matrix equation X + A∗X−1A = Q, Linear Algebra Appl., 186 (1993), pp. 255–275.

[8] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd edition, The Johns Hopkins University Press, 1996.

(28)

[9] C.-H. Guo, On Newton’s method and Halley’s method for the principal pth root of a matrix, Linear Algebra Appl., 432 (2010), pp. 1905–1922.

[10] C.-H. Guo, Y.-C. Kuo, and W.-W. Lin, Complex symmetric stabilizing solution of the matrix equation X + AT_X−1_{A = Q, Linear Algebra Appl., 435 (2011), pp. 1187–1192.}

[11] C.-H. Guo and P. Lancaster, Iterative solution of two matrix equations, Math. Comp., 68 (1999), pp. 1589–1603.

[12] C.-H. Guo and W.-W. Lin, The matrix equation X + A>_X−1_{A = Q and its application in}

nano research, SIAM J. Sci. Comput., 32 (2010), pp. 3020–3038.

[13] U. Haagerup and S. Thorbjørnsen, A new application of random matrices: Ext(Cred∗ (F2))

is not a group, Ann. of Math., 162 (2005), pp. 711–775.

[14] L. A. Harris, Fixed points of holomorphic mappings for domains in Banach spaces, Abstr. Appl. Anal., 2003, no. 5, pp. 261–274.

[15] T.-M. Huang, W.-W. Lin, and J. Qian, Structure-preserving algorithms for palindromic quadratic eigenvalue problems arising from vibration of fast trains, SIAM J. Matrix Anal. Appl., 30 (2009), pp. 1566–1592.

[16] B. Iannazzo and B. meini, Palindromic matrix polynomials, matrix functions and integral representations, Linear Algebra Appl., 434 (2011), pp. 174–184.

[17] D. L. John and D. L. Pulfrey, Green’s function calculations for semi-infinite carbon nan-otubes, Physica Status Solidi B – Basic Solid State Physics, 243 (2006), pp. 442–448. [18] A. Kletsov, Y. Dahnovsky, and J. V. Ortiz, Surface Green’s function calculations: A

nonrecursive scheme with an infinite number of principal layers, Journal of Chemical Physics, 126 (2007), Article Number 134105, 5 pp.

[19] W.-W. Lin, A new method for computing the closed-loop eigenvalues of a discrete-time alge-braic Riccati equation, Linear Algebra Appl., 96 (1987), pp. 157–180.

[20] W.-W. Lin, V. Mehrmann, and H. Xu, Canonical forms for Hamiltonian and symplectic matrices and pencils, Linear Algbera Appl., 302/303 (1999), pp. 469–533.

[21] W.-W. Lin and S.-F. Xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 26–39. [22] B. Meini, Efficient computation of the extreme solutions of X + A∗X−1A = Q and X −

A∗X−1A = Q, Math. Comp., 71 (2002), pp. 1189–1204.

[23] R. V. Patel, On computing the eigenvalues of a symplectic pencil, Linear Algebra Appl., 188/189 (1993), pp. 591–611.

[24] G. W. Stewart and J.-G. Sun, Matrix Perturbation Theory, Academic Press, 1990. [25] J. Tomfohr and O. F. Sankey, Theoretical analysis of electron transport through organic

molecules, Journal of Chemical Physics, 120 (2004), pp. 1542–1554.

[26] H. Wielandt, On eigenvalues of sums of normal matrices, Pacific J. Math., 5 (1955), pp. 633–638.

[27] X. Zhan and J. Xie, On the matrix equation X + A>_X−1_{A = I, Linear Algebra Appl., 247}