THE MATRIX EQUATION X + ATX−1A = Q AND ITS
APPLICATION IN NANO RESEARCH∗
CHUN-HUA GUO† AND WEN-WEI LIN‡
Abstract. The matrix equation X + ATX−1A = Q has been studied extensively when A and
Q are real square matrices and Q is symmetric positive definite. The equation has positive definite solutions under suitable conditions, and in that case the solution of interest is the maximal positive definite solution. The same matrix equation plays an important role in Green’s function calculations in nano research, but the matrixQ there is usually indefinite (so the matrix equation has no positive definite solutions), and one is interested in the case where the matrix equation has no positive definite solutions even whenQ is positive definite. The solution of interest in this nano application is a special weakly stabilizing complex symmetric solution. In this paper we show how a doubling algorithm can be used to find good approximations to the desired solution efficiently and reliably.
Key words. nonlinear matrix equation, complex symmetric solution, stable solution, fixed-point
iteration, doubling algorithm, Newton’s method, Green’s function
AMS subject classifications. 15A24, 65F30, 65H10 DOI. 10.1137/090758209
1. Introduction. The nonequilibrium Green’s function formalism provides a powerful conceptual and computational framework for treating quantum transport in nanodevices [6]. Typically, the system Hamiltonian H is a bi-infinite or semi-infinite block tridiagonal real symmetric matrix [1, 6, 15, 16, 21]. The semi-infinite case is slightly easier to handle. So we assume H is bi-infinite, and in that case H is usually of the form H = ⎡ ⎣ HHTL HL,S L,S HS HS,R HS,RT HR ⎤ ⎦ ,
where HS is the Hamiltonian for the scattering region, which is an ns× nssymmetric
matrix; HL is the Hamiltonian for the left lead, given by
HL= ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ . .. . .. . .. B L AL ATL BL AL ATL BL ⎤ ⎥ ⎥ ⎥ ⎥ ⎦,
∗Received by the editors May 6, 2009; accepted for publication (in revised form) August 30, 2010;
published electronically October 5, 2010.
http://www.siam.org/journals/sisc/32-5/75820.html
†Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada
([email protected]). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada and by the National Center for Theoretical Sciences in Taiwan.
‡Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan
([email protected]). The work of this author was partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.
3020
where all matrix blocks are nl× nl, AL = 0, and BL is symmetric; and HR is the
Hamiltonian for the right lead, given by
HR= ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ BR AR ATR BR AR ATR BR . .. . .. . .. ⎤ ⎥ ⎥ ⎥ ⎥ ⎦,
where all matrix blocks are nr× nr, AR= 0, and BRis symmetric. The matrix HL,S
represents the coupling between the scattering region and the left lead and is given by HL,S = ⎡ ⎢ ⎢ ⎢ ⎣ .. . 0 0 CL,S ⎤ ⎥ ⎥ ⎥ ⎦,
where CL,S is nl × ns, and the matrix HS,R represents the coupling between the
scattering region and the right lead and is given by
HS,R= [DS,R0 0 . . .] ,
where DS,Ris ns× nr.
The Green’s function G is defined [5, 7] by G = ((E + i0+)I − H)−1= lim
η→0+((E + iη)I − H)
−1,
whereE is energy, a real number that may be negative, and I is the identity operator.
We will also use Ir (or simply I) to denote the identity matrix of dimension r. Since
H is real symmetric, it is easily seen [5, 7] that G = G1+ iG2, with G1 and G2 real
symmetric (so G is complex symmetric). In practice, one is only interested in GS, the
Green’s function corresponding to the scattering region. It is easily found [5, 15] that GS =(E + i0+)I − HS− HL,ST GLHL,S− HS,RGRHS,RT −1
=(E + i0+)I − HS− CL,ST GL,SCL,S− DS,RGS,RDTS,R −1,
where GL = ((E + i0+)I − HL)−1, GR = ((E + i0+)I − HR)−1, GL,S is the nl× nl
matrix in the lower-right corner of GL, and GS,Ris the nr×nrmatrix in the upper-left
corner of GR. Moreover, one is only interested in the values ofE for which GL,S and
GS,R, respectively, have nonzero imaginary parts [16]. Computing GL,S and GS,Rhas
been a challenging problem for nano scientists.
It is easily seen [15] that GL,S satisfies the matrix equation
GL,S = ((E + i0+)I − BL− ATLGL,SAL)−1,
and GS,Rsatisfies the matrix equation
GS,R= ((E + i0+)I − BR− ARGS,RATR)−1.
For each fixedE, we take η > 0 sufficiently small and compute GL,S(η), the nl× nl
matrix in the lower-right corner of GL(η) = ((E + iη)I − HL)−1. In [1, 15, 21], the
values of E are between −5 and 5, and the smallest η used is 10−6. Now GL,S(η) satisfies the matrix equation
(1.1) X = ((E + iη)I − BL− ATLXAL)−1,
and GL,S(η) is taken to be an approximation to GL,S. Similarly, GS,Ris approximated
by GS,R(η), which is the nr× nr matrix in the upper-left corner of GR(η) = ((E +
iη)I − HR)−1 and is a particular solution of
(1.2) X = ((E + iη)I − BR− ARXATR)−1.
Since (1.1) and (1.2) are the same type, we only need to study (1.1).
A basic numerical method for finding GL,S(η) is the fixed-point iteration [21]
(1.3) X(k+1)= ((E + iη)I − BL− ATLX(k)AL)−1
with X(0)= ((E + iη)I − BL)−1. It has been observed that the iteration converges in
practice and the limit is the required matrix GL,S(η) (rather than a different solution
of (1.1)). The convergence of the iteration has been observed to be very slow for η close to 0. Note that we cannot take η = 0 for this iteration, since, otherwise, the
sequence X(k)(even if well-defined) would be real and would not approximate GL,S,
which is to have a nonzero imaginary part. To speed up the convergence of (1.3), the
following strategy is suggested in [21]: once X(k+1) has been computed from X(k),
replace X(k+1) by (X(k+1)+ X(k))/2 and proceed with the iteration. Since GL,S(η)
needs to be computed for many different values of E, it is also suggested in [1, 21]
that the GL,S(η) computed at a given energy be used as the initial guess for GL,S(η)
at the next nearby energy point.
Other methods have also been considered. In [15] (1.1) is rewritten as
(1.4) ATLXALX − ((E + iη)I − BL)X + I = 0,
and Newton’s method with exact line searches (as in [14]) is used, possibly preceded
by the fixed-point iteration (1.3). In [15] η = 0+in theory, but η is taken to be 2×10−6
in actual computations. Whether this Newton’s method will always converge to the
solution GL,S(η) is left as an open problem in [15]. Another approach is suggested in
[16]. In that approach one would obtain the equation
(1.5) ATLY2− ((E + iη)I − BL)Y + AL= 0
by postmultiplying (1.4) by AL, where Y = XAL. Equations (1.4) and (1.5) are
equivalent when AL is nonsingular. In [16] η = 0 is assumed. If the right solution Y
of (1.5) is found, then GL,S= Y A−1L is obtained. To find Y the auxiliary matrix
(1.6) 0 I −(AT L)−1AL (ATL)−1(EI − BL)
is used. Assuming this matrix is diagonalizable, the required solution Y can be found
(see [13]) by selecting nl linearly independent eigenvectors of (1.6). However, there
is no explanation in [16] how these nl eigenvectors can be selected from the 2nl
eigenvectors. Moreover, this approach does not work when ALis singular. Indeed, for
a simple example given in [16], AL is singular but the solution GL,S can be found
analytically.
In this paper we show how a doubling algorithm can find the desired solution
GL,S(η) of (1.1) and the desired solution GS,R(η) of (1.2). Our numerical experiments
indicate the efficiency and reliability of the doubling algorithm. The algorithm works very well even when η is taken to be very close to 0.
Our starting point is to rewrite (1.1) as
(1.7) X + ATX−1A = Q,
where
(1.8) A = AL, Q = QL+ iηI, QL=EI − BL,
and the required solution is X = (GL,S(η))−1. Similarly, we rewrite (1.2) as
(1.9) X + AX−1AT = Q,
where
(1.10) A = AR, Q = QR+ iηI, QR=EI − BR,
and the required solution is X = (GS,R(η))−1.
We often have AL = AR and BL = BR (in [15], for example). In this case, the
doubling algorithm is able to compute GL,S(η) and GS,R(η) simultaneously.
2. Characterization of GL,S(η) and GS,R(η). The matrices GL,S(η),
GS,R(η) have been uniquely defined through the inverses of semi-infinite block Toeplitz
matrices. They are also particular solutions of the matrix equations (1.1) and (1.2), re-spectively. Those matrix equations may have many other solutions. So we need to give
a characterization for GL,S(η) and GS,R(η), in terms of the matrix equations (rather
than the infinite matrices). We only need to derive results for GL,S(η); analogous
results for GS,R(η) follow readily.
We have rewritten (1.1) as (1.7) by replacing X in (1.1) with X−1. When Q is a
real symmetric positive definite matrix (which is possible here only when η = 0), the matrix equation (1.7) has been studied extensively (see [3, 8, 10, 11, 12, 17, 20, 22]). In all those papers the desired solution is the maximal positive definite solution (when it has at least one positive definite solution). As we will explain later, in the nano
application we are only interested in values of energy E for which (1.7) with Q =
EI − BLhas no positive definite solutions.
When A is nonsingular, with Y = X−1A, (1.7) is equivalent to the quadratic
matrix equation
ATY2− QY + A = 0
for which we have the associated quadratic eigenvalue problem P (λ)x = 0, x = 0,
where P (λ) = λ2AT− λQ + A.
We will not use this connection to solve (1.7) since the matrix A may be singular in the application we have in mind. However, we will use the quadratic matrix polynomial P (λ) in our discussions even when A is singular.
Theorem 2.1. Let A and Q be as in (1.8) and T be the unit circle. Then the
quadratic P (λ) = λ2AT − λQ + A has no eigenvalues on T for any η = 0. In this
case it has exactly nl eigenvalues inside T (counting multiplicities) and exactly nl
eigenvalues outsideT (counting multiplicities and including eigenvalues at ∞). Proof. The quadratic P (λ) has eigenvalues on T if and only if det P (λ) = 0 for
some λ ∈ T or, equivalently, det(λAT + λ−1A − QL− iηI) = 0 for some λ ∈ T.
However, for any λ ∈ T the matrix λAT + λ−1A − QL is Hermitian and thus has real
eigenvalues ν1, ν2, . . . , νnl. The eigenvalues of λAT + λ−1A − QL− iηI are then the
nonzero numbers ν1− ηi, ν2− ηi, . . ., νnl− ηi. So det(λAT + λ−1A − QL− iηI) = 0
for any λ ∈ T. Thus the quadratic P (λ) has no eigenvalues on T. The location of the eigenvalues must be as stated since P (λ) is a T -palindromic matrix polynomial (see [18]). More precisely, the number ξ is an eigenvalue of the quadratic P (λ) if and only
if ξ−1is so and they have the same partial multiplicities (see [18, Theorem 2.2]).
Let X be any complex symmetric solution of (1.7) and ρ(·) denote the
spec-tral radius. Then ρ(X−1A) = ρ((X−1A)T) = ρ(ATX−1). We adopt the following
definition.
Definition 2.2.A complex symmetric solution X of (1.7) is said to be stabilizing
if ρ(X−1A) < 1 and weakly stabilizing if ρ(X−1A) = 1. Let (2.1) M0= A 0 Q −I , L0= 0 I AT 0 .
Then the pencil M0 − λL0 (also denoted by (M0, L0)) is a linearization of the
T -palindromic polynomial λ2AT − λQ + A, and X is a solution of (1.7) if and only if
(2.2) M0 I X = L0 I X X−1A.
It follows that each complex symmetric solution of (1.7) is determined by a suitable
de-flating subspace of the pencil M0−λL0. The required matrix GL,S= limη→0+GL,S(η)
is the inverse of a special complex symmetric solution of (1.7) with η = 0, and we
are only interested in the values ofE for which GL,S has a nonzero imaginary part.
We will see later in this section that for those values ofE the pencil M0− λL0 (or,
equivalently, the quadratic λ2AT−λQ+A) must have eigenvalues on T and that GL,S
is the inverse of one of the weakly stable solutions of (1.7) with η = 0. For η > 0,
the pencil M0− λL0 has no eigenvalues on T by Theorem 2.1. This is a necessary
condition for the existence of a (necessarily unique) stabilizing solution of (1.7). By using a deep result on linear operators, we will show that (1.7) with η > 0 always
has a stabilizing solution and that its inverse is precisely the matrix GL,S(η) we are
looking for.
Recall that GL,S(η) is the nl× nlmatrix in the lower-right corner of ((E + iη)I −
HL)−1. Note that (E + iη)I − HL= ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ . .. . .. . .. Q −A −AT Q −A −AT Q ⎤ ⎥ ⎥ ⎥ ⎥ ⎦.
It is easy to see that GL,S(η) is also the nl× nlmatrix in the upper-left corner of T−1 with (2.3) T = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ Q −AT −A Q −AT −A Q . .. . .. . .. ⎤ ⎥ ⎥ ⎥ ⎥ ⎦.
This change is largely notational and is not needed if we discuss GS,R(η). The symbol
for the block Toeplitz matrix T is φ(λ) = −λA + Q − λ−1AT.
Let 2 be the usual Hilbert space of all square summable sequences of complex
numbers, and let m2 be the Cartesian product of m copies of 2. The infinite matrix
(2.3) is then seen to be inB(nl
2 ), the set of all bounded linear operators on n2l. It is
clear that T = (ηi)I − W with W a self-adjoint operator in B(nl
2 ). It is well known
that the spectrum of a self-adjoint operator is real. So, for each η > 0, T has an
inverse inB(nl
2 ) since ηi is not in the spectrum of W . Thus, by appealing to a deep
result on linear operators (see [9, Chapter XXIV, Theorem 4.1] and [19]), we know that φ(λ) has a factorization
(2.4) φ(λ) = (I − λ−1L)D(I − λU )
with D invertible, ρ(L) < 1, and ρ(U ) < 1. From (2.4) we see that A = DU, AT = LD, Q = D + LDU.
Thus D +ATD−1A = Q and ρ(D−1A) < 1. In other words, D is the unique stabilizing
solution of (1.7), which must be complex symmetric. By [9, Chapter XXIV, Theorem
4.1], the nl× nlmatrix in the upper-left corner of T−1 is precisely D−1. We thus have
the following characterization of GL,S(η) and GS,R(η).
Theorem 2.3.The matrix GL,S(η) is the inverse of the unique stabilizing solution
of (1.7), and the matrix GS,R(η) is the inverse of the unique stabilizing solution of
(1.9).
We will show in the next section that a doubling algorithm can compute GL,S(η)
and GS,R(η) efficiently, even if η is very close to 0. In the remainder of this section
we answer the following question, For what values ofE will GL,S = limη→0+GL,S(η)
have a nonzero imaginary part?
In the limiting case η = 0, the matrix Q in (1.7) is real symmetric. For real symmetric (and more generally Hermitian) matrices X and Y , we write X ≥ Y (X > Y ) if X − Y is positive semidefinite (definite). When Q is real symmetric positive definite, necessary and sufficient conditions for the existence of a positive definite solution of (1.7) have been given in [8].
Theorem 2.4. Let Q be positive definite. Then (1.7) has a positive definite
so-lution if and only if the rational matrix function ψ(λ) = Q + λA + λ−1AT is regular (i.e., the determinant of ψ(λ) is not identically zero) and ψ(λ) ≥ 0 for all λ on T. In this case (1.7) has a maximal positive definite solution X+ (i.e., X+≥ X for any
other positive definite solution).
The next result follows quite quickly from Theorem 2.4.
Theorem 2.5.Let Q be real symmetric, and for each λ on T, let the eigenvalues
of the Hermitian matrix Q + λA + λ−1AT be μ1(λ) ≤ μ2(λ) ≤ · · · ≤ μn(λ). For
i = 1, 2, . . . , n, let ai = min|λ|=1μi(λ) and bi = max|λ|=1μi(λ), and let Δi = [ai, bi].
Then the matrix equation
(2.5) X + ATX−1A = sI − Q
has a positive definite solution for s > bn, has a negative definite solution for s < a1,
and has no positive or negative definite solutions for a1 < s < bn. The quadratic
Ps(λ) = λ2AT − λ(sI − Q) + A has eigenvalues on T if and only if s ∈ ∪1≤i≤nΔi.
Proof. For each i = 1, . . . , n, μi(λ) is a continuous function on T. So μi(T) = Δi.
Since μ1(λ) ≤ μ2(λ) ≤ · · · ≤ μn(λ), we have a1 ≤ a2 ≤ · · · ≤ an and b1 ≤ b2 ≤
· · · ≤ bn. Let ψs(λ) = sI − Q − λA − λ−1AT. If s > bn, then ψs(λ) ≥ sI − bnI > 0 for all λ ∈ T. Note that in this case we must have sI − Q > 0. In fact, we have sI − Q − (A + AT) > 0 for λ = 1 and sI − Q + (A + AT) > 0 for λ = −1. Adding these two we get 2(sI − Q) > 0. Now by Theorem 2.4, (2.5) has a positive definite solution.
If s < a1, then ψs(λ) = Q − sI + λA + λ−1AT ≥ a1I − sI > 0. So Q − sI > 0, and by
Theorem 2.4, the equation X + ATX−1A = Q − sI has a positive definite solution X∗.
So−X∗is a negative definite solution of (2.5). We now show that (2.5) has no positive
definite solutions for s < bn. In fact, the existence would imply ψs(λ) ≥ 0 for all λ ∈ T
and then ψbn(λ) > 0 for all λ ∈ T. This is impossible since ψbn(λ) is singular for some
λ ∈ T. Similarly, (2.5) has no negative definite solutions for s > a1. The quadratic
Ps(λ) = λ2AT− λ(sI − Q) + A has eigenvalues on T if and only if det Ps(λ) = 0 for
some λ ∈ T or, equivalently, det ψs(λ) = (s − μ1(λ))(s − μ2(λ)) · · · (s − μn(λ)) = 0 for
some λ ∈ T; the latter is equivalent to s ∈ ∪1≤i≤nΔi.
We use one simple example to illustrate the results in Theorem 2.5. Example 2.1. Let n = 2, and
A = 0 0 1 0 , Q = t + 1 t t t + 1 , t ≥ 0.
This example is a special case of an example in [16]. For λ = eiθ, we find
μ1(λ) = t + 1 − t2+ 1 + 2t cos θ, μ2(λ) = t + 1 + t2+ 1 + 2t cos θ. We then find Δ1= [0, 2t] , 0 ≤ t ≤ 1, [0, 2] , t ≥ 1, Δ2= [2, 2(t + 1)] , 0≤ t ≤ 1, [2t, 2(t + 1)] , t ≥ 1.
By Theorem 2.5 or direct verification, (2.5) has a positive definite solution for s > 2(t + 1), has a negative definite solution for s < 0, and has no positive or negative
definite solutions for 0 < s < 2(t + 1). The quadratic Ps(λ) = λ2AT − λ(sI − Q) + A
has eigenvalues on the unit circle if and only if s ∈ Δ1∪ Δ2. In particular, when
t = 0, the quadratic Ps(λ) has eigenvalues on the unit circle only for s = 0, 2; when t = 1, the quadratic Ps(λ) has eigenvalues on the unit circle for all s ∈ [0, 4]. For this
example, the value t = 1 is the only one for which Δ1∪ Δ2is connected.
We can now give the following result about when GL,S has a nonzero imaginary
part. A similar result can be given for GS,R.
Theorem 2.6.For λ ∈ T, let the eigenvalues of ψL(λ) = BL+ λAL+ λ−1AT
L be
μL,1(λ) ≤ · · · ≤ μL,n(λ), where n = nl. Let
ΔL,i=
min
|λ|=1μL,i(λ), max|λ|=1μL,i(λ)
and ΔL = ∪ni=1ΔL,i. Then GL,S is a real symmetric matrix if E /∈ ΔL, and it is a complex symmetric matrix if E ∈ ΔL. When E ∈ ΔL, the quadratic PL(λ) = λ2ATL− λ(EI − BL) + AL has eigenvalues on T. In the generic case that all these
eigenvalues on T are simple and nonreal, GL,S has a nonzero imaginary part. Proof. By taking Q = BL, A = AL, and s = E in Theorem 2.5, we know that
PL(λ) has eigenvalues on T if and only if E ∈ ΔL. The matrix GL,S is known to be
complex symmetric. IfE /∈ ΔL, then we know by Theorem 2.3 that GL,S is determined
by the deflating subspace of M0− λL0 corresponding to the n eigenvalues inside T,
where M0 and L0 are given in (2.1) with η = 0. So GL,S is a real matrix since AL
and BL are real. We now assume that E ∈ ΔL and that the eigenvalues on T are
simple eigenvalues αk± βki (k = 1, . . . , m) with βk> 0. In this case, for each η > 0,
PL,η(λ) = λ2ATL− λ((E + iη)I − BL) + AL has no eigenvalues on T (see Theorem
2.1). After the introduction of iη, one of the pair αk± βki is moved to the inside of
T (the choice is independent of the size of η > 0 by the continuity of eigenvalues),
and the other is moved to the outside ofT (by the property of T -palindromic matrix
polynomials). Therefore, GL,S is determined by the deflating subspace of M0− λL0
(with η = 0) corresponding to the n − m eigenvalues inside T together with the m
eigenvalues onT that would be moved to the inside of T with the introduction of iη.
These n eigenvalues of M0− λL0 must be those of GL,SA by (2.2). In this case GL,S
must have a nonzero imaginary part since, otherwise, the eigenvalues of GL,SA would
appear in conjugate pairs.
We remark that the numerical determination of ΔL in the above theorem may
require the computation of the eigenvalues of ψL(λ) for many fixed values of λ on T.
However, for each fixed λ on T, ψL(λ) is just a Hermitian matrix. The computation
of all its eigenvalues is a relatively easy task. We don’t have to do the computation
for too many points of λ on T. A rough numerical approximation of ΔL would be
sufficient. In this way, we can avoid the much more complicated computation of GL,S
for many energy valuesE of no practical interest. Moreover, in many cases, ΔL,iand
ΔL,i+1 overlap for i = 1, . . . , n − 1, so an accurate approximation of ΔL requires only
the computation of the extreme eigenvalues μL,1(λ) and μL,n(λ) for many λ values
onT.
3. Computation of GL,S(η) and GS,R(η). We know that GL,S(η) = Xs−1,
where η > 0 and Xs is the unique stabilizing solution of (1.7). A doubling algorithm
has been studied in [17] for (1.7) with a real symmetric positive definite Q. In our case, Q is complex symmetric. However, the more general presentation in [3, 4] can be used directly.
Let M0 and L0be as given in (2.1). It is easy to verify that the pencil M0− λL0
is T -symplectic; i.e., M0JM0T = L0JLT0 for J = 0 I −I 0 .
We can define the sequences{Mk} and {Lk}, where
(3.1) Mk = Ak 0 Qk −I , Lk = −Pk I ATk 0 , by the following doubling algorithm [3, 4] if no breakdown occurs.
Algorithm 3.1. Let A0= A, Q0= Q, P0= 0. For k = 0, 1, . . ., compute Ak+1= Ak(Qk− Pk)−1Ak,
Qk+1= Qk− ATk(Qk− Pk)−1Ak,
Pk+1= Pk+ Ak(Qk− Pk)−1ATk.
We will show shortly that this algorithm will not break down, and Qk converges
to Xs much more quickly than the sequence {Xk} generated by the following basic
fixed-point iteration. Algorithm 3.2.
X0= Q,
Xk+1= Q − ATXk−1A, k = 0, 1, . . . .
We will also show that the sequence{Xk} is well-defined and converges to Xs.
Note that the sequence {X(k)} from (1.3) is then well-defined and given by X(k) =
Xk−1. So we indeed have lim X(k)= GL,S(η), as observed in numerical experiments.
Theorem 3.1. Let A and Q be as in (1.8) with η > 0. Let Xs be the stabilizing
solution of (1.7) and Xs be the stabilizing solution of the dual equation X + AX−1AT = Q.
(The existence of Xs is also guaranteed by the argument leading to Theorem 2.3.)
Then the following are true.
(a) The sequences {Ak}, {Qk}, {Pk} in Algorithm 3.1 are well-defined, and Qk
and Pk are complex symmetric.
(b) The sequence{Xk} in Algorithm 3.2 is well-defined, and Xk is complex
sym-metric.
(c) Qk= X2k−1 for each k ≥ 0.
(d) Qk converges to Xsquadratically, Ak converges to 0 quadratically, and Q−Pk
converges to Xs quadratically with lim sup
k→∞
2k Q
k− Xs ≤ (ρ(Xs−1A))2, lim sup k→∞ 2k A k ≤ ρ(Xs−1A), lim sup k→∞ 2kQ − P k− Xs ≤ (ρ(Xs−1A))2,
where · is any matrix norm.
(e) Xk converges to Xs linearly with
lim sup
k→∞
k
Xk− Xs ≤ (ρ(Xs−1A))2.
Proof. Let Tk be the block k × k matrix given by
Tk = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ Q −AT −A Q . .. . .. . .. −AT −A Q ⎤ ⎥ ⎥ ⎥ ⎥ ⎦.
For each k ≥ 1, we can write Tk= Vk+ iηI with Vk real symmetric. So Tk has no zero
eigenvalues and is thus invertible. The sequence{Xk} is obtained by block Gaussian
elimination performed on the matrix (2.3). In fact, X0= Q is the (1, 1) block in (2.3);
when the (1, 1) block is used to eliminate the (2, 1) block, the new (2, 2) block is X1;
when the new (2, 2) block is used to eliminate the (3, 2) block, the new (3, 3) block is
X2; and so on. Since Tk is invertible for each k ≥ 1, Xk is well-defined and invertible
for each k ≥ 0. It is easily seen by induction that Xk is complex symmetric since Q
is complex symmetric. So (b) is proved.
Let Wk= Qk− Pk in Algorithm 3.1. Then the sequence {Wk} satisfies
Wk+1= Wk− ATkWk−1Ak− AkWk−1ATk, W0= Q.
It follows from [2, Theorem 13] that Wk is nonsingular for each k ≥ 0. The sequences
{Ak}, {Qk}, {Pk} are then well-defined. Again, Qk and Pk are complex symmetric
since Q is complex symmetric. This proves (a).
The proof of (c) is the same as in the proof of [11, Proposition 5], although Q is complex symmetric here.
To prove (d), we start with M0 I Xs = L0 I Xs Xs−1A,
a special case of (2.2). It follows from [3] that for each k ≥ 0,
(3.2) Mk I Xs = Lk I Xs (Xs−1A)2 k . Substituting (3.1) into (3.2) yields
(3.3) Ak = (Xs− Pk)(Xs−1A)2k, Qk− Xs= ATk(Xs−1A)2k. Similarly, we have M0 I Xs = L0 I Xs Xs−1AT, where M0= AT 0 Q −I , L0= 0 I A 0 .
The pencil M0−λL0is a linearization of λ2A−λQ+AT, which has the same eigenvalues
as λ2AT− λQ + A. It follows that Xs−1AT and Xs−1A have the same eigenvalues, and
in particular, ρ( Xs−1AT) = ρ(Xs−1A). For each k ≥ 0, we now have
Mk I Xs = Lk I Xs ( Xs−1AT)2k,
where Mk and Lk are given by (3.1) and Algorithm 3.1 when A0 = A in Algorithm
3.1 is replaced by A0= AT. It is easy to see that
Mk= ATk 0 Qk −I , Lk= − Pk I Ak 0
with (3.4) Pk = Q − Qk, Qk = Q − Pk. So we now have (3.5) ATk = ( Xs− Pk)( Xs−1AT)2k, Qk− Xs= Ak( Xs−1AT)2k. By (3.3), (3.5), and (3.4), we have Qk− Xs= ATk(Xs−1A)2 k = ( Xs− Pk)( Xs−1AT)2k(Xs−1A)2k = (Qk− Xs+ (Xs+ Xs− Q))( Xs−1AT)2 k (Xs−1A)2 k . Thus (3.6) (Qk− Xs)(I − ( Xs−1AT)2 k (Xs−1A)2 k ) = (Xs+ Xs− Q)( Xs−1AT)2 k (Xs−1A)2 k . It follows that lim sup k→∞ 2k Q k− Xs ≤ ρ( Xs−1AT)ρ(Xs−1A) = (ρ(Xs−1A))2< 1.
So Qk converges to Xs quadratically. Then we know { Pk} is bounded, and we have
by the first equation in (3.5) that lim sup
k→∞
2k A
k ≤ ρ(Xs−1A) < 1.
So Ak converges to 0 quadratically. By the second equations in (3.5) and (3.4), we
have
lim sup
k→∞
2k
(Q − Pk)− Xs ≤ (ρ(Xs−1A))2< 1.
So Q − Pk converges to Xs quadratically. This completes the proof of (d).
Since Qk converges to Xsand X2k−1 = Qk, we know that a subsequence of{Xk}
converges to Xs. Since ρ(Xs−1A) < 1, Xsis an attractive fixed point of the mapping
f defined by f (X) = Q − ATX−1A. It follows that the sequence {Xk} also converges
to Xs, and we have lim sup k→∞ k Xk− Xs ≤ (ρ(Xs−1A))2 as in [12]. So (e) is proved.
Algorithm 3.1 is said to be structure-preserving since for each k ≥ 0, Mk and Lk
have the structures given in (3.1) and the pencil Mk− λLk is T -symplectic.
When AL = AR and BL = BR, the dual equation of (1.7) with (1.8) is
pre-cisely (1.9) with (1.10). In this case, Algorithm 3.1 can find GL,S(η) and GS,R(η)
simultaneously.
To get good approximations to GL,S, we need to take η > 0 to be sufficiently
small. However, we cannot apply Algorithm 3.1 with η = 0 directly. If we take η = 0,
then the sequence{Qk} is real and will not approximate Xs, which is to be complex
symmetric.
We are only interested in the values ofE for which limη→0+ρ(Xs−1A) = 1. Since we
need to take η sufficiently small, we always have ρ(Xs−1A) ≈ 1, and the convergence of
Algorithm 3.2 will be very slow in general. The strategy proposed in [21] for improving
the convergence of iteration (1.3) generates a new sequence X(k) as follows.
Algorithm 3.3. Take X(0)= Q−1. For k = 0, 1, . . . , compute
X(k+1)= (Q − ATX(k)A)−1, X(k+1)← (X(k)+ X(k+1))/2.
We now adapt this strategy for Algorithm 3.2 and get the following modified fixed-point method.
Algorithm 3.4. Take X0= Q. For k = 0, 1, . . . , compute
Xk+1= Q − ATXk−1A,
Xk+1← (Xk+ Xk+1)/2.
We remark that for {X(k)} from Algorithm 3.3 and {Xk} from Algorithm 3.4,
we no longer have X(k) = Xk−1 in general. To keep this relation, one would have to
replace the last line of Algorithm 3.4 by
Xk+1←(Xk−1+ Xk+1−1 )/2 −1.
This would make the algorithm slightly more expensive. Numerical experiments show that this change does not affect the performance of the algorithm in any significant
way. So we prefer to use the update of Xk+1 as given in Algorithm 3.4. Numerical
experiments also show that the convergence of Algorithm 3.4 is often much faster than that of Algorithm 3.2. But a rigorous convergence analysis remains an open problem. In [16], η = 0 is assumed, and a diagonalization procedure is used on the
auxil-iary matrix (1.6) when A = AL is nonsingular. However, that approach will run into
difficulties even when A is perfectly conditioned and the auxiliary matrix is diagonal-izable. Example 3.1. Let AL=−I3, BL= ⎡ ⎣ −14 −14 −10 0 −1 4 ⎤ ⎦ . It is easy to determine by Theorem 2.6 that
ΔL,1= [0.5858, 4.5858], ΔL,2= [2.0000, 6.0000], ΔL,3= [3.4142, 7.4142].
So ΔL= [0.5858, 7.4142]. For η = 0 and E = 4, the matrix in (1.6) has all 6 eigenvalues
onT as follows: − √ 2 2 ± √ 2 2 i, √ 2 2 ± √ 2 2 i, ±i.
The difficulty with the approach in [16] is that we do not know which 3 eigenvalues
should be used to get GL,S. There are 20 different choices. By the proof of Theorem
2.6, we now know that we can pick only 1 eigenvalue from each conjugate pair. This reduces the number of choices to 6. Only by changing η from 0 to a small positive number do we find that the 3 eigenvalues with negative imaginary parts are perturbed
to the inside ofT. So these 3 eigenvalues are to be used to find GL,S. We could have
taken a small η > 0, say η = 10−10, right in the beginning and used Algorithm 3.1 to
compute GL,S(η) as a good approximation to GL,S.
In general, the palindromic matrix polynomial P (λ) = λ2AT − λQ + A, where
A and Q are given in (1.8) with η = 0, can have 2k eigenvalues on T with k being
any integer from 0 to nl. For the above example, we find that P (λ) has 0 eigenvalues
onT for E ∈ (−∞, 0.5857] ∪ [7.4143, ∞), has 2 eigenvalues on T for E ∈ [0.5858, 2) ∪
(6, 7.4142], has 4 eigenvalues on T for E ∈ [2, 3.4142]∪[4.5858, 6], and has 6 eigenvalues
onT for E ∈ [3.4143, 4.5857]. For a large problem, the computed eigenvalues of P (λ)
(or the matrix (1.6)) may contain a number of conjugate pairs nearT, and the number
of eigenvalues insideT may be different from the number of eigenvalues outside T. In
that case, it is hard to tell which eigenvalues should be used to compute GL,S. For
this reason, the eigenvalue approach will not be considered further in this paper. Newton’s method has been studied in [12] for (1.7) with a Hermitian positive definite Q. Since Q is non-Hermitian in our case, there is no guarantee of convergence
for Newton’s method with X0= Q. With a given X0, Newton’s iteration for (1.7) is
easily found to be
(3.7) Xk− ATXk−1−1 XkXk−1−1 A = Q − 2ATXk−1−1 A.
If Xk−1 is a nonsingular complex symmetric matrix with ρ(Xk−1−1 A) < 1, then (3.7)
has a unique solution Xk, which must be complex symmetric. Since ρ(Xs−1A) < 1 for
η > 0, the convergence of Newton’s method is guaranteed if X0is complex symmetric
and sufficiently close to Xs.
Algorithm 3.5 (Newton’s method for (1.7)). Take X0 to be complex symmetric
and sufficiently close to Xs. For k = 1, 2, . . . , compute Lk= Xk−1−1 A and solve
(3.8) Xk− LTkXkLk= Q − 2LTkA.
Note that the Stein equation (3.8) is uniquely solvable when ρ(Lk) < 1.
If we are to find Xs (and then GL,S(η) = Xs−1) only for one fixed value of E,
there is little hope for Algorithm 3.5 to beat Algorithm 3.1. In practice, we need to
determine GL,S (through Xs and GL,S(η) for a small η > 0) for a range of energy
values [21]. So for Algorithm 3.5, the Xscomputed at a given energy can be used as
an initial guess for Xsat the next nearby energy point. However, there is some danger
associated with this practice. If η is very small, then the convergence of Algorithm
3.5 is guaranteed only when Xs for the previous energy is very close to the Xs for
the current energy. This means that the stepsize for the energyE must be very small.
But it is hard to tell how small the stepsize should be for a given η > 0. On the other hand, Algorithm 3.1 is not a correction method, and thus we cannot use the
Xs computed at a given energy to compute Xsat the next nearby energy point. But
the convergence of Qk to Xs is fast (see Theorem 3.1 and (3.6)) even though we use
Q0= Q for each energy E.
4. Numerical results. In this section we present some numerical results to illustrate the convergence behavior of the algorithms for computing the stabilizing
so-lution Xsof (1.7). We use DA, M FPM, and NM to denote the doubling algorithm
(Al-gorithm 3.1), the modified fixed-point method (Al(Al-gorithm 3.4), and Newton’s method (Algorithm 3.5), respectively. All computations are performed in MATLAB R2008b
using IEEE double-precision floating-point arithmetic (eps ≈ 2.2 × 10−16) on the
Linux system.
Suppose one complex flop is equivalent to four real flops and n is the dimension
of the matrices in (1.7). Then for each iteration, DA requires about (104/3)n3 real
flops, M FPM requires about (38/3)n3 real flops, and NM requires about (398/3)n3
real flops. We also recall that DA can compute the stabilizing solutions of (1.7) and (1.9) simultaneously if the matrices A and Q in these two equations are the same two matrices. This makes the comparison more favorable for DA.
To measure the accuracy of a computed stabilizing solution X to (1.7), we use the relative residual (r res)
X + ATX−1A − Q
X + A2X−1 + Q,
where · is the spectral norm.
Example 4.1. We randomly generate a symmetric matrix BL and an arbitrary
matrix AL of dimension 50. By Theorem 2.6, we find
ΔL= [−23.03, −19.94] ∪ [−19.84, 15.83] ∪ [16.12, 17.78] ∪ [18.26, 21.28].
We divide the interval [−23.03, 21.28] (the smallest interval containing ΔL) into P
subintervals using P + 1 equally spaced nodes Ei=−23.03 + 44.31(i/P ), i = 0, . . . , P .
We now choose P = 1000 and take η = 10−10in (1.8). Let dibe the distance between
T and the set of eigenvalues of the pencil (M0, L0) in (2.1) withE = Ei. We find that
di< 10−8 wheneverEi∈ ΔL.
We run DA for eachEi as well as for a fewE values outside [−23.03, 21.28]. The
algorithm is stopped when Qk+1− Qk < 10−8, and X = Qk+1 is taken to be a
good approximation to the stabilizing solution of (1.7). We also compute the relative residual for X as another check of accuracy. In Figure 4.1, we plot the relative
resid-uals, the number of iterations, and the distance betweenT and the set of eigenvalues
of the pencil (M0, L0). We see that DA converges to the stabilizing solution of (1.7)
in about 40 iterations for E ∈ ΔL, and that the convergence is very fast for otherE
values. We also see that r res < 10−10 for almost all E values. Generally speaking,
there is no need to look for higher accuracy since we have taken η = 10−10 rather
than η = 0+.
Note that DA computes the stabilizing solution for each E value without using
the stabilizing solution already computed for a nearby E value. For this example,
we have already found by DA the (approximate) stabilizing solution XDA(i) for each
E = Ei, i = 0, . . . , P . We now examine whether it is possible to use NM to find all
those stabilizing solutions more efficiently, with some help from DA. Let q ≥ 2 be a factor of P and P = mq. Suppose we use DA to find the stabilizing solutions only for E = Erq, r = 0, . . . , m − 1. Then we may find the stabilizing solutions for E = Erq+i
(i = 1, . . . , qr, with qr = q − 1 for r < m − 1 and qr = q for r = m − 1) as follows.
Take X0= XDA(rq), apply NM untilXk+1− Xk < 10−8 or k + 1 = 30, and take Xk+1
to be XNM(rq+1). To get XNM(rq+i) for i = 2, . . . , qr, we apply NM in the same way with
X0 = XNM(rq+i−1). For r = 0, . . . , m − 1 and i = 1, . . . , qr, we say NM is successful if
XNM(rq+i)−XDA(rq+i) < 10−6. Suppose NM is successful S times; we define the efficiency
(or success rate) of NM by
(4.1) Eff = S
m(q − 1) + 1.
For P = 1000, we find that NM converges (within 30 iterations) most of the time, with average number of iterations at about 10. However, it often converges to a wrong
−20 −15 −10 −5 0 5 10 15 20 0 0.5 E Distance −20 −15 −10 −5 0 5 10 15 20 10−15 10−10 Relative Residuals −20 −15 −10 −5 0 5 10 15 20 0 10 20 30 40 Iterations of DA
Fig. 4.1. Relative residuals, the number of iterations for convergence of DA, and the distance betweenT and the spectrum of (M0, L0);P = 1000.
solution. In practice, XDA(rq+i) is not computed if we choose to use NM to get XNM(rq+i).
But we could still check whether XNM(rq+i)is approximating the stabilizing solution by
computing the eigenvalues of (XNM(rq+i))−1A. The efficiency of NM is shown in Table 4.1
for different values of q. Note that the success rate of NM is only 58% even for q = 2. Besides, the computational work for 10 NM iterations is not much less than that for 40 DA iterations. Therefore, we would prefer to use DA to compute the stabilizing
solution for eachE value. However, NM may be used to improve the accuracy of the
solution obtained by DA if there is a need to do so. For example, whenE = 5.64, we
find that r res = 2× 10−9 after 40 DA iterations, and starting with this approximate
solution, we get r res = 3× 10−13 after 1 NM iteration.
Table 4.1
Efficiency of NM forP = 1000.
q 40 20 10 5 2
Eff 15% 27% 33% 48% 58%
Example 4.2. We consider a semi-infinite Hamiltonian operator for a heterostruc-tured semiconductor of the form
(4.2) H(ψ, x) = −∇ 2ε( x)∇ψ + V ( x)ψ, x ≡ x1 x2 ∈ Ω,
where Ω≡ Ω1∪ Ω2 with
Ω1= ([−9, −1] × (−∞, 0]) ∪ ([1, 9] × (−∞, 0]) , Ω2= [−1, 1] × (−∞, 0],
is the reduced Planck constant, ψ is the associated wave function, ε( x) is the electron effective mass with
ε( x) =
ε1, x ∈ Ω1,
ε2, x ∈ Ω2,
and V ( x) = ω1x21is the potential energy.
Let Trbe the tridiagonal matrix of dimension r with 4 on the main diagonal and
−1 on the two adjacent diagonals. We use the classical 5-point central finite difference method to discretize the Hamiltonian operator (4.2) on uniform grid points in Ω with
mesh size h. Then the corresponding matrices BLand ALin (1.8) are of the following
forms: BL= δ1Tl⊕ (2(δ1+ δ2))⊕ δ2Tm⊕ (2(δ1+ δ2))⊕ δ1Tl − δ1el+1eTl + eleTl+1 − δ2el+2eTl+1+ el+1eTl+2 − δ2en+1eTn+ eneTn+1 − δ1en+2eTn+1+ en+1eTn+2 + ω1h2diag(1− c)2, (2 − c)2, . . . , (N − c)2 and AL =− δ1Il⊕ δ1+ δ2 2 ⊕ δ2Im⊕ δ1+ δ2 2 ⊕ δ1Il ,
where δi=/2h2εi (i = 1, 2), ej denotes the jth vector of the identity matrix, l and
m are the numbers of grid points on the x1-axis in (−9, −1) and (−1, 1), respectively,
n = l + m + 1, N = 2l + m + 2, c = (N + 1)/2, and ⊕ denotes the direct sum of two matrices.
In our tests we take l = 39, m = 9, δ1 = 1, δ2 = 0.1, and ω1 = 5× 10−4.
By Theorem 2.6, we find ΔL = [0.00386, 8.0103]. We divide ΔL into P subintervals
using P + 1 equally spaced nodes Ei, i = 0, . . . , P . We now choose P = 1000 and take
η = 10−6 in (1.8). Let di be the distance betweenT and the set of eigenvalues of the
pencil (M0, L0) in (2.1) withE = Ei. We find that di< 10−5 wheneverEi ∈ ΔL.
We run DA for eachEias well as for a fewE values outside ΔL. The algorithm is
stopped whenQk+1− Qk < 10−8, and X = Qk+1is taken to be a good
approxima-tion to the stabilizing soluapproxima-tion of (1.7). In Figure 4.2, we plot the relative residuals and the number of iterations. We see that DA converges to the stabilizing solution of
(1.7) in about 26 iterations forE ∈ ΔL. We also see that r res < 10−9 for almost all
E values.
In this example, we have increased η from 10−10to 10−6, and as a result the usual
number of DA iterations is reduced from 40 to 26. We have also tried η = 10−10 for
this example and found that DA still works very well with the number of iterations
around 40, as in Example 4.1. In the nano literature, the smallest η used is 10−6since
no powerful methods were available at that time. With DA studied in this paper, we
can also take η to be smaller than 10−10 if there is a need to do so.
0 1 2 3 4 5 6 7 8 0 10 20 30 40 50 E = energy (η = 10−6) Iterations of DA 0 1 2 3 4 5 6 7 8 10−16 10−14 10−12 10−10 10−8 Relative Residuals
Fig. 4.2. Relative residuals and the number of iterations for convergence of DA;P = 1000.
Suppose we use DA to find the stabilizing solution only forE = E0 and use NM
(in the same way as in Example 4.1) to find the stabilizing solutions for E = Ei,
i = 1, . . . , P . In the notation of Example 4.1, we have q = P , m = 1, and Eff = S/P . We find that Eff = 46.3% and 46.5% for P = 1000 and 2000, respectively. To have Eff = 100%, one would have to take an extremely large P , which is not practical.
Example 4.3. We consider Example 2.1 with t = 1. The matrices A and Q in
(1.8) are now given by A = [01 00] and Q = (E + iη)I2− [21 12]. We take η = 10−10. For
0≤ E ≤ 4, the distance between T and the set of eigenvalues of the pencil (M0, L0)
in (2.1) is less than 10−8.
We run DA as before. In Figure 4.3 we plot the relative residuals and the number of iterations for convergence of DA. In most cases, 37 iterations are needed. We know from Theorem 3.1(c) that in exact arithmetic the approximate solution obtained after
37 DA iterations is the same as the approximate solution obtained after 237−1 (which
is over 137 billion) iterations of the basic fixed-point method (Algorithm 3.2). So we would not try to use Algorithm 3.2 even for this small problem. But M FPM turns
out to be quite useful. We run M FPM for E = Ei = 0.004i, i = 0, 1, . . . , 1000.
The algorithm is stopped when r res < 10−10 or the number of iterations exceeds
10000. Figure 4.4 plots the relative residuals and the number of iterations. More
specifically, we find that M FPM converges in 37–100 iterations for 536 values of Ei
with i ∈ {37–301, 303–305, 695–697, 699–963}, converges in 0 iterations for E500= 2,
and does not converge in 100 iterations for the other 464 values ofEi. Note that for
E500 = 2, the number of M FPM iterations is 0 since X0 = Q happens to be the
exact solution when η = 0 and since we take η = 10−10, which is very close to 0.
DA takes 1 iteration for E = 2 only because a different stopping criterion is used.
ForE = Ei, i = 302, 698, the number of M FPM iterations is 102, exceeding 100 only
−0.50 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0.2 0.4 0.6 0.8 1x 10 −9 Relative Residuals −0.50 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 10 20 30 40 50 E Iterations of DA
Fig. 4.3. Relative residuals and the number of iterations for convergence of DA;η = 10−10.
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5x 10 −5 Relative Residuals 0 0.5 1 1.5 2 2.5 3 3.5 4 0 5000 10000 E Iterations of M_FPM
Fig. 4.4. Relative residuals and the number of iterations for convergence of M FPM;η = 10−10.
slightly. We may say that for this example, M FPM is more efficient than DA most of the time. However, there is currently no theory about the convergence of M FPM even for this simple example.
5. Conclusion. We have studied a doubling algorithm for Green’s function cal-culations in nano research. Its convergence to the desired solution is guaranteed. The algorithm has been shown to be efficient and reliable. Newton’s method cannot be used for this purpose by itself, but it may be used as a correction method if very high accuracy is required. A modified fixed-point method may be more efficient than the doubling algorithm in some situations, but its convergence analysis remains an open problem.
REFERENCES
[1] I. Appelbaum, T. Wang, J. D. Joannopoulos, and V. Narayanamurti, Ballistic hot-electron transport in nanoscale semiconductor heterostructures: Exact self-energy of a three-dimensional periodic tight-binding Hamiltonian, Phys. Rev. B, 69 (2004), Article Number 165301, 6 pp.
[2] D. A. Bini, L. Gemignani, and B. Meini, Computations with infinite Toeplitz matrices and polynomials, Linear Algebra Appl., 343–344 (2002), pp. 21–61.
[3] C.-Y. Chiang, E. K.-W. Chu, C.-H. Guo, T.-M. Huang, W.-W. Lin, and S.-F. Xu, Con-vergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 227–247.
[4] E. K.-W. Chu, T.-M. Hwang, W.-W. Lin, and C.-T. Wu, Vibration of fast trains, palin-dromic eigenvalue problems and structure-preserving doubling algorithms, J. Comput. Appl. Math., 219 (2008), pp. 237–252.
[5] S. Datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, Cam-bridge, UK, 1995.
[6] S. Datta, Nanoscale device modeling: The Green’s function method, Superlattices and Mi-crostructures, 28 (2000), pp. 253–278.
[7] E. N. Economou, Green’s Functions in Quantum Physics, 3rd ed., Springer, Berlin, 2006. [8] J. C. Engwerda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and sufficient conditions
for the existence of a positive definite solution of the matrix equationX + A∗X−1A = Q, Linear Algebra Appl., 186 (1993), pp. 255–275.
[9] I. Gohberg, S. Goldberg, and M. A. Kaashoek, Classes of Linear Operators, Vol. II, Oper. Theory Adv. Appl. 63, Birkh¨auser, Basel, 1993.
[10] C.-H. Guo, Convergence rate of an iterative method for a nonlinear matrix equation, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 295–302.
[11] C.-H. Guo, Numerical solution of a quadratic eigenvalue problem, Linear Algebra Appl., 385 (2004), pp. 391–406.
[12] C.-H. Guo and P. Lancaster, Iterative solution of two matrix equations, Math. Comp., 68 (1999), pp. 1589–1603.
[13] N. J. Higham and H.-M. Kim, Numerical analysis of a quadratic matrix equation, IMA J. Numer. Anal., 20 (2000), pp. 499–519.
[14] N. J. Higham and H.-M. Kim, Solving a quadratic matrix equation by Newton’s method with exact line searches, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 303–316.
[15] D. L. John and D. L. Pulfrey, Green’s function calculations for semi-infinite carbon nan-otubes, Physica Status Solidi B, 243 (2006), pp. 442–448.
[16] A. Kletsov, Y. Dahnovsky, and J. V. Ortiz, Surface Green’s function calculations: A non-recursive scheme with an infinite number of principal layers, J. Chem. Phys., 126 (2007), Article Number 134105, 5 pp.
[17] W.-W. Lin and S.-F. Xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 26–39. [18] D. S. Mackey, N. Mackey, C. Mehl, and V. Mehrmann, Structured polynomial eigenvalue
problems: Good vibrations from good linearizations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 1029–1051.
[19] C. van der Mee, G. Rodriguez, and S. Seatzu,LDU factorization results for bi-infinite and semi-infinite scalar and block Toeplitz matrices, Calcolo, 33 (1996), pp. 307–335. [20] B. Meini, Efficient computation of the extreme solutions of X + A∗X−1A = Q and X −
A∗X−1A = Q, Math. Comp., 71 (2002), pp. 1189–1204.
[21] J. Tomfohr and O. F. Sankey, Theoretical analysis of electron transport through organic molecules, J. Chem. Phys., 120 (2004), pp. 1542–1554.
[22] X. Zhan, Computing the extremal positive definite solutions of a matrix equation, SIAM J. Sci. Comput., 17 (1996), pp. 1167–1174.