Contents lists available atSciVerse ScienceDirect
Journal of Computational and Applied
Mathematics
journal homepage:www.elsevier.com/locate/cam
Numerical solution of nonlinear matrix equations arising from Green’s
function calculations in nano research
Chun-Hua Guo
a,∗, Yueh-Cheng Kuo
b, Wen-Wei Lin
caDepartment of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada bDepartment of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan cDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan
a r t i c l e i n f o Article history: Received 14 December 2011 MSC: 15A24 65F30 Keywords:
Nonlinear matrix equation Weakly stabilizing solution Structure-preserving algorithm Green’s function
a b s t r a c t
The Green’s function approach for treating quantum transport in nano devices requires the solution of nonlinear matrix equations of the form X+(C∗+iηD∗)X−1(C+iηD) =R+iηP, where R and P are Hermitian, P+λD∗+λ−1D is positive definite for allλon the unit
circle, andη → 0+. For each fixedη > 0, we show that the required solution is the unique stabilizing solution Xη. Then X∗ = limη→0+Xηis a particular weakly stabilizing
solution of the matrix equation X+C∗X−1C=R. In nano applications, the matrices R and
C are dependent on a parameter, which is the system energyE. In practice one is mainly interested in those values ofEfor which the equation X+C∗X−1C =R has no stabilizing solutions or, equivalently, the quadratic matrix polynomial P(λ) = λ2C∗−λR+C has eigenvalues on the unit circle. We point out that a doubling algorithm can be used to compute Xηefficiently even for very small values ofη, thus providing good approximations to X∗. We also explain how the solution X∗can be computed directly using subspace methods such as the QZ algorithm by determining which unimodular eigenvalues of P(λ) should be included in the computation. In some applications the matrices C,D,R,P have very special sparsity structures. We show how these special structures can be exploited to drastically reduce the complexity of the doubling algorithm for computing Xη.
© 2012 Elsevier B.V. All rights reserved.
1. Introduction
In this paper we study nonlinear matrix equations of the form
X
+
(
C∗+
iη
D∗)
X−1(
C+
iη
D) =
R+
iη
P,
(1)where R and P are Hermitian, P
+
λ
D∗+
λ
−1D is positive definite for allλ
on the unit circle T, andη ≥
0. The special case where P=
I,
D=
0 and C,
R are real arises in nano research [1–4] and has been studied in [5,6]. We now briefly explain how the general equation(1)also arises in nano applications.A main goal of basic research in molecular electronics is to advance the understanding of electron transport through molecules. In [7], a method for calculating the current is described for a system that consists of a molecule connected between two semi-infinite metallic electrodes, and is implemented in a program that assumes a local-orbital picture and requires as input the Hamiltonian and overlap matrix elements between orbitals.
∗Corresponding author.
E-mail addresses:[email protected](C.-H. Guo),[email protected](Y.-C. Kuo),[email protected](W.-W. Lin). 0377-0427/$ – see front matter©2012 Elsevier B.V. All rights reserved.
The system Hamiltonian is a bi-infinite Hermitian matrix of the form H
=
H L HLM 0 HLM∗ HM HMR 0 HMR∗ HR
,
(2)where HM
,
HL,
HRare the Hamiltonians for the molecule, the left electrode, and the right electrode, respectively, and theoverlap matrix is a Hermitian positive definite matrix partitioned in the same way and is given by S
=
S L SLM 0 S∗LM SM SMR 0 SMR∗ SR
.
In [7] the blocks HLand HLMin(2)are shifted by sLSLand sLSLM, respectively, where sLis a proper energy shift, and the
blocks HRand HMRare shifted similarly. These shifts do not change the structure of the matrix H in(2). So we can simply
assume that the matrix H in(2)has already gone through the shifting procedure. The Green’s function (of the full interacting system) is defined by
G
=
(
E+
i0+)
S−
H
−1=
limη→0+
((
E+
iη)
S−
H)
−1,
whereEis energy. We note that for each
η >
0 the infinite matrix(
E+
iη)
S−
H=
ES−
H+
iη
S is known to be invertible by Bendixson’s theorem (see [8, Lemma 3.3]), but the existence of the above one-sided limit is something assumed. The molecule Green’s function GMis that part of G corresponding to the block for the molecule and is obtained fromGM
=
(
E+
i0+)
SM−
HM−
ΣL−
ΣR
−1,
where ΣL=
(E
SLM−
HLM)
∗GL(E
SLM−
HLM),
ΣR=
(E
SMR−
HMR)
GR(E
SMR−
HMR)
∗,
with GL=
(
E+
i0+)
SL−
HL
−1,
GR=
(
E+
i0+)
SR−
HR
−1.
Then [7] the net current is determined through a definite integral of the transmission function given by T
(
E) =
tr(Γ
LGMΓRG∗M),
where
ΓL
=
i(Σ
L−
ΣL∗),
ΓR=
i(Σ
R−
ΣR∗),
and tr
(·)
denotes the trace of a matrix. Note that T(
E)
is a real function ofEsinceΓLandΓRare Hermitian.We now explain how the matrixΣRis computed. The computation ofΣLis similar. The matrices HRand SRcan be written
as [7] HR
=
Hs Hsb Hsb∗ Hb Hbb Hbb∗ Hb Hbb Hbb∗ Hb...
... ...
,
SR=
Ss Ssb Ssb∗ Sb Sbb Sbb∗ Sb Sbb Sbb∗ Sb...
... ...
,
where Hs
,
Ss∈
Cq×qand Hb,
Sb∈
Cn×n, and we suppose that HM,
SM∈
Cp×p. The size of Hsand Sshas been taken sufficientlylarge so that all nonzero elements of the matrices HMRand SMRare in the p
×
q block on the left. This means that we onlyneed Gs, the q
×
q block in the upper-left corner of GR, for the computation ofΣR. It is easy to see [7] that Gsis determinedthrough Gs
=
(
Us−
UsbGbU ′ sb)
−1,
where Us=
zSs−
Hs,
Usb=
zSsb−
Hsb,
Usb′=
zS ∗ sb−
H ∗ sbwith z=
E+
iη
andη →
0 +, and Gbis the n
×
n block in theupper-left corner of the inverse of
zSb−
Hb zSbb−
Hbb zSbb∗−
Hbb∗ zSb−
Hb zSbb−
Hbb zS∗bb−
Hbb∗ zSb−
Hb...
...
...
.
(3)Note that the above matrix is invertible by Bendixson’s theorem since the matrix TR
=
Sb Sbb Sbb∗ Sb Sbb Sbb∗ Sb...
... ...
(4)is positive definite when S is positive definite. Note also thatTRis positive definite if and only if Sb
+
λ
Sbb+
λ
−1S∗bbis positivedefinite for all
λ
on T.The block Toeplitz structure of the matrix(3)implies that Gbsatisfies the matrix equation
Gb
=
(
Ub−
UbbGbUbb′)
−1,
(5) where Ub=
zSb−
Hb,
Ubb=
zSbb−
Hbb,
Ubb′=
zS ∗ bb−
H ∗ bb.For any W
∈
Cn×n, we can write W=
WR+
iWI, where the Hermitian matricesWR
=
1 2(
W+
W ∗),
WI=
1 2i(
W−
W ∗)
are called the real part and the imaginary part of W , respectively. We are only interested inEvalues for which the required solution Gbof(5)has a nonzero imaginary part (in the limit
η →
0+) since otherwise Gband then Gswould be Hermitian,which would imply thatΣRis Hermitian and then T
(E
) =
0 for the transmission function.Now we let
X
=
G−b1,
C=
ESbb∗−
Hbb∗,
D=
Sbb∗,
R=
ESb−
Hb,
P=
Sb.
Then Eq.(5)becomes(1).
2. Characterization of the solution Gb
The matrix equation(1)may have many different solutions. So what solution X do we need so that X−1
=
Gbis therequired solution of(5)?
Let A
=
C+
iη
D,
B=
C∗+
iη
D∗,
Q=
R+
iη
P. Then(1)becomesX
+
BX−1A=
Q.
(6)As before, R and P are Hermitian and P
+
λ
D∗+
λ
−1D is positive definite for allλ
on T. LetM
=
A 0 Q−
I
,
L=
0 I B 0
.
(7)Then X is a solution of(6)if and only if M
I X
=
L
I X
X−1A.
(8)Therefore, every solution of(6)can be obtained from a suitable invariant subspace for the pencil M
−
λ
L.Lemma 1. For any
η ̸=
0, the matrix pencil M−
λ
L has no eigenvalues on T.Proof. Suppose that
λ ∈
T and(
M−
λ
L)
x=
0 for a vector x=
(
x⊤1,
x⊤ 2
)
⊤with x 1,
x2∈
Cn. Then Ax1=
λ
x2,
Qx1−
x2=
λ
Bx1.
(9) By eliminating x2in(9)we have Wx1≡
λ
B−
Q+
λ
−1A
x1=
0.
(10)The imaginary part of x∗
1Wx1is
−
η
x∗1(
P−
λ
D∗
−
λ
−1D)
x1. Since P
−
λ
D∗−
λ
−1D is positive definite, it follows from(10)that x1=
0. By(9)we have x2=
0. Thus, M−
λ
L has no eigenvalues on T.Proof. We consider the matrix pencils H
(
t, λ) =
tA 0 tR+
iη
P−
I
−
λ
0 I tB 0
obtained from the pencil M
−
λ
L by replacing C,
D,
R with tC,
tD,
tR. For each t∈ [
0,
1]
andλ ∈
T, P+
λ(
tD∗) + λ
−1(
tD) =
(
1−
t)
P+
t(
P+
λ
D∗+
λ
−1D)
is positive definite. FromLemma 1we know that H(
t, λ)
has no eigenvalues on T for all t∈ [
0,
1]
. Hence, H(
1, λ) =
M−
λ
L and H(
0, λ)
have the same numbers of eigenvalues inside T. But it is clear that H(
0, λ)
has n eigenvalues at 0 and n eigenvalues at∞
.Note that the pencil M
−
λ
L is a linearization of the quadratic polynomial P(λ) = λ
2B−
λ
Q+
A.The basic fixed-point iteration for finding a solution of(6)is Xk+1
=
F(
Xk)
, whereF(
X) =
Q−
BX−1A. The Fréchetderivative ofF at X is the linear mapF′
X
:
Cn×n
→
Cn×ngiven byFX′
(
Z) = (
BX−1
)
Z(
X−1A)
. A solution X of(6)is said to be stabilizing ifρ(
FX′) <
1 or, equivalently,ρ(
BX−1)ρ(
X−1A) <
1, whereρ(·)
denotes the spectral radius. Note that the basic fixed-point iteration is locally convergent at a stabilizing solution.Let X be any solution of(6). Then we have P
(λ) = (λ
BX−1−
I)
X(λ
I−
X−1A).
So the eigenvalues of X−1A are n eigenvalues of P
(λ)
, and the eigenvalues of BX−1are the reciprocals of the remaining n eigenvalues of P(λ)
. It then follows fromTheorem 2that a solution X of(6)is stabilizing if and only ifρ(
X−1A) <
1 and that there is at most one stabilizing solution.Let the invariant subspace of M
−
λ
L corresponding to its n eigenvalues inside T be spanned by the columns of thematrix
U V
, where U
,
V∈
Cn×n. Then the existence of a stabilizing solution can be established by showing that U and V areboth invertible. The stabilizing solution is then X
=
VU−1. For the case where the matrices C,
D,
R,
P in(6)are all real, an elementary proof for the invertibility of U has already been given in [9] and we note that the invertibility of V can be proved in the same way. It is also shown in [9] that the imaginary part of the stabilizing solution is positive definite forη >
0. The proofs can be carried over to the complex case here with only very minor changes.Here, however, we are going to use an advanced result on linear operators to show the existence of a stabilizing solution since this approach will also explain that the stabilizing solution is precisely the solution we need for the nano application. The treatment is very similar to the one in [6] for a special case of Eq.(6). So our presentation here will be very brief.
Recall that Gb
=
limη→0+Gb(η)
, where Gb(η)
is the n×
n matrix in the upper-left corner of T−1with T given by(3)foreach
η >
0. Using the current notation, we haveT
=
Q B A Q B A Q...
... ...
.
(11)Associated with T is the rational matrix function
φ(λ) = λ
A+
Q+
λ
−1B. We already know from Bendixson’s theorem (see [8, Lemma 3.3]) that T is invertible for eachη >
0. Thus, by a result on linear operators (see [10, Chapter XXIV, Theorem 4.1] and [11]) we know thatφ(λ)
has a factorizationφ(λ) = (
I−
λ
−1L)
X(
I−
λ
U)
(12)with X invertible,
ρ(
L) <
1 andρ(
U) <
1. From(12)we see that A= −
XU,
B= −
LX,
Q=
X+
LXU.
Thus X
+
BX−1A=
Q andρ(
X−1A) <
1. In other words, X is the unique stabilizing solution of(6).By [10, Chapter XXIV, Theorem 4.1] the n
×
n matrix in the upper-left corner of T−1is precisely X−1. We thus have the following characterization of Gb(η)
.Theorem 3. For any
η >
0, the matrix Gb(η)
is the inverse of the unique stabilizing solution of(6).3. Computation of the stabilizing solution
Let M and L be as in(7). Then the stabilizing solution X of(6)satisfies(8)with
ρ(
X−1A) <
1.We remark that Eq.(6)with real matrices C
,
D,
R,
P also arises in the study of a quadratic eigenvalue problem from the vibration analysis of fast trains, where the required solution is also the stabilizing solution and a doubling algorithm is used to find the solution [8]. We can get similar results for our more general equation(6). The situation here is slightly more complicated since we no longer have B=
A⊤and the stabilizing solution is no longer complex symmetric.Starting with the matrices M and L in(7), we define the sequences
{
Mk}
and{
Lk}
, where Mk=
Ak 0 Qk−
I
,
Lk=
−
Pk I Bk 0
,
by the following structure-preserving doubling algorithm if no breakdown occurs.
Algorithm 1. Let A0
=
A,
B0=
B,
Q0=
Q,
P0=
0. For k=
0,
1, . . . ,
compute Ak+1=
Ak(
Qk−
Pk)
−1 Ak,
Bk+1=
Bk(
Qk−
Pk)
−1Bk,
Qk+1=
Qk−
Bk(
Qk−
Pk)
−1Ak,
Pk+1=
Pk+
Ak(
Qk−
Pk)
−1Bk.
The above algorithm is the SDA-2 as presented in [12]. The next result shows that the doubling algorithm has some nice properties. In particular, it can compute the stabilizing solution X of(6)efficiently.
Theorem 4. Let X be the stabilizing solution of (6)and
X be the stabilizing solution of the dual equation
X+
A
X−1B
=
Q.
Then
(a) The sequences
{
Ak}
, {
Bk}
, {
Qk}
, {
Pk}
in Algorithm 1 are well-defined.(b) Qkconverges to X quadratically, Akand Bkconverge to 0 quadratically, Q
−
Pkconverges to
X quadratically, withlim sup k→∞ 2k
∥
Qk−
X∥ ≤
ρ(
X −1B)ρ(
X−1A),
lim sup k→∞ 2k
∥
Ak∥ ≤
ρ(
X−1A),
lim sup k→∞ 2k
∥
Bk∥ ≤
ρ(
X −1 B),
lim sup k→∞ 2k
∥
Q−
Pk−
X∥ ≤
ρ(
X −1 B)ρ(
X−1A),
where∥ · ∥
is any matrix norm.Proof. The proof is very similar to that of [8, Theorem 4.1]. Although the statement of that theorem and the beginning of its proof refer to the specific problem under consideration in [8], the proof there is valid for this theorem after some minor changes. Here we only mention the following differences. In [8], Q⊤
=
Q and B=
A⊤(this would be true here if the matrices C,
D,
R,
P were all real), and in that case we can conclude that Bk=
A⊤k,
Q⊤
k
=
Qk,
Pk⊤=
Pk, andρ(
X−1B) = ρ(
X−1A)
.Remark 1. Although we no longer have
ρ(
X−1B) = ρ(
X−1A)
in general, we always haveρ(
X−1B) = ρ(
X−1B)
. In fact, theeigenvalues of
X−1B are those of
P(λ) = λ
2A−
λ
Q+
B inside T. We have mentioned earlier that the eigenvalues of BX−1are the reciprocals of those eigenvalues of P
(λ) = λ
2B−
λ
Q+
A outside T. However, the reciprocals of the eigenvalues of P(λ)
outside T are precisely the eigenvalues of
P(λ)
inside T. It is also well known that BX−1and X−1B have the sameeigenvalues.
Remark 2. As in [6], we can show that the basic fixed-point iteration (FPI) Xk+1
=
Q−
BXk−1A,
X0=
Qis also convergent and X2k−1
=
Qk. So the convergence of the FPI is much slower than the doubling algorithm. However, wecan use an averaging procedure for the FPI to speed up its convergence and for the nano application we can use the computed solution for one energy value as an initial guess for the solution for the next nearby energy value [7]. For the special case where P
=
I,
D=
0 and C,
R are real, convergence results for methods based on these ideas have been proved in [5] using the Earle–Hamilton theorem [13,14]. The proofs there can be carried over to Eq.(6), with some minor changes, as long as D=
0 still holds (so C is any complex matrix, R is any Hermitian matrix, and P is any Hermitian positive definite matrix). However, when D̸=
0 we are unable to prove any non-local convergence results for those methods. The Earle–Hamilton theorem is not applicable since we no longer have B=
A∗.To emphasize its dependence on
η
, the stabilizing solution of(6) will be denoted by Xη. For the nano application, Gb=
limη→0+Gb(η)
and Gb(η) =
Xη−1. So Gb=
X∗−1with X∗=
limη→0+Xη. It is easy to see that X∗is a particular weaklystabilizing solution of the matrix equation
X
+
C∗X−1C=
R,
(13)with
ρ(
X∗−1C) ≤
1 andρ(
C∗X∗−1) ≤
1. The solution X∗can be approximated by computing Xηby the doubling algorithm fora sufficiently small
η
, but can also be computed directly by subspace methods, as we shall see in Section5. For now, we write X∗=
X∗,R+
iX∗,I, where the Hermitian matrices X∗,Rand X∗,Iare the real part and the imaginary part of X∗, respectively, andwe will examine the rank of X∗,I. Since the imaginary part of Xηis positive definite for
η >
0, we know that X∗,Iis positive4. Rank of X∗,I
We now denote the matrices A
,
B,
Q in(6)by Aη,
Bη,
Qη, respectively. SoAη
=
C+
iη
D,
Bη=
C∗+
iη
D∗,
Qη=
R+
iη
P,
(14)with C
,
D,
R,
P as before. Let Pη(λ) = λ
2Bη−
λ
Qη+
Aη.
We already know that Pηhas no eigenvalues on T for
η ̸=
0. Forη =
0 we get P0(λ) = λ
2C∗−
λ
R+
C.
It is quite possible that P0
(λ)
has some eigenvalues on T. As we will see later, this is the case of primary interest for the nano application.Theorem 5. The number of eigenvalues (counting multiplicities) of P0
(λ)
on T must be even, say 2m. Moreover, we have rank(
X∗,I) ≤
m.Proof. The matrix polynomial P0
(λ)
is∗
-palindromic. Thusµ
is an eigenvalue of P0(λ)
if and only if 1/µ
is so, and they have the same algebraic, geometric, and partial multiplicities [15]. It follows that the total number of eigenvalues of P0(λ)
on T must be even.By Xη
+
(
C∗+
iη
D∗)
Xη−1(
C+
iη
D) =
R+
iη
P we haveXη
+
C∗Xη−1C=
R+
η
Wη,
(15)where
Wη
=
iP−
iD∗Xη−1C−
iC∗Xη−1D+
η
D∗Xη−1D.
Taking imaginary parts on(15), we getKη
−
Fη∗KηFη=
η
Tη,
(16)where Kη
=
Im(
Xη),
Tη=
Im(
Wη),
Fη=
X−1η C . Let F
=
limη→0+Fη=
X∗−1C . Then the eigenvalues of F consist of all n−
meigenvalues of P0
(λ)
inside T plus m eigenvalues of P0(λ)
on T. Let F=
V0
R0,1 0 0 R0,2
V0−1 (17)be a spectral resolution of F , where R0,1
∈
Cm×mand R0,2∈
C(n−m)×(n−m)are upper triangular withσ (
R0,1) ⊆
T andσ (
R0,2) ⊆
D≡ {
λ ∈
C||
λ| <
1}
. It follows from [16, Chapter V, Theorem 2.8] that there is a nonsingular matrix Vηsuch that Fη=
Vη
Rη,1 0 0 Rη,2
Vη−1,
(18) and Rη,1→
R0,1, Rη,2→
R0,2, and Vη→
V0, asη →
0+. From(16)and(18)we haveVη∗KηVη
−
R∗η,1 0 0 R∗η,2
Vη∗KηVη
Rη,1 0 0 Rη,2
=
η
Vη∗TηVη.
(19) Let Vη∗KηVη=
Hη,1 Hη,3 Hη,∗3 Hη,2
,
Vη∗TηVη=
Zη,1 Zη,3 Zη,∗3 Zη,2
.
(20) Then(19)becomes Hη,1−
R∗η,1Hη,1Rη,1=
η
Zη,1,
(21a) Hη,2−
R∗η,2Hη,2Rη,2=
η
Zη,2,
(21b) Hη,3−
R∗η,1Hη,3Rη,2=
η
Zη,3.
(21c)As
η →
0+, Rη,1→
R0,1withρ(
R0,1) =
1, Rη,2→
R0,2withρ(
R0,2) <
1, and Zη,2and Zη,3are bounded by the convergence of Tη. So we have Hη,2→
0 from(21b)and Hη,3→
0 from(21c). Since X∗,I=
limη→0+Kη, it follows from(20)that rank(
X∗,I) ≤
m.We conjecture that equality holds inTheorem 5when all eigenvalues of P0
(λ)
on T are simple. For the nano application, the matrices C and R in P0(λ)
are given byC
=
ES∗bb−
Hbb∗,
R=
ESb−
Hb.
If P0
(λ)
has no eigenvalues on T for an energy valueE, then X∗is Hermitian byTheorem 5and Gb=
X∗−1is also Hermitian.We then know that the transmission function T
(E
)
takes zero value, without solving any nonlinear matrix equations. So we are only interested in thoseEvalues for which P0(λ)
has some eigenvalues on T. The next simple result is thus relevant, where Sb−
λ
Sbb−
λ
−1Sbb∗ is positive definite for allλ
on T.Theorem 6. For
λ ∈
T, let the eigenvalues of(
Sb−
λ
Sbb−
λ
−1S∗ bb)
−1(
H b−
λ
Hbb−
λ
−1H∗ bb)
beµ
1(λ) ≤ · · · ≤ µ
n(λ)
. Let ∆i=
min |λ|=1µ
i(λ),
max|λ|=1µ
i(λ)
,
and∆=
ni=1∆i. Then the quadratic pencil P0
(λ) = λ
2(
ESbb−
Hbb) − λ(
ESb−
Hb) + (
ESbb∗−
H∗
bb
)
has some eigenvalues onT if and only if E
∈
∆.Proof. The quadratic P0
(λ)
has some eigenvalues on T if and only if det(
P0(λ)) =
0 for someλ ∈
T or, equivalently, det(−λ
−1P0(λ)) =
det
E
(
Sb−
λ
Sbb−
λ
−1S∗bb) − (
Hb−
λ
Hbb−
λ
−1Hbb∗) =
0for some
λ ∈
T, the latter is equivalent toE∈
∆.5. Direct computation of X∗
The solution X∗can be computed directly by subspace methods. We will need to include all eigenvalues of
P
(λ) = λ
2C∗−
λ
R+
C (22)inside T and half of its eigenvalues on T—the half that would be perturbed to the inside of T when P
(λ)
is perturbed toPη
(λ) = λ
2(
C∗+
iη
D∗) − λ(
R+
iη
P) + (
C+
iη
D).
(23) Let M=
C 0 R−
I
,
L=
0 I C∗ 0
.
(24)Then the pencilM
−
λ
L, also denoted by(
M,
L)
, is a linearization of the quadratic matrix polynomial P(λ)
. It is easy to check that y and z are the right and left eigenvectors, respectively, corresponding to an eigenvalueλ
of P(λ)
if and only if
y Ry−
λ
C∗y
,
z−
λ
z
(25) are the right and left eigenvectors of(
M,
L)
, respectively.The following result is a generalization of [5, Theorem 3.1] for the special case where P
=
I,
D=
0 and C,
R are real. It shows which invariant subspace corresponding to unimodular eigenvalues of P(λ)
should be used in the computation of X∗,assuming they are all semi-simple.
Theorem 7. Suppose that
λ
0is a semi-simple eigenvalue of P(λ)
on T with multiplicity m0and Y∈
Cn×m0forms an orthonormal basis of right eigenvectors corresponding toλ
0. Then iY∗(
2λ
0C∗−
R)
Y is a nonsingular Hermitian matrix. Let dj, j=
1, . . . , ℓ
,be the distinct eigenvalues of
Y∗
(
P−
λ
0D∗−
λ
−01D)
Y
iY∗(
2λ
0C∗−
R)
Y
−1with multiplicities mj0, and let
ξ
j∈
Cm0×mj
0 form an orthonormal basis of the eigenspace corresponding to dj. Then for
η >
0 sufficiently small and j=
1, . . . ℓ
λ
(k) j,η=
λ
0−
λ
0djη +
O(η
2),
k=
1, . . . ,
m j 0,
and yj,η=
Y
Y∗(
P−
λ
0D∗−
λ
0−1D)
Y
−1ξ
j+
O(η)
(26)Proof. Since P
(λ
0)
Y=
0 with Y∗Y=
Im0and|
λ
0| =
1, we have 0∗=
(
P(λ
0)
Y)
∗=
λ
2 0Y ∗(λ
2 0C ∗−
λ
0R+
C).
It follows that Y forms an orthonormal basis for left eigenvectors of P
(λ)
corresponding toλ
0. From(25), we obtain that the column vectors of YR=
Y RY−
λ
0C∗Y
and YL=
Y−
λ
0Y
form a basis of left and right eigenspaces ofM
−
λ
Lcorresponding toλ
0, respectively. Sinceλ
0is semi-simple, the matrix[
Y∗, −λ
0Y∗]
L
Y RY−
λ
0C∗Y
= −
Y∗(
2λ
0C∗−
R)
Y= −
Y∗P′(λ
0)
Y is nonsingular. Let
YR= −
YR
Y∗P′(λ
0)
Y
−1,
Y
L=
YL.
Then we have
YL∗LYR=
Im0 and Y
∗ LMYR=
λ
0Im0.
(27)For
η >
0 sufficiently small, we letMη
=
C+
iη
D 0 R+
iη
P−
I
,
Lη=
0 I C∗+
iη
D∗ 0
.
ThenMη
−
λ
Lηis a linearization of Pη(λ)
. By(27)and [16, Chapter VI, Theorem 2.12] there are
YRandY
Lsuch that
YR
YR
and
YLY
L
are nonsingular and
Y∗L
Y∗L
M
YRY
R =
λ
0Im0 0 0 M
,
Y∗L
Y∗L
L
Y
R
YR =
Im0 0 0 L
.
Then, by [16, Chapter VI, Theorem 2.15] the column vectors ofY
R+
O(η)
span the right eigenspace of(
Mη,
Lη)
correspondingto
(Λ,
Im0)
, where Λ=
λ
0Im0+
η
E11+
O(η
2)
I m0+
η
F11+
O(η
2)
−1 with E11=
Y
∗ L
iD 0 iP 0
YR=
Y∗(λ
0iP−
iD)
Y
Y∗P′(λ
0)
Y
−1,
F11=
Y ∗ L
0 0 iD∗ 0
YR=
Y∗(λ
0iD∗)
Y
Y∗P′(λ
0)
Y
−1.
Thus Λ=
λ
0Im0+
η(
E11−
λ
0F11) +
O(η
2) = λ
0Im0−
ηλ
0W+
O(η
2),
where W=
Y∗(
P−
λ
0D∗−
λ
0−1D)
Y
iY∗P′(λ
0)
Y
−1.
(28) The matrix Z=
iY∗P′(λ
0)
Y=
iY∗(
2λ
0C∗−
R)
Y in(28)is Hermitian since Z−
Z∗=
iY∗(
2λ
0C∗+
2λ
0C−
2R)
Y=
2iλ
0Y∗P(λ
0)
Y=
0.
Since Y∗(
P−
λ
0D∗
−
λ
−01D)
Y is positive definite, all eigenvalues of W in(28)are real and there are m0linearly independent eigenvectors. Let djfor j=
1, . . . , ℓ
be the distinct eigenvalues of W with multiplicities mj
0, and let
ξ
j∈
Cm0×mj
0 form an orthonormal basis of the eigenspace corresponding to dj. Then we have
Φ−1ΛΦ
=
λ
0Im0−
ηλ
0diag
d1Im10, . . . ,
dℓImℓ0
+
O(η
2)
whereΦ
= [
ξ
1, . . . , ξ
ℓ] ∈
Cm0×m0. Then for each j∈ {
1,
2, . . . , ℓ}
, the perturbed eigenvaluesλ
(k)j,η, k
=
1, . . . ,
m j0, and a basis of the corresponding invariant subspace ofMη
−
λ
Lηwithλ
j(,ηk)|
η=0=
λ
0can be expressed byλ
(k)and
ζ
j,η=
YR
Y∗
(
P−
λ
0D∗−
λ
0−1D)
Y
−1ξ
j+
O(η).
(29b)The equation in(26)follows from(29b).
For the pencil
(
M,
L)
given by(24), the relationM
I X
=
L
I X
X−1A shows that the weakly stabilizing solution X∗of (13)is obtained by X∗
=
X2X1−1, where the columns of
X 1 X2
form a basis for the invariant subspace of
(
M,
L)
corresponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed to the inside of T when(
M,
L)
is perturbed to(
Mη,
Lη)
withη >
0.We can use the QZ algorithm to determine this invariant subspace, with the aid ofTheorem 7when all unimodular eigenvalues of
(M,
L)are semi-simple. In practice, these unimodular eigenvalues are likely to be simple and the statements in ourTheorem 7can be simplified significantly. However, if 1 (or−
1) happens to be an eigenvalue of(
M,
L)
, then it must have even multiplicity because (counting multiplicity) half of eigenvalues at 1 (or−
1) will be perturbed to the inside of Tand the other half to the outside. Typically 1 (or
−
1) will be a double eigenvalue of partial multiplicity 2, and the eigenvector corresponding to it should be used in the computation of X∗.6. Exploiting sparsity
Subspace methods for finding X∗may be more efficient than the doubling algorithm that finds Xηfor a sufficiently small
η
. However, it is possible for the doubling algorithm to exploit certain sparsity structures in the matrices A,
B,
Q in(6)while subspace methods could not.For the nano application here and for other applications, the matrices A
,
B,
Q are from a semi-infinite block tridiagonal and block Toeplitz matrix, as given in the matrix T in(11). In some situations, the matrix T is block tridiagonal with the matrices on the three diagonals having some periodicity, but is not block Toeplitz when the submatrices in T are of the given sizes. To make T a block tridiagonal and block Toeplitz matrix, we would have to partition the matrix T into larger submatrices. To be more precise, the matrix T is given as in(11), and the matrices A,
B,
Q∈
Cn×nhave the following structures: Q=
K1,1 K1,2 0 K2,1 K2,2...
...
...
Kp−1,p 0 Kp,p−1 Kp,p
,
A=
0 Kp+1,p 0 0
,
B=
0 0 Kp,p+1 0
,
(30) where Kj,j∈
Cnj×nj,
Kp+1,p∈
Cn1×np,
Kp,p+1∈
Cnp×n1.We now useAlgorithm 1to compute the stabilizing solution Xsof(6). We remark that Eq.(6)here is more general than
the one studied in [8]. That equation arises in the vibration analysis of fast trains.
As in [8], the complexity ofAlgorithm 1can be reduced drastically by using the special structures of the matrices Q
,
A,
B given by(30). Write Qk=
Q−
Pk. Then it is easily seen fromAlgorithm 1that the matrices Ak, Bk,
Pkand Pkhave the specialforms Ak
=
0 Ek 0 0
,
Bk=
0 0 Fk 0
,
Pk=
0 0 0
Gk
,
Pk=
Gk 0 0 0
,
where Ek, Fk,
Gkand Gkare n1×
np, np×
n1, np×
npand n1×
n1matrices, respectively.Algorithm 1can be rewritten as thefollowing simplified algorithm.
Algorithm 2. Let E0
=
Kp+1,p,
F0=
Kp,p+1,
G0=
0,
G0=
0. For k=
0,
1, . . . ,
compute
Sk,1 Sk,2...
Sk,p
=
Q−
Gk 0...
0
Gk
−1
Ek 0...
0
,
(31)
Tk,1 Tk,2...
Tk,p
=
Q−
Gk 0...
0
Gk
−1
0...
0 Fk
,
(32)where Sk,i
∈
Cni×npand Tk,i∈
Cni×n1, and then computeEk+1
=
EkSk,p,
Fk+1=
FkTk,1,
Gk+1=
Gk+
FkSk,1,
Gk+1=
Gk+
EkTk,p.
(33)The main task ofAlgorithm 2is to solve the large sparse linear systems in(31)and(32). This could be done by using the Sherman–Morrison–Woodbury formula, as in [8]. But here we present a new approach that is both simpler and less expensive. Let P
=
In1 0 0 0 0 Inp 0 In−n1−np 0
be a permutation matrix and note that
P
Q−
Gk 0...
0
Gk
P⊤=
K1,1−
Gk 0 0 Kp,p−
Gk V U C
,
where V=
K1,2 0· · ·
0 0· · ·
0 Kp,p−1
,
(34) U=
K2,1 0 0...
...
0 0 Kp−1,p
,
C=
K2,2 K2,3 0 K3,2 K3,3...
...
...
Kp−2,p−1 0 Kp−1,p−2 Kp−1,p−1
.
(35)Then the matrices Sk,1, Sk,p, Tk,1and Tk,pof the solutions of(31)and(32)satisfy
K1,1−
Gk 0 0 Kp,p−
Gk
−
VC−1U
Sk,1 Tk,1 Sk,p Tk,p
=
Ek 0 0 Fk
.
(36)Note that the matrixVC−1Uis independent of k. Since Q
k
=
Q−
Pkand limk→∞Qk=
Xs, we know that Xsis obtained fromQ by replacing Kp,pin the lower-right corner with Kp,p
−
G∗, where
G∗=
limk→∞
Gk.The following algorithm gives a more detailed implementation ofAlgorithm 2and computes the stabilizing solution Xs
of(6).
Algorithm 3. Computation of Xs.
Input: Kj,j
∈
Cnj×nj, Kj,j+1∈
Cnj×nj+1, Kj+1,j∈
Cnj+1×nj, j=
1, . . . ,
n, where np+1=
n1; toleranceτ
. Output: The stabilizing solution Xsof(6), where A,
B,
Q are given by(30).TakeV
,
UandCin(34)and(35); Compute W=
K 1,1 0 0 Kp,p
−
VC−1U; E0=
Kp+1,p, F0=
Kp,p+1,
G0=
0, G0=
0; For k=
0,
1, . . .
S k,1 Tk,1 Sk,p Tk,p
=
W−
G k 0 0 Gk
−1
E k 0 0 Fk
; Ek+1=
EkSk,p,
Fk+1=
FkTk,1,
Gk+1=
Gk+
FkSk,1, Gk+1=
Gk+
EkTk,p; If∥
FkSk,1∥ ≤
τ∥
Gk∥
and∥
EkTk,p∥ ≤
τ∥
Gk∥
, then Xs←
Q , Xs(
n−
np+
1:
n,
n−
np+
1:
n) ←
Kp,p−
Gk+1, and stop.In nano applications, we typically need the n1
×
n1matrix in the upper-left corner of T−1, where T is given by(11). We also know that X−1s is the n
×
n matrix in the upper-left corner of T−1. So we are mainly interested in the n
1
×
n1matrix Y1in the upper-left corner of Xs−1. Note that Y1is the same as the matrix Sk,1in(31)when Gk,
Gk,
Ekin(31)are replaced by0
,
G∗,
In1, respectively. Thus Y1=
In1,
0
W−
0 0 0
G∗
−1
In1 0
,
where the matrix W has already been computed inAlgorithm 3.
7. Numerical results
In this section we present some numerical results. We use the doubling algorithm to compute the stabilizing solution Xη of the equation
X
+
BηX−1Aη=
Qη,
(37)where Aη
,
Bη,
Qηare given in(14). If these matrices have the special sparsity structures in(30), thenAlgorithm 3is used. To measure the accuracy of a computed stabilizing solution Xηto(37), we use the relative residualRResη
=
∥
Xη+
BηX−1η Aη
−
Qη∥
∥
Xη∥ + ∥
Aη∥∥
Bη∥∥
X−1η
∥ + ∥
Qη∥
,
where
∥ · ∥
is the spectral norm. To see whether Xηis a good approximation to the weakly stabilizing solution X∗of theequation X
+
C∗X−1C=
R, we compute RRes=
∥
Xη+
C ∗ Xη−1C−
R∥
∥
Xη∥ + ∥
C∥
2∥
X−1 η∥ + ∥
R∥
.
(38) We also use the QZ algorithm to compute X∗directly, and the relative residual RRes0is defined as in(38), with the computed Xηreplaced by the computed X∗.Example 1. We randomly generate two complex matrices C , D and two complex Hermitian matrices R, P of dimension 6.
Let
ϱ
be the minimal eigenvalue of P and set P:=
P+
(
2∥
D∥ −
ϱ)
I6.
We verify that P
+
λ
D∗+
λ
−1D is positive definite for allλ ∈
T. We then compute the stabilizing solution Xηof(37)with
η =
10−4, 10−8, 10−12, respectively, by usingAlgorithm 1. In each case,Algorithm 1is stopped when max{∥
Ak+1∥
, ∥
Bk+1∥}
<
10−10and Qk+1is taken to be the computed Xη. When
η =
0, P0(λ) = λ
2C∗−
λ
R+
C has 2m=
4 eigenvalues on T, given byΛ
= {−
0.
9026+
0.
4304i,
0.
5687−
0.
8226i,
0.
9891+
0.
1472i,
0.
1960+
0.
9806i}
.
ByTheorem 7we determine that
Λs
= {
0.
5687−
0.
8226i,
0.
9891+
0.
1472i}
is such that the perturbed eigenvalues of Pη
(λ)
(η >
0) associated with eachλ
s∈
Λsare inside T. Then we compute theweakly stabilizing solution X∗of(37)by using the invariant subspace corresponding to stable eigenvalues and eigenvalues
inΛs(the QZ algorithm). The numerical results are shown inTable 1.
Table 1 Relative residuals.
η 10−4 10−8 10−12 0
RResη 3.36×10−16 4.03×10−15 4.24×10−15 3.09×10−16
We know that Xη,I
=
2i1(
Xη−
Xη∗)
is positive definite forη >
0 and we know fromTheorem 5that rank(
X∗,I) ≤
m=
2.Table 2 The eigenvalues of Xη,I. η The eigenvalues of Xη,I 10−4 2.3555,1.2676,5.87×10−3,1.89×10−3,1.21×10−3,1.10×10−3 10−8 2.3510,1.2639,5.91×10−7,1.89×10−7,1.21×10−7,1.10×10−7 10−12 2.3510,1.2639,5.91×10−11,1.89×10−11,1.21×10−11,1.10×10−11 0 2.3510,1.2639,5.41×10−15,1.61×10−15, −5.12×10−16, −4.18×10−15
Example 2. We consider a semi-infinite Hamiltonian operator of the transverse magnetic (TM) mode for the
two-dimensional photonic crystal of the form [17] H
(
u, ⃗
k, ⃗
x) = −
1ε(⃗
x)
∇ +
i⃗
k
·
∇ +
i⃗
k
u(⃗
x)
= −
1ε(⃗
x)
∆+
2i⃗
k· ∇ − ∥⃗
k∥
2
u(⃗
x),
(39)where
⃗
k=
(
k1,
k2)
is a wave number in the first Brillouin zoneΩ∗=
(−π, π]
2,⃗
x∈
Ω= [−
0.
5, ∞) × [−
0.
5,
0.
5] =
Ω1
∪
Ω2with
Ω1=
∞
j=0 Bρ(
j),
Ω2=
∞
j=0([−
0.
5+
j,
0.
5+
j] × [−
0.
5,
0.
5]
) \
Bρ(
j)
and Bρ
(
j) = {(
x1,
x2)|(
x1−
j)
2+
x22≤
ρ
2}
, 0< ρ <
0.
5, andε(⃗
x)
is the dielectric function withε(⃗
x) =
ε
1⃗
x∈
Ω1,
ε
2⃗
x∈
Ω2.
SeeFig. 1for an illustration of the domainΩ. By Bloch’s theorem, we assume that the boundary conditions are given by
u
(⃗
x) =
0, ⃗
x∈ {
(−
0.
5,
x2)|
x2∈ [−
0.
5,
0.
5]}
,
u(
x1,
0.
5) =
eik2u(
x1, −
0.
5),
x1∈ [−
0.
5, ∞].
Fig. 1. The domainΩ = Ω1∪Ω2.
We use the classical five-point central finite difference method to discretize the operator(39)on the uniform grid points inΩwith mesh size h
=
1/
n. So n is the number of grid points on the x2axis in[−
0.
5,
0.
5)
. Let Tnbe the tridiagonal matrixof dimension n with 4 on the main diagonal and
−
1 on the two adjacent diagonals and let Dnbe the tridiagonal matrix ofdimension n with 0 on the main diagonal and
−
1 and 1 on the super-diagonal and the sub-diagonal, respectively. LetΦ
=
1 h2(
Tn−
δ
e1e ⊤ n− ¯
δ
ene⊤1) −
ik2 h(
Dn+
δ
e1e ⊤ n− ¯
δ
ene⊤1) + (
k 2 1+
k 2 2)
In and Ψ=
−
1 h2−
ik1 h
In,
whereδ =
eik2and ejdenotes the jth column vector of the identity matrix. Then the system Hamiltonian H from the operator
Fig. 2. RResη, RRes and the number of iterations ofAlgorithm 3. sub-diagonal, respectively. The block matrices Hband Hbbare of the forms
Hb
=
H1,1 H1,2 H1∗,2 H2,2...
...
...
Hn−1,n Hn∗−1,n Hn,n
,
Hbb=
0 0 Hn,n+1 0
∈
Cn2×n2,
where Hj,j=
ΓjΦΓj,
Hj,j+1=
ΓjΨ Γj+1,
j=
1, . . . ,
n,
andΓj=
diag(Υ
(:,
j))
,Υ= [
Υij] ∈
Rn×nwith
Υij=
1ε
1, (−
0.
5+
jh,
0.
5−
ih) ∈
Bρ(
0),
Υij=
1ε
2, (−
0.
5+
jh,
0.
5−
ih) ̸∈
Bρ(
0).
We now apply the Green’s function approach to the system Hamiltonian H with the overlap matrix being the identity. So we need to determine the n2
×
n2block (particularly the n×
n block) in the upper-left corner of the inverse of the matrix(3)(now with Sb
=
I and Sbb=
0). This is done by solving the matrix equation(37). The matrices Aη,
Bη,
Qη∈
Cn 2×n2in
(37)now have the structures in(30), with nj
=
p=
n andKj,j
=
zIn−
Hj,j,
Kj,j+1= −
Hj,j+1,
Kj+1,j= −
Hj∗,j+1,
j=
1, . . . ,
n,
where z
=
E+
iη
withE∈
R and 0≤
η ≪
1. We remark that the matrix equation here is a special case of(1)with P=
I, D=
0, C=
Aη=
B∗ηand R
=
EI−
Hb.We now useAlgorithm 3to compute the solution Xηof(37). In our test we take n