Numerical Solution of Nonlinear Matrix Equations Arising from Green's Function Calculations in Nano Research*

(1)

ARISING FROM GREEN’S FUNCTION CALCULATIONS IN NANO

RESEARCH

∗

CHUN-HUA GUO†, YUEH-CHENG KUO‡,AND WEN-WEI LIN§

Abstract. The Green’s function approach for treating quantum transport in nano devices requires the solution of nonlinear matrix equations of the form X + (C∗+ iηD∗)X−1(C + iηD) = R + iηP , where R and P are Hermitian, P + λD∗+ λ−1D is positive definite for all λ on the unit circle, and η → 0+_{. For each fixed η > 0, we show that the required solution is the unique} stabilizing solution Xη. Then X∗= lim_η→0+Xη is a particular weakly stabilizing solution of the matrix equation X + C∗X−1C = R. In nano applications, the matrices R and C are dependent on a parameter, which is the system energy E. In practice one is mainly interested in those values of E for which the equation X + C∗_X−1_{C = R has no stabilizing solutions or, equivalently, the quadratic} matrix polynomial P (λ) = λ2C∗− λR + C has eigenvalues on the unit circle. We point out that a doubling algorithm can be used to compute Xη efficiently even for very small values of η, thus providing good approximations to X∗. We also explain how the solution X∗can be computed directly using subspace methods such as the QZ algorithm by determining which unimodular eigenvalues of P (λ) should be included in the computation. In some applications the matrices C, D, R, P have very special sparsity structures. We show how these special structures can be expoited to drastically reduce the complexity of the doubling algorithm for computing Xη.

Key words. nonlinear matrix equation, weakly stabilizing solution, structure-preserving algo-rithm, Green’s function

AMS subject classifications. 15A24, 65F30

1. Introduction. In this paper we study nonlinear matrix equations of the form

X + (C

∗

+ iηD

∗

)X

−1

(C + iηD) = R + iηP,

(1.1)

where R and P are Hermitian, P + λD

∗

+ λ

−1

D is positive definite for all λ on the

unit circle T, and η ≥ 0. The special case where P = I, D = 0 and C, R are real arises

in nano research [1, 3, 12, 13] and has been studied in [7, 9]. We now briefly explain

how the general equation (1.1) also arises in nano applications.

A main goal of basic research in molecular electronics is to advance the

under-standing of electron transport through molecules. In [17], a method for calculating

the current is described for a system that consists of a molecule connected between

two semi-infinite metallic electrodes, and is implemented in a program that assumes

a local-orbital picture and requires as input the Hamiltonian and overlap matrix

ele-ments between orbitals.

∗_{Version of April 7, 2011.}

†_{Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada} ([email protected]). The work of this author was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada.

‡_{Department of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan} ([email protected]). The work of this author was partially supported by the National Science Council in Taiwan.

§_{Department of Mathematics, National Taiwan University, Taipei 106, Taiwan & Center of} Math-ematical Modelling and Scientific Computing, National Chiao Tung University, Hsinchu 300, Taiwan ([email protected]). The work of this author was partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.

(2)

The system Hamiltonian is a bi-infinite Hermitian matrix of the form

H =





H

L

H

LM

0 H

_LM∗

H

M

H

M R

0 H

_{M R}∗

H

R





,

(1.2)

where H

M

, H

L

, H

R

are the Hamiltonians for the molecule, the left electrode, and the

right electrode, respectively, and the overlap matrix is a Hermitian positive definite

matrix partitioned in the same way and is given by

S =





S

L

S

LM

0 S

_LM∗

S

M

S

M R

0 S

_{M R}∗

S

R





.

In [17] the blocks H

L

and H

LM

in (1.2) are shifted by s

L

S

L

and s

L

S

LM

,

respec-tively, where s

L

is a proper energy shift, and the blocks H

R

and H

M R

are shifted

similarly. These shifts do not change the structure of the matrix H in (1.2). So we

can simply assume that the matrix H in (1.2) has already gone through the shifting

procedure.

The Green’s function (of the full interacting system) is defined by

G = (E + i0

+

)S − H

−1

= lim

η→0+

((E + iη)S − H)

−1

,

where E is energy. We note that for each η > 0 the infinite matrix (E + iη)S − H =

ES − H + iηS is known to be invertible by Bendixson’s theorem (see [10, Lemma 3.3]),

but the existence of the above one-sided limit is something assumed. The molecule

Green’s function G

M

is that part of G corresponding to the block for the molecule

and is obtained from

G

M

= (E + i0

+

)S

M

− H

M

− Σ

L

− Σ

R

−1

,

where

Σ

L

= (E S

LM

−H

LM

)

∗

G

L

(E S

LM

−H

LM

),

Σ

R

= (E S

M R

−H

M R

)G

R

(E S

M R

−H

M R

)

∗

,

with

G

L

= (E + i0

+

)S

L

− H

L

−1

,

G

R

= (E + i0

+

)S

R

− H

R

−1

.

Then [17] the net current is detemined through a definite integral of the

trans-mission function given by

T (E ) = tr(Γ

L

G

M

Γ

R

G

∗M

),

where

Γ

L

= i(Σ

L

− Σ

∗L

),

Γ

R

= i(Σ

R

− Σ

∗R

),

and tr(·) denotes the trace of a matrix. Note that T (E ) is a real function of E since

Γ

L

and Γ

R

are Hermitian.

(3)

We now explain how the matrix Σ

R

is computed. The computation of Σ

L

is

similar. The matrices H

R

and S

R

can be written as [17]

H

R

=







H

s

H

sb

H

_sb∗

H

b

H

bb

H

_bb∗

H

b

H

bb

H

_bb∗

H

b

. .

.

. .

_.

. .

_.







,

S

R

=







S

s

S

sb

S

_sb∗

S

b

S

bb

S

_bb∗

S

b

S

bb

S

∗_bb

S

b

. .

.

. .

_.

. .

_.







,

where H

s

, S

s

∈ C

q×q

and H

b

, S

b

∈ C

n×n

, and we suppose that H

M

, S

M

∈ C

p×p

. The

size of H

s

and S

s

has been taken sufficiently large so that all nonzero elements of the

matrices H

M R

and S

M R

are in the p × q block on the left. This means that we only

need G

s

, the q × q block in the upper-left corner of G

R

, for the computation of Σ

R

.

It is easy to see [17] that G

s

is determined through

G

s

= (U

s

− U

sb

G

b

U

sb0

)

−1

_,

where U

s

= zS

s

− H

s

, U

sb

= zS

sb

− H

sb

, U

sb0

= zS

∗ sb

− H

∗

sb

with z = E + iη and η → 0

+

_,

and G

b

is the n × n block in the upper-left corner of the inverse of







zS

b

− H

b

zS

bb

− H

bb

zS

_bb∗

− H

∗ bb

zS

b

− H

b

zS

bb

− H

bb

zS

_bb∗

− H

∗ bb

zS

b

− H

b

. .

.

. .

_.

. .

_.







.

(1.3)

Note that the above matrix is invertible by Bendixson’s theorem since the matrix

T

R

=







S

b

S

bb

S

_bb∗

S

b

S

bb

S

_bb∗

S

b

. .

.

. .

_.

. .

_.







(1.4)

is positive definite when S is positive definite. Note also that T

R

is positive definite

if and only if S

b

+ λS

bb

+ λ

−1

S

bb∗

is positive definite for all λ on T.

The block Toeplitz structure of the matrix (1.3) implies that G

b

satisfies the

matrix equation

G

b

= (U

b

− U

bb

G

b

U

bb0

)

−1

_,

_(1.5)

where U

b

= zS

b

− H

b

, U

bb

= zS

bb

− H

bb

, U

bb0

= zS

∗bb

− H

bb∗

.

For any W ∈ C

n×n

_{, we can write W = W}

R

+ iW

I

, where the Hermitian matrices

W

R

=

1

2 (W + W

∗

_),

_W

I

=

1 2i

(W − W

∗

₎

are called the real part and the imaginary part of W , respectively. We are only

inter-ested in E values for which the required solution G

b

of (1.5) has a nonzero imaginary

part (in the limit η → 0

+

_{) since otherwise G}

b

and then G

s

would be Hermitian, which

would imply that Σ

R

is Hermitian and then T (E ) = 0 for the transmission function.

Now we let

X = G

−1_b

, C = E S

_bb∗

− H

∗

bb

, D = S

∗bb

, R = E S

b

− H

b

, P = S

b

.

(4)

2. Characterization of the solution G

b

. The matrix equation (1.1) may have

many different solutions. So what solution X do we need so that X

−1

= G

b

is the

required solution of (1.5)?

Let A = C + iηD, B = C

∗

+ iηD

∗

, Q = R + iηP . Then (1.1) becomes

X + BX

−1

A = Q.

(2.1)

As before, R and P are Hermitian and P + λD

∗

+ λ

−1

D is positive definite for all λ

on T. Let

M =

A

0 Q

−I

, L =

0 I

B

0 .

(2.2)

Then X is a solution of (2.1) if and only if

M

I

X

= L

I

X

−1

A. (2.3)

Therefore, every solution of (2.1) can be obtained from a suitable invariant subspace

for the pencil M − λL.

Lemma 2.1. For any η 6= 0, the matrix pencil M − λL has no eigenvalues on T.

Proof. Suppose that λ ∈ T and (M − λL)x = 0 for a vector x = (x

>1

, x

>2

)

>

with

x

1

, x

2

∈ C

n

. Then

Ax

1

= λx

2

, Qx

1

− x

2

= λBx

1

.

(2.4)

By eliminating x

2

in (2.4) we have

W x

1

≡ λB − Q + λ

−1

A x

1

= 0.

(2.5)

The imaginary part of x

∗1

W x

1

is −ηx

∗1

(P − λD

∗

− λ

−1

D)x

1

. Since P − λD

∗

− λ

−1

D

is positive definite, it follows from (2.5) that x

1

= 0. By (2.4) we have x

2

= 0. Thus,

M − λL has no eigenvalues on T.

Theorem 2.2.

For any η 6= 0, the matrix pencil M − λL ∈ C

2n×2n

has n

eigenvalues inside T and n eigenvalues outside T.

Proof. We consider the matrix pencils

H(t, λ) =

tA

0 tR + iηP

−I

− λ

0 I

tB

0 obtained from the pencil M − λL by replacing C, D, R with tC, tD, tR. For each

t ∈ [0, 1] and λ ∈ T, P + λ(tD

∗

) + λ

−1

(tD) = (1 − t)P + t(P + λD

∗

+ λ

−1

D) is

positive definite. From Lemma 2.1 we know that H(t, λ) has no eigenvalues on T

for all t ∈ [0, 1]. Hence, H(1, λ) = M − λL and H(0, λ) have the same numbers

of eigenvalues inside T. But it is clear that H(0, λ) has n eigenvalues at 0 and n

eigenvalues at ∞.

Note that the pencil M −λL is a linearization of the quadratic polynomial P (λ) =

λ

2

_{B − λQ + A.}

The basic fixed-point iteration for finding a solution of (2.1) is X

k+1

= F (X

k

),

where F (X) = Q − BX

−1

A. The Fr´

echet derivative of F at X is the linear map

F

0

X

: C

n×n

→ C

n×n

given by F

X0

(Z) = (BX

−1

)Z(X

−1

A). A solution X of (2.1) is

(5)

ρ(·) denotes the spectral radius. Note that the basic fixed-point iteration is locally

convergent at a stabilizing solution.

Let X be any solution of (2.1). Then we have

P (λ) = (λBX

−1

− I)X(λI − X

−1

A).

So the eigenvalues of X

−1

A are n eigenvalues of P (λ), and the eigenvalues of BX

−1

are the reciprocals of the remaining n eigenvalues of P (λ).

It then follows from

Theorem 2.2 that a solution X of (2.1) is stabilizing if and only if ρ(X

−1

A) < 1 and

that there is at most one stabilizing solution.

Let the invariant subspace of M − λL corresponding to its n eigenvalues inside

T be spanned by the columns of the matrix

U

V

, where U, V ∈ C

n×n

. Then the

existence of a stabilizing solution can be established by showing that U and V are

both invertible. The stabilizing solution is then X = V U

−1

. For the case where the

matrices C, D, R, P in (2.1) are all real, an elementary proof for the invertibility of U

has already been given in [8] and we note that the invertibility of V can be proved

in the same way. It is also shown in [8] that the imaginary part of the stabilizing

solution is positive definite for η > 0. The proofs can be carried over to the complex

case here with only very minor changes.

Here, however, we are going to use an advanced result on linear operators to show

the existence of a stabilizing solution since this approach will also explain that the

stabilizing solution is precisely the solution we need for the nano application. The

treatment is very similar to the one in [9] for a special case of the equation (2.1). So

our presentation here will be very brief.

Recall that G

b

= lim

η→0+

G

b

(η), where G

b

(η) is the n×n matrix in the upper-left

corner of T

−1

with T given by (1.3) for each η > 0. Using the current notation, we

have

T =







Q

B

A

Q

B

A

Q

. .

.

. .

_.

. .

_.







.

(2.6)

Associated with T is the rational matrix function φ(λ) = λA + Q + λ

−1

B. We already

know from Bendixson’s theorem (see [10, Lemma 3.3]) that T is invertible for each

η > 0. Thus, by a result on linear operators (see [6, Chapter XXIV, Theorem 4.1]

and [15]) we know that φ(λ) has a factorization

φ(λ) = (I − λ

−1

L)X(I − λU )

(2.7)

with X invertible, ρ(L) < 1 and ρ(U ) < 1. From (2.7) we see that

A = −XU,

B = −LX,

Q = X + LXU.

Thus X + BX

−1

A = Q and ρ(X

−1

A) < 1. In other words, X is the unique stabilizing

solution of (2.1).

By [6, Chapter XXIV, Theorem 4.1] the n × n matrix in the upper-left corner of

T

−1

is precisely X

−1

. We thus have the following characterization of G

b

(η).

Theorem 2.3. For any η > 0, the matrix G

b

(η) is the inverse of the unique

(6)

3. Computation of the stabilizing solution. Let M and L be as in (2.2).

Then the stabilizing solution X of (2.1) satisfies (2.3) with ρ(X

−1

A) < 1.

We remark that the equation (2.1) with real matrices C, D, R, P also arises in the

study of a quadratic eigenvalue problem from the vibration analysis of fast trains,

where the required solution is also the stabilizing solution and a doubling algorithm is

used to find the solution [10]. We can get similar results for our more general equation

(2.1). The situation here is slightly more complicated since we no longer have B = A

>

and the stabilizing solution is no longer complex symmetric.

Starting with the matrices M and L in (2.2), we define the sequences {M

k

} and

{L

k

}, where

M

k

=

A

k

0 Q

k

−I

, L

k

=

−P

k

I

B

k

0 ,

by the following structure-preserving doubling algorithm if no breakdown occurs.

Algorithm 3.1. Let A

0

= A, B

0

= B, Q

0

= Q, P

0

= 0.

For k = 0, 1, . . ., compute

A

k+1

= A

k

(Q

k

− P

k

)

−1

A

k

,

B

k+1

= B

k

(Q

k

− P

k

)

−1

B

k

,

Q

k+1

= Q

k

− B

k

(Q

k

− P

k

)

−1

A

k

,

P

k+1

= P

k

+ A

k

(Q

k

− P

k

)

−1

B

k

.

The above algorithm is the SDA-2 as presented in [2]. The next result shows that

the doubling algorithm has some nice properties. In particular, it can compute the

stabilizing solution X of (2.1) efficiently.

Theorem 3.1. Let X be the stabilizing solution of (2.1) and b

X be the stabilizing

solution of the dual equation

b

X + A b

X

−1

B = Q.

Then

(a) The sequences {A

k

}, {B

k

}, {Q

k

}, {P

k

} in Algorithm 3.1 are well-defined.

(b) Q

k

converges to X quadratically, A

k

and B

k

converge to 0 quadratically,

Q − P

k

converges to b

X quadratically, with

lim sup

k→∞

2k

_pkQ

k

− Xk ≤ ρ( b

X

−1

B)ρ(X

−1

A), lim sup

k→∞ 2k

_pkA

k

k ≤ ρ(X

−1

A),

lim sup

k→∞ 2k

pkB

k

k ≤ ρ( b

X

−1

B), lim sup

k→∞ 2k

q

kQ − P

k

− b

Xk ≤ ρ( b

X

−1

B)ρ(X

−1

A),

where k · k is any matrix norm.

Proof. The proof is very similar to that of [10, Theorem 4.1]. Although the

statement of that theorem and the beginning of its proof refer to the specific problem

under consideration in [10], the proof there is valid for this theorem after some minor

changes. Here we only mention the following differences. In [10], Q

>

_{= Q and B = A}

>

(this would be true here if the matrices C, D, R, P were all real), and in that case we

can conclude that B

k

= A

>k

, Q

>

(7)

Remark 3.1. Although we no longer have ρ( b

X

−1

B) = ρ(X

−1

A) in general, we

always have ρ( b

X

−1

B) = ρ(X

−1

B). In fact, the eigenvalues of b

X

−1

B are those of

b

P (λ) = λ

2

_{A − λQ + B inside T. We have mentioned earlier that the eigenvalues}

of BX

−1

are the reciprocals of those eigenvalues of P (λ) = λ

2

B − λQ + A outside

T. However, the reciprocals of the eigenvalues of P (λ) outside T are precisely the

eigenvalues of b

_{P (λ) inside T. It is also well known that BX}

−1

and X

−1

B have the

same eigenvalues.

Remark 3.2. As in [9], we can show that the basic fixed-point iteration (FPI)

X

k+1

= Q − BX

_k−1

A,

X

0

= Q

is also convergent and X

2k₋₁

= Q

k

. So the convergence of the FPI is much slower

than the doubling algorithm. However, we can use an averaging procedure for the FPI

to speed up its convergence and for the nano application we can use the computed

solution for one energy value as an initial guess for the solution for the next nearby

energy value [17].

For the special case where P = I, D = 0 and C, R are real,

convergence results for methods based on these ideas have been proved in [7] using

the Earle–Hamilton theorem [4, 11]. The proofs there can be carried over to equation

(2.1), with some minor changes, as long as D = 0 still holds (so C is any complex

matrix, R is any Hermitian matrix, and P is any Hermitian positive definite matrix).

However, when D 6= 0 we are unable to prove any non-local convergence results for

those methods. The Earle–Hamilton theorem is not applicable since we no longer

have B = A

∗

.

To emphasize its dependence on η, the stabilizing solution of (2.1) will be denoted

by X

η

. For the nano application, G

b

= lim

η→0+

G

b

(η) and G

b

(η) = X

η−1

. So G

b

=

X

_∗−1

with X

∗

= lim

η→0+

X

η

. It is easy to see that X

∗

is a particular weakly stabilizing

solution of the matrix equation

X + C

∗

X

−1

C = R,

(3.1)

with ρ(X

_∗−1

C) ≤ 1 and ρ(C

∗

X

_∗−1

) ≤ 1. The solution X

∗

can be approximated by

computing X

η

by the doubling algorithm for a sufficiently small η, but can also be

computed directly by subspace methods, as we shall see in section 5. For now, we

write X

∗

= X

∗,R

+ iX

∗,I

, where the Hermitian matrices X

∗,R

and X

∗,I

are the real

part and the imaginary part of X

∗

, respectively, and we will examine the rank of X

∗,I

.

Since the imaginary part of X

η

is positive definite for η > 0, we know that X

∗,I

is

positive semi-definite.

4. Rank of X

∗,I

. We now denote the matrices A, B, Q in (2.1) by A

η

, B

η

, Q

η

,

respectively. So

A

η

= C + iηD, B

η

= C

∗

+ iηD

∗

, Q

η

= R + iηP,

(4.1)

with C, D, R, P as before. Let

P

η

(λ) = λ

2

B

η

− λQ

η

+ A

η

.

We already know that P

η

has no eigenvalues on T for η 6= 0. For η = 0 we get

P

0

(λ) = λ

2

C

∗

− λR + C.

It is quite possible that P

0

(λ) has some eigenvalues on T. As we will see later, this is

(8)

Theorem 4.1. The number of eigenvalues (counting multiplicities) of P

0

(λ) on

T must be even, say 2m. Moreover, we have rank (X

∗,I

) ≤ m.

Proof. The matrix polynomial P

0

(λ) is ∗-palindromic. Thus µ is an eigenvalue

of P

0

(λ) if and only if 1/µ is so, and they have the same algebraic, geometric, and

partial multiplicities [14]. It follows that the total number of eigenvalues of P

0

(λ) on

T must be even.

By X

η

+ (C

∗

+ iηD

∗

)X

η−1

(C + iηD) = R + iηP we have

X

η

+ C

∗

X

η−1

C = R + ηW

η

,

(4.2)

where

W

η

= iP − iD

∗

X

η−1

C − iC

∗

X

η−1

D + ηD

∗

X

η−1

D. Taking imaginary parts on (4.2), we get

K

η

− F

η∗

K

η

F

η

= ηT

η

,

(4.3)

where K

η

= Im(X

η

), T

η

= Im(W

η

), F

η

= X

η−1

C. Let F = lim

η→0+

F

η

= X

∗−1

C. Then the eigenvalues of F consist of all n − m eigenvalues of P

0

(λ) inside T plus m

eigenvalues of P

0

(λ) on T. Let

F = V

0

R

0,1

0

0 R

0,2

V

₀−1

(4.4)

be a spectral resolution of F , where R

0,1

∈ C

m×m

and R

0,2

∈ C

(n−m)×(n−m)

are

upper triangular with σ(R

0,1

) ⊆ T and σ(R

0,2

) ⊆ D ≡ {λ ∈ C| |λ| < 1}. It follows

from [16, Chapter V, Theorem 2.8] that there is a nonsingular matrix V

η

such that

F

η

= V

η

R

η,1

0

0 R

η,2

V

_η−1

,

(4.5)

and R

η,1

→ R

0,1

, R

η,2

→ R

0,2

, and V

η

→ V

0

, as η → 0

+

.

From (4.3) and (4.5) we have

V

_η∗

K

η

V

η

−

_R

∗ η,1

0

0 R

∗η,2

V

_η∗

K

η

V

η

R

η,1

0

0 R

η,2

= ηV

_η∗

T

η

V

η

.

(4.6)

Let

V

_η∗

K

η

V

η

=

H

η,1

H

η,3

H

_η,3∗

H

η,2

, V

_η∗

T

η

V

η

=

Z

η,1

Z

η,3

Z

_η,3∗

Z

η,2

.

(4.7)

Then (4.6) becomes

H

η,1

− R

∗η,1

H

η,1

R

η,1

= ηZ

η,1

,

(4.8a)

H

η,2

− R

∗η,2

H

η,2

R

η,2

= ηZ

η,2

,

(4.8b)

H

η,3

− R

∗η,1

H

η,3

R

η,2

= ηZ

η,3

.

(4.8c)

As η → 0

+

_{, R}

η,1

→ R

0,1

with ρ(R

0,1

) = 1, R

η,2

→ R

0,2

with ρ(R

0,2

) < 1, and

Z

η,2

and Z

η,3

are bounded by the convergence of T

η

. So we have H

η,2

→ 0 from

(4.8b) and H

η,3

→ 0 from (4.8c). Since X

∗,I

= lim

η→0+

K

_η

, it follows from (4.7) that

(9)

We conjecture that equality holds in Theorem 4.1 when all eigenvalues of P

0

(λ)

on T are simple.

For the nano application, the matrices C and R in P

0

(λ) are given by

C = E S

∗_bb

− H

_bb∗

, R = E S

b

− H

b

.

If P

0

(λ) has no eigenvalues on T for an energy value E, then X

∗

is Hermitian by

Theorem 4.1 and G

b

= X

∗−1

is also Hermitian. We then know that the transmission

function T (E ) takes zero value, without solving any nonlinear matrix equations. So

we are only interested in those E values for which P

0

(λ) has some eigenvalues on T.

The next simple result is thus relevant, where S

b

− λS

bb

− λ

−1

S

bb∗

is positive definite

for all λ on T.

Theorem 4.2. For λ ∈ T, let the eigenvalues of

(S

b

− λS

bb

− λ

−1

S

bb∗

)

−1

_(H

b

− λH

bb

− λ

−1

H

bb∗

)

be µ

1

(λ) ≤ · · · ≤ µ

n

(λ). Let

∆

i

=

min

|λ|=1

µ

i

(λ), max

|λ|=1

µ

i

(λ)

,

and ∆ =

S

n

i=1

∆

i

. Then the quadratic pencil P

0

(λ) = λ

2

_{(E S}

bb

− H

bb

) − λ(E S

b

− H

b

) +

(E S

_bb∗

− H

∗

bb

) has some eigenvalues on T if and only if E ∈ ∆.

Proof. The quadratic P

0

(λ) has some eigenvalues on T if and only if det(P

0

(λ)) =

0 for some λ ∈ T or, equivalently,

det(−λ

−1

P

0

(λ)) = det E (S

b

− λS

bb

− λ

−1

S

bb∗

) − (H

b

− λH

bb

− λ

−1

H

bb∗

) = 0

for some λ ∈ T, the latter is equivalent to E ∈ ∆.

5. Direct computation of X

∗

. The solution X

∗

can be computed directly by

subspace methods. We will need to include all eigenvalues of

P (λ) = λ

2

C

∗

− λR + C

(5.1)

inside T and half of its eigenvalues on T — the half that would be perturbed to the

inside of T when P (λ) is perturbed to

P

η

(λ) = λ

2

(C

∗

+ iηD

∗

) − λ(R + iηP ) + (C + iηD).

(5.2)

Let

M =

C

0 R

−I

, L =

0 I

C

∗

0 .

(5.3)

Then the pencil M − λL, also denoted by (M, L), is a linearization of the quadratic

matrix polynomial P (λ). It is easy to check that y and z are the right and left

eigenvectors, respectively, corresponding to an eigenvalue λ of P (λ) if and only if

y

Ry − λC

∗

y

,

z

−λz

(5.4)

are the right and left eigenvectors of (M, L), respectively.

(10)

The following result is a generalization of [7, Theorem 3.1] for the special case

where P = I, D = 0 and C, R are real. It shows which invariant subspace

corre-sponding to unimodular eigenvalues of P (λ) should be used in the computation of

X

∗

, assuming they are all semi-simple.

Theorem 5.1. Suppose that λ

0

is a semi-simple eigenvalue of P (λ) on T with

multiplicity m

0

and Y ∈ C

n×m0

forms an orthonormal basis of right eigenvectors

corresponding to λ

0

. Then iY

∗

(2λ

0

C

∗

− R)Y is a nonsingular Hermitian matrix. Let

d

j

, j = 1, . . . , `, be the distinct eigenvalues of

Y

∗

(P − λ

0

D

∗

− λ

−10

D)Y (iY

∗

_(2λ

0

C

∗

− R)Y )

−1

with multiplicities m

j₀

, and let ξ

j

∈ C

m0×m

j

0

form an orthonormal basis of the eigenspace

corresponding to d

j

. Then for η > 0 sufficiently small and j = 1, . . . `

λ

(k)_j,η

= λ

0

− λ

0

d

j

η + O(η

2

), k = 1, . . . , m

j0

,

and

y

j,η

= Y (Y

∗

(P − λ

0

D

∗

− λ

−10

D)Y

−1

ξ

j

+ O(η)

(5.5)

are perturbed eigenvalues and a basis of the corresponding invariant subspace of P

η

(λ),

respectively.

Proof. Since P (λ

0

)Y = 0 with Y

∗

Y = I

m0

and |λ

0

| = 1, we have

0

∗

= (P (λ

0

)Y )

∗

= λ

2

0

Y

∗

(λ

20

C

∗

− λ

0

R + C).

It follows that Y forms an orthonormal basis for left eigenvectors of P (λ)

correspond-ing to λ

0

. From (5.4), we obtain that the column vectors of

Y

R

=

Y

RY − λ

0

C

∗

Y

and Y

L

=

Y

−λ

0

Y

form a basis of left and right eigenspaces of M − λL corresponding to λ

0

, respectively.

Since λ

0

is semi-simple, the matrix

[Y

∗

, −λ

0

Y

∗

]L

Y

RY − λ

0

C

∗

Y

= −Y

∗

(2λ

0

C

∗

− R)Y = −Y

∗

P

0

(λ

0

)Y

is nonsingular. Let

e

Y

R

= −Y

R

(Y

∗

P

0

(λ

0

)Y )

−1

,

Y

e

L

= Y

L

.

Then we have

e

Y

_L∗

L e

Y

R

= I

m0

and e

Y

∗ L

M e

Y

R

= λ

0

I

m0

.

(5.6)

For η > 0 sufficiently small, we let

M

η

=

C + iηD

0 R + iηP

−I

, L

η

=

0 I

C

∗

+ iηD

∗

0 .

(11)

Then M

η

− λL

η

is a linearization of P

η

(λ). By (5.6) and [16, Chapter VI, Theorem

2.12] there are b

Y

R

and b

Y

L

such that

h

e

Y

R

Y

b

R

i

and

h

Y

e

L

Y

b

L

i

are nonsingular and

"

e

Y

∗ L

b

Y

∗ L

#

M

h

Y

e

R

Y

b

R

i

=

_λ

0

I

m0

0

0 M

c

,

"

e

Y

∗ L

b

Y

∗ L

#

L

h

Y

e

R

Y

b

R

i

=

_I

m0

0

0 L

b

.

Then, by [16, Chapter VI, Theorem 2.15] the column vectors of e

Y

R

+ O(η) span the

right eigenspace of (M

η

, L

η

) corresponding to (Λ, I

m0

), where

Λ = (λ

0

I

m0

+ ηE

11

+ O(η

2

₎

I

m0

+ ηF

11

+ O(η

2

₎

−1

with

E

11

= e

Y

L∗

iD

0 iP

0 e

Y

R

= Y

∗

(λ

0

iP − iD)Y (Y

∗

P

0

(λ

0

)Y )

−1

,

F

11

= e

Y

L∗

0

0 iD

∗

0 e

Y

R

= Y

∗

(λ

0

iD

∗

)Y (Y

∗

P

0

(λ

0

)Y )

−1

.

Thus

Λ = λ

0

I

m0

+ η(E

11

− λ

0

F

11

) + O(η

2

_{) = λ}

0

I

m0

− ηλ

0

W + O(η

2

_),

where

W = Y

∗

(P − λ

0

D

∗

− λ

−10

D)Y (iY

∗

_P

0

_(λ

0

)Y )

−1

.

(5.7)

The matrix Z = iY

∗

P

0

(λ

0

)Y = iY

∗

(2λ

0

C

∗

− R)Y in (5.7) is Hermitian since

Z − Z

∗

= iY

∗

(2λ

0

C

∗

+ 2λ

0

C − 2R)Y = 2iλ

0

Y

∗

P (λ

0

)Y = 0.

Since Y

∗

(P − λ

0

D

∗

− λ

−10

D)Y is positive definite, all eigenvalues of W in (5.7) are

real and there are m

0

linearly independent eigenvectors. Let d

j

for j = 1, . . . , ` be

the distinct eigenvalues of W with multiplicities m

j₀

, and let ξ

j

∈ C

m0×m

j

0

form an

orthonormal basis of the eigenspace corresponding to d

j

. Then we have

Φ

−1

ΛΦ = λ

0

I

m0

− ηλ

0

diag

d

1

I

m1 0

, . . . , d

`

I

m`0

+ O(η

2

).

where Φ = [ξ

1

, . . . , ξ

`

] ∈ C

m0×m0

. Then for each j ∈ {1, 2, . . . , `}, the perturbed

eigenvalues λ

(k)_j,η

, k = 1, . . . , m

j₀

, and a basis of the corresponding invariant subspace

of M

η

− λL

η

with λ

(k) j,η

|

η=0

= λ

0

can be expressed by

λ

(k)_j,η

= λ

0

− λ

0

d

j

η + O(η

2

), k = 1, . . . , m

j 0

,

(5.8a)

and

ζ

j,η

= Y

R

Y

∗

(P − λ

0

D

∗

− λ

−10

D)Y

−1

ξ

j

+ O(η).

(5.8b)

The equation in (5.5) follows from (5.8b).

For the pencil (M, L) given by (5.3), the relation M

I

X

= L

I

X

−1

A

shows that the weakly stabilizing solution X

∗

of (3.1) is obtained by X

∗

= X

2

X

1−1

,

(12)

where the colums of

X

1

X

2

form a basis for the invariant subspace of (M, L)

corre-sponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed

to the inside of T when (M, L) is perturbed to (M

η

, L

η

) with η > 0.

We can use the QZ algorithm to determine this invariant subspace, with the

aid of Theorem 5.1 when all unimodular eigenvalues of (M, L) are semi-simple. In

practice, these unimodular eigenvalues are likely to be simple and the statements in

our Theorem 5.1 can be simplified significantly. However, if 1 (or −1) happens to

be an eigenvalue of (M, L), then it must have even multiplicity because (counting

multiplicity) half of eigenvalues at 1 (or −1) will be perturbed to the inside of T

and the other half to the outside. Typically 1 (or −1) will be a double eigenvalue of

partial multiplicity 2, and the eigenvector corresponding to it should be used in the

computation of X

∗

.

6. Exploiting sparsity. Subspace methods for finding X

∗

may be more efficient

than the doubling algorithm that finds X

η

for a sufficiently small η. However, it

is possible for the doubling algorithm to exploit certain sparsity structures in the

matrices A, B, Q in (2.1) while subspace methods couldn’t.

For the nano application here and other applications, the matrices A, B, Q are

from a semi-infinite block tridiagonal and block Toeplitz matrix, as given in the matrix

T in (2.6). In some situations, the matrix T is block tridiagonal with the matrices

on the three diagonals having some periodicity, but is not block Toeplitz when the

submatrices in T are of the given sizes. To make T a block tridiagonal and block

Toeplitz matrix, we would have to partition the matrix T into larger submatrices. To

be more precise, the matrix T is given as in (2.6), and the matrices A, B, Q ∈ C

n×n

have the following structures.

Q =







K

1,1

K

1,2

0 K

2,1

K

2,2

. .

.

. .

_.

. .

_.

_K

p−1,p

0 K

p,p−1

K

p,p







, A =

0 K

p+1,p

0

0 , B =

0

0 K

p,p+1

0 ,

(6.1)

where K

j,j

∈ C

nj×nj

, K

p+1,p

∈ C

n1×np

, K

p,p+1

∈ C

np×n1

.

We now use Algorithm 3.1 to compute the stabilizing solution X

s

of (2.1). We

remark that the equation (2.1) here is more general than the one studied in [10]. That

equation arises in the vibration analysis of fast trains.

As in [10], the complexity of Algorithm 3.1 can be reduced drastically by using

the special structures of the matrices Q, A, B given by (6.1). Write Q

k

= Q − b

P

k

.

Then it is easily seen from Algorithm 3.1 that the matrices A

k

, B

k

, b

P

k

and P

k

have

the special forms

A

k

=

0 E

k

0

0 , B

k

=

0

0 F

k

0 ,

P

b

k

=

₀

0 G

b

k

, P

k

=

G

k

0

0 ,

where E

k

, F

k

, b

G

k

and G

k

are n

1

× n

p

, n

p

× n

1

, n

p

× n

p

and n

1

× n

1

matrices,

respectively. Algorithm 3.1 can be rewritten as the following simplified algorithm.

Algorithm 6.1. Let E

0

= K

p+1,p

, F

0

= K

p,p+1

, b

G

0

= 0, G

0

= 0.

(13)

For k = 0, 1, . . ., compute







S

k,1

S

k,2

..

.

S

k,p







=







Q −







G

k

0 . .

_.

0 b

G

k













−1







E

k

0 ..

.

0 





,

(6.2)







T

k,1

T

k,2

..

.

T

k,p







=







Q −







G

k

0 . .

_.

0 b

G

k













−1







0 ..

.

0 F

k







,

(6.3)

where S

k,i

∈ C

ni×np

and T

k,i

∈ C

ni×n1

, and then compute

E

k+1

= E

k

S

k,p

, F

k+1

= F

k

T

k,1

, b

G

k+1

= b

G

k

+ F

k

S

k,1

, G

k+1

= G

k

+ E

k

T

k,p

. (6.4)

The main task of Algorithm 6.1 is to solve the large sparse linear systems in (6.2)

and (6.3). This could be done by using the Sherman–Morrison–Woodbury formula, as

in [10]. But here we present a new approach that is both simpler and less expensive.

Let

P =





I

n1

0

0 I

np

0 I

n−n1−np

0 



be a permutation matrix and note that

P







Q −







G

k

0 . .

_.

0 b

G

k













P

>

=





K

1,1

− G

k

0

0 K

p,p

− b

G

k

V

U

C





,

where

V =

K

1,2

0 · · ·

0

0 · · ·

0 K

p,p−1

,

(6.5)

U =







K

2,1

0

0 ..

.

..

.

0

0 K

p−1,p







, C =







K

2,2

K

2,3

0 K

3,2

K

3,3

. .

.

. .

_.

. .

_.

_K

p−2,p−1

0 K

p−1,p−2

K

p−1,p−1







.

(6.6)

Then the matrices S

k,1

, S

k,p

, T

k,1

and T

k,p

of the solutions of (6.2) and (6.3) satisfy

_K

1,1

− G

k

0

0 K

p,p

− b

G

k

− VC

−1

U

S

k,1

T

k,1

S

k,p

T

k,p

=

E

k

0

0 F

k

.

(6.7)

(14)

Note that the matrix VC

−1

U is independent of k. Since Q

k

= Q− b

P

k

and lim

k→∞

Q

k

=

X

s

, we know that X

s

is obtained from Q by replacing K

p,p

in the lower-right corner

with K

p,p

− b

G

∗

, where b

G

∗

= lim

k→∞

G

b

k

.

The following algorithm gives a more detailed implementation of Algorithm 6.1

and computes the stabilizing solution X

s

of (2.1).

Algorithm 6.2. Computation of X

s

.

Input: K

j,j

∈ C

nj×nj

, K

j,j+1

∈ C

nj×nj+1

, K

j+1,j

∈ C

nj+1×nj

, j = 1, . . . , n, where

n

p+1

= n

1

; tolerance τ .

Output: The stabilizing solution X

s

of (2.1), where A, B, Q are given by (6.1).

Take V, U and C in (6.5) and (6.6);

Compute W =

K

1,1

0

0 K

p,p

− VC

−1

_{U ;}

E

0

= K

p+1,p

, F

0

= K

p,p+1

, b

G

0

= 0, G

0

= 0;

For k = 0, 1, . . .

S

k,1

T

k,1

S

k,p

T

k,p

=

W −

_G

k

0

0 G

b

k

−1

E

k

0

0 F

k

;

E

k+1

= E

k

S

k,p

, F

k+1

= F

k

T

k,1

, b

G

k+1

= b

G

k

+ F

k

S

k,1

,

G

k+1

= G

k

+ E

k

T

k,p

;

If kF

k

S

k,1

k ≤ τ k b

G

k

k and kE

k

T

k,p

k ≤ τ kG

k

k, then

X

s

← Q,

X

s

(n − n

p

+ 1 : n, n − n

p

+ 1 : n) ← K

p,p

− b

G

k+1

,

and stop.

In nano applications, we typically need the n

1

×n

1

matrix in the upper-left corner

of T

−1

, where T is given by (2.6). We also know that X

_s−1

is the n × n matrix in the

upper-left corner of T

−1

. So we are mainly interested in the n

1

× n

1

matrix Y

1

in the

upper-left corner of X

s−1

. Note that Y

1

is the same as the matrix S

k,1

in (6.2) when

G

k

, b

G

k

, E

k

in (6.2) are replaced by 0, b

G

∗

, I

n1

, respectively. Thus

Y

1

= [I

n1

, 0]

W −

₀

0 G

b

∗

−1

I

n1

0 ,

where the matrix W has already been computed in Algorithm 6.2.

7. Numerical results. In this section we present some numerical results. We

use the doubling algorithm to compute the stabilizing solution X

η

of the equation

X + B

η

X

−1

A

η

= Q

η

,

(7.1)

where A

η

, B

η

, Q

η

are given in (4.1). If these matrices have the special sparsity

stuc-tures in (6.1), then Algorithm 6.2 is used. To measure the accuracy of a computed

stabilizing solution X

η

to (7.1), we use the relative residual

RRes

η

=

kX

η

+ B

η

X

η−1

A

η

− Q

η

k

kX

η

k + kA

η

kkB

η

kkX

η−1

k + kQ

η

k

,

where k · k is the spectral norm. To see whether X

η

is a good approximation to the

weakly stabilizing solution X

∗

of the equation X + C

∗

X

−1

C = R, we compute

RRes =

kX

η

+ C

∗

_X

−1 η

C − Rk

kX

η

k + kCk

2

kX

η−1

k + kRk

(15)

We also use the QZ algorithm to compute X

∗

directly, and the relative residual RRes

0

is defined as in (7.2), with the computed X

η

replaced by the computed X

∗

.

Example 7.1.

We randomly generate two complex matrices C, D and two

complex Hermitian matrices R, P of dimension 6. Let % be the minimal eigenvalue of

P and set

P := P + (2kDk − %)I

6

.

We verify that P + λD

∗

+ λ

−1

_{D is positive definite for all λ ∈ T. We then compute}

the stabilizing solution X

η

of (7.1) with η = 10

−4

, 10

−8

, 10

−12

, respectively, by using

Algorithm 3.1. In each case, Algorithm 3.1 is stopped when max{kA

k+1

k, kB

k+1

k} <

10

−10

and Q

k+1

is taken to be the computed X

η

. When η = 0, P

0

(λ) = λ

2

C

∗

−λR+C

has 2m = 4 eigenvalues on T, given by

Λ = {−0.9026 + 0.4304i, 0.5687 − 0.8226i, 0.9891 + 0.1472i, 0.1960 + 0.9806i}.

By Theorem 5.1 we determine that

Λ

s

= {0.5687 − 0.8226i, 0.9891 + 0.1472i}

is such that the perturbed eigenvalues of P

η

(λ) (η > 0) associated with each λ

s

∈ Λ

s

are inside T. Then we compute the weakly stabilizing solution X

∗

of (7.1) by using

the invariant subspace corresponding to stable eigenvalues and eigenvalues in Λ

s

_(the

QZ algorithm). The numerical results are shown in Table 7.1.

Table 7.1 Relative residuals

η 10−4 10−8 10−12 0

RResη 3.36 × 10−16 4.03 × 10−15 4.24 × 10−15 3.09 × 10−16

We know that X

η,I

=

_2i1

(X

η

− X

η∗

) is positive definite for η > 0 and we know

from Theorem 4.1 that rank(X

∗,I

) ≤ m = 2. These are confirmed by the numerical

results shown in Table 7.2, where X

0,I

= X

∗,I

.

Table 7.2 The eigenvalues of Xη,I η The eigenvalues of Xη,I

10−4 2.3555, 1.2676, 5.87 × 10−3, 1.89 × 10−3, 1.21 × 10−3, 1.10 × 10−3 10−8 2.3510, 1.2639, 5.91 × 10−7, 1.89 × 10−7, 1.21 × 10−7, 1.10 × 10−7 10−12 _{2.3510, 1.2639, 5.91 × 10}−11_{, 1.89 × 10}−11_{, 1.21 × 10}−11_{, 1.10 × 10}−11

0 2.3510, 1.2639, 5.41 × 10−15, 1.61 × 10−15, −5.12 × 10−16, −4.18 × 10−15

Example 7.2. We consider a semi-infinite Hamiltonian operator of the transverse

magnetic (TM) mode for the two-dimensional photonic crystal of the form [5]

H(u, ~

k, ~

x) = −

1 ε(~

x)

∇ + i~k

· ∇ + i~k

u(~

x)

= −

1 ε(~

x)

(16)

where ~

k = (k

1

, k

2

) is a wave number in the first Brillouin zone Ω

∗

= (−π, π]

2

, ~

x ∈

Ω = [−0.5, ∞) × [−0.5, 0.5] = Ω

1

∪ Ω

2

with

_Ω

1

=

S

∞_j=0

B

ρ

(j),

Ω

2

=

S

∞_j=0

([−0.5 + j, 0.5 + j] × [−0.5, 0.5]) \B

ρ

(j)

and B

ρ

(j) = {(x

1

, x

2

)|(x

1

− j)

2

+ x

22

≤ ρ

2

}, 0 < ρ < 0.5, and ε(~

x) is the dielectric

function with

ε(~

x) =

ε

1

~

x ∈ Ω

1

,

ε

2

~

x ∈ Ω

2

.

See Figure 7.1 for an illustration of the domain Ω. By Bloch’s theorem, we assume

ρ

(1,0) (2,0) 1 Ω Ω1 Ω1 2 Ω ( 0.5,0)− (0,0.5) x1 x2

Fig. 7.1. The domain Ω = Ω1∪ Ω2.

that the boundary conditions are given by

u(~

x) = 0, ~

x ∈ {(−0.5, x

2

)|x

2

∈ [−0.5, 0.5]},

u(x

1

, 0.5) = e

ik2

u(x

1

, −0.5), x

1

∈ [−0.5, ∞].

We use the classical five-point central finite difference method to discretize the

operator (7.3) on the uniform grid points in Ω with mesh size h = 1/n. So n is the

number of grid points on the x

2

axis in [−0.5, 0.5). Let T

n

be the tridiagonal matrix

of dimension n with 4 on the main diagonal and −1 on the two adjacent diagonals

and let D

n

be the tridiagonal matrix of dimension n with 0 on the main diagonal and

−1 and 1 on the supper-diagonal and the sub-diagonal, respectively. Let

Φ =

1 h

2

(T

n

− δe

1

e

> n

− ¯

δe

n

e

>1

) −

ik

2

h

(D

n

+ δe

1

e

> n

− ¯

δe

n

e

>1

) + (k

2 1

+ k

2 2

)I

n

and

Ψ =

−

1 h

2

−

ik

1

h

I

n

,

where δ = e

ik2

_{and e}

j

denotes the jth column vector of the identity matrix. Then

the system Hamiltonian H from the operator (7.3) is a semi-infinite block tridiagonal

matrix with H

b

on the main diagonal and H

bb

and H

bb∗

on the supper-diagonal and

the sub-diagonal, respectively. The block matrices H

b

and H

bb

are of the forms

H

b

=







H

1,1

H

1,2

H

_1,2∗

H

2,2

. .

.

. .

_.

. .

_.

_H

n−1,n

H

∗ n−1,n

H

n,n







, H

bb

=

0

0 H

n,n+1

0 ∈ C

n2×n2

,

(17)

where

H

j,j

= Γ

j

ΦΓ

j

, H

j,j+1

= Γ

j

ΨΓ

j+1

, j = 1, . . . , n,

and Γ

j

= diag(Υ(:, j)), Υ = [Υ

ij

] ∈ R

n×n

with







Υ

ij

=

q

1 ε1

,

(−0.5 + jh, 0.5 − ih) ∈ B

ρ

(0),

Υ

ij

=

q

1 ε2

,

(−0.5 + jh, 0.5 − ih) /

∈ B

ρ

(0).

We now apply the Green’s function approach to the system Hamiltonian H with

the overlap matrix being the identity. So we need to determine the n

2

_{× n}

2

_block

(particularly the n × n block) in the upper-left corner of the inverse of the matrix

(1.3) (now with S

b

= I and S

bb

= 0). This is done by solving the matrix equation

(7.1). The matrices A

η

, B

η

, Q

η

∈ C

n

2_×n2

in (7.1) now have the structures in (6.1),

with n

j

= p = n and

K

j,j

= zI

n

− H

j,j

, K

j,j+1

= −H

j,j+1

, K

j+1,j

= −H

j,j+1∗

, j = 1, . . . , n,

where z = E + iη with E ∈ R and 0 ≤ η 1. We remark that the matrix equation

here is a special case of (1.1) with P = I, D = 0, C = A

η

= B

η∗

and R = E I − H

b

.

We now use Algorithm 6.2 to compute the solution X

η

of (7.1). In our test we

take n = 50, ρ = 0.3, ε

1

= 1, ε

2

= 10 and (k

1

, k

2

) = (0.5, 0.7). We divide [0, 15] into

κ subintervals using κ + 1 equally spaced nodes E

i

, i = 0, 1, . . . , κ. We now choose

κ = 500 and run Algorithm 6.2 with η = 10

−8

and τ = 10

−8

for each E

i

. In Figure 7.2,

we plot the relative residuals (RRes

η

, RRes) and the number of iterations of Algorithm

6.2. We see that very good approximations to X

η

and X

∗

are obtained in no more

than 33 iterations. We also determine the interested energy interval ∆ =

S

n

i=1

∆

i

,

where ∆

i

are given in Theorem 4.2. The energy values in ∆ are precisely those for

which the pencil (M, L), where M and L are given in (5.3), has eigenvalues on T. In

Figure 7.3, we plot the number of eigenvalues of (M, L) on T and ∆

T[0, 15]. For this

example, the number of such eigenvalues for E ∈ [0, 15] is 0, 2, or 4. For some larger

values of E , we find the number of such eigenvalues to be 6, which turns out to be

the maximal number for any energy value. As expected from our convergence results,

Algorithm 6.2 requires more iterations when (M, L) has unimodular eigenvalues, but

it does not matter too much whether the actual number of unimodular eigenvalues is

2, 4, or any other positive even integer.

8. Conclusion. We have introduced a class of nonlinear matrix equations that

is wider than those studied earlier in the literature. The main motivation for studying

this wider class is from the Green’s function approach for treating quantum transport

in nano devices. We have characterized the special solution of practical interest. We

have shown how the doubling algorithm and subspace methods like the QZ algorithm

can be used to find good approximations to the required solution. We have also shown

how some special sparsity structures in the coefficent matrices of the equation can be

expoited to drastically reduce the complexity of the doubling algorithm for

comput-ing the desired solution. The matrix equation from the nano application involves a

parameter. At present it is not clear whether the solution computed for one value of

the parameter can be used to reduce the computational work of some iterative

meth-ods in computing the solution for a nearby value of the parameter, with guaranteed

convergence. This could be a topic for further research.

(18)

10-17 10-16 10-15 10-14 10-13 10-12 10 15 20 25 30 35 10-13 10-12 10-11 5 0 0 5 10 15 0 5 10 15 0 5 10 15

Iterations of Alg. 6.2

RRes

η

RRes



Fig. 7.2. RResη, RRes and the number of iterations of Algorithm 6.2.

REFERENCES

[1] I. Appelbaum, T. Wang, J. D. Joannopoulos, and V. Narayanamurti, Ballistic hot-electron transport in nanoscale semiconductor heterostructures: Exact self-energy of a three-dimensional periodic tight-binding Hamiltonian, Physical Review B, 69 (2004), Arti-cle Number 165301, 6 pp.

[2] C.-Y. Chiang, E. K.-W. Chu, C.-H. Guo, T.-M. Huang, W.-W. Lin, and S.-F. Xu, Con-vergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 227–247.

[3] S. Datta, Nanoscale device modeling: the Green’s function method, Superlattices and Mi-crostructures, 28 (2000), pp. 253–278.

[4] C. J. Earle and R. S. Hamilton, A fixed point theorem for holomorphic mappings, Global Analysis (Proc. Sympos. Pure Math., Vol. XVI, Berkeley, Calif., 1968), American Mathe-matical Society, Rhode Island, 1970, pp. 61–65.

[5] C. Engstr¨om and M. Wang, Complex dispersion relation calculations with the symmetric interior penalty method, Internat. J. Numer. Methods Engrg., 84 (2010), pp. 849–863. [6] I. Gohberg, S. Goldberg, M. A. Kaashoek, Classes of Linear Operators, Vol. II, Operator

Theory: Advances and Applications, Vol. 63, Birkh¨auser, 1993.

[7] C.-H. Guo, Y.-C. Kuo, and W.-W. Lin, On a nonlinear matrix equation arising in nano research, preprint, November 2010.

[8] C.-H. Guo, Y.-C. Kuo, and W.-W. Lin, Complex symmetric stabilizing solution of the matrix equation X + A>X−1A = Q, Linear Algebra Appl., to appear.

(19)

0 5 10 15 0 2 4



Numbers of eigenvalues on



0 5 10 15 [0 ,1 5] ∆ 

Fig. 7.3. The number of eigenvalues on T and interested energy interval between 0 to 15.

nano research, SIAM J. Sci. Comput., 32 (2010), pp. 3020–3038.

[10] C.-H. Guo and W.-W. Lin, Solving a structured quadratic eigenvalue problem by a structure-preserving doubling algorithm, SIAM J. Matrix Anal. Appl., 31 (2010), pp. 2784–2801. [11] L. A. Harris, Fixed points of holomorphic mappings for domains in Banach spaces, Abstr.

Appl. Anal., 2003, no. 5, pp. 261–274.

[12] D. L. John and D. L. Pulfrey, Green’s function calculations for semi-infinite carbon nan-otubes, Physica Status Solidi B – Basic Solid State Physics, 243 (2006), pp. 442–448. [13] A. Kletsov, Y. Dahnovsky, and J. V. Ortiz, Surface Green’s function calculations: A

nonrecursive scheme with an infinite number of principal layers, Journal of Chemical Physics, 126 (2007), Article Number 134105, 5 pp.

[14] D. S. Mackey, N. Mackey, C. Mehl, and V. Mehrmann, Structured polynomial eigenvalue problems: good vibrations from good linearizations, SIAM J. Matrix Anal. Appl., 28 (2006), pp. 1029–1051.

[15] C. van der Mee, G. Rodriguez, and S. Seatzu, LDU Factorization results for bi-infinite and semi-infinite scalar and block Toeplitz matrices, Calcolo, 33 (1996), pp. 307–335. [16] G. W. Stewart and J.-G. Sun, Matrix Perturbation Theory, Academic Press, 1990. [17] J. Tomfohr and O. F. Sankey, Theoretical analysis of electron transport through organic