• 沒有找到結果。

Numerical solution of nonlinear matrix equations arising from Green's function calculations in nano research

N/A
N/A
Protected

Academic year: 2021

Share "Numerical solution of nonlinear matrix equations arising from Green's function calculations in nano research"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

Contents lists available atSciVerse ScienceDirect

Journal of Computational and Applied

Mathematics

journal homepage:www.elsevier.com/locate/cam

Numerical solution of nonlinear matrix equations arising from Green’s

function calculations in nano research

Chun-Hua Guo

a,∗

, Yueh-Cheng Kuo

b

, Wen-Wei Lin

c

aDepartment of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada bDepartment of Applied Mathematics, National University of Kaohsiung, Kaohsiung 811, Taiwan cDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

a r t i c l e i n f o Article history: Received 14 December 2011 MSC: 15A24 65F30 Keywords:

Nonlinear matrix equation Weakly stabilizing solution Structure-preserving algorithm Green’s function

a b s t r a c t

The Green’s function approach for treating quantum transport in nano devices requires the solution of nonlinear matrix equations of the form X+(C∗+iηD∗)X−1(C+iηD) =R+iηP, where R and P are Hermitian, PD+λ−1D is positive definite for allλon the unit

circle, andη → 0+. For each fixedη > 0, we show that the required solution is the unique stabilizing solution Xη. Then X∗ = limη→0+Xηis a particular weakly stabilizing

solution of the matrix equation X+CX−1C=R. In nano applications, the matrices R and

C are dependent on a parameter, which is the system energyE. In practice one is mainly interested in those values ofEfor which the equation X+CX−1C =R has no stabilizing solutions or, equivalently, the quadratic matrix polynomial P(λ) = λ2CλR+C has eigenvalues on the unit circle. We point out that a doubling algorithm can be used to compute Xηefficiently even for very small values ofη, thus providing good approximations to X∗. We also explain how the solution X∗can be computed directly using subspace methods such as the QZ algorithm by determining which unimodular eigenvalues of P(λ) should be included in the computation. In some applications the matrices C,D,R,P have very special sparsity structures. We show how these special structures can be exploited to drastically reduce the complexity of the doubling algorithm for computing Xη.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

In this paper we study nonlinear matrix equations of the form

X

+

(

C

+

i

η

D

)

X−1

(

C

+

i

η

D

) =

R

+

i

η

P

,

(1)

where R and P are Hermitian, P

+

λ

D

+

λ

−1D is positive definite for all

λ

on the unit circle T, and

η ≥

0. The special case where P

=

I

,

D

=

0 and C

,

R are real arises in nano research [1–4] and has been studied in [5,6]. We now briefly explain how the general equation(1)also arises in nano applications.

A main goal of basic research in molecular electronics is to advance the understanding of electron transport through molecules. In [7], a method for calculating the current is described for a system that consists of a molecule connected between two semi-infinite metallic electrodes, and is implemented in a program that assumes a local-orbital picture and requires as input the Hamiltonian and overlap matrix elements between orbitals.

Corresponding author.

E-mail addresses:[email protected](C.-H. Guo),[email protected](Y.-C. Kuo),[email protected](W.-W. Lin). 0377-0427/$ – see front matter©2012 Elsevier B.V. All rights reserved.

(2)

The system Hamiltonian is a bi-infinite Hermitian matrix of the form H

=

H L HLM 0 HLMHM HMR 0 HMRHR

,

(2)

where HM

,

HL

,

HRare the Hamiltonians for the molecule, the left electrode, and the right electrode, respectively, and the

overlap matrix is a Hermitian positive definite matrix partitioned in the same way and is given by S

=

S L SLM 0 SLM SM SMR 0 SMRSR

.

In [7] the blocks HLand HLMin(2)are shifted by sLSLand sLSLM, respectively, where sLis a proper energy shift, and the

blocks HRand HMRare shifted similarly. These shifts do not change the structure of the matrix H in(2). So we can simply

assume that the matrix H in(2)has already gone through the shifting procedure. The Green’s function (of the full interacting system) is defined by

G

=

(

E

+

i0+

)

S

H

−1

=

lim

η→0+

((

E

+

i

η)

S

H

)

−1

,

whereEis energy. We note that for each

η >

0 the infinite matrix

(

E

+

i

η)

S

H

=

ES

H

+

i

η

S is known to be invertible by Bendixson’s theorem (see [8, Lemma 3.3]), but the existence of the above one-sided limit is something assumed. The molecule Green’s function GMis that part of G corresponding to the block for the molecule and is obtained from

GM

=

(

E

+

i0+

)

SM

HM

ΣL

ΣR

−1

,

where ΣL

=

(E

SLM

HLM

)

GL

(E

SLM

HLM

),

ΣR

=

(E

SMR

HMR

)

GR

(E

SMR

HMR

)

,

with GL

=

(

E

+

i0+

)

SL

HL

−1

,

GR

=

(

E

+

i0+

)

SR

HR

−1

.

Then [7] the net current is determined through a definite integral of the transmission function given by T

(

E

) =

tr

LGMΓRGM

),

where

ΓL

=

i

L

ΣL

),

ΓR

=

i

R

ΣR

),

and tr

(·)

denotes the trace of a matrix. Note that T

(

E

)

is a real function ofEsinceΓLandΓRare Hermitian.

We now explain how the matrixΣRis computed. The computation ofΣLis similar. The matrices HRand SRcan be written

as [7] HR

=

Hs Hsb HsbHb Hbb HbbHb Hbb HbbHb

...

... ...

,

SR

=

Ss Ssb SsbSb Sbb SbbSb Sbb SbbSb

...

... ...

,

where Hs

,

Ss

Cq×qand Hb

,

Sb

Cn×n, and we suppose that HM

,

SM

Cp×p. The size of Hsand Sshas been taken sufficiently

large so that all nonzero elements of the matrices HMRand SMRare in the p

×

q block on the left. This means that we only

need Gs, the q

×

q block in the upper-left corner of GR, for the computation ofΣR. It is easy to see [7] that Gsis determined

through Gs

=

(

Us

UsbGbUsb

)

−1

,

where Us

=

zSs

Hs

,

Usb

=

zSsb

Hsb

,

Usb

=

zSsb

Hsbwith z

=

E

+

i

η

and

η →

0 +, and G

bis the n

×

n block in the

upper-left corner of the inverse of

zSb

Hb zSbb

Hbb zSbb

HbbzSb

Hb zSbb

Hbb zSbb

HbbzSb

Hb

...

...

...

.

(3)

(3)

Note that the above matrix is invertible by Bendixson’s theorem since the matrix TR

=

Sb Sbb SbbSb Sbb SbbSb

...

... ...

(4)

is positive definite when S is positive definite. Note also thatTRis positive definite if and only if Sb

+

λ

Sbb

+

λ

−1Sbbis positive

definite for all

λ

on T.

The block Toeplitz structure of the matrix(3)implies that Gbsatisfies the matrix equation

Gb

=

(

Ub

UbbGbUbb

)

−1

,

(5) where Ub

=

zSb

Hb

,

Ubb

=

zSbb

Hbb

,

Ubb

=

zSbb

Hbb.

For any W

Cn×n, we can write W

=

WR

+

iWI, where the Hermitian matrices

WR

=

1 2

(

W

+

W

),

WI

=

1 2i

(

W

W

)

are called the real part and the imaginary part of W , respectively. We are only interested inEvalues for which the required solution Gbof(5)has a nonzero imaginary part (in the limit

η →

0+) since otherwise Gband then Gswould be Hermitian,

which would imply thatΣRis Hermitian and then T

(E

) =

0 for the transmission function.

Now we let

X

=

Gb1

,

C

=

ESbb

Hbb

,

D

=

Sbb

,

R

=

ESb

Hb

,

P

=

Sb

.

Then Eq.(5)becomes(1).

2. Characterization of the solution Gb

The matrix equation(1)may have many different solutions. So what solution X do we need so that X−1

=

Gbis the

required solution of(5)?

Let A

=

C

+

i

η

D

,

B

=

C

+

i

η

D

,

Q

=

R

+

i

η

P. Then(1)becomes

X

+

BX−1A

=

Q

.

(6)

As before, R and P are Hermitian and P

+

λ

D

+

λ

−1D is positive definite for all

λ

on T. Let

M

=

A 0 Q

I

,

L

=

0 I B 0

.

(7)

Then X is a solution of(6)if and only if M

I X

=

L

I X

X−1A

.

(8)

Therefore, every solution of(6)can be obtained from a suitable invariant subspace for the pencil M

λ

L.

Lemma 1. For any

η ̸=

0, the matrix pencil M

λ

L has no eigenvalues on T.

Proof. Suppose that

λ ∈

T and

(

M

λ

L

)

x

=

0 for a vector x

=

(

x⊤1

,

x

⊤ 2

)

with x 1

,

x2

Cn. Then Ax1

=

λ

x2

,

Qx1

x2

=

λ

Bx1

.

(9) By eliminating x2in(9)we have Wx1

λ

B

Q

+

λ

−1A

x1

=

0

.

(10)

The imaginary part of x

1Wx1is

η

x∗1

(

P

λ

D

λ

−1D

)

x

1. Since P

λ

D

λ

−1D is positive definite, it follows from(10)that x1

=

0. By(9)we have x2

=

0. Thus, M

λ

L has no eigenvalues on T. 

(4)

Proof. We consider the matrix pencils H

(

t

, λ) =

tA 0 tR

+

i

η

P

I

λ

0 I tB 0

obtained from the pencil M

λ

L by replacing C

,

D

,

R with tC

,

tD

,

tR. For each t

∈ [

0

,

1

]

and

λ ∈

T, P

+

λ(

tD

) + λ

−1

(

tD

) =

(

1

t

)

P

+

t

(

P

+

λ

D

+

λ

−1D

)

is positive definite. FromLemma 1we know that H

(

t

, λ)

has no eigenvalues on T for all t

∈ [

0

,

1

]

. Hence, H

(

1

, λ) =

M

λ

L and H

(

0

, λ)

have the same numbers of eigenvalues inside T. But it is clear that H

(

0

, λ)

has n eigenvalues at 0 and n eigenvalues at

. 

Note that the pencil M

λ

L is a linearization of the quadratic polynomial P

(λ) = λ

2B

λ

Q

+

A.

The basic fixed-point iteration for finding a solution of(6)is Xk+1

=

F

(

Xk

)

, whereF

(

X

) =

Q

BX−1A. The Fréchet

derivative ofF at X is the linear mapF′

X

:

Cn

×n

Cn×ngiven byFX

(

Z

) = (

BX

−1

)

Z

(

X−1A

)

. A solution X of(6)is said to be stabilizing if

ρ(

FX

) <

1 or, equivalently,

ρ(

BX−1

)ρ(

X−1A

) <

1, where

ρ(·)

denotes the spectral radius. Note that the basic fixed-point iteration is locally convergent at a stabilizing solution.

Let X be any solution of(6). Then we have P

(λ) = (λ

BX−1

I

)

X

I

X−1A

).

So the eigenvalues of X−1A are n eigenvalues of P

(λ)

, and the eigenvalues of BX−1are the reciprocals of the remaining n eigenvalues of P

(λ)

. It then follows fromTheorem 2that a solution X of(6)is stabilizing if and only if

ρ(

X−1A

) <

1 and that there is at most one stabilizing solution.

Let the invariant subspace of M

λ

L corresponding to its n eigenvalues inside T be spanned by the columns of the

matrix

U V

, where U

,

V

Cn×n. Then the existence of a stabilizing solution can be established by showing that U and V are

both invertible. The stabilizing solution is then X

=

VU−1. For the case where the matrices C

,

D

,

R

,

P in(6)are all real, an elementary proof for the invertibility of U has already been given in [9] and we note that the invertibility of V can be proved in the same way. It is also shown in [9] that the imaginary part of the stabilizing solution is positive definite for

η >

0. The proofs can be carried over to the complex case here with only very minor changes.

Here, however, we are going to use an advanced result on linear operators to show the existence of a stabilizing solution since this approach will also explain that the stabilizing solution is precisely the solution we need for the nano application. The treatment is very similar to the one in [6] for a special case of Eq.(6). So our presentation here will be very brief.

Recall that Gb

=

limη→0+Gb

(η)

, where Gb

(η)

is the n

×

n matrix in the upper-left corner of T−1with T given by(3)for

each

η >

0. Using the current notation, we have

T

=

Q B A Q B A Q

...

... ...

.

(11)

Associated with T is the rational matrix function

φ(λ) = λ

A

+

Q

+

λ

−1B. We already know from Bendixson’s theorem (see [8, Lemma 3.3]) that T is invertible for each

η >

0. Thus, by a result on linear operators (see [10, Chapter XXIV, Theorem 4.1] and [11]) we know that

φ(λ)

has a factorization

φ(λ) = (

I

λ

−1L

)

X

(

I

λ

U

)

(12)

with X invertible,

ρ(

L

) <

1 and

ρ(

U

) <

1. From(12)we see that A

= −

XU

,

B

= −

LX

,

Q

=

X

+

LXU

.

Thus X

+

BX−1A

=

Q and

ρ(

X−1A

) <

1. In other words, X is the unique stabilizing solution of(6).

By [10, Chapter XXIV, Theorem 4.1] the n

×

n matrix in the upper-left corner of T−1is precisely X−1. We thus have the following characterization of Gb

(η)

.

Theorem 3. For any

η >

0, the matrix Gb

(η)

is the inverse of the unique stabilizing solution of(6).

3. Computation of the stabilizing solution

Let M and L be as in(7). Then the stabilizing solution X of(6)satisfies(8)with

ρ(

X−1A

) <

1.

We remark that Eq.(6)with real matrices C

,

D

,

R

,

P also arises in the study of a quadratic eigenvalue problem from the vibration analysis of fast trains, where the required solution is also the stabilizing solution and a doubling algorithm is used to find the solution [8]. We can get similar results for our more general equation(6). The situation here is slightly more complicated since we no longer have B

=

A⊤and the stabilizing solution is no longer complex symmetric.

(5)

Starting with the matrices M and L in(7), we define the sequences

{

Mk

}

and

{

Lk

}

, where Mk

=

Ak 0 Qk

I

,

Lk

=

−

Pk I Bk 0

,

by the following structure-preserving doubling algorithm if no breakdown occurs.

Algorithm 1. Let A0

=

A

,

B0

=

B

,

Q0

=

Q

,

P0

=

0. For k

=

0

,

1

, . . . ,

compute Ak+1

=

Ak

(

Qk

Pk

)

−1 Ak

,

Bk+1

=

Bk

(

Qk

Pk

)

−1Bk

,

Qk+1

=

Qk

Bk

(

Qk

Pk

)

−1Ak

,

Pk+1

=

Pk

+

Ak

(

Qk

Pk

)

−1Bk

.

The above algorithm is the SDA-2 as presented in [12]. The next result shows that the doubling algorithm has some nice properties. In particular, it can compute the stabilizing solution X of(6)efficiently.

Theorem 4. Let X be the stabilizing solution of (6)and

X be the stabilizing solution of the dual equation

X

+

A

X

−1B

=

Q

.

Then

(a) The sequences

{

Ak

}

, {

Bk

}

, {

Qk

}

, {

Pk

}

in Algorithm 1 are well-defined.

(b) Qkconverges to X quadratically, Akand Bkconverge to 0 quadratically, Q

Pkconverges to

X quadratically, with

lim sup k→∞ 2k

Qk

X

∥ ≤

ρ(

X −1B

)ρ(

X−1A

),

lim sup k→∞ 2k

Ak

∥ ≤

ρ(

X−1A

),

lim sup k→∞ 2k

Bk

∥ ≤

ρ(

X −1 B

),

lim sup k→∞ 2k

Q

Pk

X

∥ ≤

ρ(

X −1 B

)ρ(

X−1A

),

where

∥ · ∥

is any matrix norm.

Proof. The proof is very similar to that of [8, Theorem 4.1]. Although the statement of that theorem and the beginning of its proof refer to the specific problem under consideration in [8], the proof there is valid for this theorem after some minor changes. Here we only mention the following differences. In [8], Q

=

Q and B

=

A(this would be true here if the matrices C

,

D

,

R

,

P were all real), and in that case we can conclude that Bk

=

Ak

,

Q

k

=

Qk

,

Pk

=

Pk, and

ρ(

X−1B

) = ρ(

X−1A

)

. 

Remark 1. Although we no longer have

ρ(

X−1B

) = ρ(

X−1A

)

in general, we always have

ρ(

X−1B

) = ρ(

X−1B

)

. In fact, the

eigenvalues of

X−1B are those of

P

(λ) = λ

2A

λ

Q

+

B inside T. We have mentioned earlier that the eigenvalues of BX−1

are the reciprocals of those eigenvalues of P

(λ) = λ

2B

λ

Q

+

A outside T. However, the reciprocals of the eigenvalues of P

(λ)

outside T are precisely the eigenvalues of

P

(λ)

inside T. It is also well known that BX−1and X−1B have the same

eigenvalues.

Remark 2. As in [6], we can show that the basic fixed-point iteration (FPI) Xk+1

=

Q

BXk−1A

,

X0

=

Q

is also convergent and X2k1

=

Qk. So the convergence of the FPI is much slower than the doubling algorithm. However, we

can use an averaging procedure for the FPI to speed up its convergence and for the nano application we can use the computed solution for one energy value as an initial guess for the solution for the next nearby energy value [7]. For the special case where P

=

I

,

D

=

0 and C

,

R are real, convergence results for methods based on these ideas have been proved in [5] using the Earle–Hamilton theorem [13,14]. The proofs there can be carried over to Eq.(6), with some minor changes, as long as D

=

0 still holds (so C is any complex matrix, R is any Hermitian matrix, and P is any Hermitian positive definite matrix). However, when D

̸=

0 we are unable to prove any non-local convergence results for those methods. The Earle–Hamilton theorem is not applicable since we no longer have B

=

A∗.

To emphasize its dependence on

η

, the stabilizing solution of(6) will be denoted by Xη. For the nano application, Gb

=

limη→0+Gb

(η)

and Gb

(η) =

Xη−1. So Gb

=

X∗−1with X

=

limη→0+Xη. It is easy to see that X∗is a particular weakly

stabilizing solution of the matrix equation

X

+

CX−1C

=

R

,

(13)

with

ρ(

X−1C

) ≤

1 and

ρ(

CX−1

) ≤

1. The solution Xcan be approximated by computing Xηby the doubling algorithm for

a sufficiently small

η

, but can also be computed directly by subspace methods, as we shall see in Section5. For now, we write X

=

X∗,R

+

iX∗,I, where the Hermitian matrices X∗,Rand X∗,Iare the real part and the imaginary part of X∗, respectively, and

we will examine the rank of X∗,I. Since the imaginary part of Xηis positive definite for

η >

0, we know that X∗,Iis positive

(6)

4. Rank of X∗,I

We now denote the matrices A

,

B

,

Q in(6)by Aη

,

Bη

,

Qη, respectively. So

Aη

=

C

+

i

η

D

,

Bη

=

C

+

i

η

D

,

Qη

=

R

+

i

η

P

,

(14)

with C

,

D

,

R

,

P as before. Let Pη

(λ) = λ

2Bη

λ

Qη

+

Aη

.

We already know that Pηhas no eigenvalues on T for

η ̸=

0. For

η =

0 we get P0

(λ) = λ

2C

λ

R

+

C

.

It is quite possible that P0

(λ)

has some eigenvalues on T. As we will see later, this is the case of primary interest for the nano application.

Theorem 5. The number of eigenvalues (counting multiplicities) of P0

(λ)

on T must be even, say 2m. Moreover, we have rank

(

X,I

) ≤

m.

Proof. The matrix polynomial P0

(λ)

is

-palindromic. Thus

µ

is an eigenvalue of P0

(λ)

if and only if 1

is so, and they have the same algebraic, geometric, and partial multiplicities [15]. It follows that the total number of eigenvalues of P0

(λ)

on T must be even.

By Xη

+

(

C

+

i

η

D

)

Xη−1

(

C

+

i

η

D

) =

R

+

i

η

P we have

Xη

+

CXη−1C

=

R

+

η

Wη

,

(15)

where

Wη

=

iP

iDXη−1C

iCXη−1D

+

η

DXη−1D

.

Taking imaginary parts on(15), we get

Kη

FηKηFη

=

η

Tη

,

(16)

where Kη

=

Im

(

Xη

),

Tη

=

Im

(

Wη

),

Fη

=

X−1

η C . Let F

=

limη→0+Fη

=

X∗−1C . Then the eigenvalues of F consist of all n

m

eigenvalues of P0

(λ)

inside T plus m eigenvalues of P0

(λ)

on T. Let F

=

V0

R0,1 0 0 R0,2

V0−1 (17)

be a spectral resolution of F , where R0,1

Cm×mand R0,2

C(nm)×(nm)are upper triangular with

σ (

R0,1

) ⊆

T and

σ (

R0,2

) ⊆

D

≡ {

λ ∈

C

||

λ| <

1

}

. It follows from [16, Chapter V, Theorem 2.8] that there is a nonsingular matrix Vηsuch that Fη

=

Vη

Rη,1 0 0 Rη,2

Vη−1

,

(18) and Rη,1

R0,1, Rη,2

R0,2, and Vη

V0, as

η →

0+. From(16)and(18)we have

VηKηVη

Rη,1 0 0 Rη,2

VηKηVη

Rη,1 0 0 Rη,2

=

η

VηTηVη

.

(19) Let VηKηVη

=

Hη,1 Hη,3 Hη,3 Hη,2

,

VηTηVη

=

Zη,1 Zη,3 Zη,3 Zη,2

.

(20) Then(19)becomes Hη,1

Rη,1Hη,1Rη,1

=

η

Zη,1

,

(21a) Hη,2

Rη,2Hη,2Rη,2

=

η

Zη,2

,

(21b) Hη,3

Rη,1Hη,3Rη,2

=

η

Zη,3

.

(21c)

As

η →

0+, Rη,1

R0,1with

ρ(

R0,1

) =

1, Rη,2

R0,2with

ρ(

R0,2

) <

1, and Zη,2and Zη,3are bounded by the convergence of Tη. So we have Hη,2

0 from(21b)and Hη,3

0 from(21c). Since X∗,I

=

limη→0+Kη, it follows from(20)that rank

(

X∗,I

) ≤

m. 

(7)

We conjecture that equality holds inTheorem 5when all eigenvalues of P0

(λ)

on T are simple. For the nano application, the matrices C and R in P0

(λ)

are given by

C

=

ESbb

Hbb

,

R

=

ESb

Hb

.

If P0

(λ)

has no eigenvalues on T for an energy valueE, then X∗is Hermitian byTheorem 5and Gb

=

X∗−1is also Hermitian.

We then know that the transmission function T

(E

)

takes zero value, without solving any nonlinear matrix equations. So we are only interested in thoseEvalues for which P0

(λ)

has some eigenvalues on T. The next simple result is thus relevant, where Sb

λ

Sbb

λ

−1Sbb∗ is positive definite for all

λ

on T.

Theorem 6. For

λ ∈

T, let the eigenvalues of

(

Sb

λ

Sbb

λ

−1Sbb

)

−1

(

H b

λ

Hbb

λ

−1Hbb

)

be

µ

1

(λ) ≤ · · · ≤ µ

n

(λ)

. Leti

=

min |λ|=1

µ

i

(λ),

max|λ|=1

µ

i

(λ)

,

and

=

n

i=1∆i. Then the quadratic pencil P0

(λ) = λ

2

(

ESbb

Hbb

) − λ(

ESb

Hb

) + (

ESbb

H

bb

)

has some eigenvalues on

T if and only if E

.

Proof. The quadratic P0

(λ)

has some eigenvalues on T if and only if det

(

P0

(λ)) =

0 for some

λ ∈

T or, equivalently, det

(−λ

−1P0

(λ)) =

det

E

(

Sb

λ

Sbb

λ

−1Sbb

) − (

Hb

λ

Hbb

λ

−1Hbb

) =

0

for some

λ ∈

T, the latter is equivalent toE

∆. 

5. Direct computation of X

The solution X∗can be computed directly by subspace methods. We will need to include all eigenvalues of

P

(λ) = λ

2C

λ

R

+

C (22)

inside T and half of its eigenvalues on T—the half that would be perturbed to the inside of T when P

(λ)

is perturbed to

Pη

(λ) = λ

2

(

C

+

i

η

D

) − λ(

R

+

i

η

P

) + (

C

+

i

η

D

).

(23) Let M

=

C 0 R

I

,

L

=

0 I C∗ 0

.

(24)

Then the pencilM

λ

L, also denoted by

(

M

,

L

)

, is a linearization of the quadratic matrix polynomial P

(λ)

. It is easy to check that y and z are the right and left eigenvectors, respectively, corresponding to an eigenvalue

λ

of P

(λ)

if and only if

y Ry

λ

Cy

,

z

λ

z

(25) are the right and left eigenvectors of

(

M

,

L

)

, respectively.

The following result is a generalization of [5, Theorem 3.1] for the special case where P

=

I

,

D

=

0 and C

,

R are real. It shows which invariant subspace corresponding to unimodular eigenvalues of P

(λ)

should be used in the computation of X∗,

assuming they are all semi-simple.

Theorem 7. Suppose that

λ

0is a semi-simple eigenvalue of P

(λ)

on T with multiplicity m0and Y

Cn×m0forms an orthonormal basis of right eigenvectors corresponding to

λ

0. Then iY

(

2

λ

0C

R

)

Y is a nonsingular Hermitian matrix. Let dj, j

=

1

, . . . , ℓ

,

be the distinct eigenvalues of

Y

(

P

λ

0D

λ

01D

)

Y

iY

(

2

λ

0C

R

)

Y

−1

with multiplicities mj0, and let

ξ

j

Cmm

j

0 form an orthonormal basis of the eigenspace corresponding to dj. Then for

η >

0 sufficiently small and j

=

1

, . . . ℓ

λ

(k) j

=

λ

0

λ

0dj

η +

O

2

),

k

=

1

, . . . ,

m j 0

,

and yj

=

Y

Y

(

P

λ

0D

λ

0−1D

)

Y

−1

ξ

j

+

O

(η)

(26)

(8)

Proof. Since P

0

)

Y

=

0 with YY

=

Im0and

|

λ

0

| =

1, we have 0∗

=

(

P

0

)

Y

)

=

λ

2 0Y

2 0C

λ

0R

+

C

).

It follows that Y forms an orthonormal basis for left eigenvectors of P

(λ)

corresponding to

λ

0. From(25), we obtain that the column vectors of YR

=

Y RY

λ

0CY

and YL

=

Y

λ

0Y

form a basis of left and right eigenspaces ofM

λ

Lcorresponding to

λ

0, respectively. Since

λ

0is semi-simple, the matrix

[

Y

, −λ

0Y

]

L

Y RY

λ

0CY

= −

Y

(

2

λ

0C

R

)

Y

= −

YP

0

)

Y is nonsingular. Let

YR

= −

YR

YP

0

)

Y

−1

,

Y

L

=

YL

.

Then we have

YL∗LYR

=

Im0 and Y

LMYR

=

λ

0Im0

.

(27)

For

η >

0 sufficiently small, we let

=

C

+

i

η

D 0 R

+

i

η

P

I

,

=

0 I C

+

i

η

D∗ 0

.

ThenMη

λ

Lηis a linearization of Pη

(λ)

. By(27)and [16, Chapter VI, Theorem 2.12] there are

YRandY

Lsuch that

YR

YR

and

YLY

L

are nonsingular and

Y∗L

Y∗L

M

YRY

R

 =

λ

0Im0 0 0 M

,

Y∗L

Y∗L

L

Y

R

YR

 =

Im0 0 0 L

.

Then, by [16, Chapter VI, Theorem 2.15] the column vectors ofY

R

+

O

(η)

span the right eigenspace of

(

,

)

corresponding

to

(Λ,

Im0

)

, where Λ

=

λ

0Im0

+

η

E11

+

O

2

) 

I m0

+

η

F11

+

O

2

)

−1 with E11

=

Y

L

iD 0 iP 0

YR

=

Y

0iP

iD

)

Y

YP

0

)

Y

−1

,

F11

=

Y ∗ L

0 0 iD∗ 0

YR

=

Y

0iD

)

Y

YP

0

)

Y

−1

.

Thus Λ

=

λ

0Im0

+

η(

E11

λ

0F11

) +

O

2

) = λ

0Im0

ηλ

0W

+

O

2

),

where W

=

Y

(

P

λ

0D

λ

0−1D

)

Y

iYP

0

)

Y

−1

.

(28) The matrix Z

=

iYP

0

)

Y

=

iY

(

2

λ

0C

R

)

Y in(28)is Hermitian since Z

Z

=

iY

(

2

λ

0C

+

2

λ

0C

2R

)

Y

=

2i

λ

0YP

0

)

Y

=

0

.

Since Y

(

P

λ

0D

λ

−01D

)

Y is positive definite, all eigenvalues of W in(28)are real and there are m0linearly independent eigenvectors. Let djfor j

=

1

, . . . , ℓ

be the distinct eigenvalues of W with multiplicities m

j

0, and let

ξ

j

Cmm

j

0 form an orthonormal basis of the eigenspace corresponding to dj. Then we have

Φ−1ΛΦ

=

λ

0Im0

ηλ

0diag

d1Im10

, . . . ,

dImℓ0

+

O

2

)

whereΦ

= [

ξ

1

, . . . , ξ

] ∈

Cmm0. Then for each j

∈ {

1

,

2

, . . . , ℓ}

, the perturbed eigenvalues

λ

(k)

j, k

=

1

, . . . ,

m j

0, and a basis of the corresponding invariant subspace ofMη

λ

Lηwith

λ

j(k)

|

η=0

=

λ

0can be expressed by

λ

(k)

(9)

and

ζ

j

=

YR

Y

(

P

λ

0D

λ

0−1D

)

Y

−1

ξ

j

+

O

(η).

(29b)

The equation in(26)follows from(29b). 

For the pencil

(

M

,

L

)

given by(24), the relationM

I X

=

L

I X

X−1A shows that the weakly stabilizing solution X∗of (13)is obtained by X

=

X2X1−1, where the columns of

X 1 X2

form a basis for the invariant subspace of

(

M

,

L

)

corresponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed to the inside of T when

(

M

,

L

)

is perturbed to

(

Mη

,

Lη

)

with

η >

0.

We can use the QZ algorithm to determine this invariant subspace, with the aid ofTheorem 7when all unimodular eigenvalues of

(M,

L)are semi-simple. In practice, these unimodular eigenvalues are likely to be simple and the statements in ourTheorem 7can be simplified significantly. However, if 1 (or

1) happens to be an eigenvalue of

(

M

,

L

)

, then it must have even multiplicity because (counting multiplicity) half of eigenvalues at 1 (or

1) will be perturbed to the inside of T

and the other half to the outside. Typically 1 (or

1) will be a double eigenvalue of partial multiplicity 2, and the eigenvector corresponding to it should be used in the computation of X∗.

6. Exploiting sparsity

Subspace methods for finding Xmay be more efficient than the doubling algorithm that finds Xηfor a sufficiently small

η

. However, it is possible for the doubling algorithm to exploit certain sparsity structures in the matrices A

,

B

,

Q in(6)while subspace methods could not.

For the nano application here and for other applications, the matrices A

,

B

,

Q are from a semi-infinite block tridiagonal and block Toeplitz matrix, as given in the matrix T in(11). In some situations, the matrix T is block tridiagonal with the matrices on the three diagonals having some periodicity, but is not block Toeplitz when the submatrices in T are of the given sizes. To make T a block tridiagonal and block Toeplitz matrix, we would have to partition the matrix T into larger submatrices. To be more precise, the matrix T is given as in(11), and the matrices A

,

B

,

Q

Cn×nhave the following structures: Q

=

K1,1 K1,2 0 K2,1 K2,2

...

...

...

Kp−1,p 0 Kp,p−1 Kp,p

,

A

=

0 Kp+1,p 0 0

,

B

=

0 0 Kp,p+1 0

,

(30) where Kj,j

Cnj×nj

,

Kp+1,p

Cnnp

,

Kp,p+1

Cnp×n1.

We now useAlgorithm 1to compute the stabilizing solution Xsof(6). We remark that Eq.(6)here is more general than

the one studied in [8]. That equation arises in the vibration analysis of fast trains.

As in [8], the complexity ofAlgorithm 1can be reduced drastically by using the special structures of the matrices Q

,

A

,

B given by(30). Write Qk

=

Q

Pk. Then it is easily seen fromAlgorithm 1that the matrices Ak, Bk,

Pkand Pkhave the special

forms Ak

=

0 Ek 0 0

,

Bk

=

0 0 Fk 0

,

Pk

=

0 0 0

Gk

,

Pk

=

Gk 0 0 0

,

where Ek, Fk,

Gkand Gkare n1

×

np, np

×

n1, np

×

npand n1

×

n1matrices, respectively.Algorithm 1can be rewritten as the

following simplified algorithm.

Algorithm 2. Let E0

=

Kp+1,p

,

F0

=

Kp,p+1

,

G0

=

0

,

G0

=

0. For k

=

0

,

1

, . . . ,

compute

Sk,1 Sk,2

...

Sk,p

=

Q

Gk 0

...

0

Gk

−1

Ek 0

...

0

,

(31)

Tk,1 Tk,2

...

Tk,p

=

Q

Gk 0

...

0

Gk

−1

0

...

0 Fk

,

(32)

(10)

where Sk,i

Cni×npand Tk,i

Cni×n1, and then compute

Ek+1

=

EkSk,p

,

Fk+1

=

FkTk,1

,

Gk+1

=

Gk

+

FkSk,1

,

Gk+1

=

Gk

+

EkTk,p

.

(33)

The main task ofAlgorithm 2is to solve the large sparse linear systems in(31)and(32). This could be done by using the Sherman–Morrison–Woodbury formula, as in [8]. But here we present a new approach that is both simpler and less expensive. Let P

=

In1 0 0 0 0 Inp 0 Inn1−np 0

be a permutation matrix and note that

P

Q

Gk 0

...

0

Gk

P

=

K1,1

Gk 0 0 Kp,p

Gk V U C

,

where V

=

K1,2 0

· · ·

0 0

· · ·

0 Kp,p−1

,

(34) U

=

K2,1 0 0

...

...

0 0 Kp−1,p

,

C

=

K2,2 K2,3 0 K3,2 K3,3

...

...

...

Kp−2,p−1 0 Kp−1,p−2 Kp−1,p−1

.

(35)

Then the matrices Sk,1, Sk,p, Tk,1and Tk,pof the solutions of(31)and(32)satisfy



K1,1

Gk 0 0 Kp,p

Gk

VC−1U

 

Sk,1 Tk,1 Sk,p Tk,p

=

Ek 0 0 Fk

.

(36)

Note that the matrixVC−1Uis independent of k. Since Q

k

=

Q

−

Pkand limk→∞Qk

=

Xs, we know that Xsis obtained from

Q by replacing Kp,pin the lower-right corner with Kp,p

G∗, where

G

=

limk→∞

Gk.

The following algorithm gives a more detailed implementation ofAlgorithm 2and computes the stabilizing solution Xs

of(6).

Algorithm 3. Computation of Xs.

Input: Kj,j

Cnj×nj, Kj,j+1

Cnj×nj+1, Kj+1,j

Cnj+1×nj, j

=

1

, . . . ,

n, where np+1

=

n1; tolerance

τ

. Output: The stabilizing solution Xsof(6), where A

,

B

,

Q are given by(30).

TakeV

,

UandCin(34)and(35); Compute W

=

K 1,1 0 0 Kp,p

VC−1U; E0

=

Kp+1,p, F0

=

Kp,p+1,

G0

=

0, G0

=

0; For k

=

0

,

1

, . . .

S k,1 Tk,1 Sk,p Tk,p

=

W

G k 0 0 Gk



−1

E k 0 0 Fk

; Ek+1

=

EkSk,p

,

Fk+1

=

FkTk,1

,

Gk+1

=

Gk

+

FkSk,1, Gk+1

=

Gk

+

EkTk,p; If

FkSk,1

∥ ≤

τ∥

Gk

and

EkTk,p

∥ ≤

τ∥

Gk

, then Xs

Q , Xs

(

n

np

+

1

:

n

,

n

np

+

1

:

n

) ←

Kp,p

Gk+1, and stop.

(11)

In nano applications, we typically need the n1

×

n1matrix in the upper-left corner of T−1, where T is given by(11). We also know that X−1

s is the n

×

n matrix in the upper-left corner of T

−1. So we are mainly interested in the n

1

×

n1matrix Y1in the upper-left corner of Xs−1. Note that Y1is the same as the matrix Sk,1in(31)when Gk

,

Gk

,

Ekin(31)are replaced by

0

,

G

,

In1, respectively. Thus Y1

=

In1

,

0

W

0 0 0

G



−1

In1 0

,

where the matrix W has already been computed inAlgorithm 3.

7. Numerical results

In this section we present some numerical results. We use the doubling algorithm to compute the stabilizing solution Xη of the equation

X

+

BηX−1Aη

=

Qη

,

(37)

where Aη

,

Bη

,

Qηare given in(14). If these matrices have the special sparsity structures in(30), thenAlgorithm 3is used. To measure the accuracy of a computed stabilizing solution Xηto(37), we use the relative residual

RResη

=

Xη

+

BηX−1

η Aη

Qη

Xη

∥ + ∥

Aη

∥∥

Bη

∥∥

X−1

η

∥ + ∥

Qη

,

where

∥ · ∥

is the spectral norm. To see whether Xηis a good approximation to the weakly stabilizing solution X∗of the

equation X

+

CX−1C

=

R, we compute RRes

=

Xη

+

CXη−1C

R

Xη

∥ + ∥

C

2

X−1 η

∥ + ∥

R

.

(38) We also use the QZ algorithm to compute X∗directly, and the relative residual RRes0is defined as in(38), with the computed Xηreplaced by the computed X∗.

Example 1. We randomly generate two complex matrices C , D and two complex Hermitian matrices R, P of dimension 6.

Let

ϱ

be the minimal eigenvalue of P and set P

:=

P

+

(

2

D

∥ −

ϱ)

I6

.

We verify that P

+

λ

D

+

λ

−1D is positive definite for all

λ ∈

T. We then compute the stabilizing solution Xηof(37)with

η =

10−4, 10−8, 10−12, respectively, by usingAlgorithm 1. In each case,Algorithm 1is stopped when max

{∥

Ak+1

, ∥

Bk+1

∥}

<

10−10and Q

k+1is taken to be the computed Xη. When

η =

0, P0

(λ) = λ

2C

λ

R

+

C has 2m

=

4 eigenvalues on T, given by

Λ

= {−

0

.

9026

+

0

.

4304i

,

0

.

5687

0

.

8226i

,

0

.

9891

+

0

.

1472i

,

0

.

1960

+

0

.

9806i

}

.

ByTheorem 7we determine that

Λs

= {

0

.

5687

0

.

8226i

,

0

.

9891

+

0

.

1472i

}

is such that the perturbed eigenvalues of Pη

(λ)

(

η >

0) associated with each

λ

s

Λsare inside T. Then we compute the

weakly stabilizing solution X∗of(37)by using the invariant subspace corresponding to stable eigenvalues and eigenvalues

inΛs(the QZ algorithm). The numerical results are shown inTable 1.

Table 1 Relative residuals.

η 10−4 10−8 10−12 0

RResη 3.36×10−16 4.03×10−15 4.24×10−15 3.09×10−16

We know that Xη,I

=

2i1

(

Xη

Xη∗

)

is positive definite for

η >

0 and we know fromTheorem 5that rank

(

X∗,I

) ≤

m

=

2.

(12)

Table 2 The eigenvalues of Xη,I. η The eigenvalues of Xη,I 10−4 2.3555,1.2676,5.87×10−3,1.89×10−3,1.21×10−3,1.10×10−3 10−8 2.3510,1.2639,5.91×10−7,1.89×10−7,1.21×10−7,1.10×10−7 10−12 2.3510,1.2639,5.91×10−11,1.89×10−11,1.21×10−11,1.10×10−11 0 2.3510,1.2639,5.41×10−15,1.61×10−15, −5.12×10−16, −4.18×10−15

Example 2. We consider a semi-infinite Hamiltonian operator of the transverse magnetic (TM) mode for the

two-dimensional photonic crystal of the form [17] H

(

u

, ⃗

k

, ⃗

x

) = −

1

ε(⃗

x

)

∇ +

i

k

·

∇ +

i

k

u

(⃗

x

)

= −

1

ε(⃗

x

)

+

2i

k

· ∇ − ∥⃗

k

2

u

(⃗

x

),

(39)

where

k

=

(

k1

,

k2

)

is a wave number in the first Brillouin zoneΩ∗

=

(−π, π]

2,

x

= [−

0

.

5

, ∞) × [−

0

.

5

,

0

.

5

] =

Ω1

Ω2with

Ω1

=

j=0 Bρ

(

j

),

Ω2

=

j=0

([−

0

.

5

+

j

,

0

.

5

+

j

] × [−

0

.

5

,

0

.

5

]

) \

Bρ

(

j

)

and Bρ

(

j

) = {(

x1

,

x2

)|(

x1

j

)

2

+

x22

ρ

2

}

, 0

< ρ <

0

.

5, and

ε(⃗

x

)

is the dielectric function with

ε(⃗

x

) =

ε

1

x

Ω1

,

ε

2

x

Ω2

.

SeeFig. 1for an illustration of the domainΩ. By Bloch’s theorem, we assume that the boundary conditions are given by

u

(⃗

x

) =

0

, ⃗

x

∈ {

(−

0

.

5

,

x2

)|

x2

∈ [−

0

.

5

,

0

.

5

]}

,

u

(

x1

,

0

.

5

) =

eik2u

(

x1

, −

0

.

5

),

x1

∈ [−

0

.

5

, ∞].

Fig. 1. The domainΩ = Ω1∪Ω2.

We use the classical five-point central finite difference method to discretize the operator(39)on the uniform grid points inΩwith mesh size h

=

1

/

n. So n is the number of grid points on the x2axis in

[−

0

.

5

,

0

.

5

)

. Let Tnbe the tridiagonal matrix

of dimension n with 4 on the main diagonal and

1 on the two adjacent diagonals and let Dnbe the tridiagonal matrix of

dimension n with 0 on the main diagonal and

1 and 1 on the super-diagonal and the sub-diagonal, respectively. Let

Φ

=

1 h2

(

Tn

δ

e1en

− ¯

δ

ene⊤1

) −

ik2 h

(

Dn

+

δ

e1en

− ¯

δ

ene⊤1

) + (

k 2 1

+

k 2 2

)

In and Ψ

=

1 h2

ik1 h

In

,

where

δ =

eik2and e

jdenotes the jth column vector of the identity matrix. Then the system Hamiltonian H from the operator

(13)

Fig. 2. RResη, RRes and the number of iterations ofAlgorithm 3. sub-diagonal, respectively. The block matrices Hband Hbbare of the forms

Hb

=

H1,1 H1,2 H1,2 H2,2

...

...

...

Hn−1,n Hn1,n Hn,n

,

Hbb

=

0 0 Hn,n+1 0

Cnn2

,

where Hj,j

=

ΓjΦΓj

,

Hj,j+1

=

ΓjΨ Γj+1

,

j

=

1

, . . . ,

n

,

andΓj

=

diag

(:,

j

))

= [

Υij

] ∈

Rn×nwith

Υij

=

1

ε

1

, (−

0

.

5

+

jh

,

0

.

5

ih

) ∈

Bρ

(

0

),

Υij

=

1

ε

2

, (−

0

.

5

+

jh

,

0

.

5

ih

) ̸∈

Bρ

(

0

).

We now apply the Green’s function approach to the system Hamiltonian H with the overlap matrix being the identity. So we need to determine the n2

×

n2block (particularly the n

×

n block) in the upper-left corner of the inverse of the matrix

(3)(now with Sb

=

I and Sbb

=

0). This is done by solving the matrix equation(37). The matrices Aη

,

Bη

,

Qη

Cn 2×n2

in

(37)now have the structures in(30), with nj

=

p

=

n and

Kj,j

=

zIn

Hj,j

,

Kj,j+1

= −

Hj,j+1

,

Kj+1,j

= −

Hj∗,j+1

,

j

=

1

, . . . ,

n

,

where z

=

E

+

i

η

withE

R and 0

η ≪

1. We remark that the matrix equation here is a special case of(1)with P

=

I, D

=

0, C

=

Aη

=

B

ηand R

=

EI

Hb.

We now useAlgorithm 3to compute the solution Xηof(37). In our test we take n

=

50,

ρ =

0

.

3,

ε

1

=

1,

ε

2

=

10 and

(

k1

,

k2

) = (

0

.

5

,

0

.

7

)

. We divide

[

0

,

15

]

into

κ

subintervals using

κ +

1 equally spaced nodesEi, i

=

0

,

1

, . . . , κ

.

數據

Table 1 Relative residuals.
Table 2 The eigenvalues of X η, I. η The eigenvalues of X η, I 10 − 4 2 . 3555 , 1 . 2676 , 5
Fig. 2. RRes η , RRes and the number of iterations of Algorithm 3. sub-diagonal, respectively
Fig. 3. The number of eigenvalues on T and interested energy interval between 0 and 15.

參考文獻

相關文件

You are given the wavelength and total energy of a light pulse and asked to find the number of photons it

“Find sufficiently accurate starting approximate solution by using Steepest Descent method” + ”Compute convergent solution by using Newton-based methods”. The method of

Now, nearly all of the current flows through wire S since it has a much lower resistance than the light bulb. The light bulb does not glow because the current flowing through it

利用 determinant 我 們可以判斷一個 square matrix 是否為 invertible, 也可幫助我們找到一個 invertible matrix 的 inverse, 甚至將聯立方成組的解寫下.

Then, we tested the influence of θ for the rate of convergence of Algorithm 4.1, by using this algorithm with α = 15 and four different θ to solve a test ex- ample generated as

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

Particularly, combining the numerical results of the two papers, we may obtain such a conclusion that the merit function method based on ϕ p has a better a global convergence and

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the