• 沒有找到結果。

Solution of a nonsymmetric algebraic Riccati equation from a two-dimensional transport model

N/A
N/A
Protected

Academic year: 2021

Share "Solution of a nonsymmetric algebraic Riccati equation from a two-dimensional transport model"

Copied!
14
0
0

加載中.... (立即查看全文)

全文

(1)

Contents lists available atScienceDirect

Linear Algebra and its Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / l a a

Solution of a nonsymmetric algebraic Riccati equation from

a two-dimensional transport model

Tiexiang Li

a

, Eric King-wah Chu

b,

, Jong Juang

c

, Wen-Wei Lin

d

aDepartment of Mathematics, Southeast University, Nanjing 211189, People’s Republic of China bSchool of Mathematical Sciences, Building 28, Monash University, VIC 3800, Australia cDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

dCMMSC and NCTS, Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

A R T I C L E I N F O A B S T R A C T Article history: Received 18 January 2010 Accepted 25 August 2010 Submitted by V. Mehrmann AMS classification: 15A24 45G10 45J05 65H10 65R20 65Z05 82D75 Keywords:

Algebraic Riccati equation Doubling algorithm Fixed-point iteration Newton’s method Reflection kernel Transport theory

For the steady-state solution of an integral–differential equation from a two-dimensional model in transport theory, we shall derive and study a nonsymmetric algebraic Riccati equation B−−XF−− F+X+XB+X=0, where F±≡I− ˆsPD±, B≡ (ˆbI+ ˆsP)D−and

B+≡ ˆbI+ ˆsPD+with a nonnegative matrix P, positive diagonal matrices D±, and nonnegative parameters f,bˆ≡b/(1−f)and

ˆ

ss/(1−f). We prove the existence of the minimal nonnegative solution Xunder the physically reasonable assumption f+b+ sP(D++D)<1, and study its numerical computation by fixed-point iteration, Newton’s method and doubling. We shall also study several special cases; e.g. whenbˆ=0 and P is low-ranked, then X∗= 2ˆsUV is low-ranked and can be computed using more

efficient iterative processes in U and V. Numerical examples will be given to illustrate our theoretical results.

© 2010 Elsevier Inc. All rights reserved.

∗Corresponding author. Tel.: +61 3 99054480; fax: +61 3 99054403.

E-mail addresses: [email protected] (T. Li), [email protected] (E.K.-w. Chu), [email protected] (J. Juang), [email protected](W.-W. Lin).

0024-3795/$ - see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.laa.2010.09.006

(2)

Fig. 1. The four-port-system.

1. Introduction

Transport theory has been an active area of research, associated with masters like Bellman and Chandrasekhar (see [2,13] and the references therein). A one-dimensional model was studied first in [13], stimulating a series of numerical studies, e.g., in [1,14,15,17], in the past 15 years. We shall study a more general two-dimensional model in this paper.

In [18,19], the transport of particles stemming from a rectangular beam bounded in the rectangle

[

0, x

] × [

0, y

]

and unbounded in the orthogonal z direction, incident upon a similar rectangular region, was considered. Assuming that the incident particle beam is from the East, it is important to determine the East reflection kernel, from which the corresponding transmission, left-turn and right-turn kernels can be deduced [18].

Consider the domainΘ

= [

0,

α] × [

0,

β]

in Fig.1, with particle input Udand output Vdfrom various

directions d

=

N, S, E, W . From the input UE from the East, we are interested in the corresponding

transmission, left-turn, right-turn and reflection operatorsTE,GE,DEandRE, respectively, producing

outputsTEUE,GEUE,DEUEandREUE.

For example, as on the left of Fig.2forTE, the incident flux UEcomes in at y

=

yiand the resulting

emerging flux emerges at y

=

ye, produces an output in the form

[

TEUE](ye

) ≡

β

0

TE

,

β

, ye, yi

)

UE

(

yi

)

dyi,

where TE

(

x, y, ye, yi

)

is the corresponding transmission kernel. For the output flux VEfrom the East, we

have contributions from all four directions, summing to

VE

=

TEUE

+

DNUN

+

RWUW

+

GSUS

.

To understand the system, we need to determine all the kernels. Because of symmetry, we shall consider only the kernels corresponding to the Eastern direction.

From [18,19], the integral–differential equations for the kernels forTE,GE,DE, andREhave been

derived. For the kernel REof the Eastern reflection operatorRE, we have

1

σ

RE

x

(

x, y, ye, yi

) =

b

δ(

ye

yi

) +

s p

(

ye, yi

) +

2

(

f

1

)

RE

(

x, y, ye, yi

)

+

s y 0  p

(

y, yi

)

RE

(

x, y, ye, y

) +

p

(

ye, y

)

RE

(

x, y, y, yi

)

 dy

+

y 0  bRE

(

x, y, ye, y

) +

s  y 0 p

(

y, y

)

RE

(

x, y, ye, y

)

dy 

×

RE

(

x, y, y, yi

)

dy, (1)

(3)

Fig. 2. The incremental layer and the transmission operatorTE.

with initial/boundary conditions RE

(

0,

β

, ye, yi

) =

0, where

δ(·)

is the Kronecker function with

δ(

0

) =

1 but vanishing at other arguments. For the parameters in (1),

σ

is the positive scattering cross-section, and f , b and s are nonnegative expected numbers of particles that emerged from a collision moving in, respectively, the same direction as the particle that engendered the collision, the opposite, or either of the two orthogonal directions. The nonnegative intensity kernel p

,

·)

defines the expected intensity operatorP:

(

PV

)

 y(k) 

=

y 0 p  y(k), y  V

(

y

)

dy, (2) on some function V .

One important problem in transport theory is to understand the behaviour of the four-port model in Figs.1and2, through the operatorsTE,GE,DEandRE, whose kernels can be obtained by solving their

corresponding integral–differential equations like (1). One important feature is that the equations for the kernels TE, GEand DEare decoupled from each other, but dependent on the reflection kernel RE;

details can be found in [18,19]. In this paper, we shall concentrate on the steady-state solution to (1) for the reflection kernel RE. The global solution of the integral–differential equation (1) can be computed as

a fixed point of a positive operator, with the steady-state solution as its natural upper bound. Moreover, these global solutions converge to the steady-state solution under favourable conditions. Note that RE

is differentiated with respect to the spatial variable x in (1) and the steady-state solution has to be interpreted as the scattering behaviour stabilizing further away from the source, rather than against time.

Quoting from [18,19], we shall attempt to pass on the essence of the approach of “invariant imbed-ding”, which produces (1).

To the subregion

[

0, x

] × [

0, y

]

(on the right in Fig.2) ofΘ(in Fig.1) with x

∈ (

0,

α)

and y

∈ (

0,

β)

, imbed an additional strip

[

x, x

+

x

] × [

0, y

]

on the right. Particles enter the subregion from the right are reflected back from several possibilities. First, the probability of a particle having a collision inside the strip is

σ

x, of which f

σ

x accounts for it continuing forward, b

σ

x backward, and s

σ

x left

or right. When a particle enters the strip at y

=

yi, collides and transverses up (or down) the strip, it

produces a source and an emerging flux out of the strip to the left or right at y

=

yewith probability s

σ

p

(

yi, ye

)

x. Also, the probability of any particle going through the strip is

[

1

+ σ

x

(

f

1

)]

, being

the sum of 1

− σ

x (no collision) and f

σ

x (going forward after collision). Let RE

(

x, y, ye, yi

)

and RE

(

x

+

x, y, ye, yi

)

represent the amounts of reflection, with incident particles at y

=

yiand emerging

particles at y

=

ye, in the subregion and the augmented region, respectively. Showing only the effects

of at most two collisions in the strip, we have

RE

(

x

+

x, y, ye, yi

) = [

1

+ σ

x

(

f

1

)]

RE

(

x, y, ye, yi

)[

1

+ σ

x

(

f

1

)]

+

b

σ δ(

ye

yi

)

x

+

s

σ

p

(

ye, yi

)

x

+ · · ·

=

RE

(

x, y, ye, yi

) +

2

(

f

1

RE

(

x, y, ye, yi

)

x

+

b

σδ(

ye

yi

)

x

+

s

σ

p

(

ye, yi

)

x

+ · · · .

(3)

The first term in between the equality signs in (3) accounts for the particle going through the strip into the subregion, turning after reflection and then going through the strip again and emerge. The second

(4)

term accounts for the particle bouncing back in the strip, and the third term for the particle to collide in the strip, transverse and exit. Rearrange and letx

0, (3) implies

1

σ

RE

x

(

x, y, ye, yi

) =

b

δ(

ye

yi

) +

s p

(

ye, yi

) +

2

(

f

1

)

RE

(

x, y, ye, yi

) + · · ·

,

a truncated version of (1). We now abbreviate the discussion and “explain” the remaining terms in (1). For the first integral, the particle collides and transverses in the strip, enters the subregion (at y

=

y) and reflects, or reflects first before colliding (at y

=

y) and transversing in the strip before emerging. The second integral accounts for the particle going through the strip, reflecting in the subregion, bouncing back to the strip (at y

=

y), reflecting again and emerging to the right. For the last (double) integral, the particle goes through the strip, reflects, collides in the strip (at y

=

y) and transverses in the strip, re-enters into the subregion (at y

=

y), reflects and emerges to the right. The integrals sum all the possibilities in yand y. Many other paths for the particle are obviously possible but they contribute towards higher order terms inx and disappear whenx

0.

From obvious probabilistic reasons in the above transport model, we assume the inequalities

b

+

f

+

2s 1, (4)

ψ(

P

) ≡

max

y∈[0,1],V=1

01p

(

y, y

)

V

(

y

)

dy  1, (5)

with b, f , s and p

(

y, y

)

being nonnegative. When (4) is satisfied with equality, our system and the resulting nonsymmetric algebraic Riccati equation (6) are described as critical.

For the steady-state solution, we have the right-hand-side of the integral–differential equation (1) equals to zero, yielding an integral equation in RE. Assume without loss of generality that y

=

1 and

apply numerical quadrature with n positive weights

{

d±k

}

and nodes

{

y(k)

}

, we have the approximated equation at yi

=

y(m)and ye

=

y(l): b

δ

lm

+

splm

+

2

(

f

1

)

rlm

+

s k  rlkdkpkm

+

plkd+krkm 

+

b k rlkdkrkm

+

s k,t rlkdkpktd+t rtm

=

0

with plm

p

(

y(l), y(m)

)

, rlm

RE

(

x, y, y(l), y(m)

)

and D±

diag

{

dk±

}

. Note that d±k are of O

(

n−1

)

for

many standard numerical quadratures, and we allow the flexibility of different weights d+k and dk, thus different accuracies, for the numerical integration} respect to yand yin the double integral in (1). In matrix form withb

ˆ

b

/(

1

f

)

,

ˆ

s

s

/(

1

f

)

and using the convention M

= [

mij

]

(with capital

letters denoting matrices and the corresponding lower-case letters with indices for their elements), we have the nonsymmetric algebraic Riccati equation (NARE):



ˆ

bI

+ ˆ

sP

2R

+ ˆ

s

(

RDP

+

PD+R

) + ˆ

bRDR

+ ˆ

sRDPD+R

=

0

.

Post-multiply the above equation by D−, the NARE now reads:

B

XF

F+X

+

XB+X

=

0 (6)

with the n

×

n matrices X

RD−and

F±

I

− ˆ

sPD±, B

bI

ˆ

+ ˆ

sPD, B+

≡ ˆ

bI

+ ˆ

sPD+

.

(7)

Remark 1.1. The more useful variable is X

=

RD, not R or RE, because ultimately we are interested

in integrals like

(

REV

)

 y

=

 y 0 RE  x, y, y, yVydy

≈ (

RDv

)

k

= (

Xv

)

k (8)

for some function V with the corresponding function values in v

= [

V

(

y(1)

)

,

. . .

, V

(

y(n)

)]

and y

(5)

The “convergence” of the solution X of (6) toREfor the original integral equation is an interesting

problem on its own and will be pursued elsewhere. At the moment, we shall assume that the solution

X is an accurate approximation toREfor large enough values of n, in the sense that the difference

between the left- and right-hand-sides in (8) diminishes to zero as n

→ ∞

. In addition toRE

(

V

)

in (8), we may also be interested in

 REV   y

 y 0 VyRE  x, y, y, ydy

vD+R k

=

 vY k

with Y

=

D+R

=

D+X

(

D

)

−1. Analogously, we can derive an NARE for Y (whose existence requires an assumption on D±P in 1-norm, similar to that on PD±in (11)) but it may be simpler to solve (6) for

X and then retrieve Y afterward. 2. Existence of solution

Some relevant definitions are as follows. For any matrices A, B

R

m×n, we write A B

(

A

>

B

)

if their elements satisfya

ˆ

ijb

ˆ

ij

aij

> ˆ

bij

)

for all i, j. A real square matrix A is called a Z-matrix if all

its off-diagonal elements are nonpositive. It is clear that any Z-matrix A can be written as sI

B with

B 0. A Z-matrix A is called an M-matrix if s

ρ(

B

)

, where

ρ(·)

is the spectral radius; it is a singular M-matrix if s

= ρ(

B

)

and a nonsingular M-matrix if s

> ρ(

B

)

. We have the following useful results from [3] and [8, Theorem 1.1]:

Lemma 2.1. For a Z-matrix A, the following three statements are equivalent

:

(

a

)

A is a nonsingular M-matrix

.

(

b

)

A−1 0

.

(

c

)

Av

>

0 for some vector v

>

0

.

Theorem 2.2. For the NARE

X CX

X D

AX

+

B

=

0 (9)

where A, B, C and D are real matrices of sizes m

×

m, m

×

n, n

×

m, n

×

n, respectively

.

Assume that M

=

 D

C

B A



(10)

is a nonsingular M-matrix or an irreducible singular M-matrix

.

Then the NARE has a minimal nonnegative solution S

.

If M is irreducible, then S

>

0 and A

S C and D

CS are irreducible M-matrices

.

If M is a nonsingular M-matrix, then A

S C and D

CS are nonsingular M-matrices

.

If M is an irreducible singular M-matrix with positive left and right null vectors

[

u1, u2

]

and

[

v1, v2

]

(

where u1, v1

R

n and u2, v2

R

m

)

satisfying

u1v1

/=

u2v2, then

MS

=

In

⊗ (

A

S C

) + (

D

CS

)

Im

is a nonsingular M-matrix. If M is an irreducible singular M-matrix with u1v1

=

u2v2, then MSis an irreducible singular M-matrix

.

Applying Lemma 2.1 and Theorem 2.2, we have the following existence result:

Theorem 2.3. Under the assumption that

b

+

f

+

s



P

(

D+

+

D

)

<

1, (11)

(6)

Proof. Applying Theorem 2.2, we need to show that the Z-matrix M

=

⎡ ⎣ I

− ˆ

sPD



ˆ

bI

+ ˆ

sPD+



ˆ

bI

+ ˆ

sPDI

− ˆ

sPD+ ⎤ ⎦

=

I

− ˆ

b  I D− 

− ˆ

s  P P   D, D+ (12) is a nonsingular M-matrix or irreducible singular M-matrix. For the former, applying Lemma 2.1 to M requires a vector v

>

0 such that Mv

>

0. Let v

=

e (the vector of all ones) and we need

ˆ

b

+ ˆ

s



P

(

D+

+

D

)

<

1, (13)

which is equivalent to our assumption (11).



For the rest of the paper, any matrix norm will be the

-norm unless otherwise stated. Many other useful results on more general NAREs can be found in [8].

Remark 2.1. For the numerical quadrature chosen in deriving (6) from (1), we shall assume that it is exact for some interpolating function of appropriate smoothness. Note that most numerical quadra-tures can be derived through exact integration of such interpolating functions V , and different V s yield different quadratures or weights. In other words, for some v

= [

V

(

y(1)

)

,

. . .

, V

(

y(n)

)]

and its interpolating function V , we have

(

PV

)(

y(l)

) =

1 0 p

(

y(l), y

)

V

(

y

)

dy

=

k plkd±kvk

= (

PD±v

)

l

.

Together with (5) and for some v with



v

 =

1 and its interpolating function V , it implies that



PD±

 = 

PD±v

 =

max

l

(

PV

)(

y(l)

)



ψ(

P

)

 1

.

(14)

With (4) and (14) not both satisfied with equality, the sufficient condition in (13) or assumption (11) in Theorem 2.3 are satisfied. Consequently, the critical case (with equality in (4)) does not satisfy (14) only if (5) is also satisfied with equality.

We shall consider the super-critical case, when both (4) and (5) are satisfied with equality, later in the next sub-section. In some applications, P is of low rank. We shall consider this special case in Section 3.

2.1. NARE as an eigenvalue problem

The NARE (6) can be reformulated as the following eigenvalue problem

H  I X 

=

XIS, H



FB+

BF+ 

=



I bI

ˆ

−ˆ

bDI 

+ ˆ

s  I

I  PD, D+

.

(15) From (11), it is easy to see that the eigenvalues of H are shifted from

±

1, splitting equally on opposite sides of the imaginary axis. Using the Gerschgorin Theorem and denoteD

(

a, r

) ≡ {

x

C : |

x

a

|

 r

}

, the eigenvalues are in the regionsD

(−

1,

α) ∪

D

(

1,

α)

on opposite sides of the imaginary axis, with

α ≡ ˆ

b

+ ˆ

s



P

(

D+

+

D

) <

1.

Remark 2.2. With

α =

1 in the super-critical case, a simple application of the Gerschgorin Theorem implies that H in (15) and M in (12) may be singular. However, this potential singularity may be detected or excluded, by applying the extensions of the Gerschgorin Theorem in [12, Section 6.2]. Consider all the Gerschgorin disks of H containing the origin, at least one of the corresponding inequalities should not be satisfied with equality. In other words, we may detect or exclude this super-critical case that all the first n rows have their row sums equal to zero.

Note that even if H or M are singular, the existence result in Theorem 2.2 still holds provided that

M is irreducible. With the additional requirement for the null vectors as in Theorem 2.2, the Newton’s

(7)

3. Special cases

In this section, we shall consider several special cases. When b

=

0, the NARE simplifies to (17). If, in addition, P is low-ranked, the NARE simplifies further to (18) and (19), yielding a solution of low-rank which can be solved efficiently via the nonlinear equations in (20) in Section 3.2. Other special cases are less interesting – when s

= ˆ

s

=

0 and b

/=

0, the NARE degenerates to a simple quadratic (see Section 3.3); and when b

=

s

=

0, the problem becomes trivial, with the NARE degenerates into the 0

=

0 situation. Finally, when f

=

0, the NARE remains qualitatively the same as (6).

3.1. The b

=

0 case

When b

=

0, the NARE (6) then reads

ˆ

sPD

X

(

I

− ˆ

sPD

) − (

I

− ˆ

sPD+

)

X

+ ˆ

sXPD+X

=

0

.

(16) Equivalently, we have X

= φ(

X

) ≡

ˆ

s 2  PD

+

XPD

+

PD+X

+

XPD+X

=

ˆ

s 2

(

I

+

X

)

P

(

D

+

D+X

)

, (17)

implying that X is low-ranked when P is.

We have the following special case of [6, Theorem 2.3] for Xand the iteration X(k+1)

= φ(

X(k)

)

:

Theorem 3.1. Let X(0)

=

0 and X(k+1)

= φ(

X(k)

);

i

.

e

.

, the fixed-point iteration for the NARE in

(

17

)

when b

=

0

.

Then under assumption (11

)

in Theorem 2.3, we have

(

i

)

the iterates satisfy X X(k+1) X(k) ˆs

2PD 0, and

(

ii

)

X(k)

Xas k

→ ∞.

We can apply Newton’s method and doubling [4,11] to solve (16), as in the general case in Sections 4.2 and 4.3.

3.2. Low-ranked P when b

=

0

When P

=

P1P2is of rank r (with P1, P2being n

×

r, r

<

n) and b

=

0, (17) implies X

=

ˆ

s

2UV (18)

with the auxiliary variables

U

= (

I

+

X

)

P1, V

=

P2

(

D

+

D+X

)

, (19)

where U, Vare n

×

r. Substituting X in (18) into (19), we have 2rn nonlinear equations for the 2rn unknowns in U and V : U

=

 I

+

ˆ

s 2UV  P1, V

=

P2  D

+

ˆ

s 2D +UV

.

(20)

Convergence of various iterative schemes (e.g., Newton’s method, generalized nonlinear Jacobi and Gauss–Seidel methods, as in [1,6–11,14,15] ) for the above set of nonlinear equations (20) can be shown, similar to the proof in Theorem 2.3 (or techniques in the respective references).

Consider the following iterative schemes, all starting from UI(0), VI(0)

=

0 forI

=

S,M,JandG: (I) Simple Iteration (SI):

US(k+1)

=

 I

+

s

ˆ

2U (k) S VS(k)  P1, VS(k+1)

=

P2  D

+

ˆ

s 2D +U(k) S VS(k) 

.

(8)

(II) Modified Simple Iteration (MSI): UM(k+1)

=

 I

+

ˆ

s 2U (k) MV( k) M  P1, V( k+1) M

=

P2  D

+

ˆ

s 2D +U(k+1) M V( k) M 

.

(III) Nonlinear Block Jacobi Method (NBJ):

UJ(k+1)

=

 I

+

ˆ

s 2U (k+1) J VJ(k)  P1, V( k+1) J

=

P2  D

+

ˆ

s 2D +U(k) J VJ(k+1) 

.

(IV) Nonlinear Block Gauss–Seidel Method (NBGS):

UG(k+1)

=

 I

+

ˆ

s 2U (k+1) G V( k) G  P1, V( k+1) G

=

P2  D

+

ˆ

s 2D +U(k+1) G V( k+1) G 

.

We have the following results for various iterates, similar to those in [10]:

Theorem 3.2. Assume that

(

4

)

holds with b

=

0 and the splitting P

=

P1P2, with P1, P2 0, satisfies



P1

 =

1, 0

< 

P2D±



 1

.

(21)

Ignoring the subscripts when the result holds for all the four methods, we have

(

i

)

the iterates satisfy U U(k+1) U(k) U(1) 0, V V(k+1) V(k) V(1) 0, for k

=

0, 1,

. . . ;

with U

≡ (

I

+

X

)

P1and V

P2

(

D

+

D+X

);

(

ii

)

U(k)

U, V(k)

Vas k

→ ∞;

(

iii

)

for each k, 0 U(Sk) UM(k) UG(k)and 0 VS(k) VM(k) VG(k)

;

and

(

iv

)

for each k, 0 U(Sk) UJ(k) UG(k)and 0 VS(k) VJ(k) VG(k)

.

Proof. First we prove the iterates are well-defined. For SI and MSI, the issue is trivial. For NBJ and

NBGS, the formulae imply

U(Jk+1)

=

P1  I

ˆ

s 2V (k) J P1 1 , VJ(k+1)

=

 I

ˆ

s 2P2D +U(k) J 1 P2D− (22) and U(Gk+1)

=

P1  I

ˆ

s 2V (k) G P1 1 , VG(k+1)

=

 I

ˆ

s 2P2D +U(k+1) G 1 P2D

.

(23)

The matrices inside the square brackets in (22) and (23) can be proved to be nonsingular M-matrices in the form I

K by induction, with nonnegative inverses. In particular, we need to show that



K

 <

12. Note that (4) implies that

ˆ

s1

2when b

=

0.

For k

=

0, UJ(1), VJ(1)and UG(1)are obviously well-defined, as K

=

0 in their respective formulae. For NBGS with K

2ˆsP2D+U( 1) G , (21) and (23) imply



UG(1)

 = 

P1 =1,



K





ˆ

s 2 1 4

<

1 2

so VG(1)

= (

I

K

)

−1P2Dis well-defined as I

K is a nonsingular M-matrix. In addition, from the

second formula in (23), we have  VG(1)

=



(

I

+

K

+

K2

+ · · ·)

P2D− P2D−  1

+

1 4

+

1 42

+ · · ·

 4 3

<

2

.

(24)

For the induction step, assume that UI(k)and VI(k)are well-defined, with



UI(k)



,



VI(k)

 <

2

(

I

=

(9)

VG(k+1), applying a similar argument as in (24) with K

=

2ˆsP2D+U( k+1)

G , we complete the induction

step with  VG(k+1)

=



(

I

+

K

+

K2

+ · · ·)

P2D−

<

P2D−  1

+

1 2

+

1 22

+ · · ·

  2

.

Consequently, (i) and (ii) can be proved similarly as in Theorem 3.1.

For (iii) and (iv), again by induction, we have

U(I1)

=

P1

(∀

I

)

, VI(1)

=

P2D

(

I

=

S,M,J

).

For VG(1)with K

=

ˆs2P2D+U( 1) G , we have VG(1)

= (

I

K

)

−1P2D

= (

I

+

K

+

K2

+ · · ·)

P2D P2D

=

V( 1) I

(

I

=

S,M,J

).

For the induction step, assume that (iii) and (iv) hold for some arbitrary value of k. We then have

U(Sk+1)

=

 I

+

ˆ

s 2U (k) S VS(k)  P1  I

+

ˆ

s 2U (k) MVM(k)  P1

=

UM(k+1), VS(k+1)

=

P2  D

+

ˆ

s 2D +U(k) S VS(k)   P2  D

+

ˆ

s 2D +U(k) MVM(k)   P2  D

+

ˆ

s 2D +U(k+1) M V( k) M 

=

VM(k+1)

;

U(Sk+1)

=

 I

+

ˆ

s 2U (k) S VS(k)  P1  I

+

ˆ

s 2U (k) J VJ(k)  P1   I

+

ˆ

s 2U (k+1) J VJ(k)  P1

=

U( k+1) J , VS(k+1)

=

P2  D

+

ˆ

s 2D +U(k) S VS(k)   P2  D

+

ˆ

s 2D +U(k) J VJ(k+1) 

=

VJ(k+1)

;

U(Mk+1)

=

 I

+

ˆ

s 2U (k) MV( k) M  P1  I

+

ˆ

s 2U (k+1) G V( k) G  P1

=

U( k+1) G , VM(k+1)

=

P2  D

+

ˆ

s 2D +U(k+1) M VM(k)   P2  D

+

ˆ

s 2D +U(k+1) G VG(k+1) 

=

VG(k+1)

;

and U(Jk+1)

=

P1  I

s

ˆ

2V (k) J P1 1  P1  I

ˆ

s 2V (k) G P1 1

=

U(Gk+1), VJ(k+1)

=

 I

ˆ

s 2P2D +U(k) J 1 P2D−  I

s

ˆ

2P2D +U(k+1) G 1 P2D

=

V( k+1) G

.

Note that the iterates U(k)and V(k)are increasing towards their respective limits Uand V∗, and the right-most inequalities will be obvious from

(

I

K

)

−1

=

I

+

K

+

K2

+ · · ·

The induction is complete and (iii) and (iv) are proved.



After U and V are obtained, X can be retrieved from (18). As for the simpler equation in [1,10,13,

14,15,17], the speed of convergence for various iterative schemes is reflected by the rates of increase in the iterates. Consequently, NBGS is the fastest method, as proven more elaborately in [10]. Note that these iterative schemes are of O

(

r2n

)

complexity per iteration, with the inexpensive inversion of

(10)

I

K

R

r×rin NBJ and NBGS. (Only when X

=

2ˆsUV is formed, O

(

rn2

)

flops are involved.) However, all these are possible only for the special case b

=

0 and s

/=

0, so we shall not attempt similar analysis as in [10]. Note that the iterative schemes for the simpler equations previously studied in [10] are of

O

(

n2

)

complexity.

3.3. Explicit solution for special case when s

=

0 When s

=

0, the NARE (6) becomes the quadratic

ˆ

bD

2X

+ ˆ

bX2

=

0

.

(25)

With the minimal nonnegative solution guaranteed to exist by Theorem 2.3, we may consider various iterative processes for solving (25). However, the fixed-point iteration X

← ˆ

b

(

D

+

X2

)/

2 implies that X

=

diag

{

xi}is diagonal and (25) degenerates to n scalar quadratics, implying that

X

= ˆ

b−1  I

 I

− ˆ

b2D− 

= ˆ

bD−  I

+

 I

− ˆ

b2D− 1

.

As D±

=

O

(

1

/

n

)

, a good approximation to X∗is given bybD

ˆ

/

2, identical to X(1)from the fixed-point iteration with X(0)

=

0 or solving (25) by ignoring the X2term.

4. The general case

For the general NARE (6):

B

XF

F+X

+

XB+X

=

0

with F±

I

− ˆ

sPD±, B

≡ (ˆ

bI

+ ˆ

sP

)

Dand B+

≡ ˆ

bI

+ ˆ

sPD+in (7) for b

/=

0, we can apply fixed-point iteration, Newton’s method [8,14,15] or doubling [4,11]. The existence of the unique minimal nonnegative solution X∗has been proved in Theorem 2.3. Similar results as in Theorems 3.1 and 3.2 can be proved. For P

=

P1P2of rank r

(<

n

)

, the additional structure can be exploited for lower

operation counts.

4.1. Fixed-point iteration

There are many different versions of fixed-point iterations for (6). One obvious way, extending (17) in Section 3.1, is X

F

(

X

) ≡

ˆ

s 2

(

I

+

X

)

P

(

D

+

D+X

) +

b

ˆ

2

(

D

+

X2

)

, X(0)

=

0

.

(26)

Note that we have written F

(

X

)

as the sum of the right-hand-side of (17) associated with

ˆ

s and the

left-over

ˆ

b term, requiring one less matrix–matrix multiplication than the obvious

F

(

X

) =

1

2 

B

− ˆ

s

(

XPD

+

PD+X

) +

XB+X

.

For P

=

P1P2of rank r

(<

n

)

, only

[(

6r

+

4

)

n2

+

2n3

]

flops are required per iteration (see Table5.1

for other operation counts).

Similar to Theorem 3.1, we have the following special case of [6, Theorem 2.3]:

Theorem 4.1. Under assumption

(

11

)

, for the fixed-point iteration

(

26

)

, we have

(

i

)

the iterates satisfy X X(k+1) X(k) X(1)

=

12B− 0

(

k

=

0, 1,

. . .);

and

(

ii

)

X(k)

Xas k

→ ∞.

(11)

4.2. Newton’s method

From the NARE (6), let R

(

X

)

denote the left-hand-side of the equation, in the computationally efficient form

R

(

X

) = ˆ

s

(

I

+

X

)

P

(

D

+

D+X

) + ˆ

b

(

D

+

X2

) −

2X, with further saving when P

=

P1P2is low-ranked.

At the

(

k

+

1

)

th iteration with X(k) being an approximate solution and X(k+1)

=

X(k)

+ δ

X(k), Newton’s method requires the solution of the Sylvester equation



F+

X(k)B+

δ

X(k)

+ δ

X(k)F

B+X(k)

=

RX(k)

.

(27) Convergence of Newton’s method follows readily.

Theorem 4.2. Let Xbe the minimal nonnegative solution of

(

6

).

Then under assumption

(

11

)

, for the Newton iteration

(

27

)

with X(0)

=

0, the sequence

{

X(k)

}

is well-defined, X(k) X(k+1) Xfor all k 0, and limi→∞X(k)

=

X

.

The proof makes use of selected results from Theorem 2.2. In particular when vectorized, the above Sylvester operator can be written as the matrix operator MX(with m

=

n).

4.3. Doubling

We shall quote the doubling algorithm for the general NARE (9), with the matrix M in (10) being a nonsingular M-matrix, from [11]. Note that the doubling algorithm is at approximately three times faster than Newton’s method, as concluded in [7,11] and Table5.1; please consult the details in the respective references.

For the general NARE:

X CX

X D

AX

+

B

=

0

with the corresponding matrix M in (10) being a nonsingular M-matrix, we first transform A, B, C and

D to

Eγ

=

I

2

γ

Vγ−1, Gγ

=

2

γ

Dγ1 CWγ−1, Fγ

=

I

2

γ

Wγ−1, Hγ

=

2

γ

Wγ−1 BDγ1

with the parameter

γ

 maxi,j{ˆaii,b

ˆ

jj}and

Aγ

=

A

+ γ

I, Dγ

=

D

+ γ

I, Wγ

=

Aγ

BDγ1 C, Vγ

=

Dγ

CAγ1 B

.

The doubling algorithm can then be summarized as:

E0

=

Wγ, F0

=

Fγ, G0

=

Gγ, H0

=

Hγ,

Ek+1

=

Ek

(

I

GkHk

)

−1Ek, Fk+1

=

Fk

(

I

HkGk

)

−1Fk,

Gk+1

=

Gk

+

Ek

(

I

GkHk

)

−1GkFk, Hk+1

=

Hk

+

Fk

(

I

HkGk

)

−1HkEk

.

(28)

The iterates are well-defined with I

HkGkand I

GkHkbeing nonsingular M-matrices for all k, and Hk

X and Gk

Y (respectively, the solutions to (9) and its adjoint) from below quadratically as k

→ ∞

(see [11, Theorem 5.1]).

When D±

=

D, we have F±

=

I

− ˆ

sPD and B±

= ˆ

sPD, halving the operation count of doubling. 5. Numerical examples

For comparison, we shall summarize the operation counts per iteration of various iterative methods in Table5.1. We shall show only the dominant terms, assuming that n



r. The Sylvester equations

(12)

Table 5.1

Operation counts per iteration.

Parameter P Method Flops

b=0 General Fixed-point iteration for (17) 4n3 Low-ranked NBGS (23) 6r2n b/=0 General Fixed-point iteration (26) 6n3

Low-ranked Fixed-point iteration (26) 2n3 General Newton’s method (27) 41n3 Low-ranked Newton’s method (27) 34n3 Doubling (28) 162

3n 3

Table 5.2

CPU-times and iteration numbers for Example 1.

Fixed-point Newton Doubling

n tn rn #It tn rn #It tn rn #It

64 0.125 N/A 42 0.062 N/A 6 0.047 N/A 7

128 0.374 2.99 38 0.421 6.79 5 0.156 3.32 7 256 2.886 7.72 38 2.558 6.08 5 1.435 9.20 7 512 18.80 6.51 40 21.75 8.50 5 10.76 7.50 7 1024 186.9 9.94 43 172.1 7.91 5 97.25 9.04 8

in (27) are assumed to be solved by the Bartels–Stewart algorithm [5]. For b

=

0 with a low-ranked

P, only the fastest method NBGS is considered. The slow fixed-point iteration method is also included

for comparison.

We shall consider two randomly generated examples for n

=

64, 128, 256, 512 and 1024. Example 1 has

ˆ

s

=

0

.

3,

ˆ

b

=

0

.

4 and P being full-ranked, and Example 2 has

ˆ

s

=

0

.

3,b

ˆ

=

0 and rank P

=

10. For the examples, the respective assumptions in Theorems 2.3 and 3.2 are satisfied. The numerical computation has been carried out using MATLAB R2008b on a laptop with eps

=

2

.

2204

×

10−16[16].

(13)

Table 5.3

CPU-times and iteration numbers for Example 2 with NBGS.

n 64 128 256 512 1024

tn 0.0000 0.0000 0.0312 0.0468 0.0780

rn N/A N/A N/A 1.50 1.67

#It 9 9 9 9 9

For Example 1, fixed-point iteration, Newton’s method and the doubling algorithm have been compared for various values of n. The iterations have been run until convergence with tolerance tol

=

10−15. The results are summarized in Table5.2, with tn denoting the CPU-time, rn

tn

/

tn

2 and #It the number of iterations required, for a particular value of n. The iterates are also plotted in Fig.3for n

=

1024.

From Table5.2and Fig.3, it is evident that the doubling algorithm performs better than Newton’s method and the fixed-point iteration method is the slowest, as predicted in Table5.1. The ratios rn

illus-trate the O

(

n3

)

complexity of the methods. The graphs in Fig.3illustrate the quadratic convergence of the doubling algorithm and Newton’s method, with fixed-point iteration obviously converges linearly. Newton’s method is faster than doubling in terms of convergence but the latter has an advantage in operation count per iteration by a factor of three, resulting in its better efficiency in terms of CPU-time. Note that thecputime command in MATLAB [16] is not an exact reflection of CPU-time consumed and should be used as a rough guide only. Also, we have no control over some parts of the algorithms, such as the inversion of the Sylvester operators by the MATLAB commandlyap [16] in Newton’s method.

For Example 2, only the fastest iteration method NBGS has been tested and the results are summa-rized in Table5.3, with tol

=

10−15. The O

(

n

)

complexity of the method is illustrated in the ratios rn,

although cputime in MATLAB fails to register the small amount of CPU-time for smaller values of n.

6. Concluding remarks

For a two-dimensional model in transport theory, we need to solve an integral–differential equation to obtain the Eastern reflection kernel RE, from which other kernels can be derived. For the

steady-state solution, we have derived an NARE from the corresponding integral equation using numerical quadratures. We have proved the existence and uniqueness of the minimal nonnegative solution of the NARE. When b

=

0 and P is low-ranked, the efficient NBGS method of complexity O

(

n

)

solves the NARE efficiently. For the general case when b

/=

0, the doubling algorithm is the most efficient, approximately three times more efficient as Newton’s method. The numerical results support our theoretical findings.

For future work, we need to consider conditions for existence other than (11), efficient algorithms making better use of the structure of the Riccati equations, the convergence of X

=

RD−toREand to

improve the efficiency of the numerical algorithms for large values of n. Finally, there are other similar models and problems in transport theory [2] worthy of investigation.

Acknowledgements

E. Chu was partially supported by the Centre of Mathematical Modeling and Scientific Computing and the National Centre of Theoretical Sciences, National Chiao Tung University, Republic of China. T. Li was supported by the National Science Foundation, Republic of China.

References

[1] Z.-Z. Bai, Y.-H. Gao, L.-Z. Lu, Fast iterative schemes for nonsymmetric algebraic Riccati equations arising from transport theory, SIAM J. Sci. Comput. 30 (2) (2008) 804–818.

(14)

[3] A. Berman, R.J. Plemmons, Nonnegative Matrices in the Mathematical Sciences, SIAM, Philadelphia, 1994.

[4] C.-Y. Chiang, E.K.-W. Chu, C.-H. Guo, T.-M. Huang, W.-W. Lin, S.-F. Xu, Convergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Appl. 31 (2009) 227–247.

[5] G.H. Golub, C.F. Van Loan, Matrix Computations, third ed., Johns Hopkins University Press, Baltimore, MD, 1996. [6] C.-H. Guo, Nonsymmetric algebraic Riccati equations and Wiener–Hopf factorization for M-matrices, SIAM J. Matrix Anal.

Appl. 23 (2001) 225–242.

[7] C.-H. Guo, A new class of nonsymmetric algebraic Riccati equations, Linear Algebra Appl. 426 (2007) 636–649. [8] C.-H. Guo, N.J. Higham, Iterative solution of a nonsymmetric algebraic Riccati equation, SIAM J. Matrix Anal. Appl. 29 (2007)

396–412.

[9] C.-H. Guo, P. Lancaster, Analysis and modification of Newton’s method for algebraic Riccati equations, Math. Comp. 67 (1998) 1089–1105.

[10] C.-H. Guo, W.-W. Lin, Convergence rates of some iterative methods for nonsymmetric algebraic Riccati equations arising in transport theory, Linear Algebra Appl. 432 (2010) 283–291.

[11] X.-X. Guo, W.-W. Lin, S.-F. Xu, A structure-preserving doubling algorithm for nonsymmetric algebraic Riccati equations, Numer. Math. 103 (2006) 393–412.

[12] R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985.

[13] J. Juang, Existence of algebraic matrix Riccati equations arising in transport theory, Linear Algebra Appl. 230 (1995) 89–100. [14] Y. Lin, L. Bao, Y. Wei, A modified Newton method for solving non-symmetric algebraic Riccati equations arising in transport

theory, SIMA J. Numer. Anal. 29 (2008) 215–224.

[15] L.-Z. Lu, Newton iterations for a non-symmetric algebraic Riccati equation, Numer. Linear Algebra Appl. 12 (2005) 191–200. [16] MathWorks, MATLAB User’s Guide, 2002.

[17] V. Mehrmann, H. Xu, Explicit solutions for a Riccati equations from transport theory, SIAM J. Matrix Anal. 30 (2008) 1339–1357.

[18] P. Nelson, D.L. Seth, Integrodifferential equations for the two-Dimensional transition kernels of invariant imbedding, Appl. Math. Comp. 82 (1997) 67–83.

[19] P. Nelson, D.L. Seth, R. Vasudevan, An integrodifferential equation for the two-dimensional reflection kernel, Appl. Math. Comp. 49 (1992) 1–18.

數據

Fig. 1. The four-port-system.
Fig. 2. The incremental layer and the transmission operator T E .
Fig. 3. Residuals for Example 1 (n = 1024).

參考文獻

相關文件

Assume that the partial multiplicities of purely imaginary and unimodular eigenvalues (if any) of the associated Hamiltonian and symplectic pencil, respectively, are all even and

Arbenz et al.[1] proposed a hybrid preconditioner combining a hierarchical basis preconditioner and an algebraic multigrid preconditioner for the correc- tion equation in the

We can therefore hope that the exact solution of a lower-dimensional string will provide ideas which could be used to make an exact definition of critical string theory and give

Using this formalism we derive an exact differential equation for the partition function of two-dimensional gravity as a function of the string coupling constant that governs the

In this paper, we develop a novel volumetric stretch energy minimization algorithm for volume-preserving parameterizations of simply connected 3-manifolds with a single boundary

To improve the convergence of difference methods, one way is selected difference-equations in such that their local truncation errors are O(h p ) for as large a value of p as

Despite significant increase in the price index of air passenger transport (+16.97%), the index of Transport registered a slow down in year-on-year growth from +12.70% in July to

Despite higher charges of taxi service starting from September, the index of Transport registered a slow down in year-on-year growth from +8.88% in August to +7.59%, on account of