• 沒有找到結果。

NCTS 2006 School on Modern Numerical Methods in Mathematics and Physics : Methods for Accelerating the Arnoldi Iteration

N/A
N/A
Protected

Academic year: 2021

Share "NCTS 2006 School on Modern Numerical Methods in Mathematics and Physics : Methods for Accelerating the Arnoldi Iteration"

Copied!
46
0
0

加載中.... (立即查看全文)

全文

(1)

Methods for Accelerating the

Arnoldi Iteration

Chao Yang

Computational Research Division

Lawerence Berkeley National Laboratory Berkeley, California, USA

(2)

Background

Large-scale eigenvalue problems

Ax = λx

or

Ax = λBx

A

,

B

large, sparse, or structured

y ← Ax

and

y ← Bx

can be computed efficiently

Krylov subspace methods Acceleration techniques

(3)

Arnoldi Iteration

Produces an orthonormal basis Vm = {v1, v2, ..., vm}

associated with a Krylov Subspace

K(A, v1; m) = span{v1, Av1, ..., Amv1}

A single step (Gram-Schmidt):

fj ← (I − VjVjH)Avj; vj+1 ← fj/kfjk;

After m steps:

(4)

Approximate eigenvalues and vectors from K{A, v1; m}

Find

i

, y

i

)

such that

V

mH

(AV

m

y−θV

m

y) = 0

(Galerkin condition

)

Equivalent to solving

H

m

y

i

= θ

i

y

i

,

where

H

m

= V

mH

AV

m

.

Approximation to

eigenvalue

θ

i

(

Ritz value

)

(5)

Checking Convergence

Let

z = V

m

y

, where

H

m

y = θy

; Residual norm

kAz − θzk = kAV

m

y − θV

m

yk

=

k(V

m

H

m

+ f e

Hm

)y

− θV

m

yk

=

kfk|e

Hm

y|

(6)

Convergence of Arnoldi

If

v

1

an

m

-dimensional invariant

subspace of

A

, Arnoldi converges in

m

or

few steps:

AV

m

= V

m

H

m

.

One rarely finds such a good

v

1. In general, extremal and well separated

eigenvalues emerge rapidly (Kaniel,

(7)

Convergence of Ritz values

0 5 10 15 20 25 30 35 −20 −15 −10 −5 0 5 10 15 20

Lanczos iteration number

(8)

Computational Cost

Storage for

V

m,

H

m:

O(nm + m

2

)

;

Orthogonalization

f

m

← (I − V

m

V

mH

)Av

m:

O(nm

2

)

;

Eigen-analysis of

H

m:

O(m

3

)

; MATVEC

y ← Ax

: varies with applications;

(9)

Acceleration Methods

Method of implicit restart

 polynomial

rational

Method of spectral transformation

 polynomial

rational

Precondition (tomorrow’s lecture)

Solve an eigenvalue problem as either a system of nonlinear equations or an optimization problem

(10)

Restarting an Arnoldi iteration

1. Fix the dimension

m

of

K(A, v

1

; m)

at a moderate value;

2. Modify the starting vector

v

1

← ψ(A)v

1

;

3. Repeat the Arnoldi process with the modified

v

1;

(11)

How to choose

ψ(λ)

?

Suppose eigenvalues of

A

are:

λ

1

, ..., λ

k

,

|

{z

}

wanted

λ

k+1

, ..., λ

n

|

{z

}

unwanted

,

and the corresponding eigenvector are

x

1

, x

2

, ..., x

n.

ψ(A)v

1

= γ

1

ψ(λ

1

)x

1

+

· · · + γ

k

ψ(λ

k

)x

k

|

{z

}

wanted

+ γ

k+1

ψ(λ

k+1

)x

k+1

+

· · · + γ

n

ψ(λ

n

)x

n

|

{z

}

unwanted

(12)

Two types of restarts

1.

ψ(λ)

is a polynomial;

2.

ψ(λ)

is a rational function;

ψ(λ)

must be large on

λ

1

, ..., λ

k

and small

(13)

Filter Example

10−4 10−3 10−2 10−1 100 101 −2 0 2 4 6 8 10 12 14 16x 10 30 lambda p(lambda)

(14)

Implicit Restart

1. Do not form

v

1

← ψ(A)v

1 explicitly;

2. Do not repeat the Arnoldi iteration from the first column;

Need to understand the connection

between Arnoldi and QR (RQ)

(15)

QR and RQ iteration

AV = V H

ւ

ց

AV = V QR

AV = V RQ

A(V Q) = (V Q)RQ AV Q

H

= V Q

H

(QR)

ց

ւ

AV

+

= V

+

H

+

(16)

Difference

QR RQ

AV = (V Q)R A(V Q

H

) = V R

Av

1

= v

1+

ρ

1,1

Av

1+

= v

1

ρ

1,1 power inverse power inverse

(17)

Arnoldi = Truncated Hessenberg Reduction

=

=

V

(18)

Implicit Researt Arnoldi = Truncated QR Iteration

AV

m

= V

m

H

m

+ f e

Hm

AV

m

= V

m

QR + f e

Hm

AV

m

Q = (V

m

Q)(RQ) + f e

Hm

Q

(19)

Shifts & Polynomial filter

Truncated Hessenberg reduction is shift-invariant

(A − µI)Vm = Vm(Hm − µI) + feHm

Applying p shifts = Running p QR iterations on Hm

v1+ = β(A − µ1I)(A − µ2I) · · · (A − µpI)v1

What to use for shifts? Eigenvalues of Hm.

θ1, ..., θk, | {z } wanted θk+1, ..., θm | {z } unwanted , m = k + p

(20)

Filtering Polynomial (seek the smallest eigenvalues) 10−4 10−3 10−2 10−1 100 101 −2 0 2 4 6 8 10 12 14 16x 10 30 lambda p(lambda)

(21)
(22)
(23)

ARPACK

http://www.caam.rice.edu/software/ARPACK

Solve a variety of problems (sym, nonsym, real, complex)

Location of the eigenvalues: which = LM, SM, LA, SA, BE, LR, SR, LI, SI

Choose

k

and

p

(nev=

k

, ncv=

k + p

).

p

is the degree of the filtering polynomial

Reverse communication interface Level-2 BLAS

(24)

Reverse Communication

10 continue

call dsaupd(ido, ..., workd(in),

& workd(out))

if (ido .eq. 1) then

call matvec(..., workd(in),

& workd(out))

endif ...

(25)

Truncating the RQ Iteration

RQ-iteration

AV = V H=⇒AV = V RQ=⇒AV QH = (V QH)QR

Let V + = V QH and v1+ = V +e1, then Av1+ = v1ρ1,1

Truncating RQ

(26)

The TRQ Equation

=

(A

− µI)v

+

= V

k

h + vα, V

kH

v

+

= 0,

kv

+

k = 1

 A − µI V

k

V

kH

0

  v

+

−h



=

 vα

0



, kv

+

k = 1.

(27)

Solving the TRQ equation

Using

AV

k

= V

k

H

k

+ f

k

e

Hk to simplify the bordered system.

1.

w ← (I − V

k

V

kH

)(A

− µI)

−1

(V

k

s)

; 2.

v

+

← w/kwk

;

(28)

Example

The matrix A: 2-D discrete Laplacian (100 × 100);

Target: smallest eigenvalues;

k = 4; Convergence tolerance: 10−15;           α1 β1 β1 α2 β2 β2 α3 β3 β3 α4 β4 β4 α5          

(29)

Convergence

iter. β1 β2 β3 β4

1 1.5e-2 4.2e+1 2.3e+1 1.9e+1

2 5.1e-4 3.2e-2 4.3e+1 2.3e+1

3 1.1e-6 9.1e-5 7.1e-4 3.8e-2

4 8.3e-13 7.0e-5 1.0e-4 8.5e-2

5 7.1e-24 2.9e-5 2.9e-4 2.5e-4

6 0.0 9.6e-6 1.1e-5 1.1e-4

7 0.0 3.7e-8 1.0e-6 4.8e-1

8 0.0 4.4e-19 1.5e-11 6.0e-3

(30)

Inexact TRQ

w ← (I − V

k

V

kH

) (A

− µI)

−1

(V

k

s)

|

{z

}

solve iteratively

;

˜

v

+

← w/kwk;

˜h ← V

H k

v

+

; ˜

α ← v

H

(A

− µI)˜v

+

;

(31)

Error Damping

(A

− µI)˜v

1+

= ρ

1,1

v

1

+ zδ

where

δ

is a product of

k − 1

sines (error damping).

(32)

Linear Convergence

kr+k ≤ ψ(µ, ν)krk, |ψ(µ, ν)| ≤ p ν ζ2 − ν2 + O(krk), where r = (A − µI)v1, r+ = (A − µ+I)v1+;

µ and µ+ are Rayleigh Quotient shifts (converging);

ν damped residual error (from TRQ equation);

(33)

Observation

kr

+

k ≤ ψ(µ, ν)krk,

|ψ(µ, ν)| ≤

p

ν

ζ

2

− ν

2

+

O(krk).

ν

is normally several magnitude smaller than

kzk

;

Asymptotically, if

|ν| < ζ/

2

, then

ψ < 1

monotonic convergence;

If

ν = 0

,

kr

+

k ≤ γkrk

2 (quadratic convergence of RQI)

(34)

Convergence of Inexact TRQ

iter.

α

1

kzk

β

1 1 3.0e-3 - 7.8e-3 2 9.7e-4 1.9e-3 7.3e-5 3 9.7e-4 2.5e-3 3.2e-6 4 9.7e-4 5.7e-4 1.5e-7 5 9.7e-4 1.6e-3 6.7e-9 6 9.7e-4 9.2e-4 3.2e-10 7 9.7e-4 1.3e-3 1.6e-11

(35)

Example (Reactive Scattering)

n = 256

, symmetric Target:smallest

k = 6

;

Solver: MINRES; tol =

1.0

−8; max iter =

100

.

(36)

Residual

IRAM precond no−precond 0 0.5 1 1.5 2 2.5 3 3.5 x 108 10−10 10−8 10−6 10−4 10−2 100 102 104 106 108 1010 flops residual norm matrix size = 256

(37)

Spectral Transformation

Compute eigenvalues of

ψ(A)

instead; Cluster or interior eigenvalues of

A

becomes dominant and well separated eigenvalues of

ψ(A)

;

Eigenvectors are preserved, recover eigenvalues by Rayleigh Quotient. Two types of

ψ(λ)

:

rational function:

ψ(A) = (A − µI)

−1; polynomial;

(38)

Rational Transformation

10−3 10−2 10−1 100 0 100 200 300 400 500 600 700 800 900 1000 λ λ −1 Shift-invert (A − σI)−1x = λ−σ1 x Caley Transformation (A − σ1I)−1 (A + σ2I)x = λ+σ2 λ−σ1 x

(39)

Generalized Problems

B symmetric positive definite, B = LLT

L−1AL−T(LT x) = λ(LTx)

B symmetric positive semidefinite,

(A − σB)−1Bx = 1

λ − σ x

symmetric w/s to B-semi-inner product

B nonsymmetric,

B−1Ax = λx

(40)

Practical Issues

Sparse matrix ordering and factorization Sparse triangular solves

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 nz = 95138

(41)

Polynomial Transformations

Only requires MATVEC;

Fewer number of iterations implies fewer inner products;

(42)

Polynomial Construction

Chebyshev and Kernel polynomials; Minmax polynomials;

(43)

Polynomials

Chebyshev polynomials 10−3 10−2 10−1 100 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 x y Chebyshev Kernel Kernel polynomial Kˆm(λ; ξ) = Pm j=0 φj(λ)φj(ξ).

(44)

Kernel Polynomial

−20 −15 −10 −5 0 5 10 15 20 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 lambda p(lambda Kernel polynomial

(45)

Constructing Polynomials

Minmax polynomial:

min

λ∈(α,β),p(λ)∈Pm

kf(λ) − p(λ)k∞,

Least square polynomial: minλ∈(α,β),p(λ)∈Pm kf(λ) − p(λ)kw,

10−4 10−3 10−2 10−1 100 101 −200 0 200 400 600 800 1000 1200

Least square polynomial

x y LS poly eigenvalues 1/x 10−4 10−3 10−2 10−1 100 101 −100 −50 0 50 100 150 200 250 300 350

Least square error

x

y

LS error error on eigenvalues

(46)

Comparison (

n = 2500

, degree=20)

polynomial MATVECs CPU time (sec)

Minmax 5240 32.4

Chebyshev 5230 35.6

Leja 6790 41.8

Kernel 6760 42.0

Chebyshev least square 6770 42.2

Ritz least square 6100 42.4

Legendre least square 7290 50.8

參考文獻

相關文件

Keywords: Interior transmission eigenvalues, elastic waves, generalized eigen- value problems, quadratic eigenvalue problems, quadratic Jacobi-Davidson method..

The existence of transmission eigenvalues is closely related to the validity of some reconstruction methods for the inverse scattering problems in an inhomogeneous medium such as

In this paper, we propose a practical numerical method based on the LSM and the truncated SVD to reconstruct the support of the inhomogeneity in the acoustic equation with

We have made a survey for the properties of SOC complementarity functions and theoretical results of related solution methods, including the merit function methods, the

We have made a survey for the properties of SOC complementarity functions and the- oretical results of related solution methods, including the merit function methods, the

Then, we tested the influence of θ for the rate of convergence of Algorithm 4.1, by using this algorithm with α = 15 and four different θ to solve a test ex- ample generated as

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

Assume that the partial multiplicities of purely imaginary and unimodular eigenvalues (if any) of the associated Hamiltonian and symplectic pencil, respectively, are all even and