GMRES: Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems

(1)

GMRES: Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University

December 4, 2011

(2)

Ref: SISC, 1984, Saad

Theorem 1 (Implicit Q theorem)

Let AV₁= V₁H₁ and AV₂ = V₂H₂, where H₁, H₂ are Hessenberg and V₁, V₂ are unitary with V₁e₁ = V₂e₁ = q₁. Then V₁= V₂ and H₁= H₂.

Proof

A

v1 v2 · · · vn = v1 v2 · · · vn







h11 h12 · · · h1n

h21 h22 h2n

. .. . .. ... h_n,n−1 h_nn







with

v_i^Tvj = δij, i, j = 1, ..., n

(3)

Arnoldi Algorithm

Input: Given v1 with kv1k₂ = 1;

Output: Arnoldi factorization: AV_k= V_kH_k+ h_k+1,kv_k+1e^T_k.

1: Set k = 0.

2: repeat

3: Compute h_ik = (Av_k, vi) for i = 1, 2, . . . , k;

4: Compute ˜v_k+1= Av_k−Pk

i=1h_ikv_i;

5: Compute h_k+1,k = k˜v_k+1k₂;

6: Compute v_k+1= ˜v_k+1/h_k+1,k;

7: Set k = k + 1;

8: until convergent

(4)

Remark 1

(a) Let V_k= [v₁, · · · , v_k] ∈ R^n×k where v_j, for j = 1, . . . , k, is generated by Arnoldi algorithm. Then Hk≡ V_k^TAVk is upper k × k Hessenberg.

(b) Arnoldi’s original method was a Galerkin method for approximate the eigenvalue of A by Hk.

(5)

In order to solve Ax = b by the Galerkin method using < K_k>≡< V_k>, we seek an approximate solution x_k= x0+ z_k with

zk∈ K_k=< r0, Ar0, · · · , A^k−1r0 >

and r₀ = b − Ax₀. Definition 2

{x_k} is said to be satisfied the Galerkin condition if r_k≡ b − Ax_k is orthogonal to K_k for each k.

The Galerkin method can be stated as that find

xk= x0+ zk with zk∈ V_k (1) such that

(b − Ax_k, v) = 0, ∀ v ∈ V_k,

(6)

which is equivalent to find

z_k≡ V_ky_k∈ V_k (2)

such that

(r0− Az_k, v) = 0, ∀ v ∈ V_k. (3) Substituting (2) into (3), we get

V_k^T(r₀− AV_ky_k) = 0, which implies that

y_k= (V_k^TAV_k)⁻¹kr₀ke₁. (4)

(7)

Since V_k is computed by the Arnoldi algorithm with v₁ = r₀/kr₀k, y_k in (4) can be represented as

yk = H_k⁻¹kr₀ke₁. Substituting it into (2) and (1), we get

x_k= x₀+ V_kH_k⁻¹kr₀ke₁.

Using the result that AV_k= V_kH_k+ h_k+1,kv_k+1e^T_k, r_k can be reformulated as

r_k = b − Ax_k = r0− AV_ky_k= r0− (V_kH_k+ h_k+1,kv_k+1e^T_k)y_k

= r0− V_kkr₀ke₁− h_k+1,ke^T_kykvk+1 = −(hk+1,ke^T_kyk)vk+1.

(8)

The generalized minimal residual (GMRES) algorithm

The approximate solution of the form x0+ z_k, which minimizes the residual norm over z_k∈ K_k, can in principle be obtained by following algorithms:

The ORTHODIR algorithm of Jea and Young;

the generalized conjugate residual method (GCR);

GMRES.

Let

V_k= [v₁, · · · , v_k] , H˜_k=







h1,1 h1,2 · · · h1,k

h_2,1 h_2,2 · · · h_2,k 0 . .. . .. ...

... . .. h_k,k−1 hk,k

0 · · · 0 h_k+1,k







∈ R^(k+1)×k.

(9)

By Arnoldi algorithm, we have

AV_k= V_k+1H˜_k. (5)

To solve the least square problem:

z∈Kmink

kr_o− Azk₂= min

z∈Kk

kb − A(x_o+ z)k2, (6) where Kk =< ro, Aro, · · · , A^k−1ro >=< v1, · · · , vk> with v1 = _kr^r^o

ok2.

(10)

Set z = V_ky, the least square problem (6) is equivalent to min

y∈R^k

J (y) = min

y∈R^k

kβv₁− AV_kyk₂, β = kr_ok₂. (7)

Using (5), we have J (y) = kVk+1

βe1− ˜Hky

k₂ = kβe1− ˜Hkyk2. (8) Hence, the solution of the least square (6) is

x_k= x_o+ V_ky_k,

where y_k minimize the function J (y) defined by (8) over y ∈ R^k.

(11)

GMRES Algorithm

Input: Choose x0, compute r0= b − Ax0 and v1 = r0/kr0k;

Output: Solution of linear system Ax = b.

1: for j = 1, 2, . . . , k do

2: Compute hij = (Avj, vi) for i = 1, 2, . . . , j;

3: Compute ˜vj+1= Avj −Pj

i=1hijvi;

4: Compute hj+1,j = k˜vj+1k₂;

5: Compute v_j+1= ˜v_j+1/h_j+1,j;

6: end for

7: Form the solution:

xk= x0+ Vkyk, where yk minimizes J (y) in (8).

Difficulties: when k is increasing, storage for v_j, like k, the number of multiplications is like ¹₂k²N .

(12)

GMRES(m) Algorithm

Input: Choose x0, compute r0= b − Ax0 and v1 = r0/kr0k;

Output: Solution of linear system Ax = b.

1: for j = 1, 2, . . . , m do

2: Compute hij = (Avj, vi) for i = 1, 2, . . . , j;

3: Compute ˜vj+1= Avj −Pj

i=1hijvi;

4: Compute hj+1,j = k˜vj+1k₂;

5: Compute v_j+1= ˜v_j+1/h_j+1,j;

6: end for

7: Form the solution:

xm= x0+ Vmym, where y_m minimizes k βe₁− eH_my k for y ∈ R^m.

8: Restart: Compute r_m = b − Ax_m;

9: if krmk is small, then

10: stop,

(13)

Practical Implementation: Consider QR factorization of eHk

Consider the matrix eH_k. We want to solve the least squares problem:

min

y∈R^k

k βe₁− eHky k2. Assume Givens rotations F_i , i = 1, . . . , j such that

F_j· · · F₁He_j =F_j· · · F₁







× × × ×

0 × × ×

0 0 × ×

0 0 0 ×







=







× × × ×

× × ×

× ×

×







≡ R_j ∈ R^(j+1)×j.

(14)

In order to obtain R_j+1 we must start by premultiptying the new column by the previous rotations.

He_j+1=







× × × × +

0 × × × +

0 0 × × +

0 0 0 × +

0 0 0 0 +







⇒ F_j· · · F₁He_j+1=







× × × × +

× × × +

× × +

× + 0 r

0 h







The principal upper (j + 1) × j submatrix is nothing but Rj, and

h := hj+2,j+1is not affected by the previous rotations. The next rotation F_j+1 defined by

c_j+1 ≡ r/(r²+ h²)^1/2, s_j+1 = −h/(r²+ h²)^1/2.

(15)

Thus, after k steps of the above process, we have achieved Q_kHe_k = R_k

where Q_k is a (k + 1) × (k + 1) unitary matrix and J (y) =k βe1− eH_ky k=k Q_k

βe1− eH_ky

k=k g_k− R_ky k, (9) where gk ≡ Q_kβe1. Since the last row of Rk is a zero row, the

minimization of (9) is achieved at y_k= eR⁻¹_k eg_k , where eR_k andeg_k are removed the last row of R_k and the last component of g_k, respectively.

Proposition 1

k r_k k=k b − Ax_kk=| The (k+1)-st component of g_k|.

(16)

Proposition 2

The solution xj produced by GMRES at step j is exact which is equivalent to

(i) The algorithm breaks down at step j, (ii) v˜j+1= 0,

(iii) h_j+1,j= 0,

(iv) The degree of the minimal polynomial of r₀ is j.

Corollary 3

For an n × n problem GMRES terminates at most n steps.

This uncommon type of breakdown is sometimes referred to as a “Lucky”

breakdown is the context of the Lanczos algorithm.

(17)

Proposition 3

Suppose that A is diagonalizable so that A = XDX⁻¹ and let ε^(m)= min

p∈Pm,p(0)=1 max

λi∈σ(A) |p(λ_i)| . Then

kr_m+1k ≤ κ(X)ε^(m)kr₀k , where κ(X) = kXkkX⁻¹k.

When A is positive real with symmetric part M , it holds that kr_mk ≤ [1 − α/β]^m/2kr₀k ,

where α = (λmin(M ))² and β = λmax(A^TA).

(18)

Theorem 4

Assume λ₁, . . . , λ_ν of A with positive(negative) real parts and the other eigenvalues enclosed in a circle centered at C with C > 0 and have radius R with C > R. Then

ε^(m)≤ R C

m−ν

j=ν+1,··· ,Nmax

ν

Y

i=1

|λ_i− λ_j|

|λ_i| ≤ D d

2 R C

m−ν

where

D = max

i=1,··· ,ν j=ν+1,··· ,N

|λ_i− λ_j| and d = min

i=1,··· ,ν|λ_i| .

(19)

Proof.

Consider p(z) = r(z)q(z) where r(z) = (1 − z/λ₁) · · · (1 − z/λ_ν) and q(z) arbitrary polynomial of deg ≤ m − ν such that q(0) = 1. Since p(0) = 1 and p(λi) = 0, for i = 1, . . . , ν, we have

ε^(m) ≤ max

j=ν+1,··· ,N|p(λ_j)| ≤ max

j=ν+1,··· ,N|r(λ_j)| max

j=ν+1,··· ,N|q(λ_j)| . It is easily seen that

j=ν+1,··· ,Nmax |r(λ_j)| = max

j=ν+1,··· ,N ν

Y

i=1

|λ_i− λ_j|

|λ_i| ≤ D d

ν

.

By maximum principle, the maximum of |q(z)| for z ∈ {λj}^N_j=ν+1 is on the circle. Taking σ(z) = [(C − z)/C]^m−ν whose maximum on the circle is (R/C)^m−ν yields the desired result.

(20)

Corollary 5

Under the assumptions of Proposition 3 and Theorem 4, GMRES(m) converges for any initial x0 if

m > νLog DC

dRκ(X)^1/ν

Log C R .

(21)

Appendix

Proof of Implicit Q Theorem Let

A[q1 q2 · · · qn] = [q1 q2 · · · qn]







h11 h12 · · · · · · h1n

h21 h22 . .. ... 0 . .. . .. . .. ...

..

. . .. . .. . .. hn−1,n

0 · · · 0 hn,n−1 hnn







. (10)

(22)

Then we have

Aq₁ = h₁₁q₁+ h₂₁q₂. (11) Since q₁⊥q₂, it implies that

h11= q₁^∗Aq1/q^∗₁q1. From (11), we get that

˜

q₂≡ h₂₁q₂= Aq₁− h₁₁q₁. That is

q2 = ˜q2/k ˜q2k₂ and h21= k ˜q2k₂.

(23)

Similarly, from (10),

Aq₂ = h₁₂q₁+ h₂₂q₂+ h₃₂q₃, where

h12= q₁^∗Aq2 and h22= q₂^∗Aq2. Let

˜

q₃ = Aq₂− h₁₂q₁+ h₂₂q₂. Then

q3 = ˜q3/k ˜q3k₂ and h32= k ˜q3k, and so on.

(24)

Therefore, [q₁, · · · , q_n] are uniquely determined by q₁. Thus, uniqueness holds.

Let K_n= [v₁, Av₁, · · · , Aⁿ⁻¹v₁] with kv₁k₂= 1 is nonsingular.

K_n= U_nR_nand U_ne₁ = v₁. Then

AKn= KnCn= [v1, Av1, · · · , Aⁿ⁻¹v1]







0 · · · 0 ∗ 1 . .. ... ∗ 0 . .. ... ... ...

... . .. ... 0 ...

0 · · · 0 1 ∗







. (12)

(25)

Since K_n is nonsingular, (12) implies that

A = K_nC_nK_n⁻¹= (U_nR_n)C_n(R⁻¹_n U_n⁻¹).

That is

AUn= Un(RnCnR⁻¹_n ),

where (RnCnR_n⁻¹) is Hessenberg and Une1 = v1. Because

< Un>=< Kn>, find AVn= VnHn by any method with Vne1 = v1, then it holds that V_n= U_n, i.e., vn⁽ⁱ⁾= u⁽ⁱ⁾n for i = 1, · · · , n.

Back to Theorem

(26)

Definition 6 (Givens rotation)

A plane rotation (also called a Givens rotation) is a matrix of the form G =

c s

−¯s c

where |c|²+ |s|²= 1.

Given a 6= 0 and b, set

v =p|a|²+ |b|², c = |a|/v and s = a

|a| ·¯b v, then

c s

−¯s c

a b

=

"

v a

|a|

0

# .