GMRES: Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems
Tsung-Ming Huang
Department of Mathematics National Taiwan Normal University
December 4, 2011
Ref: SISC, 1984, Saad
Theorem 1 (Implicit Q theorem)
Let AV1= V1H1 and AV2 = V2H2, where H1, H2 are Hessenberg and V1, V2 are unitary with V1e1 = V2e1 = q1. Then V1= V2 and H1= H2.
Proof
A
v1 v2 · · · vn = v1 v2 · · · vn
h11 h12 · · · h1n
h21 h22 h2n
. .. . .. ... hn,n−1 hnn
with
viTvj = δij, i, j = 1, ..., n
Arnoldi Algorithm
Input: Given v1 with kv1k2 = 1;
Output: Arnoldi factorization: AVk= VkHk+ hk+1,kvk+1eTk.
1: Set k = 0.
2: repeat
3: Compute hik = (Avk, vi) for i = 1, 2, . . . , k;
4: Compute ˜vk+1= Avk−Pk
i=1hikvi;
5: Compute hk+1,k = k˜vk+1k2;
6: Compute vk+1= ˜vk+1/hk+1,k;
7: Set k = k + 1;
8: until convergent
Remark 1
(a) Let Vk= [v1, · · · , vk] ∈ Rn×k where vj, for j = 1, . . . , k, is generated by Arnoldi algorithm. Then Hk≡ VkTAVk is upper k × k Hessenberg.
(b) Arnoldi’s original method was a Galerkin method for approximate the eigenvalue of A by Hk.
In order to solve Ax = b by the Galerkin method using < Kk>≡< Vk>, we seek an approximate solution xk= x0+ zk with
zk∈ Kk=< r0, Ar0, · · · , Ak−1r0 >
and r0 = b − Ax0. Definition 2
{xk} is said to be satisfied the Galerkin condition if rk≡ b − Axk is orthogonal to Kk for each k.
The Galerkin method can be stated as that find
xk= x0+ zk with zk∈ Vk (1) such that
(b − Axk, v) = 0, ∀ v ∈ Vk,
which is equivalent to find
zk≡ Vkyk∈ Vk (2)
such that
(r0− Azk, v) = 0, ∀ v ∈ Vk. (3) Substituting (2) into (3), we get
VkT(r0− AVkyk) = 0, which implies that
yk= (VkTAVk)−1kr0ke1. (4)
Since Vk is computed by the Arnoldi algorithm with v1 = r0/kr0k, yk in (4) can be represented as
yk = Hk−1kr0ke1. Substituting it into (2) and (1), we get
xk= x0+ VkHk−1kr0ke1.
Using the result that AVk= VkHk+ hk+1,kvk+1eTk, rk can be reformulated as
rk = b − Axk = r0− AVkyk= r0− (VkHk+ hk+1,kvk+1eTk)yk
= r0− Vkkr0ke1− hk+1,keTkykvk+1 = −(hk+1,keTkyk)vk+1.
The generalized minimal residual (GMRES) algorithm
The approximate solution of the form x0+ zk, which minimizes the residual norm over zk∈ Kk, can in principle be obtained by following algorithms:
The ORTHODIR algorithm of Jea and Young;
the generalized conjugate residual method (GCR);
GMRES.
Let
Vk= [v1, · · · , vk] , H˜k=
h1,1 h1,2 · · · h1,k
h2,1 h2,2 · · · h2,k 0 . .. . .. ...
... . .. hk,k−1 hk,k
0 · · · 0 hk+1,k
∈ R(k+1)×k.
By Arnoldi algorithm, we have
AVk= Vk+1H˜k. (5)
To solve the least square problem:
z∈Kmink
kro− Azk2= min
z∈Kk
kb − A(xo+ z)k2, (6) where Kk =< ro, Aro, · · · , Ak−1ro >=< v1, · · · , vk> with v1 = krro
ok2.
Set z = Vky, the least square problem (6) is equivalent to min
y∈Rk
J (y) = min
y∈Rk
kβv1− AVkyk2, β = krok2. (7)
Using (5), we have J (y) = kVk+1
βe1− ˜Hky
k2 = kβe1− ˜Hkyk2. (8) Hence, the solution of the least square (6) is
xk= xo+ Vkyk,
where yk minimize the function J (y) defined by (8) over y ∈ Rk.
GMRES Algorithm
Input: Choose x0, compute r0= b − Ax0 and v1 = r0/kr0k;
Output: Solution of linear system Ax = b.
1: for j = 1, 2, . . . , k do
2: Compute hij = (Avj, vi) for i = 1, 2, . . . , j;
3: Compute ˜vj+1= Avj −Pj
i=1hijvi;
4: Compute hj+1,j = k˜vj+1k2;
5: Compute vj+1= ˜vj+1/hj+1,j;
6: end for
7: Form the solution:
xk= x0+ Vkyk, where yk minimizes J (y) in (8).
Difficulties: when k is increasing, storage for vj, like k, the number of multiplications is like 12k2N .
GMRES(m) Algorithm
Input: Choose x0, compute r0= b − Ax0 and v1 = r0/kr0k;
Output: Solution of linear system Ax = b.
1: for j = 1, 2, . . . , m do
2: Compute hij = (Avj, vi) for i = 1, 2, . . . , j;
3: Compute ˜vj+1= Avj −Pj
i=1hijvi;
4: Compute hj+1,j = k˜vj+1k2;
5: Compute vj+1= ˜vj+1/hj+1,j;
6: end for
7: Form the solution:
xm= x0+ Vmym, where ym minimizes k βe1− eHmy k for y ∈ Rm.
8: Restart: Compute rm = b − Axm;
9: if krmk is small, then
10: stop,
Practical Implementation: Consider QR factorization of eHk
Consider the matrix eHk. We want to solve the least squares problem:
min
y∈Rk
k βe1− eHky k2. Assume Givens rotations Fi , i = 1, . . . , j such that
Fj· · · F1Hej =Fj· · · F1
× × × ×
× × × ×
0 × × ×
0 0 × ×
0 0 0 ×
=
× × × ×
× × ×
× ×
×
≡ Rj ∈ R(j+1)×j.
In order to obtain Rj+1 we must start by premultiptying the new column by the previous rotations.
Hej+1=
× × × × +
× × × × +
0 × × × +
0 0 × × +
0 0 0 × +
0 0 0 0 +
⇒ Fj· · · F1Hej+1=
× × × × +
× × × +
× × +
× + 0 r
0 h
The principal upper (j + 1) × j submatrix is nothing but Rj, and
h := hj+2,j+1is not affected by the previous rotations. The next rotation Fj+1 defined by
cj+1 ≡ r/(r2+ h2)1/2, sj+1 = −h/(r2+ h2)1/2.
Thus, after k steps of the above process, we have achieved QkHek = Rk
where Qk is a (k + 1) × (k + 1) unitary matrix and J (y) =k βe1− eHky k=k Qk
βe1− eHky
k=k gk− Rky k, (9) where gk ≡ Qkβe1. Since the last row of Rk is a zero row, the
minimization of (9) is achieved at yk= eR−1k egk , where eRk andegk are removed the last row of Rk and the last component of gk, respectively.
Proposition 1
k rk k=k b − Axkk=| The (k+1)-st component of gk|.
Proposition 2
The solution xj produced by GMRES at step j is exact which is equivalent to
(i) The algorithm breaks down at step j, (ii) v˜j+1= 0,
(iii) hj+1,j= 0,
(iv) The degree of the minimal polynomial of r0 is j.
Corollary 3
For an n × n problem GMRES terminates at most n steps.
This uncommon type of breakdown is sometimes referred to as a “Lucky”
breakdown is the context of the Lanczos algorithm.
Proposition 3
Suppose that A is diagonalizable so that A = XDX−1 and let ε(m)= min
p∈Pm,p(0)=1 max
λi∈σ(A) |p(λi)| . Then
krm+1k ≤ κ(X)ε(m)kr0k , where κ(X) = kXkkX−1k.
When A is positive real with symmetric part M , it holds that krmk ≤ [1 − α/β]m/2kr0k ,
where α = (λmin(M ))2 and β = λmax(ATA).
Theorem 4
Assume λ1, . . . , λν of A with positive(negative) real parts and the other eigenvalues enclosed in a circle centered at C with C > 0 and have radius R with C > R. Then
ε(m)≤ R C
m−ν
j=ν+1,··· ,Nmax
ν
Y
i=1
|λi− λj|
|λi| ≤ D d
2 R C
m−ν
where
D = max
i=1,··· ,ν j=ν+1,··· ,N
|λi− λj| and d = min
i=1,··· ,ν|λi| .
Proof.
Consider p(z) = r(z)q(z) where r(z) = (1 − z/λ1) · · · (1 − z/λν) and q(z) arbitrary polynomial of deg ≤ m − ν such that q(0) = 1. Since p(0) = 1 and p(λi) = 0, for i = 1, . . . , ν, we have
ε(m) ≤ max
j=ν+1,··· ,N|p(λj)| ≤ max
j=ν+1,··· ,N|r(λj)| max
j=ν+1,··· ,N|q(λj)| . It is easily seen that
j=ν+1,··· ,Nmax |r(λj)| = max
j=ν+1,··· ,N ν
Y
i=1
|λi− λj|
|λi| ≤ D d
ν
.
By maximum principle, the maximum of |q(z)| for z ∈ {λj}Nj=ν+1 is on the circle. Taking σ(z) = [(C − z)/C]m−ν whose maximum on the circle is (R/C)m−ν yields the desired result.
Corollary 5
Under the assumptions of Proposition 3 and Theorem 4, GMRES(m) converges for any initial x0 if
m > νLog DC
dRκ(X)1/ν
Log C R .
Appendix
Proof of Implicit Q Theorem Let
A[q1 q2 · · · qn] = [q1 q2 · · · qn]
h11 h12 · · · · · · h1n
h21 h22 . .. ... 0 . .. . .. . .. ...
..
. . .. . .. . .. hn−1,n
0 · · · 0 hn,n−1 hnn
. (10)
Then we have
Aq1 = h11q1+ h21q2. (11) Since q1⊥q2, it implies that
h11= q1∗Aq1/q∗1q1. From (11), we get that
˜
q2≡ h21q2= Aq1− h11q1. That is
q2 = ˜q2/k ˜q2k2 and h21= k ˜q2k2.
Similarly, from (10),
Aq2 = h12q1+ h22q2+ h32q3, where
h12= q1∗Aq2 and h22= q2∗Aq2. Let
˜
q3 = Aq2− h12q1+ h22q2. Then
q3 = ˜q3/k ˜q3k2 and h32= k ˜q3k, and so on.
Therefore, [q1, · · · , qn] are uniquely determined by q1. Thus, uniqueness holds.
Let Kn= [v1, Av1, · · · , An−1v1] with kv1k2= 1 is nonsingular.
Kn= UnRnand Une1 = v1. Then
AKn= KnCn= [v1, Av1, · · · , An−1v1]
0 · · · 0 ∗ 1 . .. ... ∗ 0 . .. ... ... ...
... . .. ... 0 ...
0 · · · 0 1 ∗
. (12)
Since Kn is nonsingular, (12) implies that
A = KnCnKn−1= (UnRn)Cn(R−1n Un−1).
That is
AUn= Un(RnCnR−1n ),
where (RnCnRn−1) is Hessenberg and Une1 = v1. Because
< Un>=< Kn>, find AVn= VnHn by any method with Vne1 = v1, then it holds that Vn= Un, i.e., vn(i)= u(i)n for i = 1, · · · , n.
Back to Theorem
Definition 6 (Givens rotation)
A plane rotation (also called a Givens rotation) is a matrix of the form G =
c s
−¯s c
where |c|2+ |s|2= 1.
Given a 6= 0 and b, set
v =p|a|2+ |b|2, c = |a|/v and s = a
|a| ·¯b v, then
c s
−¯s c
a b
=
"
v a
|a|
0
# .