師大
The QR algorithm
Tsung-Ming Huang
Department of Mathematics National Taiwan Normal University, Taiwan
December 3, 2008
師大
Outline
1 The power and inverse power methods The inverse power method
2 The explicitly shift QR algorithm
The QR algorithm and the inverse power method The unshifted QR algorithm
Hessenberg form
3 Implicity shifted QR algorithm The implicit double shift
4 The generalized eigenvalue problem
Real Schur and Hessenberg-triangular forms The doubly shifted QZ algorithm
師大
The power and inverse power methods
Let A be a nondefective matrix and (λi, xi)for i = 1, · · · , n be a complete set of eigenpairs of A. That is {x1, · · · , xn} is linearly independent. Hence, for any u0 6= 0, ∃ α1, · · · , αnsuch that
u0 = α1x1+ · · · + αnxn. Now Akxi= λkixi, so that
Aku0= α1λk1x1+ · · · + αnλknxn. (1) If |λ1| > |λi| for i ≥ 2 and α16= 0, then
1
λk1Aku0= α1x1+(λ2
λ1)kα2x2+· · ·+αn(λn
λ1)kxn→ α1x1 as k → 0.
師大
Theorem
Let A have a unique dominant eigenpair (λ1, x1)with x∗1x1 = 1 and X = x1 X2 be a nonsingular matrix with X2∗X2= I such that
X−1AX =
λ1 0
0 M
. Let u0 6= 0 be decomposed in u0 = r1x1+ X2c2. Then
sin ∠(x1, Aku0) ≤ |λ1|−kkMkk2kc2/r1k2 1 − |λ1|−kkMkk2kc2/r1k2. In particular ∀ ε > 0, ∃ σ such that
sin ∠(x1, Aku0) ≤ σ[ρ(M )/|λ1| + ε]k 1 − σ[ρ(M )/|λ1| + ε]k, where ρ(M ) is the spectral radius of M .
師大
Proof: Since
u0= α1x1+ X2c2 = x1 X2
α1 c2
= X
α1 c2
, it follows that
X−1Aku0 = X−1AkX
α1
c2
= (X−1AX)(X−1AX) · · · (X−1AX)
α1 c2
=
λk1 0 0 Mk
α1 c2
. Hence,
Aku0 = X
λk1α1
Mkc2
= α1λk1x1+ X2Mkc2.
師大
Let the columns of Y form an orthonormal basis for the subspace orthogonal to x1.By Lemma 3.12 in Chapter 1, we have
sin ∠(x1, Aku0) = kY∗Aku0k2
kAku0k2 = kY∗X2Mkc2k2 kα1λk1x1+ X2Mkc2k2. But
kY∗X2Mkc2k2≤ kMkk2kc2k2 and
kα1λk1x1+ X2Mkc2k2≥ |α1||λk1| − kMkk2kc2k2, we get
sin ∠(x1, Aku0) ≤ |λ1|−kkMkk2kc2/α1k2 1 − |λ1|−kkMkk2kc2/α1k2. By Theorem 2.9 in Chapter 1, ∀ ε > 0, ∃ ˆσsuch that
kMkk2≤ ˆσ(ρ(M ) + ε)k.
師大
Take σ = ˆσkc2/α1k2.Then ∀ ε > 0,
sin ∠(x1, Aku0) ≤ σ[ρ(M )/|λ1| + ε]k 1 − σ[ρ(M )/|λ1| + ε]k.
The error in the eigenvector approximation converges to zero at an asymptotic rate of [ρ(M )/|λ1|]k.
If A has a complete system of eigenvectors with
|λ1| > |λ2| ≥ · · · ≥ |λn|, then the convergence is as
|λ2/λ1|k.
師大
Algorithm (Power Method with 2-norm) Choose an initial u 6= 0 with kuk2 = 1.
Iterate until convergence
Computev = Au; k = kvk2; u := v/k Theorem
The sequence defined by Algorithm 1 is satisfied
i→∞lim ki = |λ1|
i→∞lim εiui= x1 kx1k
α1
|α1|, where ε = |λ1| λ1
師大
Proof: It is obvious that
us = Asu0/kAsu0k, ks= kAsu0k/kAs−1u0k. (2) This follows from λ1−sAsu0 −→ α1x1 that
|λ1|−skAsu0k −→ |α1|kx1k
|λ1|−s+1kAs−1u0k −→ |α1|kx1k and then
|λ1|−1kAsu0k/kAs−1u0k = |λ1|−1ks−→ 1.
From (1) follows now for s → ∞
εsus = εs Asu0 kAsu0k =
α1x1+Pn i=2αi
λi
λ1
s
xi kα1x1+Pn
i=2αi
λi
λ1
s
xik
→ α1x1
kα1x1k = x1
kx1k α1
|α1|.
師大
Algorithm (Power Method with Linear Function) Choose an initial u 6= 0.
Iterate until convergence
Computev = Au; k = `(v); u := v/k
where`(v), e.g.e1(v)oren(v), is a linear functional.
Theorem
Suppose `(x1) 6= 0and `(vi) 6= 0, i = 1, 2, . . . ,then
i→∞lim ki = λ1 i→∞lim ui = x1
`(x1).
師大
Proof: As above we show that
ui = Aiu0/`(Aiu0), ki= `(Aiu0)/`(Ai−1u0).
From (1) we get for s → ∞ λ1−s
`(Asu0) −→ α1`(x1), λ1−s+1`(As−1u0) −→ α1`(x1), thus
λ1−1
ks−→ 1.
Similarly for i −→ ∞,
ui= Aiu0
`(Aiu0) = α1x1+Pn
j=2αj(λλj
1)ixj
`(α1x1+Pn
j=2αj(λλj
1)ixj)
−→ α1x1
α1`(x1)
師大
Note that:
ks = `(Asu0)
`(As−1u0) = λ1
α1`(x1) +Pn
j=2αj(λλj
1)s`(xj) α1`(x1) +Pn
j=2αj(λλj
1)s−1`(xj)
= λ1+ O
| λ2
λ1 |s−1
.
That is the convergent rate is
λ2
λ1
.
師大
Theorem
Let u 6= 0 and for any µ set rµ= Au − µu. Then krµk2is minimized when
µ = u∗Au/u∗u.
In this case rµ⊥ u.
Proof: W.L.O.G. assume kuk2= 1. Let u U be unitary and set
u∗ U∗
A u U ≡
ν h∗ g B
=
u∗Au u∗AU U∗Au U∗AU
.
師大
Then
u∗ U∗
rµ =
u∗ U∗
Au − µ
u∗ U∗
u
=
u∗ U∗
A u U
u∗ U∗
u − µ
u∗ U∗
u
=
ν h∗ g B
u∗ U∗
u − µ
u∗ U∗
u
=
ν h∗ g B
1 0
− µ
1 0
=
ν − µ g
.
It follows that krµk22= k
u∗ U∗
rµk22 = k
ν − µ g
k22 = |ν − µ|2+ kgk22.
師大
Hence
minµ krµk2 = kgk2 = krνk2. That is µ = ν = u∗Au. On the other hand, since
u∗rµ= u∗(Au − µu) = u∗Au − µ = 0, it implies that rµ⊥ u.
Definition (Rayleigh quotient)
Let u and v be vectors with v∗u 6= 0.Then v∗Au/v∗uis called a Rayleigh quotient.
If u or v is an eigenvector corresponding to an eigenvalue λ of A, then
v∗Au
v∗u = λv∗u v∗u = λ.
Therefore, u∗kAuk/u∗kukprovide a sequence of approximation to λin the power method.
師大 The inverse power method
Inverse power method
Goal
Find the eigenvalue of A that is in a given region or closest to a certain scalar σ and the corresponding eigenvector.
Let λ1, · · · , λnbe the eigenvalues of A. Suppose λ1is simple and σ ≈ λ1.Then
µ1 = 1
λ1− σ, µ2 = 1
λ2− σ, · · · , µn= 1 λn− σ
are eigenvalues of (A − σI)−1 and µ1→ ∞ as σ → λ1. Thus we transform λ1into a dominant eigenvalue µ1.
The inverse power method is simply the power method applied to (A − σI)−1.
師大 The inverse power method
Let
y = (A − σI)−1xand ˆx = y/kyk2. It holds that
(A − σI)ˆx = x
kyk2 ≡ w.
Set
ρ = ˆx∗(A − σI)ˆx = ˆx∗w.
Then
r = [A − (σ + ρ)I]ˆx = (A − σI)ˆx − ρˆx = w − ρˆx.
Algorithm (Inverse power method with a fixed shift) Choose an initial u0 6= 0.
For i = 0, 1, 2, . . .
Computevi+1= (A − σI)−1ui andki+1= `(vi+1).
Setui+1= vi+1/ki+1
師大 The inverse power method
The convergence of Algorithm 3 is |λλ1−σ
2−σ| whenever λ1 and λ2are the closest and the second closest eigenvalues to σ.
Algorithm 3 is linearly convergent.
Algorithm (Inverse power method with variant shifts) Choose an initial u0 6= 0.
Given σ0= σ.
For i = 0, 1, 2, . . .
Computevi+1= (A − σiI)−1uiandki+1= `(vi+1).
Setui+1= vi+1/ki+1andσi+1= σi+ 1/ki+1. Above algorithm is locally quadratic convergent.
師大 The inverse power method
Connection with Newton method
Consider the nonlinear equations:
F
u λ
≡
Au − λu
`Tu − 1
=
0 0
. (3)
Newton method for (3): for i = 0, 1, 2, . . .
ui+1 λi+1
=
ui λi
−
F0
ui λi
−1
F
ui λi
. Since
F0
u λ
=
A − λI −u
`T 0
,
the Newton method can be rewritten by component-wise
(A − λi)ui+1 = (λi+1− λi)ui (4)
`Tui+1 = 1. (5)
師大 The inverse power method
Let
vi+1= ui+1 λi+1− λi. Substituting vi+1into (4), we get
(A − λiI)vi+1= ui. By equation (5), we have
ki+1= `(vi+1) = `(ui+1)
λi+1− λi = 1 λi+1− λi. It follows that
λi+1= λi+ 1 ki+1.
Hence the Newton’s iterations (4) and (5) are identified with Algorithm 4.
師大 The inverse power method
Algorithm (Inverse power method with Rayleigh Quotient) Choose an initial u0 6= 0 with ku0k2 = 1.
Compute σ0= uT0Au0. For i = 0, 1, 2, . . .
Computevi+1= (A − σiI)−1ui.
Setui+1= vi+1/kvi+1k2 andσi+1= uTi+1Aui+1. For symmetric A, Algorithm 5 is cubically convergent.
師大
The explicitly shift QR algorithm
The QR algorithm is an iterative method for reducing a matrix A to triangular form by unitary similarity transformations.
Algorithm (explicitly shift QR algorithm) Set A0 = A.
For k = 0, 1, 2, · · · Choose a shift σk;
Factor Ak− σkI = QkRk,where Qkis orthogonal and Rkis upper triangular;
Ak+1= RkQk+ σkI; end for
師大
Since
Ak− σkI = QkRk=⇒ Rk= Q∗k(Ak− σkI), it holds that
Ak+1 = RkQk+ σkI
= Q∗k(Ak− σkI)Qk+ σkI
= Q∗kAkQk
The algorithm is a variant of the power method.
師大 The QR algorithm and the inverse power method
Let Q = Qˆ q be unitary and write
Q∗AQ =
Qˆ∗A ˆQ Qˆ∗Aq q∗A ˆQ q∗Aq
≡
Bˆ ˆh ˆ g∗ µˆ
. If (λ, q) is a left eigenpair of A, then
ˆ
g∗ = q∗A ˆQ = λq∗Q = 0ˆ and ˆµ = q∗Aq = λq∗q = λ.
That is
Q∗AQ =
Bˆ hˆ 0 λ
.
But it is not an effective computational procedure because it requires q is an eigenvector of A.
師大 The QR algorithm and the inverse power method
Let q be an approximate left eigenvector of A with q∗q = 1, ˆµ = q∗Aq and r = q∗A − ˆµq∗. Then
r Qˆ q
= (q∗A − ˆµq∗) Qˆ q
= q∗A ˆQ − ˆµq∗Qˆ q∗Aq − ˆµq∗q
= q∗A ˆQ 0 = ˆg∗ 0 . Therefore,
kˆg∗k2= kr Qˆ q k2= krk2.
The QR algorithm implicitly chooses q to be a vector produced by the inverse power method with shift σ.
師大 The QR algorithm and the inverse power method
Write the QR factorization of A − σI as
Qˆ∗ q∗
(A − σI) = R ≡
Rˆ∗ r∗
. It holds that
q∗(A − σI) = r∗ = rnneTn ⇒ q∗ = rnneTn(A − σI)−1 (6) Hence, the last column of Q generated by the QR algorithm is the result of the inverse power method with shift σ applied to eTn. Question
How to choose shift σ?
師大 The QR algorithm and the inverse power method
Let
A =
B h g∗ µ
. Then
eTnAen= µ and eTnA − µen= g∗ µ − µen= g∗ 0 .
If we take (µ, en)to be an approximate left eigenvector of A, then the corresponding residual norm is kgk2.
If g is small, then µ should approximate an eigenvalue of A and choose σ = µ = eTnAen(Rayleigh quotient shift).
Question
Why the QR algorithm converges?
師大 The QR algorithm and the inverse power method
Let
A − σI ≡
B − σI h g∗ µ − σ
= QR ≡
P f e∗ π
S r 0 ρ
(7) be the QR factorization of A − σI. Take
A ≡ˆ
Bˆ ˆh ˆ g∗ µˆ
= RQ + σI. (8)
Since Q is unitary, we have
kek22+ π2 = kf k22+ π2 = 1 which implies that
kek2= kf k2 and |π| ≤ 1.
師大 The QR algorithm and the inverse power method
From (7), we have
g∗ = e∗S.
Assume S is nonsingular and κ = kS−1k2, then kek2≤ κkgk2.
Since R ≡
S r 0 ρ
= Q∗(A−σI) ≡
P∗ e f∗ π¯
B − σI h g∗ µ − σ
, it implies that
ρ = f∗h + ¯π(µ − σ) and then
|ρ| ≤ kf kkhk + |π||µ − σ| = kek2khk2+ |π||µ − σ|
≤ κkgk2khk2+ |µ − σ|.
師大 The QR algorithm and the inverse power method
From (8), we have
ˆ
g∗ = ρe∗ which implies that
kˆgk2≤ |ρ|kek2 ≤ |ρ|κkgk2≤ κ2khk2kgk22+ κ|µ − σ|kgk2. Consequently,
kgj+1k2 ≤ κ2jkhjk2kgjk22+ κj|µj − σj|kgjk2. If g0 is sufficiently small and µ0 is sufficiently near a simple eigenvalue λ, then gj → 0 and µj → λ.
Assume ∃ η and κ such that
khjk2 ≤ η and κj = kSj−1k2 ≤ κ.
師大 The QR algorithm and the inverse power method
Take the Rayleigh quotient shift σj = µj. Then kgj+1k2 ≤ κ2ηkgjk22,
which means that kgjk2converges at least quadratically to zero.
If A0 is Hermitian, then Akis also Hermitian. It holds that hj = gj
and then
kgj+1k2≤ κ2kgjk32. Therefore, the convergent rate is cubic.
師大 The unshifted QR algorithm
The unshifted QR algorithm
QR algorithm
Ak+1= Q∗kAkQk or
Ak+1= Q∗kQ∗k−1· · · Q0A0Q0· · · Qk−1Qk
for k = 0, 1, 2, · · · . Let
Qˆk= Q0· · · Qk−1Qk. Then
Ak+1= ˆQ∗kA0Qˆk.
師大 The unshifted QR algorithm
Theorem
Let Q0, · · · , Qkand R0, · · · , Rkbe the orthogonal and triangular matrices generated by the QR algorithm with shifts σ0, · · · , σk starting with A. Let
Qˆk= Q0· · · Qk and ˆRk= R0· · · Rk. Then
QˆkRˆk= (A − σ0I) · · · (A − σkI).
Proof: Since
Rk = (Ak+1− σkI)Q∗k
= Qˆ∗k(A − σkI) ˆQkQ∗k
= Qˆ∗k(A − σkI) ˆQk−1,
師大 The unshifted QR algorithm
it follows that
Rˆk = RkRˆk−1 = ˆQ∗k(A − σkI) ˆQk−1Rˆk−1
and
QˆkRˆk= (A − σkI) ˆQk−1Rˆk−1. By induction on ˆQk−1Rˆk−1, we have
QˆkRˆk= (A − σkI) · · · (A − σ0I).
If σk= 0for k = 0, 1, 2, · · · , then ˆQkRˆk = Ak+1 and ˆ
r(k)11qˆ1(k)= ˆQkRˆke1 = Ak+1e1.
This implies that the first column of ˆQkis the normalized result of applying k + 1 iterations of the power method to e1.
師大 The unshifted QR algorithm
Hence, ˆq1(k)approaches the dominant eigenvector of A, i.e., if Ak = ˆQ∗kAQk=
µk h∗k gk Bk
,
then gk→ 0 and µk → λ1,where λ1is the dominant eigenvalue of A.
Theorem Let
X−1AX = Λ ≡ diag(λ1, · · · , λn)
where |λ1| > |λ2| > · · · > |λn| > 0. Suppose X−1has an LU factorization X−1= LU,where L is unit lower triangular, and let X = QRbe the QR factorization of X. If Akhas the QR
factorization Ak = ˆQkRˆk,then ∃ diagonal matrices Dkwith
|Dk| = I such that ˆQkDk−→ Q.
師大 The unshifted QR algorithm
Proof: By the assumptions, we get
Ak= XΛkX−1 = QRΛkLU = QR(ΛkLΛ−k)(ΛkU ).
Since
(ΛkLΛ−k)ij = `ij(λi/λj)k → 0 for i > j, it holds that
ΛkLΛ−k → I as k → ∞.
Let
ΛkLΛ−k = I + Ek, where Ek→ 0 as k → ∞. Then
Ak= QR(I + Ek)(ΛkU ) = Q(I + REkR−1)(RΛkU ).
師大 The unshifted QR algorithm
Let
I + REkR−1 = ¯QkR¯k
be the QR factorization of I + REkR−1.Then Ak = (Q ¯Qk)( ¯RkRΛkU ).
Since
I + REkR−1→ I as k → ∞, we have
Q¯k → I as k → ∞.
Let the diagonals of ¯RkRΛkU be δ1, · · · , δmand set Dk= diag(¯δ1/|δ1|, · · · , ¯δn/δn).
Then Ak= (Q ¯QkD−1k )(DkR¯kRΛkU ) = ˆQkRˆk.
師大 The unshifted QR algorithm
Since the diagonals of DkR¯kRΛkU and ˆRkare positive, by the uniqueness of the QR factorization
Qˆk = Q ¯QkD−1k , which implies that
QˆkDk= Q ¯Qk → Q as k → ∞.
Remark:
(i) Since X−1AX = Λ ≡ diag(λ1, · · · , λn), we have
A = XΛX−1 = (QR)Λ(QR)−1 = Q(RΛR−1)Q∗ ≡ QT Q∗ which is a Schur decomposition of A. Therefore, the column of ˆQkDkconverge to the Schur vector of A and Ak= ˆQ∗kA ˆQkconverges to the triangular factor of the Schur decomposition of A.
師大 The unshifted QR algorithm
(ii) Write
R(ΛkLΛ−k) =
R11 r1,i R1,i+1 0 rii r∗i,i+1 0 0 Ri+1,i+1
L(k)11 0 0
`(k)∗i,1 1 0 L(k)i+1,1 `(k)i+1,i L(k)i+1,i+1
. If `(k)∗i,1 , L(k)i+1,1 and `(k)i+1,i are zeros, then
R(ΛkLΛ−k) =
R11L(k)11 r1,i R1,i+1Li+1,i+1 0 ri,i r∗i,i+1Li+1,i+1 0 0 Ri+1,i+1Li+1,i+1
and
I + REkR−1 = R(I + Ek)R−1= R(ΛkLΛ−k)R−1
=
G11 g1,i G1,i+1
0 gii g∗i,i+1 0 0 Gi+1,i+1
= Q¯kR¯k ∼ QR factorization
師大 The unshifted QR algorithm
which implies that
Q¯k= diag( ¯Qk11, w, ¯Qki+1,i+1) and
Ak = Qˆ∗kA ˆQk= ¯Q∗kQ∗AQ ¯Qk= ¯Q∗kT ¯Qk
=
A(k)11 a(k)1,i A(k)1,i+1 0 λi A(k)i,i+1 0 0 A(k)i+1,i+1
.
Therefore, Akdecouples at its ith diagonal element.
The rate of convergence is at least as fast as the approach of max{|λi/λi−1|, |λi+1/λi|}kto zero.
師大 Hessenberg form
Definition
A Householder transformation or elementary reflector is a matrix of
H = I − uu∗ where kuk2 =√
2.
Note that H is Hermitian and unitary.
Theorem
Let x be a vector such that kxk2 = 1and x1 is real and nonnegative. Let
u = (x + e1)/√ 1 + x1. Then
Hx = (I − uu∗)x = −e1.
師大 Hessenberg form
Proof:
I − uu∗x = x − (u∗x)u = x − x∗x + x1
√1 + x1 · x + e1
√1 + x1
= x − (x + e1) = −e1
Theorem
Let x be a vector with x16= 0. Let u = ρkxkx
2 + e1 q1 + ρkxkx1
2
,
where ρ = ¯x1/|x1|. Then
Hx = − ¯ρkxk2e1.
師大 Hessenberg form
Proof: Since
[ ¯ρx∗/kxk2+ eT1][ρx/kxk2+ e1]
= ρρ + ρx¯ 1/kxk2+ ¯ρ¯x1/kxk2+ 1
= 2[1 + ρx1/kxk2], it follows that
u∗u = 2 ⇒ kuk2=√ 2 and
u∗x = ρkxk¯ 2+ x1
q1 + ρkxkx1
2
.
師大 Hessenberg form
Hence,
Hx = x − (u∗x)u = x − ρkxk¯ 2+ x1
q1 + ρkxkx1
2
ρkxkx
2 + e1
q1 + ρkxkx1
2
=
"
1 −( ¯ρkxk2+ x1)kxkρ
2
1 + ρkxkx1
2
#
x −ρkxk¯ 2+ x1
1 + ρkxkx1
2
e1
= −ρkxk¯ 2+ x1
1 + ρkxkx1
2
e1
= −¯ρkxk2e1.
師大 Hessenberg form
•Reduction to Hessenberg form Take
A =
α11 a∗12 a21 A22
.
Let ˆH1 be a Householder transformation such that Hˆ1a21= v1e1.
Set H1 = diag(1, ˆH1). Then H1AH1=
α11 a∗12Hˆ1 Hˆ1a21 Hˆ1A22Hˆ1
=
α11 a∗12Hˆ1 v1e1 Hˆ1A22Hˆ1
For the general step, suppose H1, · · · , Hk−1are Householder transformation such that
Hk−1· · · H1AH1· · · Hk−1=
A11 a1,k A1,k+1 0 αkk a∗k,k+1 0 ak+1,k Ak+1,k+1
,
師大 Hessenberg form
where A11is a Hessenberg matrix of order k − 1. Let ˆHkbe a Householder transformation such that
Hˆkak+1,k= vke1. Set Hk= diag(Ik, ˆHk),then
HkHk−1· · · H1AH1· · · Hk−1Hk=
A11 a1,k A1,k+1Hˆk 0 αkk a∗k,k+1Hˆk
0 vke1 HˆkAk+1,k+1Hˆk
.
師大 Hessenberg form
× × × × ×
× × × × ×
× × × × ×
× × × × ×
× × × × ×
H1
−→
× × × × ×
× × × × ×
0 × × × ×
0 × × × ×
0 × × × ×
H2
−→
× × × × ×
× × × × ×
0 × × × ×
0 0 × × ×
0 0 × × ×
H3
−→
× × × × ×
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
師大 Hessenberg form
Definition (Givens rotation)
A plane rotation (also called a Givens rotation) is a matrix of the form
P =
c s
−¯s ¯c
where |c|2+ |s|2= 1.
Given a 6= 0 and b, set
v =p|a|2+ |b|2, c = |a|/v and s = a
|a| ·¯b v, then
c s
−¯s c¯
a b
= v a
|a|
0
! .
師大 Hessenberg form
Let
Pij =
Ii−1
c s
Ij−i−1
−¯s c¯
In−j
.
× × × ×
× × × ×
× × × ×
× × × ×
P12
−→
+ + + +
0 + + +
× × × ×
× × × ×
P13
−→
+ + + +
0 × × ×
0 + + +
× × × ×
P14
−→
+ + + +
0 × × ×
0 × × ×
0 + + +
師大 Hessenberg form
× × × ×
× × × ×
× × × ×
× × × ×
P12
−→
+ 0 × ×
+ + × ×
+ + × ×
+ + × ×
P13
−→
+ 0 0 ×
+ × + ×
+ × + ×
+ × + ×
P14
−→
+ 0 0 0
+ × × +
+ × × +
+ × × +
師大 Hessenberg form
(i) Reduce a matrix to Hessenberg form by QR factorization.
× × × × ×
× × × × ×
× × × × ×
× × × × ×
× × × × ×
Q1AQ∗1
−−−−→
× × × × ×
× × × × ×
0 × × × ×
0 × × × ×
0 × × × ×
Q2AQ∗2
−−−−→
× × × × ×
× × × × ×
0 × × × ×
0 0 × × ×
0 0 × × ×
Q3AQ∗3
−−−−→
× × × × ×
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
: upper Hessenberg
師大 Hessenberg form
(ii) Reduce upper Hessenberg matrix to upper triangular form by Givens rotations
× × × × ×
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
P12A1
−−−→
× × × × ×
0 × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
P23A2
−−−→
× × × × ×
0 × × × ×
0 0 × × ×
0 0 × × ×
0 0 0 × ×
P34A3
−−−→
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
0 0 0 × ×
P45A4
−−−→
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
0 0 0 0 ×
= T (upper triangular)
師大 Hessenberg form
× × × × ×
0 × × × ×
0 0 × × ×
0 0 0 × ×
0 0 0 0 ×
A1P12∗
−−−→
+ + × × ×
+ + × × ×
0 0 × × ×
0 0 0 × ×
0 0 0 0 ×
A2P23∗
−−−→
× + + × ×
× + + × ×
0 + + × ×
0 0 0 × ×
0 0 0 0 ×
A3P34∗
−−−→
× × + + ×
× × + + ×
0 × + + ×
0 0 + + ×
0 0 0 0 ×
A4P45∗
−−−→
× × × + +
× × × + +
0 × × + +
0 0 × + +
0 0 0 + +
= H (upper Hessenberg)
師大 Hessenberg form
A practical algorithm for reducing an upper Hessenberg matrix Hto Schur form:
1 If the shifted QR algorithm is applied to H, then hn,n−1will tend rapidly to zero and other subdiagonal elements may also tend to zero, slowly.
2 If hi,i−1 ≈ 0, then deflate the matrix to save computation.
How to decide hi,i−1to be negligible?
If
|hi+1,i| ≤ εkAkF
for a small number ε, then hi+1,iis negligible.
Let Q be an orthogonal matrix such that H = Q∗AQ ≡ [hij] is upper Hessenberg. Let
H = H − h˜ i+1,iei+1eTi ∼ deflated matrix
師大 Hessenberg form
Set
E = Q(hi+1,iei+1eTi )Q∗. Then
H = Q˜ ∗(A − E)Q.
If |hi+1,i| ≤ εkAkF,then
kEkF = kQ(hi+1,iei+1eTi )Q∗kF = |hi+1,i| ≤ εkAkF or
kEkF kAkF ≤ ε.
When ε equals the rounding unit εM,the perturbation E is of a size with the perturbation due to rounding the
elements of A.
師大 Hessenberg form
The Wilkinson shift
1 The Rayleigh-quotient shift σ = hn,n
⇒ local quadratic convergence to simple
2 If H is real
⇒ Rayleigh-quotient shift is also real
⇒ can not approximate a complex eigenvalue
3 The Wilkinson shift µ : If λ1, λ2 are eigenvalues of
hn−1,n−1 hn−1,n hn,n−1 hn,n
with
|λ1− hn,n| ≤ |λ2− hn,n|, then µ = λ1.
師大 Hessenberg form
Algorithm do k = 1, 2, · · ·
compute Wilkinson shift µk
Reduce upper Hessenberg Hk− µkI to upper triangular Tk: Pn−1,n(k) · · · P12(k)(Hk− µkI) = Tk;
compute
Hk+1 = TkP12(k)∗· · · Pn−1,n(k)∗ + µkI;
end do
⇒ Schur form of A ⇒ eigenvalues of A.
Question
How to get eigenvectors of A?
師大 Hessenberg form
If A = QT Q∗ is the Schur decomposition of A and X is the matrix of right eigenvectors of T , then QX is the matrix of right eigenvalues of A.
If
T =
T11 t1,k t1,k+1 0 τkk t∗k,k+1 0 0 Tk+1,k+1
and τkkis a simple eigenvalue of T , then
−(T11− τkkI)−1t1,k 1
0
is an eigenvector of T and
0 1 −t∗k,k+1(Tk+1,k+1− τkkI)−1 is a left eigenvector of T corresponding to τkk.
師大
The implicity shifted QR algorithm
Theorem (Real Schur form)
Let A be real of order n. Then ∃ an orthogonal matrix U such that
UTAU =
T11 T12 · · · T1k 0 T22 · · · T2k ... . .. ... ... 0 · · · 0 Tkk
∼ quasi-triangular
The diagonal blocks of T are of order one or two. The blocks of order one contain the real eigenvalue of A. The block of order two contain the pairs of complex conjugate eigenvalue of A.
The blocks can be made to appear in any order.
師大
Proof: Let (λ, x) be a complex eigenpair with λ = µ + iν and x = y + iz. That is
2y = x + ¯x, 2zi = x − ¯x and
Ay = 1
2[λx + ¯λ¯x]
= 1
2[(µy − νz) + i(µz + νy) + (µy − νz) − i(νy + µz)]
= µy − νz. (9)
Similarly,
Az = 1
2i[λx − ¯λ¯x] = νy + µz. (10) From (9) and (10), we have
A y z
= µy − νz νy + µz
= y z
µ ν
−ν µ
≡ y z L.
師大
Let
y z = X1 X2
R 0
= X1R
be a QR factorization of y z . Since y and z are linearly independent, it holds that R is nonsingular and
X1 = y z R−1. Consequently,
AX1 = A y z R−1 = y z LR−1= X1RLR−1. Using this result and (X1 X2)is unitary, we have
X1T X2T
A X1 X2
=
X1TAX1 X1TAX2
X2TAX1 X2TAX2
=
RLR−1 X1TAX2 0 X2TAX2
. (11)
師大
Since λ and ¯λare eigenvalues of L and RLR−1 is similar to L, (11) completes the deflation of the complex conjugate pair λ and ¯λ.
Remark
AX1 = X1(RLR−1)
⇒ A maps the column space of X1 into itself
⇒ span(X1)is called an eigenspace or invariant subspace.
•Francis double shift
1 If the Wilkinson shift σ is complex, then ¯σ is also a candidate for a shift.
2 Apply two steps of the QR algorithm, one with shift σ and the other with shift ¯σto yield a matrix ˆH.
師大
Let
Q ˆˆR = (H − σI)(H − ˆσI) be the QR factorization of (H − σI)(H − ˆσI), then
H = ˆˆ Q∗H ˆQ.
Since
(H − σI)(H − ˆσI) = H2− 2Re(σ)H + |σ|2I ∈ Rn×n, we have that ˆQ ∈ Rn×nand ˆH ∈ Rn×n. Therefore, the QR algorithm with two complex conjugate shifts preserves reality.
Francis double shift strategy
1 Compute the Wilkinson shift σ;
2 From the matrix H2− 2Re(σ)H + |σ|2I := ˜H ∼ O(n3) operations;
3 Compute QR factorization of ˜H : ˜H = ˆQ ˆR;
4 Compute ˆH = ˆQ∗H ˆQ.
師大
• The uniqueness of Hessenberg reduction Definition
Let H be upper Hessenberg of order n. Then H is unreduced if hi+1,i 6= 0 for i = 1, · · · , n − 1.
Theorem (Implicit Q theorem)
Suppose Q = q1 · · · qn and V = v1 · · · vn are unitary matrices with
Q∗AQ = H and V∗AV = G
being upper Hessenberg. Let k denote the smallest positive integer for which hk+1,k = 0,with the convection that k = n if H is unreduced. If v1 = q1, then vi= ±qiand |hi,i−1| = |gi,i−1| for i = 2, · · · , k.Moreover, if k < n, then gk+1,k = 0.
師大
Proof: Define W ≡ w1 · · · wn = V∗Q. Then GW = GV∗Q = V∗AQ = V∗QH = W H which implies that
hi,i−1wi = Gwi−1−
i−1
X
j=1
hj,i−1wj for i = 2, · · · , k.
Since v1 = q1, it holds that w1 = e1,
h21w2 = Gw1− h11w1 = α21e1+ α22e2. Assume
wi−1= αi−1,1e1+ · · · + αi−1,i−1ei−1.
師大
Then
hi,i−1wi = G[αi−1,1e1+ · · · + αi−1,i−1ei−1] −
i−1
X
j=1
βi,jej
= α¯i,1e1+ · · · + ¯αi,iei.
By induction, w1 · · · wk is upper triangular. Since V and Qare unitary, W = V ∗ Q is also unitary and then
w∗1wj = 0, for j = 2, · · · , k.
That is
w1j = 0, for j = 2, · · · , k which implies that
w2= ±e2. Similarly, by
w∗2wj = 0, for j = 3, · · · , k,
師大
i.e.,
w2j = 0, for j = 3, · · · , k.
We get w3 = ±e3.By induction,
wi= ±ei, for i = 2, · · · , k.
Since wi= V∗qi and hi,i−1 = w∗iGwi−1, we have vi = V ei= ±V wi = ±qi and
|hi,i−1| = |gi,i−1| for i = 2, · · · , k.
If hk+1,k = 0, then
gk+1,k = eTk+1Gek= ±eTk+1GW ek= ±eTk+1W Hek
= ±eTk+1
k
X
i=1
hikwi = ±
k
X
i=1
hikeTk+1ei = 0