# Part II On the Numerical Solutions of Eigenvalue Problems

(1)

(2)
(3)

## The Unsymmetric Eigenvalue Problem

Generalized eigenvalue problem (GEVP):

Given A, B ∈ Cn×n. Determine λ ∈ C and 0 6= x ∈ Cn with Ax = λBx. λ is called an eigenvalue of the pencil A − λB (or pair(A, B)) and x is called an eigen- vector corresponding to λ. λ is an eigenvalue of A − λB ⇐⇒ det(A − λB) = 0.

(σ(A, B) ≡ {λ ∈ C | det(A − λB) = 0}.)

Definition 5.0.2 A pencil A − λB (A, B ∈ Rm×n) or a pair(A, B) is called regular if that

(i) A and B are square matrices of order n, and (ii) det(A − λB) 6≡ 0.

In all other case (m 6= n or m = n but det(A − λB) ≡ 0), the pencil is called singular.

Detailed algebraic structure of a pencil A − λB see Matrix theory II, chapter XII (Gant- macher 1959).

Eigenvalue Problem (EVP):

Special case in GEVP when B = I, we have λ ∈ C and 0 6= x ∈ Cn with Ax = λx. λ is an eigenvalue of A and x is an eigenvector corresponding to λ.

Definition 5.0.3 (a) σ(A) = {λ ∈ C | det(A − λI) = 0} is called the spectrum of A.

(b) ρ(A) = max{| λ |: λ ∈ σ(A)} is called the radius of σ(A).

(c) P (λ) = det(λI − A) is called the characteristic polynomial of A.

Let P (λ) = Ys i=1

(λ − λi)m(λi), λi 6= λj(i 6= j) and Xs

i=1

m(λi) = n.

Example 5.0.2 A =

· 2 2 0 3

¸ , B =

· 1 0 0 0

¸

=⇒ det(A − λB) = 2 − λ and σ(A, B) = {2}.

(4)

Example 5.0.3 A =

· 1 2 0 3

¸ , B =

· 0 1 0 0

¸

=⇒ det(A − λB) = 3 and σ(A, B) = ∅.

Example 5.0.4 A =

· 1 2 0 0

¸ , B =

· 1 0 0 0

¸

=⇒ det(A − λB) = 0 and σ(A, B) = C.

Example 5.0.5 det(µA − λB) = (2µ − λ)µ µ = 1 : Ax = λBx =⇒ λ = 2.

λ = 1 : Bx = µAx =⇒ µ = 0, µ = 12 =⇒ λ = ∞, λ = 2.

σ(A, B) = {2, ∞}.

Example 5.0.6 det(µA − λB) = µ · 3µ µ = 1 : no solution for λ.

λ = 1 : Bx = µAx =⇒ µ = 0, 0.(multiple) σ(A, B) = {∞, ∞}.

Let

m(λi) := algebraic multiplicity of λi.

n(λi) := n − rank(A − λiI) = geometric multiplicity.

1 ≤ n(λi) ≤ m(λi).

If for some i, n(λi) < m(λi), then A is degenerated (defective). The following statements are equivalent:

(a) A is diagonalizable: There exists a nonsingular matrix T such that T−1AT = diag(λ1, · · · , λn).

(b) There are n linearly independent eigenvectors.

(c) A is nondefective, i.e. ∀ λ ∈ σ(A) =⇒ m(λ) = n(λ).

If A is defective then eigenvector + principle vector =⇒ Jordan form.

Theorem 5.0.3 (Jordan decomposition) If A ∈ Cn×n, then there exists a nonsingu- lar X ∈ Cn×n, such that X−1AX = diag(J1, · · · , Jt),where

Ji =





λi 1 0

. .. ...

. .. 1

0 λi





is mi× mi and m1+ · · · + mt= n.

Theorem 5.0.4 (Schur decomposition) If A ∈ Cn×n then there exists a unitary ma- trix U ∈ Cn×n such that UAU(= U−1AU) a upper triangular.

- A normal(i.e. AA = AA) ⇐⇒ ∃ unitary U such that UAU = diag(λ1, · · · , λn), i.e.

Aui = λiui, uiuj = δij.

- A hermitian(i.e. A = A) ⇐⇒ A is normal and σ(A) ⊆ R.

- A symmetric(i.e. AT = A, A ∈ Rn×n) ⇐⇒ ∃ orthogonal U such that UTAU = diag(λ1, · · · , λn) and σ(A) ⊆ R.

(5)

### 5.1 Orthogonal Projections and C-S Decomposition

Definition 5.1.1 Let S ⊆ Rn be a subspace, P ∈ Rn×n is the orthogonal projection onto S if



Range(P ) = S, P2 = P,

PT = P,

(5.1.1)

where Range(P ) = R(P ) = {y ∈ Rn| y = P x, for some x ∈ Rn}.

Remark 5.1.1 If x ∈ Rn, then P x ∈ S and (I − P )x ∈ S.

Example 5.1.1 P = vvT/vTv is the orthogonal projection onto S = span{v}, v ∈ Rn. x

Px

S=span{v}

Figure 5.1: Orthogonal projection

Remark 5.1.2 (i) If P1 and P2 are orthogonal projections, then for any z ∈ Rn we have k (P1− P2)z k22= (P1z)T(I − P2)z + (P2z)T(I − P1)z. (5.1.2) If R(P1) = R(P2) = S then the right-hand side of (5.1.2) is zero. Thus the orthog- onal projection for a subspace is unique.

(ii) If V = [v1, · · · , vk] is an orthogonal basis for S, then P = V VT is unique orthogonal projection onto S.

Definition 5.1.2 Suppose S1 and S2 are subspaces of Rn and dim(S1) = dim(S2). We define the distance between S1 and S2 by

dist(S1, S2) =k P1− P2 k2, (5.1.3) where Pi is the orthogonal projection onto Si, i = 1, 2.

Remark 5.1.3 By considering the case of one-dimensional subspaces in R2, we obtain a geometrical interpretation of dist(·, ·). Suppose S1 = span{x} and S2 = span{y} and

(6)

S

S

=span{y}

=span{x}

2

θ

1

k x k2=k y k2= 1. Assume that xTy = cos θ, θ ∈ [0,π2]. It follows that the difference between the projections onto these spaces satisfies

P1− P2 = xxT − yyT = x[x − (yTx)y]T − [y − (xTy)x]yT. If θ = 0(⇒ x = y), then dist(S1, S2) =k P1− P2 k2= sin θ = 0.

If θ 6= 0, then

Ux = [u1, u2] = [x, −[y − (yTx)x]/ sin θ]

and

Vx = [v1, v2] = [[x − (xTy)y]/ sin θ, y]

are defined and orthogonal. It follows that

P1− P2 = Ux diag[sin θ, sin θ] VxT

is the SVD of P1 − P2. Consequently, dist(S1, S2) = sin θ, the sine of the angle between the two subspaces.

Theorem 5.1.1 (C-S Decomposition, Davis / Kahan(1970) or Stewart(1977)) If Q =

· Q11 Q12 Q21 Q22

¸

is orthogonal with Q11 ∈ Rk×k and Q22 ∈ Rj×j(k ≥ j), then there exists orthogonal matrices U1, V1 ∈ Rk×k and orthogonal matrices U2, V2 ∈ Rj×j such that

· U1 0 0 U2

¸T ·

Q11 Q12 Q21 Q22

¸ · V1 0 0 V2

¸

=

I 0 0

0 C S

0 −S C

 , (5.1.4)

where

C = diag(c1, · · · , cj), ci = cos θi, S = diag(s1, · · · , sj), si = sin θi

and 0 ≤ θ1 ≤ θ2 ≤ · · · ≤ θj π2. Lemma 5.1.1 Let Q =

· Q1 Q2

¸

be orthogonal with Q1 ∈ Rn×n. Then there are unitary matrices U1, U2 and W such that

· U1T 0 0 U2T

¸ · Q1 Q2

¸ W =

· C S

¸

where C = diag(c1, · · · , cj) ≥ 0, and S = diag(s1, · · · , sn) ≥ 0 with c2i + s2i = 1, i = 1, · · · , n.

(7)

Proof: Let U1TQ1W = C be the SVD of Q1. Consider

· U1T 0 0 I

¸ · Q1 Q2

¸ W =

· C

Q2W

¸

has orthogonal columns. Define ˜Q2 ≡ Q2W . Then C2 + ˜QT2Q˜2 = I or ˜QT2Q˜2 = I − C2 diagonal, thus ˜QT2Q˜2 is diagonal. Which means that the nonzero column of ˜Q2 are orthogonal to one another.If all the columns of ˜Q2 are nonzero, set S2 = ˜QT2Q˜2 and U2 = ˜Q2S−1, then we have U2TU2 = I and U2TQ˜2 = S. It follows the decomposition.

If ˜Q2 has zero columns, normalize the nonzero columns and replace the zero columns with an orthogonal basis for the orthogonal complement of the column space of ˜Q2. It is easily verified that U2 so defined is orthogonal and S ≡ U2TQ˜2 is diagonal. It also follows that decomposition.

Theorem 5.1.2 (C-S Decomposition) Let the unitary matrix W ∈ Cn×n be parti- tioned in the form W =

· W11 W12 W21 W22

¸

, where W11∈ Cr×r with r ≤ n2. Then there exist

unitary matrices U = diag(

z}|{r

U1 , z}|{n−r

U2 ) and V = diag(

z}|{r

V1 , z}|{n−r

V2 ) such that

UW V =

Γ −Σ 0

Σ Γ 0

0 0 I

}r }r }n − 2r

, (5.1.5)

where Γ = diag(γ1, · · · , γr) ≥ 0 and Σ = diag(σ1, · · · , σr) ≥ 0 with γi2 + σ2i = 1, i = 1, · · · , r.

Proof: Let Γ = U1W11V1 be the SVD of W11 with the diagonal elements of Γ : γ1 γ2 ≤ · · · ≤ γk < 1 = γk+1 = · · · = γr, i.e.

Γ = diag(Γ0, Ir−k).

The matrix

· W11 W21

¸

V1 has orthogonal columns. Hence

I =

·µ W11 W21

V1

¸ ·µ W11 W21

V1

¸

= Γ2+ (W21V1)(W21V1).

Since I and Γ2 are diagonal, (W21V1)(W21V1) is diagonal. So the columns of W21V1 are orthogonal. Since the ith diagonal of I − Γ2 is the norm of the ith column of W21V1, only the first k(k ≤ r ≤ n − r) columns of W21V1 are nonzero. Let ˆU2 be unitary whose first k columns are the normalized columns of W21V1. Then

Uˆ2W21V1 =

· Σ 0

¸ ,

where Σ = diag(σ1, · · · , σk, 0, · · · , 0) ≡ diag(Σ0, 0), ˆU2 ∈ C(n−r)×(n−r). Since

diag(U1, ˆU2)

µ W11 W21

V1 =

 Γ Σ 0

(8)

has orthogonal (orthonormal) columns, we have γi2+ σi2 = 1, i = 1, · · · , r. (Σ0 is nonsin- gular).

By the same argument as above : there is a unitary V2 ∈ C(n−r)×(n−r) such that U1W12V2 = (T, 0),

where T = diag(τ1, · · · , τr) and τi ≤ 0. Since γi2 + τi2 = 1, it follows from γi2 + σi2 = 1 that T = −Σ. Set ˆU = diag(U1, ˆU2) and V = diag(V1, V2). Then X = ˆUW V can be partitioned in the form

X =





Γ0 0 −Σ0 0 0

0 I 0 0 0

Σ0 0 X33 X34 X35

0 0 X43 X44 X45 0 0 X53 X54 X55





}k }r − k }k }r − k }n − 2r

.

Since columns 1 and 4 are orthogonal, it follows Σ0X34 = 0. Thus X34 = 0 (since Σ0 nonsigular). Likewise X35, X43, X53 = 0. From the orthogonality of columns 1 and 3, it follows that −Γ0Σ0+ Σ0X33= 0, so X33= Γ0. The matrix ˆU3 =

· X44 X45 X54 X55

¸

is unitary.

Set U2 = diag(Ik, ˆU3) ˆU2 and U = diag(U1, U2). Then UHW V = diag(Ir+k, ˆU3)X with

X =





Γ0 0 −Σ0 0 0

0 I 0 0 0

Σ0 0 Γ0 0 0

0 0 0 I 0

0 0 0 0 I





.

The theorem is proved.

Theorem 5.1.3 Let W = [W1, W2] and Z = [Z1, Z2] be orthogonal, where W1, Z1 ∈ Rn×k and W2, Z2 ∈ Rn×(n−k). If S1 = R(W1) and S2 = R(Z1) then

dist(S1, S2) = q

1 − σmin2 (W1TZ1) (5.1.6) Proof: Let Q = WTZ and assume that k ≥ j = n − k. Let the C-S decomposition of Q be given by (5.1.2), (Qij = WiTZj, i, j = 1, 2). It follows that

k W1TZ2 k2=k W2TZ1 k2= sj = q

1 − c2j =p

1 − σmin2 (W1TZ1).

Since W1W1T and Z1Z1T are the orthogonal projections onto S1 and S2, respectively. We have

dist(S1, S2) = k W1W1T − Z1Z1T k2

= k WT(W1W1T − Z1Z1T)Z k2

= k

· 0 W1TZ2 W2TZ1 0

¸ k2

= sj.

If k < j, the above argument by setting Q = [W2, W1]T[Z2, Z1] and noting that σmin(W2TZ1) = σmin(W1TZ2) = sj.

(9)

### 5.2 Perturbation Theory

Theorem 5.2.1 (Gerschgorin Circle Theorem) If X−1AX = D+F , D ≡ diag(d1, · · · , dn) and F has zero diagonal entries, then σ(A) ⊂Sn

i=1Di, where Di = {z ∈ C | |z − di| ≤

Xn j=1,j6=i

|fij|}.

Proof: Suppose λ ∈ σ(A) and assume without loss of generality that λ 6= di for i = 1, · · · , n. Since (D − λI) + F is singular, it follows that

1 ≤k (D − λI)−1F k= Xn

j=1

|fkj| / |dk− λ|

for some k(1 ≤ k ≤ n). But this implies that λ ∈ Dk. Corollary 5.2.1 If the union M1 =Sk

j=1Dij of k discs Dij, j = 1, · · · , k, and the union M2 of the remaining discs are disjoint, then M1 contains exactly k eigenvalues of A and M2 exactly n − k eigenvalues.

Proof: Let B = X−1AX = D + F , for t ∈ [0, 1]. Let Bt := D + tF , then B0 = D, B1 = B. The eigenvalues of Bt are continuous functions of t. Applying Theorem 5.2.1 of Gerschgorin to Bt, one finds that for t = 0, there are exactly k eigenvalues of B0 in M1 and n − k in M2. (Counting multiple eigenvalues) Since for 0 ≤ t ≤ 1 all eigenvalues of Bt likewise must lie in these discs, it follows for reasons of continuity that also k eigenvalues of A lie in M1 and the remaining n − k in M2.

Remark 5.2.1 Take X = I, A = diag(A) + offdiag(A). Consider the transformation A −→ 4−1A4 with 4 = diag(δ1, · · · , δn). The Gerschgorin discs:

Di = {z ∈ C | |z − aii| ≤ Xn

k=1 k6=i

¯¯

¯¯aikδk δi

¯¯

¯¯ =: ρi}.

Example 5.2.1 Let A =

1 ² ²

² 2 ²

² ² 2

, D1 = {z | |z − 1| ≤ 2²}, D2 = D3 = {z |

|z − 2| ≤ 2²}, 0 < ² ¿ 1. Transformation with 4 = diag(1, k², k²), k > 0 yields A = 4˜ −1A4 =

1 k²2 2

1

k 2 ²

1

k ² 2

.

For ˜A we have ρ1 = 2k²2, ρ2 = ρ3 = 1k+ ². The discs D1 and D2 = D3 for ˜A are disjoint if ρ1+ ρ2 = 2k²2 +1k + ² < 1.

For this to be true we must clearly have k > 1. The optimal value ˜k, for which D1 and D2(for ˜A) touch one another, is obtained from ρ1+ ρ2 = 1. One finds

(10)

˜k = 2 1 − ² +p

(1 − ²)2− 8²2 = 1 + ² + O(²2)

and thus ρ1 = 2˜k²2 = 2²2+ O(²3). Through the transformation A −→ ˜A the radius ρ1 of D1 can thus be reduced from the initial 2² to about 2²2.

Theorem 5.2.2 (Bauer-Fike) If µ is an eigenvalue of A + E ∈ Cn×n and X−1AX = D = diag(λ1, · · · , λn), then

λ∈σ(A)min |λ − µ| ≤ κp(X) k E kp,

where k · kp is p-norm and κp(X) =k X kpk X−1 kp .

Proof: We need only consider the case µ 6∈ σ(A). If X−1(A + E − µI)X is singular, then so is I + (D − µI)−1(X−1EX). Thus,

1 ≤k (D − µI)−1(X−1EX) kp 1

λ∈σ(A)min |λ − µ| k X kpk E kpk X−1 kp .

Theorem 5.2.3 Let QAQ = D+N be a Schur decomposition of A with D = diag(λ1, · · · , λn) and N strictly upper triangular, Nn= 0. If µ ∈ σ(A + E), then

λ∈σ(A)min |λ − µ| ≤ max{θ, θn1}, where θ =k E k2 Pn−1

k=0 k N kk2.

Proof: Define δ = minλ∈σ(A)|λ − µ|. The theorem is true if δ = 0. If δ > 0, then I − (µI − A)−1E is singular and we have

1 ≤ k (µI − A)−1E k2

≤ k (µI − A)−1 k2k E k2

= k [(µI − D) − N]−1 k2k E k2 .

Since (µI − D) is diagonal it follows that [(µI − D)−1N]n = 0 and therefore

[(µI − D) − N]−1 = Xn−1

k=0

[(µI − D)−1N]k(µI − D)−1.

Hence we have

1 ≤ k E k2

δ max{1, 1 δn−1}

Xn−1 k=0

k N kk2,

from which the theorem readily follows.

(11)

Example 5.2.2 If A =

 1 2 3

0 4 5

0 0 4.001

 and E =

 0 0 0

0 0 0

0.001 0 0

. Then σ(A + E) ∼= {1.0001, 4.0582, 3.9427} and A’s matrix of eigenvectors satisfies κ2(X) ∼= 107. The Bauer- Fike bound in Theorem 5.2.2 has order 104, but the Schur bound in Theorem 5.2.3 has order 100.

Theorems 5.2.2 and 5.2.3 each indicate potential eigenvalue sensitively if A is non- normal. Specifically, if κ2(X) and k N kn−12 is large, then small changes in A can induce large change in the eigenvalues.

Example 5.2.3 If A =

· 0 I9

0 0

¸

and E =

· 0 0

10−10 0

¸

, then for all λ ∈ σ(A) and µ ∈ σ(A + E), |λ − µ| = 10−1010 . So a change of order 10−10 in A results in a change of order 10−1 in its eigenvalues.

Let λ be a simple eigenvalue of A ∈ Cn×n and x and y satisfy Ax = λx and yA = λy with k x k2=k y k2= 1. Using classical results from Function Theory, it can be shown that there exists differentiable x(ε) and λ(ε) such that

(A + εF )x(ε) = λ(ε)x(ε)

with k x(ε) k2= 1 and k F k2≤ 1, and such that λ(0) = λ and x(0) = x. By differentiating and set ε = 0:

A ˙x(0) + F x = ˙λ(0)x + λ ˙x(0).

Applying y to both sides and dividing by yx =⇒

f (x, y) = yn+ pn−1(x)yn−1+ pn−2(x)yn−2+ · · · + p1(x)y + p0(x).

Fix x, then f (x, y) = 0 has n roots y1(x), · · · , yn(x). f (0, y) = 0 has n roots y1(0), · · · , yn(0).

Theorem 5.2.4 Suppose yi(0) is a simple root of f (0, y) = 0, then there is δi > 0 such that there is a simple root yi(x) of f (x, y) = 0 defined by

yi(x) = yi(0) + pi1x + pi2x2+ · · · , (may terminate!) where the series is convergent for |x| < δi. (yi(x) −→ yi(0) as x −→ 0).

Theorem 5.2.5 If y1(0) = · · · = ym(0) is a root of multiplicity m of f (0, y) = 0, then there exists δ > 0 such that there are exactly m zeros of f (x, y) = 0 when |x| < δ having the following properties:

(a) Pr

i=1mi = m, mi ≥ 0. The m roots fall into r groups.

(b) Those roots in the group of mi are mi values of a series y1(0) + pi1z + pi2z2 + · · ·

corresponding to the mi different values of z defined by z = xmi1 .

(12)

Let λ1 be a simple root of A and x1 be the corresponding eigenvector. Since λ1 is simple, (A − λ1I) has at least one nonzero minor of order n − 1. Suppose this lies in the first (n − 1) rows of (A − λ1I). Take x1 = (An1, An2, · · · , Ann). Then

(A − λ1I)



 An1 An2 ...

Ann



=



 0 0...

0



,

since Pn

j=1anjAnj = det(A − λ1I) = 0. Here Ani is the cofactor of ani, hence it is a polynomial in λ1 of degree not greater than (n − 1).

Let λ1(ε) be the simple eigenvalue of A + εF and x1(ε) be the corresponding eigen- vector. Then the elements of x1(ε) are the polynomial in λ1(ε) and ε. Since the power series for λ1(ε) is convergent for small ε, so x1(ε) = x1+ εz1+ ε2z2+ · · · is a convergent power series

¯¯

¯ ˙λ(0)

¯¯

¯ = |yF x|

|yx| 1

|yx|. The upper bound is attained if F = yx. We refer to the reciprocal of s(λ) ≡ |yx| as the condition number of the eigenvalue λ.

λ(ε) = λ(0) + ˙λ(0)ε + O(ε2), an eigenvalue λ may be perturbed by an amount ε s(λ), if s(λ) is small then λ is appropriately regarded as ill-conditioned. Note that s(λ) is the cosine of the angle between the left and right eigenvectors associated with λ and is unique only if λ is simple. A small s(λ) implies that A is near a matrix having a multiple eigenvalue. In particular, if λ is distinct and s(λ) < 1, then there exists an E such that λ is a repeated eigenvalue of A + E and

k E k2 s(λ) p1 − s2(λ), this is proved in Wilkinson(1972).

Example 5.2.4 If A =

 1 2 3

0 4 5

0 0 4.001

 and E =

 0 0 0

0 0 0

0.001 0 0

. Then σ(A + E) ∼= {1.0001, 4.0582, 3.9427} and s(1) ∼= 0.79 × 100, s(4) = 0.16 × 10−3, s(4.001) ∼= 0.16 × 10−3. Observe that k E k2 /s(λ) is a good estimate of the perturbation that each eigenvalue undergoes.

If λ is a repeated eigenvalue, then the eigenvalue sensitivity question is more compli- cated. For example A =

· 1 a 0 1

¸

and F =

· 0 0 1 0

¸

then σ(A + εF ) = {1 ±√

εa}. Note that if a 6= 0 then the eigenvalues of A + εF are not differentiable at zero, their rate of change at the origin is infinite. In general, if λ is a detective eigenvalue of A, then O(ε) perturbations in A result in O(εp1) perturbations in λ where p ≥ 2 (see Wilkinson AEP pp.77 for a more detailed discussion).

We now consider the perturbations of invariant subspaces. Assume A ∈ Cn×n has distinct eigenvalues λ1, · · · , λn and k F k2= 1. We have

(A + εF )xk(ε) = λk(ε)xk(ε), k xk(ε) k2= 1,

(13)

and

yk(ε)(A + εF ) = λk(ε)yk(ε), k yk(ε) k2= 1,

for k = 1, · · · , n, where each λk(ε),xk(ε) and yk(ε) are differentiable. Set ε = 0 : A ˙xk(0) + F xk = ˙λk(0)xk+ λk˙xk(0),

where λk = λk(0) and xk = xk(0). Since {xi}ni=1 linearly independent, write ˙xk(0) = Pn

i=1aixi, so we have Xn

i=1 i6=k

aii− λk)xi+ F xk= ˙λk(0)xk.

But yi(0)xk= yixk = 0, for i 6= k and thus

ai = yiF xk/[(λk− λi)yixi], i 6= k.

Hence the Taylor expansion for xk(ε) is

xk(ε) = xk+ ε Xn

i=1 i6=k

½ yiF xk k− λi)yixi

¾

xi+ O(ε2).

Thus the sensitivity of xk depends upon eigenvalue sensitivity and the separation of λk

from the other eigenvalues.

Example 5.2.5 If A =

· 1.01 0.01 0.00 0.99

¸

, then λ = 0.99 has Condition 1

s(0.99) ∼= 1.118 and associated eigenvector x = (0.4472, −8.944)T. On the other hand, ˜λ = 1.00 of the

”nearby” matrix A + E =

· 1.01 0.01 0.00 1.00

¸

has an eigenvector ˜x = (0.7071, −0.7071)T. Suppose

QAQ =

· T11 T12 0 T22

¸ }p

}q = n − p (5.2.1)

is a Schur decomposition of A with

Q = [ Q|{z}1

p

, Q|{z}2

n−p

]. (5.2.2)

Definition 5.2.1 We define the separation between T11 and T22 by sepF(T11, T22) = min

Z6=0

k T11Z − ZT22 kF k Z kF .

Definition 5.2.2 Let X be a subspace of Cn, X is called an invariant subspace of A ∈ Cn×n, if AX ⊂ X (i.e. x ∈ X =⇒ Ax ∈ X).

Theorem 5.2.6 A ∈ Cn×n, V ∈ Cn×r and rank(V ) = r, then there are equivalent:

(14)

(a) there exists S ∈ Cr×r such that AV = V S.

(b) R(V ) is an invariant subspace of A.

Proof: Trivial!

Remark 5.2.2 (a) If Sz = µz, z 6= 0 then µ is eigenvalue of A with eigenvector V z.

(b) If V is a basis of X, then ˜V = V (VV )12 is an orthogonal basis of X.

Theorem 5.2.7 A ∈ Cn×n, Q = (Q1, Q2) orthogonal, then there are equivalent:

(a) If QAQ = B =

· B11 B12 B21 B22

¸

, then B21= 0.

(b) R(Q1) is an invariant subspace of A.

Proof: QAQ = B ⇐⇒ AQ = QB = (Q1, Q2)

· B11 B12 B21 B22

¸

. Thus AQ1 = Q1B11+ Q2B21.

(a) B21= 0, then AQ1 = Q1B11, so R(Q1) is an invariant subspace of A (from Theorem 5.2.6).

(b) R(Q1) is invariant subspace. There exists S such that AQ1 = Q1S = Q1B11+Q2B21. Multiply with Q1, then

S = Q1Q1S = Q1Q1B11+ Q1Q2B21. So S = B11 =⇒ Q2B21 = 0 =⇒ Q2Q2B21 = 0 =⇒ B21= 0.

Theorem 5.2.8 Suppose (5.2.1) and (5.2.2) hold and for E ∈ Cn×n we partition QEQ as follows:

QEQ =

· E11 E12

E21 E22

¸

with E11 ∈ Rp×p and E22 ∈ R(n−p)×(n−p). If

δ = sep2(T11, T22)− k E11 k2 − k E22k2> 0 and

k E21k2 (k T12k2 + k E12 k2) ≤ δ2/4.

Then there exists P ∈ C(n−k)×k such that

k P k2≤ 2 k E21 k2

and such that the column of ˜Q1 = (Q1+ Q2P )(I + PP )12 form an orthonormal basis for a invariant subspace of A + E.(See Stewart 1973).

(15)

Lemma 5.2.1 Let {sm} and {pm} be two sequence defined by

sm+1 = sm/(1 − 2ηpmsm), pm+1 = ηp2msm+1, m = 0, 1, 2, · · · (5.2.3) and

s0 = σ, p0 = σγ (5.2.4)

satisfying

4ησ2γ < 1. (Here σ, η, γ > 0) (5.2.5) Then {sm} is monotonic increasing and bounded above; {pm} is monotonic decreasing, converges quadratically to zero.

Proof: Let

xm = smpm, m = 0, 1, 2, · · · . (5.2.6) From (5.2.3) we have

xm+1 = sm+1pm+1 = ηp2ms2m/(1 − 2ηpmsm)2 = ηx2m/(1 − 2ηxm)2, (5.2.7) (5.2.5) can be written as

0 < x0 < 1

4η. (since x0 = s0p0 = σ2γ < 1

) (5.2.8)

Consider

x = f (x), f (x) = ηx2/(1 − 2ηx)2, x ≥ 0. (5.2.9) By

df (x)

dx = 2ηx

(1 − 2ηx)3,

we know that f (x) is differentiable and monotonic increasing in [0, 1/2η), and df (x)

dx |x=0= 0 : The equation (5.2.9) has zeros 0 and 1/4η in [0, 1/2η). Under Condition (5.2.8) the iteration xm as in (5.2.7) must be monotone decreasing converges quadratically to zero.

(Issacson &Keller ”Analysis of Num. Method 1996, Chapter 3 §1.) Thus sm+1

sm = 1

1 − 2ηxm = 1 + 2ηxm

1 − 2ηxm = 1 + tm,

where tm is monotone decreasing, converges quadratically to zero, hence

sm+1 = s0 Ym j=0

sj+1

sj = s0 Ym j=0

(1 + tj)

monotone increasing, and converges to s0 Y j=0

(1 + tj) < ∞, so pm = xm

sm monotone de- creasing, and quadratically convergent to zero.

(16)

Theorem 5.2.9 Let

P A12P + P A11− A22P − A21 = 0 (5.2.10) be the quadratic matrix equation in P ∈ C(n−l)×l (1 ≤ l ≤ n), where

· A11 A12 A21 A22

¸

= A, σ(A11)\

σ(A22) = ∅.

Define operator T by:

T Q ≡ QA11− A22Q, Q ∈ C(n−l)×l. (5.2.11) Let

η =k A12k, γ =k A21k (5.2.12)

and

σ =k T−1 k= sup

kP k=1

k T−1P k . (5.2.13)

If

4ησ2γ < 1, (5.2.14)

then according to the following iteration, we can get a solution P of (5.2.10) satisfying

k P k≤ 2σγ, (5.2.15)

and this iteration is quadratic convergence.

Iteration: Let Am =

"

A(m)11 A(m)12 A(m)21 A(m)22

#

, A0 = A.

(i) Solve

TmPm ≡ PmA(m)11 − A(m)22 Pm = A(m)21 (5.2.16) and get Pm ∈ C(n−l)×l;

(ii) Compute

A(m+1)11 = A(m)11 + A12Pm, A(m+1)22 = A(m)22 − PmA12, A(m+1)21 = −PmA12Pm. Goto (i), solve Pm+1.

Then

P = lim

m→∞

Xm i=0

Pi (5.2.17)

is a solution of (5.2.10) and satisfies (5.2.15).

(17)

Proof: (a) Prove that for m = 0, 1, 2, · · · , Tm−1 exist: denote

k Tm−1 k= σm, (T = T0, σ = σ0), (5.2.18) then

4 k A12 kk Pm k σm < 1. (5.2.19) By induction, m = 0, from σ(A11)T

σ(A22) = ∅ we have T0 = T is nonsingular. From (5.2.12)-(5.2.14) it holds

4 k A12 kk P0 k σ0 = 4η k T−1A21 k σ ≤ 4ησ2γ < 1.

Suppose Tm−1 exists, and (5.2.19) holds, prove that Tm+1−1 exists and 4 k A12 kk Pm+1 k σm+1 < 1.

From the definition

sep(A11, A22) = inf

kQk=1k QA11− A22Q k

and the existence of T−1 follows sep(A11, A22) =k T−1 k−1= σ−1, and by the perturbation property of ”sep” follows

sep(A(m+1)11 , A(m+1)22 ) = sep(A(m)11 + A12Pm, A(m)22 − PmA12)

≥ sep(A(m)11 , A(m)22 )− k A12Pm k − k PmA12k

1 − 2 k A12kk Pm k σm σm

> 0. (5.2.20)

From

sep(A11, A22) ≤ min{|λ1− λ2| : λ1 ∈ σ(A11), λ2 ∈ σ(A22)}.

We have σ(A(m+1)11 )T

σ(A(m+1)22 ) = ∅, hence Tm+1−1 exists and sep(A(m+1)11 , A(m+1)22 ) =k Tm+1−1 k−1= σ−1m+1. From (5.2.20) it follows

σm+1 σm

1 − 2 k A12 kk Pm k σm. (5.2.21) Substitute (5.2.19) into (5.2.21), we get σm+1 ≤ 2σm, and

k Pm+1 k≤k Tm+1−1 kk Am+121 k≤ σm+1 k Pm k2k A12k< 1

2 k Pm k . Hence

2 k A12 kk Pm+1 k σm+1 ≤ 2 k A12 kk Pm k σm < 1/2.

This proved that Tm−1 exists for all m = 0, 1, 2, · · · and (5.2.19) holds.

(18)

(b) Prove k Pm k is quadratic convergence to zero. Construct sequences {qm}, {sm}, {pm} satisfying

k A(m)21 k≤ qm, σm ≤ sm, k Pm k≤ pm. (5.2.22) From

A(m+1)21 = −PmA12Pm (5.2.23)

follows

k A(m+1)21 k≤k A12kk Pm k2≤ ηp2m. (5.2.24) Define {qm} by

qm+1 = ηp2m, q0 = γ; (5.2.25) From (5.2.21) we have

σm+1 sm

1 − 2ηpmsm. (5.2.26)

Define {sm} by

sm+1 = sm

1 − 2ηpmsm, s0 = σ; (5.2.27) From (5.2.16) we have

k Pm k≤k Tm−1 kk A(m)21 k= σm k A(m)21 k≤ smqm. Define {pm} by

pm+1 = sm+1qm+1 = ηp2msm+1, p0 = σγ. (5.2.28) By Lemma 5.2.1 follows that {pm} & 0 monotone and form (5.2.22) follows that k Pm k−→ 0 quadratically.

(c) Prove P(m) −→ P and (5.2.15) holds. According to the method as in Lemma 5.2.1. Construct {xm} (see (5.2.6),(5.2.7) ), that is

xm+1 = ηx2m

(1 − 2ηxm)2, sm+1 = sm

1 − 2ηxm (5.2.29)

and then

pm+1 = xm+1

sm+1 = ηxm

1 − 2ηxmpm. (5.2.30)

By induction! For all m = 1, 2, · · · we have pm < 1

2pm−1, xm < 1

4η. (5.2.31)

In fact, substitute

ηx0

1 − 2ηx0 = ησ2γ

1 − 2ησ2γ < 1

2 (5.2.32)

into (5.2.30) and get p1 < 1

2p0; From (5.2.29) and (5.2.32) it follows that x1 = 1

η

µ ηx0

1 − 2ηx0

2

< 1 4η.

Updating...

## References

Related subjects :