• 沒有找到結果。

Part II On the Numerical Solutions of Eigenvalue Problems

N/A
N/A
Protected

Academic year: 2022

Share "Part II On the Numerical Solutions of Eigenvalue Problems"

Copied!
52
0
0

加載中.... (立即查看全文)

全文

(1)

Part II

On the Numerical Solutions of

Eigenvalue Problems

(2)
(3)

Chapter 5

The Unsymmetric Eigenvalue Problem

Generalized eigenvalue problem (GEVP):

Given A, B ∈ Cn×n. Determine λ ∈ C and 0 6= x ∈ Cn with Ax = λBx. λ is called an eigenvalue of the pencil A − λB (or pair(A, B)) and x is called an eigen- vector corresponding to λ. λ is an eigenvalue of A − λB ⇐⇒ det(A − λB) = 0.

(σ(A, B) ≡ {λ ∈ C | det(A − λB) = 0}.)

Definition 5.0.2 A pencil A − λB (A, B ∈ Rm×n) or a pair(A, B) is called regular if that

(i) A and B are square matrices of order n, and (ii) det(A − λB) 6≡ 0.

In all other case (m 6= n or m = n but det(A − λB) ≡ 0), the pencil is called singular.

Detailed algebraic structure of a pencil A − λB see Matrix theory II, chapter XII (Gant- macher 1959).

Eigenvalue Problem (EVP):

Special case in GEVP when B = I, we have λ ∈ C and 0 6= x ∈ Cn with Ax = λx. λ is an eigenvalue of A and x is an eigenvector corresponding to λ.

Definition 5.0.3 (a) σ(A) = {λ ∈ C | det(A − λI) = 0} is called the spectrum of A.

(b) ρ(A) = max{| λ |: λ ∈ σ(A)} is called the radius of σ(A).

(c) P (λ) = det(λI − A) is called the characteristic polynomial of A.

Let P (λ) = Ys i=1

(λ − λi)m(λi), λi 6= λj(i 6= j) and Xs

i=1

m(λi) = n.

Example 5.0.2 A =

· 2 2 0 3

¸ , B =

· 1 0 0 0

¸

=⇒ det(A − λB) = 2 − λ and σ(A, B) = {2}.

(4)

Example 5.0.3 A =

· 1 2 0 3

¸ , B =

· 0 1 0 0

¸

=⇒ det(A − λB) = 3 and σ(A, B) = ∅.

Example 5.0.4 A =

· 1 2 0 0

¸ , B =

· 1 0 0 0

¸

=⇒ det(A − λB) = 0 and σ(A, B) = C.

Example 5.0.5 det(µA − λB) = (2µ − λ)µ µ = 1 : Ax = λBx =⇒ λ = 2.

λ = 1 : Bx = µAx =⇒ µ = 0, µ = 12 =⇒ λ = ∞, λ = 2.

σ(A, B) = {2, ∞}.

Example 5.0.6 det(µA − λB) = µ · 3µ µ = 1 : no solution for λ.

λ = 1 : Bx = µAx =⇒ µ = 0, 0.(multiple) σ(A, B) = {∞, ∞}.

Let

m(λi) := algebraic multiplicity of λi.

n(λi) := n − rank(A − λiI) = geometric multiplicity.

1 ≤ n(λi) ≤ m(λi).

If for some i, n(λi) < m(λi), then A is degenerated (defective). The following statements are equivalent:

(a) A is diagonalizable: There exists a nonsingular matrix T such that T−1AT = diag(λ1, · · · , λn).

(b) There are n linearly independent eigenvectors.

(c) A is nondefective, i.e. ∀ λ ∈ σ(A) =⇒ m(λ) = n(λ).

If A is defective then eigenvector + principle vector =⇒ Jordan form.

Theorem 5.0.3 (Jordan decomposition) If A ∈ Cn×n, then there exists a nonsingu- lar X ∈ Cn×n, such that X−1AX = diag(J1, · · · , Jt),where

Ji =





λi 1 0

. .. ...

. .. 1

0 λi





is mi× mi and m1+ · · · + mt= n.

Theorem 5.0.4 (Schur decomposition) If A ∈ Cn×n then there exists a unitary ma- trix U ∈ Cn×n such that UAU(= U−1AU) a upper triangular.

- A normal(i.e. AA = AA) ⇐⇒ ∃ unitary U such that UAU = diag(λ1, · · · , λn), i.e.

Aui = λiui, uiuj = δij.

- A hermitian(i.e. A = A) ⇐⇒ A is normal and σ(A) ⊆ R.

- A symmetric(i.e. AT = A, A ∈ Rn×n) ⇐⇒ ∃ orthogonal U such that UTAU = diag(λ1, · · · , λn) and σ(A) ⊆ R.

(5)

5.1 Orthogonal Projections and C-S Decomposition

Definition 5.1.1 Let S ⊆ Rn be a subspace, P ∈ Rn×n is the orthogonal projection onto S if



Range(P ) = S, P2 = P,

PT = P,

(5.1.1)

where Range(P ) = R(P ) = {y ∈ Rn| y = P x, for some x ∈ Rn}.

Remark 5.1.1 If x ∈ Rn, then P x ∈ S and (I − P )x ∈ S.

Example 5.1.1 P = vvT/vTv is the orthogonal projection onto S = span{v}, v ∈ Rn. x

Px

S=span{v}

Figure 5.1: Orthogonal projection

Remark 5.1.2 (i) If P1 and P2 are orthogonal projections, then for any z ∈ Rn we have k (P1− P2)z k22= (P1z)T(I − P2)z + (P2z)T(I − P1)z. (5.1.2) If R(P1) = R(P2) = S then the right-hand side of (5.1.2) is zero. Thus the orthog- onal projection for a subspace is unique.

(ii) If V = [v1, · · · , vk] is an orthogonal basis for S, then P = V VT is unique orthogonal projection onto S.

Definition 5.1.2 Suppose S1 and S2 are subspaces of Rn and dim(S1) = dim(S2). We define the distance between S1 and S2 by

dist(S1, S2) =k P1− P2 k2, (5.1.3) where Pi is the orthogonal projection onto Si, i = 1, 2.

Remark 5.1.3 By considering the case of one-dimensional subspaces in R2, we obtain a geometrical interpretation of dist(·, ·). Suppose S1 = span{x} and S2 = span{y} and

(6)

S

S

=span{y}

=span{x}

2

θ

1

k x k2=k y k2= 1. Assume that xTy = cos θ, θ ∈ [0,π2]. It follows that the difference between the projections onto these spaces satisfies

P1− P2 = xxT − yyT = x[x − (yTx)y]T − [y − (xTy)x]yT. If θ = 0(⇒ x = y), then dist(S1, S2) =k P1− P2 k2= sin θ = 0.

If θ 6= 0, then

Ux = [u1, u2] = [x, −[y − (yTx)x]/ sin θ]

and

Vx = [v1, v2] = [[x − (xTy)y]/ sin θ, y]

are defined and orthogonal. It follows that

P1− P2 = Ux diag[sin θ, sin θ] VxT

is the SVD of P1 − P2. Consequently, dist(S1, S2) = sin θ, the sine of the angle between the two subspaces.

Theorem 5.1.1 (C-S Decomposition, Davis / Kahan(1970) or Stewart(1977)) If Q =

· Q11 Q12 Q21 Q22

¸

is orthogonal with Q11 ∈ Rk×k and Q22 ∈ Rj×j(k ≥ j), then there exists orthogonal matrices U1, V1 ∈ Rk×k and orthogonal matrices U2, V2 ∈ Rj×j such that

· U1 0 0 U2

¸T ·

Q11 Q12 Q21 Q22

¸ · V1 0 0 V2

¸

=

I 0 0

0 C S

0 −S C

 , (5.1.4)

where

C = diag(c1, · · · , cj), ci = cos θi, S = diag(s1, · · · , sj), si = sin θi

and 0 ≤ θ1 ≤ θ2 ≤ · · · ≤ θj π2. Lemma 5.1.1 Let Q =

· Q1 Q2

¸

be orthogonal with Q1 ∈ Rn×n. Then there are unitary matrices U1, U2 and W such that

· U1T 0 0 U2T

¸ · Q1 Q2

¸ W =

· C S

¸

where C = diag(c1, · · · , cj) ≥ 0, and S = diag(s1, · · · , sn) ≥ 0 with c2i + s2i = 1, i = 1, · · · , n.

(7)

Proof: Let U1TQ1W = C be the SVD of Q1. Consider

· U1T 0 0 I

¸ · Q1 Q2

¸ W =

· C

Q2W

¸

has orthogonal columns. Define ˜Q2 ≡ Q2W . Then C2 + ˜QT2Q˜2 = I or ˜QT2Q˜2 = I − C2 diagonal, thus ˜QT2Q˜2 is diagonal. Which means that the nonzero column of ˜Q2 are orthogonal to one another.If all the columns of ˜Q2 are nonzero, set S2 = ˜QT2Q˜2 and U2 = ˜Q2S−1, then we have U2TU2 = I and U2TQ˜2 = S. It follows the decomposition.

If ˜Q2 has zero columns, normalize the nonzero columns and replace the zero columns with an orthogonal basis for the orthogonal complement of the column space of ˜Q2. It is easily verified that U2 so defined is orthogonal and S ≡ U2TQ˜2 is diagonal. It also follows that decomposition.

Theorem 5.1.2 (C-S Decomposition) Let the unitary matrix W ∈ Cn×n be parti- tioned in the form W =

· W11 W12 W21 W22

¸

, where W11∈ Cr×r with r ≤ n2. Then there exist

unitary matrices U = diag(

z}|{r

U1 , z}|{n−r

U2 ) and V = diag(

z}|{r

V1 , z}|{n−r

V2 ) such that

UW V =

Γ −Σ 0

Σ Γ 0

0 0 I

}r }r }n − 2r

, (5.1.5)

where Γ = diag(γ1, · · · , γr) ≥ 0 and Σ = diag(σ1, · · · , σr) ≥ 0 with γi2 + σ2i = 1, i = 1, · · · , r.

Proof: Let Γ = U1W11V1 be the SVD of W11 with the diagonal elements of Γ : γ1 γ2 ≤ · · · ≤ γk < 1 = γk+1 = · · · = γr, i.e.

Γ = diag(Γ0, Ir−k).

The matrix

· W11 W21

¸

V1 has orthogonal columns. Hence

I =

·µ W11 W21

V1

¸ ·µ W11 W21

V1

¸

= Γ2+ (W21V1)(W21V1).

Since I and Γ2 are diagonal, (W21V1)(W21V1) is diagonal. So the columns of W21V1 are orthogonal. Since the ith diagonal of I − Γ2 is the norm of the ith column of W21V1, only the first k(k ≤ r ≤ n − r) columns of W21V1 are nonzero. Let ˆU2 be unitary whose first k columns are the normalized columns of W21V1. Then

Uˆ2W21V1 =

· Σ 0

¸ ,

where Σ = diag(σ1, · · · , σk, 0, · · · , 0) ≡ diag(Σ0, 0), ˆU2 ∈ C(n−r)×(n−r). Since

diag(U1, ˆU2)

µ W11 W21

V1 =

 Γ Σ 0

(8)

has orthogonal (orthonormal) columns, we have γi2+ σi2 = 1, i = 1, · · · , r. (Σ0 is nonsin- gular).

By the same argument as above : there is a unitary V2 ∈ C(n−r)×(n−r) such that U1W12V2 = (T, 0),

where T = diag(τ1, · · · , τr) and τi ≤ 0. Since γi2 + τi2 = 1, it follows from γi2 + σi2 = 1 that T = −Σ. Set ˆU = diag(U1, ˆU2) and V = diag(V1, V2). Then X = ˆUW V can be partitioned in the form

X =





Γ0 0 −Σ0 0 0

0 I 0 0 0

Σ0 0 X33 X34 X35

0 0 X43 X44 X45 0 0 X53 X54 X55





}k }r − k }k }r − k }n − 2r

.

Since columns 1 and 4 are orthogonal, it follows Σ0X34 = 0. Thus X34 = 0 (since Σ0 nonsigular). Likewise X35, X43, X53 = 0. From the orthogonality of columns 1 and 3, it follows that −Γ0Σ0+ Σ0X33= 0, so X33= Γ0. The matrix ˆU3 =

· X44 X45 X54 X55

¸

is unitary.

Set U2 = diag(Ik, ˆU3) ˆU2 and U = diag(U1, U2). Then UHW V = diag(Ir+k, ˆU3)X with

X =





Γ0 0 −Σ0 0 0

0 I 0 0 0

Σ0 0 Γ0 0 0

0 0 0 I 0

0 0 0 0 I





.

The theorem is proved.

Theorem 5.1.3 Let W = [W1, W2] and Z = [Z1, Z2] be orthogonal, where W1, Z1 ∈ Rn×k and W2, Z2 ∈ Rn×(n−k). If S1 = R(W1) and S2 = R(Z1) then

dist(S1, S2) = q

1 − σmin2 (W1TZ1) (5.1.6) Proof: Let Q = WTZ and assume that k ≥ j = n − k. Let the C-S decomposition of Q be given by (5.1.2), (Qij = WiTZj, i, j = 1, 2). It follows that

k W1TZ2 k2=k W2TZ1 k2= sj = q

1 − c2j =p

1 − σmin2 (W1TZ1).

Since W1W1T and Z1Z1T are the orthogonal projections onto S1 and S2, respectively. We have

dist(S1, S2) = k W1W1T − Z1Z1T k2

= k WT(W1W1T − Z1Z1T)Z k2

= k

· 0 W1TZ2 W2TZ1 0

¸ k2

= sj.

If k < j, the above argument by setting Q = [W2, W1]T[Z2, Z1] and noting that σmin(W2TZ1) = σmin(W1TZ2) = sj.

(9)

5.2 Perturbation Theory

Theorem 5.2.1 (Gerschgorin Circle Theorem) If X−1AX = D+F , D ≡ diag(d1, · · · , dn) and F has zero diagonal entries, then σ(A) ⊂Sn

i=1Di, where Di = {z ∈ C | |z − di| ≤

Xn j=1,j6=i

|fij|}.

Proof: Suppose λ ∈ σ(A) and assume without loss of generality that λ 6= di for i = 1, · · · , n. Since (D − λI) + F is singular, it follows that

1 ≤k (D − λI)−1F k= Xn

j=1

|fkj| / |dk− λ|

for some k(1 ≤ k ≤ n). But this implies that λ ∈ Dk. Corollary 5.2.1 If the union M1 =Sk

j=1Dij of k discs Dij, j = 1, · · · , k, and the union M2 of the remaining discs are disjoint, then M1 contains exactly k eigenvalues of A and M2 exactly n − k eigenvalues.

Proof: Let B = X−1AX = D + F , for t ∈ [0, 1]. Let Bt := D + tF , then B0 = D, B1 = B. The eigenvalues of Bt are continuous functions of t. Applying Theorem 5.2.1 of Gerschgorin to Bt, one finds that for t = 0, there are exactly k eigenvalues of B0 in M1 and n − k in M2. (Counting multiple eigenvalues) Since for 0 ≤ t ≤ 1 all eigenvalues of Bt likewise must lie in these discs, it follows for reasons of continuity that also k eigenvalues of A lie in M1 and the remaining n − k in M2.

Remark 5.2.1 Take X = I, A = diag(A) + offdiag(A). Consider the transformation A −→ 4−1A4 with 4 = diag(δ1, · · · , δn). The Gerschgorin discs:

Di = {z ∈ C | |z − aii| ≤ Xn

k=1 k6=i

¯¯

¯¯aikδk δi

¯¯

¯¯ =: ρi}.

Example 5.2.1 Let A =

1 ² ²

² 2 ²

² ² 2

, D1 = {z | |z − 1| ≤ 2²}, D2 = D3 = {z |

|z − 2| ≤ 2²}, 0 < ² ¿ 1. Transformation with 4 = diag(1, k², k²), k > 0 yields A = 4˜ −1A4 =

1 k²2 2

1

k 2 ²

1

k ² 2

.

For ˜A we have ρ1 = 2k²2, ρ2 = ρ3 = 1k+ ². The discs D1 and D2 = D3 for ˜A are disjoint if ρ1+ ρ2 = 2k²2 +1k + ² < 1.

For this to be true we must clearly have k > 1. The optimal value ˜k, for which D1 and D2(for ˜A) touch one another, is obtained from ρ1+ ρ2 = 1. One finds

(10)

˜k = 2 1 − ² +p

(1 − ²)2− 8²2 = 1 + ² + O(²2)

and thus ρ1 = 2˜k²2 = 2²2+ O(²3). Through the transformation A −→ ˜A the radius ρ1 of D1 can thus be reduced from the initial 2² to about 2²2.

Theorem 5.2.2 (Bauer-Fike) If µ is an eigenvalue of A + E ∈ Cn×n and X−1AX = D = diag(λ1, · · · , λn), then

λ∈σ(A)min |λ − µ| ≤ κp(X) k E kp,

where k · kp is p-norm and κp(X) =k X kpk X−1 kp .

Proof: We need only consider the case µ 6∈ σ(A). If X−1(A + E − µI)X is singular, then so is I + (D − µI)−1(X−1EX). Thus,

1 ≤k (D − µI)−1(X−1EX) kp 1

λ∈σ(A)min |λ − µ| k X kpk E kpk X−1 kp .

Theorem 5.2.3 Let QAQ = D+N be a Schur decomposition of A with D = diag(λ1, · · · , λn) and N strictly upper triangular, Nn= 0. If µ ∈ σ(A + E), then

λ∈σ(A)min |λ − µ| ≤ max{θ, θn1}, where θ =k E k2 Pn−1

k=0 k N kk2.

Proof: Define δ = minλ∈σ(A)|λ − µ|. The theorem is true if δ = 0. If δ > 0, then I − (µI − A)−1E is singular and we have

1 ≤ k (µI − A)−1E k2

≤ k (µI − A)−1 k2k E k2

= k [(µI − D) − N]−1 k2k E k2 .

Since (µI − D) is diagonal it follows that [(µI − D)−1N]n = 0 and therefore

[(µI − D) − N]−1 = Xn−1

k=0

[(µI − D)−1N]k(µI − D)−1.

Hence we have

1 ≤ k E k2

δ max{1, 1 δn−1}

Xn−1 k=0

k N kk2,

from which the theorem readily follows.

(11)

Example 5.2.2 If A =

 1 2 3

0 4 5

0 0 4.001

 and E =

 0 0 0

0 0 0

0.001 0 0

. Then σ(A + E) ∼= {1.0001, 4.0582, 3.9427} and A’s matrix of eigenvectors satisfies κ2(X) ∼= 107. The Bauer- Fike bound in Theorem 5.2.2 has order 104, but the Schur bound in Theorem 5.2.3 has order 100.

Theorems 5.2.2 and 5.2.3 each indicate potential eigenvalue sensitively if A is non- normal. Specifically, if κ2(X) and k N kn−12 is large, then small changes in A can induce large change in the eigenvalues.

Example 5.2.3 If A =

· 0 I9

0 0

¸

and E =

· 0 0

10−10 0

¸

, then for all λ ∈ σ(A) and µ ∈ σ(A + E), |λ − µ| = 10−1010 . So a change of order 10−10 in A results in a change of order 10−1 in its eigenvalues.

Let λ be a simple eigenvalue of A ∈ Cn×n and x and y satisfy Ax = λx and yA = λy with k x k2=k y k2= 1. Using classical results from Function Theory, it can be shown that there exists differentiable x(ε) and λ(ε) such that

(A + εF )x(ε) = λ(ε)x(ε)

with k x(ε) k2= 1 and k F k2≤ 1, and such that λ(0) = λ and x(0) = x. By differentiating and set ε = 0:

A ˙x(0) + F x = ˙λ(0)x + λ ˙x(0).

Applying y to both sides and dividing by yx =⇒

f (x, y) = yn+ pn−1(x)yn−1+ pn−2(x)yn−2+ · · · + p1(x)y + p0(x).

Fix x, then f (x, y) = 0 has n roots y1(x), · · · , yn(x). f (0, y) = 0 has n roots y1(0), · · · , yn(0).

Theorem 5.2.4 Suppose yi(0) is a simple root of f (0, y) = 0, then there is δi > 0 such that there is a simple root yi(x) of f (x, y) = 0 defined by

yi(x) = yi(0) + pi1x + pi2x2+ · · · , (may terminate!) where the series is convergent for |x| < δi. (yi(x) −→ yi(0) as x −→ 0).

Theorem 5.2.5 If y1(0) = · · · = ym(0) is a root of multiplicity m of f (0, y) = 0, then there exists δ > 0 such that there are exactly m zeros of f (x, y) = 0 when |x| < δ having the following properties:

(a) Pr

i=1mi = m, mi ≥ 0. The m roots fall into r groups.

(b) Those roots in the group of mi are mi values of a series y1(0) + pi1z + pi2z2 + · · ·

corresponding to the mi different values of z defined by z = xmi1 .

(12)

Let λ1 be a simple root of A and x1 be the corresponding eigenvector. Since λ1 is simple, (A − λ1I) has at least one nonzero minor of order n − 1. Suppose this lies in the first (n − 1) rows of (A − λ1I). Take x1 = (An1, An2, · · · , Ann). Then

(A − λ1I)



 An1 An2 ...

Ann



=



 0 0...

0



,

since Pn

j=1anjAnj = det(A − λ1I) = 0. Here Ani is the cofactor of ani, hence it is a polynomial in λ1 of degree not greater than (n − 1).

Let λ1(ε) be the simple eigenvalue of A + εF and x1(ε) be the corresponding eigen- vector. Then the elements of x1(ε) are the polynomial in λ1(ε) and ε. Since the power series for λ1(ε) is convergent for small ε, so x1(ε) = x1+ εz1+ ε2z2+ · · · is a convergent power series

¯¯

¯ ˙λ(0)

¯¯

¯ = |yF x|

|yx| 1

|yx|. The upper bound is attained if F = yx. We refer to the reciprocal of s(λ) ≡ |yx| as the condition number of the eigenvalue λ.

λ(ε) = λ(0) + ˙λ(0)ε + O(ε2), an eigenvalue λ may be perturbed by an amount ε s(λ), if s(λ) is small then λ is appropriately regarded as ill-conditioned. Note that s(λ) is the cosine of the angle between the left and right eigenvectors associated with λ and is unique only if λ is simple. A small s(λ) implies that A is near a matrix having a multiple eigenvalue. In particular, if λ is distinct and s(λ) < 1, then there exists an E such that λ is a repeated eigenvalue of A + E and

k E k2 s(λ) p1 − s2(λ), this is proved in Wilkinson(1972).

Example 5.2.4 If A =

 1 2 3

0 4 5

0 0 4.001

 and E =

 0 0 0

0 0 0

0.001 0 0

. Then σ(A + E) ∼= {1.0001, 4.0582, 3.9427} and s(1) ∼= 0.79 × 100, s(4) = 0.16 × 10−3, s(4.001) ∼= 0.16 × 10−3. Observe that k E k2 /s(λ) is a good estimate of the perturbation that each eigenvalue undergoes.

If λ is a repeated eigenvalue, then the eigenvalue sensitivity question is more compli- cated. For example A =

· 1 a 0 1

¸

and F =

· 0 0 1 0

¸

then σ(A + εF ) = {1 ±√

εa}. Note that if a 6= 0 then the eigenvalues of A + εF are not differentiable at zero, their rate of change at the origin is infinite. In general, if λ is a detective eigenvalue of A, then O(ε) perturbations in A result in O(εp1) perturbations in λ where p ≥ 2 (see Wilkinson AEP pp.77 for a more detailed discussion).

We now consider the perturbations of invariant subspaces. Assume A ∈ Cn×n has distinct eigenvalues λ1, · · · , λn and k F k2= 1. We have

(A + εF )xk(ε) = λk(ε)xk(ε), k xk(ε) k2= 1,

(13)

and

yk(ε)(A + εF ) = λk(ε)yk(ε), k yk(ε) k2= 1,

for k = 1, · · · , n, where each λk(ε),xk(ε) and yk(ε) are differentiable. Set ε = 0 : A ˙xk(0) + F xk = ˙λk(0)xk+ λk˙xk(0),

where λk = λk(0) and xk = xk(0). Since {xi}ni=1 linearly independent, write ˙xk(0) = Pn

i=1aixi, so we have Xn

i=1 i6=k

aii− λk)xi+ F xk= ˙λk(0)xk.

But yi(0)xk= yixk = 0, for i 6= k and thus

ai = yiF xk/[(λk− λi)yixi], i 6= k.

Hence the Taylor expansion for xk(ε) is

xk(ε) = xk+ ε Xn

i=1 i6=k

½ yiF xk k− λi)yixi

¾

xi+ O(ε2).

Thus the sensitivity of xk depends upon eigenvalue sensitivity and the separation of λk

from the other eigenvalues.

Example 5.2.5 If A =

· 1.01 0.01 0.00 0.99

¸

, then λ = 0.99 has Condition 1

s(0.99) ∼= 1.118 and associated eigenvector x = (0.4472, −8.944)T. On the other hand, ˜λ = 1.00 of the

”nearby” matrix A + E =

· 1.01 0.01 0.00 1.00

¸

has an eigenvector ˜x = (0.7071, −0.7071)T. Suppose

QAQ =

· T11 T12 0 T22

¸ }p

}q = n − p (5.2.1)

is a Schur decomposition of A with

Q = [ Q|{z}1

p

, Q|{z}2

n−p

]. (5.2.2)

Definition 5.2.1 We define the separation between T11 and T22 by sepF(T11, T22) = min

Z6=0

k T11Z − ZT22 kF k Z kF .

Definition 5.2.2 Let X be a subspace of Cn, X is called an invariant subspace of A ∈ Cn×n, if AX ⊂ X (i.e. x ∈ X =⇒ Ax ∈ X).

Theorem 5.2.6 A ∈ Cn×n, V ∈ Cn×r and rank(V ) = r, then there are equivalent:

(14)

(a) there exists S ∈ Cr×r such that AV = V S.

(b) R(V ) is an invariant subspace of A.

Proof: Trivial!

Remark 5.2.2 (a) If Sz = µz, z 6= 0 then µ is eigenvalue of A with eigenvector V z.

(b) If V is a basis of X, then ˜V = V (VV )12 is an orthogonal basis of X.

Theorem 5.2.7 A ∈ Cn×n, Q = (Q1, Q2) orthogonal, then there are equivalent:

(a) If QAQ = B =

· B11 B12 B21 B22

¸

, then B21= 0.

(b) R(Q1) is an invariant subspace of A.

Proof: QAQ = B ⇐⇒ AQ = QB = (Q1, Q2)

· B11 B12 B21 B22

¸

. Thus AQ1 = Q1B11+ Q2B21.

(a) B21= 0, then AQ1 = Q1B11, so R(Q1) is an invariant subspace of A (from Theorem 5.2.6).

(b) R(Q1) is invariant subspace. There exists S such that AQ1 = Q1S = Q1B11+Q2B21. Multiply with Q1, then

S = Q1Q1S = Q1Q1B11+ Q1Q2B21. So S = B11 =⇒ Q2B21 = 0 =⇒ Q2Q2B21 = 0 =⇒ B21= 0.

Theorem 5.2.8 Suppose (5.2.1) and (5.2.2) hold and for E ∈ Cn×n we partition QEQ as follows:

QEQ =

· E11 E12

E21 E22

¸

with E11 ∈ Rp×p and E22 ∈ R(n−p)×(n−p). If

δ = sep2(T11, T22)− k E11 k2 − k E22k2> 0 and

k E21k2 (k T12k2 + k E12 k2) ≤ δ2/4.

Then there exists P ∈ C(n−k)×k such that

k P k2≤ 2 k E21 k2

and such that the column of ˜Q1 = (Q1+ Q2P )(I + PP )12 form an orthonormal basis for a invariant subspace of A + E.(See Stewart 1973).

(15)

Lemma 5.2.1 Let {sm} and {pm} be two sequence defined by

sm+1 = sm/(1 − 2ηpmsm), pm+1 = ηp2msm+1, m = 0, 1, 2, · · · (5.2.3) and

s0 = σ, p0 = σγ (5.2.4)

satisfying

4ησ2γ < 1. (Here σ, η, γ > 0) (5.2.5) Then {sm} is monotonic increasing and bounded above; {pm} is monotonic decreasing, converges quadratically to zero.

Proof: Let

xm = smpm, m = 0, 1, 2, · · · . (5.2.6) From (5.2.3) we have

xm+1 = sm+1pm+1 = ηp2ms2m/(1 − 2ηpmsm)2 = ηx2m/(1 − 2ηxm)2, (5.2.7) (5.2.5) can be written as

0 < x0 < 1

4η. (since x0 = s0p0 = σ2γ < 1

) (5.2.8)

Consider

x = f (x), f (x) = ηx2/(1 − 2ηx)2, x ≥ 0. (5.2.9) By

df (x)

dx = 2ηx

(1 − 2ηx)3,

we know that f (x) is differentiable and monotonic increasing in [0, 1/2η), and df (x)

dx |x=0= 0 : The equation (5.2.9) has zeros 0 and 1/4η in [0, 1/2η). Under Condition (5.2.8) the iteration xm as in (5.2.7) must be monotone decreasing converges quadratically to zero.

(Issacson &Keller ”Analysis of Num. Method 1996, Chapter 3 §1.) Thus sm+1

sm = 1

1 − 2ηxm = 1 + 2ηxm

1 − 2ηxm = 1 + tm,

where tm is monotone decreasing, converges quadratically to zero, hence

sm+1 = s0 Ym j=0

sj+1

sj = s0 Ym j=0

(1 + tj)

monotone increasing, and converges to s0 Y j=0

(1 + tj) < ∞, so pm = xm

sm monotone de- creasing, and quadratically convergent to zero.

(16)

Theorem 5.2.9 Let

P A12P + P A11− A22P − A21 = 0 (5.2.10) be the quadratic matrix equation in P ∈ C(n−l)×l (1 ≤ l ≤ n), where

· A11 A12 A21 A22

¸

= A, σ(A11)\

σ(A22) = ∅.

Define operator T by:

T Q ≡ QA11− A22Q, Q ∈ C(n−l)×l. (5.2.11) Let

η =k A12k, γ =k A21k (5.2.12)

and

σ =k T−1 k= sup

kP k=1

k T−1P k . (5.2.13)

If

4ησ2γ < 1, (5.2.14)

then according to the following iteration, we can get a solution P of (5.2.10) satisfying

k P k≤ 2σγ, (5.2.15)

and this iteration is quadratic convergence.

Iteration: Let Am =

"

A(m)11 A(m)12 A(m)21 A(m)22

#

, A0 = A.

(i) Solve

TmPm ≡ PmA(m)11 − A(m)22 Pm = A(m)21 (5.2.16) and get Pm ∈ C(n−l)×l;

(ii) Compute

A(m+1)11 = A(m)11 + A12Pm, A(m+1)22 = A(m)22 − PmA12, A(m+1)21 = −PmA12Pm. Goto (i), solve Pm+1.

Then

P = lim

m→∞

Xm i=0

Pi (5.2.17)

is a solution of (5.2.10) and satisfies (5.2.15).

(17)

Proof: (a) Prove that for m = 0, 1, 2, · · · , Tm−1 exist: denote

k Tm−1 k= σm, (T = T0, σ = σ0), (5.2.18) then

4 k A12 kk Pm k σm < 1. (5.2.19) By induction, m = 0, from σ(A11)T

σ(A22) = ∅ we have T0 = T is nonsingular. From (5.2.12)-(5.2.14) it holds

4 k A12 kk P0 k σ0 = 4η k T−1A21 k σ ≤ 4ησ2γ < 1.

Suppose Tm−1 exists, and (5.2.19) holds, prove that Tm+1−1 exists and 4 k A12 kk Pm+1 k σm+1 < 1.

From the definition

sep(A11, A22) = inf

kQk=1k QA11− A22Q k

and the existence of T−1 follows sep(A11, A22) =k T−1 k−1= σ−1, and by the perturbation property of ”sep” follows

sep(A(m+1)11 , A(m+1)22 ) = sep(A(m)11 + A12Pm, A(m)22 − PmA12)

≥ sep(A(m)11 , A(m)22 )− k A12Pm k − k PmA12k

1 − 2 k A12kk Pm k σm σm

> 0. (5.2.20)

From

sep(A11, A22) ≤ min{|λ1− λ2| : λ1 ∈ σ(A11), λ2 ∈ σ(A22)}.

We have σ(A(m+1)11 )T

σ(A(m+1)22 ) = ∅, hence Tm+1−1 exists and sep(A(m+1)11 , A(m+1)22 ) =k Tm+1−1 k−1= σ−1m+1. From (5.2.20) it follows

σm+1 σm

1 − 2 k A12 kk Pm k σm. (5.2.21) Substitute (5.2.19) into (5.2.21), we get σm+1 ≤ 2σm, and

k Pm+1 k≤k Tm+1−1 kk Am+121 k≤ σm+1 k Pm k2k A12k< 1

2 k Pm k . Hence

2 k A12 kk Pm+1 k σm+1 ≤ 2 k A12 kk Pm k σm < 1/2.

This proved that Tm−1 exists for all m = 0, 1, 2, · · · and (5.2.19) holds.

(18)

(b) Prove k Pm k is quadratic convergence to zero. Construct sequences {qm}, {sm}, {pm} satisfying

k A(m)21 k≤ qm, σm ≤ sm, k Pm k≤ pm. (5.2.22) From

A(m+1)21 = −PmA12Pm (5.2.23)

follows

k A(m+1)21 k≤k A12kk Pm k2≤ ηp2m. (5.2.24) Define {qm} by

qm+1 = ηp2m, q0 = γ; (5.2.25) From (5.2.21) we have

σm+1 sm

1 − 2ηpmsm. (5.2.26)

Define {sm} by

sm+1 = sm

1 − 2ηpmsm, s0 = σ; (5.2.27) From (5.2.16) we have

k Pm k≤k Tm−1 kk A(m)21 k= σm k A(m)21 k≤ smqm. Define {pm} by

pm+1 = sm+1qm+1 = ηp2msm+1, p0 = σγ. (5.2.28) By Lemma 5.2.1 follows that {pm} & 0 monotone and form (5.2.22) follows that k Pm k−→ 0 quadratically.

(c) Prove P(m) −→ P and (5.2.15) holds. According to the method as in Lemma 5.2.1. Construct {xm} (see (5.2.6),(5.2.7) ), that is

xm+1 = ηx2m

(1 − 2ηxm)2, sm+1 = sm

1 − 2ηxm (5.2.29)

and then

pm+1 = xm+1

sm+1 = ηxm

1 − 2ηxmpm. (5.2.30)

By induction! For all m = 1, 2, · · · we have pm < 1

2pm−1, xm < 1

4η. (5.2.31)

In fact, substitute

ηx0

1 − 2ηx0 = ησ2γ

1 − 2ησ2γ < 1

2 (5.2.32)

into (5.2.30) and get p1 < 1

2p0; From (5.2.29) and (5.2.32) it follows that x1 = 1

η

µ ηx0

1 − 2ηx0

2

< 1 4η.

參考文獻

相關文件

In part II (“Invariance of quan- tum rings under ordinary flops II”, Algebraic Geometry, 2016), we develop a quantum Leray–Hirsch theorem and use it to show that the big

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

A derivative free algorithm based on the new NCP- function and the new merit function for complementarity problems was discussed, and some preliminary numerical results for

In section 4, based on the cases of circular cone eigenvalue optimization problems, we study the corresponding properties of the solutions for p-order cone eigenvalue

Because both sets R m  and L h i ði ¼ 1; 2; :::; JÞÞ are second-order regular, similar to [19, Theorem 3.86], we state in the following theorem that there is no gap between

The space of total positive matrices mod GL(n) is finite But the number of Ising networks is infinite. Ising networks are secretly dual to each other though local duality

In particular, in the context of folded supersymmetry it is pointed out in Ref.[4] that production of the squirk-antisquirk pair ˜ Q ˜ Q ∗ at the large hadron collider (LHC)

In the following we prove some important inequalities of vector norms and matrix norms... We define backward and forward errors in