Part I On the Numerical Solutions of Linear Systems

(1)

Part I

On the Numerical Solutions of

Linear Systems

(2)

(3)

Chapter 1 Introduction

1.1 Mathematical auxiliary, definitions and relations

1.1.1 Vectors and matrices

A∈ K^m×n, where K = R or C ⇔ A = [aij] =





a11 · · · a1n

... ... ...

a_m1 · · · amn



.

• Product of matrices (K^m×n× K^n×p → K^m×p): C = AB, where c_ij = Pn

k=1a_ikb_kj, i = 1,· · · , m, j = 1, · · · , p.

• Transpose (R^m×n→ R^n×m): C = A^T, where cij = aji∈ R.

• Conjugate transpose (C^m×n→ C^n×m): C = A^∗ or C = A^H, where cij = ¯aji ∈ C.

• Differentiation (R^m×n→ R^m×n): Let C(t) = (cij(t)). Then ˙C(t) = [ ˙cij(t)].

• If A, B ∈ K^n×n satisfy AB = I, then B is the inverse of A and is denoted by A⁻¹. If A⁻¹ exists, then A is said to be nonsingular; otherwise, A is singular. A is nonsingular if and only if det(A)6= 0.

• If A ∈ K^m×n, x∈ Kⁿ and y = Ax, then yi =Pn

j=1aijxj, i = 1,· · · , m.

• Outer product of x ∈ K^m and y∈ Kⁿ:

xy^∗ =





x1y¯1 · · · x1y¯n

... . .. ...

x_my¯₁ · · · xmy¯_n



 ∈ K^m×n.

• Inner product of x and y ∈ Kⁿ: (x, y) := x^Ty =

Xn i=1

xiyi = y^Tx∈ R

(x, y) := x^∗y = Xn

i=1

¯

xiyi = y^∗x∈ C

(4)

• Sherman-Morrison Formula:

Let A∈ R^n×n be nonsingular, u, v∈ Rⁿ. If v^TA⁻¹u6= −1, then

(A + uv^T)⁻¹ = A⁻¹− (1 + v^TA⁻¹u)⁻¹A⁻¹uv^TA⁻¹. (1.1.1)

• Sherman-Morrison-Woodburg Formula:

Let A∈ R^n×n, be nonsingular U , V ∈ R^n×k. If (I + V^TA⁻¹U ) is invertible, then (A + U V^T)⁻¹ = A⁻¹− A⁻¹U (I + V^TA⁻¹U )⁻¹V^TA⁻¹,

Proof of (1.1.1):

(A + uv^T)[A⁻¹− A⁻¹uv^TA⁻¹/(1 + v^TA⁻¹u)]

= I + 1

1 + v^TA⁻¹u[uv^TA⁻¹(1 + v^TA⁻¹u)− uv^TA⁻¹− uv^TA⁻¹uv^TA⁻¹]

= I + 1

1 + v^TA⁻¹u[u(v^TA⁻¹u)v^TA⁻¹− uv^TA⁻¹uv^TA⁻¹] = I.

Example 1.1.1

A =







3 −1 1 1 1

0 1 2 2 2

0 −1 4 1 1

0 0 0 3 0

0 0 0 0 3







= B +





 0 0

−1 0 0







0 1 0 0 0 .

1.1.2 Rank and orthogonality

Let A∈ R^m×n. Then

• R(A) = {y ∈ R^m| y = Ax for some x ∈ Rⁿ } ⊆ R^m is the range space of A.

• N (A) = {x ∈ Rⁿ| Ax = 0 } ⊆ Rⁿ is the null space of A.

• rank(A) = dim [R(A)] = The number of maximal linearly independent columns of A.

• rank(A) = rank(A^T).

• dim(N (A)) + rank(A) = n.

• If m = n, then A is nonsingular ⇔ N (A) = {0} ⇔ rank(A) = n.

• Let {x¹,· · · , x^p} ⊆ Rⁿ. Then {x¹,· · · , x^p} is said to be orthogonal if x^Ti xj = 0, for i6= j and orthonormal if x^Ti xj = δij, where δij = 0 if i 6= j and δ^ij = 1 if i = j.

• S^⊥={y ∈ R^m | y^Tx = 0, for x∈ S} = orthogonal complement of S.

• Rⁿ=R(A^T)⊕ N (A), R^m=R(A) ⊕ N (A^T).

• R(A^T)⊥ N (A), R(A)^⊥=N (A^T).

(5)

1.1 Mathematical auxiliary, definitions and relations 5

A∈ R^n×n A∈ C^n×n

Symmetric: A^T = A Hermitian: A^∗ = A(A^H = A) skew-symmetric: A^T =−A skew-Hermitian: A^∗ =−A

positive definite: x^TAx > 0, x6= 0 positive definite: x^∗Ax > 0, x6= 0 non-negative definite: x^TAx≥ 0 non-negative definite: x^∗Ax ≥ 0

indefinite: (x^TAx)(y^TAy) < 0 for some x, y indefinite: (x^∗Ax)(y^∗Ay) < 0 for some x, y orthogonal: A^TA = I_n unitary: A^∗A = I_n

normal: A^TA = AA^T normal: A^∗A = AA^∗ positive: aij > 0

non-negative: aij ≥ 0.

Table 1.1: Some definitions for matrices.

1.1.3 Special matrices

Let A∈ K^n×n. Then the matrix A is

• diagonal if aij = 0, for i6= j. Denote D = diag(d1,· · · , dn)∈ Dnthe set of diagonal matrices;

• tridiagonal if a^ij = 0,|i − j| > 1;

• upper bi-diagonal if aij= 0, i > j or j > i + 1;

• (strictly) upper triangular if aij = 0, i > j (i≥ j);

• upper Hessenberg if a^ij = 0, i > j + 1.

(Note: the lower case is the same as above.)

Sparse matrix: n^1+r, where r < 1 (usually between 0.2∼ 0.5). If n = 1000, r = 0.9, then n^1+r = 501187.

Example 1.1.2 If S is skew-symmetric, then I− S is nonsingular and (I − S)⁻¹(I + S) is orthogonal (Cayley transformation of S).

1.1.4 Eigenvalues and Eigenvectors

Definition 1.1.1 Let A∈ C^n×n. Then λ∈ C is called an eigenvalue of A, if there exists x6= 0, x ∈ Cⁿ with Ax = λx and x is called an eigenvector corresponding to λ.

Notations:

σ(A) := Spectrum of A = The set of eigenvalues of A.

ρ(A) := Radius of A = max{|λ| : λ ∈ σ(A)}.

• λ ∈ σ(A) ⇔ det(A− λI) = 0.

• p(λ) = det(λI − A) = The characteristic polynomial of A.

• p(λ) =Qs

i=1(λ− λi)^m(λⁱ⁾, where λ_i 6= λj (for i6= j) and Ps

i=1m(λ_i) = n.

(6)

• m(λi) = The algebraic multiplicity of λ_i.

• n(λⁱ) = n− rank(A − λⁱI) = The geometric multiplicity of λi.

• 1 ≤ n(λⁱ)≤ m(λⁱ).

Definition 1.1.2 If there is some i such that n(λi) < m(λi), then A is called degenerated.

The following statements are equivalent:

(1) There are n linearly independent eigenvectors;

(2) A is diagonalizable, i.e., there is a nonsingular matrix T such that T⁻¹AT ∈ Dn; (3) For each λ∈ σ(A), it holds that m(λ) = n(λ).

If A is degenerated, then eigenvectors and principal vectors derive the Jordan form of A.

(See Gantmacher: Matrix Theory I, II)

Theorem 1.1.1 (Schur) (1) Let A ∈ C^n×n. There is a unitary matrix U such that U^∗AU (= U⁻¹AU ) is upper triangular.

(2) Let A ∈ R^n×n. There is an orthogonal matrix Q such that Q^TAQ(= Q⁻¹AQ) is quasi-upper triangular, i.e., an upper triangular matrix possibly with nonzero subdiagonal elements in non-consecutive positions.

(3) A is normal if and only if there is a unitary U such that U^∗AU = D diagonal.

(4) A is Hermitian if and only if A is normal and σ(A)⊆ R.

(5) A is symmetric if and only if there is an orthogonal U such that U^TAU = D diagonal and σ(A)⊆ R.

1.2 Norms and eigenvalues

Let X be a vector space over K = R or C.

Definition 1.2.1 (Vector norms) Let N be a real-valued function defined on X (N : X → R+). Then N is a (vector) norm, if

N1: N (αx) =|α|N(x), α ∈ K, for x ∈ X;

N2: N (x + y)≤ N(x) + N(y), for x, y ∈ X;

N3: N (x) = 0 if and only if x = 0.

The usual notation is kxk = N(x).

(7)

1.2 Norms and eigenvalues 7 Example 1.2.1 Let X = Cⁿ, p ≥ 1. Then kxkp = (Pn

i=1|xi|^p)^1/p is a p-norm. Espe- cially,

kxk1 = Xn

i=1

|xi| ( 1-norm),

kxk2 = ( Xn

i=1

|xi|²)^1/2 (2-norm = Euclidean-norm), kxk∞= max

1≤i≤n|xi| (∞-norm = maximum norm).

Lemma 1.2.1 N (x) is a continuous function in the components x₁,· · · , xn of x.

Proof:

|N(x) − N(y)| ≤ N(x − y) ≤ Xn

j=1

|xj − yj|N(ej)

≤ kx − yk∞

Xn j=1

N (ej).

Theorem 1.2.1 (Equivalence of norms) Let N and M be two norms on Cⁿ. Then there are constants c1, c2 > 0 such that

c1M (x)≤ N(x) ≤ c²M (x), for all x ∈ Cⁿ.

Proof: Without loss of generality (W.L.O.G.) we can assume that M (x) =kxk∞and N is arbitrary. We claim that

c₁kxk∞≤ N(x) ≤ c2kxk∞, equivalently,

c1 ≤ N(z) ≤ c²,∀ z ∈ S = {z ∈ Cⁿ|kzk∞= 1}.

From Lemma 1.2.1, N is continuous on S (closed and bounded). By maximum and minimum principle, there are c1, c2 ≥ 0 and z1, z2 ∈ S such that

c1 = N (z1)≤ N(z) ≤ N(z2) = c2.

If c1 = 0, then N (z1) = 0, and thus, z1 = 0. This contradicts that z1 ∈ S.

Remark 1.2.1 Theorem 1.2.1 does not hold in infinite dimensional space.

Definition 1.2.2 (Matrix-norms) Let A∈ C^m×n. A real-valued functionk·k : C^m×n→ R₊ satisfying

N1: kαAk = |α|kAk;

N2: kA + Bk ≤ kAk + kBk ;

(8)

N3: kAk = 0 if and only if A = 0;

N4: kABk ≤ kAkkBk ;

N5: kAxkv ≤ kAkkxkv (matrix and vector norms are compatible for some k · kv) is called a matrix norm. If k · k satisfies N1 to N4, then it is called a multiplicative or algebra norm.

Example 1.2.2 (Frobenius norm) Let kAkF = [Pn

i,j=1|ai,j|²]^1/2. kABkF = (X

i,j

|X

k

aikbkj|²)¹²

≤ (X

i,j

{X

k

|a^ik|²}{X

k

|b^kj|²})¹² (Cauchy-Schwartz Ineq.)

= (X

i

X

k

|aik|²)¹²(X

j

X

k

|bkj|²)¹²

= kAk^FkBk^F. (1.2.1)

This implies that N4 holds. Furthermore, by Cauchy-Schwartz inequality we have kAxk2 = (X

i

|X

j

aijxj|²)¹²

≤ X

i

(X

j

|a^ij|²)(X

j

|x^j|²)

!¹₂

= kAkFkxk2. (1.2.2)

This implies that N5 holds. Also, N1, N2 and N3 hold obviously. (Here, kIk^F =√ n).

Example 1.2.3 (Operator norm) Given a vector norm k · k. An associated (induced) matrix norm is defined by

kAk = sup

x6=0

kAxk

kxk = max

x6=0

kAxk

kxk . (1.2.3)

Then N5 holds immediately. On the other hand,

k(AB)xk = kA(Bx)k ≤ kAkkBxk

≤ kAkkBkkxk (1.2.4)

for all x6= 0. This implies that

kABk ≤ kAkkBk. (1.2.5)

It holds N 4. (Here kIk = 1).

(9)

1.2 Norms and eigenvalues 9 In the following, we represent and verify three useful matrix norms:

kAk1 = sup

x6=0

kAxk1

kxk¹ = max

1≤j≤n

Xn i=1

|aij| (1.2.6)

kAk∞= sup

x6=0

kAxk∞

kxk∞

= max

1≤i≤n

Xn j=1

|aij| (1.2.7)

kAk² = sup

x6=0

kAxk2

kxk2

=p

ρ(A^∗A) (1.2.8)

Proof of (1.2.6):

kAxk1 = X

i

|X

j

aijxj| ≤X

i

X

j

|aij||xj|

= X

j

|x^j|X

i

|a^ij|.

Let C1 := P

i|a^ik| = max^jP

i|a^ij|. Then kAxk¹ ≤ C¹kxk¹, thus kAk¹ ≤ C¹. On the other hand, kekk1 = 1 and kAekk1 =Pn

i=1|aik| = C1. Proof of (1.2.7):

kAxk∞ = max

i |X

j

aijxj|

≤ max_i X

j

|a^ijxj|

≤ max

i

X

j

|aij|kxk∞

≡ X

j

|a^kj|kxk∞

≡ C∞kxk∞.

This implies that kAk∞≤ C∞. If A = 0, there is nothing to prove. Assume that A6= 0 and the k-th row of A is nonzero. Define z = [zj]∈ Cⁿ by

zj = ( _¯_a

kj

|akj| if akj 6= 0, 1 if akj = 0.

Then kzk∞= 1 and akjzj =|a^kj|, for j = 1, . . . , n. It follows that kAk∞≥ kAzk∞ = max

i |X

j

aijzj| ≥ |X

j

akjzj| = Xn

j=1

|a^kj| ≡ C∞. Thus, kAk∞ ≥ max1≤i≤nPn

j=1|a^ij| ≡ C∞.

Proof of (1.2.8): Let λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0 be the eigenvalues of A^∗A. There are mutually orthonormal vectors v_j, j = 1, . . . , n such that (A^∗A)v_j = λ_jv_j. Let x = P

jαjvj. Since kAxk²2 = (Ax, Ax) = (x, A^∗Ax), kAxk²2 = X

j

αjvj,X

j

αjλjvj

!

=X

j

λj|αj|² ≤ λ1kxk²2.

(10)

Therefore, kAk²2 ≤ λ1. Equality follows by choosing x = v₁ and kAv1k²2 = (v₁, λ₁v₁) = λ₁. So, we have kAk² =p

ρ(A^∗A).

Example 1.2.4 (Dual norm) Let ¹_p +¹_q = 1. Then k · k^∗p =k · k^q, (p =∞, q = 1). (It concludes from the application of the H¨older inequality that |y^∗x| ≤ kxkpkykq.)

Theorem 1.2.2 Let A∈ C^n×n. Then for any operator norm k · k, it holds ρ(A)≤ kAk.

Moreover, for any ε > 0, there exists an operator norm k · kε such that k · kε ≤ ρ(A) + ε.

Proof: Let |λ| = ρ(A) ≡ ρ and x be the associated eigenvector with kxk = 1. Then, ρ(A) =|λ| = kλxk = kAxk ≤ kAkkxk = kAk.

On the other hand, there is a unitary matrix U such that A = U^∗RU , where R is upper triangular. Let Dt = diag(t, t², . . . , tⁿ). Compute

DtRD⁻¹_t =







λ1 t⁻¹r12 t⁻²r13 · · · t⁻ⁿ⁺¹r1n

λ₂ t⁻¹r₂₃ · · · t⁻ⁿ⁺²r_2n

λ3 ...

. .. t⁻¹r_n−1,n λ_n





 .

For t > 0 sufficiently large, the sum of all absolute values of the off-diagonal elements of DtRD_t⁻¹ is less than ε. So, it holdskD^tRD_t⁻¹k¹ ≤ ρ(A) + ε for sufficiently large t(ε) > 0.

Define k · kε for any B by

kBk^ε = kD^tU BU^∗D⁻¹_t k¹

= k(UD⁻¹t )⁻¹B(U D_t⁻¹)k1. This implies that

kAk^ε =kD^tRD_t⁻¹k ≤ ρ(A) + ε.

Remark 1.2.2

kUAV kF = kAkF (by kUAkF = q

kUa1k²2+· · · + kUank²2), (1.2.9) kUAV k² = kAk² (by ρ(A^∗A) = ρ(AA^∗)), (1.2.10) where U and V are unitary.

Theorem 1.2.3 (Singular Value Decomposition (SVD)) Let A∈ C^m×n. Then there exist unitary matrices U = [u1,· · · , u^m]∈ C^m×m and V = [v1,· · · , vⁿ]∈ C^n×n such that

U^∗AV = diag(σ₁,· · · , σp) = Σ,

where p = min{m, n} and σ1 ≥ σ2 ≥ · · · ≥ σp ≥ 0. (Here, σi denotes the i-th largest singular value of A).

(11)

1.2 Norms and eigenvalues 11 Proof: There are x ∈ Cⁿ, y ∈ C^m with kxk2 = kyk2 = 1 such that Ax = σy, where σ = kAk² (kAk² = sup_kxk₂₌₁kAxk²). Let V = [x, V1] ∈ C^n×n, and U = [y, U1] ∈ C^m×m be unitary. Then

A1 ≡ U^∗AV =

σ w^∗ 0 B

. Since

A¹

σ w

2 2

≥ (σ²+ w^∗w)², it follows that

kA¹k²2 ≥ σ²+ w^∗w from A¹

σ w

2

σ w

2 2

≥ σ²+ w^∗w.

But σ² =kAk²2 =kA¹k²2, it implies w = 0. Hence, the theorem holds by induction.

Remark 1.2.3 kAk2 =p

ρ(A^∗A) = σ₁ = The maximal singular value of A.

Let A = U ΣV^∗. Then we have

kABCk^F = kUΣV^∗BCk^F =kΣV^∗BCk^F

≤ σ1kBCkF =kAk2kBCkF. This implies

kABCkF ≤ kAk2kBkFkCk2. (1.2.11) In addition, by (1.2.2) and (1.2.11), we get

kAk² ≤ kAk^F ≤√

nkAk². (1.2.12)

Theorem 1.2.4 Let A∈ C^n×n. The following statements are equivalent:

(1) lim

m→∞A^m= 0;

(2) lim

m→∞A^mx = 0 for all x;

(3) ρ(A) < 1.

Proof: (1) ⇒ (2): Trivial. (2) ⇒ (3): Let λ ∈ σ(A), i.e., Ax = λx, x 6= 0. This implies A^mx = λ^mx → 0, as λ^m → 0. Thus |λ| < 1, i.e., ρ(A) < 1. (3) ⇒ (1): There is a norm k · k with kAk < 1 (by Theorem 1.2.2). Therefore, kA^mk ≤ kAk^m → 0, i.e., A^m → 0.

Theorem 1.2.5 It holds that

ρ(A) = lim

k→∞kA^kk^1/k where k k is an operator norm.

(12)

Proof: Since

ρ(A)^k = ρ(A^k)≤ kA^kk ⇒ ρ(A) ≤ kA^kk^1/k,

for k = 1, 2, . . .. If ε > 0, then Ã = [ρ(A) + ε]⁻¹A has spectral radius < 1 and by Theorem 1.2.4 we have k Ã^kk → 0 as k → ∞. There is an N = N(ε, A) such that k Ã^kk < 1 for all k ≥ N. Thus, kA^kk ≤ [ρ(A) + ε]^k, for all k≥ N or kA^kk^1/k ≤ ρ(A) + ε for all k ≥ N. Since ρ(A) ≤ kA^kk^1/k, and k, ε are arbitrary, lim_k→∞kA^kk^1/k exists and equals ρ(A).

Theorem 1.2.6 Let A∈ C^n×n, and ρ(A) < 1. Then (I− A)⁻¹ exists and (I− A)⁻¹ = I + A + A²+· · · .

Proof: Since ρ(A) < 1, the eigenvalues of (I − A) are nonzero. Therefore, by Theorem 2.5, (I− A)⁻¹ exists and

(I− A)(I + A + A²+· · · + A^m) = I − A^m → 0.

Corollary 1.2.1 If kAk < 1, then (I − A)⁻¹ exists and k(I − A)⁻¹k ≤ 1

1− kAk Proof: Since ρ(A)≤ kAk < 1 (by Theorem 1.2.2),

k(I − A)⁻¹k = k X∞

i=0

Aⁱk ≤ X∞

i=0

kAkⁱ = (1− kAk)⁻¹.

Theorem 1.2.7 (Without proof ) For A∈ K^n×n the following statements are equivalent:

(1) There is a multiplicative norm p with p(A^k)≤ 1, k = 1, 2, . . ..

(2) For each multiplicative norm p the power p(A^k) are uniformly bounded, i.e., there exists a M (p) <∞ such that p(A^k)≤ M(p), k = 0, 1, 2, . . ..

(3) ρ(A)≤ 1 and all eigenvalue λ with |λ| = 1 are not degenerated. (i.e., m(λ) = n(λ).) (See Householder’s book: The theory of matrix, pp.45-47.)

In the following we prove some important inequalities of vector norms and matrix norms.

(13)

1.2 Norms and eigenvalues 13 (a) It holds that

1≤ kxk^p

kxkq ≤ n^(q−p)/pq, (p≤ q). (1.2.13) Proof: Claim kxk^q ≤ kxk^p, (p≤ q): It holds

kxkq = kxk^p

x kxk^p

q

=kxkp

x kxk^p

q

≤ Cp,qkxkp,

where

C^p,q = max

kekp=1kek^q, e = (e1,· · · , eⁿ)^T. We now show that C^p,q≤ 1. From p ≤ q, we have

kek^qq = Xn

i=1

|ei|^q ≤ Xn

i=1

|ei|^p = 1 (by |ei| ≤ 1).

Hence, Cp,q ≤ 1, thus kxkq ≤ kxkp.

To prove the second inequality: Let α = q/p > 1. Then the Jensen inequality holds for the convex function ϕ(x):

ϕ(

Z

Ω

f dµ)≤ Z

Ω

(ϕ◦ f)dµ, µ(Ω) = 1.

If we take ϕ(x) = x^α, then we have Z

Ω|f|^qdx = Z

Ω

(|f|^p)^q/pdx≥

Z

Ω|f|^pdx

q/p

with |Ω| = 1. Consider the discrete measure Pn i=11

n = 1 and f (i) =|xi|. It follows that

Xn i=1

|xi|^q1 n ≥

Xn i=1

|xi|^p1 n

!q/p

. Hence, we have

n⁻¹^qkxk^q ≥ n⁻¹^pkxk^p. Thus,

n^(q−p)/pqkxkq ≥ kxkp. (b) It holds that

1≤ kxk^p

kxk∞ ≤ n¹^p. (1.2.14)

Proof: Let q→ ∞ and lim

q→∞kxk^q =kxk∞:

kxk∞ =|xk| = (|xk|^q)¹^q ≤ Xn

i=1

|xi|^q

!¹_q

=kxkq.

(14)

On the other hand, we have

kxkq = Xn

i=1

|xi|^q

!¹_q

≤ (nkxk^q_∞)¹^q ≤ n¹^qkxk∞

which implies that lim_q→∞kxkq =kxk∞. (c) It holds that

1≤j≤nmax ka^jk^p ≤ kAk^p ≤ n^(p−1)/p max

1≤j≤nka^jk^p, (1.2.15) where A = [a1,· · · , aⁿ]∈ R^m×n.

Proof: The first inequality holds obviously. To show the second inequality, for kykp = 1 we have

kAykp ≤ Xn j=1

|yj|kajkp ≤ Xn

j=1

|yj| max

j kajkp

= kyk1max

j kajkp ≤ n^(p−1)/pmax

j kajkp (by (1.2.13)).

(d) It holds that

maxi,j |a^ij| ≤ kAk^p ≤ n^(p−1)/pm^1/pmax

i,j |a^ij|, (1.2.16) where A∈ R^m×n.

Proof: By (1.2.14) and (1.2.15) immediately.

(e) It holds that

m^(1−p)/pkAk1 ≤ kAkp ≤ n^(p−1)/pkAk1. (1.2.17) Proof: By (1.2.15) and (1.2.13) immediately.

(f ) By H¨older inequality, we have (see Appendix later!)

|y^∗x| ≤ kxkpkykq, where ¹_p +¹_q = 1 or

max{|x^∗y| : kyk^q= 1} = kxk^p. (1.2.18) Then it holds that

kAk^p =kA^Tk^q. (1.2.19)

Proof: By (1.2.18) we have

kxkmaxp=1kAxk^p = max

kxkp=1 max

kykq=1|(Ax)^Ty|

= max

kykq=1 max

kxkp=1|x^T(A^Ty)| = max

kykq=1kA^Tykq =kA^Tkq.

(15)

1.2 Norms and eigenvalues 15 (g) It holds that

n⁻¹^pkAk∞≤ kAkp ≤ m¹^pkAk∞. (1.2.20) Proof: By (1.2.17) and (1.2.19), we get

m¹^pkAk∞ = m¹^pkA^Tk¹ = m¹⁻¹^qkA^Tk¹

= m^(q−1)/qkA^Tk1 ≥ kA^Tkq =kAkp. (h) It holds that

kAk2 ≤q

kAkpkAkq, (1 p +1

q = 1). (1.2.21)

Proof: By (1.2.19) we have

kAkpkAkq =kA^TkqkAkq ≥ kA^TAkq ≥ kA^TAk2.

The last inequality holds by the following statement: Let S be a symmetric matrix.

Then kSk2 ≤ kSk, for any matrix operator norm k k. Since |λ| ≤ kSk, kSk² =p

ρ(S^∗S) = p

ρ(S²) = max

λ∈σ(S)|λ| = |λ^max|.

This implies, kSk2 ≤ kSk.

(i) For A∈ R^m×n and q ≥ p ≥ 1, it holds that

n^(p−q)/pqkAk^q ≤ kAk^p ≤ m^(q−p)/pqkAk^q. (1.2.22) Proof: By (1.2.13), we get

kAk^p = max

kxk^p=1kAxk^p ≤ max

kxk^q≤1m^(q−p)/pqkAxk^q

= m^(q−p)/pqkAkq.

Appendix: To show H¨ older inequality and (1.2.18)

Taking ϕ(x) = e^x in Jensen’s inequality we have exp

Z

Ω

f dµ

≤ Z

Ω

e^fdµ.

Let Ω = finite set ={p1, . . . , pn}, µ({pi}) = _n¹, f (pi) = xi. Then exp

1

n(x1+· · · + xⁿ)

≤ 1

n(e^x¹ +· · · + e^xⁿ) . Taking yi = e^xⁱ, we have

(y1· · · yⁿ)^1/n ≤ 1

n(y1 +· · · + yⁿ).

(16)

Taking µ({pi}) = qi > 0, Pn

i=1q_i = 1 we have

y₁^q¹· · · y^qnⁿ ≤ q1y₁+· · · + qny_n. (1.2.23) Let αi = xi/kxkp, βi = yi/kykq, where x = [x1,· · · , xn]^T, y = [y1,· · · , yn]^T, α = [α1,· · · , αⁿ]^T and β = [β1,· · · , βⁿ]^T. By (1.2.23) we have

αiβi ≤ 1

pα^p_i + 1 qβ_i^q. Since kαk^p = 1, kβk^q = 1, it holds

Xn i=1

αiβi ≤ 1 p +1

q = 1.

Thus,

|x^Ty| ≤ kxkpkykq.

To show max{|x^Ty|; kxkp = 1} = kykq. Taking x_i = y_i^q−1/kyk^q/p^q we have kxk^pp =

Pn

i=1|yi|^(q−1)p kyk^q^q = 1.

Note (q− 1)p = q. Then

Xn i=1

x^T_i yi

=

Pn i=1|yi|^q

kyk^q/p^q = kyk^qq

kyk^q/p^q =kykq. The following two properties are useful in the following sections.

(i) There exists ˆz with kˆzk^p = 1 such that kyk^q = ˆz^Ty. Let z = ˆz/kyk^q. Then we have z^Ty = 1 andkzkp = _kyk¹_q.

(ii) From the duality, we have kyk = (kyk∗)_∗ = max_kuk∗=1|y^Tu| = y^Tz andˆ kˆzk∗ = 1.

Let z = ˆz/kyk. Then we have z^Ty = 1 and kzk∗ = _kyk¹ .

1.3 The Sensitivity of Linear System Ax = b

1.3.1 Backward error and Forward error

Let x = F (a). We define backward and forward errors in Figure 1.1. In Figure 1.1, ˆ

x + ∆x = F (a + ∆a) is called a mixed forward-backward error, where |∆x| ≤ ε|x|,

|∆a| ≤ η|a|.

Definition 1.3.1 (i) An algorithm is backward stable, if for all a, it produces a computed ˆ

x with a small backward error, i.e., ˆx = F (a + ∆a) with ∆a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward-backward error sense, i.e., ˆx + ∆x = F (a + ∆a) with both ∆a and ∆x small.

(17)

1.3 The Sensitivity of Linear System Ax = b 17

Figure 1.1: Relationship between backward and forward errors.

(iii) If a method which produces answers with forward errors of similar magnitude to those produced by a backward stable method, is called a forward stable.

Remark 1.3.1 (i) Backward stable ⇒ forward stable, not vice versa!

(ii) Forward error ≤ condition number × backward error Consider

ˆ

x− x = F (a + ∆a) − F (a) = F⁰(a)∆a + F⁰⁰(a + θ∆a)

2 (∆a)², θ∈ (0, 1).

Then we have

ˆ x− x

x =

aF⁰(a) F (a)

∆a

a + O (∆a)² .

The quantity C(a) = ^aF_{F (a)}⁰^(a)

is called the condition number of F. If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all ∆a.

`Apriori error estimate ! Pposteriori error estimate !`

1.3.2 An SVD Analysis

Let A =Pn

i=1σiuiviT = U ΣV^T be a singular value decomposition (SVD) of A. Then x = A⁻¹b = (U ΣV^T)⁻¹b =

Xn i=1

u_i^Tb σi

v_i.

If cos(θ) =| unTb | / k b k2 and

(A− εunv_n^T)y = b + ε(u_n^Tb)u_n, σ_n > ε≥ 0.

(18)

Then we have

k y − x k²≥ ( ε

σ_n)k x k² cos(θ).

Let E = diag{0, · · · , 0, ε}. Then it holds

(Σ− E)V^Ty = U^Tb + ε(unTb)en. Therefore,

y− x = V (Σ − E)⁻¹U^Tb + ε(unTb)(σn− ε)⁻¹vn− V Σ⁻¹U^Tb

= V ((Σ− E)⁻¹− Σ⁻¹)U^Tb + ε(unTb)(σn− ε)⁻¹vn

= V (Σ⁻¹E(Σ− E)⁻¹)U^Tb + ε(u_n^Tb)(σ_n− ε)⁻¹v_n

= V diag

0,· · · , 0, ε σn(σn− ε)

U^Tb + ε(unTb)(σn− ε)⁻¹vn

= ε

σn(σn− ε)vn(unTb) + ε(unTb)(σn− ε)⁻¹vn

= u_n^Tbv_n( ε

σn(σn− ε) + ε(σ_n− ε)⁻¹)

= ε(1 + σn)

σn(σn− ε)unTbvn.

From the inequality kxk² ≤ kbk²kA⁻¹k² we have k y − x k2

k x k2 ≥ | unTb | _σ^ε_n(^1+σ_σ−ε)

k b k2 ≥ | unTb | k b k2

ε σn

.

Theorem 1.3.1 A is nonsingular and k A⁻¹E k= r < 1. Then A + E is nonsingular and k (A + E)⁻¹− A⁻¹ k≤k E k k A⁻¹ k² /(1− r).

Proof:: Since A is nonsingular, A+E = A(I−F ), where F = −A⁻¹E. Sincek F k= r < 1, it follows that I− F is nonsingular (by Corollary 1.2.1) and k (I − F )⁻¹ k< _1−r¹ . Then

(A + E)⁻¹ = (I− F )⁻¹A⁻¹ =⇒k (A + E)⁻¹ k≤ kA⁻¹k 1− r and

(A + E)⁻¹− A⁻¹ =−A⁻¹E(A + E)⁻¹. It follows that

k (A + E)⁻¹− A⁻¹ k≤k A⁻¹ kk E kk (A + E)⁻¹ k≤ k A⁻¹ k²k E k 1− r .

Lemma 1.3.1 Let

Ax = b,

(A + ∆A)y = b + ∆b,

wherek ∆A k≤ δ k A k and k ∆b k≤ δkbk. If δκ(A) = r < 1, then A+∆A is nonsingular and ^kyk_kxk ≤ ^1+r_1−r, where κ(A) =kAkkA⁻¹k.

(19)

1.3 The Sensitivity of Linear System Ax = b 19 Proof: Since k A⁻¹∆A k< δkA⁻¹kkAk = r < 1, it follows that A + ∆A is nonsingular.

From the equality (I + A⁻¹∆A)y = x + A⁻¹∆b follows that

kyk ≤ k (I + A⁻¹∆A)⁻¹ k (kxk + δkA⁻¹kkbk)

≤ 1

1− r(kxk + δkA⁻¹kkbk)

= 1

1− r(kxk + rk b k kAk).

From kbk =k Ax k≤ kAkkxk follows the lemma.

1.3.3 Normwise Forward Error Bound

Theorem 1.3.2 If the assumption of Lemma 1.3.1 holds, then ^kx−yk_kxk ≤ _1−r^2δ κ(A).

Proof:: Since y− x = A⁻¹∆b− A⁻¹∆Ay, we have

k y − x k≤ δkA⁻¹kkbk + δkA⁻¹kkAkkyk.

So by Lemma 1.3.1 it holds k y − x k

kxk ≤ δκ(A) kbk

kAkkxk+ δκ(A)kyk kxk

≤ δκ(A)(1 + 1 + r

1− r) = 2δ

1− rκ(A).

1.3.4 Componentwise Forward Error Bound

Theorem 1.3.3 Let Ax = b and (A + ∆A)y = b + ∆b, where | ∆A |≤ δ | A | and

| ∆b |≤ δ | b |. If δκ∞(A) = r < 1, then (A + ∆A) is nonsingular and ^ky−xk_kxk ^∞

∞ ≤ _1−r^2δ k|

A⁻¹ || A |k∞. Here k | A⁻¹ || A | k∞ is called a Skeel condition number.

Proof:: Since k ∆A k∞≤ δkAk∞ and k ∆b k∞≤ δkbk∞, the assumptions of Lemma 1.3.1 are satisfied in ∞-norm. So, A + ∆A is nonsingular and ^kyk_kxk^∞_∞ ≤ ^1+r_1−r.

Since y− x = A⁻¹∆b− A⁻¹∆Ay, we have

| y − x | ≤ | A⁻¹ || ∆b | + | A⁻¹ || ∆A || y |

≤ δ | A⁻¹ || b | +δ | A⁻¹ || A || y |

≤ δ | A⁻¹ || A | (| x | + | y |).

By taking ∞-norm, we have

k y − x k∞ ≤ δ k| A⁻¹ || A |k∞ (kxk∞+ 1 + r

1− rkxk∞)

= 2δ

1− rk| A⁻¹ || A |k∞.

(20)

1.3.5 Derivation of Condition Number of Ax = b

Let

(A + εF )x(ε) = b + εf with x(0) = x.

Then we have ˙x(0) = A⁻¹(f − F x) and x(ε) = x + ε ˙x(0) + o(ε²). Therefore, k x(ε) − x k

kxk ≤ εkA⁻¹k{k f k

kxk +k F k} + o(ε²).

Define condition number κ(A) :=kAkkA⁻¹k. Then we have k x(ε) − x k

kxk ≤ κ(A)(ρA+ ρb) + o(ε²), where ρ_A = εkF k/kAk and ρ^b = εkfk/kbk.

1.3.6 Normwise Backward Error

Theorem 1.3.4 Let y be the computed solution of Ax = b. Then the normwise backward error bound

η(y) := min

ε|(A + ∆A)y = b + ∆b, k∆Ak ≤ εkAk, k∆bk ≤ εkbk is given by

η(y) = krk

kAkkyk + kbk, (1.3.24)

where r = b− Ay is the residual.

Proof: The right hand side of (1.3.24) is a upper bound of η(y). This upper bound is attained for the perturbation (by construction!)

∆Amin = kAkkykrz^T

kAkkyk + kbk, ∆bmin =− kbk

kAkkyk + kbkr, where z is the dual vector of y, i.e. z^Ty = 1 and kzk∗ = _kyk¹ .

Check:

k∆A^mink = η(y)kAk, or

k∆Amink = kAkkykkrz^Tk kAkkyk + kbk =

krk

kAkkyk + kbk

kAk.

That is, to prove

krz^Tk = krk kyk. Since

krz^Tk = max

kuk=1k(rz^T)uk = krk max

kuk=1|z^Tu| = krkkzk∗ =krk 1 kyk, we have done. Similarly, k∆bmink = η(y)kbk.

(21)

1.3 The Sensitivity of Linear System Ax = b 21

1.3.7 Componentwise Backward Error

Theorem 1.3.5 The componentwise backward error bound ω(y) := min

ε|(A + ∆A)y = b + ∆b, |∆A| ≤ ε|A|, |∆b| ≤ ε|b|

is given by

ω(y) = max

i

|r|ⁱ (A|y| + b)i

, (1.3.25)

where r = b− Ay. (note: ξ/0 = 0 if ξ = 0; ξ/0 = ∞ if ξ 6= 0.)

Proof: The right hand side of (1.3.25) is a upper bound for ω(y). This bound is attained for the perturbations ∆A = D1AD2 and ∆b =−D1b, where D1 = diag(ri/(A|y| + b)i) and D2 = diag(sign(yi)).

Remark 1.3.2 Theorems 1.3.4 and 1.3.5 are posterior error estimation approach.

1.3.8 Determinants and Nearness to Singularity

B_n =







1 −1 · · · −1 1 . .. ...

1 −1

0 1





, B_n⁻¹ =







1 1 · · · 2ⁿ⁻² . .. ... ...

. .. 1

0 1





.

Then det(Bn) = 1, κ_∞(Bn) = n2ⁿ⁻¹, σ30(B30)≈ 10⁻⁸.

Dn=





10⁻¹ 0

. ..

0 10⁻¹



 .

Then det(D_n) = 10⁻ⁿ, κ_p(D_n) = 1 and σ_n(D_n) = 10⁻¹.

(22)