September8,2011 Tsung-MingHuang Introduction

(1)

師

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University

September 8, 2011

(2)

師大

Outline

1 Vectors and matrices

2 Rank and orthogonality

3 Eigenvalues and Eigenvectors

4 Norms and eigenvalues

5 Backward and Forward errors

(3)

師

Vectors and matrices

A ∈ F with

A = [aij] =







a11 · · · a1n

... . .. ... a_m1 · · · a_mn





, F = R or C.

Product of matrices: C = AB, where cij=Pn

k=1aikbkj, i = 1, . . . , m, j = 1, . . . , p.

Transpose: C = A^T, where cij = aji∈ R.

Conjugate transpose: C = A^∗ or C = A^H, where c_ij = ¯a_ji∈ C.

Differentiation: Let C = (c_ij(t)). Then ˙C = _dt^d C = [ ˙c_ij(t)].

(4)

師大

Outer product of x ∈ F^mand y ∈ Fⁿ:

xy^∗=







x1y¯1 · · · x1y¯n

... . .. ... xmy¯1 · · · xmy¯n





∈ F^m×n.

Inner product of x ∈ Fⁿ and y ∈ Fⁿ:

hy, xi := x^Ty =

n

X

i=1

x_iy_i= y^Tx ∈ R,

hy, xi := x^∗y =

n

X

i=1

¯

xiyi= y^∗x ∈ C.

(5)

師

Sherman-Morrison Formula:

Let A ∈ R^n×n be nonsingular, u, v ∈ Rⁿ. If v^TA⁻¹u 6= −1, then (A + uv^T)⁻¹= A⁻¹− A⁻¹uv^TA⁻¹/(1 + v^TA⁻¹u). (1) Sherman-Morrison-Woodbury Formula:

Let A ∈ R^n×n, be nonsingular U , V ∈ R^n×k. If (I + V^TA⁻¹U ) is invertible, then

(A + U V^T)⁻¹ = A⁻¹− A⁻¹U (I + V^TA⁻¹U )⁻¹V^TA⁻¹. Proof of (1):

(A + uv^T)[A⁻¹− A⁻¹uv^TA⁻¹/(1 + v^TA⁻¹u)]

= I + 1

1 + v^TA⁻¹u[uv^TA⁻¹(1 + v^TA⁻¹u) − uv^TA⁻¹− uv^TA⁻¹uv^TA⁻¹]

= I + 1

1 + v^TA⁻¹u[u(v^TA⁻¹u)v^TA⁻¹− u(v^TA⁻¹u)v^TA⁻¹]

= I.

(6)

師大

Example 1

A =˜







3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 −1 0 0 3







= A +





 0 0 0 0

−1







0 1 0 0 0 ,

where

A =







3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 0 0 0 3





 .

(7)

師

Rank and orthogonality

Let A ∈ R^m×n. Then

R(A) = {y ∈ R^m| y = Ax for some x ∈ Rⁿ} ⊆ R^mis the range space of A.

N (A) = {x ∈ Rⁿ| Ax = 0} ⊆ Rⁿ is the null space of A.

rank(A) = dim[R(A)] = the number of maximal linearly independent columns of A.

rank(A) = rank(A^T).

dim(N (A))+ rank(A) = n.

If m = n, then A is nonsingular ⇔ N (A) = {0} ⇔ rank(A) = n.

(8)

師大

Let {x1, · · · , xp} in Rⁿ. Then {x1, · · · , xp} is said to be orthogonal if

x^T_ixj= 0, for i 6= j and orthonormal if

x^T_ix_j= δ_ij, where δ_ij = 0 if i 6= j and δ_ij= 1 if i = j.

S^⊥ = {y ∈ R^m| y^Tx = 0, for x ∈ S} = orthogonal complement of S.

R^m= R(A) ⊕ N (A^T).

Rⁿ= R(A^T) ⊕ N (A).

R(A)^⊥= N (A^T).

R(A^T)^⊥ = N (A).

(9)

師

Special matrices

A ∈ R^n×n

Symmetric: A^T = A skew-symmetric: A^T = −A positive definite: x^TAx > 0, x 6= 0 non-negative definite: x^TAx ≥ 0

indefinite: (x^TAx)(y^TAy) < 0, for some x, y orthogonal: A^TA = In

normal: A^TA = AA^T positive: aij> 0 non-negative: aij≥ 0.

A ∈ C^n×n

Hermitian: A^∗= A (A^H= A) skew-Hermitian: A^∗= −A positive definite: x^∗Ax > 0, x 6= 0 non-negative definite: x^∗Ax ≥ 0

indefinite: (x^∗Ax)(y^∗Ay) < 0, for some x, y unitary: A^∗A = In

(10)

師大

Let A ∈ F^n×n. Then the matrix A is

diagonal if a_ij = 0, for i 6= j. Denote D = diag(d₁, · · · , d_n) ∈ D_n; tridiagonal if aij = 0, |i − j| > 1;

upper bi-diagonal if aij = 0, i > j or j > i + 1;

(strictly) upper triangular if a_ij= 0, i > j (i ≥ j);

upper Hessenberg if aij = 0, i > j + 1.

(Note: the lower case is the same as above.)

Sparse matrix: n^1+r, where r < 1 (usually between 0.2 ∼ 0.5). If n = 1000, r = 0.9, then n^1+r= 501187.

(11)

師

Eigenvalues and Eigenvectors

Definition 2

Let A ∈ C^n×n. Then λ ∈ C is called an eigenvalue of A, if there exists x 6= 0, x ∈ Cⁿ with Ax = λx and x is called an eigenvector

corresponding to λ.

Notations:

σ(A) := spectrum of A = the set of eigenvalues of A.

ρ(A) := radius of A = max{|λ| : λ ∈ σ(A)}.

λ ∈ σ(A) ⇔ det(A − λI) = 0.

p(λ) = det(λI − A) = characteristic polynomial of A.

p(λ) =Qs

i=1(λ − λi)^m(λⁱ⁾, λi6= λj(for i 6= j) andPs

i=1m(λi) = n.

m(λ_i) = algebraic multiplicity of λ_i.

n(λi) = n − rank(A − λiI) = geometric multiplicity of λi.

(12)

師大

If there is some i such that n(λi) < m(λi), then A is called degenerated.

The following statements are equivalent:

(1) There are n linearly independent eigenvectors;

(2) A is diagonalizable, i.e., there is a nonsingular matrix T such that T⁻¹AT ∈ D_n;

(3) For each λ ∈ σ(A), it holds m(λ) = n(λ).

If A is degenerated, then eigenvectors plus principal vectors derive Jordan form.

(13)

師

Theorem 3 (Schur decomposition)

(1) Let A ∈ C^n×n. There is a unitary matrix U such that U^∗AU is upper triangular.

(2) Let A ∈ R^n×n. There is an orthogonal matrix Q such that Q^TAQ is quasi-upper triangular, i.e., an upper triangular matrix possibly with nonzero subdiagonal elements in non-consecutive positions.

(3) A is normal if and only if there is a unitary U such that U^∗AU = D is diagonal.

(4) A is Hermitian if and only if A is normal and σ(A) ⊆ R.

(5) A is symmetric if and only if there is an orthogonal U such that U^TAU = D is diagonal and σ(A) ⊆ R.

(14)

師大

Norms and eigenvalues

Let X be a vectorspace over F = R or C.

Definition 4 (Vector norms)

Let N be a real-valued function defined on X (N : X −→ R+). Then N is a (vector) norm, if

N1: N (αx) = |α|N (x), α ∈ F, for x ∈ X;

N2: N (x + y) ≤ N (x) + N (y), for x, y ∈ X;

N3: N (x) = 0 if and only if x = 0.

The usual notation is kxk = N (x).

(15)

師

Example 5

Let X = Cⁿ, p ≥ 1. Then kxk_p= (Pn

i=1|xi|^p)^1/p is an l_p-norm.

Especially,

kxk1=

n

P

i=1

|xi| ( l1-norm),

kxk2=

_n P

i=1

|xi|²

^1/2

( Euclidean-norm),

kxk∞= max

1≤i≤n|xi| ( maximum-norm).

(16)

師大

Lemma 6

N (x) is a continuous function in the components x1, · · · , xn of x.

Proof:

|N (x) − N (y)| ≤ N (x − y) ≤

n

P

j=1

|xj− yj| N (ej) ≤ kx − yk_∞

n

P

j=1

N (ej).

Theorem 7 (Equivalence of norms)

Let N and M be two norms on Cⁿ. Then there exist constants c₁, c₂> 0 such that

c1M (x) ≤ N (x) ≤ c2M (x), for all x ∈ Cⁿ.

Proof of Theorem 7

Remark: Theorem 7 does not hold in infinite dimensional space.

(17)

師

Norms and eigenvalues

Definition 8 (Matrix-norms)

Let A ∈ C^m×n. A real value function k · k : C^m×n → R+ satisfying N1: kαAk = |α|kAk;

N2: kA + Bk ≤ kAk + kBk;

N3: kAk = 0 if and only if A = 0;

N4: kABk ≤ kAkkBk;

N5: kAxk_v≤ kAkkxk_v.

If k · k satisfies N1 to N4, then it is called a matrix norm. In addition, matrix and vector norms are compatible for some k · kv in N5.

(18)

師大

Example 9 (Frobenius norm) Let kAkF =n

Pn

i,j=1|ai,j|²o1/2

.

kABkF =



 X

i,j

X

k

a_ikb_kj

2



1/2

≤



 X

i,j

( X

k

|a_ik|² ) (

X

k

|b_kj|² )



1/2

(Cauchy-Schwartz Ineq.)

= X

i

X

k

|aik|²

!1/2

 X

j

X

k

|bkj|²





1/2

= kAk_FkBkF.

This implies that N4 holds.

kAxk2=



 X

i

X

j

a_ijx_j

2



1/2

≤





 X

i



 X

j

|aij|²







 X

j

|xj|²











1/2

= kAk_Fkxk2. (2)

This implies N5 holds. Also, N1, N2 and N3 hold obviously. (kIkF =√ n)

(19)

師

Example 10 (Operator norm)

Given a vector norm k·k. An associated matrix norm is defined by kAk = sup

x6=0

kAxk kxk = max

x6=0

kAxk

kxk = max

kxk=1{kAxk} . N5 holds immediately. On the other hand,

k(AB)xk = kA(Bx)k ≤ kAk kBxk

≤ kAk kBk kxk for all x 6= 0. This implies that

kABk ≤ kAk kBk . Thus, N 4 holds. (kIk = 1).

(20)

師大

Three useful matrix norms:

kAk₁= sup

x6=0

kAxk₁

kxk₁ = max

1≤j≤n n

X

i=1

|aij| (3)

kAk_∞= sup

x6=0

kAxk_∞

kxk_∞ = max

1≤i≤n n

X

j=1

|aij| (4)

kAk₂= sup

x6=0

kAxk₂ kxk₂ =p

ρ(A^∗A) (5)

Proof of (3)-(5)

Example 11 (Dual norm)

Let ¹_p +¹_q = 1. Then k·k^∗_p= k·k_q, (p = ∞, q = 1). (It concluds from the application of the H¨older inequality, i.e. |y^∗x| ≤ kxk_pkyk_q.)

(21)

師

Theorem 12

Let A ∈ C^n×n. Then for any operator norm k·k, it holds ρ(A) ≤ kAk .

Moreover, for any > 0, there exists an operator norm k·k such that k·k≤ ρ(A) + .

Proof of Theorem 12

Lemma 13

Let U and V are unitary. Then

kU AV kF = kAk_F, kU AV k2= kAk₂ From

q

(22)

師大

Theorem 14 (Singular Value Decomposition (SVD)) Let A ∈ C^m×n. Then there exist unitary matrices

U = [u1, · · · , um] ∈ C^m×mand V = [v1, · · · , vn] ∈ C^n×n such that U^∗AV = diag(σ1, · · · , σp) = Σ, (6) where p = min{m, n} and σ1≥ σ2≥ · · · ≥ σp≥ 0. (Here, σi denotes the i-th largest singular value of A).

Proof of Theorem 14

Remark: From (6), we have kAk2=pρ(A^∗A) = σ1, which is the maximal singular value of A, and

kABCkF = kU ΣV^∗BCkF = kΣV^∗BCkF ≤ σ1kBCkF = kAk2kBCkF. This implies

kABCkF ≤ kAk2kBkFkCk2. (7) In addition, by (2) and (7), we get

kAk₂≤ kAk_F ≤√ nkAk₂.

(23)

師

Theorem 15

Let A ∈ C^n×n. The statements are equivalent:

(1) lim

m→∞A^m= 0;

(2) lim

m→∞A^mx = 0 for all x;

(3) ρ(A) < 1.

Proof:

(1) ⇒ (2): Trivial.

(2) ⇒ (3): Let λ ∈ σ(A), i.e., Ax = λx, x 6= 0. This implies A^mx = λ^mx → 0, as λ^m→ 0. Thus |λ| < 1, i.e., ρ(A) < 1.

(3) ⇒ (1): There is a norm k · k with kAk < 1 (by Theorem 12).

Therefore, kA^mk ≤ kAk^m→ 0, i.e., A^m→ 0.

(24)

師大

Theorem 16 ρ(A) = lim

k→∞

A^k

1/k.

Proof: Since

ρ(A)^k= ρ(A^k) ≤ A^k

⇒ ρ(A) ≤ A^k

1/k,

for k = 1, 2, . . .. If > 0, then ˜A = [ρ(A) + ]⁻¹A has spectral radius

< 1 and

A˜^k

→ 0 as k → ∞. There is an N = N (, A) such that

A˜^k

< 1 for all k ≥ N . Thus, A^k

≤ [ρ(A) + ]^k, for all k ≥ N or

A^k

1/k ≤ ρ(A) + , for all k ≥ N.

Since ρ(A) ≤ A^k

1/k, and k, are arbitrary, lim

k→∞

A^k

1/k exists and equals ρ(A).

(25)

師

Theorem 17

Let A ∈ C^n×n, and ρ(A) < 1. Then (I − A)⁻¹ exists and (I − A)⁻¹= I + A + A²+ · · · .

Proof: Since ρ(A) < 1, the eigenvalues of (I − A) are nonzero. Therefore, by Theorem 15, (I − A)⁻¹exists and

(I − A)(I + A + A²+ · · · + A^m) = I − A^m→ I.

Corollary 18

If kAk < 1, then (I − A)⁻¹ exists and (I − A)⁻¹

≤ 1

1 − kAk. Proof: Since ρ(A) ≤ kAk < 1 (by Theorem 12),

_∞ _∞

(26)

師大

Theorem 19 (without proof)

For A ∈ F^n×n the following statements are equivalent:

(1) There is a multiplicative norm p with p(A^k) ≤ 1, k = 1, 2, . . ..

(2) For each multiplicative norm p the power p(A^k) are uniformly bounded, i.e., there exists a M (p) < ∞ such that p(A^k) ≤ M (p), k = 0, 1, 2, . . ..

(3) ρ(A) ≤ 1 and all eigenvalue λ with |λ| = 1 are not degenerated.

(i.e., m(λ) = n(λ).)

(See Householder: The theory of matrix, pp.45-47.)

(27)

師

In the following, we prove some important inequalities of vector norms and matrix norms.

1 ≤ kxk_p

kxk_q ≤ n^(q−p)/pq, (p ≤ q). (8)

Proof of (8)

1 ≤ kxk_p

kxk_∞ ≤ n¹^p. (9)

Proof of (9)

max

1≤j≤nkajk_p≤ kAk_p≤ n^(p−1)/p max

1≤j≤nkajk_p, (10) where A = [a₁, · · · , a_n] ∈ R^m×n.

Proof of (10)

(28)

師大

maxi,j |aij| ≤ kAk_p ≤ n^(p−1)/pm^1/pmax

i,j |aij| , (11) where A ∈ R^m×n.

Proof of (11): By (9) and (10) immediately.

m^(1−p)/pkAk₁≤ kAk_p≤ n^(p−1)/pkAk₁. (12) Proof of (12): By (10) and (8) immediately.

(29)

師

H¨older inequality:

x^Ty

≤ kxk_pkyk_q, where1 p+1

q = 1. (13)

Proof of (13): Let αi= _kxk^xⁱ

p, βi=_kyk^yⁱ

q. Then (α^p_i)^1/p(β_i^q)^1/q≤1

pα^p_i +1

qβ_i^q. (Jensen Inequality) Since kαk_p= 1, kβk_q= 1, it follows that

n

X

i=1

α_iβ_i≤ 1 p+1

q = 1.

Then we have x^Ty

≤ kxk_pkyk_q.

(30)

師大

max{

x^Ty

: kxk_p= 1} = kyk_q. (14) Proof of (14): Take xi= y^q−1_i / kyk^q/p_q . Then we have

kxk^p_p= P |yi|^q

kyk^q/p_q = kyk^q_q

kyk^q/p_q = 1. ( ∵ (q − 1)p = 1) It follows

n

X

i=1

x^T_i yi

= P |yi|^q

kyk^q/p_q = kyk^q_q

kyk^q/p_q = kyk_q.

Remark: ∃ˆz with kˆzk_p= 1 s.t. kyk_q = ˆz^Ty. Let z = ˆz/ kyk_q. Then we have ∃z s.t. z^Ty = 1 with kzk_p=_kyk¹

q

.

(31)

師

kAk_p= A^T

_q (15)

Proof of (15)

n⁻¹^pkAk_∞≤ kAk_p≤ m¹^pkAk_∞. (16)

Proof of (16)

kAk₂≤q

kAk_pkAk_q, (1 p+1

q = 1). (17)

Proof of (17)

n^(p−q)/pqkAk_q ≤ kAk_p≤ m^(q−p)/pqkAk_q, (18) where A ∈ R^m×n and q ≥ p ≥ 1.

(32)

師大

Backward error and Forward error

Let x = F (a). We define backward and forward errors in Figure 1. In Figure 1, ˆx + ∆x = F (a + ∆a) is called a mixed forward-backward error, where |∆x| ≤ ε|x|, |∆a| ≤ η|a|.

Definition 20

(i) An algorithm is backward stable, if for all a, it produces a computed ˆx with a small backward error, i.e., ˆx = F (a + ∆a) with

∆a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward-backward error sense, i.e., ˆx + ∆x = F (a + ∆a) with both

∆a and ∆x small.

(iii) If a method which produces answers with forward errors of similar magnitude to those produced by a backward stable method, is called a forward stable.

(33)

師大

(i) An algorithm is backward stable, if for all a, it produces a computed ^x with a small backward error, i.e., ^x = F (a + a) with a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward- backward error sense, i.e., ^x + x = F (a + a) with both a and x small.

(iii) If a method which produces answers with forward errors of similar magnitude to those produced by a backward stable method, is called a forward stable.

Figure: Relationship between backward and forward errors.

Remark:

(i) Backward stable ⇒ forward stable, no vice versa!

(ii) Forward error ≤ condition number × backward error

33 / 56

(34)

師大

Consider ˆ

x − x = F (a + ∆a) − F (a) = F⁰(a)∆a +F⁰⁰(a + θ∆a)

2 (∆a)², θ ∈ (0, 1).

Then we have ˆ x − x

x = aF⁰(a) F (a)

∆a

a + O (∆a)² . The quantity C(a) =

aF⁰(a) F (a)

is called the condition number of F. If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all ∆a.

Backward error:

( Apriori error estimate !` Aposteriori error estimate !`

(35)

師

Lemma 21

Ax = b

(A + ∆A)ˆx = b + ∆b

with k∆Ak ≤ δ kAk and k∆bk ≤ δ kbk. If δκ(A) = r < 1 then A + ∆A is nonsingular and ^kˆ_kxk^xk ≤^1+r_1−r.

Proof: Since

A⁻¹∆A < δ

A⁻¹

kAk = r < 1, it follows that A + ∆A is nonsingular. From (I + A⁻¹∆A)ˆx = x + A⁻¹∆b, we have

kˆxk ≤

(I + A⁻¹∆A)⁻¹

kxk + δ A⁻¹

kbk

≤ 1

1 − r kxk + δ A⁻¹

kbk

= 1

1 − r

kxk + rkbk kAk

(36)

師大

Normwise Forward Error Bound

Theorem 22

If the condition of Lemma 21 hold, then kx − ˆxk

kxk ≤ 2δ 1 − rκ(A).

Proof: Since ˆx − x = A⁻¹∆b − A⁻¹∆Aˆx, we have kˆx − xk ≤ δ

A⁻¹

kbk + δ A⁻¹

kAk kˆxk . So, by Lemma 21, we have

kˆx − xk

kxk ≤ δκ(A) kbk

kAk kxk+ δκ(A)kˆxk kxk

≤ δκ(A)

1 + 1 + r 1 − r

= 2δ 1 − rκ(A).

(37)

師

Componentwise Forward Error Bounds

Theorem 23

Let Ax = b and (A + ∆A)ˆx = b + ∆b. Let |∆A| ≤ δ |A| and

|∆b| ≤ δ |b|. If δκ_∞(A) = r < 1 then (A + ∆A) is nonsingular and kˆx − xk_∞

kxk_∞ ≤ 2δ 1 − r

A⁻¹ |A|

_∞.

Proof: Since k∆Ak_∞≤ δ kAk_∞and k∆bk_∞≤ δ kbk_∞, the conditions of Lemma 21 are satisfied in ∞-norm. Then A + ∆A is nonsingular and

kˆxk_∞

kxk_∞ ≤ ^1+r_1−r.

Since ˆx − x = A⁻¹∆b − A⁻¹∆Aˆx, we have

|ˆx − x| ≤ A⁻¹

|∆b| + A⁻¹

|∆A| |ˆx|

≤ δ A⁻¹

|b| + δ A⁻¹

|A| |ˆx| ≤ δ A⁻¹

|A| (|x| + |ˆx|).

(38)

師大

Taking ∞-norm, we get kˆx − xk_∞ ≤ δ

A⁻¹

|A|

_∞

kxk_∞+1 + r 1 − rkxk_∞

= 2δ

1 − r k A⁻¹

|A|

| {z } k_∞

Skeel condition number

.

(39)

師

Condition Number by First Order Approximation

(A + F )x() = b + f, x(0) = x

˙

x(0) = A⁻¹(f − F x) x() = x + ˙x(0) + o(²) kx() − xk

kxk ≤ A⁻¹

kf k kxk + kF k

+ o(²) Condition number κ(A) := kAk

A⁻¹ kbk ≤ kAk kxk , kx() − xk

kxk ≤ κ(A)(ρ_A+ ρ_b) + o(²).

ρ_A = kF k

kAk, ρ_b= kf k

kbk, κ₂(A) = σ₁(A) .

(40)

師大

Normwise Backward Error Bound

Theorem 24

Let ˆx be the computed solution of Ax = b. Then the normwise backward error bound

η(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, k∆Ak ≤ kAk , k∆bk ≤ kbk}

is given by

η(ˆx) = krk

kAk kˆxk + kbk, (19)

where r = b − Aˆx is the residual.

(41)

師

Proof: The right hand side of (19) is a upper bound of η(ˆx). This upper bound is attained for the perturbation (by construction)

∆Amin= kAk kˆxk rz^T

kAk kˆxk + kbk, ∆bmin= − kbk kAk kˆxk + kbkr, where z is the dual vector of ˆx, i.e. z^Tx = 1 and kzkˆ _∗= _kˆ_xk¹ . Check:

k∆Amink = η(ˆx) kAk , or

k∆Amink =kAk kˆxk rz^T

kAk kˆxk + kbk =

krk

kAk kˆxk + kbk

kAk , i.e. claim

rz^T

= krk

kˆxk. Since

rz^T

= max

kuk=1

(rz^T)u

= krk max

kuk=1

z^Tu

= krk kzk_∗= krk 1 kˆxk,

(42)

師大

Componentwise Backward Error Bound

Theorem 25

The componentwise backward error bound

ω(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, |∆A| ≤ |A| , |∆b| ≤ |b|}

is given by

ω(ˆx) = max

i

|r|_i

(A |ˆx| + b)_i, (20) where r = b − Aˆx. (note: ξ/0 = 0 if ξ = 0; ξ/0 = ∞ if ξ 6= 0.)

Proof: The right hand side of (20) is a upper bound for ω(ˆx). This bound is attained for the perturbation

∆A = D1AD2, ∆b = −D1b, where

D₁= diag(r_i/(A |ˆx| + b)_i) and D₂= diag(sign(ˆx_i)).

(43)

師

Determinents and Nearness to Singularity

Bn =







1 −1 · · · −1 1 . .. ...

1 −1

0 1







, B⁻¹_n =







1 1 · · · 2ⁿ⁻² . .. . .. ...

. .. 1

0 1





 ,

det(Bn) = 1, κ∞(Bn) = n2ⁿ⁻¹, σn(Bn) ≈ 10⁻⁸(n = 30).

Dn =







10⁻¹ 0

. ..

0 10⁻¹





,

det(Dn) = 10⁻ⁿ, κp(Dn) = 1, σn(Dn) = 10⁻¹.

(44)

師大

Appendix

Proof of Theorem 7: Without loss of generality (W.L.O.G.) we can assume that M (x) = kxk_∞and N is arbitrary. We claim

c1kxk_∞≤ N (x) ≤ c2kxk_∞ or

c1≤ N (z) ≤ c2, for z ∈ S = {z ∈ Cⁿ|kzk_∞= 1}.

From Lemma 6, N is continuous on S (closed and bounded). By maximum and minimum principle, there are c1, c2≥ 0 and z1, z2∈ S such that

c1= N (z1) ≤ N (z) ≤ N (z2) = c2.

If c₁= 0, then N (z₁) = 0. Thus, z₁= 0. This contradicts that z₁∈ S.

Return

(45)

師

Proof of (3):

kAxk₁=X

i

X

j

aijxj

≤X

i

X

j

|aij| |xj| =X

j

|xj|X

i

|aij| .

Let

C :=X

i

|aik| = max

j

X

i

|aij| .

Then kAxk₁≤ C kxk₁, thus kAk₁≤ C. On the other hand, ke_kk₁= 1 and kAe_kk₁=Pn

i=1|a_ik| = C.

(46)

師大

Proof of (4):

kAxk_∞ = max

i

X

j

a_ijx_j

≤ max

i

X

j

|aijx_j|

≤ max

i

X

j

|a_ij| kxk_∞≡X

j

|a_kj| kxk_∞≡ ˆC kxk_∞.

This implies, kAk_∞≤ ˆC. If A = 0, then there is nothing to prove.

Assume A 6= 0. Thus, the k-th row of A is nonzero. Define z = [zi] ∈ Cⁿ by

z_i= _|a^a^¯^ki

ki| if a_ki6= 0, zi= 1 if aki= 0.

Then kzk_∞= 1 and a_kjz_j= |a_kj|, for j = 1, . . . , n. It follows kAk_∞≥ kAzk_∞= max

i

X

j

aijzj

≥

X

j

akjzj

=

n

X

j=1

|akj| ≡ ˆC.

Then, kAk_∞≥ max

1≤i≤n

Pn

j=1|a_ij| ≡ ˆC.

(47)

師

Proof of (5): Let λ1≥ λ2≥ · · · ≥ λn≥ 0 be the eigenvalues of A^∗A.

There are muturally orthonormal vectors vj, j = 1, . . . , n such that (A^∗A)vj= λjvj. Let x =P

jαjvj. Since kAxk²₂= (Ax, Ax) = (x, A^∗Ax),

kAxk²₂=



 X

j

α_jv_j,X

j

α_jλ_jv_j



=X

j

λ_j|αj|²≤ λ1kxk²₂.

Therefore, kAk²₂≤ λ₁. Equality follows by choosing x = v₁ and kAv1k²₂= (v1, λ1v1) = λ1. So, we have kAk₂=pρ(A^∗A).

Return

(48)

師大

Proof of Theorem 12: Let |λ| = ρ(A) ≡ ρ and x be the associated eigenvector with kxk = 1. Then,

ρ(A) = |λ| = kλxk = kAxk ≤ kAk kxk = kAk . Claim: k·k≤ ρ(A) + . There is a unitary U such that A = U^∗RU , where R is upper triangular.

Let Dt= diag(t, t², · · · , tⁿ). For t > 0 large enough, the sum of all absolute values of the off-diagonal elements of D_tRD_t⁻¹ is less than .

So, it holds

D_tRD_t⁻¹

₁≤ ρ(A) + for large t() > 0. Define k·k for any B by

kBk =

D_tU BU^∗D⁻¹_t ₁

=

(U D_t⁻¹)⁻¹B(U D⁻¹_t ) 1. This implies,

kAk=

D_tRD⁻¹_t

≤ ρ(A) + .

Return

(49)

師

Proof of Theorem 14: There are x ∈ Cⁿ, y ∈ C^m with kxk₂= kyk₂= 1 such that Ax = σy, where

σ = kAk₂ (kAk₂= sup

kxk₂=1

kAxk₂). Let V = [x, V1] ∈ C^n×n, and U = [y, U1] ∈ C^m×mbe unitary. Then

A1≡ U^∗AV =

σ w^∗

0 B

.

Since

A₁

σ w

2

≥ (σ²+ w^∗w)², it follows

kA1k²₂≥ σ²+ w^∗w from

A₁

σ w

2

σ w

2

≥ σ²+ w^∗w.

But σ²= kAk²₂= kA1k²₂, it implies w = 0. Hence, the theorem holds by induction.

(50)

師大

Proof of (8): Claim kxkq ≤ kxk_p, (p ≤ q): It holds

kxk_q =

kxk_p x kxk_p

_q

= kxk_p

x kxk_p

_q

≤ Cp,qkxk_p,

where

Cp,q = max

kek_p=1

kek_q, e = (e1, · · · , en)^T. We now show that Cp,q ≤ 1. From p ≤ q, we have

kek^q_q =

n

X

i=1

|ei|^q ≤

n

X

i=1

|ei|^p= 1 (by |ei| ≤ 1).

Hence, Cp,q≤ 1, thus kxkq≤ kxk_p.

(51)

師

To prove the second inequality: Let α = q/p > 1. Then the Jensen ineqality holds for the convex function ϕ(x) ≡ x^α:

Z

Ω

|f |^qdx = Z

Ω

(|f |^p)^q/pdx ≥

Z

Ω

|f |^pdx

q/p

with |Ω| = 1. Consider the discrete measurePn i=1

1

n = 1 and f (i) = |x_i|. It follows that

n

X

i=1

|xi|^q 1 n ≥

n

X

i=1

|xi|^p 1 n

!^q/p . Hence, we have

n⁻¹^qkxk_q ≥ n⁻¹^pkxk_p. Thus,

n^(q−p)/pqkxk_q≥ kxk_p.

(52)

師大

Proof of (9): Let q → ∞ and lim

q→∞kxk_q = kxk_∞:

kxk_∞= |xk| = (|xk|^q)¹^q ≤

n

X

i=1

|xi|^q

!¹_q

= kxk_q.

On the other hand,

kxk_q=

n

X

i=1

|xi|^q

!¹_q

≤ (n kxk^q_∞)

1

q ≤ n¹^qkxk_∞.

It follows that lim

q→∞kxk_q= kxk_∞.

Return

(53)

師

To prove the second inequality: Let α = q/p > 1. Then the Jensen ineqality holds for the convex function ϕ(x) ≡ x^α:

Z

Ω

|f |^qdx = Z

Ω

(|f |^p)^q/pdx ≥

Z

Ω

|f |^pdx

^q/p

with |Ω| = 1.

Consider the discrete measurePn i=1

1

n = 1 and f (i) = |x_i|. It follows that

n

X

i=1

|xi|^q 1 n ≥

n

X

i=1

|xi|^p 1 n

!q/p

. Hence, we have

n⁻¹^qkxk_q ≥ n⁻¹^pkxk_p. Thus,

n^(q−p)/pqkxk_q≥ kxk_p.

(54)

師大

Proof of (10): The first inequality holds obviously. Now, for the second inequality, we have

kAyk_p ≤

n

X

j=1

|yj| kajk_p

≤

n

X

j=1

|yj| max

j kajk_p

= kyk₁max

j kajk_p

≤ n^(p−1)/pmax

j kajk_p. (by (8))

Return

(55)

師

Proof of (15): It holds max

kxk_p=1kAxk_p = max

kxk_p=1

max

kyk_q=1

(Ax)^Ty

= max

kyk_q=1

max

kxk_p=1

x^T(A^Ty)

= max

kyk_q=1

A^Ty _q

=

A^T _q.

Proof of (16): By (12) and (15), we get m¹^pkAk_∞ = m¹^p

A^T

₁= m¹⁻^q¹ A^T

₁

= m^(q−1)/q A^T

₁≥ A^T

_q = kAk_p.

(56)

師大

Proof of (17): It holds kAk_pkAk_q =

A^T

_qkAk_q ≥ A^TA

_q ≥ A^TA

₂. The last inequality holds by the following statement: Let S be a symmetric matrix. Then kSk₂≤ kSk, for any matrix operator norm k·k.

Since |λ| ≤ kSk, kSk₂=p

ρ(S^∗S) =p

ρ(S²) = max

λ∈σ(S)|λ| = |λmax| . This implies, kSk₂≤ kSk.

Proof of (18): By (8), we get kAk_p= max

kxk_p=1kAxk_p≤ max

kxk_q≤1m^(q−p)/pqkAxk_q= m^(q−p)/pqkAk_q.

Return