September8,2011 Tsung-MingHuang Introduction

Full text

(1)

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University

September 8, 2011

(2)

Outline

1 Vectors and matrices

2 Rank and orthogonality

3 Eigenvalues and Eigenvectors

4 Norms and eigenvalues

5 Backward and Forward errors

(3)

Vectors and matrices

A ∈ F with

A = [aij] =

a11 · · · a1n

... . .. ... am1 · · · amn

, F = R or C.

Product of matrices: C = AB, where cij=Pn

k=1aikbkj, i = 1, . . . , m, j = 1, . . . , p.

Transpose: C = AT, where cij = aji∈ R.

Conjugate transpose: C = A or C = AH, where cij = ¯aji∈ C.

Differentiation: Let C = (cij(t)). Then ˙C = dtd C = [ ˙cij(t)].

(4)

Outer product of x ∈ Fmand y ∈ Fn:

xy=

x11 · · · x1n

... . .. ... xm1 · · · xmn

∈ Fm×n.

Inner product of x ∈ Fn and y ∈ Fn:

hy, xi := xTy =

n

X

i=1

xiyi= yTx ∈ R,

hy, xi := xy =

n

X

i=1

¯

xiyi= yx ∈ C.

(5)

Sherman-Morrison Formula:

Let A ∈ Rn×n be nonsingular, u, v ∈ Rn. If vTA−1u 6= −1, then (A + uvT)−1= A−1− A−1uvTA−1/(1 + vTA−1u). (1) Sherman-Morrison-Woodbury Formula:

Let A ∈ Rn×n, be nonsingular U , V ∈ Rn×k. If (I + VTA−1U ) is invertible, then

(A + U VT)−1 = A−1− A−1U (I + VTA−1U )−1VTA−1. Proof of (1):

(A + uvT)[A−1− A−1uvTA−1/(1 + vTA−1u)]

= I + 1

1 + vTA−1u[uvTA−1(1 + vTA−1u) − uvTA−1− uvTA−1uvTA−1]

= I + 1

1 + vTA−1u[u(vTA−1u)vTA−1− u(vTA−1u)vTA−1]

= I.

(6)

Example 1

A =˜

3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 −1 0 0 3

= A +

 0 0 0 0

−1

 0 1 0 0 0  ,

where

A =

3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 0 0 0 3

 .

(7)

Rank and orthogonality

Let A ∈ Rm×n. Then

R(A) = {y ∈ Rm| y = Ax for some x ∈ Rn} ⊆ Rmis the range space of A.

N (A) = {x ∈ Rn| Ax = 0} ⊆ Rn is the null space of A.

rank(A) = dim[R(A)] = the number of maximal linearly independent columns of A.

rank(A) = rank(AT).

dim(N (A))+ rank(A) = n.

If m = n, then A is nonsingular ⇔ N (A) = {0} ⇔ rank(A) = n.

(8)

Let {x1, · · · , xp} in Rn. Then {x1, · · · , xp} is said to be orthogonal if

xTixj= 0, for i 6= j and orthonormal if

xTixj= δij, where δij = 0 if i 6= j and δij= 1 if i = j.

S = {y ∈ Rm| yTx = 0, for x ∈ S} = orthogonal complement of S.

Rm= R(A) ⊕ N (AT).

Rn= R(AT) ⊕ N (A).

R(A)= N (AT).

R(AT) = N (A).

(9)

Special matrices

A ∈ Rn×n

Symmetric: AT = A skew-symmetric: AT = −A positive definite: xTAx > 0, x 6= 0 non-negative definite: xTAx ≥ 0

indefinite: (xTAx)(yTAy) < 0, for some x, y orthogonal: ATA = In

normal: ATA = AAT positive: aij> 0 non-negative: aij≥ 0.

A ∈ Cn×n

Hermitian: A= A (AH= A) skew-Hermitian: A= −A positive definite: xAx > 0, x 6= 0 non-negative definite: xAx ≥ 0

indefinite: (xAx)(yAy) < 0, for some x, y unitary: AA = In

(10)

Let A ∈ Fn×n. Then the matrix A is

diagonal if aij = 0, for i 6= j. Denote D = diag(d1, · · · , dn) ∈ Dn; tridiagonal if aij = 0, |i − j| > 1;

upper bi-diagonal if aij = 0, i > j or j > i + 1;

(strictly) upper triangular if aij= 0, i > j (i ≥ j);

upper Hessenberg if aij = 0, i > j + 1.

(Note: the lower case is the same as above.)

Sparse matrix: n1+r, where r < 1 (usually between 0.2 ∼ 0.5). If n = 1000, r = 0.9, then n1+r= 501187.

(11)

Eigenvalues and Eigenvectors

Definition 2

Let A ∈ Cn×n. Then λ ∈ C is called an eigenvalue of A, if there exists x 6= 0, x ∈ Cn with Ax = λx and x is called an eigenvector

corresponding to λ.

Notations:

σ(A) := spectrum of A = the set of eigenvalues of A.

ρ(A) := radius of A = max{|λ| : λ ∈ σ(A)}.

λ ∈ σ(A) ⇔ det(A − λI) = 0.

p(λ) = det(λI − A) = characteristic polynomial of A.

p(λ) =Qs

i=1(λ − λi)m(λi), λi6= λj(for i 6= j) andPs

i=1m(λi) = n.

m(λi) = algebraic multiplicity of λi.

n(λi) = n − rank(A − λiI) = geometric multiplicity of λi.

(12)

If there is some i such that n(λi) < m(λi), then A is called degenerated.

The following statements are equivalent:

(1) There are n linearly independent eigenvectors;

(2) A is diagonalizable, i.e., there is a nonsingular matrix T such that T−1AT ∈ Dn;

(3) For each λ ∈ σ(A), it holds m(λ) = n(λ).

If A is degenerated, then eigenvectors plus principal vectors derive Jordan form.

(13)

Theorem 3 (Schur decomposition)

(1) Let A ∈ Cn×n. There is a unitary matrix U such that UAU is upper triangular.

(2) Let A ∈ Rn×n. There is an orthogonal matrix Q such that QTAQ is quasi-upper triangular, i.e., an upper triangular matrix possibly with nonzero subdiagonal elements in non-consecutive positions.

(3) A is normal if and only if there is a unitary U such that UAU = D is diagonal.

(4) A is Hermitian if and only if A is normal and σ(A) ⊆ R.

(5) A is symmetric if and only if there is an orthogonal U such that UTAU = D is diagonal and σ(A) ⊆ R.

(14)

Norms and eigenvalues

Let X be a vectorspace over F = R or C.

Definition 4 (Vector norms)

Let N be a real-valued function defined on X (N : X −→ R+). Then N is a (vector) norm, if

N1: N (αx) = |α|N (x), α ∈ F, for x ∈ X;

N2: N (x + y) ≤ N (x) + N (y), for x, y ∈ X;

N3: N (x) = 0 if and only if x = 0.

The usual notation is kxk = N (x).

(15)

Example 5

Let X = Cn, p ≥ 1. Then kxkp= (Pn

i=1|xi|p)1/p is an lp-norm.

Especially,

kxk1=

n

P

i=1

|xi| ( l1-norm),

kxk2=

 n P

i=1

|xi|2

1/2

( Euclidean-norm),

kxk= max

1≤i≤n|xi| ( maximum-norm).

(16)

Lemma 6

N (x) is a continuous function in the components x1, · · · , xn of x.

Proof:

|N (x) − N (y)| ≤ N (x − y) ≤

n

P

j=1

|xj− yj| N (ej) ≤ kx − yk

n

P

j=1

N (ej).

Theorem 7 (Equivalence of norms)

Let N and M be two norms on Cn. Then there exist constants c1, c2> 0 such that

c1M (x) ≤ N (x) ≤ c2M (x), for all x ∈ Cn.

Proof of Theorem 7

Remark: Theorem 7 does not hold in infinite dimensional space.

(17)

Norms and eigenvalues

Definition 8 (Matrix-norms)

Let A ∈ Cm×n. A real value function k · k : Cm×n → R+ satisfying N1: kαAk = |α|kAk;

N2: kA + Bk ≤ kAk + kBk;

N3: kAk = 0 if and only if A = 0;

N4: kABk ≤ kAkkBk;

N5: kAxkv≤ kAkkxkv.

If k · k satisfies N1 to N4, then it is called a matrix norm. In addition, matrix and vector norms are compatible for some k · kv in N5.

(18)

Example 9 (Frobenius norm) Let kAkF =n

Pn

i,j=1|ai,j|2o1/2

.

kABkF =

X

i,j

X

k

aikbkj

2

1/2

X

i,j

( X

k

|aik|2 ) (

X

k

|bkj|2 )

1/2

(Cauchy-Schwartz Ineq.)

= X

i

X

k

|aik|2

!1/2

X

j

X

k

|bkj|2

1/2

= kAkFkBkF.

This implies that N4 holds.

kAxk2=

X

i

X

j

aijxj

2

1/2

X

i

X

j

|aij|2

X

j

|xj|2

1/2

= kAkFkxk2. (2)

This implies N5 holds. Also, N1, N2 and N3 hold obviously. (kIkF =√ n)

(19)

Example 10 (Operator norm)

Given a vector norm k·k. An associated matrix norm is defined by kAk = sup

x6=0

kAxk kxk = max

x6=0

kAxk

kxk = max

kxk=1{kAxk} . N5 holds immediately. On the other hand,

k(AB)xk = kA(Bx)k ≤ kAk kBxk

≤ kAk kBk kxk for all x 6= 0. This implies that

kABk ≤ kAk kBk . Thus, N 4 holds. (kIk = 1).

(20)

Three useful matrix norms:

kAk1= sup

x6=0

kAxk1

kxk1 = max

1≤j≤n n

X

i=1

|aij| (3)

kAk= sup

x6=0

kAxk

kxk = max

1≤i≤n n

X

j=1

|aij| (4)

kAk2= sup

x6=0

kAxk2 kxk2 =p

ρ(AA) (5)

Proof of (3)-(5)

Example 11 (Dual norm)

Let 1p +1q = 1. Then k·kp= k·kq, (p = ∞, q = 1). (It concluds from the application of the H¨older inequality, i.e. |yx| ≤ kxkpkykq.)

(21)

Theorem 12

Let A ∈ Cn×n. Then for any operator norm k·k, it holds ρ(A) ≤ kAk .

Moreover, for any  > 0, there exists an operator norm k·k such that k·k≤ ρ(A) + .

Proof of Theorem 12

Lemma 13

Let U and V are unitary. Then

kU AV kF = kAkF, kU AV k2= kAk2 From

q

(22)

Theorem 14 (Singular Value Decomposition (SVD)) Let A ∈ Cm×n. Then there exist unitary matrices

U = [u1, · · · , um] ∈ Cm×mand V = [v1, · · · , vn] ∈ Cn×n such that UAV = diag(σ1, · · · , σp) = Σ, (6) where p = min{m, n} and σ1≥ σ2≥ · · · ≥ σp≥ 0. (Here, σi denotes the i-th largest singular value of A).

Proof of Theorem 14

Remark: From (6), we have kAk2=pρ(AA) = σ1, which is the maximal singular value of A, and

kABCkF = kU ΣVBCkF = kΣVBCkF ≤ σ1kBCkF = kAk2kBCkF. This implies

kABCkF ≤ kAk2kBkFkCk2. (7) In addition, by (2) and (7), we get

kAk2≤ kAkF ≤√ nkAk2.

(23)

Theorem 15

Let A ∈ Cn×n. The statements are equivalent:

(1) lim

m→∞Am= 0;

(2) lim

m→∞Amx = 0 for all x;

(3) ρ(A) < 1.

Proof:

(1) ⇒ (2): Trivial.

(2) ⇒ (3): Let λ ∈ σ(A), i.e., Ax = λx, x 6= 0. This implies Amx = λmx → 0, as λm→ 0. Thus |λ| < 1, i.e., ρ(A) < 1.

(3) ⇒ (1): There is a norm k · k with kAk < 1 (by Theorem 12).

Therefore, kAmk ≤ kAkm→ 0, i.e., Am→ 0.

(24)

Theorem 16 ρ(A) = lim

k→∞

Ak

1/k.

Proof: Since

ρ(A)k= ρ(Ak) ≤ Ak

⇒ ρ(A) ≤ Ak

1/k,

for k = 1, 2, . . .. If  > 0, then ˜A = [ρ(A) + ]−1A has spectral radius

< 1 and

k

→ 0 as k → ∞. There is an N = N (, A) such that

k

< 1 for all k ≥ N . Thus, Ak

≤ [ρ(A) + ]k, for all k ≥ N or

Ak

1/k ≤ ρ(A) + , for all k ≥ N.

Since ρ(A) ≤ Ak

1/k, and k,  are arbitrary, lim

k→∞

Ak

1/k exists and equals ρ(A).

(25)

Theorem 17

Let A ∈ Cn×n, and ρ(A) < 1. Then (I − A)−1 exists and (I − A)−1= I + A + A2+ · · · .

Proof: Since ρ(A) < 1, the eigenvalues of (I − A) are nonzero. Therefore, by Theorem 15, (I − A)−1exists and

(I − A)(I + A + A2+ · · · + Am) = I − Am→ I.

Corollary 18

If kAk < 1, then (I − A)−1 exists and (I − A)−1

≤ 1

1 − kAk. Proof: Since ρ(A) ≤ kAk < 1 (by Theorem 12),

(26)

Theorem 19 (without proof)

For A ∈ Fn×n the following statements are equivalent:

(1) There is a multiplicative norm p with p(Ak) ≤ 1, k = 1, 2, . . ..

(2) For each multiplicative norm p the power p(Ak) are uniformly bounded, i.e., there exists a M (p) < ∞ such that p(Ak) ≤ M (p), k = 0, 1, 2, . . ..

(3) ρ(A) ≤ 1 and all eigenvalue λ with |λ| = 1 are not degenerated.

(i.e., m(λ) = n(λ).)

(See Householder: The theory of matrix, pp.45-47.)

(27)

In the following, we prove some important inequalities of vector norms and matrix norms.

1 ≤ kxkp

kxkq ≤ n(q−p)/pq, (p ≤ q). (8)

Proof of (8)

1 ≤ kxkp

kxk ≤ n1p. (9)

Proof of (9)

max

1≤j≤nkajkp≤ kAkp≤ n(p−1)/p max

1≤j≤nkajkp, (10) where A = [a1, · · · , an] ∈ Rm×n.

Proof of (10)

(28)

maxi,j |aij| ≤ kAkp ≤ n(p−1)/pm1/pmax

i,j |aij| , (11) where A ∈ Rm×n.

Proof of (11): By (9) and (10) immediately.

m(1−p)/pkAk1≤ kAkp≤ n(p−1)/pkAk1. (12) Proof of (12): By (10) and (8) immediately.

(29)

H¨older inequality:

xTy

≤ kxkpkykq, where1 p+1

q = 1. (13)

Proof of (13): Let αi= kxkxi

p, βi=kykyi

q. Then (αpi)1/piq)1/q≤1

pi +1

iq. (Jensen Inequality) Since kαkp= 1, kβkq= 1, it follows that

n

X

i=1

αiβi≤ 1 p+1

q = 1.

Then we have xTy

≤ kxkpkykq.

(30)

max{

xTy

: kxkp= 1} = kykq. (14) Proof of (14): Take xi= yq−1i / kykq/pq . Then we have

kxkpp= P |yi|q

kykq/pq = kykqq

kykq/pq = 1. ( ∵ (q − 1)p = 1) It follows

n

X

i=1

xTi yi

= P |yi|q

kykq/pq = kykqq

kykq/pq = kykq.

Remark: ∃ˆz with kˆzkp= 1 s.t. kykq = ˆzTy. Let z = ˆz/ kykq. Then we have ∃z s.t. zTy = 1 with kzkp=kyk1

q

.

(31)

kAkp= AT

q (15)

Proof of (15)

n1pkAk≤ kAkp≤ m1pkAk. (16)

Proof of (16)

kAk2≤q

kAkpkAkq, (1 p+1

q = 1). (17)

Proof of (17)

n(p−q)/pqkAkq ≤ kAkp≤ m(q−p)/pqkAkq, (18) where A ∈ Rm×n and q ≥ p ≥ 1.

(32)

Backward error and Forward error

Let x = F (a). We define backward and forward errors in Figure 1. In Figure 1, ˆx + ∆x = F (a + ∆a) is called a mixed forward-backward error, where |∆x| ≤ ε|x|, |∆a| ≤ η|a|.

Definition 20

(i) An algorithm is backward stable, if for all a, it produces a computed ˆx with a small backward error, i.e., ˆx = F (a + ∆a) with

∆a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward-backward error sense, i.e., ˆx + ∆x = F (a + ∆a) with both

∆a and ∆x small.

(iii) If a method which produces answers with forward errors of similar magnitude to those produced by a backward stable method, is called a forward stable.

(33)

(i) An algorithm is backward stable, if for all a, it produces a computed ^x with a small backward error, i.e., ^x = F (a + a) with a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward- backward error sense, i.e., ^x + x = F (a + a) with both a and x small.

(iii) If a method which produces answers with forward errors of similar magni- tude to those produced by a backward stable method, is called a forward stable.

Figure: Relationship between backward and forward errors.

Remark:

(i) Backward stable ⇒ forward stable, no vice versa!

(ii) Forward error ≤ condition number × backward error

33 / 56

(34)

Consider ˆ

x − x = F (a + ∆a) − F (a) = F0(a)∆a +F00(a + θ∆a)

2 (∆a)2, θ ∈ (0, 1).

Then we have ˆ x − x

x = aF0(a) F (a)

 ∆a

a + O (∆a)2 . The quantity C(a) =

aF0(a) F (a)

is called the condition number of F. If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all ∆a.

Backward error:

( Apriori error estimate !` Aposteriori error estimate !`

(35)

Lemma 21

 Ax = b

(A + ∆A)ˆx = b + ∆b

with k∆Ak ≤ δ kAk and k∆bk ≤ δ kbk. If δκ(A) = r < 1 then A + ∆A is nonsingular and kxkxk1+r1−r.

Proof: Since

A−1∆A < δ

A−1

kAk = r < 1, it follows that A + ∆A is nonsingular. From (I + A−1∆A)ˆx = x + A−1∆b, we have

kˆxk ≤

(I + A−1∆A)−1

kxk + δ A−1

kbk

≤ 1

1 − r kxk + δ A−1

kbk

= 1

1 − r



kxk + rkbk kAk



(36)

Normwise Forward Error Bound

Theorem 22

If the condition of Lemma 21 hold, then kx − ˆxk

kxk ≤ 2δ 1 − rκ(A).

Proof: Since ˆx − x = A−1∆b − A−1∆Aˆx, we have kˆx − xk ≤ δ

A−1

kbk + δ A−1

kAk kˆxk . So, by Lemma 21, we have

kˆx − xk

kxk ≤ δκ(A) kbk

kAk kxk+ δκ(A)kˆxk kxk

≤ δκ(A)



1 + 1 + r 1 − r



= 2δ 1 − rκ(A).

(37)

Componentwise Forward Error Bounds

Theorem 23

Let Ax = b and (A + ∆A)ˆx = b + ∆b. Let |∆A| ≤ δ |A| and

|∆b| ≤ δ |b|. If δκ(A) = r < 1 then (A + ∆A) is nonsingular and kˆx − xk

kxk ≤ 2δ 1 − r

A−1 |A|

.

Proof: Since k∆Ak≤ δ kAkand k∆bk≤ δ kbk, the conditions of Lemma 21 are satisfied in ∞-norm. Then A + ∆A is nonsingular and

xk

kxk1+r1−r.

Since ˆx − x = A−1∆b − A−1∆Aˆx, we have

|ˆx − x| ≤ A−1

|∆b| + A−1

|∆A| |ˆx|

≤ δ A−1

|b| + δ A−1

|A| |ˆx| ≤ δ A−1

|A| (|x| + |ˆx|).

(38)

Taking ∞-norm, we get kˆx − xk ≤ δ

A−1

|A|



kxk+1 + r 1 − rkxk



= 2δ

1 − r k A−1

|A|

| {z } k

Skeel condition number

.

(39)

Condition Number by First Order Approximation

(A + F )x() = b + f, x(0) = x

˙

x(0) = A−1(f − F x) x() = x +  ˙x(0) + o(2) kx() − xk

kxk ≤  A−1

 kf k kxk + kF k

 + o(2) Condition number κ(A) := kAk

A−1 kbk ≤ kAk kxk , kx() − xk

kxk ≤ κ(A)(ρA+ ρb) + o(2).

ρA = kF k

kAk, ρb= kf k

kbk, κ2(A) = σ1(A) .

(40)

Normwise Backward Error Bound

Theorem 24

Let ˆx be the computed solution of Ax = b. Then the normwise backward error bound

η(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, k∆Ak ≤  kAk , k∆bk ≤  kbk}

is given by

η(ˆx) = krk

kAk kˆxk + kbk, (19)

where r = b − Aˆx is the residual.

(41)

Proof: The right hand side of (19) is a upper bound of η(ˆx). This upper bound is attained for the perturbation (by construction)

∆Amin= kAk kˆxk rzT

kAk kˆxk + kbk, ∆bmin= − kbk kAk kˆxk + kbkr, where z is the dual vector of ˆx, i.e. zTx = 1 and kzkˆ = xk1 . Check:

k∆Amink = η(ˆx) kAk , or

k∆Amink =kAk kˆxk rzT

kAk kˆxk + kbk =

 krk

kAk kˆxk + kbk

 kAk , i.e. claim

rzT

= krk

kˆxk. Since

rzT

= max

kuk=1

(rzT)u

= krk max

kuk=1

zTu

= krk kzk= krk 1 kˆxk,

(42)

Componentwise Backward Error Bound

Theorem 25

The componentwise backward error bound

ω(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, |∆A| ≤  |A| , |∆b| ≤  |b|}

is given by

ω(ˆx) = max

i

|r|i

(A |ˆx| + b)i, (20) where r = b − Aˆx. (note: ξ/0 = 0 if ξ = 0; ξ/0 = ∞ if ξ 6= 0.)

Proof: The right hand side of (20) is a upper bound for ω(ˆx). This bound is attained for the perturbation

∆A = D1AD2, ∆b = −D1b, where

D1= diag(ri/(A |ˆx| + b)i) and D2= diag(sign(ˆxi)).

(43)

Determinents and Nearness to Singularity

Bn =

1 −1 · · · −1 1 . .. ...

1 −1

0 1

, B−1n =

1 1 · · · 2n−2 . .. . .. ...

. .. 1

0 1

 ,

det(Bn) = 1, κ(Bn) = n2n−1, σn(Bn) ≈ 10−8(n = 30).

Dn =

10−1 0

. ..

0 10−1

,

det(Dn) = 10−n, κp(Dn) = 1, σn(Dn) = 10−1.

(44)

Appendix

Proof of Theorem 7: Without loss of generality (W.L.O.G.) we can assume that M (x) = kxkand N is arbitrary. We claim

c1kxk≤ N (x) ≤ c2kxk or

c1≤ N (z) ≤ c2, for z ∈ S = {z ∈ Cn|kzk= 1}.

From Lemma 6, N is continuous on S (closed and bounded). By maximum and minimum principle, there are c1, c2≥ 0 and z1, z2∈ S such that

c1= N (z1) ≤ N (z) ≤ N (z2) = c2.

If c1= 0, then N (z1) = 0. Thus, z1= 0. This contradicts that z1∈ S.

Return

(45)

Proof of (3):

kAxk1=X

i

X

j

aijxj

≤X

i

X

j

|aij| |xj| =X

j

|xj|X

i

|aij| .

Let

C :=X

i

|aik| = max

j

X

i

|aij| .

Then kAxk1≤ C kxk1, thus kAk1≤ C. On the other hand, kekk1= 1 and kAekk1=Pn

i=1|aik| = C.

(46)

Proof of (4):

kAxk = max

i

X

j

aijxj

≤ max

i

X

j

|aijxj|

≤ max

i

X

j

|aij| kxk≡X

j

|akj| kxk≡ ˆC kxk.

This implies, kAk≤ ˆC. If A = 0, then there is nothing to prove.

Assume A 6= 0. Thus, the k-th row of A is nonzero. Define z = [zi] ∈ Cn by

 zi= |aa¯ki

ki| if aki6= 0, zi= 1 if aki= 0.

Then kzk= 1 and akjzj= |akj|, for j = 1, . . . , n. It follows kAk≥ kAzk= max

i

X

j

aijzj

X

j

akjzj

=

n

X

j=1

|akj| ≡ ˆC.

Then, kAk≥ max

1≤i≤n

Pn

j=1|aij| ≡ ˆC.

(47)

Proof of (5): Let λ1≥ λ2≥ · · · ≥ λn≥ 0 be the eigenvalues of AA.

There are muturally orthonormal vectors vj, j = 1, . . . , n such that (AA)vj= λjvj. Let x =P

jαjvj. Since kAxk22= (Ax, Ax) = (x, AAx),

kAxk22=

 X

j

αjvj,X

j

αjλjvj

=X

j

λjj|2≤ λ1kxk22.

Therefore, kAk22≤ λ1. Equality follows by choosing x = v1 and kAv1k22= (v1, λ1v1) = λ1. So, we have kAk2=pρ(AA).

Return

(48)

Proof of Theorem 12: Let |λ| = ρ(A) ≡ ρ and x be the associated eigenvector with kxk = 1. Then,

ρ(A) = |λ| = kλxk = kAxk ≤ kAk kxk = kAk . Claim: k·k≤ ρ(A) + . There is a unitary U such that A = URU , where R is upper triangular.

Let Dt= diag(t, t2, · · · , tn). For t > 0 large enough, the sum of all absolute values of the off-diagonal elements of DtRDt−1 is less than .

So, it holds

DtRDt−1

1≤ ρ(A) +  for large t() > 0. Define k·k for any B by

kBk =

DtU BUD−1t 1

=

(U Dt−1)−1B(U D−1t ) 1. This implies,

kAk=

DtRD−1t

≤ ρ(A) + .

Return

(49)

Proof of Theorem 14: There are x ∈ Cn, y ∈ Cm with kxk2= kyk2= 1 such that Ax = σy, where

σ = kAk2 (kAk2= sup

kxk2=1

kAxk2). Let V = [x, V1] ∈ Cn×n, and U = [y, U1] ∈ Cm×mbe unitary. Then

A1≡ UAV =

 σ w

0 B

 .

Since

A1

 σ w



2

2

≥ (σ2+ ww)2, it follows

kA1k22≥ σ2+ ww from

A1

 σ w



2

2

 σ w



2

2

≥ σ2+ ww.

But σ2= kAk22= kA1k22, it implies w = 0. Hence, the theorem holds by induction.

(50)

Proof of (8): Claim kxkq ≤ kxkp, (p ≤ q): It holds

kxkq =

kxkp x kxkp

q

= kxkp

x kxkp

q

≤ Cp,qkxkp,

where

Cp,q = max

kekp=1

kekq, e = (e1, · · · , en)T. We now show that Cp,q ≤ 1. From p ≤ q, we have

kekqq =

n

X

i=1

|ei|q

n

X

i=1

|ei|p= 1 (by |ei| ≤ 1).

Hence, Cp,q≤ 1, thus kxkq≤ kxkp.

(51)

To prove the second inequality: Let α = q/p > 1. Then the Jensen ineqality holds for the convex function ϕ(x) ≡ xα:

Z

|f |qdx = Z

(|f |p)q/pdx ≥

Z

|f |pdx

q/p

with |Ω| = 1. Consider the discrete measurePn i=1

1

n = 1 and f (i) = |xi|. It follows that

n

X

i=1

|xi|q 1 n ≥

n

X

i=1

|xi|p 1 n

!q/p . Hence, we have

n1qkxkq ≥ n1pkxkp. Thus,

n(q−p)/pqkxkq≥ kxkp.

(52)

Proof of (9): Let q → ∞ and lim

q→∞kxkq = kxk:

kxk= |xk| = (|xk|q)1q

n

X

i=1

|xi|q

!1q

= kxkq.

On the other hand,

kxkq=

n

X

i=1

|xi|q

!1q

≤ (n kxkq)

1

q ≤ n1qkxk.

It follows that lim

q→∞kxkq= kxk.

Return

(53)

To prove the second inequality: Let α = q/p > 1. Then the Jensen ineqality holds for the convex function ϕ(x) ≡ xα:

Z

|f |qdx = Z

(|f |p)q/pdx ≥

Z

|f |pdx

q/p

with |Ω| = 1.

Consider the discrete measurePn i=1

1

n = 1 and f (i) = |xi|. It follows that

n

X

i=1

|xi|q 1 n ≥

n

X

i=1

|xi|p 1 n

!q/p

. Hence, we have

n1qkxkq ≥ n1pkxkp. Thus,

n(q−p)/pqkxkq≥ kxkp.

(54)

Proof of (10): The first inequality holds obviously. Now, for the second inequality, we have

kAykp

n

X

j=1

|yj| kajkp

n

X

j=1

|yj| max

j kajkp

= kyk1max

j kajkp

≤ n(p−1)/pmax

j kajkp. (by (8))

Return

(55)

Proof of (15): It holds max

kxkp=1kAxkp = max

kxkp=1

max

kykq=1

(Ax)Ty

= max

kykq=1

max

kxkp=1

xT(ATy)

= max

kykq=1

ATy q

=

AT q.

Proof of (16): By (12) and (15), we get m1pkAk = m1p

AT

1= m1−q1 AT

1

= m(q−1)/q AT

1≥ AT

q = kAkp.

(56)

Proof of (17): It holds kAkpkAkq =

AT

qkAkq ≥ ATA

q ≥ ATA

2. The last inequality holds by the following statement: Let S be a symmetric matrix. Then kSk2≤ kSk, for any matrix operator norm k·k.

Since |λ| ≤ kSk, kSk2=p

ρ(SS) =p

ρ(S2) = max

λ∈σ(S)|λ| = |λmax| . This implies, kSk2≤ kSk.

Proof of (18): By (8), we get kAkp= max

kxkp=1kAxkp≤ max

kxkq≤1m(q−p)/pqkAxkq= m(q−p)/pqkAkq.

Return

Updating...

References

Related subjects :