師

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University

September 8, 2011

師大

### Outline

1 Vectors and matrices

2 Rank and orthogonality

3 Eigenvalues and Eigenvectors

4 Norms and eigenvalues

5 Backward and Forward errors

師

### Vectors and matrices

A ∈ F with

A = [aij] =

a11 · · · a1n

... . .. ...
a_{m1} · · · a_{mn}

, F = R or C.

Product of matrices: C = AB, where cij=Pn

k=1aikbkj, i = 1, . . . , m, j = 1, . . . , p.

Transpose: C = A^{T}, where cij = aji∈ R.

Conjugate transpose: C = A^{∗} or C = A^{H}, where c_{ij} = ¯a_{ji}∈ C.

Differentiation: Let C = (c_{ij}(t)). Then ˙C = _{dt}^{d} C = [ ˙c_{ij}(t)].

師大

Outer product of x ∈ F^{m}and y ∈ F^{n}:

xy^{∗}=

x1y¯1 · · · x1y¯n

... . .. ... xmy¯1 · · · xmy¯n

∈ F^{m×n}.

Inner product of x ∈ F^{n} and y ∈ F^{n}:

hy, xi := x^{T}y =

n

X

i=1

x_{i}y_{i}= y^{T}x ∈ R,

hy, xi := x^{∗}y =

n

X

i=1

¯

xiyi= y^{∗}x ∈ C.

師

Sherman-Morrison Formula:

Let A ∈ R^{n×n} be nonsingular, u, v ∈ R^{n}. If v^{T}A^{−1}u 6= −1, then
(A + uv^{T})^{−1}= A^{−1}− A^{−1}uv^{T}A^{−1}/(1 + v^{T}A^{−1}u). (1)
Sherman-Morrison-Woodbury Formula:

Let A ∈ R^{n×n}, be nonsingular U , V ∈ R^{n×k}. If (I + V^{T}A^{−1}U ) is
invertible, then

(A + U V^{T})^{−1} = A^{−1}− A^{−1}U (I + V^{T}A^{−1}U )^{−1}V^{T}A^{−1}.
Proof of (1):

(A + uv^{T})[A^{−1}− A^{−1}uv^{T}A^{−1}/(1 + v^{T}A^{−1}u)]

= I + 1

1 + v^{T}A^{−1}u[uv^{T}A^{−1}(1 + v^{T}A^{−1}u) − uv^{T}A^{−1}− uv^{T}A^{−1}uv^{T}A^{−1}]

= I + 1

1 + v^{T}A^{−1}u[u(v^{T}A^{−1}u)v^{T}A^{−1}− u(v^{T}A^{−1}u)v^{T}A^{−1}]

= I.

師大

Example 1

A =˜

3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 −1 0 0 3

= A +

0 0 0 0

−1

0 1 0 0 0 ,

where

A =

3 −1 1 1 1

0 1 2 2 2

0 0 4 1 1

0 0 0 3 0

0 0 0 0 3

.

師

### Rank and orthogonality

Let A ∈ R^{m×n}. Then

R(A) = {y ∈ R^{m}| y = Ax for some x ∈ R^{n}} ⊆ R^{m}is the range
space of A.

N (A) = {x ∈ R^{n}| Ax = 0} ⊆ R^{n} is the null space of A.

rank(A) = dim[R(A)] = the number of maximal linearly independent columns of A.

rank(A) = rank(A^{T}).

dim(N (A))+ rank(A) = n.

If m = n, then A is nonsingular ⇔ N (A) = {0} ⇔ rank(A) = n.

師大

Let {x1, · · · , xp} in R^{n}. Then {x1, · · · , xp} is said to be
orthogonal if

x^{T}_{i}xj= 0, for i 6= j
and orthonormal if

x^{T}_{i}x_{j}= δ_{ij},
where δ_{ij} = 0 if i 6= j and δ_{ij}= 1 if i = j.

S^{⊥} = {y ∈ R^{m}| y^{T}x = 0, for x ∈ S} = orthogonal complement of
S.

R^{m}= R(A) ⊕ N (A^{T}).

R^{n}= R(A^{T}) ⊕ N (A).

R(A)^{⊥}= N (A^{T}).

R(A^{T})^{⊥} = N (A).

師

### Special matrices

A ∈ R^{n×n}

Symmetric: A^{T} = A
skew-symmetric: A^{T} = −A
positive definite: x^{T}Ax > 0, x 6= 0
non-negative definite: x^{T}Ax ≥ 0

indefinite: (x^{T}Ax)(y^{T}Ay) < 0, for some x, y
orthogonal: A^{T}A = In

normal: A^{T}A = AA^{T}
positive: aij> 0
non-negative: aij≥ 0.

A ∈ C^{n×n}

Hermitian: A^{∗}= A (A^{H}= A)
skew-Hermitian: A^{∗}= −A
positive definite: x^{∗}Ax > 0, x 6= 0
non-negative definite: x^{∗}Ax ≥ 0

indefinite: (x^{∗}Ax)(y^{∗}Ay) < 0, for some x, y
unitary: A^{∗}A = In

師大

Let A ∈ F^{n×n}. Then the matrix A is

diagonal if a_{ij} = 0, for i 6= j. Denote D = diag(d_{1}, · · · , d_{n}) ∈ D_{n};
tridiagonal if aij = 0, |i − j| > 1;

upper bi-diagonal if aij = 0, i > j or j > i + 1;

(strictly) upper triangular if a_{ij}= 0, i > j (i ≥ j);

upper Hessenberg if aij = 0, i > j + 1.

(Note: the lower case is the same as above.)

Sparse matrix: n^{1+r}, where r < 1 (usually between 0.2 ∼ 0.5). If
n = 1000, r = 0.9, then n^{1+r}= 501187.

師

### Eigenvalues and Eigenvectors

Definition 2

Let A ∈ C^{n×n}. Then λ ∈ C is called an eigenvalue of A, if there exists
x 6= 0, x ∈ C^{n} with Ax = λx and x is called an eigenvector

corresponding to λ.

Notations:

σ(A) := spectrum of A = the set of eigenvalues of A.

ρ(A) := radius of A = max{|λ| : λ ∈ σ(A)}.

λ ∈ σ(A) ⇔ det(A − λI) = 0.

p(λ) = det(λI − A) = characteristic polynomial of A.

p(λ) =Qs

i=1(λ − λi)^{m(λ}^{i}^{)}, λi6= λj(for i 6= j) andPs

i=1m(λi) = n.

m(λ_{i}) = algebraic multiplicity of λ_{i}.

n(λi) = n − rank(A − λiI) = geometric multiplicity of λi.

師大

If there is some i such that n(λi) < m(λi), then A is called degenerated.

The following statements are equivalent:

(1) There are n linearly independent eigenvectors;

(2) A is diagonalizable, i.e., there is a nonsingular matrix T such that
T^{−1}AT ∈ D_{n};

(3) For each λ ∈ σ(A), it holds m(λ) = n(λ).

If A is degenerated, then eigenvectors plus principal vectors derive Jordan form.

師

Theorem 3 (Schur decomposition)

(1) Let A ∈ C^{n×n}. There is a unitary matrix U such that U^{∗}AU is
upper triangular.

(2) Let A ∈ R^{n×n}. There is an orthogonal matrix Q such that Q^{T}AQ is
quasi-upper triangular, i.e., an upper triangular matrix possibly with
nonzero subdiagonal elements in non-consecutive positions.

(3) A is normal if and only if there is a unitary U such that U^{∗}AU = D
is diagonal.

(4) A is Hermitian if and only if A is normal and σ(A) ⊆ R.

(5) A is symmetric if and only if there is an orthogonal U such that
U^{T}AU = D is diagonal and σ(A) ⊆ R.

師大

### Norms and eigenvalues

Let X be a vectorspace over F = R or C.

Definition 4 (Vector norms)

Let N be a real-valued function defined on X (N : X −→ R+). Then N is a (vector) norm, if

N1: N (αx) = |α|N (x), α ∈ F, for x ∈ X;

N2: N (x + y) ≤ N (x) + N (y), for x, y ∈ X;

N3: N (x) = 0 if and only if x = 0.

The usual notation is kxk = N (x).

師

Example 5

Let X = C^{n}, p ≥ 1. Then kxk_{p}= (Pn

i=1|xi|^{p})^{1/p} is an l_{p}-norm.

Especially,

kxk1=

n

P

i=1

|xi| ( l1-norm),

kxk2=

_{n}
P

i=1

|xi|^{2}

^{1/2}

( Euclidean-norm),

kxk∞= max

1≤i≤n|xi| ( maximum-norm).

師大

Lemma 6

N (x) is a continuous function in the components x1, · · · , xn of x.

Proof:

|N (x) − N (y)| ≤ N (x − y) ≤

n

P

j=1

|xj− yj| N (ej) ≤ kx − yk_{∞}

n

P

j=1

N (ej).

Theorem 7 (Equivalence of norms)

Let N and M be two norms on C^{n}. Then there exist constants
c_{1}, c_{2}> 0 such that

c1M (x) ≤ N (x) ≤ c2M (x), for all x ∈ C^{n}.

Proof of Theorem 7

Remark: Theorem 7 does not hold in infinite dimensional space.

師

### Norms and eigenvalues

Definition 8 (Matrix-norms)

Let A ∈ C^{m×n}. A real value function k · k : C^{m×n} → R+ satisfying
N1: kαAk = |α|kAk;

N2: kA + Bk ≤ kAk + kBk;

N3: kAk = 0 if and only if A = 0;

N4: kABk ≤ kAkkBk;

N5: kAxk_{v}≤ kAkkxk_{v}.

If k · k satisfies N1 to N4, then it is called a matrix norm. In addition, matrix and vector norms are compatible for some k · kv in N5.

師大

Example 9 (Frobenius norm) Let kAkF =n

Pn

i,j=1|ai,j|^{2}o1/2

.

kABkF =

X

i,j

X

k

a_{ik}b_{kj}

2

1/2

≤

X

i,j

( X

k

|a_{ik}|^{2}
) (

X

k

|b_{kj}|^{2}
)

1/2

(Cauchy-Schwartz Ineq.)

= X

i

X

k

|aik|^{2}

!1/2

X

j

X

k

|bkj|^{2}

1/2

= kAk_{F}kBkF.

This implies that N4 holds.

kAxk2=

X

i

X

j

a_{ij}x_{j}

2

1/2

≤

X

i

X

j

|aij|^{2}

X

j

|xj|^{2}

1/2

= kAk_{F}kxk2.
(2)

This implies N5 holds. Also, N1, N2 and N3 hold obviously. (kIkF =√ n)

師

Example 10 (Operator norm)

Given a vector norm k·k. An associated matrix norm is defined by kAk = sup

x6=0

kAxk kxk = max

x6=0

kAxk

kxk = max

kxk=1{kAxk} . N5 holds immediately. On the other hand,

k(AB)xk = kA(Bx)k ≤ kAk kBxk

≤ kAk kBk kxk for all x 6= 0. This implies that

kABk ≤ kAk kBk . Thus, N 4 holds. (kIk = 1).

師大

Three useful matrix norms:

kAk_{1}= sup

x6=0

kAxk_{1}

kxk_{1} = max

1≤j≤n n

X

i=1

|aij| (3)

kAk_{∞}= sup

x6=0

kAxk_{∞}

kxk_{∞} = max

1≤i≤n n

X

j=1

|aij| (4)

kAk_{2}= sup

x6=0

kAxk_{2}
kxk_{2} =p

ρ(A^{∗}A) (5)

Proof of (3)-(5)

Example 11 (Dual norm)

Let ^{1}_{p} +^{1}_{q} = 1. Then k·k^{∗}_{p}= k·k_{q}, (p = ∞, q = 1). (It concluds from the
application of the H¨older inequality, i.e. |y^{∗}x| ≤ kxk_{p}kyk_{q}.)

師

Theorem 12

Let A ∈ C^{n×n}. Then for any operator norm k·k, it holds
ρ(A) ≤ kAk .

Moreover, for any > 0, there exists an operator norm k·k_{} such that
k·k_{}≤ ρ(A) + .

Proof of Theorem 12

Lemma 13

Let U and V are unitary. Then

kU AV kF = kAk_{F}, kU AV k2= kAk_{2}
From

q

師大

Theorem 14 (Singular Value Decomposition (SVD))
Let A ∈ C^{m×n}. Then there exist unitary matrices

U = [u1, · · · , um] ∈ C^{m×m}and V = [v1, · · · , vn] ∈ C^{n×n} such that
U^{∗}AV = diag(σ1, · · · , σp) = Σ, (6)
where p = min{m, n} and σ1≥ σ2≥ · · · ≥ σp≥ 0. (Here, σi denotes
the i-th largest singular value of A).

Proof of Theorem 14

Remark: From (6), we have kAk2=pρ(A^{∗}A) = σ1, which is the
maximal singular value of A, and

kABCkF = kU ΣV^{∗}BCkF = kΣV^{∗}BCkF ≤ σ1kBCkF = kAk2kBCkF.
This implies

kABCkF ≤ kAk2kBkFkCk2. (7) In addition, by (2) and (7), we get

kAk_{2}≤ kAk_{F} ≤√
nkAk_{2}.

師

Theorem 15

Let A ∈ C^{n×n}. The statements are equivalent:

(1) lim

m→∞A^{m}= 0;

(2) lim

m→∞A^{m}x = 0 for all x;

(3) ρ(A) < 1.

Proof:

(1) ⇒ (2): Trivial.

(2) ⇒ (3): Let λ ∈ σ(A), i.e., Ax = λx, x 6= 0. This implies
A^{m}x = λ^{m}x → 0, as λ^{m}→ 0. Thus |λ| < 1, i.e., ρ(A) < 1.

(3) ⇒ (1): There is a norm k · k with kAk < 1 (by Theorem 12).

Therefore, kA^{m}k ≤ kAk^{m}→ 0, i.e., A^{m}→ 0.

師大

Theorem 16 ρ(A) = lim

k→∞

A^{k}

1/k.

Proof: Since

ρ(A)^{k}= ρ(A^{k}) ≤
A^{k}

⇒ ρ(A) ≤
A^{k}

1/k,

for k = 1, 2, . . .. If > 0, then ˜A = [ρ(A) + ]^{−1}A has spectral radius

< 1 and

A˜^{k}

→ 0 as k → ∞. There is an N = N (, A) such that

A˜^{k}

< 1 for all k ≥ N . Thus,
A^{k}

≤ [ρ(A) + ]^{k}, for all k ≥ N
or

A^{k}

1/k ≤ ρ(A) + , for all k ≥ N.

Since ρ(A) ≤
A^{k}

1/k, and k, are arbitrary, lim

k→∞

A^{k}

1/k exists and equals ρ(A).

師

Theorem 17

Let A ∈ C^{n×n}, and ρ(A) < 1. Then (I − A)^{−1} exists and
(I − A)^{−1}= I + A + A^{2}+ · · · .

Proof: Since ρ(A) < 1, the eigenvalues of (I − A) are nonzero. Therefore, by
Theorem 15, (I − A)^{−1}exists and

(I − A)(I + A + A^{2}+ · · · + A^{m}) = I − A^{m}→ I.

Corollary 18

If kAk < 1, then (I − A)^{−1} exists and
(I − A)^{−1}

≤ 1

1 − kAk. Proof: Since ρ(A) ≤ kAk < 1 (by Theorem 12),

_{∞}
_{∞}

師大

Theorem 19 (without proof)

For A ∈ F^{n×n} the following statements are equivalent:

(1) There is a multiplicative norm p with p(A^{k}) ≤ 1, k = 1, 2, . . ..

(2) For each multiplicative norm p the power p(A^{k}) are uniformly
bounded, i.e., there exists a M (p) < ∞ such that p(A^{k}) ≤ M (p),
k = 0, 1, 2, . . ..

(3) ρ(A) ≤ 1 and all eigenvalue λ with |λ| = 1 are not degenerated.

(i.e., m(λ) = n(λ).)

(See Householder: The theory of matrix, pp.45-47.)

師

In the following, we prove some important inequalities of vector norms and matrix norms.

1 ≤ kxk_{p}

kxk_{q} ≤ n^{(q−p)/pq}, (p ≤ q). (8)

Proof of (8)

1 ≤ kxk_{p}

kxk_{∞} ≤ n^{1}^{p}. (9)

Proof of (9)

max

1≤j≤nkajk_{p}≤ kAk_{p}≤ n^{(p−1)/p} max

1≤j≤nkajk_{p}, (10)
where A = [a_{1}, · · · , a_{n}] ∈ R^{m×n}.

Proof of (10)

師大

maxi,j |aij| ≤ kAk_{p} ≤ n^{(p−1)/p}m^{1/p}max

i,j |aij| , (11)
where A ∈ R^{m×n}.

Proof of (11): By (9) and (10) immediately.

m^{(1−p)/p}kAk_{1}≤ kAk_{p}≤ n^{(p−1)/p}kAk_{1}. (12)
Proof of (12): By (10) and (8) immediately.

師

H¨older inequality:

x^{T}y

≤ kxk_{p}kyk_{q}, where1
p+1

q = 1. (13)

Proof of (13): Let αi= _{kxk}^{x}^{i}

p, βi=_{kyk}^{y}^{i}

q. Then
(α^{p}_{i})^{1/p}(β_{i}^{q})^{1/q}≤1

pα^{p}_{i} +1

qβ_{i}^{q}. (Jensen Inequality)
Since kαk_{p}= 1, kβk_{q}= 1, it follows that

n

X

i=1

α_{i}β_{i}≤ 1
p+1

q = 1.

Then we have
x^{T}y

≤ kxk_{p}kyk_{q}.

師大

max{

x^{T}y

: kxk_{p}= 1} = kyk_{q}. (14)
Proof of (14): Take xi= y^{q−1}_{i} / kyk^{q/p}_{q} . Then we have

kxk^{p}_{p}= P |yi|^{q}

kyk^{q/p}_{q} = kyk^{q}_{q}

kyk^{q/p}_{q} = 1. ( ∵ (q − 1)p = 1)
It follows

n

X

i=1

x^{T}_{i} yi

= P |yi|^{q}

kyk^{q/p}_{q} = kyk^{q}_{q}

kyk^{q/p}_{q} = kyk_{q}.

Remark: ∃ˆz with kˆzk_{p}= 1 s.t. kyk_{q} = ˆz^{T}y. Let z = ˆz/ kyk_{q}. Then we
have ∃z s.t. z^{T}y = 1 with kzk_{p}=_{kyk}^{1}

q

.

師

kAk_{p}=
A^{T}

_{q} (15)

Proof of (15)

n^{−}^{1}^{p}kAk_{∞}≤ kAk_{p}≤ m^{1}^{p}kAk_{∞}. (16)

Proof of (16)

kAk_{2}≤q

kAk_{p}kAk_{q}, (1
p+1

q = 1). (17)

Proof of (17)

n^{(p−q)/pq}kAk_{q} ≤ kAk_{p}≤ m^{(q−p)/pq}kAk_{q}, (18)
where A ∈ R^{m×n} and q ≥ p ≥ 1.

師大

### Backward error and Forward error

Let x = F (a). We define backward and forward errors in Figure 1. In Figure 1, ˆx + ∆x = F (a + ∆a) is called a mixed forward-backward error, where |∆x| ≤ ε|x|, |∆a| ≤ η|a|.

Definition 20

(i) An algorithm is backward stable, if for all a, it produces a computed ˆx with a small backward error, i.e., ˆx = F (a + ∆a) with

∆a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward-backward error sense, i.e., ˆx + ∆x = F (a + ∆a) with both

∆a and ∆x small.

(iii) If a method which produces answers with forward errors of similar magnitude to those produced by a backward stable method, is called a forward stable.

師大

(i) An algorithm is backward stable, if for all a, it produces a computed ^x with a small backward error, i.e., ^x = F (a + a) with a small.

(ii) An algorithm is numerical stable, if it is stable in the mixed forward- backward error sense, i.e., ^x + x = F (a + a) with both a and x small.

(iii) If a method which produces answers with forward errors of similar magni- tude to those produced by a backward stable method, is called a forward stable.

Figure: Relationship between backward and forward errors.

Remark:

(i) Backward stable ⇒ forward stable, no vice versa!

(ii) Forward error ≤ condition number × backward error

33 / 56

師大

Consider ˆ

x − x = F (a + ∆a) − F (a) = F^{0}(a)∆a +F^{00}(a + θ∆a)

2 (∆a)^{2}, θ ∈ (0, 1).

Then we have ˆ x − x

x = aF^{0}(a)
F (a)

∆a

a + O (∆a)^{2} .
The quantity C(a) =

aF^{0}(a)
F (a)

is called the condition number of F. If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all ∆a.

Backward error:

( Apriori error estimate !` Aposteriori error estimate !`

師

Lemma 21

Ax = b

(A + ∆A)ˆx = b + ∆b

with k∆Ak ≤ δ kAk and k∆bk ≤ δ kbk. If δκ(A) = r < 1 then A + ∆A
is nonsingular and ^{kˆ}_{kxk}^{xk} ≤^{1+r}_{1−r}.

Proof: Since

A^{−1}∆A
< δ

A^{−1}

kAk = r < 1, it follows that
A + ∆A is nonsingular. From (I + A^{−1}∆A)ˆx = x + A^{−1}∆b, we have

kˆxk ≤

(I + A^{−1}∆A)^{−1}

kxk + δ
A^{−1}

kbk

≤ 1

1 − r kxk + δ
A^{−1}

kbk

= 1

1 − r

kxk + rkbk kAk

師大

### Normwise Forward Error Bound

Theorem 22

If the condition of Lemma 21 hold, then kx − ˆxk

kxk ≤ 2δ 1 − rκ(A).

Proof: Since ˆx − x = A^{−1}∆b − A^{−1}∆Aˆx, we have
kˆx − xk ≤ δ

A^{−1}

kbk + δ
A^{−1}

kAk kˆxk . So, by Lemma 21, we have

kˆx − xk

kxk ≤ δκ(A) kbk

kAk kxk+ δκ(A)kˆxk kxk

≤ δκ(A)

1 + 1 + r 1 − r

= 2δ 1 − rκ(A).

師

### Componentwise Forward Error Bounds

Theorem 23

Let Ax = b and (A + ∆A)ˆx = b + ∆b. Let |∆A| ≤ δ |A| and

|∆b| ≤ δ |b|. If δκ_{∞}(A) = r < 1 then (A + ∆A) is nonsingular and
kˆx − xk_{∞}

kxk_{∞} ≤ 2δ
1 − r

A^{−1}
|A|

_{∞}.

Proof: Since k∆Ak_{∞}≤ δ kAk_{∞}and k∆bk_{∞}≤ δ kbk_{∞}, the conditions
of Lemma 21 are satisfied in ∞-norm. Then A + ∆A is nonsingular and

kˆxk_{∞}

kxk_{∞} ≤ ^{1+r}_{1−r}.

Since ˆx − x = A^{−1}∆b − A^{−1}∆Aˆx, we have

|ˆx − x| ≤
A^{−1}

|∆b| +
A^{−1}

|∆A| |ˆx|

≤ δ
A^{−1}

|b| + δ
A^{−1}

|A| |ˆx| ≤ δ
A^{−1}

|A| (|x| + |ˆx|).

師大

Taking ∞-norm, we get
kˆx − xk_{∞} ≤ δ

A^{−1}

|A|

_{∞}

kxk_{∞}+1 + r
1 − rkxk_{∞}

= 2δ

1 − r k
A^{−1}

|A|

| {z }
k_{∞}

Skeel condition number

.

師

### Condition Number by First Order Approximation

(A + F )x() = b + f, x(0) = x

˙

x(0) = A^{−1}(f − F x)
x() = x + ˙x(0) + o(^{2})
kx() − xk

kxk ≤
A^{−1}

kf k kxk + kF k

+ o(^{2})
Condition number κ(A) := kAk

A^{−1}
kbk ≤ kAk kxk ,
kx() − xk

kxk ≤ κ(A)(ρ_{A}+ ρ_{b}) + o(^{2}).

ρ_{A} = kF k

kAk, ρ_{b}= kf k

kbk, κ_{2}(A) = σ_{1}(A)
.

師大

### Normwise Backward Error Bound

Theorem 24

Let ˆx be the computed solution of Ax = b. Then the normwise backward error bound

η(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, k∆Ak ≤ kAk , k∆bk ≤ kbk}

is given by

η(ˆx) = krk

kAk kˆxk + kbk, (19)

where r = b − Aˆx is the residual.

師

Proof: The right hand side of (19) is a upper bound of η(ˆx). This upper bound is attained for the perturbation (by construction)

∆Amin= kAk kˆxk rz^{T}

kAk kˆxk + kbk, ∆bmin= − kbk
kAk kˆxk + kbkr,
where z is the dual vector of ˆx, i.e. z^{T}x = 1 and kzkˆ _{∗}= _{kˆ}_{xk}^{1} .
Check:

k∆Amink = η(ˆx) kAk , or

k∆Amink =kAk kˆxk
rz^{T}

kAk kˆxk + kbk =

krk

kAk kˆxk + kbk

kAk , i.e. claim

rz^{T}

= krk

kˆxk. Since

rz^{T}

= max

kuk=1

(rz^{T})u

= krk max

kuk=1

z^{T}u

= krk kzk_{∗}= krk 1
kˆxk,

師大

### Componentwise Backward Error Bound

Theorem 25

The componentwise backward error bound

ω(ˆx) := min {|(A + ∆A)ˆx = b + ∆b, |∆A| ≤ |A| , |∆b| ≤ |b|}

is given by

ω(ˆx) = max

i

|r|_{i}

(A |ˆx| + b)_{i}, (20)
where r = b − Aˆx. (note: ξ/0 = 0 if ξ = 0; ξ/0 = ∞ if ξ 6= 0.)

Proof: The right hand side of (20) is a upper bound for ω(ˆx). This bound is attained for the perturbation

∆A = D1AD2, ∆b = −D1b, where

D_{1}= diag(r_{i}/(A |ˆx| + b)_{i}) and D_{2}= diag(sign(ˆx_{i})).

師

### Determinents and Nearness to Singularity

Bn =

1 −1 · · · −1 1 . .. ...

1 −1

0 1

, B^{−1}_{n} =

1 1 · · · 2^{n−2}
. .. . .. ...

. .. 1

0 1

,

det(Bn) = 1, κ∞(Bn) = n2^{n−1}, σn(Bn) ≈ 10^{−8}(n = 30).

Dn =

10^{−1} 0

. ..

0 10^{−1}

,

det(Dn) = 10^{−n}, κp(Dn) = 1, σn(Dn) = 10^{−1}.

師大

### Appendix

Proof of Theorem 7: Without loss of generality (W.L.O.G.) we can
assume that M (x) = kxk_{∞}and N is arbitrary. We claim

c1kxk_{∞}≤ N (x) ≤ c2kxk_{∞}
or

c1≤ N (z) ≤ c2, for z ∈ S = {z ∈ C^{n}|kzk_{∞}= 1}.

From Lemma 6, N is continuous on S (closed and bounded). By maximum and minimum principle, there are c1, c2≥ 0 and z1, z2∈ S such that

c1= N (z1) ≤ N (z) ≤ N (z2) = c2.

If c_{1}= 0, then N (z_{1}) = 0. Thus, z_{1}= 0. This contradicts that
z_{1}∈ S.

Return

師

Proof of (3):

kAxk_{1}=X

i

X

j

aijxj

≤X

i

X

j

|aij| |xj| =X

j

|xj|X

i

|aij| .

Let

C :=X

i

|aik| = max

j

X

i

|aij| .

Then kAxk_{1}≤ C kxk_{1}, thus kAk_{1}≤ C. On the other hand, ke_{k}k_{1}= 1
and kAe_{k}k_{1}=Pn

i=1|a_{ik}| = C.

師大

Proof of (4):

kAxk_{∞} = max

i

X

j

a_{ij}x_{j}

≤ max

i

X

j

|aijx_{j}|

≤ max

i

X

j

|a_{ij}| kxk_{∞}≡X

j

|a_{kj}| kxk_{∞}≡ ˆC kxk_{∞}.

This implies, kAk_{∞}≤ ˆC. If A = 0, then there is nothing to prove.

Assume A 6= 0. Thus, the k-th row of A is nonzero. Define
z = [zi] ∈ C^{n} by

z_{i}= _{|a}^{a}^{¯}^{ki}

ki| if a_{ki}6= 0,
zi= 1 if aki= 0.

Then kzk_{∞}= 1 and a_{kj}z_{j}= |a_{kj}|, for j = 1, . . . , n. It follows
kAk_{∞}≥ kAzk_{∞}= max

i

X

j

aijzj

≥

X

j

akjzj

=

n

X

j=1

|akj| ≡ ˆC.

Then, kAk_{∞}≥ max

1≤i≤n

Pn

j=1|a_{ij}| ≡ ˆC.

師

Proof of (5): Let λ1≥ λ2≥ · · · ≥ λn≥ 0 be the eigenvalues of A^{∗}A.

There are muturally orthonormal vectors vj, j = 1, . . . , n such that
(A^{∗}A)vj= λjvj. Let x =P

jαjvj. Since
kAxk^{2}_{2}= (Ax, Ax) = (x, A^{∗}Ax),

kAxk^{2}_{2}=

X

j

α_{j}v_{j},X

j

α_{j}λ_{j}v_{j}

=X

j

λ_{j}|αj|^{2}≤ λ1kxk^{2}_{2}.

Therefore, kAk^{2}_{2}≤ λ_{1}. Equality follows by choosing x = v_{1} and
kAv1k^{2}_{2}= (v1, λ1v1) = λ1. So, we have kAk_{2}=pρ(A^{∗}A).

Return

師大

Proof of Theorem 12: Let |λ| = ρ(A) ≡ ρ and x be the associated eigenvector with kxk = 1. Then,

ρ(A) = |λ| = kλxk = kAxk ≤ kAk kxk = kAk .
Claim: k·k_{}≤ ρ(A) + . There is a unitary U such that A = U^{∗}RU ,
where R is upper triangular.

Let Dt= diag(t, t^{2}, · · · , t^{n}). For t > 0 large enough, the sum of all
absolute values of the off-diagonal elements of D_{t}RD_{t}^{−1} is less than .

So, it holds

D_{t}RD_{t}^{−1}

_{1}≤ ρ(A) + for large t() > 0. Define k·k_{} for
any B by

kBk_{} =

D_{t}U BU^{∗}D^{−1}_{t}
_{1}

=

(U D_{t}^{−1})^{−1}B(U D^{−1}_{t} )
1.
This implies,

kAk_{}=

D_{t}RD^{−1}_{t}

≤ ρ(A) + .

Return

師

Proof of Theorem 14: There are x ∈ C^{n}, y ∈ C^{m} with
kxk_{2}= kyk_{2}= 1 such that Ax = σy, where

σ = kAk_{2} (kAk_{2}= sup

kxk_{2}=1

kAxk_{2}). Let V = [x, V1] ∈ C^{n×n}, and
U = [y, U1] ∈ C^{m×m}be unitary. Then

A1≡ U^{∗}AV =

σ w^{∗}

0 B

.

Since

A_{1}

σ w

2

2

≥ (σ^{2}+ w^{∗}w)^{2}, it follows

kA1k^{2}_{2}≥ σ^{2}+ w^{∗}w from

A_{1}

σ w

2

2

σ w

2

2

≥ σ^{2}+ w^{∗}w.

But σ^{2}= kAk^{2}_{2}= kA1k^{2}_{2}, it implies w = 0. Hence, the theorem holds by
induction.

師大

Proof of (8): Claim kxkq ≤ kxk_{p}, (p ≤ q): It holds

kxk_{q} =

kxk_{p} x
kxk_{p}

_{q}

= kxk_{p}

x
kxk_{p}

_{q}

≤ Cp,qkxk_{p},

where

Cp,q = max

kek_{p}=1

kek_{q}, e = (e1, · · · , en)^{T}.
We now show that Cp,q ≤ 1. From p ≤ q, we have

kek^{q}_{q} =

n

X

i=1

|ei|^{q} ≤

n

X

i=1

|ei|^{p}= 1 (by |ei| ≤ 1).

Hence, Cp,q≤ 1, thus kxkq≤ kxk_{p}.

師

To prove the second inequality: Let α = q/p > 1. Then the Jensen
ineqality holds for the convex function ϕ(x) ≡ x^{α}:

Z

Ω

|f |^{q}dx =
Z

Ω

(|f |^{p})^{q/p}dx ≥

Z

Ω

|f |^{p}dx

q/p

with |Ω| = 1. Consider the discrete measurePn i=1

1

n = 1 and
f (i) = |x_{i}|. It follows that

n

X

i=1

|xi|^{q} 1
n ≥

n

X

i=1

|xi|^{p} 1
n

!^{q/p}
.
Hence, we have

n^{−}^{1}^{q}kxk_{q} ≥ n^{−}^{1}^{p}kxk_{p}.
Thus,

n^{(q−p)/pq}kxk_{q}≥ kxk_{p}.

師大

Proof of (9): Let q → ∞ and lim

q→∞kxk_{q} = kxk_{∞}:

kxk_{∞}= |xk| = (|xk|^{q})^{1}^{q} ≤

n

X

i=1

|xi|^{q}

!^{1}_{q}

= kxk_{q}.

On the other hand,

kxk_{q}=

n

X

i=1

|xi|^{q}

!^{1}_{q}

≤ (n kxk^{q}_{∞})

1

q ≤ n^{1}^{q}kxk_{∞}.

It follows that lim

q→∞kxk_{q}= kxk_{∞}.

Return

師

To prove the second inequality: Let α = q/p > 1. Then the Jensen
ineqality holds for the convex function ϕ(x) ≡ x^{α}:

Z

Ω

|f |^{q}dx =
Z

Ω

(|f |^{p})^{q/p}dx ≥

Z

Ω

|f |^{p}dx

^{q/p}

with |Ω| = 1.

Consider the discrete measurePn i=1

1

n = 1 and f (i) = |x_{i}|. It follows
that

n

X

i=1

|xi|^{q} 1
n ≥

n

X

i=1

|xi|^{p} 1
n

!q/p

. Hence, we have

n^{−}^{1}^{q}kxk_{q} ≥ n^{−}^{1}^{p}kxk_{p}.
Thus,

n^{(q−p)/pq}kxk_{q}≥ kxk_{p}.

師大

Proof of (10): The first inequality holds obviously. Now, for the second inequality, we have

kAyk_{p} ≤

n

X

j=1

|yj| kajk_{p}

≤

n

X

j=1

|yj| max

j kajk_{p}

= kyk_{1}max

j kajk_{p}

≤ n^{(p−1)/p}max

j kajk_{p}. (by (8))

Return

師

Proof of (15): It holds max

kxk_{p}=1kAxk_{p} = max

kxk_{p}=1

max

kyk_{q}=1

(Ax)^{T}y

= max

kyk_{q}=1

max

kxk_{p}=1

x^{T}(A^{T}y)

= max

kyk_{q}=1

A^{T}y
_{q}

=

A^{T}
_{q}.

Proof of (16): By (12) and (15), we get
m^{1}^{p}kAk_{∞} = m^{1}^{p}

A^{T}

_{1}= m^{1−}^{q}^{1}
A^{T}

_{1}

= m^{(q−1)/q}
A^{T}

_{1}≥
A^{T}

_{q} = kAk_{p}.

師大

Proof of (17): It holds
kAk_{p}kAk_{q} =

A^{T}

_{q}kAk_{q} ≥
A^{T}A

_{q} ≥
A^{T}A

_{2}.
The last inequality holds by the following statement: Let S be a
symmetric matrix. Then kSk_{2}≤ kSk, for any matrix operator norm k·k.

Since |λ| ≤ kSk,
kSk_{2}=p

ρ(S^{∗}S) =p

ρ(S^{2}) = max

λ∈σ(S)|λ| = |λmax| .
This implies, kSk_{2}≤ kSk.

Proof of (18): By (8), we get
kAk_{p}= max

kxk_{p}=1kAxk_{p}≤ max

kxk_{q}≤1m^{(q−p)/pq}kAxk_{q}= m^{(q−p)/pq}kAk_{q}.

Return