Iterative Techniques in Matrix Algebra

(1)

. . . .. .. .. .. .. .. . . . . .

Chapter 7 Iterative Techniques in Matrix Algebra

Hung-Yuan Fan (范洪源)

Department of Mathematics, National Taiwan Normal University, Taiwan

Spring 2016

(2)

Section 7.1

Norms of Vectors and Matrices

(3)

. . . .. .. .. .. .. .. . . . . .

Vector Norms (向量的範數)

Def 7.1

A vector norm onRⁿ is a function∥ · ∥ : Rⁿ→ R with the following properties:

(i) ∥x∥ ≥ 0 for all x ∈ Rⁿ, (ii) ∥x∥ = 0 ⇐⇒ x = 0,

(iii) ∥αx∥ = |α| ∥x∥ for all α ∈ R and x ∈ Rⁿ, (iv) ∥x + y∥ ≤ ∥x∥ + ∥y∥ for all x, y ∈ Rⁿ.

Note: The n-dimensional vector x is often denoted by

x =





 x₁ x₂ ... x_n





= [x1, x2,· · · , xn]^T= (x1, x2,· · · , xn)^T.

(4)

Useful Vector Norms

Def 7.2

The l₂ and l_∞ norms for the vector x = [x₁, x2,· · · , xn]^T∈ Rⁿ are defined by

∥x∥2 = (∑ⁿ

i=1

x²_i)1/2

and ∥x∥∞= max

1≤i≤n|xi|.

The l₁ norm of x∈ Rⁿ is defined by

∥x∥1 =

∑n i=1

|xi|.

(5)

. . . .. .. .. .. .. .. . . . . .

Distance between Vectors in R

ⁿ

Def 7.4

Let x = [x₁,· · · , xn]^T and y = [y₁,· · · , yn]^T be two vectors inRⁿ. The l₂ and l_∞ distances between x and y are defined by

∥x−y∥2 = {∑ⁿ

i=1

(xi−yi)² }1/2

and ∥x−y∥_∞= max

1≤i≤n|xi−yi|.

The l₁ distance between x and y is given by

∥x − y∥1 =

∑n i=1

|xi− yi|.

(6)

Example 2, p. 435 The 3× 3 linear system

3.3330x1+ 15920x2− 10.333x3= 15913, 2.2220x₁+ 16.710x₂+ 9.6120x₃ = 28.544, 1.5611x₁+ 5.1791x₂1.6852x₃ = 8.4254

has the exact sol. x = [1, 1, 1]^T. If the system is solved by GE with patrial pivoting using 5-digit rounding arithnetic, we obtain the computed sol.

˜x = [1.2001, 0.99991, 0.92538]^T. So, the l_∞ and l₂ distances between x and ˜x are

(7)

. . . .. .. .. .. .. .. . . . . .

Convergence for Sequences of Vectors

Def 7.5 (向量序列的收斂性)

A seq.{x^(k)}^∞_k₋₁ of vectors in Rⁿis said to converge to x∈ Rⁿ with respect to the norm∥ · ∥ if ∀ ϵ > 0, ∃ N(ϵ) ∈ N s.t.

∥x^(k)− x∥ < ϵ ∀ k ≥ N(ϵ).

Thm 7.6

The seq. of vectors{x^(k)}^∞k=1 converges to x∈ Rⁿ

with respect to the l

_∞

norm

⇐⇒ lim

k→∞x^(k)_i = xi for i = 1, 2, . . . , n.

pf: It is easily seen that

∀ ϵ > 0, ∃ N(ϵ) ∈ N s.t.

∥x^(k)− x∥∞< ϵ ∀ k ≥ N(ϵ)

⇐⇒|x^(k)_i − xi| < ϵ ∀ k ≥ N(ϵ) and 1 ≤ i ≤ n.

(8)

Example 3, p. 436

The sequence of vectors inR⁴ x^(k)= [1, 2 + 1

k, 3

k², e^−ksin k]^T

converges to x = [1, 2, 0, 0]^T∈ R⁴ with respect to the l_∞ norm, since

klim→∞(2 +1

k) = 2, lim

k→∞

3

k² = 0, lim

k→∞e^−ksin k = 0.

Question

Does the given sequence converge to x with respect to the l₂

(9)

. . . .. .. .. .. .. .. . . . . .

Thm 7.7 (The Equivalence of Vector Norms) For each x∈ Rⁿ,

∥x∥∞≤ ∥x∥2 ≤√

n∥x∥∞.

In this case, we say that the l_∞ and l₂ norms are equivalent.

pf: For any x = [x

₁,· · · , xn]^T∈ Rⁿ, let |xi0| = max

1≤i≤n|xi| = ∥x∥_∞. Then we see that

1 ∥x∥_∞=|xi0| =√ x²_i

0 ≤√

x²₁+· · · + x²n=∥x∥2.

2 ∥x∥2 ≤(∑n i=1

x²_i

0

)1/2

=

(n∥x∥²_∞)1/2

=√

n∥x∥_∞. So, these prove the desired inequalities.

(10)

Example 4, p. 437

Show that the sequence of vectors in Example 3 x^(k) = [1, 2 +1

k, 3

k², e^−ksin k]^T∈ R⁴

converges to x = [1, 2, 0, 0]^T∈ R⁴ with respect to the l₂ norm.

pf: In Example 3, we know that lim

k→∞∥x^(k)− x∥_∞= 0. So, for any ϵ > 0,∃ N0∈ N s.t.

∥x^(k)− x∥_∞< ϵ

2 ∀ k ≥ N0, and furthermore, it follows from Thm 7.7 that

√

(11)

. . . .. .. .. .. .. .. . . . . .

Remarks

Any two vector norms ∥ · ∥ and ∥ · ∥^′ onRⁿ are equivalent, i.e., ∃ c1> 0 and c2> 0 s.t.

c₁∥x∥^′≤ ∥x∥ ≤ c2∥x∥^′ ∀ x ∈ Rⁿ.

A seq. {x^(k)}^∞_k=1 converges to the limit x∈ Rⁿ

with respect to the norm

∥ · ∥ ⇐⇒ a seq. {x^(k)}^∞k=1 converges to the limit x∈ Rⁿ

with respect to the norm

∥ · ∥^′. (向量序列的收斂性與範數無關!)

For any x∈ Rⁿ, the relations between l₁, l₂ and l_∞ norms are

∥x∥2≤ ∥x∥1≤√ n∥x∥2,

∥x∥∞≤ ∥x∥1 ≤ n∥x∥∞.

(12)

Matrix Norms and Distances

Def 7.8 (矩陣的範數)

A matrix norm onRⁿ^×n is a function ∥ · ∥ : Rⁿ^×n→ R satisfying for all A, B∈ Rⁿ^×n and all α∈ R :

(i) ∥A∥ ≥ 0;

(ii) ∥A∥ = 0 ⇐⇒ A = 0 (zero matrix);

(iii) ∥αA∥ = |α| ∥A∥;

(iv) ∥A + B∥ ≤ ∥A∥ + ∥B∥;

(v) ∥AB∥ ≤ ∥A∥ ∥B∥.

Definition (Distances of Two Matrices)

(13)

. . . .. .. .. .. .. .. . . . . .

Thm 7.9 (⾃然矩陣範數)

If∥ · ∥ is a vector norm on Rⁿ, then

∥A∥ = max

∥x∥=1∥Ax∥

is amatrix norm on Rⁿ^×n. (See Exercise 13 for the proof.)

pf: Only prove that

∥AB∥ ≤ ∥A∥ ∥B∥ for any A, B ∈ Rⁿ^×n here.

For any unit vector x∈ Rⁿ, we have

∥A(Bx)∥ = ∥Bx∥ · ∥A( Bx

∥Bx∥

)∥ ≤ ∥A∥ · ∥Bx∥.

Thus, we conclude that

∥AB∥ = max

∥x∥=1∥(AB)x∥ = max

∥x∥=1∥A(Bx)∥

≤ max

∥x∥=1(∥A∥ ∥Bx∥) = ∥A∥ · max

∥x∥=1∥Bx∥ = ∥A∥ ∥B∥.

(14)

Remarks

Matrix norms defined by vector norms are called thenatural (or induced) matrix norm associated with the vector norm.

Since x = _∥z∥^z is a unit vector for z̸= 0, Thm 7.9 can be rewritten as

∥A∥ = max

∥x∥=1∥Ax∥ = max

z̸=0 ∥A( z

∥z∥

)∥ =max

z̸=0

∥Az∥

∥z∥ .

(15)

. . . .. .. .. .. .. .. . . . . .

Cor 7.10

For any A∈ Rⁿ^×n, 0̸= z ∈ Rⁿ and any natural norm∥ · ∥,

∥Az∥ ≤ ∥A∥ · ∥z∥.

Some Natural Matrix Norms

1 ∥A∥_∞= max

∥x∥∞=1∥Ax∥_∞= max

z̸=0

∥Az∥_∞

∥z∥∞ . (the l_∞ norm)

2 ∥A∥2= max

∥x∥2=1∥Ax∥2 = max

z̸=0

∥Az∥2

∥z∥2 . (the l₂ norm)

3 ∥A∥1= max

∥x∥1=1∥Ax∥1 = max

z̸=0

∥Az∥1

∥z∥1 . (the l₁ norm)

(16)

Thm 7.11 (矩陣

∞-範數的計算公式)

If A = [a_ij]∈ Rⁿ^×n, then

∥A∥_∞= max

1≤i≤n

∑n j=1

|aij|. (|A| = [ |aij| ] 的最⼤列和)

pf: The proof is separated into two parts.

(1) Assume ∥A∥∞=∥Ax∥∞ for some x∈ Rⁿ with

∥x∥_∞= max

1≤j≤n|xj| = 1. Then we have

∥A∥∞=∥Ax∥∞= max

1≤i≤n|(Ax)i| = max

1≤i≤n

∑n j=1

a_ijx_j

∑n ( ) ∑ⁿ

(17)

. . . .. .. .. .. .. .. . . . . .

(2) Let y = [y₁, y₂, . . . , yn]^T∈ Rⁿ, where each component y_j=

{ 1, if a_ij ≥ 0,

−1, if a_ij< 0.

Then ∥y∥∞= 1 anda_ijy_j=|aij| for all i, j. So, we get

∥Ay∥_∞= max

1≤i≤n|(Ay)i| = max

1≤i≤n

∑n j=1

a_ijy_j = max

1≤i≤n

∑n j=1

|aij|.

Furthermore, it follows that

∥A∥∞= max

∥x∥∞=1∥AX∥∞≥ ∥Ay∥∞= max

1≤i≤n

∑n j=1

|aij|.

From the parts (1) and (2) =⇒ these complete the proof.

(18)

Exercise 6, p. 442 (矩陣 1-範數的計算公式) If A = [a_ij]∈ Rⁿ^×n, then

∥A∥1 = max

1≤j≤n

∑n i=1

|aij|. (|A| = [ |aij| ] 的最⼤⾏和)

(19)

. . . .. .. .. .. .. .. . . . . .

Example 5, p. 441 For the 3× 3 matrix

A =



1 2 −1

0 3 −1

5 −1 1



 ,

it follows that

∥A∥_∞= max

i=1,2,3

∑3 j=1

|aij| = max{4, 4, 7} = 7,

∥A∥1 = max

j=1,2,3

∑3 i=1

|aij| = max{6, 6, 3} = 6.

(20)

Section 7.2

Eigenvalues and Eigenvectors

(21)

. . . .. .. .. .. .. .. . . . . .

Def 7.12 (特徵多項式)

The characteristic polynomial of A∈ Rⁿ^×n is defined by p(λ) = det(A− λI),

where I is the n× n identity matrix.

Note: The characteristic poly. p is an nth-degree poly. with real

coefficients. So, it has at most n distinct zeros inC.

(22)

Def 7.13 (特徵值與特徵向量)

Let p(λ) be the characteristic poly. of A∈ Rⁿ^×n.

The number λ∈ C is called an eigenvalue (or characteristic value) of A if p(λ) = 0.

The spectrum (譜) of A, denoted by σ(A), is the set of all eigenvalues of A.

If∃ 0 ̸= x ∈ Rⁿ s.t. Ax = λx or (A− λI)x = 0 for λ∈ σ(A), then x is called an eigenvector (pr characteristic vector) of A corresponding to λ.

(23)

. . . .. .. .. .. .. .. . . . . .

Def 7.14

The spectral radius (譜半徑) of A∈ Rⁿ^×n is defined by ρ(A) = max{|λ| | λ ∈ σ(A)}.

(For complex λ = α + βi, we define|λ| =√

α²+ β².) Thm 7.15 (矩陣 2-範數的計算公式)

If A is an n× n matrix, then (i) ∥A∥2=√

ρ(A^TA).

(ii) ρ(A)≤ ∥A∥ for any natural matrix norm ∥ · ∥.

(24)

Review from Linear Algebra

Let B = A^TA with A∈ Rⁿ^×n and v∈ Rⁿ.

B^T= (A^TA)^T= A^T(A^T)^T= A^TA = B, i.e., B is symmetric.

For any λ∈ σ(B), λ ≥ 0.

B is orthogonally diagonalizable, i.e.,∃ orthog. Q ∈ Rⁿ^×n s.t.

Q^TBQ = Q^T(A^TA)Q = diag(λ₁, λ₂,· · · , λn)≡ D, where λ₁ ≥ λ2 ≥ · · · λn≥ 0.

Since ∥v∥²2= v^Tv, we have

∥Ax∥²2= (Ax)^T(Ax) = x^T(A^TA)x = x^TBx

(25)

. . . .. .. .. .. .. .. . . . . .

Proof of Thm 7.15 (1/2)

(i) Since A^TA is symmetric, ∃ orthogonal Q ∈ Rⁿ^×n s.t.

Q^T(A^TA)Q = diag(λ₁, λ₂,· · · , λn)≡ D, where λ₁ ≥ λ2 ≥ · · · λn≥ 0. Hence,

∥A∥²₂= max

∥x∥²=1∥Ax∥²₂ = max

∥x∥²=1

x^T(A^TA)x

= max

∥x∥2=1

x^TQDQ^Tx = max

∥y∥2=1

y^TDy, where we let y = Q^Tx. So, ∥A∥²₂ ≤ λ1. Moreover, the maximum value of y^TDy is achieved at the vector y^∗= [1, 0,· · · , 0]^T∈ Rⁿ and thus ∥A∥²₂= λ₁ or

∥A∥2=√

λ1=√

ρ(A^TA).

(26)

Proof of Thm 7.15 (2/2)

(ii) Let A∈ Rⁿ^×n and∥ · ∥ be any natural norm. For each λ∈ σ(A), ∃ 0 ̸= x ∈ Rⁿs.t.

Ax = λx with ∥x∥ = 1.

Hence, we know that

|λ| = |λ| · ∥x∥ = ∥λx∥ = ∥Ax∥ ≤ ∥A∥ ∥x∥ = ∥A∥.

So, the spectral radius of A satisfies ρ(A)≤ ∥A∥.

(27)

. . . .. .. .. .. .. .. . . . . .

Rematks

If A^T= A∈ Rⁿ^×n, then∥A∥2= ρ(A).

For any A∈ Rⁿ^×n and any ϵ > 0,∃ a natural norm ∥ · ∥ϵ s.t.

ρ(A) <∥A∥ϵ< ρ(A) + ϵ.

(28)

Def 7.16 (收斂矩陣的定義)

We day that a matrix A∈ Rⁿ^×n is convergent if

klim→∞(A^k)ij= 0, for i, j = 1, 2, . . . , n.

Example 4, p. 448 The 2× 2 matrix A =

[₁

2 0

1 4

1 2

]

is a convergent matrix, since we have

A^k=

[(¹₂)^k 0

k

2^k+1 (¹₂)^k ]

∀ k ≥ 1

(29)

. . . .. .. .. .. .. .. . . . . .

Thm 7.17 (收斂矩陣的等價條件)

Let A∈ Rⁿ^×n. The following statements are equivalent.

(i) A is a convergent matrix.

(ii) lim

n→∞∥Aⁿ∥ = 0 for some natural norm.

(iii) lim

n→∞∥Aⁿ∥ = 0 for all natural norms.

(iv) ρ(A) < 1.

(v) lim

n→∞Aⁿx = 0 for every x∈ Rⁿ.

(30)

Section 7.3

The Jacobi and Gauss-Siedel Iterative

Techniques

(31)

. . . .. .. .. .. .. .. . . . . .

Derivation of Jacobi Method

Basic Idea

From the ith eq. of a linear system Ax = b

a_i1x₁+ ai2x₂+· · · +a_iix_i+· · · + ainx_n= bi

for solving the ith componentx_i, we get

x_i= 1 a_ii







∑n

j = 1 j̸= i

(−aijx_j) + bi





 ,

provided thata_ii̸= 0 for i = 1, 2, . . . , n.

(32)

The Jacobi Method

Component Form (Jacobi 法的分量形式)

For each k≥ 1, we nay consider the Jacobi iterative method:

x^(k)_i = 1 a_ii







∑n

j = 1 j̸= i

(−aijx^(k_j ⁻¹⁾) + bi





 for i = 1, . . . , n, (1)

where an initial approx.x⁽⁰⁾= [x⁽⁰⁾₁ ,· · · , x⁽⁰⁾n ]^T∈ Rⁿ is given.

(33)

. . . .. .. .. .. .. .. . . . . .

Example 1, p. 451 The following linear system

10x

₁− x2+ 2x3 = 6

− x1+ 11x2− x3+ 3x4= 25 2x1− x2+ 10x3− x4=−11 3x₂− x3+ 8x₄= 15

has a unique solution x = [1, 2,−1, 1]^T∈ R⁴. Use Jacobi’s iterative technique to find an approx. x^(k) to x starting with x⁽⁰⁾ = [0, 0, 0, 0]^T∈ R⁴ until

∥x^(k)− x^(k⁻¹⁾∥∞

∥x^(k)∥∞ < 10⁻³.

(34)

Solution (1/2)

The given linear system ca be rewritten as x₁ = 1

10x₂− 1 5x₃+3

5 x₂ = 1

11x₁+ 1

11x₃− 3

11x₄+25 11 x₃ = −1

5 x₁+ 1

10x₂+ 1

10x₄−11 10 x₄ = −3

8 x₂+1

8x₃+ 15 8 .

(35)

. . . .. .. .. .. .. .. . . . . .

Solution (2/2)

For each k≥ 1, we apply the Jacbi’s method:

x^(k)₁ = 1

10x^(k₂⁻¹⁾−1

5x^(k₃⁻¹⁾+3 5 x^(k)₂ = 1

11x^(k₁⁻¹⁾+ 1

11x^(k₃⁻¹⁾− 3

11x^(k₄⁻¹⁾+ 25 11 x^(k)₃ = −1

5 x^(k₁⁻¹⁾+ 1

10x^(k₂⁻¹⁾+ 1

10x^(k₄⁻¹⁾−11 10 x^(k)₄ = −3

8 x^(k₂⁻¹⁾+1

8x^(k₃⁻¹⁾+15 8 with the initial guessx⁽⁰⁾ = [0, 0, 0, 0]^T∈ R⁴.

(36)

Numerical Results

After 10 iterations of Jacobi method, we have

∥x⁽¹⁰⁾− x⁽⁹⁾∥_∞

∥x⁽¹⁰⁾∥_∞ = 8.0× 10⁻⁴

1.9998 = 4.0× 10⁻⁴< 10⁻³. In fact, the absolute error is∥x⁽¹⁰⁾− x∥_∞= 2× 10⁻⁴.

(37)

. . . .. .. .. .. .. .. . . . . .

Equivalent Matrix-Vector Forms

As in Chapter 2, every root-finding problem f(x) = 0 is converted into its equivalent fixed-point form

x = g(x), x∈ I = [a, b]

, for some differentiable function g.

Similarly, we also try to covert the original linear system Ax = b into its equivalent matrix-vector form

x = Tx + c, x∈ Rⁿ, where T∈ Rⁿ^×n and c∈ Rⁿ are fixed.

For k = 1, 2, . . ., compute

x^(k) = Tx^(k⁻¹⁾+ c

with an initial approx. x⁽⁰⁾ ∈ Rⁿ to the unique sol. x.

(38)

A Useful Split of A

The iterative techniques for solving Ax = b will be derived by first splitting A into its diagonal and off-diagonal parts, i.e.,

A =





a₁₁ 0 · · · 0

0 a22

... .. . ..

.

... ... 0

0 · · · 0 ann



 −





0 · · · · · · 0

−a21 ...

.. . ..

.

... ...

.. .

−an1 · · · −an,n−1 0





−





0 −a12 · · · −a1n ..

.

... ... .. . ..

.

... −an−1,n

0 · · · · · · 0





≡ D −L−U. (2)

(39)

. . . .. .. .. .. .. .. . . . . .

The Jacobi Method Revisited

From the splitting of A as in (2), the linear system Ax = b is transformed to

(D− L − U)x = b ⇔ Dx = (L + U)x + b ⇔ x =T_jx +c_j, whereT_j≡ D⁻¹(L + U) andc_j ≡ D⁻¹b.

It is easily seen that the component form of Jacobi method

x^(k)_i = 1 a_ii







∑n

j = 1 j̸= i

(−aijx^(k_j ⁻¹⁾) +b_i





 ^fori = 1, . . . , n.

is equivalent to the following matrix-vector form x^(k)=T_jx^(k⁻¹⁾+c_j ∀ k ≥ 1.

(40)

Example 2, p. 453

The 4× 4 linear system in Example 1 can be rewritten in the form x₁ = 1

10x₂− 1 5x₃+3

5 x₂ = 1

11x₁+ 1

11x₃− 3

11x₄+25 11 x₃ = −1

5 x₁+ 1

10x₂+ 1

10x₄−11 10 x₄ = −3

8 x₂+1

8x₃+ 15 8 .

So, the unique sol. x∈ R⁴ satisfies x = T_jx + c_j with





0 ₁₀¹ ⁻¹₅ 0

1 1 −3









3 5 25





(41)

. . . .. .. .. .. .. .. . . . . .

Algorithm 7.1: Jacobi Method

INPUT dim. n; A = [a_ij]∈ Rⁿ^×n ; b∈ Rⁿ; X0 = x⁽⁰⁾ ∈ Rⁿ; tol. TOL;

max. no. of iter. N₀.

OUTPUT an approx. sol. x₁, x2, . . . , xn to Ax = b.

Step 1 Set k = 1.

Step 2 While (k≤ N0) do Steps 3–6 Step 3 For i = 1, . . . , n set

xi= 1 aii







∑n

j = 1 j̸= i

(−aijX0j) + bi





 .

Step 4 If∥x − X0∥ < TOL then OUTPUT(x1,· · · , xn); STOP.

Step 5 Set k = k + 1.

Step 6 Set X0 = x.

Step 7 OUTPUT(‘Maximum number of iterations exceeded’); STOP.

(42)

Comments on Algorithm 7.1

If some a_ii = 0 and A is nonsingular, choose p̸= i s.t.

|api| is as large as possible,

and then perform (E_p)↔ (Ei) to ensure thatno a_ii= 0 before applying the Jacobi method.

In Step 4, a better stopping criterion should be

∥x^(k)− x^(k⁻¹⁾∥

∥x^(k)∥ < TOL,

where the vector norm ∥ · ∥ is the l1, l₂ or l_∞ norm.

(43)

. . . .. .. .. .. .. .. . . . . .

Jacobi Method v.s. Gauss-Seidel Metod

For the Jacobi’s method, the ith componentx^(k)_i of x^(k) is determined by x^(k₁⁻¹⁾,· · · , x^(ki−1⁻¹⁾ and x^(k_i+1⁻¹⁾,· · · , x^(kn⁻¹⁾. At the kth step of Gauss-Seidel method, the ith component x^(k)_i is computed byx^(k)₁ ,· · · , x^(k)_i₋₁ and x^(k_i+1⁻¹⁾,· · · , x^(kn⁻¹⁾. Notice that the recently computed values of x^(k)₁ ,· · · , x^(k)_i₋₁ are better approxs. to x than the values of x^(k₁⁻¹⁾,· · · , x^(ki−1⁻¹⁾.

(44)

Component Form of Gauss-Seidel Method

For each k≥ 1, the ith component of x^(k) is determined by

x^(k)_i = 1 a_ii



∑ⁱ⁻¹

j=1

(−aijx^(k)_j ) +

∑n j=i+1

(−aijx^(k_j ⁻¹⁾) + bi





= 1 a_ii



−

i−1

∑

j=1

(a_ijx^(k)_j )−

∑n j=i+1

(a_ijx^(k_j ⁻¹⁾) + b_i



 , (3)

where an initial vector x⁽⁰⁾ is given and i = 1, 2, . . . , n.

(45)

. . . .. .. .. .. .. .. . . . . .

Algorithm 7.2: Gauss-Seidel Method

INPUT dim. n; A = [a_ij]∈ Rⁿ^×n ; b∈ Rⁿ; X0 = x⁽⁰⁾ ∈ Rⁿ; tol. TOL;

max. no. of iter. N₀.

OUTPUT an approx. sol. x₁, x₂, . . . , xn to Ax = b.

Step 1 Set k = 1.

Step 2 While (k≤ N0) do Steps 3–6 Step 3 For i = 1, . . . , n set

xi= 1 aii



−

i−1

∑

j=1

(aijxj)−

∑n

j=i+1

(aijX0j) + bi



 .

Step 4 If∥x − X0∥ < TOL then OUTPUT(x1,· · · , xn); STOP.

Step 5 Set k = k + 1.

Step 6 Set X0 = x.

Step 7 OUTPUT(‘Maximum number of iterations exceeded’); STOP.

(46)

Example 3, p. 455 The following linear system

10x

₁− x2+ 2x3 = 6

− x1+ 11x2− x3+ 3x4= 25 2x1− x2+ 10x3− x4=−11 3x₂− x3+ 8x₄= 15

has a unique solution x = [1, 2,−1, 1]^T∈ R⁴. Use Gauss-Seidel

method to find an approx. x

^(k) to x starting with

x⁽⁰⁾ = [0, 0, 0, 0]^T∈ R⁴ until

∥x^(k)− x^(k⁻¹⁾∥∞ −3

(47)

. . . .. .. .. .. .. .. . . . . .

Solution

For each k≥ 1, we apply the Gauss-Seidel method:

x^(k)₁ = 1

10x^(k₂⁻¹⁾−1

5x^(k₃ ⁻¹⁾+3 5 x^(k)₂ = 1

11x^(k)₁ + 1

11x^(k₃⁻¹⁾− 3

11x^(k₄⁻¹⁾+25 11 x^(k)₃ = −1

5 x^(k)₁ + 1

10x^(k)₂ + 1

10x^(k₄⁻¹⁾−11 10 x^(k)₄ = −3

8 x^(k)₂ +1

8x^(k)₃ +15 8 with the initial guessx⁽⁰⁾ = [0, 0, 0, 0]^T∈ R⁴.

(48)

Numerical Results of Example 3

After 5 iterations of Gauss-Seidel method, we have

∥x⁽⁵⁾− x⁽⁴⁾∥_∞

∥x⁽⁵⁾∥_∞ = 8.0× 10⁻⁴

2.000 = 4.0× 10⁻⁴ < 10⁻³. In fact, the absolute error is∥x⁽⁵⁾− x∥_∞= 1.0× 10⁻⁴. The numerical results are shown in the following table.

(49)

. . . .. .. .. .. .. .. . . . . .

Matrix-Vector Form of Gauss-Seidel Method (1/2) From the component form as in (3)

x^(k)_i = 1 a_ii



−

i−1

∑

j=1

(a_ijx^(k)_j ) +

∑n j=i+1

(−aijx^(k_j ⁻¹⁾) + bi



 ,

we immediately obtain

a_i1x^(k)₁ +· · · +a_iix^(k)_i =−ai,i+1x^(k_i+1⁻¹⁾− · · · − ainx^(k_n⁻¹⁾+ bi

for each i = 1, 2, . . . , n.

Thus we have following matrix form for Gauss-Seidel method:





a₁₁ 0 · · · 0

a₂₁ a₂₂ . .. .. . ..

.

... ... 0

an1 · · · an,n−1 ann









x^(k)₁ x^(k)₂ .. . x^(k)n



 =





0 −a12 · · · −a1n ..

. . .. . .. .. . ..

.

... −an−1,n

0 · · · · · · 0









x^(k−1)₁ x^(k₂⁻¹⁾ .. . x^(k−1)n



+b.

(50)

Matrix-Vector Form of Gauss-Seidel Method (2/2)

For each k≥ 1, the above matrix equation can be rewritten as (D−L)x^(k)=Ux^(k⁻¹⁾+ b

⇐⇒ x^(k)= (D−L)⁻¹Ux^(k⁻¹⁾+ (D−L)⁻¹b

⇐⇒ x^(k)= Tgx^(k⁻¹⁾+ cg,

where T_g≡ (D −L)⁻¹U and c_g ≡ (D −L)⁻¹b.

Recall the Jacobi method given by

x^(k)= Tjx^(k−1)+ cj, k = 1, 2, . . . , where T = D⁻¹(L + U) and c = D⁻¹b.

(51)

. . . .. .. .. .. .. .. . . . . .

General Iteration Methods

Some Questions

1 When does a general iteration of the form x^(k)= Tx^(k−1)+ c, k = 1, 2, . . . converge to a solution x∈ Rⁿ of the matrix equation x = Tx + c?

2 What is the rate of convergence for this iterative method?

3 Does the Gauss-Seidel method always converge faster than the Jacobi method?

(52)

Lemma 7.18

If T∈ Rⁿ^×n satisfiesρ(T) < 1, the (I− T)⁻¹ exists and

(I− T)⁻¹= I + T + T²+· · · =∑^∞

j=0

T^j

withT⁰≡ Ibeing defined conventionally.

(53)

. . . .. .. .. .. .. .. . . . . .

Proof of Lemma 7.18

If λ∈ σ(T), then ∃ 0 ̸= x ∈ Rⁿ s.t.

Tx = λx or (I− T)x = (1 − λ)x.

So, 1− λ ∈ σ(I − T).

Because ρ(T) < 1,|λ| ≤ρ(T) < 1. This means that I− T does not have any zero eigenvalues and hence (I− T)⁻¹ exists.

Let S_m=

∑m j=0

T^j. Then we have

(I− T)Sm=

∑m j=0

T^j−

∑m j=0

T^j+1= I− T^m+1.

Since ρ(T) < 1, T is convergent, i.e., lim

m→∞T^m= 0 by Thm 7.17. Hence (I− T)⁻¹= lim

m→∞S_m= ∑^∞

j=0

T^j.

(54)

Thm 7.19 (廣義迭代法收斂性的充要條件)

For any x⁽⁰⁾ ∈ Rⁿ, the sequence{x^(k)}^∞_k=0 defined by x^(k)=Tx^(k⁻¹⁾+ c ∀ k ≥ 1

converges to the unique solution of x =Tx + c⇐⇒ρ(T) < 1.

pf: The proof is illustrated as follows.

(⇐) Suppose that ρ(T) < 1. By induction =⇒ x^(k)= Tx^(k⁻¹⁾+ c

= T(Tx^(k⁻²⁾+ c) + c = T²x^(k⁻²⁾+ (T + I)c ...

(55)

. . . .. .. .. .. .. .. . . . . .

Since ρ(T) < 1, lim

k→∞T^kx⁽⁰⁾ = 0 by Thm 7.17. Thus, it follows from Lemma 7.18 that

x≡ lim

k→∞x^(k)= 0 + (∑^∞

j=0

T^j)

c = (I− T)⁻¹c.

Hence, the limit x∈ Rⁿis the unique solution of the equation x = (I− T)⁻¹c⇐⇒ (I − T)x = c ⇐⇒ x = Tx + c.

(⇒) Assume that lim

k→∞x^(k) = x for any initial vector x⁽⁰⁾, where x∈ Rⁿis the unique sol. of x = Tx + c. Now, we want to claim that

ρ(T) < 1⇐⇒ lim

k→∞T^kz = 0 ∀ z ∈ Rⁿ by applying Thm 7.17.

(56)

For any z∈ Rⁿ, let x⁽⁰⁾= x− z. Then by induction =⇒

x− x^(k)= (Tx + c)− (Tx^(k⁻¹⁾+ c)

= T(x− x^(k⁻¹⁾) =· · · = T^k(x− x⁽⁰⁾)

= T^kz, since z = x− x⁽⁰⁾. So, it follows from the assumption that

klim→∞T^kz = x− lim

k→∞x^(k)= x− x = 0.

Since z∈ Rⁿis given arbitrarily, we have ρ(T) < 1.