. . . .. .. .. .. .. .. . . . . .

**Chapter 7**

**Iterative Techniques in Matrix** **Algebra**

**Hung-Yuan Fan (范洪源)**

**Department of Mathematics,**
**National Taiwan Normal University, Taiwan**

**Spring 2016**

**Section 7.1**

**Norms of Vectors and Matrices**

. . . .. .. .. .. .. .. . . . . .

## Vector Norms (向量的範數)

**Def 7.1**

A vector norm onR* ^{n}* is a function

*∥ · ∥ : R*

^{n}*→ R with the*following properties:

(i) *∥x∥ ≥ 0 for all x ∈ R** ^{n}*,
(ii)

*∥x∥ = 0 ⇐⇒ x = 0,*

(iii) *∥αx∥ = |α| ∥x∥ for all α ∈ R and x ∈ R** ^{n}*,
(iv)

*∥x + y∥ ≤ ∥x∥ + ∥y∥ for all x, y ∈ R*

*.*

^{n}**Note: The n-dimensional vector x is often denoted by**

**Note: The n-dimensional vector x is often denoted by**

*x =*

*x*_{1}
*x*_{2}
...
*x*_{n}

*= [x*1*, x*2*,· · · , x**n*]^{T}*= (x*1*, x*2*,· · · , x**n*)^{T}*.*

## Useful Vector Norms

**Def 7.2**

*The l*_{2} *and l*_{∞}*norms for the vector x = [x*_{1}*, x*2*,· · · , x**n*]^{T}*∈ R** ^{n}*
are defined by

*∥x∥*2 =
(∑^{n}

*i=1*

*x*^{2}* _{i}*)1/2

and *∥x∥**∞*= max

1*≤i≤n**|x**i**|.*

*The l*_{1} *norm of x∈ R** ^{n}* is defined by

*∥x∥*1 =

∑*n*
*i=1*

*|x**i**|.*

. . . .. .. .. .. .. .. . . . . .

## Distance between Vectors in R

^{n}**Def 7.4**

*Let x = [x*_{1}*,· · · , x**n*]^{T}*and y = [y*_{1}*,· · · , y**n*]* ^{T}* be two vectors inR

*.*

^{n}*The l*

_{2}

*and l*

_{∞}*distances between x and y are defined by*

*∥x−y∥*2 =
{∑^{n}

*i=1*

*(x**i**−y**i*)^{2}
}1/2

and *∥x−y∥** _{∞}*= max

1*≤i≤n**|x**i**−y**i**|.*

*The l*_{1} *distance between x and y is given by*

*∥x − y∥*1 =

∑*n*
*i=1*

*|x**i**− y**i**|.*

**Example 2, p. 435**
The 3*× 3 linear system*

*3.3330x*1*+ 15920x*2*− 10.333x*3*= 15913,*
*2.2220x*_{1}*+ 16.710x*_{2}*+ 9.6120x*_{3} *= 28.544,*
*1.5611x*_{1}*+ 5.1791x*_{2}*1.6852x*_{3} *= 8.4254*

**has the exact sol. x = [1, 1, 1]**** ^{T}**. If the system is solved by GE

**with patrial pivoting using 5-digit rounding arithnetic, we obtain**the computed sol.

˜**x = [1.2001, 0.99991, 0.92538]**^{T}*.*
*So, the l*_{∞}*and l*_{2} *distances between x and ˜x are*

. . . .. .. .. .. .. .. . . . . .

## Convergence for Sequences of Vectors

**Def 7.5 (向量序列的收斂性)**

A seq.*{x*^{(k)}*}*^{∞}_{k}* _{−1}* of vectors in R

^{n}*is said to converge to x∈ R*

*with respect to the norm*

^{n}*∥ · ∥ if ∀ ϵ > 0, ∃ N(ϵ) ∈ N s.t.*

*∥x*^{(k)}*− x∥ < ϵ* *∀ k ≥ N(ϵ).*

**Thm 7.6**

The seq. of vectors*{x*^{(k)}*}*^{∞}*k=1* *converges to x∈ R*^{n}

**with respect to** **the l**

**the l**

_{∞}**norm**

*⇐⇒ lim*

*k**→∞**x*^{(k)}_{i}*= x**i* *for i = 1, 2, . . . , n.*

**pf: It is easily seen that**

*∀ ϵ > 0, ∃ N(ϵ) ∈ N s.t.*

*∥x*^{(k)}*− x∥**∞**< ϵ* *∀ k ≥ N(ϵ)*

*⇐⇒|x*^{(k)}_{i}*− x**i**| < ϵ* *∀ k ≥ N(ϵ) and 1 ≤ i ≤ n.*

**Example 3, p. 436**

The sequence of vectors inR^{4}
*x*^{(k)}*= [1, 2 +* 1

*k,* 3

*k*^{2}*, e*^{−k}*sin k]*^{T}

*converges to x = [1, 2, 0, 0]*^{T}*∈ R*^{4} *with respect to the l** _{∞}* norm,
since

*k*lim*→∞*(2 +1

*k) = 2,* lim

*k**→∞*

3

*k*^{2} *= 0,* lim

*k**→∞**e*^{−k}*sin k = 0.*

**Question**

**Does the given sequence converge to x with respect to the l**_{2}

. . . .. .. .. .. .. .. . . . . .

**Thm 7.7 (The Equivalence of Vector Norms)**
*For each x∈ R** ^{n}*,

*∥x∥**∞**≤ ∥x∥*2 *≤√*

*n∥x∥**∞**.*

*In this case, we say that the l*_{∞}*and l*_{2} norms are equivalent.

**pf: For any x = [x**

**pf: For any x = [x**

_{1}

*,· · · , x*

*n*]

^{T}*∈ R*

*, let*

^{n}*|x*

*i*0

*| = max*

1*≤i≤n**|x**i**| = ∥x∥** _{∞}*.
Then we see that

**1** *∥x∥** _{∞}*=

*|x*

*i*0

*| =*√

*x*

^{2}

_{i}0 *≤*√

*x*^{2}_{1}+*· · · + x*^{2}*n*=*∥x∥*2*.*

**2** *∥x∥*2 *≤*(∑*n*
*i=1*

*x*^{2}_{i}

0

)1/2

=

(*n∥x∥*^{2}* _{∞}*)1/2

=*√*

*n∥x∥** _{∞}*.
So, these prove the desired inequalities.

**Example 4, p. 437**

Show that the sequence of vectors in Example 3
*x*^{(k)}*= [1, 2 +*1

*k,* 3

*k*^{2}*, e*^{−k}*sin k]*^{T}*∈ R*^{4}

*converges to x = [1, 2, 0, 0]*^{T}*∈ R*^{4} *with respect to the l*_{2} norm.

**pf: In Example 3, we know that lim**

*k**→∞**∥x*^{(k)}*− x∥** _{∞}*= 0. So, for any

*ϵ > 0,∃ N*0

*∈ N s.t.*

*∥x*^{(k)}*− x∥*_{∞}*<* *ϵ*

2 *∀ k ≥ N*0*,*
and furthermore, it follows from Thm 7.7 that

*√*

. . . .. .. .. .. .. .. . . . . .

**Remarks**

Any two vector norms *∥ · ∥ and ∥ · ∥** ^{′}* onR

*are equivalent, i.e.,*

^{n}*∃ c*1

*> 0 and c*2

*> 0 s.t.*

*c*_{1}*∥x∥*^{′}*≤ ∥x∥ ≤ c*2*∥x∥*^{′}*∀ x ∈ R*^{n}*.*

A seq. *{x*^{(k)}*}*^{∞}_{k=1}*converges to the limit x∈ R*^{n}

**with respect** **to the norm**

*∥ · ∥ ⇐⇒ a seq. {x*

^{(k)}*}*

^{∞}*k=1*converges to the limit

*x∈ R*

^{n}**with respect to the norm**

*∥ · ∥*

*. (向量序列的收斂性 與範數無關!)*

^{′}*For any x∈ R*^{n}*, the relations between l*_{1}*, l*_{2} *and l** _{∞}* norms are

*∥x∥*2*≤ ∥x∥*1*≤√*
*n∥x∥*2*,*

*∥x∥**∞**≤ ∥x∥*1 *≤ n∥x∥**∞**.*

## Matrix Norms and Distances

**Def 7.8 (矩陣的範數)**

A matrix norm onR^{n}* ^{×n}* is a function

*∥ · ∥ : R*

^{n}

^{×n}*→ R satisfying*

*for all A, B∈ R*

^{n}

^{×n}*and all α∈ R :*

(i) *∥A∥ ≥ 0;*

(ii) *∥A∥ = 0 ⇐⇒ A = 0 (zero matrix);*

(iii) *∥αA∥ = |α| ∥A∥;*

(iv) *∥A + B∥ ≤ ∥A∥ + ∥B∥;*

(v) *∥AB∥ ≤ ∥A∥ ∥B∥.*

**Definition (Distances of Two Matrices)**

. . . .. .. .. .. .. .. . . . . .

**Thm 7.9 (⾃然矩陣範數)**

If**∥ · ∥ is a vector norm on R*** ^{n}*, then

*∥A∥ =* max

*∥x∥=1**∥Ax∥*

is amatrix norm on R^{n}* ^{×n}*. (See Exercise 13 for the proof.)

**pf: Only prove that**

*∥AB∥ ≤ ∥A∥ ∥B∥ for any A, B ∈ R*

^{n}*here.*

^{×n}*For any unit vector x∈ R** ^{n}*, we have

*∥A(Bx)∥ = ∥Bx∥ · ∥A( Bx*

*∥Bx∥*

)*∥ ≤ ∥A∥ · ∥Bx∥.*

Thus, we conclude that

*∥AB∥ = max*

*∥x∥=1**∥(AB)x∥ = max*

*∥x∥=1**∥A(Bx)∥*

*≤ max*

*∥x∥=1*(∥A∥ ∥Bx∥) = ∥A∥ · max

*∥x∥=1**∥Bx∥ = ∥A∥ ∥B∥.*

**Remarks**

**Matrix norms defined by vector norms are called the**natural
(or induced) matrix norm associated with the vector norm.

*Since x =* _{∥z∥}^{z}*is a unit vector for z̸= 0, Thm 7.9 can be*
rewritten as

*∥A∥ = max*

*∥x∥=1**∥Ax∥ = max*

*z**̸=0* *∥A( z*

*∥z∥*

)*∥ =*max

*z**̸=0*

*∥Az∥*

*∥z∥* *.*

. . . .. .. .. .. .. .. . . . . .

**Cor 7.10**

*For any A∈ R*^{n}* ^{×n}*, 0

*̸= z ∈ R*

*and any natural norm*

^{n}*∥ · ∥,*

*∥Az∥ ≤ ∥A∥ · ∥z∥.*

**Some Natural Matrix Norms**

**1** *∥A∥** _{∞}*= max

*∥x∥**∞*=1*∥Ax∥** _{∞}*= max

*z**̸=0*

*∥Az∥*_{∞}

*∥z∥**∞* . *(the l** _{∞}* norm)

**2** *∥A∥*2= max

*∥x∥*2=1*∥Ax∥*2 = max

*z**̸=0*

*∥Az∥*2

*∥z∥*2 . *(the l*_{2} norm)

**3** *∥A∥*1= max

*∥x∥*1=1*∥Ax∥*1 = max

*z**̸=0*

*∥Az∥*1

*∥z∥*1 . *(the l*_{1} norm)

**Thm 7.11 (矩陣**

**∞-範數的計算公式)**

**∞-範數的計算公式)**

*If A = [a*

*]*

_{ij}*∈ R*

^{n}*, then*

^{×n}*∥A∥** _{∞}*= max

1*≤i≤n*

∑*n*
*j=1*

*|a**ij**|. (|A| = [ |a**ij**| ] 的最⼤列和)*

**pf: The proof is separated into two parts.**

(1) Assume *∥A∥**∞*=*∥Ax∥**∞* *for some x∈ R** ^{n}* with

*∥x∥** _{∞}*= max

1*≤j≤n**|x**j**| = 1. Then we have*

*∥A∥**∞*=*∥Ax∥**∞*= max

1*≤i≤n**|(Ax)**i**| = max*

1*≤i≤n*

∑*n*
*j=1*

*a*_{ij}*x*_{j}

∑*n* ( ) ∑^{n}

. . . .. .. .. .. .. .. . . . . .

(2) *Let y = [y*_{1}*, y*_{2}*, . . . , y**n*]^{T}*∈ R** ^{n}*, where each component

*y*

*=*

_{j}{ *1,* *if a*_{ij}*≥ 0,*

*−1,* *if a*_{ij}*< 0.*

Then *∥y∥**∞*= 1 and*a*_{ij}*y** _{j}*=

*|a*

*ij*

*| for all i, j*. So, we get

*∥Ay∥** _{∞}*= max

1*≤i≤n**|(Ay)**i**| = max*

1*≤i≤n*

∑*n*
*j=1*

*a*_{ij}*y** _{j}* = max

1*≤i≤n*

∑*n*
*j=1*

*|a**ij**|.*

Furthermore, it follows that

*∥A∥**∞*= max

*∥x∥**∞*=1*∥AX∥**∞**≥ ∥Ay∥**∞*= max

1*≤i≤n*

∑*n*
*j=1*

*|a**ij**|.*

From the parts (1) and (2) =*⇒ these complete the proof.*

**Exercise 6, p. 442 (矩陣 1-範數的計算公式)**
*If A = [a** _{ij}*]

*∈ R*

^{n}*, then*

^{×n}*∥A∥*1 = max

1*≤j≤n*

∑*n*
*i=1*

*|a**ij**|. (|A| = [ |a**ij**| ] 的最⼤⾏和)*

. . . .. .. .. .. .. .. . . . . .

**Example 5, p. 441**
For the 3*× 3 matrix*

*A =*

1 2 *−1*

0 3 *−1*

5 *−1* 1

* ,*

it follows that

*∥A∥** _{∞}*= max

*i=1,2,3*

∑3
*j=1*

*|a**ij***| = max{4, 4, 7} = 7,**

*∥A∥*1 = max

*j=1,2,3*

∑3
*i=1*

*|a**ij***| = max{6, 6, 3} = 6.**

**Section 7.2**

**Eigenvalues and Eigenvectors**

. . . .. .. .. .. .. .. . . . . .

**Def 7.12 (特徵多項式)**

*The characteristic polynomial of A∈ R*^{n}* ^{×n}* is defined by

*p(λ) = det(A− λI),*

*where I is the n× n identity matrix.*

**Note: The characteristic poly. p is an nth-degree poly. with real**

**Note: The characteristic poly. p is an nth-degree poly. with real**

*coefficients. So, it has at most n distinct zeros in*C.

**Def 7.13 (特徵值與特徵向量)**

*Let p(λ) be the characteristic poly. of A∈ R*^{n}* ^{×n}*.

*The number λ∈ C is called an eigenvalue (or characteristic*
*value) of A if p(λ) = 0.*

*The spectrum (譜) of A, denoted by σ(A), is the set of all*
*eigenvalues of A.*

If*∃ 0 ̸= x ∈ R** ^{n}* s.t.

*Ax = λx*or

*(A− λI)x = 0*

*for λ∈ σ(A),*

*then x is called an eigenvector (pr characteristic vector) of A*

*corresponding to λ.*

. . . .. .. .. .. .. .. . . . . .

**Def 7.14**

**The spectral radius (譜半徑) of A**∈ R^{n}* ^{×n}* is defined by

*ρ(A) = max{|λ| | λ ∈ σ(A)}.*

*(For complex λ = α + βi, we define|λ| =*√

*α*^{2}*+ β*^{2}.)
**Thm 7.15 (矩陣 2-範數的計算公式)**

*If A is an n× n matrix, then*
(i) *∥A∥*2=√

*ρ(A*^{T}*A).*

(ii) *ρ(A)≤ ∥A∥ for any natural matrix norm ∥ · ∥.*

**Review from Linear Algebra**

*Let B = A*^{T}*A with A∈ R*^{n}^{×n}*and v∈ R** ^{n}*.

*B*^{T}*= (A*^{T}*A)*^{T}*= A*^{T}*(A** ^{T}*)

^{T}*= A*

^{T}*A = B, i.e., B is symmetric.*

*For any λ∈ σ(B), λ ≥ 0.*

*B is orthogonally diagonalizable, i.e.,∃ orthog. Q ∈ R*^{n}* ^{×n}* s.t.

*Q*^{T}*BQ = Q*^{T}*(A*^{T}*A)Q = diag(λ*_{1}*, λ*_{2}*,· · · , λ**n*)*≡ D,*
*where λ*_{1} *≥ λ*2 *≥ · · · λ**n**≥ 0.*

Since *∥v∥*^{2}2*= v*^{T}*v, we have*

*∥Ax∥*^{2}2*= (Ax)*^{T}*(Ax) = x*^{T}*(A*^{T}*A)x = x*^{T}*Bx*

. . . .. .. .. .. .. .. . . . . .

**Proof of Thm 7.15 (1/2)**

(i) *Since A*^{T}*A is symmetric,* *∃ orthogonal Q ∈ R*^{n}* ^{×n}* s.t.

*Q*^{T}*(A*^{T}*A)Q = diag(λ*_{1}*, λ*_{2}*,· · · , λ**n*)*≡ D,*
*where λ*_{1} *≥ λ*2 *≥ · · · λ**n**≥ 0. Hence,*

*∥A∥*^{2}_{2}= max

*∥x∥*^{2}=1*∥Ax∥*^{2}_{2} = max

*∥x∥*^{2}=1

*x*^{T}*(A*^{T}*A)x*

= max

*∥x∥*2=1

*x*^{T}*QDQ*^{T}*x = max*

*∥y∥*2=1

*y*^{T}*Dy,*
*where we let y = Q*^{T}*x. So,* *∥A∥*^{2}_{2} *≤ λ*1. Moreover, the
*maximum value of y*^{T}*Dy is achieved at the vector*
*y*^{∗}*= [1, 0,· · · , 0]*^{T}*∈ R** ^{n}* and thus

*∥A∥*

^{2}

_{2}

*= λ*

_{1}or

*∥A∥*2=*√*

*λ*1=√

*ρ(A*^{T}*A).*

**Proof of Thm 7.15 (2/2)**

(ii) *Let A∈ R*^{n}* ^{×n}* and

*∥ · ∥ be any natural norm. For each*

*λ∈ σ(A), ∃ 0 ̸= x ∈ R*

*s.t.*

^{n}*Ax = λx* with *∥x∥ = 1.*

Hence, we know that

*|λ| = |λ| · ∥x∥ = ∥λx∥ = ∥Ax∥ ≤ ∥A∥ ∥x∥ = ∥A∥.*

*So, the spectral radius of A satisfies ρ(A)≤ ∥A∥.*

. . . .. .. .. .. .. .. . . . . .

**Rematks**

*If A*^{T}*= A∈ R*^{n}* ^{×n}*, then

*∥A∥*2

*= ρ(A).*

*For any A∈ R*^{n}^{×n}*and any ϵ > 0,∃ a natural norm ∥ · ∥**ϵ* s.t.

*ρ(A) <∥A∥**ϵ**< ρ(A) + ϵ.*

**Def 7.16 (收斂矩陣的定義)**

*We day that a matrix A∈ R*^{n}^{×n}**is convergent if**

*k*lim*→∞**(A** ^{k}*)

*ij*

*= 0,*

*for i, j = 1, 2, . . . , n.*

**Example 4, p. 448**
The 2*× 2 matrix A =*

[_{1}

2 0

1 4

1 2

]

is a convergent matrix, since we have

*A** ^{k}*=

[(^{1}_{2})* ^{k}* 0

*k*

2* ^{k+1}* (

^{1}

_{2})

*]*

^{k}*∀ k ≥ 1*

. . . .. .. .. .. .. .. . . . . .

**Thm 7.17 (收斂矩陣的等價條件)**

*Let A∈ R*^{n}* ^{×n}*. The following statements are equivalent.

(i) *A is a convergent matrix.*

(ii) lim

*n**→∞**∥A*^{n}**∥ = 0 for some natural norm.**

(iii) lim

*n**→∞**∥A*^{n}**∥ = 0 for all natural norms.**

(iv) *ρ(A) < 1.*

(v) lim

*n**→∞**A*^{n}*x = 0 for every x∈ R** ^{n}*.

**Section 7.3**

**The Jacobi and Gauss-Siedel Iterative**

**Techniques**

. . . .. .. .. .. .. .. . . . . .

## Derivation of Jacobi Method

**Basic Idea**

*From the ith eq. of a linear system Ax = b*

*a*_{i1}*x*_{1}*+ a**i2**x*_{2}+*· · · +a*_{ii}*x** _{i}*+

*· · · + a*

*in*

*x*

_{n}*= b*

*i*

*for solving the ith componentx** _{i}*, we get

*x** _{i}*= 1

*a*

_{ii}

∑*n*

*j = 1*
*j**̸= i*

(*−a**ij**x*_{j}*) + b**i*

* ,*

provided that*a*_{ii}*̸= 0 for i = 1, 2, . . . , n*.

## The Jacobi Method

**Component Form (Jacobi 法的分量形式)**

*For each k≥ 1, we nay consider the Jacobi iterative method:*

*x*^{(k)}* _{i}* = 1

*a*

_{ii}

∑*n*

*j = 1*
*j**̸= i*

(*−a**ij**x*^{(k}_{j}^{−1)}*) + b**i*

* for i = 1, . . . , n,* (1)

where an initial approx.*x*^{(0)}*= [x*^{(0)}_{1} *,· · · , x*^{(0)}*n* ]^{T}*∈ R** ^{n}* is given.

. . . .. .. .. .. .. .. . . . . .

**Example 1, p. 451**
The following linear system

**10x**

**10x**

_{1}

*− x*2

*+ 2x*3 = 6

*− x*1* + 11x*2

*− x*3

*+ 3x*4= 25

*2x*1

*− x*2

*3*

**+ 10x***− x*4=

*−11*

*3x*

_{2}

*− x*3

**+ 8x**_{4}= 15

*has a unique solution x = [1, 2,−1, 1]*^{T}*∈ R*^{4}. Use Jacobi’s
*iterative technique to find an approx. x*^{(k)}*to x starting with*
*x*^{(0)} *= [0, 0, 0, 0]*^{T}*∈ R*^{4} until

*∥x*^{(k)}*− x*^{(k}^{−1)}*∥**∞*

*∥x*^{(k)}*∥**∞* *< 10*^{−3}*.*

**Solution (1/2)**

The given linear system ca be rewritten as
*x*_{1} = 1

10*x*_{2}*−* 1
5*x*_{3}+3

5
*x*_{2} = 1

11*x*_{1}+ 1

11*x*_{3}*−* 3

11*x*_{4}+25
11
*x*_{3} = *−1*

5 *x*_{1}+ 1

10*x*_{2}+ 1

10*x*_{4}*−*11
10
*x*_{4} = *−3*

8 *x*_{2}+1

8*x*_{3}+ 15
8 *.*

. . . .. .. .. .. .. .. . . . . .

**Solution (2/2)**

*For each k≥ 1, we apply the Jacbi’s method:*

*x*^{(k)}_{1} = 1

10*x*^{(k}_{2}^{−1)}*−*1

5*x*^{(k}_{3}* ^{−1)}*+3
5

*x*

^{(k)}_{2}= 1

11*x*^{(k}_{1}* ^{−1)}*+ 1

11*x*^{(k}_{3}^{−1)}*−* 3

11*x*^{(k}_{4}* ^{−1)}*+ 25
11

*x*

^{(k)}_{3}=

*−1*

5 *x*^{(k}_{1}* ^{−1)}*+ 1

10*x*^{(k}_{2}* ^{−1)}*+ 1

10*x*^{(k}_{4}^{−1)}*−*11
10
*x*^{(k)}_{4} = *−3*

8 *x*^{(k}_{2}* ^{−1)}*+1

8*x*^{(k}_{3}* ^{−1)}*+15
8
with the initial guess

*x*

^{(0)}

*= [0, 0, 0, 0]*

^{T}*∈ R*

^{4}.

## Numerical Results

**After 10 iterations of Jacobi method, we have**

*∥x*^{(10)}*− x*^{(9)}*∥*_{∞}

*∥x*^{(10)}*∥** _{∞}* =

*8.0× 10*

^{−4}*1.9998* **= 4.0****× 10**^{−4}*< 10*^{−3}*.*
In fact, the absolute error is*∥x*^{(10)}*− x∥*_{∞}**= 2****× 10***^{−4}*.

. . . .. .. .. .. .. .. . . . . .

**Equivalent Matrix-Vector Forms**

*As in Chapter 2, every root-finding problem f(x) = 0 is*
converted into its equivalent fixed-point form

*x = g(x),* *x∈ I = [a, b]*

**, for some differentiable function g.**

Similarly, we also try to covert the original linear system
*Ax = b into its* equivalent matrix-vector form

*x = Tx + c,* *x∈ R*^{n}*,*
*where T∈ R*^{n}^{×n}*and c∈ R** ^{n}* are fixed.

*For k = 1, 2, . . ., compute*

*x*^{(k)}*= Tx*^{(k}^{−1)}*+ c*

*with an initial approx. x*^{(0)} *∈ R*^{n}*to the unique sol. x.*

**A Useful Split of A**

*The iterative techniques for solving Ax = b will be derived by first*
*splitting A into its diagonal and off-diagonal parts, i.e.,*

*A =*

*a*_{11} 0 *· · ·* 0

0 *a*22

... .. . ..

.

... ... 0

0 *· · ·* 0 *a**nn*

* −*

0 *· · ·* *· · ·* 0

*−a*21
...

.. . ..

.

... ...

.. .

*−a**n1* *· · ·* *−a**n,n**−1* 0

*−*

0 *−a*12 *· · ·* *−a**1n*
..

.

... ... .. . ..

.

... *−a**n**−1,n*

0 *· · ·* *· · ·* 0

*≡ D −L−U.* (2)

. . . .. .. .. .. .. .. . . . . .

## The Jacobi Method Revisited

*From the splitting of A as in (2), the linear system Ax = b is*
transformed to

*(D− L − U)x = b ⇔ Dx = (L + U)x + b ⇔ x =T*_{j}*x +c*_{j}*,*
where*T*_{j}*≡ D*^{−1}*(L + U)* and*c*_{j}*≡ D*^{−1}*b.*

It is easily seen that the component form of Jacobi method

*x*^{(k)}* _{i}* = 1

*a*

_{ii}

∑*n*

*j = 1*
*j**̸= i*

(*−a**ij**x*^{(k}_{j}* ^{−1)}*) +

*b*

_{i}

^{for}*i = 1, . . . , n.*

is equivalent to the following matrix-vector form
*x** ^{(k)}*=

*T*

_{j}*x*

^{(k}*+*

^{−1)}*c*

_{j}*∀ k ≥ 1.*

**Example 2, p. 453**

The 4*× 4 linear system in Example 1 can be rewritten in the form*
*x*_{1} = 1

10*x*_{2}*−* 1
5*x*_{3}+3

5
*x*_{2} = 1

11*x*_{1}+ 1

11*x*_{3}*−* 3

11*x*_{4}+25
11
*x*_{3} = *−1*

5 *x*_{1}+ 1

10*x*_{2}+ 1

10*x*_{4}*−*11
10
*x*_{4} = *−3*

8 *x*_{2}+1

8*x*_{3}+ 15
8 *.*

*So, the unique sol. x∈ R*^{4} *satisfies x = T*_{j}*x + c** _{j}* with

0 _{10}^{1} ^{−1}_{5} 0

1 1 *−3*

3 5 25

. . . .. .. .. .. .. .. . . . . .

**Algorithm 7.1: Jacobi Method**

INPUT *dim. n; A = [a** _{ij}*]

*∈ R*

^{n}

^{×n}*; b∈ R*

^{n}*; X0 = x*

^{(0)}

*∈ R*

^{n}*; tol. TOL;*

*max. no. of iter. N*_{0}.

OUTPUT *an approx. sol. x*_{1}*, x*2*, . . . , x**n* *to Ax = b.*

Step 1 *Set k = 1.*

Step 2 *While (k≤ N*0**) do Steps 3–6**
Step 3 *For i = 1, . . . , n set*

*x**i*= 1
*a**ii*

∑*n*

*j = 1*
*j**̸= i*

(*−a**ij**X0**j**) + b**i*

* .*

Step 4 If*∥x − X0∥ < TOL then OUTPUT(x*1*,**· · · , x**n***); STOP.**

Step 5 *Set k = k + 1.*

Step 6 *Set X0 = x.*

Step 7 **OUTPUT(‘Maximum number of iterations exceeded’); STOP.**

**Comments on Algorithm 7.1**

*If some a*_{ii}**= 0 and A is nonsingular, choose p**̸= i s.t.

*|a**pi**| is as large as possible,*

*and then perform (E** _{p}*)

*↔ (E*

*i*) to ensure that

*no a*

*= 0 before applying the Jacobi method.*

_{ii}In Step 4, a better stopping criterion should be

*∥x*^{(k)}*− x*^{(k}^{−1)}*∥*

*∥x*^{(k)}*∥* *< TOL,*

where the vector norm *∥ · ∥ is the l*1*, l*_{2} *or l** _{∞}* norm.

. . . .. .. .. .. .. .. . . . . .

**Jacobi Method v.s. Gauss-Seidel Metod**

*For the Jacobi’s method, the ith componentx*^{(k)}_{i}*of x** ^{(k)}* is

*determined by x*

^{(k}_{1}

^{−1)}*,· · · , x*

^{(k}*i*

*−1*

^{−1)}*and x*

^{(k}

_{i+1}

^{−1)}*,· · · , x*

^{(k}*n*

*.*

^{−1)}*At the kth step of Gauss-Seidel method, the ith component*

*x*

^{(k)}*is computed by*

_{i}*x*

^{(k)}_{1}

*,· · · , x*

^{(k)}

_{i}

_{−1}*and x*

^{(k}

_{i+1}

^{−1)}*,· · · , x*

^{(k}*n*

*. Notice that the recently computed values of*

^{−1)}*x*

^{(k)}_{1}

*,· · · , x*

^{(k)}

_{i}*are*

_{−1}*better approxs. to x than the values of x*

^{(k}_{1}

^{−1)}*,· · · , x*

^{(k}*i*

*−1*

*.*

^{−1)}**Component Form of Gauss-Seidel Method**

*For each k≥ 1, the ith component of x** ^{(k)}* is determined by

*x*^{(k)}* _{i}* = 1

*a*

_{ii}

∑^{i}^{−1}

*j=1*

(−a*ij**x*^{(k)}* _{j}* ) +

∑*n*
*j=i+1*

(−a*ij**x*^{(k}_{j}^{−1)}*) + b**i*

= 1
*a*_{ii}

*−*

*i**−1*

∑

*j=1*

(a_{ij}*x*^{(k)}* _{j}* )

*−*

∑*n*
*j=i+1*

(a_{ij}*x*^{(k}_{j}^{−1)}*) + b*_{i}

* ,* (3)

*where an initial vector x*^{(0)} *is given and i = 1, 2, . . . , n.*

. . . .. .. .. .. .. .. . . . . .

**Algorithm 7.2: Gauss-Seidel Method**

INPUT *dim. n; A = [a** _{ij}*]

*∈ R*

^{n}

^{×n}*; b∈ R*

^{n}*; X0 = x*

^{(0)}

*∈ R*

^{n}*; tol. TOL;*

*max. no. of iter. N*_{0}.

OUTPUT *an approx. sol. x*_{1}*, x*_{2}*, . . . , x**n* *to Ax = b.*

Step 1 *Set k = 1.*

Step 2 *While (k≤ N*0**) do Steps 3–6**
Step 3 *For i = 1, . . . , n set*

*x**i*= 1
*a**ii*

*−*

*i**−1*

∑

*j=1*

(a*ij**x**j*)*−*

∑*n*

*j=i+1*

(a*ij**X0**j**) + b**i*

* .*

Step 4 If*∥x − X0∥ < TOL then OUTPUT(x*1*,**· · · , x**n***); STOP.**

Step 5 *Set k = k + 1.*

Step 6 *Set X0 = x.*

Step 7 **OUTPUT(‘Maximum number of iterations exceeded’); STOP.**

**Example 3, p. 455**
The following linear system

**10x**

**10x**

_{1}

*− x*2

*+ 2x*3 = 6

*− x*1* + 11x*2

*− x*3

*+ 3x*4= 25

*2x*1

*− x*2

*3*

**+ 10x***− x*4=

*−11*

*3x*

_{2}

*− x*3

**+ 8x**_{4}= 15

*has a unique solution x = [1, 2,−1, 1]*^{T}*∈ R*^{4}**. Use Gauss-Seidel**

**method to find an approx. x**

**method to find an approx. x**

^{(k)}*to x starting with*

*x*^{(0)} *= [0, 0, 0, 0]*^{T}*∈ R*^{4} until

*∥x*^{(k)}*− x*^{(k}^{−1)}*∥**∞* *−3*

. . . .. .. .. .. .. .. . . . . .

**Solution**

*For each k≥ 1, we apply the Gauss-Seidel method:*

*x*^{(k)}_{1} = 1

10*x*^{(k}_{2}^{−1)}*−*1

5*x*^{(k}_{3} * ^{−1)}*+3
5

*x*

^{(k)}_{2}= 1

11*x*^{(k)}_{1} + 1

11*x*^{(k}_{3}^{−1)}*−* 3

11*x*^{(k}_{4}* ^{−1)}*+25
11

*x*

^{(k)}_{3}=

*−1*

5 *x*^{(k)}_{1} + 1

10*x*^{(k)}_{2} + 1

10*x*^{(k}_{4}^{−1)}*−*11
10
*x*^{(k)}_{4} = *−3*

8 *x*^{(k)}_{2} +1

8*x*^{(k)}_{3} +15
8
with the initial guess*x*^{(0)} *= [0, 0, 0, 0]*^{T}*∈ R*^{4}.

## Numerical Results of Example 3

**After 5 iterations of Gauss-Seidel method, we have**

*∥x*^{(5)}*− x*^{(4)}*∥*_{∞}

*∥x*^{(5)}*∥** _{∞}* =

*8.0× 10*

^{−4}*2.000* **= 4.0****× 10**^{−4}*< 10*^{−3}*.*
In fact, the absolute error is*∥x*^{(5)}*− x∥*_{∞}**= 1.0****× 10***^{−4}*. The
numerical results are shown in the following table.

. . . .. .. .. .. .. .. . . . . .

**Matrix-Vector Form of Gauss-Seidel Method (1/2)**
From the component form as in (3)

*x*^{(k)}* _{i}* = 1

*a*

_{ii}

*−*

*i**−1*

∑

*j=1*

(a_{ij}*x*^{(k)}* _{j}* ) +

∑*n*
*j=i+1*

(*−a**ij**x*^{(k}_{j}^{−1)}*) + b**i*

* ,*

we immediately obtain

*a*_{i1}*x*^{(k)}_{1} +*· · · +a*_{ii}*x*^{(k)}* _{i}* =

*−a*

*i,i+1*

*x*

^{(k}

_{i+1}

^{−1)}*− · · · − a*

*in*

*x*

^{(k}

_{n}

^{−1)}*+ b*

*i*

*for each i = 1, 2, . . . , n.*

Thus we have following matrix form for Gauss-Seidel method:

*a*_{11} 0 *· · ·* 0

*a*_{21} *a*_{22} . ..
..
.
..

.

... ... 0

*a**n1* *· · ·* *a**n,n**−1* *a**nn*

*x*^{(k)}_{1}
*x*^{(k)}_{2}
..
.
*x*^{(k)}*n*

=

0 *−a*12 *· · ·* *−a**1n*
..

. . .. . .. .. . ..

.

... *−a**n**−1,n*

0 *· · ·* *· · ·* 0

*x*^{(k−1)}_{1}
*x*^{(k}_{2}* ^{−1)}*
..
.

*x*

^{(k−1)}*n*

*+b.*

**Matrix-Vector Form of Gauss-Seidel Method (2/2)**

*For each k≥ 1, the above matrix equation can be rewritten as*
*(D−L)x** ^{(k)}*=

*Ux*

^{(k}

^{−1)}*+ b*

*⇐⇒* *x*^{(k)}*= (D−L)*^{−1}*Ux*^{(k}^{−1)}*+ (D−L)*^{−1}*b*

*⇐⇒* *x*^{(k)}*= T**g**x*^{(k}^{−1)}*+ c**g**,*

*where T*_{g}*≡ (D −L)*^{−1}*U* *and c*_{g}*≡ (D −L)*^{−1}*b.*

Recall the Jacobi method given by

*x*^{(k)}*= T**j**x*^{(k−1)}*+ c**j**,* *k = 1, 2, . . . ,*
*where T* *= D*^{−1}*(L + U) and c* *= D*^{−1}*b.*

. . . .. .. .. .. .. .. . . . . .

## General Iteration Methods

**Some Questions**

**1** When does a general iteration of the form
*x*^{(k)}*= Tx*^{(k−1)}*+ c,* *k = 1, 2, . . .*
converge to a solution *x∈ R** ^{n}* of the matrix equation

*x = Tx + c?*

**2** What is the rate of convergence for this iterative method?

**3** **Does the Gauss-Seidel method always converge faster than**
the Jacobi method?

**Lemma 7.18**

*If T∈ R*^{n}* ^{×n}* satisfies

*ρ(T) < 1, the (I− T)*

*exists and*

^{−1}*(I− T)*^{−1}*= I + T + T*^{2}+*· · · =*∑^{∞}

*j=0*

*T*^{j}

with*T*^{0}*≡ I*being defined conventionally.

. . . .. .. .. .. .. .. . . . . .

**Proof of Lemma 7.18**

*If λ∈ σ(T), then ∃ 0 ̸= x ∈ R** ^{n}* s.t.

*Tx = λx* or *(I− T)x = (1 − λ)x.*

So, 1*− λ ∈ σ(I − T).*

Because *ρ(T) < 1,|λ| ≤ρ(T) < 1. This means that I− T*
*does not have any zero eigenvalues and hence (I− T)** ^{−1}* exists.

*Let S** _{m}*=

∑*m*
*j=0*

*T** ^{j}*. Then we have

*(I− T)S**m*=

∑*m*
*j=0*

*T*^{j}*−*

∑*m*
*j=0*

*T*^{j+1}*= I− T*^{m+1}*.*

Since *ρ(T) < 1, T is convergent, i.e., lim*

*m**→∞**T*^{m}**= 0 by Thm**
*7.17. Hence (I− T)** ^{−1}*= lim

*m**→∞**S** _{m}*= ∑

^{∞}*j=0*

*T** ^{j}*.

**Thm 7.19 (廣義迭代法收斂性的充要條件)**

*For any x*^{(0)} *∈ R** ^{n}*, the sequence

*{x*

^{(k)}*}*

^{∞}*defined by*

_{k=0}*x*

*=*

^{(k)}*Tx*

^{(k}

^{−1)}*+ c*

*∀ k ≥ 1*

**converges to the unique solution of x =**Tx + c⇐⇒ρ(T) < 1.

**pf: The proof is illustrated as follows.**

(*⇐)* *Suppose that ρ(T) < 1. By induction =⇒*
*x*^{(k)}*= Tx*^{(k}^{−1)}*+ c*

*= T(Tx*^{(k}^{−2)}*+ c) + c = T*^{2}*x*^{(k}^{−2)}*+ (T + I)c*
...

. . . .. .. .. .. .. .. . . . . .

*Since ρ(T) < 1, lim*

*k**→∞**T*^{k}*x*^{(0)} = 0 by Thm 7.17. Thus, it
follows from Lemma 7.18 that

*x≡ lim*

*k**→∞**x** ^{(k)}*= 0 +
(∑

^{∞}*j=0*

*T** ^{j}*)

*c = (I− T)*^{−1}*c.*

*Hence, the limit x∈ R** ^{n}*is the unique solution of the equation

*x = (I− T)*

^{−1}*c⇐⇒ (I − T)x = c ⇐⇒ x = Tx + c.*

(*⇒)* Assume that lim

*k**→∞**x*^{(k)}*= x for any initial vector x*^{(0)}, where
*x∈ R*^{n}*is the unique sol. of x = Tx + c. Now, we want to*
claim that

*ρ(T) < 1⇐⇒ lim*

*k**→∞**T*^{k}*z = 0* *∀ z ∈ R** ^{n}*
by applying Thm 7.17.

*For any z∈ R*^{n}*, let x*^{(0)}*= x− z. Then by induction =⇒*

*x− x*^{(k)}*= (Tx + c)− (Tx*^{(k}^{−1)}*+ c)*

*= T(x− x*^{(k}* ^{−1)}*) =

*· · · = T*

^{k}*(x− x*

^{(0)})

*= T*^{k}*z, since z = x− x*^{(0)}*.*
So, it follows from the assumption that

*k*lim*→∞**T*^{k}*z = x− lim*

*k**→∞**x*^{(k)}*= x− x = 0.*

*Since z∈ R** ^{n}*is given arbitrarily, we have

*ρ(T) < 1.*