Solutions of Equations in One Variable

(1)

師大

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University, Taiwan

August 28, 2011

(2)

師大

Outline

1 Bisection Method

2 Fixed-Point Iteration

3 Newton’s method

4 Error analysis for iterative methods

5 Accelerating convergence

6 Zeros of polynomials and M ¨uller’s method

(3)

師大

Bisection Method

Idea

Iff (x) ∈ C[a, b]andf (a)f (b) < 0, then∃ c ∈ (a, b)such that f (c) = 0.

(4)

師大

Bisection method algorithm

Given f (x) defined on (a, b), the maximal number of iterations M, and stop criteria δ and ε, this algorithm tries to locate one root of f (x).

Compute u = f (a), v = f (b), and e = b − a If sign(u) = sign(v), then stop

For k = 1, 2, . . . , M

e = e/2, c = a + e, w = f (c) If |e| < δ or |w| < ε, then stop If sign(w) 6= sign(u)

b = c, v = w Else

a = c, u = w End If

End For

(5)

師大

Let {c_n} be the sequence of numbers produced. The algorithm should stop if one of the following conditions is satisfied.

1 the iteration number k > M ,

2 |c_k− c_k−1| < δ, or

3 |f (c_k)| < ε.

Let [a₀, b₀], [a₁, b₁], . . .denote the successive intervals produced by the bisection algorithm. Then

a = a₀≤ a₁ ≤ a₂ ≤ · · · ≤ b₀= b

⇒ {a_n} and {bn} are bounded

⇒ lim

n→∞an and lim

n→∞bn exist

(6)

師大

Since

b₁− a₁ = 1

2(b₀− a₀) b₂− a₂ = 1

2(b₁− a₁) = 1

4(b₀− a₀) ...

b_n− a_n = 1

2ⁿ(b₀− a₀) hence

n→∞lim bn− lim

n→∞an= lim

n→∞(bn− a_n) = lim

n→∞

1

2ⁿ(b0− a₀) = 0.

Therefore

n→∞lim an= lim

n→∞bn≡ z.

Since f is a continuous function, we have that

n→∞lim f (an) = f ( lim

n→∞an) = f (z) and lim

n→∞f (bn) = f ( lim

n→∞bn) = f (z).

(7)

師大

On the other hand,

f (a_n)f (b_n) ≤ 0

⇒ lim

n→∞f (a_n)f (b_n) = f²(z) ≤ 0

⇒ f (z) = 0

Therefore, the limit of the sequences {a_n} and {b_n} is a zero of f in [a, b]. Let c_n= ¹₂(a_n+ b_n). Then

|z − c_n| = lim

n→∞a_n−1

2(a_n+ b_n)

= 1 2

h

n→∞lim a_n− b_ni + 1

2 h

n→∞lim a_n− a_ni

≤ max lim

n→∞an− b_n ,

lim

n→∞an− a_n

≤ |b_n− a_n| = 1

2ⁿ|b₀− a₀|.

This proves the following theorem.

(8)

師大

Theorem 1

Let{[a_n, bn]}denote the intervals produced by the bisection algorithm. Then lim

n→∞anand lim

n→∞bnexist, areequal, and represent azerooff (x). If

z = lim

n→∞an= lim

n→∞bn and cn= 1

2(an+ bn), then

|z − c_n| ≤ 1

2ⁿ(b₀− a₀) . Remark

{c_n}convergestozwith therateofO(2⁻ⁿ).

(9)

師大

Example 2

How many steps should be taken to compute a root of f (x) = x³+ 4x²− 10 = 0 on [1, 2] with relative error 10⁻³? solution: Seek an n such that

|z − c_n|

|z| ≤ 10⁻³ ⇒ |z − c_n| ≤ |z| × 10⁻³. Since z ∈ [1, 2], it is sufficient to show

|z − c_n| ≤ 10⁻³. That is, we solve

2⁻ⁿ(2 − 1) ≤ 10⁻³ ⇒ −n log₁₀2 ≤ −3 which gives n ≥ 10.

(10)

師大

Fixed-Point Iteration

Definition 3

xis called afixed pointof a given functionf iff (x) = x.

Root-finding problems and fixed-point problems Find x^∗ such that f (x^∗) = 0.

Let g(x) = x − f (x). Then g(x^∗) = x^∗− f (x^∗) = x^∗.

⇒ x^∗is a fixed point for g(x).

Find x^∗ such that g(x^∗) = x^∗. Define f (x) = x − g(x) so that f (x^∗) = x^∗− g(x^∗) = x^∗− x^∗= 0

⇒ x^∗is a zero of f (x).

(11)

師大

Example 4

The function g(x) = x²− 2, for −2 ≤ x ≤ 3, has fixed points at x = −1and x = 2 since

g(−1) = (−1)²− 2 = −1 and g(2) = 2²− 2 = 2.

(12)

師大

Theorem 5 (Existence and uniqueness)

1 If g ∈ C[a, b] such thata ≤ g(x) ≤ bfor all x ∈ [a, b], then g hasa fixed point in [a, b].

2 If, in addition, g⁰(x)exists in (a, b) and there exists a positive constantM < 1such that|g⁰(x)| ≤ M < 1for all x ∈ (a, b). Then the fixed point isunique.

(13)

師大

Proof

Existence:

If g(a) = a or g(b) = b, then a or b is a fixed point of g and we are done.

Otherwise, it must be g(a) > a and g(b) < b. The function h(x) = g(x) − xis continuous on [a, b], with

h(a) = g(a) − a > 0 and h(b) = g(b) − b < 0.

By the Intermediate Value Theorem, ∃ x^∗ ∈ [a, b] such that h(x^∗) = 0. That is

g(x^∗) − x^∗ = 0 ⇒ g(x^∗) = x^∗. Hence g has a fixed point x^∗in [a, b].

(14)

師大

Proof

Uniqueness:

Suppose that p 6= q are both fixed points of g in [a, b]. By the Mean-Value theorem, there exists ξ between p and q such that

g⁰(ξ) = g(p) − g(q)

p − q = p − q p − q = 1.

However, this contradicts to the assumption that

|g⁰(x)| ≤ M < 1for all x in [a, b]. Therefore the fixed point of g is unique.

(15)

師大

Example 6

Show that the following function has a unique fixed point.

g(x) = (x²− 1)/3, x ∈ [−1, 1].

Solution: The Extreme Value Theorem implies that min

x∈[−1,1]g(x) = g(0) = −1 3, max

x∈[−1,1]g(x) = g(±1) = 0.

That is g(x) ∈ [−1, 1], ∀ x ∈ [−1, 1].

Moreover, g is continuous and

|g⁰(x)| =

2x 3

≤ 2

3, ∀ x ∈ (−1, 1).

By above theorem, g has a unique fixed point in [−1, 1].

(16)

師大

Let p be such unique fixed point of g. Then p = g(p) = p²− 1

3 ⇒ p²− 3p − 1 = 0

⇒ p = 1 2(3 −√

13).

(17)

師大

Fixed-point iteration or functional iteration

Given a continuous function g, choose an initial point x₀ and generate {x_k}^∞_k=0by

x_k+1 = g(x_k), k ≥ 0.

{x_k} may not converge, e.g., g(x) = 3x. However, when the sequence converges, say,

k→∞lim x_k= x^∗, then, since g is continuous,

g(x^∗) = g( lim

k→∞x_k) = lim

k→∞g(x_k) = lim

k→∞x_k+1= x^∗. That is, x^∗is a fixed point of g.

(18)

師大

Fixed-point iteration

Given x₀, tolerance T OL, maximum number of iteration M . Set i = 1 and x = g(x₀).

While i ≤ M and |x − x₀| ≥ T OL Set i = i + 1, x₀ = xand x = g(x₀).

End While

(19)

師大

Example 7 The equation

x³+ 4x²− 10 = 0

has a unique root in [1, 2]. Change the equation to the fixed-point form x = g(x).

(a) x = g₁(x) ≡ x − f (x) = x − x³− 4x²+ 10

(b) x = g₂(x) = ¹⁰_x − 4x1/2

x³ = 10 − 4x² ⇒ x² = 10

x − 4x ⇒ x = ± 10 x − 4x

1/2

(20)

師大

(c) x = g₃(x) = ¹₂ 10 − x³1/2

4x² = 10 − x³ ⇒ x = ±1

2 10 − x³1/2

(d) x = g₄(x) =

10 4+x

1/2

x²(x + 4) = 10 ⇒ x = ±

10 4 + x

1/2

(e) x = g₅(x) = x −^x³_3x^+4x2+8x²⁻¹⁰

x = g₅(x) ≡ x − f (x) f⁰(x)

(21)

師大

Results of the fixed-point iteration with initial point x₀ = 1.5

(22)

師大

Theorem 8 (Fixed-point Theorem)

Let g ∈ [a, b] be such thatg(x) ∈ [a, b]for all x ∈ [a, b]. Suppose that g⁰exists on (a, b) and that ∃ k with0 < k < 1such that

|g⁰(x)| ≤ k, ∀ x ∈ (a, b).

Then, for any number x₀ in [a, b],

xn= g(xn−1), n ≥ 1, converges to the unique fixed point x in [a, b].

(23)

師大

Proof: By the assumptions, a unique fixed point exists in [a, b].

Since g([a, b]) ⊆ [a, b], {x_n}^∞_n=0 is defined and x_n∈ [a, b] for all n ≥ 0. Using the Mean Values Theorem and the fact that

|g⁰(x)| ≤ k, we have

|x − x_n| = |g(x_n−1) − g(x)| = |g⁰(ξn)||x − xn−1| ≤ k|x − x_n−1|, where ξn∈ (a, b). It follows that

|x_n− x| ≤ k|x_n−1− x| ≤ k²|x_n−2− x| ≤ · · · ≤ kⁿ|x₀− x|. (1) Since 0 < k < 1, we have

n→∞lim kⁿ= 0 and

n→∞lim |x_n− x| ≤ lim

n→∞kⁿ|x₀− x| = 0.

Hence, {x_n}^∞_n=0converges to x.

(24)

師大

Corollary 9

If g satisfies the hypotheses of above theorem, then

|x − x_n| ≤ kⁿmax{x₀− a, b − x₀} and

|x_n− x| ≤ kⁿ

1 − k|x₁− x₀|, ∀ n ≥ 1.

Proof: From (1),

|x_n− x| ≤ kⁿ|x₀− x| ≤ kⁿmax{x0− a, b − x₀}.

For n ≥ 1, using the Mean Values Theorem,

|x_n+1− x_n| = |g(x_n) − g(x_n−1)| ≤ k|x_n− x_n−1| ≤ · · · ≤ kⁿ|x₁− x₀|.

(25)

師大

Thus, for m > n ≥ 1,

|x_m− x_n| = |x_m− x_m−1+ xm−1− · · · + x_n+1− x_n|

≤ |x_m− x_m−1| + |x_m−1− x_m−2| + · · · + |x_n+1− x_n|

≤ k^m−1|x₁− x₀| + k^m−2|x₁− x₀| + · · · + kⁿ|x₁− x₀|

= kⁿ|x₁− x₀| 1 + k + k²+ · · · + k^m−n−1 . It implies that

|x − x_n| = lim

m→∞|x_m− x_n| ≤ lim

m→∞kⁿ|x₁− x₀|

m−n−1

X

j=0

k^j

≤ kⁿ|x₁− x₀|

∞

X

j=0

k^j = kⁿ

1 − k|x₁− x₀|.

(26)

師大

Example 10

For previous example,

f (x) = x³+ 4x²− 10 = 0.

For g₁(x) = x − x³− 4x²+ 10, we have g₁(1) = 6 and g₁(2) = −12, so g₁([1, 2]) * [1, 2]. Moreover,

g₁⁰(x) = 1 − 3x²− 8x ⇒ |g₁⁰(x)| ≥ 1 ∀ x ∈ [1, 2]

• DOES NOT guarantee to converge or not

(27)

師大

For g₃(x) = ¹₂(10 − x³)^1/2, ∀ x ∈ [1, 1.5], g₃⁰(x) = −3

4x²(10 − x³)^−1/2< 0, ∀ x ∈ [1, 1.5], so g₃ is strictly decreasing on [1, 1.5] and

1 < 1.28 ≈ g₃(1.5) ≤ g₃(x) ≤ g₃(1) = 1.5, ∀ x ∈ [1, 1.5].

On the other hand,

|g⁰₃(x)| ≤ |g⁰₃(1.5)| ≈ 0.66, ∀ x ∈ [1, 1.5].

Hence, the sequence is convergent to the fixed point.

(28)

師大

For g₄(x) =p10/(4 + x), we have r10

6 ≤ g₄(x) ≤ r10

5 , ∀ x ∈ [1, 2] ⇒ g₄([1, 2]) ⊆ [1, 2]

Moreover,

|g⁰₄(x)| =

√ −5

10(4 + x)^3/2

≤ 5

√10(5)^3/2 < 0.15, ∀ x ∈ [1, 2].

The bound of |g⁰₄(x)|is much smaller than the bound of |g₃⁰(x)|, which explains the more rapid convergence using g₄.

(29)

師大

Suppose that f : R → R and f ∈ C²[a, b], i.e., f⁰⁰exists and is continuous. If f (x^∗) = 0and x^∗= x + hwhere h is small, then byTaylor’stheorem

0 = f (x^∗) = f (x + h)

= f (x) + f⁰(x)h + 1

2f⁰⁰(x)h²+ 1

3!f⁰⁰⁰(x)h³+ · · ·

= f (x) + f⁰(x)h + O(h²).

Sincehissmall, O(h²)is negligible. It is reasonable to drop O(h²)terms. This implies

f (x) + f⁰(x)h ≈ 0 and h ≈ −f (x)

f⁰(x), if f⁰(x) 6= 0.

Hence

x + h = x − f (x) f⁰(x) is a better approximation to x^∗.

(30)

師大

This sets the stage for theNewton-Rapbson’smethod, which starts with an initial approximation x₀ and generates the sequence {x_n}^∞_n=0defined by

xn+1 = xn− f (xn) f⁰(xn).

Since the Taylor’s expansion of f (x) at x_kis given by f (x) = f (x_k) + f⁰(x_k)(x − x_k) +1

2f⁰⁰(x_k)(x − x_k)²+ · · · . At x_k, one uses thetangent line

y = `(x) = f (xk) + f⁰(xk)(x − xk)

toapproximate the curveof f (x) and uses the zero of the tangent line to approximate the zero of f (x).

(31)

師大

Newton’s Method

Given x₀, tolerance T OL, maximum number of iteration M . Set i = 1 and x = x0− f (x₀)/f⁰(x0).

While i ≤ M and |x − x₀| ≥ T OL

Set i = i + 1, x₀ = xand x = x₀− f (x₀)/f⁰(x0).

End While

(32)

師大

Three stopping-technique inequalities

(a). |x_n− x_n−1| < ε, (b). |x_n− x_n−1|

|x_n| < ε, xn6= 0, (c). |f (x_n)| < ε.

Note that Newton’s method for solving f (x) = 0 xn+1= xn− f (x_n)

f⁰(xn), for n ≥ 1 is just a special case of functional iteration in which

g(x) = x − f (x) f⁰(x).

(33)

師大

Example 11

The following table shows the convergence behavior of

Newton’s method applied to solving f (x) = x²− 1 = 0. Observe the quadratic convergence rate.

n xn |e_n| ≡ |1 − x_n|

0 2.0 1

1 1.25 0.25

2 1.025 2.5e-2

3 1.0003048780488 3.048780488e-4 4 1.0000000464611 4.64611e-8

5 1.0 0

(34)

師大

Theorem 12

Assumef (x^∗) = 0, f⁰(x^∗) 6= 0andf (x),f⁰(x)andf⁰⁰(x)are continuouson N_ε(x^∗). Then if x₀is chosensufficiently closeto x^∗, then

xn+1 = xn− f (xn) f⁰(x_n)

→ x^∗.

Proof: Define

g(x) = x − f (x) f⁰(x). Find an interval [x^∗− δ, x^∗+ δ]such that

g([x^∗− δ, x^∗+ δ]) ⊆ [x^∗− δ, x^∗+ δ]

and

|g⁰(x)| ≤ k < 1, ∀ x ∈ (x^∗− δ, x^∗+ δ).

(35)

師大

Since f⁰ is continuous and f⁰(x^∗) 6= 0, it implies that ∃ δ₁ > 0 such that f⁰(x) 6= 0 ∀ x ∈ [x^∗− δ₁, x^∗+ δ₁] ⊆ [a, b]. Thus, g is defined and continuous on [x^∗− δ₁, x^∗+ δ1]. Also

g⁰(x) = 1 − f⁰(x)f⁰(x) − f (x)f⁰⁰(x)

[f⁰(x)]² = f (x)f⁰⁰(x) [f⁰(x)]² , for x ∈ [x^∗− δ₁, x^∗+ δ₁]. Since f⁰⁰ is continuous on [a, b], we have g⁰ is continuous on [x^∗− δ₁, x^∗+ δ1].

By assumption f (x^∗) = 0, so

g⁰(x^∗) = f (x^∗)f⁰⁰(x^∗)

|f⁰(x^∗)|² = 0.

Since g⁰ is continuous on [x^∗− δ₁, x^∗+ δ₁]and g⁰(x^∗) = 0, ∃ δ with 0 < δ < δ₁ and k ∈ (0, 1) such that

|g⁰(x)| ≤ k, ∀ x ∈ [x^∗− δ, x^∗+ δ].

(36)

師大

Claim: g([x^∗− δ, x^∗+ δ]) ⊆ [x^∗− δ, x^∗+ δ].

If x ∈ [x^∗− δ, x^∗+ δ], then, by the Mean Value Theorem, ∃ ξ between x and x^∗ such that

|g(x) − g(x^∗)| = |g⁰(ξ)||x − x^∗|.

It implies that

|g(x) − x^∗| = |g(x) − g(x^∗)| = |g⁰(ξ)||x − x^∗|

≤ k|x − x^∗| < |x − x^∗| < δ.

Hence, g([x^∗− δ, x^∗+ δ]) ⊆ [x^∗− δ, x^∗+ δ].

By the Fixed-Point Theorem, the sequence {x_n}^∞_n=0defined by x_n= g(x_n−1) = x_n−1− f (x_n−1)

f⁰(xn−1), for n ≥ 1, converges to x^∗ for any x0 ∈ [x^∗− δ, x^∗+ δ].

(37)

師大

Example 13

When Newton’s method applied to f (x) = cos x with starting point x₀ = 3, which is close to the root ^π₂ of f , it produces x₁ = −4.01525, x₂= −4.8526, · · · ,which converges to another root −^3π₂ .

5 4 3 2 1 0 1 2 3 4 5

1. 5 0 1.5

x₀ y = c os ( x)

x^*

(38)

師大

Secant method

Disadvantage of Newton’s method

In many applications, the derivative f⁰(x)is very expensive to compute, or the function f (x) is not given in an algebraic formula so that f⁰(x)is not available.

By definition,

f⁰(x_n−1) = lim

x→xn−1

f (x) − f (x_n−1) x − x_n−1 . Letting x = xn−2, we have

f⁰(xn−1) ≈ f (xn−2) − f (xn−1)

x_n−2− x_n−1 = f (xn−1) − f (xn−2) x_n−1− x_n−2 . Using this approximation for f⁰(x_n−1)in Newton’s formula gives

xn= xn−1−f (xn−1)(xn−1− x_n−2) f (xn−1) − f (xn−2) ,

38 / 68

(39)

師大

From geometric point of view, we use asecant linethrough xn−1andxn−2 instead of the tangent line to approximate the function at the point x_n−1.

The slope of the secant line is

s_n−1 = f (x_n−1) − f (x_n−2) xn−1− x_n−2 and the equation is

M (x) = f (xn−1) + sn−1(x − xn−1).

The zero of the secant line x = x_n−1−f (xn−1)

s_n−1 = x_n−1− f (x_n−1) xn−1− x_n−2 f (x_n−1) − f (x_n−2) is then used as a new approximate x_n.

(40)

師大

Secant Method

Given x₀, x₁, tolerance T OL, maximum number of iteration M . Set i = 2; y₀ = f (x₀); y₁= f (x₁);

x = x1− y₁(x1− x₀)/(y1− y₀).

While i ≤ M and |x − x₁| ≥ T OL

Set i = i + 1; x₀ = x₁; y₀ = y₁; x₁= x; y₁ = f (x);

x = x1− y₁(x1− x₀)/(y1− y₀).

End While

(41)

師大

Method of False Position

1 Choose initial approximations x₀ and x₁ with f (x₀)f (x₁) < 0.

2 x₂= x₁− f (x₁)(x₁− x₀)/(f (x₁) − f (x₀))

3 Decide which secant line to use to compute x₃: If f (x₂)f (x₁) < 0, then x₁and x₂ bracket a root, i.e.,

x₃= x₂− f (x₂)(x₂− x₁)/(f (x₂) − f (x₁)) Else, x0 and x2 bracket a root, i.e.,

x3= x2− f (x₂)(x2− x₀)/(f (x2) − f (x0)) End if

(42)

師大

Method of False Position

Given x0, x1, tolerance T OL, maximum number of iteration M . Set i = 2; y₀ = f (x₀); y₁= f (x₁); x = x₁− y₁(x₁− x₀)/(y₁− y₀).

While i ≤ M and |x − x₁| ≥ T OL Set i = i + 1; y = f (x).

If y · y₁< 0, then set x₀= x₁; y₀ = y₁.

Set x₁ = x; y₁ = y; x = x₁− y₁(x₁− x₀)/(y₁− y₀).

End While

(43)

師大

Error analysis for iterative methods

Definition 14

Let{x_n} → x^∗. If there are positive constantscandαsuch that

n→∞lim

|x_n+1− x^∗|

|x_n− x^∗|^α = c,

then we say therate of convergenceis oforderα.

We say that the rate of convergence is

1 linearifα = 1and 0 < c < 1.

2 superlinearif

n→∞lim

|x_n+1− x^∗|

|x_n− x^∗| = 0;

3 quadraticifα = 2. ^{43 / 68}

(44)

師大

Suppose that {x_n}^∞_n=0and {˜x_n}^∞_n=0are linearly and quadratically convergent to x^∗, respectively, with the same constant c = 0.5. For simplicity, suppose that

|x_n+1− x^∗|

|x_n− x^∗| ≈ c and |˜xn+1− x^∗|

|˜x_n− x^∗|² ≈ c.

These imply that

|x_n− x^∗| ≈ c|x_n−1− x^∗| ≈ c²|x_n−2− x^∗| ≈ · · · ≈ cⁿ|x₀− x^∗|, and

|˜x_n− x^∗| ≈ c|˜x_n−1− x^∗|² ≈ cc|˜x_n−2− x^∗|²2

= c³|˜x_n−2− x^∗|⁴

≈ c³c|˜x_n−3− x^∗|²4

= c⁷|˜x_n−3− x^∗|⁸

≈ · · · ≈ c²ⁿ⁻¹|˜x0− x^∗|²ⁿ.

(45)

師大

Remark

Quadratically convergent sequences generally converge much more quickly thank those that converge only linearly.

Theorem 15

Let g ∈ C[a, b] with g([a, b]) ⊆ [a, b]. Suppose that g⁰ is continuous on (a, b) and ∃ k ∈ (0, 1) such that

|g⁰(x)| ≤ k, ∀ x ∈ (a, b).

Ifg⁰(x^∗) 6= 0, then for any x0∈ [a, b], the sequence xn= g(xn−1), for n ≥ 1

converges onlylinearlyto the unique fixed point x^∗ in [a, b].

(46)

師大

Proof:

By the Fixed-Point Theorem, the sequence {x_n}^∞_n=0 converges to x^∗.

Since g⁰ exists on (a, b), by the Mean Value Theorem, ∃ ξ_n between xnand x^∗ such that

xn+1− x^∗ = g(xn) − g(x^∗) = g⁰(ξn)(xn− x^∗).

∵ {xn}^∞_n=0→ x^∗ ⇒ {ξ_n}^∞_n=0→ x^∗ Since g⁰ is continuous on (a, b), we have

n→∞lim g⁰(ξ_n) = g⁰(x^∗).

Thus,

n→∞lim

|x_n+1− x^∗|

|x_n− x^∗| = lim

n→∞|g⁰(ξ_n)| = |g⁰(x^∗)|.

Hence, if g⁰(x^∗) 6= 0, fixed-point iteration exhibits linear convergence.

(47)

師大

Theorem 16

Let x^∗ be a fixed point of g and I be an open interval with x^∗ ∈ I. Suppose thatg⁰(x^∗) = 0and g⁰⁰is continuous with

|g⁰⁰(x)| < M, ∀ x ∈ I.

Then ∃ δ > 0 such that

{x_n= g(x_n−1)}^∞_n=1 → x^∗ for x₀ ∈ [x^∗− δ, x^∗+ δ]

at leastquadratically. Moreover,

|x_n+1− x^∗| < M

2 |x_n− x^∗|², for sufficiently large n.

(48)

師大

Proof:

Since g⁰(x^∗) = 0and g⁰ is continuous on I, ∃ δ such that [x^∗− δ, x^∗+ δ] ⊂ Iand

|g⁰(x)| ≤ k < 1, ∀ x ∈ [x^∗− δ, x^∗+ δ].

In the proof of the convergence for Newton’s method, we have

{x_n}^∞_n=0⊂ [x^∗− δ, x^∗+ δ].

Consider the Taylor expansion of g(x_n)at x^∗ x_n+1= g(x_n) = g(x^∗) + g⁰(x^∗)(x_n− x^∗) +g⁰⁰(ξ)

2 (x_n− x^∗)²

= x^∗+g⁰⁰(ξ)

2 (x_n− x^∗)², where ξ lies between xnand x^∗.

(49)

師大

Since

|g⁰(x)| ≤ k < 1, ∀ x ∈ [x^∗− δ, x^∗+ δ]

and

g([x^∗− δ, x^∗+ δ]) ⊆ [x^∗− δ, x^∗+ δ], it follows that {x_n}^∞_n=0converges to x^∗.

But ξnis between xnand x^∗for each n, so {ξn}^∞_n=0also converges to x^∗ and

n→∞lim

|x_n+1− x^∗|

|x_n− x^∗|² = |g⁰⁰(x^∗)|

2 < M 2 .

It implies that {xn}^∞_n=0 is quadratically convergent to x^∗if g⁰⁰(x^∗) 6= 0and

|x_n+1− x^∗| < M

2 |x_n− x^∗|², for sufficiently large n.

(50)

師大

For Newton’s method, g(x) = x − f (x)

f⁰(x) ⇒ g⁰(x) = 1 − f⁰(x)

f⁰(x)+f (x)f⁰⁰(x)

(f⁰(x))² = f (x)f⁰⁰(x) (f⁰(x))² It follows that g⁰(x^∗) = 0. Hence Newton’s method is locally quadratically convergent.

(51)

師大

Error Analysis of Secant Method

Reference: D. Kincaid and W. Cheney, ”Numerical analysis”

Let x^∗ denote the exact solution of f (x) = 0, e_k= x_k− x^∗ be the error at the k-th step. Then

e_k+1 = x_k+1− x^∗

= x_k− f (x_k) x_k− x_k−1

f (x_k) − f (x_k−1) − x^∗

= 1

f (x_k) − f (x_k−1)[(xk−1− x^∗)f (xk) − (xk− x^∗)f (xk−1)]

= 1

f (x_k) − f (x_k−1)(ek−1f (xk) − ekf (xk−1))

= e_ke_k−1

1

ekf (x_k) − _e¹

k−1f (x_k−1)

xk− x_k−1 · x_k− x_k−1 f (xk) − f (xk−1)

!

(52)

師大

To estimate the numerator

1

ekf (xk)− ¹

ek−1f (xk−1)

xk−x_k−1 , we apply Taylor’s Theorem

f (x_k) = f (x^∗+ e_k) = f (x^∗) + f⁰(x^∗)e_k+1

2f⁰⁰(x^∗)e²_k+ O(e³_k), to get

1 ek

f (x_k) = f⁰(x^∗) +1

2f⁰⁰(x^∗)e_k+ O(e²_k).

Similarly, 1

e_k−1f (x_k−1) = f⁰(x^∗) +1

2f⁰⁰(x^∗)e_k−1+ O(e²_k−1).

Hence 1

e_kf (x_k) − 1

e_k−1f (x_k−1) ≈ 1

2(e_k− e_k−1)f⁰⁰(x^∗).

Since x_k− x_k−1 = e_k− e_k−1and x_k− x_k−1

f (x ) − f (x ) → 1 f⁰(x^∗),

(53)

師大

we have

e_k+1 ≈ e_ke_k−1

1

2(e_k− e_k−1)f⁰⁰(x^∗) e_k− e_k−1 · 1

f⁰(x^∗)

!

= 1 2

f⁰⁰(x^∗) f⁰(x^∗)e_ke_k−1

≡ Ce_kek−1. (2)

To estimate the convergence rate, we assume

|e_k+1| ≈ η|e_k|^α, where η > 0 and α > 0 are constants, i.e.,

|e_k+1|

η|e_k|^α → 1 as k → ∞.

Then |e_k| ≈ η|e_k−1|^αwhich implies |e_k−1| ≈ η^−1/α|e_k|^1/α. Hence (2) gives

η|e_k|^α≈ C|e_k|η^−1/α|e_k|^1/α =⇒ C⁻¹η¹⁺^α¹ ≈ |e_k|^1−α+^α¹.

(54)

師大

Since |e_k| → 0 as k → ∞, and C⁻¹η¹⁺^α¹ is a nonzero constant, 1 − α + 1

α = 0 =⇒ α = 1 +√ 5

2 ≈ 1.62.

This result implies that C⁻¹η¹⁺^α¹ → 1 and η → C^1+α^α = f⁰⁰(x^∗)

2f⁰(x^∗)

0.62

. In summary, we have shown that

|e_k+1| = η|e_k|^α, α ≈ 1.62, that is, therate of convergenceissuperlinear.

Rate of convergence

secantmethod: superlinear Newton’smethod: quadratic bisectionmethod: linear

(55)

師大

Each iteration of method requires

secant method: one function evaluation

Newton’s method: two function evaluation, namely, f (x_k) and f⁰(x_k).

⇒ two steps of secant method are comparable to one step of Newton’s method. Thus

|e_k+2| ≈ η|e_k+1|^α ≈ η^1+α|e_k|³⁺

√ 5

2 ≈ η^1+α|e_k|^2.62.

⇒ secant method is more efficient than Newton’s method.

Remark

Two steps of secant method would require a little more work than one step of Newton’s method.

(56)

師大

Aitken’s ∆² method

Accelerate the convergence of a sequence that islinearly convergent.

Suppose {y_n}^∞_n=0is a linearly convergent sequence with limit y. Construct a sequence {ˆy_n}^∞_n=0that converges more rapidly to y than {y_n}^∞_n=0.

For n sufficiently large, y_n+1− y

y_n− y ≈ y_n+2− y y_n+1− y. Then

(y_n+1− y)²≈ (y_n+2− y)(y_n− y), so

y²_n+1− 2y_n+1y + y² ≈ y_n+2yn− (y_n+2+ yn)y + y²

(57)

師大

and

(yn+2+ yn− 2y_n+1)y ≈ yn+2yn− y²_n+1. Solving for y gives

y ≈ yn+2yn− y_n+1² y_n+2− 2y_n+1+ y_n

= y_ny_n+2− 2y_ny_n+1+ y²_n− y_n²+ 2y_ny_n+1− y²_n+1 yn+2− 2y_n+1+ yn

= y_n(y_n+2− 2y_n+1+ y_n) − (y_n+1− y_n)² (y_n+2− y_n+1) − (y_n+1− y_n)

= y_n− (y_n+1− y_n)²

(y_n+2− y_n+1) − (y_n+1− y_n). Aitken’s ∆² method

ˆ

y_n= y_n− (y_n+1− y_n)²

(yn+2− y_n+1) − (yn+1− y_n). (3)

(58)

師大

Example 17

The sequence {y_n= cos(1/n)}^∞_n=1converges linearly to y = 1.

n yn yˆn

1 0.54030 0.96178 2 0.87758 0.98213 3 0.94496 0.98979 4 0.96891 0.99342 5 0.98007 0.99541 6 0.98614

7 0.98981

{ˆyn}^∞_n=1converges more rapidly to y = 1 than {yn}^∞_n=1.

(59)

師大

Definition 18

For a given sequence {y_n}^∞_n=0, the forward difference ∆y_nis defined by

∆yn= yn+1− y_n, for n ≥ 0.

Higher powers of ∆ are defined recursively by

∆^ky_n= ∆(∆^k−1y_n), for k ≥ 2.

The definition implies that

∆²yn= ∆(yn+1− y_n) = ∆yn+1− ∆y_n= (yn+2− y_n+1) − (yn+1− y_n).

So the formula for ˆynin (3) can be written as ˆ

yn= yn−(∆y_n)²

∆²yn

, for n ≥ 0.

(60)

師大

Theorem 19

Suppose {y_n}^∞_n=0 → ylinearlyand

n→∞lim

yn+1− y y_n− y < 1.

Then {ˆyn}^∞_n=0→ y faster than {yn}^∞_n=0in the sense that

n→∞lim ˆ y_n− y yn− y = 0.

Aitken’s ∆²method constructs the terms in order:

y0, y1 = g(y0), y2= g(y1), yˆ0= {∆²}(y₀), y3 = g(y2), ˆ

y1 = {∆²}(y₁), . . . .

⇒ Assume |ˆy0− y| < |y₂− y|

(61)

師大

Steffensen’s method constructs the terms in order:

y₀⁽⁰⁾≡ y₀, y₁⁽⁰⁾ = g(y⁽⁰⁾₀ ), y₂⁽⁰⁾= g(y₁⁽⁰⁾), y₀⁽¹⁾= {∆²}(y₀⁽⁰⁾), y⁽¹⁾₁ = g(y⁽¹⁾₀ ), y₂⁽¹⁾= g(y⁽¹⁾₁ ), . . . . Steffensen’s method (To find a solution of y = g(y))

Given y₀, tolerance T ol, max. number of iteration M . Set i = 1.

While i ≤ M

Set y1= g(y0); y2= g(y1); y = y0− (y₁− y₀)²/(y2− 2y₁+ y0).

If |y − y₀| < T ol, then STOP.

Set i = i + 1; y₀ = y.

End While Theorem 20

Suppose x = g(x) has solution x^∗ with g⁰(x^∗) 6= 1. If ∃ δ > 0 such that g ∈ C³[x^∗− δ, x^∗+ δ], then Steffensen’s method gives quadraticconvergence for any x₀ ∈ [x^∗− δ, x^∗+ δ].

(62)

師大

Zeros of polynomials and M ¨ uller’s method

• Horner’s method:

Let

P (x) = a₀+ a₁x + a₂x²+ · · · + a_n−1xⁿ⁻¹+ a_nxⁿ

= a₀+ x (a₁+ x (a₂+ · · · + x (a_n−1+ a_nx) · · · )) . If

b_n = a_n,

b_k = a_k+ b_k+1x0, for k = n − 1, n − 2, . . . , 1, 0, then

b0= a0+ b1x0= a0+ (a1+ b2x0) x0 = · · · = P (x0).

Consider

Q(x) = b1+ b2x + · · · + bnxⁿ⁻¹.

(63)

師大

Then

b0+ (x − x0)Q(x) = b0+ (x − x0) b1+ b2x + · · · + bnxⁿ⁻¹

= (b0− b₁x0) + (b1− b₂x0)x + · · · + (bn−1− b_nx0)xⁿ⁻¹+ bnxⁿ

= a0+ a1x + · · · + anxⁿ= P (x).

Differentiating P (x) with respect to x gives

P⁰(x) = Q(x) + (x − x₀)Q⁰(x) and P⁰(x₀) = Q(x₀).

Use Newton-Raphson method to find an approximate zero of P (x):

xk+1= xk−P (xk)

Q(x_k), ∀ k = 0, 1, 2, . . . . Similarly, let

cn = bn= an,

ck = bk+ ck+1xk, for k = n − 1, n − 2, . . . , 1, then c₁ = Q(x_k).

(64)

師大

Horner’s method(Evaluate y = P (x₀)and z = P⁰(x₀)) Set y = a_n; z = a_n.

For j = n − 1, n − 2, . . . , 1 Set y = a_j+ yx₀; z = y + zx₀. End for

Set y = a₀+ yx0. If x_N is an approximate zero of P , then

P (x) = (x − x_N)Q(x) + b₀ = (x − x_N)Q(x) + P (x_N)

≈ (x − x_N)Q(x) ≡ (x − ˆx1)Q1(x).

So x − ˆx1is an approximate factor of P (x) and we can find a second approximate zero of P by applying Newton’s method to Q₁(x). The procedure is called deflation.

(65)

師大

• M ¨uller’s method for complex root:

Theorem 21

If z = a + ib is a complex zero of multiplicity m of P (x) with real coefficients, then ¯z = a − biis also a zero of multiplicity m of P (x)and (x²− 2ax + a²+ b²)^m is a factor of P (x).

Secant method: Given p₀ and p₁, determine p₂ as the intersection of the x-axis with the line through (p₀, f (p₀))and (p₁, f (p₁)).

M ¨uller’s method: Given p0, p1

and p₂, determine p₃ by the intersection of the x-axis with the parabola through (p0, f (p0)), (p₁, f (p₁))and (p₂, f (p₂)).

(66)

師大

Let

P (x) = a(x − p2)²+ b(x − p2) + c

that passes through (p₀, f (p₀)), (p₁, f (p₁))and (p₂, f (p₂)). Then f (p0) = a(p0− p₂)²+ b(p0− p₂) + c,

f (p1) = a(p1− p₂)²+ b(p1− p₂) + c, f (p2) = a(p2− p₂)²+ b(p2− p₂) + c = c.

It implies that c = f (p2),

b = (p0− p₂)²[f (p1) − f (p2)] − (p1− p₂)²[f (p0) − f (p2)]

(p0− p₂)(p1− p₂)(p0− p₁) , a = (p₁− p₂) [f (p₀) − f (p₂)] − (p₀− p₂) [f (p₁) − f (p₂)]

(p₀− p₂)(p₁− p₂)(p₀− p₁) .

(67)

師大

To determine p₃, a zero of P , we apply the quadratic formula to P (x) = 0and get

p₃− p₂ = 2c b ±√

b²− 4ac. Choose

p₃ = p₂+ 2c b + sgn(b)√

b²− 4ac

such that the denominator will be largest and result in p₃ selected as the closest zero of P to p2.

(68)

師大

M ¨uller’s method (Find a solution of f (x) = 0)

Given p₀, p₁, p₂; tolerance T OL; maximum number of iterations M Set h₁ = p₁− p₀; h₂ = p₂− p₁;

δ1= (f (p1) − f (p0))/h1; δ₂ = (f (p2) − f (p1))/h2; d = (δ2− δ₁)/(h2+ h1); i = 3.

While i ≤ M

Set b = δ₂+ h2d; D =pb²− 4f (p₂)d.

If |b − D| < |b + D|, then set E = b + D else set E = b − D.

Set h = −2f (p₂)/E; p = p₂+ h.

If |h| < T OL, then STOP.

Set p₀= p₁; p₁ = p₂; p₂ = p; h₁= p₁− p₀; h₂= p₂− p₁; δ₁= (f (p₁) − f (p₀))/h₁; δ₂ = (f (p₂) − f (p₁))/h₂; d = (δ2− δ₁)/(h2+ h1); i = i + 1.

End while