Numerical solutions of nonlinear systems of equations

(1)

師大

of equations

Tsung-Ming Huang

Department of Mathematics National Taiwan Normal University, Taiwan

E-mail: min@ntnu.edu.tw

September 12, 2015

1 / 35

(2)

師大

Outline

1 Fixed points for functions of several variables

2 Newton’s method

3 Quasi-Newton methods

4 Steepest Descent Techniques

2 / 35

(3)

師大

Fixed points for functions of several variables

Theorem 1

Let f : D ⊂ Rⁿ→ R be a function and x0∈ D. If all the partial derivatives of f exist and ∃ δ > 0 and α > 0 such that

∀ kx − x₀k < δ and x ∈ D, we have

∂f (x)

∂xj

≤ α, ∀ j = 1, 2, . . . , n, then f is continuous at x₀.

Definition 2 (Fixed Point)

A function G from D ⊂ Rⁿinto Rⁿhas a fixed point at p ∈ D if G(p) = p.

3 / 35

(4)

師大

Theorem 3 (Contraction Mapping Theorem)

Let D = {(x₁, · · · , xn)^T; ai≤ x_i ≤ b_i, ∀ i = 1, . . . , n} ⊂ Rⁿ. Suppose G : D → Rⁿis a continuous function with G(x) ∈ D whenever x ∈ D. Then G has a fixed point in D.

Suppose, in addition, G has continuous partial derivatives and a constant α < 1 exists with

∂g_i(x)

∂xj

≤ α

n, whenever x ∈ D,

for j = 1, . . . , n and i = 1, . . . , n. Then, for any x⁽⁰⁾ ∈ D, x^(k)= G(x^(k−1)), for each k ≥ 1

converges to the unique fixed point p ∈ D and k x^(k)− p k∞≤ α^k

1 − α k x⁽¹⁾− x⁽⁰⁾ k∞.

4 / 35

(5)

師大

Example 4

Consider the nonlinear system

3x₁− cos(x₂x₃) −1

2 = 0, x²₁− 81(x₂+ 0.1)²+ sin x₃+ 1.06 = 0,

e^−x¹^x²+ 20x3+10π − 3

3 = 0.

Fixed-point problem:

Change the system into the fixed-point problem:

x1 = 1

3cos(x2x3) +1

6 ≡ g1(x1, x2, x3), x2 = 1

9 q

x²₁+ sin x3+ 1.06 − 0.1 ≡ g2(x1, x2, x3), x3 = − 1

20e^−x¹^x²−10π − 3

60 ≡ g3(x1, x2, x3).

Let G : R³→ R³be defined by G(x) = [g1(x), g₂(x), g₃(x)]^T.

5 / 35

(6)

師大

• G has a unique point in D ≡ [−1, 1] × [−1, 1] × [−1, 1]:

Existence: ∀ x ∈ D,

|g1(x)| ≤1

3| cos(x2x3)| +1 6 ≤ 0.5,

|g2(x)| = 1 9

q

x²₁+ sin x₃+ 1.06 − 0.1

≤1 9

√1 + sin 1 + 1.06 − 0.1 < 0.09,

|g3(x)| = 1

20e^−x¹^x²+10π − 3 60 ≤ 1

20e +10π − 3

60 < 0.61, it implies that G(x) ∈ D whenever x ∈ D.

Uniqueness:

∂g1

∂x₁

= 0,

∂g2

∂x₂

= 0 and

∂g3

∂x₃

= 0, as well as

∂g₁

∂x2

≤ 1

3|x₃| · | sin(x₂x₃)| ≤ 1

3sin 1 < 0.281,

6 / 35

(7)

師大

∂g1

∂x₃

≤ 1

3|x2| · | sin(x2x3)| ≤1

3sin 1 < 0.281,

∂g2

∂x1

= |x1|

9px²₁+ sin x₃+ 1.06 < 1 9√

0.218 < 0.238,

∂g₂

∂x3

= | cos x₃|

18px²₁+ sin x3+ 1.06< 1 18√

0.218 < 0.119,

∂g3

∂x₁

= |x2|

20 e^−x¹^x²≤ 1

20e < 0.14,

∂g3

∂x2

= |x1|

20 e^−x¹^x²≤ 1

20e < 0.14.

These imply that g1, g2and g3are continuous on D and ∀ x ∈ D,

∂gi

∂x_j

≤ 0.281, ∀ i, j.

Similarly, ∂gi/∂x_jare continuous on D for all i and j. Consequently, Ghas a unique fixed point in D.

7 / 35

(8)

師大

• Approximated solution:

Fixed-point iteration (I):

Choosing x⁽⁰⁾= [0.1, 0.1, −0.1]^T, {x^(k)} is generated by x^(k)₁ = 1

3cos x^(k−1)₂ x^(k−1)₃ +1 6, x^(k)₂ = 1

9 r

x^(k−1)₁ 2

+ sin x^(k−1)₃ + 1.06 − 0.1, x^(k)₃ = −1

20e^−x^(k−1)¹ ^x^(k−1)² −10π − 3 60 . Result:

k x^(k)₁ x^(k)₂ x^(k)₃ kx^(k)− x^(k−1)k_∞

0 0.10000000 0.10000000 -0.10000000

1 0.49998333 0.00944115 -0.52310127 0.423 2 0.49999593 0.00002557 -0.52336331 9.4 × 10⁻³ 3 0.50000000 0.00001234 -0.52359814 2.3 × 10⁻⁴ 4 0.50000000 0.00000003 -0.52359847 1.2 × 10⁻⁵ 5 0.50000000 0.00000002 -0.52359877 3.1 × 10⁻⁷

8 / 35

(9)

師大

• Approximated solution (cont.):

Accelerate convergence of the fixed-point iteration:

x^(k)₁ = 1

3cos x^(k−1)₂ x^(k−1)₃ +1 6, x^(k)₂ = 1

9 r

x^(k)₁ ²

+ sin x^(k−1)₃ + 1.06 − 0.1, x^(k)₃ = −1

20e^−x^(k)¹ ^x^(k)² −10π − 3 60 , as in the Gauss-Seidel method for linear systems.

Result:

k x^(k)₁ x^(k)₂ x^(k)₃ kx^(k)− x^(k−1)k∞

0 0.10000000 0.10000000 -0.10000000

1 0.49998333 0.02222979 -0.52304613 0.423 2 0.49997747 0.00002815 -0.52359807 2.2 × 10⁻² 3 0.50000000 0.00000004 -0.52359877 2.8 × 10⁻⁵ 4 0.50000000 0.00000000 -0.52359877 3.8 × 10⁻⁸

9 / 35

(10)

師大

Exercise

Page 636: 5, 7.b, 7.d

10 / 35

(11)

師大

Newton’s method

First consider solving the following system of nonlinear eqs.:

(f₁(x₁, x₂) = 0, f2(x1, x2) = 0.

Suppose (x^(k)₁ , x^(k)₂ )is an approximation to the solution of the system above, and we try to compute h^(k)₁ and h^(k)₂ such that (x^(k)₁ + h^(k)₁ , x^(k)₂ + h^(k)₂ )satisfies the system. By the Taylor’s theorem for two variables,

0 = f1(x^(k)₁ + h^(k)₁ , x^(k)₂ + h^(k)₂ )

≈ f₁(x^(k)₁ , x^(k)₂ ) + h^(k)₁ ∂f1

∂x1

(x^(k)₁ , x^(k)₂ ) + h^(k)₂ ∂f1

∂x2

(x^(k)₁ , x^(k)₂ ) 0 = f2(x^(k)₁ + h^(k)₁ , x^(k)₂ + h^(k)₂ )

≈ f₂(x^(k)₁ , x^(k)₂ ) + h^(k)₁ ∂f2

∂x₁(x^(k)₁ , x^(k)₂ ) + h^(k)₂ ∂f2

∂x₂(x^(k)₁ , x^(k)₂ )

11 / 35

(12)

師大

Put this in matrix form

" _∂f

1

∂x₁(x^(k)₁ , x^(k)₂ ) _∂x^∂f¹

2(x^(k)₁ , x^(k)₂ )

∂f₂

∂x₁(x^(k)₁ , x^(k)₂ ) _∂x^∂f²

2(x^(k)₁ , x^(k)₂ )

# "

h^(k)₁ h^(k)₂

# +

"

f1(x^(k)₁ , x^(k)₂ ) f2(x^(k)₁ , x^(k)₂ )

#

≈

0 0

. The matrix

J (x^(k)₁ , x^(k)₂ ) ≡

" _∂f

1

∂x1(x^(k)₁ , x^(k)₂ ) ^∂f_∂x¹

2(x^(k)₁ , x^(k)₂ )

∂f₂

∂x₁(x^(k)₁ , x^(k)₂ ) ^∂f_∂x²

2(x^(k)₁ , x^(k)₂ )

#

is called theJacobian matrix. Set h^(k)₁ and h^(k)₂ be the solution of the linear system

J (x^(k)₁ , x^(k)₂ )

"

h^(k)₁ h^(k)₂

#

= −

"

f₁(x^(k)₁ , x^(k)₂ ) f₂(x^(k)₁ , x^(k)₂ )

# ,

then

"

x^(k+1)₁ x^(k+1)₂

#

=

"

x^(k)₁ x^(k)₂

# +

"

h^(k)₁ h^(k)₂

#

is expected to be a better approximation.

12 / 35

(13)

師大

In general, we solve the system of n nonlinear equations f_i(x₁, · · · , x_n) = 0, i = 1, . . . , n. Let

x =

x1 x2 · · · xn

T

and

F (x) =

f1(x) f2(x) · · · fn(x) T

. The problem can be formulated as solving

F (x) = 0, F : Rⁿ→ Rⁿ. LetJ (x), where the(i, j)entryis _∂x^∂fⁱ

j(x), be the n × nJacobian matrix. Then the Newton’s iteration is defined as

x^(k+1) = x^(k)+ h^(k),

where h^(k)∈ Rⁿis the solution of the linear system J (x^(k))h^(k)= −F (x^(k)).

13 / 35

(14)

師大

Algorithm 1 (Newton’s Method for Systems)

Given a function F : Rⁿ→ Rⁿ, an initial guess x⁽⁰⁾ to the zero of F , and stop criteria M , δ, and ε, this algorithm performs the Newton’s iteration to approximate one root of F .

Set k = 0 and h⁽⁻¹⁾= e1.

While (k < M ) and (k h^(k−1)k≥ δ) and (k F (x^(k)) k≥ ε) Calculate J (x^(k)) = [∂Fi(x^(k))/∂xj].

Solve the n × n linear system J (x^(k))h^(k)= −F (x^(k)).

Set x^(k+1)= x^(k)+ h^(k) and k = k + 1.

End while

Output (“Convergent x^(k)”) or

(“Maximum number of iterations exceeded”)

14 / 35

(15)

師大

Theorem 5

Let x^∗be a solution of G(x) = x. Suppose ∃ δ > 0 with

(i) ∂g_i/∂x_jis continuous on Nδ = {x; kx − x^∗k < δ} for all i and j.

(ii) ∂²g_i(x)/(∂x_j∂x_k)is continuous and

∂²g_i(x)

∂xj∂xk

≤ M

for some M whenever x ∈ Nδ for each i, j and k.

(iii) ∂g_i(x^∗)/∂x_k = 0for each i and k.

Then ∃ ˆδ < δsuch that the sequence {x^(k)} generated by x^(k)= G(x^(k−1))

convergesquadraticallyto x^∗for any x⁽⁰⁾satisfying kx⁽⁰⁾− x^∗k_∞< ˆδ.

Moreover,

kx^(k)− x^∗k_∞≤n²M

2 kx^(k−1)− x^∗k²_∞, ∀ k ≥ 1. _{15 / 35}

(16)

師大

Example 6

Consider the nonlinear system

3x₁− cos(x₂x₃) −1

2 = 0, x²₁− 81(x₂+ 0.1)²+ sin x3+ 1.06 = 0,

e^−x¹^x² + 20x3+10π − 3

3 = 0.

Nonlinear functions: Let

F (x₁, x₂, x₃) = [f₁(x₁, x₂, x₃), f₂(x₁, x₂, x₃), f₃(x₁, x₂, x₃)]^T , where

f₁(x₁, x₂, x₃) = 3x₁− cos(x₂x₃) −1 2,

f2(x1, x2, x3) = x²₁− 81(x₂+ 0.1)²+ sin x3+ 1.06, f₃(x₁, x₂, x₃) = e^−x¹^x²+ 20x₃+10π − 3

3 .

16 / 35

(17)

師大

Nonlinear functions (cont.):

The Jacobian matrix J (x) for this system is

J (x₁, x₂, x₃) =





3 x₃sin x₂x₃ x₂sin x₂x₃ 2x₁ −162(x₂+ 0.1) cos x₃

−x₂e^−x¹^x² −x₁e^−x¹^x² 20



. Newton’s iteration with initial x⁽⁰⁾= [0.1, 0.1, −0.1]^T:





 x^(k)₁ x^(k)₂ x^(k)₃





=





 x^(k−1)₁ x^(k−1)₂ x^(k−1)₃





−





 h^(k−1)₁ h^(k−1)₂ h^(k−1)₃





, where





 h^(k−1)₁ h^(k−1)₂ h^(k−1)₃





= J

x^(k−1)₁ , x^(k−1)₂ , x^(k−1)₃ −1

F (x^(k−1)₁ , x^(k−1)₂ , x^(k−1)₃ ).

17 / 35

(18)

師大

Result:

k x^(k)₁ x^(k)₂ x^(k)₃ kx^(k)− x^(k−1)k_∞

0 0.10000000 0.10000000 −0.10000000

1 0.50003702 0.01946686 −0.52152047 0.422 2 0.50004593 0.00158859 −0.52355711 1.79 × 10⁻² 3 0.50000034 0.00001244 −0.52359845 1.58 × 10⁻³ 4 0.50000000 0.00000000 −0.52359877 1.24 × 10⁻⁵ 5 0.50000000 0.00000000 −0.52359877 0

18 / 35

(19)

師大

Exercise Page 644: 2, 8

19 / 35

(20)

師大

Quasi-Newton methods

Newton’s Methods

Advantage:quadraticconvergence

Disadvantage: For each iteration, it requires O(n³) + O(n²) + O(n)arithmetic operations:

n²partial derivatives for Jacobian matrix – in most situations, the exact evaluation of the partial derivatives is inconvenient.

nscalar functional evaluations of F

O(n³)arithmetic operations to solve linear system.

quasi-Newton methods

Advantage: it requires only n scalar functional evaluations per iteration and O(n²)arithmetic operations

Disadvantage:superlinearconvergence

Recall that in one dimensional case, one uses thelinearmodel

`_k(x) = f (x_k) + a_k(x − x_k)

toapproximatethe functionf (x)at x_k. That is,`_k(x_k) = f (x_k) for any a_k∈ R. If we further require that`⁰(x_k) = f⁰(x_k), then

a_k= f⁰(x_k). ^{20 / 35}

(21)

師大

The zero of `k(x)is used to give a new approximate for the zero of f (x), that is,

xk+1= xk− 1

f⁰(xk)f (xk) which yieldsNewton’smethod.

Iff⁰(xk)isnot available, one instead asks the linear model to satisfy

`_k(x_k) = f (x_k) and `_k(x_k−1) = f (x_k−1).

In doing this, the identity

f (xk−1) = `k(xk−1) = f (xk) + ak(xk−1− xk)

gives

ak =f (x_k) − f (x_k−1) xk− xk−1

.

Solving `k(x) = 0yields thesecantiteration xk+1= xk− xk− xk−1

f (xk) − f (xk−1)f (xk).

21 / 35

(22)

師大

In multiple dimension, the analogueaffine modelbecomes M_k(x) = F (x^(k)) + A_k(x − x^(k)),

where x, x^(k)∈ Rⁿand A_k∈ R^n×n, and satisfies M_k(x^(k)) = F (x^(k)),

for any A_k. The zero of M_k(x)is then used to give a new approximate for the zero of F (x), that is,

x^(k+1) = x^(k)− A⁻¹_k F (x^(k)).

TheNewton’smethod chooses

A_k= F⁰(x^(k)) ≡ J (x^(k)) =the Jacobian matrix

and yields the iteration x^(k+1)= x^(k)−

F⁰(x^(k))−1

F (x^(k)).

22 / 35

(23)

師大

When the Jacobian matrix J (x^(k)) ≡ F⁰(x^(k))is not available, one can require

M_k(x^(k−1)) = F (x^(k−1)).

Then

F (x^(k−1)) = M_k(x^(k−1)) = F (x^(k)) + A_k(x^(k−1)− x^(k)), which gives

Ak(x^(k)− x^(k−1)) = F (x^(k)) − F (x^(k−1)) and this is the so-called secant equation. Let

h^(k)= x^(k)− x^(k−1) and y^(k)= F (x^(k)) − F (x^(k−1)).

The secant equation becomes Akh^(k)= y^(k).

23 / 35

(24)

師大

However, this secant equation can not uniquely determine A_k. One way of choosing A_k is to minimize M_k− M_k−1 subject to the secant equation. Note

M_k(x) − M_k−1(x)

=F (x^(k)) + Ak(x − x^(k)) − F (x^(k−1)) − Ak−1(x − x^(k−1))

=(F (x^(k)) − F (x^(k−1))) + A_k(x − x^(k)) − A_k−1(x − x^(k−1))

=A_k(x^(k)− x^(k−1)) + A_k(x − x^(k)) − A_k−1(x − x^(k−1))

=A_k(x − x^(k−1)) − A_k−1(x − x^(k−1))

=(A_k− A_k−1)(x − x^(k−1)).

For any x ∈ Rⁿ, we express

x − x^(k−1) = αh^(k)+ t^(k),

for some α ∈ R, t^(k)∈ Rⁿ, and (h^(k))^Tt^(k)= 0. Then

M_k−M_k−1= (A_k−A_k−1)(αh^(k)+t^(k)) = α(A_k−A_k−1)h^(k)+(A_k−A_k−1)t^(k).

24 / 35

(25)

師大

Since

(Ak− A_k−1)h^(k)= Akh^(k)− A_k−1h^(k)= y^(k)− A_k−1h^(k), both y^(k)and A_k−1h^(k)are old values, we have no control over the first part (A_k− A_k−1)h^(k). In order to minimize

M_k(x) − M_k−1(x), we try to choose A_kso that (Ak− A_k−1)t^(k) = 0

for all t^(k)∈ Rⁿ, (h^(k))^Tt^(k)= 0. This requires that A_k− A_k−1to be a rank-one matrix of the form

A_k− A_k−1 = u^(k)(h^(k))^T

for some u^(k)∈ Rⁿ. Then

u^(k)(h^(k))^Th^(k)= (A_k− A_k−1)h^(k)= y^(k)− A_k−1h^(k)

25 / 35

(26)

師大

which gives

u^(k)= y^(k)− A_k−1h^(k) (h^(k))^Th^(k) .

Therefore,

Ak= Ak−1+(y^(k)− A_k−1h^(k))(h^(k))^T

(h^(k))^Th^(k) . (1) After A_kis determined, the new iterate x^(k+1) is derived from solving M_k(x) = 0. It can be done by first noting that

h^(k+1) = x^(k+1)− x^(k) =⇒ x^(k+1) = x^(k)+ h^(k+1) and

M_k(x^(k+1)) = 0 ⇒ A_kh^(k+1)= −F (x^(k)) These formulations give theBroyden’smethod.

26 / 35

(27)

師大

Algorithm 2 (Broyden’s Method)

Given F : Rⁿ→ Rⁿ, an initial vector x⁽⁰⁾ and initial Jacobian matrix A₀ ∈ R^n×n (e.g., A₀ = I), tolerance T OL, maximum number of iteration M .

Set k = 1.

While k ≤ M and kx^(k)− x^(k−1)k₂ ≥ T OL Solve A_kh^(k+1)= −F (x^(k))for h^(k+1) Update x^(k+1)= x^(k)+ h^(k+1)

Compute y^(k+1)= F (x^(k+1)) − F (x^(k)) Update

A_k+1= A_k+(y^(k+1)+ F (x^(k)))(h^(k+1))^T (h^(k+1))^Th^(k+1) Set k = k + 1

End While

27 / 35

(28)

師大

Solve the linear system A_kh^(k+1) = −F (x^(k))for h^(k+1): LU-factorization: cost ²₃n³+ O(n²)floating-point operations.

Applying the Shermann-Morrison-Woodbury formula B + U V^T−1

= B⁻¹− B⁻¹U I + V^TB⁻¹U−1

V^TB⁻¹ to (1), we have

A⁻¹_k

=

Ak−1+(y^(k)− Ak−1h^(k))(h^(k))^T (h^(k))^Th^(k)

⁻¹

= A⁻¹_k−1− A⁻¹_k−1y^(k)− Ak−1h^(k) (h^(k))^Th^(k)

1 + (h^(k))^TA⁻¹_k−1y^(k)− Ak−1h^(k) (h^(k))^Th^(k)

⁻¹

(h^(k))^TA⁻¹_k−1

= A⁻¹_k−1+(h^(k)− A⁻¹_k−1y^(k))(h^(k))^TA⁻¹_k−1 (h^(k))^TA⁻¹_k−1y^(k) .

28 / 35

(29)

師大

Newton-based methods

Advantage: high speed of convergence once a sufficiently accurate approximation

Weakness: an accurate initial approximation to the solution is needed to ensure convergence.

Steepest Descent method converges only linearly to the sol., but it will usually converge even for poor initial approximations.

“Find sufficiently accurate starting approximate solution by using Steepest Descent method” + ”Compute convergent solution by using Newton-based methods”

The method of Steepest Descent determines a local minimum for a multivariable function of g : Rⁿ → R.

A system of the form fi(x1, . . . , xn) = 0, i = 1, 2, . . . , nhas a solution at x iff the function g defined by

g(x1, . . . , xn) =

n

X

i=1

[fi(x1, . . . , xn)]²

has the minimal value zero. ^{29 / 35}

(30)

師大

Basic idea of steepest descent method:

(i) Evaluate g at an initial approximation x⁽⁰⁾;

(ii) Determine a direction from x⁽⁰⁾that results in a decrease in the value of g;

(iii) Move an appropriate distance in this direction and call the new vector x⁽¹⁾;

(iv) Repeat steps (i) through (iii) with x⁽⁰⁾replaced by x⁽¹⁾. Definition 7 (Gradient)

If g : Rⁿ→ R, the gradient, ∇g(x), at x is defined by

∇g(x) = ∂g

∂x1

(x), · · · , ∂g

∂xn

(x)

.

Definition 8 (Directional Derivative)

The directional derivative of g at x in the direction of v with k v k2= 1 is defined by

D_vg(x) = lim

h→0

g(x + hv) − g(x)

h = v^T∇g(x).

30 / 35

(31)

師大

Theorem 9

The direction of the greatest decrease in the value of g at x is the direction given by−∇g(x).

Object: reduce g(x) to its minimal value zero.

⇒ for an initial approximation x⁽⁰⁾, an appropriate choice for new vector x⁽¹⁾ is

x⁽¹⁾= x⁽⁰⁾− α∇g(x⁽⁰⁾), for some constant α > 0.

Choose α > 0 such that g(x⁽¹⁾) < g(x⁽⁰⁾): define h(α) = g(x⁽⁰⁾− α∇g(x⁽⁰⁾)),

then find α^∗ such that

h(α^∗) = min

α h(α).

31 / 35

(32)

師大

How to find α^∗?

Solve a root-finding problem h⁰(α) = 0 ⇒ Too costly, in general.

Choose three number α1< α2< α3, construct quadratic polynomial P (x) that interpolates h at α1, α2and α3, i.e.,

P (α1) = h(α1), P (α2) = h(α2), P (α3) = h(α3), to approximate h. Use the minimum value P ( ˆα)in [α1, α3] to approximate h(α^∗). The new iteration is

x⁽¹⁾ = x⁽⁰⁾− ˆα∇g(x⁽⁰⁾).

Set α1 = 0to minimize the computation α3is found with h(α3) < h(α1).

Choose α2= α3/2.

32 / 35

(33)

師大

Example 10

Use the Steepest Descent method with x⁽⁰⁾= (0, 0, 0)^T to find a reasonable starting approximation to the solution of the nonlinear system

f1(x1, x2, x3) = 3x1− cos(x2x3) −1 2 = 0,

f2(x1, x2, x3) = x²₁− 81(x2+ 0.1)²+ sin x3+ 1.06 = 0, f₃(x₁, x₂, x₃) = e^−x¹^x²+ 20x₃+10π − 3

3 = 0.

Let g(x1, x₂, x₃) = [f₁(x₁, x₂, x₃)]²+ [f₂(x₁, x₂, x₃)]²+ [f₃(x₁, x₂, x₃)]². Then∇g(x1, x2, x3) ≡ ∇g(x)

=

2f₁(x)∂f1

∂x₁(x) + 2f₂(x)∂f2

∂x₁(x) + 2f₃(x)∂f3

∂x₁(x), 2f1(x)∂f1

∂x2

(x) + 2f2(x)∂f2

∂x2

(x) + 2f3(x)∂f3

∂x2

(x), 2f1(x)∂f1

∂x₃(x) + 2f2(x)∂f2

∂x₃(x) + 2f3(x)∂f3

∂x₃(x)

33 / 35

(34)

師大

For x⁽⁰⁾ = [0, 0, 0]^T, we have

g(x⁽⁰⁾) = 111.975 and z0 = k∇g(x⁽⁰⁾)k2 = 419.554.

Let z = 1

z0

∇g(x⁽⁰⁾) = [−0.0214514, −0.0193062, 0.999583]^T. With α₁ = 0, we have

g1= g(x⁽⁰⁾− α₁z) = g(x⁽⁰⁾) = 111.975.

Let α₃ = 1so that

g3 = g(x⁽⁰⁾− α₃z) = 93.5649 < g1. Set α₂ = α₃/2 = 0.5. Thus

g2= g(x⁽⁰⁾− α₂z) = 2.53557.

34 / 35

(35)

師大

Form quadratic polynomial P (α) defined as P (α) = g1+ h1α + h3α(α − α2)

that interpolates g(x⁽⁰⁾− αz) at α1= 0, α₂= 0.5and α3= 1as follows g2= P (α2) = g1+ h1α2 ⇒ h1= g2− g1

α₂ = −218.878, g3= P (α3) = g1+ h1α3+ h3α3(α3− α2) ⇒ h3= 400.937.

Thus

P (α) = 111.975 − 218.878α + 400.937α(α − 0.5) so that

0 = P⁰(α₀) = −419.346 + 801.872α₀ ⇒ α₀= 0.522959 Since

g0= g(x⁽⁰⁾− α0z) = 2.32762 < min{g1, g3}, we set

x⁽¹⁾= x⁽⁰⁾− α0z = [0.0112182, 0.0100964, −0.522741]^T.

35 / 35