• 沒有找到結果。

(“ ⇒”) Rewrite g(x) as

g(x) = < x − x, A(x − x) > + < x, Ax > + < x, Ax >

− < x, Ax > −2 < x, b >

= < x − x, A(x − x) > − < x, Ax >

+2 < x, Ax> −2 < x, b >

= < x − x, A(x − x) > − < x, Ax > +2 < x, Ax− b > . Suppose that xis the solution of Ax = b, i.e., Ax = b. Then

g(x) =< x − x, A(x − x) > − < x, Ax >

which minimum occurs at x = x.

66 / 87

師大

(“⇐”) Fixed vectors x and v, for any α ∈ R, f (α) ≡ g(x + αv)

= < x + αv, Ax + αAv > −2 < x + αv, b >

= < x, Ax > +α < v, Ax > +α < x, Av > +α2 < v, Av >

−2 < x, b > −2α < v, b >

= < x, Ax > −2 < x, b > +2α < v, Ax > −2α < v, b > +α2 < v, Av >

= g(x) + 2α < v, Ax − b > +α2< v, Av > .

Because f is a quadratic function of α and < v, Av > is positive, f has a minimal value when f0(α) = 0. Since

f0(α) = 2 < v, Ax − b > +2α < v, Av >, the minimum occurs at

ˆ

α = −< v, Ax − b >

< v, Av > = < v, b − Ax >

< v, Av > .

師大

and

g(x + ˆαv) = f ( ˆα) = g(x) − 2< v, b − Ax >

< v, Av > < v, b − Ax >

+ < v, b − Ax >

< v, Av >

2

< v, Av >

= g(x) −< v, b − Ax >2

< v, Av > . So, for any nonzero vector v, we have

g(x + ˆαv) < g(x) if < v, b − Ax >6= 0 (6) and

g(x + ˆαv) = g(x) if < v, b − Ax >= 0. (7) Suppose that xis a vector that minimizes g. Then

g(x+ ˆαv) ≥ g(x) for any v. (8)

68 / 87

師大

From (6), (7) and (8), we have

< v, b − Ax>= 0 for any v, which implies that Ax= b.

Let

r = b − Ax.

Then

α = < v, b − Ax >

< v, Av > = < v, r >

< v, Av >. If r 6= 0 and if v and r are not orthogonal, then

g(x + αv) < g(x)

which implies that x + αv is closer to x than is x.

師大

Let x(0) be an initial approximation to x and v(1) 6= 0 be an initial search direction. For k = 1, 2, 3, . . ., we compute

αk = < v(k), b − Ax(k−1) >

< v(k), Av(k)> , x(k) = x(k−1)+ αkv(k)

and choose a new search direction v(k+1).

Question: How to choose {v(k)} such that {x(k)} converges rapidly to x?

Let Φ : Rn→ R be a differential function on x. Then it holds Φ(x + εp) − Φ(x)

ε = ∇Φ(x)Tp + O(ε).

The right hand side takes minimum at p = − ∇Φ(x)

k∇Φ(x)k (i.e., the largest descent) for all p with kpk = 1 (neglect O(ε)).

70 / 87

師大

Denote x = [x1, x2, . . . , xn]T. Then g(x) =< x, Ax > −2 < x, b >=

n

X

i=1 n

X

j=1

aijxixj− 2

n

X

i=1

xibi.

It follows that

∂g

∂xk

(x) = 2

n

X

i=1

akixi− 2bk, for k = 1, 2, . . . , n.

Therefore, the gradient of g is

∇g(x) = ∂g

∂x1(x), ∂g

∂x2, · · · , ∂g

∂xn(x)

T

= 2(Ax − b) = −2r.

師大

Steepest descent method (gradient method) Given an initial x06= 0.

For k = 1, 2, . . . rk−1= b − Axk−1 If rk−1= 0, then stop;

else αk = r

T k−1rk−1

rTk−1Ark−1, xk= xk−1+ αkrk−1. End for

Theorem 37

If xk, xk−1are two approximations of the steepest descent method for solving Ax = b and λ1≥ λ2≥ · · · ≥ λn > 0are the eigenvalues of A, then it holds:

kxk− xkA λ1− λn λ1+ λn



kxk−1− xkA,

where kxkA=

xTAx. Thus the gradient method is convergent.

72 / 87

師大

If the condition number of A (= λ1n) is large, then

λ1−λn λ1n ≈ 1.

The gradient method converges very slowly. Hence this method is not recommendable.

It is favorable to choose that the search directions {v(i)} as mutually A-conjugate, where A is symmetric positive definite.

Definition 38

Two vectors p and q are calledA-conjugate(A-orthogonal), if pTAq = 0.

師大

Lemma 39

Let v1, . . . , vn6= 0 be pairwiselyA-conjugate. Then they are linearly independent.

Proof: From

0 =

n

X

j=1

cjvj

follows that

0 = (vk)TA

n

X

j=1

cjvj

=

n

X

j=1

cj(vk)TAvj = ck(vk)TAvk,

so ck = 0, for k = 1, . . . , n.

74 / 87

師大

Theorem 40

Let A besymm. positive definiteand v1, . . . , vn∈ Rn\{0} be pairwiselyA-orthogonal. Give x0and let r0= b − Ax0. For k = 1, . . . , n, let

αk= < vk, b − Axk−1>

< vk, Avk> and xk= xk−1+ αkvk. ThenAxn = band

< b − Axk, vj>= 0, for each j = 1, 2, . . . , k − 1.

Proof: Since, for each k = 1, 2, . . . , n, xk = xk−1+ αkvk, we have

Axn = Axn−1+ αnAvn = (Axn−2+ αn−1Avn−1) + αnAvn= · · ·

= Ax0+ α1Av1+ α2Av2+ · · · + αnAvn.

師大

It implies that

< Axn− b, vk>

= < Ax0− b, vk > +α1 < Av1, vk > + · · · + αn< Avn, vk>

= < Ax0− b, vk > +α1 < v1, Avk > + · · · + αn< vn, Avk>

= < Ax0− b, vk > +αk< vk, Avk>

= < Ax0− b, vk > +< vk, b − Axk−1 >

< vk, Avk> < vk, Avk>

= < Ax0− b, vk > + < vk, b − Axk−1>

= < Ax0− b, vk >

+ < vk, b − Ax0+ Ax0− Ax1+ · · · − Axk−2+ Axk−2− Axk−1>

= < Ax0− b, vk > + < vk, b − Ax0 > + < vk, Ax0− Ax1>

+ · · · + < vk, Axk−2− Axk−1>

= < vk, Ax0− Ax1> + · · · + < vk, Axk−2− Axk−1 > .

76 / 87

師大

For any i

xi = xi−1+ αivi and Axi = Axi−1+ αiAvi, we have

Axi−1− Axi = −αiAvi. Thus, for k = 1, . . . , n,

< Axn− b, vk>

= −α1 < vk, Av1 > − · · · − αk−1 < vk, Avk−1 >= 0 which implies that Axn= b.

Suppose that

< rk−1, vj >= 0 for j = 1, 2, . . . , k − 1. (9) By the result

rk= b − Axk= b − A(xk−1+ αkvk) = rk−1− αkAvk

師大

it follows that

< rk, vk> = < rk−1, vk > −αk< Avk, vk>

= < rk−1, vk > −< vk, b − Axk−1 >

< vk, Avk> < Avk, vk>

= 0.

From assumption (9) and A-orthogonality, for j = 1, . . . , k − 1

< rk, vj >=< rk−1, vj > −αk< Avk, vj >= 0 which is completed the proof by the mathematic induction.

Method of conjugate directions:

Let A besymmetric positive definite, b, x0∈ Rn. Given v1, . . . , vn∈ Rn\{0} pairwisely A-orthogonal.

r0 = b − Ax0, For k = 1, . . . , n,

αk= <v<vk,rk−1>

k,Avk>, xk= xk−1+ αkvk, rk= rk−1− αkAvk= b − Axk. End For

78 / 87

師大

Practical Implementation

In k-th step a direction vkwhich is A-orthogonal to v1, . . . , vk−1must be determined.

It allows for orthogonalization of rkagainst v1, . . . , vk. Let rk6= 0, g(x) decreases strictly in the direction −rk. For ε > 0small, we have g(xk− εrk) < g(xk).

If rk−1= b − Axk−16= 0, then we use rk−1to generate vk by vk= rk−1+ βk−1vk−1. (10) Choose βk−1such that

0 = < vk−1, Avk >=< vk−1, Ark−1+ βk−1Avk−1 >

= < vk−1, Ark−1> +βk−1< vk−1, Avk−1> .

師大

That is

βk−1= −< vk−1, Ark−1>

< vk−1, Avk−1>. (11) Theorem 41

Let vkand βk−1be defined in (10) and (11), respectively. Then r0, . . . , rk−1are mutually orthogonal and

< vk, Avi>= 0, for i = 1, 2, . . . , k − 1.

That is {v1, . . . , vk} is an A-orthogonal set.

Having chosen vk, we compute αk = < vk, rk−1>

< vk, Avk> = < rk−1+ βk−1vk−1, rk−1>

< vk, Avk >

= < rk−1, rk−1>

< vk, Avk > + βk−1

< vk−1, rk−1>

< vk, Avk >

= < rk−1, rk−1>

< vk, Avk > . (12)

80 / 87

師大

Since

rk= rk−1− αkAvk, we have

< rk, rk>=< rk−1, rk > −αk< Avk, rk>= −αk< rk, Avk> . Further, from (12),

< rk−1, rk−1 >= αk< vk, Avk>, so

βk = −< vk, Ark >

< vk, Avk> = −< rk, Avk>

< vk, Avk>

= (1/αk) < rk, rk>

(1/αk) < rk−1, rk−1> = < rk, rk >

< rk−1, rk−1 >.

師大

Algorithm 4 (Conjugate Gradient method (CG-method)) Let A be s.p.d., b ∈ Rn, choose x0∈ Rn, r0 = b − Ax0 = v0. If r0 = 0, then N = 0 stop, otherwise for k = 0, 1, . . .

(a). αk= <v<rk,rk>

k,Avk>, (b). xk+1= xk+ αkvk, (c). rk+1= rk− αkAvk,

(d). If rk+1= 0, let N = k + 1, stop.

(e). βk= <r<rk+1,rk+1>

k,rk> , (f). vk+1= rk+1+ βkvk.

Theoretically, the exact solution is obtained in n steps.

If A is well-conditioned, then approximate solution is obtained in about√

nsteps.

If A is ill-conditioned, then the number of iterations may be greater than n.

82 / 87

師大

Select a nonsingular matrix C so that A = C˜ −1AC−T is better conditioned.

Consider the linear system A˜˜x = ˜b, where

˜

x = CTx and ˜b = C−1b.

Then

A˜˜x = (C−1AC−T)(CTx) = C−1Ax.

Thus,

Ax = b ⇔ ˜A˜x = ˜b and x = C−Tx.˜

師大

Since

˜

xk= CTxk, we have

˜

rk = ˜b − ˜A˜xk= C−1b − C−1AC−T CTxk

= C−1(b − Axk) = C−1rk. Let

˜

vk = CTvk and wk= C−1rk. Then

β˜k = < ˜rk, ˜rk>

< ˜rk−1, ˜rk−1 > = < C−1rk, C−1rk>

< C−1rk−1, C−1rk−1>

= < wk, wk>

< wk−1, wk−1>.

84 / 87

師大

Thus,

˜

αk = < ˜rk−1, ˜rk−1 >

< ˜vk, ˜A˜vk> = < C−1rk−1, C−1rk−1 >

< CTvk, C−1AC−TCTvk>

= < wk−1, wk−1 >

< CTvk, C−1Avk>

and, since

< CTvk, C−1Avk> = (vk)TCC−1Avk= (vk)T Avk

= < vk, Avk>, we have

˜

αk= < wk−1, wk−1 >

< vk, Avk> . Further,

˜

xk= ˜xk−1+ ˜αkk, so CTxk= CTxk−1+ ˜αkCTvk and

xk = xk−1+ ˜αkvk.

師大

Continuing,

˜

rk= ˜rk−1− ˜αkA˜˜vk, so

C−1rk= C−1rk−1− ˜αkC−1AC−TCTvk and

rk= rk−1− ˜αkAvk. Finally,

˜

vk+1 = ˜rk+ ˜βk˜vk and CTvk+1= C−1rk+ ˜βkCTvk, so

vk+1= C−TC−1rk+ ˜βkvk= C−Twk+ ˜βkvk.

86 / 87

相關文件