• 沒有找到結果。

From Evolutionary Biology to Interior-Point Methods

N/A
N/A
Protected

Academic year: 2022

Share "From Evolutionary Biology to Interior-Point Methods"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

Methods

Paul Tseng

Mathematics, University of Washington Seattle

WCOM, UBC Kelowna October 27, 2007

Joint work with Immanuel Bomze and Werner Schachinger (Univ. Vienna) Abstract

This is a talk given at WCOM Kelowna, October 2007.

(2)

Talk Outline

• Affine-Scaling Method for Linearly Constrained Optimization

• Replicator Dynamics in Evolutionary Biology

• 1st-Order Interior-Point Methods for Linearly Constrained Optimization

? Convergence & Convergence Rate

? Numerical Tests

• Conclusions & Open Questions

(3)

Linearly Constrained Smooth Optimization

(P) max

x f (x) s.t. Ax = b, x ≥ 0

f : <n → < is continuously diff., A ∈ <m×n of rank m, b ∈ <m

Seek a stationary pt: a feasible x (Ax = b, x ≥ 0) with x ⊥ ∇f (x) − A>λ ≤ 0 for some λ ∈ <m.

Primal Nondegeneracy: For any feasible x, the columns of A corresponding to {j | xj 6= 0} have rank m.

(4)

Affine-Scaling Method for (P)

Given Ax0 = b, x0 > 0, generate for t = 0, 1, ...

dt = arg max

d

∇f (xt)>d | Ad = 0, k(Xt)−1dk2 ≤ 1 xt+1 = xt + αtdt > 0

with Xt = diag(xt), αt > 0 suitably chosen Dikin ’67, ’72

(5)

The AS method is simple, fairly efficient in practice, but difficult to analyze.

Barnes, Bonnans, Dikin, Gonzaga, Mascarenhas, Monma, Monteiro, Roos, Saigal, J. Sun, T, Tsuchiya, Vanderbei, Ye, ...

• When f is linear, {xt} and {f (xt)} converge linearly; Luo, T ’92

the limit x¯ attains maximum if αt is not “too large”; Tsuchiya ’91, Tsuchiya & Muramatsu ’95

¯

x may fail to attain maximum if αt is “too large”. Mascarenhas ’97, Terlaky & Tsuchiya ’99

• When f is concave or convex and assuming primal nondeg, every cluster pt of {xt} is a stationary pt of (P). Gonzaga & Carlos ’90, Monteiro & Wang ’98

What about general f? And convergence rate?

(6)

Replicator Dynamics in Evolutionary Biology

xtj = fraction of jth genotype in pop. at time t, j = 1, ..., n Initially, x0j > 0 and P

j x0j = 1. xt = (xtj)nj=1 evolves according to

xt+1 = XtQxt

(xt)>Qxt, t = 0, 1, ...

with adaptation coefficients Qii > 0 and Qij ≥ 0 for all i, j ..., Haldane ’32, ..., Moran ’62, ...

• {xt} converges and its limit x¯ satisfies x¯j((Q¯x)j − ¯λ) = 0 for all j, with λ = max¯ j(Q¯x)j; convergence rate is linear iff (Q¯x)j < ¯λ whenever x¯j = 0;

otherwise k¯x − xtk = O(1/√

t). Lyubich et al. ’80

(7)

Connecting RD with AS

Rewrite RD iteration as

xt+1 − xt = Xt[gt − e (xt)>gt] (xt)>gt

with gt = Qxt, e = (1, ..., 1)>.

Thus

xt+1 = xt + αtdt with dt = Xtr(xt) where αt = 1/(xt)>gt and

r(x) = ∇f (x) − e x>∇f (x)

(8)

Solve for dt in AS iteration when A = e> yields dt ∝ (Xt)2



gt − e e>(Xt)2gt kxtk22



with gt = ∇f (xt), e = (1, ..., 1)>.

Thus

xt+1 = xt + αtdt with dt = (Xt)2r(xt) where αt > 0 and

r(x) = ∇f (x) − e e>X2∇f (x) kxk2

(9)

1st-Order Interior-Point Methods for (P)

Given Ax0 = b, x0 > 0, generate for t = 0, 1, ...

xt+1 = xt + αtdt with dt = (Xt)rγ(xt) where γ > 0,

rγ(x) = ∇f (x) − A>(AXA>)−1AX∇f (x) and αt is the largest α ∈ {αt0(β)k}k=0,1,... satisfying

f (xt + αdt) ≥ f (xt) + σα(gt)>dt, Armijo-type inexact LS

where 0 < β, σ < 1, gt = ∇f (xt), and

0 < αt0 < ( ∞ if dt ≥ 0;

−1

minj dtj/xtj else.

(10)

γ = 1/2 =⇒ RD

γ = 1 =⇒ AS

This method is simple, suited for large problems (n ≥ 10000).

Convergence of {xt}? Convergence rate of {f (xt)}, {xt}?

Choosing γ?

(11)

Convergence Results

: Assume {x feasible | f (x) ≥ f (x0)} is bounded. Then xt > 0, {f (xt)} ↑, and {xt}, {dt} are bounded.

(a) Assume primal nondeg, f is concave or convex, and we choose

inft αt0 > 0, supt αt0 < ∞. Then every cluster pt of {xt} is a stationary pt of (P).

(b) Assume f is quadratic and we choose inft αt0 > 0. Then υ − f (xt) = O 

1/t1/ max{γ,2γ−1} , with υ = lim

t→∞f (xt).

Assume in addition we choose γ < 1. Then {xt} converges and its limit x¯ satisfies

k¯x − xtk = O  1/t

1−γ

 .

Under primal nondeg, x¯ is a stationary pt of (P). Moreover, if γ ≤ 12 and

¯

x − rγ(¯x) > 0, then {f (xt)} converges Q-linearly and {k¯x − xtk} converges R-linearly. If instead supt αt0 < ∞, γ ≥ 12 and x − r¯ γ(¯x) 6> 0, then {k¯x − xtk}

cannot converge linearly.

• Thus, γ < 1 seems preferrable.

(12)

Numerical Tests

:

• Implement 1st-order IP method in Matlab. For Armijo LS, use β = .5, σ = .1, αt0 = min



0.95αtfeas, max



10−5, αt−1 β2



, αtfeas = −1

minj(dtj/xtj), with α−1 = ∞.

• Numerical tests on

maxx f (x) s.t. e>x = 1, x ≥ 0

with −f from Mor ´e-Garbow-Hillstrom set (least square), and n = 1000.

• Initial x0 = e/n. Terminate when resid := k min{xt, −rγ(xt)}k ≤ tol.

(13)

f (x) γ #iter #f-eval cpu (sec) obj resid

BAL .8 8 84 0.02 9.98998·108 1.4·10−6

1 7 8 0.03 9.98998·108 3.6·10−8

1.2 8 9 0.02 9.98998·108 3.7·10−7

BT .8 9146 27409 20.68 999.031 9.9·10−4

1 17559 52665 23.31 999.055 9.9·10−4

1.2 527757 1.58·106 1192.55 999.081 9.9·10−4

DBV .8 99 299 0.42 4.9·10−8 9.8·10−5

1 146 440 0.52 4.5·10−8 9.8·10−5

.2 240 722 1.02 4.0·10−8 9.9·10−5

EPS .8 424 1269 1.89 1.3·10−6 9.9·10−4

1 987 2958 3.52 3.9·10−6 9.9·10−4

1.2 1963 5887 8.76 8.0·10−6 9.4·10−4

ER .8 5 6 0.01 498.002 5.2·10−7

1 7 8 0.03 498.002 2.6·10−7

1.2 10 11 0.03 498.002 3.7·10−7

LR1 .8 20 21 0.04 3.32834·108 9.5·10−7

1 19 20 0.03 3.32839·108 3.5·10−7

1.2 20 21 0.03 3.33481·108 9.9·10−7

VD .8 22 74 0.04 6.22504·1022 2.4·10−9

1 19 46 0.04 6.22504·1022 2.6·10−8

1.2 18 50 0.05 6.22504·1022 5.7·10−9

Table 1: Performance of 1st-order IP method.

Quit due to Armijo ascent condition not met when α < 10−20

(14)

Conclusions & Open questions

1. Theory and practice suggest γ < 1 is preferrable to γ ≥ 1.

2. The method and its analysis readily extends to 0 ≤ x ≤ u by replacing Xt with min{Xt, U − Xt}.

3. Convergence of {xt} when γ ≥ 1 or when f is not quadratic?

4. Linear convergence of {f (xt)} and {xt} when γ > 12 or when f is not quadratic?

5. Convergence of {xt} for 2nd-order AS method?

(15)

Convergence Proof Ideas

(a) f (xt+1) − f (xt) ≥ σαttk2 with ηt = (Xt)γr(xt).

(b) If f is quadratic, use linearity of KT condition to show

t := υ − f (xt) = O 

tkmin{1+γ2 ,1γ} .

(c) If f is quadratic and γ < 1, then

kxt+1 − xtk = αtk(Xt)γηtk = O(kηtk) = O kηtk2tk



= O ∆t − ∆t+1 (∆t)1+γ2

!

= O

Z t

t+1

t1+γ2 dt

!

= O 

(∆t)1−γ2 − (∆t+1)1−γ2 

(d) Under primal nondegeneracy, rγ is continuous on the feasible set.

參考文獻

相關文件

This implies that f is an open mapping which maps interior points (of X) to interior points (of Y ) and f −1 is an open mapping from Y onto X... Is this convergence uniform

A function f is said to be continuous on an interval if it is continuous at each interior point of the interval and one-sidedly continuous at whatever endpoints the interval

• Density is varying in the medium and the volume scattering properties at a point is the product of scattering properties at a point is the product of the density at that point

This shows that q is an interior point

Activity D – Students will partner with local teens to work on a joint project fusing music and dance from African and Chinese cultures. They will showcase their work at a

Since p is an arbitrary chosen point of U and p is an interior point of U, U is open.. We see that p is never an interior point

Interior-point methods not amenable to warm start.. Chen,T

Interior-point methods not amenable to warm start.. Chen,T