# PROGRAMMING AND

## Full text

(1)

### A

PATH-FOLLOWING ALGORITHM FOR

### LINEAR

PROGRAMMING USING

LOGARITHMIC

### PENALTY

FUNCTIONS*

PAUL TSENGt

Abstract. Motivated bya recent work ofSetiono, apath-following algorithm for linear pro- gramming using both logarithmic and quadratic penalty functions isproposed. Inthe algorithm, alogarithmic andaquadraticpenalty isplaced on, respectively, the nonnegativityconstraintsand anarbitrary subset of the equality constraints; Newton’smethod isapplied to solve the penalized problem, andaftereachNewton stepthe penalty parametersaredecreased. Thisalgorithm main- tains neitherprimalnordualfeasibilityand does not requireaPhaseI. Itis shownthatif the initial iterate ischosenappropriatelyandthepenaltyparametersaredecreased tozeroinaparticularway, then thealgorithmis linearly convergent. Numericalresults arealsopresentedsuggesting that the algorithmmay be competitive withinteriorpointalgorithmsinpractice, requiringtypicallybetween 30-45iterationstoaccuratelysolve each Netlibproblemtested.

Keywords, linear program,path-following,Newtonstep, penaltyfunction AMS subject classifications. 49, 90

1. Introduction. Since the pioneeringworkofKarmarkar

### [Kar84],

muchinter- est has focused on solving linear programs using interior point algorithms. These interior point algorithms may be classified roughly aseither

### (i)

projective-scaling

potential

### reduction), (ii)

affine-scaling, or

path-following.

### We

will notattemptto reviewtheliterature on thissubject, which isvast

for example

for

### surveys).

Our interest is in algorithms of the path-following type, ofthe sort discussed in

### [GaZS1].

These interior point algorithms typically penalize the nonneg- ativity constraints by a logarithmic function anduse

### Newton’s

method tosolve the penalized problem, with the penalty parameters decreased after each Newton step

for example,

### [Gon89], [GMSTW86], [KMY89], [MoA87], [Ren88], [Tse89]).

One disadvantage of interior point algorithms is the need for an initial interior feasible solution.

### A

commontechnique for handlingthis is to addanartificalcolumn

### (see [AKRV89], [BDDW89], [GMSTW86], [Lus90], [MMS89], [MSSPB89], [MoM87]),

but this itselfhas disadvantages. For example, thecost ofthe artificial column must be estimated, andsome type ofrank-1 updatingisneededto solve eachleast square problemwhichcansignificantlyincreasethe solutiontimeanddegrade thenumerical accuracyofthe solutions.

Recently,Setiono

### [Set89]

proposedaninterestingalgorithm thatcombinesfeatures ofapath-following algorithmwiththoseof the method of multipliers

see

### [Roc76], [Ber82]).

Thisalgorithmdoesnot requireafeasible solution to start and is comparable to interior point algorithms both in terms of work per iteration and, according to the numerical results reported in

### [Set89],

interms ofthe total numberofiterations. Todescribe the basic idea in Setiono’salgorithm, consider

Receivedbythe editorsApril 16, 1990; acceptedfor publication(inrevisedform)June11,1992.

This researchwassupported bythe U.S.Army ResearchOffice, contract DAAL03-86oK-0171,and wasconducted whiletheauthorwaswiththeLaboratoryfor Information and DecisionSystemsand theCenterforIntelligent ControlSystems,MassachusettsInstitute ofTechnology, Cambridge.

Department of Mathematics, GN-50, University of Washington, Seattle, Washington 98195 (t sengmath, washington, edu).

1578

(2)

alinearprogramin the standarddual form

minimize

subjectto t

c, t

where

### A

is some matrix and b and c are vectors of appropriate dimension. Let us attach a

### Lagrange

multiplier vector x to the constraints t

### + ATp

c and apply

themethod ofmultipliers to the above linear program. Thisproduces thefollowing iterations

1

k

k- 1, 2,

x

where

### (e k}

is a sequence of monotonically decreasing positive scalars and

is some

### (inexact)

solution of the augmented Lagrangian subproblem

minimize

1

2 subjectto t

0.

### (An

advantage ofthe above multiplier iterations is that they do not need a feasible solution to

### start.) A

key issue associated with the above multiplier iterations con- cerns the efficient generation ofan inexact solution

for each k.

### (Note

that as ek decreases, theobjective function of

### (1.1)

becomes progressively more

### ill-conditioned.)

Setiono’s algorithm may be viewed as the method of multipliers in which

### (t,p k)

is generated according to the following scheme, reminiscent ofthepath-following idea: Add a logarithmic penalty function

### _/k -jm__ ln(tj)

totheobjective of

where

### k

is some positive scalar monoton-

ically decreasingwith k, andapply asingle Newton step, starting from

### (tk-l,p k-l),

to the resulting problem.

### (If

thetk thusobtained lies outsidethepositiveorthant, it

ismoved back towards tk-1 until it becomes positive.

### 1)

Inthis paper, inspired bythe precedingworkofSetiono, we study an algorithm that also addsto the objective aquadraticpenaltyontheequalityconstraints and a logarithmic penaltyon thenonnegativity constraints; and then solves the penalized problem using Newton’s method, with the penalty parameters decreased after each

### Newton

step. Unlike Setiono’s algorithm, our algorithmdoes not use the multiplier vector x

### (so

it may be viewed as a pure penalty

### method)

and allows any subset of theequality constraints to bepenalized.

### We

show that ifthe problem isprimalnon- degenerate and the iterates start near the optimal solution of the initial penalized problem, then the penalty parameters can be decreased at the rate ofa geometric progression and the iterates converge linearly. To the best of our knowledge, this is the first

### (global)

linear convergence result for an noninterior point path-following algorithm. We also present numerical results indicating that thealgorithm may po- tentially be competitive with interior point algorithms in practice.

### We

remark that enalty methods that use either the quadratic or the logarithmic penalty function have been well studied

for example,

### [Ber82], [FiM68], [Fri57], [JIO78], [Man84],

Moreprecisely, k isgivenbythe formula

k k-

.98AkAt

### k,

whereAtk is the Newton direction(projectedonto the space oft) and Ak is thelargest AE (0,1]

for which k-

### _

/ktk isnonnegative. (Thechoice of.98 isarbitrary--anynumberbetween 0 and 1woulddo.)

(3)

### [WBD88]),

but very little is known about penalty methods that use both types of penaltyfunctions

### (called

mixed interior point-exterior pointalgorithms in

### Upon

completion of this paper, the author learned from Professor O. L. Man- gasarian that the algorithm discussed here is essentially the IDLN

Dual Least

### Norm)

algorithm described in the recent Ph.D. thesis ofSetiono

see thereport

### [Set91]),

with minordifferences inthechoiceofthepenalty parameters.

The samereferenceincludes an (asymptotic) linearrateconvergence analysis andex- tensivenumericalresults showing that theIDLN algorithm outperforms the popular simplex code

### MINOS

by a factor of 2 or more on the 63 Netlib problems tested.

### In

short,Setiono’sindependent workprovides furtherevidenceofthepracticalefficiency of the mixed interior point-exterior point solution approach. Finally, we note that while this paperwas under review, other noninterior point methods relating to that studiedhere have been proposed. One method, brought to our attention by one of thereferees,isacertainaugmented Lagrangian algorithm forstochastic programming

### (see [MuR90]);

another method is a primal-dual exterior point algorithm for linear programming

### (see [KMM91]). However,

neitherofthesemethods has beenshown to possess the nice

### theoretical/numerical

properties enjoyed by the algorithm studied here. Forexample, no convergenceresult is given for the methodin

### [MuR90]

andno

numericalorrate of convergence result isgiven forthe method in

### [KMM91].

This paper proceeds as follows: In

### 2

we describethe basic algorithm; in

### 3

we

analyzeits convergence; andin

### 4

werecountour numerical experience withit.

### In 5

wediscuss extensionsofthis work.

Inour notation, every vector isacolumnvector insome k-dimensionalrealspace

### k,

andsuperscript T denotes transpose. For any vector x, wedenote by xj the jth coordinate ofx, by Diag(x) the diagonalmatrixwhose jth diagonal entry is xj, and by

### IlXlll, Ilxll, Ilxll

theLl-norm, the L2-norm, and the

### L-norm

ofx, respectively.

For any matrix

we denote by

### Aj

the jth column of

### A.

We also denoteby e the vector of l’s

### (whose

dimension will be clear fromthe

and denote by

### ln(.)

the

natural logarithmfunction.

2. Algorithmdescription. Let

### A

beann mmatrix,Bbean mmatrix, b bean n-vector, cbean m-vector, and d bean/-vector. Consider thefollowinglinear program associated with

b, c, and d:

minimize

subject to t

c, Bt d, t

### >_

0,

which wecallthe dualproblem. Thedualproblemmay be viewedas astandardlinear program int, inwhichwearbitrarilypartition theequalityconstraintsintotwo subsets andexpress onesubset inthe generator form t

c.

### (To

seethis, consider the problemof minimizing

subject to t E

N

t

### _>

0, wherea is anm-vector and

### R

andSareaffinesetsin

canalwaysexpress

t t

forsomep

and S

t Bt d

### }

for some matrices

### A

andB and somevectors cand d. Using t

### c-ATp

to eliminate tfrom the objectivefunction,theproblemis now intheform

Theconstraintst

### + ATp

ccan bethought ofasthecomplicatingconstraints which, ifremoved,wouldmake

### (7:))

mucheasiertosolve. Theadvantages of splitting the constraints in this manner willbe explainedattheendof

### 2.

Finally,wenote that

the form in which we write the equality constraints is unimportant, and the above form is adopted fornotational convenienceonly.

attaching

### Lagrange

multipliervectors x and y to theconstraints c-

### ATp

t

(4)

andBt d, respectively, weobtainthe following dual of

### (:P)

subjectminimizetoC

TX

b,dTxy

### + BTy>_O,

whichwecall the primalproblem.

Wemake thefollowing blanketassumptions, whicharestandardforinterior point algorithms, regarding

and

Assumption

b,x

0

isnonempty.

d,t

t

c

isnonempty.

### (c) A

has fullrowrank.

It iswell known that, under parts

and

of Assumption

both

and

### (P)

have nonempty bounded optimal solutionsets.

Consider the dual problem

### (7)).

Suppose that we place a quadratic penalty on the constraints t

### + ATp

c with a penalty parameter

### 1/e (e > 0)

and we place a

logarithmic penalty on the constraints t

### _>

0 with a penalty parameter 7

### >

0. This

gives thefollowing approximation to

minimize

### f,(t,p)

subject to Bt d, t>0,

where

### f, (0, c) " n

isthepenalized objective function givenby

m

t

2

j=l

### ln(tj)ebTpVt>O,

/p.

Thepenalizedproblem

### f,v

is twice

differentiableand the Hessian

### V2f,v

ispositive definiteeverywhere

Assumption

### A(c)).

Since the feasible set of

is nonempty

Assumption

### A(b))

and its

intersectionwith any level set of

is bounded

Assumption

### (7),v)

has an optimal solution which, by the strict convexity of

is

unique.

Note 1.

### We

can use penaltyfunctions other thanthe quadraticandthelogarith- mic. Forexample, we can use acubic inplace ofthequadratic and tj

inplaceof

### -ln(tj).

The quadratic and the logarithmic function, however, have nice properties

### (such

asthe second derivative of thelogarithmic function equalsminusthe squareof itsfirst

### derivative)

whichmakeglobal convergence analysispossible.

It is well known

that

### (t,p)

isthe optimal solution of

### (7),)

ifand

onlyifit satisfies, togetherwith some uE

### z,

theKuhn-Tuckerconditions

t

0, Bt d,

### V f,v(t, p) + BTu

0 O.

Straightforward calculationusing

finds that

t

/

b

T

(5)

where

### T Diag(t).

The above formulas will be used extensively in the subsequent analysis. Note that

### V 2f,

isill-conditioned at theboundaryofits domain.

It is not difficult to show that, as e and tend to zero, the optimalsolution of

### (7),)

approaches theoptimal solution setof

Lemma

### 3.1).

Thissuggests the following algorithm for solving

### (:D,). At

eachiteration,wearegivene, anda

### (t, p)

whichis anapproximate solutionof

### (T),);

weapplyaNewtonstepto

at

to generatea new

### (t, p)

andthenwedecreasee and 7- In otherwords, weconsider a sequence ofpenalized problems

k 1, 2, with

0 and

0, andwe

usea

### Newton

stepto followtheoptimalsolution ofone penalizedproblemto that of the next.

### We

now formallystate thisalgorithm, which wecall the

for

algorithm.

### QLPPF

Algorithm

Iteration0. Choose

0and

0. Choose

m X

### n

withBt d.

Iteration k. Given

E

### (O,(:x:))

m X}n with Btk --d, compute

### (/ktk,/kpk,uk)

to be asolution of

k

0

O,

andset

t

### (2.7) +1= a , e+=ck ,

where ck is some scalar in

Note 2.

### As

we noted earlier, the

### QLPPF

Algorithm is closely linked to the algorithms proposed by Setiono. Inparticular, it can be seen from

### (2.3), (2.4)

that,

in thespecialcase where B is thezero matrix, the direction finding problem

### (2.5)

is

identical to thatin theIDLN algorithm ofSetiono

anddiffers

fromthat inthe

### IDPP

algorithm ofSetiono

### (see [Set89, Eq. (6)])

by onlyanorder ek

termon the right-hand side

tendsto zero as

tendsto

3.

### A

unique feature of the

### QLPPF

Algorithm lies in its handling of the two sets of constraints t

### + ATp

c and Bt d, whereby quadratic penalties are placed only on thefirst set while the second set is maintained to be satisfied at all iterations. This feature has the advantage that it enables the

### QLPPF

Algorithm to provide a unified framework for interiorpoint methods and exterior point methods.

### As

anexample, ifweputallthe equalityconstraintsintoBt d

### (and

correspondingly set

I and c-

### 0),

then it can be seen that the

Algorithmwith

### 7k

1

for all k reduces to the well-known primal path-following algorithm for maximizing

### bTt

subject to the constraints Bt d, t

0

### (see [Gon89], [Wse89], [Ye87]).

If we

put all the equality constraints into t

c

### (and

correspondingly set B 0 and d

### 0),

then, as was noted above, the

### QLPPF

Algorithmreduces to the IDLN algorithm for maximizing

### bTp

subject to the constraints t

c, t

### >_

O. For

other choices of constraint splitting, we obtain algorithms somewhere in between.

### We

can then envision choosing a constraint splitting so the corresponding

### QLPPF

Algorithm is in some sense most efficient

fastest

### convergence)

for the given problem. Alternatively,wemay choosea constraintsplittingsoto ensurethatcertain

"critical" equality constraints are satisfied exactlyat all iterations

### (by

puttingthese constraints intoBt-

### d).

(6)

3. Global convergence. In this section, we show that if

### (7))

is in some sense

primalnondegenerate andif

### (ti, pi)

is "close"tothe optimalsolutionof

inthe

### QLPPF

Algorithm, then, by decreasing the

### k’s

at an appropriate rate, the iterates

generated by the

### QLPPF

Algorithm approach the optimal solution set of

Theorem

### 3.4).

Becausethe Hessian

### V2f,

isill-conditioned at theboundary ofits domain, theproof ofthisresult isquite involved and reliescriticallyon finding asuitable

### Lyapunov (i.e., merit)

function to monitor theprogress ofthealgorithm.

Forany e

0and’y

0, let

m x

x

### l

be thefunction givenby

1 t

t

O,

where T

From

and

we see that

is a solution of

if

and only if

satisfies Bt d,t

0 and

### p,(t,p, u)

0 for some u. Hence

actsas a

### Lyapunov

function which measures howfar

isfrom solving

This

For anye

0, let

t-

-eb, Bt d, t

O

### }.

The following lemma shows that any

E

that satisfies

### p,(t,p, u) <_ ,

for

some u, iswithinan orderofe

### + (1 + )’y

of beingan optimalsolution of

### (7)).

LEMMA 3.1. Fix any

"y

0 and

Forany

that

### satisfies p,(t,p, u) <_ for

someu, the vector

given by

x

y

is

### feasible for (P)

and, together with

Jr-

e

2

### <_2(* (e)2 + m(1 + )e’y),

where v* denotes the optimal cost

and

is an

o utio

Since

is in

### J;,

itfollowsfrom the definitionof

and

that

b. Since

### p,(t,p,u) <_ ,

we have from the definition of

and

that

### IT(-x- BTy)+ "Yell <_"y,

where T Diag(t). Thus

Since t

0

### and/ _<

1,the firstinequalityin

yields

0.

isfeasiblefor

Let

1

2

### + cT{+dTV(, b).

(7)

Straightforwardalgebra using

yields

tT

### Also,

it followsfromstrongduality for aconvexquadraticprogramand its dual that

0

d,

0

(,)- (,)

Thus,by letting

### (x*, y*)

be any optimal solution of

with

### Iix*]12/2 *,

weobtain from the above tworelationsand

4-

v* that

24-

4-

tT

4-v* 4-

4-

### BTy),

Similarly, by letting

### (t*, p*)

beany optimal solutionof

### (T)),

weobtainfromthesame

two relations andthe facts

v* andt* 4-

cthat

1

c

t

v* tT

4-

Since 0

### <_ tT(x + BTy)<_ m(1 + ) (cf. (3.4)),

theabovetwo relationsyield cTx 4-

y

v*4-erl 4-

4-

v*

Finally, since

is feasiblefor

so that

4-

### dTy >_

v*, the first ofthese two relationsalsoyields

V* V*

/

### /m(l/

Canceling v* from bothsides andusing

### (3.3)

completes theproof, r3

Since weare dealing with linearprograms, Lemma3.1 implies that, ase \$0 and 7 \$ 0, the

given by

### (3.3)

approaches the optimal solution set of

and

### (t,p)

approaches the optimal solution set of

### (7)). In

fact, it suffices to decrease e and as far as2

### L,

where is somescalar constant andL isthesize oftheproblem encoding in binary

as, say, in

### [Kar84]),

at which timean optimalsolution of

and

of

### (T))

can be recovered by using the techniques described in, for example,

and

Foreach

0, let

m

### - [0,oo)

bethe function given by where

D

E

F

(8)

andT

### Diag(t). (F

iswelldefined because

1

is aprojection

and

1, so that I-

### D1/2ED

1/2 is positive definite.

### We

also use the assumption that

has full row

The quantity

### 0h(t)

estimates thenorm squared ofcertain projection-like operator dependingon

### A

andt, and it willbe used extensively in our analysis. In general,

### 0 (t)

is rather cumbersome to evaluate, but, as weshall see, it sufficesfor our analysisto bound

fromabove

Lemma

### 3.3(b)).

3.1. Analyzinga Newtonstep. Inthis subsectionweproveakeylemmathat statesthatif

### (,/3)

is "close" tothe optimal solutionof

then

### (t, p)

generated by applyingoneNewtonstepto

at

### (,/3)

isclosetotheoptimal solution of

for

some e

and some

### - <.

The notion of "closeness" ismeasuredby the

function

### p,

andtheproof ofthelemma isbasedon theideas usedin

### [Tse89, 2].

LEMMA 3.2. Forany

any

0 andany

E

mx

with

d, let

p,

be given by

t-

p

whereu and

### (At, Ap)

togethersolve thefollowing system

linear equations

0 0,

O.

Suppose that

### p,#(-, p, fi) < Z for some/ < min{1, 1/0(t-)}.

Then thefollowing hold:

### (b)

For anya satisfying

2

1

a

1

max

we have

Let

g

b,

whereT-

Then, by

using

and

wewrite

equivalently as

### BAt

0,

where for convenience welet D

Solving for

gives

1/2 T 1/2

ED

where welet E

and F

### (F

iswell definedbythesamereasoning thatthematrix givenby

iswell

### defined.)

Then, we can bound

asfollows:

1/2 T 1/

(9)

Since

### -2D

isdiagonal and each ofits diagonal entry isless than

weobtain that

hence

### (3.16) _<

where the last inequality follows from

### (3.15)

and the definition of

and

by using

and

we canwrite

equivalentlyas

0 and

0

b,

sofrom

and

obtain

t

-2

where T

### Diag(t), AT

Diag(At), and the third andthe last equality follow from

This implies

### (3.19) _< gO#(t-)(/3) 2,

where the thirdinequality follows from

have from

### (3.16)

and the hypothesis

that

1, so

and

0 yields t

0.

dtogether

with

0

and

yields Bt d, and

together with

yields 0

t

b. Hence

### (b)

Fix any a satisfying

and let

c

that because

### 0(t-)/ <

1, the left-hand quantity in

### (3.12)

is strictly less than 1, so such an

Let

r

t

### ATp BTu)

-t-

Thenthetriangle inequality and

imply

t-

2

2

2-

### 1)V/-,

whichtogetherwiththefact

2

yields

(10)

Let s

t

4-eb.

using

and

wehave

s

4- ab

### (a1)b,

whichtogetherwiththefact

/

1,

yields

### Ilsll/x/g- -- (l/a- 1)llbl] v"-Z/ff’

_</3.

This together with

### (3.20)

and the definition of

p,

proves our

claim, r]

3.2. Bounds.

### Lemma

3.2 shows that ifthe rate c at which the penalty pa- rameters e and

### -

are decreased is not too small

### (see(3.12)),

then a single Newton

stepsuffices tokeepthe current iterateclosetothe optimalsolution ofthepenalized problem

### (T),,).

Thus, in orderto establish the

### (linear)

convergence ofthe

### QLPPF

Algorithm, it suffices to bound c away from 1 which, according to

amounts

to bounding

### O.(t)

bysome quantity independent of t and e, 7- Itis not difficult to seethatsuchabound doesnotexistfor arbitrary t. Fortunately, weneedto consider onlythosetthat, togetherwith some p,areclose tothe optimalsolutionof

### (77,,),

in

whichcase, as weshowbelow,suchabound doesexist

### (provided

thatacertainprimal nondegeneracyassumptionalso

### holds).

Theproof ofthis issomewhat intricate: Fore

and

### -

large,weargueby showingthatt cannotbetoolarge, i.e., of the ordere

1

Lemma

### 3.3(a))

and,for and -ysmall, weargueby showingthat,under the primal nondegeneracy assumption, the columns of

### A

corresponding to those components of t which are small

ofthe order

areofrank n.

3.3.

For all e

O, all

and all

E

satisfying

1

some u, we have

7

where

### Mx >

0 is some scalar depending on

b, c, andd only.

Suppose that

### (7>)

isprimalnondegeneratein the sensethat,

everyoptimal

solution

those columns

### of A

corresponding to thepositive components

x*

### + BTy

have rank n. Then,

all e

all )

and all

E satisfying

1

some u, we have

where we

4

w

0,

with

### M2 >_

1 some scalar depending on

b, c, andd only.

### Suppose

the contrary, so that there exists asequence

u

such that ek

0 and

0 and

k, pk,

u

1 Vk,

"r ----,oo, oo.

(11)

### By

passing into a subsequence if necessary we will assume that

### (t

convergesto some limit point, say

have from

using

and

### (3.2))

that

Btk d, tk >0 Vk, and fromLemma3.1 that

k

v*

k

k

Vk.

### Upon

dividing bothsidesof the above fourrelationsby

### I[(tk, pk)[

and letting k --,c, we obtainfrom

and

oo

O,

### (too,poo) [[too+ATp[l=0.

that

Then, any optimal solution to

### (73)

would remain optimal when moved along the direction

### (too, poo),

contradicting theboundedness ofthe optimal solution set of

Assumption

Fixany

0,

0,

andany

E

satisfying

### p,(t,p,u)<_

1

for some u. Let T--

andlet

### D, E,

F be given by, respectively,

and

Then, F-1

1 and

### D (I + AT-2) -1.

From thedefinitionof

wethenobtain

/

/

### I[E[[[ID/2I[[IAll[[F[[)2

wherethestrict inequality follows from thefacts

1,

1. Nowwe bound

Wehave

z

m

j=l

j=l

### >_ A E(Ayz)2/l[t[le

j=l

where the first inequality follows from

1 and a

### >

0 denotes the smallest eigenvalue of

0 because

hasfull row

yields

### (3.26) IIFll < 211tl12/(r).

For e and q, near zero, wegive below a different bound on

### IFII- By

the primal

nondegeneracy assumption, there exists a constant 5

0 depending on

### A, B,

b, c, anddonly suchthat if

### (x, y)

isanyfeasible solution of

with

v*

(12)

thenthecolumns

withxj

5

### }

have rank n. Consider

the

case in whichE

and

p,

1,we

have fromwhere

### *

Lemmais defined as3.1 thatthe columnsin Lemma 3.1.

Since

E

with

5

### }

have rank n. Since

and

1,we have

and

so that

implies tj

weobtain from

that

m

j=l

### > (5)’llzll/(e) w,

where the last inequality follows from the fact thatthose

forwhichtj

have

rankn, and

### a’

issome positivescalar dependingon

only. Since

we then have

### I111 (2)/((5)’)<_8/((5)’)8/((5)’),

where for convenience we let w

### /3’. Now,

consider the remaining ce where

2my 6. Then,

sopart

and

together yield

e

7

2

1

1

### + 2m)

Combiningtheabove two cesandweconcludeha

ifotherwise,

### *e+2m7<5;

for somepositivescalarK dependingon

### A, B,

b,c, anddonly. Combining the above bound with

### (3.24)

and we conclude that

4 forsome scalar

1 dependingon

### A, B,

b, c, and donly.

3.3. Main convergence result.

### By

combiningLemm 3.1 to 3.3, we obtain the followingglobal convergence result for the

### QLPPF

Algorithm.

THEOREM3.4. Supposethat

### (P)

isprimalnondegenerateinthesense

dt

th

th om

u

somescalar

t

0, Bt d,

,Oe,

U

fl,

(13)

and

we choose

1

Vk,

ak max

1

then ek

### O,

7k 0 linearly, and

k-

### C)/ek,uk/ek)}, {(tk,pk)}

approach the optimalsolutionset of, respectively,

and

Firstnotefrom

1 forall w

0

and

and

that

### <

1. Also note from

that

claimthat

pk_,_x

U

U

for all k

### >

2. It is easilyseen byusing

and

and Lemma3.2

that

holds for k 2.

that

holds for all k

h, for some h

2. Then,

E

and

u

### h) < /3 <

1. Since wealso have from

that eh

and 7h

andfrom

that

-4-

wecan applyLemma

to conclude that

### (3.34) Oh9/h(t h) _ ?D(h-1/7h-l) ) (1/71)

where the equality follows from

Then, by

Since

### (3.33)

holds for k h, we also have

u

so

### Lemma

3.2 together with

and

holds for k h

1.

Since

holds for all k

2, we see that

holds for all h

2. Then, by

### (3.30), O(th)

isless than 1 and bounded away from 1 for all h

2, so that, by

### (3.31),

ak isless than 1 and bounded away from 1 for all k. Hence

### ek

0, 7k 0 at the rate ofa geometric progression. The remainder of the proof follows from

andLemma3.1. 0

### Ok (tk)

wecanuse, for example, theupperbound

intheformula

### (3.31),

andlinearconvergence would be preserved.

### However,

thisbound istypically loose and difficulttocompute. Thereisalso theissueof finding

andu satisfying

### 3.4.

3.4. Algorithm initialization.

### By

Theorem 3.4, if the primal nondegeneracy assumption thereinholds and ifwe can find e, 7,

andusatisfying

t

0,

d,

then we can

and start the

### QLPPF

Algorithm withe, 7,

### (t,p),

andwewould obtain linearconvergence. But how dowe findsuch e, 7,

### (t, p),

andu?

(14)

One obvious way is to fix any e

0, any

### >

0, and then solve the penalized problem

The solution

### (t,p)

obtained satisfies

t

0, Bt d,

0 -0

for some u

so, by

and

we have

p,

0 and

and

### (3.37)

hold and, by Lemma

so

also holds.

course

### (T),)

need not besolved

### exactly.)

Tosolvetheproblem

### (T),),

we can use any method forconvex differentiable minimization

### descent),

andwewould typicallywant esmall and/large sothat

### (T),)

iswellconditioned.

### Suppose

that there holds Be 0 and

b.

### (This

holds, for example, when

isthezeromatrix

### (which

corresponds to the casewhenall equalityconstraints are penalized), and a change ofvariable

### x’ ()-lx,

where 2 is any interior feasible solution of

b,

### > 0)

and Z Diag(), has been made in

### (P).)

Then we can find a usable e, /,

### (t,p),

u immediately: Fix any e

andlet

e.p Also

let w

and

u

Then Bw d,

eb

and

ee t, sothat

Bt-

d,

t

-eb,

t

e/e

ee

ew,

where

and

Also from

### b(1) >

1 and our choice ofe we

have e

sot

0.

by

wehave

and, by

and e 7, wehave

p,

2

2

2

/

/

### Ilwll)/() ,

where the last inequality follows from the triangle inequality and the nonexpansive property of projection matrices. From our choice ofe we see that

### .5/(1),

sothe right-handsideof

### (3.38)

isbounded above by

2

### 1/(1) 1/(e/),

where the inequality follows from

### (1) _>

1 and the equality follows from 7 e.

and

### (3.37)

hold. Also, since

J), then

together with

shows that

### (3.36)

holds.

4. Numerical results. Tostudythe performance of the

### QLPPF

Algorithmin practice, wehave implementedthealgorithmtosolve the specialcaseof

and

(15)

inwhich

### B

isthe zeromatrix, i.e.,

isofthe form

minimizecTx

subject to

b, x

O.

### (This

corresponds to penalizing all equality constraints in the corresponding dual

### problem.)

Below we describe our implementation and present our very preliminary numerical experience.

1. Initialization. Inour implementation, weset for allproblems e1--

andset

arbitrarily)

-0, t

where

### "max"

istaken componentwise.

that since

### B

isthezero matrix, t can be set to any positive

### vector.) We

have chosent so tominimize

t

subjectto t

### _>

e. Also, care must be exercised in choosinge and

### -"

if theirvalues

are set toolow, thenthe

### QLPPF

Algorithm may fail to converge; if their values are set too high, then the

### QLPPF

Algorithm may require many iterations to converge.

### (Notice

that we set e and

### 7

directly proportional to the average cost

### Iclll/m

so

theirvalues scale with

### c.)

2. Steplength selection. To ensure that the

### tk’s

remain inside the positive orthant, weemployabacktrackingscheme similar to that usedbySetiono: whenever tk

### + At

k isoutside the positiveorthant, wereplacetheformula for t+1 in

by

tk+l tk

where

k

min

### However,

this raises a difficulty, namely, for

### A

much smaller than 1, the vector

### (.98AkAt k, Ap k)

may be far from the

direction

### (Atk, Ap k)

and, as a conse- quence, theiterates may fail to converge. Toremedy this, we replace

to

theformula for

in

by

### pk+l pk -t-.98AkAp k,

whenevernonconvergence isdetected.

### (The

parameter value .98 ischosen somewhat arbitrarily, but it works wellin our

### tests.)

The proper choiceofthe

### ak’s

isvery important for the

Algorthm: if the

aretoo near 1

### (so

thepenalty parameters decrease

### slowly),

then the algorithm would convergeslowly; if the

aretoonear0

### (so

thepenalty parameters decrease rapidly), then the algorithm may fail to converge. In our implementation we adjust the

### ak’s

dynamically according to the following rule:

.95ok-1

.6

if

k 1;

if

.2;

otherwise

Vk

2,

(16)

with

### 61

set to .5. The rational forthis adjustment rule is that, if

### Ak

1, then the current iterate is closely following the solution trajectory

### (so

we can decrease the penalty parameters at afasterrate andstillretain

and, if

### Ak _<

.2, then the current iterate is unable to follow the solution trajectory

### (so

we must decrease

thepenalty parameters at a slower

### rate).

3. Termination. To avoid numericalproblems, westop decreasing thepenalty parameters e and

### ,

when they reach some prespecified tolerances min and min, respectively.

our testswe set

(min--

min

terminatethe

### QLPPF

Algorithm when the relativeduality gap andtheviolation ofprimal anddual feasibilityaresmall.

### More

specifically,weterminatewhenever the current iterate, denoted by

satisfies

10

10

where x

Lemma

X

and

### [.1+

denotes the

orthogonal projection onto the nonnegativeorthant. Only for three ofourtest prob- lems could the above termination criterion not be met

to violation of

and

### (4.6))

inwhich casethe algorithmisterminated wheneverprimal feasibility

ismet and

v*

islessthan 5.10

### -7,

wherev* denotes theoptimalcost of

### (4.1).

4. Solving for the direction. The most expensive computation at each iter- ation ofthe

### QLPPF

Algorithm lies in solving the system oflinear equations

### (2.5).

Thiscan be seen to entailsolving a singlelinear system of the form

### (4.7) AQATw

z,

for w, where z is some n-vector and

### Q

is some m m diagonal matrix whose jth diagonal entryis

withe

0,

### >

0, andtsome positivem-vector.

### (Linear

system of the form

### (4.7)

also

arise in interior point algorithms, but

### (4.7)

has thenice property that the condition numberof

### Q

canbe controlled byadjusting thepenalty parameters e and

Inour

implementation,

is solved using

### YSMP,

asparse matrixpackage for symmetric positive semidefinitesystems developed atYale University

### (see [EGSS79], [EGSS82])

andaprecursor to the commercialpackage

### SMPAK (Scientific

ComputingAssociates,

### 1985). YSMP

comprisesaset ofFortranroutines implementing theCholeskydecom- positionschemeand, as apreprocessor,the minimum-degree ordering algorithm

e.g.,

### [GeL81]).

Inourimplementation, theminimum-degree orderingroutine

### ORDER

iscalled first toobtainapermutationoftherowsand columns of the matrix

### AA

T so that fill-in isreduced during factorization. Then,

### AA

T issymbolically factoredusing the routine

### SSF. (SSF

is called only once since the nonzero pattern of

T does

not change with

### Q.) At

eachiteration, the matrix

### AQA

T isnumerically factored by

Updating...

## References

Related subjects :