PROGRAMMING AND

(1)

A

PATH-FOLLOWING ALGORITHM FOR

LINEAR

PROGRAMMING USING

QUADRATIC AND

LOGARITHMIC

PENALTY

FUNCTIONS*

PAUL TSENGt

Abstract. Motivated bya recent work ofSetiono, ^apath-following algorithm for linear programming using both logarithmic and quadratic penalty functions isproposed. Inthe algorithm, alogarithmic andaquadraticpenalty isplaced on, respectively, the nonnegativityconstraintsand anarbitrary subset of the equality constraints; Newton’smethod isapplied to solve the penalized problem, andaftereachNewton stepthe penalty parametersaredecreased. Thisalgorithm main- tains neitherprimalnordualfeasibilityand does not requireaPhaseI. Itis shownthatif the initial iterate ischosenappropriatelyandthepenaltyparametersaredecreased tozeroinaparticularway, then thealgorithmis linearly convergent. Numericalresults arealsopresentedsuggesting that the algorithmmay be competitive withinteriorpointalgorithmsinpractice, requiringtypicallybetween 30-45iterationstoaccuratelysolve each Netlibproblemtested.

Keywords, linear program,path-following,Newtonstep, penaltyfunction AMS subject classifications. 49, 90

1. Introduction. Since the pioneeringworkofKarmarkar

[Kar84],

muchinter- est has focused on solving linear programs using interior point algorithms. These interior point algorithms may be classified roughly aseither

(i)

projective-scaling

(or

potential

reduction), (ii)

affine-scaling, or

(iii)

path-following.

We

will notattemptto reviewtheliterature on thissubject, which isvast

(see

^for ^example

[Meg89], [Tod89]

for

surveys).

^Our ^interest ^{is in} algorithms of the path-following type, ofthe sort discussed in

[GaZS1].

These interior point algorithms typically penalize the nonneg- ativity constraints by a logarithmic function and^use

Newton’s

method tosolve the penalized problem, with the penalty parameters decreased after each Newton step

(see,

for example,

[Gon89], [GMSTW86], [KMY89], [MoA87], [Ren88], [Tse89]).

One disadvantage of interior point algorithms ^is the need for an initial interior feasible solution.

A

commontechnique for handlingthis is to addanartificalcolumn

(see [AKRV89], [BDDW89], [GMSTW86], [Lus90], [MMS89], [MSSPB89], [MoM87]),

but this itselfhas disadvantages. For example, thecost ofthe artificial column must be estimated, andsome type ofrank-1 updatingisneededto solve eachleast square problemwhichcansignificantlyincreasethe solutiontimeanddegrade thenumerical accuracyofthe solutions.

Recently,Setiono

[Set89]

^proposed^aninterestingalgorithm that^combinesfeatures ofapath-following algorithmwiththoseof the method of multipliers

[HaB70], [Hes69], [Pow69] (also

^see

[Roc76], [Ber82]).

^This^algorithm^doesnot requireafeasible solution to start and is comparable to interior point algorithms both in terms of work per iteration and, according to the numerical results reported in

[Set89],

ⁱⁿ^terms ^ofthe total numberofiterations. Todescribe the basic idea in Setiono’salgorithm, consider

Receivedbythe editorsApril 16, 1990; acceptedfor publication(in^revisedform)June11,1992.

This researchwassupported bythe U.S.Army ResearchOffice, contract DAAL03-86oK-0171,and wasconducted whiletheauthorwaswiththeLaboratoryfor Information and DecisionSystemsand theCenterforIntelligent ControlSystems,MassachusettsInstitute ofTechnology, Cambridge.

Department of Mathematics, GN-50, University of Washington, Seattle, Washington 98195 (t sengmath, washington, edu).

1578

(2)

alinearprogramin the standarddual form

minimize

--bTp

subjectto t

+ ^ATp

^c, ^t

^>_ O,

where

A

^is some matrix and b and c are vectors of appropriate dimension. Let us attach ^a

Lagrange

multiplier vector x to the constraints t

+ ^ATp

^c ^and ^apply

themethod ofmultipliers to the above linear program. Thisproduces thefollowing iterations

1

(t

^k

+ ^ATpk C),

^k- 1, 2,

x+

^x

+ _{_J}

where

(e k}

^is ^a ^sequence of monotonically decreasing positive scalars and

(tk,p k)

is some

(inexact)

^solution of the augmented Lagrangian subproblem

(1.1)

^minimize

--bTp + (xk)T(t + ATp- c)- -k[It

1

^{+ ATp-} ^c[I

² subjectto t

>_

0.

(An

advantage ofthe above multiplier iterations is that they do not need a feasible solution to

start.) ^A

^key ^issue associated with the above multiplier iterations con- cerns the efficient generation ofan inexact solution

(tk,p k)

ofthe convex quadratic program

(1.1)

for each k.

(Note

^that ^as ^e^k decreases, theobjective function of

(1.1)

becomes progressively more

ill-conditioned.)

^Setiono’s ^algorithm may be viewed as the method of multipliers in which

(t,p k)

^is generated according to the following scheme, reminiscent ofthepath-following idea: Add a logarithmic penalty function

_/k -jm__ ^ln(tj)

^to^the^objective ^of

(1.1),

^where

k

^is ^some ^positive ^scalar ^monoton-

ically decreasingwith k, andapply ^asingle Newton step, starting from

(tk-l,p k-l),

to the resulting problem.

(If

^the^t^k ^thusobtained lies outsidethepositiveorthant, ^it

ismoved back towards t^k-1 until it becomes positive.

1)

Inthis paper, inspired bythe precedingworkofSetiono, we study an algorithm that also addsto the objective aquadraticpenaltyontheequalityconstraints and a logarithmic penaltyon thenonnegativity constraints; and then solves the penalized problem using Newton’s method, ^with the penalty parameters decreased after each

Newton

step. Unlike Setiono’s algorithm, our algorithmdoes not use the multiplier vector x

(so

^it may be viewed as a pure penalty

method)

^and ^allows any subset of theequality constraints to bepenalized.

We

show that ifthe problem isprimalnon- degenerate and the iterates start near the optimal solution of the initial penalized problem, then the penalty parameters can be decreased at the rate ofa geometric progression and the iterates converge linearly. To the best of our knowledge, this is the first

(global)

^linear convergence result for an noninterior point path-following algorithm. We also present numerical results indicating that thealgorithm may po- tentially be competitive with interior point algorithms in practice.

We

remark that enalty methods that use either the quadratic or the logarithmic penalty function have been well studied

(see,

for example,

[Ber82], [FiM68], [Fri57], [JIO78], [Man84],

Moreprecisely, ^k isgivenbythe formula

k k-

+

^.98AkAt

^k,

whereAt^k is the Newton direction(projectedonto the space oft) ^and Ak is thelargest AE (0,1]

for which ^k-

_

^/ktk ^isnonnegative. (Thechoice of.98 isarbitrary--anynumberbetween 0 and 1woulddo.)

(3)

[WBD88]),

^but ^very ^{little is} ^known about penalty methods that use both types of penaltyfunctions

(called

mixed interior point-exterior pointalgorithms in

[FiM68]).

Upon

completion of this paper, the author learned from Professor O. L. Man- gasarian that the algorithm discussed here is essentially the IDLN

(Interior

^Dual Least

Norm)

algorithm described in the recent Ph.D. thesis ofSetiono

[Set90] (also

see thereport

[Set91]),

^{with minor}differences inthechoiceofthepenalty parameters.

The samereferenceincludes an (asymptotic) ^linearrateconvergence analysis and^ex- tensivenumericalresults showing that theIDLN algorithm outperforms the popular simplex code

MINOS

by a factor of 2 or more on the 63 Netlib problems tested.

In

short,^Setiono’sindependent workprovides furtherevidenceofthepracticalefficiency of the mixed interior point-exterior point solution approach. Finally, we note that while this paperwas under review, other noninterior point methods relating to that studiedhere have been proposed. One method, brought to our attention by one of thereferees,isacertainaugmented Lagrangian algorithm forstochastic programming

(see [MuR90]);

ânother ^method îs â primal-dual exterior point algorithm for linear programming

(see [KMM91]). However,

^neitherofthesemethods has beenshown to possess the nice

theoretical/numerical

properties enjoyed by the algorithm studied here. Forexample, no convergenceresult is given for the methodin

[MuR90]

^and^no

numericalorrate of convergence result isgiven forthe method in

[KMM91].

This paper proceeds as follows: In

2

^we ^describe^the ^basic ^algorithm; ⁱⁿ

3

^we

analyzeits convergence; andin

4

^we^recount^our numerical experience withit.

In 5

wediscuss extensionsofthis work.

Inour notation, every vector is^acolumnvector insome k-dimensionalrealspace

k,

andsuperscript T denotes transpose. For any vector x, ^wedenote by xj the jth coordinate ofx, by Diag(x) the diagonalmatrixwhose jth diagonal entry is xj, and by

IlXlll, Ilxll, Ilxll

^the^Ll-norm, ^the ^L2-norm, ^{and the}

^L-norm

^of^x, respectively.

For any matrix

A,

we denote by

Aj

the jth column of

A.

We also denoteby e the vector of l’s

(whose

dimension will be clear fromthe

context)

^{and denote} ^by

ln(.)

^the

natural logarithmfunction.

2. Algorithmdescription. Let

A

beann mmatrix,Bbean mmatrix, b bean n-vector, ^cbean m-vector, and d bean/-vector. Consider thefollowinglinear program associated with

A, B,

b, c, ^and d:

minimize

--bTp

subject to t

+ ATp--

^c, ^Bt ^d, ^t

^>_

^0,

which wecallthe dualproblem. Thedualproblemmay be viewed^{as a}standardlinear program int, inwhichwearbitrarilypartition theequalityconstraintsintotwo subsets andexpress onesubset inthe generator form t

+ ^ATp

^c.

(To

^seethis, consider the problemof minimizing

aTt

subject to t E

R

N

S,

t

_>

0, wherea is anm-vector and

R

andSareaffinesetsin

m. We

canalwaysexpress

R {

^t ^t

c-ATp

forsomep

}

and S

{

^t ^Bt ^d

}

^for ^some ^matrices

^A

^and^B ^and ^some^vectors ^cand d. Using t

c-ATp

to eliminate tfrom the objectivefunction,theproblemis now intheform

(7:)).)

^Theconstraintst

+ ^ATp

^c^can ^be^{thought of}^as^thecomplicatingconstraints which, ifremoved,wouldmake

(7:))

^much^easier^to^solve. ^Theadvantages of splitting the constraints in this manner willbe explainedattheendof

2.

^Finally,^we^{note that}

the form in which we write the equality constraints is unimportant, and the above form is adopted fornotational convenienceonly.

By

attaching

Lagrange

multipliervectors x and y to theconstraints c-

ATp

t

(4)

andBt d, respectively, weobtainthe following dual of

(7))"

(:P)

^subject^minimize^to^C

^Ax

^T^X

--

^b,^d^T^x^y

^{+ BTy} ^>_ ^O,

whichwecall the primalproblem.

Wemake thefollowing blanketassumptions, whicharestandardforinterior point algorithms, regarding

(7))

^and

(P).

Assumption

A. (a) { (x, y) Ax

b,x

+ ^BTy ^>

⁰

}

^is^nonempty.

(b) { (t, p) Bt

d,t

> O,

t

+ ^ATp

^c

}

^is^nonempty.

(c) ^A

^{has full}^row^rank.

It iswell known that, under parts

(a)

^and

(b)

of Assumption

A,

both

(7))

^and

(P)

^have nonempty bounded optimal solutionsets.

Consider the dual problem

(7)).

^Suppose ^that ^we ^place ^a quadratic penalty on the constraints t

+ ^ATp

^c ^with ^a penalty parameter

1/e (e > 0)

^{and we} ^place ^a

logarithmic penalty on the constraints t

_>

0 with a penalty parameter ₇

>

^{0. This}

gives thefollowing approximation to

(7))"

minimize

f,(t,p)

subject ^to Bt d, t>0,

where

f, (0, c) " n

isthepenalized objective function givenby

m

(2.1) f,v(t, p) -llc

^t

^ATpll

²

^/E

_j=l

^ln(tj) ^ebTp ^Vt ^> ^O,

^/p.

Thepenalizedproblem

(7),v)

has the advantage thatitsobjectivefunction

f,v

^{is twice}

differentiableand the Hessian

V2f,v

^ispositive definiteeverywhere

(via

Assumption

A(c)).

^Since the feasible set of

(7),v)

^is ^nonempty

(by

^Assumption

A(b))

^{and its}

intersectionwith any level set of

f,v

^is ^bounded

^(by

^Assumption

A(a)),

^{it is} readily seen that

(7),v)

^has ^an optimal solution which, by the strict convexity of

f,v,

^is

unique.

Note 1.

We

can use penaltyfunctions other thanthe quadraticandthelogarith- mic. Forexample, we can use acubic inplace ofthequadratic and tj

ln(ty)

ⁱⁿ^place^of

-ln(tj).

The quadratic and the logarithmic function, however, have nice properties

(such

^asthe second derivative of thelogarithmic function equalsminusthe squareof itsfirst

derivative)

^which^makeglobal convergence analysispossible.

It is well known

(see [Roc70])

^that

(t,p)

isthe optimal solution of

(7),)

^if^and

onlyifit satisfies, togetherwith some uE

z,

theKuhn-Tuckerconditions

(2.2)

^t

>

0, Bt d,

V f,v(t, p) + BTu

₀ ^O.

Straightforward calculationusing

(2.1)

^{finds that}

(2.3) Vf,(t,p) (

^t

⁺ ATp-c-e7(T)-le

)

A(t

^/

ATp- _c)

b

(2.4) V2f,(t,p)= ( Î ⁺ ê(T)-2 Â ÂA AT)

^T

(5)

where

T Diag(t).

The above formulas will be used extensively in the subsequent analysis. Note that

V 2f,

^isill-conditioned at theboundaryofits domain.

It is not difficult to show that, as e and tend to zero, the optimalsolution of

(7),)

approaches theoptimal solution setof

(/)) (see

^Lemma

3.1).

^Thissuggests the following algorithm for solving

(:D,). ^At

êachîteration,^weâre^givenê, ândâ

(t, p)

whichis anapproximate solutionof

(T),);

^we^apply^a^Newton^step^to

(7:),)

^at

(t, p)

to generatea new

(t, p)

andthenwedecreasee and 7- In otherwords, ^we^consider ^a sequence ofpenalized problems

(7)e,),

^k ^{1, 2,} ^with

e

⁰ ^and

/k

^0, ^and^we

usea

Newton

stepto followtheoptimalsolution ofone penalizedproblemto that of the next.

We

now formallystate thisalgorithm, which wecall the

QLPPF (short

^for

Quadratic-LogarithmicPenalty

Path-Following)

algorithm.

QLPPF

Algorithm

Iteration0. Choose

e >

0and

, ^>

^0. ^Choose

(t , ^p) (0, (:X:))

^m ^X

n

^with^Bt ^d.

Iteration k. Given

(tk,p k)

E

(O,(:x:))

^m ^X^}n ^with ^Bt^k --d, compute

(/ktk,/kpk,uk)

to be asolution of

(2.5) V2 fek,k(tk, pk) l ^lap ^Atk

^k

⁺ ^Vfk, ^(t , ^pk) + ^BTuk

₀

I

^O,

^BAt ^O,

andset

(2.6) tk+l

t

+ ^At ^k, ^{pk+l__ pk} + ^Apk,

(2.7) +1= a , ^e+= ^ck ,

where c^k is some scalar in

(0,1).

Note 2.

As

we noted earlier, the

QLPPF

Algorithm is closely linked to the algorithms proposed by Setiono. Inparticular, it can be seen from

(2.3), (2.4)

that,

in thespecialcase where B is thezero matrix, the direction finding problem

(2.5)

^is

identical to thatin theIDLN algorithm ofSetiono

(see [Set91, ^Eq. (16)])

^and^differs

fromthat inthe

IDPP

algorithm ofSetiono

(see [Set89, ^Eq. (6)])

^{by only}ânôrder ê^k

termon the right-hand side

(which

^tends^to ^{zero as}

ek

tendsto

zero).

Note

3.

A

unique feature of the

QLPPF

Algorithm lies in its handling of the two sets of constraints t

+ ^ATp

^c ând ^Bt ^d, whereby quadratic penalties are placed only ôn thefirst set while the second set is maintained to be satisfied at all iterations. This feature has the advantage that ît enables the

QLPPF

Algorithm to provide ^a ^unified framework for interiorpoint methods and exterior point methods.

As

anexample, ifweputallthe equalityconstraintsintoBt d

(and

correspondingly set

A

I and c-

0),

^then ^{it can} ^be ^seen ^that ^the

QLPPF

Algorithmwith

7k

¹

for all k reduces to the well-known primal path-following algorithm for maximizing

bTt

subject to the constraints Bt d, t

>_

0

(see [Gon89], [Wse89], [Ye87]).

^If ^we

put all the equality constraints into t

+ ^ATp

^c

(and

correspondingly set B 0 and d

0),

then, ^as ^was noted above, the

QLPPF

Algorithmreduces to the IDLN algorithm for maximizing

bTp

subject to the constraints t

+ ^ATp

^c, ^t

^>_

^O. ^For

other choices of constraint splitting, we obtain algorithms somewhere in between.

We

can then envision choosing ^a constraint splitting so the corresponding

QLPPF

Algorithm is in some sense most efficient

(e.g.,

^fastest

convergence)

^for the given problem. Alternatively,wemay choosea constraintsplitting^soto ensurethatcertain

"critical" equality constraints are satisfied exactlyat all iterations

(by

puttingthese constraints intoBt-

d).

(6)

3. Global convergence. In this section, ^we show that if

(7))

^{is in} ^{some sense}

primalnondegenerate andif

(ti, pi)

^is "close"tothe optimalsolutionof

(7)1,1)

ⁱⁿ^the

QLPPF

Algorithm, then, by decreasing the

k’s

at an appropriate rate, the iterates

((tk, pk)}

generated by the

QLPPF

Algorithm approach the optimal solution set of

(7)) (see

^Theorem

3.4).

^Becausethe Hessian

V2f,

^isill-conditioned at theboundary ofits domain, theproof ofthisresult isquite involved and reliescritically^on finding asuitable

Lyapunov (i.e., merit)

function to monitor theprogress ofthealgorithm.

Forany ^e

>

⁰^and^’y

>

0, let

p, (0, oo)

^m ^x

n

^x

l

be thefunction givenby

(3.1)

p,(t,p, u) max{ -llT(c

1 ^t

^ATp ^{BTu) +} ^e’yell

-L-IIA(c-

^t

^ATp) + ebll ^} ^Vt ^>

^O,

^Vp,

where T

Diag(t).

From

(2.2)

^and

(2.3)

^{we see} ^that

(t,p)

^{is a} ^solution ^of

(7),)

^if

and only if

(t,p)

^satisfies Bt d,^t

>

⁰ ^and

p,(t,p, u)

⁰ ^for ^some ^u. ^Hence

p,

actsas a

Lyapunov

function which measures howfar

(t, p)

isfrom solving

(7),).

^This

notion ismade precise inthefollowinglemma.

For any^e

>

0, let

y={ (t,p) A(c-

^t-

ATp)

-eb, Bt d, t

>

^O

}.

The following lemma shows that any

(t,p)

E

J;

that satisfies

p,(t,p, u) ^<_ ,

^for

some u, iswithinan orderofe

+ (1 + ^)’y

^{of being}ân ôptimal^solution ôf

^(7)).

LEMMA 3.1. Fix any

> O,

"y

>

⁰ and

(0, 1].

^Forany

(t, p)

that

satisfies p,(t,p, u) ^<_ for

^some^u, ^the ^vector

(x, y)

given by

(3.3)

^x

(t + ATp- ^c)/e,

^y

u/e

is

feasible for (P)

^and, ^together ^with

(t, p), satisfies

cTx

Jr-

dTy <_ v* + vi*

^e

+ ^m(1 + ^)’y, bTp >_ v* --m(1 +/)’y,

It + ATp- cll

²

^<_ ^2(* (e)2 + m(1 + ^)e’y),

where v* denotes the optimal cost

of (:P)

^and

* ^min{ IIxll2/21(x,y *)

^{is an}

o ti. a

^{o utio}

(7’) }.

Proof.

^Since

(t, p)

is in

J;,

itfollowsfrom the definitionof

J; (see (3.2))

^and

(3.3)

that

Ax

b. Since

p,(t,p,u) <_ ,

^we ^have ^from the definition of

p, (see (3.1))

and

(3.3)

^that

IT(-x- BTy)+ "Yell ^<_ ^"y,

^where ^T ^Diag(t). ^Thus

(3.4) ’y(1 )e ^_< T(x + ^BTy) ^{<_ ’y(1} + )e.

Since t

>

⁰

and/ _<

1,the firstinequalityin

(3.4)

^yields

x+BTy >_ ’y(1-)T-le _>

0.

Hence, (x, y)

isfeasiblefor

(7)).

Let

1

le

d(T, 7r) bTu + ATr 1 ^{V(, ),}

p({, ) ll{ll

²

^{+ cT{} ⁺ ^dT ^{V(, b).}

(7)

Straightforwardalgebra using

(3.3)

^yields

pc(x, y) de(t, p) +

^t^T

(x + ^BTy).

Also,

it followsfromstrongduality for aconvexquadraticprogramand its dual that

min Pe(, ) A ^b, + BT ^>_

⁰

^} max{ de(T, r) ^BT

d,

>_

0

}.

(,)- (,)

Thus,by letting

(x, y)

be any optimal solution of

(7)

^with

Iix]12/2 ,

^we^obtain from the above tworelationsand

cTx

4-

dTy

v* that

l[Xll

²^4-

^cTx

^4-

^dTy ^{pc(x, y)}

de(t, p) +

^t^T

(x + ^BTy)

<_ pe(x,y) + tT(x + ^BTy) erl*

^4-^v* ^4-

tT(x

^4-

BTy),

Similarly, by letting

(t, ^p)

^beany optimal solutionof

(T)),

^we^obtain^from^the^same

two relations andthe facts

bTp

v* and_{t* 4-}

ATp

cthat

bTp- llt

1

^{+ ATp-}

^c

^de(t,p)

o(x,

^t

+

>_ de(if,p*)- tT(x + ^BTy)

v* t^T

(x

^4-

BTy).

Since 0

<_ tT(x + ^BTy) ^<_ m(1 + ) (cf. (3.4)),

theabovetwo relationsyield c^Tx 4-

dT

y

<_

v*4-erl 4-

m(1

^4-

bTp >_

v*

--m(1 +

Finally, since

(x, y)

is feasiblefor

(7))

^so ^that

cTx

4-

dTy >_

v*, the first ofthese two relationsalsoyields

12

^V* ^V*

llxl

^/

^/m(l/

Canceling v* from bothsides andusing

(3.3)

completes theproof, ^r3

Since weare dealing ^with ^linearprograms, Lemma3.1 implies that, ^as^e $0 and 7 $ 0, the

(x, y)

given by

(3.3)

approaches the optimal solution set of

(7))

^and

(t,p)

approaches the optimal solution set of

(7)). ^In

fact, ^it suffices to decrease e and as far as2

L,

where is somescalar constant andL isthesize oftheproblem encoding in binary

(defined

as, say, in

[Kar84]),

at which timean optimalsolution of

(P)

^and

of

(T))

^can be recovered by using the techniques described in, for example,

[Kar84]

and

[PaS82].

Foreach

A >

0, let

0" (0, oc)

^m

- ^[0, ^oo)

^bethe function given by where

Ox(t) (lie + ED/eATFAD1/eE]I + IIED/eATFll) ^Vt > O,

D

(I + AT-2) ^-1,

E

I- DI/2BT[BDBT]-IBD 1/2,

F

[A(I- D1/2ED/2)AT] -1,

(8)

andT

Diag(t). (F

^is^well^defined ^because

IEII ^_<

¹

^(E

^{is a}^projection

^matrix)

^and

IIDII ^<

^1, ^so ^that ^I-

^D1/2ED

^1/2 is positive definite.

We

also use the assumption that

A

has full row

rank.)

^The ^quantity

0h(t)

estimates thenorm squared ofcertain projection-like operator dependingon

A

andt, and it willbe used extensively in our analysis. In general,

0 (t)

^is rather cumbersome to evaluate, but, ^{as we}shall see, it sufficesfor our analysisto bound

0h (t)

^fromabove

(see

^Lemma

3.3(b)).

3.1. Analyzing^a Newtonstep. Inthis subsectionweproveakeylemmathat statesthatif

(,/3)

^is "close" tothe optimal solutionof

(79,#),

^then

(t, p)

generated by applyingoneNewtonstepto

(79)

^at

(,/3)

^is^close^to^theoptimal solution of

(79,)

^for

some e

<

and some

- ^< ^.

^{The notion} of "closeness" ismeasuredby the

Lyapunov

function

p,

andtheproof ofthelemma isbasedon theideas usedin

[Tse89, 2].

LEMMA 3.2. Forany

> O,

any

>

0 andany

(, , z)

^E

(0, cx)

^m^x

n l

with

B-

d, let

(t,

^p,

u)

^{be given} ^by

(3.9)

^t-

+ ^At,

(3.10)

^p

^p + ^Ap,

whereu and

(At, Ap)

togethersolve thefollowing system

of

linear equations

()

(3.11) + +

₀ ^0,

^BAt

^O.

Suppose that

p,#(-, p, fi) < Z for some/ < min{1, 1/0(t-)}.

^{Then the}following hold:

(a) (t,p) e Y.

(b)

^For any^a satisfying

/0#(t--)(fl)

²

+

¹

} ^<

^a

^<

¹

(3.12)

^max

V ^I+ZX/Ilbll

we have

_Proof. pa,a#(t,p,

_Let

u) ^<_ .

(3.13)

(3.14)

^g

A(c- - ^ATp) ⁺

^b,

whereT-

Diag(t-).

Then, by

(3.1), (3.15)

By

using

(2.3), (2.4)

^and

(3.13), (3.14),

^we^write

(3.11)

equivalently as

D-1At + ATAp + BT(u z) -1,

AAt + ^AA TAp ,

BAt

0,

where for convenience welet D

(I + -2)-1.

Solving for

At

gives

1/2 T 1/2

At D1/2(E --

^ED

^{A FAD} ^E)D1/2-1 D1/2ED1/2ATF,

where welet E

I- D/2BT[BDBT]-IBD/2

^and ^F

[A(I- D/2ED/2)AT] -.

(F

^iswell definedbythesamereasoning thatthematrix givenby

(3.8)

^is^well

defined.)

Then, ^{we can} bound

-At

asfollows:

1/2 T 1/

[[T-1At[I ^_< [[-ID1/2II2[IE-I--ED A FAD 2E[II[II-+-I[-’-ID1/2II[IED1/2ATF[[[[,5[[.

(9)

Since

-2D

isdiagonal and each ofits diagonal entry isless than

1/(g),

^we^obtain that

1[-2Dl[ < 1/(g)and

hence

[[-At[I ^_< []E + ED/2ATFAD/2E[I]III/()+ [IED/2ATFI[[Ill/x/

(3.16) ^_<

where the last inequality follows from

(3.15)

and the definition of

T, D, E, F,

and

0(t-) (see (3.5)-(3.8)).

Now,

by using

(2.3)

^and

(2.4),

^{we can}^write

(3.11)

equivalently^as

BAt

0 and

(3.17) -lAt

(3.18)

0

A(c- - Ât ÂTp- ÂT Âp) ⁺

^b,

sofrom

(3.9)

^and

(3.10)we

^obtain

T(c-

^t

ATp- BTu) + e

(c- -- ^At- ^AT- ^{ATAp- BTu)}

+ e + AT(c- - ^At- ATp- ATAp- BTu)

-lAt + AT(c- - Ât ÂT- Â ^TAp ^BTu)

-IAT(e + ’(c- ⁹

^-2

ÂTAt, - Ât- ÂTp ÂTAp ^BTu))

where T

Diag(t), AT

Diag(At), and the third andthe last equality follow from

(3.17).

This implies

(3.19) ^_< gO#(t-)(/3) 2,

where the thirdinequality follows from

(3.16).

(a) ^We

^have ^from

(3.16)

^and ^the ^hypothesis

/ < 1/0(t-)

^that

II-Atll ^_<

0#(t-)(/) </ <

1, so

(3.9)

^and

>

⁰ yields t

+ ^At ^>

^0.

^Also, ^B-=

^d^together

with

BAt

0

(see (3.11))

^and

(3.9)

^yields ^Bt d, and

(3.18)

^together ^with

(3.9), (3.10)

^yields ⁰

A(c-

^t

ATp) +

^b. ^Hence

(t, p)

(b)

^{Fix any} ^a ^satisfying

(3.12)

^{and let}

- ^a,

^c

^{a. (Note}

that because

0(t-)/ ^<

^1, ^the ^left-hand quantity in

(3.12)

^is strictly less than 1, so such an

exists.)

^Let

r

T(c-

^t

ATp BTu)

^-t-

Thenthetriangle inequality and

(3.19)

^imply

Ilrll/() ^_< IIT(c-

^t-

ATp- BTu) + ’ell/(e) + (1 (a)2)/(a)

²

<_ 0(t--)03)2/()

²

+ (1/(c)

^2-

1)V/-,

whichtogetherwiththefact

(see (3.12)) (0(t-)(/3)

²

+ v/)/(/ + x/) ^_< (c)2,

^yields

(3.o) I[,’11/() <_ Z.

(10)

Let s

A(c-

^t

ATp)

4-eb.

By

using

(3.9), (3.10),

^and

(3.1S),

^we^have

s

A(c- - Ât ÂTp- Â ^Tlkp)

^{4- ab}

^(a ^1)b,

whichtogetherwiththefact

(see (3.12))

1/(1

^/

<

1,

yields

Ilsll/x/g- -- (l/a- 1)llbl] v"-Z/ff’

^_</3.

This together with

(3.20)

^and ^the ^definition ^of

p,,(t,

^p,

u) (see (3.1))

^proves ^our

claim, r]

3.2. Bounds.

Lemma

3.2 shows that ifthe rate c at which the penalty parameters e and

-

^are ^decreased ^is ^{not too} ^small

^(see ^(3.12)),

^then ^a ^single ^Newton

stepsuffices tokeepthe current iterateclosetothe optimalsolution ofthepenalized problem

(T),,).

Thus, ⁱⁿ orderto establish the

(linear)

convergence ofthe

QLPPF

Algorithm, it suffices to bound c away from 1 which, according to

(3.12),

^amounts

to bounding

O.(t)

^by^some ^quantity independent of t and e, 7- Itis not difficult to seethatsuchabound doesnotexistfor arbitrary t. Fortunately, ^weneedto consider onlythosetthat, togetherwith some p,^areclose tothe optimalsolutionof

(77,,),

ⁱⁿ

whichcase, as weshowbelow,suchabound doesexist

(provided

thatacertainprimal nondegeneracyassumptionalso

holds).

Theproof ofthis issomewhat intricate: Fore

and

-

^large,^we^argue^{by showing}^that^{t cannot}^be^toolarge, i.e., of the ordere

+- ⁺

¹

(see

^Lemma

3.3(a))

and,for and -ysmall, weargueby showingthat,under the primal nondegeneracy assumption, the columns of

A

corresponding to those components of t which are small

(i.e.,

^of^{the order}

3’)

^are^of^rank ^n.

LEMMA

3.3.

(a)

^For ^all ^e

>

O, all

" ^> ^O,

^{and all}

^(t,p)

^E

,

satisfying

p,(t,p, u) ^<_

¹

for

^some ^u, ^we ^have

Iltll ^<_ Mx(e +

7

+ ^1),

where

Mx >

⁰ ^is ^some scalar depending on

A, B,

b, c, andd only.

(b)

^Suppose ^that

(7>)

^is^primalnondegeneratein the sensethat,

for

^every^optimal

solution

(x, y) of (P),

^those ^columns

of ^A

corresponding to thepositive components

of

^x*

+ ^BTy

^have ^rank ^n. ^Then,

for

^all ^e

> O, all7 ^> ^O,

^all ⁾

^>_ 7/2,

^{and all}

(t,p)

E satisfying

pe,(t,p, u) ^<_

¹

for

^some u, ^we have

0 (t) ^<

where we

define

(w) M2(vr + ^l/yr.)

⁴

^V

^w

^>

^0,

with

M2 ^>_

¹ ^some ^scalar ^depending ^on

A, B,

b, c, andd only.

Proof. (a)

^The ^proof^is ^by contradiction.

Suppose

the contrary, ^so that there exists asequence

{(tk, ^pk,

^u

k, ek, .),k)}

such that e^k

>

⁰ and

.),k >

^{0 and}

(3.22) (t k, pk) e

k, _pk,

(t k, pk,

u

k) ^<_

¹ Vk,

(3.23) IIt ll/( +

"r ^----,oo, oo.

(11)

By

passing into a subsequence if necessary we will assume that

(t

convergesto some limit point, say

(t,p oo) (so (t , ^poo) ^0). ^We

^have ^from

^(3.22)

(also

^using

(3.1)

^and

(3.2))

^that

Bt^k d, t^k >0 Vk, and fromLemma3.1 that

bTp

^k

>_

v*

2m7 k, ]It

^k

+ ATp

^k

cll ^<_ V/2(r/* ^(e ^k)2 ⁺ ^2rnekk)

^Vk.

Upon

dividing bothsidesof the above fourrelationsby

I[(tk, pk)[

and letting k ^--,c, we obtainfrom

Btoo (3.23)

^and

O, (tk,pk)/[l(tk,pk)[ too >_ O, bTp

^oo

>_ -

O,

^(too,poo) [[too+ATp[l=0.

^that

Then, any optimal solution to

(73)

^would ^remain optimal when moved along the direction

(too, poo),

contradicting theboundedness ofthe optimal solution set of

(73) (via

^Assumption

A(a)).

(b)

^Fixany

>

0,

- ^>

^0,

^{A >_} ^e-/2,

^and^any

^(t,p)

^E

^Y

^satisfying

^p,(t,p, ^u) ^<_

¹

for some u. Let T--

Diag(t)

andlet

D, E,

F be given by, respectively,

(3.6), (3.7),

and

(3.8).

Then, F^-1

A(I- D1/2ED1/2)AT, I]E[[ ^_<

¹ ^and

^D (I + AT-2) ^-1.

From thedefinitionof

0h (t) (see (3.5))

^we^then^obtain

(3.24)

O(t) (lie + ED1/2ATFAD1/2E[I + IIED1/UATFII)

_< ([[El[

^/

[]E[[2[[D/[12[[A[I[[F[[

^/

I[E[[[ID/2I[[IAll[[F[[)2

wherethestrict inequality follows from thefacts

I[D]l <

1,

IIEl[

^1. ^Now^we ^bound

[[F][.

^We^have

zT(F-I)z zTA(I- DI/2ED1/2)AT

z

>_ zTA(I D)A Tz

m

+

j=l

(3.25) ^>_ ^A E(Az)2/(tj)

j=l

>_ A E(Ayz)2/l[t[le

j=l

where the first inequality follows from

[IE[[ ^_<

^{1 and} ^a

>

⁰ denotes the smallest eigenvalue of

AA T. (a >

^{0 because}

^A

^has^full ^row

rank.) Hence, A >_ e7/2

^yields

(3.26) IIFll < 211tl12/(r).

For e and q, near zero, ^wegive below a different bound on

IFII- ^By

^the ^primal

nondegeneracy assumption, there exists a constant 5

>

0 depending on

A, B,

b, c, anddonly suchthat if

(x, y)

isanyfeasible solution of

(P)

^with

cTx + ^{dTy <_}

^v*

+ ,

(12)

thenthecolumns

{ Aj J ^e {1,..., m}

^withxj

+ Byy ^_>

⁵

}

^{have rank} ^n. ^Consider

the

_(t, _p)

case in which_E

_:P

_and

_p,(t, *e

p,

+ u) 2m- ^_<

^1,^we

^< ^5,

^{have from}^where

*

^Lemma^is ^{defined as}^{3.1 that}the columnsⁱⁿ ^Lemma ^3.1.

{ A

^Since

^J

^E

{1,..., m}

^with

(t + Ayp + Bfu- ^c)/e ^_>

⁵

^}

^have ^rank ^n. ^Since

^{(t,p) e} ^:P

^and

p,(t,p, u) ^_<

1,^we have

T(c- t-ATp- BTu) ^_> --2ee (cf. (3.1)

^and

(3.2))

^so ^that

(tj + Ayp + Bfu- ^c)/e ^>_

^implies ^tj

^<_ 2,/5. Hence,

^we^obtain ^from

(3.25)

^that

m

z(-l) ^> ,(y)/(t)

j=l

>_ () (Yz)/(e)

> (5)’llzll/(e) w,

where the last inequality follows from the fact thatthose

Aj

forwhichtj

_ ^2"),/5

^have

rankn, and

a’

issome positivescalar dependingon

A

only. Since

A >_ e7/2,

^we then have

I111 ^(2)/((5) ^’) ^<_ ^8/((5) ^’) ^8/((5)’),

where for convenience we let w

/3’. Now,

consider the remaining ^ce where

* ⁺

^2my ^6. ^Then,

^6/(2m),

^so^part

^(a)

^and

^(3.26)

together yield

IIFII 2(M1)(

^e

+

7

+ 1)

²

2(M1)2

(

¹

<2(M1) ( ⁺

¹

⁺ 2m)

Combiningtheabove two cesandweconcludeha

IFII ^< { ^K/w ^K(+1/)2

^if^otherwise,

^*e ⁺ ^2m7 ^< ^5;

for somepositivescalarK depending^on

A, B,

b,c, anddonly. Combining the above bound with

(3.24)

^and ^we conclude that

O(t) M( + 1/)

⁴ ^for^some ^scalar

M2

¹ ^depending^on

A, B,

b, c, and donly.

3.3. Main convergence result.

By

combiningLemm 3.1 to 3.3, ^we ^obtain the followingglobal convergence result for the

QLPPF

Algorithm.

THEOREM3.4. Supposethat

(P)

^is^primalnondegenerateinthesense

of ^Lemma 3.3(b)

dt

(.) n @ (3.21). f n

th

QLPPF Atoth ^(t,p ) toth

^{th om}

u

satisfies (.) (3.28) (.eg)

for

^some^scalar

(3.30)

t

>

0, Bt d,

o,. (t ) ^_< (/),

,Oe,

(t 1, pl,

^U

1) _

^fl,

0<<

(13)

and

if

^we ^choose

(i Okk(tk)()2+x/-

¹

)

^Vk,

(3.31)

^a^k ^max

Z + v/

1

+ V/71/el/llbl[

then e^k

O,

₇^k 0 linearly, and

{((tk + ^ATp

^k-

C)/ek,uk/ek)}, {(tk,pk)}

approach the optimalsolutionset of, respectively,

(79)

^and

(T)).

Proof.

^First^note^from

(w) >

1 forall w

>

⁰

(see (3.21)

^and

M2 > 1)

^and

(3.30)

that

<

^1. Also note from

(2.7)

^that

(3.32) ek/7k

We

claimthat

(3.33) (t k, pk) _

^pk_,_x

^(t ^k, ^pk,

^U

^k) ^<_ ^, ^p, ^(t ^k, ^pk,

^U

^k) ^{<_ 3,}

for all k

>

2. It is easily^seen byusing

(3.27)-(3.31)

^and

(2.5)-(2.7)

^and ^Lemma^3.2

that

(3.33)

^holds ^for ^k ^2.

^Suppose

^that

(3.33)

^{holds for} ^all ^k

^<

h, for some h

>

2. Then,

(th,p h)

^E

3;-1

and

p-l,-l(th,ph,

^u

h) ^< /3 <

^1. ^Since ^wealso have from

(2.7)

^that ^e^h

^ah-leh-1

and 7^h

ah-17h-1

^and^from

^(3.31)

^that

^(ah-1)2 ^_>

y/-/(1

-4-

X/-) ^> 1/2,

^we^can ^apply^Lemma

3.3(b)

^to conclude that

(3.34) Oh9/h(t h) _ ?D(h-1/7h-l) ) (1/71)

where the equality follows from

(3.32).

^Then, ^by

(3.30), fl < 1/0 (th).

^Since

^(3.33)

holds for k h, we also have

p,(th,p ^h,

^u

h) ^< ,

^so

^Lemma

^3.2 ^together ^with

(2.5)-(2.7)

^and

(3.31), (3.32)yields

(th+l,p h+l) , pe,/(th+l,ph+l,uh+l) _ ^, pe+,+(th+l,ph+l,uh+l) _ _.

Hence, (3.33)

holds for k h

+

^1.

Since

(3.33)

holds for all k

>

2, ^{we see} that

(3.34)

holds for all h

>

2. Then, by

(3.30), O(th)

^is^less ^{than 1} and bounded away from 1 for all h

>

2, ^so that, by

(3.31),

^a^k ^is^less ^{than 1} and bounded away from 1 for all k. Hence

ek

0, ₇^k 0 at the rate ofa geometric progression. The remainder of the proof follows from

(3.33)

andLemma3.1. 0

Notethat insteadof

Ok ^(t ^k)

^we^canuse, for example, theupperbound

1/(el/71)

intheformula

(3.31),

^and^linearconvergence would be preserved.

However,

thisbound istypically loose and difficulttocompute. Thereisalso theissueof finding

el, 71, ,

(t1, pl)

^and^u ^satisfying

(3.27)-(3.30),

^which^we^will ^addressⁱⁿ

3.4.

3.4. Algorithm initialization.

By

Theorem 3.4, if the primal nondegeneracy assumption thereinholds and ifwe can find e, 7,

(t, p)

andusatisfying

(3.35) (3.36) (3.37)

t

>

0,

Bt

d,

O(t) ^<_ (/), p,(t,p, u) <

then we can

set/3 p,(t,p, u) ^(assuming p,(t,p, u) # 0)

^and ^start ^the

QLPPF

Algorithm withe, 7,

(t,p),

andwewould obtain linearconvergence. But how dowe findsuch e, 7,

(t, p),

andu?

(14)

One obvious way is to fix any e

>

0, any

>

0, and then solve the penalized problem

(T),).

The solution

(t,p)

^obtained ^satisfies

t

>

0, Bt d,

^vf,(t,p) ⁺ BTu)

⁰ ^-0

for some u

(see (2.2))

^{so, by}

(2.3)

^and

(3.1), (3.2),

^we ^have

pe,(t,

^p,

u)

⁰ and

(t,p) e ;. Hence (3.35)

^and

(3.37)

^hold and, by Lemma

3.3(b), O(t) <_ (e/-),

^so

(3.36)

also holds.

(Of

^course

(T),)

^{need not} ^be^solved

^exactly.)

^To^solve^the^problem

(T),),

^{we can use} ^any ^{method for}^convex differentiable minimization

(e.g.,

gradient descent, ^coordinate

descent),

andwewould typicallywant esmall and/large sothat

(T),)

^is^wellconditioned.

Suppose

that there holds Be 0 and

Ae

b.

(This

holds, for example, when

B

isthezeromatrix

(which

corresponds to the casewhenall equalityconstraints are penalized), and a change ofvariable

x’ ()-lx,

^where ² ^is any interior feasible solution of

(P) (i.e., A

b,

> 0)

and Z Diag(), has been made in

(P).)

Then we can find a usable e, /,

(t,p),

u immediately: Fix any e

> 2(1)([Ic[I + IIBT(BBT)-ldII)

^and^let

-

^e.^p ^Also

(AAT)-IA(c-

^{let w}

^BT(BBT)-ld w),

^and

t:eew

u

-(BBT)-d.

Then Bw d,

A(c- ATp) Aw, ^At

eb

+ ^Aw

^and

^BTu

^ee ^t, ^so^that

Bt-

Bw

_d,

A(c-

^t

ATp)

-eb,

T(c-

t

ATp BTu) +

^e/e

(I + W)(c-

^ee

ATp) + (e)ee

(eI + W)(c- ATp)

ew,

where

T Diag(t)

and

W Diag(w).

Also from

b(1) >

¹ ând ôur ^choice ôf^{e we}

have e

> Ilwll,

^so^t

>

^0.

Hence,

by

(3.2),

^we^have

(t,p) e J;

and, by

(3.1)

^and ^e 7, wehave

p,(t,

^p,

u) II(eI + W)(c ATp) wll/()

²

<-- II-- ^ATplI/ ⁺ IIWIIIIc- ^ATp)II/()

²

I1(I AT(AAT)-IA)(c

+ ^IIWIIII(I AT(AAT)-IA)c + AT(AAT)-IAwlI/(e)

²

< I1- ^wll/

^/

Ilwll(llcll

^/

Ilwll)/() ,

where the last inequality follows from the triangle inequality and the nonexpansive property of projection matrices. From our choice ofe we see that

(llcll + Ilwll)/e <

.5/(1),

^sothe right-handsideof

(3.38)

^isbounded above by

.5/(1)+ (.5/(1))

²

^_<

1/(1) 1/(e/),

^where ^the inequality follows from

(1) ^_>

¹ and the equality follows from ₇ e.

Hence (3.35)

^and

(3.37)

^hold. Also, since

(t, p) e

J), then

(3.37)

together with

Lemma 3.3(b)

^{shows that}

(3.36)

^holds.

4. Numerical results. Tostudythe performance of the

QLPPF

Algorithmin practice, wehave implementedthealgorithmtosolve the specialcaseof

(7:))

^and

(15)

inwhich

B

isthe zeromatrix, i.e.,

(P)

^is^of^{the form}

(4.1)

^minimize^c^T^x

subject to

Ax

b, x

>

O.

(This

corresponds to penalizing all equality constraints in the corresponding dual

problem.)

Below we describe our implementation and present our very preliminary numerical experience.

1. Initialization. Inour implementation, weset for allproblems e^1--

x= lO41[clll/m,

andset

(somewhat

arbitrarily)

pl

-0, t

max{e, c/2},

where

"max"

istaken componentwise.

(Note

^{that since}

^B

^is^the^zero matrix, t can be set to any positive

vector.) ^We

have chosent so tominimize

IITl(c

^t

ATpl)II

subjectto t

_>

e. Also, care must be exercised in choosinge and

-"

^{if their}^values

are set toolow, thenthe

QLPPF

Algorithm may fail to converge; if their values ^are set too high, then the

QLPPF

Algorithm may require many iterations to converge.

(Notice

that ^we set e and

7

directly proportional to the average cost

Iclll/m

^so

theirvalues scale with

c.)

2. Steplength selection. To ensure that the

tk’s

remain inside the positive orthant, ^we^employ^abacktrackingscheme similar to that usedbySetiono: whenever t^k

+ ^At

^k ^isoutside the positiveorthant, ^wereplacetheformula for t⁺¹ in

(2.6)

^by

(4.2)

^t^k+l ^t^k

H-.98AkAt k,

where

k

(4.3) Ak=

^min

^tj

However,

this raises a difficulty, namely, for

A

much smaller than 1, the vector

(.98AkAt k, Ap k)

may be far from the

Newton

direction

(Atk, ^Ap k)

and, as a conse- quence, theiterates may fail to converge. Toremedy this, ^we replace

(analogous

to

(4.2))

^theformula for

pk+

in

(2.6)

^by

pk+l pk -t-.98AkAp k,

whenevernonconvergence isdetected.

(The

parameter value .98 ischosen somewhat arbitrarily, but ^it works wellin our

tests.)

The proper choiceofthe

ak’s

isvery important for the

QLPPF

Algorthm: ^{if the}

ak’s

aretoo near 1

(so

^thepenalty parameters decrease

slowly),

then the algorithm would convergeslowly; if the

ak’s

aretoonear0

(so

^thepenalty parameters decrease rapidly), then the algorithm may fail to converge. In our implementation we adjust the

ak’s

dynamically according to the following rule:

max{.3,

^.95o^k-1

}

Ok

_.6

ok-1

if

A

^k 1;

if

A _<

.2;

otherwise

Vk

_>

2,

(16)

with

61

set to .5. The rational forthis adjustment rule is that, ^if

Ak

1, then the current iterate is closely following the solution trajectory

(so

^we can decrease the penalty parameters at afasterrate andstillretain

convergence)

and, ^if

Ak _<

.2, then the current iterate is unable to follow the solution trajectory

(so

^we ^must ^decrease

thepenalty parameters at a slower

rate).

3. Termination. To avoid numericalproblems, westop decreasing thepenalty parameters e and

,

^when ^{they reach} ^some prespecified tolerances _min and _min, respectively.

In

our tests^we set

(min--

lo-lellClll/m,

min

lO-llClll/m.

We

terminatethe

QLPPF

Algorithm when the relativeduality gap andtheviolation ofprimal anddual feasibilityaresmall.

More

specifically,weterminatewhenever the current iterate, denoted by

(t,p),

^satisfies

(4.4) IIX(ATp- c)111 _{_<}

10

-7,

(4.5) max{llAx bll , II[-x]+ll t -< ^10-,

(4.6) II[ATp- _c]+ll

10

-7,

where x

(t + ^ATp- )1 (see

^Lemma

3.1),

^X

Diag(x),

and

[.1+

^denotes ^the

orthogonal projection onto the nonnegativeorthant. Only for three ofourtest problems could the above termination criterion not be met

(owing

to violation of

(4.4)

and

(4.6))

ⁱⁿ^{which case}the algorithmisterminated wheneverprimal feasibility

(4.5)

ismet and

IcTx

^v*

^I/Iv*l

^is^less^than ^5.10

^-7,

^where^{v* denotes} ^the^optimal^cost ^of

(4.1).

4. Solving for the direction. The most expensive computation at each iteration ofthe

QLPPF

Algorithm lies in solving the system oflinear equations

(2.5).

Thiscan be seen to entailsolving a singlelinear system of the form

(4.7) AQATw

z,

for w, where z is some n-vector and

Q

is some m m diagonal matrix whose jth diagonal entryis

(4.8)

+

withe

>

0,

>

0, andtsome positivem-vector.

(Linear

system of the form

(4.7)

^also

arise in interior point algorithms, but

(4.7)

^has ^the^nice property that the condition numberof

Q

canbe controlled byadjusting thepenalty parameters e and

’.)

^In^our

implementation,

(4.7)

^is solved using

YSMP,

^asparse matrixpackage for symmetric positive semidefinitesystems developed atYale University

(see [EGSS79], [EGSS82])

andaprecursor to the commercialpackage

SMPAK (Scientific

^ComputingAssociates,

1985). ^YSMP

^comprises^a^set ^of^Fortran^routines implementing theCholeskydecom- positionschemeand, ^{as a}preprocessor,the minimum-degree ordering algorithm

(see,

e.g.,

[GeL81]).

Inourimplementation, theminimum-degree orderingroutine

ORDER

iscalled first toobtainapermutationoftherowsand columns of the matrix

AA

^T so that fill-in isreduced during factorization. Then,

AA

^T issymbolically factoredusing the routine

SSF. (SSF

^is ^called ^only ^{once since} ^the ^nonzero ^{pattern of}

^AQA

^T ^does

not change with

Q.) ^At

eachiteration, the matrix

AQA

^T isnumerically factored by

PROGRAMMING AND

A

LINEAR

QUADRATIC AND

PENALTY

[Kar84],

(i)

(or

reduction), (ii)

(iii)

We

(see

[Meg89], [Tod89]

surveys).

[GaZS1].

Newton’s

(see,

[Gon89], [GMSTW86], [KMY89], [MoA87], [Ren88], [Tse89]).

A

(see [AKRV89], [BDDW89], [GMSTW86], [Lus90], [MMS89], [MSSPB89], [MoM87]),

[Set89]

[HaB70], [Hes69], [Pow69] (also

[Roc76], [Ber82]).

[Set89],

--bTp

+ ATp

>_ O,

A

Lagrange

+ ATp

(t

+ ATpk C),

x+

+ _J

(e k}

(tk,p k)

(inexact)

(1.1)

--bTp + (xk)T(t + ATp- c)- -k[It

+ ATp- c[I

>_

(An

start.) A

(tk,p k)

(1.1)

(Note

(1.1)

ill-conditioned.)

(t,p k)

_/k -jm__ ln(tj)

(1.1),

k

(tk-l,p k-l),

(If

1)

Newton

(so

method)

We

(global)

We

(see,

[Ber82], [FiM68], [Fri57], [JIO78], [Man84],

+

k,

_

[WBD88]),

(called

[FiM68]).

Upon

(Interior

Norm)

[Set90] (also

[Set91]),

MINOS

In

(see [MuR90]);

(see [KMM91]). However,

theoretical/numerical

[MuR90]

+ ^ATp

^>_ O,

+ ^ATp

+ ^ATpk C),

+ _{_J}

^{+ ATp-} ^c[I

start.) ^A

_/k -jm__ ^ln(tj)

^k,

^L-norm

^>_

+ ^ATp

^A

+ ^ATp

^Ax

^{+ BTy} ^>_ ^O,

+ ^BTy ^>

+ ^ATp

(c) ^A

+ ^ATp