Mathematical Programming manuscript No.

(will be inserted by the editor)

Mihai Anitescu · Paul Tseng · Stephen J. Wright

### Elastic-Mode Algorithms for Mathematical Programs with Equilibrium Constraints: Global Convergence and Stationarity Properties

April 14, 2005

Abstract. The elastic-mode formulation of the problem of minimizing a nonlinear function subject to equilibrium constraints has appealing local properties in that, for a finite value of the penalty parameter, local solutions satisfying first- and second-order necessary optimality conditions for the original problem are also first- and second-order points of the elastic-mode formulation. Here we study global convergence properties of methods based on this formulation, which involve generating an (exact or inexact) first- or second-order point of the formulation, for nondecreasing values of the penalty parameter. Under certain regularity conditions on the active constraints, we establish finite or asymptotic convergence to points having a certain stationarity property (such as strong stationarity, M-stationarity, or C-stationarity). Numerical experience with these approaches is discussed. In particular, our analysis and the numerical evidence show that exact complementarity can be achieved finitely even when the elastic-mode formulation is solved inexactly.

Key words. Nonlinear programming, equilibrium constraints, complementar- ity constraints, elastic-mode formulation, strong stationarity, C-stationarity, M- stationarity.

AMS subject classifications 49M30, 49M37, 65K05, 90C30, 90C33

1. Introduction

We consider a mathematical program with equilibrium constraints (MPEC), defined as follows:

minxf (x) subject to
g(x) ≥ 0, h(x) = 0,
0 ≤ G^{T}x ⊥ H^{T}x ≥ 0,

(1)

where f : IR^{n} → IR, g : IR^{n} → IR^{p}, and h : IR^{n} → IR^{q} are all twice continuously
differentiable functions (at least in a neighborhood of all points generated by
our methods), and G and H are n × m column submatrices of the n × n identity
matrix (with no columns in common). Hence, the constraints G^{T}x ≥ 0 and
H^{T}x ≥ 0 represent nonnegativity bound constraints on certain components of

M. Anitescu: Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, e-mail: anitescu@mcs.anl.gov

P. Tseng: Department of Mathematics, University of Washington, Seattle, WA 98195, e-mail:

tseng@math.washington.edu

S. J. Wright: Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI 53706, e-mail: swright@cs.wisc.edu

x, and the notation G^{T}x ⊥ H^{T}x signifies that (G^{T}x)^{T}(H^{T}x) = 0. This special
form of the complementarity constraints does not sacrifice generality; it can
always be attained by introducing artificial variables as needed. We use this
form because some of our results require the nonnegativity constraints G^{T}x ≥ 0
and H^{T}x ≥ 0 to be satisfied exactly even when x is only an inexact solution
of the subproblem in question. Such conditions are readily satisfied by most
interior-point and active-set methods.

MPEC has been well studied in recent years, with many solution meth-
ods proposed; see [2, 3, 5, 13, 15–17, 19, 21, 23] and references therein. Although an
MPEC can be formulated as a nonlinear program by rewriting the complemen-
tarity constraint as an equality constraint (G^{T}x)^{T}(H^{T}x) = 0 or as an inequality
constraint (G^{T}x)^{T}(H^{T}x) ≤ 0, the resulting nonlinear program is highly degen-
erate; that is, it does not satisfy the linear independence constraint qualification
(LICQ) nor the Mangasarian-Fromovitz constraint qualification (MFCQ). Thus,
in order to achieve global convergence, specialized methods have been proposed
that exploit the special structure of the complementarity constraint. These meth-
ods generate a sequence of points in IR^{n} whose accumulation points satisfy, un-
der suitable assumptions, certain necessary optimality conditions for the MPEC
(1). Different types of necessary optimality conditions have been developed, the
strongest and most desirable of which is strong stationarity [23]; see Definition 1
below. Under MPEC-LICQ (see Definition 2), strong stationarity is equivalent
to the notion of B-stationarity [6]. Two weaker conditions, M-stationarity and
C-stationarity [18, 23], will also be of interest (see Definition 3).

A regularization method of Scholtes [24] achieves M-stationarity under MPEC- LICQ and achieves strong stationarity under an additional upper-level strict complementarity (ULSC) condition. A relaxation method of Lin and Fukushima [15] and a penalty method of Hu and Ralph [10], penalizing the complementarity constraint, have similar global convergence properties. A smoothing method of Fukushima and Pang [6] achieves strong stationarity under MPEC-LICQ and an additional asymptotically weak nondegeneracy condition. All these methods are conceptual, in that they assume the generation of a sequence of points satisfying exactly certain second-order necessary optimality conditions. Only in the case of linear constraints has a practical method been developed (Fukushima and Tseng [7]). We are led to ask: Can global convergence (to C- or M- or strongly stationarity points) be achieved under weaker assumptions or for more practical methods?

In this paper, we study this question for a nonlinear programming formula- tion of (1) that uses an explicit penalization of the complementarity constraint, also known as the “elastic mode.” For a given penalty parameter c ≥ 0 and fixed upper bound ¯ζ ∈ [0, ∞), this formulation can be written as follows:

PF(c) : minx,ζf (x) + cζ + c(G^{T}x)^{T}(H^{T}x) subject to

g(x) ≥ −ζep, ζeq ≥ h(x) ≥ −ζeq, 0 ≤ ζ ≤ ¯ζ,
G^{T}x ≥ 0, H^{T}x ≥ 0,

(2)

where e_{l}is the vector (1, 1, . . . , 1)^{T} with l components. A similar formulation was
studied by Anitescu [1, 2], while a variant with ζ fixed at zero was investigated

by Ralph and Wright [21]. The penalty method in [10] is based on this variant.

Our analysis may also be extended to this variant, as well as to a mixed variant
whereby ζ is fixed at zero for a subset of the constraints (see Section 5). For ¯ζ
sufficiently large, a feasible point of (2) is easily found, and there are appealing
correspondences between points x^{∗}that satisfy first-order optimality conditions
for (1) and points (x^{∗}, 0) that satisfy first-order optimality conditions for (2)
(see Theorem 2).

The algorithms we consider in this paper generate a sequence of (exact or
inexact) first- or second-order points (x^{k}, ζk) of PF(ck), where {ck} is a positive
nondecreasing sequence. We study the stationarity properties of the accumula-
tion points of {x^{k}}. The upper bound constraint ζk ≤ ¯ζ helps to ensure the
existence and boundedness of (x^{k}, ζk).

Our analyses draw on global convergence analyses of Scholtes [24] and An-
itescu [2]; the latter studied a variant of (1) known as parametric mixed-P varia-
tional inequalities. In Section 3, we study stationarity properties of termination
points and accumulation points of {(x^{k}, ζk)}. In Subsection 3.1, (x^{k}, ζk) is an
inexact first-order point of PF(ck), and we show that each feasible accumula-
tion point satisfying MPEC-LICQ is C-stationary for (1). In Subsection 3.2,
(x^{k}, ζk) is an exact second-order point of PF(ck), and we show (somewhat sur-
prisingly) termination at a strongly stationary point for ck sufficiently large; oth-
erwise accumulation points either are infeasible or fail to satisfy MPEC-LICQ.

In Subsection 3.3, (x^{k}, ζ_{k}) is an inexact second-order point of PF(c_{k}), and we
show that each feasible accumulation point satisfying MPEC-LICQ is either M-
stationary or strongly stationary (depending on boundedness of {c_{k}}). Moreover,
if exact complementarity holds between bound constraints and their multipliers,
then x^{k}satisfies exactly the complementarity condition (G^{T}x^{k})^{T}(H^{T}x^{k}) = 0 for
all ck sufficiently large. In Subsection 3.4, we introduce a strengthened version
of MPEC-LICQ and prove another result concerning exact satisfaction of the
complementarity condition for sufficiently large ck–even when the subproblems
PF(ck) are solved inexactly. In Subsection 3.5, we present a practical algorithm
for generating (x^{k}, ζk) as an inexact second-order point of PF(ck).

Section 4 discusses a “regularized” nonlinear programming formulation of (1) [24] and presents examples to illustrate and compare the behavior of meth- ods based on elastic-mode and regularized formulations. Section 5 presents some numerical experience, corroborating the aforementioned result of exact comple- mentarity under finite penalty.

In what follows, we use k·k to denote the Euclidean norm k·k2. The notations
O(·) and o(·) are used in the usual sense. We denote by eq a vector of length q
whose entries are all 1, that is, eq = (1, 1, . . . , 1)^{T}.

2. Assumptions and Background

In this section, we summarize some known results concerning constraint qual- ifications and necessary optimality conditions for MPEC and its elastic-mode formulation. We discuss first-order stationarity conditions and constraint quali-

fications for MPEC (1) in Subsection 2.1 and first- and second-order stationarity conditions for PF(c) (2) in Subsection 2.2. Subsection 2.3 describes the corre- spondence between certain first-order points of the elastic form (2) and first-order points of the MPEC (1).

2.1. Stationarity Conditions and Constraint Qualifications for MPEC

We start by defining the following active sets at a feasible point x^{∗} of MPEC
(1):

Ig

def= {i ∈ {1, 2, . . . , p} | gi(x^{∗}) = 0}, (3a)
I_{G} ^{def}= {i ∈ {1, 2, . . . , m} | G^{T}_{i}x^{∗}= 0}, (3b)
I_{H} ^{def}= {i ∈ {1, 2, . . . , m} | H_{i}^{T}x^{∗}= 0}, (3c)
where Gi and Hi denote the ith column of G and H, respectively (in each
case, a column from the identity matrix). Because x^{∗} is feasible for (1), we have
IG∪ IH= {1, 2, . . . , m}.

Using the active sets, we define our first notion of first-order stationarity for (1) as follows.

Definition 1. A feasible point x^{∗} of (1) is strongly stationary if d = 0 solves
the following linear program:

mind∇f (x^{∗})^{T}d subject to

g(x^{∗}) + ∇g(x^{∗})^{T}d ≥ 0, h(x^{∗}) + ∇h(x^{∗})^{T}d = 0,
G^{T}_{i} d = 0, i ∈ IG\IH,

H_{i}^{T}d = 0, i ∈ IH\IG,
G^{T}_{i}d ≥ 0, H_{i}^{T}d ≥ 0, i ∈ IG∩ IH.

(4)

Let us introduce Lagrange multipliers and define the MPEC Lagrangian as in Scholtes [24, Sec. 4]:

L(x, λ, µ, τ, ν) = f (x) − λ^{T}g(x) − µ^{T}h(x) − τ^{T}G^{T}x − ν^{T}H^{T}x. (5)
By combining the (necessary and sufficient) conditions for d = 0 to solve (4)
with the feasibility conditions for x^{∗}, we see that x^{∗}is strongly stationary if and
only if x^{∗} satisfies, together with some multipliers (λ^{∗}, µ^{∗}, τ^{∗}, ν^{∗}), the following
conditions:

∇xL(x^{∗}, λ^{∗}, µ^{∗}, τ^{∗}, ν^{∗}) = 0, (6a)

0 ≤ λ^{∗} ⊥ g(x^{∗}) ≥ 0, (6b)

h(x^{∗}) = 0, (6c)

τ^{∗} ⊥ G^{T}x^{∗}≥ 0, (6d)

ν^{∗} ⊥ H^{T}x^{∗}≥ 0, (6e)

τ_{i}^{∗}≥ 0, i ∈ IG∩ IH, (6f)
ν_{i}^{∗}≥ 0, i ∈ I_{G}∩ I_{H}. (6g)

Under the following constraint qualification at x^{∗}, the multipliers (λ^{∗}, µ^{∗}, τ^{∗}, ν^{∗})
are in fact unique.

Definition 2. The MPEC-LICQ holds at a feasible point x^{∗}of (1) if the follow-
ing set of vectors is linearly independent:

K^{def}= {∇g_{i}(x^{∗})}_{i∈I}_{g}∪ {∇h_{i}(x^{∗})}i=1,2,...,q∪ {G_{i}}_{i∈I}_{G}∪ {H_{i}}_{i∈I}_{H}. (7)
The following result, dating back to Luo, Pang, and Ralph [17] but stated
here in the form of Scheel and Scholtes [23, Theorem 2], shows that, under
MPEC-LICQ, strong stationarity is a set of (first-order) necessary optimality
conditions for the MPEC.

Theorem 1. Suppose that x^{∗} is a local minimizer of (1). If the MPEC-LICQ
holds at x^{∗}, then x^{∗}is strongly stationary, and the multiplier vector (λ^{∗}, µ^{∗}, τ^{∗}, ν^{∗})
that satisfies the conditions (6) is unique.

Our analysis also uses two weaker notions of first-order stationarity for (1) that have been studied in previous works; see, for example, Outrata [18] and Scheel and Scholtes [23].

Definition 3. (a) A point x^{∗} is C-stationary if there exist multipliers

(λ^{∗}, µ^{∗}, τ^{∗}, ν^{∗}) satisfying (6) except that the conditions (6f), (6g) are replaced
by τ_{i}^{∗}ν_{i}^{∗}≥ 0, for each i ∈ IG∩ IH.

(b) A point x^{∗} is M-stationary if it is C-stationary and if either τ_{i}^{∗}≥ 0 or ν_{i}^{∗}≥ 0
for each i ∈ IG∩ IH.

Notice that M-stationarity allows such situations as τ_{i}^{∗}< 0 and µ^{∗}_{i} = 0 for some
i ∈ I_{G} ∩ I_{H} but does not allow the situation τ_{i}^{∗} < 0 and µ^{∗}_{i} < 0, which is
allowed by C-stationarity. In particular, strongly stationary ⇒ M-stationary ⇒
C-stationary.

2.2. Necessary Optimality Conditions for PF(c)

In this subsection, we discuss the exact and inexact first- and second-order nec- essary optimality conditions for PF(c) defined in (2). We start by defining the Lagrangian for this problem as follows:

L_{c}(x, ζ, λ, µ^{−}, µ^{+}, τ, ν) = f (x) + cζ + c(G^{T}x)^{T}H^{T}x − λ^{T}(g(x) + ζe_{p})(8)

−(µ^{+})^{T}(ζeq− h(x)) − (µ^{−})^{T}(ζeq+ h(x)) − τ^{T}G^{T}x − ν^{T}H^{T}x.

The Karush-Kuhn-Tucker first-order necessary optimality conditions for this problem are as follows:

∇xLc(x, ζ, λ, µ^{−}, µ^{+}, τ, ν) = 0, (9a)
c − e^{T}_{p}λ − e^{T}_{q}µ^{−}− e^{T}_{q}µ^{+}= π^{−}− π^{+}, (9b)
0 ≤ (π^{−}, π^{+}) ⊥ (ζ, ¯ζ − ζ) ≥ 0, (9c)
0 ≤ λ ⊥ g(x) + ζe_{p}≥ 0, (9d)

0 ≤ µ^{+}⊥ ζe_{q}− h(x) ≥ 0, (9e)
0 ≤ µ^{−} ⊥ ζeq+ h(x) ≥ 0, (9f)

0 ≤ τ ⊥ G^{T}x ≥ 0, (9g)

0 ≤ ν ⊥ H^{T}x ≥ 0. (9h)

We call (x, ζ) satisfying these conditions a first-order point of PF(c). Since these conditions cannot be satisfied exactly in practice, we consider the following in- exact first-order conditions.

Definition 4. We say that (x, ζ) is an -first-order point of PF(c) ( ≥ 0) if
there exist multipliers (λ, µ^{−}, µ^{+}, τ, ν, π^{−}, π^{+}) satisfying

k∇xLc(x, ζ, λ, µ^{−}, µ^{+}, τ, ν)k∞≤ ,

|c − e^{T}_{p}λ − e^{T}_{q}µ^{−}− e^{T}_{q}µ^{+}− π^{−}+ π^{+}| ≤ ,

0 ≤ (π^{−}, π^{+}), (ζ, ¯ζ − ζ) ≥ 0, ζπ^{−}+ (¯ζ − ζ)π^{+}≤ ,
0 ≤ λ, g(x) + ζep≥ −ep, |(g(x) + ζep)^{T}λ| ≤ ,
0 ≤ µ^{+}, ζeq− h(x) ≥ −eq, |(ζeq− h(x))^{T}µ^{+}| ≤ ,
0 ≤ µ^{−}, ζeq+ h(x) ≥ −eq, |(ζeq+ h(x))^{T}µ^{−}| ≤ ,

0 ≤ τ, G^{T}x ≥ 0, τ^{T}G^{T}x ≤ ,

0 ≤ ν, H^{T}x ≥ 0, ν^{T}H^{T}x ≤ .

(10)

The conditions (10) are well suited to situations in which PF(c) is solved by
interior-point methods or active-set methods, since such methods can enforce
the bound constraints G^{T}x ≥ 0 and H^{T}x ≥ 0 explicitly (also the nonnegativ-
ity constraints on the multipliers, in the case of interior-point methods), while
allowing the constraints involving nonlinear functions to be satisfied inexactly.

We now introduce the notions of approximately active constraints and of exact and inexact second-order (stationary) points of PF(c).

Definition 5. Given a function r : IR^{n} → IR, a constraint r(x) ≥ 0 or r(x) = 0
of a nonlinear program is δ-active (δ ≥ 0) at a point ˆx if |r(ˆx)| ≤ δ. The
constraint is active at ˆx if r(ˆx) = 0.

Definition 6. We say that (x, ζ) is a second-order point of PF(c) if there exist
multipliers (λ, µ^{−}, µ^{+}, τ, ν, π^{−}, π^{+}) satisfying (9) (so (x, ζ) is a first-order point
of PF(c)) and

˜

u^{T}∇^{2}_{(x,ζ)(x,ζ)}L_{c}(x, ζ, λ, µ^{−}, µ^{+}, τ, ν)˜u ≥ 0,

for all ˜u ∈ IR^{n+1}in the null space of the gradients of all active constraints of (2)
at (x, ζ).

Definition 7. We say that (x, ζ) is an (, δ)-second-order point of PF(c) (, δ ≥
0) if there exist multipliers (λ, µ^{−}, µ^{+}, τ, ν, π^{−}, π^{+}) satisfying (10) (so (x, ζ) is
an -first-order point of PF(c)) and

˜
u^{T}∇^{2}

(x,ζ)(x,ζ)Lc(x, ζ, λ, µ^{−}, µ^{+}, τ, ν)˜u ≥ −Ck˜uk^{2},

for all ˜u ∈ IR^{n+1}that are simultaneously in the null space of the gradients of all
active bound constraints (G^{T}x ≥ 0, H^{T}x ≥ 0, 0 ≤ ζ ≤ ¯ζ) of (2) at (x, ζ) and in

the null space of the gradients of δ-active nonbound constraints (g(x) ≥ −ζe_{p},
ζe_{q} ≥ h(x) ≥ −ζe_{q}) at (x, ζ). Here C ≥ 0 is an arbitrary constant independent
of (x, ζ).

We shall see in Subsection 5.3 that the bounded indefiniteness condition given in Definition 7 is numerically easier to verify than the more standard positive semidefiniteness condition (corresponding to C = 0). In particular, when we use an off-the-shelf code to solve PF(c), we generally have no knowledge and no control of how the active constraints are computed, if they are explicitly computed at all. Hence, it is difficult to check numerically whether the final point output by the code satisfies the positive semidefiniteness condition because this condition is sensitive to the value of the (unknown) tolerance δ. On the other hand, as our numerical experience in Subsection 5.3 suggests, the bounded indefiniteness condition seems fairly insensitive to δ.

2.3. Relating First-Order Points of the MPEC and the Elastic Form

The following result identifies certain first-order points of PF(c) (2) with the strongly stationary points of the MPEC (1).

Theorem 2. If (x, ζ) is a first-order point of PF(c) with c ≥ 0 and x is feasible for (1), then (x, 0) is also a first-order point of PF(c), and x is strongly stationary for (1).

Proof. To prove the first claim, we show that when x is feasible for (1), ζ can be replaced by 0 in the conditions (9) and they will still be satisfied, without changes to the other variables. It is easy to see that the conditions (9d), (9e), and (9f) continue to hold after this substitution, while (9b) is not affected. Also, we have from (9c) that

0 ≤ π^{+}⊥ ¯ζ − ζ ≥ 0. (11)

If ¯ζ = 0, we must have that ζ = 0 already, so that the substitution of 0 for ζ
is inconsequential. If ¯ζ − ζ > 0, we must have π^{+} = 0, so (11) still holds after
ζ is replaced by 0. The final case is ζ = ¯ζ > 0 with π^{+} > 0. By the conditions
0 ≤ π^{−} ⊥ ζ ≥ 0, we have π^{−} = 0, so the right-hand side (9b) is negative. On
the other hand, since x is feasible in (1) and ζ > 0, we have g(x) + ζe_{p} > 0,
ζe_{q}− h(x) > 0, and ζe_{q} + h(x) > 0, so it follows by complementarity in (9d),
(9e), and (9f) that λ = 0 and µ^{−}= µ^{+}= 0. Hence, the left-hand side of (9b) is
nonnegative, a contradiction. Thus, we must have π^{+}= 0, so the conditions (11)
will continue to hold after we replace ζ by 0. The first statement of the theorem
is proved.

For the second statement, we can identify (9a) with (6a) by setting x^{∗} = x,
ζ = 0, and

τ^{∗}= τ − cH^{T}x, ν^{∗}= ν − cG^{T}x, λ^{∗}= λ, µ^{∗}= µ^{−}− µ^{+}. (12)

3. Global Convergence Results

In this section we state and prove results for methods in which PF(ck) is solved
for a nondecreasing sequence of positive scalars {ck}. By “solved” we mean that
either an exact or inexact first- or second-order point x^{k} of PF(ck) is com-
puted; we analyze various cases in the subsections below. We are interested
particularly in techniques that achieve exact complementarity finitely; that is,
(G^{T}xk)^{T}(H^{T}x^{k}) = 0 for all iterates k with ck exceeding some threshold c^{∗}.

3.1. A Sequence of Inexact First-Order Points

Here we consider the situation in which an inexact first-order point (x^{k}, ζ_{k}) of
PF(c_{k}) is generated, for k = 0, 1, . . ., and give conditions under which accumula-
tion points of {x^{k}} are C-stationary. The proof is long and somewhat technical.

It borrows some ideas from the proofs of Scholtes [24, Theorem 3.1] and An- itescu [2, Theorem 2.5].

Theorem 3. Let {c_{k}} be a positive sequence, nondecreasing with k, and {k}
be a nonnegative sequence with {c_{k}_{k}} → 0. Suppose that (x^{k}, ζ_{k}) is an _{k}-first-
order point of PF(c_{k}), k = 0, 1, . . .. Let x^{∗} be any accumulation point of {x^{k}}
that is feasible for (1) and satisfies MPEC-LICQ. Then x^{∗} is C-stationary for
(1), and for any S ⊂ {0, 1, . . .} with {x^{k}}_{k∈S}→ x^{∗}, we have {ζ_{k}}_{k∈S} → 0.

Proof. Suppose without loss of generality that {x^{k}} → x^{∗}. Since c_{k} ≥ c0 >

0 and {c_{k}_{k}} → 0, we have {k} → 0. Let (λ^{k}, µ^{−k}, µ^{+k}, τ^{k}, ν^{k}, π^{−k}, π^{+k}) be
multipliers associated with (x^{k}, ζ_{k}) (from (10)).

From the final row of (10), we have that, for all k,

ν_{i}^{k}(H_{i}^{T}x^{k}) ≤ (ν^{k})^{T}(H^{T}x^{k}) ≤ k, i = 1, 2, . . . , m, (13)
so for i /∈ I_{H}, since H_{i}^{T}x^{k} is bounded away from zero, we have that ν_{i}^{k} = O(_{k}).

By similar reasoning, we have that τ_{i}^{k} = O(k) for i /∈ IG. Using these two facts,
we can write the first row of (10) as follows:

0 = ∇f (x^{k}) −

p

X

i=1

λ^{k}_{i}∇g_{i}(x^{k}) −

q

X

i=1

(µ^{−k}_{i} − µ^{+k}_{i} )∇h_{i}(x^{k})

−X

i∈I_{G}

(τ_{i}^{k}− ckH_{i}^{T}x^{k})Gi− X

i∈I_{H}

(ν_{i}^{k}− ckG^{T}_{i} x^{k})Hi

+ck

X

i /∈IG

(H_{i}^{T}x^{k})Gi+ ck

X

i /∈IH

(G^{T}_{i} x^{k})Hi+ O(k).

Since x^{∗}is feasible for (1), we have I_{G}∪IH= {1, 2, . . . , m}, and the set of indices
i /∈ I_{G} is simply I_{H}\I_{G}. Similarly, i /∈ I_{H} ⇔ i ∈ I_{G}\I_{H}. Hence, we can restate

the relation above as follows:

0 = ∇f (x^{k}) −

p

X

i=1

λ^{k}_{i}∇gi(x^{k}) −

q

X

i=1

(µ^{−k}_{i} − µ^{+k}_{i} )∇hi(x^{k})

− X

i∈I_{G}∩IH

(τ_{i}^{k}− ckH_{i}^{T}x^{k})Gi− X

i∈I_{G}∩IH

(ν_{i}^{k}− ckG^{T}_{i}x^{k})Hi (14)

− X

i∈IG\IH

(τ_{i}^{k}− ckH_{i}^{T}x^{k})Gi− ck(G^{T}_{i} x^{k})Hi

− X

i∈I_{H}\I_{G}

(ν_{i}^{k}− ckG^{T}_{i}x^{k})H_{i}− ck(H_{i}^{T}x^{k})G_{i} + O(k).

We examine the final summation in (14) more closely. This term can be written as follows:

X

i∈IH\IG

(ν_{i}^{k}− c_{k}G^{T}_{i} x^{k})H_{i}− c_{k}(H_{i}^{T}x^{k})G_{i}

= X

i∈I_{H}\IG

(ν_{i}^{k}− ckG^{T}_{i} x^{k})

Hi+H_{i}^{T}x^{k}
G^{T}_{i}x^{k}Gi

− ν_{i}^{k}H_{i}^{T}x^{k}

G^{T}_{i}x^{k}Gi (15)

= X

i∈IH\IG

(ν_{i}^{k}− ckG^{T}_{i} x^{k})

Hi+H_{i}^{T}x^{k}
G^{T}_{i}x^{k}Gi

+ O(k),

where the final inequality is a consequence of {G^{T}_{i} x^{k}} → G^{T}_{i}x^{∗}> 0 for i ∈ I_{H}\IG

and 0 ≤ ν_{i}^{k}H_{i}^{T}x^{k}≤ k (see (13)). Hence, by defining

H˜_{i}^{k}^{def}=

Hi+H_{i}^{T}x^{k}

G^{T}_{i} x^{k}Gi, for i ∈ IH\IG ,
H_{i}, for i ∈ I_{G}∩ IH,

(16)

we deduce from (15) that X

i∈I_{H}\IG

(ν_{i}^{k}− ckG^{T}_{i} x^{k})H_{i}− ck(H_{i}^{T}x^{k})G_{i} = X

i∈I_{H}\IG

(ν_{i}^{k}− ckG^{T}_{i}x^{k}) ˜H_{i}^{k}+ O(_{k}).

(17)
Since {H_{i}^{T}x^{k}/G^{T}_{i}x^{k}} → 0 for i ∈ IH\IG, we have from (16) that

{ ˜H_{i}^{k}} → H_{i}, for i ∈ I_{H}.

A similar definition of ˜G^{k}_{i} for i ∈ IG yields for the second-to-last summation in
(14) that

X

i∈IG\IH

(τ_{i}^{k}− ckH_{i}^{T}x^{k})Gi− ck(G^{T}_{i}x^{k})Hi = X

i∈IG\IH

(τ_{i}^{k}− ckH_{i}^{T}x^{k}) ˜G^{k}_{i} + O(k).

(18)

By substituting (17) and (18) into (14) and using the definitions of ˜H_{i}^{k} and ˜G^{k}_{i},
we have

0 = ∇f (x^{k}) −

p

X

i=1

λ^{k}_{i}∇gi(x^{k}) −

q

X

i=1

(µ^{−k}_{i} − µ^{+k}_{i} )∇hi(x^{k}) (19)

−X

i∈I_{G}

(τ_{i}^{k}− ckH_{i}^{T}x^{k}) ˜G^{k}_{i} − X

i∈I_{H}

(ν_{i}^{k}− ckG^{T}_{i}x^{k}) ˜H_{i}^{k}+ O(_{k}).

We turn now to the term in (19) involving λ^{k}. By taking a further sub-
sequence if necessary, we assume that there is a constant ρ > 0 such that
gi(x^{k}) ≥ ρ for all i /∈ Ig and all k. From the fourth row of (10) we have

|(g(x^{k}) + ζkep)^{T}λ^{k}| ≤ k and therefore
X

i6∈Ig

(gi(x^{k}) + ζk)λ^{k}_{i} ≤ k−X

i∈Ig

(gi(x^{k}) + ζk)λ^{k}_{i} ≤ k+ k

X

i∈Ig

λ^{k}_{i},

where the second inequality follows from the fact that λ^{k}_{i} ≥ 0 and gi(x^{k}) + ζk ≥

−k for all i (due to the fourth row of (10)). Since i /∈ Ig ⇒ gi(x^{k}) + ζk ≥
gi(x^{k}) ≥ ρ > 0, it follows that

ρX

i /∈Ig

λ^{k}_{i} ≤ k+ k

X

i∈I_{g}

λ^{k}_{i}, for all k. (20)

WhenP

i∈I_{g}λ^{k}_{i} ≥ 1, we have immediately from (20) that
P

i /∈Igλ^{k}_{i}
P

i∈I_{g}λ^{k}_{i} ≤ 2k

ρ . (21)

Then

p

X

i=1

λ^{k}_{i}∇gi(x^{k}) =X

i∈Ig

λ^{k}_{i}

"

∇gi(x^{k}) +
P

j /∈I_{g}λ^{k}_{j}∇gj(x^{k})
P

j∈I_{g}λ^{k}_{j}

#

=X

i∈Ig

λ^{k}_{i}g˜i,

where the vector ˜g^{k}_{i} is defined in the obvious way. Because of (21) and {x^{k}} → x^{∗},
we have {˜g^{k}_{i}} → ∇gi(x^{∗}). Otherwise, whenP

i∈I_{g}λ^{k}_{i} < 1, we have from (20) that
X

i /∈Ig

λ^{k}_{i} ≤ 2k

ρ = O(k), (22)

so that

p

X

i=1

λ^{k}_{i}∇gi(x^{k}) =X

i∈I_{g}

λ^{k}_{i}g˜^{k}_{i} + O(k),

where we set ˜g^{k}_{i} ^{def}= ∇g_{i}(x^{k}). Thus, in both cases, we have that

p

X

i=1

λ^{k}_{i}∇gi(x^{k}) =X

i∈I_{g}

λ^{k}_{i}˜g^{k}_{i} + O(_{k}) and {˜g_{i}^{k}} → ∇gi(x^{∗}), i ∈ I_{g}.

Using the first relation, we can write (19) as follows:

0 = ∇f (x^{k}) −X

i∈Ig

λ^{k}_{i}˜g_{i}^{k}−

q

X

i=1

(µ^{−k}_{i} − µ^{+k}_{i} )∇hi(x^{k}) (23)

−X

i∈I_{G}

(τ_{i}^{k}− ckH_{i}^{T}x^{k}) ˜G^{k}_{i} − X

i∈I_{H}

(ν_{i}^{k}− ckG^{T}_{i}x^{k}) ˜H_{i}^{k}+ O(_{k}).

Since x^{∗}satisfies MPEC-LICQ, we can invoke Lemma 2 to deduce from (23) the
existence of λ^{∗}_{i} for i ∈ Ig, τ_{i}^{∗} for i ∈ IG, and ν_{i}^{∗}for i ∈ IH such that

0 = ∇f (x^{∗}) −X

i∈I_{g}

λ^{∗}_{i}∇g_{i}(x^{∗}) −

q

X

i=1

µ^{∗}_{i}∇h_{i}(x^{∗}) − X

i∈I_{G}

τ_{i}^{∗}G_{i}− X

i∈I_{H}

ν_{i}^{∗}H_{i},

and, moreover,

{λ^{k}_{i}} → λ^{∗}_{i}, for i ∈ Ig, (24a)
{µ^{−k}_{i} − µ^{+k}_{i} } → µ^{∗}_{i}, for i = 1, 2, . . . , q, (24b)
{τ_{i}^{k}− ckH_{i}^{T}x^{k}} → τ_{i}^{∗}, for i ∈ I_{G}, (24c)
{ν_{i}^{k}− ckG^{T}_{i}x^{k}} → ν_{i}^{∗}, for i ∈ IH. (24d)
We now analyze (24c) and (24d) for i ∈ I_{G}∩ IH. Since τ_{i}^{k}, ν_{i}^{k}, G^{T}_{i} x^{k}, and
H_{i}^{T}x^{k} are all nonnegative, we have

(τ_{i}^{k}− ckH_{i}^{T}x^{k})(ν_{i}^{k}− ckG^{T}_{i}x^{k})

= τ_{i}^{k}ν_{i}^{k}+ c^{2}_{k}(H_{i}^{T}x^{k})(G^{T}_{i} x^{k}) − c_{k}(τ_{i}^{k}G^{T}_{i} x^{k}+ ν_{i}^{k}H_{i}^{T}x^{k})

≥ −ck(τ_{i}^{k}G^{T}_{i} x^{k}+ ν_{i}^{k}H_{i}^{T}x^{k})

≥ −2ckk,

where the final inequality follows from (10). Taking limits as k → ∞ and using
{ckk} → 0, we conclude that τ_{i}^{∗}ν_{i}^{∗}≥ 0 for i ∈ IG∩ IH, implying C-stationarity.

To complete the proof, we show by contradiction that {ζ_{k}} → 0. If this
limit did not hold, we could assume by taking a subsequence if necessary that
ζ_{k}≥ ζ > 0 for all k. Since x^{∗}is feasible, we have that g_{i}(x^{∗}) ≥ 0 for all i, so for
all k sufficiently large we have

gi(x^{k}) + ζk ≥ ζ/2, for i = 1, 2, . . . , p.

Hence, we have from the fourth row of (10) that
e^{T}_{p}λ^{k} ≤ 2k/ζ,

for all k sufficiently large. Similarly, since h(x^{∗}) = 0, we have that
ζk− hi(x^{k}) ≥ ζ/2, ζk+ hi(x^{k}) ≥ ζ/2, for i = 1, 2, . . . , q.

Hence

e^{T}_{q}µ^{−k}≤ 2_{k}/ζ, e^{T}_{q}µ^{+k}≤ 2_{k}/ζ,

for all k sufficiently large. This together with the second row of (10) yields
π^{−k}− π^{+k}= c^{k}+ O(^{k}).

Since π^{+k}≥ 0, this implies π^{−k}≥ c^{k}+ O(^{k}). Also, from the third row of (10),
we have ζkπ^{−k}≤ k. Thus

ζ_{k} ≤ _{k}

π^{−k} ≤ _{k}

ck+ O(k)→ 0 as k → ∞, contradicting our positive lower bound on ζk.

Without loss of generality, we could assume in Theorem 3 that {c_{k}} is increas-
ing (rather than nondecreasing). However, allowing {c_{k}} to be nondecreasing is
convenient when, for example, (x^{k}, ζk) is the point generated at the kth iteration
of an iterative method that allows ck to remain unchanged from one iteration to
the next; see Algorithm Elastic-Inexact in Section 3.5.

The following corollary gives additional global convergence properties of the
sequence {(x^{k}, ζ_{k})}.

Corollary 1. Suppose that the assumptions of Theorem 3 hold, where x^{∗} is an
accumulation point of {x^{k}} that is C-stationary for (1) and satisfies MPEC-
LICQ. Then for any S ⊂ {0, 1, . . .} such that {x^{k}}k∈S → x^{∗}, we have that

{ckG^{T}_{i}x^{k}}k∈S → 0, for i ∈ IG\IH, (25a)
{ckH_{i}^{T}x^{k}}k∈S → 0, for i ∈ IH\IG, (25b)
{c_{k}(G^{T}_{i} x^{k})(H_{i}^{T}x^{k})}_{k∈S} → 0, for i ∈ I_{G}∩ I_{H}, (25c)

{ckζk}k∈S → 0. (25d)

Proof. We first prove (25b); the proof of (25a) is analogous. If {ck} is bounded (from above by ¯c, say), then the result follows from

0 ≤ c_{k}H_{i}^{T}x^{k}≤ ¯cH_{i}^{T}x^{k} → ¯cH_{i}^{T}x^{∗}= 0, as k ∈ S, k → ∞, i ∈ I_{H}\IG.
Suppose instead that {ck} ↑ ∞. Assume for contradiction that there is some
S ⊂ S, some i ∈ I¯ H\IG, and some constant ρ > 0 such that ckH_{i}^{T}x^{k}≥ ρ for all
k ∈ ¯S. From the final row of (10), we have that ν_{i}^{k}H_{i}^{T}x^{k} ≤ (ν^{k})^{T}H^{T}x^{k} ≤ k,
implying

ν_{i}^{k}ckH_{i}^{T}x^{k}≤ ckk→ 0, as k ∈ ¯S, k → ∞.

It follows from ckH_{i}^{T}x^{k} ≥ ρ that {ν_{i}^{k}}_{k∈ ¯}_{S} → 0. From the limit (24d), we then
have that

{ckG^{T}_{i}x^{k}}_{k∈ ¯}_{S} → −ν_{i}^{∗}.

Since {c_{k}} ↑ ∞, this limit implies that {G^{T}_{i}x^{k}}_{k∈ ¯}_{S} → 0. Since {x^{k}} → x^{∗}, it
follows that G^{T}_{i} x^{∗} = 0, implying that i ∈ I_{G}. This contradicts our choice of
i ∈ IH\IG, so (25b) must hold in this case too.

If {c_{k}} is bounded, then (25c) follows from the feasibility of x^{∗} for (1). Sup-
pose instead that {c_{k}} ↑ ∞. Assume for contradiction that there is some ¯S ⊂ S,

some i ∈ I_{G}∩ I_{H}, and some constant ρ > 0 such that c_{k}(G^{T}_{i} x^{k})(H_{i}^{T}x^{k}) ≥ ρ for
all k ∈ ¯S. Thus by (24c), we have

τ_{i}^{k}= c_{k}H_{i}^{T}x^{k}+ O(1) ≥ ρ

G^{T}_{i} x^{k} + O(1) ≥ ρ
2G^{T}_{i}x^{k},

for all k ∈ ¯S sufficiently large. However, from the second-to-last row of (10), we
have τ_{i}^{k}G^{T}_{i}x^{k} ≤ (τ^{k})^{T}(G^{T}x^{k}) ≤ _{k}, so that τ_{i}^{k}≤ _{k}/G^{T}_{i}x^{k}, yielding the desired
contradiction since {k} → 0.

To prove (25d), we see from the third row of (10) that, for all k,

c_{k}ζ_{k}≤ ζ_{k}(e^{T}_{p}λ^{k}+ e^{T}_{q}µ^{−k}+ e^{T}_{q}µ^{+k}+ π^{−k}− π^{+k}) + _{k}ζ_{k}. (26)
Because of (24a) and {ζk}k∈S → 0 (Theorem 3), we have that {ζke^{T}_{p}λ^{k}}k∈S → 0.

Similarly, it is immediate that {_{k}ζ_{k}}k∈S→ 0. From the fifth and sixth rows of
(10) and (24b), we also have

ζk(e^{T}_{q}µ^{+k}+ e^{T}_{q}µ^{−k})

≤ h(x^{k})^{T}(µ^{+k}− µ^{−k}) + 2k

≤ kh(x^{k})k_{∞}kµ^{+k}− µ^{−k}k_{1}+ 2_{k}

≤ (ζk+ k)kµ^{+k}− µ^{−k}k1+ 2k→ 0, as k ∈ S, k → ∞.

Lastly, from the third row of (10), we have ζk(π^{−k}− π^{+k}) ≤ k− ¯ζπ^{+k} ≤ k.
Hence, by taking limits in (26), we have the desired result (25d).

In Theorem 3, we assumed that the accumulation point x^{∗} is feasible for
(1). This assumption is fairly mild and, as we show below, is satisfied under the
following assumptions on {(x^{k}, ζk)} and {ck}.

Assumption 1 (a) {f (x^{k})} is bounded from below.

(b) {f (x^{k}) + c_{k}ζ_{k}+ c_{k}(G^{T}x^{k})^{T}(H^{T}x^{k})} is bounded from above.

(c) There exist positive sequences {ω_{k}} → 0, {ηk} → ∞ such that ck+1≥ ηk+1

whenever ζ_{k}+ (G^{T}x^{k})^{T}(H^{T}x^{k}) ≥ ω_{k}.

Assumption 1(a) holds if f is bounded from below over the feasible set of
PF(ck). Assumption 1(b) holds if (i) the method for solving PF(ck) has the
property that the final point (x^{k}, ζk) it generates has objective value no greater
than that of the starting point whenever the starting point is feasible for PF(ck);

and (ii) this method is started at (¯x, 0), with ¯x a feasible point of (1). Then (¯x, 0) is feasible for PF(ck), with objective value f (¯x), so that

f (x^{k}) + ckζk+ ck(G^{T}x^{k})^{T}(H^{T}x^{k}) ≤ f (¯x), for all k.

Assumption 1(c) holds if we choose c_{k+1}≥ max{c_{k}, η_{k+1}} whenever

(G^{T}x^{k})^{T}(H^{T}x^{k}) +ζ_{k} ≥ ω_{k}. Assumption 1 contrasts with the infeasible-point
MPEC-LICQ assumption used in [10, Lemma 3.2].

Lemma 1. Let {ck} be a positive sequence, nondecreasing with k, and {k} be
a nonnegative sequence with {_{k}} → 0. Suppose that (x^{k}, ζ_{k}) is an _{k}-first-order
point of PF(c_{k}), k = 0, 1, . . ., and that Assumption 1 is satisfied. Then every
accumulation point of {x^{k}} is feasible for (1).

Proof. It suffices to show that

{ζk} → 0, {(G^{T}x^{k})^{T}(H^{T}x^{k})} → 0. (27)
Then any accumulation point (x^{∗}, ζ_{∗}) of {(x^{k}, ζk)} satisfies (G^{T}x^{∗})^{T}(H^{T}x^{∗}) = 0
and ζ_{∗}= 0, implying that x^{∗}is feasible for (1). (The other constraints of (1) are
satisfied by x^{∗}, from rows 4 to 8 of (10) and {k} → 0.)

We divide our argument into two cases. First, suppose that

ζk+ (G^{T}x^{k})^{T}(H^{T}x^{k}) < ωk, (28)
for all k sufficiently large. Since ζ_{k} ≥ 0, G^{T}x^{k} ≥ 0, H^{T}x^{k} ≥ 0 for all k and
{ω_{k}} → 0, the bound (28) implies (27). Second, suppose that (28) fails to hold for
all k in some infinite subsequence. Then, by Assumption 1(c), ck+1≥ ηk+1for all
k in this subsequence. Since {ck} is nondecreasing and {ηk} → ∞, we have that
{ck} ↑ ∞. Assumptions 1(a) and 1(b) imply that {ckζk+ ck(G^{T}x^{k})^{T}(H^{T}x^{k})} is
bounded from above. Since ζk ≥ 0, G^{T}x^{k} ≥ 0, H^{T}x^{k}≥ 0 for all k and {ck} ↑ ∞,
(27) follows.

3.2. A Sequence of Exact Second-Order Points

In this subsection, we consider the situation in which an exact second-order point
(x^{k}, ζ_{k}) of PF(c_{k}) is generated (Definition 6), for k = 0, 1, . . ., with {c_{k}} ↑ ∞.

Algorithm Elastic-Exact

Choose c_{k} > 0, k = 0, 1, . . ., with {c_{k}} ↑ ∞;

for k = 0, 1, 2 . . .

Find a second-order point (x^{k}, ζ_{k}) of PF(c_{k}) with Lagrange multipliers
(λ^{k}, µ^{−k}, µ^{+k}, τ^{k}, ν^{k}, π^{−k}, π^{+k});

if ζ_{k}= 0 and (G^{T}x^{k})^{T}(H^{T}x^{k}) = 0,
STOP.

end (if ) end (for)

We show below that either the algorithm terminates finitely—in which case,
the final iterate x^{k} is strongly stationary by Theorem 2—or each accumulation
point of {x^{k}} either is infeasible or fails to satisfy MPEC-LICQ.

Theorem 4. If Algorithm Elastic-Exact does not terminate finitely, then ev-
ery accumulation point x^{∗} of {x^{k}} either is infeasible for (1) or fails to satisfy
MPEC-LICQ.

Proof. Assume for contradiction that the algorithm does not terminate finitely
and that there is an accumulation point x^{∗} that is feasible for (1) and satisfies
MPEC-LICQ. Let S ⊂ {0, 1, . . .} index the subsequence for which {x^{k}}k∈S → x^{∗}.
Since (x^{k}, ζk) is a first-order point of PF(ck), with multipliers λ^{k}, µ^{−k}, µ^{+k}, τ^{k}, ν^{k},

π^{−k}, π^{+k}, Theorem 3 (with k ≡ 0) shows that x^{∗} is C-stationary. Our aim is

to show that in fact x^{k} is feasible for (1) for sufficiently large k ∈ S, and hence
the algorithm terminates finitely.

For any k, if ζ_{k} > 0, then (9c) would imply that π^{−k}= 0, and hence
ck− e^{T}_{p}λ^{k}− e^{T}_{q}(µ^{−k}+ µ^{+k}) = −π^{+k}≤ 0. (29)
Moreover, (9e) and (9f) would imply that, for each i, either µ^{−k}_{i} = 0 or µ^{+k}_{i} =
0, and hence kµ^{−k}+ µ^{+k}k = kµ^{−k}− µ^{+k}k. Since (24a) and (24b) hold when
restricted to k ∈ S, and since {c_{k}} → ∞, we have that (29) cannot hold for all
k ∈ S sufficiently large. Thus, we have

ζk= 0, for all k ∈ S sufficiently large.

We next show that because (x^{k}, ζ_{k}) satisfies the second-order necessary opti-
mality condition for PF(c_{k}), we must have (G^{T}_{j}x^{k})(H_{j}^{T}x^{k}) = 0 for all j ∈ I_{G}∩I_{H}
and all k ∈ S sufficiently large. Suppose not. By passing to a further subsequence
if necessary, there must be an index j ∈ IG∩ IH such that (G^{T}_{j}x^{k})(H_{j}^{T}x^{k}) 6= 0
for all k ∈ S. Define a direction d^{k} satisfying the following conditions:

∇gi(x^{k})^{T}d^{k} = 0, for i ∈ Ig,

∇h(x^{k})^{T}d^{k} = 0,

G^{T}_{i}d^{k} = 0, for i ∈ IG with i 6= j,
H_{i}^{T}d^{k} = 0, for i ∈ IH with i 6= j,

G^{T}_{j}d^{k} = 1,
H_{j}^{T}d^{k} = −1.

(30)

Since {x^{k}}k∈S → x^{∗} and MPEC-LICQ holds at x^{∗}, the gradients in this defi-
nition are linearly independent for all k ∈ S sufficiently large, in which case d^{k}
satisfying these equations is well defined. In fact, we can choose d^{k} so that

kd^{k}k = O(1).

Since {(x^{k}, ζ_{k})}_{k∈S} → (x^{∗}, 0), the set of active constraints of (2) at (x^{k}, ζ_{k}) is
a subset of the active constraints at (x^{∗}, 0) for all k ∈ S sufficiently large, in
which case the direction (d^{k}, 0) lies in the direction set described in Definition 6
corresponding to ck, (x^{k}, ζk, λ^{k}, µ^{−k}, µ^{+k}, τ^{k}, ν^{k}). (Notice that the constraints
G^{T}_{j}x ≥ 0 and H_{j}^{T}x ≥ 0 are not active at (x^{k}, ζk) because (G^{T}_{j}x^{k})(H_{j}^{T}x^{k}) 6= 0.)
Also, (9d) implies λ^{k}_{i} = 0, i 6∈ Ig, for all k ∈ S sufficiently large.

From Definition (8) and using λ^{k}_{i} = 0 for i 6∈ I_{g}, we have

∇^{2}_{xx}L_{c}_{k}(x^{k}, ζ^{k}, λ^{k}, µ^{+k}, µ^{−k}, τ^{k}, ν^{k}) = ∇^{2}f (x^{k}) −X

i∈I_{g}

λ^{k}_{i}∇^{2}g_{i}(x^{k})

−

q

X

i=1

(µ^{−k}_{i} − µ^{+k}_{i} )∇^{2}hi(x^{k})

+ck m

X

i=1

GiH_{i}^{T} + HiG^{T}_{i} ,