• 沒有找到結果。

3 On global saddle points

N/A
N/A
Protected

Academic year: 2022

Share "3 On global saddle points"

Copied!
26
0
0

加載中.... (立即查看全文)

全文

(1)

to appear in Journal of Global Optimization, 2015

On the existence of saddle points for nonlinear second-order cone programming problems

Jinchuan Zhou 1 Department of Mathematics

School of Science

Shandong University of Technology Zibo 255049, P.R. China E-mail: jinchuanzhou@163.com

Jein-Shan Chen 2 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: jschen@math.ntnu.edu.tw

October 8, 2013

(1st revision on May 17, 2014) (2nd revision on August 31, 2014)

Abstract. In this paper, we study the existence of local and global saddle points for nonlinear second-order cone programming problems. The existence of local saddle points is developed by using the second-order sufficient conditions, in which a sigma-term is added to reflect the curvature of second-order cone. Furthermore, by dealing with the perturbation of the primal problem, we establish the existence of global saddle points, which can be applicable for the case of multiple optimal solutions. The close relationship between global saddle points and exact penalty representations are discussed as well.

Keywords. Local and global saddle points, second-order sufficient conditions, aug- mented Lagrangian, exact penalty representations.

AMS subject classifications. 90C26, 90C46.

1The author’s work is supported by National Natural Science Foundation of China (11101248, 11171247, 11271233), Shandong Province Natural Science Foundation (ZR2010AQ026, ZR2012AM016), and Young Teacher Support Program of Shandong University of Technology.

2Corresponding author. Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is supported by Ministry of Science and Technology, Taiwan.

(2)

1 Introduction

Recall that the second-order cone (SOC), also called the Lorentz cone or ice-cream cone, in IRm+1 is defined as

Km+1 = {(x1, x2) ∈ IR × IRm| kx2k ≤ x1},

where k·k denotes the Euclidean norm. The order relation induced by this pointed closed convex cone Km+1 is given by

x Km+1 0 ⇐⇒ x ∈ IRm+1, x1 ≥ kx2k.

In this paper, we consider the following nonlinear second-order cone programming (NSOCP) min f (x)

s.t. gj(x) Kmj +1 0, j = 1, 2, . . . , J, (1) h(x) = 0,

where f : IRn→ IR, h : IRn→ IRl, gj : IRn → IRmj+1 are twice continuously differentiable functions, and Kmj+1 is the second-order cone in IRmj+1 for j = 1, 2, · · · , J .

For a given nonlinear programming problem, we can define another programming problem associated with it by using traditional Lagrangian functions. The original prob- lem is called the primal problem, and the latter one is called the dual problem. Since the weak duality property always holds, our concern is on how to obtain the strong duality property (or zero duality gap property). In other words, we want to know when the primal and dual problems have the same optimal values, which provides the theoretical foundation for many primal-dual type methods. However, if we employ the traditional Lagrangian functions, then some convexity is necessary for achieving strong duality prop- erty. To overcome this drawback, we need to resort to the augmented Lagrangian func- tions, whose main advantage is ensuring the strong duality property without requiring convexity. In addition, the zero duality gap property coincides with the existence of global saddle points, provided that the optimal solution sets of the primal and dual problems are nonempty, respectively. Many researchers have studied the properties of augmented Lagrangian and the existence of saddle points. For example, Rockafellar and Wets [13]

proposed a class of augmented Lagrangian where augmented function is required to be convex functions. This was extended by Huang and Yang [6] where convexity condition is replaced by level-boundedness, and it was further generalized by Zhou and Yang [21]

where level-boundedness condition is replaced by so-called valley-at-zero property; see also [14] for more details. These important works give an unified frame for the aug- mented Lagrangian function and its duality theory. Meanwhile, Floudas and Jongen [5]

pointed out the crucial role of saddle points for the minimization of smooth functions with a finite number of stationary points. The necessary and/or sufficient conditions

(3)

to ensure the existence of local and/or global saddle points were investigated by many researchers. For example, the existence of local and global saddle points of Rockafellar’s augmented Lagrangian function was studied in [12]. Local saddle points of the general- ized Mangasarian’s augmented Lagrangian were analyzed in [19]. The existences of local and global saddle points of p-th power nonlinear Lagrangian were discussed in [7, 8, 18].

For more references, please see [9, 10, 14, 16, 17, 20, 22].

All the results mentioned above are focused on either the standard nonlinear pro- gramming or the generalized minimizing problems [13]. The main purpose of this paper is to establish the existences of local and global saddle points of NSOCP (1) by suffi- ciently exploiting the special structure of SOC. As shown in nonlinear programming, the positive definiteness of ∇2xxL over the critical cone is a sufficient condition for the exis- tence of local saddle points. However, this classical result cannot be extended trivially to NSOCP (1) and the analysis is more complicated because IRn+ is polyhedral, whereas Km+1 is non-polyhedral. Hence, we particulary study the sigma-term [4], which in some extend stands for the curvature of second-order cone. Our result shows that the local saddle point exists provided that the sum of ∇2xxL and H is positive definite even if ∇2xxL is indefinite (see Theorem 2.3). This undoubtedly clarifies the essential role played by the sigma-term. Moreover, by developing the perturbation of the primal problem, we establish the existence of global saddle points without restricting the optimal solution being unique, as required in [12, 16]. Furthermore, we study another important concept, exact penalty representation, and develop its new necessary and sufficient conditions.

The close relationship between global saddle points and exact penalty representations is established as well.

To end this section, we introduce some basic concepts which will be needed for our subsequent analysis. Let IRn be n-dimensional real vector space. For x, y ∈ IRn, the inner product is denoted by xTy or hx, yi. Given a convex subset A ⊆ IRn and a point x ∈ A, the normal cone of A at x, denoted by NA(x), is defined as

NA(x) := {v ∈ IRn| hv, z − xi ≤ 0, ∀z ∈ A}, and the tangent cone, denoted by TA(x), is defined as

TA(x) := NA(x),

where NA(x) means the polar cone of NA(x). Given d ∈ TA(x), the outer second order tangent set is defined as

TA2(x, d) =n

w ∈ IRn

∃ tn↓ 0 such that dist x + tnd + 1

2t2nw, A = o(t2n)o . The support function of A is

σ(x | A) := sup{hx, zi | z ∈ A}.

(4)

We also write cl(A), int(A), and ∂(A) to stand for the closure, interior, and boundary of A, respectively. For the simplicity of notations, let us write Kj to stand for Kmj+1 and K be the Cartesian product of these second-order cones, i.e., K := K1× K2× · · · KJ. In addition, we denote g(x) := (g1(x), g2(x), · · · , gJ(x)), p :=

J

X

j=1

(mj+1), and S means the solution set of NSOCP (1). According to [13, Exercise 11.57], the augmented Lagrangian function for NSOCP (1) is written as

Lc(x, λ, µ, c)

:= f (x) + hµ, h(x)i + c

2kh(x)k2+ c 2

J

X

j=1

"

dist2



gj(x) −λj c , Kj



λj c

2# . (2) Here c ∈ IR++ := {ζ ∈ IR | ζ > 0} and (x, λ, µ) ∈ IRn×IRp×IRlwith λ = (λ1, λ2, · · · , λJ) ∈ IRm1+1× IRm2+1× · · · × IRmJ+1.

Definition 1.1. Let Lc be given as in (2) and (x, λ, µ) ∈ IRn× IRp× IRl.

(a) The triple (x, λ, µ) is said to be a local saddle point of Lc for some c > 0 if there exists δ > 0 such that

Lc(x, λ, µ) ≤ Lc(x, λ, µ) ≤ Lc(x, λ, µ), ∀x ∈ B(x, δ), (λ, µ) ∈ IRp× IRl, (3) where B(x, δ) denotes the δ-neighborhood of x, i.e., B(x, δ) := {x ∈ IRn| kx − xk ≤ δ}.

(b) The triple (x, λ, µ) is said to be a global saddle point of Lc for some c > 0 if Lc(x, λ, µ) ≤ Lc(x, λ, µ) ≤ Lc(x, λ, µ), ∀x ∈ IRn, (λ, µ) ∈ IRp× IRl. (4)

2 On local saddle points

In this section, we focus on the necessary and sufficient conditions for the existence of local saddle points. For simplicity, we let Q stand for a second-order cone without emphasizing its dimension, while using the notation Q ⊂ IRm+1 to indicate that Q is regarded as a second-order cone in IRm+1. In other words, the result holding for Q is also applicable to Ki for i = 1, . . . , J in the subsequent analysis. According to [13, Example 6.16] we know for a ∈ Q,

−b ∈ NQ(a) ⇐⇒ ΠQ(a − b) = a

⇐⇒ dist(a − b, Q) = kbk (5)

⇐⇒ a ∈ Q, b ∈ Q, aTb = 0,

where the last equivalence comes from the fact that Q is a self-dual cone, i.e., (Q) = −Q.

(5)

Lemma 2.1. Let Lc be given as in (2). Then, the augmented Lagrangian function Lc(x, λ, µ) is nondecreasing with respect to c > 0.

Proof. See [13, Exercise 11.56]. 2

We now discuss the necessary conditions for local saddle points.

Theorem 2.1. Suppose (x, λ, µ) is a local saddle point of Lc. Then, (a) −λ ∈ NK(g(x));

(b) Lc(x, λ, µ) = f (x) for all c > 0;

(c) x is a local optimal solution to NSOCP (1).

Proof. We first show that x is a feasible point of NSOCP (1), for which we need to verify two things: (i) h(x) = 0, (ii) gj(x) Kj 0 for all j = 1, 2, . . . , J .

(i) Suppose h(x) 6= 0. Taking µ = γh(x) with γ → ∞, and applying the first inequality in (3) yields Lc(x, λ, µ) = ∞ which is a contradiction. Thus, h(x) = 0.

(ii) Suppose gj(x) /∈ Kj for some j = 1, · · · , J . Then, there exist ˜λj ∈ Kj such that η := h˜λj, gj(x)i < 0. Therefore, for β ∈ IR

dist2 gj(x) −β ˜λj

c , Kj

!

β ˜λj

c

2

=

gj(x) − β ˜λj

c − ΠKj gj(x) − β ˜λj c

!

2

β ˜λj c

2

=

gj(x) − ΠKj gj(x) − β ˜λj c

!

2

− 2

*β ˜λj

c , gj(x) − ΠKj gj(x) − β ˜λj c

!+

≥ dist2(gj(x), Kj) − 2β

* ˜λj

c, gj(x) +

= dist2(gj(x), Kj) − 2β η

c. (6)

Here the inequality comes from the facts that

gj(x) − ΠKj gj(x) − β ˜λj c

!

≥ kgj(x) − ΠKj(gj(x))k = dist(gj(x), Kj)

and

D˜λj, ΠKj



gj(x) − (β ˜λj/c)E

≥ 0

(6)

because ˜λj ∈ Kj and ΠKj



gj(x) − (β ˜λj/c)

∈ Kj. Taking β → ∞, it follows from (3) and (6) that Lc(x, λ, µ) is unbounded above which is a contradiction.

Plugging λ = 0 in the first inequality of (3) (i.e., Lc(x, 0, µ) ≤ Lc(x, λ, µ)), we obtain

J

X

j=1

"

dist2



gj(x) − λj c, Kj



λj c

2#

≥ 0, (7)

where we have used the feasibility of x as shown above.

On the other hand, we have dist



gj(x) −λj c, Kj



gj(x) − λj

c − gj(x)

=

λj c

,

where the inequality is due to the fact that gj(x) ∈ Kj as shown above. This together with (7) ensures that

dist



gj(x) −λj c, Kj



=

λj c

. (8)

Combining (5) and (8) yields −λj ∈ NKj(gj(x)) for all j = 1, · · · , J , i.e., −λ ∈ NK(g(x)) by [13, Proposition 6.41]. This establishes part (a). Furthermore, it implies

dist



gj(x) −λj c , Kj



=

λj c

, ∀c > 0, (9)

because −λj/c ∈ NKj(gj(x)) for all c > 0 (since NKj(gj(x)) is a cone). Hence Lc(x, λ, µ) = f (x) for all c > 0. This establishes part (b).

Now, we turn the attention to part (c). Suppose x ∈ B(x, δ) is any feasible point of NSOCP (1). Then, from (3), we know

f (x) ≥ Lc(x, λ, µ) ≥ Lc(x, λ, µ) = f (x),

where the first inequality comes from the fact that x is feasible. This means x is a local optimal solution to NSOCP (1). The proof is complete. 2

For NSOCP (1), we say that Robinson’s constraint qualification holds at x if ∇hi(x) for i = 1, . . . , l are linearly independent and there exists d ∈ IRn such that

∇h(x)d = 0 and g(x) + ∇g(x)d ∈ int(K).

It is known that if x is a local solution to NSOCP (1) and Robinson’s constraint quali- fication holds at x, then there exists (λ, µ) ∈ IRp× IRl such that the following Karush- Kuhn-Tucker (KKT) conditions

xL(x, λ, µ) = 0, h(x) = 0, −λ ∈ NK(g(x)), (10)

(7)

or equivalently,

xL(x, λ, µ) = 0, h(x) = 0, λ ∈ K, g(x) ∈ K, (λ)Tg(x) = 0, where L(x, λ, µ) is the standard Lagrangian function of NSOCP (1), i.e.,

L(x, λ, µ) := f (x) + hµ, h(x)i − hλ, g(x)i . (11) For convenience of subsequent analysis, we denote by Λ(x) all Lagrangian multipliers (λ, µ) satisfying (10).

It is well-known that the second order sufficient conditions are utilized to ensure the existence of local saddle points. In the nonlinear programming, it requires the positive definiteness of ∇2xxL over the critical cone. However, due to the non-polyhedric of second- order cone, an additional widely known sigma-term (or σ-term), which stands for the curvature of second-order cone, is required. In particular, it was noted in [4, page 177]

that the σ-term vanishes when the cone is polyhedral. Due to the important role played by σ-term in the analysis of second-order cone, before developing the sufficient conditions for the existence of local saddle points, we shall study some basic properties of σ-term which will be used in the subsequence analysis. First, based on the arguments given in [1, Theorem 29] we obtain the following result.

Theorem 2.2. Let x ∈ Q and d ∈ TQ(x). Then, the support function of the outer second order tangent set TQ2(x, d) is

σ y | TQ2(x, d)

=





−y1

x1 dT  1 0 0 −Im



d, for y ∈ NQ(x) ∩ {d}, x ∈ ∂Q\{0}, 0, for y ∈ NQ(x) ∩ {d}, x /∈ ∂Q\{0}, +∞, for y /∈ NQ(x) ∩ {d}.

Proof. We know from [4, Proposition 3.34] that

TQ2(x, d) + TTQ(x)(d) ⊂ TQ2(x, d) ⊂ TTQ(x)(d).

This implies

σ y | TQ2(x, d) + σ y | TTQ(x)(d)

= σ y | TQ2(x, d) + TTQ(x)(d)

≤ σ y | TQ2(x, d) ≤ σ y | TTQ(x)(d) . (12) Note that

σ y | TTQ(x)(d) < +∞ ⇐⇒ σ y | TTQ(x)(d) = 0 (13)

⇐⇒ y ∈ NTQ(x)(d) (14)

(8)

⇐⇒ y ∈ (TQ(x)) = NQ(x), yTd = 0 (15) where the first and third equivalences come from the fact that TTQ(x)(d) and TQ(x) are cones, respectively. Thus, we only need to establish the exact formula of σ y | TQ2(x, d), provided that (15) holds. In addition, it also indicates from (12) that σ y | TQ2(x, d) = ∞ whenever y /∈ NQ(x) ∩ {d}, since TQ2(x, d) is nonempty for x ∈ Q and d ∈ TQ(x) by [1, Lemma 27].

In fact, under condition (15), it follows from (12) and (13) that

σ y | TQ2(x, d) ≤ σ y | TTQ(x)(d) = 0. (16) Furthermore, in light of condition (15), we discuss the following four cases.

(i) If x = 0, then 0 ∈ TQ2(x, d) = TQ(d) where the equality is due to [1, Lemma 27]. Thus, σ y | TQ2(x, d) = σ (y | TQ(d)) ≥ 0.

This together with (16) implies σ y | TQ2(x, d) = 0.

(ii) If x ∈ int(Q), then it follows from (15) that y = 0. Hence, σ y | TQ2(x, d) = 0.

(iii) If x ∈ ∂(Q)\{0} and d ∈ int(TQ(x)), then it follows from (14) that y = 0 since d ∈ int(TQ(x)). Hence σ y | TQ2(x, d) = 0 = −(y1/x1)(d21− kd2k2).

(iv) If x ∈ ∂(Q)\{0} and d ∈ ∂(TQ(x)), then the desired result can be obtained by following the arguments given in [1, page 222]. We provide the proof for the sake of completeness. Note that σ y|TQ2(x, d) is to maximize y1w1+ y2Tw2 over all w satisfying

−w1x1+ wT2x2 ≤ d21− kd2k2 (see [1, Lemma 27]). From y ∈ NQ(x), i.e., −y ∈ Q, x ∈ Q, and xTy = 0, we know −y1 = αx1 and −y2 = −αx2 with α = −xy1

1 ≥ 0, see [1, page 208].

Thus,

hy, wi = y1w1+ y2Tw2 = α wT2x2− w1x1 ≤ α d21− kd2k2 = −y1

x1 d21− kd2k2 . The maximum can be obtained at (w1, w2) = (−xd21

1, −kdkx2k2

2k2x2). This establishes the desired expression. 2

Remark 2.1. Let A be a convex subset in IRm+1. In the proof of Theorem 2.2, we use the inclusion TA2(x, d) ⊂ TTA(x)(d). It is known from [4, page 168] that these two sets are the same if A is polyhedral. But, for the non-polyhedral cone Q, the following example shows this inclusion maybe strict.

(9)

Example 2.1. For Q ⊂ IR3, let ¯x = (1, 1, 0) and ¯d = (1, 1, 1). Then, TQ(¯x) = {d = (d1, d2, d3) ∈ IR3| (d2, d3)T(¯x2, ¯x3) − d11 ≤ 0}

= {d = (d1, d2, d3) | d2− d1 ≤ 0}, which implies ¯d ∈ ∂TQ(¯x). Hence,

TQ2(¯x, ¯d) = {w = (w1, w2, w3) | (w2, w3)T(¯x2, ¯x3) − w11 ≤ ¯d21− k( ¯d2, ¯d3)k2}

= {w = (w1, w2, w3) | w2− w1 ≤ −1}.

On the other hand, since TTQx)( ¯d) = cl(RTQx)( ¯d)), where RTQx)( ¯d) denotes the radi- cal (or feasible) cone of TQ(¯x) at ( ¯d), then for each w ∈ TTQx)( ¯d), there exists w0 ∈ RTQx)( ¯d) → w such that ¯d + tw0 ∈ TQ(¯x) for some t > 0, i.e.,

( ¯d2, ¯d3) + t(w20, w03)T

(¯x2, ¯x3) − ( ¯d1+ tw01)¯x1 ≤ 0,

which ensures that (w20, w03)T(¯x2, ¯x3) − w101 ≤ 0. Now, taking limit yields w2 − w1 ≤ 0.

Thus, we obtain

TTQx)( ¯d) = {w = (w1, w2, w3) | w2− w1 ≤ 0}

which says TQ2(¯x, ¯d) ( TTQx)( ¯d). In fact, 0 ∈ TTQx)( ¯d), but 0 /∈ TQ2(¯x, ¯d).

Corollary 2.1. For x ∈ Q and y ∈ NQ(x), we define

Θ(x, y) := TQ(x) ∩ {y}= {d | d ∈ TQ(x) and yTd = 0}.

Then, σ y | TQ2(x, d) is nonpositive and continuous with respect to d over Θ(x, y).

Proof. We first show that σ(y | TQ2(x, d)) is nonpositive for d ∈ Θ(x, y). In fact, we know from Theorem 2.2 that σ y | TQ2(x, d) = 0 when x = 0, or x ∈ int(Q), or x ∈ ∂(Q)\{0}

and d ∈ int(TQ(x)). If x ∈ ∂(Q)\{0} and d ∈ ∂(TQ(x)), then we have x1d1 = xT2d2 by the formula of TQ(x), see [1, Lemma 25]. Hence x1|d1| = |xT2d2| ≤ kx2kkd2k which implies

|d1| ≤ kd2k because x1 = kx2k > 0. Note that −y1 is nonnegative since −y ∈ Q. Then, applying Theorem 2.2 yields σ y | TQ2(x, d) = −(y1/x1)(d21− kd2k2) ≤ 0. Thus, in any case, we have verified the nonpositivity of σ y | TQ2(x, d) over Θ(x, y).

Next, we now show the continuity of σ y | TQ2(x, d) with respect to d over Θ(x, y).

Indeed, if x = 0 or x ∈ int(Q), then σ y | TQ2(x, d) = 0 for all d ∈ Θ(x, y) which, of course, is continuous. If x ∈ ∂Q\{0}, then σ y | TQ2(x, d) = −(y1/x1)(d21 − kd2k2) for d ∈ Θ(x, y) which is continuous with respect to d as well. 2

Remark 2.2. For a general closed convex cone Ω, σ(y | T2(x, d)) can be a discontinuous function of d; see [4, Page 178] or [15, Page 489]. But, when Ω is the second order cone Q, our result shows that this function is continuous.

(10)

For a convex subset A in IRm+1, it is well known that the function dist2(x, A) is con- tinuously differentiable with ∇dist2(x, A) = 2 (x − ΠA(x)). But, there are very limited results on second order differentiability unless some additional structure is imposed on A, for example, second order regularity, see [2, 3, 15].

Let φ(x) := dist2(x, Q) for Q ⊂ IRm+1. Since Q is second order regular, then according to [15], φ possesses the following nice property: for any x, d ∈ IRm+1, there holds that

lim

d0→d t↓0

φ(x + td0) − φ(x) − tφ0(x; d0)

1

2t2 = V(x, d) (17)

where V(x, d) is the optimal value of the problem

min 2kd − zk2− 2σ x − ΠQ(x) | TQ2Q(x), z)

s.t. z ∈ Θ (ΠQ(x), x − ΠQ(x)) . (18)

With these preparations, the sufficient conditions for the existence of local saddle points are given as below.

Theorem 2.3. Suppose x is a feasible point of the NSOCP (1) satisfying the following:

(i) x is a KKT point and (λ, µ) ∈ Λ(x), i.e.,

xL(x, λ, µ) = 0 and − λ ∈ NK(g(x)).

(ii) the following second order conditions hold

2xxL(x, y)(d, d) + dTH(x, λ)d > 0, ∀d ∈ C(x, λ)\{0}, (19) where

C(x, λ) := n

d ∈ IRn| ∇h(x)d = 0, ∇g(x)d ∈ TK(g(x)) , (∇g(x)d)T) = 0o ,

and H(x, λ) :=

J

X

j=1

Hj(x, λj) with

Hj x, λj :=

− (λj)1

(gj(x))1∇gj(x)T  1 0 0 −Imj



∇gj(x), gj(x) ∈ ∂(Kj)\{0},

0, otherwise.

Then, (x, λ, µ) is a local saddle point of Lc for some c > 0.

(11)

Proof. The first inequality in (3) follows from the fact that Lc(x, λ, µ) = f (x) by (5) since −λ ∈ NK(g(x)), and that Lc(x, λ, µ) ≤ f (x) for all (λ, µ) ∈ IRp× IRl due to x being feasible.

We will prove the second inequality in (3) by contradiction, i.e., we cannot find c > 0 and δ > 0 such that f (x) = Lc(x, λ, µ) ≤ Lc(x, λ, µ) for all x ∈ B(x, δ). In other words, there exists a sequence cn → ∞ as n → ∞, and each fixed cn, we always find a sequence {xnk} (noting that its sequence is dependent on cn) such that xnk → x as k → ∞ and

f (x) > Lcn(xnk, λ, µ). (20) To proceed, we denote tnk := kxnk − xk and dnk := (xnk − x)/kxnk − xk. Assume, without loss of generality, that dnk → ˜dn as k → ∞. First, we observe that

φ



gj(xnk) −λj cn



= φ



gj(x) − λj

cn + tnk∇gj(x)dnk+ 1

2(tnk)22gj(x)(dnk, dnk) + o (tnk)2



= φ



gj(x) − λj cn

+ tnk



∇gj(x)dnk+1

2tnk2gj(x)(dnk, dnk)



+ o (tnk)2

= φ



gj(x) − λj cn

 + tnkφ0



gj(x) − λj cn

 

∇gj(x)dnk+ 1

2tnk2gj(x)(dnk, dnk)



+1 2(tnk)2V



gj(x) −λj

cn, ∇gj(x) ˜dn



+ o (tnk)2

(21) where the second equality follows from the fact of φ being Lipschitz continuous (in fact, φ is continuously differentiable) and the last step is due to (17). From (18), V

gj(x) − λj/cn, ∇gj(x) ˜dn

is the optimal value of the following problem

min



2k∇gj(x) ˜dn− zk2− 2σ



−λj cn

TK2j(gj(x), z)



(22) s.t. z ∈ Θ(gj(x), −λj)

where we have used the fact that Θ gj(x), −λj/cn = Θ(gj(x), −λj) by definition since cn6= 0, and ΠKj gi(x) − (λj/cn) = gi(x) because −λj ∈ NKj(gj(x)) by (5).

Note that the optimal value of the above problem (22) is finite since σ is nonpositive by Corollary 2.1, and that the objective function is strongly convex (because k · k2 is strongly convex and −σ is convex [4, Proposition 3.48]). Hence, the optimal solution of the problem (22) exists and is unique, say zjn, i.e.,

V



gj(x) −λj

cn, ∇gj(x) ˜dn



= 2

∇gj(x) ˜dn− zjn

2−2σ



−λj cn TK2

j(gj(x), znj)

 , (23)

(12)

where zjn∈ Θ(gj(x), −λj). Then, combining (21) and (23) yields

dist2



gj(xnk) − λj cn, Kj



λj cn

2

= −2tnk

j

cn, ∇gj(x)dnk+ 1

2tnk2gj(x)(dnk, dnk)



+(tnk)2



k∇gj(x) ˜dn− zjnk2− σ



−λj cn

TK2j(gj(x), zjn)



+ o((tnk)2), (24) where we use the fact that dist gj(x) − (λj/cn), Kj = kλj/cnk and

φ0



gj(x) −λj cn



= 2



gj(x) − λj

cn − ΠKj



gj(x) − λj cn



= −2λj cn.

Since f (x) > Lcn(xnk, λ, µ) by (20), applying the Taylor expansion, we obtain from (24) that

0 > f (xnk) − f (x) + hµ, h(xnk)i + cn

2 kh(xnk)k2 +cn

2

J

X

j=1

"

dist2



gj(xnk) −λj cn

, Kj



λj ck

2#

= tnk∇f (x)Tdnk +1

2(tnk)2(dnk)T2f (x)dnk+ o((tnk)2) +



µ, tnk∇h(x)dnk +1

2(tnk)2∇h(x)(dnk, dnk) + o((tnk)2)

 +cn

2ktnk∇h(x)dnk+ o(tnk)k2 +cn

2

J

X

j=1



− 2tnk

j

cn, ∇gj(x)dnk +1

2tnk2gj(x)(dnk, dnk)



+(tnk)2



k∇gj(x) ˜dn− zjnk2− σ



−λj cn

TK2j(gj(x), zjn)



+ o((tnk)2)

 . Dividing by (tnk)2/2 on both sides and taking limits as k → ∞ give

0 ≥ ∇2xxL(x, λ, µ)( ˜dn, ˜dn) + cnk∇h(x) ˜dnk2 (25) +cn

J

X

j=1



k∇gj(x) ˜dn− znjk2− σ



−λj cn

TK2

j(gj(x), znj)



where we use the fact that ∇xL(x, λ, µ) = 0, the first equality in KKT conditions (10).

Since −λj ∈ NKj(gj(x)) from (10) and zjn ∈ Θ(gj(x), −λj), applying Corollary 2.1 yields

σ



−λj cn

TK2

j(gj(x), zjn)



= 1 cnσ

−λj TK2

j(gj(x), zjn)

≤ 0

(13)

where the equality is due to the positive homogeneity of the support function, see [11].

Thus, it follows from (25) that

0 ≥ ∇2xxL(x, λ, µ)( ˜dn, ˜dn) + cnk∇h(x) ˜dnk2+ cn

J

X

j=1

k∇gj(x) ˜dn− zjnk2.

Due to k ˜dnk = 1 for all n, we may assume, taking a subsequence if necessary, that ˜dn→ ˜d.

Because cncan be made sufficiently large as n → ∞, we obtain from the above inequality that ∇h(x) ˜dn → 0 and ∇gj(x) ˜dn− zjn → 0. Therefore, ∇h(x) ˜d = lim

n→∞∇h(x) ˜dn = 0 and

dist

∇gj(x) ˜d, Θ(gj(x), −λj)

= lim

n→∞dist

∇gj(x) ˜dn, Θ(gj(x), −λj)

≤ lim

n→∞k∇gj(x) ˜dn− zjnk = 0

which implies ∇gj(x) ˜d ∈ Θ(gj(x), −λj) for all j = 1, 2, · · · , J . Thus, we have ˜d ∈ C(x, λ). In addition, it follows from (25) again that

0 ≥ ∇2xxL(x, λ, µ)( ˜dn, ˜dn) − cn

J

X

j=1

σ



−λj cn

TK2

j gj(x), zjn



= ∇2xxL(x, λ, µ)( ˜dn, ˜dn) −

J

X

j=1

σ



−λj

TK2j gj(x), zjn .

Note that σ

−λj TK2

j(gj(x), ∇gj(x) ˜d

= − ˜dTHj(x, λj) ˜d by Theorem 2.2. Taking the limits on both sides as n → ∞, using the continuity of σ by Corollary 2.1, and zjn→ ∇gj(x) ˜d (since ∇gj(x) ˜dn− zjn→ 0), we obtain

0 ≥ ∇2xxL(x, λ, µ)( ˜d, ˜d) −

J

X

j=1

σ

−λj TK2

j(gj(x), ∇gj(x) ˜d

= ∇2xxL(x, λ, µ)( ˜d, ˜d) +

J

X

j=1

THj(x, λj) ˜d

= ∇2xxL(x, λ, µ)( ˜d, ˜d) + ˜dTH(x, λ) ˜d

which contradicts (19) since ˜d ∈ C(x, λ) and ˜d 6= 0. Thus, the proof is complete. 2

For convex nonlinear programming, the saddle point has a close relation to the KKT point. Their relationship has been found in [4] by using the traditional Lagrangian functions (11). Here we further discuss their relationship for NSOCP via augmented Lagrangian functions (2).

(14)

Definition 2.1. The problem NSOCP (1) is said to be convex if the objective function f is a convex function, h is an affine mapping, and g is a convex mapping with respect to the set −K, i.e., for any x, y ∈ IRn and t ∈ [0, 1], we have

g (tx + (1 − t)y) −Ktg(x) + (1 − t)g(y). (26) It is easy to see that g is convex with respect to −K if and only if gj is convex with respect to −Kj for all j = 1, 2, · · · , J . In general, the square of a convex function may not be convex, for example, (x2 − 1)2 is not convex although x2− 1 is convex. Nonetheless, the square of the distance function is still convex, i.e., dist2(x, Q) is convex. In fact, dist2(x, Q) = inf{kx − yk2 + δQ(y)| y ∈ IRm+1} = k · k2Q, where  is the infimal convolution and δ is the indicator function [11]. This conclusion can be also obtained by noting that a differentiable function is convex if and only if its gradient is monotone, see [13]. Hence, it only need to show that ∇ dist2(x, Q) = 2(x − ΠQ(x)) is monotone, which is ensured by

∇ dist2(x, Q) − ∇ dist2(y, Q), x − y

= 2 hx − y − (ΠQ(x) − ΠQ(y)) , x − yi

≥ 2kx − yk2− 2 kΠQ(x) − ΠQ(y)k · kx − yk

= 2kx − yk ·h

kx − yk − kΠQ(x) − ΠQ(y)ki

≥ 0

where in the last step we use the fact that the metric projection is non-expansive, i.e., kΠQ(x) − ΠQ(y)k ≤ kx − yk. 2

The following lemma shows that the function −dist(·, Q) behaves like a monotone function.

Lemma 2.2. If x Qy, then dist(x, Q) ≤ dist(y, Q).

Proof. Given x, y with x Q y, i.e., x − y ∈ Q. Note that Q + Q = Q because Q is a convex cone, see [11]. Hence, we know Q + (x − y) ⊂ Q since x − y ∈ Q. Then, the desired result follows by

dist(x, Q) = inf

z∈Qkx − zk

≤ inf

z∈Q+(x−y)kx − zk======== infu:=z−x+y

u∈Qky − uk

= dist(y, Q).

2

The converse of Lemma 2.2 fails, which is illustrated by the following example.

(15)

Example 2.2. Consider K2 = {(x1, x2) | x1 ≥ |x2|}. Then, for x = (1, 2) and y = (−1, −1), we have

dist(x, K2) =√

2/2 <√

2 = dist(y, K2).

But, we see x K2 y since x − y = (2, 3) /∈ K2.

We next show that if the problem NSOCP (1) is convex, then the augmented La- grangian is also convex.

Theorem 2.4. If NSOCP (1) is convex, then Lc(x, λ, µ) is convex with respect to x for all (c, λ, µ) ∈ IR++× IRp× IRl.

Proof. Since h : IRn → IRl is an affine mapping, then there exists a matrix M ∈ IRl×n and q ∈ IRl such that h(x) = M x + q. Thus, we know

hµ, h(x)i + (c/2)kh(x)k2

= hµ, M x + qi + (c/2)hM x + q, M x + qi

= (c/2)hx, MTM xi + hMTµ + cMTq, xi + hµ + (c/2)q, qi

is convex due to MTM being positive semi-definite. In view of the expression of Lc(x, λ, µ) given in (2), it remains to show the convexity of dist2



gj(x) − λj c , Kj



. In fact, since gj is convex with respect to −Kj, it follows from (26) that

gj(tx + (1 − t)y) − λj

c Kj (t)



gj(x) − λj c



+ (1 − t)



gj(y) −λj c

 . This together with Lemma 2.2 implies

dist



gj(tx + (1 − t)y) − λj c , Kj



≤ dist

 t



gj(x) −λj c



+ (1 − t)



gj(y) −λj c

 , Kj

 , and hence

dist2



gj(tx + (1 − t)y) − λj c , Kj



≤ dist2

 t



gj(x) −λj c



+ (1 − t)



gj(y) − λj c

 , Kj



≤ t dist2



gj(x) − λj c , Kj



+ (1 − t)dist2



gj(y) −λj c , Kj



where the last step is due to the convexity of dist2(x, Kj) as the arguments following (26).

2

For convex NSOCP (1), the following result states the relationship between global saddle points and KKT points.

參考文獻

相關文件

Based on a class of smoothing approximations to projection function onto second-order cone, an approximate lower order penalty approach for solving second-order cone

Because both sets R m  and L h i ði ¼ 1; 2; :::; JÞÞ are second-order regular, similar to [19, Theorem 3.86], we state in the following theorem that there is no gap between

Hikami proposed a state integral model which gives a topological invariant for hyperbolic 3-manifold.. Saddle Point of

Fukushima, On the local convergence of semismooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity, SIAM Journal on Optimization,

In particular, the parabolic second-order directional differentiability of projec- tion operator was used to establish the expression of second-order tangent sets, which plays

Due to the important role played by σ -term in the analysis of second-order cone, before developing the sufficient conditions for the existence of local saddle points, we shall

Elsewhere the difference between and this plain wave is, in virtue of equation (A13), of order of .Generally the best choice for x 1 ,x 2 are the points where V(x) has

• Philip Kotler and Lane Keller (2009), Marketing Management, Upper Saddle River, NJ, Prentice-Hall. • Kotler &amp; Armstrong, Principles of