• 沒有找到結果。

3 Merit Functions Associated with K

N/A
N/A
Protected

Academic year: 2022

Share "3 Merit Functions Associated with K"

Copied!
42
0
0

加載中.... (立即查看全文)

全文

(1)

A SURVEY ON SOC COMPLEMENTARITY FUNCTIONS AND SOLUTION METHODS FOR SOCPS AND SOCCPS

Jein-Shan Chen and Shaohua Pan

This paper is dedicated to the memory of Professor Paul Tseng.

Abstract: This paper makes a survey on SOC complementarity functions and related solution methods for the second-order cone programming (SOCP) and second-order cone complementarity problem (SOCCP).

Specifically, we discuss the properties of four classes of popular merit functions, and study the theoretical results of associated merit function methods and numerical behaviors in the solution of convex SOCPs.

Then, we present suitable nonsinguarity conditions for the B-subdifferentials of the natural residual (NR) and Fischer-Burmeister (FB) nonsmooth system reformulations at a (locally) optimal solution, and test the numerical behavior of a globally convergent FB semismooth Newton method. Finally, we survey the properties of smoothing functions of the NR and FB SOC complementarity functions, and provide numerical comparisons of the smoothing Newton methods based on them. The theoretical results and numerical experience of this paper provide a comprehensive view on the development of this field in the past ten years.

Key words: second-order cone, complementarity functions, merit functions, smoothing function, nons- mooth Newton methods, smoothing Newton methods

Mathematics Subject Classification: 26B05, 26B35, 90C33, 65K05

1 Introduction

The second-order cone (SOC) in IRn (n≥ 1), also called the Lorentz cone, is defined as Kn:={

(x1, x2)∈ IR × IRn−1 | x1≥ ∥x2} ,

where∥·∥ denotes the Euclidean norm. If n = 1, then Knis the set of nonnegative reals IR+. We are interested in optimization and complementarity problems whose constraints involve the direct product of SOCs. In particular, we are interested in the SOC complementarity system which is to find vectors x, y∈ IRn and ζ∈ IRl satisfying

x∈ K, y ∈ K, ⟨x, y⟩ = 0, E(x, y, ζ) = 0, (1.1) where ⟨·, ·⟩ denotes the Euclidean inner product, E : IRn × IRn × IRl → IRn × IRl is a continuously differentiable mapping, andK is the direct product of SOCs given by

K = Kn1× Kn2× · · · × Knm (1.2)

Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.

The author’s work is supported by National Young Natural Science Foundation (No. 10901058), Guang- dong Natural Science Foundation (No. 9251802902000001) and the Fundamental Research Funds for the Central Universities (SCUT).

(2)

with m, n1, . . . , nm≥ 1, and n1+· · · + nm = n. Throughout this paper, corresponding to the structure ofK, we write x = (x1, . . . , xm), y = (y1, . . . , ym) with xi, yi ∈ IRni.

A special case of (1.1) is the generalized second-order cone complementarity problem (SOCCP) which, for given two continuously differentiable mappings F = (F1, . . . , Fm) and G = (G1, . . . , Gm) with Fi, Gi: IRn→ IRni, is to find a vector ζ∈ IRn such that

F (ζ)∈ K, G(ζ) ∈ K, ⟨F (ζ), G(ζ)⟩ = 0. (1.3) When G becomes an identity one, (1.3) reduces to finding a vector ζ∈ IRn such that

ζ∈ K, F (ζ) ∈ K, ⟨ζ, F (ζ)⟩ = 0, (1.4) which is a direct extension of the NCPs studied well in the past 30 years (see [22, 24]).

Another special case of (1.1) is the KKT conditions of the second-order cone programming minimize f (x)

subject to Ax = b, x∈ K (1.5)

where f : IRn→ IR is a twice continuously differentiable function, A is an m × n matrix with full row rank, and b∈ IRm. When f is linear, (1.5) becomes the standard linear SOCP that has wide applications in engineering design, control, finance, management science, and so on; see [1, 35] and the references therein. In addition, system (1.1) arises directly from some engineering and practical problems; for example, the three-dimensional frictional contact problems [34] and the robust Nash equilibria [28].

During the past ten years, there appeared active research for SOCPs and SOCCPs, and various methods had been proposed which include the interior-point methods [1, 35, 41, 63, 51], the smoothing Newton methods [18, 25, 27], the semismooth Newton methods [32, 43], and the merit function methods [10, 12]. Among others, the last three kinds of methods are typically developed by an SOC complementarity function. Recall that a mapping ϕ : IRn× IRn→ IRn is an SOC complementarity function associated withKn if

ϕ(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0. (1.6) However, there are lack of comprehensive studies for the properties of SOC complementarity functions and the numerical behavior of related solution methods. In this work, we give a survey for popular SOC complementarity functions and the related merit function methods, semismooth Newton methods and smoothing Newton methods.

The squared norm of SOC complementarity functions gives a merit function associated withKn, where ψ : IRn× IRn → IR+is called a merit function associated with Kn if

ψ(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0. (1.7) Apart from this, there are other ways to construct merit functions; for example, the LT merit function in Subsection 3.3. Here we are interested in those smooth ψ so that the SOCCP (1.3) can be reformulated as an unconstrained smooth minimization problem

min

ζ∈IRn Ψ(ζ) :=

m i=1

ψ (Fi(ζ), Gi(ζ)) , (1.8)

(3)

in the sense that ζis a solution to (1.3) if and only if it solves (1.8) with zero optimal value.

This is the so-called merit function approach. Note that with a smooth merit function ψ, system (1.1) can also be reformulated as a smooth minization problem

min

(x,y,ζ)∈IR2n+l∥E(x, y, ζ)∥2+ ψ(x, y),

but the reformulation is not effective for the solution of (1.1) due to the conflict between the feasibility and the decrease of complementarity gap involved in the objective. So, in this paper we consider the merit function methods for the SOCCP (1.3). In Section 3, we survey and compare the properties of four classes of popular smooth merit functions associated withKn. In Section 4, we focus on the theoretical results of corresponding merit function methods, and their numerical performance in the solution of linear SOCPs from DIMACS [52] and nonlinear convex SOCPs generated randomly.

With an SOC complementarity function ϕ associated withKn, we can rewrite (1.1) as

Φ(z) = Φ(x, y, ζ) :=





E(x, y, ζ) ϕ(x1, y1)

... ϕ(xm, ym)



= 0, (1.9)

By [22, Prop. 9.1.1], system (1.9) is effective only for those nondifferentiable but (strongly) semismooth ϕ. Two popular such ϕ are the vector-valued natural residual (NR) function ϕNR: IRn× IRn→ IRn and Fischer-Burmeister (FB) function ϕFB : IRn× IRn → IRn:

ϕNR(x, y) := x− (x − y)+ (1.10)

and

ϕFB(x, y) := (x + y)− (x2+ y2)1/2, (1.11) where (·)+ denotes the Euclidean projection onto Kn, x2 means the Jordan product of x and itself, and x1/2 with x∈ Kn is the unique square root of x such that x1/2◦ x1/2 = x.

The two nondifferentiable functions are strongly semismooth, where the proof for ϕNR can be found in [18, Prop. 4.3], [9, Prop. 7] or [27, Prop. 4.5], and the proof for ϕFB is given by Sun and Sun [58] and Chen [11] by using different techniques. In Section 5, we review the nonsingularity conditions for the B-subdifferentials of Φ at a solution of (1.1) without strict complementarity, and test the behavior of a global FB nonsmooth Newton method.

Let θ : IRn× IRn× IR → IRn be a continuously differentiable on IRn× IRn× IR++ with θ(·, ·, 0) ≡ ϕ(·, ·) for ϕ = ϕNR or ϕFB. Then (1.1) is also equivalent to the augmented system

Θ(ω) = Θ(ε, x, y, ζ) :=





 ε E(x, y, ζ) θ(x1, y1, ε)

... θ(xm, ym, ε)







= 0, (1.12)

which is continuously differentiable in IR++×IRn×IRn×IRl. In the past several years, some smoothing Newton methods have been proposed for (1.1) by solving a sequence of smooth systems or a single augmented system (see, e.g., [25, 18, 27]), but there is no comprehensive study for their numerical performance. Motivated by the efficiency of the smoothing Newton

(4)

method [54], we in Section 6 apply it for the system (1.12) involving the CHKS smoothing function and the squared smoothing function of ϕNR, and the FB smoothing function, re- spectively, and compare their numerical behaviors. Similar to the NR and FB nonsmooth Newton methods, the locally superlinear (quadratic) convergence of these smoothing meth- ods does not require the strict complementarity of solutions. So, these nonsmooth and smoothing methods are superior to interior point methods in theory since singular Jaco- bians will occur to the latter if strict complementarity is not satisfied.

Throughout this paper, I means an identity matrix of appropriate dimension, IRn(n≥ 1) denotes the space of n-dimensional real column vectors, and IRn1×· · ·×IRnmis identified with IRn1+···+nm. For a given set S, we denote int(S) and bd(S) by the interior and boundary of S, respectively. For any x∈ IRn, we write x≽Kn 0 (respectively, x≻Kn 0) to mean x∈ Kn (respectively, x∈ int(Kn)). For any differentiable F : IRn → IRl, we denote F(x) ∈ IRl×n by the Jacobian of F at x, and ∇F (x) by the transposed Jacobian of F at x. A square matrix B∈ IRn×nis said to be positive definite if⟨u, Bu⟩ > 0 for all nonzero u ∈ IRn, and B is said to be positive semidefinite if⟨u, Bu⟩ ≥ 0 for all u ∈ IRn.

2 Preliminaries

This section recalls some background materials that are needed in the subsequent sections.

For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, their Jordan product [23] is defined by x◦ y := (⟨x, y⟩, y1x2+ x1y2).

The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, . . . , 0)T ∈ IRn. For any given x = (x1, x2)∈ IR × IRn−1, the matrix

Lx:=

[ x1 xT2 x2 x1I

]

will be used, which can be viewed as a linear mapping from IRn to IRngiven by Lxy = x◦ y.

For each x = (x1, x2)∈ IR × IRn−1, let λ1(x), λ2(x) and u(1)x , u(2)x be the spectral values and the corresponding spectral vectors of x, respectively, given by

λi(x) := x1+ (−1)i∥x2∥ and u(i)x := 1 2 (

1, (−1)ix¯2

)

, i = 1, 2

with ¯x2= x2/∥x2∥ if x2̸= 0, and otherwise ¯x2being any vector in IRn−1satisfying∥¯x2∥ = 1.

Then x admits a spectral factorization [23] associated withKn in the form of x = λ1(x)u(1)x + λ2(x)u(2)x .

When x2̸= 0, the spectral factorization is unique. The following lemma states the relation between the spectral factorization of x and the eigenvalue decomposition of Lx.

Lemma 2.1 ([25]). For any given x∈ IRn, let λ1(x), λ2(x) be the spectral values of x, and u(1)x , u(2)x be the associated spectral vectors. Then, Lx has the eigenvalue decomposition

Lx= U (x)diag (λ2(x), x1,· · · , x1, λ1(x)) U (x)T

(5)

where

U (x) =(√

2u(2)x , u(3)x ,· · · , u(n)x ,√ 2u(1)x

)∈ IRn×n

is an orthogonal matrix, and u(i)x for i = 3, . . . , n have the form of (0, ¯ui) with ¯u3, . . . , ¯un

being any unit vectors in IRn−1 that span the linear subspace orthogonal to x2.

By using Lemma 2.1, it is not hard to calculate the inverse of Lxwhenever it exists:

L−1x = 1 det(x)

x1 −xT2

−x2

det(x) x1

I + 1 x1

x2xT2

 (2.1)

where det(x) := x21− ∥x22denotes the determinant of x.

By the spectral factorization above, for any given scalar function g : IR → J ⊆ IR, we may define the associated vector-valued function gsoc: IRn→ S ⊆ IRn by

gsoc(x) := g(λ1(x))u(1)x + g(λ2(x))u(2)x . (2.2) For example, taking g(t) =

t for t ≥ 0, we have that gsoc(x) = x1/2 with x ∈ Kn. The vector-valued gsoc inherits many desirable properties from g (see [9]). The following lemma provides the formulas to compute the Jacobian of gsoc and its inverse.

Lemma 2.2. Let g : IR→ J ⊆ IR be a given scalar function, and gsoc: IRn → S ⊆ IRn be defined by (2.2). If g is differentiable on int(J), then gsoc is differentiable in int(S) with

∇gsoc(x) =











g(x1)I if x2= 0,



b(x) c(x) xT2

∥x2 c(x) x2

∥x2 a(x)I + (b(x)− a(x))x2xT2

∥x22



 if x2̸= 0

for any x = (x1, x2)∈ int(S), where a(x) = g(λ2(x))− g(λ1(x))

λ2(x)− λ1(x) , b(x) = g2(x)) + g1(x))

2 , c(x) = g2(x))− g1(x))

2 .

If∇gsoc(·) is invertible at x ∈ int(S), then letting d(x) = b2(x)− c2(x), we have that

(∇gsoc(x))−1=











(g(x1))−1I if x2= 0,



b(x)

d(x) −c(x)

d(x) xT2

∥x2

−c(x) d(x)

x2

∥x2 1 a(x)I +

(b(x) d(x)− 1

a(x) ) x2xT2

∥x22



 if x2̸= 0.

Proof. The first part is direct by Prop. 5.2 of [25] or Prop. 5 of [9]. For the second part, it suffices to calculate the inverse of∇gsoc(x) when x2 ̸= 0. By the expression of ∇gsoc, it is easy to verify that b(x) + c(x) and b(x)− c(x) are the eigenvalues of ∇gsoc(x) with (1,∥xx2

2) and (1,−∥xx22) being the corresponding eigenvectors, and a(x) is the eigenvalue of multiplicity n− 2 with corresponding eigenvectors of the form (0, ¯vi), where ¯v1, . . . , ¯vn−2

are any unit vectors in IRn−1 that span the subspace orthogonal to x2. By this, using an elementary calculation yields the formula of (∇gsoc(x))−1.

(6)

We next recall some joint properties of two mappings which are the direct extensions of the uniform Cartesian P -property [17], the uniform Jordan P -property [60], the weak coerciveness [67], and the R0-property [6], respectively.

Definition 2.3. The mappings F = (F1, . . . , Fm) and G = (G1, . . . , Gm) are said to have (i) joint uniform Cartesian P -property if there exists a constant ϱ > 0 such that, for

every ζ, ξ∈ IRn, there exists an index ν∈ {1, 2, . . . , m} such that

⟨Fν(ζ)− Fν(ξ)), Gν(ζ)− Gν(ξ)⟩ ≥ ϱ∥ζ − ξ∥2.

(ii) joint uniform Jordan P -property if there exists a constant ϱ > 0 such that, for every ζ, ξ∈ IRn,

λ2[(F (ζ)− F (ξ)) ◦ (G(ζ) − G(ξ))] ≥ ϱ∥ζ − ξ∥2.

(iii) joint Cartesian weak coerciveness if there is an element ξ∈ IRn such that

∥ζ∥→∞lim max

1≤i≤m

⟨Gi(ζ)− Gi(ξ), Fi(ζ)⟩

∥ζ − ξ∥ = +∞.

(iv) joint Cartesian strong coerciveness if the last equation holds for all ξ ∈ IRn. (v) joint Cartesian R0w-property if, for any sequencek} ⊆ IRn satisfying

∥ζk∥ → +∞, lim sup

k→∞ ∥(F (ζk))∥ < +∞, lim sup

k→∞ ∥(G(ζk))∥ < +∞, there holds that

lim sup

k→∞ max

1≤i≤m

Fik), Gik)⟩

= +∞.

It is easy to see that the joint uniform Cartesian P -property implies the joint Cartesian strong coerciveness. From the arguments in [47], it follows that the joint uniform Carte- sian P -property implies the joint uniform Jordan P -property, and the joint Cartesian weak coerciveness with respect to an element ξ with G(ξ)∈ K implies the joint Cartesian Rw0- property. Now we are not clear whether the joint uniform Jordan P -property implies the joint Cartesian weak coerciveness. Note that the above several properties do not imply the joint monotonicity of F and G, but the joint monotonicity of F and G with some additional conditions may imply their joint Cartesian Rw0-property; see the remarks after Prop. 4.2.

The following definition recalls the concept of linear growth of a mapping, which is weaker than the global Lipschitz continuity.

Definition 2.4. A mapping F : IRn → IRn is said to have linear growth if there exists a constant C > 0 such that∥F (ζ)∥ ≤ ∥F (0)∥ + C∥ζ∥ for any ζ ∈ IRn.

We next introduce the Cartesian (strict) column monotonicity of matrices M and N , which is weaker than the (strict) column monotonicity introduced in [22, page 1014] and [37, page 222]. Particularly, when N is invertible, this property reduces to the Cartesian P0

(P )-property of the matrix−N−1M introduced by Chen and Qi [17].

Definition 2.5. The matrices M, N∈ IRn×nare said to be

(i) Cartesian column monotone if for any u, v∈ IRn with u̸= 0, v ̸= 0, M u + N v = 0 =⇒ ∃ν ∈ {1, . . . , m} s.t. uν̸= 0 and ⟨uν, vν⟩ ≥ 0.

(7)

(ii) Cartesian strictly column monotone if for any u, v∈ IRn with (u, v)̸= (0, 0), M u + N v = 0 =⇒ ∃ν ∈ {1, . . . , m} s.t. ⟨uν, vν⟩ > 0.

To close this section, we recall the concept of B-subdifferential for a locally Lipschitz continuous mapping. If H : IRn→ IRm is locally Lipschitz continuous, then the set

BH(z) :={

V ∈ IRm×n | ∃{zk} ⊆ DH : zk→ z, H(zk)→ V}

is nonempty and called the B-subdifferential [55] of H at z, where DH ⊆ IRn is the set of points at which H is differentiable. The convex hull of ∂BH(z) is called the generalized Jacobian of Clarke [20], i.e. ∂H(z) = conv∂BH(z). We assume that the reader is familiar with the concept of (strong) semismoothness, and refer to [49, 55, 56] for the details.

Unless otherwise stated, in the rest of this paper, we assume that F = (F1, . . . , Fm) and G = (G1, . . . , Gm) with Fi, Gi : IRn → IRni are continuously differentiable. For a given x∈ IRl for some l≥ 2, we write x = (x1, x2)∈ IR × IRl−1, where x1 is the first component of the vector x and x2 consists of the remaining l− 1 components of x.

3 Merit Functions Associated with K

n

This section reviews four classes of smooth merit functions associated with Kn and their properties related to the merit function approach. The nondifferentiable NR function

ψNR(x, y) :=∥x − (x − y)+2 ∀x, y ∈ IRn (3.1) is needed, which plays a crucial role in error bound estimations of other merit functions.

3.1 Implicit Lagrangian Function

The implicit Lagrangian ψMS: IRn× IRn →IR+, parameterized by α > 1, is defined as ψMS(x, y) := max

u,v∈Kn

{

⟨x, y − v⟩ − ⟨y, u⟩ − 1

(∥x − u∥2+∥y − v∥2) }

= ⟨x, y⟩ + 1

(∥(x − αy)+2− ∥x∥2+∥(y − αx)+2− ∥y∥2)

. (3.2) The function is introduced by Mangasarian and Solodov [38] for NCPs, and extended to semidefinite complementarity problems (SDCPs) by Tseng [61] and general symmetric cone complementarity problems (SCCPs) by Kong et al. [33]. By Theorem 3.2(b) of [33], ψMS is a merit function induced by the trace of the SOC complementarity function

ϕMS(x, y) := x◦ y + 1

[(x− αy)2+− x2+ (y− αx)2+− y2]

∀x, y ∈ IRn, α > 1. (3.3) The following results are extensions of known results, particularly [62, 65, 39], for NCPs.

Lemma 3.1. For any fixed α > 1 and all x, y∈ IRn, we have the following results.

(a) ψMS(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0 ⇐⇒ ϕMS(x, y) = 0.

(b) ϕMS and ψMS are continuously differentiable everywhere, with

xψMS(x, y) = y + α−1((x− αy)+− x) − (y − αx)+,

yψMS(x, y) = x + α−1((y− αx)+− y) − (x − αy)+.

(8)

(c) The gradient function∇ψMS is globally Lipschitz continuous.

(d) ⟨x, ∇xψMS(x, y)⟩ + ⟨y, ∇yψMS(x, y)⟩ = 2ψMS(x, y).

(e) ⟨∇xψMS(x, y),∇yψMS(x, y)⟩ ≥ 0.

(f ) ψMS(x, y) = 0 if and only if∇xψMS(x, y) = 0 and∇yψMS(x, y) = 0.

(g) (α− 1)∥ϕNR(x, y)∥2≥ ψMS(x, y)≥ (1 − α−1)∥ϕNR(x, y)∥2.

(h) α−1(α− 1)2ψMS(x, y)≤ ∥∇xψMS(x, y) +∇yψMS(x, y)∥2≤ 2α(α − 1)ψMS(x, y).

Proof. The proofs of parts (a)–(b) and (e)–(f) are given in [33]. Parts (c)–(d) are direct by the expressions of ψMS and ∇ψMS. Part (g) is a direct application of [62, Prop. 2.2] with

˜

π =−ψMS. Part (h) is easily shown by [50, Theorem 4.2] and (b) and (g).

Analogous to the NCPs and SDCPs, the implicit Lagrangian has the most favorable properties among all projection merit functions. So, we do not review others in this class.

3.2 Fischer-Burmeister (FB) Merit Function

From [25], ϕFB in (1.11) is an SOC complementarity function, and whence its squared norm ψFB(x, y) := 1

2∥ϕFB(x, y)∥2. (3.4)

is a merit function associated with Kn. The function ψFB was shown to be continuously differentiable everywhere with globally Lipschitz continuous gradient [10, 16], although ϕFB itself is not differentiable. Recently, we extend these favorable properties of ψFB to the following one-parametric class of merit functions (see [14, 15]):

ψτ(x, y) := 1

2∥ϕτ(x, y)∥2, (3.5)

where τ∈ (0, 4) is an arbitrary fixed parameter and ϕτ: IRn× IRn→ IRn is defined by ϕτ(x, y) := (x + y)−[

(x− y)2+ τ (x◦ y)]1/2

. (3.6)

Clearly, when τ = 2, ψτ becomes the FB merit function ψFB. The one-parametric class of functions was originally proposed by Kanzow and Kleinmichel [31] for NCPs, and was proved to share all desirable properties of the FB NCP function. The following lemma summarizes those properties of ψτ used in the merit function approach.

Lemma 3.2. For any fixed τ∈ (0, 4) and all x, y ∈ IRn, we have the following results.

(a) ψτ(x, y) = 0 ⇐⇒ ϕτ(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0.

(b) ψτ is continuously differentiable everywhere with xψτ(0, 0) =∇yψτ(0, 0) = 0. Also, if w = (x− y)2+ τ (x◦ y) ∈ int(Kn), then

xψτ(x, y) = (

I− Lx+τ−22 yL−1w

)

ϕτ(x, y),

yψτ(x, y) = (

I− Ly+τ−2

2 xL−1w

)

ϕτ(x, y);

(9)

and if (x− y)2+ τ (x◦ y) ∈ bd(Kn) and (x, y)̸= (0, 0),

xψτ(x, y) = [

1 x1+τ−22 y1

x21+ y12+ (τ− 2)x1y1

]

ϕτ(x, y),

yψτ(x, y) = [

1 y1+τ−22 x1

x21+ y12+ (τ− 2)x1y1

]

ϕτ(x, y).

(c) The gradient function∇ψτ is globally Lipschitz continuous.

(d) ⟨x, ∇xψτ(x, y)⟩ + ⟨y, ∇yψτ(x, y)⟩ = 2ψτ(x, y).

(e) ⟨∇xψτ(x, y),∇yψτ(x, y)⟩ ≥ 0, with equality holding if and only if ψτ(x, y) = 0.

(f ) ψτ(x, y) = 0 ⇐⇒ ∇xψτ(x, y) = 0 ⇐⇒ ∇yψτ(x, y) = 0.

(g) There exist constant c1> 0 and c2> 0 independent on x, y such that c1∥ϕNR(x, y)∥ ≤ ∥ϕτ(x, y)∥ ≤ c2∥ϕNR(x, y)∥.

(h) There exist constants C1> 0 and C2> 0 only dependent on n, τ such that C1∥ϕτ(x, y)∥ ≤ ∥∇xψτ(x, y) +∇yψτ(x, y)∥ ≤ C2∥ϕτ(x, y)∥.

Proof. The proofs of parts (a)–(b) and (d)–(e) can be found in [14]. Part (c) is proved in [15, Theorem 3.1]. Part (f) follows by parts (a), (b) and (e). Parts (g) and (h) are established in [3].

Comparing Lemma 3.2 with Lemma 3.1, we see that the functions ψFB and ψMS share with similar favorable properties, but the properties (e)–(f) of ψFB are stronger than those of ψMS, which make ψFB require a weaker stationary point condition; see Prop. 4.1.

It should be pointed out that the squared norms of Evtushenko and Purtov [21] SOC complementarity functions ϕα: IRn× IRn→ IRn and ϕβ: IRn× IRn → IRn, defined as

ϕα(x, y) := −(x ◦ y) + 1

2α(x + y)2 0 < α≤ 1, ϕβ(x, y) := −(x ◦ y) + 1

((x)2+ (y)2)

0 < β < 1, (3.7) also provide the smooth merit functions ψαand ψβ associated withKn. But, since they do not enjoy the property (e) of ψτ or the weaker property (e) of ψMS, it is hard to find the conditions to guarantee that every stationary point of Ψαand Ψβ is a solution of SOCCPs (see the proof of Prop. 4.1). In addition, unlike in the setting of NCPs, the squared norm of penalized FB SOC complementarity function is not smooth even nondifferentiable. So, this paper does not include these functions.

3.3 Luo and Tseng (LT) Merit Function

The third class of smooth merit functions is an extension of the class of functions introduced by Luo and Tseng [37] for NCPs, and subsequently extended to SDCPs in [61, 66]. In the setting of SOCs, this class of functions is defined as

ψLT(x, y) := ψ0(⟨x, y⟩) + bψ(x, y), ∀x, y ∈ IRn (3.8)

(10)

where ψ0: IR→ IR+ is an arbitrary smooth function satisfying

ψ0(0) = 0, ψ0(t) = 0 ∀t ≤ 0, and ψ0(t) > 0 ∀t > 0 (3.9) and bψ : IRn× IRn→ IR+ is an arbitrary smooth function such that

ψ(x, y) = 0,b ⟨x, y⟩ ≤ 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0. (3.10) The requirment for ψ0 is a little different from the original LT merit functions [37]. There are many functions satisfying (3.9) such as the polynomial function q−1max(0, t)q (q≥ 2), the exponential function exp(max(0, t)2)− 1, and logarithmic function ln(1 + max(0, t)2).

In addition, there are many choices for bψ such as ψMS, ψτ and the following ψb1(x, y) :=1

2

(∥(x)2+∥(y)2)

and bψ2(x, y) := 1

2∥ϕFB(x, y)+2. (3.11) In this paper, we are particularly interested in three subclasses of ψLT with bψ chosen as ψFB, ψb1 and bψ2. Among others, ψLT with bψ = ψFB is an analog of the merit function studied by Yamashita and Fukushima [66] for SDCPs. In view of this, we write ψLT with bψ = ψFB as ψYF. We also write ψLT with bψ = bψ1 and bψ2as ψLT1 and ψLT2, respectively.

Lemma 3.3. Let ψ be one of the functions ψYF, ψLT1 and ψLT2. Then, for all x, y∈ IRn, (a) ψ(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0.

(b) ψ is continuously differentiable everywhere. Furthermore,

xψYF(x, y) = ψ0 (⟨x, y⟩) y + ∇xψFB(x, y),

yψYF(x, y) = ψ0 (⟨x, y⟩) x + ∇yψFB(x, y), where∇xψFB and∇yψFB are given by Lemma 3.2(c) with τ = 2;

xψLT1(x, y) = ψ0(⟨x, y⟩) y + (x), yψLT1(x, y) = ψ0(⟨x, y⟩) x + (y); when ψ = ψLT2,∇xψLT2(0, 0) =∇yψLT2(0, 0) = 0, and if x2+ y2∈ int(Kn),

xψLT2(x, y) = ψ0 (⟨x, y⟩) y +(

I− LxL−1(x2+y2)1/2 )

ϕFB(x, y)+,

yψLT2(x, y) = ψ0 (⟨x, y⟩) x +(

I− LyL−1(x2+y2)1/2 )

ϕFB(x, y)+, and if x2+ y2∈ bd+(Kn),

xψLT2(x, y) = ψ0(⟨x, y⟩) y + (

1 x1

x21+ y21 )

ϕFB(x, y)+,

yψLT2(x, y) = ψ0(⟨x, y⟩) x + (

1 y1

x21+ y12 )

ϕFB(x, y)+.

(c) The gradient∇ψ is globally Lipschitz continuous on any bounded set of IRn× IRn. (d) ⟨x, ∇xψ(x, y)⟩ + ⟨y, ∇yψ(x, y)⟩ ≥ 2ψ0(⟨x, y⟩) ⟨x, y⟩ + 2ψ(x, y) ≥ 2 bψ(x, y).

(11)

(e) ⟨∇xψ(x, y),∇yψ(x, y)⟩ ≥ 0, and when ψ = ψYF and ψLT2, ⟨∇xψ(x, y),∇yψ(x, y)⟩ = 0 if and only if ψ(x, y) = 0.

(f ) When ψ = ψYF and ψLT2, ψ(x, y) = 0⇐⇒ ∇xψ(x, y) = 0⇐⇒ ∇yψ(x, y) = 0; and when ψ = ψLT1, ψ(x, y) = 0⇐⇒ ∇xψ(x, y) = 0 and yψ(x, y) = 0.

(g) If ψ0 is convex and nondecreasing in IR, then ψLT1 is a convex function over IRn× IRn. Proof. When ψ = ψYF, from the definition of ψYF and Lemma 3.2(a) and (c)–(d), we readily get parts (a)–(c); parts (d)–(e) are easily verified by using part (b), Lemma 3.2(e) with τ = 2 and equation (3.9). When ψ = ψLT1 and ψLT2, parts (a)–(b) and (d)–(e) are established in Prop. 3.1 and Prop. 3.2 of [12] except the smoothness of ψLT2, which is implied by Lemma 1 of appendix. Part (c) is immediate by using the expressions of∇ψLT1and∇ψLT2and noting that∇ bψ is globally Lipschitz continuous on IRn× IRn.

When ψ = ψYF and ψLT2, part (f) follows by parts (b) and (e), and when ψ = ψLT1, part (f) follows by parts (b) and (d). By Prop. 3.1(b) of [12], bψ1is convex over IRn× IRn. Since ψ0 is convex and nondecreasing in IR, it is easy to verify that ψ0(⟨x, y⟩) is also convex over IRn× IRn. So, we obtain the result of part (g).

Comparing Lemma 3.3 with Lemmas 3.1 and 3.2, we observe that ψMS and ψτ have two remarkable advantages over the LT class of merit functions: one is the positive homogeneity of ψMS and ψτ, which makes the corresponding merit functions for SOCCPs overcome the bad-scaling of problems; the other is that their gradients have the same growth as the merit function itself, which is the key to establish convergence rate of some descent algorithms.

It should be pointed out that although the LT class of merit functions does not possess the property (g) of ψMSand ψFB, the corresponding merit functions for the SOCCPs may provide a global error bound under a weaker condition (see Prop. 4.3), and moreover, Lemma 3.5 below shows that they have faster growth than ψMS and ψFB.

3.4 A Variant of LT Merit Function

A variant of the LT merit functions is the function bψLT: IRn× IRn → IR+defined by ψbLT(x, y) := ψ0(∥(x ◦ y)+2) + bψ(x, y) ∀x, y ∈ IRn, (3.12) where ψ0satisfies the first and the third properties of (3.9) and bψ satisfies (3.10). This class of merit functions was considered by Chen [12]. In this work we are interested in bψLT with ψ = ψb FB, bψ1 and bψ2, and write them as bψYF, bψLT1 and bψLT2, successively.

Lemma 3.4. Let ψ be one of the functions bψYF, bψLT1 and bψLT2. Then, for all x, y∈ IRn, (a) ψ(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0.

(b) ψ is continuously differentiable everywhere, with

xψ(x, y) = 0(

∥(x ◦ y)+2)

Ly(x◦ y)++xψ(x, y),b

yψ(x, y) = 0(

∥(x ◦ y)+2)

Lx(x◦ y)++yψ(x, y),b where∇xψ(x, y) andb yψ(x, y) are same as in Lemma 3.3.b

(c) The gradient∇ψ is globally Lipschitz continuous on any bounded set of IRn× IRn.

(12)

(d) ⟨x, ∇xψ(x, y)⟩ + ⟨y, ∇yψ(x, y)⟩ = 4ψ0

(∥(x ◦ y)+2)

∥(x ◦ y)+2+ 2 bψ(x, y).

(e) ψ(x, y) = 0 ⇐⇒ ∇xψ(x, y) = 0 and yψ(x, y) = 0.

Proof. The proofs are same as that of Lemma 3.3, and we omit them.

For the class of merit functions bψLT, it is difficult to establish the following inequality

⟨∇xψbLT(x, y),∇yψbLT(x, y)⟩ ≥ 0 ∀x, y ∈ IRn

although numerical simulations show that they possess the property. The main difficulty is to estimate the terms⟨Ly(x◦ y)+,∇yψ(x, y)b ⟩ and ⟨Lx(x◦ y)+,∇xψ(x, y)b ⟩.

To close this section, we characterize the growth of the above merit functions via a lemma, whose proof is direct by the arguments of [47, Sec. 4] and the remarks after it.

Lemma 3.5. If the sequence{(xk, yk)} ⊆ IRn× IRn satisfies one of the conditions:

(i) lim infk→∞λ1(xk) =−∞ or lim infk→∞λ1(yk)→ −∞;

(ii) 1(xk)} and {λ1(yk)} are bounded below, λ2(xk), λ2(yk)→ +∞, and ∥xxkk∥yykk 9 0;

(iii) 1(xk)} and {λ1(yk)} are bounded below, and lim supk→∞⟨xk, yk⟩ = +∞,

then lim supk→∞ψ(xk, yk)→ ∞ for ψ = ψYF, ψLT1, ψLT2, bψYF, bψLT1 and bψLT2. If {(xk, yk)} satisfies (i) or (ii), then lim supk→∞ψ(xk, yk)→ ∞ with ψ = ψNR, ψMS, ψτ.

The condition (ii) of Lemma 3.5 implies the condition (iii) since, when 1(xk)} and 1(yk)} are bounded below and λ2(xk), λ2(yk)→ +∞, there must exist a vector d ∈ IRn such that xk − d ∈ Kn and yk − d ∈ Kn, which along with ∥xxkk ∥yykk 9 0 yields that

⟨xk,yk

∥xk∥∥yk → c > 0 (taking a subsequence if necessary), and lim supk→∞⟨xk, yk⟩ = +∞ then follows. Hence, ψLT and its variant bψLT have faster growth than ψτ and ψMS.

4 Merit Function Approach and Applications

This section is devoted to the merit function methods for the generalized SOCCP (1.3), which yields a solution of (1.3) by solving an unconstrained minimization (1.8) with ψ being one of the merit functions introduced in last section. Throughout this section, we assume thatK has the Cartesian structure of (1.2), and for any ζ ∈ IRn, write

xψ(F (ζ), G(ζ)) =

(x1ψ(F1(ζ), G1(ζ)), . . . ,∇xmψ(Fm(ζ), Gm(ζ)) )

,

yψ(F (ζ), G(ζ)) =

(y1ψ(F1(ζ), G1(ζ)), . . . ,∇ymψ(Fm(ζ), Gm(ζ)) )

.

When applying effective gradient-type methods for the problem (1.8), we expect only a stationary point due to the nonconvexity of merit functions. Thus, it is necessary to know what conditions can guarantee every stationary point of Ψ to be a solution of (1.3). The following proposition provides a suitable condition for the first three classes of functions.

Proposition 4.1. Let Ψ be given by (1.8) with ψ being one of the previous merit functions.

(13)

(a) When ψ = ψτ, ψYF and ψLT2, every stationary point of Ψ is a solution of (1.3) if∇F (ζ) and−∇G(ζ) are Cartesian column monotone for any ζ ∈ IRn.

(b) When ψ = ψMS or ψLT1, every stationary point of Ψ is a solution of (1.3) if∇F (ζ) and

−∇G(ζ) are Cartesian strictly column monotone for any ζ ∈ IRn.

Proof. Since F and G are continuously differentiable, by Lemmas 3.1–3.3(b), the function Ψ is continuously differentiable with

∇Ψ(ζ) = ∇F (ζ)∇xψ(F (ζ), G(ζ)) +∇G(ζ)∇yψ(F (ζ), G(ζ)). (4.1) Let ζ∈ IRn be an arbitrary but fixed stationary point of the function Ψ. Then,

∇F (ζ)∇xψ(F (ζ), G(ζ)) +∇G(ζ)∇yψ(F (ζ), G(ζ)) = 0. (4.2) Suppose that ζ is not a solution of (1.3). When ψ = ψτ, ψYF and ψLT2, we must have

xψ(F (ζ), G(ζ)) ̸= 0 and ∇yψ(F (ζ), G(ζ))̸= 0 by Lemma 3.2– 3.3(f). Since ∇F (ζ) and

−∇G(ζ) are Cartesian column monotone, equality (4.2) implies that there exists an index ν∈ {1, . . . , m} such that ∇xνψ(Fν(ζ), Gν(ζ))̸= 0 and

⟨∇xνψ(Fν(ζ), Gν(ζ)),∇yνψ(Fν(ζ), Gν(ζ))⟩ ≤ 0.

Along with Lemma 3.2–3.3(e), we have ψ(Fν(ζ), Gν(ζ)) = 0. This, by Lemma 3.2–3.3(f), impliesxνψ(Fν(ζ), Gν(ζ)) = 0, and then we get a contradiction. When ψ = ψMS or ψLT1, by Lemma 3.1 and 3.3(f) we have (xψ(F (ζ), G(ζ)),∇yψ(F (ζ), G(ζ))̸= (0, 0). Since ∇F (ζ) and−∇G(ζ) are Cartesian strictly column monotone, (4.2) implies that there exists an index ν∈ {1, . . . , m} such that ∇xνψ(Fν(ζ), Gν(ζ))̸= 0 and

⟨∇xνψ(Fν(ζ), Gν(ζ)),∇yνψ(Fν(ζ), Gν(ζ))⟩ < 0, which is impossible by Lemma 3.1 and 3.3(e). The proof is completed.

When ∇G(ζ) is invertible, since the Cartesian (strict) column monotonicity of ∇F (ζ) and−∇G(ζ) is equivalent to the Cartesian (P ) P0-property of∇G(ζ)−1∇F (ζ), Prop. 4.1 extends the results of [10, Prop. 3] and [12, Prop. 3.3] for ψYF and ψLT2, respectively, as well as recovers the result of [44, Prop. 5.1]. When G is an identity mapping, in view of Lemmas 3.1– 3.3(e), using the same arguments as in [33, Theorem 5.3] can prove that every regular stationary point of Ψ is a solution of (1.3). From [33], we know that the regularity is weaker than the Cartesian P -property of∇F , but it is not clear whether it is weaker than the Cartesian P0-property of∇F .

The property that ⟨∇xψ(x, y),∇yψ(x, y)⟩ ≥ 0 for all x, y ∈ IRn plays a crucial role in the proof of Prop. 4.1. For the variant of LT functions bΨLT, since the desirable property is not established, we can not provide suitable stationary point conditions for bΨLT.

When solving the minimization problem (1.8), to guarantee that the iterative sequence generated has a limit point, it is necessary to require that Ψ has bounded level sets, which is implied by the coerciveness of Ψ, i.e., lim sup∥ζ∥→∞Ψ(ζ) = +∞. The following proposition provides the weakest coerciveness conditions for the previous merit functions.

Proposition 4.2. Suppose the mappings F and G satisfy one of the following conditions:

(C.1) F and G have the joint uniform Jordan P -property and the linear growth.

參考文獻

相關文件

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

A derivative free algorithm based on the new NCP- function and the new merit function for complementarity problems was discussed, and some preliminary numerical results for

11 (1998) 227–251] for the nonnegative orthant complementarity problem to the general symmet- ric cone complementarity problem (SCCP). We show that the class of merit functions

By exploiting the Cartesian P -properties for a nonlinear transformation, we show that the class of regularized merit functions provides a global error bound for the solution of

In this paper, we have shown that how to construct complementarity functions for the circular cone complementarity problem, and have proposed four classes of merit func- tions for

In this paper, we extend this class of merit functions to the second-order cone complementarity problem (SOCCP) and show analogous properties as in NCP and SDCP cases.. In addition,

Chen, Conditions for error bounds and bounded level sets of some merit func- tions for the second-order cone complementarity problem, Journal of Optimization Theory and