A least-square semismooth Newton method for the second-order cone complementarity problem

(1)

Optimization Methods & Software Vol. 26, No. 1, February 2011, 1–22

A least-square semismooth Newton method for the second-order cone complementarity problem

Shaohua Pan^aand Jein-Shan Chen^b*

aDepartment of Mathematics, South China University of Technology, Guangzhou 510640, People’s Republic of China;^bDepartment of Mathematics, National Taiwan Normal University,

Taipei 11677, Taiwan, Republic of China

(Received 11 March 2008; final version received 30 June 2009 )

We present a nonlinear least-square formulation for the second-order cone complementarity problem based on the Fischer–Burmeister (FB) function and the plus function. This formulation has two-fold advantages. First, the operator involved in the over-determined system of equations inherits the favourable properties of the FB function for local convergence, for example, the (strong) semi-smoothness; second, the natural merit function of the over-determined system of equations share all the nice features of the class of merit functions fYFstudied in [J.-S. Chen and P. Tseng, An unconstrained smooth minimization reformulation of the second-order cone complementarity problem, Math. Program. 104 (2005), pp. 293–

327] for global convergence. We propose a semi-smooth Levenberg–Marquardt method to solve the arising over-determined system of equations, and establish the global and local convergence results. Among others, the superlinear (quadratic) rate of convergence is obtained under strict complementarity of the solution and a local error bound assumption, respectively. Numerical results verify the advantages of the least-square reformulation for difficult problems.

Keywords: second-order cone complementarity problem; Fischer–Burmeister function; semi-smooth;

Levenberg–Marquardt method

1. Introduction

We consider the second-order cone complementarity problem (SOCCP), which is to find a vector ζ ∈ Rⁿsuch that

F (ζ )∈ K, G(ζ) ∈ K, F (ζ ), G(ζ ) = 0, (1)

where·, · denotes the Euclidean inner product, F : Rⁿ→ Rⁿand G: Rⁿ→ Rⁿare assumed to be continuously differentiable throughout this paper, andK is the Cartesian product of second- order cones (SOCs), also called Lorentz cones [11], i.e.

K = Kⁿ¹× Kⁿ²× · · · × Kⁿ^q, (2)

with n₁+ · · · + nq = n and Kⁿⁱ := {(xi1, x_i2)∈ R × Rⁿⁱ⁻¹|xi1≥ ||xi2||}. In this paper, corresponding to the Cartesian structure of the cone K, we will write F = (F1, . . . , F_q) and G= (G1, . . . , G_q)with F_i, G_i: Rⁿ→ Rⁿⁱ.

*Corresponding author. Email: [email protected]

ISSN 1055-6788 print/ISSN 1029-4937 online

DOI: 10.1080/10556780903180366 http://www.informaworld.com

(2)

An important special case of problem (1) corresponds to G(ζ )≡ ζ, i.e.

F (ζ )∈ K, ζ ∈ K, F (ζ), ζ = 0. (3)

This is a natural extension of the non-linear complementarity problem (NCP) [9,12], where K = Rⁿ₊, the non-negative orthant inRⁿ, corresponds to n₁= · · · = nq = 1 and q = n. Another important special case of (1) corresponds to the Karush–Kuhn–Tucker (KKT) conditions of the convex second-order cone program (SOCP):

minimize g(x)

subject to Ax = b, x ∈ K, (4)

where g: Rⁿ→ R is a twice continuously differentiable convex function, A ∈ R^m^×nhas full row rank, and b∈ R^m. The KKT conditions of (4) can be rewritten as (1) with

F (ζ ):= ˆx + (I − A^T(AA^T)⁻¹A)ζ, G(ζ ):= ∇g(F (ζ)) − A^T(AA^T)⁻¹Aζ, (5) whereˆx ∈ Rⁿsatisfies Ax= b; see [5] for details. The convex SOCP arises in many applications from engineering design, finance, and robust optimization; see [1,20] and references therein.

Motivated by Kanno et al. [17] where the three-dimensional quasi-static frictional contact was directly reformulated as a linear SOC complementarity problem, we believe that, besides these applications, the SOCCP (1) will be found to have some applications in engineering which cannot reduce to SOCPs.

Various methods have been proposed for solving convex SOCPs and SOCCPs, including the interior point methods [1,2,20,21,28,30], the smoothing Newton methods [6,14,16], the merit function method [5] and the semi-smooth Newton method [19]. Among others, the last three kinds of methods are all based on an SOC complementarity function or a merit function. Specifically, φ: Rⁿⁱ × Rⁿⁱ → Rⁿⁱ is called an SOC complementarity function associated withKⁿⁱ if

φ (x_i, y_i)= 0 ⇐⇒ x ∈ Kⁿⁱ, y ∈ Kⁿⁱ, xi, y_i = 0. (6) Clearly, when ni = 1, an SOC complementarity function becomes an NCP function. A popular choice of φ is the Fischer–Burmeister (FB) function defined by

φ_FB(x_i, y_i):= (xi²+ yi²)^1/2− (xi+ yi) ∀xi, y_i ∈ Rⁿⁱ, (7) where x_i² = xi◦ ximeans the Jordan product of xiwith itself (the definition of Jordan product is given in Section 2), and (xi)^1/2means a vector such that[(xi)^1/2]²= xi. The function φFBis well- defined and satisfies (6); see [14]. Hence, the SOCCP (1) can be reformulated as the following non-smooth system

_FB(ζ ):=

⎛

⎜⎝

φ_FB(F₁(ζ ), G₁(ζ )) ...

φFB(Fq(ζ ), Gq(ζ ))

⎞

⎟⎠ = 0. (8)

The system (8) induces a natural merit function FB: Rⁿ→ R+for (1), given by

_FB(ζ ):=1

2||FB(ζ )||²=

q i=1

ψ_FB(F_i(ζ ), G_i(ζ )) (9) with

ψFB(xi, yi):=1

2||φFB(xi, yi)||². (10) The function ψ_FBwas studied in [5] and used to develop a merit function method. Recently, we analysed in [22] that, to guarantee the boundedness of the level sets of the FB merit function _FB,

(3)

it requires that the mapping F at least has the uniform Cartesian P -property. This means that φ_FB has some limitations in handling monotone SOCCPs.

Motivated by Kanzow and Petra [18] for the NCPs, we present a new reformulation for (1) in this paper to overcome the disadvantage of φ_FB. Let φ₀: Rⁿⁱ× Rⁿⁱ → R₊be given by

φ₀(x_i, y_i):= max{0, xi, y_i}, (11) and define the operator : Rⁿ→ Rⁿ^+q as

(ζ ):=

⎛

⎜⎜

⎝

ρ₁φ_FB(F₁(ζ ), G₁(ζ )) ...

ρ1φFB(Fq(ζ ), Gq(ζ )) ρ2φ0(F1(ζ ), G1(ζ ))

...

ρ2φ0(Fq(ζ ), Gq(ζ ))

⎞

⎟⎟

⎠

, (12)

where ρ1, ρ2are arbitrary but fixed constants from (0,1) used as the weights for the first type of terms and the second one, respectively. In other words, we define by appending q components to the mapping _FB. These additional components, as will be shown later, play a crucial role in overcoming the disadvantage of _FBmentioned above. Noting that

ζ^∗solves (ζ )= 0 ⇐⇒ ζ^∗solves (1), (13) we have the following nonlinear least-square reformulation for the SOCCP (1)

min

ζ∈Rⁿ(ζ ):=1

2||(ζ)||²=

q i=1

ψ (Fi(ζ ), Gi(ζ )), (14)

where

ψ (x_i, y_i):= ρ1²ψ_FB(x_i, y_i)+1

2ρ₂²φ₀(x_i, y_i)². (15) The reformulation has the following advantages: on the one hand, belongs to the class of merit functions f_YFintroduced in [5], which will be shown to have more desirable properties than _FB; on the other hand, inherits the semi-smoothness of FB even strong semi-smoothness under some conditions. By this, we propose a semi-smooth Levenberg–Marquardt type method for solving (14), and establish the superlinear (quadratic) rate of convergence under strict complementarity and a local error bound assumption of the solution, respectively.

Throughout this paper, I represents an identity matrix of suitable dimension,|| · || denotes the Euclidean norm,Rⁿdenotes the space of n-dimensional real column vectors, andRⁿ¹× · · · × Rⁿ^q is identified withRⁿ¹^+···+n^q. Thus, (x1, . . . , xq)∈ Rⁿ¹× · · · × Rⁿ^q is viewed as a column vector inRⁿ¹^+···+n^q. For a differentiable mapping F : Rⁿ→ R^m,∇F (x) denotes the transpose of the Jacobian F(x). For a (not necessarily symmetric) square matrix A∈ Rⁿ^×n, we write A 0 (respectively, A 0) to mean A is positive semi-definite (respectively, positive definite). Given a finite number of matrices Q₁, . . . , Q_n, we denote the block diagonal matrix with these matrices as block diagonals by diag (Q₁, . . . , Q_n). IfJ and B are index sets such that J , B ⊆ {1, 2, . . . , q}, we denote P_{J B}by the block matrix consisting of the sub-matrices P_{j k}∈ Rⁿ^j^×n^k of P with j ∈ J and k∈ B. We denote int(Kⁿ), bd(Kⁿ)and bd⁺(Kⁿ)by the interior, the boundary ofKⁿ, and the boundary ofKⁿexcluding the origin, respectively.

(4)

2. Preliminaries

This section recalls some background materials that will be used in the sequel. We start with the definition of the Jordan product. For any x= (x1, x2), y = (y1, y2)∈ R × Rⁿ⁻¹, we define their Jordan product [11] associated withKⁿas

x◦ y := (x, y, x1y2+ y1x2). (16) The Jordan product ‘◦’, unlike the scalar or matrix multiplication, is not associative, which is a main source on complication in the analysis of SOCCPs. The identity element under this product is e:= (1, 0, . . . , 0)^T∈ Rⁿ. Given a vector x= (x1, x2)∈ R × Rⁿ⁻¹, let Lx :=

x1 x₂^T x2 x1I

which can be viewed as a linear mapping fromRⁿ toRⁿ. It is easy to verify that Lxy= x ◦ y and Lx+y = Lx+ Lyfor any x, y∈ Rⁿ. Furthermore, x∈ Kⁿif and only if Lx 0, and x ∈ int(Kⁿ) if and only if Lx 0. When x ∈ int(Kⁿ), the inverse of Lx is given by

L⁻¹_x = 1 det(x)

⎡

⎣ x1 −x2^T

−x2

det(x) x1

I + 1 x1

x2x₂^T

⎤

⎦ , (17)

where det(x) denotes the determinant of x defined by det(x):= x1²− ||x2||².

From [11,14], we recall that each x= (x1, x2)∈ R × Rⁿ⁻¹ admits a spectral factorization associated withKⁿ, of the form

x = λ1(x)· u⁽¹⁾x + λ2(x)· u⁽²⁾x , (18) where λ_i(x)and u⁽ⁱ⁾_x for i= 1, 2 are the spectral values and the associated spectral vectors of x, respectively, defined by

λi(x):= x1+ (−1)ⁱ||x2||, u⁽ⁱ⁾x := 1

2(1, (−1)ⁱ¯x2),

with¯x2 = x2/||x2|| if x2= 0 and otherwise being any vector in Rⁿ⁻¹with|| ¯x2|| = 1. If x2 = 0, the factorization is unique. The spectral factorizations of x, x²as well as x^1/2have various interesting properties [14]; for example, x∈ Kⁿif and only if 0≤ λ1(x)≤ λ2(x), and x ∈ int(Kⁿ)if and only if 0 < λ₁(x)≤ λ2(x).

We next recall from Chen and Qi, [4] the definition of Cartesian P -property for a matrix and a nonlinear transformation.

Definition2.1 A matrix M∈ Rⁿ^×nis said to have

(a) the Cartesian P -property if for any non-zero ζ = (ζ1, . . . , ζq)∈ Rⁿwith ζi∈ Rⁿⁱ, there exists an index ν∈ {1, 2, . . . , q} such that ζν, (Mζ )ν > 0;

(b) the Cartesian P₀-property if for any non-zero ζ = (ζ1, . . . , ζ_q)∈ Rⁿ with ζ_i ∈ Rⁿⁱ, there exists an index ν∈ {1, 2, . . . , q} such that

ζν = 0 and ζν, (Mζ )ν ≥ 0.

Definition2.2 The mappings F =(F1, . . . , Fq)and G= (G1, . . . , Gq)are said to have (a) the jointly uniform Cartesian P -property if there exists a constant ρ > 0 such that, for any

ζ, ξ ∈ Rⁿ, there exists ν∈ {1, 2, . . . , q} such that

Fν(ζ )− Fν(ξ ), G_ν(ζ )− Gν(ξ ) ≥ ρζ − ξ²,

(5)

(b) the joint Cartesian P -property if for any ζ, ξ ∈ Rⁿ with G(ζ )= G(ξ), there exists ν ∈ {1, 2, . . . , q} such that

Fν(ζ )− Fν(ξ ), Gν(ζ )− Gν(ξ ) > 0,

(c) the joint Cartesian P0-property if for any ζ, ξ ∈ Rⁿ with G(ζ )= G(ξ), there exists ν ∈ {1, 2, . . . , q} such that

Gν(ζ )= Gν(ξ ) and Fν(ζ )− Fν(ξ ), Gν(ζ )− Gν(ξ ) ≥ 0,

When G(ζ )≡ ζ, Definition 2.2 gives the Cartesian P -properties of F . Obviously, the uniform Cartesian P -property⇒ the Cartesian P -property ⇒ the Cartesian P0-property. Also, a contin- uously differentiable mapping has the Cartesian P0-property if and only if its Jacobian matrix at every point has the Cartesian P₀-property, and if the Jacobian matrix of a continuously differen- tiable mapping has the Cartesian P -property at every point, then the mapping has the Cartesian P-property. From Definition 2.1, the positive semi-definitness implies the Cartesian P₀-property.

Given a mapping H : Rⁿ→ R^m, if H is locally Lipschitz continuous, then

∂_BH (ζ ):= {V ∈ R^m^×n| ∃{ζ^k} ⊆ DH : ζ^k→ ζ, H(ζ^k)→ V }

is non-empty and called the B-subdifferential of H at ζ , where D_H ⊆ Rⁿdenotes the set of points at which H is differentiable. The convex hull ∂H (ζ ):= conv∂BH (ζ )is the generalized Jacobian of H at ζ in the sense of Clarke [4]. For the concepts of (strongly) semi-smooth functions, please refer to [24,25] for details.

3. Properties of the operator

To study the favourable properties of the operator , we first give two technical lemmas to summarize some properties of φ0and φFB, respectively. The results of the first lemma are direct, and the results of the second lemma can be found in [14, Prop. 4.2], [5, Prop. 2], [27, Cor. 3.3]

and [22, Prop. 3.1].

Lemma3.1 Let φ0: Rⁿ× Rⁿ→ R+be defined as in Equation (11). Then, (a) the square of φ0is continuously differentiable everywhere;

(b) φ0is strongly semi-smooth everywhere onRⁿ× Rⁿ;

(c) the B-subdifferential ∂Bφ0(x, y)of φ0at any (x, y)∈ Rⁿ× Rⁿis given by

∂Bφ0(x, y)= [∂B(x^Ty)₊y^T ∂B(x^Ty)₊x^T], where

∂B(x^Ty)₊=

⎧⎪

⎨

⎪⎩

{1} if x^Ty >0, {1, 0} if x^Ty = 0, {0} if x^Ty <0.

Lemma3.2 Let φ_FB: Rⁿ× Rⁿ→ Rⁿbe defined as in Equation (7). Then, for any given x = (x₁, x₂), y = (y1, y₂)∈ R × Rⁿ⁻¹,the following results hold.

(a) φ_FB(x, y)= 0 ⇐⇒ x ∈ Kⁿ, y∈ Kⁿ, x, y = 0.

(b) φ_FBis strongly semismooth at (x, y).

(6)

(c) Each element[Ux− I Uy− I] of ∂BφFB(x, y)has the following representation:

(c.1) If x²+ y²∈ int(Kⁿ),then U_x = L⁻¹_(x2+y²)^1/2L_x and U_y = L⁻¹_(x2+y²)^1/2L_y. (c.2) If x²+ y²∈ bd⁺(Kⁿ),then[Ux, Uy] belongs to the set

1 2√

2w₁

1 ¯w^T₂

¯w2 4I− 3 ¯w2¯w2^T

Lx+1

2

1

− ¯w2

u^T,

1 2√

2w₁

1 ¯w^T2

¯w2 4I− 3 ¯w2¯w2^T

Ly+1

2

1

− ¯w2

v^T

u= (u1, u2),

v= (v1, v₂)∈ R × Rⁿ⁻¹satisfy|u1| ≤ u2 ≤ 1, |v1| ≤ v2 ≤ 1

,

where w= (w1, w2)= x²+ y²and w2 = w2/w2.

(c.3) If (x, y)= (0, 0), [Ux, Uy] belongs to {[Lˆu, L_ˆv]| ˆu²+ ˆv²= 1} or

1 2

1

¯w2

ξ^T+1

2

1

− ¯w2

u^T+ 2

0 0

0 (I− ¯w2¯w^T2)

Ls,

1 2

1

¯w2

η^T+1

2

1

− ¯w2

v^T+ 2

0 0

0 (I− ¯w2¯w2^T)

Lω

|| ¯w2satisfies

 ¯w2 = 1 and u = (u1, u2), v= (v1, v2), ξ = (ξ1, ξ2), η= (η1, η2), s= (s1, s2), ω= (ω1, ω2)∈ R × Rⁿ⁻¹satisfy|ξ1| ≤ ξ2 ≤ 1,

|u1| ≤ u2 ≤ 1, |η1| ≤ η2 ≤ 1, |v1| ≤ v2 ≤ 1, s²+ ω² ≤ 1 2

.

(d) The squared norm of φ_FB,i.e. _FB, is continuously differentiable at (x, y).

From Lemma 3.1 (b) and Lemma 3.2 (b), we obtain the semi-smoothness of .

Proposition3.3 The operator : Rⁿ→ Rⁿ^+q defined by (12) is semi-smooth. If, in addition, Fand Gare Lipschitz continuous, then is strongly semi-smooth.

Proof Let i denote the ith component function of for i= 1, 2, . . . , 2q, i.e., i(ζ )= φFB(Fi(ζ ), Gi(ζ )) for i= 1, 2, . . . , q and i(ζ )= φ0(Fi(ζ ), Gi(ζ )) for i= q + 1, . . . , 2q.

Then, the mapping is (strongly) semi-smooth if every _i is (strongly) semi-smooth. Note that

_i : Rⁿ→ Rⁿⁱ for i= 1, 2, . . . , q is the composite of the strongly semi-smooth function φFB

and the smooth function ζ → (Fi(ζ ), G_i(ζ )), whereas _q_+i : Rⁿ→ R is the composite of the strongly semi-smooth function φ₀and the function ζ → (Fi(ζ ), G_i(ζ )). Moreover, when Fand Gare Lipschitz continuous, ζ → (Fi(ζ ), G_i(ζ ))is strongly semi-smooth. By [13, Theorem 19], we have that every component function of is semi-smooth, and strongly semi-smooth if, in

addition, Fand Gare Lipschitz continuous.

Next, we present an estimation for the B-subdifferential of at any ζ ∈ Rⁿ.

(7)

Proposition3.4 Let : Rⁿ→ Rⁿ^+qbe given by (12). Then, for any ζ ∈ Rⁿ,

∂_B(ζ )^T⊆ ∇F (ζ)[ρ1(A(ζ )− I) ρ2C(ζ )] + ∇G(ζ )[ρ1(B(ζ )− I) ρ2D(ζ )] where C(ζ )= diag(C1(ζ ), . . . , Cq(ζ ))and D(ζ )= diag(D1(ζ ), . . . , Dq(ζ ))with

Ci(ζ )∈ Gi(ζ )∂B(Fi(ζ )^TGi(ζ ))₊ and Di(ζ )∈ Fi(ζ )∂B(Fi(ζ )^TGi(ζ ))₊,

and A(ζ )= diag(A1(ζ ), . . . , A_q(ζ ))and B(ζ )= diag(B1(ζ ), . . . , B_q(ζ ))with the block diago- nals A_i(ζ ), B_i(ζ )∈ Rⁿⁱ^×nⁱ having the following representation:

(a) If Fi(ζ )²+ Gi(ζ )²∈ int(Kⁿⁱ),then Ai(ζ )= LF_i(ζ )L⁻¹_[F

i(ζ )²+Gi(ζ )²]^1/2and Bi(ζ )= LG_i(ζ )L⁻¹_[F

i(ζ )²+Gi(ζ )²]^1/2. (b) If Fi(ζ )²+ Gi(ζ )²∈ bd⁺(Kⁿⁱ), then[Ai(ζ ), Gi(ζ )] belongs to the set

1

2√

2wi1(ζ )LF_i(ζ )

1 ¯wi2(ζ )^T

¯wi2(ζ ) 4I− 3 ¯wi2(ζ )¯wi2(ζ )^T

+1

2ui(1,− ¯wi2(ζ )^T), 1

2√

2wi1(ζ )LG_i(ζ )

1 ¯wi2(ζ )^T

¯wi2(ζ ) 4I− 3 ¯wi2(ζ )¯wi2(ζ )^T

+1

2vi(1,− ¯wi2(ζ )^T) ui= (ui1, ui2), vi = (vi1, vi2)satisfy|ui1| ≤ ui2 ≤ 1, |vi1| ≤ vi2 ≤ 1

, where w(ζ )= (wi1(ζ ), wi2(ζ ))= Fi(ζ )²+ Gi(ζ )²and ¯wi2(ζ )= wi2(ζ )/wi2(ζ ).

(c) If (Fi(ζ ), Gi(ζ ))= (0, 0), [Ai(ζ ), Bi(ζ )] ∈ {[Lˆui, L_ˆv_i] | ˆui²+ˆvi² = 1} or

1

2ξi(1, ¯wi2^T)−1

2ui(−1, ¯wi2^T)+ 2Ls_i

0 0

0 (I− ¯wi2¯w^T_i2)

, 1

2η_i(1, ¯w^T_i2)−1

2v_i(−1, ¯w^T_i2)+ 2Lωi

0 0

0 (I− ¯wi2¯w_i2^T)

| ¯wi2∈ Rⁿⁱ⁻¹ satisfies ¯wi2 = 1 and ξi = (ξi1, ξi2), ui = (ui1, ui2), ηi = (ηi1, ηi2),

vi = (vi1, vi2), si= (si1, si2), ωi = (ωi1, ωi2)satisfy|ξi1| ≤ ξi2 ≤ 1,

|ui1| ≤ ui2 ≤ 1, |ηi1| ≤ ηi2 ≤ 1, |vi1| ≤ vi2 ≤ 1, si²+ ωi²≤ 1 2

. Proof Let _i be the ith component function of , i.e. _i(ζ )= φFB(F_i(ζ ), G_i(ζ )) and

_q_+i(ζ )= φ0(F_i(ζ ), G_i(ζ ))for i= 1, . . . , q. By the concept of B-subdifferential,

∂B(ζ )^T ⊆ ∂B1(ζ )^T× ∂B2(ζ )^T× · · · × ∂B2q(ζ )^T, (19) where the latter means the set of all matrices whose (ni−1+ 1)th to nith columns belong to

∂Bi(ζ )^Twith n0 = 0, and (n + i)th column belongs to ∂Bq+i(ζ )^T. Note that

∂_B_i(ζ )^T ⊆ ρ1[∇Fi(ζ ) ∇Gi(ζ )] ∂Bφ_FB(F_i(ζ ), G_i(ζ ))^T,

∂Bq+i(ζ )^T ⊆ ρ2[∇Fi(ζ ) ∇Gi(ζ )] ∂Bφ0(Fi(ζ ), Gi(ζ ))^T. (20) Also, by Lemmas 3.1(c) and 3.2(c), each element in ∂BφFB(Fi(ζ ), Gi(ζ ))^T and

∂_Bφ₀(F_i(ζ ), G_i(ζ ))^Thas the form of

Ai(ζ )−I B_i(ζ )−I

and

Ci(ζ ) D_i(ζ )

, respectively, with A_i(ζ ), B_i(ζ )and

(8)

Ci(ζ ), Di(ζ )for i= 1, . . . , q characterized as in the proposition. Combining with Equations (19)

and (20), we obtain the desired result.

To prove the fast local convergence of non-smooth Levenberg–Marquardt methods, we need to know under what conditions every element H ∈ ∂B(ζ^∗)has full rank n, where ζ^∗is a solution of the SOCCP (1). To the end, define the index sets

I := {i ∈ {1, 2, . . . , q} | Fi(ζ^∗)= 0, Gi(ζ^∗)∈ int(Kⁿⁱ)},

B := {i ∈ {1, 2, . . . , q} | Fi(ζ^∗)∈ bd⁺(Kⁿⁱ), Gi(ζ^∗)∈ bd⁺(Kⁿⁱ)},

J := {i ∈ {1, 2, . . . , q} | Fi(ζ^∗)∈ int(Kⁿⁱ), Gi(ζ^∗)= 0}. (21) If ζ^∗satisfies strict complementarity, i.e. Fi(ζ^∗)+ Gi(ζ^∗)∈ int(Kⁿⁱ)for all i, then{1, 2, . . . , q}

can be partitioned asI ∪ B ∪ J . Thus, if ∇G(ζ^∗)is invertible, then by rearrangement the matrix P (ζ^∗)= ∇G(ζ^∗)⁻¹∇F (ζ^∗)can be rewritten as

P (ζ^∗)=

⎛

⎝P (ζ^∗)_II P (ζ^∗)_IB P (ζ^∗)_IJ P (ζ^∗)_BI P (ζ^∗)_BB P (ζ^∗)_BJ P (ζ^∗)_{J I} P (ζ^∗)_{J B} P (ζ^∗)_{J J}

⎞

⎠ .

Now we have the following results for the full rank of every element H ∈ ∂B(ζ^∗).

Theorem3.5 Let ζ^∗be a strictly complementary solution of (1). Suppose that∇G(ζ^∗)is invert- ible and let P (ζ^∗)= ∇G(ζ^∗)⁻¹∇F (ζ^∗). If P (ζ^∗)_II is non-singular and its Schur-complement P (ζ^∗)_II := P (ζ^∗)_BB− P (ζ^∗)_BIP (ζ^∗)⁻¹_IIP (ζ^∗)_IB, in the matrix

P (ζ^∗)_IIP (ζ^∗)_IB P (ζ^∗)_BIP (ζ^∗)_BB

has the Cartesian P -property, then every element H in the B-subdifferential ∂B(ζ^∗)has full column rank n.

Proof Let H ∈ ∂B(ζ^∗). By Proposition 3.4, H=

ρ₁H₁ ρ₂H₂

with H₁^Tfrom the set ∂B1(ζ^∗)^T×

· · · × ∂B_q(ζ^∗)^T. From Theorem 4.1 of [22], H₁^Tis non-singular under the given assumptions.

This implies the desired result rank(H )= n.

The proof of Theorem 3.5 is based on the important property of the first block of H . Nevertheless, when the first block H₁is singular, the second block H₂may contribute something to guarantee that H has a full column rank n.

To close this section, we give a technical lemma that will be used in Section 5.

Lemma3.6 Let ζ^∗be a solution of (1) such that all elements in ∂B(ζ^∗)have full column rank.

Then, there exist constants ε > 0 and c > 0 such that(H^TH )⁻¹ ≤ c for all ζ − ζ^∗ < ε and all H ∈ ∂B(ζ ). Furthermore, for any given¯ν > 0, H^TH+ νI are uniformly positive definite for all ν∈ [0, ¯ν] and H ∈ ∂B(ζ )withζ − ζ^∗ < ε.

Proof The proof is similar to [24, Lemma 2.6]. For completeness, we include it here. Suppose that the claim of the lemma is not true. Then there exists a sequence{ζ^k} converging to ζ^∗and a corresponding sequence of matrices{Hk} with Hk∈ ∂B(ζ^k)for all k∈ IN such that either Hk^THk

is singular or(H_k^TH_k)⁻¹ → +∞ on a subsequence. Noting that H_k^TH_k is symmetric positive semi-definite, for the non-singular case, we have(H_k^TH_k)⁻¹ = 1/λmin(H_k^TH_k),which implies that the condition(H_k^TH_k)⁻¹ → +∞ is equivalent to λmin(H_k^TH_k)→ 0. Since ζ^k→ ζ^∗and the mapping ζ → ∂B(ζ )is upper semi-continuous, it follows that the sequence{Hk} is bounded, and hence it has a convergent subsequence. Let H_∗be a limit of such a sequence. Then λ_min(H_∗^TH_∗)= 0

(9)

by the continuity of the minimum eigenvalue. This means that H_∗^TH_∗is singular. However, from the fact that the mapping ζ → ∂B(ζ )is closed, we have H_∗∈ ∂B(ζ^∗), which by the given condition implies that H_∗^TH_∗is non-singular. Thus, we obtain a contradiction, and the first part follows. By the result of the first part and the definition of matrix norm, there exist constants ε > 0 and c > 0 such that

[λmin(H^TH+ νI)]⁻¹= (H^TH+ νI)⁻¹ ≤ c for all ν∈ [0, ¯ν] and H ∈ ∂B(ζ )withζ − ζ^∗ < ε. This implies that

u^T(H^TH+ νI)u ≥ λmin(H^TH+ νI)u²≥ 1

cu² ∀ u ∈ Rⁿ.

Therefore, all the matrices H^TH+ νI are uniformly positive definite.

4. Properties of the merit function

This section is devoted to the favourable properties of defined by (14) and (15). To this end, we need the following lemma which summarizes the properties of ψ .

Lemma4.1 Let ψ: Rⁿ× Rⁿ→ R₊be defined as in (15). Then, for any x, y∈ Rⁿ, (a) ψ(x, y)= 0 ⇐⇒ FB(x, y)= 0 ⇐⇒ x ∈ Kⁿ, y∈ Kⁿ, x, y = 0;

(b) ψ(x, y) is continuously differentiable;

(c) x, ∇xψ (x, y) + y, ∇yψ (x, y) ≥ 2ψ(x, y);

(d) ∇xψ (x, y),∇yψ (x, y) ≥ 0, and the equality holds if and only if ψ(x, y) = 0;

(e) ψ(x, y)= 0 ⇐⇒ ∇ψ(x, y) = 0 ⇐⇒ ∇xψ (x, y)= 0 ⇐⇒ ∇yψ (x, y)= 0.

Proof Part (a) is direct by the definition of ψ , and part (b) is from Lemmas 3.1(a) and 3.2(d).

We next consider part (c). By the definition of ψ ,

∇xψ (x, y)= ρ1²∇xψFB(x, y)+ ρ2²φ0(x, y)y,

∇yψ (x, y)= ρ1²∇yψFB(x, y)+ ρ2²φ0(x, y)x. (22) From Lemma 6 (a) of [5] and the definition of φ₀(x, y), it then follows that

x, ∇xψ (x, y) + y, ∇yψ (x, y)

= ρ1²[x, ∇xψFB(x, y) + y, ∇yψFB(x, y)] + 2ρ2²φ0(x, y)x^Ty

= ρ1²φFB(x, y)²+ 2ρ2²φ0(x, y)²

= 2

ρ₁²ψFB(x, y)+1

2ρ₂²φ0(x, y)²

+ ρ²2φ0(x, y)²

≥ 2ψ(x, y).

(d) Using the formulas in (22) and [5, Lemma 6(a)], it follows that

∇xψ (x, y),∇yψ (x, y) = ρ1⁴∇xψFB(x, y),∇yψFB(x, y) + ρ2⁴x^Tyφ0(x, y)² + ρ1²ρ₂²φ0(x, y)[x, ∇xψFB(x, y) + y, ∇yψFB(x, y)]

= ρ1⁴∇xψ_FB(x, y),∇yψ_FB(x, y) + ρ2⁴φ₀(x, y)³

+ 2ρ1²ρ₂²φ0(x, y)ψFB(x, y). (23)

(10)

The first term on the right-hand side of (23) is non-negative by [5, Lemma 6(b)], and the last two terms are also non-negative. Therefore, ∇xψ (x, y),∇yψ (x, y) ≥ 0, and moreover,

∇xψ (x, y),∇yψ (x, y) = 0 if and only if

∇xψ_FB(x, y),∇yψ_FB(x, y) = 0 and φ0(x, y)= 0, which, together with Lemma 6(b) of [5], implies the desired result.

(e) If ψ (x, y)= 0, then from the definition of ψ it follows that φFB(x, y)= 0 and φ0(x, y)= 0.

From Proposition 1 of [5], we immediately obtain∇xψFB(x, y)= ∇yψFB(x, y)= 0, and consequently∇xψ (x, y)= 0 and ∇yψ (x, y)= 0 by (22). If ∇ψ(x, y) = 0, then by part (c) and the non-negativity of ψ , we get ψ (x, y)= 0. Thus we prove the first equivalence. For the second equivalence, it suffices to prove the sufficiency. Suppose that ∇xψ (x, y)= 0. From part (d), we readily get ψ(x, y)= 0, which together with part (a) and (22) implies ∇ψ(x, y) = 0. Con- sequently, ∇ψ(x, y) = 0 ⇐⇒ ∇xψ (x, y)= 0. Similarly, ∇ψ(x, y) = 0 ⇐⇒ ∇yψ (x, y)= 0.

This implies the last equivalence.

Lemma 4.1(b) shows that is continuously differentiable. By Lemma 4.1(d), we can prove every stationary point of is a solution of Equation (1) under mild conditions.

Proposition4.2 Let : Rⁿ→ R+be given by (14) and (15). Then every stationary point of

is a solution of (1) under one of the following assumptions:

(a) ∇F (ζ) and −∇G(ζ ) are column monotone¹for any ζ ∈ Rⁿ.

(b) For any ζ ∈ Rⁿ,∇G(ζ) is invertible and ∇G(ζ)⁻¹∇F (ζ) has Cartesian P0-property.

Proof When the assumption (a) is satisfied, using the same arguments as those of [5, Prop.

3] yields the desired result. Now suppose that the assumption (b) holds. Let ¯ζ be an arbitrary stationary point of and write

∇xψ (F (ζ ), G(ζ ))= (∇x₁ψ (F1(ζ ), G1(ζ )), . . . ,∇x_qψ (Fq(ζ ), Gq(ζ ))),

∇yψ (F (ζ ), G(ζ ))=!

∇y₁ψ (F1(ζ ), G1(ζ )), . . . ,∇y_qψ (Fq(ζ ), Gq(ζ ))"

. Then,

∇(¯ζ) = ∇F (¯ζ)∇xψ (F ( ¯ζ ), G( ¯ζ ))+ ∇G(¯ζ)∇yψ (F ( ¯ζ ), G( ¯ζ ))= 0, which, by the invertibility of∇G, can be rewritten as

∇G(¯ζ)⁻¹∇F (¯ζ)∇xψ (F ( ¯ζ ), G( ¯ζ ))+ ∇yψ (F ( ¯ζ ), G( ¯ζ ))= 0. (24) Suppose that ¯ζ is not the solution of Equation (1). By Lemma 4.1(e), we necessarily have

∇xψ (F ( ¯ζ ), G( ¯ζ ))= 0.

Using the Cartesian P0-property of∇G(¯ζ)⁻¹∇F (¯ζ), there must exist an index ν ∈ {1, 2, . . . , q}

such that∇xνψ (Fν( ¯ζ ), Gν( ¯ζ ))= 0 and

∇xνψ (F_ν( ¯ζ ), G_ν( ¯ζ )),[∇G(¯ζ)⁻¹∇F (¯ζ)∇xψ (F ( ¯ζ ), G( ¯ζ ))]ν ≥ 0. (25) In addition, note that (24) is equivalent to

[∇G(¯ζ)⁻¹∇F (¯ζ)∇xψ (F ( ¯ζ ), G( ¯ζ ))]i+ ∇yiψ (F_i( ¯ζ ), G_i( ¯ζ ))= 0, i = 1, 2, . . . , q.

(11)

Making the inner product with∇xνψ (F ( ¯ζ ), G( ¯ζ ))for the νth equality, we obtain

∇x_νψ (Fν( ¯ζ ), Gν( ¯ζ )),[∇G(¯ζ)⁻¹∇F (¯ζ)∇xψ (F ( ¯ζ ), G( ¯ζ ))]ν + ∇x_νψ (Fν( ¯ζ ), Gν( ¯ζ )),∇y_νψ (Fν( ¯ζ ), Gν( ¯ζ )) = 0.

The first term on the left-hand side is non-negative by (25), whereas the second term is positive by Lemma 4.1(d) since ζ is not a solution of (1). This leads to a contradiction, and consequently

¯ζ must be a solution of (1).

When∇G(ζ) is invertible for any ζ ∈ Rⁿ, the assumption in (a) is equivalent to the positive semi-definiteness of∇G(ζ)⁻¹∇F (ζ) at any ζ ∈ Rⁿ, which implies the Cartesian P₀-property of

∇G(ζ)⁻¹∇F (ζ ). Thus, for the SOCCP (3), the assumption (a) is stronger than the assumption (b) which is now equivalent to the Cartesian P₀-property of F .

Next we provide a condition to guarantee the boundedness of the level sets of  L(γ ) := {ζ ∈ Rⁿ| (ζ) ≤ γ }

for all γ ≥ 0. This property is important since it guarantees that the descent sequence of must have a limit point, and the solution set of (1) is bounded if it is non-empty. It turns out that the following condition for F and G is sufficient.

Condition A. For any sequence{ζ^k} satisfying ζ^k → +∞, whenever

lim sup[−F (ζ^k)]+ < +∞ and lim sup [−G(ζ^k)]+ < +∞, (26) there exists an index ν∈ {1, 2, . . . , q} such that lim supFν(ζ^k), Gν(ζ^k) = +∞.

Proposition4.3 If the mappings F and G satisfy Condition A, then the level setsL(γ )are bounded for all γ ≥ 0.

Proof Assume that there is a unbounded sequence{ζ^k} ⊆ L(γ )for some γ ≥ 0. Since (ζ^k)≤ γ for all k, the sequence{FB(ζ^k)} is bounded. By Lemma 8 of [5],

lim sup[−Fi(x^k)]+ < +∞ and lim sup [−Gi(x^k)]+ < +∞

hold for all i ∈ {1, 2, . . . , q}. This shows that F and G satisfy Condition A, and hence there exists an index ν such that lim supFν(ζ^k), Gν(ζ^k) = +∞. From the definition of , it follows that the sequence{(ζ^k)} is unbounded, which clearly contradicts the fact that {ζ^k} ⊆ L(γ ). The

proof is completed.

Condition A is rather weak to guarantee that has bounded level sets since, as will be shown below, the condition is implied by the joint monotonicity of F and G with the strict feasibility of (1) used in [5] for f_YF, the jointly uniform Cartesian P -functions with a feasible point, and the joint ˜R01-property in the following sense.

Definition4.4 The mappings F, G: Rⁿ→ Rⁿare said to have the joint ˜R₀₁-property if for any sequence{ζ^k} with

ζ^k → +∞, [−G(ζ^k)]+

ζ^k → 0, [−F (ζ^k)]+

ζ^k → 0, (27)

there holds that

lim inf

k→+∞

F (ζ^k), G(ζ^k)

ζ^k >0. (28)