A semismooth Newton method for SOCCPs based on a one-parametric class of SOC complementarity functions

(1)

DOI 10.1007/s10589-008-9166-9

A semismooth Newton method for SOCCPs based on a one-parametric class of SOC complementarity functions

Shaohua Pan· Jein-Shan Chen

Received: 29 March 2007 / Revised: 29 October 2007 / Published online: 7 February 2008

Abstract In this paper, we present a detailed investigation for the properties of a one-parametric class of SOC complementarity functions, which include the globally Lipschitz continuity, strong semismoothness, and the characterization of their B-subdifferential. Moreover, for the merit functions induced by them for the second- order cone complementarity problem (SOCCP), we provide a condition for each sta- tionary point to be a solution of the SOCCP and establish the boundedness of their level sets, by exploiting Cartesian P -properties. We also propose a semismooth New- ton type method based on the reformulation of the nonsmooth system of equations involving the class of SOC complementarity functions. The global and superlinear convergence results are obtained, and among others, the superlinear convergence is established under strict complementarity. Preliminary numerical results are reported for DIMACS second-order cone programs, which confirm the favorable theoretical properties of the method.

Keywords Second-order cone· Complementarity · B-subdifferential · Semismooth· Newton’s method

S. Pan work is partially supported by the Doctoral Starting-up Foundation (B13B6050640) of GuangDong Province.

J.-S. Chen member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.

S. Pan

School of Mathematical Sciences, South China University of Technology, Guangzhou 510640, People’s Republic of China

e-mail:shhpan@scut.edu.cn J.-S. Chen (

⁾

Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan e-mail:jschen@math.ntnu.edu.tw

(2)

1 Introduction

We consider the following conic complementarity problem of finding ζ∈ Rⁿ such that

F (ζ )∈ K, G(ζ )∈ K, F (ζ ), G(ζ ) = 0, (1) where·, · represents the Euclidean inner product, F and G are the mappings from RⁿtoRⁿwhich are assumed to be continuously differentiable, andK is the Cartesian product of second-order cones (SOCs), also called Lorentz cones [10]. In other words, K = Kⁿ¹× Kⁿ²× · · · × Kⁿ^m, (2) where m, n1, . . . , n_m≥ 1, n1+ n2+ · · · + nm= n, and

Kⁿⁱ:= {(x1, x2)∈ R × Rⁿⁱ⁻¹| x1≥ x2},

with· denoting the Euclidean norm and K¹denoting the set of nonnegative real numbersR₊. We refer to (1)–(2) as the second-order cone complementarity problem (SOCCP). In the sequel, corresponding to the Cartesian structure ofK, we write x= (x1, . . . , x_m)with xi ∈ Rⁿⁱ for any x ∈ Rⁿ, and F = (F1, . . . , F_m) and G= (G₁, . . . , G_m)with Fi, G_i: Rⁿ→ Rⁿⁱ.

An important special case of the SOCCP corresponds to G(ζ )= ζ for all ζ ∈ Rⁿ. Then (1) reduces to

F (ζ )∈ K, ζ ∈ K, F (ζ ), ζ = 0, (3) which is a natural extension of the nonlinear complementarity problem (NCP) where K = K¹×· · ·×K¹. Another important special case corresponds to the Karush-Kuhn- Tucker (KKT) conditions of the convex second-order cone program (SOCP):

min g(x)

s.t. Ax= b, x ∈ K, (4)

where A∈ R^m^×nhas full row rank, b∈ R^mand g: Rⁿ→ R is a convex twice continuously differentiable function. From [6], the KKT conditions for (4), which are sufficient but not necessary for optimality, can be written in the form of (1) and (2) with

F (ζ ):= d +(I −A^T(AA^T)⁻¹A)ζ, G(ζ ):= ∇g(F (ζ ))−A^T(AA^T)⁻¹Aζ, (5) where d∈ Rⁿis any vector satisfying Ax= b. For large problems with a sparse A, (5) has an advantage that the main cost of evaluating the Jacobian∇F and ∇G lies in inverting AA^T, which can be done efficiently via sparse Cholesky factorization.

There have been various methods proposed for solving SOCPs and SOCCPs, which include interior-point methods [1–3, 18, 19, 23, 26], non-interior smoothing Newton methods [7,13], smoothing-regularization methods [15], merit function methods [6] and semismooth Newton methods [16]. Among others, the last four kinds

(3)

of methods are all based on an SOC complementarity function or a smooth merit function induced by it.

Given a mapping φ: R^l× R^l→ R^l, we call φ an SOC complementarity function associated with the coneK^lif for any (x, y)∈ R^l× R^l,

φ (x, y)= 0 ⇐⇒ x ∈ K^l, y∈ K^l, x, y = 0. (6) Clearly, when l= 1, an SOC complementarity function reduces to an NCP function, which plays an important role in the solution of NCPs; see [24] and references therein.

A popular choice of φ is the Fischer-Burmeister (FB) function [11,12], defined by φFB(x, y):= (x²+ y²)^1/2− (x + y). (7) More specifically, for any x= (x1, x₂), y= (y1, y₂)∈ R × R^l⁻¹, we define their Jor- dan product associated withK^las

x◦ y := (x, y, y1x2+ x1y2). (8) The Jordan product “◦”, unlike scalar or matrix multiplication, is not associative, which is the main source on complication in the analysis of SOCCPs. The identity element under this product is e:= (1, 0, . . . , 0)^T ∈ R^l. We write x²to mean x◦ x and write x+ y to mean the usual componentwise addition of vectors. It is known that x²∈ K^l for all x∈ R^l. Moreover, if x∈ K^l, then there exists a unique vector inK^l, denoted by x^1/2, such that (x^1/2)²= x^1/2◦ x^1/2= x. Thus, φFBin (7) is well-defined for all (x, y)∈ R^l× R^land mapsR^l× R^ltoR^l. The function φFBwas proved in [13]

to satisfy the equivalence (6), and therefore its squared norm, denoted by ψFB(x, y):=1

2φFB(x, y)²,

is a merit function for the SOCCP. The merit function is shown to be continuously differentiable by Chen and Tseng [6], and a merit function approach was proposed by use of it.

Another popular choice of φ is the natural residual function φNR: R^l× R^l→ R^l given by

φ_NR(x, y):= x − [x − y]₊,

where[·]+means the minimum Euclidean distance projection ontoK^l. The function was studied in [13,15] which is involved in smoothing methods for the SOCCP, recently it was used to develop a semismooth Newton method for nonlinear SOCPs by Kanzow and Fukushima [16]. We note that φNRinduces a natural residual merit function

ψNR(x, y):=1

2φNR(x, y)²,

but, compared to ψFB, it has a remarkable drawback, i.e. the non-differentiability.

In this paper, we consider a one-parametric class of vector-valued functions φ_τ(x, y):= [(x − y)²+ τ(x ◦ y)]^1/2− (x + y) (9)

(4)

with τ being any but fixed parameter in (0, 4). The class of functions is a natural ex- tension of the family of NCP functions proposed by Kanzow and Kleinmichel [17], and has been shown in [4] to satisfy the characterization (6). It is not hard to see that as τ= 2, φτ reduces to the FB function φFB in (7) while it becomes a multiple of the natural residual function φNR as τ → 0⁺. With the class of SOC complementarity functions, clearly, the SOCCP can be reformulated as a nonsmooth system of equations

τ(ζ ):=

⎛

⎜⎜

⎝

φ_τ(F1(ζ ), G1(ζ )) ...

φτ(Fi(ζ ), Gi(ζ )) ...

φ_τ(F_m(ζ ), G_m(ζ ))

⎞

⎟⎟

⎠

= 0, (10)

which induces a natural merit function τ : Rⁿ→ R₊given by

τ(ζ )=1

2τ(ζ )²=

m i=1

ψτ(Fi(ζ ), Gi(ζ )), (11)

with ψτ being the natural merit function associated with φτ, i.e.,

ψ_τ(x, y)=1

2φτ(x, y)². (12)

In [4], we studied the continuous differentiability of ψτ and showed that each sta- tionary point of τ is a solution of the SOCCP if ∇F and −∇G are column monotone. In this paper, we concentrate on the properties of φτ, including the globally Lipschitz continuity, the strong semismoothness, and the characterization of the B-subdifferential. Particularly, we provide a weaker condition than [4] for each sta- tionary point of τ to be a solution of the SOCCP and establish the boundedness of the level sets of τ, by using Cartesian P -properties. We also propose a semismooth Newton method based on the system (10), and obtain the corresponding global and the superlinear convergence results. Among others, the superlinear convergence is established under strict complementarity.

Throughout this paper, I represents an identity matrix of suitable dimension, and Rⁿ¹ × · · · × Rⁿ^m is identified with Rⁿ¹^+···+n^m. For a differentiable mapping F : Rⁿ→ R^m,∇F (x) denotes the transpose of the Jacobian F(x). For a symmet- ric matrix A∈ Rⁿ^×n, we write A O (respectively, A O) to mean A is positive semidefinite (respectively, positive definite). Given a finite number of square matrices Q₁, . . . , Q_n, we denote the block diagonal matrix with these matrices as block diag- onals by diag(Q1, . . . , Q_n)or by diag(Qi, i= 1, . . . , n). If J and B are index sets such thatJ , B ⊆ {1, 2, . . . , m}, we denote P_{J B}by the block matrix consisting of the sub-matrices Pj k∈ Rⁿ^j^×n^k of P with j∈ J , k ∈ B, and by x_Ba vector consisting of sub-vectors xi∈ Rⁿⁱ with i∈ B.

(5)

2 Preliminaries

In this section, we recall some background materials and preliminary results that will be used in the subsequent sections. We begin with the interior and the boundary ofK^l. It is known thatK^lis a closed convex self-dual cone with nonempty interior given by

int(K^l):= {x = (x1, x2)∈ R × R^l⁻¹| x1>x2}

and the boundary given by

bd(K^l):= {x = (x1, x2)∈ R × R^l⁻¹| x1= x2}.

For each x= (x1, x₂)∈ R × R^l⁻¹, the determinant and the trace of x are defined by det(x):= x₁²− x2², tr(x):= 2x1.

In general, det(x◦ y) = det(x) det(y) unless x2= αy2 for some α∈ R. A vector x∈ R^lis said to be invertible if det(x)= 0, and its inverse is denoted by x⁻¹. Given a vector x= (x1, x₂)∈ R × R^l⁻¹, we often use the following symmetry matrix

L_x:=

x1 x₂^T x₂ x₁I

, (13)

which can be viewed as a linear mapping fromR^l toR^l. It is easy to verify Lxy= x◦ y and Lx+y= Lx+ Ly for any x, y∈ R^l. Furthermore, x∈ K^l if and only if L_x O, and x ∈ int(K^l)if and only if Lx O. Then Lxis invertible with

L⁻¹_x = 1 det(x)

x₁ −x₂^T

−x2 det(x) x₁ I+_x¹

1x2x₂^T

. (14)

We recall from [13] that each x= (x1, x2)∈ R × R^l⁻¹admits a spectral factorization, associated withK^l, of the form

x= λ1(x)· u⁽¹⁾_x + λ2(x)· u⁽²⁾_x ,

where λi(x)and u⁽ⁱ⁾x for i= 1, 2 are the spectral values and the associated spectral vectors of x, respectively, given by

λi(x)= x1+ (−1)ⁱx2, u⁽ⁱ⁾_x =1

2(1, (−1)ⁱ¯x2) (15) with ¯x2= x2/x2 if x2= 0, and otherwise ¯x2 being any vector inR^l⁻¹satisfying

 ¯x2 = 1. If x2= 0, then the factorization is unique. The spectral decompositions of x, x² and x^1/2have some basic properties as below, whose proofs can be found in [13].

Property 2.1 For any x= (x1, x₂)∈ R × R^l⁻¹with the spectral values λ1(x), λ₂(x) and spectral vectors u⁽¹⁾x , u⁽²⁾_x given as above, we have that

(6)

(a) x∈ K^l if and only if λ1(x)≥ 0, and x ∈ int(K^l)if and only if λ1(x) >0.

(b) x²= λ²₁(x)u⁽¹⁾_x + λ²₂(x)u⁽²⁾_x ∈ K^l. (c) x^1/2=√

λ1(x) u⁽¹⁾_x +√

λ2(x) u⁽²⁾_x ∈ K^l if x∈ K^l.

(d) det(x)= λ1(x)λ2(x), tr(x)= λ1(x)+ λ2(x)andx²= [λ²₁(x)+ λ²₂(x)]/2.

For the sake of notation, throughout the rest of this paper, we always let w= (w1, w2)= w(x, y) := (x − y)²+ τ(x ◦ y),

z= (z1, z2)= z(x, y) := [(x − y)²+ τ(x ◦ y)]^1/2 (16) for any x= (x1, x₂), y= (y1, y₂)∈ R × R^l⁻¹. It is easy to compute

w₁= x²+ y²+ (τ − 2)x^Ty,

w2= 2(x1x2+ y1y2)+ (τ − 2)(x1y2+ y1x2).

Moreover, w∈ K^land z∈ K^lhold by considering that w= x²+ y²+ (τ − 2)(x ◦ y)

=

x+τ− 2 2 y

2

+τ (4− τ) 4 y²=

y+τ− 2 2 x

2

+τ (4− τ)

4 x². (17) In what follows, we present several important technical lemmas. Since their proofs can be found in [4], we here omit them for simplicity.

Lemma 2.1 [4, Lemma 3.4] For any x= (x1, x2), y= (y1, y2)∈ R × R^l⁻¹and τ∈ (0, 4), let w= (w1, w2)be defined as in (16). Ifw2 = 0, then

x1+τ − 2 2 y1

+ (−1)ⁱ

x2+τ − 2 2 y2

T w₂

w2

2

≤

x2+τ − 2 2 y2

+ (−1)ⁱ

x1+τ − 2 2 y1

w₂

w2

²

≤ λi(w) for i= 1, 2.

Furthermore, these relations also hold when interchanging x and y.

Lemma 2.2 [4, Lemma 3.2] For any x= (x1, x₂), y= (y1, y₂)∈ R × R^l⁻¹and τ∈ (0, 4), let w= (w1, w2)be given as in (16). If w /∈ int(K^l), then

x₁²= x2², y₁²= y2², x1y1= x2^Ty2, x1y2= y1x2; (18) x₁²+ y₁²+ (τ − 2)x1y₁= x1x₂+ y1y₂+ (τ − 2)x1y₂

= x2²+ y2²+ (τ − 2)x^T₂y₂. (19) If, in addition, (x, y)= (0, 0), then w2 = 0, and moreover,

x₂^T w₂

w2= x1, x1

w₂

w2 = x2, y₂^T w₂

w2 = y1, y1

w₂

w2= y2. (20)

(7)

Lemma 2.3 [4, Proposition 3.2] For any x= (x1, x₂), y= (y1, y₂)∈ R × R^l⁻¹, let z(x, y)be defined by (16). Then z(x, y) is continuously differentiable at a point (x, y) if and only if (x− y)²+ τ(x ◦ y) ∈ int(K^l), and furthermore,

∇xz(x, y)= L_x₊^τ−2

2 yL⁻¹_z , ∇yz(x, y)= L_y₊^τ−2 2 xL⁻¹_z , where

L⁻¹_z =

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

⎛

⎝ b c ^w

T

w22

c_w^w²

2 aI+ (b − a)^w_w²^w^T²

2²

⎞

⎠ if w2= 0;

(1/√

w1)I if w2= 0,

(21)

with

a= 2

√λ2(w)+√

λ1(w), b=1 2

1

√λ2(w)+ 1

√λ1(w)

,

c=1 2

1

√λ₂(w)− 1

√λ₁(w)

.

(22)

To close this section, we recall some definitions that will be used in the subsequent sections. Given a mapping H: Rⁿ→ R^m, if H is locally Lipschitz continuous, the set

∂_BH (z):= {V ∈ R^m^×n| ∃{z^k} ⊆ DH: z^k→ z, H(z^k)→ V }

is nonempty and is called the B-subdifferential of H at z, where DH ⊆ Rⁿdenotes the set of points at which H is differentiable. The convex hull ∂H (z):= conv∂BH (z) is the generalized Jacobian of H at z in the sense of Clarke [8]. For the concepts of (strongly) semismooth functions, please refer to [21,22] for details. We next present definitions of Cartesian P -properties for a matrix M∈ Rⁿ^×n, which are in fact special cases of those introduced by Chen and Qi [5] for a linear transformation.

Definition 2.1 A matrix M∈ Rⁿ^×nis said to have

(a) the Cartesian P -property if for any 0= x = (x1, . . . , x_m)∈ Rⁿ with xi ∈ Rⁿⁱ, there exists an index ν∈ {1, 2, . . . , m} such that

xν, (Mx)ν > 0;

(b) the Cartesian P0-property if for any 0= x = (x1, . . . , xm)∈ Rⁿ with xi ∈ Rⁿⁱ, there exists an index ν∈ {1, 2, . . . , m} such that

x_ν= 0 and xν, (Mx)_ν ≥ 0.

Some nonlinear generalizations of these concepts in the setting ofK are defined as follows.

(8)

Definition 2.2 Given a mapping F= (F1, . . . , F_m)with Fi : Rⁿ→ Rⁿⁱ, F is said to

(a) have the uniform Cartesian P -property if for any x = (x1, . . . , x_m), y = (y1, . . . , y_m)∈ Rⁿ, there exists an index ν∈ {1, 2, . . . , m} and a positive constant ρ >0 such that

xν− yν, Fν(x)− Fν(y) ≥ ρx − y²;

(b) have the Cartesian P0-property if for any x= (x1, . . . , xm), y= (y1, . . . , ym)∈ Rⁿ, there exists an index ν∈ {1, 2, . . . , m} such that

x_ν= yν and xν− yν, F_ν(x)− Fν(y) ≥ 0.

If a continuously differentiable mapping F has the Cartesian P -properties, then the matrix∇F (x) at any x ∈ Rⁿenjoys the corresponding Cartesian P -properties.

3 Properties of the functions φτ and τ

This section is devoted to investigating the favorable properties of φτ, which include the globally Lipschitz continuity, the strong semismoothness and the characterization of the B-subdifferential at any point. Based on these results, we also present some properties of the operator τ related to the generalized Newton method.

From the definition of φτ and z(x, y) given as in (9) and (16), respectively, we have

φτ(x, y)= z(x, y) − (x + y) = z − (x + y) (23) for any x= (x1, x₂), y= (y1, y₂)∈ R × R^l⁻¹. Recall that the vectors w= (w1, w₂) and z= (z1, z₂)in (16) satisfy w, z∈ K^l, and hence, from Property2.1(b) and (c),

z=

√λ2(w)+√ λ1(w)

2 ,

√λ2(w)−√ λ1(w)

2 ¯w2

, (24)

where ¯w2= _w^w²₂ if w2= 0, and otherwise ¯w2 is any vector in R^l⁻¹ satisfying

 ¯w2 = 1. The following proposition states some favorable properties possessed by φτ.

Proposition 3.1 The function φτ defined as in (9) has the following properties.

(a) φτ is continuously differentiable at a point (x, y)∈ R^l × R^l if and only if (x− y)²+ τ(x ◦ y) ∈ int(K^l). Moreover,

∇xφ_τ(x, y)= L_x₊^τ−2

2 yL⁻¹_z − I, ∇yφ_τ(x, y)= L_y₊^τ−2

2 xL⁻¹_z − I.

(b) φτ is globally Lipschitz continuous with the Lipschitz constant independent of τ . (c) φτ is strongly semismooth at any (x, y)∈ R^l× R^l.

(d) ψτ defined by (12) is continuously differentiable everywhere.

(9)

Proof (a) The proof directly follows from Lemma2.3and (23).

(b) It suffices to prove that z(x, y) is globally Lipschitz continuous by (23). Let ˆz = (ˆz1,ˆz2)= ˆz(x, y, ) := [(x − y)²+ τ(x ◦ y) + e]^1/2 (25) for any > 0 and x= (x1, x2), y= (y1, y2)∈ R × R^l−1. Then, applying Lemma A.1 inAppendixand the Mean-Value Theorem, we have

z(x, y) − z(a, b) = lim

→0⁺ˆz(x, y, ) − lim

→0⁺ˆz(a, b, )

≤ lim

→0⁺ˆz(x, y, ) − ˆz(a, y, ) + ˆz(a, y, ) − ˆz(a, b, )

≤ lim

→0⁺

1 0

∇xˆz(a + t(x − a), y, )(x − a)dt

+ lim

→0⁺

₁

0

∇yˆz(a, b + t(y − b), )(y − b)dt

≤√

2C(x, y) − (a, b)

for any (x, y), (a, b)∈ R^l× R^l, where C > 0 is a constant independent of τ . (c) From the definition of φτ and φFB, it is not hard to check that

φτ(x, y)= φFB

x+τ− 2 2 y,

√τ (4− τ)

2 y

+1

2(τ− 4 +

τ (4− τ))y.

Note that φFBis strongly semismooth everywhere by Corollary 3.3 of [25], and the functions x+ ^τ⁻²₂ y, ¹₂√

τ (4− τ)y and ¹₂(τ − 4 +√

τ (4− τ))y are also strongly semismooth at any (x, y)∈ R^l× R^l. Therefore, φτ is a strongly semismooth function since by [12, Theorem 19] the composition of strongly semismooth functions is strongly semismooth.

(d) The proof can be found in Proposition 3.3 of the literature [4]. Proposition3.1(c) indicates that, when a smoothing or nonsmooth Newton method is employed to solve the system (10), a fast convergence rate (at least superlinear) can be expected. To develop a semismooth Newton method for the SOCCP, we need to characterize the B-subdifferential ∂Bφ_τ(x, y)at a general point (x, y). The discussion of B-subdifferential for φFBwas given in [20]. Here, we generalize it to φτ for any τ∈ (0, 4). The detailed derivation process is included inAppendixfor completeness.

Proposition 3.2 Given a general point x= (x1, x2), y= (y1, y2)∈ R × R^l⁻¹, each element in ∂Bφ_τ(x, y)is of the form V = [Vx− I Vy− I] with Vxand Vyhaving the following representation:

(a) If (x− y)²+ τ(x ◦ y) ∈ int(K^l), then Vx= L⁻¹_z L_x₊τ−2

2 yand Vy= L⁻¹_z L_y₊τ−2 2 x. (b) If (x− y)²+ τ(x ◦ y) ∈ bd(K^l)and (x, y)= (0, 0), then

(10)

V_x∈

1 2√

2w1

1 ¯w^T₂

¯w2 4I− 3 ¯w2¯w^T₂

L_x+τ− 2 2 L_y

+1

2 1

− ¯w2

u^T

,

(26) V_y∈

1 2√

2w1

1 ¯w^T₂

¯w2 4I− 3 ¯w2¯w^T₂

L_y+τ− 2 2 L_x

+1

2 1

− ¯w2

v^T

for some u= (u1, u₂), v= (v1, v₂)∈ R × R^l⁻¹satisfying|u1| ≤ u2 ≤ 1 and

|v1| ≤ v2 ≤ 1, where ¯w2=_w^w²₂.

(c) If (x, y)= (0, 0), then Vx ∈ {L_û}, Vy ∈ {L_ˆv} for some û = ( û1,û2), ˆv = (ˆv1,ˆv2)∈ R × R^l⁻¹satisfying û, ˆv ≤ 1 and û1ˆv2+ ˆv1û2= 0, or

Vx∈

1 2

1

¯w2

ξ^T +1

2 1

− ¯w2

u^T + 2

0 0

(I− ¯w2¯w₂^T)s2 (I− ¯w2¯w₂^T)s1

,

V_y∈

1 2

1

¯w2

η^T +1

2 1

− ¯w2

v^T + 2

0 0

(I− ¯w2¯w₂^T)ω2 (I− ¯w2¯w^T₂)ω1

for some ¯w2 = 1, u = (u1, u₂), v= (v1, v₂), ξ = (ξ1, ξ₂), η= (η1, η₂)∈ R × R^l⁻¹ satisfying|u1| ≤ u2 ≤ 1, |v1| ≤ v2 ≤ 1, |ξ1| ≤ ξ2 ≤ 1 and |η1| ≤

η2 ≤ 1, and s = (s1, s₂), ω= (ω1, ω₂)∈ R × R^l⁻¹such thats²+ ω²≤ 1.

In what follows, we focus on the properties of the operator τ defined in (10). We start with the semismoothness of τ. Since τ is (strongly) semismooth if and only if all component functions are (strongly) semismooth, and since the composite of (strongly) semismooth functions is (strongly) semismooth by [12, Theorem 19], we obtain the following conclusion as an immediate consequence of Proposition3.1(c).

Proposition 3.3 The operator τ: Rⁿ→ Rⁿdefined as in (10) is semismooth. More- over, it is strongly semismooth if Fand Gare locally Lipschitz continuous.

To characterize the B-subdifferential of τ, in the rest of this paper, we let F_i(ζ )= (Fi1(ζ ), Fi2(ζ )), G_i(ζ )= (Gi1(ζ ), Gi2(ζ ))∈ R × Rⁿⁱ⁻¹ and wi: Rⁿ→ Rⁿⁱand zi: Rⁿ→ Rⁿⁱ for i= 1, 2, . . . , m be given as follows:

w_i = (wi1(ζ ), w_i2(ζ ))= w(Fi(ζ ), G_i(ζ )),

z_i = (zi1(ζ ), z_i2(ζ ))= z(Fi(ζ ), G_i(ζ )). (27) Proposition 3.4 Let τ: Rⁿ→ Rⁿbe defined as in (10). Then, for any ζ∈ Rⁿ,

∂_B_τ(ζ )^T ⊆ ∇F (ζ )(A(ζ ) − I) + ∇G(ζ )(B(ζ ) − I), (28) where A(ζ ) and B(ζ ) are possibly multivalued n× n block diagonal matrices whose ith blocks Ai(ζ )and Bi(ζ )for i= 1, 2, . . . , m have the following representation:

(11)

(a) If (Fi(ζ )− Gi(ζ ))²+ τ(Fi(ζ )◦ Gi(ζ ))∈ int(Kⁿⁱ), then A_i(ζ )= L_F_i₊τ−2

2 GiL⁻¹_z

i and B_i(ζ )= L_G_i₊τ−2

2 FiL⁻¹_z

i .

(b) If (Fi(ζ )− Gi(ζ ))²+ τ(Fi(ζ )◦ Gi(ζ ))∈ bd(Kⁿⁱ)and (Fi(ζ ), G_i(ζ ))= (0, 0), then

A_i(ζ )∈

1

2√ 2wi1

L_F_i+τ− 2 2 L_G_i

1 ¯w_i2^T

¯wi2 4I− 3 ¯wi2¯w_i2^T

+1

2u_i(1,− ¯w_i^T2)

B_i(ζ )∈

1

2√ 2wi1

L_G_i+τ− 2 2 L_F_i

1 ¯w_i2^T

¯wi2 4I− 3 ¯wi2¯w_i2^T

+1

2v_i(1,− ¯w^Ti2)

for some ui= (ui1, u_i₂), vi= (vi1, v_i2)∈ R × Rⁿⁱ⁻¹satisfying|ui1| ≤ ui2 ≤ 1 and|vi1| ≤ vi2 ≤ 1, where ¯wi2=_w^wⁱ²

i2. (c) If (Fi(ζ ), Gi(ζ ))= (0, 0), then

A_i(ζ )∈ {L_ˆu₁} ∪

1

2ξ_i(1, ¯w^T_i2)+1

2u_i(1,− ¯w^T_i2)+

0 2s_i2^T(I− ¯wi2¯w_i2^T) 0 2si1(I− ¯wi2¯w_i2^T)

B_i(ζ )∈ {L_ˆv₁} ∪

1

2η_i(1, ¯w^Ti2)+1

2v_i(1,− ¯wi2^T)+

0 2ω^T_i2(I− ¯wi2¯w^T_i2) 0 2ωi1(I− ¯wi2¯w^T_i2)

for some ûi = ( ûi1,ûi2), ˆvi= (ˆvi1,ˆvi2)∈ R × Rⁿⁱ⁻¹ satisfying ûi, ˆvi ≤ 1 and ûi1ˆvi2+ ˆvi1ûi2= 0, some ui = (ui1, u_i₂), vi = (vi1, v_i2), ξi = (ξi1, ξ_i2), η_i = (ηi1, η_i2)∈ R × Rⁿⁱ⁻¹with|ui1| ≤ ui2 ≤ 1, |vi1| ≤ vi2 ≤ 1, |ξi1| ≤

ξi2 ≤ 1 and |ηi1| ≤ ηi2 ≤ 1, ¯ωi2∈ Rⁿⁱ⁻¹ satisfying  ¯ωi2 = 1, and si = (si1, si2), ωi= (ωi1, ωi2)∈ R × Rⁿⁱ⁻¹such thatsi²+ ωi²≤ 1.

Proof Let τ,i(ζ )denote the ith subvector of τ, i.e. τ,i(ζ )= φτ(Fi(ζ ), Gi(ζ )) for all i= 1, 2, . . . , m. From Proposition 2.6.2 of [8], it follows that

∂Bτ(ζ )^T ⊆ ∂Bτ,1(ζ )^T × ∂Bτ,2(ζ )^T × · · · × ∂Bτ,m(ζ )^T, (29) where the latter denotes the set of all matrices whose (ni−1+ 1) to nith columns with n0= 0 belong to ∂B_τ,i(ζ )^T. Using the definition of B-subdifferential and the continuous differentiability of F and G, it is not difficult to verify that

∂_B_τ,i(ζ )^T = [∇Fi(ζ ) ∇Gi(ζ )]∂Bφ_τ(F_i(ζ ), G_i(ζ ))^T, i= 1, 2, . . . , m. (30) Using Proposition3.2and the last two equations, we get the desired result. Proposition 3.5 For any ζ∈ Rⁿ, let A(ζ ) and B(ζ ) be the multivalued block diago- nal matrices given as in Proposition3.4Then, for any i∈ {1, 2, . . . , m},

(Ai(ζ )− I)τ,i(ζ ), (B_i(ζ )− I)τ,i(ζ ) ≥ 0,

(12)

with equality holding if and only if τ,i(ζ )= 0. Particularly, for the index i such that (F_i(ζ )− Gi(ζ ))²+ τ(Fi(ζ )· Gi(ζ )∈ int(Kⁿⁱ)), we have

(Ai(ζ )− I)υi, (B_i(ζ )− I)υi ≥ 0, for any υi∈ Rⁿⁱ. Proof From Theorem 2.6.6 of [8] and Proposition 3.1(d), we have that

∇ψτ(x, y)= ∂Bφ_τ(x, y)^Tφ_τ(x, y).

Consequently, for any i= 1, 2, . . . , m, it follows that

∇ψτ(Fi(ζ ), Gi(ζ ))= ∂Bφτ(Fi(ζ ), Gi(ζ ))^Tφτ(Fi(ζ ), Gi(ζ )).

In addition, from Propositions3.2and3.4, it is not hard to see that [Ai(ζ )^T − I Bi(ζ )^T − I] ∈ ∂Bφ_τ(F_i(ζ ), G_i(ζ )).

Combining with the last two equations yields that for any i= 1, 2, . . . , m,

∇xψ_τ(F_i(ζ ), G_i(ζ ))= (Ai(ζ )− I)τ,i(ζ ),

∇yψ_τ(F_i(ζ ), G_i(ζ ))= (Bi(ζ )− I)τ,i(ζ ).

(31)

Consequently, the first part of conclusions is a direct consequence of Proposition 4.1 of [4]. Notice that for any i∈ O(ζ ) and υi∈ Rⁿⁱ,

(Ai(ζ )− I)υi, (Bi(ζ )− I)υi

= (L_F_i₊τ−2

2 Gi− Lzi)L⁻¹_z

i υ_i, (L_G

i+^τ−22 Fi− Lzi)L⁻¹_z

i υ_i

= (L_G_i₊^τ−2

2 F_i− Lzi)(L_F

i+^τ⁻²₂ G_i− Lzi)L⁻¹_z

i υ_i, L⁻¹_z

i υ_i. (32)

Using the same argument as Case (2) of [4, Proposition 4.1] then yields the second

part.

4 Nonsingularity conditions

In this section, we show that all elements of the B-subdifferential ∂Bτ(ζ )at a solu- tion ζ^∗of the SOCCP are nonsingular if ζ^∗satisfies strict complementarity, i.e.,

Fi(ζ^∗)+ Gi(ζ^∗)∈ int(Kⁿⁱ) for all i= 1, 2, . . . , m. (33) First, we give a technical lemma which states that the multi-valued matrix (A_i(ζ^∗)− I) + (Bi(ζ^∗)− I) is nonsingular if the i-th block component satisfies strict complementarity.

Lemma 4.1 Let ζ^∗ be a solution of the SOCCP, and A(ζ^∗)and B(ζ^∗)be the mul- tivalued block diagonal matrices characterized by Proposition 3.4. Then, for any i∈ {1, 2, . . . , m} such that Fi(ζ^∗)+ Gi(ζ^∗)∈ int(Kⁿⁱ), we have that τ,i(ζ )is con- tinuously differentiable at ζ^∗and (Ai(ζ^∗)− I) + (Bi(ζ^∗)− I) is nonsingular.

(13)

Proof Since ζ^∗is a solution of the SOCCP, we have for all i= 1, 2, . . . , m F_i(ζ^∗)∈ Kⁿⁱ, G_i(ζ^∗)∈ Kⁿⁱ, Fi(ζ^∗), G_i(ζ^∗) = 0.

It is not hard to verify that Fi(ζ^∗)+ Gi(ζ^∗)∈ int(Kⁿⁱ)if and only if one of the three cases shown as below holds.

Case (1) Fi(ζ^∗)∈ int(Kⁿⁱ)and Gi(ζ^∗)= 0. Under this case,

wi(ζ^∗)= (Fi(ζ^∗)− Gi(ζ^∗))²+ τ(Fi(ζ^∗)◦ Gi(ζ^∗))= Fi(ζ^∗)²∈ int(Kⁿⁱ).

By Proposition3.1(a), τ,i(ζ )is continuously differentiable at ζ^∗. Since zi(ζ^∗)= w_i(ζ^∗)^1/2= Fi(ζ^∗), from Proposition3.4(a) it follows that

A_i(ζ^∗)= I and Bi(ζ^∗)=τ− 2 2 I,

which implies that (Ai(ζ^∗)− I) + (Bi(ζ^∗)− I) is nonsingular since 0 < τ < 4.

Case (2) Fi(ζ^∗)= 0 and Gi(ζ^∗)∈ int(Kⁿⁱ). Now, wi(ζ^∗)= Gi(ζ^∗)²∈ int(Kⁿⁱ).

So, τ,i(ζ )is continuously differentiable at ζ^∗by Proposition3.1(a). Since z_i(ζ^∗)= wi(ζ^∗)^1/2= Gi(ζ^∗),

applying Proposition3.4(a) yields that

A_i(ζ^∗)=τ− 2

2 I and B_i(ζ^∗)= I,

which immediately implies that (Ai(ζ^∗)− I) + (Bi(ζ^∗)− I) is nonsingular.

Case (3) Fi(ζ^∗)∈bd⁺(Kⁿⁱ)and Gi(ζ^∗)∈bd⁺(Kⁿⁱ), where bd⁺(Kⁿⁱ):=bd(Kⁿⁱ)\{0}.

By Proposition 3.1(a), it suffices to prove wi(ζ^∗) ∈ int(Kⁿⁱ). Suppose that wi(ζ^∗)∈ bd(Kⁿⁱ). Then, from (18) in Lemma2.2, it follows that

Fi1(ζ^∗)G_i1(ζ^∗)= Fi2(ζ^∗)^TG_i2(ζ^∗).

Since Fi1(ζ^∗)= Fi2(ζ^∗) = 0 and Gi1(ζ^∗)= Gi2(ζ^∗) = 0, we have

Fi2(ζ^∗) · Gi2(ζ^∗) = Fi2(ζ^∗)^TG_i₂(ζ^∗),

which implies that Fi2(ζ^∗)= αGi2(ζ^∗) for some constant α > 0. Consequently, Fi(ζ^∗)= αGi(ζ^∗). Noting that Fi(ζ^∗), Gi(ζ^∗) = 0, we then get Fi(ζ^∗) = Gi(ζ^∗) = 0. This clearly contradicts the assumptions that Fi(ζ^∗) = 0 and G_i(ζ^∗)= 0. So, wi(ζ^∗)∈ int(Kⁿⁱ).

From the expression of Ai(ζ )and Bi(ζ )given by Proposition 3.4(a), (A_i(ζ^∗)− I) + (Bi(ζ^∗)− I) = −L2z_i(ζ^∗)−^τ2(Fi(ζ^∗)+Gi(ζ^∗))L⁻¹_z

i(ζ^∗).