Applied Mathematics and Optimization, vol. 59, pp. 293-318, 2009

### A damped Gauss-Newton method for the second-order cone complementarity problem

Shaohua Pan^{1}

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China

Jein-Shan Chen ^{2}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan

June 5, 2007

(revised January 18, 2008) (final version June 25, 2008)

Abstract. We investigate some properties related to the generalized Newton method
for the Fischer-Burmeister (FB) function over second-order cones, which allows us to
reformulate the second-order cone complementarity problem (SOCCP) as a semismooth
system of equations. Specifically, we characterize the B-subdifferential of the FB function
at a general point and study the condition for every element of the B-subdifferential at a
solution being nonsingular. In addition, for the induced FB merit function, we establish
its coerciveness and provide a weaker condition than [7] for each stationary point to be
*a solution, under suitable Cartesian P -properties of the involved mapping. By this, a*
damped Gauss-Newton method is proposed and the global and superlinear convergence
results are obtained. Numerical results are reported for the second-order cone programs
from the DIMACS library, which verify the good theoretical properties of the method.

Key words: second-order cones; complementarity; Fischer-Burmeister function; B- subdifferential; generalized Newton method.

1The author’s work is partially supported by the Doctoral Starting-up Foundation (B13B6050640) of GuangDong Province. E-mail:shhpan@scut.edu.cn.

2Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office.

The author’s work is partially supported by National Science Council of Taiwan. E-mail:

jschen@math.ntnu.edu.tw, FAX: 886-2-29332342.

### 1 Introduction

*Consider the following conic complementarity problem of finding ζ ∈ IR** ^{n}* such that

*F (ζ) ∈ K,*

*G(ζ) ∈ K,*

*hF (ζ), G(ζ)i = 0,*(1)

*where h·, ·i represents the Euclidean inner product, F, G : IR*

^{n}*→ IR*

*are the mapping*

^{m}*assumed to be continuously differentiable throughout this paper, and K is the Cartesian*product of second-order cones (SOCs), or called Lorentz cones. In other words,

*K = K*^{n}^{1} *× K*^{n}^{2} *× · · · × K*^{n}^{q}*,* (2)
*where q, n*_{1}*, . . . , n*_{q}*≥ 1, n*_{1}*+ · · · + n*_{q}*= m and*

*K*^{n}* ^{i}* :=©

*x = (x*_{1}*, x*_{2}*) ∈ IR × IR*^{n}^{i}^{−1}*| x*_{1} *≥ kx*_{2}*k*ª

*with k · k denoting the Euclidean norm and K*^{1} denoting the set of nonnegative reals IR_{+}.
*We will refer to (1)–(2) as the second-order cone complementarity problem (SOCCP).*

*Corresponding to the Cartesian structure of K, in the rest of this paper, we always write*
*F = (F*_{1}*, . . . , F*_{q}*) and G = (G*_{1}*, . . . , G*_{q}*) with F*_{i}*, G** _{i}* : IR

^{n}*→ IR*

^{n}*.*

^{i}*An important special case of the SOCCP corresponds to n = m and G(ζ) = ζ for all*
*ζ ∈ IR** ^{n}*. Then (1) and (2) reduce to

*F (ζ) ∈ K,* *ζ ∈ K,* *hF (ζ), ζi = 0,* (3)

which is a natural extension of the nonlinear complementarity problem (NCP) over the
nonnegative orthant cone IR^{n}_{+}. Another special case corresponds to the Karush-Kuhn-
Tucker (KKT) conditions for the convex second-order cone program (CSOCP):

*min g(x)*

*s.t. Ax = b,* *x ∈ K,* (4)

*where A ∈ IR*^{p×m}*has full row rank, b ∈ IR*^{p}*and g : IR*^{m}*→ IR is a twice continuously*
differentiable convex function. From [7], the KKT conditions of (4), which are sufficient
*but not necessary for optimality, can be rewritten in the form of (1) with n = m and*

*F (ζ) := ˆx + (I − A*^{T}*(AA** ^{T}*)

^{−1}*A)ζ,*

*G(ζ) := ∇g(F (ζ)) − A*

^{T}*(AA*

*)*

^{T}

^{−1}*Aζ,*(5) where ˆ

*x ∈ IR*

^{n}*is any vector satisfying Ax = b. When g is a linear function, (4) reduces to*the standard second-order cone program which has extensive applications in engineering design, finance, control, and robust optimization; see [1, 14] and references therein.

There have been many methods proposed for solving SOCPs and SOCCPs. They include the interior-point methods [1, 2, 14, 16, 24, 26], the non-interior smoothing New- ton methods [6, 11], the smoothing-regularization method [13], and the merit function

approach [7]. Among others, the last three kinds of methods are all based on an SOC
*complementarity function. Specifically, a mapping φ : IR*^{l}*× IR*^{l}*→ IR*^{l}*is called an SOC*
*complementarity function associated with the cone K*^{l}*(l ≥ 1) if*

*φ(x, y) = 0 ⇐⇒ x ∈ K*^{l}*, y ∈ K*^{l}*, hx, yi = 0.* (6)
*A popular choice of φ is the vector-valued Fischer-Burmeister (FB) function, defined*
by

*φ(x, y) := (x*^{2}*+ y*^{2})^{1/2}*− (x + y)* *∀x, y ∈ IR** ^{l}* (7)

*where x*

^{2}

*= x ◦ x denotes the Jordan product of x and itself, x*

*denotes a vector such*

^{1/2}*that (x*

*)*

^{1/2}^{2}

*= x, and x + y means the usual componentwise addition of vectors. From*

*the next section, we see that φ in (7) is well-defined for all (x, y) ∈ IR*

^{l}*× IR*

*. The function was shown in [11] to satisfy the equivalence (6), and therefore its squared norm*

^{l}*ψ(x, y) :=* 1

2*kφ(x, y)k*^{2} (8)

*is a merit function for the SOCCP, i.e., ψ(x, y) = 0 if and only if x ∈ K*^{l}*, y ∈ K** ^{l}* and

*hx, yi = 0. The functions φ and ψ were studied in the literature [7, 21], where ψ was*

*shown to be continuously differentiable everywhere by Chen and Tseng [7] and φ was*proved to be strongly semismooth by D. Sun and J. Sun [21].

In view of the characterization in (6), clearly, the SOCCP can be reformulated as the following nonsmooth system of equations:

*Φ(ζ) :=*

*φ(F*_{1}*(ζ), G*_{1}*(ζ))*
...

*φ(F**i**(ζ), G**i**(ζ))*
...

*φ(F*_{q}*(ζ), G*_{q}*(ζ))*

= 0 (9)

*where φ is defined as in (7) with a suitable dimension l. By Corollary 3.3 of [21], it is*
not hard to show that the operator Φ : IR^{n}*→ IR** ^{m}* in (9) is semismooth. Furthermore,
from Proposition 2 of [7], its squared norm induces a smooth merit function, given by

*Ψ(ζ) :=* 1

2*kΦ(ζ)k*^{2} =
X*q*

*i=1*

*ψ(F**i**(ζ), G**i**(ζ)).* (10)

*In this paper, we mainly characterize the B-subdifferential of φ at a general point*
and present an estimate for the B-subdifferential of Φ. By this, a condition is given to
guarantee every element of the B-subdifferential of Φ at a solution to be nonsingular,
which plays an important role in the local convergence analysis of nonsmooth Newton

methods for the SOCCP. In addition, two important results are also presented for the
*merit function Ψ(ζ). One of them shows that each stationary point of Ψ is a solution of*
the SOCCP under a weaker condition than the one used by [7], and the other establishes
*the coerciveness of Ψ for the SOCCP (3) under the uniform Cartesian P -property of F .*
Based on these results, we finally propose a damped Gauss-Newton method by applying
the generalized Newton method [19, 20] for the system (9), and analyze its global and
superlinear (quadratic) convergence. Numerical results are reported for the SOCPs from
the DIMACS library [18], which verify the good theoretical properties of the method.

*Throughout this paper, I represents an identity matrix of suitable dimension, IR*^{n}*denotes the space of n-dimensional real column vectors, and IR*^{n}^{1}*× · · · × IR*^{n}* ^{q}* is identified
with IR

^{n}^{1}

^{+···+n}

^{q}*. Thus, (x*

_{1}

*, . . . , x*

_{q}*) ∈ IR*

^{n}^{1}

*× · · · × IR*

^{n}*is viewed as a column vector in IR*

^{q}

^{n}^{1}

^{+···+n}

^{q}*. For any differentiable mapping F : IR*

^{n}*→ IR*

^{m}*, the notation ∇F (x) ∈ IR*

^{n×m}*denotes the transpose of the Jacobian F*

^{0}*(x). For a symmetric matrix A, we write A Â O*

*(respectively, A º O) if A is positive definite (respectively, positive semidefinite). Given*

*a finite number of square matrices Q*

_{1}

*, · · · , Q*

*, we denote the block diagonal matrix with*

_{q}*these matrices as block diagonals by diag(Q*

_{1}

*, . . . , Q*

_{q}*) or by diag(Q*

_{i}*, i = 1, . . . , q). If J*

*and B are index sets such that J , B ⊆ {1, 2, . . . , q}, we denote by P*

*J B*the block matrix

*consisting of the sub-matrices P*

_{jk}*∈ IR*

^{n}

^{j}

^{×n}

^{k}*of P with j ∈ J , k ∈ B, and denote by x*

*a*

_{B}*vector consisting of sub-vectors x*

_{i}*∈ IR*

^{n}

^{i}*with i ∈ B.*

### 2 Preliminaries

This section recalls some background materials and preliminary results that will be used
*in the subsequent sections. We start with the interior and the boundary of K*^{l}*(l > 1).*

*It is known that K** ^{l}* is a closed convex self-dual cone with nonempty interior given by

*int(K*

*) :=©*

^{l}*x = (x*1*, x*2*) ∈ IR × IR*^{l−1}*| x*1 *> kx*2*k*ª
and the boundary given by

*bd(K** ^{l}*) :=©

*x = (x*_{1}*, x*_{2}*) ∈ IR × IR*^{l−1}*| x*_{1} *= kx*_{2}*k*ª
*.*

*For any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR × IR*^{l−1}*, we define their Jordan product [9] as*
*x ◦ y := (hx, yi, x*_{1}*y*_{2}*+ y*_{1}*x*_{2}*).*

*The Jordan product “◦”, unlike scalar or matrix multiplication, is not associative, which*
is the main source on complication in the analysis of SOCCP. The identity element under
*this product is e := (1, 0, · · · , 0)*^{T}*∈ IR*^{l}*. For each x = (x*1*, x*2*) ∈ IR × IR** ^{l−1}*, define the

*matrix L*

*by*

_{x}*L**x* :=

· *x*_{1} *x*^{T}_{2}
*x*_{2} *x*_{1}*I*

¸
*,*

which can be viewed as a linear mapping from IR* ^{l}* to IR

*with the following properties.*

^{l}*Property 2.1 (a) L*_{x}*y = x ◦ y and L*_{x+y}*= L*_{x}*+ L*_{y}*for any y ∈ IR*^{l}*.*
*(b) x ∈ K*^{l}*⇐⇒ L**x* *º O and x ∈ int(K*^{l}*) ⇐⇒ L**x* *Â O.*

*(c) L*_{x}*is invertible whenever x ∈ int(K*^{l}*) with the inverse L*^{−1}_{x}*given by*

*L*^{−1}* _{x}* = 1

*det(x)*

*x*1 *−x*^{T}_{2}

*−x*_{2} *det(x)*

*x*_{1} *I +* *x*_{2}*x*^{T}_{2}
*x*_{1}

* ,* (11)

*where det(x) := x*^{2}_{1}*− kx*_{2}*k*^{2} *denotes the determinant of x.*

*In the following, we recall from [9, 11] that each x = (x*1*, x*2*) ∈ IR × IR** ^{l−1}* admits a

*spectral factorization, associated with K*

*, of the form*

^{l}*x = λ*_{1}*(x) · u*^{(1)}_{x}*+ λ*_{2}*(x) · u*^{(2)}_{x}*,*

*where λ*1*(x), λ*2*(x) and u*^{(1)}*x* *, u*^{(2)}*x* are the spectral values and the associated spectral vectors
*of x, respectively, defined by*

*λ*_{i}*(x) = x*_{1}*+ (−1)*^{i}*kx*_{2}*k,* *u*^{(i)}* _{x}* = 1
2

¡*1, (−1)*^{i}*x*¯_{2}¢

*,* *i = 1, 2,*

with ¯*x*_{2} *= x*_{2}*/kx*_{2}*k if x*_{2} *6= 0 and otherwise ¯x*_{2} being any vector in IR* ^{l−1}* satisfying

*k¯x*

_{2}

*k = 1. If x*

_{2}

*6= 0, the factorization is unique. The spectral factorizations of x, x*

^{2}and

*x*

*have various interesting properties, and some of them are summarized as follows.*

^{1/2}*Property 2.2 For any x = (x*_{1}*, x*_{2}*) ∈ IR × IR*^{l−1}*, let λ*_{1}*(x), λ*_{2}*(x) and u*^{(1)}*x* *, u*^{(2)}*x* *be the*
*spectral values and the associated spectral vectors. Then, the following results hold.*

*(a) x ∈ K*^{l}*⇐⇒ 0 ≤ λ*_{1}*(x) ≤ λ*_{2}*(x) and x ∈ int(K*^{l}*) ⇐⇒ 0 < λ*_{1}*(x) ≤ λ*_{2}*(x).*

*(b) x*^{2} *= [λ*1*(x)]*^{2}*u*^{(1)}*x* *+ [λ*2*(x)]*^{2}*u*^{(2)}*x* *∈ K*^{l}*for any x ∈ IR*^{l}*.*
*(c) If x ∈ K*^{l}*, then x** ^{1/2}* =p

*λ*_{1}*(x) u*^{(1)}*x* +p

*λ*_{2}*(x) u*^{(2)}*x* *∈ K*^{l}*.*

Now we recall the concepts of the B-subdifferential and (strong) semismoothness.

*Given a mapping H : IR*^{n}*→ IR*^{m}*, if H is locally Lipschitz continuous, then the set*

*∂*_{B}*H(z) :=*

n

*V ∈ IR*^{m×n}*| ∃{z*^{k}*} ⊆ D*_{H}*: z*^{k}*→ z, H*^{0}*(z*^{k}*) → V*
o

*is nonempty and is called the B-subdifferential of H at z, where D*_{H}*⊆ IR** ^{n}* denotes the

*set of points at which H is differentiable. The convex hull ∂H(z) := conv∂*

_{B}*H(z) is the*

generalized Jacobian of Clarke [4]. Semismoothness was originally introduced by Mifflin
[15] for functionals. Smooth functions, convex functionals, and piecewise linear functions
are examples of semismooth functions. Later, Qi and Sun [19] extended the definition of
*semismooth functions to a mapping H : IR*^{n}*→ IR*^{m}*. H is called semismooth at x if H is*
*directionally differentiable at x and for all V ∈ ∂H(x + h) and h → 0,*

*V h − H*^{0}*(x; h) = o(khk);*

*H is called strongly semismooth at x if H is semismooth at x and for all V ∈ ∂H(x + h)*
*and h → 0,*

*V h − H*^{0}*(x; h) = O(khk*^{2});

*H is called (strongly) semismooth if it is (strongly) semismooth everywhere. Here, o(khk)*
*means a function α : IR*^{n}*→ IR** ^{m}* satisfying lim

*h→0**α(h)/khk = 0, while O(khk*^{2}) denotes a
*function α : IR*^{n}*→ IR*^{m}*satisfying kα(h)k ≤ Ckhk*^{2}*for all khk ≤ δ and some C > 0, δ > 0.*

*Next, we present the definitions of Cartesian P -properties for a matrix M ∈ IR** ^{m×m}*,
which are special cases of those introduced by Chen and Qi [5] for a linear transformation.

*Definition 2.1 A matrix M ∈ IR*^{m×m}*is said to have*

*(a) the Cartesian P -property if for any 0 6= x = (x*_{1}*, . . . , x*_{q}*) ∈ IR*^{m}*with x*_{i}*∈ IR*^{n}^{i}*, there*
*exists an index ν ∈ {1, 2, . . . , q} such that hx*_{ν}*, (Mx)*_{ν}*i > 0;*

*(b) the Cartesian P*_{0}*-property if for any 0 6= x = (x*_{1}*, . . . , x*_{q}*) ∈ IR*^{m}*with x*_{i}*∈ IR*^{n}^{i}*, there*
*exists an index ν ∈ {1, 2, . . . , q} such that x*_{ν}*6= 0 and hx*_{ν}*, (Mx)*_{ν}*i ≥ 0.*

*Some nonlinear generalizations of these concepts in the setting of K are defined as follows.*

*Definition 2.2 Given a mapping F = (F*_{1}*, . . . , F*_{q}*) with F** _{i}* : IR

^{n}*→ IR*

^{n}

^{i}*, F is said to*

*(a) have the uniform Cartesian P -property if for any x = (x*

_{1}

*, . . . , x*

_{q}*), y = (y*

_{1}

*, . . . , y*

_{q}*) ∈*

IR^{m}*, there is an index ν ∈ {1, 2, . . . , q} and a positive constant ρ > 0 such that*
*hx**ν* *− y**ν**, F**ν**(x) − F**ν**(y)i ≥ ρkx − yk*^{2};

*(b) have the Cartesian P*_{0}*-property if for any x = (x*_{1}*, . . . , x*_{q}*), y = (y*_{1}*, . . . , y*_{q}*) ∈ IR*^{m}*and x 6= y, there exists an index ν ∈ {1, 2, . . . , q} such that*

*x*_{ν}*6= y*_{ν}*and hx*_{ν}*− y*_{ν}*, F*_{ν}*(x) − F*_{ν}*(y)i ≥ 0.*

*From the above definitions, if a continuously differentiable mapping F : IR*^{n}*→ IR** ^{n}*has

*the uniform Cartesian P -property (Cartesian P*0

*-property), then ∇F (x) at any x ∈ IR*

^{n}*enjoys the Cartesian P -property (Cartesian P*

_{0}-property). In addition, we may see that,

*when n*

_{1}

*= · · · = n*

_{q}*= 1, the above concepts reduce to the definitions of P -matrices and*

*P -functions, respectively, for the NCP.*

Finally, we introduce some notations which will be used in the rest of this paper. For
*any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR × IR*^{l−1}*, we define w, z : IR*^{l}*× IR*^{l}*→ IR** ^{l}* by

*w = (w*_{1}*, w*_{2}*) = (w*_{1}*(x, y), w*_{2}*(x, y)) = w(x, y) := x*^{2}*+ y*^{2}*,*

*z = (z*_{1}*, z*_{2}*) = (z*_{1}*(x, y), z*_{2}*(x, y)) = z(x, y) := (x*^{2}*+ y*^{2})^{1/2}*.* (12)
*Clearly, w ∈ K*^{l}*with w*1 *= kxk*^{2} *+ kyk*^{2} *and w*2 *= 2(x*1*x*2 *+ y*1*y*2). Let ¯*w*2 *= w*2*/kw*2*k*
*if w*_{2} *6= 0, and otherwise ¯w*_{2} be any vector in IR^{l−1}*satisfying k ¯w*_{2}*k = 1. Then, using*
Property 2.2 (b) and (c), it is not hard to compute that

*z =*

Ãp*λ*_{2}*(w) +*p
*λ*_{1}*(w)*

2 *,*

p*λ*_{2}*(w) −*p
*λ*_{1}*(w)*

2 *w*¯2

!

*∈ K*^{l}*.*

### 3 B-subdifferential of the FB Function

*In this section, we characterize the B-subdifferential of the FB function φ at a general*
*point (x, y) ∈ IR*^{l}*× IR** ^{l}*. For this purpose, we need several important technical lemmas.

*The first lemma characterizes the set of the points where z(x, y) is differentiable. Since*
the proof is direct by [3, Propostion 4] and formula (11), we here omit it.

*Lemma 3.1 The function z(x, y) in (12) is continuously differentiable at a point (x, y)*
*if and only if x*^{2}*+ y*^{2} *∈ int(K*^{l}*). Moreover, ∇*_{x}*z(x, y) = L*_{x}*L*^{−1}_{z}*and ∇*_{y}*z(x, y) = L*_{y}*L*^{−1}_{z}*,*
*where L*^{−1}_{z}*= (1/√*

*w*1*)I if w*2 *= 0, and otherwise*
*L*^{−1}* _{z}* =

µ *b* *c ¯w*^{T}_{2}

*c ¯w*_{2} *aI + (b − a) ¯w*_{2}*w*¯^{T}_{2}

¶

(13)
*with*

*a =* 2

p*λ*_{2}*(w) +*p

*λ*_{1}*(w), b =* 1
2

Ã p 1

*λ*_{2}*(w)*+ 1
p*λ*_{1}*(w)*

!

*, c =* 1
2

Ã p 1

*λ*_{2}*(w)−* 1
p*λ*_{1}*(w)*

!
*.*

The following two lemmas extends the results of Lemmas 2 and 3 of [7], respectively.

Since the proofs are direct by using the same technique as [7], we here omit them.

*Lemma 3.2 For any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR × IR*^{l−1}*with w = x*^{2}*+ y*^{2} *∈ bd(K*^{l}*),*
*we have*

*x*^{2}_{1} *= kx*_{2}*k*^{2}*, y*_{1}^{2} *= ky*_{2}*k*^{2}*, x*_{1}*y*_{1} *= x*^{T}_{2}*y*_{2}*, x*_{1}*y*_{2} *= y*_{1}*x*_{2}*.*
*If, in addition, w*_{2} *6= 0, then kwk*^{2} *= 2w*_{1}^{2} *= 2kw*_{2}*k*^{2} *= 4(x*^{2}_{1}*+ y*_{1}^{2}*) 6= 0 and*

*x*_{1}*w*¯_{2} *= x*_{2}*, x*^{T}_{2}*w*¯_{2} *= x*_{1}*, y*_{1}*w*¯_{2} *= y*_{2}*, y*_{2}^{T}*w*¯_{2} *= y*_{1}*.*

*Lemma 3.3 For any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR×IR*^{l−1}*with w*_{2} *= 2(x*_{1}*x*_{2}*+y*_{1}*y*_{2}*) 6= 0,*
*there holds that*

¡*x*_{1}*+ (−1)*^{i}*x*^{T}_{2}*w*¯_{2}¢_{2}

*≤*°

*°x*_{2} *+ (−1)*^{i}*x*_{1}*w*¯_{2}°

°^{2} *≤ λ*_{i}*(w) for i = 1, 2.*

Based on Lemmas 3.1–3.3, we are now in a position to present the representation for
*the elements of the B-subdifferential ∂*_{B}*φ(x, y) at a general point (x, y) ∈ IR*^{l}*× IR** ^{l}*.

*Proposition 3.1 Given a general point (x, y) ∈ IR*

^{l}*× IR*

^{l}*, each element in ∂*

*B*

*φ(x, y) is*

*given by [V*

_{x}*− I V*

_{y}*− I] with V*

_{x}*and V*

_{y}*having the following representation:*

*(a) If x*^{2}*+ y*^{2} *∈ int(K*^{l}*), then V*_{x}*= L*^{−1}_{z}*L*_{x}*and V*_{y}*= L*^{−1}_{z}*L*_{y}*.*
*(b) If x*^{2}*+ y*^{2} *∈ bd(K*^{l}*) and (x, y) 6= (0, 0), then*

*V*_{x}*∈*

½ 1

2*√*
*2w*_{1}

µ 1 *w*¯^{T}_{2}

¯

*w*2 *4I − 3 ¯w*2*w*¯^{T}_{2}

¶

*L** _{x}*+1
2

µ 1

*− ¯w*2

¶
*u*^{T}

¾

*V*_{y}*∈*

½ 1

2*√*
*2w*1

µ 1 *w*¯^{T}_{2}

¯

*w*_{2} *4I − 3 ¯w*_{2}*w*¯_{2}^{T}

¶

*L** _{y}*+ 1
2

µ 1

*− ¯w*_{2}

¶
*v*^{T}

¾

(14)
*for some u = (u*_{1}*, u*_{2}*), v = (v*_{1}*, v*_{2}*) ∈ IR × IR*^{l−1}*satisfying |u*_{1}*| ≤ ku*_{2}*k ≤ 1 and*

*|v*1*| ≤ kv*2*k ≤ 1, where ¯w*2 *= w*2*/kw*2*k.*

*(c) If (x, y) = (0, 0), then V*_{x}*∈ {L*_{ˆ}_{x}*}, V*_{y}*∈ {L*_{y}_{ˆ}*} for some ˆx, ˆy with kˆxk*^{2}*+ kˆyk*^{2} *= 1, or*
*V*_{x}*∈*

½1 2

µ 1

¯
*w*_{2}

¶

*ξ** ^{T}* +1
2

µ 1

*− ¯w*_{2}

¶

*u** ^{T}* + 2

µ 0 0

*(I − ¯w*_{2}*w*¯^{T}_{2}*)s*_{2} *(I − ¯w*_{2}*w*¯^{T}_{2}*)s*_{1}

¶¾

*V*_{y}*∈*

½1 2

µ 1

¯
*w*_{2}

¶

*η** ^{T}* + 1
2

µ 1

*− ¯w*_{2}

¶

*v** ^{T}* + 2

µ 0 0

*(I − ¯w*_{2}*w*¯^{T}_{2}*)ω*_{2} *(I − ¯w*_{2}*w*¯^{T}_{2}*)ω*_{1}

¶¾
(15)
*for some u = (u*_{1}*, u*_{2}*), v = (v*_{1}*, v*_{2}*), ξ = (ξ*_{1}*, ξ*_{2}*), η = (η*_{1}*, η*_{2}*) ∈ IR × IR*^{l−1}*such*
*that |u*_{1}*| ≤ ku*_{2}*k ≤ 1, |v*_{1}*| ≤ kv*_{2}*k ≤ 1, |ξ*_{1}*| ≤ kξ*_{2}*k ≤ 1, |η*_{1}*| ≤ kη*_{2}*k ≤ 1, ¯w*_{2} *∈*
IR^{l−1}*satisfying k ¯w*2*k = 1, and s = (s*1*, s*2*), ω = (ω*1*, ω*2*) ∈ IR × IR*^{l−1}*satisfying*
*ksk*^{2}*+ kωk*^{2} *≤ 1/2.*

*Proof. Let D*_{φ}*denote the set of points where φ is differentiable. Recall that this set is*
*characterized by Lemma 3.1 since φ(x, y) = z(x, y) − (x + y), and moreover,*

*φ*^{0}_{x}*(x, y) = L*^{−1}_{z}*L*_{x}*− I, φ*^{0}_{y}*(x, y) = L*^{−1}_{z}*L*_{y}*− I* *∀(x, y) ∈ D*_{φ}*.*

*(a) In this case, φ is continuously differentiable at (x, y) by Lemma 3.1. Hence, ∂*_{B}*φ(x, y)*
*consists of a single element, i.e. φ*^{0}*(x, y) = [L*^{−1}_{z}*L*_{x}*− I L*^{−1}_{z}*L*_{y}*− I], and the result is clear.*

*(b) Assume that (x, y) 6= (0, 0) satisfies x*^{2}*+ y*^{2} *∈ bd(K*^{l}*). Then w ∈ bd(K*^{l}*) and w*1 *> 0,*
*which means kw*_{2}*k = w*_{1} *> 0 and λ*_{2}*(w) > λ*_{1}*(w) = 0. Observe that, when w*_{2} *6= 0, the*
*matrix L*^{−1}* _{z}* in (13) can be decomposed as the sum of

*L*_{1}*(w) :=* 1
2p

*λ*_{1}*(w)*

µ 1 *− ¯w*^{T}_{2}

*− ¯w*_{2} *w*¯_{2}*w*¯_{2}^{T}

¶

(16)

and

*L*_{2}*(w) :=* 1
2p

*λ*_{2}*(w)*

1 *w*¯^{T}_{2}

¯

*w*_{2} 4p

*λ*_{2}*(w)*
p*λ*2*(w) +*p

*λ*1*(w)(I − ¯w*_{2}*w*¯_{2}* ^{T}*) + ¯

*w*

_{2}

*w*¯

^{T}_{2}

(17)

with ¯*w*_{2} *= w*_{2}*/kw*_{2}*k. Consequently, φ*^{0}_{x}*and φ*^{0}* _{y}* can be rewritten as

*φ*^{0}_{x}*(x, y) = (L*_{1}*(w) + L*_{2}*(w)) L*_{x}*− I, φ*^{0}_{y}*(x, y) = (L*_{1}*(w) + L*_{2}*(w)) L*_{y}*− I.* (18)
*Let {(x*^{k}*, y*^{k}*)} ⊆ D**φ* *be an arbitrary sequence converging to (x, y). Let w*^{k}*= (w*^{k}_{1}*, w*_{2}* ^{k}*) =

*w(x*

^{k}*, y*

^{k}*) and z*

^{k}*= z(x*

^{k}*, y*

^{k}*) for each k, where w(x, y) and z(x, y) are given as in (12).*

*Since w*_{2} *6= 0, we without loss of generality assume kw*^{k}_{2}*k 6= 0 for each k. Let ¯w*_{2}* ^{k}* =

*w*

^{k}_{2}

*/kw*

^{k}_{2}

*k for each k. From (18), it follows that*

*φ*^{0}_{x}*(x*^{k}*, y** ^{k}*) = ¡

*L*_{1}*(w*^{k}*) + L*_{2}*(w** ^{k}*)¢

*L*_{x}^{k}*− I,*
*φ*^{0}_{y}*(x*^{k}*, y** ^{k}*) = ¡

*L*_{1}*(w*^{k}*) + L*_{2}*(w** ^{k}*)¢

*L*_{y}^{k}*− I.* (19)

Since lim_{k→∞}*λ*_{1}*(w*^{k}*) = 0, lim*_{k→∞}*λ*_{2}*(w*^{k}*) = 2w*_{1} *> 0 and lim*_{k→∞}*w*¯^{k}_{2} = ¯*w*_{2}, we have

*k→∞*lim *L*2*(w*^{k}*)L*_{x}^{k}*= C(w)L**x* and lim

*k→∞**L*2*(w*^{k}*)L*_{y}^{k}*= C(w)L**y* (20)
where

*C(w) =* 1
2*√*

*2w*_{1}

µ 1 *w*¯_{2}^{T}

¯

*w*_{2} *4I − 3 ¯w*_{2}*w*¯^{T}_{2}

¶

*.* (21)

*Next we focus on the limit of L*1*(w*^{k}*)L*_{x}^{k}*and L*1*(w*^{k}*)L*_{y}^{k}*as k → ∞. By computing,*
*L*_{1}*(w*^{k}*)L*_{x}* ^{k}* = 1

2

µ *u*^{k}_{1} *(u*^{k}_{2})^{T}

*−u*^{k}_{1}*w*¯_{2}^{k}*− ¯w*^{k}_{2}*(u*^{k}_{2})^{T}

¶
*,*
*L*1*(w*^{k}*)L*_{y}* ^{k}* = 1

2

µ *v*^{k}_{1} *(v*_{2}* ^{k}*)

^{T}*−v*_{1}^{k}*w*¯_{2}^{k}*− ¯w*^{k}_{2}*(v*_{2}* ^{k}*)

^{T}¶
*,*

where

*u*^{k}_{1} = *x*^{k}_{1} *− (x*^{k}_{2})^{T}*w*¯_{2}^{k}

p*λ*_{1}*(w** ^{k}*)

*, u*

^{k}_{2}=

*x*

^{k}_{2}

*− x*

^{k}_{1}

*w*¯

^{k}_{2}

p*λ*_{1}*(w** ^{k}*)

*, v*

^{k}_{1}=

*y*

_{1}

^{k}*− (y*

^{k}_{2})

^{T}*w*¯

_{2}

^{k}p*λ*_{1}*(w** ^{k}*)

*, v*

_{2}

*=*

^{k}*y*

^{k}_{2}

*− y*

_{1}

^{k}*w*¯

_{2}

*p*

^{k}*λ*

_{1}

*(w*

*)*

^{k}*.*

*By Lemma 3.3, |u*^{k}_{1}*| ≤ ku*^{k}_{2}*k ≤ 1 and |v*^{k}_{1}*| ≤ kv*_{2}^{k}*k ≤ 1. So, taking the limit (possibly on a*
*subsequence) on L*_{1}*(w*^{k}*)L*_{x}^{k}*and L*_{1}*(w*^{k}*)L*_{y}* ^{k}*, respectively, gives

*L*_{1}*(w*^{k}*)L*_{x}^{k}*→* 1
2

µ *u*_{1} *u*^{T}_{2}

*−u*1*w*¯2 *− ¯w*2*u*^{T}_{2}

¶

= 1 2

µ 1

*− ¯w*2

¶
*u*^{T}*L*1*(w*^{k}*)L*_{y}^{k}*→* 1

2

µ *v*_{1} *v*_{2}^{T}

*−v*_{1}*w*¯_{2} *− ¯w*_{2}*v*^{T}_{2}

¶

= 1 2

µ 1

*− ¯w*_{2}

¶

*v** ^{T}* (22)

*for some u = (u*_{1}*, u*_{2}*), v = (v*_{1}*, v*_{2}*) ∈ IR × IR*^{l−1}*satisfying |u*_{1}*| ≤ ku*_{2}*k ≤ 1 and |v*_{1}*| ≤*
*kv*2*k ≤ 1. In fact, u and v are some accumulation point of the sequences {u*^{k}*} and {v*^{k}*},*
respectively. From equations (19)–(22), we immediately obtain

*φ*^{0}_{x}*(x*^{k}*, y*^{k}*) → C(w)L** _{x}*+1
2

µ 1

*− ¯w*2

¶

*u*^{T}*− I,*
*φ*^{0}_{y}*(x*^{k}*, y*^{k}*) → C(w)L** _{y}*+ 1

2

µ 1

*− ¯w*_{2}

¶

*v*^{T}*− I.*

*This shows that φ*^{0}*(x*^{k}*, y*^{k}*) → [V*_{x}*− I V*_{y}*− I] as k → ∞ with V*_{x}*, V** _{y}* satisfying (14).

*(c) Assume that (x, y) = (0, 0). Let {(x*^{k}*, y*^{k}*)} ⊆ D** _{φ}* be an arbitrary sequence converging

*to (x, y). Let w*

^{k}*= (w*

^{k}_{1}

*, w*

^{k}_{2}

*) = w(x*

^{k}*, y*

^{k}*) and z*

^{k}*= z(x*

^{k}*, y*

^{k}*) for each k. Since w = 0, we*

*without any loss of generality assume that w*

_{2}

^{k}*= 0 for all k, or w*

_{2}

^{k}*6= 0 for all k.*

*Case (1): w*_{2}^{k}*= 0 for all k. From Lemma 3.1, it follows that L*^{−1}_{z}*k* *= (1/*p

*w*_{1}^{k}*)I. Therefore,*
*φ*^{0}_{x}*(x*^{k}*, y** ^{k}*) = 1

p*w*^{k}_{1}*L*_{x}^{k}*− I and φ*^{0}_{y}*(x*^{k}*, y** ^{k}*) = 1

p*w*_{1}^{k}*L*_{y}^{k}*− I.*

*Since w*^{k}_{1} *= kx*^{k}*k*^{2}*+ ky*^{k}*k*^{2}*, every element in φ*^{0}_{x}*(x*^{k}*, y*^{k}*) and φ*^{0}_{y}*(x*^{k}*, y** ^{k}*) is bounded. Taking

*limit (possibly on a subsequence) on φ*

^{0}

_{x}*(x*

^{k}*, y*

^{k}*) and φ*

^{0}

_{y}*(x*

^{k}*, y*

*), we obtain*

^{k}*φ*^{0}_{x}*(x*^{k}*, y*^{k}*) → L**x*ˆ*− I and φ*^{0}_{y}*(x*^{k}*, y*^{k}*) → L**y*ˆ*− I*

for some vectors ˆ*x, ˆy ∈ IR*^{l}*satisfying kˆxk*^{2} *+ kˆyk*^{2} = 1, where ˆ*x and ˆy are some accu-*
mulation point of the sequences

n*√**x*^{k}*w*^{k}_{1}

o and

n*√**y*^{k}*w*^{k}_{1}

o

, respectively. Thus, we prove that
*φ*^{0}*(x*^{k}*, y*^{k}*) → [V*_{x}*− I V*_{y}*− I] as k → ∞ with V*_{x}*∈ {L*_{x}_{ˆ}*} and V*_{y}*∈ {L*_{y}_{ˆ}*}.*

*Case (2): w*^{k}_{2} *6= 0 for all k. Now φ*^{0}_{x}*(x*^{k}*, y*^{k}*) and φ*^{0}_{y}*(x*^{k}*, y** ^{k}*) are given as in (19). Using the

*same arguments as part (b) and noting the boundedness of { ¯w*

^{k}_{2}

*}, we have*

*L*1*(w*^{k}*)L*_{x}^{k}*→* 1
2

µ 1

*− ¯w*_{2}

¶

*u*^{T}*, L*1*(w*^{k}*)L*_{y}^{k}*→* 1
2

µ 1

*− ¯w*_{2}

¶

*v** ^{T}* (23)

*for some u = (u*_{1}*, u*_{2}*), v = (v*_{1}*, v*_{2}*) ∈ IR × IR*^{l−1}*satisfying |u*_{1}*| ≤ ku*_{2}*k ≤ 1 and |v*_{1}*| ≤*
*kv*2*k ≤ 1, and ¯w*2 *∈ IR*^{l−1}*satisfying k ¯w*2*k = 1. We next compute the limit of L*2*(w*^{k}*)L*_{x}^{k}*and L*_{2}*(w*^{k}*)L*_{y}^{k}*as k → ∞. By the definition of L*_{2}*(w) in (17),*

*L*_{2}*(w*^{k}*)L*_{x}* ^{k}* = 1
2

µ *ξ*_{1}^{k}*(ξ*_{2}* ^{k}*)

^{T}*ξ*_{1}^{k}*w*¯^{k}_{2} + 4¡

*I − ¯w*_{2}* ^{k}*( ¯

*w*

^{k}_{2})

*¢*

^{T}*s*^{k}_{2} *w*¯_{2}^{k}*(ξ*_{2}* ^{k}*)

*+ 4¡*

^{T}*I − ¯w*_{2}* ^{k}*( ¯

*w*

^{k}_{2})

*¢*

^{T}*s*

^{k}_{1}

¶
*,*
*L*_{2}*(w*^{k}*)L*_{y}* ^{k}* = 1

2

µ *η*_{1}^{k}*(η*^{k}_{2})^{T}

*η*^{k}_{1}*w*¯_{2}* ^{k}*+ 4¡

*I − ¯w*^{k}_{2}( ¯*w*_{2}* ^{k}*)

*¢*

^{T}*ω*_{2}^{k}*w*¯_{2}^{k}*(η*^{k}_{2})* ^{T}* + 4¡

*I − ¯w*_{2}* ^{k}*( ¯

*w*

^{k}_{2})

*¢*

^{T}*ω*

^{k}_{1}

¶
*,*
where

*ξ*_{1}* ^{k}* =

*x*

^{k}_{1}

*+ (x*

^{k}_{2})

^{T}*w*¯

_{2}

^{k}p*λ*_{2}*(w** ^{k}*)

*, ξ*

_{2}

*=*

^{k}*x*

^{k}_{2}

*+ x*

^{k}_{1}

*w*¯

_{2}

^{k}p*λ*_{2}*(w** ^{k}*)

*, η*

_{1}

*=*

^{k}*y*

^{k}_{1}

*+ (y*

_{2}

*)*

^{k}

^{T}*w*¯

^{k}_{2}

p*λ*_{2}*(w** ^{k}*)

*, η*

^{k}_{2}=

*y*

_{2}

^{k}*+ y*

^{k}_{1}

*w*¯

_{2}

^{k}p*λ*_{2}*(w** ^{k}*)

*,*(24) and

*s*^{k}_{1} = *x*^{k}_{1}
p*λ*_{2}*(w** ^{k}*) +p

*λ*_{1}*(w** ^{k}*)

*, s*

^{k}_{2}=

*x*

^{k}_{2}p

*λ*

_{2}

*(w*

*) +p*

^{k}*λ*_{1}*(w** ^{k}*)

*,*

*ω*

_{1}

*=*

^{k}*y*

_{1}

^{k}p*λ*_{2}*(w** ^{k}*) +p

*λ*_{1}*(w** ^{k}*)

*, ω*

^{k}_{2}=

*y*

_{2}

*p*

^{k}*λ*

_{2}

*(w*

*) +p*

^{k}*λ*_{1}*(w** ^{k}*)

*.*(25)

*By Lemma 3.3, |ξ*

_{1}

^{k}*| ≤ kξ*

_{2}

^{k}*k ≤ 1 and |η*

_{1}

^{k}*| ≤ kη*

^{k}_{2}

*k ≤ 1. In addition,*

*ks*^{k}*k*^{2}*+ kω*^{k}*k*^{2} = *kx*^{k}*k*^{2}*+ ky*^{k}*k*^{2}
*2(kx*^{k}*k*^{2}*+ ky*^{k}*k*^{2}) + 2p

*λ*1*(w** ^{k}*)p

*λ*2*(w** ^{k}*)

*≤*1 2

*.*

*Hence, taking limit (possibly on a subsequence) on L*_{2}*(w*^{k}*)L*_{x}^{k}*and L*_{2}*(w*^{k}*)L*_{y}* ^{k}* yields

*L*

_{2}

*(w*

^{k}*)L*

_{x}

^{k}*→*1

2

µ *ξ*1 *ξ*_{2}^{T}

*ξ*_{1}*w*¯_{2}*+ 4(I − ¯w*_{2}*w*¯_{2}^{T}*)s*_{2} *w*¯_{2}*ξ*_{2}^{T}*+ 4(I − ¯w*_{2}*w*¯_{2}^{T}*)s*_{1}

¶

= 1

2 µ 1

¯
*w*_{2}

¶

*ξ** ^{T}* + 2

µ 0 0

*(I − ¯w*_{2}*w*¯_{2}^{T}*)s*_{2} *(I − ¯w*_{2}*w*¯^{T}_{2}*)s*_{1}

¶
*,*
*L*_{2}*(w*^{k}*)L*_{y}^{k}*→* 1

2

µ *η*1 *η*^{T}_{2}

*η*_{1}*w*¯_{2}*+ 4(I − ¯w*_{2}*w*¯_{2}^{T}*)ω*_{2} *w*¯_{2}*η*^{T}_{2} *+ 4(I − ¯w*_{2}*w*¯^{T}_{2}*)ω*_{1}

¶

= 1

2 µ 1

¯
*w*_{2}

¶

*η** ^{T}* + 2

µ 0 0

*(I − ¯w*_{2}*w*¯^{T}_{2}*)ω*_{2} *(I − ¯w*_{2}*w*¯_{2}^{T}*)ω*_{1}

¶

(26)
*for some vectors ξ = (ξ*_{1}*, ξ*_{2}*), η = (η*_{1}*, η*_{2}*) ∈ IR × IR*^{l−1}*satisfying |ξ*_{1}*| ≤ kξ*_{2}*k ≤ 1 and*

*|η*_{1}*| ≤ kη*_{2}*k ≤ 1, ¯w*_{2} *∈ IR*^{l−1}*satisfying k ¯w*_{2}*k = 1, and s = (s*_{1}*, s*_{2}*), ω = (ω*_{1}*, ω*_{2}*) ∈ IR×IR*^{l−1}*satisfying ksk*^{2}*+ kωk*^{2} *≤ 1/2. Among others, ξ and η are some accumulation point of the*
*sequences {ξ*^{k}*} and {η*^{k}*}, respectively; and s and ω are some accumulation point of the*
*sequences {s*^{k}*} and {ω*^{k}*}, respectively. From (19), (23) and (26), we obtain*

*φ*^{0}_{x}*(x*^{k}*, y*^{k}*) →* 1
2

µ 1

¯
*w*2

¶

*ξ** ^{T}* +1
2

µ 1

*− ¯w*2

¶

*u** ^{T}* + 2

µ 0 0

*(I − ¯w*2*w*¯^{T}_{2}*)s*2 *(I − ¯w*2*w*¯_{2}^{T}*)s*1

¶

*− I,*
*φ*^{0}_{y}*(x*^{k}*, y*^{k}*) →* 1

2 µ 1

¯
*w*_{2}

¶

*η** ^{T}* + 1
2

µ 1

*− ¯w*_{2}

¶

*v** ^{T}* + 2

µ 0 0

*(I − ¯w*_{2}*w*¯_{2}^{T}*)ω*_{2} *(I − ¯w*_{2}*w*¯^{T}_{2}*)ω*_{1}

¶

*− I.*