Mathematical Methods of Operations Research, vol. 64, pp. 495-519, 2006

### Two classes of merit functions for the second-order cone complementarity problem

Jein-Shan Chen ^{1}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan

June 2, 2005

(revised December 8, 2005) (second revised March 10, 2006)

Abstract Recently Tseng [*Merit function for semidefinite complementarity, Mathematical*
Programming, 83, pp. 159-185, 1998] extended a class of merit functions, proposed by Z.

Luo and P. Tseng [*A new class of merit functions for the nonlinear complementarity problem,*
in Complementarity and Variational Problems: State of the Art, pp. 204-225, 1997], for the
nonlinear complementarity problem (NCP) to the semidefinite complementarity problem
(SDCP) and showed several related properties. In this paper, we extend this class of
merit functions to the second-order cone complementarity problem (SOCCP) and show
analogous properties as in NCP and SDCP cases. In addition, we study another class of
merit functions which are based on a slight modification of the aforementioned class of
merit functions. Both classes of merit functions provide an error bound for the SOCCP
and have bounded level sets.

Key words. Error bound, Jordan product, level set, merit function, second-order cone, spectral factorization.

AMS subject classifications. 26B05, 90C33

### 1 Introduction

*We consider the following conic complementarity problem of finding x, y ∈ IR*^{n}*and ζ ∈ IR** ^{n}*
satisfying

*hx, yi = 0,* *x ∈ K,* *y ∈ K,* (1)

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan. E-mail: jschen@math.ntnu.edu.tw, FAX: 886-2-29332342.

*x = F (ζ),* *y = G(ζ),* (2)
*where h·, ·i is the Euclidean inner product, F : IR*^{n}*→ IR*^{n}*and G : IR*^{n}*→ IR** ^{n}* are smooth

*(i.e., continuously differentiable) mappings, and K is the Cartesian product of second-order*cones (SOC), also called Lorentz cones [8]. In other words,

*K = K*^{n}^{1}*× · · · × K*^{n}^{N}*,* (3)

*where N, n*1*, . . . , n**N* *≥ 1, n*1*+ · · · + n**N* *= n, and*

*K*^{n}^{i}*:= {(x*_{1}*, x*_{2}*) ∈ IR × IR*^{n}^{i}^{−1}*| kx*_{2}*k ≤ x*_{1}*},* (4)
*with k · k denoting the Euclidean norm and K*^{1} denoting the set of nonnegative reals IR+.
*A special case of (3) is K = IR*^{n}_{+}, the nonnegative orthant in IR* ^{n}*, which corresponds to

*N = n and n*

_{1}

*= · · · = n*

_{N}*= 1. We will refer to (1), (2), (3) as the second-order cone*

*complementarity problem (SOCCP).*

*An important special case of SOCCP corresponds to G(ζ) = ζ for all ζ ∈ IR** ^{n}*. Then
(1) and (2) reduce to

*hF (ζ), ζi = 0,* *F (ζ) ∈ K,* *ζ ∈ K,* (5)

*which is a natural extension of the nonlinear complementarity problem (NCP) where K =*
IR^{n}_{+}. Another important special case of SOCCP corresponds to the Karush-Kuhn-Tucker
(KKT) optimality conditions for the second-order cone program (SOCP) (see [4] for details):

minimize *c*^{T}*x*

*subject to Ax = b,* *x ∈ K,* (6)

*where A ∈ IR*^{m×n}*has full row rank, b ∈ IR*^{m}*and c ∈ IR** ^{n}*.

*For simplicity, we will focus on K = K** ^{n}* throughout the whole paper. All the analysis

*can be carried over to the general case where K has the direct product structure as (3). It*

*is known that K*

*is a closed convex cone with interior given by*

^{n}*int(K*^{n}*) = {(x*_{1}*, x*_{2}*) ∈ IR × IR*^{n−1}*| kx*_{2}*k < x*_{1}*}.*

*For any x, y in IR*^{n}*, we write x º*_{Kn}*y if x − y ∈ K*^{n}*; and write x Â*_{Kn}*y if x − y ∈ int(K** ^{n}*). In

*other words, we have x º*

_{Kn}*0 if and only if x ∈ K*

^{n}*and x Â*

_{Kn}*0 if and only if x ∈ int(K*

*).*

^{n}*The relation º** _{Kn}* is a partial ordering, i.e., it is anti-symmetric, transitive, and reflexive.

*Nonetheless, it is not a total ordering in K** ^{n}*.

There have been various methods proposed for solving SOCP and SOCCP. They include interior-point methods [1, 2, 18, 20, 21, 23, 28], non-interior smoothing Newton methods [6, 11, 13], and smoothing–regularization methods [14]. Recently, the author and his co- author studied an alternative approach based on reformulating SOCP and SOCCP as an

unconstrained smooth minimization problem [4]. In that approach, it aimed to find a
*smooth function ψ : IR*^{n}*× IR*^{n}*→ IR*_{+} such that

*ψ(x, y) = 0* *⇐⇒* *x ∈ K*^{n}*,* *y ∈ K*^{n}*,* *hx, yi = 0.* (7)
Then SOCCP can be expressed as an unconstrained smooth (global) minimization problem:

*ζ∈IR*min^{n}*f (ζ) := ψ(F (ζ), G(ζ)).* (8)
*We call such a f a merit function for the SOCCP.*

*A popular choice of ψ is the squared norm of Fischer-Burmeister function, i.e., ψ*_{FB} :
IR^{n}*× IR*^{n}*→ IR*_{+} associated with second-order cone given by

*ψ*_{FB}*(x, y) =* 1

2*kφ*_{FB}*(x, y)k*^{2}*,* (9)

*where φ*_{FB} : IR^{n}*× IR*^{n}*→ IR** ^{n}* is the well-known Fischer-Burmeister function [9, 10] defined
by

*φ*_{FB}*(x, y) = (x*^{2}*+ y*^{2})^{1/2}*− x − y.* (10)
*More specifically, for any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR × IR*^{n−1}*, we define their Jordan*
*product associated with K** ^{n}* as

*x ◦ y := (hx, yi, y*_{1}*x*_{2}*+ x*_{1}*y*_{2}*).* (11)
*The Jordan product ◦, unlike scalar or matrix multiplication, is not associative, which is*
a main source on complication in the analysis of SOCCP. The identity element under this
*product is e := (1, 0, . . . , 0)*^{T}*∈ IR*^{n}*. We write x*^{2} *to mean x ◦ x and write x + y to mean*
*the usual componentwise addition of vectors. It is known that x*^{2} *∈ K*^{n}*for all x ∈ IR** ^{n}*.

*Moreover, if x ∈ K*

^{n}*, then there exists a unique vector in K*

^{n}*, denoted by x*

*, such that*

^{1/2}*(x*

*)*

^{1/2}^{2}

*= x*

^{1/2}*◦ x*

^{1/2}*= x. Thus, φ*

_{FB}

*defined as (10) is well-defined for all (x, y) ∈ IR*

^{n}*× IR*

*and maps IR*

^{n}

^{n}*× IR*

*to IR*

^{n}

^{n}*. It was shown in [11] that φ*

_{FB}

*(x, y) = 0 if and only if (x, y)*

*satisfies (1). Therefore, ψ*

_{FB}defined as (9) induces a merit function for the SOCCP.

In this paper, we study two classes of merit functions for the SOCCP. The first class is
*f*_{LT}*(ζ) := ψ*0*(hF (ζ), G(ζ)i) + ψ(F (ζ), G(ζ)),* (12)
*where ψ*_{0} *: IR → IR*_{+} satisfies

*ψ*_{0}*(t) = 0 ∀t ≤ 0 and ψ*_{0}^{0}*(t) > 0 ∀t > 0,* (13)
*and ψ : IR*^{n}*× IR*^{n}*→ IR*+ satisfies

*ψ(x, y) = 0, hx, yi ≤ 0* *⇐⇒* *(x, y) ∈ K*^{n}*× K*^{n}*, hx, yi = 0.* (14)

*The function f*_{LT} was proposed by Z. Luo and P. Tseng for NCP case in [19] and was
extended to the SDCP case by P. Tseng in [27]. We explore the extension to the SOCCP
*as will be seen in Sec. 3 and Sec. 4. In addition, we make a slight modification of f*_{LT} which
forms another class of merit function as below.

*f*d_{LT}*(ζ) := ψ*^{∗}_{0}*(F (ζ) ◦ G(ζ)) + ψ(F (ζ), G(ζ)),* (15)
*where ψ*_{0}* ^{∗}* : IR

^{n}*→ IR*+ is given as

*ψ*_{0}^{∗}*(w) =* 1

2*k(w)*_{+}*k*^{2}*.* (16)

*and ψ : IR*^{n}*× IR*^{n}*→ IR*_{+} *satisfies (14). We notice that ψ*_{0}* ^{∗}* possesses the following property:

*ψ*^{∗}_{0}*(w) = 0* *⇐⇒* *w ¹*_{Kn}*0,* (17)

*which is a similar feature to (13) in some sense. Examples of ψ*_{0} *and ψ will be given in*
Sec. 3. The second class of merit functions for SDCP case was recently studied in [12] and
a variant of *f*^{d}_{LT} was also studied by the author in [3].

*We will show that both f*_{LT} and *f*^{d}_{LT} provide global error bound (Prop. 4.1 and Prop.

4.2), which plays an important role in analyzing the convergence rate of some iterative
*methods for solving the SOCCP, if F and G are jointly strongly monotone. We will also*
*prove that if F and G are jointly monotone and a strictly feasible solution exists then both*
*f*_{LT} and *f*^{d}_{LT} have bounded level sets (Prop. 4.3 and Prop. 4.4) which will ensure that the
sequence generated by a descent algorithm has at least an accumulation point. All these
properties will make it possible to construct a descent algorithm for solving the equivalent
*unconstrained reformulation of the SOCCP. In contrast, the merit function induced by ψ*_{FB}
*lacks these properties. In addition, we will show that both f*_{LT} and *f*^{d}_{LT} are differentiable
and their gradients have computable formulas. All the aforementioned features are signifi-
cant reasons for choosing and studying these new merit functions.

Finally, we point out that SOCCP can be reduced to an SDCP by observing that, for
*any x = (x*1*, x*2*) ∈ IR × IR*^{n−1}*, we have x ∈ K** ^{n}* if and only if

*L**x* :=

"

*x*_{1} *x*^{T}_{2}
*x*_{2} *x*_{1}*I*

#

is positive semidefinite (also see [11, p. 437] and [24]). However, this reduction increases
*the problem dimension from n to n(n + 1)/2 and it is not known whether this increase can*
*be mitigated by exploiting the special “arrow” structure of L** _{x}*.

Throughout this paper, IR^{n}*denotes the space of n-dimensional real column vectors*
and ^{T}*denotes transpose. For any differentiable function f : IR*^{n}*→ IR, ∇f (x) denotes*

*the gradient of f at x. For any differentiable mapping F = (F*_{1}*, ..., F** _{m}*)

*: IR*

^{T}

^{n}*→ IR*

*,*

^{m}*∇F (x) = [∇F*_{1}*(x) · · · ∇F*_{m}*(x)] is a n × m matrix which denotes the transpose Jacobian of*
*F at x. For any symmetric matrices A, B ∈ IR*^{n×n}*, we write A º B (respectively, A Â B) to*
*mean A−B is positive semidefinite (respectively, positive definite). For nonnegative scalars*
*α and β, we write α = O(β) to mean α ≤ Cβ, with C independent of α and β. For any*
*x ∈ IR*^{n}*, (x)*_{+}*is used to denote the orthogonal projection of x onto K*^{n}*, whereas (x)** _{−}*means

*the orthogonal projection of x onto −K*

^{n}*. Also we denote C*

^{∗}*:= {y | hx, yi ≥ 0 ∀x ∈ K}*

*the dual cone of C, given any closed convex cone C.*

### 2 Preliminaries

In this section, we review some background materials and preliminary results obtained by
the author and his co-author in [4] that will be used later. We begin with the determinant
*and trace of x. For any x = (x*1*, x*2*) ∈ IR × IR*^{n−1}*, its determinant and trace are defined by*

*det(x) := x*^{2}_{1}*− kx*_{2}*k*^{2} *, tr(x) := 2x*_{1}*.*

*In general, det(x ◦ y) 6= det(x)det(y) unless x*_{2} *= y*_{2}*. Besides, we observe that tr(x ◦ y) =*
*2hx, yi. We next recall from [11] that each x = (x*_{1}*, x*_{2}*) ∈ IR × IR** ^{n−1}* admits a spectral

*factorization, associated with K*

*, of the form*

^{n}*x = λ*_{1}*u*^{(1)}*+ λ*_{2}*u*^{(2)}*,*

*where λ*_{1}*, λ*_{2} *and u*^{(1)}*, u*^{(2)} *are the spectral values and the associated spectral vectors of x*
given by

*λ*_{i}*= x*_{1}*+ (−1)*^{i}*kx*_{2}*k,*
*u** ^{(i)}* =

1 2

µ

*1, (−1)*^{i}*x*2

*kx*_{2}*k*

¶

*if x*2 *6= 0;*

1 2

µ

*1, (−1)*^{i}*w*_{2}

¶

*if x*_{2} *= 0,*

*for i = 1, 2, with w*_{2} being any vector in IR^{n−1}*satisfying kw*_{2}*k = 1. If x*_{2} *6= 0, the factor-*
ization is unique.

*The above spectral factorization of x, as well as x*^{2} *and x*^{1/2}*and the matrix L**x*, have
various interesting properties; see [11]. We list four properties that we will use in the
subsequent sections.

*Property 2.1 For any x = (x*_{1}*, x*_{2}*) ∈ IR × IR*^{n−1}*, with spectral values λ*_{1}*, λ*_{2} *and spectral*
*vectors u*^{(1)}*, u*^{(2)}*, the following results hold.*

*(a) tr(x) = λ*_{1}*+ λ*_{2} *and det(x) = λ*_{1}*λ*_{2}*.*
*(b) If x ∈ K*^{n}*, then 0 ≤ λ*1 *≤ λ*2 *and x** ^{1/2}*=

*√*

*λ*1 *u*^{(1)}+*√*

*λ*2 *u*^{(2)}*.*

*(c) If x ∈ int(K*^{n}*), then 0 < λ*_{1} *≤ λ*_{2}*, and L*_{x}*is invertible with*

*L*^{−1}* _{x}* = 1

*det(x)*

*x*1 *−x*^{T}_{2}

*−x*_{2} *det(x)*
*x*_{1} *I +* 1

*x*_{1}*x*_{2}*x*^{T}_{2}

*.*

*(d) x ◦ y = L*_{x}*y for all y ∈ IR*^{n}*, and L*_{x}*Â 0 if and only if x ∈ int(K*^{n}*).*

*In the following, we present some preliminary properties about φ*_{FB} *and ψ*_{FB} given as
(10) and (9), respectively, which are crucial to proving the results in Sec. 3 and Sec. 4. We
only indicate their sources and omit the proofs since they can be found in [4] and [11].

*Lemma 2.1 ([11, Prop. 2.1]) Let φ*_{FB} : IR^{n}*× IR*^{n}*→ IR*^{n}*be given by (10). Then*
*φ*_{FB}*(x, y) = 0 ⇐⇒ x, y ∈ K*^{n}*, x ◦ y = 0,*

*⇐⇒ x, y ∈ K*^{n}*, hx, yi = 0.*

*Lemma 2.2 ([4, Lem. 3.2]) For any x = (x*_{1}*, x*_{2}*), y = (y*_{1}*, y*_{2}*) ∈ IR × IR*^{n−1}*with x*^{2}*+ y*^{2} *6∈*

*int(K*^{n}*), we have*

*x*^{2}_{1} *= kx*_{2}*k*^{2}*,*
*y*^{2}_{1} *= ky*2*k*^{2}*,*
*x*_{1}*y*_{1} *= x*^{T}_{2}*y*_{2}*,*
*x*_{1}*y*_{2} *= y*_{1}*x*_{2}*.*

*Lemma 2.3 ([4, Prop. 3.1, 3.2]) Let φ*_{FB}*, ψ*_{FB} *be given as (10) and (9), respectively. Then,*
*ψ*_{FB} *has the following properties.*

*(a) ψ*_{FB} : IR^{n}*× IR*^{n}*→ IR*_{+} *satisfies (7).*

*(b) ψ*_{FB} *is continuously differentiable at every (x, y) ∈ IR*^{n}*× IR*^{n}*. Moreover, ∇*_{x}*ψ*_{FB}*(0, 0) =*

*∇**y**ψ*_{BF}*(0, 0) = 0. If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *∈ int(K*^{n}*), then*

*∇*_{x}*ψ*_{FB}*(x, y) =*

µ

*L*_{x}*L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y),*

*∇*_{y}*ψ*_{FB}*(x, y) =*

µ

*L*_{y}*L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y).* (18)

*If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *6∈ int(K*^{n}*), then x*^{2}_{1}*+ y*_{1}^{2} *6= 0 and*

*∇*_{x}*ψ*_{FB}*(x, y) =*

*x*_{1}

q

*x*^{2}_{1} *+ y*^{2}_{1} *− 1*

*φ*_{FB}*(x, y),* (19)

*∇*_{y}*ψ*_{FB}*(x, y) =*

*y*_{1}

q

*x*^{2}_{1} *+ y*^{2}_{1} *− 1*

*φ*_{FB}*(x, y).* (20)

*Lemma 2.4 ([4, Lem. 5.1]) Let C be any closed convex cone in IR*^{n}*. For each x ∈ IR*^{n}*,*
*let x*^{+}_{C}*and x*^{−}_{C}*denote the nearest-point (in the Euclidean norm) projection of x onto C and*

*−C*^{∗}*, respectively. Then, the following results hold.*

*(a) For any x ∈ IR*^{n}*, we have x = x*^{+}_{C}*+ x*^{−}_{C}*and kxk*^{2} *= kx*^{+}_{C}*k*^{2}*+ kx*^{−}_{C}*k*^{2}*.*
*(b) For any x ∈ IR*^{n}*and y ∈ C, we have hx, yi ≤ hx*^{+}_{C}*, yi.*

*(c) If C is self-dual, then for any x ∈ IR*^{n}*and y ∈ C, we have* ^{°}^{°}°*(x + y)*^{+}_{C}^{°}^{°}°*≥*^{°}^{°}°*x*^{+}_{C}^{°}^{°}°*.*
Proof. In fact, part (a) and (b) are classical results of [16]. *2*

*Lemma 2.5 ([4, Lem. 5.2]) Let φ*_{FB}*, ψ*_{FB} *be given by (10) and (9), respectively. For any*
*(x, y) ∈ IR*^{n}*× IR*^{n}*, we have*

*4ψ*_{FB}*(x, y) ≥ 2*

°°

°°*φ*_{FB}*(x, y)*_{+}

°°

°°

2 *≥*

°°

°°*(−x)*_{+}

°°

°°

2+

°°

°°*(−y)*_{+}

°°

°°

2*.*

To close this section, we recall some definitions that will be used for analysis in subse-
*quent sections. We say that F and G are jointly monotone if*

*hF (ζ) − F (ξ), G(ζ) − G(ξ)i ≥ 0 ∀ζ, ξ ∈ IR*^{n}*.*

*Similarly, F and G are jointly strongly monotone if there exists ρ > 0 such that*
*hF (ζ) − F (ξ), G(ζ) − G(ξ)i ≥ ρkζ − ξk*^{2} *∀ζ, ξ ∈ IR*^{n}*.*

*In the case where G(ζ) = ζ for all ζ ∈ IR** ^{n}*, the above notions are equivalent to the well-

*known notions of F being, respectively, monotone and strongly monotone [7, Sec. 2.3].*

### 3 Two classes of merit functions

In this section, we study two classes of merit functions for the SOCCP. We are motivated by a class of merit functions proposed by Z. Luo and P. Tseng [19] for the NCP case originally and was already extended to the SDCP by P. Tseng [27]. We introduce them as below.

*Let f*_{LT} be given as (12), i.e.,

*f*_{LT}*(ζ) := ψ*_{0}*(hF (ζ), G(ζ)i) + ψ(F (ζ), G(ζ)),*

*where ψ*_{0} *satisfies (13) and ψ satisfies (14). We notice that ψ*_{0} is differentiable and strictly
*increasing on [0, ∞). An example of ψ*0 *is ψ*0*(t) =* ^{1}_{4}*(max{0, t})*^{4}. Let Ψ+ (we adopt the

*notation used as in [27]) denote the collection of ψ : IR*^{n}*× IR*^{n}*→ IR*_{+} satisfying (14) that
are differentiable and satisfy the following conditions:

( *h∇**x**ψ(x, y), ∇**y**ψ(x, y)i ≥ 0,* *∀(x, y) ∈ IR*^{n}*× IR*^{n}*.*

*hx, ∇*_{x}*ψ(x, y)i + hy, ∇*_{y}*ψ(x, y)i ≥ 0, ∀(x, y) ∈ IR*^{n}*× IR*^{n}*.* (21)
*We will give an example of ψ belonging to Ψ*_{+}in Prop. 3.1. Before that, we need couple
technical lemmas which will be used for proving Prop. 3.1 and Prop. 3.2.

*Lemma 3.1 (a) For any x ∈ IR*^{n}*, hx, (x)*_{−}*i = k(x)*_{−}*k*^{2} *and hx, (x)*_{+}*i = k(x)*_{+}*k*^{2}*.*
*(b) For any x ∈ IR*^{n}*and y ∈ IR*^{n}*, we have*

*x ∈ K*^{n}*⇐⇒* *hx, yi ≥ 0 ∀y ∈ K*^{n}*.* (22)

*Proof. (a) By definition of trace, we know that tr(x ◦ y) = 2hx, yi. Thus,*
*hx, (x)**−**i =* 1

2*tr*

µ

*x ◦ (x)**−*

¶

= 1
2*tr*

µ

*[(x)*_{+}*+ (x)*_{−}*] ◦ (x)*_{−}

¶

= 1
2*tr*

µ

*(x)*^{2}_{−}

¶

*= k(x)*_{−}*k*^{2}*,*

where the last inequality is from definition of trace again. Similar arguments applied for
*hx, (x)*_{+}*i = k(x)*_{+}*k*^{2}.

*(b) Since K*^{n}*is self-dual, that is K*^{n}*= (K** ^{n}*)

*. Hence, the desired result follows.*

^{∗}*2*

*Lemma 3.2 [11, Prop. 3.4] For any x, y ∈ IR*^{n}*and w ∈ K*^{n}*, we have*
*w*^{2} *º x*^{2} *+ y*^{2} *=⇒ L*^{2}_{w}*º L*^{2}_{x}*+ L*^{2}_{y}*,*

*w*^{2} *º x*^{2} *=⇒ w º x.*

*Proposition 3.1 Let ψ*1 : IR^{n}*× IR*^{n}*→ IR*+ *be given by*
*ψ*1*(x, y) :=* 1

2

µ

*k(−x)*+*k*^{2}*+ k(−y)*+*k*^{2}

¶

*.* (23)

*Then, the following results hold.*

*(a) ψ*1 *satisfies (14).*

*(b) ψ*_{1} *is convex and differentiable at every (x, y) ∈ IR*^{n}*× IR*^{n}*with ∇*_{x}*ψ*_{1}*(x, y) = (x)*_{−}*and*

*∇*_{y}*ψ*_{1}*(x, y) = (y)*_{−}*.*

*(c) For every (x, y) ∈ IR*^{n}*× IR*^{n}*, we have*

*h∇*_{x}*ψ*_{1}*(x, y), ∇*_{y}*ψ*_{1}*(x, y)i ≥ 0.*

*(d) For every (x, y) ∈ IR*^{n}*× IR*^{n}*, we have*

*hx, ∇*_{x}*ψ*_{1}*(x, y)i + hy, ∇*_{y}*ψ*_{1}*(x, y)i = k(x)*_{−}*k*^{2} *+ k(y)*_{−}*k*^{2}*.*
*(e) ψ*_{1} *belongs to Ψ*_{+}*.*

*Proof. (a) Suppose ψ*_{1}*(x, y) = 0 and hx, yi ≤ 0. Then by definition of ψ*_{1} as (23), we have
*(−x)*_{+}*= 0, (−y)*_{+} *= 0 which implies x ∈ K*^{n}*, y ∈ K*^{n}*. Since K*^{n}*is self-dual, x, y ∈ K** ^{n}*leads

*to hx, yi ≥ 0 by (22). This together with hx, yi ≤ 0 yields hx, yi = 0. The other direction*

*is clear from the above arguments. Hence, we proved that ψ*1 satisfies (14).

*(b) For any x ∈ IR*^{n}*, we have the decomposition x = (x)*_{+}*+ (x)*_{−}*= (x)*_{+}*− (−x)*_{+}. Hence,
1

2*k(−x)*+*k*^{2} = 1

2*k(x)*+*− xk*^{2} = min

*w∈K*^{n}

1

2*kw − xk*^{2}*,*

*which is convex and differentiable in x (see [22, page 255]). Moreover, the chain rule gives*

*∇*_{x}

·1

2*k(−x)*_{+}*k*^{2}

¸

*= −(−x)*_{+}*= (x)*_{−}*.*

*Similar formula holds for y. Thus, ψ*_{1} *is convex and differentiable at every (x, y) ∈ IR*^{n}*×IR*^{n}*with ∇*_{x}*ψ*_{1}*(x, y) = −(−x)*_{+} *= (x)*_{−}*and ∇*_{y}*ψ*_{1}*(x, y) = −(−y)*_{+}*= (y)** _{−}*.

(c) From part(b), we have

*h∇*_{x}*ψ*_{1}*(x, y), ∇*_{y}*ψ*_{1}*(x, y)i = h(x)*_{−}*, (y)*_{−}*i = h(−x)*_{+}*, (−y)*_{+}*i ≥ 0,*
where the inequality is true by (22).

(d) By applying Lemma 3.1(a), we obtain

*hx, ∇*_{x}*ψ*_{1}*(x, y)i = hx, (x)*_{−}*i = k(x)*_{−}*k*^{2}*.*

*Similarly, hy, ∇*_{x}*ψ*_{1}*(x, y)i = k(y)*_{−}*k*^{2} and hence the desired result holds.

(e) This is an immediate consequence of (a) through (d). *2*

*Next, we consider a further restriction on ψ. Let Ψ*_{++} *denote the collection of ψ ∈ Ψ*_{+}
satisfying the following conditions:

*ψ(x, y) = 0 ∀(x, y) ∈ IR*^{n}*× IR*^{n}*whenever h∇*_{x}*ψ(x, y), ∇*_{y}*ψ(x, y)i = 0.* (24)
*We notice that the ψ*_{1} defined as (23) in Prop. 3.1 does not belong to Ψ_{++}. An example of
*such ψ belonging to Ψ*_{++} is given in Prop. 3.2.

*Proposition 3.2 Let ψ*_{2} : IR^{n}*× IR*^{n}*→ IR*_{+} *be given by*
*ψ*_{2}*(x, y) :=* 1

2*kφ*_{FB}*(x, y)*_{+}*k*^{2}*,* (25)

*where φ*_{FB} *is defined as (10). Then, the following results hold.*

*(a) ψ*_{2} *satisfies (14).*

*(b) ψ*2 *is differentiable at every (x, y) ∈ IR*^{n}*× IR*^{n}*Moreover, ∇**x**ψ*2*(0, 0) = ∇**y**ψ*2*(0, 0) = 0.*

*If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *∈ int(K*^{n}*), then*

*∇*_{x}*ψ*_{2}*(x, y) =*

µ

*L*_{x}*L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y)*_{+}*,*

*∇**y**ψ*2*(x, y) =*

µ

*L**y**L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y)*+*.* (26)
*If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *6∈ int(K*^{n}*), then x*^{2}_{1}*+ y*_{1}^{2} *6= 0 and*

*∇*_{x}*ψ*_{2}*(x, y) =*

*x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y)*_{+}*,*

*∇*_{y}*ψ*_{2}*(x, y) =*

*y*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y)*_{+}*.* (27)

*(c) For every (x, y) ∈ IR*^{n}*× IR*^{n}*, we have*

*h∇**x**ψ*2*(x, y), ∇**y**ψ*2*(x, y)i ≥ 0,*
*and the equality holds whenever ψ*_{2}*(x, y) = 0.*

*(d) For every (x, y) ∈ IR*^{n}*× IR*^{n}*, we have*

*hx, ∇**x**ψ*2*(x, y)i + hy, ∇**y**ψ*2*(x, y)i = kφ*_{FB}*(x, y)*+*k*^{2}*.*
*(e) ψ*_{2} *belongs to Ψ*_{++}*.*

*Proof. (a) Suppose ψ*_{2}*(x, y) = 0 and hx, yi ≤ 0. Let z := −φ*_{FB}*(x, y). Then (−z)*_{+} =
*φ*_{FB}*(x, y)*_{+} *= 0 which says z ∈ K*^{n}*. Since x + y = (x*^{2}*+ y*^{2})^{1/2}*+ z, squaring both sides and*
simplifying yield

*2(x ◦ y) = 2*

µ

*(x*^{2} *+ y*^{2})^{1/2}*◦ z*

¶

*+ z*^{2}*.*

*Now, taking trace of both sides and using the fact tr(x ◦ y) = 2hx, yi, we obtain*

*4hx, yi = 4h(x*^{2}*+ y*^{2})^{1/2}*, zi + 2kzk*^{2}*.* (28)
*Since (x*^{2}*+ y*^{2})^{1/2}*∈ K*^{n}*and z ∈ K*^{n}*, then we know h(x*^{2}*+ y*^{2})^{1/2}*, zi ≥ 0 by Lemma 3.1(b).*

*Thus, the right hand-side of (28) is nonnegative, which togethers with hx, yi ≤ 0 implies*
*hx, yi = 0. Therefore, with this, the equation (28) says z = 0 which is equivalent to*
*φ*_{FB}*(x, y) = 0. Then by Lemma 2.1, we have x, y ∈ K*^{n}*. Conversely, if x, y ∈ K** ^{n}* and

*hx, yi = 0, then again Lemma 2.1 yields φ*

_{FB}

*(x, y) = 0. Thus, ψ*

_{2}

*(x, y) = 0 and hx, yi ≤ 0.*

(b) For the proof of part(b), we need to discuss three cases.

*Case (1): If (x, y) = (0, 0), then for any h, k ∈ IR*^{n}*, let µ*_{1} *≤ µ*_{2} be the spectral values and
*let v*^{(1)}*, v*^{(2)} *be the corresponding spectral vectors of h*^{2}*+ k*^{2}. Hence, by Property 2.1(b),

*k(h*^{2}*+ k*^{2})^{1/2}*− h − kk = k√*

*µ*1*v*^{(1)}+*√*

*µ*2*v*^{(2)}*− h − kk*

*≤* *√*

*µ*_{1}*kv*^{(1)}*k +√*

*µ*_{2}*kv*^{(2)}*k + khk + kkk*

= (*√*

*µ*_{1}+*√*
*µ*_{2}*)/√*

*2 + khk + kkk.*

Also

*µ*_{1} *≤ µ*_{2} *= khk*^{2}*+ kkk*^{2}*+ 2kh*_{1}*h*_{2}*+ k*_{1}*k*_{2}*k*

*≤ khk*^{2}*+ kkk*^{2}*+ 2|h*_{1}*|kh*_{2}*k + 2|k*_{1}*|kk*_{2}*k*

*≤ 2khk*^{2} *+ 2kkk*^{2}*.*
Combining the above two inequalities yields

*ψ*_{2}*(h, k) − ψ*_{2}*(0, 0) =* 1

2*kφ*_{FB}*(h, k)*_{+}*k*^{2}

*≤ kφ*_{FB}*(h, k)k*^{2}

*= k(h*^{2}*+ k*^{2})^{1/2}*− h − kk*^{2}

*≤* ^{³}(*√*

*µ*_{1}+*√*
*µ*_{2}*)/√*

*2 + khk + kkk*^{´}^{2}

*≤*

µ

2

q

*2khk*^{2}*+ 2kkk*^{2}*/√*

*2 + khk + kkk*

¶_{2}

*= O(khk*^{2}*+ kkk*^{2}*),*

*where the first inequality is from Lemma 2.5. This shows that ψ*_{2} *is differentiable at (0, 0)*
with

*∇**x**ψ*2*(0, 0) = ∇**y**ψ*2*(0, 0) = 0.*

*Case (2): If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *∈ int(K*^{n}*), let z be factored as z = λ*_{1}*u*^{(1)} *+ λ*_{2}*u*^{(2)}
*for any z ∈ IR*^{n}*. Now, let g : IR*^{n}*→ IR** ^{n}* be defined as

*g(z) :=* 1

2*((z)*_{+})^{2} *= ˆg(λ*_{1}*)u*^{(1)}*+ ˆg(λ*_{2}*)u*^{(2)}*,*

*where ˆg : IR → IR is given by ˆg(λ) :=* ^{1}_{2}*(max(0, λ))*^{2}. From the continuous differentiability
*of ˆg and Prop. 5.2 of [5], the vector-valued function g is also continuously differentiable.*

*Hence, the first component g*_{1}*(z) =* ^{1}_{2}*k(z)*_{+}*k*^{2} *of g(z) is continuously differentiable as well.*

*By an easy computation, we have ∇g*_{1}*(z) = (z)*_{+}*. Since ψ*_{2}*(x, y) = g*_{1}*(φ*_{FB}*(x, y)) and φ*_{FB} is
*differentiable at (x, y) 6= (0, 0) with x*^{2}*+ y*^{2} *∈ int(K** ^{n}*) (see [11, Cor. 5.2]). Hence, the chain
rule yields

*∇**x**ψ*2*(x, y) = ∇**x**φ*_{FB}*(x, y)∇g*1*(φ*_{FB}*(x, y)) =*

µ

*L**x**L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y)*+*,*

*∇*_{y}*ψ*_{2}*(x, y) = ∇*_{y}*φ*_{FB}*(x, y)∇g*_{1}*(φ*_{FB}*(x, y)) =*

µ

*L*_{y}*L*^{−1}* _{(x}*2

*+y*

^{2})

^{1/2}*− I*

¶

*φ*_{FB}*(x, y)*_{+}*.*

*Case (3): If (x, y) 6= (0, 0) and x*^{2}*+ y*^{2} *6∈ int(K*^{n}*), by direct computation, we know kxk*^{2} +
*kyk*^{2} *= 2kx*_{1}*x*_{2}*+ y*_{1}*y*_{2}*k under this case. Since (x, y) 6= (0, 0), this also implies x*_{1}*x*_{2}*+ y*_{1}*y*_{2} *6=*

*0. We notice that we can not apply the chain rule as in case(2) since φ*_{FB} is no longer
*differentiable at such (x, y) of case(3). By the spectral factorization, we observe that*

*φ*_{FB}*(x, y)*_{+}*= φ*_{FB}*(x, y) ⇐⇒ φ*_{FB}*(x, y) ∈ K*^{n}

*φ*_{FB}*(x, y)*_{+} *= 0 ⇐⇒ φ*_{FB}*(x, y) ∈ −K** ^{n}* (29)

*φ*

_{FB}

*(x, y)*+

*= λ*2

*u*

^{(2)}

*⇐⇒ φ*

_{FB}

*(x, y) 6∈ K*

^{n}*∪ −K*

^{n}*,*

*where λ*_{2} *is the bigger spectral value of φ*_{FB}*(x, y) and u*^{(2)} is the corresponding spectral
vector. Indeed, by applying Lemma 2.2, under this case, we have (as in [4, eq. (26)])

*φ*_{FB}*(x, y) =*

µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1} *+ y*_{1}*),x*_{1}*x*_{2}*+ y*_{1}*y*_{2}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− (x*_{2}*+ y*_{2})

¶

*.* (30)

*Therefore, λ*_{2} *and u*^{(2)} are given as below:

*λ*_{2} = ^{q}*x*^{2}_{1}*+ y*^{2}_{1}*− (x*_{1}*+ y*_{1}*) + kw*_{2}*k,* (31)
*u*^{(2)} = 1

2

µ

*1,* *w*_{2}
*kw*_{2}*k*

¶

*,* (32)

*where w*_{2} = ^{x}*√*^{1}^{x}^{2}^{+y}^{1}^{y}^{2}

*x*^{2}_{1}*+y*_{1}^{2} *− (x*_{2}*+ y*_{2}*). To prove the differentiability of ψ*_{2} under this case, we
shall discuss the following three subcases according to the above observation (29).

*(i) If φ*_{FB}*(x, y) 6∈ K*^{n}*∪ −K*^{n}*then φ*_{FB}*(x, y)*+ *= λ*2*u*^{(2)} *where λ*2 *and u*^{(2)} are given as in (31).

*From the fact that ku*^{(2)}*k =* ^{√}^{1}_{2}, we obtain
*ψ*_{2}*(x, y) =* 1

2*kφ*_{FB}*(x, y)*_{+}*k*^{2} = 1
4*λ*^{2}_{2}

= 1 4

·µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1}*+ y*_{1})

¶_{2}

+ 2

µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1}*+ y*_{1})

¶

*· kw*_{2}*k + kw*_{2}*k*^{2}

¸

*.*

*Since (x, y) 6= (0, 0) in this case, ψ*_{2} is differentiable clearly. Moreover, using the product
*rule and chain rule for differentiation, the derivative of ψ*_{2} *with respect to x*_{1} works out to
be

*∂*

*∂x*1

*ψ*_{2}*(x, y) =* 1
4

·

2

µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1} *+ y*_{1})

¶µ *x*_{1}

q

*x*^{2}_{1}*+ y*^{2}_{1} *− 1*

¶

+ 2

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶

*kw*_{2}*k*

+2

µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1}*+ y*_{1})

¶

*·* *w*_{2}^{T}*∇*_{x}_{1}*w*_{2}

*kw*2*k* *+ 2w*_{2}^{T}*∇*_{x}_{1}*w*_{2}

¸

= 1

2

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶µq

*x*^{2}_{1}*+ y*^{2}_{1}*− (x*_{1}*+ y*_{1}*) + kw*_{2}*k*

¶

*.*
The last equality of the above expression is true because of

*∇*_{x}_{1}*w*_{2} =

*x*_{2}*·*^{q}*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1}*x*_{2}*+ y*_{1}*y*_{2}*) ·√*^{x}^{1}

*x*^{2}_{1}*+y*^{2}_{1}

*(x*^{2}_{1}*+ y*^{2}_{1})

=

*√* 1
*x*^{2}_{1}*+y*^{2}_{1}

·

*x*2*(x*^{2}_{1}*+ y*^{2}_{1}*) − (x*^{2}_{1}*x*2*+ x*1*y*1*y*2)

¸

*(x*^{2}_{1}*+ y*_{1}^{2})

= *x*^{2}_{1}*x*_{2}*+ y*_{1}^{2}*x*_{2}*− x*^{2}_{1}*x*_{2}*− x*_{1}*y*_{1}*y*_{2}
(^{q}*x*^{2}_{1}*+ y*_{1}^{2})^{3}

*= 0,*

*where the last equality holds by Lemma 2.2. Similarly, the gradient of ψ*2 with respect to
*x*_{2} works out to be

*∇*_{x}_{2}*ψ*_{2}*(x, y) =* 1
4

"

2

µq

*x*^{2}_{1}*+ y*_{1}^{2}*− (x*_{1}*+ y*_{1})

¶*∇*_{x}_{2}*w*_{2}*· w*_{2}

*kw*_{2}*k* *+ 2∇*_{x}_{2}*w*_{2}*· w*_{2}

#

= 1

2

µq

*x*^{2}_{1}*+ y*^{2}_{1}*− (x*_{1}*+ y*_{1})

¶µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶ *w*_{2}
*kw*2*k* +

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶

*w*_{2}

= 1

2

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶µq

*x*^{2}_{1} *+ y*^{2}_{1}*− (x*_{1}*+ y*_{1}*) + kw*_{2}*k*

¶ *w*_{2}
*kw*_{2}*k*

*.*

*Then, we can rewrite ∇**x**ψ*2*(x, y) as*

*∇*_{x}*ψ*_{2}*(x, y) =*

" _{∂}

*∂x*1*ψ*_{2}*(x, y)*

*∇*_{x}_{2}*ψ*_{2}*(x, y)*

#

:=

"

Ξ_{1}
Ξ2

#

=

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶

*λ*_{2}*u*^{(2)}

=

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶

*φ*_{FB}*(x, y)*_{+}*,* (33)

where

Ξ_{1} := 1
2

µ *x*_{1}

q*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶µq

*x*^{2}_{1} *+ y*^{2}_{1}*− (x*_{1}*+ y*_{1}*) + kw*_{2}*k*

¶

*∈ IR*

Ξ_{2} := 1
2

µ *x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

¶µq

*x*^{2}_{1} *+ y*^{2}_{1}*− (x*_{1}*+ y*_{1}*) + kw*_{2}*k*

¶ *w*_{2}

*kw*_{2}*k* *∈ IR*^{n−1}*.*

*(ii) If φ*_{FB}*(x, y) ∈ K*^{n}*then φ*_{FB}*(x, y)*_{+} *= φ*_{FB}*(x, y) and hence ψ*_{2}*(x, y) =* ^{1}_{2}*kφ*_{FB}*(x, y)*_{+}*k*^{2} =

1

2*kφ*_{FB}*(x, y)k*^{2}*. Thus, by [4, Prop. 3.1(b)], we know that the gradient of ψ*_{2} under this
subcase is as below:

*∇**x**ψ*2*(x, y) =*

*x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y) =*

*x*_{1}

q

*x*^{2}_{1}*+ y*^{2}_{1} *− 1*

*φ*_{FB}*(x, y)*+ (34)

*∇*_{y}*ψ*_{2}*(x, y) =*

*y*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y) =*

*y*_{1}

q

*x*^{2}_{1}*+ y*^{2}_{1} *− 1*

*φ*_{FB}*(x, y)*_{+}*.*

*If there is (x*^{0}*, y*^{0}*) such that φ*_{FB}*(x*^{0}*, y*^{0}*) 6∈ K*^{n}*∪ −K*^{n}*and φ*_{FB}*(x*^{0}*, y*^{0}*) → φ*_{FB}*(x, y) ∈ K** ^{n}* (the
neighborhood of point belonging to this subcase). From (33) and (34), it can be seen that

*∇*_{x}*ψ*_{2}*(x*^{0}*, y*^{0}*) → ∇*_{x}*ψ*_{2}*(x, y),* *∇*_{y}*ψ*_{2}*(x*^{0}*, y*^{0}*) → ∇*_{y}*ψ*_{2}*(x, y).*

*Thus, ψ*_{2} is differentiable under this subcase.

*(iii) If φ*_{FB}*(x, y) ∈ −K*^{n}*then φ*_{FB}*(x, y)*_{+}*= 0. Thus, ψ*_{2}*(x, y) =* ^{1}_{2}*kφ*_{FB}*(x, y)*_{+}*k*^{2} = 0 and it is
clear that its gradient under this subcase is

*∇*_{x}*ψ*_{2}*(x, y) = 0 =*

*x*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y)*_{+}*,* (35)

*∇*_{y}*ψ*_{2}*(x, y) = 0 =*

*y*_{1}

q

*x*^{2}_{1}*+ y*_{1}^{2} *− 1*

*φ*_{FB}*(x, y)*_{+}*.*

*Again, if there is (x*^{0}*, y*^{0}*) such that φ*_{FB}*(x*^{0}*, y*^{0}*) 6∈ K*^{n}*∪−K*^{n}*and φ*_{FB}*(x*^{0}*, y*^{0}*) → φ*_{FB}*(x, y) ∈ −K** ^{n}*
(the neighborhood of point belonging to this subcase). From (33) and (35), it can be seen
that

*∇**x**ψ*2*(x*^{0}*, y*^{0}*) → 0 = ∇**x**ψ*2*(x, y),* *∇**y**ψ*2*(x*^{0}*, y*^{0}*) → 0 = ∇**y**ψ*2*(x, y).*

*Thus, ψ*_{2} is differentiable under this subcase.

From the above, we complete the proof of this case and therefore the proof for part(b) is done.

*(c) We wish to show that h∇*_{x}*ψ*_{2}*(x, y), ∇*_{y}*ψ*_{2}*(x, y)i ≥ 0 and the equality holds if and only*
*if ψ*_{2}*(x, y) = 0. We follow the three cases as above.*

*Case (1): If (x, y) = (0, 0), by part (b), we know ∇*_{x}*ψ*_{2}*(x, y) = ∇*_{y}*ψ*_{2}*(x, y) = 0. Therefore,*
the desired equality holds.