Volume 83, Number 287, May 2014, Pages 1143–1171 S 0025-5718(2013)02742-1

Article electronically published on July 15, 2013

**ON THE GENERALIZED FISCHER-BURMEISTER MERIT**
**FUNCTION FOR THE SECOND-ORDER CONE**

**COMPLEMENTARITY PROBLEM**

SHAOHUA PAN, SANGHO KUM, YONGDO LIM, AND JEIN-SHAN CHEN

Abstract. It has been an open question whether the family of merit functions
*ψ**p**(p > 1), the generalized Fischer-Burmeister (FB) merit function, associated*
to the second-order cone is smooth or not. In this paper we answer it partly,
*and show that ψ**p* *is smooth for p**∈ (1, 4), and we provide the condition for*
*its coerciveness. Numerical results are reported to illustrate the inﬂuence of p*
*on the performance of the merit function method based on ψ**p*.

1. Introduction

*Given two continuously diﬀerentiable mappings F, G :*R^{n}*→ R** ^{n}*, we consider the

*second-order cone complementarity problem (SOCCP): to seek a ζ*

*∈ R*

*such that (1)*

^{n}*F (ζ)∈ K, G(ζ) ∈ K, F (ζ), G(ζ) = 0,*

where*·, · denotes the Euclidean inner product that induces the norm · , and K*
is the Cartesian product of a group of second-order cones (SOCs). In other words,
(2) *K = K*^{n}^{1}*× K*^{n}^{2}*× · · · × K*^{n}^{m}*,*

*where n*1*, . . . , n**m**≥ 1, n*1+*· · · + n**m**= n, andK*^{n}* ^{i}* is the SOC inR

^{n}*deﬁned by*

^{i}*K*

^{n}*:=*

^{i}*(x**i1**, x**i2*)*∈ R × R*^{n}^{i}^{−1}*| x**i1**≥ x**i2*
*.*

As an extension of the nonlinear complementarity problem (NCP) over the non-
negative orthant cone R* ^{n}*+ (see [13]), the SOCCP has important applications in
engineering problems [21] and robust Nash equilibria [19]. In particular, it also

Received by the editor August 15, 2010, and in revised form, April 18, 2011 and August 7, 2012.

*2010 Mathematics Subject Classiﬁcation. Primary 90C33, 90C25.*

*Key words and phrases. Second-order cones, complementarity problem, generalized FB merit*
function.

The ﬁrst author’s work was supported by National Young Natural Science Foundation (No.

10901058) and the Fundamental Research Funds for the Central Universities (SCUT).

The second author’s work was supported by Basic Science Research Program through NRF Grant No. 2012-0001740.

The third author’s work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No.2012-005191).

Corresponding author: The fourth author is a member of the Mathematics Division, National Center for Theoretical Sciences, Taipei Oﬃce. The fourth author’s work was supported by National Science Council of Taiwan.

*2013 American Mathematical Society*c
1143

arises from the suitable reformulation for the Karush-Kuhn-Tucker (KKT) opti- mality conditions of the nonlinear second-order cone programming (SOCP):

(3) minimize *f (x)*

subject to *Ax = b,* *x∈ K,*

*where f :* R^{n}*→ R is a twice continuously diﬀerentiable function, A is an m × n*
*real matrix with full row rank, and b∈ R** ^{m}*. It is well known that the SOCP has
very wide applications in engineering design, control, management science, and so
on; see [1, 25] and the references therein.

In the past several years, there have various methods proposed for SOCPs and
SOCCPs. They include the interior-point methods [2, 26, 28, 31, 33], the smoothing
Newton methods [11, 15, 18], the semismooth Newton methods [22, 34], and the
merit function methods [4, 12]. The merit function method aims to seek a smooth
*(continuously diﬀerentiable) function ψ :*R^{n}*× R*^{n}*→ R*+ satisfying

(4) *ψ(x, y) = 0* *⇐⇒ x ∈ K, y ∈ K, x, y = 0,*

so that the SOCCP can be reformulated as an unconstrained minimization problem

(5) min

*ζ**∈R*^{n}*Ψ(ζ) := ψ(F (ζ), G(ζ))*

*in the sense that ζ** ^{∗}* is a solution to (1) if and only if it solves (5) with zero op-

*timal value. We call such ψ a merit function associated with*

*K. Note that the*smooth merit functions also play a key role in the globalization of semismooth and smoothing Newton methods.

This paper is concerned with the generalized Fischer-Burmeister (FB) merit function

(6) *ψ**p**(x, y) :=* 1

2*φ**p**(x, y)*^{2}*,*

*where p is a ﬁxed real number from (1, +∞), and φ**p*:R^{n}*× R*^{n}*→ R** ^{n}* is deﬁned by

(7) *φ**p**(x, y) :=*^{p}

*|x|** ^{p}*+

*|y|*

^{p}*− (x + y)*

with *|x|** ^{p}* being the vector-valued function (or L¨owner function) associated with

*|t|*^{p}*(t∈ R) (see Section 2 for the deﬁnition). Clearly, when p = 2, ψ**p* reduces to
the FB merit function

*ψ*_{FB}*(x, y) :=* 1

2*φ*FB*(x, y)*^{2}*,*

*where φ*_{FB} :R^{n}*× R*^{n}*→ R** ^{n}* is the FB SOC complementarity function deﬁned by

*φ*

_{FB}

*(x, y) :=*

*x*^{2}*+ y*^{2}*− (x + y),*

*with x*^{2} *= x◦ x being the Jordan product of x with itself, and* *√*

*x with x* *∈ K*
being the unique vector such that*√*

*x◦√*

*x = x. The function ψ*_{FB} is shown to be a
smooth merit function with globally Lipschitz continuous derivative [10, 12]. Such
a desirable property is also proved for the FB matrix-valued merit function [30, 32].

*In this paper, we study the favorable properties of ψ**p*. The motivations for us
*to study this family of merit functions are as follows. In the setting of NCPs, ψ**p*is
shown to share all favorable properties as the FB merit function holds (see [9, 8]),
and the performance proﬁle in [5] indicates that the semismooth Newton method
*based on φ**p* *with a smaller p has better performance than a larger p. Thus, it is*
*very natural to ask whether ψ**p*has the desirable properties of the FB merit function
or not in the setting of SOCCPs, and what performance the merit function method

*and the Newton-type methods based on φ**p* *display with respect to p. This work is*
the ﬁrst step in resolving these questions. Although there are some papers [4, 6, 12]

to study the smoothness of merit functions for the SOCCPs, the analysis techniques
*therein are not applicable for the general function ψ**p*. We wish that the analysis
technique of this paper would be helpful in handling general L¨owner operators.

*The main contribution of this paper is to show that ψ**p**with p∈ (1, 4) is a smooth*
merit function associated with *K, and to establish the coerciveness of Ψ**p**(ζ) :=*

*ψ**p**(F (ζ), G(ζ)) under the uniform Jordan P -property and the linear growth of F .*
Throughout this paper, we will focus on the case of*K = K** ^{n}*, and all the analysis
can be carried over to the general case where

*K is the Cartesian product of K*

^{n}*. To*

^{i}*this end, for any given x∈ R*

^{n}*with n > 1, we write x = (x*1

*, x*2

*) where x*1is the ﬁrst

*component of x, and x*2 is the column vector consisting of the rest components of

*x; and let x*2=

_{x}

^{x}^{2}

2 *whenever x*2*
= 0, and otherwise let x*2be an arbitrary vector
in R^{n}* ^{−1}* with

*x*2

**= 1. We denote intK**

^{n}**, bdK**

^{n}**and bd**

^{+}

*K*

*by the interior, the boundary, and the boundary excluding the origin, respectively, of*

^{n}*K*

*. For any*

^{n}*x, y∈ R*

^{n}*, x*

_{K}

^{n}*y means x− y ∈ K*

^{n}*; and x*

_{K}

^{n}*y means x*

**− y ∈ intK***. For a*

^{n}*real symmetric matrix A, we write A 0 (respectively, A 0) to mean that A is*positive semideﬁnite (respectively, positive deﬁnite). For a diﬀerentiable mapping

*F :*R

^{n}*→ R*

*,*

^{m}*∇F (x) denotes the transposed Jacobian of F at x. For nonnegative*

*α and β, α = O(β) means α*

*≤ Cβ for some C > 0 independent of α and β. The*

*notation I always represents an identity matrix of appropriate dimension.*

2. Preliminaries

*The Jordan product of any two vectors x and y associated with* *K** ^{n}* (see [14]) is
deﬁned as

*x◦ y := (x, y, y*1*x*2*+ x*1*y*2*).*

The Jordan product, unlike scalar or matrix multiplication, is not associative, which
is a main source of complication in the analysis of SOCCP. The identity element
*under this product is e = (1, 0, . . . , 0)*^{T}*∈ R** ^{n}*.

*For any given x*

*∈ R*

*, deﬁne*

^{n}*L*

*x*:R

^{n}*→ R*

*by*

^{n}*L**x**y :=*

*x*1 *x*^{T}_{2}
*x*2 *x*1*I*

*y = x◦ y* *∀y ∈ R*^{n}*.*

*Recall from [14] that each x∈ R** ^{n}* has a spectral factorization associated with

*K*

*:*

^{n}*x = λ*1

*(x)u*

^{(1)}

*x*

*+ λ*2

*(x)u*

^{(2)}

*x*

*,*

(8)

*where λ**i**(x) and u*^{(i)}*x* *for i = 1, 2 are the spectral values of x and the corresponding*
spectral vectors, respectively, deﬁned by

*λ**i**(x) := x*1+ (−1)^{i}*x*2* and u*^{(i)}*x* :=1
2

*1, (−1)*^{i}*x*2

*.*
(9)

*The factorization is unique when x*2 *
= 0. The following lemma states the relation*
*between the spectral factorization of x and the eigenvalue decomposition of L**x*.
**Lemma 2.1 ([14, 15]). For any given x**∈ R^{n}*, let λ*1*(x), λ*2*(x) be the spectral values*
*of x, and let u*^{(1)}*x* *, u*^{(2)}*x* *be the corresponding spectral vectors. Then, we have*

*L**x**= U**x**diag (λ*2*(x), x*1*, . . . , x*1*, λ*1*(x)) U*_{x}^{T}

*with U**x* = [*√*

*2u*^{(2)}*x* *u*^{(3)} *· · · u*^{(n)}*√*

*2u*^{(1)}*x* ] *∈ R*^{n}^{×n}*beings an orthogonal matrix,*
*where u*^{(i)}*= (0, u*^{i}*) for i = 3, . . . , n with u*^{3}*, . . . , u*^{n}*being any unit vectors to span*
*the linear subspace orthogonal to x*2*.*

*By Lemma 2.1, clearly, L**x** 0 iﬀ (if and only if) x *_{K}^{n}*0, L**x** 0 iﬀ x *_{K}* ^{n}* 0,

*and L*

*x*

*is invertible iﬀ x*1

*= 0 and det(x) := x*

^{2}1

*−x*2

^{2}

*= 0. Also, if L*

*x*is invertible,

*L*^{−1}*x* = 1
*det(x)*

*x*1 *−x** ^{T}*2

*−x*2 *det(x)*
*x*_{1} *I +*_{x}^{1}

1*x*2*x*^{T}_{2}

*.*
(10)

*Given a scalar function g :R → R, deﬁne a vector function g*^{soc}:R^{n}*→ R** ^{n}* by

*g*

^{soc}

*(x) := g(λ*1

*(x))u*

^{(1)}

_{x}*+ g(λ*2

*(x))u*

^{(2)}

_{x}*.*

(11)

*If g is deﬁned on a subset of* *R, then g*^{soc} is deﬁned on the corresponding subset
of R^{n}*. The deﬁnition of g*^{soc} *is unambiguous whether x*2 *
= 0 or x*2= 0. In this
paper, we often use the vector-valued functions associated with *|t|*^{p}*(t* *∈ R) and*

*√**p*

*t (t≥ 0), respectively, written as*

*|x|** ^{p}*:=

*|λ*1

*(x)|*

^{p}*u*

^{(1)}

*+*

_{x}*|λ*2

*(x)|*

^{p}*u*

^{(2)}

_{x}*∀ x ∈ R*

^{n}*,*

*√**p*

*x :=* ^{p}

*λ*1*(x) u*^{(1)}* _{x}* +

^{p}*λ*2*(x) u*^{(2)}_{x}*∀ x ∈ K*^{n}*.*
*The two functions show that φ**p* *in (7) is well deﬁned for any x, y∈ R** ^{n}*.

We next present four lemmas that will often be used in the subsequent analysis.

**Lemma 2.2 ([23, 24]). For any given 0**≤ ρ ≤ 1, ξ^{ρ}_{K}^{n}*η*^{ρ}*when ξ*_{K}^{n}*η*_{K}^{n}*0.*

**Lemma 2.3. For any nonnegative real numbers a and b, the following results hold:**

**(a): (a + b)**^{ρ}*≥ a*^{ρ}*+ b*^{ρ}*if ρ > 1, and the equality holds iﬀ ab = 0;*

**(b): (a + b)**^{ρ}*≤ a*^{ρ}*+ b*^{ρ}*if 0 < ρ < 1, and the equality holds iﬀ ab = 0.*

*Proof. Without loss of generality, we assume that a* *≤ b and b > 0. Consider the*
*function h(t) = (t + 1)*^{ρ}*− (t*^{ρ}*+ 1) (t≥ 0). It is easy to verify that h is increasing*
*on [0, +∞) when ρ > 1. Hence, h(a/b) ≥ h(0) = 0, i.e., (a + b)*^{ρ}*≥ a*^{ρ}*+ b** ^{ρ}*. Also,

*h(a/b) = h(0) if and only if a/b = 0. That is, (a + b)*

^{ρ}*= a*

^{ρ}*+ b*

*if and only if*

^{ρ}*ab = 0. This proves part (a). Note that h is decreasing on [0, +∞) when 0 < ρ < 1,*

and a similar argument leads to part (b).

**Lemma 2.4. For any ξ, η**∈ K^{n}*, if ξ + η ∈ bdK*

^{n}*, then one of the following cases*

*must hold: (i) ξ = 0, η*

**∈ bdK**

^{n}*; (ii) ξ*

**∈ bdK**

^{n}*, η = 0; (iii) ξ = γη for some γ > 0*

*with η*

**∈ bd**^{+}

*K*

^{n}*.*

*Proof. From ξ, η∈ K*^{n}*and ξ + η ∈ bdK*

*, we immediately obtain that*

^{n}*ξ*2* + η*2* ≥ ξ*2*+ η*2* = ξ*1*+ η*1*≥ ξ*2* + η*2*.*

*This shows that ξ*2 *= 0, or η*2*= 0, or ξ*2*= γη*2 *
= 0 for some γ > 0. Substituting*
*ξ*2*= 0, or η*2*= 0, or ξ*2*= γη*2 into *ξ*2*+ η*2* = ξ*1*+ η*1 yields the result.
*To close this section, we show that φ**p*in (7) is an SOC complementarity function,
*and then its squared norm ψ**p* is a merit function associated with*K** ^{n}*.

**Lemma 2.5. Let φ***p* *be deﬁned by (7). Then, for any x, y∈R*^{n}*, it holds that*
*φ**p**(x, y) = 0* *⇐⇒ x ∈ K*^{n}*, y∈ K*^{n}*,* *x, y = 0.*

*Proof. “⇐”. From [16, Proposition 6], there exists a Jordan frame*

*u*^{(1)}*, u*^{(2)}
such
*that x = λ*1*u*^{(1)}*+ λ*2*u*^{(2)} *and y = μ*1*u*^{(1)}*+ μ*2*u*^{(2)}*with λ**i**, μ**i**≥ 0 for i = 1, 2. Then,*

*(x + y)** ^{p}* =

*(λ*1

*+ μ*1)

^{p}*u*

^{(1)}

*+ (λ*2

*+ μ*2)

^{p}*u*

^{(2)}

*,*

*x*

^{p}*+ y*

*=*

^{p}*(λ*

^{p}_{1}

*+ μ*

^{p}_{1}

*)u*

^{(1)}

*+ (λ*

^{p}_{2}

*+ μ*

^{p}_{2}

*)u*

^{(2)}

*.*

Since 0 = 2x, y = λ1*μ*1*+ λ*2*μ*2 *implies λ*1*μ*1 *= λ*2*μ*2 = 0, from the last two
*equalities and Lemma 2.3(a) we obtain (x + y)*^{p}*= x*^{p}*+ y*^{p}*, and then φ**p**(x, y) = 0.*

“⇒”. Since φ*p**(x, y) = 0, we have x =*

*|x|** ^{p}*+

*|y|*

^{p}*− y*

*K*

^{n}*|y| − y ∈ K*

*, where*

^{n}*the inequality is due to Lemma 2.2. Similarly, we have y =*

^{p}*|x|** ^{p}*+

*|y|*

^{p}*− x*

_{K}

^{n}*|x| − x ∈ K*^{n}*. Now from φ**p**(x, y) = 0, we have (x + y)*^{p}*= x*^{p}*+ y** ^{p}*, and then

*(λ*1

*(x + y))*

^{p}*+ (λ*2

*(x + y))*

^{p}*= (λ*1

*(x))*

^{p}*+ (λ*2

*(x))*

^{p}*+ (λ*1

*(y))*

^{p}*+ (λ*2

*(y))*

^{p}*.*

*Noting that h(t) = (t*0

*+ t)*

^{p}*+ (t*0

*− t)*

^{p}*for a ﬁxed t*0

*≥ 0 is increasing on [0, t*0], we also have

*[λ*1*(x + y)]*^{p}*+ [λ*2*(x + y)]*^{p}*≥ (x*1*+ y*1*− x*2* + y*2*)*^{p}*+ (x*1*+ y*1+*x*2* − y*2*)*^{p}

*= (λ*1*(x) + λ*2*(y))*^{p}*+ (λ*2*(x) + λ*1*(y))*^{p}

*≥ (λ*1*(x))*^{p}*+ (λ*2*(y))*^{p}*+ (λ*2*(x))*^{p}*+ (λ*1*(y))*^{p}*,*
(12)

*where the last inequality is due to Lemma 2.3(a) and x, y* *∈ K** ^{n}*. The last two
equations imply that all the inequalities on the right-hand side of (12) become
equalities. Therefore,

(13) *x*2*+ y*2* = x*2* − y*2*, λ*1*(x)λ*2*(y) = 0,* *λ*2*(x)λ*1*(y) = 0.*

*Assume that x*2*
= 0 and y*2*
= 0. Since x, y ∈ K** ^{n}*, from the equalities in (13), we get

*x*1=

*x*2

*, y*1=

*y*2

*, and x*2=

*γy*2for some

*γ < 0, which implies x, y = 0. When*

*x*2

*= 0 or y*2= 0, using the continuity of the inner product yields

*x, y = 0.*

*3. Differentiability of ψ**p*

*Unless otherwise stated, in the rest of this paper, we assume that p > 1 with*
*q = (1−p** ^{−1}*)

^{−1}*, and g*

^{soc}is the vector-valued function associated with

*|t|*

^{p}*(t∈ R),*

*i.e., g*

^{soc}

*(x) =|x|*

^{p}*. For any x, y∈ R*

*, we deﬁne*

^{n}*w = w(x, y) :=|x|** ^{p}*+

*|y|*

^{p}*and z = z(x, y) :=*

^{p}*|x|** ^{p}*+

*|y|*

^{p}*.*(14)

By deﬁnitions of*|x|** ^{p}* and

*|y|*

*, clearly,*

^{p}*w*1*:= w*1*(x, y)* = *|λ*2*(x)|** ^{p}*+

*|λ*1

*(x)|*

^{p}2 +*|λ*2*(y)|** ^{p}*+

*|λ*1

*(y)|*

^{p}2 *,*

*w*2*:= w*2*(x, y)* = *|λ*2*(x)|*^{p}*− |λ*1*(x)|*^{p}

2 *x*2+*|λ*2*(y)|*^{p}*− |λ*1*(y)|*^{p}

2 *y*_{2}*,*

(15)

*where x*2 = _{x}^{x}^{2}_{2}_{}*if x*2 *
= 0, and otherwise x*2 is an arbitrary vector inR^{n}* ^{−1}* with

*x*2* = 1, and y*2 has a similar deﬁnition.

*Noting that z(x, y) =* ^{p}

*w(x, y), we have*
*z*1*= z*1*(x, y)* =

*p*

*λ*2*(w) +*^{p}*λ*1*(w)*

2 *,*

*z*2*= z*2*(x, y)* =

*p*

*λ*2*(w)−*^{p}*λ*1*(w)*

2 *w*2*,*

(16)

*where w*2= _{w}^{w}^{2}

2 *if w*2*
= 0, and otherwise w*2 is an arbitrary vector inR^{n}* ^{−1}* with

*w*2* = 1.*

*To study the diﬀerentiability of ψ**p*, we need the following two crucial lemmas.

*The ﬁrst one gives the properties of the points (x, y) satisfying w(x, y)* **∈ bdK*** ^{n}*,
and the second one provides a suﬃcient characterization for the continuously dif-

*ferentiable points of z(x, y).*

**Lemma 3.1. For any (x, y) with w(x, y)****∈ bdK**^{n}*, we have the following equalities:*

*w*1*(x, y) =w*2*(x, y) = 2*^{p}* ^{−1}*(|x1

*|*

*+*

^{p}*|y*1

*|*

^{p}*),*

*x*

^{2}

_{1}=

*x*2

^{2}

*, y*

^{2}

_{1}=

*y*2

^{2}

*, x*1

*y*1

*= x*

^{T}_{2}

*y*2

*, x*1

*y*2

*= y*1

*x*2

*.*(17)

*If, in addition, w*2*(x, y)
= 0, the following equalities hold with w*2*(x, y) =* _{w}^{w}^{2}^{(x,y)}

2*(x,y)**:*
(18) *x*^{T}_{2}*w*2*(x, y) = x*1*, x*1*w*2*(x, y) = x*2*, y*^{T}_{2}*w*2*(x, y) = y*1*, y*1*w*2*(x, y) = y*2*.*
*Proof. Fix any (x, y) with w(x, y) ∈ bdK*

*. Since*

^{n}*|x|*

^{p}*,|y|*

^{p}*∈ K*

*, applying Lemma*

^{n}*2.4 with ξ =|x|*

^{p}*and η =|y|*

*, we have*

^{p}*|x|*

^{p}

**∈ bdK***and*

^{n}*|y|*

^{p}

**∈ bdK***. This means that*

^{n}*|λ*2

*(x)|*

^{p}*·|λ*1

*(x)|*

*= 0 and*

^{p}*|λ*2

*(y)|*

^{p}*·|λ*1

*(y)|*

^{p}*= 0. So, x*

^{2}

_{1}=

*x*2

^{2}

*and y*

_{1}

^{2}=

*y*2

^{2}.

*Substituting this into w*1

*(x, y), we readily obtain w*1

*(x, y) = 2*

^{p}*(|x1*

^{−1}*|*

*+*

^{p}*|y*1

*|*

*).*

^{p}To prove other equalities in (17) and (18), we ﬁrst consider the case where
*x*1+*x*2* = 0 and y*1*− y*2* = 0 with x*2*
= 0 and y*2*
= 0. Under this case,*

*w*1= *|λ*1*(x)|** ^{p}*+

*|λ*2

*(y)|*

^{p}2 = *|λ*1*(x)|** ^{p}*
2

*x*2

*x*2*−|λ*2*(y)|** ^{p}*
2

*y*2

*y*2 ^{2}*,*

*which implies that x** ^{T}*2

*y*2 =

*−x*2

*y*2

*= x*1

*y*1

*. Together with x*

^{2}1 =

*x*2

^{2}and

*y*

^{2}

_{1}=

*y*2

^{2}

*, we have that x*1

*y*2

*= y*1

*x*2

*. From the deﬁnition of w*2, it follows that

*x*^{T}_{2}*w*2=*−|λ*1*(x)|*^{p}

2 *x*2* +* *|λ*2*(y)|** ^{p}*
2

*x*1*y*1

*y*2 = 2^{p}* ^{−1}*(|x1

*|*

*+*

^{p}*|y*1

*|*

^{p}*) x*1=

*w*2

*x*1

*,*

*x*1

*w*2=

*−|λ*1

*(x)|*

^{p}2

*x*1*x*2

*x*2 +*|λ*2*(y)|** ^{p}*
2

*y*1*x*2

*y*2 = 2^{p}* ^{−1}*(

*|x*1

*|*

*+*

^{p}*|y*1

*|*

^{p}*) x*2=

*w*2

*x*2

*.*

*Similarly, we also have y*

_{2}

^{T}*w*2=

*w*2

*y*1

*and y*

^{T}_{1}

*w*2=

*w*2

*y*2. The above arguments

*show that equations (17) and (18) hold under the case where x*1 =

*−x*2

*, y*1 =

*y*2*. Using the same arguments, we can prove that (17) and (18) hold under any*
*one of the following cases: x*1 = *x*2*, y*1 = *y*2*; or x*1 = *−x*2*, y*1 = *y*2*; or*

*x*1=*−x*2*, y*1=*−y*2*.*

**Lemma 3.2. z(x, y) is continuously diﬀerentiable at (x, y) with w(x, y)****∈ intK**^{n}*,*
*and*

*∇**x**z(x, y) =∇g*^{soc}*(x)∇g*^{soc}*(z)** ^{−1}* and

*∇*

*y*

*z(x, y) =∇g*

^{soc}

*(y)∇g*

^{soc}

*(z)*

^{−1}*,*

*where∇g*

^{soc}

*(z)*

^{−1}*= (p√*

^{q}*w*1)^{−1}*I if w*2*= 0, and otherwise*

*∇g*^{soc}*(z)** ^{−1}* = 1

*2p*

⎡

⎢⎣

1

*√**q*

*λ*_{2}*(w)* + *√*_{q}^{1}

*λ*_{1}*(w)*

*w*^{T}_{2}

*√**q*

*λ*_{2}*(w)* *−* *√**q*^{w}^{T}^{2}

*λ*_{1}*(w)*
*w*_{2}

*√**q*

*λ*_{2}*(w)* *−√**q* ^{w}^{2}

*λ*_{1}*(w)*

*2p(I**−w*2*w*^{T}_{2})

*a(z)* +*√*_{q}^{w}^{2}^{w}^{T}^{2}

*λ*_{2}*(w)*+*√*_{q}^{w}^{2}^{w}^{T}^{2}

*λ*_{1}*(w)*

⎤

⎥*⎦ .*

*Proof. Since* *|t|*^{p}*(t* *∈ R) and* *√*^{p}

*t (t > 0) are continuously diﬀerentiable, by [15,*
*Proposition 5.2] or [7, Proposition 5], the functions g*^{soc}*(x) and√*

*x are continuously*
diﬀerentiable in R^{n}**and intK*** ^{n}*, respectively. This implies the ﬁrst part of this

lemma. A simple calculation gives the expression of *∇z(x, y). By the formula in*
[15, Proposition 5.2],

*∇g*^{soc}*(x) =*

⎧⎨

⎩

*p sign(x*1)|x1*|*^{p}^{−1}*I* *if x*2= 0;

*b(x)* *c(x)x*^{T}_{2}

*c(x)x*2 *a(x)I + (b(x)− a(x))x*2*x*^{T}_{2}

*if x*2*
= 0,*
(19)

where

*x*2= *x*2

*x*2*,* *a(x) =|λ*2*(x)|*^{p}*− |λ*1*(x)|*^{p}*λ*2*(x)− λ*1*(x)* *,*
*b(x) =* *p*

2

*sign(λ*2*(x))|λ*2*(x)|*^{p}^{−1}*+ sign(λ*1*(x))|λ*1*(x)|*^{p}^{−1}*,*
*c(x) =* *p*

2

*sign(λ*2*(x))|λ*2*(x)|*^{p}^{−1}*− sign(λ*1*(x))|λ*1*(x)|*^{p}^{−1}*.*
(20)

We next derive the formula of*∇g*^{soc}*(z)*^{−1}*. When w*2*= 0, we have λ*1*(w) = λ*2*(w) =*
*w*1 *> 0, which by (16) implies z*1 = *√*^{p}

*w*1 *and z*2= 0. From formula (19), it then
follows that *∇g*^{soc}*(z) = p|z*1*|*^{p}^{−1}*I = p√*^{q}

*w*1*I. Consequently,* *∇g*^{soc}*(z)** ^{−1}* =

_{p}*√*

*q*

^{1}

*w*1

*I.*

*When w*2 *
= 0, since* ^{p}

*λ*2*(w) >* ^{p}

*λ*1*(w), we have z*2 *
= 0 and z*2 = _{z}^{z}^{2}

2 *= w*2

by (16). Using the expression of *∇g*^{soc}*(z), it is easy to verify that b(z) + c(z)*
*and b(z)− c(z) are the eigenvalues of ∇g*^{soc}*(z) with (1, w*2*) and (1,−w*2) being
*the corresponding eigenvectors, and a(z) is the eigenvalue of multiplicity n− 2*
*with corresponding eigenvectors of the form (0, v**i**), where v*1*, . . . , v**n**−2*are any unit
vectors inR^{n}^{−1}*that span the subspace orthogonal to w*2. Hence,

*∇g*^{soc}*(z) = U diag (b(z)− c(z), a(z), . . . , a(z), b(z) + c(z)) U*^{T}*,*
*where U = [u*1 *v*1 *· · · v**n**−2* *u*2]*∈ R*^{n}* ^{×n}* is an orthogonal matrix with

*u*1=

1

*−w*2

*, u*2=

1
*w*2

*, v**i*=

0
*v**i*

*for i = 1, . . . , n− 2.*

By this, we know that*∇g*^{soc}*(z)** ^{−1}* has the expression given as in the lemma.
Now we are in a position to prove the following main result of this section.

**Proposition 3.1. The function ψ***p* *for p∈ (1, 4) is diﬀerentiable everywhere. Also,*
*for any given x, y* *∈ R*^{n}*, if w(x, y) = 0, then* *∇**x**ψ**p**(x, y) =* *∇**y**ψ**p**(x, y) = 0; if*
*w(x, y) ∈ intK*

^{n}*, then*

*∇**x**ψ**p**(x, y) =*

*∇g*^{soc}*(x)∇g*^{soc}*(z)*^{−1}*− I*

*φ**p**(x, y),*

*∇**y**ψ**p**(x, y) =*

*∇g*^{soc}*(y)∇g*^{soc}*(z)*^{−1}*− I*

*φ**p**(x, y);*

(21)

*and if w(x, y) ∈ bd*

^{+}

*K*

^{n}*, then*

*∇**x**ψ**p**(x, y) =*

*sign(x*1)*|x*1*|*^{p}^{−1}

*q*

*|x*1*|** ^{p}*+

*|y*1

*|*

^{p}*− 1*

*φ**p**(x, y),*

*∇**y**ψ**p**(x, y) =*

*sign(y*1)*|y*1*|*^{p}^{−1}

*q*

*|x*1*|** ^{p}*+

*|y*1

*|*

^{p}*− 1*

*φ**p**(x, y).*

(22)

*Proof. Fix any (x, y)∈ R*^{n}*×R*^{n}*. If w(x, y) ∈ intK*

*, the result is implied by Lemma*

^{n}*3.2 since φ*

*p*

*(x, y) = z(x, y)− (x + y). In fact, in this case, ψ*

*p*is continuously

*diﬀerentiable at (x, y). Hence, it suﬃces to consider the cases w(x, y) = 0 and*

*w(x, y)*

**∈ bd**^{+}

*K*

^{n}*. In the following arguments, x*

^{}*and y*

*are arbitrary vectors in*

^{}R^{n}*, and μ*1*(x*^{}*, y*^{}*), μ*2*(x*^{}*, y*^{}*) are the spectral values of w(x*^{}*, y*^{}*) with ξ*^{(1)}*, ξ*^{(2)}*∈ R** ^{n}*
being the corresponding spectral vectors.

*Case 1. w(x, y) = 0. Note that (x, y) = (0, 0) in this case. Hence, we only need to*
*prove, for any x*^{}*, y*^{}*∈ R** ^{n}*,

*ψ**p**(x*^{}*, y** ^{}*)

*− ψ*

*p*

*(0, 0) =*1

2*z(x*^{}*, y** ^{}*)

*− (x*

^{}*+ y*

*)*

^{}^{2}

*= O((x*

^{}*, y*

*)), (23)*

^{}*which shows that ψ**p* *is diﬀerentiable at (0, 0) with* *∇**x**ψ**p**(0, 0) =* *∇**y**ψ**p**(0, 0) = 0.*

Indeed,

*z(x*^{}*, y** ^{}*)

*− (x*

^{}*+ y*

*) =*

^{}

^{p}*μ*1*(x*^{}*, y*^{}*) ξ*^{(1)}+^{p}

*μ*2*(x*^{}*, y*^{}*) ξ*^{(2)}*− (x*^{}*+ y** ^{}*)

*≤√*
2^{p}

*μ*2*(x*^{}*, y** ^{}*) +

*x*

^{}*+ y*

^{}*.*

(24)

*From the deﬁnition of w*1*(x, y) and w*2*(x, y), it is easy to obtain that*

*μ*2*(x*^{}*, y*^{}*) = w*1*(x*^{}*, y*^{}*) + w*2*(x*^{}*, y** ^{}*)

*≤ |λ*2

*(x*

*)|*

^{}*+*

^{p}*|λ*1

*(x*

*)|*

^{}*+*

^{p}*|λ*2

*(y*

*)|*

^{}*+*

^{p}*|λ*1

*(y*

*)|*

^{}

^{p}*.*Using the nondecreasing property of

*√*

^{p}*t and Lemma 2.3(b), it then follows that*

*p*

*μ*2*(x*^{}*, y** ^{}*)

*≤ (|λ*2

*(x*

*)*

^{}*|*

*+*

^{p}*|λ*1

*(x*

*)*

^{}*|*

*+*

^{p}*|λ*2

*(y*

*)*

^{}*|*

*+*

^{p}*|λ*1

*(y*

*)*

^{}*|*

*)*

^{p}

^{1/p}*≤ |λ*2*(x** ^{}*)

*| + |λ*1

*(x*

*)*

^{}*| + |λ*2

*(y*

*)*

^{}*| + |λ*1

*(y*

*)*

^{}*| ≤ 2(x*

^{}*+ y*

^{}*).*

This, together with (24), implies that equation (23) holds.

*Case 2. w(x, y) ∈ bd*

^{+}

*K*

^{n}*. Now w*1

*(x, y) =w*2

*(x, y) = 0, and one of x*2

*and y*2is nonzero by (18). We proceed with the arguments in three steps, as shown below.

*Step 1. We prove that w*1*(x*^{}*, y*^{}*) and w*2*(x*^{}*, y** ^{}*) are

*p times diﬀerentiable at*

*(x*

^{}*, y*

^{}*) = (x, y), wherep denotes the maximum integer not greater than p. Since*

*one of x*2

*and y*2 is nonzero, we prove this result by considering three possible

*cases: (i) x*2

*= 0, y*2

*= 0; (ii) x*2

*= 0, y*2

*= 0; and (iii) x*2

*= 0, y*2= 0. For case (i), since

_{x}

^{x}

^{}^{2}

2, _{y}^{y}^{2}^{}

2*, λ*2*(x*^{}*), λ*1*(x*^{}*), λ*2*(y*^{}*), and λ*1*(y** ^{}*) are inﬁnite times diﬀerentiable at

*(x, y), and|t|*

*is*

^{p}*p times continuously diﬀerentiable in R, it follows that w*1

*(x*

^{}*, y*

*)*

^{}*and w*2

*(x*

^{}*, y*

*) are*

^{}*p times diﬀerentiable at (x, y). Now assume that case (ii) is*satisﬁed. From the arguments in case (i), we know that

*|λ*2*(y** ^{}*)|

*+*

^{p}*|λ*1

*(y*

*)|*

^{}

^{p}2 and *|λ*2*(y** ^{}*)|

^{p}*− |λ*1

*(y*

*)|*

^{}*2*

^{p}*y*2^{}

*y*2^{}

are *p times diﬀerentiable at (x, y). In addition, since |λ**i**(x** ^{}*)|

^{p}*≤ 2*

^{p}^{2}

*x*

^{}

^{p}*for i = 1, 2, and x = 0 in this case, we have that*

*|λ*2

*(x*

*)|*

^{}*+*

^{p}*|λ*1

*(x*

*)|*

^{}*and*

^{p}1

2(*|λ*2*(x** ^{}*)

*|*

^{p}*− |λ*1

*(x*

*)*

^{}*|*

^{p}*)x*

^{}_{2}are

*p times diﬀerentiable at x with the ﬁrst p − 1*

*order derivatives being zero. Thus, w*1

*(x*

^{}*, y*

^{}*) and w*2

*(x*

^{}*, y*

*) are*

^{}*p times diﬀeren-*

*tiable at (x, y). By the symmetry of x*

^{}*, y*

^{}*in w(x*

^{}*, y*

*) and the arguments in case (ii), the result also holds for case (iii).*

^{}*Step 2. We show that ψ**p* *is diﬀerentiable at (x, y). By the deﬁnition of ψ**p*, we have
*2ψ**p**(x*^{}*, y** ^{}*) =

*x*

^{}*+ y*

^{}^{2}+

*z(x*

^{}*, y*

*)*

^{}^{2}

*− 2z(x*

^{}*, y*

^{}*), x*

^{}*+ y*

^{}*.*

Since*x*^{}*+ y*^{}^{2}is diﬀerentiable, it suﬃces to argue that the last two terms on the
*right-hand side are diﬀerentiable at (x, y). By formulas (8)-(9), it is not hard to*

calculate that

2*z(x*^{}*, y** ^{}*)

^{2}

*= (μ*2

*(x*

^{}*, y*

*))*

^{}^{2}

^{p}*+ (μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

^{p}*,*(25)

2*z(x*^{}*, y*^{}*), x*^{}*+ y*^{}* =*^{p}

*μ*2*(x*^{}*, y** ^{}*)

*x*^{}_{1}*+ y*^{}_{1}+*(w*2*(x*^{}*, y** ^{}*))

^{T}*(x*

^{}_{2}

*+ y*

^{}_{2})

*w*2*(x*^{}*, y** ^{}*)

+^{p}

*μ*1*(x*^{}*, y** ^{}*)

*x*^{}_{1}*+ y*_{1}^{}*−(w*2*(x*^{}*, y** ^{}*))

^{T}*(x*

^{}_{2}

*+ y*

_{2}

*)*

^{}*w*2*(x*^{}*, y** ^{}*)

*.*
(26)

*Since w*2*(x, y)
= 0, μ*2*(x, y) = λ*2*(w) > 0, and w*1*(x*^{}*, y*^{}*) and w*2*(x*^{}*, y** ^{}*) are diﬀer-

*entiable at (x, y), by Step 1 we have that (μ*2

*(x*

^{}*, y*

*))*

^{}^{2}

*and the ﬁrst term on the*

^{p}*right-hand side of (26) is diﬀerentiable at (x, y). Thus, it suﬃces to prove that*

*(μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

*and the last term on the right-hand side of (26) are diﬀerentiable at*

^{p}*(x, y).*

*We ﬁrst argue that (μ*1*(x*^{}*, y** ^{}*))

^{2}

^{p}*is diﬀerentiable at (x, y). Since w*2

*(x, y)*

*=*

*0, and w*1

*(x*

^{}*, y*

^{}*) and w*2

*(x*

^{}*, y*

*) are*

^{}*p times diﬀerentiable at (x, y) by Step 1,*

*the function μ*1

*(x*

^{}*, y*

*) is*

^{}*p times diﬀerentiable at (x, y). When p < 2, by the*

*mean-value theorem and μ*1

*(x, y) = λ*1

*(w) = 0, it follows that μ*1

*(x*

^{}*, y*

*) =*

^{}*O(x*

^{}*− x + y*

^{}*− y) for any (x*

^{}*, y*

^{}*) suﬃciently close to (x, y), and therefore*

*(μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

^{p}*= O[(x*

^{}*− x + y*

^{}*− y)*

^{2}

^{p}*]. This shows that (μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

*is diﬀer-*

^{p}*entiable at (x, y) with zero derivative. When p*

*≥ 2, μ*1

*(x*

^{}*, y*

*) is inﬁnite times*

^{}*diﬀerentiable at (x, y), and its ﬁrst derivative equals zero by the result in the Ap-*

*pendix. From the second-order Taylor expansion of μ*1

*(x*

^{}*, y*

^{}*) at (x, y), it follows*

*that (μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

^{p}*= O[(x*

^{}*− x + y*

^{}*− y)*

^{p}^{4}

*]. This implies that (μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

*is*

^{p}*diﬀerentiable at (x, y) with zero gradient when 2*

*≤ p < 4. Thus, we prove that*

*(μ*1

*(x*

^{}*, y*

*))*

^{}^{2}

^{p}*is diﬀerentiable at (x, y) with zero gradient when p∈ (1, 4).*

We next consider the last term on the right-hand side of (26). Observe that

*x*^{}_{1}*+ y*_{1}^{}*−(w*2*(x*^{}*, y** ^{}*))

^{T}*(x*

^{}_{2}

*+ y*

_{2}

*)*

^{}*w*2*(x*^{}*, y** ^{}*)

*is diﬀerentiable at (x, y), and its function value at (x, y) equals zero by (18).*

*Hence, this term is O(x*^{}*− x + y*^{}*− y), which, along with μ*1*(x*^{}*, y** ^{}*) =

*O(x*

^{}*− x + y*

^{}*− y), means that the last term of (26) is O((x*

^{}*− x+*

*y*^{}*− y)*^{1+}^{1}^{p}*) = o(x*^{}*− x + y*^{}*− y). This shows that the last term of (26)*
*is diﬀerentiable at (x, y) with zero derivative.*

*Step 3. We derive the formula of∇**x**ψ**p**(x, y). From Step 2, we see that 2∇ψ**p**(x, y)*
equals the diﬀerence between the gradient of ^{1}_{2}*(μ*2*(x*^{}*, y** ^{}*))

^{2}

*+*

^{p}*x*

^{}*+ y*

^{}^{2}and that of

*the ﬁrst term on the right-hand side of (26), evaluated at (x, y). By the Appendix,*

*the gradients of (μ*2

*(x*

^{}*, y*

*))*

^{}

^{1/p}*and (μ*2

*(x*

^{}*, y*

*))*

^{}

^{2/p}*with respect to x*

*, evaluated at*

^{}*(x*

^{}*, y*

^{}*) = (x, y), are*

*∇**x**(μ*2*(x*^{}*, y** ^{}*))

^{1/p}*|*

*(x*

^{}*,y*

^{}*)=(x,y)*

*= (λ*2

*(w))*

^{1}

^{p}*2*

^{−1}

^{p}

^{−1}*sign(x*1)|x1

*|*

^{p}

^{−1} 1
*w*2

*,*
(27)

*∇**x**(μ*2*(x*^{}*, y** ^{}*))

^{2/p}*|*

*(x*

^{}*,y*

^{}*)=(x,y)*

*= (λ*2

*(w))*

^{2}

^{p}*2*

^{−1}

^{p}*sign(x*1)|x1

*|*

^{p}

^{−1}1
*w*2

*.*
(28)