Computational Optimization and Applications, vol. 45, pp. 581-606, 2010
A one-parametric class of merit functions for the second-order cone complementarity problem
Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University
Taipei, Taiwan 11677 E-mail: jschen@math.ntnu.edu.tw
Shaohua Pan
School of Mathematical Sciences South China University of Technology
Guangzhou 510640, China E-mail: shhpan@scut.edu.cn
January 10, 2007
(first revised September 3, 2007) (final revised April 21, 2008)
Abstract. We investigate a one-parametric class of merit functions for the second-order cone complementarity problem (SOCCP) which is closely related to the popular Fischer- Burmeister (FB) merit function and natural residual merit function. In fact, it will reduce to the FB merit function if the involved parameter τ equals 2, whereas as τ tends to zero, its limit will become a multiple of the natural residual merit function. In this paper, we show that this class of merit functions enjoys several favorable properties as the FB merit function holds, for example, the smoothness. These properties play an important role in the reformulation method of an unconstrained minimization or a nonsmooth system of equations for the SOCCP. Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions, which indicate that the FB merit function is not the best.
For the sparse linear SOCPs, the merit function corresponding to τ = 2.5 or 3 works better than the FB merit function, whereas for the dense convex SOCPs, the merit function with τ = 0.1, 0.5 or 1.0 seems to have better numerical performance.
Key words. Second-order cone, complementarity, merit function, Jordan product.
1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.
AMS subject classifications. 26B05, 26B35, 90C33, 65K05
1 Introduction
We consider the conic complementarity problem of finding a vector ζ ∈ IRn such that F (ζ)∈ K, G(ζ) ∈ K, ⟨F (ζ), G(ζ)⟩ = 0, (1) where ⟨·, ·⟩ is the Euclidean inner product, F : IRn → IRn and G : IRn → IRn are the mappings assumed to be continuously differentiable throughout this paper, and K is the Cartesian product of second-order cones (SOCs). In other words,
K = Kn1 × Kn2 × · · · × KnN, (2) where N, n1, . . . , nN ≥ 1, n1+· · · + nN = n, and
Kni :={(x1, x2)∈ IR × IRni−1 | ∥x2∥ ≤ x1
}
,
with∥ · ∥ denoting the Euclidean norm and K1 denoting the set of nonnegative reals IR+. We will refer to (1)–(2) as the second-order cone complementarity problem (SOCCP).
An important special case of the SOCCP corresponds to G(ζ) = ζ for all ζ ∈ IRn. Then (1) reduces to
F (ζ) ∈ K, ζ ∈ K, ⟨F (ζ), ζ⟩ = 0, (3)
which is a natural extension of the nonlinear complementarity problem (NCP) [9, 10]
with K = IRn+, the nonnegative orthant cone of IRn. Another important special case corresponds to the KKT optimality conditions of the convex second-order cone program (CSOCP):
minimize g(x)
subject to Ax = b, x∈ K, (4)
where g : IRn→ IR is a convex twice continuously differentiable function, A ∈ IRm×n has full row rank and b∈ IRm. From [6], we know that the KKT conditions of (4), which are sufficient but not necessary for optimality, can be reformulated as (1) with
F (ζ) := ¯x + (I − AT(AAT)−1A)ζ, G(ζ) :=∇g(F (ζ)) − AT(AAT)−1Aζ, (5) where ¯x∈ IRnis any point such that A¯x = b. When g is linear, the CSOCP reduces to the linear SOCP which arises in numerous applications in engineering design, finance, robust optimization, and includes as special cases convex quadratically constrained quadratic programs and linear programs; see [1, 15] and references therein.
There have been various methods proposed for solving SOCPs and SOCCPs. They include the interior-point methods [2, 3, 16, 17, 21], the non-interior smoothing Newton methods [5, 8], and the smoothing-regularization method [12]. Recently, there was an alternative method [6] based on reformulating the SOCCP as an unconstrained mini- mization problem. In that approach, it aims to find a function ψ : IRn × IRn → IR+
satisfying
ψ(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K, ⟨x, y⟩ = 0, (6) so that the SOCCP can be reformulated as an unconstrained minimization problem
ζmin∈IRnf (ζ) := ψ(F (ζ), G(ζ)).
We call such ψ a merit function associated with the cone K.
A popular choice of ψ is the Fischer-Burmeister (FB) merit function ψFB(x, y) := 1
2∥ϕFB(x, y)∥2, (7)
where ϕFB : IRn× IRn→ IRn is the vector-valued FB function defined by
ϕFB(x, y) := (x2+ y2)1/2− (x + y), (8) with x2 = x◦x denoting the Jordan product between x and itself, x1/2 being a vector such that (x1/2)2 = x, and x + y meaning the usual componentwise addition of vectors. The function ψFB was studied in [6] and particularly shown to be continuously differentiable (smooth). Another popular choice of ψ is the natural residual merit function
ψNR(x, y) := 1
2∥ϕNR(x, y)∥2,
where ϕNR : IRn× IRn → IRn is the vector-valued natural residual function given by ϕNR(x, y) := x− (x − y)+
with (·)+ meaning the projection in the Euclidean norm onto K. The function ϕNR was studied in [8, 12] which is involved in smoothing methods for the SOCCP. Compared with the FB merit function ψFB, the function ψNR has a drawback, i.e., its non-differentiability.
In this paper, we will investigate the following one-parametric class of functions ψτ(x, y) := 1
2∥ϕτ(x, y)∥2, (9)
where τ is a fixed parameter from (0, 4) and ϕτ : IRn× IRn → IRn is defined by
ϕτ(x, y) :=[(x− y)2+ τ (x◦ y)]1/2− (x + y). (10)
Specifically, we prove that ψτ is a merit function associated with K which is continuously differentiable everywhere with computable gradient formulas (see Propositions 3.1–3.3), and hence the SOCCP can be reformulated as an unconstrained smooth minimization
ζmin∈IRnfτ(ζ) := ψτ(F (ζ), G(ζ)). (11) Also, we show that every stationary point of fτ solves the SOCCP under the condition that∇F and −∇G are column monotone (see Proposition 4.1). Observe that ϕτ reduces to ϕFB when τ = 2, whereas its limit as τ → 0 becomes a multiple of ϕNR. Thus, this class of merit functions has a close relation to two of the most important merit functions so that a closer look and study for it is worthwhile. In addition, this study is motivated by the work [13] where ϕτ was used to develop a nonsmooth Newton method for the NCP. This paper is mainly concerned with the merit function approach based on the unconstrained minimization problem (11). Numerical results are also reported for some convex SOCPs, which indicate that ψτ can be an alternative for ψFB if a suitable τ is selected.
Throughout this paper, IRn denotes the space of n-dimensional real column vectors, and IRn1× · · · × IRnm is identified with IRn1+···+nm. Thus, (x1, . . . , xm)∈ IRn1× · · · × IRnm is viewed as a column vector in IRn1+···+nm. The notation I denotes an identity matrix of suitable dimension, and int(Kn) denotes the interior of Kn. For any differentiable mapping F : IRn → IRm, ∇F (x) ∈ IRn×m denotes the transposed Jacobian of F at x.
For a symmetric matrix A, we write A≽ O (respectively, A ≻ O) to mean A is positive semidefinite (respectively, positive definite). For nonnegative α and β, we write α = O(β) to mean α≤ Cβ, with C > 0 independent of α and β. Without loss of generality, in the rest of this paper we assume that K = Kn (n > 1). All analysis can be carried over to the general case where K has the structure as (2). In addition, we always assume that τ satisfies 0 < τ < 4.
2 Preliminaries
It is known that Kn (n > 1) is a closed convex self-dual cone with nonempty interior int(Kn) :={x = (x1, x2)∈ IR × IRn−1 | x1 >∥x2∥}.
For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, the Jordan product of x and y is defined by
x◦ y := (⟨x, y⟩, y1x2+ x1y2). (12) The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, . . . , 0)T ∈ IRn. For any x = (x1, x2)∈ IR × IRn−1, the determinant
of x is defined by det(x) := x21− ∥x2∥2. If det(x)̸= 0, then x is said to be invertible. If x is invertible, there exists a unique y = (y1, y2) ∈ IR × IRn−1 satisfying x◦ y = y ◦ x = e.
We call this y the inverse of x and denote it by x−1. For each x = (x1, x2)∈ IR × IRn−1, let
Lx :=
[ x1 xT2 x2 x1I
]
. (13)
It is easily verified that Lxy = x◦ y and Lx+y = Lx+ Ly for any x, y ∈ IRn, but generally L2x = LxLx ̸= Lx2 and L−1x ̸= Lx−1. If Lx is invertible, then the inverse of Lx is given by
L−1x = 1 det(x)
x1 −xT2
−x2
det(x) x1 I + 1
x1x2xT2
. (14)
We next recall from [8] that each x = (x1, x2)∈ IR × IRn−1 admits a spectral factor- ization, associated with Kn, of the form
x = λ1(x)· u(1)x + λ2(x)· u(2)x ,
where λ1(x), λ2(x) and u(1)x , u(2)x are the spectral values and the associated spectral vectors of x given by
λi(x) = x1 + (−1)i∥x2∥, u(i)x = 1 2
(
1, (−1)ix¯2) for i = 1, 2, with ¯x2 = ∥xx2
2∥ if x2 ̸= 0, and otherwise ¯x2 being any vector in IRn−1such that∥¯x2∥ = 1. If x2 ̸= 0, the factorization is unique. The spectral factorization of x has various interesting properties; see [8]. We list three properties that will be used later.
Property 2.1 (a) x2 = λ21(x)· u(1)x + λ22(x)· u(2)x ∈ Kn for any x∈ IRn. (b) If x∈ Kn, then x1/2 =
√
λ1(x)· u(1)x +
√
λ2(x)· u(2)x ∈ Kn.
(c) x∈ Kn ⇐⇒ λ1(x)≥ 0 ⇐⇒ Lx ≽ O, x ∈ int(Kn)⇐⇒ λ1(x) > 0⇐⇒ Lx≻ O.
3 Smoothness of the function ψ
τIn this section we will show that ψτ defined by (9) is a smooth merit function. First, by Property 2.1 (a) and (b), ϕτ and ψτ are well-defined since for any x, y∈ IRn, there has (x− y)2+ τ (x◦ y) =(x +τ − 2
2 y
)2
+ τ (4− τ) 4 y2 =
(
y + τ− 2 2 x
)2
+τ (4− τ)
4 x2 ∈ Kn.(15) The following proposition shows that ψτ is indeed a merit function associated withKn.
Proposition 3.1 Let ψτ and ϕτ be given as in (9) and (10), respectively. Then, ψτ(x, y) = 0 ⇐⇒ ϕτ(x, y) = 0 ⇐⇒ x ∈ Kn, y ∈ Kn, ⟨x, y⟩ = 0.
Proof. The first equivalence is clear by the definition of ψτ. We consider the second one.
“⇐”. Since x ∈ K, y ∈ K and ⟨x, y⟩ = 0, we have x ◦ y = 0. Substituting it into the expression of ϕτ(x, y) then yields that ϕτ(x, y) = (x2+ y2)1/2− (x + y) = ϕFB(x, y). From Proposition 2.1 of [8], we immediately obtain ϕτ(x, y) = 0.
“⇒”. Suppose that ϕτ(x, y) = 0. Then, x + y = [(x− y)2+ τ (x◦ y)]1/2. Squaring both sides yields x◦ y = 0. This implies that x + y = (x2 + y2)1/2, i.e. ϕFB(x, y) = 0. From Proposition 2.1 of [8], it then follows that x∈ Kn, y∈ Kn and ⟨x, y⟩ = 0. 2
In what follows, we focus on the proof of the smoothness of ψτ. We first introduce some notation that will be used in the sequel. For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, let
w = (w1, w2) = w(x, y) := (x− y)2+ τ (x◦ y),
z = (z1, z2) = z(x, y) := [(x− y)2+ τ (x◦ y)]1/2. (16) Then, w ∈ Kn and z ∈ Kn. Moreover, by the definition of Jordan product,
w1 = w1(x, y) = ∥x∥2+∥y∥2 + (τ − 2)xTy,
w2 = w2(x, y) = 2(x1x2+ y1y2) + (τ − 2)(x1y2+ y1x2). (17) Let λ1(w) and λ2(w) be the spectral values of w. By Property 2.1 (b), we have that
z1 = z1(x, y) =
√
λ2(w) +
√
λ1(w)
2 , z2 = z2(x, y) =
√
λ2(w)−√λ1(w)
2 w¯2, (18) where ¯w2 := ∥ww2
2∥ if w2 ̸= 0 and otherwise ¯w2 is any vector in IRn−1 satisfying ∥ ¯w2∥ = 1.
The following technical lemma describes the behavior of x, y when w = (x− y)2+ τ (x◦y) is on the boundary of Kn. In fact, it may be viewed as an extension of [6, Lemma 3.2].
Lemma 3.1 For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, if w /∈ int(Kn), then x21 = ∥x2∥2, y12 = ∥y2∥2, x1y1 = xT2y2, x1y2 = y1x2; (19) x21+ y12+ (τ − 2)x1y1 = ∥x1x2+ y1y2+ (τ − 2)x1y2∥
= ∥x2∥2+∥y2∥2+ (τ − 2)xT2y2. (20) If, in addition, (x, y)̸= (0, 0), then w2 ̸= 0, and furthermore,
xT2 w2
∥w2∥ = x1, x1 w2
∥w2∥ = x2, y2T w2
∥w2∥ = y1, y1 w2
∥w2∥ = y2. (21)
Proof. Since w = (x− y)2 + τ (x◦ y) /∈ int(Kn), using (15) and [6, Lemma 3.2] yields
(
x1 +τ − 2 2 y1
)2
= x2+τ − 2 2 y2
2
, y21 =∥y2∥2,
(
x1+ τ− 2 2 y1
)
y2 =
(
x2+τ − 2 2 y2
)
y1,
(
x1+ τ− 2 2 y1
)
y1 =
(
x2+τ − 2 2 y2
)T
y2;
(
y1+τ − 2 2 x1
)2
= y2+ τ− 2 2 x2
2, x21 =∥x2∥2,
(
y1+ τ− 2 2 x1
)
x2 =
(
y2+τ − 2 2 x2
)
x1,
(
y1+τ − 2 2 x1
)
x1 =
(
y2+τ − 2 2 x2
)T
x2. From these equalities, we readily get the results in (19). Since w ∈ Kn but w /∈ int(Kn), we have∥x∥2+∥y∥2+(τ−2)xTy =∥2x1x2+ 2y1y2+ (τ − 2)(x1y2+ y1x2)∥ by λ1(w) = 0.
Applying the relations in (19) then gives the equalities in (20). If, in addition, (x, y)̸=
(0, 0), then it is clear that ∥x1x2+ y1y2 + (τ − 2)x1y2∥ = x21+ y12+ (τ − 2)x1y1 ̸= 0. To prove the equalities in (21), it suffices to verify that xT2 ∥ww2
2∥ = x1 and x1∥ww2
2∥ = x2 by the symmetry of x and y in w. The verifications are straightforward by (20) and x1y2 = y1x2 2
By Lemma 3.1, when w /∈ int(Kn), the spectral values of w are calculated as follows:
λ1(w) = 0, λ2(w) = 4(x21+ y21 + (τ − 2)x1y1
)
. (22)
If (x, y)̸= (0, 0) also holds, then using equations (18), (20) and (22) yields that z1(x, y) =
√
x21+ y12+ (τ − 2)x1y1, z2(x, y) = x1x2+ y1y2+ (τ − 2)x1y2
√
x21+ y21+ (τ − 2)x1y1 .
Thus, if (x, y)̸= (0, 0) and (x − y)2+ τ (x◦ y) /∈ int(Kn), ϕτ(x, y) can be rewritten as
ϕτ(x, y) = z(x, y)− (x + y) =
√
x21+ y21 + (τ − 2)x1y1− (x1+ y1) x1x2+ y1y2+ (τ − 2)x1y2
√
x21+ y21 + (τ − 2)x1y1
− (x2+ y2)
. (23)
This specific expression will be employed in the proof of the following main result.
Proposition 3.2 The function ψτ given by (9) is differentiable at every (x, y) ∈ IRn× IRn. Moreover, ∇xψτ(0, 0) =∇yψτ(0, 0) = 0; if (x− y)2+ τ (x◦ y) ∈ int(Kn), then
∇xψτ(x, y) = [Lx+τ−2
2 yL−1z − I]ϕτ(x, y),
∇yψτ(x, y) = [Ly+τ−2
2 xL−1z − I]ϕτ(x, y); (24)
if (x, y)̸= (0, 0) and (x − y)2+ τ (x◦ y) ̸∈ int(Kn), then x21+ y21+ (τ − 2)x1y1 ̸= 0 and
∇xψτ(x, y) =
x1+τ−22 y1
√
x21+ y12+ (τ − 2)x1y1 − 1
ϕτ(x, y),
∇yψτ(x, y) =
y1+ τ−22 x1
√
x21+ y12+ (τ − 2)x1y1 − 1
ϕτ(x, y). (25)
Proof. Case (1): (x, y) = (0, 0). For any u = (u1, u2), v = (v1, v2)∈ IR×IRn−1, let µ1, µ2 be the spectral values of (u− v)2+ τ (u◦ v) and ξ(1), ξ(2) be the spectral vectors. Then,
2 [ψτ(u, v)− ψτ(0, 0)] = [u2+ v2 + (τ − 2)(u ◦ v)]1/2− u − v 2
= √
µ1 ξ(1)+√
µ2 ξ(2)− u − v 2
≤ (√2µ2+∥u∥ + ∥v∥)2. In addition, from the definition of spectral value, it follows that
µ2 = ∥u∥2+∥v∥2+ (τ − 2)uTv + 2∥(u1u2+ v1v2) + (τ− 2)(u1v2+ v1u2)∥
≤ 2∥u∥2+ 2∥v∥2+ 3|τ − 2|∥u∥∥v∥ ≤ 5(∥u∥2+∥v∥2).
Now combining the last two equations, we have ψτ(u, v)− ψτ(0, 0) = O(∥u∥2 +∥v∥2).
This shows that ψτ is differentiable at (0, 0) with∇xψτ(0, 0) =∇yψτ(0, 0) = 0.
Case (2): (x− y)2 + τ (x◦ y) ∈ int(Kn). By [4, Proposition 5], z(x, y) defined by (18) is continuously differentiable at such (x, y), and consequently ϕτ(x, y) is also continuously differentiable at such (x, y) since ϕτ(x, y) = z(x, y)− (x + y). Notice that
z2(x, y) =
(
x +τ − 2 2 y
)2
+τ (4− τ) 4 y2, which leads to ∇xz(x, y)Lz = Lx+τ−2
2 y by taking differentiation on both sides about x.
Since Lz ≻ O by Property 2.1 (c), it follows that ∇xz(x, y) = Lx+τ−2
2 yL−1z . Consequently,
∇xϕτ(x, y) =∇xz(x, y)− I = Lx+τ−22 yL−1z − I.
This together with ∇xψτ(x, y) =∇xϕτ(x, y)ϕτ(x, y) proves the first formula of (24). For the symmetry of x and y in ψτ, the second formula also holds.
Case (3): (x, y) ̸= (0, 0) and (x − y)2+ τ (x◦ y) /∈ int(Kn). For any x′ = (x′1, x′2), y′ = (y1′, y2′)∈ IR × IRn−1, it is easy to verify that
2ψτ(x′, y′) = [x′2+ y′2+ (τ − 2)(x′ ◦ y′)]1/2
2
+∥x′+ y′∥2
−2⟨[x′2+ y′2+ (τ − 2)(x′◦ y′)
]1/2
, x′+ y′
⟩
= ∥x′∥2+∥y′∥2 + (τ − 2)⟨x′, y′⟩ + ∥x′+ y′∥2
−2⟨[x′2+ y′2+ (τ − 2)(x′◦ y′)]1/2, x′+ y′
⟩
,
where the second equality uses the fact that∥z∥2 =⟨z2, e⟩ for any z ∈ IRn. Since ∥x′∥2+
∥y′∥2+ (τ−2)⟨x′, y′⟩+∥x′+ y′∥2 is clearly differentiable in (x′, y′), it suffices to show that
⟨[x′2+ y′2+ (τ− 2)(x′◦ y′)]1/2, x′+ y′⟩ is differentiable at (x′, y′) = (x, y). By Lemma 3.1, w2 = w2(x, y)̸= 0, which implies w′2 = w2(x′, y′) = 2x′1x′2+2y1′y′2+(τ−2)(x′1y′2+y′1x′2)̸= 0 for all (x′, y′) ∈ IRn× IRn sufficiently near to (x, y). Let µ1, µ2 be the spectral values of x′2+ y′2+ (τ − 2)(x′◦ y′). Then we can compute that
2⟨[x′2+ y′2+ (τ − 2)(x′◦ y′)]1/2, x′+ y′
⟩
= √
µ2
[
x′1+ y1′ + [2(x′1x′2 + y′1y2′) + (τ − 2)(x′1y2′ + y′1x′2)]T(x′2+ y′2)
∥2(x′1x′2+ y1′y′2) + (τ − 2)(x′1y′2+ y1′x′2)∥
]
+√ µ1
[
x′1+ y1′ − [2(x′1x′2+ y′1y2′) + (τ − 2)(x′1y2′ + y′1x′2)]T (x′2+ y2′)
∥2(x′1x′2+ y1′y′2) + (τ − 2)(x′1y′2+ y1′x′2)∥
]
. (26) Since λ2(w) > 0 and w2(x, y) ̸= 0, the first term on the right-hand side of (26) is differentiable at (x′, y′) = (x, y). Now, we claim that the second term is o(∥x′ − x∥ +
∥y′ − y∥), i.e., it is differentiable at (x, y) with zero gradient. To see this, notice that w2(x, y) ̸= 0, and hence µ1 = ∥x′∥2 +∥y′∥2 + (τ − 2)⟨x′, y′⟩ − ∥2(x′1x′2 + y′1y2′) + (τ − 2)(x′1y2′ + y1′x′2)∥, viewed as a function of (x′, y′), is differentiable at (x′, y′) = (x, y).
Moreover, µ1 = λ1(w) = 0 when (x′, y′) = (x, y). Thus, the first-order Taylor’s expansion of µ1 at (x, y) yields
µ1 = O(∥x′ − x∥ + ∥y′− y∥).
Also, since w2(x, y)̸= 0, by the product and quotient rules for differentiation, the function
x′1+ y′1−[2(x′1x′2+ y1′y′2) + (τ − 2)(x′1y′2+ y1′x′2)]T (x′2+ y2′)
∥2(x′1x′2+ y′1y2′) + (τ − 2)(x′1y2′ + y′1x′2)∥ (27) is differentiable at (x′, y′) = (x, y), and it has value 0 at (x′, y′) = (x, y) due to
x1+ y1− [x1x2+ y1y2+ (τ − 2)x1y2]T (x2+ y2)
∥x1x2+ y1y2+ (τ − 2)x1y2∥ = x1− xT2
w2
∥w2∥ + y1− y2T
w2
∥w2∥ = 0.
Hence, the function in (27) is O(∥x′− x∥ + ∥y′− y∥) in magnitude, which together with µ1 = O(∥x′− x∥ + ∥y′− y∥) shows that the second term on the right-hand side of (26) is
O((∥x′− x∥ + ∥y′− y∥)3/2) = o(∥x′− x∥ + ∥y′ − y∥).
Thus, we have shown that ψτ is differentiable at (x, y). Moreover, we see that 2∇ψτ(x, y) is the sum of the gradient of ∥x′∥2+∥y′∥2+ (τ − 2)⟨x′, y′⟩ + ∥x′+ y′∥2 and the gradient of the first term on the right-hand side of (26), evaluated at (x′, y′) = (x, y).
The gradient of∥x′∥2+∥y′∥2+ (τ− 2)⟨x′, y′⟩ + ∥x′+ y′∥2 with respect to x′, evaluated at (x′, y′) = (x, y), is 2x + (τ − 2)y + 2(x + y). The derivative of the first term on the
right-hand side of (26) with respect to x′1, evaluated at (x′, y′) = (x, y), works out to be
√ 1 λ2(w)
[(
x1+τ − 2 2 y1
)
+
(
x2+τ − 2 2 y2
)T w2
∥w2∥
] (
x1+ y1+ (x2+ y2)T w2
∥w2∥
)
+
√
λ2(w)
[
1 + (x2+ τ−22 y2)T(x2+ y2)
∥x1x2+ y1y2+ (τ − 2)x1y2∥ − w2T(x2+ y2)· w2T(x2+ τ−22 y2)
∥x1x2+ y1y2+ (τ − 2)x1y2∥ · ∥w2∥2
]
= 2(x1+ τ−22 y1)(x1+ y1)
√
x21+ y21+ (τ − 2)x1y1 + 2
√
x21+ y12+ (τ − 2)x1y1,
where the equality follows from Lemma 3.1. Similarly, the gradient of the first term on the right of (26) with respect to x′2, evaluated at (x′, y′) = (x, y), works out to be
√ 1 λ2(w)
[(
x2+τ − 2 2 y2
)
+
(
x1+τ − 2 2 y1
) w2
∥w2∥
] (
x1+ y1+ (x2 + y2)T w2
∥w2∥
)
+
√
λ2(w)
[(2x1+ (τ − 2)y1)x2 +τ2(x1+ y1)y2
∥x1x2+ y1y2+ (τ − 2)x1y2∥ − wT2(x2+ y2)· (x1+τ−22 y1)w2
∥x1x2+ y1y2+ (τ − 2)x1y2∥ · ∥w2∥2
]
= 2(2x1+ (τ − 2)y1)x2+τ2(x1+ y1)y2
√
x21+ y12+ (τ − 2)x1y1
.
Then, combining the last two gradient expressions yields that 2∇xψτ(x, y) = 2x + (τ − 2)y + 2(x + y) −
[2
√
x21+ y12+ (τ − 2)x1y1 0
]
− 2
√
x21+ y12+ (τ − 2)x1y1
[ (x1+τ−22 y1)(x1+ y1) (2x1+ (τ − 2)y1)x2+ τ2(x1+ y1)y2
]
.
Using the fact that x1y2 = y1x2 and noting that ϕτ can be simplified as the one in (23) under this case, we readily rewrite the above expression for ∇xψτ(x, y) in the form of (25). By symmetry, ∇yψτ(x, y) also holds as the form of (25). 2
Proposition 3.2 shows that ψτ is differentiable with a computable gradient. To estab- lish the continuity of the gradient of ψτ or the smoothness of ψτ, we need the following two crucial technical lemmas whose proofs are provided in appendix.
Lemma 3.2 For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, if w2 ̸= 0, then
[(
x1+τ − 2 2 y1
)
+ (−1)i(x2+τ − 2 2 y2
)T w2
∥w2∥
]2
≤ (
x2+τ − 2 2 y2
)
+ (−1)i(x1+ τ− 2 2 y1
) w2
∥w2∥
2
≤ λi(w) for i = 1, 2. Furthermore, these relations also hold when interchanging x and y.
Lemma 3.3 For all (x, y) satisfying (x− y)2+ τ (x◦ y) ∈ int(Kn), we have that
Lx+τ−2 2 yL−1z
F ≤ C, Ly+τ−2 2 xL−1z
F ≤ C, (28)
where C > 0 is a constant independent of x, y and τ , and ∥ · ∥F denotes the Frobenius norm.
Proposition 3.3 The function ψτ defined by (9) is smooth everywhere on IRn× IRn. Proof. By Proposition 3.2 and the symmetry of x and y in ∇ψτ, it suffices to show that ∇xψτ is continuous at every (a, b)∈ IRn× IRn. If (a− b)2+ τ (a◦ b) ∈ int(Kn), the conclusion has been shown in Proposition 3.2. We next consider the other two cases.
Case (1): (a, b) = (0, 0). By Proposition 3.2, we need to show that ∇xψτ(x, y) → 0 as (x, y)→ (0, 0). If (x − y)2+ τ (x◦ y) ∈ int(Kn), then∇xψτ(x, y) is given by (24), whereas if (x, y) ̸= (0, 0) and (x − y)2 + τ (x◦ y) /∈ int(Kn), then ∇xψτ(x, y) is given by (25).
Notice that Lx+τ−2
2 yL−1z and √ x1+τ−22 y1
x21+y12+(τ−2)x1y1
are bounded with bound independent of x, y and τ , using the continuity of ϕτ(x, y) immediately yields the desired result.
Case (2): (a, b)̸= (0, 0) and (a−b)2+τ (a◦b) /∈ int(Kn). We will show that∇xψτ(x, y)→
∇xψτ(a, b) by the two subcases: (2a) (x, y)̸= (0, 0) and (x − y)2+ τ (x◦ y) /∈ int(Kn) and (2b) (x− y)2+ τ (x◦ y) ∈ int(Kn). In subcase (2a), ∇xψτ(x, y) is given by (25). Noting that the right hand side of (25) is continuous at (a, b), the desired result follows.
Next, we prove that ∇xψτ(x, y) → ∇xψτ(a, b) in subcase (2b). From (24), we have that
∇xψτ(x, y) =
(
x +τ − 2 2 y
)
− Lx+τ−22 yL−1z (x + y)− ϕτ(x, y). (29) On the other hand, since (a, b)̸= (0, 0) and (a − b)2+ τ (a◦ b) /∈ int(Kn),
∥a∥2+∥b∥2+ (τ − 2)aTb =∥2(a1a2+ b1b2) + (τ − 2)(a1b2+ b1a2)∥ ̸= 0, (30) and moreover from (20) it follows that
∥a∥2+∥b∥2+ (τ − 2)aTb = 2(a21+ b21+ (τ − 2)a1b1)
= 2(∥a2∥2+∥b2∥2+ (τ − 2)aT2b2)
= 2∥(a1a2+ b1b2) + (τ − 2)a1b2∥. (31) Using the equalities in (31), it is not hard to verify that
a1+τ−22 b1
√
a21+ b21+ (τ − 2)a1b1
(
(a− b)2+ τ (a◦ b))1/2 = a +τ − 2 2 b.