to appear in Journal of Nonlinear and Convex Analysis, 2017
On four discrete-type families of NCP-functions
Chien-Hao Huang1, Kang-Jun Weng 2, Jein-Shan Chen 3 , Hsun-Wei Chu 4, Ming-Yen Li 5
Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan
September 1, 2016 (revised on February 1, 2017)
Abstract. In this paper, we look into the detailed properties of four discrete-type families of NCP-functions, which are newly discovered in recent literature. With the discrete-oriented feature, we are motivated to know what differences there are compared to the traditional NCP-functions. The properties obtained in this paper not only explain the difference but also provide background bricks for designing solution methods based on such discrete-type families of NCP-functions.
Keywords. NCP-function; Complementarity; Semismooth.
1 Introduction
The nonlinear complementarity problem (NCP) [18, 27] is to find a point x ∈ Rn such that
x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0
where h·, ·i is the Euclidean inner product and F = (F1, · · · , Fn)T maps from Rn to Rn. The NCP has attracted much attention due to its various applications in operations re- search, economics, and engineering, see [13, 18, 27] and references therein. There have
1E-mail: qqnick0719@ntnu.edu.tw
2E-mail: kanewang316@gmail.com
3Corresponding author. The author’s work is supported by Ministry of Science and Technology, Taiwan.
4E-mail: 80040003s@ntnu.edu.tw
5E-mail: leemy801026@gmail.com
been many methods proposed for solving the NCP. Among which, one of the most popu- lar and powerful approaches that has been studied intensively recently is to reformulate the NCP as a system of nonlinear equations [23] or as an unconstrained minimization problem [12, 14, 19]. Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions plays an important role.
A function φ : R2 → R is called an NCP-function if it satisfies
φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (1)
Many NCP-functions and merit functions have been explored and proposed in many literature, see [16] for a survey. Among them, the Fischer-Burmeister (FB) function and the Natural-Residual (NR) function are two effective NCP-functions. The FB function φFB : R2 → R is defined by
φFB(a, b) =√
a2+ b2− (a + b), (2)
and the NR function φNR : R2 → R is defined by
φNR(a, b) = a − (a − b)+= min {a, b} , (3) where (t)+ means max{0, t} for any t ∈ R.
Recently, the generalized Fischer-Burmeister function φp
FB which includes the Fischer- Burmeister as a special case was considered in [2, 3, 4, 8, 33]. Indeed, the function φp
FB is a natural extension of the φFB function, in which the 2-norm in φFB is replaced by general p-norm. In other words, φp
FB : R2 → R is defined as
φpFB(a, b) = k(a, b)kp− (a + b), (4) where p > 1 and k(a, b)kp =p|a|p p+ |b|p. The detailed geometric view of φp
FB is depicted in [33]. Corresponding to φp
FB, there is a merit function ψp
FB : R2 → R+ given by ψp
FB(a, b) = 1 2
φp
FB(a, b)
2. (5)
For any given p > 1, the function ψp
FB is a nonnegative NCP-function and smooth on R2. Note that φp
FBis a natural “continuous” type of generalization of the FB function φFB. To the contrast, what does “generalized natural-residual function” look like? In [7], Chen et al. give an answer to the long-standing open question. More specifically, the generalized natural-residual function, denoted by φp
NR, is defined by φp
NR(a, b) = ap− (a − b)p+ (6)
with p > 1 being a positive odd integer. As remarked in [7], the main idea to create it relies on “discrete generalization”, not the “continuous generalization”. Note that when p = 1, φpNR is reduced to the natural residual function φNR.
Unlike the surface of φp
FB, the surface of φp
NR is not symmetric which may cause some difficulties in further analysis in designing solution methods. To this end, Chang et al.
[1] try to symmetrize the function φpNR. The first-type symmetrization of φpNR, denoted by φp
S−NR is proposed as
φpS−NR(a, b) =
ap− (a − b)p if a > b, ap = bp if a = b, bp− (b − a)p if a < b,
(7)
where p > 1 being a positive odd integer. It is shown in [1] that φpS−NR is an NCP- function with symmetric surface, but it is not differentiable. Therefore, it is natural to ask whether there exists another symmetrization function that has not only symmetric surface but also is differentiable. Fortunately, Chang et al. [1] also figure out the second symmetrization of φpNR, denoted by ψS−NRp , which is proposed as
ψp
S−NR(a, b) =
apbp − (a − b)pbp if a > b, apbp = a2p if a = b, apbp − (b − a)pap if a < b,
(8)
where p > 1 being a positive odd integer. As expected, the function ψp
S−NR is not only dif- ferentiable but also possesses a symmetric surface. To sum up, there exist three discrete- type families of NCP-functions: φp
NR, φp
S−NR, and ψp
S−NR, which are based on the NR function φNR.
Next, we elaborate more about the above three new NCP-functions.
(i) For p being an even integer, all of above are not NCP-functions. A counterexample is given as below.
φ2NR(−1, −2) = (−1)2− (−1 + 2)2+= 0.
φ2
S−NR(−1, −2) = (−1)2− (−1 + 2)2 = 0.
ψ2
S−NR(−1, −2) = (−1)2(−2)2− (−1 + 2)2(−2)2 = 0.
(ii) The above three functions are neither convex nor concave functions. To see this, taking p = 3 and using the following arguments verify the assertion.
1 = φ3NR(1, 1) < 1
2φ3NR(0, 1) +1
2 φ3NR(2, 1) = 0 2 +7
2 = 7 2.
1 = φ3
NR(1, 1) > 1 2φ3
NR(1, −1) +1 2 φ3
NR(1, 3) = −7 2+ 1
2 = −3.
1 = φ3
S−NR(1, 1) < 1 2φ3
S−NR(0, 0) +1 2φ3
S−NR(2, 2) = 0 2+ 8
2 = 4.
1 = φ3
S−NR(1, 1) > 1 2φ3
S−NR(2, 0) +1 2φ3
S−NR(0, 2) = 0 2+ 0
2 = 0.
1 = ψ3S−NR(1, 1) < 1
2ψS−NR3 (0, 0) + 1
2ψS−NR3 (2, 2) = 0 2+ 64
2 = 32.
1 = ψS−NR3 (1, 1) > 1
2ψS−NR3 (2, 0) + 1
2ψ3S−NR(0, 2) = 0 2 +0
2 = 0.
The idea of “discrete generalization” looks simple, but it is novel and important. In fact, the authors also apply such idea to construct more NCP-functions. For example, the authors apply it to the Fischer-Burmeister function to obtain φpD−FB : R2 → R given by
φp
D−FB(a, b) =√
a2 + b2p
− (a + b)p (9)
where p > 1 being a positive odd integer. This function is proved as an NCP-function in [22]. In addition, it can also serve as a complementarity function for second-order cone complementarity problem (SOCCP) [22].
The aforementioned four discrete-type families of NCP-functions are newly discovered.
Unlike the existing NCP-functions, we know that they are discrete-oriented in some sense.
However, what other differences there are compared to the traditional continuous-type families of NCP-functions? This is the main motivation of this paper. Even though we have the feature of differentiability, we point out that the Newton method may not be applied directly because the Jacobian at a degenerate solution to NCP may be singular (see [19, 20]). Nonetheless, the feature of differentiability may enable that some other methods relying on differentiability (like quasi-Newton methods, neural network meth- ods) can be employed directly for solving NCP. In this paper, we look into the detailed properties of these four discrete-type families of NCP-functions. The properties investi- gated in this paper not only explain the difference but also provide background bricks for designing solution methods based on such discrete-type families of NCP-functions.
The paper is organized as follows. In Section 2, we review some background defini- tions including locally Lipschitz, semismoothness, the known results about φp
FB and ψp
FB
and its related properties. In Section 3-6, we shall discuss the properties about φp
NR, φpS−NR, ψpS−NR, φpD−FB, respectively. Especially, we discuss the semismoothness of φpS−NR in Section 4 as well.
2 Preliminaries
In this section, we recall some background concepts and materials which will play an important role in the subsequent analysis. We begin with the so-called semismooth functions. Semismooth function, as introduced by Mifflin [24] for functionals and further extended by Qi and Sun [30] for vector-valued functions, is of particular interest due to the central role it plays in the superlinear convergence analysis of certain generalized Newton methods, see [30]. First, we say that F : Rn → Rm is strictly continuous (also called locally Lipschitz continuous) at x ∈ Rn [31, Chap. 9] if there exist scalars κ > 0 and δ > 0 such that
kF (y) − F (z)k ≤ κky − zk ∀y, z ∈ Rn with ky − xk ≤ δ and kz − xk ≤ δ.
The mapping F is locally Lipschitz continuous if F is locally Lipschitz continuous at every x ∈ Rn. If δ can be taken to be ∞, then F is Lipschitz continuous with Lipschitz constant κ. We say F is directionally differentiable at x ∈ Rn if
F0(x; h) := lim
t→0+
F (x + th) − F (x)
t exists ∀h ∈ Rn;
and F is directionally differentiable if F is directionally differentiable at every x ∈ Rn. If F is locally Lipschitz continuous, then F is almost everywhere differentiable by Rademachers Theorem, see [31, Section 9J]. In this case, the generalized Jacobian ∂F (x) of F at x (in the Clarke sense) can be defined as the convex hull of B-subdifferential
∂BF (x), where
∂BF (x) :=
lim
xj→x∇F (xj)
F is differentiable at xj ∈ Rn
.
Assume F is locally Lipschitz continuous. We say F is semismooth at x ∈ Rn if F is directionally differentiable at x ∈ Rn and, for any V ∈ ∂F (x + h) and h → 0, we have
F (x + h) − F (x) − V h = o(khk). (10)
Moreover, F is called ρ-order semismooth at x ∈ Rn (0 < ρ < ∞) if F is semismooth at x ∈ Rn and, for any V ∈ ∂F (x + h) and h → 0, we have
F (x + h) − F (x) − V h = O(khk1+ρ). (11) The mapping F is semismooth (respectively, ρ-order semismooth) if F is semismooth (respectively, ρ-order semismooth) at every x ∈ Rn. We say F is strongly semismooth if it is 1-order semismooth. Convex functions and piecewise continuously differentiable functions are examples of semismooth functions. The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function. The property of semismoothness plays an important role in nonsmooth Newton methods
[29, 30] as well as in some smoothing methods.
An important concept related to semismooth function is the SC1 function, which is introduced as below.
Definition 2.1. A function F : Rn→ Rm is said to be an SC1 function if F is contin- uously differentiable and its gradient is semismooth.
We can view SC1 functions are functions lying between C1 and C2 functions. By defining SC1 functions, many results regarding the minimization of C2 functions can be extended to the minimization of SC1 functions, see [28] and references therein. In addition to SC1 function, we also introduce LC1 function here.
Definition 2.2. A function F : Rn → Rm is called an LC1 function if F is continuously differentiable and its gradient is locally Lipschitz continuous.
In light of the above definitions, given any F : Rn→ Rm, we have the following relations.
strongly semismooth
⇑ ⇓
C2 ⇒ SC1 ⇒ LC1 ⇒ C1 ⇒ semismooth ⇒ locally Lipschitz
⇑ convex
To close this section, we present some well-known properties of φp
FB and ψp
FB, defined as in (4) and (5), respectively, that are important for designing a descent algorithm that is indeed derivative-free method.
Property 2.1. ([8, Propostion 3.1]) Let φpFB be defined as in (4). Then, the following hold.
(a) φp
FB is a NCP-function, i.e., it satisfies (1).
(b) φpFB is sub-additive, i.e., φpFB(w + w0) ≤ φpFB(w) + φpFB(w0) for all w, w0 ∈ R2. (c) φp
FB is positive homogeneous, i.e., φp
FB(αw) = αφp
FB(w) for all w ∈ R2 and α ≥ 0.
(d) φp
FB is convex, i.e., φp
FB(αw+(1−α)w0) ≤ αφp
FB(w)+(1−α)φp
FB(w0) for all w, w0 ∈ R2 and α ∈ [0, 1].
(e) φp
FB is Lipschitz continuous with κ1 = √
2 + 2(1/p−1/2) when 1 < p < 2, and with κ2 = 1 +√
2 when p ≥ 2. In other words, |φp
FB(w) − φp
FB(w0)| ≤ κ1kw − w0k when 1 < p < 2 and |φpFB(w) − φpFB(w0)| ≤ κ2kw − w0k when p ≥ 2 for all w, w0 ∈ R2.
Property 2.2. Let φp
FB be defined as in (4). Then, for any α > 0, the following variants of φp
FB are also NCP-functions.
φgp
FB−1(a, b) = φp
FB(a, b) − α(a)+(b)+, φgpFB−2(a, b) = φpFB(a, b) − α((a)+(b)+)2, φgpFB−3(a, b) =
q
[φpFB(a, b)]2+ α ((a)+(b)+)2, φgp
FB−4(a, b) = q
[φp
FB(a, b)]2+ α [(ab)+]2.
Property 2.3. ([9, Lemma 2.2]) Let φp
FB be defined as in (4). Then, the generalized gradient ∂φp
FB(a, b) of φp
FB at a point (a, b) is equal to the set of all (va, vb) such that (va, vb) =
sgn(a) · |a|p−1 k(a, b)kp−1p
− 1,sgn(b) · |b|p−1 k(a, b)kp−1p
− 1
if (a, b) 6= (0, 0),
(ξ − 1, ζ − 1) if (a, b) = (0, 0),
where (ξ, ζ) is any vector satisfying |ξ|p−1p + |ζ|p−1p ≤ 1.
Property 2.4. ([8, Propostion 3.2]) Let φpFB, ψpFB be defined as in (4) and (5), respec- tively. Then, the following hold.
(a) ψp
FB is an NCP-function, i.e., it satisfies (1).
(b) ψFBp (a, b) ≥ 0 for all (a, b) ∈ R2. (c) ψp
FB is continuously differentiable everywhere.
(d) ∇aψp
FB(a, b) · ∇bψp
FB(a, b) ≥ 0 for all (a, b) ∈ R2. The equality holds if and only if φp
FB(a, b) = 0.
(e) ∇aψpFB(a, b) = 0 ⇐⇒ ∇bψpFB(a, b) = 0 ⇐⇒ φpFB(a, b) = 0.
3 The function φ
pNRIn this section, we focus on the generalized NR function φp
NR defined as in (6). Its con- tinuous differentiability is studied in [7]. Here we further study the Lipschitz continuity and some property which is usually employed in derivative-free algorithm.
Proposition 3.1. ([7, Proposition 2.1]) Let φp
NR be defined as in (6) with p > 1 being a positive odd integer. Then, φp
NR is an NCP-function.
Proposition 3.2. ([7, Proposition 2.2]) Let φpNR be defined as in (6) with p > 1 being a positive odd integer, and let p = 2k + 1 where k ∈ N. Then, the following hold.
(a) An alternative expression of φp
NR is φpNR(a, b) = a2k+1−1
2 (a − b)2k+1+ (a − b)2k|a − b| . (b) The function φp
NR is continuously differentiable with
∇φp
NR(a, b) = p
"
ap−1− (a − b)p−2(a − b)+ (a − b)p−2(a − b)+
# .
(c) The function φp
NR is twice continuously differentiable with
∇2φpNR(a, b) = p(p − 1)" ap−2− (a − b)p−3(a − b)+ (a − b)p−3(a − b)+ (a − b)p−3(a − b)+ −(a − b)p−3(a − b)+
# .
Proposition 3.3. ([7, Proposition 2.4]) Let φpNR be defined as in (6) with p > 1 being a positive odd integer. Then, for any α > 0, the following variants of φpNR are also NCP-functions.
φgp
NR−1(a, b) = φp
NR(a, b) + α(a)+(b)+, φgpNR−2(a, b) = φpNR(a, b) + α ((a)+(b)+)2, φgp
NR−3(a, b) = [φp
NR(a, b)]2+ α ((ab)+)4, φgpNR−4(a, b) = [φpNR(a, b)]2+ α ((ab)+)2.
Proposition 3.4. Let φp
NR be defined as in (6) with p > 1 being a positive odd integer.
Then, the following hold.
(a) φpNR(a, b) > 0 ⇐⇒ a > 0, b > 0.
(b) φpNR is positive homogeneous of degree p, i.e., φpNR(αw) = αpφpNR(w) for all w ∈ R2 and α ≥ 0.
(c) φp
NR is locally Lipschitz continuous, but not (globally) Lipschitz continuous.
(d) φp
NR is not α-H¨older continuous for any α ∈ (0, 1], that is, the H¨older coefficient [φpNR]α,R2 := sup
w6=w0
|φp
NR(w) − φp
NR(w0)|
kw − w0kα is infinite.
(e) ∇aφp
NR(a, b) · ∇bφp
NR(a, b)
> 0 on {(a, b) | a > b > 0 or a > b > 2a},
= 0 on {(a, b) | a ≤ b or a > b = 2a or a > b = 0},
< 0 otherwise.
(f ) ∇aφp
NR(a, b) · ∇bφp
NR(a, b) = 0 provided that φp
NR(a, b) = 0.
Proof. (a) This result has been mentioned in [7, Lemma 2.2].
(b) It is clear by definition of φp
NR.
(c) Since continuously differentiability implies locally Lipschitz continuity, it remains to show φpNR is not Lipschitz continuous. Consider the restriction of φpNR on the line L := {(a, b) | a = b > 0}. Note that for any a > 0, φp
NR(a, a) = ap, it suffices to show that f (t) := tp is not Lipschitz continuous. Indeed, for any M > 0, choosing t = max{1, M } and t0 = t + 1 yields
|f (t) − f (t0)|
|t − t0| = (t + 1)p− tp
= (t + 1)p−1+ (t + 1)p−2t + · · · + tp−1
> p · tp−1
> M.
Hence, it follows that f is not Lipschitz continuous.
(d) As in the proof of part(c), we again restrict φp
NR on L and choose the same t. Hence, we also have
|f (t) − f (t0)|
|t − t0|α > M for any positive number M , that is, φp
NR is not α-H¨older continuous.
(e) According to Proposition 3.2, we know that
∇aφpNR(a, b) · ∇bφpNR(a, b) = p2· (ap−1− (a − b)p−2(a − b)+) ((a − b)p−2(a − b)+)
=
( p2· (ap−1− (a − b)p−1) (a − b)p−1 if a > b,
0 if a ≤ b.
When a > b, it is clear that p2 > 0 and (a − b)p−1> 0. Thus, we only consider the term ap−1− (a − b)p−1. Note that p − 1 is even, which implies
ap−1= (a − b)p−1 ⇐⇒ |a| = a − b ⇐⇒ b = 0 or b = 2a.
In addition to the case a ≤ b, there are two subcases a > b = 0 and a > b = 2a such that
∇aφp
NR(a, b) · ∇bφp
NR(a, b) = 0. On the other hand, we have
ap−1> (a − b)p−1 ⇐⇒ |a| > a − b ⇐⇒ b > 0 or b > 2a.
All the above says ∇aφpNR(a, b)·∇bφpNR(a, b) is positive only when a > b > 0 or a > b > 2a.
For the remainder case, it is not hard to verify ∇aφpNR(a, b) · ∇bφpNR(a, b) < 0.
(f) It is clear from part(e). 2
4 The function φ
pS−NR
In this section, we focus on the function φpS−NR defined as in (7). As mentioned earlier, it is the symmetrization of φpNR. As mentioned in [1], Chang et al. showed that it is not differentiable on the line L = {(a, b) | a = b}. However, it should be mildly modified since φp
S−NR is differentiable at (0, 0). Here we further study the Lipschitz continuity, semis- moothness, and some properties which are usually employed in derivative-free algorithm.
Proposition 4.1. ([1, Proposition 2.1]) Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer. Then, φpS−NR is an NCP-function and is positive only on the first quadrant Rn++:= {(a, b) | a > 0, b > 0}.
Proposition 4.2. ([1, Proposition 2.2]) Let φpS−NR be defined as in (7) with p > 1 being a positive odd integer. Then, the following hold.
(a) An alternative expression of φp
S−NR is
φp
S−NR(a, b) =
φp
NR(a, b) if a > b, ap = bp if a = b, φp
NR(b, a) if a < b.
(b) The function φp
S−NR is not differentiable. However, φp
S−NR is continuously differen- tiable on the set Ω := {(a, b) | a 6= b} with
∇φpS−NR(a, b) =
( p [ ap−1− (a − b)p−1, (a − b)p−1]T if a > b, p [ (b − a)p−1, bp−1− (b − a)p−1]T if a < b.
In a more compact form,
∇φpS−NR(a, b) =
( p [ φp−1
NR (a, b), (a − b)p−1]T if a > b, p [ (b − a)p−1, φp−1NR (b, a) ]T if a < b.
(c) The function φp
S−NR is twice continuously differentiable on the set Ω = {(a, b) | a 6= b}
with
∇2φpS−NR(a, b) =
p(p − 1)
"
ap−2− (a − b)p−2 (a − b)p−2 (a − b)p−2 −(a − b)p−2
#
if a > b,
p(p − 1)
"
−(b − a)p−2 (b − a)p−2 (b − a)p−2 bp−2− (b − a)p−2
#
if a < b.
In a more compact form,
∇2φp
S−NR(a, b) =
p(p − 1)
"
φp−2NR (a, b) (a − b)p−2 (a − b)p−2 −(a − b)p−2
#
if a > b,
p(p − 1)
"
−(b − a)p−2 (b − a)p−2 (b − a)p−2 φp−2NR (b, a)
#
if a < b.
Proposition 4.3. Let φpS−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, φp
S−NR is differentiable at (0, 0) with ∇φp
S−NR(0, 0) =
"
0 0
#
Proof. First, we change the representation of φp
S−NR by polar coordinate, i.e., φp
S−NR(a, b) =
( ap− (a − b)p if a ≥ b, bp− (b − a)p if a < b,
=
( rp[cospθ − (cos θ − sin θ)p] if −3π4 ≤ θ ≤ π4, rp[sinpθ − (sin θ − cos θ)p] if π4 < θ < 5π4 ,
We note that the parts | cospθ − (cos θ − sin θ)p| and | sinpθ − (sin θ − cos θ)p| are bounded by some constant Mp which depends on p, hence we have
|φp
S−NR(a, b) − φp
S−NR(0, 0)|
√a2+ b2 ≤ Mp· rp
r = Mp· rp−1 → 0 as r → 0.
As (a, b) → (0, 0), which implies r → 0, we conclude that ∇φp
S−NR(0, 0) =
"
0 0
#
. 2
Note that φp
S−NR is indicated not differentiable on the line L = {(a, b) | a = b} in [1, Proposition 2.2]. Here, we show that it is indeed differentiable at (0, 0) so that Proposition 4.3 can be viewed as an addendum to [1, Proposition 2.2].
Proposition 4.4. ([1, Proposition 2.3]) Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer. Then, for any α > 0, the following variants of φp
S−NR are also NCP-functions.
φe1(a, b) = φpS−NR(a, b) + α(a)+(b)+, φe2(a, b) = φp
S−NR(a, b) + α ((a)+(b)+)2, φe3(a, b) = [φpS−NR(a, b)]2+ α ((ab)+)4, φe4(a, b) = [φp
S−NR(a, b)]2+ α ((ab)+)2. Proposition 4.5. Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, the following hold.
(a) φp
S−NR(a, b) > 0 ⇐⇒ a > 0, b > 0.
(b) φp
S−NR is positive homogeneous of degree p.
(c) φp
S−NR is not Lipschitz continuous.
(d) φp
S−NR is not α-H¨older continuous for any α ∈ (0, 1].
(e) ∇aφp
S−NR(a, b) · ∇bφp
S−NR(a, b) > 0 on {(a, b) | a > b > 0}S{(a, b) | b > a > 0}.
(f ) ∇aφp
S−NR(a, b) · ∇bφp
S−NR(a, b) = 0 provided that φp
S−NR(a, b) = 0 and (a, b) 6= (0, 0).
Proof. (a) It is clear from Proposition 4.1 or [1, Proposition 2.1]).
(b) It follows from the definition of φpS−NR.
(c)-(d) The proof is similar to Proposition 3.4(c)-(d).
(e) It is enough to verify the case for a > b > 0 because for b > a > 0, the inequality will hold automatically due to φp
S−NR having a symmetric surface. To see this, according to Proposition 4.2(b), we have
∇aφp
S−NR(a, b) · ∇bφp
S−NR(a, b) = p2·ap−1− (a − b)p−1 (a − b)p−1, which yields the desired result by Proposition 3.4(e).
(f) This result also follows from the proof of Proposition 3.4(e). 2 Next, we show the semismoothness of φp
S−NR. In fact, each piecewise continuously differentiable function is semismooth. For the sake of completeness, we shall show this result according to the definition step by step, and hence we not only obtain the locally Lipschitz constant, generalized gradient, but also derive the “strongly” semismoothness.
First, we need to check that it is strictly continuous (locally Lipschitz continuous). Note that φp
S−NR is not global Lipschitz continuous as shown in Proposition 4.5(c).
Lemma 4.1. Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, φp
S−NR is strictly continuous (locally Lipschitz continuous).
Proof. For any point x = (a, b) with a 6= b, the continuous differentiability of φp
S−NR
implies its locally Lipschitz continuity. It remains to show φpS−NR is locally Lipschitz continuous on the line L = {(a, b) | a = b}.
To proceed the arguments, we present two inequalities that will be frequently used. Given any x0 = (a0, a0) and δ > 0, let Nδ(x0) := {x ∈ R2 | kx − x0k ≤ δ}. Then, for any x = (x1, x2) ∈ Nδ(x0), we have two basic inequalities as follows:
|xi| ≤ kxk ≤ kx − x0k + kx0k ≤ δ + kx0k ∀i = 1, 2. (12)
|x1− x2| ≤ |x1− a0| + |a0− x2| ≤ kx − x0k + kx0− xk ≤ 2δ. (13) Now, for any y, z ∈ Nδ(x0), we discuss four cases as below.
(i) For y ∈ L and z ∈ L, we have
φp
S−NR(y) − φp
S−NR(z)
= |yp1− z1p|
= |y1− z1| · |y1p−1+ yp−21 z1+ · · · + z1p−1|
≤ ky − zk · (|y1|p−1+ |y1|p−2· |z1| + · · · + |z1|p−1)
≤ p(δ + kx0k)p−1ky − zk
= κ1ky − zk,
where κ1 := p(δ + kx0k)p−1 and the second inequality holds by (12).
(ii) For y /∈ L and z ∈ L (or y ∈ L and z /∈ L), without loss of generality, we assume y1 > y2. Then, we have
φp
S−NR(y) − φp
S−NR(z)
= |y1p− (y1− y2)p− z1p|
≤ |y1p− z1p| + (y1− y2)p
≤ κ1ky − zk + (y1− y2)p−1(|y1− z1| + |z1− z2| + |z2− y2|)
≤ κ1ky − zk + (2δ)p−1(ky − zk + kz − yk)
= κ2ky − zk,
where κ2 := κ1+ 2(2δ)p−1 and the last inequality holds by (13).
(iii) For y /∈ L, z /∈ L and y, z lie on the opposite side of L, i.e., (y1 − y2)(z1− z2) < 0, without loss of generality, we assume y1 > y2 and z1 < z2. Since y, z lie on the opposite side of L, the line L and the segment [y, z] := {λy + (1 − λ)z | λ ∈ [0, 1]} must intersect at a point w ∈ [y, z] ∩ L. Thus, we have
φpS−NR(y) − φpS−NR(z)
≤ |φpS−NR(y) − φpS−NR(w)| + |φpS−NR(w) − φpS−NR(z)|
≤ κ2ky − wk + κ2kw − zk
≤ κ2ky − zk + κ2ky − zk
= κ3ky − zk,
where κ3 := 2κ2 and the third inequality holds because w ∈ [y, z].
(iv) For y /∈ L, z /∈ L and y, z lie on the same side of L, i.e., (y1 − y2)(z1 − z2) > 0, without loss of generality, we assume y1 > y2 and z1 > z2. Then, we have
φp
S−NR(y) − φp
S−NR(z)
= |(yp1− (y1− y2)p) − (z1p− (z1− z2)p)|
≤ |yp1 − zp1| + |(y1− y2)p− (z1− z2)p|
≤ κ1ky − zk + 2p(2δ)p−1ky − zk
= κ4ky − zk
where κ4 = κ1+ 2p(2δ)p−1 and the second part is estimated as follows:
|(y1− y2)p − (z1− z2)p|
= |(y1− y2) − (z1− z2)| · |(y1− y2)p−1+ · · · + (z1− z2)p−1|
≤ (|y1− z1| + |y2− z2|)(|y1− y2|p−1+ · · · + |z1− z2|p−1)
≤ (ky − zk + ky − zk)p(2δ)p−1
= 2p(2δ)p−1ky − zk.
From all the above, by choosing κ = max{κ1, κ2, κ3, κ4}, we conclude that
φpS−NR(y) − φpS−NR(z)
≤ κky − zk for any y, z ∈ Nδ(x0).
This means that φp
S−NR is locally Lipschitz continuous at x0. Then, the proof is complete.
2
Proposition 4.6. Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, the generalized gradient of φpS−NR is given by
∂φp
S−NR(a, b) =
p [ ap−1− (a − b)p−1, (a − b)p−1]T if a > b,
p [αap−1, (1 − α)ap−1]T | α ∈ [0, 1]
if a = b, p [ (b − a)p−1, bp−1− (b − a)p−1]T if a < b.
Proof. We have already seen the ∂φp
S−NR(a, b) when a 6= b in [22]. For a = b, according to the definition of Clarke’s generalized gradient, we claim that
∂φp
S−NR(a, a) = conv
lim
(ai,bi)→(a,a)∇φp
S−NR(ai, bi) φp
S−NR is differentiable at (ai, bi) ∈ R2
. To see this, we discuss three cases as below.
(i) If ai > bi, for any i ≥ n and sufficiently large n, then lim
(ai,bi)→(a,a)∇φpS−NR(ai, bi) = lim
(ai,bi)→(a,a)p
"
ap−1i − (ai− bi)p−1 (ai− bi)p−1
#
= p
"
ap−1 0
# .
(ii) If ai < bi, for any i ≥ n and sufficiently large n, then lim
(ai,bi)→(a,a)∇φpS−NR(ai, bi) = lim
(ai,bi)→(a,a)p
"
(bi− ai)p−1 bp−1i − (bi− ai)p−1
#
= p
"
0 ap−1
# . (iii) For the remainder case, ∇φp
S−NR(ai, bi) has no limit as (ai, bi) → (a, a).
From all the above, we conclude that
∂φpS−NR(a, a) = conv (
p
"
ap−1 0
# , p
"
0 ap−1
#)
= (
p
"
αap−1 (1 − α)ap−1
#
α ∈ [0, 1]
) . Thus, the desired result follows. 2
Lemma 4.2. Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, φpS−NR is a directional differentiable function.
Proof. For any point x = (a, b) with a 6= b, the continuous differentiability of φp
S−NR
implies the directional differentiability. Thus, it remains to show φpS−NR is directional differentiable on the line L = {(a, b) | a = b}.
To proceed, given any x = (a, a), h = (h1, h2) and t > 0, we discuss three cases as below.
(i) If h1 = h2, then
lim
t→0+
φp
S−NR(x + th) − φp
S−NR(x) t
= lim
t→0+
(a + th1)p− ap t
= lim
t→0+
ap+ pap−1th1 +Pp k=2
p
kap−ktkhk1− ap t
= lim
t→0+ pap−1h1+
p
X
k=2 p
kap−ktk−1hk1
!
= pap−1h1. (ii) If h1 > h2, then
lim
t→0+
φp
S−NR(x + th) − φp
S−NR(x) t
= lim
t→0+
(a + th1)p − (th1− th2)p− ap t
= lim
t→0+
ap+ pap−1th1+Pp k=2
p
kap−ktkhk1 − tp(h1− h2)p − ap t
= lim
t→0+ pap−1h1+
p
X
k=2 p
kap−ktk−1hk1− tp−1(h1− h2)p
!
= pap−1h1.
(iii) If h1 < h2, then lim
t→0+
φp
S−NR(x + th) − φp
S−NR(x) t
= lim
t→0+
(a + th2)p − (th2− th1)p− ap t
= lim
t→0+
ap+ pap−1th2+Pp k=2
p
kap−ktkhk2 − tp(h2− h1)p − ap t
= lim
t→0+ pap−1h2+
p
X
k=2 p
kap−ktk−1hk2− tp−1(h2− h1)p
!
= pap−1h2.
To sum up, the definition of directional differentiability is checked. Then, the proof is complete. 2
Proposition 4.7. Let φp
S−NR be defined as in (7) with p > 1 being a positive odd integer.
Then, φpS−NR is a semismooth function. Moreover, φpS−NR is strongly semismooth.
Proof. We shall directly show φpS−NR is strongly semismooth. Note that φpS−NR is twice continuously differentiable at any point x = (a, b) with a 6= b, which implies the strongly semismoothness of φp
S−NR at x. It remains to show φp
S−NR is strongly semismooth on the line L = {(a, b) | a = b}.
For any x = (a, a), h = (h1, h2), V ∈ ∂φp
S−NR(x + h) and h → 0, we have the following inequality while khk ≤ 1:
khkp ≤ khk2 for any p ≥ 2.
To prove the strong semismoothness of φpS−NR, we will apply this inequality and verify (11) by discussing three cases as below.
(i) If h1 = h2, then for any α ∈ [0, 1]
φp
S−NR(x + h) − φp
S−NR(x) − V h
=
(a + h1)p− ap− pαap−1, (1 − α)ap−1
"
h1 h1
#
=
ap+ pap−1h1+
p
X
k=2 p
kap−khk1− ap− pap−1h1
≤ M1(|h1|2+ · · · + |h1|p)
≤ M1(khk2+ · · · + khkp)
≤ (p − 1)M1khk2,
where M1 = max p
k|a|p−k| k = 2, 3, · · · , p and the last inequality holds when khk ≤ 1.
(ii) If h1 > h2, then
φpS−NR(x + h) − φpS−NR(x) − V h
=
(a + h1)p − (h1− h2)p − ap− p(a + h1)p−1− (h1− h2)p−1, (h1− h2)p−1
"
h1 h2
#
=
(a + h1)p − (h1− h2)p − ap− p(a + h1)p−1h1+ p(h1− h2)p
=
(a + h1)p − ap− p ap−1+
p−1
X
k=1 p−1
k ap−1−khk1
!
h1+ (p − 1)(h1− h2)p
≤
(a + h1)p − ap− pap−1h1
| {z }
Ξ1
+p
p−1
X
k=1 p−1
k ap−1−khk+11
| {z }
Ξ2
+(p − 1) |(h1− h2)p|
| {z }
Ξ3
.
As h → 0, we have the following estimations for each Ξi.
• Ξ1 ≤ (p − 1)M1khk2 by case (i).
• Ξ2 ≤ Pp−1 k=1
p−1
k |a|p−1−k|h1|k+1 ≤ M2(|h1|2+ · · · + |h1|p) ≤ (p − 1)M2khk2, where M2 = max p−1
k |a|p−1−k| k = 1, 2, · · · , p − 1 .
• Ξ3 ≤Pp k=0
p
k|h1|p−k|h2|k ≤ M3(khkp+ · · · + khkp) ≤ (p + 1)M3khk2, where M3 = max p
k | k = 0, 1, · · · , p . Hence, we conclude that
φp
S−NR(x + h) − φp
S−NR(x) − V h
≤ M khk2, where M = (p − 1)M1+ p(p − 1)M2+ (p − 1)(p + 1)M3.
(iii) If h1 < h2, the argument is similar to the case (ii).
All the above together with Lemmas 4.1-4.2 prove that φp
S−NR is strongly semismooth.
2
5 The function ψ
pS−NR
In this section, we focus on the function ψp
S−NR defined as in (8). As mentioned earlier, it is the second symmetrization of φp
NR. Moreover, it is differentiable and possesses the symmetric surface as shown in [1]. Here we further study the Lipschitz continuity, and some property which is usually employed in derivative-free algorithm.
Proposition 5.1. ([1, Proposition 3.1]) Let ψp
S−NR be defined as in (8) with p > 1 being a positive odd integer. Then, ψp
S−NR is an NCP-function and is positive on the set Ω0 = {(a, b) | ab 6= 0} ∪ {(a, b) | a < b = 0} ∪ {(a, b) | 0 = a > b}.
Proposition 5.2. ([1, Proposition 3.2]) Let ψp
S−NR be defined as in (8) with p > 1 being a positive odd integer. Then, the following hold.
(a) An alternative expression of φpS−NR is
ψp
S−NR(a, b) =
φp
NR(a, b)bp if a > b, apbp = a2p if a = b, φpNR(b, a)ap if a < b.
(b) The function ψp
S−NR is continuously differentiable with
∇ψS−NRp (a, b) =
p [ ap−1bp− (a − b)p−1bp, apbp−1− (a − b)pbp−1+ (a − b)p−1bp]T if a > b, p [ ap−1bp, apbp−1]T = pa2p−1[1 , 1 ]T if a = b, p [ ap−1bp− (b − a)pap−1+ (b − a)p−1ap, apbp−1− (b − a)p−1ap]T if a < b.
In a more compact form,
∇ψp
S−NR(a, b) =
p [ φp−1
NR (a, b)bp, φp
NR(a, b)bp−1+ (a − b)p−1bp]T if a > b,
p [ a2p−1, a2p−1]T if a = b,
p [ φpNR(b, a)ap−1+ (b − a)p−1ap, φp−1NR (b, a)ap]T if a < b.
(c) The function ψp
S−NR is twice continuously differentiable with
∇2ψS−NRp (a, b) =
p
(p − 1)[ap−2− (a − b)p−2]bp (p − 1)(a − b)p−2bp +p[ap−1− (a − b)p−1]bp−1
(p − 1)(a − b)p−2bp +p[ap−1− (a − b)p−1]bp−1
(p − 1)[ap− (a − b)p]bp−2 +2p(a − b)p−1bp−1
−(p − 1)(a − b)p−2bp
if a > b,
p
"
(p − 1)ap−2bp pap−1bp−1 pap−1bp−1 (p − 1)apbp−2
#
if a = b,
p
(p − 1)[bp− (b − a)p]ap−2 +2p(b − a)p−1ap−1
−(p − 1)(b − a)p−2ap
(p − 1)(b − a)p−2ap +p[bp−1− (b − a)p−1]ap−1
(p − 1)(b − a)p−2ap
+p[bp−1− (b − a)p−1]ap−1 (p − 1)[bp−2− (b − a)p−2]ap
if a < b.