3 The function φ

(1)

to appear in Journal of Nonlinear and Convex Analysis, 2017

On four discrete-type families of NCP-functions

Chien-Hao Huang¹, Kang-Jun Weng ², Jein-Shan Chen ³ , Hsun-Wei Chu ⁴, Ming-Yen Li ⁵

Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

September 1, 2016 (revised on February 1, 2017)

Abstract. In this paper, we look into the detailed properties of four discrete-type families of NCP-functions, which are newly discovered in recent literature. With the discrete-oriented feature, we are motivated to know what differences there are compared to the traditional NCP-functions. The properties obtained in this paper not only explain the difference but also provide background bricks for designing solution methods based on such discrete-type families of NCP-functions.

Keywords. NCP-function; Complementarity; Semismooth.

1 Introduction

The nonlinear complementarity problem (NCP) [18, 27] is to find a point x ∈ Rⁿ such that

x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0

where h·, ·i is the Euclidean inner product and F = (F₁, · · · , F_n)^T maps from Rⁿ to Rⁿ. The NCP has attracted much attention due to its various applications in operations re- search, economics, and engineering, see [13, 18, 27] and references therein. There have

1E-mail: qqnick0719@ntnu.edu.tw

2E-mail: kanewang316@gmail.com

3Corresponding author. The author’s work is supported by Ministry of Science and Technology, Taiwan.

4E-mail: 80040003s@ntnu.edu.tw

5E-mail: leemy801026@gmail.com

(2)

been many methods proposed for solving the NCP. Among which, one of the most popu- lar and powerful approaches that has been studied intensively recently is to reformulate the NCP as a system of nonlinear equations [23] or as an unconstrained minimization problem [12, 14, 19]. Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions plays an important role.

A function φ : R² → R is called an NCP-function if it satisfies

φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (1)

Many NCP-functions and merit functions have been explored and proposed in many literature, see [16] for a survey. Among them, the Fischer-Burmeister (FB) function and the Natural-Residual (NR) function are two effective NCP-functions. The FB function φ_FB : R² → R is defined by

φ_FB(a, b) =√

a²+ b²− (a + b), (2)

and the NR function φ_NR : R² → R is defined by

φ_NR(a, b) = a − (a − b)₊= min {a, b} , (3) where (t)+ means max{0, t} for any t ∈ R.

Recently, the generalized Fischer-Burmeister function φ^p

FB which includes the Fischer- Burmeister as a special case was considered in [2, 3, 4, 8, 33]. Indeed, the function φ^p

FB is a natural extension of the φ_FB function, in which the 2-norm in φ_FB is replaced by general p-norm. In other words, φ^p

FB : R² → R is defined as

φ^p_FB(a, b) = k(a, b)k_p− (a + b), (4) where p > 1 and k(a, b)k_p =p|a|^p ^p+ |b|^p. The detailed geometric view of φ^p

FB is depicted in [33]. Corresponding to φ^p

FB, there is a merit function ψ^p

FB : R² → R+ given by ψ^p

FB(a, b) = 1 2

φ^p

FB(a, b)

2. (5)

For any given p > 1, the function ψ^p

FB is a nonnegative NCP-function and smooth on R². Note that φ^p

FBis a natural “continuous” type of generalization of the FB function φ_FB. To the contrast, what does “generalized natural-residual function” look like? In [7], Chen et al. give an answer to the long-standing open question. More specifically, the generalized natural-residual function, denoted by φ^p

NR, is defined by φ^p

NR(a, b) = a^p− (a − b)^p₊ (6)

(3)

with p > 1 being a positive odd integer. As remarked in [7], the main idea to create it relies on “discrete generalization”, not the “continuous generalization”. Note that when p = 1, φ^p_NR is reduced to the natural residual function φ_NR.

Unlike the surface of φ^p

FB, the surface of φ^p

NR is not symmetric which may cause some difficulties in further analysis in designing solution methods. To this end, Chang et al.

[1] try to symmetrize the function φ^p_NR. The first-type symmetrization of φ^p_NR, denoted by φ^p

S−NR is proposed as

φ^p_S−NR(a, b) =







a^p− (a − b)^p if a > b, a^p = b^p if a = b, b^p− (b − a)^p if a < b,

(7)

where p > 1 being a positive odd integer. It is shown in [1] that φ^p_S−NR is an NCP- function with symmetric surface, but it is not differentiable. Therefore, it is natural to ask whether there exists another symmetrization function that has not only symmetric surface but also is differentiable. Fortunately, Chang et al. [1] also figure out the second symmetrization of φ^p_NR, denoted by ψ_S−NR^p , which is proposed as

ψ^p

S−NR(a, b) =







a^pb^p − (a − b)^pb^p if a > b, a^pb^p = a^2p if a = b, a^pb^p − (b − a)^pa^p if a < b,

(8)

where p > 1 being a positive odd integer. As expected, the function ψ^p

S−NR is not only differentiable but also possesses a symmetric surface. To sum up, there exist three discrete- type families of NCP-functions: φ^p

NR, φ^p

S−NR, and ψ^p

S−NR, which are based on the NR function φ_NR.

Next, we elaborate more about the above three new NCP-functions.

(i) For p being an even integer, all of above are not NCP-functions. A counterexample is given as below.

φ²_NR(−1, −2) = (−1)²− (−1 + 2)²₊= 0.

φ²

S−NR(−1, −2) = (−1)²− (−1 + 2)² = 0.

ψ²

S−NR(−1, −2) = (−1)²(−2)²− (−1 + 2)²(−2)² = 0.

(ii) The above three functions are neither convex nor concave functions. To see this, taking p = 3 and using the following arguments verify the assertion.

1 = φ³_NR(1, 1) < 1

2φ³_NR(0, 1) +1

2 φ³_NR(2, 1) = 0 2 +7

2 = 7 2.

(4)

1 = φ³

NR(1, 1) > 1 2φ³

NR(1, −1) +1 2 φ³

NR(1, 3) = −7 2+ 1

2 = −3.

1 = φ³

S−NR(1, 1) < 1 2φ³

S−NR(0, 0) +1 2φ³

S−NR(2, 2) = 0 2+ 8

2 = 4.

1 = φ³

S−NR(1, 1) > 1 2φ³

S−NR(2, 0) +1 2φ³

S−NR(0, 2) = 0 2+ 0

2 = 0.

1 = ψ³_S−NR(1, 1) < 1

2ψ_S−NR³ (0, 0) + 1

2ψ_S−NR³ (2, 2) = 0 2+ 64

2 = 32.

1 = ψ_S−NR³ (1, 1) > 1

2ψ_S−NR³ (2, 0) + 1

2ψ³_S−NR(0, 2) = 0 2 +0

2 = 0.

The idea of “discrete generalization” looks simple, but it is novel and important. In fact, the authors also apply such idea to construct more NCP-functions. For example, the authors apply it to the Fischer-Burmeister function to obtain φ^p_D−FB : R² → R given by

φ^p

D−FB(a, b) =√

a² + b²p

− (a + b)^p (9)

where p > 1 being a positive odd integer. This function is proved as an NCP-function in [22]. In addition, it can also serve as a complementarity function for second-order cone complementarity problem (SOCCP) [22].

The aforementioned four discrete-type families of NCP-functions are newly discovered.

Unlike the existing NCP-functions, we know that they are discrete-oriented in some sense.

However, what other differences there are compared to the traditional continuous-type families of NCP-functions? This is the main motivation of this paper. Even though we have the feature of differentiability, we point out that the Newton method may not be applied directly because the Jacobian at a degenerate solution to NCP may be singular (see [19, 20]). Nonetheless, the feature of differentiability may enable that some other methods relying on differentiability (like quasi-Newton methods, neural network methods) can be employed directly for solving NCP. In this paper, we look into the detailed properties of these four discrete-type families of NCP-functions. The properties investi- gated in this paper not only explain the difference but also provide background bricks for designing solution methods based on such discrete-type families of NCP-functions.

The paper is organized as follows. In Section 2, we review some background definitions including locally Lipschitz, semismoothness, the known results about φ^p

FB and ψ^p

FB

and its related properties. In Section 3-6, we shall discuss the properties about φ^p

NR, φ^p_S−NR, ψ^p_S−NR, φ^p_D−FB, respectively. Especially, we discuss the semismoothness of φ^p_S−NR in Section 4 as well.

(5)

2 Preliminaries

In this section, we recall some background concepts and materials which will play an important role in the subsequent analysis. We begin with the so-called semismooth functions. Semismooth function, as introduced by Mifflin [24] for functionals and further extended by Qi and Sun [30] for vector-valued functions, is of particular interest due to the central role it plays in the superlinear convergence analysis of certain generalized Newton methods, see [30]. First, we say that F : Rⁿ → R^m is strictly continuous (also called locally Lipschitz continuous) at x ∈ Rⁿ [31, Chap. 9] if there exist scalars κ > 0 and δ > 0 such that

kF (y) − F (z)k ≤ κky − zk ∀y, z ∈ Rⁿ with ky − xk ≤ δ and kz − xk ≤ δ.

The mapping F is locally Lipschitz continuous if F is locally Lipschitz continuous at every x ∈ Rⁿ. If δ can be taken to be ∞, then F is Lipschitz continuous with Lipschitz constant κ. We say F is directionally differentiable at x ∈ Rⁿ if

F⁰(x; h) := lim

t→0⁺

F (x + th) − F (x)

t exists ∀h ∈ Rⁿ;

and F is directionally differentiable if F is directionally differentiable at every x ∈ Rⁿ. If F is locally Lipschitz continuous, then F is almost everywhere differentiable by Rademachers Theorem, see [31, Section 9J]. In this case, the generalized Jacobian ∂F (x) of F at x (in the Clarke sense) can be defined as the convex hull of B-subdifferential

∂_BF (x), where

∂_BF (x) :=

lim

x^j→x∇F (x^j)

F is differentiable at x^j ∈ Rⁿ

.

Assume F is locally Lipschitz continuous. We say F is semismooth at x ∈ Rⁿ if F is directionally differentiable at x ∈ Rⁿ and, for any V ∈ ∂F (x + h) and h → 0, we have

F (x + h) − F (x) − V h = o(khk). (10)

Moreover, F is called ρ-order semismooth at x ∈ Rⁿ (0 < ρ < ∞) if F is semismooth at x ∈ Rⁿ and, for any V ∈ ∂F (x + h) and h → 0, we have

F (x + h) − F (x) − V h = O(khk^1+ρ). (11) The mapping F is semismooth (respectively, ρ-order semismooth) if F is semismooth (respectively, ρ-order semismooth) at every x ∈ Rⁿ. We say F is strongly semismooth if it is 1-order semismooth. Convex functions and piecewise continuously differentiable functions are examples of semismooth functions. The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function. The property of semismoothness plays an important role in nonsmooth Newton methods

(6)

[29, 30] as well as in some smoothing methods.

An important concept related to semismooth function is the SC¹ function, which is introduced as below.

Definition 2.1. A function F : Rⁿ→ R^m is said to be an SC¹ function if F is continuously differentiable and its gradient is semismooth.

We can view SC¹ functions are functions lying between C¹ and C² functions. By defining SC¹ functions, many results regarding the minimization of C² functions can be extended to the minimization of SC¹ functions, see [28] and references therein. In addition to SC¹ function, we also introduce LC¹ function here.

Definition 2.2. A function F : Rⁿ → R^m is called an LC¹ function if F is continuously differentiable and its gradient is locally Lipschitz continuous.

In light of the above definitions, given any F : Rⁿ→ R^m, we have the following relations.

strongly semismooth

⇑ ⇓

C² ⇒ SC¹ ⇒ LC¹ ⇒ C¹ ⇒ semismooth ⇒ locally Lipschitz

⇑ convex

To close this section, we present some well-known properties of φ^p

FB and ψ^p

FB, defined as in (4) and (5), respectively, that are important for designing a descent algorithm that is indeed derivative-free method.

Property 2.1. ([8, Propostion 3.1]) Let φ^p_FB be defined as in (4). Then, the following hold.

(a) φ^p

FB is a NCP-function, i.e., it satisfies (1).

(b) φ^p_FB is sub-additive, i.e., φ^p_FB(w + w⁰) ≤ φ^p_FB(w) + φ^p_FB(w⁰) for all w, w⁰ ∈ R². (c) φ^p

FB is positive homogeneous, i.e., φ^p

FB(αw) = αφ^p

FB(w) for all w ∈ R² and α ≥ 0.

(d) φ^p

FB is convex, i.e., φ^p

FB(αw+(1−α)w⁰) ≤ αφ^p

FB(w)+(1−α)φ^p

FB(w⁰) for all w, w⁰ ∈ R² and α ∈ [0, 1].

(e) φ^p

FB is Lipschitz continuous with κ₁ = √

2 + 2^(1/p−1/2) when 1 < p < 2, and with κ₂ = 1 +√

2 when p ≥ 2. In other words, |φ^p

FB(w) − φ^p

FB(w⁰)| ≤ κ₁kw − w⁰k when 1 < p < 2 and |φ^p_FB(w) − φ^p_FB(w⁰)| ≤ κ₂kw − w⁰k when p ≥ 2 for all w, w⁰ ∈ R².

(7)

Property 2.2. Let φ^p

FB be defined as in (4). Then, for any α > 0, the following variants of φ^p

FB are also NCP-functions.

φg^p

FB−1(a, b) = φ^p

FB(a, b) − α(a)₊(b)₊, φg^p_FB₋₂(a, b) = φ^p_FB(a, b) − α((a)+(b)+)², φg^p_FB₋₃(a, b) =

q

[φ^p_FB(a, b)]²+ α ((a)+(b)+)², φg^p

FB−4(a, b) = q

[φ^p

FB(a, b)]²+ α [(ab)₊]².

Property 2.3. ([9, Lemma 2.2]) Let φ^p

FB be defined as in (4). Then, the generalized gradient ∂φ^p

FB(a, b) of φ^p

FB at a point (a, b) is equal to the set of all (v_a, v_b) such that (v_a, v_b) =







sgn(a) · |a|^p−1 k(a, b)k^p−1p

− 1,sgn(b) · |b|^p−1 k(a, b)k^p−1p

− 1

if (a, b) 6= (0, 0),

(ξ − 1, ζ − 1) if (a, b) = (0, 0),

where (ξ, ζ) is any vector satisfying |ξ|^p−1^p + |ζ|^p−1^p ≤ 1.

Property 2.4. ([8, Propostion 3.2]) Let φ^p_FB, ψ^p_FB be defined as in (4) and (5), respectively. Then, the following hold.

(a) ψ^p

FB is an NCP-function, i.e., it satisfies (1).

(b) ψ_FB^p (a, b) ≥ 0 for all (a, b) ∈ R². (c) ψ^p

FB is continuously differentiable everywhere.

(d) ∇_aψ^p

FB(a, b) · ∇_bψ^p

FB(a, b) ≥ 0 for all (a, b) ∈ R². The equality holds if and only if φ^p

FB(a, b) = 0.

(e) ∇aψ^p_FB(a, b) = 0 ⇐⇒ ∇bψ^p_FB(a, b) = 0 ⇐⇒ φ^p_FB(a, b) = 0.

3 The function φ

^p_NR

In this section, we focus on the generalized NR function φ^p

NR defined as in (6). Its continuous differentiability is studied in [7]. Here we further study the Lipschitz continuity and some property which is usually employed in derivative-free algorithm.

(8)

Proposition 3.1. ([7, Proposition 2.1]) Let φ^p

NR be defined as in (6) with p > 1 being a positive odd integer. Then, φ^p

NR is an NCP-function.

Proposition 3.2. ([7, Proposition 2.2]) Let φ^p_NR be defined as in (6) with p > 1 being a positive odd integer, and let p = 2k + 1 where k ∈ N. Then, the following hold.

(a) An alternative expression of φ^p

NR is φ^p_NR(a, b) = a^2k+1−1

2 (a − b)^2k+1+ (a − b)^2k|a − b| . (b) The function φ^p

NR is continuously differentiable with

∇φ^p

NR(a, b) = p

"

a^p−1− (a − b)^p−2(a − b)₊ (a − b)^p−2(a − b)₊

# .

(c) The function φ^p

NR is twice continuously differentiable with

∇²φ^p_NR(a, b) = p(p − 1)" a^p−2− (a − b)^p−3(a − b)₊ (a − b)^p−3(a − b)₊ (a − b)^p−3(a − b)₊ −(a − b)^p−3(a − b)₊

# .

Proposition 3.3. ([7, Proposition 2.4]) Let φ^p_NR be defined as in (6) with p > 1 being a positive odd integer. Then, for any α > 0, the following variants of φ^p_NR are also NCP-functions.

φg^p

NR−1(a, b) = φ^p

NR(a, b) + α(a)₊(b)₊, φg^p_NR₋₂(a, b) = φ^p_NR(a, b) + α ((a)+(b)+)², φg^p

NR−3(a, b) = [φ^p

NR(a, b)]²+ α ((ab)₊)⁴, φg^p_NR₋₄(a, b) = [φ^p_NR(a, b)]²+ α ((ab)₊)².

Proposition 3.4. Let φ^p

NR be defined as in (6) with p > 1 being a positive odd integer.

Then, the following hold.

(a) φ^p_NR(a, b) > 0 ⇐⇒ a > 0, b > 0.

(b) φ^p_NR is positive homogeneous of degree p, i.e., φ^p_NR(αw) = α^pφ^p_NR(w) for all w ∈ R² and α ≥ 0.

(c) φ^p

NR is locally Lipschitz continuous, but not (globally) Lipschitz continuous.

(9)

(d) φ^p

NR is not α-H¨older continuous for any α ∈ (0, 1], that is, the H¨older coefficient [φ^p_NR]_α,R² := sup

w6=w⁰

|φ^p

NR(w) − φ^p

NR(w⁰)|

kw − w⁰k^α is infinite.

(e) ∇_aφ^p

NR(a, b) · ∇_bφ^p

NR(a, b)







> 0 on {(a, b) | a > b > 0 or a > b > 2a},

= 0 on {(a, b) | a ≤ b or a > b = 2a or a > b = 0},

< 0 otherwise.

(f ) ∇_aφ^p

NR(a, b) = 0 provided that φ^p

NR(a, b) = 0.

Proof. (a) This result has been mentioned in [7, Lemma 2.2].

(b) It is clear by definition of φ^p

NR.

(c) Since continuously differentiability implies locally Lipschitz continuity, it remains to show φ^p_NR is not Lipschitz continuous. Consider the restriction of φ^p_NR on the line L := {(a, b) | a = b > 0}. Note that for any a > 0, φ^p

NR(a, a) = a^p, it suffices to show that f (t) := t^p is not Lipschitz continuous. Indeed, for any M > 0, choosing t = max{1, M } and t⁰ = t + 1 yields

|f (t) − f (t⁰)|

|t − t⁰| = (t + 1)^p− t^p

= (t + 1)^p−1+ (t + 1)^p−2t + · · · + t^p−1

> p · t^p−1

> M.

Hence, it follows that f is not Lipschitz continuous.

(d) As in the proof of part(c), we again restrict φ^p

NR on L and choose the same t. Hence, we also have

|f (t) − f (t⁰)|

|t − t⁰|^α > M for any positive number M , that is, φ^p

NR is not α-H¨older continuous.

(e) According to Proposition 3.2, we know that

∇aφ^p_NR(a, b) · ∇bφ^p_NR(a, b) = p²· (a^p−1− (a − b)^p−2(a − b)+) ((a − b)^p−2(a − b)+)

=

( p²· (a^p−1− (a − b)^p−1) (a − b)^p−1 if a > b,

0 if a ≤ b.

When a > b, it is clear that p² > 0 and (a − b)^p−1> 0. Thus, we only consider the term a^p−1− (a − b)^p−1. Note that p − 1 is even, which implies

a^p−1= (a − b)^p−1 ⇐⇒ |a| = a − b ⇐⇒ b = 0 or b = 2a.

(10)

In addition to the case a ≤ b, there are two subcases a > b = 0 and a > b = 2a such that

∇_aφ^p

NR(a, b) = 0. On the other hand, we have

a^p−1> (a − b)^p−1 ⇐⇒ |a| > a − b ⇐⇒ b > 0 or b > 2a.

All the above says ∇_aφ^p_NR(a, b)·∇_bφ^p_NR(a, b) is positive only when a > b > 0 or a > b > 2a.

For the remainder case, it is not hard to verify ∇aφ^p_NR(a, b) · ∇bφ^p_NR(a, b) < 0.

(f) It is clear from part(e). 2

4 The function φ

^p

S−NR

In this section, we focus on the function φ^p_S−NR defined as in (7). As mentioned earlier, it is the symmetrization of φ^p_NR. As mentioned in [1], Chang et al. showed that it is not differentiable on the line L = {(a, b) | a = b}. However, it should be mildly modified since φ^p

S−NR is differentiable at (0, 0). Here we further study the Lipschitz continuity, semismoothness, and some properties which are usually employed in derivative-free algorithm.

S−NR be defined as in (7) with p > 1 being a positive odd integer. Then, φ^p_S−NR is an NCP-function and is positive only on the first quadrant Rⁿ++:= {(a, b) | a > 0, b > 0}.

Proposition 4.2. ([1, Proposition 2.2]) Let φ^p_S−NR be defined as in (7) with p > 1 being a positive odd integer. Then, the following hold.

(a) An alternative expression of φ^p

S−NR is

φ^p

S−NR(a, b) =





 φ^p

NR(a, b) if a > b, a^p = b^p if a = b, φ^p

NR(b, a) if a < b.

(b) The function φ^p

S−NR is not differentiable. However, φ^p

S−NR is continuously differentiable on the set Ω := {(a, b) | a 6= b} with

∇φ^p_S−NR(a, b) =

( p [ a^p−1− (a − b)^p−1, (a − b)^p−1]^T if a > b, p [ (b − a)^p−1, b^p−1− (b − a)^p−1]^T if a < b.

In a more compact form,

∇φ^p_S−NR(a, b) =

( p [ φ^p−1

NR (a, b), (a − b)^p−1]^T if a > b, p [ (b − a)^p−1, φ^p−1_NR (b, a) ]^T if a < b.

(11)

(c) The function φ^p

S−NR is twice continuously differentiable on the set Ω = {(a, b) | a 6= b}

with

∇²φ^p_S−NR(a, b) =











p(p − 1)

"

a^p−2− (a − b)^p−2 (a − b)^p−2 (a − b)^p−2 −(a − b)^p−2

#

if a > b,

p(p − 1)

"

−(b − a)^p−2 (b − a)^p−2 (b − a)^p−2 b^p−2− (b − a)^p−2

#

if a < b.

∇²φ^p

S−NR(a, b) =











p(p − 1)

"

φ^p−2_NR (a, b) (a − b)^p−2 (a − b)^p−2 −(a − b)^p−2

#

if a > b,

p(p − 1)

"

−(b − a)^p−2 (b − a)^p−2 (b − a)^p−2 φ^p−2_NR (b, a)

#

if a < b.

Proposition 4.3. Let φ^p_S−NR be defined as in (7) with p > 1 being a positive odd integer.

Then, φ^p

S−NR is differentiable at (0, 0) with ∇φ^p

S−NR(0, 0) =

"

0 0

#

Proof. First, we change the representation of φ^p

S−NR by polar coordinate, i.e., φ^p

S−NR(a, b) =

( a^p− (a − b)^p if a ≥ b, b^p− (b − a)^p if a < b,

=

( r^p[cos^pθ − (cos θ − sin θ)^p] if ^−3π₄ ≤ θ ≤ ^π₄, r^p[sin^pθ − (sin θ − cos θ)^p] if ^π₄ < θ < ^5π₄ ,

We note that the parts | cos^pθ − (cos θ − sin θ)^p| and | sin^pθ − (sin θ − cos θ)^p| are bounded by some constant M_p which depends on p, hence we have

|φ^p

S−NR(a, b) − φ^p

S−NR(0, 0)|

√a²+ b² ≤ M_p· r^p

r = M_p· r^p−1 → 0 as r → 0.

As (a, b) → (0, 0), which implies r → 0, we conclude that ∇φ^p

S−NR(0, 0) =

"

0 0

#

. 2

Note that φ^p

S−NR is indicated not differentiable on the line L = {(a, b) | a = b} in [1, Proposition 2.2]. Here, we show that it is indeed differentiable at (0, 0) so that Proposition 4.3 can be viewed as an addendum to [1, Proposition 2.2].

(12)

S−NR be defined as in (7) with p > 1 being a positive odd integer. Then, for any α > 0, the following variants of φ^p

S−NR are also NCP-functions.

φe₁(a, b) = φ^p_S−NR(a, b) + α(a)₊(b)₊, φe₂(a, b) = φ^p

S−NR(a, b) + α ((a)₊(b)₊)², φe3(a, b) = [φ^p_S−NR(a, b)]²+ α ((ab)+)⁴, φe₄(a, b) = [φ^p

S−NR(a, b)]²+ α ((ab)₊)². Proposition 4.5. Let φ^p

S−NR be defined as in (7) with p > 1 being a positive odd integer.

Then, the following hold.

(a) φ^p

S−NR(a, b) > 0 ⇐⇒ a > 0, b > 0.

(b) φ^p

S−NR is positive homogeneous of degree p.

(c) φ^p

S−NR is not Lipschitz continuous.

(d) φ^p

S−NR is not α-H¨older continuous for any α ∈ (0, 1].

(e) ∇_aφ^p

S−NR(a, b) · ∇_bφ^p

S−NR(a, b) > 0 on {(a, b) | a > b > 0}S{(a, b) | b > a > 0}.

(f ) ∇_aφ^p

S−NR(a, b) = 0 provided that φ^p

S−NR(a, b) = 0 and (a, b) 6= (0, 0).

Proof. (a) It is clear from Proposition 4.1 or [1, Proposition 2.1]).

(b) It follows from the definition of φ^p_S−NR.

(c)-(d) The proof is similar to Proposition 3.4(c)-(d).

(e) It is enough to verify the case for a > b > 0 because for b > a > 0, the inequality will hold automatically due to φ^p

S−NR having a symmetric surface. To see this, according to Proposition 4.2(b), we have

∇_aφ^p

S−NR(a, b) = p²·a^p−1− (a − b)^p−1 (a − b)^p−1, which yields the desired result by Proposition 3.4(e).

(f) This result also follows from the proof of Proposition 3.4(e). 2 Next, we show the semismoothness of φ^p

S−NR. In fact, each piecewise continuously differentiable function is semismooth. For the sake of completeness, we shall show this result according to the definition step by step, and hence we not only obtain the locally Lipschitz constant, generalized gradient, but also derive the “strongly” semismoothness.

First, we need to check that it is strictly continuous (locally Lipschitz continuous). Note that φ^p

S−NR is not global Lipschitz continuous as shown in Proposition 4.5(c).

(13)

Lemma 4.1. Let φ^p

Then, φ^p

S−NR is strictly continuous (locally Lipschitz continuous).

Proof. For any point x = (a, b) with a 6= b, the continuous differentiability of φ^p

S−NR

implies its locally Lipschitz continuity. It remains to show φ^p_S−NR is locally Lipschitz continuous on the line L = {(a, b) | a = b}.

To proceed the arguments, we present two inequalities that will be frequently used. Given any x⁰ = (a₀, a₀) and δ > 0, let N_δ(x⁰) := {x ∈ R² | kx − x⁰k ≤ δ}. Then, for any x = (x₁, x₂) ∈ N_δ(x⁰), we have two basic inequalities as follows:

|x_i| ≤ kxk ≤ kx − x⁰k + kx⁰k ≤ δ + kx⁰k ∀i = 1, 2. (12)

|x₁− x₂| ≤ |x₁− a₀| + |a₀− x₂| ≤ kx − x⁰k + kx⁰− xk ≤ 2δ. (13) Now, for any y, z ∈ N_δ(x⁰), we discuss four cases as below.

(i) For y ∈ L and z ∈ L, we have

φ^p

S−NR(y) − φ^p

S−NR(z)

= |y^p₁− z₁^p|

= |y₁− z₁| · |y₁^p−1+ y^p−2₁ z₁+ · · · + z₁^p−1|

≤ ky − zk · (|y₁|^p−1+ |y₁|^p−2· |z₁| + · · · + |z₁|^p−1)

≤ p(δ + kx⁰k)^p−1ky − zk

= κ1ky − zk,

where κ₁ := p(δ + kx⁰k)^p−1 and the second inequality holds by (12).

(ii) For y /∈ L and z ∈ L (or y ∈ L and z /∈ L), without loss of generality, we assume y₁ > y₂. Then, we have

φ^p

S−NR(y) − φ^p

S−NR(z)

= |y₁^p− (y₁− y₂)^p− z₁^p|

≤ |y₁^p− z₁^p| + (y₁− y₂)^p

≤ κ₁ky − zk + (y₁− y₂)^p−1(|y₁− z₁| + |z₁− z₂| + |z₂− y₂|)

≤ κ₁ky − zk + (2δ)^p−1(ky − zk + kz − yk)

= κ₂ky − zk,

where κ₂ := κ₁+ 2(2δ)^p−1 and the last inequality holds by (13).

(iii) For y /∈ L, z /∈ L and y, z lie on the opposite side of L, i.e., (y₁ − y₂)(z₁− z₂) < 0, without loss of generality, we assume y1 > y2 and z1 < z2. Since y, z lie on the opposite side of L, the line L and the segment [y, z] := {λy + (1 − λ)z | λ ∈ [0, 1]} must intersect at a point w ∈ [y, z] ∩ L. Thus, we have

φ^p_S−NR(y) − φ^p_S−NR(z)

≤ |φ^p_S−NR(y) − φ^p_S−NR(w)| + |φ^p_S−NR(w) − φ^p_S−NR(z)|

≤ κ₂ky − wk + κ₂kw − zk

≤ κ₂ky − zk + κ₂ky − zk

= κ₃ky − zk,

(14)

where κ₃ := 2κ₂ and the third inequality holds because w ∈ [y, z].

(iv) For y /∈ L, z /∈ L and y, z lie on the same side of L, i.e., (y1 − y2)(z1 − z2) > 0, without loss of generality, we assume y₁ > y₂ and z₁ > z₂. Then, we have

φ^p

S−NR(y) − φ^p

S−NR(z)

= |(y^p₁− (y₁− y₂)^p) − (z₁^p− (z₁− z₂)^p)|

≤ |y^p₁ − z^p₁| + |(y₁− y₂)^p− (z₁− z₂)^p|

≤ κ₁ky − zk + 2p(2δ)^p−1ky − zk

= κ₄ky − zk

where κ₄ = κ₁+ 2p(2δ)^p−1 and the second part is estimated as follows:

|(y1− y2)^p − (z1− z2)^p|

= |(y₁− y₂) − (z₁− z₂)| · |(y₁− y₂)^p−1+ · · · + (z₁− z₂)^p−1|

≤ (|y₁− z₁| + |y₂− z₂|)(|y₁− y₂|^p−1+ · · · + |z₁− z₂|^p−1)

≤ (ky − zk + ky − zk)p(2δ)^p−1

= 2p(2δ)^p−1ky − zk.

From all the above, by choosing κ = max{κ₁, κ₂, κ₃, κ₄}, we conclude that

φ^p_S−NR(y) − φ^p_S−NR(z)

≤ κky − zk for any y, z ∈ Nδ(x⁰).

This means that φ^p

S−NR is locally Lipschitz continuous at x⁰. Then, the proof is complete.

2

Then, the generalized gradient of φ^p_S−NR is given by

∂φ^p

S−NR(a, b) =







p [ a^p−1− (a − b)^p−1, (a − b)^p−1]^T if a > b,

p [αa^p−1, (1 − α)a^p−1]^T | α ∈ [0, 1]

if a = b, p [ (b − a)^p−1, b^p−1− (b − a)^p−1]^T if a < b.

Proof. We have already seen the ∂φ^p

S−NR(a, b) when a 6= b in [22]. For a = b, according to the definition of Clarke’s generalized gradient, we claim that

∂φ^p

S−NR(a, a) = conv

lim

(ai,bi)→(a,a)∇φ^p

S−NR(a_i, b_i) φ^p

S−NR is differentiable at (a_i, b_i) ∈ R²

. To see this, we discuss three cases as below.

(i) If a_i > b_i, for any i ≥ n and sufficiently large n, then lim

(ai,bi)→(a,a)∇φ^p_S−NR(a_i, b_i) = lim

(ai,bi)→(a,a)p

"

a^p−1_i − (a_i− b_i)^p−1 (ai− bi)^p−1

#

= p

"

a^p−1 0

# .

(15)

(ii) If a_i < b_i, for any i ≥ n and sufficiently large n, then lim

(ai,bi)→(a,a)∇φ^p_S−NR(a_i, b_i) = lim

(ai,bi)→(a,a)p

"

(b_i− a_i)^p−1 b^p−1_i − (b_i− a_i)^p−1

#

= p

"

0 a^p−1

# . (iii) For the remainder case, ∇φ^p

S−NR(a_i, b_i) has no limit as (a_i, b_i) → (a, a).

From all the above, we conclude that

∂φ^p_S−NR(a, a) = conv (

p

"

a^p−1 0

# , p

"

0 a^p−1

#)

= (

p

"

αa^p−1 (1 − α)a^p−1

#

α ∈ [0, 1]

) . Thus, the desired result follows. 2

Lemma 4.2. Let φ^p

Then, φ^p_S−NR is a directional differentiable function.

Proof. For any point x = (a, b) with a 6= b, the continuous differentiability of φ^p

S−NR

implies the directional differentiability. Thus, it remains to show φ^p_S−NR is directional differentiable on the line L = {(a, b) | a = b}.

To proceed, given any x = (a, a), h = (h₁, h₂) and t > 0, we discuss three cases as below.

(i) If h₁ = h₂, then

lim

t→0⁺

φ^p

S−NR(x + th) − φ^p

S−NR(x) t

= lim

t→0⁺

(a + th₁)^p− a^p t

= lim

t→0⁺

a^p+ pa^p−1th₁ +Pp k=2

p

ka^p−kt^kh^k₁− a^p t

= lim

t→0⁺ pa^p−1h₁+

p

X

k=2 p

ka^p−kt^k−1h^k₁

!

= pa^p−1h₁. (ii) If h₁ > h₂, then

lim

t→0⁺

φ^p

S−NR(x) t

= lim

t→0⁺

(a + th₁)^p − (th₁− th₂)^p− a^p t

= lim

t→0⁺

a^p+ pa^p−1th₁+Pp k=2

p

ka^p−kt^kh^k₁ − t^p(h₁− h₂)^p − a^p t

= lim

t→0⁺ pa^p−1h₁+

p

X

k=2 p

ka^p−kt^k−1h^k₁− t^p−1(h₁− h₂)^p

!

= pa^p−1h₁.

(16)

(iii) If h₁ < h₂, then lim

t→0⁺

φ^p

S−NR(x) t

= lim

t→0⁺

(a + th₂)^p − (th₂− th₁)^p− a^p t

= lim

t→0⁺

a^p+ pa^p−1th₂+Pp k=2

p

ka^p−kt^kh^k₂ − t^p(h₂− h₁)^p − a^p t

= lim

t→0⁺ pa^p−1h₂+

p

X

k=2 p

ka^p−kt^k−1h^k₂− t^p−1(h₂− h₁)^p

!

= pa^p−1h₂.

To sum up, the definition of directional differentiability is checked. Then, the proof is complete. 2

Then, φ^p_S−NR is a semismooth function. Moreover, φ^p_S−NR is strongly semismooth.

Proof. We shall directly show φ^p_S−NR is strongly semismooth. Note that φ^p_S−NR is twice continuously differentiable at any point x = (a, b) with a 6= b, which implies the strongly semismoothness of φ^p

S−NR at x. It remains to show φ^p

S−NR is strongly semismooth on the line L = {(a, b) | a = b}.

For any x = (a, a), h = (h₁, h₂), V ∈ ∂φ^p

S−NR(x + h) and h → 0, we have the following inequality while khk ≤ 1:

khk^p ≤ khk² for any p ≥ 2.

To prove the strong semismoothness of φ^p_S−NR, we will apply this inequality and verify (11) by discussing three cases as below.

(i) If h1 = h2, then for any α ∈ [0, 1]

φ^p

S−NR(x + h) − φ^p

S−NR(x) − V h

=

(a + h₁)^p− a^p− pαa^p−1, (1 − α)a^p−1

"

h₁ h1

#

=

a^p+ pa^p−1h₁+

p

X

k=2 p

ka^p−kh^k₁− a^p− pa^p−1h₁

≤ M₁(|h₁|²+ · · · + |h₁|^p)

≤ M₁(khk²+ · · · + khk^p)

≤ (p − 1)M₁khk²,

(17)

where M₁ = max _p

k|a|^p−k| k = 2, 3, · · · , p and the last inequality holds when khk ≤ 1.

(ii) If h₁ > h₂, then

φ^p_S−NR(x + h) − φ^p_S−NR(x) − V h

=

(a + h₁)^p − (h₁− h₂)^p − a^p− p(a + h₁)^p−1− (h₁− h₂)^p−1, (h₁− h₂)^p−1

"

h₁ h₂

#

=

(a + h₁)^p − (h₁− h₂)^p − a^p− p(a + h₁)^p−1h₁+ p(h₁− h₂)^p

=

(a + h₁)^p − a^p− p a^p−1+

p−1

X

k=1 p−1

k a^p−1−kh^k₁

!

h₁+ (p − 1)(h₁− h₂)^p

≤

(a + h₁)^p − a^p− pa^p−1h₁

| {z }

Ξ1

+p

p−1

X

k=1 p−1

k a^p−1−kh^k+1₁

| {z }

Ξ2

+(p − 1) |(h₁− h₂)^p|

| {z }

Ξ3

.

As h → 0, we have the following estimations for each Ξ_i.

• Ξ₁ ≤ (p − 1)M₁khk² by case (i).

• Ξ₂ ≤ Pp−1 k=1

p−1

k |a|^p−1−k|h₁|^k+1 ≤ M₂(|h₁|²+ · · · + |h₁|^p) ≤ (p − 1)M₂khk², where M₂ = max _p−1

k |a|^p−1−k| k = 1, 2, · · · , p − 1 .

• Ξ₃ ≤Pp k=0

p

k|h₁|^p−k|h₂|^k ≤ M₃(khk^p+ · · · + khk^p) ≤ (p + 1)M₃khk², where M₃ = max _p

k | k = 0, 1, · · · , p . Hence, we conclude that

φ^p

S−NR(x + h) − φ^p

S−NR(x) − V h

≤ M khk², where M = (p − 1)M₁+ p(p − 1)M₂+ (p − 1)(p + 1)M₃.

(iii) If h₁ < h₂, the argument is similar to the case (ii).

All the above together with Lemmas 4.1-4.2 prove that φ^p

S−NR is strongly semismooth.

2

5 The function ψ

^p

S−NR

In this section, we focus on the function ψ^p

S−NR defined as in (8). As mentioned earlier, it is the second symmetrization of φ^p

NR. Moreover, it is differentiable and possesses the symmetric surface as shown in [1]. Here we further study the Lipschitz continuity, and some property which is usually employed in derivative-free algorithm.

(18)

Proposition 5.1. ([1, Proposition 3.1]) Let ψ^p

S−NR be defined as in (8) with p > 1 being a positive odd integer. Then, ψ^p

S−NR is an NCP-function and is positive on the set Ω⁰ = {(a, b) | ab 6= 0} ∪ {(a, b) | a < b = 0} ∪ {(a, b) | 0 = a > b}.

Proposition 5.2. ([1, Proposition 3.2]) Let ψ^p

S−NR be defined as in (8) with p > 1 being a positive odd integer. Then, the following hold.

(a) An alternative expression of φ^p_S−NR is

ψ^p

S−NR(a, b) =





 φ^p

NR(a, b)b^p if a > b, a^pb^p = a^2p if a = b, φ^p_NR(b, a)a^p if a < b.

(b) The function ψ^p

S−NR is continuously differentiable with

∇ψ_S−NR^p (a, b) =







p [ a^p−1b^p− (a − b)^p−1b^p, a^pb^p−1− (a − b)^pb^p−1+ (a − b)^p−1b^p]^T if a > b, p [ a^p−1b^p, a^pb^p−1]^T = pa^2p−1[1 , 1 ]^T if a = b, p [ a^p−1b^p− (b − a)^pa^p−1+ (b − a)^p−1a^p, a^pb^p−1− (b − a)^p−1a^p]^T if a < b.

∇ψ^p

S−NR(a, b) =







p [ φ^p−1

NR (a, b)b^p, φ^p

NR(a, b)b^p−1+ (a − b)^p−1b^p]^T if a > b,

p [ a^2p−1, a^2p−1]^T if a = b,

p [ φ^p_NR(b, a)a^p−1+ (b − a)^p−1a^p, φ^p−1_NR (b, a)a^p]^T if a < b.

(c) The function ψ^p

S−NR is twice continuously differentiable with

∇²ψ_S−NR^p (a, b) =









 p







(p − 1)[a^p−2− (a − b)^p−2]b^p (p − 1)(a − b)^p−2b^p +p[a^p−1− (a − b)^p−1]b^p−1

(p − 1)(a − b)^p−2b^p +p[a^p−1− (a − b)^p−1]b^p−1

(p − 1)[a^p− (a − b)^p]b^p−2 +2p(a − b)^p−1b^p−1

−(p − 1)(a − b)^p−2b^p







if a > b,

p

"

(p − 1)a^p−2b^p pa^p−1b^p−1 pa^p−1b^p−1 (p − 1)a^pb^p−2

#

if a = b,

p







(p − 1)[b^p− (b − a)^p]a^p−2 +2p(b − a)^p−1a^p−1

−(p − 1)(b − a)^p−2a^p

(p − 1)(b − a)^p−2a^p +p[b^p−1− (b − a)^p−1]a^p−1

(p − 1)(b − a)^p−2a^p

+p[b^p−1− (b − a)^p−1]a^p−1 (p − 1)[b^p−2− (b − a)^p−2]a^p







if a < b.