Discovery of new complementarity functions for NCP and SOCCP

(1)

Computational and Applied Mathematics, vol. 37, no. 5, pp. 5727-5749, 2018

Discovery of new complementarity functions for NCP and SOCCP

Peng-Fei Ma ¹

Department of Mathematics

Zhejiang University of Science and Technology Hangzhou, Zhejiang 310023, P.R. China

Jein-Shan Chen ² Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: jschen@math.ntnu.edu.tw

Chien-Hao Huang ³ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Chun-Hsu Ko ⁴

Department of Electrical Engineering I-Shou University

Kaohsiung 840, Taiwan

February 26, 2017

(1st revised on November 18, 2017) (2nd revised on March 9, 2018)

(3rd revised on May 20, 2018)

1E-mail: mathpengfeima@126.com. This research was supported by a grant from the National Nat- ural Science Foundation of China(No.11626212).

2Corresponding author. The author’s work is supported by Ministry of Science and Technology, Taiwan.

3E-mail: qqnick0719@ntnu.edu.tw

4E-mail: chko@isu.edu.tw

(2)

Abstract. It is well known that complementarity functions play an important role in dealing with complementarity problems. In this paper, we propose a few new classes of complementarity functions for nonlinear complementarity problems and second-order cone complementarity problems. The constructions of such new complementarity functions are based on discrete generalization which is a novel idea in contrast to the continuous generalization of Fischer-Burmeister function. Surprisingly, these new families of complementarity functions possess continuous differentiability even though they are discrete-oriented extensions. This feature enables that some methods like derivative-free algorithm can be employed directly for solving nonlinear complementarity problems and second-order cone complementarity problems. This is a new discovery to the literature and we believe that such new complementarity functions can also be used in many other contexts.

Keywords. NCP, SOCCP, natural residual, complementarity function.

1 Introduction

In general, the complementarity problem comes from the KKT conditions of linear and nonlinear programming problems. For different types of optimization problems, there arise various complementarity problems, for example, linear complementarity problem, nonlinear complementarity problem, semidefinite complementarity problem, second-order cone complementarity problem, and symmetric cone complementarity problem. To deal with complementarity problems, the so-called complementarity functions play an important role therein. In this paper, we focus on two classes of complementarity functions, which are used for the nonlinear complementarity problem (NCP) and the second-order cone complementarity problem (SOCCP), respectively.

The first class is the nonlinear complementarity problem (NCP) that has attracted much attention since 1970s because of its wide applications in the fields of economics, engineering, and operations research, see [17, 21, 29] and references therein. In mathe- matical format, the NCP is to find a point x ∈ Rⁿ such that

x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0,

where h·, ·i is the Euclidean inner product and F = (F₁, . . . , F_n)^T is a map from Rⁿ to Rⁿ. For solving NCP, the so-called NCP-function φ : R² → R defined as below

φ(a, b) = 0 ⇐⇒ a, b ≥ 0, ab = 0

plays a crucial role. Generally speaking, with such NCP-functions, the NCP can be re- formulated as nonsmooth equations [36, 39, 44] or unconstrained minimization [22, 23,

(3)

27, 31, 32, 40, 43]. Then, different kinds of approaches and algorithms are designed based on the aforementioned reformulations and various NCP-functions. During the past four decades, around thirty NCP-functions are proposed, see [26] for a survey.

The second class is the second-order cone complementarity problem (SOCCP), which can be viewed as a natural extension of NCP and is to seek a ζ ∈ Rⁿ such that

ζ ∈ K, F (ζ) ∈ K, hζ, F (ζ)i = 0,

where F : Rⁿ → Rⁿ is a map and K is the Cartesian product of second-order cones (SOC), also called Lorentz cones [19]. In other words, K is expressed as

K = Kⁿ¹ × · · · × Kⁿ^m, where m, n₁, . . . , n_m ≥ 1, n₁+ · · · + n_m = n, and

Kⁿⁱ := {(x₁, x₂) ∈ R × Rⁿⁱ⁻¹ | kx₂k ≤ x₁},

with k · k denoting the Euclidean norm. The SOCCP has important applications in engineering problems [35] and robust Nash equilibria [28]. Another important special case of SOCCP corresponds to the Karush-Kuhn-Tucker (KKT) optimality conditions for the second-order cone program (SOCP) (see [4] for details):

minimize c^Tx

subject to Ax = b, x ∈ K,

where A ∈ R^m×nhas full row rank, b ∈ R^mand c ∈ Rⁿ. Many solution methods have been proposed for solving SOCCP, see [12] for a survey. For example, merit function approach based on reformulating the SOCCP as an unconstrained smooth minimization problem is studied in [4, 6, 38]. In such approach, it is to find a smooth function ψ : Rⁿ× Rⁿ → R+

such that

ψ(x, y) = 0 ⇐⇒ hx, yi = 0, x ∈ Kⁿ, y ∈ Kⁿ. (1) Then, the SOCCP can be expressed as an unconstrained smooth (global) minimization problem:

min

ζ∈Rⁿ ψ(ζ, F (ζ)). (2)

In fact, a function ψ satisfying the condition in (1) (not necessarily smooth) is called a complementarity function for SOCCP (or complementarity function associated with Kⁿ).

Various gradient methods such as conjugate gradient methods and quasi-Newton methods [2, 20] can be applied for solving (2). In general, for this approach to be effective, the choice of complementarity function ψ is also crucial.

(4)

Back to the complementarity functions for NCP, two popular choices of NCP-functions are the well-known Fischer-Burmeister function (FB function, in short) φ_FB : R² → R defined by (see [23, 24])

φ_FB(a, b) =√

a²+ b²− (a + b), and the squared norm of Fischer-Burmeister function given by

ψ_FB(a, b) = 1 2

φ_FB(a, b)

2.

In addition, the generalized Fischer-Burmeister function φ_p : R² → R, which includes the Fischer-Burmeister as a special case, is considered in [5, 7, 8, 11, 30, 42]. In particular, the function φ_p is a natural “continuous extension” of φ_FB, in which the 2-norm in φ_FB(a, b) is replaced by general p-norm. In other words, φ_p : R² → R is defined as

φ_p(a, b) = k(a, b)k_p − (a + b), p > 1 (3) and its geometric view is depicted in [42]. The effect of perturbing p for different kinds of algorithms is investigated in [9–11, 14, 15]. We point it out that the generalized Fischer-Burmeister φp given as in (3) is not differentiable, whereas the squared norm of generalized Fischer-Burmeister function is smooth so that it is usually adapted as a differentiable NCP-function [38]. Moreover, all the aforementioned functions including Fischer-Burmeister function, generalized Fischer-Burmeister function and their squared norm can be extended to the setting of SOCCP via Jordan algebra.

A different type of popular NCP-function is the natural residual function φ_NR : R² → R given by

φ_NR(a, b) = a − (a − b)+= min{a, b}.

Recently, Chen et al. propose a family of generalized natural residual functions φ^p

NR

defined by

φ^p_NR(a, b) = a^p− (a − b)^p₊,

where p > 1 is a positive odd integer, (a − b)^p₊= [(a − b)₊]^p, and (a − b)₊ = max{a − b, 0}.

When p = 1, φ^p_NR reduces to the natural residual function φ_NR, i.e., φ¹

NR(a, b) = a − (a − b)₊= min{a, b} = φ_NR(a, b).

As remarked in [16], this extension is “discrete generalization”, not “continuous generalization”. Nonetheless, it possesses twice differentiability surprisingly so that the squared norm of φ^p_NR is not needed. Based on this discrete generalization, two families of NCP- functions are further proposed in [3] which have the feature of symmetric surfaces. To the contrast, it is very natural to ask whether there is a similar “discrete extension” for Fischer-Burmeister function. We answer this question affirmatively.

(5)

In this paper, we apply the idea of “discrete generalization” to the Fischer-Burmeister function which gives the following function (denoted by φ^p

D−FB):

φ^p

D−FB(a, b) =√

a²+ b²p

− (a + b)^p, (4)

where p > 1 is a positive odd integer and (a, b) ∈ R². Notice that when p = 1, φ^p

D−FB

reduces to the Fischer-Burmeister function. In Section 3, we will see that φ^p

D−FB is an NCP-function and is twice differentiable directly without taking its squared norm. Note that if p is even, it is no longer an NCP-function. Even though we have the feature of differentiability, we point out that the Newton method may not applied directly because the Jacobian at a degenerate solution to NCP is singular (see [32, 33]). Nonetheless, this feature may enable that many methods like derivative-free algorithm can be employed directly for solving NCP. In addition, we investigate the differentiable properties of φ^p_D−FB, the computable formulas for their gradients and Jacobians. In order to have more insight for this new family of NCP-function, we also depict the surfaces of φ^p

D−FB(a, b) with various values of p.

In Section 4, we show that the new function φ^p

D−FB can be further employed to the SOCCP setting as complementarity functions and merit functions. In other words, in the terms of Jordan algebra, we define φ^p

D−FB : Rⁿ× Rⁿ → Rⁿ by φ^p

D−FB(x, y) =p

x²+ y²p

− (x + y)^p, (5)

where p > 1 is a positive odd integer, x ∈ Rⁿ, y ∈ Rⁿ, x² = x ◦ x is the Jordan product of x with itself and √

x with x ∈ Kⁿ being the unique vector such that √ x ◦√

x = x.

We prove that each φ^p

D−FB(x, y) is a complementarity function associated with Kⁿ and establish formulas for its gradient and Jacobian. These properties and formulas can be used to design and analyze non-interior continuation methods for solving second-order cone programs and complementarity problems. In addition, several variants of φ^p

D−FB are also shown to be complementarity functions for SOCCP.

Throughout the paper, we assume K = Kⁿ for simplicity and all the analysis can be carried over to the case where K is a product of second-order cones without difficulty.

The following notations will be used. The identity matrix is denoted by I and Rⁿdenotes the space of n-dimensional real column vectors. For any given x ∈ Rⁿ with n > 1, we write x = (x₁, x₂) where x₁ is the first entry of x and x₂ is the subvector that consists of the remaining entries. For every differentiable function f : Rⁿ → R, ∇f(x) denotes the gradient of f at x. For every differentiable mapping F : Rⁿ → R^m, ∇F (x) is an n × m matrix which denotes the transposed Jacobian of F at x. For nonnegative scalar functions α and β, we write α = o(β) to mean lim

β→0

α β = 0.

(6)

2 Preliminaries

In this section, we review some background materials about the Jordan algebra in [19, 25].

Then, we present some technical lemmas which are needed in subsequent analysis.

For any x = (x₁, x₂), y = (y₁, y₂) ∈ R × Rⁿ⁻¹, we define the Jordan product associated with Kⁿ as

x ◦ y := (hx, yi, y₁x₂+ x₁y₂).

The identity element under this product is e := (1, 0, . . . , 0)^T ∈ Rⁿ. For any given x = (x₁, x₂) ∈ R × Rⁿ⁻¹, we define symmetric matrix

L_x := x₁ x^T₂ x₂ x₁I

which can be viewed as a linear mapping from Rⁿ to Rⁿ. It is easy to verify that Lxy = x ◦ y, ∀x ∈ Rⁿ.

Moreover, we have L_x is invertible for x _Kⁿ 0 and

L⁻¹_x = 1 det(x)





x₁ −x^T₂

−x₂ det(x) x1

I + 1 x1

x₂x^T₂



,

where det(x) = x²₁−kx₂k². We next recall from [12, 25] that each x = (x₁, x₂) ∈ R×Rⁿ⁻¹ admits a spectral factorization, associated with Kⁿ, of the form

x = λ₁u⁽¹⁾+ λ₂u⁽²⁾, (6)

where λ1, λ2 and u⁽¹⁾, u⁽²⁾ are the spectral values and the associated spectral vectors of x given by

λ_i = x₁+ (−1)ⁱkx₂k,

u⁽ⁱ⁾ =











1 2

1, (−1)ⁱ x₂ kx₂k

if x₂ 6= 0;

1 2

1, (−1)ⁱw2

if x2 = 0,

for i = 1, 2, with w₂ being any vector in Rⁿ⁻¹ satisfying kw₂k = 1. If x₂ 6= 0, the factorization is unique.

Given a real-valued function g : R → R, we can define a vector-valued SOC-function g^soc : Rⁿ→ Rⁿ by

g^soc(x) := g(λ₁)u⁽¹⁾+ g(λ₂)u⁽²⁾.

(7)

If g is defined on a subset of R, then g^soc is defined on the corresponding subset of Rⁿ. The definition of g^soc is unambiguous whether x₂ 6= 0 or x₂ = 0. In this paper, we will often use the vector-valued functions corresponding to t^p (t ∈ R) and √

t (t ≥ 0), respectively, which are expressed as

x^p := (λ₁(x))^pu⁽¹⁾+ (λ₂(x))^pu⁽²⁾, ∀x ∈ Rⁿ

√x := pλ₁(x)u⁽¹⁾+pλ₂(x)u⁽²⁾, ∀x ∈ Kⁿ.

We will see that the above two vector-valued functions play a role in showing that φ^p

D−FB

given as in (5) is well-defined in the SOC setting for any x, y ∈ Rⁿ. Note that the other way to define x^p and √

x is through Jordan product. In other words, x^p represents x ◦ x ◦ · · · ◦ x for p-times and √

x ∈ Kⁿ satisfies √ x ◦√

x = x.

Lemma 2.1. Suppose that p = 2k + 1 where k = 1, 2, 3, · · · . Then, for any u, v ∈ R, we have u^p = v^p if and only if u = v.

Proof. The proof is straightforward and can be found in [1, Theorem 1.12]. Here, we provide an alternative proof.

“⇐” It is trivial.

“⇒” For v = 0, since u^p = v^p, we have u = v = 0. For v 6= 0, from f (t) = t^p− 1 being a strictly monotone increasing function for any t ∈ R, we have u

v

p

− 1 = 0 if and only if u

v = 1, which implies u = v. Thus, the proof is complete. 2

Lemma 2.2. For p = 2m + 1 with m = 1, 2, 3, · · · and x = (x₁, x₂), y = (y₁, y₂) ∈ R × Rⁿ⁻¹, suppose that x^p and y^p represent x ◦ x ◦ · · · ◦ x and y ◦ y ◦ · · · ◦ y for p-times, respectively. Then, x^p = y^p if and only if x = y.

Proof. “⇐” This direction is trivial.

“⇒” Suppose that x^p = y^p. By the spectral decomposition (6), we write x = λ₁(x)u⁽¹⁾x + λ₂(x)u⁽²⁾x ,

y = λ₁(y)u⁽¹⁾y + λ₂(y)u⁽²⁾y .

Then, x^p = (λ₁(x))^pu⁽¹⁾x + (λ₂(x))^pu⁽²⁾x and y^p = (λ₁(y))^pu⁽¹⁾y + (λ₂(y))^pu⁽²⁾y . Since x^p = y^p and eigenvalues are unique, we obtain (λ₁(x))^p = (λ₁(y))^p and (λ₂(x))^p = (λ₂(y))^p. By Lemma 2.1, this implies λ₁(x) = λ₁(y) and λ₂(x) = λ₂(y). Moreover, {u⁽¹⁾x , u⁽²⁾x } and {u⁽¹⁾y , u⁽²⁾y } are Jordan frames, we have u⁽¹⁾x + u⁽²⁾x = u⁽¹⁾y + u⁽²⁾y = e, where e is the identity element. From x^p = y^p and u⁽¹⁾x + u⁽²⁾x = u⁽¹⁾y + u⁽²⁾y , we get

[(λ₁(x))^p− (λ₂(x))^p] (u⁽¹⁾_x − u⁽¹⁾_y ) = 0.

If (λ₁(x))^p = (λ₂(x))^p, we have λ₁(x) = λ₂(x) and λ₁(y) = λ₂(y), that is, x = λ₁(x)e = y.

Otherwise, if (λ₁(x))^p 6= (λ₂(x))^p, we must have u⁽¹⁾x = u⁽¹⁾y , which implies u⁽²⁾x = u⁽²⁾y . 2

(8)

3 New generalized Fischer-Burmeister function for NCP

In this section, we show that the function φ^p

D−FB defined as in (4) is an NCP-function and present its twice differentiability. At the same time, we also depict the surfaces of φ^p

D−FB

with various values of p to have more insight for this new family of NCP-functions.

Proposition 3.1. Let φ^p

D−FB be defined as in (4) where p is a positive odd integer. Then, φ^p

D−FB is an NCP-function.

Proof. Suppose φ^p

D−FB(a, b) = 0 , which says √

a²+ b²p

= (a + b)^p. Using p being a positive odd integer and applying Lemma 2.1, we have

√

a²+ b²

p

= (a + b)^p ⇐⇒ √

a²+ b² = a + b.

It is well known that√

a²+ b² = a + b is equivalent to a, b ≥ 0, ab = 0 because φ_FB is an NCP-function. This shows that φ^p_D−FB(a, b) = 0 implies a, b ≥ 0, ab = 0. The converse direction is trivial. Thus, we prove that φ^p

D−FB is an NCP-function. 2 Remark 3.1: We elaborate more about the new NCP-function φ^p

D−FB. (a) For p being an even integer, φ^p

D−FB is not a NCP-function. A counterexample is given as below.

φ^p_D−FB(−5, 0) = (−5)²− (−5)² = 0.

(b) The surface of φ^p_D−FB is symmetric, i.e., φ^p_D−FB(a, b) = φ^p_D−FB(b, a).

(c) The function φ^p

D−FB(a, b) is positive homogenous of degree p, i.e., φ^p

D−FB(α(a, b)) = α^pφ^p

D−FB(a, b) for α ≥ 0.

(d) The function φ^p_D−FB is neither convex nor concave function. To see this, taking p = 3 and using the following argument verify the assertion.

5³− 7³ = φ³

D−FB(3, 4) > 1 2φ³

D−FB(0, 0) + 1 2φ³

D−FB(6, 8)

= 1

2 × 0 + 1

2 10³− 14³ = 4 5³− 7³ and

0 = φ³

D−FB(0, 0) < 1 2φ³

D−FB(−2, 0) + 1 2φ³

D−FB(2, 0) = 1

2 × 16 + 1

2× 0 = 8.

(9)

D−FB be defined as in (4) where p is a positive odd integer. Then, the following hold.

(a) For p > 1, φ^p_D−FB is continuously differentiable with

∇φ^p

D−FB(a, b) = p a(√

a²+ b²)^p−2− (a + b)^p−1 b(√

a² + b²)^p−2− (a + b)^p−1

.

(b) For p > 3, φ^p

D−FB is twice continuously differentiable with

∇²φ^p

D−FB(a, b) =







∂²φ^p_D−FB

∂a²

∂²φ^p_D−FB

∂²φ^p_D−FB ∂a∂b

∂b∂a

∂²φ^p_D−FB

∂b²





,

where

∂²φ^p

D−FB

∂a² = pn

[(p − 1)a²+ b²](√

a²+ b²)^p−4− (p − 1)(a + b)^p−2o ,

∂²φ^p

D−FB

∂a∂b = p[(p − 2)ab(√

a²+ b²)^p−4− (p − 1)(a + b)^p−2] = ∂²φ^p

D−FB

∂b∂a ,

∂²φ^p_D−FB

∂b² = pn

[a²+ (p − 1)b²](√

a²+ b²)^p−4− (p − 1)(a + b)^p−2o .

Proof. The verifications of differentiability and computations of first and second deriva- tives are straightforward, we omit them. 2

Next, we present some variants of φ^p_D−FB. Indeed, analogous to those functions in [41], the variants of φ^p

D−FB as below can be verified being NCP-functions.

φ₁(a, b) = φ^p

D−FB(a, b) − α(a)₊(b)₊, α > 0.

φ₂(a, b) = φ^p

D−FB(a, b) − α ((a)₊(b)₊)², α > 0.

φ₃(a, b) = [φ^p

D−FB(a, b)]²+ α ((ab)₊)⁴, α > 0.

φ₄(a, b) = [φ^p

D−FB(a, b)]²+ α ((ab)₊)², α > 0.

In the above expressions, for any t ∈ R, we define t+ as max{0, t}.

Lemma 3.1. Let φ^p_D−FB be defined as in (4) where p is a positive odd integer. Then, the value of φ^p

D−FB(a, b) is negative only in the first quadrant, i.e., φ^p

D−FB(a, b) < 0 if and only if a > 0, b > 0.

(10)

Proof. We know that f (t) = t^p is a strictly increasing function when p is odd. Using this fact yields

a > 0, b > 0

⇐⇒ a + b > 0 and ab > 0

⇐⇒ √

a² + b² < a + b

⇐⇒ √

a²+ b²p

< (a + b)^p

⇐⇒ φ^p

D−FB(a, b) < 0, which proves the desired result. 2

Proposition 3.3. All the above functions φ_i for i ∈ {1, 2, 3, 4} are NCP-functions.

Proof. Applying Lemma 3.1, the arguments are similar to those in [16, Proposition 2.4], which are omitted here. 2

In fact, in light of Lemma 2.1, we can construct more variants of φ^p

D−FB, which are also new NCP-function. More specifically, consider that k and m are positive integers, f : R × R → R, and g : R × R → R with g(a, b) 6= 0 for all a, b ∈ R, the following functions are new variants of φ^p_D−FB.

φ₅(a, b) = h

g(a, b) √

a²+ b²+ f (a, b)i_2m+1^2k+1

−g(a, b) a + b + f (a, b)^2m+1^2k+1 . φ6(a, b) =

h

g(a, b)(√

a²+ b²− a − b)i_m^k . φ₇(a, b) = h

g(a, b)(√

a²+ b²− a + f (a, b))i_2m+1^2k+1

− [g(a, b)(b + f (a, b))]^2m+1^2k+1 . φ₈(a, b) = h

g(a, b)(√

a²+ b²− a + f (a, b))i_2m+1^2k+1

− [g(a, b)(b + f (a, b))]^2m+1^2k+1 . φ₉(a, b) = e^φⁱ^(a,b)− 1 where i = 5, 6, 7, 8.

φ₁₀(a, b) = ln(|φ_i(a, b)| + 1) where i = 5, 6, 7, 8.

Proposition 3.4. All the above functions φ_i for i ∈ {5, 6, 7, 8, 9, 10} are NCP-functions.

Proof. This is an immediate consequence of Propositions 3.1-3.3. By Lemma 2.1 and

(11)

g(a, b) 6= 0 for a, b ∈ R, we have φ5(a, b) = 0

⇐⇒ h

g(a, b) √

a²+ b²+ f (a, b)i_2m+1^2k+1

=g(a, b) a + b + f (a, b)^2m+1^2k+1

⇐⇒ n h

g(a, b) √

a²+ b²+ f (a, b)i_2m+1^2k+1 o2m+1

=n

g(a, b) a + b + f (a, b)^2m+1^2k+1 o2m+1

⇐⇒ h

g(a, b) √

a²+ b²+ f (a, b)i2k+1

=g(a, b) a + b + f (a, b)^2k+1

⇐⇒ g(a, b) √

a²+ b²+ f (a, b) = g(a, b) a + b + f (a, b)

⇐⇒ √

a²+ b²+ f (a, b) = a + b + f (a, b)

⇐⇒ √

a²+ b² = a + b.

The other functions φi for i ∈ {6, 7, 8, 9, 10} are similar to φ5. 2

According to the above results, we immediately obtain the following theorem.

Theorem 3.1. Suppose that φ(a, b) = ϕ₁(a, b) − ϕ₂(a, b) is an NCP-function on R × R and k and m are positive integers. Then, φ(a, b)^m^k and ϕ₁(a, b)_2m+1^2k+1

− [ϕ₂(a, b)]^2m+1^2k+1 are NCP-functions.

Proof. Using k and m being positive integers and applying Lemma 2.1, we have

φ(a, b)^m^k = 0

⇐⇒ n

φ(a, b)^m^kom

= 0

⇐⇒ φ(a, b)^k = 0

⇐⇒ φ(a, b) = 0.

Similarly, we have

ϕ₁(a, b)_2m+1^2k+1

− [ϕ₂(a, b)]^2m+1^2k+1 = 0

⇐⇒ ϕ1(a, b)_2m+1^2k+1

= [ϕ₂(a, b)]^2m+1^2k+1

⇐⇒ n

ϕ₁(a, b)_2m+1^2k+1o2m+1

=n

[ϕ₂(a, b)]^2m+1^2k+1o2m+1

⇐⇒ ϕ1(a, b)]^2k+1=ϕ2(a, b)]^2k+1

⇐⇒ ϕ₁(a, b) = ϕ₂(a, b)

⇐⇒ φ(a, b) = 0.

The above arguments together with the assumption of φ(a, b) being an NCP-function yield the desired result. 2

Remark 3.2: We elaborate more about Theorem 3.1.

(12)

(a) Based on the existing well-known NCP-functions, we can construct new NCP-functions in light of Theorem 3.1. This is a novel way to construct new NCP-functions.

(b) When k is a positive integer, φ(a, b)^k is an NCP-function. This means that perturbing the parameter k gives new NCP-functions. In addition, if φ(a, b) is an NCP- function, for any positive integer m, φ(a, b)^m^k is also an NCP-function. Thus, we can determine suitable and nice NCP-functions among these functions according to their numerical performance.

To close this section, we depict the surfaces of φ^p_D−FB with different values of p so that we may have deeper insight for this new family of NCP-functions. Figure 1 is the surface if φ_D−FB(a, b) from which we see that it is convex. Figure 2 presents the surface of φ³

D−FB(a, b) in which we see that it is neither convex nor concave as mentioned in Remark 3.1(c). In addition, the value of φ^p_D−FB(a, b) is negative only when a > 0 and b > 0 as mentioned in Lemma 3.1. The surfaces of φ^p

D−FB with various values of p are shown in Figure 3.

−10

−5 0

5

10 −10 −5 0 5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

Figure 1: The surface of z = φ_D−FB(a, b) and (a, b) ∈ [−10, 10] × [−10, 10]

4 Extending φ

^p

D−FB

and φ

^p

NR

to SOCCP

In this section, we extend the new function φ^p

D−FB and φ^p

NR to SOC setting. More specifically, we show that the function φ^p

D−FB and φ^p

NR are complementarity functions associated

(13)

−10

−5 0

5

10 −10 −5 0 5 10

−1

−0.5 0 0.5 1 1.5

x 10⁴

b−axis a−axis

z−axis

Figure 2: The surface of z = φ³

D−FB(a, b) and (a, b) ∈ [−10, 10] × [−10, 10]

with Kⁿ. In addition, we present the computing formulas for its Jacobian.

Proposition 4.1. Let φ^p_D−FB be defined by (5). Then, φ^p_D−FB is a complementarity function associated with Kⁿ, i.e., it satisfies

φ^p_D−FB(x, y) = 0 ⇐⇒ x ∈ Kⁿ, y ∈ Kⁿ, hx, yi = 0.

Proof. Since φ^p

D−FB(x, y) = 0 , we have

px²+ y²p

= (x + y)^p. Using p being a positive odd integer and applying Lemma 2.2 yield

px²+ y²p

= (x + y)^p ⇐⇒ p

x²+ y² = x + y.

It is known that φ_FB(x, y) := px²+ y²− (x + y) is a complementarity function associated with Kⁿ. This indicates that φ^p

D−FBis a complementarity function associate with Kⁿ. 2 With similar technique, we can prove that φ^p

NR can be extended as a complementarity function for SOCCP.

Proposition 4.2. The function φ^p

NR : Rⁿ× Rⁿ→ Rⁿ defined by φ^p

NR(x, y) = x^p− [(x − y)₊]^p (7) is a complementarity function associated with Kⁿ, where p > 1 is a positive odd integer and (·)₊ means the projection onto Kⁿ.

(14)

−5

0

−5 5

0

5

−1000

−500 0 500 1000 1500

a−axis

b−axis

z−axis

(a) z = φ³_D−FB(a, b)

−5

0

5

−5

0

5

−1

−0.5 0 0.5 1 1.5

x 10⁵

a−axis

b−axis

z−axis

(b) z = φ⁵_D−FB(a, b)

−5

0

5

−5

0

5

−1

−0.5 0 0.5 1 1.5

x 10⁷

a−axis

b−axis

z−axis

(c) z = φ⁷_D−FB(a, b)

−5

0

5

−5

0

5

−1

−0.5 0 0.5 1 1.5

x 10⁹

a−axis

b−axis

z−axis

(d) z = φ⁹_D−FB(a, b)

Figure 3: The surface of z = φ^p

D−FB(a, b) with different values of p Proof. From Lemma 2.2, we see that φ^p

NR(x, y) = 0 if and only if x = (x − y)₊. On the other hand, it is known that φ_NR(x, y) = x − (x − y)₊ is a complementarity function for SOCCP, which implies x − (x − y)₊ = 0 if and only if x ∈ Kⁿ, y ∈ Kⁿ, and hx, yi = 0.

Hence, φ^p

NR is a complementarity function associated with Kⁿ. 2

In order to compute the Jacobian of φ^p_D−FB, we need to introduce some notations for convenience. For any x = (x₁, x₂) ∈ R × Rⁿ⁻¹ and y = (y₁, y₂) ∈ R × Rⁿ⁻¹, we define

w(x, y) := x²+ y² = (w₁(x, y), w₂(x, y)) ∈ R × Rⁿ⁻¹ and v(x, y) := x + y.

Then, it is clear that w(x, y) ∈ Kⁿ and λ_i(w) ≥ 0, i = 1, 2.

D−FB be defined as in (5) and g^soc(x) = (p|x|)^p, h^soc(x) = x^p are the vector-valued functions corresponding to g(t) = |t|^p² and h(t) = t^p for t ∈

(15)

R, respectively. Then, φ^p_D−FB is continuously differentiable at any (x, y) ∈ Rⁿ × Rⁿ. Moreover, we have

∇_xφ^p

D−FB(x, y) = 2L_x∇g^soc(w) − ∇h^soc(v),

∇_yφ^p_D−FB(x, y) = 2L_y∇g^soc(w) − ∇h^soc(v),

where w := w(x, y) = x²+ y², v := v(x, y) = x + y, t 7→ sign(t) is the sign function, and

∇g^soc(w) =





 p

2|w₁|^p²⁻¹· sign(w₁)I if w₂ = 0;

b₁(w) c₁(w) ¯w^T₂

c₁(w) ¯w₂ a₁(w)I + (b₁(w) − a₁(w)) ¯w₂w¯₂^T

if w₂ 6= 0;

¯

w₂ = w₂ kw₂k,

a₁(w) = |λ₂(w)|^p² − |λ₁(w)|^p² λ₂(w) − λ₁(w) , b₁(w) = p

4

h|λ₂(w)|^p²⁻¹+ |λ₁(w)|^p²⁻¹i , c₁(w) = p

4

h|λ₂(w)|^p²⁻¹− |λ₁(w)|^p²⁻¹i , and

∇h^soc(v) =







pv₁^p−1I if v2 = 0;

b₂(v) c₂(v)¯v₂^T

c₂(v)¯v₂ a₂(v)I + (b₂(v) − a₂(v)) ¯v₂v¯₂^T

if v2 6= 0; (8)

¯

v2 = v₂

kv₂k, (9)

a₂(v) = (λ₂(v))^p− (λ₁(v))^p

λ₂(v) − λ₁(v) , (10)

b2(v) = p

2(λ2(v))^p−1+ (λ1(v))^p−1 , (11) c2(v) = p

2(λ2(v))^p−1− (λ1(v))^p−1 , (12) Proof. From the definition of φ^p

D−FB, it is clear to see that for any (x, y) ∈ Rⁿ× Rⁿ, φ^p_D−FB(x, y) =p

x²+ y²

p

− (x + y)^p

=p

|x²+ y²|p

− (x + y)^p

= h

|λ1(w)|^p²u⁽¹⁾(w) + |λ2(w)|^p²u⁽²⁾(w) i

−(λ1(v))^pu⁽¹⁾(v) + (λ2(v))^pu⁽²⁾(v)

= g^soc(w) − h^soc(v).

(13)

(16)

For p ≥ 3, since both |t|^p² and t^p are continuously differentiable on R, by [13, Proposition 5] and [25, Proposition 5.2], we know that the function g^soc and h^soc are continuously differentiable on Rⁿ. Moreover, it is clear that w(x, y) = x²+ y² is continuously differentiable on Rⁿ× Rⁿ, then we conclude that φ^p

D−FB is continuously differentiable. Moreover, from the formula in [13, Proposition 4] and [25, Proposition 5.2], we have

∇g^soc(w) =





 p

2|w₁|^p²⁻¹· sign(w₁)I if w₂ = 0;

b₁(w) c₁(w) ¯w₂^T

c₁(w) ¯w₂ a₁(w)I + (b₁(w) − a₁(w)) ¯w₂w¯^T₂

if w₂ 6= 0;

∇h^soc(v) =







pv^p−1₁ I if v2 = 0;

b₂(v) c₂(v)¯v^T₂

c₂(v)¯v₂ a₂(v)I + (b₂(v) − a₂(v)) ¯v₂¯v₂^T

if v₂ 6= 0;

where

¯

w₂ = _kw^w²

2k, ¯v₂ = _kv^v²

2k

a₁(w) = ^|λ²^(w)|

p

2−|λ1(w)|

p 2

λ2(w)−λ1(w) , a₂(v) = ^(λ²_λ^(v))^p^−(λ¹^(v))^p

2(v)−λ1(v) ,

b₁(w) = ^p₄|λ₂(w)|^p²⁻¹+ |λ₁(w)|^p²⁻¹ , b₂(v) = ^p₂ [(λ₂(v))^p−1+ (λ₁(v))^p−1] , c₁(w) = ^p₄|λ₂(w)|^p²⁻¹− |λ₁(w)|^p²⁻¹ , c₂(v) = ^p₂[(λ₂(v))^p−1− (λ₁(v))^p−1] .

By taking differentiation on both sides about x and y for (13), respectively, and applying the chain rule for differentiation, it follows that

∇_xφ^p

D−FB(x, y) = 2L_x∇g^soc(w) − ∇h^soc(v),

∇_yφ^p

D−FB(x, y) = 2L_y∇g^soc(w) − ∇h^soc(v).

Hence, we complete the proof. 2

With Lemma 2.2 and Proposition 4.1, we can construct more complementarity functions for SOCCP which are variants of φ^p

D−FB(x, y). More specifically, consider that k and m are positive integers and f^soc(x, y) : Rⁿ× Rⁿ→ Rⁿ is the vector-valued function corresponding to a given real-valued function f , the following functions are new variants of φ^p

D−FB(x, y).

φe₁(x, y) = hp

x²+ y²+ f^soc(x, y)i_2m+1^2k+1

− [x + y + f^soc(x, y)]^2m+1^2k+1 .

φe₂(x, y) = hp

x²+ y²− x − yi_m^k . φe3(x, y) = hp

x²+ y²− x + f^soc(x, y) i_2m+1^2k+1

− [y + f^soc(x, y)]^2m+1^2k+1 . φe₄(x, y) = hp

x²+ y²− y + f^soc(x, y)i_2m+1^2k+1

− [x + f^soc(x, y)]^2m+1^2k+1 .

(17)

Proposition 4.4. All the above functions eφ_i for i ∈ {1, 2, 3, 4} are complementarity functions associated with Kⁿ.

Proof. The results follow from applying Lemma 2.2 and Proposition 4.1. 2

In general, for complementarity functions associated with Kⁿ, we have the following parallel result to Theorem 3.1.

Theorem 4.1. Suppose that φ(x, y) = ϕ₁(x, y) − ϕ₂(x, y) is a complementarity function associated with Kⁿ on Rⁿ × Rⁿ, and k, m are positive integers. Then φ(x, y)^m^k and

ϕ1(x, y)_2m+1^2k+1

− [ϕ₂(x, y)]^2m+1^2k+1 are complementarity functions associated with Kⁿ. Proof. According to k and m are positive integers and by using Lemma 2.2, we have

φ(x, y)^m^k = 0

⇐⇒ n

φ(x, y)^m^kom

= 0

⇐⇒ φ(x, y)^k= 0

⇐⇒ φ(x, y) = 0.

Similarly, we have

ϕ₁(x, y)_2m+1^2k+1

− [ϕ₂(x, y)]^2m+1^2k+1 = 0

⇐⇒ ϕ₁(x, y)_2m+1^2k+1

= [ϕ₂(x, y)]^2m+1^2k+1

⇐⇒ n

ϕ₁(x, y)_2m+1^2k+1o2m+1

=n

[ϕ₂(x, y)]^2m+1^2k+1o2m+1

⇐⇒ ϕ₁(x, y)]^2k+1=ϕ₂(x, y)]^2k+1

⇐⇒ ϕ₁(x, y) = ϕ₂(x, y)

⇐⇒ φ(x, y) = 0.

From the above arguments and the assumption, the proof is complete. 2 Remark 4.1: We elaborate more about Theorem 4.1.

(a) Based existing complementarity functions, we can construct new complementarity functions associated with Kⁿ in light of Theorem 4.1.

(b) When k is a positive odd integer, φ(x, y)^k is a complementarity function associated with Kⁿ. This means that perturbing the odd integer parameter k, we obtain the new complementarity functions associated with Kⁿ. In addition, if φ(x, y) is a complementarity function, then for any positive integer m, φ(x, y)^m^k is also a complementarity function. We can determine nice complementarity functions associated with Kⁿ among these functions by their numerical performance.

(18)

Finally, we establish formula for Jacobian of φ^p

NR and the smoothness of φ^p

NR. To this aim, we need the following technical lemma.

Lemma 4.1. Let p > 1. Then, the real-valued function f (t) = (t+)^p is continuously differentiable with f⁰(t) = p(t₊)^p−1 where t₊ = max{0, t}.

Proof. By the definition of t₊, we have

f (t) = (t₊)^p = t^p if t ≥ 0, 0 if t < 0, which implies

f⁰(t) = pt^p−1 if t ≥ 0, 0 if t < 0.

Then, it is easy to see that f⁰(t) = p(t₊)^p−1 is continuous for p > 1. 2

Proposition 4.5. Let φ^p_NR be defined as in (7) and h^soc(x) = x^p, l^soc(x) = (x+)^p be the vector-valued functions corresponding to the real-valued functions h(t) = t^p and l(t) = (t₊)^p, respectively. Then, φ^p

NR is continuously differentiable at any (x, y) ∈ Rⁿ× Rⁿ, and its Jacobian is given by

∇_xφ^p_NR(x, y) = ∇h^soc(x) − ∇l^soc(x − y),

∇_yφ^p

NR(x, y) = ∇l^soc(x − y), where ∇h^soc satisfies (8)-(12) and

∇l^soc(u) =







p((u1)+)^p−1I if u2 = 0;

b₃(u) c₃(u)¯u^T₂

c₃(u)¯u₂ a₃(u)I + (b₃(u) − a₃(u)) ¯u₂u¯^T₂

if u2 6= 0;

¯

u₂ = u₂ ku₂k,

a₃(u) = (λ₂(u)₊)^p− (λ₁(u)₊)^p λ₂(u) − λ₁(u) , b₃(u) = p

2(λ₂(u)₊)^p−1+ (λ₁(u)₊)^p−1 , c₃(u) = p

2(λ₂(u)₊)^p−1− (λ₁(u)₊)^p−1 ,

Proof. In light of [13, Proposition 5] and [25, Proposition 5.2], the results follow from applying Lemma 4.1 and using the chain rule for differentiation. 2

(19)

5 Numerical experiments

As mentioned, the Newton method may not be appropriate for numerical implementation, due to possible singularity of Jacobian at a degenerate solution. In view of this, in this section, we employ the derivative-free descent method studied in [37] to test the numerical performance based on various value of p. The target of the derivative-free descent method studied in [37] is mainly on SOCCP (second-order cone complementarity problem). Hence, we consider the following SOCCP:

z ∈ K, M z + b ∈ K, z^T(M z + b) = 0, K = K₁ × · · · × K_r.

According to our results, the above SOCCP can be recast as an unconstrained minimization problem:

min

ζ∈RⁿΨ_p(ζ) = 1 2kφ^p

D−FB(ζ, F (ζ))k², where F (ζ) = M ζ + b.

All tests are done on a PC using Inter core i7-5600U with 2.6GHz and 8GB RAM, and the codes are written in Matlab 2010b. The test instances are generated randomly.

In particular, we first generate random sparse square matrices Ni(i = 1, 2 . . . r) with density 0.01, in which non-zero elements are chosen randomly from a normal distribution with mean −1 and variance 4. Then, we create the positive semidefinite matrix M_i for (i = 1, 2 . . . r) by setting M_i := N_iN_i^T and let M := diag(M₁, . . . , M_r). In addition, we take vector b := −M w with w = (w1, . . . , wr) and wi ∈ Ki. With these M and b, it is not hard to verify that the corresponding SOCCP has at least a feasible solution. To construct SOCs of various types, we set n₁ = n₂ = · · · = n_r.

We implement a test problem generated as above with n = 1000 and r = 100. The parameters in the algorithm are set as

β = 0.9, γ = 0.8, σ = 10⁻⁴, and = 10⁻⁸. We start with the initial point

ζ0 = (ζn1, · · · , ζnr) where ζni =

10, w_i kw_ik

with w_i ∈ Rⁿⁱ⁻¹ being generated randomly. The stopping criteria, i.e., Ψ_p(ζ^k) ≤ , is either the number of iteration is over 10⁵ or a step-length is less than 10⁻¹². The Figure 4 depicts detailed iteration process of the algorithm corresponding to different value of p.

(20)

The algorithm fails for the problem when p ≥ 5. The main reason is that the step- length is too small eventually. We also suspect that larger p leads to tedious computation of the complementarity function in Jordan algebra. Anyway, this phenomenon indicates that the discrete-type of complementarity functions only work well for small value of p.

The convergence in Figure 4 shows the method with a bigger p has a faster reduction of Ψ_pat the beginning, and the method with a smaller p has a faster reduction of Ψ_p eventually. Moreover, the bigger p applies, the total number of iterations of the algorithm is less.

In order to check numerical performance of the algorithm corresponding to different value of p, we solve the test problems with different dimension. The numerical results are summarized in Tables 1. “Ψ_p(ζ^∗)” and “Gap” denote the merit function value and the value of

ζ^TF (ζ)

at the final iteration, respectively. “NF”, “Iter”, and “Time” indicate the number of function evaluations of Ψ_p, the number of iteration required in order to satisfy the termination condition, and the CPU time in second for solving each problem, respectively.

Table 1: Numerical results with different value of p

Problem p = 1 p = 1.4

(n, r) Φp(ζ^∗) NF Iter Gap time Φp(ζ^∗) NF Iter Gap time (100,10) 9.8e-9 5350 4952 2.75e-4 9.3 1.0e-8 4401 1474 5.92e-5 3.5 (200,20) 9.4e-9 5064 4914 3.74e-5 16.5 1.0e-8 16179 5649 3.84e-5 25.9 (300,30) 1.0e-8 7445 5273 2.26e-4 30.3 9.9e-9 7000 1266 2.40e-5 11.5 (400,40) 9.8e-9 5342 5016 1.62e-4 50.0 9.9e-9 3747 857 4.31e-5 9.5 (500,50) 1.0e-8 23533 13749 6.81e-4 126.4 9.6e-9 29454 6257 3.39e-4 93.9 (600,60) 1.0e-8 18260 11119 16.1e-4 65.1 1.0e-8 24685 8320 8.69e-5 119.7 (700,70) 1.0e-8 8320 5690 6.16e-4 38.3 1.0e-8 13458 4493 1.79e-4 77.7 (800,80) 1.0e-8 29415 10149 4.43e-5 199.2 9.3e-9 2507 1838 1.54e-4 27.4 (900,90) 1.0e-8 14648 10888 1.46e-3 159.8 9.9e-9 5970 1621 8.77e-5 44.9 (1000,100) 1.0e-8 14590 9672 2.78e-4 238.3 1.0e-8 12337 2570 7.58e-5 92.0 (1100,110) 9.9e-9 5994 5406 4.64e-6 109.6 1.0e-8 13767 2948 3.51e-4 126.5 (1200,120) 9.8e-9 6100 5528 6.12e-5 121.7 9.9e-9 20990 5650 1.51e-5 211.4 (1300,130) 9.8e-9 4253 3612 2.42e-4 115.5 9.7e-9 777 316 5.78e-5 10.1 (1400,140) 1.0e-8 9827 7136 1.46e-4 307.5 1.0e-8 6357 2736 2.20e-4 70.6 (1500,150) 9.9e-9 4701 4211 3.04e-4 156.9 9.9e-9 7060 1823 6.56e-6 67.8 (1600,160) 9.9e-9 5744 3843 4.61e-4 172.8 1.0e-8 9434 2583 1.39e-4 82.9 (1700,170) 1.0e-8 11163 5581 2.74e-4 195.1 1.0e-8 12307 2740 9.87e-5 185.7 (1800,180) 1.0e-8 7449 5985 3.77e-4 204.5 1.0e-8 38524 9469 2.43e-4 439.8 (1900,190) 1.0e-8 4205 2102 7.19e-5 83.2 1.0e-8 7413 1636 3.40e-4 125.4 (2000,200) 9.9e-9 5189 4953 2.12e-4 212.9 9.15e-9 10230 480 2.32e-5 294.9

We also use the performance profiles introduced by Dolan and Mor`e [18] to compare the performance of algorithm with different p. The performance profiles are generated by executing solvers S on the test set P. Let n_p,s be the number of iteration (or the

(21)

Table 2: Numerical results with different value of p

Problem p = 2.6 p = 3

(n, r) Φp(ζ^∗) NF Iter Gap time Φp(ζ^∗) NF Iter Gap time (100,10) 9.9e-9 28878 1866 2.40e-6 11.9 9.2e-9 11281 201 3.80e-7 14.7 (200,20) 1.0e-8 57844 3743 1.64e-6 47.9 9.5e-9 21221 422 1.15e-6 52.9 (300,30) 9.9e-9 14452 963 3.14e-6 17.3 9.2e-9 4383 89 5.97e-7 17.5 (400,40) 9.8e-9 20747 1417 2.31e-6 32.7 9.9e-9 7419 133 8.34e-7 34.0 (500,50) 9.8e-9 13929 1084 1.53e-6 30.7 8.4e-9 27229 474 1.04e-6 87.8 (600,60) 9.9e-9 28224 2032 2.48e-7 77.1 9.9e-9 48809 878 4.19e-7 193.8 (700,70) 9.9e-9 16739 1230 1.93e-5 52.8 7.9e-9 7069 140 6.16e-4 58.4 (800,80) 9.9e-9 72745 5342 7.69e-7 270.5 9.8e-9 27620 534 5.95e-7 260.1 (900,90) 9.5e-9 7574 522 6.09e-7 37.5 8.0e-9 10276 187 1.35e-7 129.6 (1000,100) 1.0e-8 145414 8664 4.92e-7 821.6 9.6e-9 17790 325 2.26e-7 258.2 (1100,110) 9.7e-9 16834 1465 3.76e-7 111.0 9.5e-9 31750 528 6.41e-7 507.2 (1200,120) 9.9e-9 45621 3346 1.82e-6 271.5 9.8e-9 20326 370 4.82e-7 437.4 (1300,130) 1.0e-8 25661 1739 3.21e-6 171.8 8.9e-9 10399 185 7.16e-7 115.5 (1400,140) 9.8e-9 57526 4116 2.09e-5 277.6 8.9e-9 12529 205 1.09e-6 348.4 (1500,150) 1.0e-8 355478 321117 1.50e-5 2343.0 4.7e-3 11824 217 1.54e-5 393.5 (1600,160) 9.3e-9 12995 5961 1.70e-6 98.5 9.9e-9 33843 550 5.43e-7 862.2 (1700,170) 1.0e-8 47367 3380 8.64e-7 441.0 1.0e-8 80519 5084 1.73e-7 742.8 (1800,180) 9.8e-9 7697 536 1.67e-6 53.0 7.4e-9 8472 154 4.15e-8 289.6 (1900,190) 1.0e-8 149019 10644 2.59e-6 1577.9 1.0e-8 16128 909 5.84e-7 161.5 (2000,200) 1.0e-8 27876 1991 2.64e-6 238.5 1.0e-8 34310 630 1.37e-7 862.2

computing time) required to solve problem p ∈ P by solver s ∈ S, and define the performance ratio as

r_p,s= n_p,s

min{n_p,s : 1 ≤ s ≤ n_s},

where n_s is the number of solvers. Whenever the solver s does not solve problem p successfully, set rp,s = rM. Here rM is a very large preset positive constant. Then, performance profile for each solver s is defined by

ρ_s(χ) = 1

n_psize{p ∈ P : log₂(r_p,s) ≤ χ}.

where size{p ∈ P : log₂(r_p,s) ≤ χ} is the number of elements in the set {p ∈ P : log₂(r_p,s) ≤ χ}. ρ_s(χ) represents the probability that the performance ratio r_p,s is within the factor 2^χ. It is easy to see that ρ_s(0) is the probability that the solver s wins over the rest of solvers. See [18] for more details about the performance profile.

From Figure 5(a), it shows that the algorithm with p = 1 and p = 1.4 performs better than p = 2.6 and p = 3 on function evaluations. Similarly, from Figure 5(b) and Figure 5(c), we observe that the algorithm with p = 3 performs best on the number of iterations, while the algorithm with p = 1.4 is the best one on CPU time. This provides evidence