On construction of new NCP functions Jan Harold Alcantara

(1)

On construction of new NCP functions

Jan Harold Alcantara

^a

, Chen-Han Lee

^a

, Chieu Thanh Nguyen

^a

, Yu-Lin Chang

^a

and Jein-Shan Chen

^a,∗

aDepartment of Mathematics, National Taiwan Normal University

A R T I C L E I N F O

Keywords:

Complementarity Functions Nonlinear Complementarity Problem

A B S T R A C T

We report a new method to construct complementarity functions for the nonlinear complementarity problem (NCP). Basic properties related to growth behavior, convexity and semismoothness of the newly discovered NCP functions are proved. We also present some variants, generalizations and other transformations of these NCP functions. Finally, we propose some interesting research directions that can be explored in the NCP research.

1. Motivation

Nonlinear complementarity problems (NCPs) are an important class of variational inequalities often encountered when dealing with Karush-Kuhn-Tucker conditions of optimization problems [7]. Apart from these, NCP provides an important framework for the study of equilibrium problems which usually arises from different areas such as operations research, engineering and economics [7,9,10].

Given a function 𝐹 ∶ IR^𝑛→IR^𝑛, the problem of finding a point 𝑥∈ IR^𝑛such that

𝑥≥ 0, 𝐹(𝑥)≥ 0, and ⟨𝑥, 𝐹 (𝑥)⟩ = 0, (1) is precisely the nonlinear complementarity problem. Vari- ous approaches to solving this problem have been proposed in the past years. One class of methods utilizes a so-called NCP function, that is, a function 𝜙∶ IR²→IR such that

𝜙(𝑎, 𝑏) = 0 ⟺ 𝑎≥ 0, 𝑏≥ 0, and 𝑎𝑏= 0.

An NCP function is useful in solving NCP (1) as it naturally exploits the structure of the problem. In particular, defining ΦF ∶ IR^𝑛→IR^𝑛as

ΦF(𝑥) =

⎛⎜

⎜⎝

𝜙(𝑥₁, 𝐹₁(𝑥))

⋮ 𝜙(𝑥_𝑛, 𝐹_𝑛(𝑥))

⎞⎟

⎟⎠

, (2)

it is clear to see that NCP (1) is equivalent to solving the system of equationsΦ_F(𝑥) = 0. Moreover, the NCP-function also gives rise to a merit function, namelyΨ_F(𝑥) ∶= ¹₂‖Φ_F(𝑥)‖². That is, the global minimizers of Ψ_F and the solutions of (1) coincide. Consequently, designing solution methods for solving (1) usually involves these NCP functions.

Due to their usefulness, numerous NCP functions have been proposed and extensively studied in the literature [11].

Among them, the Fischer-Burmeister (FB) function given by

𝜙FB(𝑎, 𝑏) =√

𝑎²+ 𝑏²− (𝑎 + 𝑏) (3)

∗Corresponding author. Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan

[email protected](J. Chen) ORCID(s):0000-0002-4596-9419(J. Chen)

had gained significant attention and had been widely used in several studies because of its desirable numerical properties.

In addition, as noted in [11], it is remarkable that several NCP functions are akin to the FB function. For instance, the generalized FB function

𝜙^𝑝

FB(𝑎, 𝑏) =‖(𝑎, 𝑏)‖𝑝− (𝑎 + 𝑏), 𝑝 >1 (4) is an interesting generalization of 𝜙

FBwhich can be efficiently used in solving NCPs. Here,‖ ⋅ ‖𝑝denotes the 𝑙_𝑝-norm, and the tunable parameter 𝑝 has been shown to possibly improve numerical performance of some algorithms [4,6].

A general way to construct NCP functions was first given by Mangasarian in [16], and another method was formu- lated by Luo and Tseng [15] and Kanzow, Yamashita, and Fukushima [13]. More recently, a rigorous discussion on how to construct NCP functions was presented by Galantai in [11]. On the other hand, the purpose of this paper is to present another general method to construct NCP functions which is new to the literature. The very useful generalized FB function 𝜙^𝑝

FBis one among the functions that our method can generate. We also discuss some analytic properties and geometric views of the proposed functions. We present some variants and generalizations of these NCP functions, and we also suggest some possibly important extensions. Finally, we report some possible research directions that are worth exploring in the future.

2. New NCP Functions

In this section, we present a new method to construct continuous NCP functions. Let 𝜃 ∶ IR → IR be continuous and define 𝜙^𝑝

𝜃 ∶ IR²→IR as 𝜙^𝑝

𝜃(𝑎, 𝑏) =‖(𝑎, 𝑏)‖𝑝− (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏), 𝑝≥ 1. (5) Note that 𝜙^𝑝

𝜃 is a continuous symmetric function; that is, 𝜙^𝑝

𝜃(𝑎, 𝑏) = 𝜙^𝑝_𝜃(𝑏, 𝑎). For some suitable choice of 𝜃, the above function yields an NCP function. We divide our discussion into two cases, depending on the value of 𝑝.

(2)

We first consider the case of 𝑝= 1, that is,

𝜙¹_𝜃(𝑎, 𝑏) =|𝑎| + |𝑏| − (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏). (6) Proposition 2.1. Let 𝜃∶ IR → IR such that 𝜃(0) = 1, 𝜃(𝑡) >

1 for all 𝑡 > 0, and −1 < 𝜃(𝑡) < 1 for all 𝑡 < 0. Then, 𝜙¹_𝜃 defined by(6) is an NCP function. Moreover, 𝜙¹_𝜃(𝑎, 𝑏)≤ 0 if and only if(𝑎, 𝑏) ∈ IR²₊.

Proof. Observe that we may rewrite 𝜙¹_𝜃as 𝜙¹_𝜃(𝑎, 𝑏) = 𝑎(

sgn(𝑎) − 𝜃(𝑏)) + 𝑏(

sgn(𝑏) − 𝜃(𝑎)) , where

sgn(𝑡) ∶=

⎧⎪

⎨⎪

⎩

1 if 𝑡 >0, 0 if 𝑡= 0,

−1 if 𝑡 <0.

Then, it is easy to verify that 𝜙¹

𝜃(𝑎, 𝑏)

=

⎧⎪

⎨⎪

⎩

0 if 𝑎, 𝑏≥ 0 & 𝑎𝑏 = 0, 𝑎(1 − 𝜃(𝑏)) + 𝑏(1 − 𝜃(𝑎)) if 𝑎 >0 & 𝑏 > 0,

−𝑎(1 + 𝜃(𝑏)) + 𝑏(1 − 𝜃(𝑎)) if 𝑎 <0 & 𝑏≥ 0,

−𝑎(1 + 𝜃(𝑏)) − 𝑏(1 + 𝜃(𝑎)) if 𝑎 <0 & 𝑏 < 0.

(7) By our hypotheses on 𝜃, we see that 𝜙¹

𝜃(𝑎, 𝑏) < 0 for the second case, and 𝜙¹

𝜃(𝑎, 𝑏) > 0 for the third and last cases.

Finally, by symmetry of 𝜙¹

𝜃, we have 𝜙¹

𝜃(𝑎, 𝑏) > 0 when 𝑎 >

0 and 𝑏 < 0 as in the third case. In other words, 𝜙¹_𝜃(𝑎, 𝑏) = 0 if and only if 𝑎, 𝑏≥ 0 and 𝑎𝑏 = 0. This says that 𝜙¹_𝜃 is an NCP function. □

An important consequence of Proposition 2.1is given by the following result, which describes the growth behav- ior of the NCP function 𝜙¹

𝜃. This corollary plays an important role in establishing coerciveness ofΦ

Fgiven by (2) (see [8]), which in turn is helpful in convergence analysis of algorithms. We omit the proof of the following corollary since it easily follows from the formula of 𝜙¹

𝜃given in (7). We do note that the strict inequality assumptions on the limits of 𝜃 at±∞ are important to avoid indeterminate products.

Corollary 2.1. Let 𝜃 satisfy the hypothesis of Proposition 2.1such that lim

𝑡→∞𝜃(𝑡) > 1 and −1 < lim

𝑡→−∞𝜃(𝑡) < 1. Then,

|𝜙¹_𝜃(𝑎^𝑘, 𝑏^𝑘)| → ∞ as 𝑘 → ∞ for any sequence {(𝑎^𝑘, 𝑏^𝑘)} ⊆ IR²with|𝑎^𝑘| → ∞ and |𝑏^𝑘| → ∞.

In the remaining parts of the paper, we assume that 𝜃 satisfies the conditions in Proposition2.1whenever 𝑝 = 1.

Note that a simple choice of 𝜃 is any monotonically increas- ing function whose range is contained in (−1, ∞), passes through(0, 1), and is strictly monotonic in some neighbor- hood of0.

and 𝜃₃(𝑡) = ²

1+𝑒⁻𝑡 clearly satisfy the conditions of Propo- sition 2.1 and Corollary 2.1. The graphs of 𝜙¹_𝜃

𝑖(𝑎, 𝑏) for 𝑖 = 1, 2, 3 are shown in Figures1a,2a, and3a. For each 𝑖, it is evident that the function 𝜙¹_𝜃

𝑖 is non-positive onIR²₊ and has the growth behavior as described in Corollary2.1.

In addition, 𝜙¹

𝜃_𝑖 is a nonsmooth nonconvex function for all 𝑖. In particular, the function has sharp trace curves corre- sponding to 𝑎= 0 and 𝑏 = 0, which are the points of non- differentiability of 𝜙¹

𝜃.

2.2. The case 𝑝 >1

Now, we consider 𝜙^𝑝_𝜃with 𝑝 >1 and provide conditions on 𝜃 which will make 𝜙^𝑝_𝜃an NCP function. The conditions are almost similar to those given in Proposition2.1. How- ever, we do not require strict inequality at 𝑡= 1, but we need a higher lower bound for 𝜃(𝑡) on (−∞, 0).

Proposition 2.2. Let 𝑝 >1. Suppose 𝜃 ∶ IR → IR such that 𝜃(0) = 1, 𝜃(𝑡) ≥ 1 for all 𝑡 > 0, and −2

1−𝑝

𝑝 ≤ 𝜃(𝑡) ≤ 1 for all 𝑡 < 0. Then, 𝜙^𝑝_𝜃 defined by(5) is an NCP function.

Moreover, 𝜙^𝑝

𝜃(𝑎, 𝑏)≤ 0 if and only if (𝑎, 𝑏) ∈ IR²₊.

Proof. Since 𝜙^𝑝_𝜃is symmetric w.r.t. the line 𝑎= 𝑏, it suffices to check the values of 𝜙^𝑝

𝜃 on the region 𝑎≤ 𝑏. We carefully consider four cases.

(i) If 𝑎= 0 and 𝑏 > 0, then 𝜙^𝑝_𝜃(𝑎, 𝑏) =|𝑏| − 𝜃(0)𝑏 = 0 since 𝜃(0) = 1.

(ii) Suppose 𝑎 > 0 and 𝑏 > 0. Due to 𝑝 > 1, we have

‖(𝑎, 𝑏)‖𝑝 = (𝑎^𝑝+ 𝑏^𝑝)

1

𝑝 < 𝑎+ 𝑏 which in turn yields 𝜙^𝑝_𝜃(𝑎, 𝑏) < 𝑎+𝑏−(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏) = 𝑎(1−𝜃(𝑏))+𝑏(1−𝜃(𝑎)).

Because 𝜃(𝑡)≥ 1 for any 𝑡 > 0 it follows that 𝜙^𝑝_𝜃(𝑎, 𝑏) < 0.

(iii) Suppose 𝑎 < 0 and 𝑏 ≥ 0. In this case, we have that

‖(𝑎, 𝑏)‖𝑝 > 𝑎+ 𝑏. Thus, 𝜙^𝑝

𝜃(𝑎, 𝑏) > 𝑎+𝑏−(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏) = 𝑎(1−𝜃(𝑏))+𝑏(1−𝜃(𝑎)).

Since 𝑏 ≥ 0, we have 1 − 𝜃(𝑏) ≤ 0 and so the term 𝑎(1 − 𝜃(𝑏)) is nonnegative. On the other hand, 1 − 𝜃(𝑎) > 0 since 𝑎 < 0 which means that the term 𝑏(1 − 𝜃(𝑎)) is likewise nonnegative. Hence, 𝜙^𝑝_𝜃(𝑎, 𝑏) > 0.

(iv) Finally, suppose that 𝑎 < 0 and 𝑏 < 0. The function 𝑡↦ 𝑡^𝑝is strictly convex on[0, ∞) since 𝑝 > 1. Thus,

‖(𝑎, 𝑏)‖^𝑝_𝑝=|𝑎|^𝑝+|𝑏|^𝑝 >2^1−𝑝(|𝑎| + |𝑏|)^𝑝,

which implies that‖(𝑎, 𝑏)‖𝑝>2

1−𝑝

𝑝 (|𝑎| + |𝑏|) = −2^1−𝑝^𝑝 (𝑎 + 𝑏). Consequently,

𝜙^𝑝_𝜃(𝑎, 𝑏) > −2

1−𝑝

𝑝 (𝑎 + 𝑏) − (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)

(3)

-5 -4 -3 -2 -1 0

-1 1 2 3 4 5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

(a) Graph of 𝜙¹_𝜃

1

-5 -4 -3 -2 -1 0

-1 1 2 3 4 5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

(b) Graph of 𝜙^1.2_𝜃

1

-5 -4 -3 -2 -1 0

-1 1 2 3 4 5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

(c) Graph of 𝜙²_𝜃

1

-5 -4 -3 -2 -1 0

-1 1 2 3 4 5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

(d) Graph of 𝜙¹⁰_𝜃

1

Figure 1: Graphs of 𝜙^𝑝

𝜃₁ for different values of 𝑝 where 𝜃₁(𝑡) = 𝑒^𝑡.

= −𝑎(2

1−𝑝

𝑝 + 𝜃(𝑏)) − 𝑏(2

1−𝑝 𝑝 + 𝜃(𝑎))

≥ 0

where the last inequality follows from the assumption that 𝜃(𝑡)≥ −2

1−𝑝

𝑝 for all 𝑡≤ 0.

From the above four cases, it is clear that 𝜙^𝑝

𝜃(𝑎, 𝑏)≤ 0 only onIR²₊. This completes the proof. □

We now state a consequence of Proposition5, similar to Corollary2.1.

Corollary 2.2. Let 𝜃 satisfy the hypothesis of Proposition 2.2such that lim

𝑡→∞𝜃(𝑡) > 1 and −2

1−𝑝

𝑝 < lim

𝑡→−∞𝜃(𝑡) < 1.

Then,|𝜙^𝑝_𝜃(𝑎^𝑘, 𝑏^𝑘)| → ∞ as 𝑘 → ∞ for any sequence {(𝑎^𝑘, 𝑏^𝑘)} ⊆ IR²with|𝑎^𝑘| → ∞ and |𝑏^𝑘| → ∞.

Proof. The result follows from the inequalities obtained from cases (ii), (iii) and (iv) in the proof of Proposition2.2. □ Whenever 𝑝 > 1, we always assume that 𝜃 satisfies the

conditions of Proposition2.2for the remaining parts of the paper. We now show some examples.

Example 2.2. Observe that by taking 𝜃(𝑡) ≡ 1, we obtain the generalized FB function (4). Hence, the family of NCP functions given by (5) subsumes the class of generalized FB functions.

Example 2.3. As in Example2.1, consider 𝜃_𝑖for 𝑖= 1, 2, 3.

Then, for any 𝑝 >1, the function 𝜙^𝑝_𝜃

𝑖is an NCP function by Proposition2.2. Notice from Figures 1-3 (subfigures (b) to (d)) that the graphs of 𝜙^𝑝

𝜃_𝑖(𝑝 >1) look “smoother” than that of 𝜙¹

𝜃_𝑖. In particular, 𝜙^𝑝

𝜃 is not differentiable only at the ori- gin. Finally, 𝜙^𝑝_𝜃

𝑖is also nonconvex similar to 𝜙¹_𝜃

𝑖in Example 2.1.

It is known that the differentiability and convexity of any complementarity function cannot be held simultaneously [12, 18]. Nonetheless, it could be neither differentiable nor convex. The below two propositions indicate that this is the case for 𝜙^𝑝_𝜃. First, as we have observed from Examples2.1and

(4)

-3 -2 -1 0

-1 1 2 3 4

-0.5 0 0.5 1-1 -0.5 0 0.5 1

2

-3 -2 -1 0

-1 1 2 3 4

-0.5 0 0.5 1-1 -0.5 0 0.5 1

2

-3 -2 -1 0

-1 1 2 3 4

-0.5 0 0.5 1-1 -0.5 0 0.5 1

2

-3 -2 -1 0

-1 1 2 3 4

-0.5 0 0.5 1-1 -0.5 0 0.5 1

2

Figure 2: Graphs of 𝜙^𝑝_𝜃

2 for different values of 𝑝 where 𝜃₂(𝑡) =

√𝑡²+4+𝑡

2 .

2.3, 𝜙^𝑝

𝜃 is not convex. We claim that this is indeed the case in general.

Proposition 2.3. Suppose that 𝜃 is strictly increasing on some interval 𝐼= [0, 𝑡₀). Then, 𝜙^𝑝_𝜃 is not convex.

Proof. Suppose that 𝜙^𝑝_𝜃 is convex, due to 𝜙^𝑝_𝜃(0, 0) = 0, it must be the case that 𝜙^𝑝

𝜃(𝜆𝑎, 𝜆𝑏) ≤ 𝜆𝜙^𝑝_𝜃(𝑎, 𝑏) for any 𝜆 ∈ [0, 1] and any 𝑢, 𝑣 ∈ IR. Taking any 𝑎, 𝑏 ∈ 𝐼 yields

𝜙^𝑝_𝜃(𝜆𝑎, 𝜆𝑏) − 𝜆𝜙^𝑝_𝜃(𝑎, 𝑏)

= ‖(𝜆𝑎, 𝜆𝑏)‖𝑝− (𝜆𝜃(𝜆𝑏)𝑎 + 𝜆𝜃(𝜆𝑎)𝑏) − 𝜆(‖(𝑎, 𝑏)‖𝑝

−(𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏))

= 𝜆𝑎(𝜃(𝑏) − 𝜃(𝜆𝑏)) + 𝜆𝑏(𝜃(𝑎) − 𝜃(𝜆𝑎)).

Since 𝜆 ∈ [0, 1], we have that 𝜆𝑎, 𝜆𝑏 ∈ 𝐼. By the strict monotonicity assumption on 𝜃 in 𝐼 , there has 𝜙^𝑝

𝜃(𝜆𝑎, 𝜆𝑏) − 𝜆𝜙^𝑝

𝜃(𝑎, 𝑏) > 0. Hence, 𝜙^𝑝_𝜃is not convex. □

We close this section by showing the semismoothness of 𝜙^𝑝

𝜃. The concept of semismoothness was introduced by

Mifflin [17] for functionals, and was later extended by Qi and Sun [19] for vector-valued functions.

Proposition 2.4. Suppose that 𝜃 is continuously differen- tiable and satisfies the conditions of Proposition2.1if 𝑝= 1 or Proposition2.2if 𝑝 >1. Then, 𝜙^𝑝_𝜃 is semismooth. More- over, the generalized gradient of 𝜙¹

𝜃is described by

𝜕𝜙¹_𝜃(𝑎, 𝑏) =

⎧⎪

⎪⎪

⎨⎪

⎪⎪

⎩

{ [sgn(𝑎) − 𝜃^′(𝑎)𝑏 − 𝜃(𝑏), sgn(𝑏) − 𝜃^′(𝑏)𝑎 − 𝜃(𝑎)]^𝑇} if 𝑎≠ 0 & 𝑏 ≠ 0 { [0, 2𝜆 − 1 − 𝑎𝜃^′(0) − 𝜃(𝑎)]^𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 > 0 & 𝑏 = 0 { [2𝜆 − 1 − 𝑏𝜃^′(0) − 𝜃(𝑏), 0]^𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 = 0 & 𝑏 > 0 { [−2, 2𝜆 − 1 − 𝑎𝜃^′(0) − 𝜃(𝑎)]^𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 < 0 & 𝑏 = 0 { [−2, 2𝜆 − 1 − 𝑏𝜃^′(0) − 𝜃(𝑏)]^𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 = 0 & 𝑏 < 0 { [𝜉, 𝜁 ]^𝑇 ∶ 𝜉, 𝜁 ∈ [−2, 0] } if 𝑎 = 𝑏 = 0

and for 𝑝 >1, we have

𝜕𝜙^𝑝

𝜃(𝑎, 𝑏) =

⎧⎪

⎨⎪

⎩

{ [sgn(𝑎)|𝑎|^𝑝−1

‖(𝑎,𝑏)‖^1−𝑝𝑝

− 𝜃(𝑏) − 𝑏𝜃^′(𝑎),^sgn(𝑏)^|𝑏|^𝑝−1

‖(𝑎,𝑏)‖^1−𝑝𝑝

− 𝜃(𝑎) − 𝑎𝜃^′(𝑏)]𝑇}

if (𝑎, 𝑏)≠ (0, 0) {[𝜉 − 1, 𝜁 − 1]^𝑇 ∶ |𝜉|^𝑝−1^𝑝 +|𝜁|^𝑝−1^𝑝 ≤ 1}

if 𝑎 = 𝑏 = 0.

(5)

-2 -1.5 -1 -0.5 0 0.5

-1 1 1.5 2 2.5 3 3.5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

3

-2 -1.5 -1 -0.5 0 0.5

-1 1 1.5 2 2.5 3 3.5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

3

-2 -1.5 -1 -0.5 0 0.5

-1 1 1.5 2 2.5 3 3.5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

3

-2 -1.5 -1 -0.5 0 0.5

-1 1 1.5 2 2.5 3 3.5

-0.5 0 0.5 1-1 -0.5 0 0.5 1

3

Figure 3: Graphs of 𝜙^𝑝_𝜃

3 for different values of 𝑝 where 𝜃₃(𝑡) = ²

1+𝑒^−𝑡.

Proof. Note that the mapping 𝑓 ∶ (𝑎, 𝑏) ↦ ‖(𝑎, 𝑏)‖𝑝 is a convex map and is therefore semismooth. Because 𝑔 ∶ (𝑎, 𝑏) ↦ −(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏 is smooth (and hence semismooth), their sum 𝑓+ 𝑔 = 𝜙^𝑝_𝜃is semismooth. Now, we compute the generalized gradient of 𝜙¹_𝜃. It is clear that 𝜙¹_𝜃is differentiable only on 𝐷∶= {(𝑎, 𝑏) ∶ 𝑎≠ 0 and 𝑏 ≠ 0}. Then, its gradient

∇𝜙¹_𝜃(𝑎, 𝑏) =

[ sgn(𝑎) − 𝜃^′(𝑎)𝑏 − 𝜃(𝑏) sgn(𝑏) − 𝜃^′(𝑏)𝑎 − 𝜃(𝑎)

]

∀(𝑎, 𝑏) ∈ 𝐷, coincides with the generalized gradient on 𝐷. Suppose then that(𝑎, 𝑏) ∉ 𝐷. First, we consider the case when 𝑎 > 0 and 𝑏 = 0. By definition of Clarke’s generalized gradient

𝜕𝜙¹

𝜃(𝑎, 𝑏) = conv(

𝜕_𝐵𝜙¹

𝜃(𝑎, 𝑏))

, i.e., the convex hull of the 𝐵-subdifferential

𝜕_𝐵𝜙¹

𝜃(𝑎, 𝑏) = {

𝑔∈ IR²| ∃{(𝑎𝑘, 𝑏_𝑘)}^∞_𝑘=1 ⊆ 𝐷s.t.

(𝑎_𝑘, 𝑏_𝑘) → (𝑎, 𝑏) and ∇𝜙¹_𝜃(𝑎_𝑘, 𝑏_𝑘) → 𝑔} . Let{(𝑎_𝑘, 𝑏_𝑘)}^∞_𝑘=1 ⊆ 𝐷such that(𝑎_𝑘, 𝑏_𝑘) → (𝑎, 0). For all sufficiently large 𝑘, we have 𝑎_𝑘 > 0. If 𝑏_𝑘 > 0 for all 𝑘

sufficiently large, then

𝑘→∞lim ∇𝜙¹_𝜃(𝑎_𝑘, 𝑏_𝑘) = lim

𝑘→∞

[ sgn(𝑎_𝑘) − 𝜃^′(𝑎_𝑘)𝑏_𝑘− 𝜃(𝑏_𝑘) sgn(𝑏_𝑘) − 𝜃^′(𝑏_𝑘)𝑎_𝑘− 𝜃(𝑎_𝑘)

]

=

[ 1 − 𝜃^′(𝑎)⋅ 0 − 𝜃(0) 1 − 𝜃^′(0)⋅ 𝑎 − 𝜃(𝑎)

]

=

[ 0

1 − 𝑎𝜃^′(0) − 𝜃(𝑎) ]

,

where we used the fact that 𝜃 is continuously differentiable and that 𝜃(0) = 1. If 𝑏_𝑘<0 for all 𝑘 sufficiently large, then

𝑘→∞lim ∇𝜙¹_𝜃(𝑎_𝑘, 𝑏_𝑘) =

[ 1 − 𝜃^′(𝑎)⋅ 0 − 𝜃(0)

−1 − 𝜃^′(0)⋅ 𝑎 − 𝜃(𝑎) ]

=

[ 0

−1 − 𝑎𝜃^′(0) − 𝜃(𝑎) ]

.

In other cases,∇𝜙¹_𝜃(𝑎_𝑘, 𝑏_𝑘) has no limit. Hence,

𝜕_𝐵𝜙¹_𝜃(𝑎, 0) ={

[0, 1 − 𝑎𝜃^′(0) − 𝜃(𝑎)]^𝑇,[0, −1 − 𝑎𝜃^′(0) − 𝜃(𝑎)]^𝑇} and the result for the case 𝑎 >0 and 𝑏 = 0 follows by taking the convex hull. We omit the proof of the other cases as the

(6)

onIR except at(0, 0). The computation of the generalized gradient 𝜙^𝑝

𝜃(0, 0) is similar to the computation of 𝜕𝜙^𝑝

FB(0, 0) shown as in [3]. This completes the proof. □

3. Some extensions

In this section, we discuss some variants and generaliza- tions of 𝜙^𝑝_𝜃. We also suggest some specific functions which can be used to derive new NCP functions from old ones. To proceed, we denote by 𝑡⁺the projection onto[0, ∞), i.e.,

𝑡⁺∶=

{

𝑡 if 𝑡≥ 0 0 if 𝑡 <0.

For convenience, we define ̂𝜙^𝑝,𝑖_𝜃 for 𝑖= 1, 2, 3 as follows:

𝜙̂^𝑝,1_𝜃 (𝑎, 𝑏) = 𝜙^𝑝_𝜃(𝑎, 𝑏) − 𝛼𝑎⁺𝑏⁺ 𝜙̂^𝑝,2

𝜃 (𝑎, 𝑏) = 𝜙^𝑝

𝜃(𝑎, 𝑏) − 𝛼(𝑎⁺𝑏⁺)² 𝜙̂^𝑝,3_𝜃 (𝑎, 𝑏) = 𝜙^𝑝_𝜃(𝑎, 𝑏) − 𝛼(𝑎⁺)²(𝑏⁺)²

where 𝛼 > 0. For any 𝑝 ≥ 1 and (𝑎, 𝑏) ∈ IR²₊₊, we know from Proposition2.1and Proposition2.2that ̂𝜙^𝑝,𝑖_𝜃 (𝑎, 𝑏) < 0.

Moreover, ̂𝜙^𝑝,𝑖_𝜃 (𝑎, 𝑏) = 𝜙^𝑝_𝜃(𝑎, 𝑏) > 0 for all (𝑎, 𝑏) ∉ IR²₊. Consequently, these three variants are easily to be seen as NCP functions as well.

Proposition 3.1. The functions ̂𝜙^𝑝,𝑖

𝜃 are all NCP functions for any 𝛼 >0 and 𝑖 = 1, 2, 3.

Recently, “continuous” and “discrete” generalizations of NCP functions have gained some attention, see [2, 3, 5].

These generalizations involve a tunable parameter 𝑞, which have been shown to play important role in achieving better numerical performance of some NCP functions-based algorithms [1,4,6]. Moreover, the extension results to an NCP function with possibly different analytic properties [2, 5].

For instance, the generalized FB function (4) is considered a continuous generalization of the FB function (3) in the sense that 𝑝 takes on values from the interval(1, ∞), and the FB function can be obtained by taking 𝑝= 2. On the other hand, discrete generalizations have also been studied recently. For instance, the natural residual (NR) function

𝜙NR(𝑎, 𝑏) = min{𝑎, 𝑏} = 𝑎 − (𝑎 − 𝑏)⁺

is another popular NCP function apart from the FB function.

A discrete generalization of this function proposed in [5] is given by

𝜙^𝑞

NR(𝑎, 𝑏) = 𝑎^𝑞− [(𝑎 − 𝑏)⁺]^𝑞 (8) where 𝑞 is a positive odd integer. For 𝑞= 1, the above function reduces to the original NR function. The generalization is “discrete” in the sense that 𝑞 can only take on positive odd

NR function (8) is that it possesses twice differentiability for 𝑞 >3, which is not the case for the NR function. This makes 𝜙^𝑞

NRsuitable for algorithms needing differentiability.

We wish to point out that the technique employed in the second type of generalization discussed above can always be adopted for NCP functions of the form

𝜙(𝑎, 𝑏) = ̄𝜙₁(𝑎, 𝑏) − ̄𝜙₂(𝑎, 𝑏). (9) In other words, the function

𝜙^𝑞(𝑎, 𝑏) ∶= [ ̄𝜙₁(𝑎, 𝑏)]^𝑞− [ ̄𝜙₂(𝑎, 𝑏)]^𝑞

is always a discrete generalization of 𝜙 given in (9), where 𝑞 is a positive odd integer. As a matter of fact, we can further extend such technique by considering any family of injective functions{𝑓_𝑞}. More precisely, the function

𝜙_𝑓

𝑞(𝑎, 𝑏) ∶= 𝑓_𝑞( ̄𝜙₁(𝑎, 𝑏)) − 𝑓_𝑞( ̄𝜙₂(𝑎, 𝑏)) (10) is easily seen to be an NCP function whenever 𝑓_𝑞 is injec- tive and 𝜙 is an NCP function as in (9). The transformation (10) has also been noted in [11]. For instance, the discrete generalized NR function (8) can be realized by transforming the NR function as in (10) using the map 𝑓_𝑞(𝑡) = 𝑡^𝑞, where 𝑞 >0 is an odd integer. Applying the same map to our NCP function 𝜙^𝑝

𝜃, we obtain a discrete generalization as

(𝜙^𝑝_𝜃)^𝑞 ∶=‖(𝑎, 𝑏)‖^𝑞_𝑝− (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)^𝑞, (11) where 𝑞 is a positive odd integer. As mentioned above, a generalization can possibly yield NCP functions with different analytic properties. In the case of (11), it is easy to verify that(𝜙^𝑝_𝜃)^𝑞is continuously differentiable onIR²when- ever 𝑞 ≥ 𝑝 > 1, whereas the original function 𝜙^𝑝_𝜃 is not differentiable at the origin.

Another discrete generalization of 𝜙^𝑝

𝜃can be obtained by applying the same map 𝑓_𝑞(𝑡) = 𝑡^𝑞 to the equivalent form of 𝜙^𝑝_𝜃given by

𝜙^𝑝

𝜃(𝑎, 𝑏) = 𝜙^𝑝

FB(𝑎, 𝑏) −[

𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)] . (12) This yields another symmetric generalization

(𝜙^𝑝_𝜃)^𝑞

FB(𝑎, 𝑏) = [𝜙^𝑝

FB(𝑎, 𝑏)]^𝑞−[

(𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)]𝑞

. For 𝑞 = 1, note that Proposition2.4guarantees the semis- moothness of 𝜙^𝑝_𝜃. Interestingly, the above generalization yields smooth NCP functions for any 𝑝 >1 and odd integers 𝑞≥ 3.

This can be easily verified and we omit proof. We summa- rize these results in Proposition3.2. Note that the above generalizations are all symmetric. In general, the transformation given in (10) yields symmetric NCP functions when applied to our proposed NCP function 𝜙^𝑝

𝜃 and its alternative form (12).

(7)

Proposition 3.2. Suppose 𝜃 is continuously differentiable and satisfies the conditions of Proposition2.1if 𝑝 = 1, or Proposition2.2if 𝑝 >1. Let 𝑞≥ 1 be an odd integer. Then,

(𝜙^𝑝_𝜃)^𝑞(𝑎, 𝑏) ∶=‖(𝑎, 𝑏)‖^𝑞_𝑝−(

𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)𝑞

is a discrete generalization of 𝜙^𝑝_𝜃, which is smooth if 𝑞≥ 𝑝 >

1. Additionally, (𝜙^𝑝_𝜃)^𝑞

FB(𝑎, 𝑏) ∶= [𝜙^𝑝

FB(𝑎, 𝑏)]^𝑞− [(𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)]^𝑞 is also a discrete generalizations of 𝜙^𝑝

𝜃, which is smooth if 𝑞≥ 3 and 𝑝 > 1

It is interesting to note that 𝑓_𝑞(𝑡) = 𝑡^𝑞 with 𝑞 ≥ 1 an odd integer is one of the functions usually employed in or- der to improve numerical performance of algorithms. This is referred to as an “activation function” in the literature on neural network approach for optimization. Such a function is often utilized to improve convergence rate, and other examples are given as follows:

1. Bipolar Sigmoid Function [20,21]

𝑓_𝑞(𝑡) = 1 − 𝑒^−𝑞𝑡

1 + 𝑒^−𝑞𝑡, 𝑞 >0.

2. Power-Sigmoid Function [20,21]

𝑓_𝑞(𝑡) =

{_1+𝑒_−𝑞1

1−𝑒−𝑞1 ⋅^1−𝑒^−𝑞1𝑡

1+𝑒−𝑞1𝑡 if|𝑡| < 1

𝑡^𝑞² if|𝑡| ≥ 1

where 𝑞 = (𝑞₁, 𝑞₂), 𝑞₁ > 2 and 𝑞₂ ≥ 3 is an odd integer.

3. Smooth Power-Sigmoid Function [20,21]

𝑓_𝑞(𝑡) = 1

2 ⋅1 + 𝑒^−𝑞¹

1 − 𝑒^−𝑞¹ ⋅ 1 − 𝑒^−𝑞¹^𝑡 1 + 𝑒^−𝑞¹^𝑡+1

2𝑡^𝑞² where 𝑞= (𝑞₁, 𝑞₂), 𝑞₁>2 and 𝑞₂≥ 3.

4. Sign-Bi-Power Function [14]

𝑓_𝑞(𝑡) =

⎧⎪

⎨⎪

⎩

|𝑡|^𝑞+|𝑡|¹^𝑞 if 𝑡 >0 0 if 𝑡= 0

−|𝑡|^𝑞−|𝑡|¹^𝑞 if 𝑡 <0

, 𝑞 >0.

These functions are all injective maps which can be employed to transform an NCP function of the form (9). How- ever, none of these transformations lead to a generalization in the sense illustrated above. Indeed, a generalized version can only be obtained if there exists ̄𝑞 such that 𝑓_𝑞_̄(𝑡) ≡ 𝑡.

We do note, however, that ^𝑓^𝑞

2 yields a continuous generalization via the transformation (10) if 𝑓_𝑞is the sign-bi-power function. In any case, an interesting research direction is to explore the applicability of the above injective functions in improving numerical efficiency of NCP functions-based

solution methods, just as how these functions improve numerical performance in neural network approaches. In the case of the power function 𝑓_𝑞(𝑡) = 𝑡^𝑞 and the generalized NR function, some numerical results are reported in [1]. Fi- nally, we note that it is also worth considering in numerical implementations the composite map 𝑓_𝑞◦𝜙^𝑝_𝜃. This is also an NCP function provided that 𝑓_𝑞 is injective with 𝑓_𝑞(0) = 0 such as the above four activation functions.

4. Concluding Remarks

In this short paper, we proposed a new way to construct NCP functions. The family of generalized FB functions, in particular, can be generated from the proposed approach. We proved herein some basic properties of the newly discovered NCP function, which includes the growth behavior, noncon- vexity and semismoothness of 𝜙^𝑝

𝜃. These are prerequisite to designing solution methods based on the new NCP function.

Observe that for a fixed 𝜃, the NCP function 𝜙^𝑝

𝜃is parame- trized by 𝑝≥ 1. Future research directions can explore the effects of tuning the parameter 𝑝 in the performance of algo- rithms. This is worth exploring as it has been shown that for the case of the generalized FB and NR functions, better convergence rates of solution methods can be attained by con- trolling the values of 𝑝 [1,4,6]. Numerical comparisons of these new NCP functions with popular NCP functions such as the FB and NR function are recommended. How to best choose the parameter 𝑝 and the function 𝜃 are some topics that are worth venturing, as this could suggest alternative NCP functions that can work well with algorithms. Finally, it seems worthwhile to explore the effects of choosing differ- ent activation functions 𝑓_𝑞such as the bipolar sigmoid function, power-sigmoid function, smooth power-sigmoid function, and the sign-bi-power function, in forming new NCP functions from old ones such as(𝜙^𝑝_𝜃)_𝑓

𝑞and 𝑓_𝑞◦𝜙^𝑝_𝜃. We leave it for future research to study whether or not these transformations can be used to improve numerical performance of an NCP function-based algorithm. If these functions indeed improve some algorithms, it is recommended to determine which one of these will work best for the complementarity problem.

Acknowledgement

JS Chen is supported by Ministry of Science and Tech- nology, Taiwan.

References

[1] J.H. ALCANTARA ANDJ.-S. CHEN, Neural networks based on three classes of NCP-functions for solving nonlinear complementarity prob- lems, Neurocomputing 359 (2019) 102–113.

[2] Y.-L. CHANG, J.-S. CHEN,ANDC.-Y. YANG, Symmetrization of gener- alized natural residual function for NCP, Operations Research Letters 43 (2015) 354–358.

[3] J.-S. CHEN, On some NCP-functions based on the generalized Fischer- Burmeister function, Asia-Pacific Journal of Operations Research 24 (2007) 401–420.

(8)

ity problems, Information Sciences 180 (2010) 697–711.

[5] J.-S. CHEN, C.-H. KO,ANDX.-R. WU, What is the generalization of natural residual function for NCP?, Pacific Journal of Optimization 12 (2016) 19–27.

[6] J.-S. CHEN ANDS.-H. PAN, A family of NCP functions and a de- scent method for the nonlinear complementarity problem, Computa- tional Optimization and Applications 40 (2008) 389–404.

[7] F. FACCHINEI AND J.-S. PANG, Finite-Dimensional Variational In- equalities and Complementarity Problems, Volumes I and II, Springer- Verlag, New York, 2003.

[8] F. FACCHINEI AND J. SOARES, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization 7 (1997) 225–247.

[9] M.C. FERRIS, O.L. MANGASARIAN,ANDJ.-S. PANG, editors, Com- plementarity: Applications, Algorithms and Extensions, Kluwer Aca- demic Publishers, Dordrecht 2001.

[10] M.C. FERRIS ANDJ.-S. PANG, Engineering and economic applica- tions of complementarity problems, SIAM Review 39 (1997) 669–713.

[11] A. GALANTAI, Properties and construction of NCP functions, Com- putational Optimization and Applications 52 (2012) 805–824.

[12] C.-H. HUANG, J.-S. CHEN,AND ANDJ.E. MARTINEZ-LEGAZ, Dif- ferentiability v.s. convexity of complementarity functions, Optimization Letters, 11(1) (2017) 209–216.

[13] C. KANZOW, N. YAMASHITA, AND M. FUKUSHIMA, New NCP- functions and their properties, Journal of Optimization Theory and Ap- plications 94 (1997) 115–135.

[14] S. LI, S. CHEN,ANDB. LIU, Accelerating a recurrent neural network to finite-time convergence for solving time-varying sylvester equation by using a sign-bi-power activation function, Neural Process Letters 37 (2013) 189–205.

[15] Z.Q. LUO AND P. TSENG, : A new class of merit functions for the nonlinear complementarity problem, In: Ferris, M.C., Pang, J.S. (eds.) Complementarity and Variational Problems: State of the Art, pp. 204–

225. SIAM, Philadelphia (1997).

[16] O.L. MANGASARIAN, Equivalence of the complementarity problem to a system of nonlinear equations, SIAM Journal on Applied Mathe- matics 31 (1976) 89–92.

[17] R. MIFFLIN, Semismooth and semiconvex function in constrained op- timization, SIAM Journal on Control and Optimization 15 (1977) 959–

972.

[18] S.M. MIRI ANDS. EFFATI, On generalized convexity of nonlinear complementarity functions, Journal of Optimization Theory and Ap- pllications, 164 (2015) 723–730.

[19] L. QI ANDJ. SUN, A nonsmooth version of Newton’s method, Math- ematical Programming 58 (1993) 353–367.

[20] Y. ZHANG, Z. FAN, AND Z. LI, Zhang neural network for online solution of time-varying sylvester equation, In: Proceedings of the 2nd international conference on advances in computation and intelligence, ISICA’07. Berlin, Springer, 276–285.

[21] Y. ZHANG ANDS.S. GE, Design and analysis of a general recurrent neural network model for time-varying matrix inversion, IEEE Trans- actions on Neural Networks 16 (2005) 1477–1490.