On construction of new NCP functions
Jan Harold Alcantara
a, Chen-Han Lee
a, Chieu Thanh Nguyen
a, Yu-Lin Chang
aand Jein-Shan Chen
a,∗aDepartment of Mathematics, National Taiwan Normal University
A R T I C L E I N F O
Keywords:
Complementarity Functions Nonlinear Complementarity Problem
A B S T R A C T
We report a new method to construct complementarity functions for the nonlinear complementarity problem (NCP). Basic properties related to growth behavior, convexity and semismoothness of the newly discovered NCP functions are proved. We also present some variants, generalizations and other transformations of these NCP functions. Finally, we propose some interesting research directions that can be explored in the NCP research.
1. Motivation
Nonlinear complementarity problems (NCPs) are an im- portant class of variational inequalities often encountered when dealing with Karush-Kuhn-Tucker conditions of op- timization problems [7]. Apart from these, NCP provides an important framework for the study of equilibrium problems which usually arises from different areas such as operations research, engineering and economics [7,9,10].
Given a function 𝐹 ∶ IR𝑛→IR𝑛, the problem of finding a point 𝑥∈ IR𝑛such that
𝑥≥ 0, 𝐹(𝑥)≥ 0, and ⟨𝑥, 𝐹 (𝑥)⟩ = 0, (1) is precisely the nonlinear complementarity problem. Vari- ous approaches to solving this problem have been proposed in the past years. One class of methods utilizes a so-called NCP function, that is, a function 𝜙∶ IR2→IR such that
𝜙(𝑎, 𝑏) = 0 ⟺ 𝑎≥ 0, 𝑏≥ 0, and 𝑎𝑏= 0.
An NCP function is useful in solving NCP (1) as it naturally exploits the structure of the problem. In particular, defining ΦF ∶ IR𝑛→IR𝑛as
ΦF(𝑥) =
⎛⎜
⎜⎝
𝜙(𝑥1, 𝐹1(𝑥))
⋮ 𝜙(𝑥𝑛, 𝐹𝑛(𝑥))
⎞⎟
⎟⎠
, (2)
it is clear to see that NCP (1) is equivalent to solving the sys- tem of equationsΦF(𝑥) = 0. Moreover, the NCP-function also gives rise to a merit function, namelyΨF(𝑥) ∶= 12‖ΦF(𝑥)‖2. That is, the global minimizers of ΨF and the solutions of (1) coincide. Consequently, designing solution methods for solving (1) usually involves these NCP functions.
Due to their usefulness, numerous NCP functions have been proposed and extensively studied in the literature [11].
Among them, the Fischer-Burmeister (FB) function given by
𝜙FB(𝑎, 𝑏) =√
𝑎2+ 𝑏2− (𝑎 + 𝑏) (3)
∗Corresponding author. Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan
[email protected](J. Chen) ORCID(s):0000-0002-4596-9419(J. Chen)
had gained significant attention and had been widely used in several studies because of its desirable numerical properties.
In addition, as noted in [11], it is remarkable that several NCP functions are akin to the FB function. For instance, the generalized FB function
𝜙𝑝
FB(𝑎, 𝑏) =‖(𝑎, 𝑏)‖𝑝− (𝑎 + 𝑏), 𝑝 >1 (4) is an interesting generalization of 𝜙
FBwhich can be efficiently used in solving NCPs. Here,‖ ⋅ ‖𝑝denotes the 𝑙𝑝-norm, and the tunable parameter 𝑝 has been shown to possibly improve numerical performance of some algorithms [4,6].
A general way to construct NCP functions was first given by Mangasarian in [16], and another method was formu- lated by Luo and Tseng [15] and Kanzow, Yamashita, and Fukushima [13]. More recently, a rigorous discussion on how to construct NCP functions was presented by Galantai in [11]. On the other hand, the purpose of this paper is to present another general method to construct NCP functions which is new to the literature. The very useful generalized FB function 𝜙𝑝
FBis one among the functions that our method can generate. We also discuss some analytic properties and geometric views of the proposed functions. We present some variants and generalizations of these NCP functions, and we also suggest some possibly important extensions. Finally, we report some possible research directions that are worth exploring in the future.
2. New NCP Functions
In this section, we present a new method to construct continuous NCP functions. Let 𝜃 ∶ IR → IR be continuous and define 𝜙𝑝
𝜃 ∶ IR2→IR as 𝜙𝑝
𝜃(𝑎, 𝑏) =‖(𝑎, 𝑏)‖𝑝− (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏), 𝑝≥ 1. (5) Note that 𝜙𝑝
𝜃 is a continuous symmetric function; that is, 𝜙𝑝
𝜃(𝑎, 𝑏) = 𝜙𝑝𝜃(𝑏, 𝑎). For some suitable choice of 𝜃, the above function yields an NCP function. We divide our discussion into two cases, depending on the value of 𝑝.
We first consider the case of 𝑝= 1, that is,
𝜙1𝜃(𝑎, 𝑏) =|𝑎| + |𝑏| − (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏). (6) Proposition 2.1. Let 𝜃∶ IR → IR such that 𝜃(0) = 1, 𝜃(𝑡) >
1 for all 𝑡 > 0, and −1 < 𝜃(𝑡) < 1 for all 𝑡 < 0. Then, 𝜙1𝜃 defined by(6) is an NCP function. Moreover, 𝜙1𝜃(𝑎, 𝑏)≤ 0 if and only if(𝑎, 𝑏) ∈ IR2+.
Proof. Observe that we may rewrite 𝜙1𝜃as 𝜙1𝜃(𝑎, 𝑏) = 𝑎(
sgn(𝑎) − 𝜃(𝑏)) + 𝑏(
sgn(𝑏) − 𝜃(𝑎)) , where
sgn(𝑡) ∶=
⎧⎪
⎨⎪
⎩
1 if 𝑡 >0, 0 if 𝑡= 0,
−1 if 𝑡 <0.
Then, it is easy to verify that 𝜙1
𝜃(𝑎, 𝑏)
=
⎧⎪
⎨⎪
⎩
0 if 𝑎, 𝑏≥ 0 & 𝑎𝑏 = 0, 𝑎(1 − 𝜃(𝑏)) + 𝑏(1 − 𝜃(𝑎)) if 𝑎 >0 & 𝑏 > 0,
−𝑎(1 + 𝜃(𝑏)) + 𝑏(1 − 𝜃(𝑎)) if 𝑎 <0 & 𝑏≥ 0,
−𝑎(1 + 𝜃(𝑏)) − 𝑏(1 + 𝜃(𝑎)) if 𝑎 <0 & 𝑏 < 0.
(7) By our hypotheses on 𝜃, we see that 𝜙1
𝜃(𝑎, 𝑏) < 0 for the second case, and 𝜙1
𝜃(𝑎, 𝑏) > 0 for the third and last cases.
Finally, by symmetry of 𝜙1
𝜃, we have 𝜙1
𝜃(𝑎, 𝑏) > 0 when 𝑎 >
0 and 𝑏 < 0 as in the third case. In other words, 𝜙1𝜃(𝑎, 𝑏) = 0 if and only if 𝑎, 𝑏≥ 0 and 𝑎𝑏 = 0. This says that 𝜙1𝜃 is an NCP function. □
An important consequence of Proposition 2.1is given by the following result, which describes the growth behav- ior of the NCP function 𝜙1
𝜃. This corollary plays an impor- tant role in establishing coerciveness ofΦ
Fgiven by (2) (see [8]), which in turn is helpful in convergence analysis of al- gorithms. We omit the proof of the following corollary since it easily follows from the formula of 𝜙1
𝜃given in (7). We do note that the strict inequality assumptions on the limits of 𝜃 at±∞ are important to avoid indeterminate products.
Corollary 2.1. Let 𝜃 satisfy the hypothesis of Proposition 2.1such that lim
𝑡→∞𝜃(𝑡) > 1 and −1 < lim
𝑡→−∞𝜃(𝑡) < 1. Then,
|𝜙1𝜃(𝑎𝑘, 𝑏𝑘)| → ∞ as 𝑘 → ∞ for any sequence {(𝑎𝑘, 𝑏𝑘)} ⊆ IR2with|𝑎𝑘| → ∞ and |𝑏𝑘| → ∞.
In the remaining parts of the paper, we assume that 𝜃 satisfies the conditions in Proposition2.1whenever 𝑝 = 1.
Note that a simple choice of 𝜃 is any monotonically increas- ing function whose range is contained in (−1, ∞), passes through(0, 1), and is strictly monotonic in some neighbor- hood of0.
and 𝜃3(𝑡) = 2
1+𝑒−𝑡 clearly satisfy the conditions of Propo- sition 2.1 and Corollary 2.1. The graphs of 𝜙1𝜃
𝑖(𝑎, 𝑏) for 𝑖 = 1, 2, 3 are shown in Figures1a,2a, and3a. For each 𝑖, it is evident that the function 𝜙1𝜃
𝑖 is non-positive onIR2+ and has the growth behavior as described in Corollary2.1.
In addition, 𝜙1
𝜃𝑖 is a nonsmooth nonconvex function for all 𝑖. In particular, the function has sharp trace curves corre- sponding to 𝑎= 0 and 𝑏 = 0, which are the points of non- differentiability of 𝜙1
𝜃.
2.2. The case 𝑝 >1
Now, we consider 𝜙𝑝𝜃with 𝑝 >1 and provide conditions on 𝜃 which will make 𝜙𝑝𝜃an NCP function. The conditions are almost similar to those given in Proposition2.1. How- ever, we do not require strict inequality at 𝑡= 1, but we need a higher lower bound for 𝜃(𝑡) on (−∞, 0).
Proposition 2.2. Let 𝑝 >1. Suppose 𝜃 ∶ IR → IR such that 𝜃(0) = 1, 𝜃(𝑡) ≥ 1 for all 𝑡 > 0, and −2
1−𝑝
𝑝 ≤ 𝜃(𝑡) ≤ 1 for all 𝑡 < 0. Then, 𝜙𝑝𝜃 defined by(5) is an NCP function.
Moreover, 𝜙𝑝
𝜃(𝑎, 𝑏)≤ 0 if and only if (𝑎, 𝑏) ∈ IR2+.
Proof. Since 𝜙𝑝𝜃is symmetric w.r.t. the line 𝑎= 𝑏, it suffices to check the values of 𝜙𝑝
𝜃 on the region 𝑎≤ 𝑏. We carefully consider four cases.
(i) If 𝑎= 0 and 𝑏 > 0, then 𝜙𝑝𝜃(𝑎, 𝑏) =|𝑏| − 𝜃(0)𝑏 = 0 since 𝜃(0) = 1.
(ii) Suppose 𝑎 > 0 and 𝑏 > 0. Due to 𝑝 > 1, we have
‖(𝑎, 𝑏)‖𝑝 = (𝑎𝑝+ 𝑏𝑝)
1
𝑝 < 𝑎+ 𝑏 which in turn yields 𝜙𝑝𝜃(𝑎, 𝑏) < 𝑎+𝑏−(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏) = 𝑎(1−𝜃(𝑏))+𝑏(1−𝜃(𝑎)).
Because 𝜃(𝑡)≥ 1 for any 𝑡 > 0 it follows that 𝜙𝑝𝜃(𝑎, 𝑏) < 0.
(iii) Suppose 𝑎 < 0 and 𝑏 ≥ 0. In this case, we have that
‖(𝑎, 𝑏)‖𝑝 > 𝑎+ 𝑏. Thus, 𝜙𝑝
𝜃(𝑎, 𝑏) > 𝑎+𝑏−(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏) = 𝑎(1−𝜃(𝑏))+𝑏(1−𝜃(𝑎)).
Since 𝑏 ≥ 0, we have 1 − 𝜃(𝑏) ≤ 0 and so the term 𝑎(1 − 𝜃(𝑏)) is nonnegative. On the other hand, 1 − 𝜃(𝑎) > 0 since 𝑎 < 0 which means that the term 𝑏(1 − 𝜃(𝑎)) is likewise nonnegative. Hence, 𝜙𝑝𝜃(𝑎, 𝑏) > 0.
(iv) Finally, suppose that 𝑎 < 0 and 𝑏 < 0. The function 𝑡↦ 𝑡𝑝is strictly convex on[0, ∞) since 𝑝 > 1. Thus,
‖(𝑎, 𝑏)‖𝑝𝑝=|𝑎|𝑝+|𝑏|𝑝 >21−𝑝(|𝑎| + |𝑏|)𝑝,
which implies that‖(𝑎, 𝑏)‖𝑝>2
1−𝑝
𝑝 (|𝑎| + |𝑏|) = −21−𝑝𝑝 (𝑎 + 𝑏). Consequently,
𝜙𝑝𝜃(𝑎, 𝑏) > −2
1−𝑝
𝑝 (𝑎 + 𝑏) − (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)
-5 -4 -3 -2 -1 0
-1 1 2 3 4 5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(a) Graph of 𝜙1𝜃
1
-5 -4 -3 -2 -1 0
-1 1 2 3 4 5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(b) Graph of 𝜙1.2𝜃
1
-5 -4 -3 -2 -1 0
-1 1 2 3 4 5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(c) Graph of 𝜙2𝜃
1
-5 -4 -3 -2 -1 0
-1 1 2 3 4 5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(d) Graph of 𝜙10𝜃
1
Figure 1: Graphs of 𝜙𝑝
𝜃1 for different values of 𝑝 where 𝜃1(𝑡) = 𝑒𝑡.
= −𝑎(2
1−𝑝
𝑝 + 𝜃(𝑏)) − 𝑏(2
1−𝑝 𝑝 + 𝜃(𝑎))
≥ 0
where the last inequality follows from the assumption that 𝜃(𝑡)≥ −2
1−𝑝
𝑝 for all 𝑡≤ 0.
From the above four cases, it is clear that 𝜙𝑝
𝜃(𝑎, 𝑏)≤ 0 only onIR2+. This completes the proof. □
We now state a consequence of Proposition5, similar to Corollary2.1.
Corollary 2.2. Let 𝜃 satisfy the hypothesis of Proposition 2.2such that lim
𝑡→∞𝜃(𝑡) > 1 and −2
1−𝑝
𝑝 < lim
𝑡→−∞𝜃(𝑡) < 1.
Then,|𝜙𝑝𝜃(𝑎𝑘, 𝑏𝑘)| → ∞ as 𝑘 → ∞ for any sequence {(𝑎𝑘, 𝑏𝑘)} ⊆ IR2with|𝑎𝑘| → ∞ and |𝑏𝑘| → ∞.
Proof. The result follows from the inequalities obtained from cases (ii), (iii) and (iv) in the proof of Proposition2.2. □ Whenever 𝑝 > 1, we always assume that 𝜃 satisfies the
conditions of Proposition2.2for the remaining parts of the paper. We now show some examples.
Example 2.2. Observe that by taking 𝜃(𝑡) ≡ 1, we obtain the generalized FB function (4). Hence, the family of NCP functions given by (5) subsumes the class of generalized FB functions.
Example 2.3. As in Example2.1, consider 𝜃𝑖for 𝑖= 1, 2, 3.
Then, for any 𝑝 >1, the function 𝜙𝑝𝜃
𝑖is an NCP function by Proposition2.2. Notice from Figures 1-3 (subfigures (b) to (d)) that the graphs of 𝜙𝑝
𝜃𝑖(𝑝 >1) look “smoother” than that of 𝜙1
𝜃𝑖. In particular, 𝜙𝑝
𝜃 is not differentiable only at the ori- gin. Finally, 𝜙𝑝𝜃
𝑖is also nonconvex similar to 𝜙1𝜃
𝑖in Example 2.1.
It is known that the differentiability and convexity of any complementarity function cannot be held simultaneously [12, 18]. Nonetheless, it could be neither differentiable nor con- vex. The below two propositions indicate that this is the case for 𝜙𝑝𝜃. First, as we have observed from Examples2.1and
-3 -2 -1 0
-1 1 2 3 4
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(a) Graph of 𝜙1𝜃
2
-3 -2 -1 0
-1 1 2 3 4
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(b) Graph of 𝜙1.2𝜃
2
-3 -2 -1 0
-1 1 2 3 4
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(c) Graph of 𝜙2𝜃
2
-3 -2 -1 0
-1 1 2 3 4
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(d) Graph of 𝜙10𝜃
2
Figure 2: Graphs of 𝜙𝑝𝜃
2 for different values of 𝑝 where 𝜃2(𝑡) =
√𝑡2+4+𝑡
2 .
2.3, 𝜙𝑝
𝜃 is not convex. We claim that this is indeed the case in general.
Proposition 2.3. Suppose that 𝜃 is strictly increasing on some interval 𝐼= [0, 𝑡0). Then, 𝜙𝑝𝜃 is not convex.
Proof. Suppose that 𝜙𝑝𝜃 is convex, due to 𝜙𝑝𝜃(0, 0) = 0, it must be the case that 𝜙𝑝
𝜃(𝜆𝑎, 𝜆𝑏) ≤ 𝜆𝜙𝑝𝜃(𝑎, 𝑏) for any 𝜆 ∈ [0, 1] and any 𝑢, 𝑣 ∈ IR. Taking any 𝑎, 𝑏 ∈ 𝐼 yields
𝜙𝑝𝜃(𝜆𝑎, 𝜆𝑏) − 𝜆𝜙𝑝𝜃(𝑎, 𝑏)
= ‖(𝜆𝑎, 𝜆𝑏)‖𝑝− (𝜆𝜃(𝜆𝑏)𝑎 + 𝜆𝜃(𝜆𝑎)𝑏) − 𝜆(‖(𝑎, 𝑏)‖𝑝
−(𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏))
= 𝜆𝑎(𝜃(𝑏) − 𝜃(𝜆𝑏)) + 𝜆𝑏(𝜃(𝑎) − 𝜃(𝜆𝑎)).
Since 𝜆 ∈ [0, 1], we have that 𝜆𝑎, 𝜆𝑏 ∈ 𝐼. By the strict monotonicity assumption on 𝜃 in 𝐼 , there has 𝜙𝑝
𝜃(𝜆𝑎, 𝜆𝑏) − 𝜆𝜙𝑝
𝜃(𝑎, 𝑏) > 0. Hence, 𝜙𝑝𝜃is not convex. □
We close this section by showing the semismoothness of 𝜙𝑝
𝜃. The concept of semismoothness was introduced by
Mifflin [17] for functionals, and was later extended by Qi and Sun [19] for vector-valued functions.
Proposition 2.4. Suppose that 𝜃 is continuously differen- tiable and satisfies the conditions of Proposition2.1if 𝑝= 1 or Proposition2.2if 𝑝 >1. Then, 𝜙𝑝𝜃 is semismooth. More- over, the generalized gradient of 𝜙1
𝜃is described by
𝜕𝜙1𝜃(𝑎, 𝑏) =
⎧⎪
⎪⎪
⎨⎪
⎪⎪
⎩
{ [sgn(𝑎) − 𝜃′(𝑎)𝑏 − 𝜃(𝑏), sgn(𝑏) − 𝜃′(𝑏)𝑎 − 𝜃(𝑎)]𝑇} if 𝑎≠ 0 & 𝑏 ≠ 0 { [0, 2𝜆 − 1 − 𝑎𝜃′(0) − 𝜃(𝑎)]𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 > 0 & 𝑏 = 0 { [2𝜆 − 1 − 𝑏𝜃′(0) − 𝜃(𝑏), 0]𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 = 0 & 𝑏 > 0 { [−2, 2𝜆 − 1 − 𝑎𝜃′(0) − 𝜃(𝑎)]𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 < 0 & 𝑏 = 0 { [−2, 2𝜆 − 1 − 𝑏𝜃′(0) − 𝜃(𝑏)]𝑇 ∶ 𝜆 ∈ [0, 1] } if 𝑎 = 0 & 𝑏 < 0 { [𝜉, 𝜁 ]𝑇 ∶ 𝜉, 𝜁 ∈ [−2, 0] } if 𝑎 = 𝑏 = 0
and for 𝑝 >1, we have
𝜕𝜙𝑝
𝜃(𝑎, 𝑏) =
⎧⎪
⎨⎪
⎩
{ [sgn(𝑎)|𝑎|𝑝−1
‖(𝑎,𝑏)‖1−𝑝𝑝
− 𝜃(𝑏) − 𝑏𝜃′(𝑎),sgn(𝑏)|𝑏|𝑝−1
‖(𝑎,𝑏)‖1−𝑝𝑝
− 𝜃(𝑎) − 𝑎𝜃′(𝑏)]𝑇}
if (𝑎, 𝑏)≠ (0, 0) {[𝜉 − 1, 𝜁 − 1]𝑇 ∶ |𝜉|𝑝−1𝑝 +|𝜁|𝑝−1𝑝 ≤ 1}
if 𝑎 = 𝑏 = 0.
-2 -1.5 -1 -0.5 0 0.5
-1 1 1.5 2 2.5 3 3.5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(a) Graph of 𝜙1𝜃
3
-2 -1.5 -1 -0.5 0 0.5
-1 1 1.5 2 2.5 3 3.5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(b) Graph of 𝜙1.2𝜃
3
-2 -1.5 -1 -0.5 0 0.5
-1 1 1.5 2 2.5 3 3.5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(c) Graph of 𝜙2𝜃
3
-2 -1.5 -1 -0.5 0 0.5
-1 1 1.5 2 2.5 3 3.5
-0.5 0 0.5 1-1 -0.5 0 0.5 1
(d) Graph of 𝜙10𝜃
3
Figure 3: Graphs of 𝜙𝑝𝜃
3 for different values of 𝑝 where 𝜃3(𝑡) = 2
1+𝑒−𝑡.
Proof. Note that the mapping 𝑓 ∶ (𝑎, 𝑏) ↦ ‖(𝑎, 𝑏)‖𝑝 is a convex map and is therefore semismooth. Because 𝑔 ∶ (𝑎, 𝑏) ↦ −(𝜃(𝑏)𝑎+𝜃(𝑎)𝑏 is smooth (and hence semismooth), their sum 𝑓+ 𝑔 = 𝜙𝑝𝜃is semismooth. Now, we compute the generalized gradient of 𝜙1𝜃. It is clear that 𝜙1𝜃is differentiable only on 𝐷∶= {(𝑎, 𝑏) ∶ 𝑎≠ 0 and 𝑏 ≠ 0}. Then, its gradient
∇𝜙1𝜃(𝑎, 𝑏) =
[ sgn(𝑎) − 𝜃′(𝑎)𝑏 − 𝜃(𝑏) sgn(𝑏) − 𝜃′(𝑏)𝑎 − 𝜃(𝑎)
]
∀(𝑎, 𝑏) ∈ 𝐷, coincides with the generalized gradient on 𝐷. Suppose then that(𝑎, 𝑏) ∉ 𝐷. First, we consider the case when 𝑎 > 0 and 𝑏 = 0. By definition of Clarke’s generalized gradient
𝜕𝜙1
𝜃(𝑎, 𝑏) = conv(
𝜕𝐵𝜙1
𝜃(𝑎, 𝑏))
, i.e., the convex hull of the 𝐵-subdifferential
𝜕𝐵𝜙1
𝜃(𝑎, 𝑏) = {
𝑔∈ IR2| ∃{(𝑎𝑘, 𝑏𝑘)}∞𝑘=1 ⊆ 𝐷s.t.
(𝑎𝑘, 𝑏𝑘) → (𝑎, 𝑏) and ∇𝜙1𝜃(𝑎𝑘, 𝑏𝑘) → 𝑔} . Let{(𝑎𝑘, 𝑏𝑘)}∞𝑘=1 ⊆ 𝐷such that(𝑎𝑘, 𝑏𝑘) → (𝑎, 0). For all sufficiently large 𝑘, we have 𝑎𝑘 > 0. If 𝑏𝑘 > 0 for all 𝑘
sufficiently large, then
𝑘→∞lim ∇𝜙1𝜃(𝑎𝑘, 𝑏𝑘) = lim
𝑘→∞
[ sgn(𝑎𝑘) − 𝜃′(𝑎𝑘)𝑏𝑘− 𝜃(𝑏𝑘) sgn(𝑏𝑘) − 𝜃′(𝑏𝑘)𝑎𝑘− 𝜃(𝑎𝑘)
]
=
[ 1 − 𝜃′(𝑎)⋅ 0 − 𝜃(0) 1 − 𝜃′(0)⋅ 𝑎 − 𝜃(𝑎)
]
=
[ 0
1 − 𝑎𝜃′(0) − 𝜃(𝑎) ]
,
where we used the fact that 𝜃 is continuously differentiable and that 𝜃(0) = 1. If 𝑏𝑘<0 for all 𝑘 sufficiently large, then
𝑘→∞lim ∇𝜙1𝜃(𝑎𝑘, 𝑏𝑘) =
[ 1 − 𝜃′(𝑎)⋅ 0 − 𝜃(0)
−1 − 𝜃′(0)⋅ 𝑎 − 𝜃(𝑎) ]
=
[ 0
−1 − 𝑎𝜃′(0) − 𝜃(𝑎) ]
.
In other cases,∇𝜙1𝜃(𝑎𝑘, 𝑏𝑘) has no limit. Hence,
𝜕𝐵𝜙1𝜃(𝑎, 0) ={
[0, 1 − 𝑎𝜃′(0) − 𝜃(𝑎)]𝑇,[0, −1 − 𝑎𝜃′(0) − 𝜃(𝑎)]𝑇} and the result for the case 𝑎 >0 and 𝑏 = 0 follows by taking the convex hull. We omit the proof of the other cases as the
onIR except at(0, 0). The computation of the generalized gradient 𝜙𝑝
𝜃(0, 0) is similar to the computation of 𝜕𝜙𝑝
FB(0, 0) shown as in [3]. This completes the proof. □
3. Some extensions
In this section, we discuss some variants and generaliza- tions of 𝜙𝑝𝜃. We also suggest some specific functions which can be used to derive new NCP functions from old ones. To proceed, we denote by 𝑡+the projection onto[0, ∞), i.e.,
𝑡+∶=
{
𝑡 if 𝑡≥ 0 0 if 𝑡 <0.
For convenience, we define ̂𝜙𝑝,𝑖𝜃 for 𝑖= 1, 2, 3 as follows:
𝜙̂𝑝,1𝜃 (𝑎, 𝑏) = 𝜙𝑝𝜃(𝑎, 𝑏) − 𝛼𝑎+𝑏+ 𝜙̂𝑝,2
𝜃 (𝑎, 𝑏) = 𝜙𝑝
𝜃(𝑎, 𝑏) − 𝛼(𝑎+𝑏+)2 𝜙̂𝑝,3𝜃 (𝑎, 𝑏) = 𝜙𝑝𝜃(𝑎, 𝑏) − 𝛼(𝑎+)2(𝑏+)2
where 𝛼 > 0. For any 𝑝 ≥ 1 and (𝑎, 𝑏) ∈ IR2++, we know from Proposition2.1and Proposition2.2that ̂𝜙𝑝,𝑖𝜃 (𝑎, 𝑏) < 0.
Moreover, ̂𝜙𝑝,𝑖𝜃 (𝑎, 𝑏) = 𝜙𝑝𝜃(𝑎, 𝑏) > 0 for all (𝑎, 𝑏) ∉ IR2+. Consequently, these three variants are easily to be seen as NCP functions as well.
Proposition 3.1. The functions ̂𝜙𝑝,𝑖
𝜃 are all NCP functions for any 𝛼 >0 and 𝑖 = 1, 2, 3.
Recently, “continuous” and “discrete” generalizations of NCP functions have gained some attention, see [2, 3, 5].
These generalizations involve a tunable parameter 𝑞, which have been shown to play important role in achieving better numerical performance of some NCP functions-based algo- rithms [1,4,6]. Moreover, the extension results to an NCP function with possibly different analytic properties [2, 5].
For instance, the generalized FB function (4) is considered a continuous generalization of the FB function (3) in the sense that 𝑝 takes on values from the interval(1, ∞), and the FB function can be obtained by taking 𝑝= 2. On the other hand, discrete generalizations have also been studied recently. For instance, the natural residual (NR) function
𝜙NR(𝑎, 𝑏) = min{𝑎, 𝑏} = 𝑎 − (𝑎 − 𝑏)+
is another popular NCP function apart from the FB function.
A discrete generalization of this function proposed in [5] is given by
𝜙𝑞
NR(𝑎, 𝑏) = 𝑎𝑞− [(𝑎 − 𝑏)+]𝑞 (8) where 𝑞 is a positive odd integer. For 𝑞= 1, the above func- tion reduces to the original NR function. The generalization is “discrete” in the sense that 𝑞 can only take on positive odd
NR function (8) is that it possesses twice differentiability for 𝑞 >3, which is not the case for the NR function. This makes 𝜙𝑞
NRsuitable for algorithms needing differentiability.
We wish to point out that the technique employed in the second type of generalization discussed above can always be adopted for NCP functions of the form
𝜙(𝑎, 𝑏) = ̄𝜙1(𝑎, 𝑏) − ̄𝜙2(𝑎, 𝑏). (9) In other words, the function
𝜙𝑞(𝑎, 𝑏) ∶= [ ̄𝜙1(𝑎, 𝑏)]𝑞− [ ̄𝜙2(𝑎, 𝑏)]𝑞
is always a discrete generalization of 𝜙 given in (9), where 𝑞 is a positive odd integer. As a matter of fact, we can further extend such technique by considering any family of injective functions{𝑓𝑞}. More precisely, the function
𝜙𝑓
𝑞(𝑎, 𝑏) ∶= 𝑓𝑞( ̄𝜙1(𝑎, 𝑏)) − 𝑓𝑞( ̄𝜙2(𝑎, 𝑏)) (10) is easily seen to be an NCP function whenever 𝑓𝑞 is injec- tive and 𝜙 is an NCP function as in (9). The transformation (10) has also been noted in [11]. For instance, the discrete generalized NR function (8) can be realized by transforming the NR function as in (10) using the map 𝑓𝑞(𝑡) = 𝑡𝑞, where 𝑞 >0 is an odd integer. Applying the same map to our NCP function 𝜙𝑝
𝜃, we obtain a discrete generalization as
(𝜙𝑝𝜃)𝑞 ∶=‖(𝑎, 𝑏)‖𝑞𝑝− (𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)𝑞, (11) where 𝑞 is a positive odd integer. As mentioned above, a generalization can possibly yield NCP functions with dif- ferent analytic properties. In the case of (11), it is easy to verify that(𝜙𝑝𝜃)𝑞is continuously differentiable onIR2when- ever 𝑞 ≥ 𝑝 > 1, whereas the original function 𝜙𝑝𝜃 is not differentiable at the origin.
Another discrete generalization of 𝜙𝑝
𝜃can be obtained by applying the same map 𝑓𝑞(𝑡) = 𝑡𝑞 to the equivalent form of 𝜙𝑝𝜃given by
𝜙𝑝
𝜃(𝑎, 𝑏) = 𝜙𝑝
FB(𝑎, 𝑏) −[
𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)] . (12) This yields another symmetric generalization
(𝜙𝑝𝜃)𝑞
FB(𝑎, 𝑏) = [𝜙𝑝
FB(𝑎, 𝑏)]𝑞−[
(𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)]𝑞
. For 𝑞 = 1, note that Proposition2.4guarantees the semis- moothness of 𝜙𝑝𝜃. Interestingly, the above generalization yields smooth NCP functions for any 𝑝 >1 and odd integers 𝑞≥ 3.
This can be easily verified and we omit proof. We summa- rize these results in Proposition3.2. Note that the above gen- eralizations are all symmetric. In general, the transformation given in (10) yields symmetric NCP functions when applied to our proposed NCP function 𝜙𝑝
𝜃 and its alternative form (12).
Proposition 3.2. Suppose 𝜃 is continuously differentiable and satisfies the conditions of Proposition2.1if 𝑝 = 1, or Proposition2.2if 𝑝 >1. Let 𝑞≥ 1 be an odd integer. Then,
(𝜙𝑝𝜃)𝑞(𝑎, 𝑏) ∶=‖(𝑎, 𝑏)‖𝑞𝑝−(
𝜃(𝑏)𝑎 + 𝜃(𝑎)𝑏)𝑞
is a discrete generalization of 𝜙𝑝𝜃, which is smooth if 𝑞≥ 𝑝 >
1. Additionally, (𝜙𝑝𝜃)𝑞
FB(𝑎, 𝑏) ∶= [𝜙𝑝
FB(𝑎, 𝑏)]𝑞− [(𝑎(𝜃(𝑏) − 1) + 𝑏(𝜃(𝑎) − 1)]𝑞 is also a discrete generalizations of 𝜙𝑝
𝜃, which is smooth if 𝑞≥ 3 and 𝑝 > 1
It is interesting to note that 𝑓𝑞(𝑡) = 𝑡𝑞 with 𝑞 ≥ 1 an odd integer is one of the functions usually employed in or- der to improve numerical performance of algorithms. This is referred to as an “activation function” in the literature on neural network approach for optimization. Such a function is often utilized to improve convergence rate, and other ex- amples are given as follows:
1. Bipolar Sigmoid Function [20,21]
𝑓𝑞(𝑡) = 1 − 𝑒−𝑞𝑡
1 + 𝑒−𝑞𝑡, 𝑞 >0.
2. Power-Sigmoid Function [20,21]
𝑓𝑞(𝑡) =
{1+𝑒−𝑞1
1−𝑒−𝑞1 ⋅1−𝑒−𝑞1𝑡
1+𝑒−𝑞1𝑡 if|𝑡| < 1
𝑡𝑞2 if|𝑡| ≥ 1
where 𝑞 = (𝑞1, 𝑞2), 𝑞1 > 2 and 𝑞2 ≥ 3 is an odd integer.
3. Smooth Power-Sigmoid Function [20,21]
𝑓𝑞(𝑡) = 1
2 ⋅1 + 𝑒−𝑞1
1 − 𝑒−𝑞1 ⋅ 1 − 𝑒−𝑞1𝑡 1 + 𝑒−𝑞1𝑡+1
2𝑡𝑞2 where 𝑞= (𝑞1, 𝑞2), 𝑞1>2 and 𝑞2≥ 3.
4. Sign-Bi-Power Function [14]
𝑓𝑞(𝑡) =
⎧⎪
⎨⎪
⎩
|𝑡|𝑞+|𝑡|1𝑞 if 𝑡 >0 0 if 𝑡= 0
−|𝑡|𝑞−|𝑡|1𝑞 if 𝑡 <0
, 𝑞 >0.
These functions are all injective maps which can be em- ployed to transform an NCP function of the form (9). How- ever, none of these transformations lead to a generalization in the sense illustrated above. Indeed, a generalized version can only be obtained if there exists ̄𝑞 such that 𝑓𝑞̄(𝑡) ≡ 𝑡.
We do note, however, that 𝑓𝑞
2 yields a continuous general- ization via the transformation (10) if 𝑓𝑞is the sign-bi-power function. In any case, an interesting research direction is to explore the applicability of the above injective functions in improving numerical efficiency of NCP functions-based
solution methods, just as how these functions improve nu- merical performance in neural network approaches. In the case of the power function 𝑓𝑞(𝑡) = 𝑡𝑞 and the generalized NR function, some numerical results are reported in [1]. Fi- nally, we note that it is also worth considering in numerical implementations the composite map 𝑓𝑞◦𝜙𝑝𝜃. This is also an NCP function provided that 𝑓𝑞 is injective with 𝑓𝑞(0) = 0 such as the above four activation functions.
4. Concluding Remarks
In this short paper, we proposed a new way to construct NCP functions. The family of generalized FB functions, in particular, can be generated from the proposed approach. We proved herein some basic properties of the newly discovered NCP function, which includes the growth behavior, noncon- vexity and semismoothness of 𝜙𝑝
𝜃. These are prerequisite to designing solution methods based on the new NCP function.
Observe that for a fixed 𝜃, the NCP function 𝜙𝑝
𝜃is parame- trized by 𝑝≥ 1. Future research directions can explore the effects of tuning the parameter 𝑝 in the performance of algo- rithms. This is worth exploring as it has been shown that for the case of the generalized FB and NR functions, better con- vergence rates of solution methods can be attained by con- trolling the values of 𝑝 [1,4,6]. Numerical comparisons of these new NCP functions with popular NCP functions such as the FB and NR function are recommended. How to best choose the parameter 𝑝 and the function 𝜃 are some topics that are worth venturing, as this could suggest alternative NCP functions that can work well with algorithms. Finally, it seems worthwhile to explore the effects of choosing differ- ent activation functions 𝑓𝑞such as the bipolar sigmoid func- tion, power-sigmoid function, smooth power-sigmoid func- tion, and the sign-bi-power function, in forming new NCP functions from old ones such as(𝜙𝑝𝜃)𝑓
𝑞and 𝑓𝑞◦𝜙𝑝𝜃. We leave it for future research to study whether or not these transfor- mations can be used to improve numerical performance of an NCP function-based algorithm. If these functions indeed improve some algorithms, it is recommended to determine which one of these will work best for the complementarity problem.
Acknowledgement
JS Chen is supported by Ministry of Science and Tech- nology, Taiwan.
References
[1] J.H. ALCANTARA ANDJ.-S. CHEN, Neural networks based on three classes of NCP-functions for solving nonlinear complementarity prob- lems, Neurocomputing 359 (2019) 102–113.
[2] Y.-L. CHANG, J.-S. CHEN,ANDC.-Y. YANG, Symmetrization of gener- alized natural residual function for NCP, Operations Research Letters 43 (2015) 354–358.
[3] J.-S. CHEN, On some NCP-functions based on the generalized Fischer- Burmeister function, Asia-Pacific Journal of Operations Research 24 (2007) 401–420.
ity problems, Information Sciences 180 (2010) 697–711.
[5] J.-S. CHEN, C.-H. KO,ANDX.-R. WU, What is the generalization of natural residual function for NCP?, Pacific Journal of Optimization 12 (2016) 19–27.
[6] J.-S. CHEN ANDS.-H. PAN, A family of NCP functions and a de- scent method for the nonlinear complementarity problem, Computa- tional Optimization and Applications 40 (2008) 389–404.
[7] F. FACCHINEI AND J.-S. PANG, Finite-Dimensional Variational In- equalities and Complementarity Problems, Volumes I and II, Springer- Verlag, New York, 2003.
[8] F. FACCHINEI AND J. SOARES, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization 7 (1997) 225–247.
[9] M.C. FERRIS, O.L. MANGASARIAN,ANDJ.-S. PANG, editors, Com- plementarity: Applications, Algorithms and Extensions, Kluwer Aca- demic Publishers, Dordrecht 2001.
[10] M.C. FERRIS ANDJ.-S. PANG, Engineering and economic applica- tions of complementarity problems, SIAM Review 39 (1997) 669–713.
[11] A. GALANTAI, Properties and construction of NCP functions, Com- putational Optimization and Applications 52 (2012) 805–824.
[12] C.-H. HUANG, J.-S. CHEN,AND ANDJ.E. MARTINEZ-LEGAZ, Dif- ferentiability v.s. convexity of complementarity functions, Optimization Letters, 11(1) (2017) 209–216.
[13] C. KANZOW, N. YAMASHITA, AND M. FUKUSHIMA, New NCP- functions and their properties, Journal of Optimization Theory and Ap- plications 94 (1997) 115–135.
[14] S. LI, S. CHEN,ANDB. LIU, Accelerating a recurrent neural network to finite-time convergence for solving time-varying sylvester equation by using a sign-bi-power activation function, Neural Process Letters 37 (2013) 189–205.
[15] Z.Q. LUO AND P. TSENG, : A new class of merit functions for the nonlinear complementarity problem, In: Ferris, M.C., Pang, J.S. (eds.) Complementarity and Variational Problems: State of the Art, pp. 204–
225. SIAM, Philadelphia (1997).
[16] O.L. MANGASARIAN, Equivalence of the complementarity problem to a system of nonlinear equations, SIAM Journal on Applied Mathe- matics 31 (1976) 89–92.
[17] R. MIFFLIN, Semismooth and semiconvex function in constrained op- timization, SIAM Journal on Control and Optimization 15 (1977) 959–
972.
[18] S.M. MIRI ANDS. EFFATI, On generalized convexity of nonlinear complementarity functions, Journal of Optimization Theory and Ap- pllications, 164 (2015) 723–730.
[19] L. QI ANDJ. SUN, A nonsmooth version of Newton’s method, Math- ematical Programming 58 (1993) 353–367.
[20] Y. ZHANG, Z. FAN, AND Z. LI, Zhang neural network for online solution of time-varying sylvester equation, In: Proceedings of the 2nd international conference on advances in computation and intelligence, ISICA’07. Berlin, Springer, 276–285.
[21] Y. ZHANG ANDS.S. GE, Design and analysis of a general recurrent neural network model for time-varying matrix inversion, IEEE Trans- actions on Neural Networks 16 (2005) 1477–1490.