Contents lists available atScienceDirect
Journal of Computational and Applied Mathematics
journal homepage:www.elsevier.com/locate/cam
A new class of neural networks for NCPs using smooth perturbations of the natural residual function
Jan Harold Alcantara, Jein-Shan Chen
∗,1Department of Mathematics, National Taiwan Normal University, Taiwan
a r t i c l e i n f o
Article history:
Received 10 February 2020
Received in revised form 4 January 2022
MSC:
37-N40 65-K10 65-K15
Keywords:
Complementarity problem Smoothing method Neural network Stability
a b s t r a c t
We present a new class of neural networks for solving nonlinear complementarity problems (NCPs) based on some family of real-valued functions (denoted byF) that can be used to construct smooth perturbations of the level curve defined byΦNR(x,y)=0, whereΦNRis the natural residual function (also called the ‘‘min’’ function). We introduce two important subclasses ofF, which deserve particular attention because of their significantly different theoretical and numerical properties. One of these subfamilies yields a smoothing function forΦNR, while the other subfamily only yields a smoothing curve forΦNR(x,y)=0. We also propose a simple framework for generating functions from these subclasses. Using the smoothing approach, we build two types of neural networks and provide sufficient conditions to guarantee asymptotic and exponential stability of equilibrium solutions. Finally, we present extensive numerical experiments to validate the theoretical results and to illustrate the difference in numerical performance of functions from the two subclasses. Numerical comparisons with existing neural networks for NCPs are also demonstrated.
© 2022 Elsevier B.V. All rights reserved.
1. Introduction
A nonlinear complementarity problem (NCP) consists of nonlinear inequalities with nonnegativity and orthogonality conditions on multiple variables and given multivariate functions. Such problems arise in constrained optimization and equilibrium problems [1]. Moreover, applications of NCPs in operations research, engineering, and economics motivated significant research efforts in the past decades [1–3], which resulted in various numerical techniques including the merit function approach [4–6], nonsmooth Newton method [7–9], and regularization approach [10,11]. We refer the interested reader to the monograph [1] and the paper [12] for a survey and thorough discussion of solution methods for complementarity problems.
Smoothing methods belong to another class of solution methods that have been extensively used in solving comple- mentarity problems [10,13–17]. A natural smoothing technique for the NCP is to construct a differentiable approximation of
φ
NR(s,
t):=
min{
s,
t} =
s−
(s−
t)+ (1.1)using a smoothing function for the plus function. Meanwhile, Haddou and Maheux [18] recently introduced a novel smoothing approach to handle NCPs. In their approach, they utilized smooth perturbations of
φ
NR that do not necessarily∗ Corresponding author.
E-mail addresses: 80640005s@ntnu.edu.tw(J.H. Alcantara),jschen@math.ntnu.edu.tw(J.-S. Chen).
1 The author’s work is supported by Ministry of Science and Technology, Taiwan.
https://doi.org/10.1016/j.cam.2022.114092 0377-0427/©2022 Elsevier B.V. All rights reserved.
correspond to smoothing functions for
φ
NR. These perturbations can be viewed as smooth approximations of the level curveφ
NR(s,
t)=
0, which are then used to obtain approximate solutions of the NCP. Unfortunately, there have been no further studies on algorithmic procedure for NCPs following this smoothing framework. Moreover, the choice of smooth perturbations as well as a procedure for controlling the perturbation parameter are some numerical issues that were left for future studies in [18].Meanwhile, neural network (NN) approaches for complementarity problems have also been explored in [19–21] mainly because it is desirable to have a real-time solution of the NCP, which may not be attainable with the usual approaches mentioned above. Neural networks are hardware-implementable, i.e. via integrated circuits, and therefore exhibit real- time processing. Hopfield and Tank originally introduced neural networks for optimization [22,23], and since then have been used in solving linear and nonlinear programming problems and variational inequalities [24–30]. Notice, however, that the NCP is not an optimization problem. Nevertheless, the NCP can be reformulated as a smooth minimization problem by constructing a merit function usually through the use of NCP-functions [4–6,19–21]. A nonnegative merit function usually serves as an energy function, which is then used to formulate a steepest-descent dynamical system whose equilibrium solutions correspond to NCP solutions under some suitable conditions.
A neural network based on the Fischer–Burmeister (FB) function was designed to handle P0-NCPs in [21]. In [20], these results were extended to the generalized Fischer–Burmeister (GFB) function, an NCP function that involves a parameter p
∈
(1, ∞
). It was shown that for the latter NN, better numerical performance of the network can be achieved by choosing a larger value of p. These neural networks have good stability and convergence properties and are not very sensitive to initial conditions. Aside from FB and GFB neural networks, three classes of NNs based on the discrete generalization ofφ
NRgiven in(1.1)were recently formulated in [19]. This neural network has relatively better convergence speed than FB and GFB neural networks, but its stability requires stricter assumptions and the NN is usually more sensitive to initial conditions as compared to FB and GFB networks.In this paper, we build a new class of neural networks based on the smoothing method for NCP introduced by Haddou and Maheux [18] using some familyF of smoothing functions. We introduce two subclasses ofFand propose a simple method to generate functions from these subclasses. Interestingly, the aforementioned issues on how to choose the smooth perturbations of
φ
NR and how to control the decrease of the smoothing parameter can be best addressed by considering these two subclasses. Sufficient conditions to guarantee stability of the neural network are provided and are illustrated through several numerical experiments. Finally, we compare the present neural networks with the well-known FB and GFB neural networks in solving P0- and non-P0-NCPs.This paper is organized as follows: In Section2, we recall the smoothing method proposed in [18] and introduce the familyF. In Section3, we discuss a simple idea on how to generate smoothing functions. We also prove a characterization result for the two subfamilies ofF introduced. In Section4, two neural networks for NCPs are presented. We provide several sufficient conditions for Lyapunov and exponential stability of the neural networks. Several numerical simulations are presented in Section5to understand how to choose a smoothing function for the NCP. A comparative analysis with FB and GFB neural networks is also shown. Conclusions and some recommendations for future studies are presented in Section6.
The notations used for this study are as follows. Rndenotes the n-dimensional Euclidean space endowed with the usual inner product, and Rm×ndenotes the space of m
×
n real matrices. We let Rn+and Rn++denote the nonnegative and positive orthant of Rn, respectively. MT denotes the transpose of a matrix M. Mijand M·jwill be used to denote the (i,
j)-entry of M and jth column of M, respectively. For M∈
Rn×nandΛ⊆ {
1, . . . ,
n}
, we denote by MΛthe principal submatrix of M indexed byΛ(i.e. the submatrix of M corresponding to rows and columns indexed byΛ).Λcdenotes the complement ofΛ. For any differentiable function f:
R2→
R,∇
af (s,
t) and∇
bf (s,
t) means the partial derivative of f w.r.t. s and t, respectively. Given a differentiable mapping F=
(F1, . . . ,
Fm)T:
Rn→
Rm,∇
F (x)= [∇
F1(x)· · · ∇
Fm(x)] ∈
Rn×mdenotes the transposed Jacobian of F at x, where∇
Fi(x) denotes the gradient of Fiat x. Given a family of real-valued functions on Rn:{ φ
r:
r>
0}
, we denote byφ
0the pointwise limit limr↘0φ
r, whenever it exists.2. Smoothing approach for NCP
In this section, we present the smoothing approach for NCP proposed by Haddou and Maheux [18] and introduce two important classes of smoothing functions. We also recall some concepts related to nonlinear mappings.
Let F
:
Rn→
Rn be given. The nonlinear complementarity problem, which we denote by NCP(F ), is to find a point x∈
Rnsuch thatx
≥
0,
F (x)≥
0 and⟨
x,
F (x)⟩ =
0.
(2.1)The set of all solutions of NCP(F ) is denoted by SOL(F ), which we assume to be nonempty. We can also show the equivalence of NCP(F ) and the equation x
=
PK(x−
F (x)), where K=
Rn+and PK denotes the projection onto K . In light of this equivalence, we see that x solves(2.1)if and only ifΦNR(x,
F (x))=
0 whereΦ:
Rn×
Rn→
Rn is a nonsmooth mapping given byΦNR(x
,
y)=
⎛
⎜
⎝
φ
NR(x1,
y1)φ
NR(x...
n,
yn)⎞
⎟
⎠ .
2
In this paper, we will consider nonlinear monotone and P0-functions. We recall some basic types of nonlinear mappings.
Definition 2.1. Let F
=
(F1, . . . ,
Fn)T:
Rn→
Rn. Then, the mapping F is said to be (i) monotone if⟨
x−
y,
F (x)−
F (y)⟩ ≥
0 for all x,
y∈
Rn.(ii) strictly monotone if
⟨
x−
y,
F (x)−
F (y)⟩ >
0 for all x,
y∈
Rnand x̸=
y.(iii) strongly monotone with modulus
µ >
0 if⟨
x−
y,
F (x)−
F (y)⟩ ≥ µ∥
x−
y∥
2for all x,
y∈
Rn. (iv) a P0-function if max1≤i≤n xi̸=yi
(xi
−
yi)(Fi(x)−
Fi(y))≥
0 for all x,
y∈
Rnand x̸=
y.(v) a P-function if max
1≤i≤n(xi
−
yi)(Fi(x)−
Fi(y))>
0 for all x,
y∈
Rnand x̸=
y.(vi) a uniform P-function with modulus
κ >
0 if max1≤i≤n(xi
−
yi)(Fi(x)−
Fi(y))≥ κ∥
x−
y∥
2, for all x,
y∈
Rn.A matrix is called a P0-matrix (resp. P-matrix) if all its principals minors are nonnegative (resp. positive). Note that F is a P0-function if and only if
∇
F (x) is a P0-matrix for all x∈
Rn. If∇
F (x) is a P-matrix for all x∈
Rn, then F is a P-function.However, the converse is not necessarily true.
We now recall the smoothing approach described in [18]. First, we consider a continuously differentiable strictly increasing function
θ
such thatt→−∞lim
θ
(t)= −∞ , θ
(0)=
0 and limt→+∞
θ
(t)=
1 (2.2)We also impose a condition that
θ
(t) should approach 1 fast enough but with some uniformity for large values of t. This condition is precisely defined as follows.Definition 2.2. Let a
∈
(0,
1) and supposeθ
(t) is a strictly increasing C1function such thatθ
(0)=
0 and limt→∞θ
(t)=
1.We say that
θ
satisfies condition (Ha) if there exists ta>
0 such that 12
+
12
θ
(at)≤ θ
(t)∀
t≥
ta.
We say that
θ
belongs to classFif it satisfies condition (Ha) for some a∈
(0,
1).For instance, the following functions were considered in [18]:
θ
(1)(t)= {
tt+1 if t
≥
0t if t
<
0 andθ
(2)(t)=
1−
e−t.
(2.3)θ
(1)satisfies (Ha) for all a∈
(0,
1/
2) whileθ
(2) satisfies (Ha) for all a∈
(0,
1).For each r
>
0, we define the functionφ
r:
R2→
R byφ
r(s,
t):=
rθ
−1( θ (
sr
)
+ θ (
tr
)
−
1)
.
(2.4)Note that strict monotonicity of
θ
was imposed to guarantee its invertibility as required in(2.4). We summarize here some important facts proved in [18].Lemma 2.1. Let
θ
be a strictly increasing C1function satisfying conditions(2.2). Then, the following holds.(i)
φ
r(s,
t)≤
min{
s,
t}
for all s,
t∈
R and r>
0.(ii) If
θ ∈
F, then limr↘0
φ
r(s,
t)=
0⇐⇒
min{
s,
t} =
0∀
s,
t∈
R.
(iii) Ifθ
satisfies condition (Ha) for all a∈
(0,
1), thenlim
r↘0
φ
r(s,
t)=
min{
s,
t} ∀
s,
t∈
R++.
Let Gr:
Rn×
Rn→
Rnbe given byGr(x
,
y)=
⎛
⎜
⎝
φ
r(x1,
y1)φ
r(xn... ,
yn)⎞
⎟
⎠ .
(2.5)3
ByLemma 2.1(iii), Gr is a smoothing function forΦNR over Rn++
×
Rn++ ifθ
satisfies (Ha) for all a∈
(0,
1). In contrast, we may not obtain a smoothing function forΦNR if there exists a∈
(0,
1) for which (Ha) does not hold. For instance, whenθ = θ
(1), one can verify that limr↘0Gr(x,
y)<
ΦNR(x,
y) (see also [18]). In view of these contrasting properties, we introduce two subclasses ofF.Definition 2.3. Let
θ ∈
F. We say thatθ
belongs to classF1if it satisfies condition (Ha) for all a∈
(0,
1). Otherwise, we say thatθ
belongs toF2.In Section3, we will prove that the result ofLemma 2.1(iii) holds only for functions inF1. Despite not obtaining a smoothing function forΦNRwhen
θ ∈
F2,Lemma 2.1(ii) is a very useful result to apply Haddou and Maheux’s smoothing strategy. To see this, define the functionsΦr(x)
:=
Gr(x,
F (x)) and Φ0(x):=
limr↘0Φr(x)
,
provided that the limit exists. Then byLemma 2.1(ii),(2.1)is equivalent to solving the system of equationsΦ0(x)
=
0.Moreover, it is shown in [18] that whenever F is a P0-function,Φr is a P-function for any r
>
0 whileΦ0is a P0-function whenever it exists. Consequently, one may use Gowda and Tawhid’s theory for P0-equations given in [31] by viewingΦ0as a continuous perturbation ofΦr. Indeed,Lemma 2.3 which was given in [18] is an easy consequence of Theorem 4 of [31]. Before stating the convergence result, we make some assumptions.
Assumption 1. SOL(F ) is nonempty and bounded,
θ ∈
F andΦ0exists.Some conditions which will guarantee the existence ofΦ0are given in the following result, whose proofs can be found in [18].
Lemma 2.2. Let s
,
t>
0. Then,φ
0(s,
t):=
limr↘0
φ
r(s,
t) exists andφ
0(s,
t)≤
min{
s,
t}
if any of the following condition holds:(i) There exists
ε >
0 such that∂∂rφ
r(s,
t)≤
0 for all r∈
(0, ε
);(ii) V
:=
(− ψ
′◦ ψ
−1)× ψ
−1 is locally subadditive at 0+; i.e. there existsη >
0 such that for all 0< α, β, α + β < η
, we haveV (
α + β
)≤
V (α
)+
V (β
),
whereψ :=
1− θ
.Lemma 2.3. SupposeAssumption 1holds and that F is a P0-function.
(i) There exists an
ˆ
r>
0 such that for any r∈
(0, ˆ
r),Φr(x)=
0 has a unique solution x(r)and the mapping r↦→
x(r)is continuous on (0, ˆ
r).(ii) lim
r↘0 inf
x∗∈SOL(F )
∥
x(r)−
x∗∥ =
0.Lemma 2.3(i) implies the well-posedness of the equationΦr(x)
=
0, whileLemma 2.3(ii) suggests the following typical algorithm employed in smoothing techniques, which is the one presented in [18].Algorithm 1.
Let
ε >
0, and x0∈
Rn++. Set r0=
max{
1, √
max1≤i≤n
|
x0iFi(x0)|}
and k=
0.For k
=
0,
1, . . .
until max1≤i≤n|
xkiFi(xk)| ≤ ε
, do1. SolveΦrk(x)
=
Grk(x,
F (x))=
0, where Gr(· , ·
) is defined in(2.5), to get a solution xk. 2. Set rk+1=
min{
0.
1rk,
(rk)2, √
max1≤i≤n
|
xkiFi(xk)|}
.We remark that if x(r)solvesΦr(x)
=
0, then x(r)and F (x(r)) are both in Rn++. 3. Designing smoothing functionsIn Section2, we introduced two subclasses ofFbased on whether or not condition (Ha) is satisfied for all a
∈
(0,
1).In this section, we discuss some important remarks and results related to condition (Ha). Moreover, we present a simple and intuitive way to generate functions
θ
fromF. Under some suitable assumptions, the construction way also provides information on the classification ofθ
, i.e. whetherθ
belongs toF1orF2.4
We consider first a function
θ
satisfying(2.2)restricted to R+. Observe that there is a one-to-one correspondence between these functions and probability density functions. Indeed, if p(t):= θ
′(t), then p(t)≥
0 and∫
∞ 0p(t) dt
=
limt→∞
∫
t 0θ
′(τ
) dτ =
limt→∞(
θ
(t)− θ
(0))=
1.
Hence, the function
θ
corresponds to a probability density function on R+. Conversely, one natural way to generateθ
is to consider a probability density function p:
R+→
R+and takingθ
(t)=
∫
t 0p(
τ
) dτ
for all t
∈ [
0, ∞
). This construction way was also mentioned in [32].Another natural and intuitive way to generate
θ
with the desired asymptotic behavior at infinity is to consider a differential equation with an equilibrium solution at 1. Consider for instance the separable autonomous differential equation given byd
θ
dt
=
f (θ
), θ
(0)=
0.
(3.1)Throughout the paper, the following are our assumptions on f : Assumption 2.
(i) f is C1on an open interval containing
[
0,
1]
. (ii) f>
0 on[
0,
1) and f (1)=
0.Assumption 2(i) is important in ensuring that the initial-value problem(3.1)has a unique solution. On the other hand, Assumption 2(ii) is required to obtain an increasing function
θ
(t) which approaches 1 as t→ ∞
.Proposition 3.1. Let f be any function satisfyingAssumption 2. Then the initial-value problem(3.1)has a unique solution which is defined for all t
≥
0 and limt→∞θ
(t)=
1.Proof. The existence and uniqueness of solution follows fromAssumption 2(i) (see [33]). Let
[
0,
T ) be the maximal interval of existence of the solutionθ
(t) of(3.1).Since f (
θ
)>
0 for allθ ∈
(0,
1),θ
(t) is an increasing function. Thus, we may let L=
limt→∞θ
(t). Note also that the constant function 1 is a solution of the differential equationddtθ=
f (θ
). Since solution through a point is unique,θ
(t)<
1 for all t∈ [
0,
T ). Consequently, L≤
1 and{ θ
(t):
t∈ [
0,
T )} ⊆ [
0,
1]
. Hence,θ
(t) is defined for all t≥
0 (see [33]).Finally, note that from(3.1), we have
θ
(t)=
∫
t 0f (
θ
(τ
)) dτ.
Since
θ
(t)→
L, we conclude that∫
∞0 f (
θ
(τ
)) dτ
is convergent. It follows that f (L)=
limt→∞f (
θ
(t))=
0.
ByAssumption 2(ii), we conclude that L
=
1, as desired. □It was mentioned in the preceding section that satisfying condition (Ha) is very important to guarantee the applicability of the smoothing approach. The benefit of constructing
θ
using(3.1)is that condition (Ha) can be easily deduced if f has a specific form. We describe this in the following theorem.Theorem 3.1. Letf be a continuously differentiable function on an open interval containing
¯ [
0,
1]
such thatf¯ >
0 on[
0,
1]
. Suppose that f takes the form f (θ
)= ¯
f (θ
)(1− θ
)k, where k≥
1. Ifθ
(t) denotes the unique solution of(3.1), thenθ ∈
F. In particular,(i) If k
=
1, thenθ ∈
F1.(ii) If k
>
1, thenθ ∈
F2. Moreover, condition (Ha) is satisfied when a∈ (
0
,
2k1−1)
and is not satisfied when a
∈ (
12k−1
,
1)
. Before we proveTheorem 3.1, we need the following lemma. This lemma will also be useful later inTheorems 4.3and 4.4.Lemma 3.1. Under the hypothesis ofTheorem 3.1, we have
tlim→∞
1
− θ
(t) 1− θ
(at)=
{
0 if k=
1 a1
k−1 if k
>
1∀
a∈
(0,
1).
5
Proof. Let a
∈
(0,
1) and defineψ
(t):=
1− θ
(t). Note that ddt
ψ
(t)= −
dθ
dt
= −¯
f (θ
(t))(1− θ
(t))k= −¯
f (θ
(t))(ψ
(t))k.
(3.2) Suppose first that k=
1. Sinceθ
(0)=
0, thenψ
(0)=
1. Moreover, we have from(3.2)that1
− θ
(t)= ψ
(t)=
exp(
−
∫
t 0¯
f (θ
(τ
)) dτ )
.
Consequently, we have for any a∈
(0,
1) that1
− θ
(t) 1− θ
(at)=
exp
(
− ∫
t0f (
¯ θ
(τ
)) dτ )
exp(
− ∫
at0 f (
¯ θ
(τ
)) dτ ) =
exp(
−
∫
t atf (
¯ θ
(τ
)) dτ )
(3.3)
Sincef (
¯ θ
)>
0 in[
0,
1]
, there exists m>
0 such thatf (¯ θ
)>
m for allθ ∈ [
0,
1]
. Continuing from(3.3), 1− θ
(t)1
− θ
(at)≤
exp(−
m(1−
a)t) ∀
t≥
0.
Letting t→ ∞
gives the desired limit for the first case.Now, suppose that k
>
1. From(3.2), we have (1− θ
(t))1−k=
(ψ
(t))1−k=
1+
(k−
1)∫
t 0¯
f (θ
(τ
)) dτ.
Note that due tof (1)
¯ ̸=
0, we have∫
∞0 f (
¯ θ
(τ
)) dτ = ∞
. Then, this leads totlim→∞
(
1− θ
(t) 1− θ
(at))
k−1=
limt→∞
1
+
(k−
1)∫
at0 f (
¯ θ
(τ
)) dτ
1+
(k−
1)∫
t0
¯
f (θ
(τ
)) dτ =
limt→∞
a
¯
f (θ
(at))¯
f (θ
(t))=
a,
where the last equality follows from the fact thatf (1)
¯ ̸=
0 andθ
(t)→
1 byProposition 3.1. □Proof ofTheorem 3.1. Fix a
∈
(0,
1) and define ha(t)=
12+
12θ
(at)− θ
(t) for all t≥
0. Differentiating ha, we obtain h′a(t)=
a2
θ
′(at)− θ
′(t)=
a2f (
θ
(at))−
f (θ
(t)).
(3.4)Sincef (1)
¯ ̸=
0 andθ
(t)→
1, we have fromLemma 3.1that limt→∞
f (
θ
(t)) f (θ
(at))=
limt→∞
f (
¯ θ
(t))(1− θ
(t))k f (¯ θ
(at))(1− θ
(at))k=
{
0 if k=
1 ak
k−1 if k
>
1.
(3.5)Suppose now that k
=
1. By(3.5), we can find ta>
0 such that f (θ
(t))<
a2f (θ
(at)) for all t>
ta. From(3.4), we see that h′a(t)>
0 for all t>
taand therefore ha(t) is strictly increasing on (ta, ∞
). But clearly, ha(t) approaches 0 as t→ ∞
. Thus, ha(t)<
0 for all t>
ta. That is,θ
(t) satisfies condition (Ha) for all a∈
(0,
1).On the other hand, suppose k
>
1. If 0<
a<
2k1−1, then a2
−
ak−k1=
a(
12
−
ak−11)
>
0.
Then, by(3.5), we can find ta
>
0 such that f (f (θθ(at))(t))<
a2 for all t>
ta. As in the preceding case,θ
(t) satisfies condition (Ha) for all a∈
(
0,
2k1−1)
. If 2k1−1
<
a<
1, then ak−k1−
a2>
0 and so there exists ta>
0 for which f (f (θθ(at))(t))>
a2 for all t>
ta. It follows that ha(t)>
0 over (ta, ∞
). This completes the proof. □Remark 3.1. Given a function f satisfyingAssumption 2, we may applyTheorem 3.1provided that there exists k
≥
1 such that L:=
limθ→1f (θ
)(1− θ
)−k̸=
0. In this case, we define¯
f (θ
):=
f (θ
)(1− θ
)−kforθ ∈ [
0,
1) and definef (1)¯ :=
L.We then obtain the desired factorization f (
θ
)= ¯
f (θ
)(1− θ
)k.In general, this factorization is not always achievable. To see this, consider the function g(t)
=
t3(
1
+
sin1 t)
+
t(1−
cos t).
It is easy to verify that g is continuously differentiable on R. Moreover, since the first term of g is nonnegative on (0
,
1]
while the second term is greater than 0 on (0,
1]
, then g is strictly positive on (0,
1]
. In addition,lim
t→0+
g(t)
tk
=
0∀
k∈ [
1,
3),
6
and the above limit does not exist for k
≥
3. Defining f (θ
):=
g(1− θ
), we see that f satisfiesAssumption 2. But, there is no k≥
1 such that limθ→1−f (θ
)(1− θ
)−kexists and is nonzero.The following is an easy consequence ofTheorem 3.1and the above remark.
Corollary 3.1. If f′(1)
<
0, thenθ ∈
F1.Example 3.1. We generate five functions from(3.1), two of which are the ones used in [18] given by Eq.(2.3). It can be easily verified that V given inLemma 2.2is locally subadditive at 0+for each of the functions below.
The parameter k has a significant role in numerical simulations. Notice that (1
− θ
)kdecreases in value as k increases whenθ ∈
(0,
1). In view of(3.1)and the form of f given inTheorem 3.1, k controls the rate of growth of the functionθ
(t).In Section5, it is shown that functions with faster growth rate yield faster convergence time for the neural networks.
However, the rate of increase of
θ
should not be very fast so as to avoid ill-conditioning. Hence, the parameter k is useful when we take into account the conditioning of the problem. That is, a higher value of k may be adapted when the problem is ill-conditioned.We now give an example of a function
θ
that does not belong toF. Example 3.2. Defineθ
(t)=
2π
∫
t 0sin2
τ
τ
2 dτ
. Indeed,θ
is a strictly increasing continuously differentiable function such thatθ
(0)=
0 andθ →
1 as t→ ∞
. The function hadefined in the proof ofTheorem 3.1oscillates between positive and negative values of t, and therefore does not satisfy condition (Ha) for any a∈
(0,
1).Remark 3.2. One can observe that classF1functions can be viewed, in some sense, as the limit case of classF2functions.
First, notice that inLemma 3.1, the zero limit obtained for classF1function can be viewed as the limit of ak−11 as k
↘
1, where ak−11 is the calculated limit for classF2function. InTheorem 3.1, we see that forθ ∈
F2, condition (Ha) is satisfied only for a∈
Ik:=
(0,
21−k). Note that Ik↗
(0,
1) as k↘
1 and that condition (Ha) holds for all a∈
(0,
1) for classF1 functions.Remark 3.3. We mention some remarks regarding the definition of
θ
over (−∞ ,
0). In order for Gr given by(2.5)to be smooth over Rn×
Rnfor any r>
0, we must use a differentiable extension ofθ
for t<
0. In this paper, we letθ
(t)=
tf (0) for t<
0 in this paper, similar to the construction ofθ
(1) in(2.3).We conclude this section with the following proposition which implies that the limit in Lemma 2.1(iii) is in fact a characterization of functions inF1. That is, a smoothing function of
φ
NR over the positive orthant is obtained if and only ifθ ∈
F1.Proposition 3.2. Suppose
θ ∈
F2and leta¯ ∈
inf S, where S= {
a∈
(0,
1)|
condition (Ha) does not hold} .
Let P¯a= {
(s,
t)∈
R2++: ¯
as<
t<
as¯}
. Then, we havelim
r↘0
φ
r(s,
t)<
min{
s,
t} ∀
(s,
t)∈
P¯a,
whenever the limit exists.Proof. We note first that (a
¯ ,
1)⊆
S. Indeed, the set S is nonempty and¯
a∈
(0,
1) sinceθ ∈
F2. Ifa˜ ∈
S, for any u>
0,θ
(v
)<
12+
12θ
(a˜ v
) for somev ≥
u. Sinceθ
is increasing, we see that condition (Ha) also does not hold for any a∈
(˜
a,
1).Now, fix (s
,
t)∈
Pa¯ and suppose s=
min{
s,
t}
. Given any a∈
(a¯ ,
1), then by the above note, there exists a sequence{
rn}
such that rn↘
0 andθ (
trn
)
<
1 2+
12
θ (
atrn
) .
Since
θ
andθ
−1are increasing and s≤
t, the above inequality yieldsφ
rn(s,
t)=
rnθ
−1( θ
(
s rn) + θ
(
t rn)
−
1)
≤
rnθ
−1(
2
θ (
trn
)
−
1)
<
atLetting n
→ ∞
, we obtain limn→∞φ
rn(s,
t)≤
at for any a∈
(a¯ ,
1). Since (s,
t)∈
Pa¯, then s> ¯
at and by letting a↘ ¯
a, we obtainlim
n→∞
φ
rn(s,
t)≤ ¯
at<
s=
min{
s,
t}
which is the desired result. □It is interesting to note that as inRemark 3.2, classF1can again be viewed as the limit case of classF2whena
¯ ↘
0.More precisely, the limit inLemma 2.1(iii) holds on
⋃
a¯∈(0,1)Pa¯
=
R2++whenθ ∈
F1.7
4. Smoothed neural networks
In this section, we present two gradient dynamical systems to solve the NCP (2.1) using the smoothing approach presented in Section2. Similar to the approach used in [19,21], the stability analysis of our neural networks relies on the use of Lyapunov functions. For more details, we refer the reader to [33,34].
4.1. The first neural network
We consider first a neural network which can be used to obtain approximate solutions of NCP(F ). FromLemma 2.3, solutions of(2.1)can be obtained by successively solving for decreasing values of r the equationΦr(x)
=
0, which is exactly the motivation ofAlgorithm 1. Note that solving the aforementioned equation, if a solution exists, is equivalent to solving the smooth minimization problemmin
x∈RnΨr(x)
:=
12
∥
Φr(x)∥
2.
(4.1)This motivates the steepest descent-based neural network dx
dt
= − ρ∇
Ψr(x),
x(0)=
x0,
(NN1)where
ρ >
0 is a time-scaling parameter and r>
0 is a sufficiently small fixed number. Consequently, neural network (NN1)can be used to deal with Step 1 ofAlgorithm 1.Observe that a solution of(4.1)is an equilibrium point of(NN1). However, the converse is not necessarily true. We are interested on conditions that can be imposed on F so that an equilibrium point of the neural network(NN1)is also an optimal solution of(4.1), and consequently an approximated solution of(2.1). To this end, we establish first an important property of
∇
Φr.Theorem 4.1. Let F be a P0-function, r
>
0 and suppose thatθ
′(u)̸=
0 for all u∈
R. Then∇
Φr(x) is a nonsingular for any x∈
Rnand r>
0.Proof. First, we note that
∇
aφ
r(s,
t)= ∂
∂
sφ
r(s,
t)=
(θ
−1)′(
θ (
s r)
+ θ (
tr
)
−
1)
θ
′(
s r)
= θ
′(
sr
) θ
′(
θ
−1(
θ (
sr) + θ (
rt) −
1))
(4.2)>
0,
where we used the fact that
θ
′(θ
−1(u))·
(θ
−1)′(u)=
1. By symmetry, we also have∇
bφ
r(s,
t):=
∂∂tφ
r(s,
t)>
0.Let x
∈
Rnand r>
0, and denote by Ar(x) and Br(x) the n×
n diagonal matrices such that (Ar(x))ii= ∇
aφ
r(xi,
Fi(x)) and (Br(x))ii= ∇
bφ
r(xi,
Fi(x)).
Note that Ar(x) and Br(x) are invertible since
∇
aφ
r(xi,
Fi(x)) and∇
bφ
r(xi,
Fi(x)) are both strictly positive for all i, as noted above. Furthermore, we have by Chain Rule that∇
Φr(x)=
Ar(x)+ ∇
F (x)·
Br(x)=
(Dr(x)+ ∇
F (x))·
Br(x),
(4.3) where Dr(x)=
Ar(x)Br(x)−1. Since Dr(x) is a diagonal matrix,det(Dr(x)
+ ∇
F (x))= ∑
Λ⊆{1,...,n}
det(DΛ) det(
∇
F (x)Λc) (4.4)where DΛdenotes the principal submatrix of Dr(x) (see, for instance, Chapter 2 of [35]). Each term in the above summation is nonnegative since F is a P0-function and the diagonal entries of Dr(x) are positive. In particular, the term corresponding to
α = {
1, . . . ,
n}
is precisely det(Dr(x))>
0. Thus, Dr(x)+ ∇
F (x) is nonsingular. Since Br(x) is also nonsingular, we conclude from(4.3)that∇
Φr(x) is nonsingular. □Observe that the hypothesis on
θ
of the above theorem holds if it is generated from(3.1)using Assumption 2and Remark 3.3. We now state some important consequences of the above result.Corollary 4.1. Under the hypotheses ofTheorem 4.1, the following holds:
(i) Every equilibrium point of(NN1)is an optimal solution of(4.1).
(ii) An isolated equilibrium point of(NN1)is exponentially stable.
8