Contents lists available atScienceDirect

## Journal of Computational and Applied Mathematics

journal homepage:www.elsevier.com/locate/cam

## A new class of neural networks for NCPs using smooth perturbations of the natural residual function

### Jan Harold Alcantara, Jein-Shan Chen

^{∗}

^{,}

^{1}

*Department of Mathematics, National Taiwan Normal University, Taiwan*

a r t i c l e i n f o

*Article history:*

Received 10 February 2020

Received in revised form 4 January 2022

*MSC:*

37-N40 65-K10 65-K15

*Keywords:*

Complementarity problem Smoothing method Neural network Stability

a b s t r a c t

We present a new class of neural networks for solving nonlinear complementarity
problems (NCPs) based on some family of real-valued functions (denoted by_{F}) that can
be used to construct smooth perturbations of the level curve defined byΦNR*(x*,* ^{y)}*=0,
whereΦ

_{NR}is the natural residual function (also called the ‘‘min’’ function). We introduce two important subclasses of

_{F}, which deserve particular attention because of their significantly different theoretical and numerical properties. One of these subfamilies yields a smoothing function forΦ

_{NR}, while the other subfamily only yields a smoothing curve forΦNR

*(x*,

*=0. We also propose a simple framework for generating functions from these subclasses. Using the smoothing approach, we build two types of neural networks and provide sufficient conditions to guarantee asymptotic and exponential stability of equilibrium solutions. Finally, we present extensive numerical experiments to validate the theoretical results and to illustrate the difference in numerical performance of functions from the two subclasses. Numerical comparisons with existing neural networks for NCPs are also demonstrated.*

^{y)}© 2022 Elsevier B.V. All rights reserved.

**1. Introduction**

A nonlinear complementarity problem (NCP) consists of nonlinear inequalities with nonnegativity and orthogonality conditions on multiple variables and given multivariate functions. Such problems arise in constrained optimization and equilibrium problems [1]. Moreover, applications of NCPs in operations research, engineering, and economics motivated significant research efforts in the past decades [1–3], which resulted in various numerical techniques including the merit function approach [4–6], nonsmooth Newton method [7–9], and regularization approach [10,11]. We refer the interested reader to the monograph [1] and the paper [12] for a survey and thorough discussion of solution methods for complementarity problems.

Smoothing methods belong to another class of solution methods that have been extensively used in solving comple- mentarity problems [10,13–17]. A natural smoothing technique for the NCP is to construct a differentiable approximation of

### φ

NR*(s*

### ,

^{t)}### :=

min### {

*s*

### ,

^{t}### } =

*s*

### −

*(s*

### −

*t)*+ (1.1)

using a smoothing function for the plus function. Meanwhile, Haddou and Maheux [18] recently introduced a novel smoothing approach to handle NCPs. In their approach, they utilized smooth perturbations of

### φ

NR that do not necessarily∗ Corresponding author.

*E-mail addresses:* 80640005s@ntnu.edu.tw(J.H. Alcantara),jschen@math.ntnu.edu.tw(J.-S. Chen).

1 The author’s work is supported by Ministry of Science and Technology, Taiwan.

https://doi.org/10.1016/j.cam.2022.114092 0377-0427/©2022 Elsevier B.V. All rights reserved.

correspond to smoothing functions for

### φ

NR. These perturbations can be viewed as smooth approximations of the level curve### φ

NR*(s*

### ,

^{t)}### =

0, which are then used to obtain approximate solutions of the NCP. Unfortunately, there have been no further studies on algorithmic procedure for NCPs following this smoothing framework. Moreover, the choice of smooth perturbations as well as a procedure for controlling the perturbation parameter are some numerical issues that were left for future studies in [18].Meanwhile, neural network (NN) approaches for complementarity problems have also been explored in [19–21] mainly because it is desirable to have a real-time solution of the NCP, which may not be attainable with the usual approaches mentioned above. Neural networks are hardware-implementable, i.e. via integrated circuits, and therefore exhibit real- time processing. Hopfield and Tank originally introduced neural networks for optimization [22,23], and since then have been used in solving linear and nonlinear programming problems and variational inequalities [24–30]. Notice, however, that the NCP is not an optimization problem. Nevertheless, the NCP can be reformulated as a smooth minimization problem by constructing a merit function usually through the use of NCP-functions [4–6,19–21]. A nonnegative merit function usually serves as an energy function, which is then used to formulate a steepest-descent dynamical system whose equilibrium solutions correspond to NCP solutions under some suitable conditions.

*A neural network based on the Fischer–Burmeister (FB) function was designed to handle P*_{0}-NCPs in [21]. In [20], these
results were extended to the generalized Fischer–Burmeister (GFB) function, an NCP function that involves a parameter
*p*

### ∈

(1### , ∞

). It was shown that for the latter NN, better numerical performance of the network can be achieved by choosing*a larger value of p. These neural networks have good stability and convergence properties and are not very sensitive to*initial conditions. Aside from FB and GFB neural networks, three classes of NNs based on the discrete generalization of

### φ

NRgiven in(1.1)were recently formulated in [19]. This neural network has relatively better convergence speed than FB and GFB neural networks, but its stability requires stricter assumptions and the NN is usually more sensitive to initial conditions as compared to FB and GFB networks.In this paper, we build a new class of neural networks based on the smoothing method for NCP introduced by Haddou
and Maheux [18] using some family_{F} of smoothing functions. We introduce two subclasses of_{F}and propose a simple
method to generate functions from these subclasses. Interestingly, the aforementioned issues on how to choose the
smooth perturbations of

### φ

NR and how to control the decrease of the smoothing parameter can be best addressed by considering these two subclasses. Sufficient conditions to guarantee stability of the neural network are provided and are illustrated through several numerical experiments. Finally, we compare the present neural networks with the well-known*FB and GFB neural networks in solving P*

_{0}

*- and non-P*

_{0}-NCPs.

This paper is organized as follows: In Section2, we recall the smoothing method proposed in [18] and introduce the
family_{F}. In Section3, we discuss a simple idea on how to generate smoothing functions. We also prove a characterization
result for the two subfamilies of_{F} introduced. In Section4, two neural networks for NCPs are presented. We provide
several sufficient conditions for Lyapunov and exponential stability of the neural networks. Several numerical simulations
are presented in Section5to understand how to choose a smoothing function for the NCP. A comparative analysis with
FB and GFB neural networks is also shown. Conclusions and some recommendations for future studies are presented in
Section6.

The notations used for this study are as follows. R^{n}*denotes the n-dimensional Euclidean space endowed with the usual*
inner product, and R^{m}^{×}^{n}*denotes the space of m*

### ×

*n real matrices. We let R*

*+and R*

^{n}*++denote the nonnegative and positive orthant of R*

^{n}

^{n}*, respectively. M*

^{T}*denotes the transpose of a matrix M. M*

_{ij}*and M*·

_{j}*will be used to denote the (i*

### ,

*j)-entry of*

*M and jth column of M, respectively. For M*

### ∈

_{R}

^{n}^{×}

*andΛ*

^{n}### ⊆ {

1### , . . . ,

^{n}### }

*, we denote by M*

_{Λ}

*the principal submatrix of M*indexed byΛ

*(i.e. the submatrix of M corresponding to rows and columns indexed by*Λ

^{).}Λ

*denotes the complement ofΛ*

^{c}*. For any differentiable function f*

### :

_{R}

^{2}

### →

_{R,}

### ∇

_{a}*f (s*

### ,

^{t) and}### ∇

_{b}*f (s*

### ,

*t) means the partial derivative of f w.r.t. s and t,*

*respectively. Given a differentiable mapping F*

### =

*(F*

_{1}

### , . . . ,

^{F}*m*)

^{T}### :

_{R}

^{n}### →

_{R}

*,*

^{m}### ∇

*F (x)*

### = [∇

*F*

_{1}

*(x)*

### · · · ∇

*F*

_{m}*(x)*

### ] ∈

_{R}

^{n}^{×}

*denotes*

^{m}*the transposed Jacobian of F at x, where*

### ∇

*F*

_{i}*(x) denotes the gradient of F*

_{i}*at x. Given a family of real-valued functions on*R

*:*

^{n}### { φ

*r*

### :

*r*

### >

^{0}

### }

, we denote by### φ

0the pointwise limit lim*↘*

_{r}_{0}

### φ

*r*, whenever it exists.

**2. Smoothing approach for NCP**

In this section, we present the smoothing approach for NCP proposed by Haddou and Maheux [18] and introduce two important classes of smoothing functions. We also recall some concepts related to nonlinear mappings.

*Let F*

### :

_{R}

^{n}### →

_{R}

^{n}*be given. The nonlinear complementarity problem, which we denote by NCP(F ), is to find a point*

*x*

### ∈

_{R}

*such that*

^{n}*x*

### ≥

0### ,

^{F (x)}### ≥

0 and### ⟨

*x*

### ,

^{F (x)}### ⟩ =

0### .

^{(2.1)}

*The set of all solutions of NCP(F ) is denoted by SOL(F ), which we assume to be nonempty. We can also show the*
*equivalence of NCP(F ) and the equation x*

### =

*P*

_{K}*(x*

### −

*F (x)), where K*

### =

_{R}

^{n}_{+}

*and P*

_{K}*denotes the projection onto K . In light*

*of this equivalence, we see that x solves*(2.1)if and only ifΦNR

*(x*

### ,

^{F (x))}### =

0 whereΦ### :

_{R}

^{n}### ×

_{R}

^{n}### →

_{R}

*is a nonsmooth mapping given by*

^{n}ΦNR*(x*

### ,

^{y)}### =

### ⎛

### ⎜

### ⎝

### φ

NR*(x*

_{1}

### ,

*1)*

^{y}### φ

NR*(x*

### ...

_{n}### ,

^{y}*n*)

### ⎞

### ⎟

### ⎠ .

2

*In this paper, we will consider nonlinear monotone and P*_{0}-functions. We recall some basic types of nonlinear
mappings.

**Definition 2.1.** *Let F*

### =

*(F*

_{1}

### , . . . ,

^{F}*n*)

^{T}### :

_{R}

^{n}### →

_{R}

^{n}*. Then, the mapping F is said to be*

**(i) monotone if**

### ⟨

*x*

### −

*y*

### ,

^{F (x)}### −

*F (y)*

### ⟩ ≥

*0 for all x*

### ,

^{y}### ∈

_{R}

*.*

^{n}**(ii) strictly monotone if**

### ⟨

*x*

### −

*y*

### ,

^{F (x)}### −

*F (y)*

### ⟩ >

*0 for all x*

### ,

^{y}### ∈

_{R}

^{n}*and x*

### ̸=

*y.*

**(iii) strongly monotone with modulus**

### µ >

^{0 if}

### ⟨

*x*

### −

*y*

### ,

^{F (x)}### −

*F (y)*

### ⟩ ≥ µ∥

^{x}### −

*y*

### ∥

^{2}

*for all x*

### ,

^{y}### ∈

_{R}

*.*

^{n}

**(iv) a P**_{0}-function if max

1≤*i*≤*n*
*xi*̸=_{yi}

*(x*_{i}

### −

*y*

_{i}*)(F*

_{i}*(x)*

### −

*F*

_{i}*(y))*

### ≥

*0 for all x*

### ,

^{y}### ∈

_{R}

^{n}*and x*

### ̸=

*y.*

**(v) a P-function if max**

1≤* _{i}*≤

_{n}*(x*

*i*

### −

*y*

*i*

*)(F*

*i*

*(x)*

### −

*F*

*i*

*(y))*

### >

*0 for all x*

### ,

^{y}### ∈

_{R}

^{n}*and x*

### ̸=

*y.*

**(vi) a uniform P-function with modulus**

### κ >

^{0 if max}

1≤* _{i}*≤

_{n}*(x*

_{i}### −

*y*

_{i}*)(F*

_{i}*(x)*

### −

*F*

_{i}*(y))*

### ≥ κ∥

^{x}### −

*y*

### ∥

^{2}

*, for all x*

### ,

^{y}### ∈

_{R}

*.*

^{n}*A matrix is called a P*_{0}*-matrix (resp. P-matrix) if all its principals minors are nonnegative (resp. positive). Note that F is*
*a P*_{0}-function if and only if

### ∇

*F (x) is a P*

_{0}

*-matrix for all x*

### ∈

_{R}

*. If*

^{n}### ∇

*F (x) is a P-matrix for all x*

### ∈

_{R}

^{n}*, then F is a P-function.*

However, the converse is not necessarily true.

We now recall the smoothing approach described in [18]. First, we consider a continuously differentiable strictly increasing function

### θ

^{such that}

*t*→−∞lim

### θ

^{(t)}### = −∞ , θ

^{(0)}

### =

0 and lim*t*→+∞

### θ

^{(t)}### =

1 (2.2)We also impose a condition that

### θ

*(t) should approach 1 fast enough but with some uniformity for large values of t. This*condition is precisely defined as follows.

**Definition 2.2. Let a**

### ∈

(0### ,

1) and suppose### θ

*(t) is a strictly increasing C*

^{1}function such that

### θ

^{(0)}

### =

0 and lim*→∞*

_{t}### θ

^{(t)}### =

1.We say that

### θ

*satisfies condition (H*

_{a}*) if there exists t*

_{a}### >

0 such that 12

### +

^{1}

2

### θ

^{(at)}### ≤ θ

^{(t)}### ∀

*t*

### ≥

*t*

_{a}### .

We say that

### θ

belongs to class_{F}

*if it satisfies condition (H*

_{a}*) for some a*

### ∈

(0### ,

^{1).}

For instance, the following functions were considered in [18]:

### θ

^{(1)}

^{(t)}### = {

_{t}*t*+_{1} *if t*

### ≥

0*t* *if t*

### <

^{0}

^{and}

### θ

^{(2)}

^{(t)}### =

1### −

*e*

^{−}

^{t}### .

^{(2.3)}

### θ

^{(1)}

*satisfies (H*

_{a}*) for all a*

### ∈

(0### ,

^{1}

### /

^{2) while}

### θ

^{(2)}

*satisfies (H*

_{a}*) for all a*

### ∈

(0### ,

^{1).}

*For each r*

### >

0, we define the function### φ

*r*

### :

_{R}

^{2}

### →

_{R by}

### φ

*r*

*(s*

### ,

^{t)}### :=

*r*

### θ

^{−}

^{1}

### ( θ (

_{s}*r*

### )

### + θ (

*t*

*r*

### )

### −

1### )

### .

^{(2.4)}

Note that strict monotonicity of

### θ

was imposed to guarantee its invertibility as required in(2.4). We summarize here some important facts proved in [18].**Lemma 2.1.** *Let*

### θ

*be a strictly increasing C*

^{1}

*function satisfying conditions*(2.2). Then, the following holds.

**(i)**

### φ

*r*

*(s*

### ,

^{t)}### ≤

min### {

*s*

### ,

^{t}### }

*for all s*

### ,

^{t}### ∈

_{R and r}### >

^{0.}**(ii) If**

### θ ∈

F*, then*lim

*r*↘_{0}

### φ

*r*

*(s*

### ,

^{t)}### =

0### ⇐⇒

min### {

*s*

### ,

^{t}### } =

0### ∀

*s*

### ,

^{t}### ∈

_{R}

### .

**(iii) If**### θ

*satisfies condition (H*

_{a}*) for all a*

### ∈

(0### ,

^{1), then}lim

*r*↘_{0}

### φ

*r*

*(s*

### ,

^{t)}### =

min### {

*s*

### ,

^{t}### } ∀

*s*

### ,

^{t}### ∈

_{R}++

### .

*Let G*

_{r}### :

_{R}

^{n}### ×

_{R}

^{n}### →

_{R}

*be given by*

^{n}*G*_{r}*(x*

### ,

^{y)}### =

### ⎛

### ⎜

### ⎝

### φ

*r*

*(x*

_{1}

### ,

*1)*

^{y}### φ

*r*

*(x*

_{n}### ... ,

^{y}*n*)

### ⎞

### ⎟

### ⎠ .

^{(2.5)}

3

ByLemma 2.1(iii), G* _{r}* is a smoothing function forΦNR over R

*++*

^{n}### ×

_{R}

^{n}_{++}if

### θ

*satisfies (H*

_{a}*) for all a*

### ∈

(0### ,

1). In contrast, we may not obtain a smoothing function forΦNR*if there exists a*

### ∈

(0### ,

*1) for which (H*

*) does not hold. For instance, when*

_{a}### θ = θ

^{(1)}, one can verify that lim

*↘*

_{r}_{0}

*G*

_{r}*(x*

### ,

^{y)}### <

ΦNR*(x*

### ,

*y) (see also [18]). In view of these contrasting properties, we*introduce two subclasses of

_{F}.

**Definition 2.3. Let**

### θ ∈

F. We say that### θ

belongs to class_{F}

_{1}

*if it satisfies condition (H*

_{a}*) for all a*

### ∈

(0### ,

1). Otherwise, we say that### θ

^{belongs to}F2.

In Section3, we will prove that the result ofLemma 2.1(iii) holds only for functions in_{F}_{1}. Despite not obtaining a
smoothing function forΦNRwhen

### θ ∈

F2,Lemma 2.1(ii) is a very useful result to apply Haddou and Maheux’s smoothing strategy. To see this, define the functionsΦ*r**(x)*

### :=

*G*

_{r}*(x*

### ,

^{F (x))}^{and}Φ0

*(x)*

### :=

lim*r*↘_{0}Φ*r**(x)*

### ,

provided that the limit exists. Then byLemma 2.1(ii),(2.1)is equivalent to solving the system of equationsΦ0*(x)*

### =

0.Moreover, it is shown in [18] that whenever F is a P_{0}-function,Φ*r* *is a P-function for any r*

### >

^{0 while}Φ0

*is a P*

_{0}-function

*whenever it exists. Consequently, one may use Gowda and Tawhid’s theory for P*

_{0}-equations given in [31] by viewingΦ0

as a continuous perturbation ofΦ*r*. Indeed,Lemma 2.3 which was given in [18] is an easy consequence of Theorem 4
of [31]. Before stating the convergence result, we make some assumptions.

**Assumption 1. SOL(F ) is nonempty and bounded,**

### θ ∈

F andΦ0exists.Some conditions which will guarantee the existence ofΦ0are given in the following result, whose proofs can be found in [18].

**Lemma 2.2.** *Let s*

### ,

^{t}### >

^{0. Then,}### φ

0*(s*

### ,

^{t)}### :=

lim*r*↘0

### φ

*r*

*(s*

### ,

^{t)}

^{exists}

^{and}### φ

0*(s*

### ,

^{t)}### ≤

min### {

*s*

### ,

^{t}### }

*if any of the following condition holds:*

**(i) There exists**

### ε >

*0 such that*

_{∂}

^{∂}

_{r}### φ

*r*

*(s*

### ,

^{t)}### ≤

*0 for all r*

### ∈

(0### , ε

^{);}**(ii) V**

### :=

(### − ψ

^{′}

### ◦ ψ

^{−}

^{1}

^{)}

### × ψ

^{−}

^{1}

*is locally subadditive at 0*

^{+}

*; i.e. there exists*

### η >

*0 such that for all 0*

### < α, β, α + β < η

^{, we}*have*

*V (*

### α + β

^{)}

### ≤

*V (*

### α

^{)}

### +

*V (*

### β

^{)}

### ,

*where*

### ψ :=

^{1}

### − θ

^{.}**Lemma 2.3.** *SupposeAssumption 1holds and that F is a P*_{0}*-function.*

**(i) There exists an**

### ˆ

*r*

### >

*0 such that for any r*

### ∈

(0### , ˆ

*Φ*

^{r),}*r*

*(x)*

### =

*0 has a unique solution x*

^{(r)}*and the mapping r*

### ↦→

*x*

^{(r)}*is*

*continuous on (0*

### , ˆ

^{r).}**(ii) lim**

*r*↘_{0} inf

*x∗*∈_{SOL(F )}

### ∥

*x*

^{(r)}### −

*x*

^{∗}

### ∥ =

*0.*

Lemma 2.3(i) implies the well-posedness of the equationΦ*r**(x)*

### =

0, whileLemma 2.3(ii) suggests the following typical algorithm employed in smoothing techniques, which is the one presented in [18].**Algorithm 1.**

Let

### ε >

^{0, and x}^{0}

### ∈

_{R}

^{n}_{++}

*. Set r*

^{0}

### =

max### {

1### , √

max_{1}≤* _{i}*≤

_{n}### |

*x*

^{0}

_{i}*F*

_{i}*(x*

^{0})

### |}

*and k*

### =

0.**For k**

### =

0### ,

^{1}

### , . . .

^{until max}1≤

*≤*

_{i}

_{n}### |

*x*

^{k}

_{i}*F*

_{i}*(x*

*)*

^{k}### | ≤ ε

^{, do}1. SolveΦ*r*^{k}*(x)*

### =

*G*

_{r}

_{k}*(x*

### ,

^{F (x))}### =

*0, where G*

*(*

_{r}### · , ·

) is defined in(2.5), to get a solution x*.*

^{k}*2. Set r*

^{k}^{+}

^{1}

### =

min### {

0### .

^{1r}

^{k}### ,

^{(r}

^{k}^{)}

^{2}

### , √

max_{1}≤* _{i}*≤

_{n}### |

*x*

^{k}

_{i}*F*

_{i}*(x*

*)*

^{k}### |}

.*We remark that if x** ^{(r)}*solvesΦ

*r*

*(x)*

### =

*0, then x*

^{(r)}*and F (x*

*) are both in R*

^{(r)}*++.*

^{n}**3. Designing smoothing functions**

In Section2, we introduced two subclasses of_{F}*based on whether or not condition (H*_{a}*) is satisfied for all a*

### ∈

(0### ,

^{1).}

*In this section, we discuss some important remarks and results related to condition (H** _{a}*). Moreover, we present a simple
and intuitive way to generate functions

### θ

^{from}F. Under some suitable assumptions, the construction way also provides information on the classification of

### θ

, i.e. whether### θ

^{belongs to}F1or

_{F}

_{2}.

4

We consider first a function

### θ

^{satisfying}

^{(2.2)}restricted to R

^{+}. Observe that there is a one-to-one correspondence

*between these functions and probability density functions. Indeed, if p(t)*

### := θ

^{′}

*(t), then p(t)*

### ≥

0 and### ∫

∞ 0*p(t) dt*

### =

lim*t*→∞

### ∫

*t*0

### θ

^{′}

^{(}

### τ

^{) d}### τ =

^{lim}

*t*→∞(

### θ

^{(t)}### − θ

^{(0))}

### =

1### .

Hence, the function

### θ

corresponds to a probability density function on R^{+}. Conversely, one natural way to generate

### θ

^{is}

*to consider a probability density function p*

### :

_{R}+

### →

_{R}+and taking

### θ

^{(t)}### =

### ∫

*t*0

*p(*

### τ

^{) d}### τ

*for all t*

### ∈ [

0### , ∞

). This construction way was also mentioned in [32].Another natural and intuitive way to generate

### θ

with the desired asymptotic behavior at infinity is to consider a differential equation with an equilibrium solution at 1. Consider for instance the separable autonomous differential equation given by*d*

### θ

*dt*

### =

*f (*

### θ

^{)}

### , θ

^{(0)}

### =

0### .

^{(3.1)}

*Throughout the paper, the following are our assumptions on f :*
**Assumption 2.**

*(i) f is C*^{1}on an open interval containing

### [

0### ,

^{1}

### ]

.*(ii) f*

### >

^{0 on}

### [

0### ,

*1) and f (1)*

### =

0.Assumption 2(i) is important in ensuring that the initial-value problem(3.1)has a unique solution. On the other hand, Assumption 2(ii) is required to obtain an increasing function

### θ

*(t) which approaches 1 as t*

### → ∞

.**Proposition 3.1.** *Let f be any function satisfyingAssumption 2. Then the initial-value problem*(3.1)*has a unique solution*
*which is defined for all t*

### ≥

*0 and lim*

*→∞*

_{t}### θ

^{(t)}### =

*1.*

**Proof. The existence and uniqueness of solution follows from**Assumption 2(i) (see [33]). Let

### [

0### ,

*T ) be the maximal interval*of existence of the solution

### θ

^{(t) of}^{(3.1).}

*Since f (*

### θ

^{)}

### >

^{0 for all}

### θ ∈

^{(0}

### ,

^{1),}

### θ

*(t) is an increasing function. Thus, we may let L*

### =

lim*→∞*

_{t}### θ

*(t). Note also that the*constant function 1 is a solution of the differential equation

^{d}

_{dt}^{θ}

### =

*f (*

### θ

). Since solution through a point is unique,### θ

^{(t)}### <

^{1}

*for all t*

### ∈ [

0### ,

*T ). Consequently, L*

### ≤

1 and### { θ

^{(t)}### :

*t*

### ∈ [

0### ,

^{T )}### } ⊆ [

0### ,

^{1}

### ]

. Hence,### θ

*(t) is defined for all t*

### ≥

0 (see [33]).Finally, note that from(3.1), we have

### θ

^{(t)}### =

### ∫

*t*0

*f (*

### θ

^{(}

### τ

^{)) d}### τ.

Since

### θ

^{(t)}### →

*L, we conclude that*

### ∫

∞0 *f (*

### θ

^{(}

### τ

^{)) d}### τ

is convergent. It follows that*f (L)*

### =

lim*t*→∞*f (*

### θ

^{(t))}### =

0### .

ByAssumption 2(ii), we conclude that L

### =

1, as desired. □*It was mentioned in the preceding section that satisfying condition (H** _{a}*) is very important to guarantee the applicability
of the smoothing approach. The benefit of constructing

### θ

^{using}

^{(3.1)}

*is that condition (H*

_{a}*) can be easily deduced if f has*a specific form. We describe this in the following theorem.

**Theorem 3.1. Let**f be a continuously differentiable function on an open interval containing

### ¯ [

0### ,

^{1}

### ]

*such that*

_{f}### ¯ >

^{0 on}### [

0### ,

^{1}

### ]

*.*

*Suppose that f takes the form f (*

### θ

^{)}

### = ¯

*f (*

### θ

^{)(1}

### − θ

^{)}

^{k}

^{, where k}### ≥

*1. If*

### θ

*(t) denotes the unique solution of*(3.1), then

### θ ∈

F*. In*

*particular,*

**(i) If k**

### =

*1, then*

### θ ∈

F1*.*

**(ii) If k**

### >

^{1, then}### θ ∈

F2*. Moreover, condition (H*

_{a}*) is satisfied when a*

### ∈ (

0

### ,

_{2}

*k*

^{1}−1

### )

*and is not satisfied when a*

### ∈ (

12* ^{k}*−1

### ,

^{1}

### )

*.*Before we proveTheorem 3.1, we need the following lemma. This lemma will also be useful later inTheorems 4.3and 4.4.

**Lemma 3.1.** *Under the hypothesis ofTheorem 3.1, we have*

*t*lim→∞

1

### − θ

*1*

^{(t)}### − θ

^{(at)}### =

### {

0*if k*

### =

1*a*

1

*k*−1 *if k*

### >

^{1}

### ∀

*a*

### ∈

(0### ,

^{1)}

### .

5

**Proof. Let a**

### ∈

(0### ,

1) and define### ψ

^{(t)}### :=

1### − θ

*(t). Note that*

*d*

*dt*

### ψ

^{(t)}### = −

^{d}### θ

*dt*

### = −¯

*f (*

### θ

^{(t))(1}### − θ

^{(t))}

^{k}### = −¯

*f (*

### θ

^{(t))(}### ψ

^{(t))}

^{k}### .

^{(3.2)}

*Suppose first that k*

### =

1. Since### θ

^{(0)}

### =

0, then### ψ

^{(0)}

### =

1. Moreover, we have from(3.2)that1

### − θ

^{(t)}### = ψ

^{(t)}### =

exp### (

### −

### ∫

*t*0

### ¯

*f (*

### θ

^{(}

### τ

^{)) d}### τ )

### .

*Consequently, we have for any a*

### ∈

(0### ,

^{1) that}

1

### − θ

*1*

^{(t)}### − θ

^{(at)}### =

exp

### (

### − ∫

*t*

0_{f (}

### ¯ θ

^{(}

### τ

^{)) d}### τ )

exp### (

### − ∫

*at*

0 _{f (}

### ¯ θ

^{(}

### τ

^{)) d}### τ ) =

exp### (

### −

### ∫

*t*

*at*

*f (*

### ¯ θ

^{(}

### τ

^{)) d}### τ )

(3.3)

Since_{f (}

### ¯ θ

^{)}

### >

^{0 in}

### [

0### ,

^{1}

### ]

*, there exists m*

### >

0 such that

_{f (}### ¯ θ

^{)}

### >

^{m for all}### θ ∈ [

^{0}

### ,

^{1}

### ]

. Continuing from(3.3), 1### − θ

^{(t)}1

### − θ

^{(at)}### ≤

exp### (−

^{m(1}### −

*a)t*

### ) ∀

*t*

### ≥

0### .

*Letting t*

### → ∞

gives the desired limit for the first case.*Now, suppose that k*

### >

^{1. From}(3.2), we have (1

### − θ

^{(t))}^{1}

^{−}

^{k}### =

(### ψ

^{(t))}^{1}

^{−}

^{k}### =

1### +

*(k*

### −

1)### ∫

*t*0

### ¯

*f (*

### θ

^{(}

### τ

^{)) d}### τ.

Note that due to_{f (1)}

### ¯ ̸=

0, we have### ∫

∞0 _{f (}

### ¯ θ

^{(}

### τ

^{)) d}### τ = ∞

. Then, this leads to*t*lim→∞

### (

1### − θ

*1*

^{(t)}### − θ

^{(at)}### )

*k*−

_{1}

### =

lim*t*→∞

1

### +

*(k*

### −

1)### ∫

*at*

0 _{f (}

### ¯ θ

^{(}

### τ

^{)) d}### τ

1### +

*(k*

### −

1)### ∫

*t*

0

### ¯

_{f (}### θ

^{(}

### τ

^{)) d}### τ =

lim*t*→∞

*a*

### ¯

_{f (}### θ

^{(at))}### ¯

*f (*

### θ

^{(t))}### =

*a*

### ,

where the last equality follows from the fact that_{f (1)}

### ¯ ̸=

0 and### θ

^{(t)}### →

1 byProposition 3.1. □**Proof ofTheorem 3.1. Fix a**

### ∈

(0### ,

*1) and define h*

_{a}*(t)*

### =

^{1}

_{2}

### +

^{1}

_{2}

### θ

^{(at)}### − θ

*(t) for all t*

### ≥

*0. Differentiating h*

*, we obtain*

_{a}*h*

^{′}

_{a}*(t)*

### =

^{a}2

### θ

^{′}

^{(at)}### − θ

^{′}

^{(t)}### =

^{a}2*f (*

### θ

^{(at))}### −

*f (*

### θ

^{(t))}### .

^{(3.4)}

Since_{f (1)}

### ¯ ̸=

0 and### θ

^{(t)}### →

1, we have fromLemma 3.1that lim*t*→∞

*f (*

### θ

^{(t))}*f (*

### θ

^{(at))}### =

lim*t*→∞

*f (*

### ¯ θ

^{(t))(1}### − θ

^{(t))}

^{k}*f (*

### ¯ θ

^{(at))(1}### − θ

^{(at))}

^{k}### =

### {

0*if k*

### =

1*a*

*k*

*k*−1 *if k*

### >

^{1}

### .

^{(3.5)}

*Suppose now that k*

### =

1. By(3.5), we can find t

_{a}### >

*0 such that f (*

### θ

^{(t))}### <

^{a}_{2}

^{f (}### θ

*(at)) for all t*

### >

^{t}*a*. From(3.4), we see that

*h*

^{′}

_{a}*(t)*

### >

*0 for all t*

### >

^{t}*a*

*and therefore h*

_{a}*(t) is strictly increasing on (t*

_{a}### , ∞

*). But clearly, h*

_{a}*(t) approaches 0 as t*

### → ∞

. Thus,*h*

_{a}*(t)*

### <

*0 for all t*

### >

^{t}*a*. That is,

### θ

*(t) satisfies condition (H*

_{a}*) for all a*

### ∈

(0### ,

^{1).}

*On the other hand, suppose k*

### >

^{1. If 0}

### <

^{a}### <

_{2}

*k*

^{1}−1, then

*a*

2

### −

*a*

^{k}^{−}

^{k}^{1}

### =

*a*

### (

12

### −

*a*

^{k}^{−}

^{1}

^{1}

### )

### >

^{0}

### .

Then, by(3.5), we can find t_{a}

### >

0 such that

_{f (}

^{f (}_{θ}

^{θ}

_{(at))}

^{(t))}### <

^{a}_{2}

^{for all t}### >

^{t}*a*. As in the preceding case,

### θ

*(t) satisfies condition (H*

*)*

_{a}*for all a*

### ∈

### (

0### ,

_{2}

*k*

^{1}−1

### )

. If _{2}_{k}^{1}−1

### <

^{a}### <

^{1, then a}

^{k}^{−}

^{k}^{1}

### −

^{a}_{2}

### >

*0 and so there exists t*

*a*

### >

0 for which

_{f (}

^{f (}_{θ}

^{θ}

_{(at))}

^{(t))}### >

^{a}_{2}

^{for all t}### >

^{t}*a*. It

*follows that h*

_{a}*(t)*

### >

^{0 over (t}*a*

### , ∞

). This completes the proof._{□}

* Remark 3.1. Given a function f satisfying*Assumption 2, we may applyTheorem 3.1

*provided that there exists k*

### ≥

1*such that L*

### :=

lim_{θ→}

_{1}

*f (*

### θ

^{)(1}

### − θ

^{)}

^{−}

^{k}### ̸=

0. In this case, we define### ¯

_{f (}### θ

^{)}

### :=

*f (*

### θ

^{)(1}

### − θ

^{)}

^{−}

^{k}^{for}

### θ ∈ [

^{0}

### ,

1) and define

_{f (1)}### ¯ :=

*L.*

*We then obtain the desired factorization f (*

### θ

^{)}

### = ¯

*f (*

### θ

^{)(1}

### − θ

^{)}

^{k}^{.}

In general, this factorization is not always achievable. To see this, consider the function
*g(t)*

### =

*t*

^{3}

### (

1

### +

sin1*t*

### )

### +

*t(1*

### −

*cos t)*

### .

*It is easy to verify that g is continuously differentiable on R. Moreover, since the first term of g is nonnegative on (0*

### ,

^{1}

### ]

while the second term is greater than 0 on (0### ,

^{1}

### ]

*, then g is strictly positive on (0*

### ,

^{1}

### ]

. In addition,lim

*t*→_{0+}

*g(t)*

*t*^{k}

### =

0### ∀

*k*

### ∈ [

1### ,

^{3)}

### ,

6

*and the above limit does not exist for k*

### ≥

*3. Defining f (*

### θ

^{)}

### :=

*g(1*

### − θ

*), we see that f satisfies*Assumption 2. But, there is

*no k*

### ≥

1 such that lim_{θ→}

_{1−}

*f (*

### θ

^{)(1}

### − θ

^{)}

^{−}

*exists and is nonzero.*

^{k}The following is an easy consequence ofTheorem 3.1and the above remark.

**Corollary 3.1. If f**^{′}(1)

### <

^{0, then}### θ ∈

F1*.*

**Example 3.1. We generate five functions from**(3.1), two of which are the ones used in [18] given by Eq.(2.3). It can be
*easily verified that V given in*Lemma 2.2is locally subadditive at 0^{+}for each of the functions below.

*The parameter k has a significant role in numerical simulations. Notice that (1*

### − θ

^{)}

^{k}*decreases in value as k increases*when

### θ ∈

^{(0}

### ,

1). In view of(3.1)*and the form of f given in*Theorem 3.1, k controls the rate of growth of the function

### θ

^{(t).}In Section5, it is shown that functions with faster growth rate yield faster convergence time for the neural networks.

However, the rate of increase of

### θ

*should not be very fast so as to avoid ill-conditioning. Hence, the parameter k is useful*

*when we take into account the conditioning of the problem. That is, a higher value of k may be adapted when the problem*is ill-conditioned.

We now give an example of a function

### θ

that does not belong to_{F}.

**Example 3.2. Define**

### θ

^{(t)}### =

^{2}

### π

### ∫

*t*0

sin^{2}

### τ

### τ

^{2}

^{d}### τ

^{. Indeed,}

### θ

is a strictly increasing continuously differentiable function such that### θ

^{(0)}

### =

0 and### θ →

^{1 as t}### → ∞

*. The function h*

*defined in the proof ofTheorem 3.1oscillates between positive and*

_{a}*negative values of t, and therefore does not satisfy condition (H*

_{a}*) for any a*

### ∈

(0### ,

^{1).}

**Remark 3.2. One can observe that class**_{F}_{1}functions can be viewed, in some sense, as the limit case of class_{F}_{2}functions.

First, notice that inLemma 3.1, the zero limit obtained for class_{F}_{1}*function can be viewed as the limit of a*^{k}^{−}^{1}^{1} *as k*

### ↘

1,*where a*

^{k}^{−}

^{1}

^{1}is the calculated limit for class

_{F}

_{2}function. InTheorem 3.1, we see that for

### θ ∈

F2*, condition (H*

*) is satisfied*

_{a}*only for a*

### ∈

*I*

_{k}### :=

(0### ,

^{2}

^{1}

^{−}

^{k}*). Note that I*

_{k}### ↗

(0### ,

^{1) as k}### ↘

*1 and that condition (H*

_{a}*) holds for all a*

### ∈

(0### ,

1) for class_{F}

_{1}functions.

**Remark 3.3. We mention some remarks regarding the definition of**

### θ

^{over (}

### −∞ ,

*0). In order for G*

*given by(2.5)to be smooth over R*

_{r}

^{n}### ×

_{R}

^{n}*for any r*

### >

0, we must use a differentiable extension of### θ

^{for t}### <

0. In this paper, we let### θ

^{(t)}### =

*tf (0)*

*for t*

### <

0 in this paper, similar to the construction of### θ

^{(1)}

^{in}

^{(2.3).}

We conclude this section with the following proposition which implies that the limit in Lemma 2.1(iii) is in fact a
characterization of functions in_{F}_{1}. That is, a smoothing function of

### φ

NR over the positive orthant is obtained if and only if### θ ∈

F1.**Proposition 3.2. Suppose**

### θ ∈

F2*and leta*

### ¯ ∈

*inf S, where*

*S*

### = {

*a*

### ∈

(0### ,

^{1)}

### |

*condition (H*

_{a}*) does not hold*

### } .

*Let P*¯

_{a}### = {

*(s*

### ,

^{t)}### ∈

_{R}

^{2}

_{++}

### : ¯

*as*

### <

^{t}### <

_{a}*¯*

^{s}### }

*. Then, we have*

lim

*r*↘_{0}

### φ

*r*

*(s*

### ,

^{t)}### <

^{min}

### {

*s*

### ,

^{t}### } ∀

*(s*

### ,

^{t)}### ∈

*P*¯

_{a}### ,

*whenever the limit exists.*

**Proof. We note first that (***a*

### ¯ ,

^{1)}

### ⊆

*S. Indeed, the set S is nonempty and*

### ¯

*a*

### ∈

(0### ,

^{1) since}

### θ ∈

F2. If*a*

### ˜ ∈

*S, for any u*

### >

^{0,}

### θ

^{(}

### v

^{)}

### <

^{1}

_{2}

### +

^{1}

_{2}

### θ

^{(}

*a*

### ˜ v

^{) for some}

### v ≥

^{u. Since}### θ

*is increasing, we see that condition (H*

_{a}*) also does not hold for any a*

### ∈

(### ˜

*a*

### ,

^{1).}

*Now, fix (s*

### ,

^{t)}### ∈

*P*

*¯*

_{a}*and suppose s*

### =

min### {

*s*

### ,

^{t}### }

*. Given any a*

### ∈

(*a*

### ¯ ,

1), then by the above note, there exists a sequence### {

*r*

*n*

### }

*such that r*

*n*

### ↘

0 and### θ (

*t*

*r**n*

### )

### <

^{1}2

### +

^{1}

2

### θ (

*at*

*r**n*

### ) .

Since

### θ

^{and}

### θ

^{−}

^{1}

*are increasing and s*

### ≤

*t, the above inequality yields*

### φ

*r*

*n*

*(s*

### ,

^{t)}### =

*r*

_{n}### θ

^{−}

^{1}

### ( θ

### (

*s*

*r*

_{n}### ) + θ

### (

*t*

*r*

_{n}### )

### −

1### )

### ≤

*r*

_{n}### θ

^{−}

^{1}

### (

2

### θ (

*t*

*r*_{n}

### )

### −

1### )

### <

^{at}*Letting n*

### → ∞

, we obtain lim*→∞*

_{n}### φ

*r*

*n*

*(s*

### ,

^{t)}### ≤

*at for any a*

### ∈

(*a*

### ¯ ,

*1). Since (s*

### ,

^{t)}### ∈

*P*

*¯*

_{a}*, then s*

### > ¯

*at and by letting a*

### ↘ ¯

*a, we*obtain

lim

*n*→∞

### φ

*r*

*n*

*(s*

### ,

^{t)}### ≤ ¯

*at*

### <

^{s}### =

min### {

*s*

### ,

^{t}### }

which is the desired result. □It is interesting to note that as inRemark 3.2, class_{F}_{1}can again be viewed as the limit case of class_{F}_{2}when*a*

### ¯ ↘

0.More precisely, the limit inLemma 2.1(iii) holds on

### ⋃

*a*¯∈_{(0},^{1)}*P** _{a}*¯

### =

_{R}

^{2}

_{++}when

### θ ∈

F1.7

**4. Smoothed neural networks**

In this section, we present two gradient dynamical systems to solve the NCP (2.1) using the smoothing approach presented in Section2. Similar to the approach used in [19,21], the stability analysis of our neural networks relies on the use of Lyapunov functions. For more details, we refer the reader to [33,34].

*4.1. The first neural network*

*We consider first a neural network which can be used to obtain approximate solutions of NCP(F ). From*Lemma 2.3,
solutions of(2.1)*can be obtained by successively solving for decreasing values of r the equation*Φ*r**(x)*

### =

0, which is exactly the motivation ofAlgorithm 1. Note that solving the aforementioned equation, if a solution exists, is equivalent to solving the smooth minimization problemmin

*x*∈R* ^{n}*Ψ

*r*

*(x)*

### :=

^{1}

2

### ∥

_{Φ}

_{r}*(x)*

### ∥

^{2}

### .

^{(4.1)}

This motivates the steepest descent-based neural network
*dx*

*dt*

### = − ρ∇

Ψ*r*

*(x)*

### ,

^{x(0)}### =

*x*

_{0}

### ,

^{(NN1)}

where

### ρ >

*0 is a time-scaling parameter and r*

### >

0 is a sufficiently small fixed number. Consequently, neural network (NN1)can be used to deal with Step 1 ofAlgorithm 1.Observe that a solution of(4.1)is an equilibrium point of(NN1). However, the converse is not necessarily true. We are
*interested on conditions that can be imposed on F so that an equilibrium point of the neural network*(NN1)is also an
optimal solution of(4.1), and consequently an approximated solution of(2.1). To this end, we establish first an important
property of

### ∇

_{Φ}

*.*

_{r}**Theorem 4.1.** *Let F be a P*_{0}*-function, r*

### >

*0 and suppose that*

### θ

^{′}

^{(u)}### ̸=

*0 for all u*

### ∈

_{R. Then}### ∇

_{Φ}

_{r}*(x) is a nonsingular for any*

*x*

### ∈

_{R}

^{n}*and r*

### >

^{0.}**Proof. First, we note that**

### ∇

_{a}### φ

*r*

*(s*

### ,

^{t)}### = ∂

### ∂

^{s}### φ

*r*

*(s*

### ,

^{t)}### =

(### θ

^{−}

^{1}

^{)}

^{′}

### (

### θ (

_{s}*r*

### )

### + θ (

*t*

*r*

### )

### −

1### )

### θ

^{′}

### (

_{s}*r*

### )

### = θ

^{′}

### (

_{s}*r*

### ) θ

^{′}

### (

### θ

^{−}

^{1}

### (

### θ (

^{s}

_{r}### ) + θ (

_{r}

^{t}### ) −

_{1}

### ))

^{(4.2)}

### >

^{0}

### ,

where we used the fact that

### θ

^{′}

^{(}

### θ

^{−}

^{1}

^{(u))}### ·

(### θ

^{−}

^{1}

^{)}

^{′}

^{(u)}### =

1. By symmetry, we also have### ∇

_{b}### φ

*r*

*(s*

### ,

^{t)}### :=

_{∂}

^{∂}

_{t}### φ

*r*

*(s*

### ,

^{t)}### >

^{0.}

*Let x*

### ∈

_{R}

^{n}*and r*

### >

*0, and denote by A*

_{r}*(x) and B*

_{r}*(x) the n*

### ×

*n diagonal matrices such that*

*(A*

_{r}*(x))*

_{ii}### = ∇

_{a}### φ

*r*

*(x*

_{i}### ,

^{F}*i*

*(x))*and

*(B*

_{r}*(x))*

_{ii}### = ∇

_{b}### φ

*r*

*(x*

_{i}### ,

^{F}*i*

*(x))*

### .

*Note that A*_{r}*(x) and B*_{r}*(x) are invertible since*

### ∇

_{a}### φ

*r*

*(x*

_{i}### ,

^{F}*i*

*(x)) and*

### ∇

_{b}### φ

*r*

*(x*

_{i}### ,

^{F}*i*

*(x)) are both strictly positive for all i, as noted*above. Furthermore, we have by Chain Rule that

### ∇

_{Φ}

_{r}*(x)*

### =

*A*

_{r}*(x)*

### + ∇

*F (x)*

### ·

*B*

_{r}*(x)*

### =

*(D*

_{r}*(x)*

### + ∇

*F (x))*

### ·

*B*

_{r}*(x)*

### ,

^{(4.3)}

*where D*

_{r}*(x)*

### =

*A*

_{r}*(x)B*

_{r}*(x)*

^{−}

^{1}

*. Since D*

_{r}*(x) is a diagonal matrix,*

*det(D*_{r}*(x)*

### + ∇

*F (x))*

### = ∑

Λ⊆{^{1},...,* ^{n}*}

*det(D*_{Λ}) det(

### ∇

*F (x)*

_{Λ}

*) (4.4)*

^{c}*where D*_{Λ}*denotes the principal submatrix of D*_{r}*(x) (see, for instance, Chapter 2 of [35]). Each term in the above summation*
*is nonnegative since F is a P*_{0}*-function and the diagonal entries of D*_{r}*(x) are positive. In particular, the term corresponding*
to

### α = {

^{1}

### , . . . ,

^{n}### }

*is precisely det(D*

_{r}*(x))*

### >

^{0. Thus, D}*r*

*(x)*

### + ∇

*F (x) is nonsingular. Since B*

_{r}*(x) is also nonsingular, we*conclude from(4.3)that

### ∇

_{Φ}

_{r}*(x) is nonsingular.*□

Observe that the hypothesis on

### θ

of the above theorem holds if it is generated from(3.1)using Assumption 2and Remark 3.3. We now state some important consequences of the above result.**Corollary 4.1.** *Under the hypotheses ofTheorem 4.1, the following holds:*

* (i) Every equilibrium point of*(NN1)

*is an optimal solution of*(4.1).

* (ii) An isolated equilibrium point of*(NN1)

*is exponentially stable.*

8