Information Sciences

(1)

A neural network based on the generalized Fischer–Burmeister function for nonlinear complementarity problems

Jein-Shan Chen

^a,*,1

, Chun-Hsu Ko

^b

, Shaohua Pan

^c

aDepartment of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan

bDepartment of Electrical Engineering, I-Shou University, Kaohsiung 840, Taiwan

cSchool of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China

a r t i c l e i n f o

Article history:

Received 27 February 2008

Received in revised form 18 August 2009 Accepted 6 November 2009

Keywords:

The nonlinear complementarity problem Neural network

Exponentially convergent

Generalized Fischer–Burmeister function

a b s t r a c t

In this paper, we consider a neural network model for solving the nonlinear complementarity problem (NCP). The neural network is derived from an equivalent unconstrained minimization reformulation of the NCP, which is based on the generalized Fischer–Burmei- ster function /pða; bÞ ¼ kða; bÞk_p ða þ bÞ. We establish the existence and the convergence of the trajectory of the neural network, and study its Lyapunov stability, asymptotic stability as well as exponential stability. It was found that a larger p leads to a better convergence rate of the trajectory. Numerical simulations verify the obtained theoretical results.

1. Introduction

For decades, the nonlinear complementarity problem (NCP) has attracted a lot of attention because of its wide applications in operations research, economics, and engineering[9,12]. Given a mapping F : Rⁿ! Rⁿ, the NCP is to ﬁnd a point x 2 Rⁿsuch that

x P 0; FðxÞ P 0; hx; FðxÞi ¼ 0; ð1Þ

where h; i is the Euclidean inner product. Throughout this paper, we assume that F is continuously differentiable, and let F ¼ ðF1; . . . ;FnÞ^Twith Fi:Rⁿ! R for i ¼ 1; . . . ; n.

There have been many methods proposed for solving the NCP[9,12]. One of the most popular approaches is to reformu- late the NCP as an unconstrained minimization problem via a merit function; see[14,19–21]. A merit function is a function whose global minimizers coincide with the solutions of the NCP. The class of NCP-functions deﬁned below is used to construct a merit function.

Deﬁnition 1.1. A function / : R R ! R is called an NCP-function if it satisﬁes

/ða; bÞ ¼ 0 () a P 0; b P 0; ab ¼ 0: ð2Þ

A popular NCP-function is the Fischer–Burmeister (FB) function[10,11], which is deﬁned as

doi:10.1016/j.ins.2009.11.014

*Corresponding author. Tel.: +886 2 29325417; fax: +886 2 29332342.

E-mail addresses:[email protected](J.-S. Chen),[email protected](C.-H. Ko),[email protected](S. Pan).

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Ofﬁce. The author’s work is partially supported by National Science Council of Taiwan.

Contents lists available atScienceDirect

Information Sciences

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / i n s

(2)

/FBða; bÞ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a²þ b² q

ða þ bÞ: ð3Þ

The FB merit function w_FB:R R ! Rþcan be obtained by taking the square of /_FB, i.e.,

wFBða; bÞ :¼1

2j/FBða; bÞj²: ð4Þ

In[1,3,4], we studied a family of NCP-functions that subsumes the FB function /_FBas a special case. More speciﬁcally, we deﬁne /_p:R R ! R by

/_pða; bÞ :¼ kða; bÞkp ða þ bÞ; ð5Þ

where p is any fixed real number from ð1; þ1Þ and kða; bÞk_pdenotes the p-norm of ða; bÞ, i.e., kða; bÞk_p¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jaj^pþ jbj^p pp

. In other words, in the function /_p, we replace the 2-norm of ða; bÞ in the FB function /_FBby a more general p-norm of ða; bÞ. The function /_pis still an NCP-function, as noted in Tseng’s paper[29]. There has been no further study on this NCP-function, even for p ¼ 3, until recently[1,3,4]. Similar to /_FB, the square of /_pinduces a nonnegative NCP-function w_p:R R ! Rþ:

w_pða; bÞ :¼1

2j/pða; bÞj²: ð6Þ

The function w_pis continuously differentiable and it has some favorable properties; see[1,3,4]. Moreover, if we deﬁne the functionWp:Rⁿ! Rþby

WpðxÞ :¼Xⁿ

i¼1

wpðxi;FiðxÞÞ ¼1

2kUpðxÞk²; ð7Þ

whereUp:Rⁿ! Rⁿis a mapping given as

UpðxÞ ¼

/_pðx1;F1ðxÞÞ ... /pðxn;FnðxÞÞ 0

BB

@

1 CC

A; ð8Þ

then the NCP can be reformulated into the following smooth minimization problem:

minx2RⁿWpðxÞ: ð9Þ

Thus,WpðxÞ in(7)is a smooth merit function for the NCP.

Effective gradient-type methods can be applied to the unconstrained smooth minimization problem(9). However, in many scientiﬁc and engineering applications, it is desirable to have a real-time solution of the NCP. Thus, traditional unconstrained optimization algorithms may not be suitable for real-time implementation because the computing time required for a solution greatly depends on the dimension and structure of the problem. One promising way to overcome this problem is to apply neural networks.

Neural networks for optimization were first introduced in the 1980s by Hopfield and Tank[16,28]. Since then, neural networks have been applied to various optimization problems, including linear programming, nonlinear programming, variational inequalities, and linear and nonlinear complementarity problems; see [6,8,7,15,17,18,22,24,31–35]. There have been many studies on neural-network approaches to real-world problems in some other fields, such as[26,27,36]. The main idea of the neural-network approach for optimization is to construct a nonnegative energy function and establish a dynamic system that represents an artificial neural network. The dynamic system is usually in the form of first order ordinary differential equations. Furthermore, it is expected that the dynamic system will approach its static state (or an equilibrium point), which corresponds to the solution for the underlying optimization problem, starting from an initial point. In addition, neural networks for solving optimization problems are hardware-implementable; that is, the neural networks can be implemented using integrated circuits.

In this paper, we focus on a neural-network approach to the NCP. We utilizeWpðxÞ as the traditional energy function. As mentioned above, the NCP is equivalent to the unconstrained smooth minimization problem(9). Therefore, it is natural to adopt the following steepest descent-based neural network model for NCP:

dxðtÞ

dt ¼

q

rWpðxðtÞÞ; xð0Þ ¼ x0; ð10Þ

where

q

>0 is a scaling factor. Most neural networks in the existing literature are projection-type ones based on other kinds of NCP-functions, such as natural residual function (e.g.[18,33]) and the regularized gap function (e.g.[6]). Recently, neural networks based on the FB function have been designed for linear and quadratic programming, and for nonlinear complementarity problems[8,24]. Our model is based on the generalized FB function, which is a generalization of the functions used in [8,24]. We show that the neural network(10)is Lyapunov stable, asymptotically stable, and exponentially stable. We observed in[2]that p has a great inﬂuence on the numerical performance of certain descent-type methods; a larger p yields

(3)

a better convergence rate, whereas a smaller p often gives a better global convergence. Thus, whether such phenomena occur in our neural network model is also investigated.

Throughout this paper, Rⁿdenotes the space of n-dimensional real column vectors and^Tdenotes the transpose. For any differentiable function f : Rⁿ! R; rf ðxÞ means the gradient of f at x. For any differentiable mapping F ¼ ðF1; . . . ;FmÞ^T:Rⁿ! R^m; rFðxÞ ¼ ½rF1ðxÞ rFmðxÞ 2 R^nmdenotes the transposed Jacobian of F at x. The p-norm of x is denoted by kxk_pand the Euclidean norm of x is denoted by kxk. Besides, e_iis the n-dimensional vector whose i-th compo- nent is 1 and 0 elsewhere. Unless otherwise stated, we assume that p in the sequel is any ﬁxed real number in ð1; þ1Þ if not speciﬁed.

2. Preliminaries

In this section, we review some properties of /_pand w_p, as well as materials of ordinary differential equations that will play an important role in the subsequent analysis. We start with some concepts for a nonlinear mapping.

Deﬁnition 2.1. Let F ¼ ðF1; . . . ;FnÞ^T:Rⁿ! Rⁿ. Then, the mapping F is said to be

(a) monotone if hx y; FðxÞ FðyÞi P 0 for all x; y 2 Rⁿ;

(b) strongly monotone with modulus

l

>0 if hx y; FðxÞ FðyÞi P

l

kx yk²for all x; y 2 Rⁿ; (c) an P0-function if max16i6n

xi–yiðxi y_iÞðFiðxÞ FiðyÞÞ P 0 for all x; y 2 Rⁿand x–y;

(d) a uniform P-function with modulus

j

>0 if max16i6nðxi y_iÞðFiðxÞ FiðyÞÞ P

j

kx yk², for all x; y 2 Rⁿ; (e) Lipschitz continuous if there exists a constant L > 0 such that kFðxÞ FðyÞk 6 Lkx yk for all x; y 2 Rⁿ.

FromDeﬁnition 2.1, the following one-sided implications can be obtained:

F is strongly monotone ) F is a uniformP-function ) F is an P0function;rF is positive semidefinite ) F is monotone ) F is an P0function:

Nevertheless, we point out that F being a uniform P-function does not necessarily imply that F is monotone. The following two lemmas summarize some favorable properties of /_pand w_p, respectively. Since their proofs can be found in[2–4], we here omit them.

Lemma 2.1. Let /p:R R ! R be given by(5). Then, the following properties hold.

(a) /_pis a positive homogeneous and sub-additive NCP-function.

(b) /_pis Lipschitz continuous with L ¼ ffiffiffi p2

þ 2^ð1=p1=2Þfor 1 < p < 2, and L ¼ ffiffiffi p2

þ 1 for p P 2.

(c) /_pis strongly semismooth.

(d) If fða^k;b^kÞg # R R with a^k! 1, or b^k! 1, or a^k! 1; b^k! 1, then j/pða^k;b^kÞj ! 1 when k ! 1.

(e) Given a point ða; bÞ 2 R R, every element in the generalized gradient @/_pða; bÞ has the representation ðn 1; f 1Þ with

n¼sgnðaÞ jaj^p1

kða; bÞk^p1_p and f ¼sgnðbÞ jbj^p1 kða; bÞk^p1_p

for ða; bÞ–ð0; 0Þ, where sgnðÞ represents the sign function; otherwise, n and f are real numbers that satisfy jnj^p1^p þ jfj^p1^p 61.

Lemma 2.2. Let /_pand w_p be deﬁned as in(5) and (6), respectively. Then, (a) w_pða; bÞ P 0 for all a; b 2 R and w_pis an NCP-function, i.e., it satisﬁes(2).

(b) w_pis continuously differentiable everywhere. Moreover,raw_pða; bÞ ¼rbw_pða; bÞ ¼ 0 if ða; bÞ ¼ ð0; 0Þ; otherwise, raw_pða; bÞ ¼ sgnðaÞ jaj^p1

kða; bÞk^p1_p 1

! /pða; bÞ;

rbw_pða; bÞ ¼ sgnðbÞ jbj^p1 kða; bÞk^p1p

1

! /_pða; bÞ:

ð11Þ

(c) raw_pða; bÞ rbw_pða; bÞ P 0 for all a; b 2 R. The inequality becomes an equality if and only if /_pða; bÞ ¼ 0.

(d) raw_pða; bÞ ¼ 0 ()rbw_pða; bÞ ¼ 0 () /pða; bÞ ¼ 0 () w_pða; bÞ ¼ 0.

(e) The gradient of w_pis Lipschitz continuous for p P 2, i.e., there exists L > 0 such that krwpða; bÞ rwpðc; dÞk 6 Lkða; bÞ ðc; dÞk for all ða; bÞ; ðc; dÞ 2 R²and p P 2:

(f) For all a; b 2 R, we have ð2 2^1=pÞ minfa; bg 6 j/_pða; bÞj 6 ð2 þ 2^1=pÞ minfa; bg.

(4)

Next, we recall some materials about ﬁrst order differential equations (ODE):

_xðtÞ ¼ HðxðtÞÞ; xðt0Þ ¼ x02 Rⁿ; ð12Þ

where H : Rⁿ! Rⁿis a mapping. We also introduce three kinds of stability that will be discussed later. These materials can be found in ODE textbooks; see[25].

Deﬁnition 2.2. A point x¼ xðtÞ is called an equilibrium point or a steady state of the dynamic system(12)if HðxÞ ¼ 0. If there is a neighborhoodX#Rⁿof xsuch that HðxÞ ¼ 0 and HðxÞ–08x 2Xn fxg, then xis called an isolated equilibrium point.

Lemma 2.3. Assume that H : Rⁿ! Rⁿis a continuous mapping. Then, for any t0P0 and x02 Rⁿ, there exists a local solution xðtÞ for(12)with t 2 ½t0;

s

Þ for some

s

>t0. If, in addition, H is locally Lipschitz continuous at x0, then the solution is unique; if H is Lipschitz continuous in Rⁿ, then

s

can be extended to 1.

If a local solution deﬁned on ½t0;

s

Þ cannot be extended to a local solution on a larger interval ½t0;

s

1Þ;

s

1>

s

, then it is called a maximal solution, and the interval ½t0;

s

Þ is the maximal interval of existence. Clearly, any local solution has an extension to a maximal one. We denote ½t₀;

s

ðx0ÞÞ by the maximal interval of existence associated with x0.

Lemma 2.4. Assume that H : Rⁿ! Rⁿ is continuous. If xðtÞ with t 2 ½t0;

s

ðx0ÞÞ is a maximal solution and

s

ðx0Þ < 1, then lim_t"_sðx0ÞkxðtÞk ¼ 1.

Deﬁnition 2.3. Stability in the sense of LyapunovLet xðtÞ be a solution for(12). An isolated equilibrium point xis Lyapunov stable if for any x0¼ xðt0Þ and any

e

>0, there exists a d > 0 such that kxðtÞ xk <

e

for all t P t0and kxðt0Þ xk < d.

Deﬁnition 2.4 (Asymptotic stability). An isolated equilibrium point xis said to be asymptotically stable if in addition to being Lyapunov stable, it has the property that xðtÞ ! xas t ! 1 for all kxðt0Þ xk < d.

Deﬁnition 2.5 (Lyapunov function). Let X#Rⁿ be an open neighborhood of x. A continuously differentiable function W : Rⁿ! R is said to be a Lyapunov function at the state x over the setXfor Eq.(12)if

WðxÞ ¼ 0; WðxÞ > 0; 8x 2Xn fxg:

dWðxðtÞÞ

dt ¼rWðxðtÞÞ^THðxðtÞÞ 6 0; 8x 2X: (

ð13Þ

Lemma 2.5

(a) An isolated equilibrium point xis Lyapunov stable if there exists a Lyapunov function over some neighborhoodXof x. (b) An isolated equilibrium point xis asymptotically stable if there is a Lyapunov function over some neighborhoodXof x

such that^dWðxðtÞÞ_dt <0 for all x 2Xn fxg.

Definition 2.6 (Exponential stability). An isolated equilibrium point xis exponentially stable if there exists a d > 0 such that arbitrary point xðtÞ of(10)with the initial condition xðt0Þ ¼ x0and kxðt0Þ xk < d is well-defined on ½0; þ1Þ and satisfies

kxðtÞ xk₂6ce^x^tkxðt0Þ xk 8t P t0;

where c > 0 and

x

>0 are constants independent of the initial point.

3. Neural network model

We now discuss properties of the neural network model introduced in(10). First, fromLemma 2.2(a), we obtain the following result.

Proposition 3.1. LetWp:Rⁿ! Rþbe deﬁned as in(7). Then,WpðxÞ P 0 for all x 2 RⁿandWpðxÞ ¼ 0 if and only if x solves the NCP.

Proposition 3.2. LetWp:Rⁿ! Rþbe given by(7). Then, the following results hold.

(a) The functionWpis continuously differentiable everywhere with

rWpðxÞ ¼ V^TUpðxÞ for any V 2 @UpðxÞ ð14Þ

or

rWpðxÞ ¼raw_pðx; FðxÞÞ þrFðxÞrbw_pðx; FðxÞÞ ð15Þ

(5)

with

rawpðx; FðxÞÞ :¼rawpðx1;F1ðxÞÞ; . . . ;rawpðxn;FnðxÞÞT

; rbw_pðx; FðxÞÞ :¼rbw_pðx1;F1ðxÞÞ; . . . ;rbw_pðxn;FnðxÞÞT

:

(b) If F is an P0-function, then every stationary point of(9)is a global minimizer ofWpðxÞ, and it consequently solves the NCP.

(c) If F is a uniform P-function, then the level sets LðWp;

c

Þ :¼ fx 2 RⁿjWpðxÞ 6

c

g are bounded for all

c

2 R.

(d) WpðxðtÞÞ is nonincreasing with respect to t.

Proof. The first equality in (a) follows fromLemma 2.2(c) and[5, Theorem 2.6.6]. The second one follows from the chain rule. Part (b) is the result of[3, Proposition 3.4], and part (c) is the result of[4, Proposition 3.5]. It remains to show part (d). By the definition ofWpðxÞ and(10), it is not difficult to compute

dWpðxðtÞÞ

dt ¼rWpðxðtÞÞ^TdxðtÞ

dt ¼rWpðxðtÞÞ^T

q

rWpðxðtÞÞ

¼

q

krWpðxðtÞÞk²60: ð16Þ

Therefore,WpðxðtÞÞ is a monotonically decreasing function with respect to t. h

Proposition 3.2(a) provides two ways to computerWpðxÞ, which is needed in the network(10). One is to use formula(14), for which we give an algorithm (seeAlgorithm 3.1below), to evaluate an element V 2 @UpðxÞ. The other is to adopt formula (15).

Algorithm 3.1. The procedure to evaluate an element V 2 @UpðxÞ

(S.0) Let x 2 Rⁿbe given, and let Videnote the i-th row of a matrix V 2 Rⁿⁿ. (S.1) Set IðxÞ :¼ fi 2 f1; 2; . . . ; ngj x_i¼ FiðxÞ ¼ 0g.

(S.2) Set z 2 Rⁿsuch that zi¼ 0 for i R IðxÞ, and zi¼ 1 for i 2 IðxÞ.

(S.3) For i 2 IðxÞ, let ui¼ ½jzij^p1^p þ jrFiðxÞ^Tzj^p1^p^p1^p, and Vi¼ zi

ui 1

e^T_i þ rFiðxÞ^Tz ui 1

! rFiðxÞ^T:

(S.4) For i R IðxÞ, set

Vi¼ sgnðxiÞ jxij^p1 kðxi;FiðxÞÞk^p1p

1

!

e^T_i þ sgnðFiðxÞÞ jFiðxÞj^p1 kðxi;FiðxÞÞk^p1p

1

! rFiðxÞ^T:

The above procedure is a traditional way of obtainingrWpðxðtÞÞ. For example, the neural network in[24]uses(14)and a similar algorithm to evaluate an element of V 2 @UFBðxÞ. We propose a simpler way of obtainingrWpðxðtÞÞ which is to com- puterWpðxðtÞÞ by using formula(15)rather than formula(14). Formula(15)also provides an indication on how the neural network(10)can be implemented on hardware; seeFig. 1.

To close this section, we claim thatWpprovides a global error bound for the solution of the NCP. This result is important and will be used to analyze the inﬂuence of p on the convergence rate of the trajectory xðtÞ of the neural network(10)in the next section.

Proposition 3.3. Suppose F is a uniform P-function with modulus

j

>0 and Lipschitz continuous with constant L > 0. Then, the NCP has a unique solution x, and

kx xk²6 4L²

j

²ð2 2^1=pÞ²WpðxÞ 8x 2 Rⁿ:

Proof. Since F is a uniform P-function, byProposition 3.2(c), there exists a global minimizer ofWpðxÞ which says the NCP has a solution. Assume that the NCP has two different solutions xand y, then byDeﬁnition 2.1(d) we have

j

kx yk²6max

16i6mðxi yiÞðFiðxÞ FiðyÞÞ ¼ max

16i6m xiFiðyÞ yiFiðxÞ 60;

where the equality is due to the fact that x_iFiðxÞ ¼ y_iFiðyÞ ¼ 0 for i ¼ 1; 2; . . . ; n (note that xand yare the solutions to the NCP), and the last inequality holds since x;yP0 and FðxÞ; FðyÞ P 0. This leads to a contradiction. Hence, the NCP has a unique solution.

For any x 2 Rⁿ, let rðxÞ :¼ ðr1ðxÞ; . . . ; rnðxÞÞ^T with riðxÞ ¼ minfxi;FiðxÞg for i ¼ 1; . . . ; n. Since F is Lipschitz continuous with constant L > 0, by[21, Lemma 7.4]we have

ðxi x_iÞðFiðxÞ FiðxÞÞ 6 2LjriðxÞjkx xk;

(6)

for all x 2 Rⁿand i ¼ 1; 2; . . . ; n. On the other hand, since F is a uniform P-function with modulus

j

>0, fromDeﬁnition 2.1(d) it follows that

j

kx xk²6max

16i6nðxi xiÞðFiðxÞ FiðxÞÞ

for any x 2 Rⁿ. Combining the last two equations yields kx xk 6 ð2L=

j

Þ max

16i6njriðxÞj 8x 2 Rⁿ: This together withLemma 2.2(f) implies

kx xk 6 2L

j

ð2 2^1=pÞmax

16i6nj/pðxi;FiðxÞÞj 6 2L

j

ð2 2^1=pÞkUpðxÞk:

Consequently, we obtain the desired result. h

4. Convergence and stability of the trajectory

This section focuses on issues of convergence and stability of the neural network(10). We analyze the behavior of the solution trajectory of(10)including the existence and convergence, and establish three kinds of stability for an isolated equilibrium point. We ﬁrst state the relationships between an equilibrium point of(10)and a solution to the NCP.

Proposition 4.1

(a) Every solution to the NCP is an equilibrium point of(10).

(b) If F is an P0-function, then every equilibrium point of(10)is a solution to the NCP.

Proof

(a) Suppose that x is a solution to the NCP. Then, fromProposition 3.1, it is clear thatUpðxÞ ¼ 0. UsingLemma 2.2(d) and (15), we then haverWpðxÞ ¼ 0. This, byDeﬁnition 2.2, shows that x is an equilibrium point of(10).

(b) This is a direct consequence ofProposition 3.2(b). h

The following proposition establishes the existence of the solution trajectory of(10).

Proposition 4.2. For any ﬁxed p P 2, the following hold.

(a) For any initial state x0¼ xðt0Þ, there exists exactly one maximal solution xðtÞ with t 2 ½t0;

s

ðx0ÞÞ for the neural network(10).

(b) If the level set Lðx0Þ ¼ fx 2 RⁿjWpðxÞ 6Wpðx0Þg is bounded or F is Lipschitz continuous, then

s

ðx0Þ ¼ þ1.

Fig. 1. A simpliﬁed block diagram for the neural network(10).

(7)

Proof

(a) Since F is continuously differentiable,rFðxÞ is continuous, and therefore,rFðxÞ is bounded on a local compact neighborhood of x. On the other hand,raw_pandrbw_pare Lipschitz continuous byLemma 2.2(e). These two facts together with formula(15)show thatrWpðxÞ is locally Lipschitz continuous. Thus, applyingLemma 2.3leads to the desired result.

(b) We proceed the arguments by the two cases as shown below.

Case (i): The level set Lðx0Þ is bounded. We prove the result by contradiction. Suppose

s

ðx0Þ < 1. Then, byLemma 2.4, limt"sðx0ÞkxðtÞk ¼ 1. Let L^cðx0Þ :¼ Rⁿn Lðx0Þ and

s

0:¼ inffs P 0js <

s

ðx0Þ; xðsÞ 2 L^cðx0Þg < 1:

We know that xð

s

0Þ lies on the boundary of Lðx0Þ and L^cðx0Þ. Moreover, Lðx0Þ is compact since it is bounded by assumption and it is also closed because of the continuity ofWpðxÞ. Therefore, we have xð

s

0Þ 2 Lðx0Þ and

s

0<

s

ðx0Þ, implying that

WpðxðsÞÞ >Wpðx0Þ >Wpðxð

s

0ÞÞ for some s 2 ð

s

0;

s

ðx0ÞÞ: ð17Þ However,Proposition 3.2(d) says thatWpðxðÞÞ is nonincreasing on ½t0;

s

ðx0ÞÞ, which contradicts(17). This completes the proof of Case (i).

Case (ii): F is Lipschitz continuous. From the proof of part (a), we know thatrWpðxÞ is Lipschitz continuous. Thus, by Lemma 2.3, we have

s

ðx0Þ ¼ 1. h

Next, we investigate the convergence of the solution trajectory of(10).

Theorem 4.1

(a) Let xðtÞ with t 2 ½t0;

s

ðx0ÞÞ be the unique maximal solution to (10). If

s

ðx0Þ ¼ 1 and fxðtÞg is bounded, then limt!1rWpðxðtÞÞ ¼ 0.

(b) If F is strongly monotone or a uniform P-function, then Lðx0Þ is bounded and every accumulation point of the trajectory xðtÞ is a solution to the NCP.

Proof. WithProposition 3.2(b) and (d) andProposition 4.2, the arguments are exactly the same as those for[24, Corollary 4.3]. Thus, we omit them. h

FromProposition 4.1(a), every solution xto the NCP is an equilibrium point of the neural network(10). If, in addition, x is an isolated equilibrium point of(10), then we can show that xis not only Lyapunov stable but also asymptotically stable.

Theorem 4.2. Let x be an isolated equilibrium point of the neural network(10). Then, x is Lyapunov stable for(10), and furthermore, it is asymptotically stable.

Proof. Since xis a solution to the NCP,WpðxÞ ¼ 0. In addition, since xis an isolated equilibrium point of(10), there exists a neighborhoodX#Rⁿof xsuch that

rWpðxÞ ¼ 0; and rWpðxÞ–0 8x 2Xn fxg:

Next, we argue thatWpðxÞ is indeed a Lyapunov function at xover the setXfor(10)by showing that the conditions in(13) are satisﬁed. First, notice thatWpðxÞ P 0. Suppose that there is an x 2Xn fxg such thatWpðxÞ ¼ 0. Then, by formula(15) andLemma 2.2(d), we haverWðxÞ ¼ 0, i.e., x is also an equilibrium point of(10), which clearly contradicts the assumption that xis an isolated equilibrium point inX. Thus, we prove thatWpðxÞ > 0 for any x 2Xn fxg. This together with(16) shows that the conditions in(13)are satisﬁed, and henceWpðxÞ is a Lyapunov function at xover the setXfor(10). There- fore, xis Lyapunov stable byLemma 2.5(a).

Now, we show that xis asymptotically stable. Since xis isolated, from(16)we have dWpðxðtÞÞ

dt <0; 8xðtÞ 2Xn fxg:

This, byLemma 2.5(b), implies that xis asymptotically stable. h

Furthermore, using the same arguments we can prove that the neural network(10)is also exponentially stable if xis a regular solution to the NCP. Recall that xis a regular solution to the NCP if every element V 2 @UpðxÞ is nonsingular.

Theorem 4.3. If xis a regular solution of the NCP, then it is exponentially stable.

(8)

Remark 4.1

(a) Using arguments similar to those used inProposition 3.2of[13], we can prove that xis regular ifrFaais nonsingular and the Schur complement ofrFaain

rFaaðxÞ rFabðxÞ rFbaðxÞ rFbbðxÞ

is an P-matrix, where

a

:¼ fijx_i >0g and b :¼ fijx_i ¼ FiðxÞ ¼ 0g. Clearly, ifrF is positive deﬁnite, then the conditions hold true.

(b) FromDeﬁnition 2.6, if an isolated equilibrium point xis exponentially stable, then there exists a d > 0 such that xðtÞ with x0¼ ðt0Þ, and kxðt0Þ xk < d satisﬁes

kxðtÞ xk 6 ce^x^tkxðt0Þ xk 8t P t0;

which together withProposition 3.3implies that kxðtÞ xk 6 2cL

j

ð2 2^1=pÞ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Wpðx0Þ q

e^x^t 8t P t0: ð18Þ

Since the strong monotonicity of F implies that F is a uniform P-function and thatrF is positive deﬁnite, from(18)we obtain that the neural network(10)can yield a trajectory with an exponential convergence rate under the condition that F is strongly monotone and Lipschitz continuous.

(c) We observe from(18)that, when p increases, the coefﬁcient of e^x^tin the right hand side term becomes smaller, which in turn implies that a larger p yields a better convergence rate. This agrees with the result obtained by[2]

for a descent-type method based onWp. In addition, from(18) we notice that the energy of the initial state, i.e., Wpðx0Þ also has an inﬂuence on the convergence rate. A higher initial energy will lead to a worse convergence rate.

5. Simulation results

In this section, we test four well-known nonlinear complementarity problems by our neural network model(10). For each test problem, we also compare the numerical performance of the proposed neural network with various values of p and various initial states xðt0Þ. The test instances are described below.

Example 5.1[31, Example 2]. Consider the NCP, where F : R⁵! R⁵is given by

FðxÞ ¼

x1þ x2x3x4x5=50 x2þ x1x3x4x5=50 3 x3þ x1x2x4x5=50 1 x4þ x1x2x3x5=50 þ 1=2

x5þ x1x2x3x4=50 0

BB BB BB

@

1 CC CC CC A :

The NCP has only one solution x¼ ð0; 3; 1; 0; 0Þ.

Example 5.2[30, Watson]. Consider the NCP, where F : R⁵! R⁵is given by

FðxÞ ¼ 2 exp X⁵

i¼1

ðxi i þ 2Þ²

! x1þ 1

x2

x3 1 x4 2 x5 3 0 BB BB BB

@ 1 CC CC CC A :

Note that F is not a P0-function on Rⁿ. The solution to this problem is x¼ ð0; 0; 1; 2; 3Þ.

Example 5.3[23, Kojima–Shindo]. Consider the NCP, where F : R⁴! R⁴is given by

FðxÞ ¼

3x²₁þ 2x1x2þ 2x²2þ x3þ 3x4 6 2x²₁þ x1þ x²2þ 3x3þ 2x4 2 3x²₁þ x1x2þ 2x²2þ 2x3þ 3x4 1

x²₁þ 3x²2þ 2x3þ 3x4 3 0

BB B@

1 CC CA:

This is a non-degenerate NCP and the solution is x¼ ð ffiffiffi p6

=2; 0; 0; 1=2Þ.

(9)

Example 5.4 [23, Kojima–Shindo]. Consider the NCP, where F : R⁴! R⁴is given by

FðxÞ ¼

3x²₁þ 2x1x2þ 2x²2þ x3þ 3x4 6 2x²₁þ x1þ x²2þ 10x3þ 2x4 2 3x²₁þ x1x2þ 2x²2þ 2x3þ 9x4 9

x²₁þ 3x²2þ 2x3þ 3x4 3 0

BB B@

1 CC CA:

0 5 10 15 20 25 30 35 40 45 50

10⁻⁷ 10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Fig. 3. Convergence behavior of the error kxðtÞ xk inExample 5.2with given x0.

0 5 10 15 20 25 30 35 40 45 50

10⁻⁷ 10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

(10)

This is a degenerate NCP and has two solutions x¼ ð ffiffiffi p6

=2; 0; 0; 1=2Þ and x¼ ð1; 0; 3; 0Þ.

The numerical implementation is coded by Matlab 7.0 and the ordinary differential equation solver adopted is ode23, which uses an Runge–Kutta (2, 3) formula. We ﬁrst test the inﬂuence of the parameter p on the value of kxðtÞ xk.Figs.

2–5 in the appendix describe how kxðtÞ xk varies with p for these instances with the initial states x0¼ ð10²;1; 0:5; 10²;10²Þ^T;x0¼ ð10²;10²;0:5; 0:5; 0:5Þ^T;x0¼ ð2; 10²;10²;0:1Þ^T, and x0¼ ð10³;10³;10³;10³Þ^T,

0 5 10 15 20 25 30 35 40 45 50

10⁻⁷ 10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

0 5 10 15 20 25 30 35 40 45 50

10⁻⁷ 10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3p=20

(11)

respectively. In the tests, the design parameter

q

in the neural network(10)is set to be 1000. FromFigs. 2–5, we see that, when p ¼ 1:1, the neural network(10)generates the slowest decrease of kxðtÞ xk for all test instances, whereas when p ¼ 20 it generates the fastest decrease of kxðtÞ xk. This verifies the analysis ofRemark 4.1(c). We should emphasize that the conclusion inRemark 4.1(c) requires the initial state x0to be sufficiently close to an equilibrium point. If this condition is not satisfied, we cannot draw such conclusion; seeFig. 6.

0 5 10 15 20 25 30

10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹ 10²

Time (ms)

||x(t)−x*||

x⁽¹⁾₀ x⁽²⁾₀ x⁽³⁾₀

Fig. 7. Convergence behavior of kxðtÞ xk inExample 5.1with three different initial points x^ð1Þ0, x^ð2Þ0 , and x^ð3Þ0 (p ¼ 1:8).

0 5 10 15 20 25 30 35 40 45 50

10⁻⁷ 10⁻⁶ 10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ 10¹

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Fig. 6. Convergence behavior of kxðtÞ xk inExample 5.1with x0¼ ½0; 0; 0; 0; 0.

(12)

Example 5.1shows how the value of kxðtÞ xk varies with initial state x0.Fig. 7describes the convergence behavior of kxðtÞ xk with initial states x^ð1Þ₀ ¼ ð1; 1; 1; 1; 1Þ^T, x^ð2Þ₀ ¼ ð5; 5; 5; 5; 5Þ^T, and x^ð3Þ₀ ¼ ð10; 10; 10; 10; 10Þ^T. Notice that the initial energies corresponding to these three states areWpðx^ð1Þ₀ Þ ¼ 5:814;Wpðx^ð2Þ₀ Þ ¼ 39:367, andWpðx^ð3Þ₀ Þ ¼ 226:333, respectively.

0 2 4 6 8 10 12 14

−1

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

Time (ms)

Trajectories of x(t)

x1,x2 x3 x4 x5

Fig. 9. Transient behavior of xðtÞ of the neural network with 6 random initial points and p ¼ 1:4 inExample 5.2.

0 5 10 15 20 25 30

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Time (ms)

x1,x4,x5 x2

x3

(13)

In the tests, we choose p ¼ 1:8 and

q

¼ 1000.Fig. 7, shows that a larger initial energy yields a slower decrease of the error kxðtÞ xk if the initial state is close to the solution of the NCP. This agrees with the analysis inRemark 4.1(c).

The convergence behavior of xðtÞ from several initial states with a ﬁxed p and

q

¼ 1000 for each example is shown inFigs.

8–12. The transient behavior of xðtÞ forExample 5.4is depicted inFigs. 11 and 12since there are two solutions for this prob-

0 10 20 30 40 50

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x2,x3 x1 x4

Time (ms)

0 2 4 6 8 10 12 14 16 18

−0.5 0 0.5 1 1.5 2

Time (ms)

x2,x3 x1

x4

(14)

lem. More specifically, we test 12 random initial points for the NCP, 9 of which converge to ð ffiffiffi p6

=2; 0; 0; 1=2Þ; the remaining 3 converge to ð1; 0; 3; 0Þ. When ﬁnding the solution trajectory xðtÞ, we employ krWpðxðtÞÞk 6 10⁵as the stopping criterion.

To sum up, the neural network(10)is a better alternative for the network based on the FB function /_FBif an appropriate p is chosen. Based on the analysis ofRemark 4.1(c) and the above numerical simulations, we see that, to obtain a better convergence rate of the trajectory xðtÞ, the parameter p cannot be set too small. In addition, we should emphasize that the initial state xðt0Þ has a great inﬂuence on the convergence behavior of kxðtÞ xk.

To end this section, we answer a natural question: are there advantages of our proposed neural network compared to the existing ones? To answer this, we summarize what we have observed from numerical experiments and theoretical results as below.

We compare our neural network model with some existing models which also work for NCP, for instance, the ones used in [6,31,32]. At ﬁrst glance, the neural network models based on projection in[6,31,32]look having lower complexity. How- ever, we observe that the difference of the numerical performance is very marginal by testing MCPLIB benchmark problems.

Our proposed model seems having better properties from theoretical view. Note that there requires monotonicity (strong monotonicity) of F to guarantee the Lyapunov stability (exponential stability) of the neural network models used in [6,31,32]. In contrast, such conditions are not needed for our neural network model. In fact, it can be veriﬁed that all F’s are non-monotone in previous examples exceptExample 5.2(by checking the positive semi-deﬁniteness of their Jaco- bian matrices).

For the following special NCP:

x ¼ ðx1;x2;x3Þ P 0; FðxÞ ¼ ðx1;x2;x3Þ P 0; hx; FðxÞi ¼ x²₁ x²₂ x²₃¼ 0;

it is easy to verify that the unique solution is ð0; 0; 0Þ which can be solved easily by our neural network model. But, the solution trajectory diverges by using the model in[31].

Changing initial points may not having much effect for our neural network model, whereas it does for other existing models. For instance, choosing x0¼ ð12; 12; 12; 12; 12Þ as the initial point inExample 5.1causes the divergence of solution trajectory solved by the neural network model used in[31], while it does not affect anything by our neural network model.

6. Conclusions

In this paper, we have studied a (class of) neural network based on the generalized FB function /_pdeﬁned as in(5). We establish the Lyapunov stability, the asymptotic stability, and the exponential stability for the neural network. In addition,

0 100 200 300 400 500 600 700 800

−0.5 0 0.5 1 1.5 2 2.5 3 3.5

Time (ms)

x2,x4 x1 x3

(15)

we also analyze the inﬂuence of the parameter p on the convergence rate of the trajectory (or the local convergence behavior of the error kxðtÞ xk) and obtain that a larger p leads to a better convergence rate. This agrees with the result obtained by [2]for a descent-type method based on /_p, which also indicates how to choose a suitable p in practice. Numerical experiments verify the obtained theoretical results. The advantages of our proposed neural network compared to other existing neural networks are reported as well. One future topic is to modify the proposed neural network model for various optimization problems and establish its related stability accordingly.

Appendix SeeFigs. 2–12.

References

[1] J.-S. Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization 36 (2006) 565–580.

[2] J.-S. Chen, H.-T. Gao, S.-H. Pan, A derivative-free R-linearly convergent algorithm based on the generalized Fischer–Burmeister merit function, Journal of Computational and Applied Mathematics, submitted for publication.

[3] J.-S. Chen, S.-H. Pan, A family of NCP functions and a descent method for the nonlinear complementarity problem, Computational Optimization and Applications 40 (2008) 389–404.

[4] J.-S. Chen, S.-H. Pan, A regularization semismooth Newton method based on the generalized Fischer–Burmeister function for P0-NCPs, Journal of Computational and Applied Mathematics 220 (2008) 464–479.

[5] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.

[6] C. Dang, Y. Leung, X. Gao, K. Chen, Neural networks for nonlinear and mixed complementarity problems and their applications, Neural Networks 17 (2004) 271–283.

[7] S. Effati, A. Ghomashi, A.R. Nazemi, Application of projection neural network in solving convex programming problems, Applied Mathematics and Computation 188 (2007) 1103–1114.

[8] S. Effati, A.R. Nazemi, Neural network and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation 172 (2006) 305–331.

[9] M.C. Ferris, O.L. Mangasarian, J.-S. Pang (Eds.), Complementarity: Applications, Algorithms and Extensions, Kluwer Academic Publishers, Dordrecht, 2001.

[10] A. Fischer, A special Newton-type optimization methods, Optimization 24 (1992) 269–284.

[11] A. Fischer, Solution of the monotone complementarity problem with locally Lipschitzian functions, Mathematical Programming 76 (1997) 513–532.

[12] F. Facchinei, J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, vols. I and II, Springer-Verlag, New York, 2003.

[13] F. Facchinei, J. Soares, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization 7 (1997) 225–247.

[14] C. Geiger, C. Kanzow, On the resolution of monotone complementarity problems, Computational Optimization and Applications 5 (1996) 155–173.

[15] Q. Han, L.-Z. Liao, H. Qi, L. Qi, Stability analysis of gradient-based neural networks for optimization problems, Journal of Global Optimization 19 (2001) 363–381.

[16] J.J. Hopﬁeld, D.W. Tank, Neural computation of decision in optimization problems, Biological Cybernetics 52 (1985) 141–152.

[17] X. Hu, J. Wang, Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17 (2006) 1487–1499.

[18] X. Hu, J. Wang, A recurrent neural network for solving a class of general variational inequalities, IEEE Transactions on Systems, Man, and Cybernetics – B 37 (2007) 528–539.

[19] H. Jiang, Unconstrained minimization approaches to nonlinear complementarity problems, Journal of Global Optimization 9 (1996) 169–181.

[20] C. Kanzow, Nonlinear complementarity as unconstrained optimization, Journal of Optimization Theory and Applications 88 (1996) 139–155.

[21] C. Kanzow, M. Fukushima, Equivalence of the generalized complementarity problem to differentiable unconstrained minimization, Journal of Optimization Theory and Applications 90 (1996) 581–603.

[22] M.P. Kennedy, L.O. Chua, Neural network for nonlinear programming, IEEE Transaction on Circuits and Systems 35 (1988) 554–562.

[23] M. Kojima, S. Shindo, Extensions of Newton and quasi-Newton methods to systems of PC¹equations, Journal of Operations Research Society of Japan 29 (1986) 352–374.

[24] L.-Z. Liao, H. Qi, L. Qi, Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Journal of Computational and Applied Mathematics 131 (2001) 342–359.

[25] R.K. Miller, A.N. Michel, Ordinary Differential Equations, Academic Press, 1982.

[26] S.-K. Oh, W. Pedrycz, S.-B. Roh, Genetically optimized fuzzy polynomial neural networks with fuzzy set-based polynomial neurons, Information Sciences 176 (2006) 3490–3519.

[27] A. Shortt, J. Keating, L. Monlinier, C. Pannell, Optical implementation of the Kak neural network, Information Sciences 171 (2005) 273–287.

[28] D.W. Tank, J.J. Hopﬁeld, Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Transactions on Circuits and Systems 33 (1986) 533–541.

[29] P. Tseng, Global behaviour of a class of merit functions for the nonlinear complementarity problem, Journal of Optimization Theory and Applications 89 (1996) 17–37.

[30] L.T. Watson, Solving the nonlinear complementarity problem by a homotopy method, SIAM Journal on Control and Optimization 17 (1979) 36–46.

[31] Y. Xia, H. Leung, J. Wang, A projection neural network and its application to constrained optimization problems, IEEE Transactions on Circuits and Systems – I 49 (2002) 447–458.

[32] Y. Xia, H. Leung, J. Wang, A general projection neural network for solving monotone variational inequalities and related optimization problems, IEEE Transactions on Neural Networks 15 (2004) 318–328.

[33] Y. Xia, H. Leung, J. Wang, A recurrent neural network for solving nonlinear convex programs subject to linear constraints, IEEE Transactions on Neural Networks 16 (2005) 379–386.

[34] M. Yashtini, A. Malek, Solving complementarity and variational inequalities problems using neural networks, Applied Mathematics and Computation 190 (2007) 216–230.

[35] S.H. Zak, V. Upatising, S. Hui, Solving linear programming problems with neural networks: a comparative study, IEEE Transactions on Neural Networks 6 (1995) 94–104.

[36] G. Zhang, A neural network ensemble method with jittered training data for time series forecasting, Information Sciences 177 (2007) 5329–5340.