A neural network based on the generalized Fischer–Burmeister function for nonlinear complementarity problems
Jein-Shan Chen
a,*,1, Chun-Hsu Ko
b, Shaohua Pan
caDepartment of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan
bDepartment of Electrical Engineering, I-Shou University, Kaohsiung 840, Taiwan
cSchool of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China
a r t i c l e i n f o
Article history:
Received 27 February 2008
Received in revised form 18 August 2009 Accepted 6 November 2009
Keywords:
The nonlinear complementarity problem Neural network
Exponentially convergent
Generalized Fischer–Burmeister function
a b s t r a c t
In this paper, we consider a neural network model for solving the nonlinear complemen- tarity problem (NCP). The neural network is derived from an equivalent unconstrained minimization reformulation of the NCP, which is based on the generalized Fischer–Burmei- ster function /pða; bÞ ¼ kða; bÞkp ða þ bÞ. We establish the existence and the convergence of the trajectory of the neural network, and study its Lyapunov stability, asymptotic stabil- ity as well as exponential stability. It was found that a larger p leads to a better conver- gence rate of the trajectory. Numerical simulations verify the obtained theoretical results.
Ó 2009 Elsevier Inc. All rights reserved.
1. Introduction
For decades, the nonlinear complementarity problem (NCP) has attracted a lot of attention because of its wide applica- tions in operations research, economics, and engineering[9,12]. Given a mapping F : Rn! Rn, the NCP is to find a point x 2 Rnsuch that
x P 0; FðxÞ P 0; hx; FðxÞi ¼ 0; ð1Þ
where h; i is the Euclidean inner product. Throughout this paper, we assume that F is continuously differentiable, and let F ¼ ðF1; . . . ;FnÞTwith Fi:Rn! R for i ¼ 1; . . . ; n.
There have been many methods proposed for solving the NCP[9,12]. One of the most popular approaches is to reformu- late the NCP as an unconstrained minimization problem via a merit function; see[14,19–21]. A merit function is a function whose global minimizers coincide with the solutions of the NCP. The class of NCP-functions defined below is used to con- struct a merit function.
Definition 1.1. A function / : R R ! R is called an NCP-function if it satisfies
/ða; bÞ ¼ 0 () a P 0; b P 0; ab ¼ 0: ð2Þ
A popular NCP-function is the Fischer–Burmeister (FB) function[10,11], which is defined as
0020-0255/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.ins.2009.11.014
*Corresponding author. Tel.: +886 2 29325417; fax: +886 2 29332342.
E-mail addresses:[email protected](J.-S. Chen),[email protected](C.-H. Ko),[email protected](S. Pan).
1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.
Contents lists available atScienceDirect
Information Sciences
j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / i n s
/FBða; bÞ ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2þ b2 q
ða þ bÞ: ð3Þ
The FB merit function wFB:R R ! Rþcan be obtained by taking the square of /FB, i.e.,
wFBða; bÞ :¼1
2j/FBða; bÞj2: ð4Þ
In[1,3,4], we studied a family of NCP-functions that subsumes the FB function /FBas a special case. More specifically, we define /p:R R ! R by
/pða; bÞ :¼ kða; bÞkp ða þ bÞ; ð5Þ
where p is any fixed real number from ð1; þ1Þ and kða; bÞkpdenotes the p-norm of ða; bÞ, i.e., kða; bÞkp¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jajpþ jbjp pp
. In other words, in the function /p, we replace the 2-norm of ða; bÞ in the FB function /FBby a more general p-norm of ða; bÞ. The func- tion /pis still an NCP-function, as noted in Tseng’s paper[29]. There has been no further study on this NCP-function, even for p ¼ 3, until recently[1,3,4]. Similar to /FB, the square of /pinduces a nonnegative NCP-function wp:R R ! Rþ:
wpða; bÞ :¼1
2j/pða; bÞj2: ð6Þ
The function wpis continuously differentiable and it has some favorable properties; see[1,3,4]. Moreover, if we define the functionWp:Rn! Rþby
WpðxÞ :¼Xn
i¼1
wpðxi;FiðxÞÞ ¼1
2kUpðxÞk2; ð7Þ
whereUp:Rn! Rnis a mapping given as
UpðxÞ ¼
/pðx1;F1ðxÞÞ ... /pðxn;FnðxÞÞ 0
BB
@
1 CC
A; ð8Þ
then the NCP can be reformulated into the following smooth minimization problem:
minx2RnWpðxÞ: ð9Þ
Thus,WpðxÞ in(7)is a smooth merit function for the NCP.
Effective gradient-type methods can be applied to the unconstrained smooth minimization problem(9). However, in many scientific and engineering applications, it is desirable to have a real-time solution of the NCP. Thus, traditional uncon- strained optimization algorithms may not be suitable for real-time implementation because the computing time required for a solution greatly depends on the dimension and structure of the problem. One promising way to overcome this problem is to apply neural networks.
Neural networks for optimization were first introduced in the 1980s by Hopfield and Tank[16,28]. Since then, neural net- works have been applied to various optimization problems, including linear programming, nonlinear programming, varia- tional inequalities, and linear and nonlinear complementarity problems; see [6,8,7,15,17,18,22,24,31–35]. There have been many studies on neural-network approaches to real-world problems in some other fields, such as[26,27,36]. The main idea of the neural-network approach for optimization is to construct a nonnegative energy function and establish a dynamic system that represents an artificial neural network. The dynamic system is usually in the form of first order ordinary differ- ential equations. Furthermore, it is expected that the dynamic system will approach its static state (or an equilibrium point), which corresponds to the solution for the underlying optimization problem, starting from an initial point. In addition, neural networks for solving optimization problems are hardware-implementable; that is, the neural networks can be implemented using integrated circuits.
In this paper, we focus on a neural-network approach to the NCP. We utilizeWpðxÞ as the traditional energy function. As mentioned above, the NCP is equivalent to the unconstrained smooth minimization problem(9). Therefore, it is natural to adopt the following steepest descent-based neural network model for NCP:
dxðtÞ
dt ¼
q
rWpðxðtÞÞ; xð0Þ ¼ x0; ð10Þwhere
q
>0 is a scaling factor. Most neural networks in the existing literature are projection-type ones based on other kinds of NCP-functions, such as natural residual function (e.g.[18,33]) and the regularized gap function (e.g.[6]). Recently, neural networks based on the FB function have been designed for linear and quadratic programming, and for nonlinear complemen- tarity problems[8,24]. Our model is based on the generalized FB function, which is a generalization of the functions used in [8,24]. We show that the neural network(10)is Lyapunov stable, asymptotically stable, and exponentially stable. We ob- served in[2]that p has a great influence on the numerical performance of certain descent-type methods; a larger p yieldsa better convergence rate, whereas a smaller p often gives a better global convergence. Thus, whether such phenomena occur in our neural network model is also investigated.
Throughout this paper, Rndenotes the space of n-dimensional real column vectors andTdenotes the transpose. For any differentiable function f : Rn! R; rf ðxÞ means the gradient of f at x. For any differentiable mapping F ¼ ðF1; . . . ;FmÞT:Rn! Rm; rFðxÞ ¼ ½rF1ðxÞ rFmðxÞ 2 Rnmdenotes the transposed Jacobian of F at x. The p-norm of x is denoted by kxkpand the Euclidean norm of x is denoted by kxk. Besides, eiis the n-dimensional vector whose i-th compo- nent is 1 and 0 elsewhere. Unless otherwise stated, we assume that p in the sequel is any fixed real number in ð1; þ1Þ if not specified.
2. Preliminaries
In this section, we review some properties of /pand wp, as well as materials of ordinary differential equations that will play an important role in the subsequent analysis. We start with some concepts for a nonlinear mapping.
Definition 2.1. Let F ¼ ðF1; . . . ;FnÞT:Rn! Rn. Then, the mapping F is said to be
(a) monotone if hx y; FðxÞ FðyÞi P 0 for all x; y 2 Rn;
(b) strongly monotone with modulus
l
>0 if hx y; FðxÞ FðyÞi Pl
kx yk2for all x; y 2 Rn; (c) an P0-function if max16i6nxi–yiðxi yiÞðFiðxÞ FiðyÞÞ P 0 for all x; y 2 Rnand x–y;
(d) a uniform P-function with modulus
j
>0 if max16i6nðxi yiÞðFiðxÞ FiðyÞÞ Pj
kx yk2, for all x; y 2 Rn; (e) Lipschitz continuous if there exists a constant L > 0 such that kFðxÞ FðyÞk 6 Lkx yk for all x; y 2 Rn.FromDefinition 2.1, the following one-sided implications can be obtained:
F is strongly monotone ) F is a uniformP-function ) F is an P0function;rF is positive semidefinite ) F is monotone ) F is an P0function:
Nevertheless, we point out that F being a uniform P-function does not necessarily imply that F is monotone. The following two lemmas summarize some favorable properties of /pand wp, respectively. Since their proofs can be found in[2–4], we here omit them.
Lemma 2.1. Let /p:R R ! R be given by(5). Then, the following properties hold.
(a) /pis a positive homogeneous and sub-additive NCP-function.
(b) /pis Lipschitz continuous with L ¼ ffiffiffi p2
þ 2ð1=p1=2Þfor 1 < p < 2, and L ¼ ffiffiffi p2
þ 1 for p P 2.
(c) /pis strongly semismooth.
(d) If fðak;bkÞg # R R with ak! 1, or bk! 1, or ak! 1; bk! 1, then j/pðak;bkÞj ! 1 when k ! 1.
(e) Given a point ða; bÞ 2 R R, every element in the generalized gradient @/pða; bÞ has the representation ðn 1; f 1Þ with
n¼sgnðaÞ jajp1
kða; bÞkp1p and f ¼sgnðbÞ jbjp1 kða; bÞkp1p
for ða; bÞ–ð0; 0Þ, where sgnðÞ represents the sign function; otherwise, n and f are real numbers that satisfy jnjp1p þ jfjp1p 61.
Lemma 2.2. Let /pand wp be defined as in(5) and (6), respectively. Then, (a) wpða; bÞ P 0 for all a; b 2 R and wpis an NCP-function, i.e., it satisfies(2).
(b) wpis continuously differentiable everywhere. Moreover,rawpða; bÞ ¼rbwpða; bÞ ¼ 0 if ða; bÞ ¼ ð0; 0Þ; otherwise, rawpða; bÞ ¼ sgnðaÞ jajp1
kða; bÞkp1p 1
! /pða; bÞ;
rbwpða; bÞ ¼ sgnðbÞ jbjp1 kða; bÞkp1p
1
! /pða; bÞ:
ð11Þ
(c) rawpða; bÞ rbwpða; bÞ P 0 for all a; b 2 R. The inequality becomes an equality if and only if /pða; bÞ ¼ 0.
(d) rawpða; bÞ ¼ 0 ()rbwpða; bÞ ¼ 0 () /pða; bÞ ¼ 0 () wpða; bÞ ¼ 0.
(e) The gradient of wpis Lipschitz continuous for p P 2, i.e., there exists L > 0 such that krwpða; bÞ rwpðc; dÞk 6 Lkða; bÞ ðc; dÞk for all ða; bÞ; ðc; dÞ 2 R2and p P 2:
(f) For all a; b 2 R, we have ð2 21=pÞ minfa; bg 6 j/pða; bÞj 6 ð2 þ 21=pÞ minfa; bg.
Next, we recall some materials about first order differential equations (ODE):
_xðtÞ ¼ HðxðtÞÞ; xðt0Þ ¼ x02 Rn; ð12Þ
where H : Rn! Rnis a mapping. We also introduce three kinds of stability that will be discussed later. These materials can be found in ODE textbooks; see[25].
Definition 2.2. A point x¼ xðtÞ is called an equilibrium point or a steady state of the dynamic system(12)if HðxÞ ¼ 0. If there is a neighborhoodX#Rnof xsuch that HðxÞ ¼ 0 and HðxÞ–08x 2Xn fxg, then xis called an isolated equilibrium point.
Lemma 2.3. Assume that H : Rn! Rnis a continuous mapping. Then, for any t0P0 and x02 Rn, there exists a local solution xðtÞ for(12)with t 2 ½t0;
s
Þ for somes
>t0. If, in addition, H is locally Lipschitz continuous at x0, then the solution is unique; if H is Lipschitz continuous in Rn, thens
can be extended to 1.If a local solution defined on ½t0;
s
Þ cannot be extended to a local solution on a larger interval ½t0;s
1Þ;s
1>s
, then it is called a maximal solution, and the interval ½t0;s
Þ is the maximal interval of existence. Clearly, any local solution has an extension to a maximal one. We denote ½t0;s
ðx0ÞÞ by the maximal interval of existence associated with x0.Lemma 2.4. Assume that H : Rn! Rn is continuous. If xðtÞ with t 2 ½t0;
s
ðx0ÞÞ is a maximal solution ands
ðx0Þ < 1, then limt"sðx0ÞkxðtÞk ¼ 1.Definition 2.3. Stability in the sense of LyapunovLet xðtÞ be a solution for(12). An isolated equilibrium point xis Lyapunov stable if for any x0¼ xðt0Þ and any
e
>0, there exists a d > 0 such that kxðtÞ xk <e
for all t P t0and kxðt0Þ xk < d.Definition 2.4 (Asymptotic stability). An isolated equilibrium point xis said to be asymptotically stable if in addition to being Lyapunov stable, it has the property that xðtÞ ! xas t ! 1 for all kxðt0Þ xk < d.
Definition 2.5 (Lyapunov function). Let X#Rn be an open neighborhood of x. A continuously differentiable function W : Rn! R is said to be a Lyapunov function at the state x over the setXfor Eq.(12)if
WðxÞ ¼ 0; WðxÞ > 0; 8x 2Xn fxg:
dWðxðtÞÞ
dt ¼rWðxðtÞÞTHðxðtÞÞ 6 0; 8x 2X: (
ð13Þ
Lemma 2.5
(a) An isolated equilibrium point xis Lyapunov stable if there exists a Lyapunov function over some neighborhoodXof x. (b) An isolated equilibrium point xis asymptotically stable if there is a Lyapunov function over some neighborhoodXof x
such thatdWðxðtÞÞdt <0 for all x 2Xn fxg.
Definition 2.6 (Exponential stability). An isolated equilibrium point xis exponentially stable if there exists a d > 0 such that arbitrary point xðtÞ of(10)with the initial condition xðt0Þ ¼ x0and kxðt0Þ xk < d is well-defined on ½0; þ1Þ and satisfies
kxðtÞ xk26cextkxðt0Þ xk 8t P t0;
where c > 0 and
x
>0 are constants independent of the initial point.3. Neural network model
We now discuss properties of the neural network model introduced in(10). First, fromLemma 2.2(a), we obtain the fol- lowing result.
Proposition 3.1. LetWp:Rn! Rþbe defined as in(7). Then,WpðxÞ P 0 for all x 2 RnandWpðxÞ ¼ 0 if and only if x solves the NCP.
Proposition 3.2. LetWp:Rn! Rþbe given by(7). Then, the following results hold.
(a) The functionWpis continuously differentiable everywhere with
rWpðxÞ ¼ VTUpðxÞ for any V 2 @UpðxÞ ð14Þ
or
rWpðxÞ ¼rawpðx; FðxÞÞ þrFðxÞrbwpðx; FðxÞÞ ð15Þ
with
rawpðx; FðxÞÞ :¼rawpðx1;F1ðxÞÞ; . . . ;rawpðxn;FnðxÞÞT
; rbwpðx; FðxÞÞ :¼rbwpðx1;F1ðxÞÞ; . . . ;rbwpðxn;FnðxÞÞT
:
(b) If F is an P0-function, then every stationary point of(9)is a global minimizer ofWpðxÞ, and it consequently solves the NCP.
(c) If F is a uniform P-function, then the level sets LðWp;
c
Þ :¼ fx 2 RnjWpðxÞ 6c
g are bounded for allc
2 R.(d) WpðxðtÞÞ is nonincreasing with respect to t.
Proof. The first equality in (a) follows fromLemma 2.2(c) and[5, Theorem 2.6.6]. The second one follows from the chain rule. Part (b) is the result of[3, Proposition 3.4], and part (c) is the result of[4, Proposition 3.5]. It remains to show part (d). By the definition ofWpðxÞ and(10), it is not difficult to compute
dWpðxðtÞÞ
dt ¼rWpðxðtÞÞTdxðtÞ
dt ¼rWpðxðtÞÞT
q
rWpðxðtÞÞ¼
q
krWpðxðtÞÞk260: ð16ÞTherefore,WpðxðtÞÞ is a monotonically decreasing function with respect to t. h
Proposition 3.2(a) provides two ways to computerWpðxÞ, which is needed in the network(10). One is to use formula(14), for which we give an algorithm (seeAlgorithm 3.1below), to evaluate an element V 2 @UpðxÞ. The other is to adopt formula (15).
Algorithm 3.1. The procedure to evaluate an element V 2 @UpðxÞ
(S.0) Let x 2 Rnbe given, and let Videnote the i-th row of a matrix V 2 Rnn. (S.1) Set IðxÞ :¼ fi 2 f1; 2; . . . ; ngj xi¼ FiðxÞ ¼ 0g.
(S.2) Set z 2 Rnsuch that zi¼ 0 for i R IðxÞ, and zi¼ 1 for i 2 IðxÞ.
(S.3) For i 2 IðxÞ, let ui¼ ½jzijp1p þ jrFiðxÞTzjp1pp1p, and Vi¼ zi
ui 1
eTi þ rFiðxÞTz ui 1
! rFiðxÞT:
(S.4) For i R IðxÞ, set
Vi¼ sgnðxiÞ jxijp1 kðxi;FiðxÞÞkp1p
1
!
eTi þ sgnðFiðxÞÞ jFiðxÞjp1 kðxi;FiðxÞÞkp1p
1
! rFiðxÞT:
The above procedure is a traditional way of obtainingrWpðxðtÞÞ. For example, the neural network in[24]uses(14)and a similar algorithm to evaluate an element of V 2 @UFBðxÞ. We propose a simpler way of obtainingrWpðxðtÞÞ which is to com- puterWpðxðtÞÞ by using formula(15)rather than formula(14). Formula(15)also provides an indication on how the neural network(10)can be implemented on hardware; seeFig. 1.
To close this section, we claim thatWpprovides a global error bound for the solution of the NCP. This result is important and will be used to analyze the influence of p on the convergence rate of the trajectory xðtÞ of the neural network(10)in the next section.
Proposition 3.3. Suppose F is a uniform P-function with modulus
j
>0 and Lipschitz continuous with constant L > 0. Then, the NCP has a unique solution x, andkx xk26 4L2
j
2ð2 21=pÞ2WpðxÞ 8x 2 Rn:Proof. Since F is a uniform P-function, byProposition 3.2(c), there exists a global minimizer ofWpðxÞ which says the NCP has a solution. Assume that the NCP has two different solutions xand y, then byDefinition 2.1(d) we have
j
kx yk26max16i6mðxi yiÞðFiðxÞ FiðyÞÞ ¼ max
16i6m xiFiðyÞ yiFiðxÞ 60;
where the equality is due to the fact that xiFiðxÞ ¼ yiFiðyÞ ¼ 0 for i ¼ 1; 2; . . . ; n (note that xand yare the solutions to the NCP), and the last inequality holds since x;yP0 and FðxÞ; FðyÞ P 0. This leads to a contradiction. Hence, the NCP has a unique solution.
For any x 2 Rn, let rðxÞ :¼ ðr1ðxÞ; . . . ; rnðxÞÞT with riðxÞ ¼ minfxi;FiðxÞg for i ¼ 1; . . . ; n. Since F is Lipschitz continuous with constant L > 0, by[21, Lemma 7.4]we have
ðxi xiÞðFiðxÞ FiðxÞÞ 6 2LjriðxÞjkx xk;
for all x 2 Rnand i ¼ 1; 2; . . . ; n. On the other hand, since F is a uniform P-function with modulus
j
>0, fromDefinition 2.1(d) it follows thatj
kx xk26max16i6nðxi xiÞðFiðxÞ FiðxÞÞ
for any x 2 Rn. Combining the last two equations yields kx xk 6 ð2L=
j
Þ max16i6njriðxÞj 8x 2 Rn: This together withLemma 2.2(f) implies
kx xk 6 2L
j
ð2 21=pÞmax16i6nj/pðxi;FiðxÞÞj 6 2L
j
ð2 21=pÞkUpðxÞk:Consequently, we obtain the desired result. h
4. Convergence and stability of the trajectory
This section focuses on issues of convergence and stability of the neural network(10). We analyze the behavior of the solution trajectory of(10)including the existence and convergence, and establish three kinds of stability for an isolated equi- librium point. We first state the relationships between an equilibrium point of(10)and a solution to the NCP.
Proposition 4.1
(a) Every solution to the NCP is an equilibrium point of(10).
(b) If F is an P0-function, then every equilibrium point of(10)is a solution to the NCP.
Proof
(a) Suppose that x is a solution to the NCP. Then, fromProposition 3.1, it is clear thatUpðxÞ ¼ 0. UsingLemma 2.2(d) and (15), we then haverWpðxÞ ¼ 0. This, byDefinition 2.2, shows that x is an equilibrium point of(10).
(b) This is a direct consequence ofProposition 3.2(b). h
The following proposition establishes the existence of the solution trajectory of(10).
Proposition 4.2. For any fixed p P 2, the following hold.
(a) For any initial state x0¼ xðt0Þ, there exists exactly one maximal solution xðtÞ with t 2 ½t0;
s
ðx0ÞÞ for the neural network(10).(b) If the level set Lðx0Þ ¼ fx 2 RnjWpðxÞ 6Wpðx0Þg is bounded or F is Lipschitz continuous, then
s
ðx0Þ ¼ þ1.Fig. 1. A simplified block diagram for the neural network(10).
Proof
(a) Since F is continuously differentiable,rFðxÞ is continuous, and therefore,rFðxÞ is bounded on a local compact neigh- borhood of x. On the other hand,rawpandrbwpare Lipschitz continuous byLemma 2.2(e). These two facts together with formula(15)show thatrWpðxÞ is locally Lipschitz continuous. Thus, applyingLemma 2.3leads to the desired result.
(b) We proceed the arguments by the two cases as shown below.
Case (i): The level set Lðx0Þ is bounded. We prove the result by contradiction. Suppose
s
ðx0Þ < 1. Then, byLemma 2.4, limt"sðx0ÞkxðtÞk ¼ 1. Let Lcðx0Þ :¼ Rnn Lðx0Þ ands
0:¼ inffs P 0js <s
ðx0Þ; xðsÞ 2 Lcðx0Þg < 1:We know that xð
s
0Þ lies on the boundary of Lðx0Þ and Lcðx0Þ. Moreover, Lðx0Þ is compact since it is bounded by assumption and it is also closed because of the continuity ofWpðxÞ. Therefore, we have xðs
0Þ 2 Lðx0Þ ands
0<s
ðx0Þ, implying thatWpðxðsÞÞ >Wpðx0Þ >Wpðxð
s
0ÞÞ for some s 2 ðs
0;s
ðx0ÞÞ: ð17Þ However,Proposition 3.2(d) says thatWpðxðÞÞ is nonincreasing on ½t0;s
ðx0ÞÞ, which contradicts(17). This completes the proof of Case (i).Case (ii): F is Lipschitz continuous. From the proof of part (a), we know thatrWpðxÞ is Lipschitz continuous. Thus, by Lemma 2.3, we have
s
ðx0Þ ¼ 1. hNext, we investigate the convergence of the solution trajectory of(10).
Theorem 4.1
(a) Let xðtÞ with t 2 ½t0;
s
ðx0ÞÞ be the unique maximal solution to (10). Ifs
ðx0Þ ¼ 1 and fxðtÞg is bounded, then limt!1rWpðxðtÞÞ ¼ 0.(b) If F is strongly monotone or a uniform P-function, then Lðx0Þ is bounded and every accumulation point of the trajectory xðtÞ is a solution to the NCP.
Proof. WithProposition 3.2(b) and (d) andProposition 4.2, the arguments are exactly the same as those for[24, Corollary 4.3]. Thus, we omit them. h
FromProposition 4.1(a), every solution xto the NCP is an equilibrium point of the neural network(10). If, in addition, x is an isolated equilibrium point of(10), then we can show that xis not only Lyapunov stable but also asymptotically stable.
Theorem 4.2. Let x be an isolated equilibrium point of the neural network(10). Then, x is Lyapunov stable for(10), and furthermore, it is asymptotically stable.
Proof. Since xis a solution to the NCP,WpðxÞ ¼ 0. In addition, since xis an isolated equilibrium point of(10), there exists a neighborhoodX#Rnof xsuch that
rWpðxÞ ¼ 0; and rWpðxÞ–0 8x 2Xn fxg:
Next, we argue thatWpðxÞ is indeed a Lyapunov function at xover the setXfor(10)by showing that the conditions in(13) are satisfied. First, notice thatWpðxÞ P 0. Suppose that there is an x 2Xn fxg such thatWpðxÞ ¼ 0. Then, by formula(15) andLemma 2.2(d), we haverWðxÞ ¼ 0, i.e., x is also an equilibrium point of(10), which clearly contradicts the assumption that xis an isolated equilibrium point inX. Thus, we prove thatWpðxÞ > 0 for any x 2Xn fxg. This together with(16) shows that the conditions in(13)are satisfied, and henceWpðxÞ is a Lyapunov function at xover the setXfor(10). There- fore, xis Lyapunov stable byLemma 2.5(a).
Now, we show that xis asymptotically stable. Since xis isolated, from(16)we have dWpðxðtÞÞ
dt <0; 8xðtÞ 2Xn fxg:
This, byLemma 2.5(b), implies that xis asymptotically stable. h
Furthermore, using the same arguments we can prove that the neural network(10)is also exponentially stable if xis a regular solution to the NCP. Recall that xis a regular solution to the NCP if every element V 2 @UpðxÞ is nonsingular.
Theorem 4.3. If xis a regular solution of the NCP, then it is exponentially stable.
Remark 4.1
(a) Using arguments similar to those used inProposition 3.2of[13], we can prove that xis regular ifrFaais nonsingular and the Schur complement ofrFaain
rFaaðxÞ rFabðxÞ rFbaðxÞ rFbbðxÞ
is an P-matrix, where
a
:¼ fijxi >0g and b :¼ fijxi ¼ FiðxÞ ¼ 0g. Clearly, ifrF is positive definite, then the conditions hold true.(b) FromDefinition 2.6, if an isolated equilibrium point xis exponentially stable, then there exists a d > 0 such that xðtÞ with x0¼ ðt0Þ, and kxðt0Þ xk < d satisfies
kxðtÞ xk 6 cextkxðt0Þ xk 8t P t0;
which together withProposition 3.3implies that kxðtÞ xk 6 2cL
j
ð2 21=pÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Wpðx0Þ q
ext 8t P t0: ð18Þ
Since the strong monotonicity of F implies that F is a uniform P-function and thatrF is positive definite, from(18)we obtain that the neural network(10)can yield a trajectory with an exponential convergence rate under the condition that F is strongly monotone and Lipschitz continuous.
(c) We observe from(18)that, when p increases, the coefficient of extin the right hand side term becomes smaller, which in turn implies that a larger p yields a better convergence rate. This agrees with the result obtained by[2]
for a descent-type method based onWp. In addition, from(18) we notice that the energy of the initial state, i.e., Wpðx0Þ also has an influence on the convergence rate. A higher initial energy will lead to a worse convergence rate.
5. Simulation results
In this section, we test four well-known nonlinear complementarity problems by our neural network model(10). For each test problem, we also compare the numerical performance of the proposed neural network with various values of p and var- ious initial states xðt0Þ. The test instances are described below.
Example 5.1[31, Example 2]. Consider the NCP, where F : R5! R5is given by
FðxÞ ¼
x1þ x2x3x4x5=50 x2þ x1x3x4x5=50 3 x3þ x1x2x4x5=50 1 x4þ x1x2x3x5=50 þ 1=2
x5þ x1x2x3x4=50 0
BB BB BB
@
1 CC CC CC A :
The NCP has only one solution x¼ ð0; 3; 1; 0; 0Þ.
Example 5.2[30, Watson]. Consider the NCP, where F : R5! R5is given by
FðxÞ ¼ 2 exp X5
i¼1
ðxi i þ 2Þ2
! x1þ 1
x2
x3 1 x4 2 x5 3 0 BB BB BB
@ 1 CC CC CC A :
Note that F is not a P0-function on Rn. The solution to this problem is x¼ ð0; 0; 1; 2; 3Þ.
Example 5.3[23, Kojima–Shindo]. Consider the NCP, where F : R4! R4is given by
FðxÞ ¼
3x21þ 2x1x2þ 2x22þ x3þ 3x4 6 2x21þ x1þ x22þ 3x3þ 2x4 2 3x21þ x1x2þ 2x22þ 2x3þ 3x4 1
x21þ 3x22þ 2x3þ 3x4 3 0
BB B@
1 CC CA:
This is a non-degenerate NCP and the solution is x¼ ð ffiffiffi p6
=2; 0; 0; 1=2Þ.
Example 5.4 [23, Kojima–Shindo]. Consider the NCP, where F : R4! R4is given by
FðxÞ ¼
3x21þ 2x1x2þ 2x22þ x3þ 3x4 6 2x21þ x1þ x22þ 10x3þ 2x4 2 3x21þ x1x2þ 2x22þ 2x3þ 9x4 9
x21þ 3x22þ 2x3þ 3x4 3 0
BB B@
1 CC CA:
0 5 10 15 20 25 30 35 40 45 50
10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101
Time (ms)
||x(t)−x*||
p=1.1 p=1.5 p=3 p=20
Fig. 3. Convergence behavior of the error kxðtÞ xk inExample 5.2with given x0.
0 5 10 15 20 25 30 35 40 45 50
10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101
Time (ms)
||x(t)−x*||
p=1.1 p=1.5 p=3 p=20
Fig. 2. Convergence behavior of the error kxðtÞ xk inExample 5.1with given x0.
This is a degenerate NCP and has two solutions x¼ ð ffiffiffi p6
=2; 0; 0; 1=2Þ and x¼ ð1; 0; 3; 0Þ.
The numerical implementation is coded by Matlab 7.0 and the ordinary differential equation solver adopted is ode23, which uses an Runge–Kutta (2, 3) formula. We first test the influence of the parameter p on the value of kxðtÞ xk.Figs.
2–5 in the appendix describe how kxðtÞ xk varies with p for these instances with the initial states x0¼ ð102;1; 0:5; 102;102ÞT;x0¼ ð102;102;0:5; 0:5; 0:5ÞT;x0¼ ð2; 102;102;0:1ÞT, and x0¼ ð103;103;103;103ÞT,
0 5 10 15 20 25 30 35 40 45 50
10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101
Time (ms)
||x(t)−x*||
p=1.1 p=1.5 p=3 p=20
Fig. 5. Convergence behavior of the error kxðtÞ xk inExample 5.4with given x0.
0 5 10 15 20 25 30 35 40 45 50
10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101
Time (ms)
||x(t)−x*||
p=1.1 p=1.5 p=3p=20
Fig. 4. Convergence behavior of the error kxðtÞ xk inExample 5.3with given x0.
respectively. In the tests, the design parameter
q
in the neural network(10)is set to be 1000. FromFigs. 2–5, we see that, when p ¼ 1:1, the neural network(10)generates the slowest decrease of kxðtÞ xk for all test instances, whereas when p ¼ 20 it generates the fastest decrease of kxðtÞ xk. This verifies the analysis ofRemark 4.1(c). We should emphasize that the conclusion inRemark 4.1(c) requires the initial state x0to be sufficiently close to an equilibrium point. If this condition is not satisfied, we cannot draw such conclusion; seeFig. 6.0 5 10 15 20 25 30
10−6 10−5 10−4 10−3 10−2 10−1 100 101 102
Time (ms)
||x(t)−x*||
x(1)0 x(2)0 x(3)0
Fig. 7. Convergence behavior of kxðtÞ xk inExample 5.1with three different initial points xð1Þ0, xð2Þ0 , and xð3Þ0 (p ¼ 1:8).
0 5 10 15 20 25 30 35 40 45 50
10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101
Time (ms)
||x(t)−x*||
p=1.1 p=1.5 p=3 p=20
Fig. 6. Convergence behavior of kxðtÞ xk inExample 5.1with x0¼ ½0; 0; 0; 0; 0.
Example 5.1shows how the value of kxðtÞ xk varies with initial state x0.Fig. 7describes the convergence behavior of kxðtÞ xk with initial states xð1Þ0 ¼ ð1; 1; 1; 1; 1ÞT, xð2Þ0 ¼ ð5; 5; 5; 5; 5ÞT, and xð3Þ0 ¼ ð10; 10; 10; 10; 10ÞT. Notice that the initial energies corresponding to these three states areWpðxð1Þ0 Þ ¼ 5:814;Wpðxð2Þ0 Þ ¼ 39:367, andWpðxð3Þ0 Þ ¼ 226:333, respectively.
0 2 4 6 8 10 12 14
−1
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4
Time (ms)
Trajectories of x(t)
x1,x2 x3 x4 x5
Fig. 9. Transient behavior of xðtÞ of the neural network with 6 random initial points and p ¼ 1:4 inExample 5.2.
0 5 10 15 20 25 30
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Time (ms)
Trajectories of x(t)
x1,x4,x5 x2
x3
Fig. 8. Transient behavior of xðtÞ of the neural network with 6 random initial points and p ¼ 1:8 inExample 5.1.
In the tests, we choose p ¼ 1:8 and
q
¼ 1000.Fig. 7, shows that a larger initial energy yields a slower decrease of the error kxðtÞ xk if the initial state is close to the solution of the NCP. This agrees with the analysis inRemark 4.1(c).The convergence behavior of xðtÞ from several initial states with a fixed p and
q
¼ 1000 for each example is shown inFigs.8–12. The transient behavior of xðtÞ forExample 5.4is depicted inFigs. 11 and 12since there are two solutions for this prob-
0 10 20 30 40 50
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x2,x3 x1 x4
Time (ms)
Trajectories of x(t)
Fig. 11. Transient behavior of xðtÞ of the neural network with 9 random initial points and p ¼ 1:2 inExample 5.4.
0 2 4 6 8 10 12 14 16 18
−0.5 0 0.5 1 1.5 2
Time (ms)
Trajectories of x(t)
x2,x3 x1
x4
Fig. 10. Transient behavior of xðtÞ of the neural network with 6 random initial points and p ¼ 2:4 inExample 5.3.
lem. More specifically, we test 12 random initial points for the NCP, 9 of which converge to ð ffiffiffi p6
=2; 0; 0; 1=2Þ; the remaining 3 converge to ð1; 0; 3; 0Þ. When finding the solution trajectory xðtÞ, we employ krWpðxðtÞÞk 6 105as the stopping criterion.
To sum up, the neural network(10)is a better alternative for the network based on the FB function /FBif an appropriate p is chosen. Based on the analysis ofRemark 4.1(c) and the above numerical simulations, we see that, to obtain a better con- vergence rate of the trajectory xðtÞ, the parameter p cannot be set too small. In addition, we should emphasize that the initial state xðt0Þ has a great influence on the convergence behavior of kxðtÞ xk.
To end this section, we answer a natural question: are there advantages of our proposed neural network compared to the existing ones? To answer this, we summarize what we have observed from numerical experiments and theoretical results as below.
We compare our neural network model with some existing models which also work for NCP, for instance, the ones used in [6,31,32]. At first glance, the neural network models based on projection in[6,31,32]look having lower complexity. How- ever, we observe that the difference of the numerical performance is very marginal by testing MCPLIB benchmark problems.
Our proposed model seems having better properties from theoretical view. Note that there requires monotonicity (strong monotonicity) of F to guarantee the Lyapunov stability (exponential stability) of the neural network models used in [6,31,32]. In contrast, such conditions are not needed for our neural network model. In fact, it can be verified that all F’s are non-monotone in previous examples exceptExample 5.2(by checking the positive semi-definiteness of their Jaco- bian matrices).
For the following special NCP:
x ¼ ðx1;x2;x3Þ P 0; FðxÞ ¼ ðx1;x2;x3Þ P 0; hx; FðxÞi ¼ x21 x22 x23¼ 0;
it is easy to verify that the unique solution is ð0; 0; 0Þ which can be solved easily by our neural network model. But, the solution trajectory diverges by using the model in[31].
Changing initial points may not having much effect for our neural network model, whereas it does for other existing mod- els. For instance, choosing x0¼ ð12; 12; 12; 12; 12Þ as the initial point inExample 5.1causes the divergence of solution trajectory solved by the neural network model used in[31], while it does not affect anything by our neural network model.
6. Conclusions
In this paper, we have studied a (class of) neural network based on the generalized FB function /pdefined as in(5). We establish the Lyapunov stability, the asymptotic stability, and the exponential stability for the neural network. In addition,
0 100 200 300 400 500 600 700 800
−0.5 0 0.5 1 1.5 2 2.5 3 3.5
Time (ms)
Trajectories of x(t)
x2,x4 x1 x3
Fig. 12. Transient behavior of xðtÞ of the neural network with 3 random initial points and p ¼ 1:2 inExample 5.4.
we also analyze the influence of the parameter p on the convergence rate of the trajectory (or the local convergence behavior of the error kxðtÞ xk) and obtain that a larger p leads to a better convergence rate. This agrees with the result obtained by [2]for a descent-type method based on /p, which also indicates how to choose a suitable p in practice. Numerical experi- ments verify the obtained theoretical results. The advantages of our proposed neural network compared to other existing neural networks are reported as well. One future topic is to modify the proposed neural network model for various optimi- zation problems and establish its related stability accordingly.
Appendix SeeFigs. 2–12.
References
[1] J.-S. Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization 36 (2006) 565–580.
[2] J.-S. Chen, H.-T. Gao, S.-H. Pan, A derivative-free R-linearly convergent algorithm based on the generalized Fischer–Burmeister merit function, Journal of Computational and Applied Mathematics, submitted for publication.
[3] J.-S. Chen, S.-H. Pan, A family of NCP functions and a descent method for the nonlinear complementarity problem, Computational Optimization and Applications 40 (2008) 389–404.
[4] J.-S. Chen, S.-H. Pan, A regularization semismooth Newton method based on the generalized Fischer–Burmeister function for P0-NCPs, Journal of Computational and Applied Mathematics 220 (2008) 464–479.
[5] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
[6] C. Dang, Y. Leung, X. Gao, K. Chen, Neural networks for nonlinear and mixed complementarity problems and their applications, Neural Networks 17 (2004) 271–283.
[7] S. Effati, A. Ghomashi, A.R. Nazemi, Application of projection neural network in solving convex programming problems, Applied Mathematics and Computation 188 (2007) 1103–1114.
[8] S. Effati, A.R. Nazemi, Neural network and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation 172 (2006) 305–331.
[9] M.C. Ferris, O.L. Mangasarian, J.-S. Pang (Eds.), Complementarity: Applications, Algorithms and Extensions, Kluwer Academic Publishers, Dordrecht, 2001.
[10] A. Fischer, A special Newton-type optimization methods, Optimization 24 (1992) 269–284.
[11] A. Fischer, Solution of the monotone complementarity problem with locally Lipschitzian functions, Mathematical Programming 76 (1997) 513–532.
[12] F. Facchinei, J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, vols. I and II, Springer-Verlag, New York, 2003.
[13] F. Facchinei, J. Soares, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization 7 (1997) 225–247.
[14] C. Geiger, C. Kanzow, On the resolution of monotone complementarity problems, Computational Optimization and Applications 5 (1996) 155–173.
[15] Q. Han, L.-Z. Liao, H. Qi, L. Qi, Stability analysis of gradient-based neural networks for optimization problems, Journal of Global Optimization 19 (2001) 363–381.
[16] J.J. Hopfield, D.W. Tank, Neural computation of decision in optimization problems, Biological Cybernetics 52 (1985) 141–152.
[17] X. Hu, J. Wang, Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17 (2006) 1487–1499.
[18] X. Hu, J. Wang, A recurrent neural network for solving a class of general variational inequalities, IEEE Transactions on Systems, Man, and Cybernetics – B 37 (2007) 528–539.
[19] H. Jiang, Unconstrained minimization approaches to nonlinear complementarity problems, Journal of Global Optimization 9 (1996) 169–181.
[20] C. Kanzow, Nonlinear complementarity as unconstrained optimization, Journal of Optimization Theory and Applications 88 (1996) 139–155.
[21] C. Kanzow, M. Fukushima, Equivalence of the generalized complementarity problem to differentiable unconstrained minimization, Journal of Optimization Theory and Applications 90 (1996) 581–603.
[22] M.P. Kennedy, L.O. Chua, Neural network for nonlinear programming, IEEE Transaction on Circuits and Systems 35 (1988) 554–562.
[23] M. Kojima, S. Shindo, Extensions of Newton and quasi-Newton methods to systems of PC1equations, Journal of Operations Research Society of Japan 29 (1986) 352–374.
[24] L.-Z. Liao, H. Qi, L. Qi, Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Journal of Computational and Applied Mathematics 131 (2001) 342–359.
[25] R.K. Miller, A.N. Michel, Ordinary Differential Equations, Academic Press, 1982.
[26] S.-K. Oh, W. Pedrycz, S.-B. Roh, Genetically optimized fuzzy polynomial neural networks with fuzzy set-based polynomial neurons, Information Sciences 176 (2006) 3490–3519.
[27] A. Shortt, J. Keating, L. Monlinier, C. Pannell, Optical implementation of the Kak neural network, Information Sciences 171 (2005) 273–287.
[28] D.W. Tank, J.J. Hopfield, Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Transactions on Circuits and Systems 33 (1986) 533–541.
[29] P. Tseng, Global behaviour of a class of merit functions for the nonlinear complementarity problem, Journal of Optimization Theory and Applications 89 (1996) 17–37.
[30] L.T. Watson, Solving the nonlinear complementarity problem by a homotopy method, SIAM Journal on Control and Optimization 17 (1979) 36–46.
[31] Y. Xia, H. Leung, J. Wang, A projection neural network and its application to constrained optimization problems, IEEE Transactions on Circuits and Systems – I 49 (2002) 447–458.
[32] Y. Xia, H. Leung, J. Wang, A general projection neural network for solving monotone variational inequalities and related optimization problems, IEEE Transactions on Neural Networks 15 (2004) 318–328.
[33] Y. Xia, H. Leung, J. Wang, A recurrent neural network for solving nonlinear convex programs subject to linear constraints, IEEE Transactions on Neural Networks 16 (2005) 379–386.
[34] M. Yashtini, A. Malek, Solving complementarity and variational inequalities problems using neural networks, Applied Mathematics and Computation 190 (2007) 216–230.
[35] S.H. Zak, V. Upatising, S. Hui, Solving linear programming problems with neural networks: a comparative study, IEEE Transactions on Neural Networks 6 (1995) 94–104.
[36] G. Zhang, A neural network ensemble method with jittered training data for time series forecasting, Information Sciences 177 (2007) 5329–5340.