• 沒有找到結果。

# 3 Neural network model

N/A
N/A
Protected

Share "3 Neural network model"

Copied!
25
0
0

(1)

Information Sciences, vol. 180, pp. 697-711, 2010

### A neural network based on the generalized Fischer-Burmeisterfunction for nonlinear complementarity problems

Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Chun-Hsu Ko 2

Department of Electrical Engineering I-Shou University

Kaohsiung 840, Taiwan

Shaohua Pan3

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China

February 18, 2008

(revised on June 11, 2008, February 25, 2009, March 16, 2009, August 18, 2009) Abstract. In this paper, we consider a neural network model for solving the nonlinear complementarity problem (NCP). The neural network is derived from an equivalent un- constrained minimization reformulation of the NCP, which is based on the generalized Fischer-Burmeister function ϕp(a, b) =∥(a, b)∥p− (a + b). We establish the existence and the convergence of the trajectory of the neural network, and study its Lyapunov stability, asymptotic stability as well as exponential stability. It was found that a larger p leads to a better convergence rate of the trajectory. Numerical simulations verify the obtained theoretical results.

Key words: The NCP, neural network, exponentially convergent, generalized Fischer- Burmeister function.

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Oﬃce.

The author’s work is partially supported by National Science Council of Taiwan. E-mail:

jschen@math.ntnu.edu.tw

2E-mail: chko@isu.edu.tw

3E-mail:shhpan@scut.edu.cn

(2)

### 1Introduction

For decades, the nonlinear complementarity problem (NCP) has attracted a lot of atten- tion because of its wide applications in operations research, economics, and engineering [9, 12]. Given a mapping F : IRn → IRn, the NCP is to ﬁnd a point x∈ IRn such that

x≥ 0, F (x) ≥ 0, ⟨x, F (x)⟩ = 0, (1)

where ⟨·, ·⟩ is the Euclidean inner product. Throughout this paper, we assume that F is continuously diﬀerentiable, and let F = (F1, . . . , Fn)T with Fi : IRn → IR for i = 1, . . . , n.

There have been many methods proposed for solving the NCP [9, 12]. One of the most popular approaches is to reformulate the NCP as an unconstrained minimization problem via a merit function; see [14, 19, 20, 21]. A merit function is a function whose global minimizers coincide with the solutions of the NCP. The class of NCP-functions deﬁned below is used to construct a merit function.

Definition 1.1 A function ϕ : IR× IR → IR is called an NCP-function if it satisﬁes

ϕ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2)

A popular NCP-function is the Fischer-Burmeister (FB) function [10, 11], which is deﬁned as

ϕFB(a, b) =√

a2+ b2− (a + b). (3)

The FB merit function ψFB: IR× IR → IR+ can be obtained by taking the square of ϕFB, i.e.,

ψFB(a, b) := 1

2FB(a, b)|2. (4)

In [1, 3, 4], we studied a family of NCP-functions that subsumes the FB function ϕFB as a special case. More speciﬁcally, we deﬁne ϕp : IR× IR → IR by

ϕp(a, b) :=∥(a, b)∥p− (a + b), (5) where p is any ﬁxed real number from (1, +∞) and ∥(a, b)∥p denotes the p-norm of (a, b), i.e., ∥(a, b)∥p = √p

|a|p+|b|p. In other words, in the function ϕp, we replace the 2-norm of (a, b) in the FB function ϕFB by a more general p-norm of (a, b). The function ϕp is still an NCP-function, as noted in Tseng’s paper [30]. There has been no further study on this NCP-function, even for p = 3, until recently [1, 3, 4]. Similar to ϕFB, the square of ϕp induces a nonnegative NCP-function ψp : IR× IR → IR+:

ψp(a, b) := 1

2p(a, b)|2. (6)

(3)

The function ψp is continuously diﬀerentiable and it has some favorable properties; see [1, 3, 4]. Moreover, if we deﬁne the function Ψp : IRn→ IR+ by

Ψp(x) :=

n i=1

ψp(xi, Fi(x)) = 1

2∥Φp(x)∥2 (7)

where Φp : IRn→ IRn is a mapping given as

Φp(x) =



ϕp(x1, F1(x)) ... ϕp(xn, Fn(x))

 , (8)

then the NCP can be reformulated into the following smooth minimization problem:

xmin∈IRnΨp(x). (9)

Thus, Ψp(x) in (7) is a smooth merit function for the NCP.

Eﬀective gradient-type methods can be applied to the unconstrained smooth min- imization problem (9). However, in many scientiﬁc and engineering applications, it is desirable to have a real-time solution of the NCP. Thus, traditional unconstrained optimization algorithms may not be suitable for real-time implementation because the computing time required for a solution greatly depends on the dimension and structure of the problem. One promising way to overcome this problem is to apply neural networks.

Neural networks for optimization were ﬁrst introduced in the 1980s by Hopﬁeld and Tank [16, 29]. Since then, neural networks have been applied to various optimization problems, including linear programming, nonlinear programming, variational inequali- ties, and linear and nonlinear complementarity problems; see [6, 8, 7, 15, 17, 18, 23, 25, 32, 33, 34, 35, 36]. There have been many studies on neural-network approaches to real- world problems in some other ﬁelds, such as [27, 28, 37]. The main idea of the neural network approach for optimization is to construct a nonnegative energy function and establish a dynamic system that represents an artiﬁcial neural network. The dynamic system is usually in the form of ﬁrst order ordinary diﬀerential equations. Furthermore, it is expected that the dynamic system will approach its static state (or an equilibrium point), which corresponds to the solution for the underlying optimization problem, start- ing from an initial point. In addition, neural networks for solving optimization problems are hardware-implementable; that is, the neural networks can be implemented using in- tegrated circuits.

In this paper, we focus on a neural network approach to the NCP. We utilize Ψp(x) as the traditional energy function. As mentioned above, the NCP is equivalent to the

(4)

unconstrained smooth minimization problem (9). Therefore, it is natural to adopt the following steepest descent-based neural network model for NCP:

dx(t)

dt =−ρ∇Ψp(x(t)), x(0) = x0, (10)

where ρ > 0 is a scaling factor. Most neural networks in the existing literature are projection-type ones based on other kinds of NCP-functions, such as natural residual function (e.g. [18, 34]) and the regularized gap function (e.g. [6]). Recently, neural networks based on the FB function have been designed for linear and quadratic pro- gramming, and for nonlinear complementarity problems [8, 25]. Our model is based on the generalized FB function, which is a generalization of the functions used in [8, 25].

We show that the neural network (10) is Lyapunov stable, asymptotically stable, and exponentially stable. We observed in [2] that p has a great inﬂuence on the numerical performance of certain descent-type methods; a larger p yields a better convergence rate, whereas a smaller p often gives a better global convergence. Thus, whether such phe- nomena occur in our neural network model is also investigated.

Throughout this paper, IRn denotes the space of n-dimensional real column vectors and T denotes the transpose. For any diﬀerentiable function f : IRn→ IR, ∇f(x) means the gradient of f at x. For any diﬀerentiable mapping F = (F1, . . . , Fm)T : IRn → IRm,

∇F (x) = [∇F1(x) · · · ∇Fm(x)]∈ IRn×m denotes the transposed Jacobian of F at x. The p-norm of x is denoted by∥x∥pand the Euclidean norm of x is denoted by∥x∥. Besides, ei

is the n-dimensional vector whose i-th component is 1 and 0 elsewhere. Unless otherwise stated, we assume that p in the sequel is any ﬁxed real number in (1, +∞) if not speciﬁed.

### 2Preliminaries

In this section, we review some properties of ϕp and ψp, as well as materials of ordinary diﬀerential equations that will play an important role in the subsequent analysis. We start with some concepts for a nonlinear mapping.

Definition 2.1 Let F = (F1, . . . , Fn)T : IRn→ IRn. Then, the mapping F is said to be (a) monotone if ⟨x − y, F (x) − F (y)⟩ ≥ 0 for all x, y ∈ IRn.

(b) strongly monotone with modulus µ > 0 if ⟨x − y, F (x) − F (y)⟩ ≥ µ∥x − y∥2 for all x, y ∈ IRn.

(c) an P0-function if max

1≤i≤n xi̸=yi

(xi− yi)(Fi(x)− Fi(y))≥ 0 for all x, y ∈ IRn and x̸= y.

(5)

(d) a uniform P -function with modulus κ > 0 if max

1≤i≤n(xi−yi)(Fi(x)−Fi(y))≥ κ∥x−y∥2, for all x, y∈ IRn.

(e) Lipschitz continuous if there exists a constant L > 0 such that ∥F (x) − F (y)∥ ≤ L∥x − y∥ for all x, y ∈ IRn.

From Deﬁnition 2.1, the following one-sided implications can be obtained:

F is strongly monotone =⇒ F is a uniform P -function =⇒ F is an P0 function;

∇F is positive semideﬁnite =⇒ F is monotone =⇒ F is an P0 function.

Nevertheless, we point out that F being a uniform P -function does not necessarily imply that F is monotone. The following two lemmas summarize some favorable properties of ϕp and ψp, respectively. Since their proofs can be found in [2, 3, 4], we here omit them.

Lemma 2.1 Let ϕp : IR× IR → IR be given by (5). Then, the following properties hold.

(a) ϕp is a positive homogeneous and sub-additive NCP-function.

(b) ϕp is Lipschitz continuous with L =√

2 + 2(1/p−1/2) for 1 < p < 2, and L = 2 + 1 for p≥ 2.

(c) ϕp is strongly semismooth.

(d) If {(ak, bk)} ⊆ IR × IR with ak → −∞, or bk → −∞, or ak → ∞, bk → ∞, then

p(ak, bk)| → ∞ when k → ∞.

(e) Given a point (a, b)∈ IR×IR, every element in the generalized gradient ∂ϕp(a, b) has the representation (ξ− 1, ζ − 1) with

ξ = sgn(a)· |a|p−1

∥(a, b)∥pp−1

and ζ = sgn(b)· |b|p−1

∥(a, b)∥pp−1

for (a, b)̸= (0, 0), where sgn(·) represents the sign function; otherwise, ξ and ζ are real numbers that satisfy |ξ|p−1p +|ζ|p−1p ≤ 1.

Lemma 2.2 Let ϕp and ψp be deﬁned as in (5) and (6), respectively. Then, (a) ψp(a, b)≥ 0 for all a, b ∈ IR and ψp is an NCP-function, i.e., it satisﬁes (2).

(b) ψp is continuously diﬀerentiable everywhere. Moreover, aψp(a, b) =∇bψp(a, b) = 0 if (a, b) = (0, 0); otherwise,

aψp(a, b) =

(sgn(a)· |a|p−1

∥(a, b)∥pp−1

− 1 )

ϕp(a, b),

bψp(a, b) =

(sgn(b)· |b|p−1

∥(a, b)∥pp−1

− 1 )

ϕp(a, b). (11)

(6)

(c) aψp(a, b)· ∇bψp(a, b) ≥ 0 for all a, b ∈ IR. The inequality becomes an equality if and only if ϕp(a, b) = 0.

(d) aψp(a, b) = 0 ⇐⇒ ∇bψp(a, b) = 0 ⇐⇒ ϕp(a, b) = 0 ⇐⇒ ψp(a, b) = 0.

(e) The gradient of ψp is Lipschitz continuous for p ≥ 2, i.e., there exists L > 0 such that

∥∇ψp(a, b)− ∇ψp(c, d)∥ ≤ L∥(a, b) − (c, d)∥ for all (a, b), (c, d) ∈ IR2 and p≥ 2.

(f ) For all a, b∈ IR, we have (2 − 21/p) min{a, b} ≤ |ϕp(a, b)| ≤ (2 + 21/p) min{a, b}.

Next, we recall some materials about ﬁrst order diﬀerential equations (ODE):

˙x(t) = H(x(t)), x(t0) = x0 ∈ IRn (12) where H : IRn → IRn is a mapping. We also introduce three kinds of stability that will be discussed later. These materials can be found in ODE textbooks; see [26].

Definition 2.2 A point x = x(t) is called an equilibrium point or a steady state of the dynamic system (12) if H(x) = 0. If there is a neighborhood Ω ⊆ IRn of x such that H(x) = 0 and H(x)̸= 0 ∀x ∈ Ω\{x}, then x is called an isolated equilibrium point.

Lemma 2.3 Assume that H : IRn → IRn is a continuous mapping. Then, for any t0 ≥ 0 and x0 ∈ IRn, there exists a local solution x(t) for (12) with t∈ [t0, τ ) for some τ > t0. If, in addition, H is locally Lipschitz continuous at x0, then the solution is unique; if H is Lipschitz continuous in IRn, then τ can be extended to ∞.

If a local solution deﬁned on [t0, τ ) cannot be extended to a local solution on a larger interval [t0, τ1), τ1 > τ , then it is called a maximal solution, and the interval [t0, τ ) is the maximal interval of existence. Clearly, any local solution has an extension to a maximal one. We denote [t0, τ (x0)) by the maximal interval of existence associated with x0. Lemma 2.4 Assume that H : IRn → IRn is continuous. If x(t) with t ∈ [t0, τ (x0)) is a maximal solution and τ (x0) <∞, then lim

t↑τ(x0)∥x(t)∥ = ∞.

Definition 2.3 (Stability in the sense of Lyapunov) Let x(t) be a solution for (12). An isolated equilibrium point x is Lyapunov stable if for any x0 = x(t0) and any ε > 0, there exists a δ > 0 such that ∥x(t) − x∥ < ε for all t ≥ t0 and ∥x(t0)− x∥ < δ.

(7)

Definition 2.4 (Asymptotic stability) An isolated equilibrium point x is said to be asymptotically stable if in addition to being Lyapunov stable, it has the property that x(t)→ x as t→ ∞ for all ∥x(t0)− x∥ < δ.

Definition 2.5 (Lyapunov function) Let Ω ⊆ IRn be an open neighborhood of ¯x. A continuously diﬀerentiable function W : IRn → IR is said to be a Lyapunov function at the state ¯x over the set Ω for equation (12) if



W (¯x) = 0, W (x) > 0, ∀x ∈ Ω\{¯x}.

dW (x(t))

dt =∇W (x(t))TH(x(t))≤ 0, ∀x ∈ Ω. (13)

Lemma 2.5 (a) An isolated equilibrium point x is Lyapunov stable if there exists a Lyapunov function over some neighborhood Ω of x.

(b) An isolated equilibrium point x is asymptotically stable if there is a Lyapunov func- tion over some neighborhood Ω of x such that dW (x(t))

dt < 0 for all x∈ Ω\{x}.

Definition 2.6 (Exponential stability) An isolated equilibrium point x is exponentially stable if there exists a δ > 0 such that arbitrary point x(t) of (10) with the initial condition x(t0) = x0 and ∥x(t0)− x∥ < δ is well-deﬁned on [0, +∞) and satisﬁes

∥x(t) − x2 ≤ ce−ωt∥x(t0)− x∥ ∀t ≥ t0, where c > 0 and ω > 0 are constants independent of the initial point.

### 3Neural network model

We now discuss properties of the neural network model introduced in (10). First, from Lemma 2.2(a), we obtain the following result.

Proposition 3.1 Let Ψp : IRn → IR+ be deﬁned as in (7). Then, Ψp(x) ≥ 0 for all x∈ IRn and Ψp(x) = 0 if and only if x solves the NCP.

Proposition 3.2 Let Ψp : IRn→ IR+ be given by (7). Then, the following results hold.

(8)

(a) The function Ψp is continuously diﬀerentiable everywhere with

∇Ψp(x) = VTΦp(x) for any V ∈ ∂Φp(x) (14) or

∇Ψp(x) = aψp(x, F (x)) +∇F (x)∇bψp(x, F (x)) (15) with

aψp(x, F (x)) := [∇aψp(x1, F1(x)), . . . ,∇aψp(xn, Fn(x))]T ,

bψp(x, F (x)) := [∇bψp(x1, F1(x)), . . . ,∇bψp(xn, Fn(x))]T .

(b) If F is an P0-function, then every stationary point of (9) is a global minimizer of Ψp(x), and it consequently solves the NCP.

(c) If F is a uniform P -function, then the level sets L(Ψp, γ) :={x ∈ IRn | Ψp(x)≤ γ}

are bounded for all γ ∈ IR.

(d) Ψp(x(t)) is nonincreasing with respect to t.

Proof. The ﬁrst equality in (a) follows from Lemma 2.2 (c) and [5, Theorem 2.6.6].

The second one follows from the chain rule. Part (b) is the result of [3, Proposition 3.4], and part (c) is the result of [4, Proposition 3.5]. It remains to show part (d). By the deﬁnition of Ψp(x) and (10), it is not diﬃcult to compute

p(x(t))

dt =∇Ψp(x(t))Tdx(t)

dt = ∇Ψp(x(t))T (−ρ∇Ψp(x(t)))

= −ρ∥∇Ψp(x(t))∥2 ≤ 0. (16) Therefore, Ψp(x(t)) is a monotonically decreasing function with respect to t. 2

Proposition 3.2(a) provides two ways to compute ∇Ψp(x), which is needed in the network (10). One is to use formula (14), for which we give an algorithm (see Algorithm 3.1 below), to evaluate an element V ∈ ∂Φp(x). The other is to adopt formula (15).

Algorithm 3.1 (The procedure to evaluate an element V ∈ ∂Φp(x))

(S.0) Let x∈ IRn be given, and let Vi denote the i-th row of a matrix V ∈ IRn×n. (S.1) Set I(x) :={i ∈ {1, 2, . . . , n}| xi = Fi(x) = 0}.

(S.2) Set z ∈ IRn such that zi = 0 for i /∈ I(x), and zi = 1 for i∈ I(x).

(S.3) For i∈ I(x), let ui =

[|zi|p−1p +|∇Fi(x)Tz|p−1p ]p−1p , and

Vi = (zi

ui − 1 )

eTi +

(∇Fi(x)Tz ui − 1

)

∇Fi(x)T.

(9)

(S.4) For i /∈ I(x), set Vi =

(sgn(xi)· |xi|p−1

∥(xi, Fi(x))∥p−1p

− 1 )

eTi +

(sgn(Fi(x))· |Fi(x)|p−1

∥(xi, Fi(x))∥p−1p

− 1 )

∇Fi(x)T.

The above procedure is a traditional way of obtaining ∇Ψp(x(t)). For example, the neural network in [25] uses (14) and a similar algorithm to evaluate an element of V ∈ ∂ΦFB(x). We propose a simpler way of obtaining ∇Ψp(x(t)) which is to compute

∇Ψp(x(t)) by using formula (15) rather than formula (14). Formula (15) also provides an indication on how the neural network (10) can be implemented on hardware; see Figure 1 below.

Figure 1: A simpliﬁed block diagram for the neural network (10).

To close this section, we claim that Ψp provides a global error bound for the solution of the NCP. This result is important and will be used to analyze the inﬂuence of p on the convergence rate of the trajectory x(t) of the neural network (10) in the next section.

Proposition 3.3 Suppose F is a uniform P -function with modulus κ > 0 and Lipschitz continuous with constant L > 0. Then, the NCP has a unique solution x, and

∥x − x2 4L2

κ2(2− 21/p)2Ψp(x) ∀x ∈ IRn.

(10)

Proof. Since F is a uniform P -function, by Proposition 3.2(c), there exists a global minimizer of Ψp(x) which says the NCP has a solution. Assume that the NCP has two diﬀerent solutions x and y, then by Deﬁnition 2.1(d) we have

κ∥x− y2 ≤ max

1≤i≤m(xi − yi)(Fi(x)− Fi(y))

= max

1≤i≤m

{− xiFi(y)− yiFi(x) }≤ 0

where the equality is due to the fact that xiFi(x) = yiFi(y) = 0 for i = 1, 2, . . . , n (note that x and y are the solutions to the NCP), and the last inequality holds since x, y ≥ 0 and F (x), F (y) ≥ 0. This leads to a contradiction. Hence, the NCP has a unique solution.

For any x ∈ IRn, let r(x) := (r1(x), . . . , rn(x))T with ri(x) = min{xi, Fi(x)} for i = 1, . . . , n. Since F is Lipschitz continuous with constant L > 0, by [21, Lemma 7.4]

we have

(xi− xi)(Fi(x)− Fi(x))≤ 2L|ri(x)|∥x − x∥,

for all x ∈ IRn and i = 1, 2, . . . , n. On the other hand, since F is a uniform P -function with modulus κ > 0, from Deﬁnition 2.1(d) it follows that

κ∥x − x2 ≤ max

1≤i≤n(xi− xi)(Fi(x)− Fi(x)) for any x∈ IRn. Combining the last two equations yields

∥x − x∥ ≤ (2L/κ) max

1≤i≤n|ri(x)| ∀x ∈ IRn. This together with Lemma 2.2(f) implies

∥x − x∥ ≤ 2L

κ(2− 21/p) max

1≤i≤np(xi, Fi(x))| ≤ 2L

κ(2− 21/p)∥Φp(x)∥.

Consequently, we obtain the desired result. 2

### 4Convergence and stability of the trajectory

This section focuses on issues of convergence and stability of the neural network (10).

We analyze the behavior of the solution trajectory of (10) including the existence and convergence, and establish three kinds of stability for an isolated equilibrium point. We ﬁrst state the relationships between an equilibrium point of (10) and a solution to the NCP.

(11)

Proposition 4.1 (a) Every solution to the NCP is an equilibrium point of (10).

(b) If F is an P0-function, then every equilibrium point of (10) is a solution to the NCP.

Proof. (a) Suppose that x is a solution to the NCP. Then, from Proposition 3.1, it is clear that Φp(x) = 0. Using Lemma 2.2 (d) and (15), we then have ∇Ψp(x) = 0. This, by Deﬁnition 2.2, shows that x is an equilibrium point of (10).

(b) This is a direct consequence of Proposition 3.2 (b). 2

The following proposition establishes the existence of the solution trajectory of (10).

Proposition 4.2 For any ﬁxed p≥ 2, the following hold.

(a) For any initial state x0 = x(t0), there exists exactly one maximal solution x(t) with t∈ [t0, τ (x0)) for the neural network (10).

(b) If the level set L(x0) = {x ∈ IRn | Ψp(x) ≤ Ψp(x0)} is bounded or F is Lipschitz continuous, then τ (x0) = +∞.

Proof. (a) Since F is continuously diﬀerentiable, ∇F (x) is continuous, and therefore,

∇F (x) is bounded on a local compact neighborhood of x. On the other hand, ∇aψp and

bψp are Lipschitz continuous by Lemma 2.2 (e). These two facts together with formula (15) show that ∇Ψp(x) is locally Lipschitz continuous. Thus, applying Lemma 2.3 leads to the desired result.

(b) We proceed the arguments by the two cases as shown below.

Case (i): The level setL(x0) is bounded. We prove the result by contradiction. Suppose τ (x0) <∞. Then, by Lemma 2.4, lim

t↑τ(x0)∥x(t)∥ = ∞. Let Lc(x0) := IRn\L(x0) and τ0 := inf{s ≥ 0 | s < τ(x0), x(s)∈ Lc(x0)} < ∞.

We know that x(τ0) lies on the boundary ofL(x0) andLc(x0). Moreover,L(x0) is compact since it is bounded by assumption and it is also closed because of the continuity of Ψp(x).

Therefore, we have x(τ0)∈ L(x0) and τ0 < τ (x0), implying that

Ψp(x(s)) > Ψp(x0) > Ψp(x(τ0)) for some s∈ (τ0, τ (x0)). (17) However, Proposition 3.2(d) says that Ψp(x(·)) is nonincreasing on [t0, τ (x0)), which contradicts (17). This completes the proof of Case (i).

Case (ii): F is Lipschitz continuous. From the proof of part (a), we know that ∇Ψp(x) is Lipschitz continuous. Thus, by Lemma 2.3, we have τ (x0) =∞. 2

Next, we investigate the convergence of the solution trajectory of (10).

(12)

Theorem 4.1 (a) Let x(t) with t∈ [t0, τ (x0)) be the unique maximal solution to (10).

If τ (x0) =∞ and {x(t)} is bounded, then lim

t→∞∇Ψp(x(t)) = 0.

(b) If F is strongly monotone or a uniform P -function, thenL(x0) is bounded and every accumulation point of the trajectory x(t) is a solution to the NCP.

Proof. With Proposition 3.2 (b) and (d) and Proposition 4.2, the arguments are exactly the same as those for [25, Corollary 4.3]. Thus, we omit them. 2

From Proposition 4.1 (a), every solution x to the NCP is an equilibrium point of the neural network (10). If, in addition, x is an isolated equilibrium point of (10), then we can show that x is not only Lyapunov stable but also asymptotically stable.

Theorem 4.2 Let x be an isolated equilibrium point of the neural network (10). Then, x is Lyapunov stable for (10), and furthermore, it is asymptotically stable.

Proof. Since x is a solution to the NCP, Ψp(x) = 0. In addition, since x is an isolated equilibrium point of (10), there exists a neighborhood Ω ⊆ IRn of x such that

∇Ψp(x) = 0, and ∇Ψp(x)̸= 0 ∀x ∈ Ω\{x}.

Next, we argue that Ψp(x) is indeed a Lyapunov function at x over the set Ω for (10) by showing that the conditions in (13) are satisﬁed. First, notice that Ψp(x)≥ 0. Suppose that there is an ¯x ∈ Ω\{x} such that Ψpx) = 0. Then, by formula (15) and Lemma 2.2(d), we have ∇Ψ(¯x) = 0, i.e., ¯x is also an equilibrium point of (10), which clearly contradicts the assumption that x is an isolated equilibrium point in Ω. Thus, we prove that Ψp(x) > 0 for any x ∈ Ω\{x}. This together with (16) shows that the conditions in (13) are satisﬁed, and hence Ψp(x) is a Lyapunov function at x over the set Ω for (10). Therefore, x is Lyapunov stable by Lemma 2.5(a).

Now, we show that xis asymptotically stable. Since x is isolated, from (16) we have p(x(t))

dt < 0, ∀ x(t) ∈ Ω\{x}.

This, by Lemma 2.5 (b), implies that x is asymptotically stable. 2

Furthermore, using the same arguments we can prove that the neural network (10) is also exponentially stable if x is a regular solution to the NCP. Recall that x is a regular solution to the NCP if every element V ∈ ∂Φp(x) is nonsingular.

Theorem 4.3 If x is a regular solution of the NCP, then it is exponentially stable.

(13)

Remark 4.1 (a) Using arguments similar to those used in Proposition 3.2 of [13], we can prove that x is regular if ∇Fαα is nonsingular and the Schur complement of

∇Fαα in (

∇Fαα(x) ∇Fαβ(x)

∇Fβα(x) ∇Fββ(x) )

is an P -matrix, where α := {i | xi > 0} and β := {i | xi = Fi(x) = 0}. Clearly, if ∇F is positive deﬁnite, then the conditions hold true.

(b) From Deﬁnition 2.6, if an isolated equilibrium point x is exponentially stable, then there exists a δ > 0 such that x(t) with x0 = (t0), and ∥x(t0)− x∥ < δ satisﬁes

∥x(t) − x∥ ≤ ce−ωt∥x(t0)− x∥ ∀t ≥ t0, which together with Proposition 3.3 implies that

∥x(t) − x∥ ≤ 2cL κ(2− 21/p)

Ψp(x0)e−ωt ∀t ≥ t0. (18) Since the strong monotonicity of F implies that F is a uniform P -function and that ∇F is positive deﬁnite, from (18) we obtain that the neural network (10) can yield a trajectory with an exponential convergence rate under the condition that F is strongly monotone and Lipschitz continuous.

(c) We observe from (18) that, when p increases, the coeﬃcient of e−ωt in the right hand side term becomes smaller, which in turn implies that a larger p yields a better convergence rate. This agrees with the result obtained by [2] for a descent-type method based on Ψp. In addition, from (18) we notice that the energy of the initial state, i.e., Ψp(x0) also has an inﬂuence on the convergence rate. A higher initial energy will lead to a worse convergence rate.

### 5Simulation results

In this section, we test four well-known nonlinear complementarity problems by our neural network model (10). For each test problem, we also compare the numerical performance of the proposed neural network with various values of p and various initial states x(t0).

The test instances are described below.

Example 5.1 [32, Example 2] Consider the NCP, where F : IR5 → IR5 is given by

F (x) =





x1+ x2x3x4x5/50 x2+ x1x3x4x5/50− 3 x3+ x1x2x4x5/50− 1 x4+ x1x2x3x5/50 + 1/2

x5+ x1x2x3x4/50





.

(14)

The NCP has only one solution x = (0, 3, 1, 0, 0).

Example 5.2 [31, Watson] Consider the NCP, where F : IR5 → IR5 is given by

F (x) = 2 exp ( 5

i=1

(xi− i + 2)2 )





x1+ 1 x2 x3− 1 x4− 2 x5− 3





.

Note that F is not a P0-function on IRn. The solution to this problem is x = (0, 0, 1, 2, 3).

Example 5.3 [24, Kojima-Shindo] Consider the NCP, where F : IR4 → IR4 is given by

F (x) =



3x21 + 2x1x2+ 2x22+ x3+ 3x4− 6 2x21+ x1+ x22+ 3x3+ 2x4− 2 3x21 + x1x2+ 2x22+ 2x3+ 3x4− 1

x21+ 3x22 + 2x3+ 3x4− 3



 .

This is a non-degenerate NCP and the solution is x = (

6/2, 0, 0, 1/2).

Example 5.4 [24, Kojima-Shindo] Consider the NCP, where F : IR4 → IR4 is given by

F (x) =



3x21 + 2x1x2+ 2x22+ x3+ 3x4− 6 2x21+ x1+ x22+ 10x3+ 2x4− 2 3x21 + x1x2+ 2x22+ 2x3+ 9x4− 9

x21+ 3x22 + 2x3+ 3x4− 3



 .

This is a degenerate NCP and has two solutions x = (

6/2, 0, 0, 1/2) and x = (1, 0, 3, 0).

The numerical implementation is coded by Matlab 7.0 and the ordinary diﬀerential equation solver adopted is ode23, which uses an Runge-Kutta (2, 3) formula. We ﬁrst test the inﬂuence of the parameter p on the value of ∥x(t) − x∥. Figures 2–5 in the appendix describe how∥x(t)−x∥ varies with p for these instances with the initial states x0 = (10−2, 1, 0.5, 10−2, 10−2)T, x0 = (10−2, 10−2, 0.5, 0.5, 0.5)T, x0 = (2, 10−2, 10−2, 0.1)T, and x0 = (10−3, 10−3, 10−3, 10−3)T, respectively. In the tests, the design parameter ρ in the neural network (10) is set to be 1000. From Figures 2–5, we see that, when p = 1.1, the neural network (10) generates the slowest decrease of∥x(t)−x∥ for all test instances, whereas when p = 20 it generates the fastest decrease of ∥x(t) − x∥. This veriﬁes the analysis of Remark 4.1 (c). We should emphasize that the conclusion in Remark 4.1 (c) requires the initial state x0 to be suﬃciently close to an equilibrium point. If this

(15)

condition is not satisﬁed, we cannot draw such conclusion; see Figure 6.

Example 5.1 shows how the value of ∥x(t) − x∥ varies with initial state x0. Figure 7 describes the convergence behavior of∥x(t) − x∥ with initial states x(1)0 = (1, 1, 1, 1, 1)T, x(2)0 = (5, 5, 5, 5, 5)T, and x(3)0 = (10, 10, 10, 10, 10)T. Notice that the initial energies cor- responding to these three states are Ψp(x(1)0 ) = 5.814, Ψp(x(2)0 ) = 39.367, and Ψp(x(3)0 ) = 226.333, respectively. In the tests, we choose p = 1.8 and ρ = 1000. Figure 7, shows that a larger initial energy yields a slower decrease of the error ∥x(t) − x∥ if the initial state is close to the solution of the NCP. This agrees with the analysis in Remark 4.1(c).

The convergence behavior of x(t) from several initial states with a ﬁxed p and ρ = 1000 for each example is shown in Figures 8–12. The transient behavior of x(t) for Example 5.4 is depicted in Figure 11 and Figure 12 since there are two solutions for this problem.

More speciﬁcally, we test 12 random initial points for the NCP, 9 of which converge to (

6/2, 0, 0, 1/2); the remaining 3 converge to (1, 0, 3, 0). When ﬁnding the solution tra- jectory x(t), we employ ∥∇Ψp(x(t))∥ ≤ 10−5 as the stopping criterion.

To sum up, the neural network (10) is a better alternative for the network based on the FB function ϕFB if an appropriate p is chosen. Based on the analysis of Remark 4.1 (c) and the above numerical simulations, we see that, to obtain a better convergence rate of the trajectory x(t), the parameter p cannot be set too small. In addition, we should emphasize that the initial state x(t0) has a great inﬂuence on the convergence behavior of ∥x(t) − x∥.

To end this section, we answer a natural question: are there advantages of our pro- posed neural network compared to the existing ones? To answer this, we summarize what we have observed from numerical experiments and theoretical results as below.

• We compare our neural network model with some existing models which also work for NCP, for instance, the ones used in [6, 32, 33]. At ﬁrst glance, the neural network models based on projection in [6, 32, 33] look having lower complexity. However, we observe that the diﬀerence of the numerical performance is very marginal by testing MCPLIB benchmark problems.

• Our proposed model seems having better properties from theoretical view. Note that there requires monotonicity (strong monotonicity) of F to guarantee the Lyapunov stability (exponential stability) of the neural network models used in [6, 32, 33]. In contrast, such conditions are not needed for our neural network model. In fact, it can be veriﬁed that all F ’s are non-monotone in previous exam- ples except Example 5.2 (by checking the positive semi-deﬁniteness of their Jacobian matrices).

(16)

• For the following special NCP:

x = (x1, x2, x3)≥ 0, F (x) = (x1,−x2,−x3)≥ 0, ⟨x, F (x)⟩ = x21− x22− x23 = 0, it is easy to verify that the unique solution is (0, 0, 0) which can be solved easily by our neural network model. But, the solution trajectory diverges by using the model in [32].

• Changing initial points may not having much eﬀect for our neural network model, whereas it does for other existing models. For instance, choosing

x0 = (12,−12, 12, −12, 12) as the initial point in Example 5.1 causes the divergence of solution trajectory solved by the neural network model used in [32], while it does not aﬀect anything by our neural network model.

### 6Conclusions

In this paper, we have studied a (class of) neural network based on the generalized FB function ϕp deﬁned as in (5). We establish the Lyapunov stability, the asymptotic stability, and the exponential stability for the neural network. In addition, we also analyze the inﬂuence of the parameter p on the convergence rate of the trajectory (or the local convergence behavior of the error∥x(t)−x∥) and obtain that a larger p leads to a better convergence rate. This agrees with the result obtained by [2] for a descent-type method based on ϕp, which also indicates how to choose a suitable p in practice. Numerical experiments verify the obtained theoretical results. The advantages of our proposed neural network compared to other existing neural networks are reported as well. One future topic is to modify the proposed neural network model for various optimization problems and establish its related stability accordingly.

### References

[1] J.-S. Chen (2006), The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Opti- mization, vol. 36, 565–580.

[2] J.-S. Chen, H.-T. Gao and S.-H. Pan (2009), A derivative-free R-linearly conver- gent algorithm based on the generalized Fischer-Burmeister merit function, Journal of Computational and Applied Mathematics, vol. 232, 455–471.

[3] J.-S. Chen and S.-H. Pan (2008), A family of NCP functions and a descent method for the nonlinear complementarity problem, Computational Optimization and Appli- cations, vol. 40, 389–404.

(17)

[4] J.-S. Chen and S.-H. Pan (2008), A regularization semismooth Newton method based on the generalized Fischer-Burmeister function for P0-NCPs, Journal of Com- putational and Applied Mathematics, vol. 220, 464–479.

[5] F. H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.

[6] C. Dang, Y. Leung, X. Gao, and K. Chen (2004), Neural networks for nonlinear and mixed complementarity problems and their applications, Neural Networks, vol. 17, 271–283.

[7] S. Effati, A. Ghomashi, and A. R. Nazemi (2007), Application of projection neural network in solving convex programming problems, Applied Mathematics and Computation, vol. 188, 1103–1114.

[8] S. Effati and A. R. Nazemi (2006), Neural network and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation, vol. 172, 305–331.

[9] M. C. Ferris, O. L. Mangasarian, and J.-S. Pang, editors, Complementarity:

Applications, Algorithms and Extensions, Kluwer Academic Publishers, Dordrecht, 2001.

[10] A. Fischer (1992), A special Newton-type optimization methods, Optimization, vol.

24, 269–284.

[11] A. Fischer (1997), Solution of the monotone complementarity problem with locally Lipschitzian functions, Mathematical Programming, vol. 76, 513–532.

[12] F. Facchinei and J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, Volumes I and II, Springer-Verlag, New York, 2003.

[13] F. Facchinei and J. Soares (1997), A new merit function for nonlinear comple- mentarity problems and a related algorithm, SIAM Journal on Optimization, vol. 7, 225–247.

[14] C. Geiger, and C. Kanzow (1996), On the resolution of monotone complemen- tarity problems, Computational Optimization and Applications, vol. 5, 155–173.

[15] Q. Han, L.-Z. Liao, H. Qi, and L. Qi (2001), Stability analysis of gradient-based neural networks for optimization problems, Journal of Global Optimization, vol. 19, 363–381.

[16] J. J. Hopfield and D. W. Tank (1985), Neural computation of decision in optimization problems, Biological Cybernetics, vol. 52, 141–152.

(18)

[17] X. Hu and J. Wang (2006), Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network, IEEE Trans- actions on Neural Networks, vol. 17, 1487–1499.

[18] X. Hu and J. Wang (2007), A recurrent neural network for solving a class of gen- eral variational inequalities, IEEE Transactions on Systems, Man, and Cybernetics-B, vol. 37, 528–539.

[19] H. Jiang (1996), Unconstrained minimization approaches to nonlinear complemen- tarity problems, Journal of Global Optimization, vol. 9, 169–181.

[20] C. Kanzow (1996), Nonlinear complementarity as unconstrained optimization, Journal of Optimization Theory and Applications, vol. 88, 139–155.

[21] C. Kanzow and M. Fukushima (1996), Equivalence of the generalized comple- mentarity problem to diﬀerentiable unconstrained minimization, Journal of Optimiza- tion Theory and Applications, vol. 90, pp. 581–603.

[22] H. K. Khalil (1996), Nonlinear System, Upper Saddle River, NJ: Prentice Hall.

[23] M. P. Kennedy and L. O. Chua (1988), Neural network for nonlinear program- ming, IEEE Tansaction on Circuits and Systems, vol. 35, 554–562.

[24] M. Kojima and S. Shindo (1986), Extensions of Newton and quasi-Newton meth- ods to systems of P C1 equations, Journal of Operations Research Society of Japan, vol. 29, 352–374.

[25] L.-Z. Liao, H. Qi, and L. Qi (2001), Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Journal of Computational and Applied Mathematics, vol. 131, 342–359.

[26] R. K. Miller and A. N. Michel (1982), Ordinary Diﬀerential Equations, Aca- demic Press.

[27] S-K. Oh, W. Pedrycz, and S-B. Roh (2006), Genetically optimized fuzzy polyno- mial neural networks with fuzzy set-based polynomial neurons, Information Sciences, vol. 176, 3490–3519.

[28] A. Shortt, J. Keating, L. Monlinier, and C. Pannell (2005), Optical im- plementation of the Kak neural network, Information Sciences, vol. 171, 273–287.

[29] D. W. Tank and J. J. Hopfield (1986), Simple neural optimization networks:

an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Transactions on Circuits and Systems, vol. 33, 533–541.

(19)

[30] P. Tseng (1996), Global behaviour of a class of merit functions for the nonlinear complementarity problem, Journal of Optimization Theory and Applications, vol. 89, 17–37.

[31] L. T. Watson (1979), Solving the nonlinear complementarity problem by a homo- topy method, SIAM Journal on Control and Optimization, vol. 17, 36–46.

[32] Y. Xia, H. Leung, and J. Wang (2002), A projection neural network and its application to constrained optimization problems, IEEE Transactions on Circuits and Systems-I, vol. 49, 447–458.

[33] Y. Xia, H. Leung, and J. Wang (2004), A genarl projection neural network for solving monotone variational inequalities and related optimization problems, IEEE Transactions on Neural Networks, vol. 15, 318–328.

[34] Y. Xia, H. Leung, and J. Wang (2005), A recurrent neural network for solving nonlinear convex programs subject to linear constraints, IEEE Transactions on Neural Networks, vol. 16, 379–386.

[35] M. Yashtini and A. Malek (2007), Solving complementarity and variational inequalities problems using neural networks, Applied Mathematics and Computation, vol. 190, 216–230.

[36] S. H. Zak, V. Upatising, and S. Hui (1995), Solving linear programming prob- lems with neural networks: a comparative study, IEEE Transactions on Neural Net- works, vol. 6, 94–104.

[37] G. Zhang (2007), A neural network ensemble method with jittered training data for time series forecasting, Information Sciences, vol. 177, 5329–5340.

Appendix

(20)

0 5 10 15 20 25 30 35 40 45 50 10−7

10−6 10−5 10−4 10−3 10−2 10−1 100 101

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Figure 2: Convergence behavior of the error∥x(t) − x∥ in Example 5.1 with given x0.

0 5 10 15 20 25 30 35 40 45 50

10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Figure 3: Convergence behavior of the error∥x(t) − x∥ in Example 5.2 with given x0.

(21)

0 5 10 15 20 25 30 35 40 45 50 10−7

10−6 10−5 10−4 10−3 10−2 10−1 100 101

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Figure 4: Convergence behavior of the error∥x(t) − x∥ in Example 5.3 with given x0.

0 5 10 15 20 25 30 35 40 45 50

10−7 10−6 10−5 10−4 10−3 10−2 10−1 100 101

Time (ms)

||x(t)−x*||

p=1.1 p=1.5 p=3 p=20

Figure 5: Convergence behavior of the error∥x(t) − x∥ in Example 5.4 with given x0.

Qi (2001), Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Journal of Computational and Applied Mathematics, vol. Pedrycz,

Categories of Network Types by Broad Learning Method.

Each unit in hidden layer receives only a portion of total errors and these errors then feedback to the input layer.. Go to step 4 until the error is

Principle Component Analysis Denoising Auto Encoder Deep Neural Network... Deep Learning Optimization

Deep learning usually refers to neural network based model.. Shallow – Speech Recognition. ◉

for training

Random Forest: Theory and Practice Neural Network Motivation.. Neural Network Hypothesis Neural Network Training Deep

They are suitable for different types of problems While deep learning is hot, it’s not always better than other learning methods.. For example, fully-connected