• 沒有找到結果。

2 Overview and Contributions of the Paper

N/A
N/A
Protected

Academic year: 2022

Share "2 Overview and Contributions of the Paper"

Copied!
32
0
0

加載中.... (立即查看全文)

全文

(1)

to appear in Neurocomputing, 2019

Neural networks based on three classes of NCP-functions for solving nonlinear complementarity problems

Jan Harold Alcantara1 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Jein-Shan Chen 2 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

February 6, 2019 (revised on April 15, 2019)

Abstract. In this paper, we consider a family of neural networks for solving nonlin- ear complementarity problems (NCP). The neural networks are constructed from the merit functions based on three classes of NCP-functions: the generalized natural residual function and its two symmetrizations. In this paper, we first characterize the stationary points of the induced merit functions. Growth behavior of the complementarity func- tions is also described, as this will play an important role in describing the level sets of the merit functions. In addition, the stability of the steepest descent-based neural network model for NCP is analyzed. We provide numerical simulations to illustrate the theoretical results, and also compare the proposed neural networks with existing neural networks based on other well-known NCP-functions. Numerical results indicate that the performance of the neural network is better when the parameter p associated with the NCP-function is smaller. The efficiency of the neural networks in solving NCPs is also reported.

Keywords. NCP-function; natural residual function; complementarity problem; neural network; stability.

1E-mail: 80640005s@ntnu.edu.tw

2Corresponding author. E-mail: jschen@math.ntnu.edu.tw. The research is supported by Ministry of Science and Technology, Taiwan.

(2)

1 Introduction and Motivation

Given a function F : IRn → IRn, the nonlinear complementarity problem (NCP) is to find a point x ∈ IRn such that

x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0, (1)

where h·, ·i is the Euclidean inner product and ≥ means the component-wise order on IRn. Throughout this paper, we assume that F is continuously differentiable, and let F = (F1, . . . , Fn)T with Fi : IRn→ IR for i = 1, . . . , n.

For decades, substantial research efforts have been devoted in the study of nonlinear complementarity problems because of their wide range of applications in many areas such as optimization, operations research, engineering, and economics [8, 9, 12, 48]. Some source problems of NCPs include models of equilibrium problems in the aforementioned fields and complementarity conditions in constrained optimization problems [9, 12].

There are many methods in solving the NCP (1). In general, these solution methods may be categorized into two classes, depending on whether or not they make use of the so-called NCP-function (see Definition 2.1). Some techniques that usually exploit NCP-functions include merit function approach [11, 19, 26], nonsmooth Newton method [10, 45], smoothing methods [4, 31], and regularization approach [17, 37]. On the other hand, interior-point method [29, 30] and proximal point algorithm [33] are some well- known approaches to solve (1) which do not utilize NCP-functions in general. The excellent monograph of Facchinei and Pang [9] provides a thorough survey and discussion of solution methods for complementarity problems and variational inequalities.

The above numerical approaches can efficiently solve the NCP; however, it is of- ten desirable in scientific and engineering applications to obtain a real-time solution.

One promising approach that can provide real-time solutions is the use of neural net- works, which were first introduced in optimization by Hopfield and Tank in the 1980s [13, 38]. Neural networks based on circuit implementation exhibit real-time process- ing. Furthermore, prior researches show that neural networks can be used efficiently in linear and nonlinear programming, variational inequalities and nonlinear complemen- tarity problems [2, 7, 14, 15, 20, 23, 42, 43, 44, 47, 49] and as well as in other fields [25, 28, 36, 34, 39, 40, 46, 50, 51, 55].

Motivated by the preceding discussion, we construct a new family of neural net- works based on recently discovered discrete-type NCP-functions to solve NCPs. Neural networks based on the Fischer-Burmeister (FB) function [23] and the generalized Fischer- Burmeister function [2] have already been studied. The latter NCP-functions, which have been extensively used in the different solution methods, are strongly semismooth func- tions, which often provide efficient performance [9]. In this paper, we explore the use of smooth NCP-functions as building blocks of the proposed neural networks. Moreover, the NCP-functions we consider herein have piecewise-defined formulas, as opposed to the FB and generalized FB functions which have simple formulations. In turn, the subsequent

(3)

analysis is more complicated. Nevertheless, we show that the proposed neural networks may offer promising results too. The analysis and numerical reports in this paper, on the other hand, pave the way for the use of piecewise-defined NCP-functions.

This paper is organized as follows: In Section 2, we revisit equivalent reformulations of the NCP (1) using NCP-functions. We also elaborate on the purpose and limitations of the paper. In Section 3, we review some mathematical preliminaries related to non- linear mappings and stability analysis. We also summarize some important properties of the three classes of NCP-functions we used in constructing the neural networks. In Section 4, we describe the general properties of the neural networks, which include the characterization of stationary points of the induced merit functions. In Section 5, we look at the growth behavior of the three classes of NCP-functions considered. This result will be used to prove the boundedness of the level sets of the induced merit functions. We also prove some stability properties of the neural networks. In Section 6, we present the results of our numerical simulations. Conclusions and some recommendations for future studies are discussed in Section 7.

Throughout the paper, IRn denotes the space of n-dimensional real column vectors, IRm×n denotes the space of m × n real matrices, and AT denotes the transpose of A ∈ IRm×n. For any differentiable function f : IRn → IR, ∇f (x) means the gradient of f at x. For any differentiable mapping F = (F1, . . . , Fm)T : IRn → IRm, ∇F (x) = [∇F1(x) · · · ∇Fm(x)] ∈ IRn×m denotes the transposed Jacobian of F at x. We assume that p is an odd integer greater than 1, unless otherwise specified.

2 Overview and Contributions of the Paper

In this section, we give an overview of this research. We begin by looking at equivalent reformulations of the nonlinear complementarity problem (1) using NCP-functions, which is defined as follows.

Definition 2.1 A function φ : IR × IR → IR is called an NCP-function if it satisfies φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0.

The well-known natural-residual function given by

φNR(a, b) = a − (a − b)+ = min{a, b}

is an example of an NCP-function, which is widely used in solving NCP. Recently, in [3], the discrete-type generalization of φNR is proposed and described by

φp

NR(a, b) = ap− (a − b)p+ where p > 1 is odd integer. (2) It is shown in [3] that φpNR is twice continuously differentiable. However, its surface is not symmetric, which may result to difficulties in designing and analyzing solution methods

(4)

[16]. To conquer this, two symmetrizations of the φp

NR are presented in [1]. A natural symmetrization of φp

NR is given by φp

S−NR(a, b) =

ap− (a − b)p if a > b, ap = bp if a = b, bp− (b − a)p if a < b.

(3)

The above NCP-function is symmetric, but is only differentiable on {(a, b) | a 6= b or a = b = 0}. It was however shown in [16] that φp

S−NR is semismooth and is directionally differentiable. The second symmetrization of φpNR is described by

ψS−NRp (a, b) =

apbp − (a − b)pbp if a > b, apbp = a2p if a = b, apbp − (b − a)pap if a < b,

(4)

which possesses both differentiability and symmetry. The functions φpNR, φpS−NR and ψpS−NR are three classes of the four discrete-type families of NCP-functions which are recently discovered, together with the discrete-type generalization of the Fischer-Burmeister func- tion given by

φp

D−FB(a, b) =p

x2+ y2p

− (x + y)p. (5)

A comprehensive discussion of their properties is presented in [16].

To see how an NCP-function φ can be useful in solving NCP (1), we define Φ : IRn→ IRn by

Φ(x) =

φ(x1, F1(x)) ... φ(xn, Fn(x))

. (6)

It is easy to see that x solves NCP (1) if and only if Φ(x) = 0 (see also Proposition 4.1 (a)). Thus, the NCP is equivalent to the nonlinear system of equations Φ(x) = 0.

Meanwhile, if φ is an NCP-function, then ψ : IR × IR → IR+ given by ψ(a, b) := 1

2|φ(a, b)|2 (7)

is also an NCP-function. Accordingly, if we define Ψ : IRn→ IR+ by Ψ(x) =

n

X

i=1

ψ(xi, Fi(x)) = 1

2kΦ(x)k2, (8)

then the NCP can be reformulated as a minimization problem minx∈IRnΨ(x). Hence, Ψ given by (8) is a merit function for the NCP, that is, its global minimizer coincides with the solution of the NCP. Consequently, it is only natural to consider the steepest descent-based neural network

dx(t)

dt = −ρ∇Ψ(x(t)), x(t0) = x0, (9)

(5)

where ρ > 0 is a time-scaling factor. The above neural network (9) is also motivated by the ones considered in [23] and in [2], where the NCP functions used are the Fischer- Burmeister (FB) function given by

φFB(a, b) =√

a2+ b2− (a + b), (10)

and the generalized Fischer-Burmeister functions given by φp

FB(a, b) = k(a, b)kp− (a + b) where p ∈ (1, +∞), (11) respectively. We aim to compare the neural networks based on the generalized natural- residual functions (2), (3) and (4) with the well-studied networks based on the FB func- tions (10) and (11).

One of the contributions of this paper lies on establishing the theoretical properties of the generalized natural residual functions. These are fundamental in designing NCP- based solution methods, and in this paper, we use the neural network approach. Basic properties of these functions are already presented in [16]. The purpose of this paper is to elaborate some more properties and applications of the newly discovered discrete-type classes of NCP-functions given by (2), (3) and (4). Specifically, we look at the properties of their induced merit functions given by (8). First, it is important for us to determine the correspondence between the solutions of NCP (1) and the stationary points of Ψ.

From the above discussion (also see Proposition 4.1(d)), we already know that an NCP solution is a stationary point. On the other hand, we also want to determine which stationary points of Ψ are solutions to the NCP. For certain NCP functions such as the Mangasarian and Solodov function [19], FB function [11] and generalized FB function [5], a stationary point of the merit function was shown to be a solution to the NCP when F is monotone or a P0-function. It should be pointed out that these NCP-functions possess the following nice properties:

(P1) ∇aψ(a, b) · ∇bψ(a, b) ≥ 0 for all (a, b) ∈ IR2; and

(P2) For all (a, b) ∈ IR2, ∇aψ(a, b) = 0 ⇐⇒ ∇bψ(a, b) = 0 ⇐⇒ φ(a, b) = 0.

However, these properties are not possessed by φp

NR, φp

S−NR and ψp

S−NR, which leads to some difficulties in the subsequent analysis. Hence, we seek for other conditions which will guarantee that a stationary point is an NCP solution. Furthermore, we also want to look at the growth behavior of the functions (2), (3) and (4). This will play a key role in characterizing the level sets of the induced merit functions. It must be noted that since the NCP functions φp

S−NR and ψp

S−NR are piecewise-defined functions, then the analyses of their growth behavior and the properties of their induced merit functions are more difficult, as compared with the commonly used FB functions (10) and (11) which have simple formulations.

Another purpose of this paper is to discuss the stability properties of the neural networks based on φp

NR, φp

S−NR and ψp

S−NR. We further look into different examples to

(6)

see the influence of p on the convergence of trajectories of the neural network to the NCP solution. Finally, we compare the numerical performance of these three types of neural networks with two well-studied neural networks based on the FB function [23] and generalized FB function [2].

We recall that a solution x is said to be degenerate if {i | xi = Fi(x) = 0} is not empty. Note that if x is degenerate and φ is differentiable at x, then ∇Φ(x) is singular. Consequently, one should not expect a locally fast convergence of numerical methods based on smooth NCP-functions if the computed solution is degenerate [9, 18].

Because of the differentiability of φp

NR, φp

S−NR and ψp

S−NR on the feasible region of the NCP problem, it is also expected that the convergence of the trajectories of the neural network (9) to a degenerate solution could be slow. Hence, in this paper, we will give particular attention to nondegenerate NCPs.

3 Preliminaries

In this section, we review some special nonlinear mappings, some properties of φpNR, φpS−NR and ψp

S−NR, as well as some tools from stability theory in dynamical systems that will be crucial in our analysis. We begin with recalling concepts related to nonlinear mappings.

Definition 3.1 Let F = (F1, . . . , Fn)T : IRn→ IRn. Then, the mapping F is said to be (a) monotone if hx − y, F (x) − F (y)i ≥ 0 for all x, y ∈ IRn.

(b) strictly monotone if hx − y, F (x) − F (y)i > 0 for all x, y ∈ IRn and x 6= y.

(c) strongly monotone with modulus µ > 0 if hx − y, F (x) − F (y)i ≥ µkx − yk2 for all x, y ∈ IRn.

(d) a P0-function if max

1≤i≤n xi6=yi

(xi− yi)(Fi(x) − Fi(y)) ≥ 0 for all x, y ∈ IRn and x 6= y.

(e) a P -function if max

1≤i≤n(xi− yi)(Fi(x) − Fi(y)) > 0 for all x, y ∈ IRn and x 6= y.

(f ) a uniform P -function with modulus κ > 0 if max

1≤i≤n(xi−yi)(Fi(x)−Fi(y)) ≥ κkx−yk2, for all x, y ∈ IRn.

From Definition 3.1, the following one-sided implications can be obtained:

F is strongly monotone =⇒ F is a uniform P -function =⇒ F is a P0-function.

(7)

It is known that F is monotone (resp. strictly monotone) if and only if ∇F (x) is positive semidefinite (resp. positive definite) for all x ∈ IRn. In addition, F is a P0-function if and only if ∇F (x) is a P0-matrix for all x ∈ IRn; that is, its principal minors are nonnegative. Further, if ∇F (x) is a P -matrix (that is, its principal minors are positive) for all x ∈ IRn, then F is a P -function. However, we point out that a P -function does not necessarily have a Jacobian which is a P -matrix.

The following characterization of P -matrices and P0-matrices will be useful in our analysis.

Lemma 3.1 A matrix M ∈ IRn×n is a P -matrix (resp. a P0-matrix) if and only if whenever xi(M x)i ≤ 0 (resp. xi(M x)i < 0) for all i, then x = 0.

Proof. Please see [6]. 2

The following two lemmas summarize some properties of φp

NR, φp

S−NR and ψp

S−NR that will be useful in our subsequent analysis.

Lemma 3.2 Let p > 1 be an odd integer. Then, the following hold.

(a) The function φpNR is twice continuously differentiable. Its gradient is given by

∇φp

NR(a, b) = p ap−1− (a − b)p−2(a − b)+ (a − b)p−2(a − b)+

 .

(b) The function φp

S−NR is twice continuously differentiable on the set Ω := {(a, b) | a 6=

b}. Its gradient is given by

∇φpS−NR(a, b) = p [ ap−1− (a − b)p−1, (a − b)p−1]T if a > b, p [ (b − a)p−1, bp−1− (b − a)p−1]T if a < b.

Further, φpS−NR is differentiable at (0, 0) with ∇φpS−NR(0, 0) = [0, 0]T. (c) The function ψp

S−NR is twice continuously differentiable. Its gradient is given by

∇ψp

S−NR(a, b) =

p [ ap−1bp− (a − b)p−1bp, apbp−1− (a − b)pbp−1+ (a − b)p−1bp]T if a > b, p [ ap−1bp, apbp−1]T = pa2p−1[1 , 1 ]T if a = b, p [ ap−1bp− (b − a)pap−1+ (b − a)p−1ap, apbp−1− (b − a)p−1ap]T if a < b.

Proof. Please see [3, Proposition 2.2], [1, Propositions 2.2 and 3.2], and [16, Proposition 4.3]. 2

Lemma 3.3 Let p > 1 be a positive odd integer. Then, the following hold.

(8)

(a) If φ ∈ {φp

NR, φp

S−NR}, then φ(a, b) > 0 ⇐⇒ a > 0, b > 0. On the other hand, ψp

S−NR(a, b) ≥ 0 on IR2. (b) ∇aφpNR(a, b) · ∇bφpNR(a, b)

> 0 on {(a, b) | a > b > 0 or a > b > 2a},

= 0 on {(a, b) | a ≤ b or a > b = 2a or a > b = 0},

< 0 otherwise,

aφp

S−NR(a, b) · ∇bφp

S−NR(a, b) > 0 on {(a, b) | a > b > 0}S{(a, b) | b > a > 0}, and

aψp

S−NR(a, b) · ∇bψp

S−NR(a, b) > 0 on the first quadrant IR2++.

(c) If φ ∈ {φpNR, φpS−NR}, then ∇aφ(a, b) · ∇bφ(a, b) = 0 provided that φ(a, b) = 0. On the other hand, ψp

S−NR(a, b) = 0 ⇐⇒ ∇ψp

S−NR(a, b) = 0. In particular, we have

aψp

S−NR(a, b) · ∇bψp

S−NR(a, b) = 0 provided that ψp

S−NR(a, b) = 0.

Proof. Please see [16, Propositions 3.4, 4.5, and 5.4]. 2

Next, we recall some materials about first order differential equations (ODE):

˙x(t) = H(x(t)), x(t0) = x0 ∈ IRn (12) where H : IRn → IRn is a mapping. We also introduce three kinds of stability that we will consider later. These materials can be found in ODE textbooks; see [27].

Definition 3.2 A point x = x(t) is called an equilibrium point or a steady state of the dynamic system (12) if H(x) = 0. If there is a neighborhood Ω ⊆ IRn of x such that H(x) = 0 and H(x) 6= 0 ∀x ∈ Ω\{x}, then x is called an isolated equilibrium point.

Lemma 3.4 Assume that H : IRn → IRn is a continuous mapping. Then, for any t0 ≥ 0 and x0 ∈ IRn, there exists a local solution x(t) for (12) with t ∈ [t0, τ ) for some τ > t0. If, in addition, H is locally Lipschitz continuous at x0, then the solution is unique; if H is Lipschitz continuous in IRn, then τ can be extended to ∞.

Definition 3.3 (Stability in the sense of Lyapunov) Let x(t) be a solution for (12). An isolated equilibrium point x is Lyapunov stable if for any x0 = x(t0) and any ε > 0, there exists a δ > 0 such that kx(t) − xk < ε for all t ≥ t0 and kx(t0) − xk < δ.

Definition 3.4 (Asymptotic stability) An isolated equilibrium point x is said to be asymptotically stable if in addition to being Lyapunov stable, it has the property that x(t) → x as t → ∞ for all kx(t0) − xk < δ.

(9)

Definition 3.5 (Lyapunov function) Let Ω ⊆ IRn be an open neighborhood of ¯x. A continuously differentiable function W : IRn → IR is said to be a Lyapunov function at the state ¯x over the set Ω for equation (12) if

W (¯x) = 0, W (x) > 0, ∀x ∈ Ω\{¯x}.

dW (x(t))

dt = ∇W (x(t))TH(x(t)) ≤ 0, ∀x ∈ Ω.

Lemma 3.5 (a) An isolated equilibrium point x is Lyapunov stable if there exists a Lyapunov function over some neighborhood Ω of x.

(b) An isolated equilibrium point x is asymptotically stable if there is a Lyapunov func- tion over some neighborhood Ω of x such that dW (x(t))

dt < 0 for all x ∈ Ω\{x}.

Definition 3.6 (Exponential stability) An isolated equilibrium point x is exponentially stable if there exists a δ > 0 such that arbitrary point x(t) of (12) with the initial condition x(t0) = x0 and kx(t0) − xk < δ is well-defined on [0, +∞) and satisfies

kx(t) − xk2 ≤ ce−ωtkx(t0) − xk ∀t ≥ t0, where c > 0 and ω > 0 are constants independent of the initial point.

The following result will also be helpful in our stability analysis.

Lemma 3.6 Let F be locally Lipschitzian. If all V ∈ ∂F (x) are nonsingular, then there is a neighborhood N (x) of x and a constant C such that for any y ∈ N (x) and any V ∈ ∂F (y), V is nonsingular and kV−1k ≤ C

Proof. Please see [32, Propositions 3.1]. 2

4 Neural network model

In this section, we describe the properties of the neural network (9) based on the functions φp

NR, φp

S−NR and ψp

S−NR. Before this, we summarize first some important properties of Ψ as defined in (8) for general NCP-functions. Proposition 4.1 (a) is in fact Lemma 2.2 in [19]. On the other hand, Proposition 4.1(b) and (e) are true for all gradient systems (9).

Proposition 4.1 Let Ψ : IRn → IR+be defined as in (8), with φ being any NCP-function, and let ψ be as in (7). Suppose that F is continuously differentiable. Then,

(10)

(a) Ψ(x) ≥ 0 for all x ∈ IRn. If the NCP (1) has a solution, x is a global minimizer of Ψ(x) if and only if x solves the NCP.

(b) Ψ(x(t)) is a nonincreasing function of t, where x(t) is a solution of (9).

(c) Let x ∈ IRn, and suppose that φ is differentiable at (xi, Fi(x)) for each i = 1, . . . , n.

Then

∇Ψ(x) = ∇aψ(x, F (x)) + ∇F (x)∇bψ(x, F (x)) (13) where

aψ(x, F (x)) := [∇aψ(x1, F1(x)), . . . , ∇aψ(xn, Fn(x))]T ,

bψ(x, F (x)) := [∇bψ(x1, F1(x)), . . . , ∇bψ(xn, Fn(x))]T .

(d) Let x be a solution to the NCP such that φ is differentiable at (xi, Fi(x)) for each i = 1, . . . , n. Then, x is a stationary point of Ψ.

(e) Every accumulation point of a solution x(t) of neural network (9) is an equilibrium point.

Proof. (a) It is clear that Ψ ≥ 0. Notice that Ψ(x) = 0 if and only if Φ(x) = 0, which occurs if and only if φ(xi, Fi(x)) = 0 for all i. Since φ is an NCP-function, this is equivalent to having xi ≥ 0, Fi(x) ≥ 0 and xiFi(x) = 0. Thus, Ψ(x) = 0 if and only if x ≥ 0, F (x) ≥ 0 and hx, F (x)i = 0. This proves part (a).

(b) The desired result follows from dΨ(x(t))

dt = ∇Ψ(x(t))Tdx

dt = ∇Ψ(x(t))T(−ρ∇Ψ(x(t))) = −ρk∇Ψ(x(t))k2 ≤ 0 for all solutions x(t).

(c) The formula is clear from chain rule.

(d) First, note that from equation (7), we have ∇ψ(a, b) = φ(a, b) · ∇φ(a, b). Thus, if x is a solution to the NCP, it gives ∇ψ(xi, Fi(x)) = 0 for all i = 1, . . . , n. Then, it follows from formula (13) in part(c) that ∇Ψ(x) = 0. That is, x is a stationary point of Ψ.

(e) Please see page 232 of [41]. 2

We adopt the neural network (9) with Ψ(x) = 12kΦ(x)k2, where Φ is given by (6) with φ ∈ {φpNR, φpS−NR, ψS−NRp }. The function Φ corresponding to φpNR, φpS−NR and ψS−NRp is denoted, respectively, by Φp

NR, Φp

S1−NR and Φp

S2−NR. Their corresponding merit functions will be denoted by Ψp

NR, Ψp

S1−NR and Ψp

S2−NR, respectively. We note that by formula (13) and the differentiability of Ψ ∈ {ΨpNR, ΨpS1−NR, ΨpS2−NR} (see Proposition 4.2), the neural network (9) can be implemented on hardware as in Figure 1.

We first establish the existence and uniqueness of the solutions of neural network (9).

(11)

Figure 1: Simplified block diagram for neural network (9). This figure is lifted from Chen et al. [2]

Proposition 4.2 Let p > 1 be an odd integer. Then, the following hold.

(a) Ψp

NR and Ψp

S2−NR are both continuously differentiable on IRn.

(b) ΨpS1−NR is continuously differentiable on the open set Ω = {x ∈ IRn| xi 6= Fi(x), ∀i = 1, 2, · · · , n}.

Consequently, the neural network (9) with Ψp

NR or Ψp

S2−NR has a unique solution for all x0 ∈ IRn. The neural network (9) with Ψp

S1−NR has a unique solution for all x0 ∈ Ω.

Proof. Part (a) and (b) directly follow from Proposition 4.1(c) and Lemma 3.2. The existence and uniqueness of the solutions follows from Lemma 3.4, noting the continuous differentiability of F and ΨpNR, ΨpS1−NR (on Ω), and ΨpS2−NR. 2

We note that because of Proposition 4.2(b), we only consider the neural network (9) with Ψ = Ψp

S1−NR as a dynamical system defined on the set Ω. Our next goal is to determine the conditions such that equilibrium points of (9) are global minimizers of Ψ.

When an NCP-function has properties (P1) and (P2) (see Introduction), an equilibrium point is a global minimizer when F is a P0-function. However, these properties only hold on a proper subset of IRnfor the functions φp

NR, φp

S−NR and ψp

S−NR. Thus, we seek for other conditions to achieve the goal. We start with the merit function Ψp

NR.

(12)

Proposition 4.3 If F is strongly monotone with modulus µ > 1, then every stationary point of Ψp

NR is a global minimizer.

Proof. Let x be a stationary point of ΨpNR, that is, ∇ΨpNR(x) = 0. For convenience, we denote by A(x) and B(x) the diagonal matrices such that for each i = 1, . . . , n,

Aii(x) = (xi)p−1 and Bii(x) = (xi − Fi(x))p−2(xi − Fi(x))+. Then, by formula (13) and Lemma 3.2 (b), we have

p[A(x) − B(x)]Φp

NR(x) + p∇F (x)B(xp

NR(x) = 0, which yields

A(xpNR(x) + (∇F (x) − I)B(xpNR(x) = 0. (14) Analogous to the technique in [11], pre-multiplying both sides of (14) by (B(xp

NR(x))T leads to

Φp

NR(x)T[B(x)A(x)]Φp

NR(x) + (B(xp

NR(x))T(∇F (x) − I)B(xp

NR(x) = 0. (15) Since p is odd integer, we have A(x) ≥ 0 and B(x) ≥ 0; and hence,

Φp

NR(x)T[B(x)A(x)]Φp

NR(x) ≥ 0.

On the other hand, since F is strongly monotone with modulus µ > 1, defining G(x) :=

F (x) − x gives

hx − y, G(x) − G(y)i = hx − y, F (x) − x − F (y) + yi

= hx − y, F (x) − F (y)i − kx − yk2

≥ (µ − 1)kx − yk2

> 0,

for all x, y ∈ IRn. Note then that ∇G(x) = ∇F (x) − I is positive definite. Consequently, each term of the left-hand side of (15) is non-negative. With (∇F (x) − I) being positive definite, it yields B(xp

NR(x) = 0. In addition, from (14), we have A(xp

NR(x) = 0.

To sum up, we have proved that Aii(xpNR(xi, Fi(x)) = 0 and Bii(xpNR(xi, Fi(x)) = 0 for all i.

Now, if φp

NR(xi, Fi(x)) 6= 0 for some i, then we must have Aii(x) = Bii(x) = 0. Thus, (xi)p−1 = 0 (i.e., xi = 0), and xi ≤ Fi(x). Since φpNR is an NCP-function, the latter implies that φp

NR(xi, Fi(x)) = 0. Hence, φp

NR(xi, Fi(x)) = 0 for all i, that is, x is a global minimizer of Ψp

NR. This completes the proof. 2

The following proposition provides a weaker condition on F to guarantee that a stationary point of Ψp

NR is a global minimizer.

(13)

Proposition 4.4 If (∇F − I) is a P -matrix, then every stationary point of Ψp

NR is a global minimizer.

Proof. Suppose that ∇Ψp

NR(x) = 0. If B(xp

NR(x) = 0, then A(xp

NR(x) = 0 by equation (14). As in the preceding proof, we obtain Φp

NR(x) = 0, and hence we are done.

It remains to consider another case that B(xp

NR(x) 6= 0. Note that (B(xpNR(x))i

= (xi − Fi(x))p−2(xi − Fi(x))+φp

NR(xi, Fi(x))

=

 0 if xi ≤ Fi(x) or xi > Fi(x) = 0, (xi − Fi(x))p−1φp

NR(xi, Fi(x)) if xi > Fi(x) and Fi(x) 6= 0.

Thus, the nonzero entries of B(xp

NR(x) appear at indices i where xi > Fi(x) and Fi(x) 6= 0. To proceed, we denote

I1 = {i | xi 6= 0 and (B(xpNR(x))i 6= 0}, I2 = {i | xi = 0 and (B(xpNR(x))i 6= 0}.

With these notations, we observe the following facts.

(i) For i ∈ I1, since p is odd, it is clear that the i-th entry of A(xp

NR(x) and B(xp

NR(x)) are both nonzero and have the same sign.

(ii) For i ∈ I2, then (B(xp

NR(x))i 6= 0 and (A(xp

NR(x))i = 0.

Because (∇F − I) is a P -matrix, it follows from Lemma 3.1 that there exists an index j such that

(B(xp

NR(x))j[(∇F (x) − I)(B(xp

NR(x))]j > 0.

This says that (B(xpNR(x))j 6= 0 and therefore j ∈ I1∪ I2. Note that by (i) above, (A(xp

NR(x))i and (B(xp

NR(x))i have the same sign if j ∈ I1 which will contradict equation (14). On the other hand, if j ∈ I2, we have from fact (ii) that (A(xp

NR(x))j = 0. However, we also have that [(∇F (x)−I)(B(xp

NR(x))]j 6= 0. This certainly violates equation (14). Thus, we conclude that B(xpNR(x) = 0, and hence ΦpNR(x) = 0. Then, the proof is complete. 2

Remark 4.1 In fact, if the function F is nonnegative (or if we at least have F (x) ≥ 0 for an equilibrium point x), then case (ii) in the above proof cannot happen. Thus, the above theorem is valid even when (∇F − I) is a P0-matrix by Lemma 3.1.

From Lemma 2.2(b) and Lemma 2.2(c), we see that the structures of ∇Φp

S1−NR and

∇ΦpS2−NR corresponding to the NCP-functions φpS−NR and ψpS−NR are complex because of the piecewise nature of φp

S−NR and ψp

S−NR. This makes it difficult to find conditions on F so that a stationary point of Ψp

S1−NR or Ψp

S2−NR is also a global minimizer. However, if F is a nonnegative function, we have the following proposition.

(14)

Proposition 4.5 Suppose that F is a nonnegative P0-function and x ≥ 0. If x is a stationary point of Ψp

S1−NR or Ψp

S2−NR, then it is a global minimizer.

Proof. If we can show that properties (P1) and (P2) mentioned in the Introduction Section hold for φp

S−NR and ψp

S−NR on the nonnegative quadrant IR2+, then we can proceed as in the proof of [5, Proposition 3.4]. Thus, it is enough to show that (P1) and (P2) hold on IR2+. To simplify our notations, we denote φ1 = φpS−NR, φ2 = ψpS−NR, and ψi =

1

2i|2 (i = 1, 2). Note that the domain of ∇Ψp

S1−NR is {x | xi 6= Fi(x) or xi = Fi(x) = 0}.

Thus, for ψ1, it suffices to check that it has properties (P1) and (P2) only on the set {(a, b) ∈ IR2+| a 6= b or a = b = 0}.

To proceed, we observe that

aψi(a, b) = φi(a, b)∇aφi(a, b) and ∇bψi(a, b) = φi(a, b)∇bφi(a, b), which imply

aψi(a, b) · ∇bψi(a, b) = (φi(a, b))2· ∇aφi(a, b) · ∇bφi(a, b), i = 1, 2.

If a ≥ b = 0 or b ≥ a = 0, then φi(a, b) = 0; and thus, the above product is zero.

Otherwise, the above product is positive by Lemma 3.3(b). This asserts (P1).

To show (P2), note that it is obvious that ∇aψi(a, b) = ∇bψi(a, b) = 0 if φi(a, b) = 0 for i = 1, 2.

To show the converse, it is enough to argue that if ∇aφi(a, b) = 0 or ∇bφi(a, b) = 0, then φi(a, b) = 0. First, we analyze the case for φ1. Suppose that ∇aφ1(a, b) = 0. From Lemma 2.2(c),

1

p∇aφ1(a, b) =





ap−1− (a − b)p−1 if a > b

0 if a = b = 0

(b − a)p−1 if a < b

For a = b = 0, then φ1(a, b) = 0. For a > b, then a = |a − b| = a − b since p is an odd integer. Thus, b = 0 and because a > b, we obtain φ1(a, b) = 0. For a < b, we have from (4) that (b − a)p−1= 0, which is impossible. This proves that ∇aφ1(a, b) = 0 implies that φ1(a, b) = 0. Similarly, we can show that ∇bφ1(a, b) = 0 implies that φ1(a, b) = 0. This asserts (P2) for the function ψ1.

Analogously, for ψ2, assume that ∇aφ2(a, b) = 0. From Lemma 2.2(d), we have 1

p∇aφ2(a, b) =

ap−1bp− (a − b)p−1bp if a > b,

a2p−1 if a = b,

ap−1bp− (b − a)pap−1+ (b − a)p−1ap if a < b.

For a = b, then a2p−1 = 0, and hence a = 0 and φ2(a, b) = 0. For a > b, then ap−1bp− (a − b)p−1bp = 0. For b = 0, we obtain φ2(a, b) = 0 by using a > b. Otherwise,

(15)

ap−1 − (a − b)p−1 = 0. Because p is odd and a > b, we have a = |a − b| = a − b.

consequently, b = 0 and φ2(a, b) = 0. For a < b, then we have from the above formula for ∇aφ2 that ap−1bp− (b − a)pap−1+ (b − a)p−1ap = 0. For a = 0, then φ2(a, b) = 0 due to a < b. Otherwise, a > 0 and

0 = bp− (b − a)p+ (b − a)p−1a

= bp− (b − a)p−1(b − 2a)

= (a + k)p− kp−1(k − a) where k = b − a > 0

=

p−1

X

i=0

p i



ap−iki+ akp−1

> 0

which is a contradiction. To sum up, we have shown that ∇aφ2(a, b) = 0 implies that φ2(a, b) = 0. Similarly, it can be verified φ2(a, b) = 0 provided ∇bφ2(a, b) = 0. Thus, ψ2 possesses the property (P2). This completes the proof. 2

5 Stability Analysis

We now look at the properties of the neural network (9) related to the behavior of its solutions. We have the following consequences, which easily follow from Proposition 4.1(a), Proposition 4.1(d), Proposition 4.4, and Proposition 4.5.

Proposition 5.1 Consider the neural network (9) with Ψ ∈ {Ψp

NR, Ψp

S1−NR, Ψp

S2−NR}.

(a) Every solution of the NCP is an equilibrium point.

(b) If (∇F − I) is a P -matrix, then every equilibrium point of (9) with Ψ = Ψp

NR solves the NCP.

(c) If F is a nonnegative P0-function, every equilibrium point x ≥ 0 of (9) with Ψ ∈ {ΨpS1−NR, ΨpS2−NR} solves the NCP.

Theorem 5.1 below addresses the boundedness of the level sets of Ψ and convergence of the trajectories of the neural network. Before we state this theorem, we need the following lemma.

Lemma 5.1 Let {(ak, bk)}k=1 ⊆ IR2 such that |ak| → ∞ and |bk| → ∞ as k → ∞.

Then, |φp

NR(ak, bk)| → ∞, |φp

S−NR(ak, bk)| → ∞, and |ψp

S−NR(ak, bk)| → ∞.

(16)

Proof. (a) First, we verify that |φp

S−NR(ak, bk)| → ∞. To proceed, we consider three cases.

(i) Suppose ak → ∞ and bk → ∞. Note that for all x ∈ [−1, 0] and n ∈ N, there holds (1 + x)n ≤ (1 − nx)−1

which is a useful inequality. Thus, when a > b > 0, we have φpS−NR(a, b) = ap− (a − b)p = ap− ap

 1 − b

a

p

≥ ap− ap

 1 − p



−b a

−1

= ap− ap

 a

a + pb



= papb a + pb

≥ pap−1b 1 + p

≥ pbp 1 + p.

Similarly, φpS−NR(a, b) ≥ 1+ppap for b > a > 0. Thus, φpS−NR(ak, bk) → ∞ as k → ∞.

(ii) Suppose ak → −∞ and bk → −∞. Observe that φpS−NR(a, b) ≤ ap when a > b, and φp

S−NR(a, b) ≤ bp when a < b. Thus, φp

S−NR(ak, bk) → −∞ as k → ∞.

(iii) Suppose ak → ∞ and bk → −∞. For a > 0 and b < 0, we have (a − b)p ≥ ap+ (−b)p = ap − bp.

Thus, φp

S−NR(a, b) = ap − (a − b)p ≤ bp and we conclude that φp

S−NR(ak, bk) → −∞ as k → ∞. In the case that ak → −∞ and bk → ∞, we also have φp

S−NR(ak, bk) → −∞ as k → ∞ by symmetry of φp

S−NR.

(b) Next, we show that |φpNR(ak, bk)| → ∞. Again, we consider three cases.

(i) Suppose that ak → −∞. Since φp

NR(a, b) = ap− (a − b)p+ ≤ ap for all (a, b) ∈ IR2, it is trivial to see that φp

NR(ak, bk) → −∞.

(ii) Suppose that ak → ∞ and bk → ∞. For a > b > 0, then we have φp

NR(a, b) = φp

S−NR(a, b) ≥ pbp 1 + p. For 0 ≤ a < b, it is clear that φp

NR(a, b) = ap. Then, we conclude that φp

NR(ak, bk) → ∞.

(17)

(iii) Suppose that ak → ∞ and bk → −∞. For a > 0 and b < 0, we have φp

NR(a, b) = φp

S−NR(a, b) ≤ bp

and so φpNR(ak, bk) → −∞. Thus, we have proved that |φpNR(ak, bk)| → ∞.

(c) The last limit, |ψp

S−NR(ak, bk)| → ∞, follows from the fact that

ψp

S−NR(a, b) =

φpS−NR(a, b)bp if a > b, apbp = a2p if a = b, φp

S−NR(b, a)ap if a < b.

and the inequalities obtained above for φp

S−NR. 2

Theorem 5.1 Let F be a uniform P -function and let Ψ ∈ {Ψp

NR, Ψp

S1−NR, Ψp

S2−NR}.

(a) The level sets L(Ψ, γ) := {x ∈ IRn| Ψ(x) ≤ γ} of Ψ are bounded for any γ ≥ 0.

Consequently, the trajectory x(t) through any initial condition x0 ∈ IRn is defined for all t ≥ 0.

(b) The trajectory x(t) of (9) through any x0 ∈ IRn converges to an equilibrium point.

Proof. (a) Suppose otherwise. Then, there exists a sequence {xk}k=1 ⊆ L(Ψ, γ) such that kxkk → ∞ as k → ∞. A similar argument as in [10] shows that there exists an index i such that |xki| → ∞ and |Fi(xk)| → ∞ as k → ∞. By Lemma 5.1, we have |φ(xki, Fi(xk))| → ∞, where φ ∈ {φpNR, φpS−NR, ψS−NRp }. But, this is impossible since Ψ(xk) ≤ γ for all k. Thus, the level set L(Ψ, γ) is bounded. The remaining part of the theorem can be proved similar to Proposition 4.2(b) in [2].

(b) From part(a), the level sets of Ψ are compact and so by LaSalle’s Invariance Principle [22], we reach the desired conclusion. 2

Theorem 5.2 Suppose x is an isolated equilibrium point of (9). Then, x is asymptot- ically stable provided that either

(i) Ψ = Ψp

NR and (∇F − I) is a P -matrix; or (ii) Ψ ∈ {Ψp

S1−NR, Ψp

S2−NR}, F is a nonnegative P0-function, and the equilibrium point is nonnegative.

Proof. Let x be an isolated equilibrium point of (9). Then, it has a neighborhood O such that

∇Ψ(x) = 0 and ∇Ψ(x) 6= 0 for all x ∈ O\{x}.

We claim that Ψ is a Lyapunov function at x over Ω. To proceed, we note first that Ψ(x) ≥ 0. By Proposition 5.1(b) and (c), Ψ(x) = 0. Further, if Ψ(x) = 0 for some

(18)

x ∈ O\{x}, then x solves the NCP and by Proposition 5.1(a), it is an equilibrium point.

This contradicts the isolation of x. Thus, Ψ(x) > 0 for all x ∈ O\{x}. Finally, it is clear that

dΨ(x(t))

dt = −ρk∇Ψ(x(t))k2 < 0

over the set O\{x}. Then, applying Lemma 3.5 yields that x is asymptotically stable.

2

We look now at the exponential stability of the neural network.

Theorem 5.3 Consider the neural network (9) with Ψ ∈ {Ψp

NR, Ψp

S1−NR, Ψp

S2−NR}. If

∇Φ(x) is nonsingular for some isolated equilibrium point x, then x solves the NCP and x is exponentially stable.

Proof. Let x be an equilibrium point such that ∇Φ(x) is nonsingular. Note that

∇Ψ(x) = ∇Φ(x)Φ(x), and so ∇Ψ(x) = 0 implies that Φ(x) = 0. This proves the first claim of the theorem. Further, using Ψ as a Lyapunov function as in the preceding theorem, x is asymptotically stable.

Note that since Φ is differentiable at x, we have

Φ(x) = ∇Φ(x)T(x − x) + o(kx − xk) as x → x (16) By Lemma 3.6, there exists δ > 0 and a constant C such that ∇Φ(x) is nonsingular for all x with kx − xk < δ, and k∇Φ(x)−1k ≤ C. Then, it gives

κkyk2 ≤ k∇Φ(x)yk2 (17)

for any x in the δ-neighborhood (call it Nδ) and any y ∈ IRn, where κ = 1/C2.

Let ε < 2ρκ. Since x is asymptotically stable, we may choose δ small enough so that o(kx − xk2) < εkx − xk2 and x(t) → x as t → ∞ for any initial condition x(0) ∈ Nδ. Now, define g : [0, ∞) → IR by

g(t) := kx(t) − xk2

where x(t) is the unique solution through x(0) ∈ Nδ. Using equations (16) and (17), we obtain

dg(t)

dt = 2(x(t) − x)Tdx(t) dt

= −2ρ(x(t) − x)T∇Ψ(x(t))

= −2ρ(x(t) − x)T∇Φ(x(t))Φ(x(t))

= −2ρ(x(t) − x)T∇Φ(x(t))∇Φ(x)T(x(t) − x) + o(kx(t) − xk2)

≤ (−2ρκ + ε)kx(t) − xk2

= (−2ρκ + ε)g(t).

(19)

Then, it follows that g(t) ≤ e(−2ρκ+ε)tg(0), which says

kx(t) − xk ≤ e(−ρκ+ε/2)tkx(0) − xk,

where −ρκ + ε/2 < 0. This proves that x is exponentially stable. 2

6 Simulation Results

In this section, we look at some nonlinear complementarity problems and test them us- ing the neural network (9) with Ψ ∈ {ΨpNR, ΨpS1−NR, ΨpS2−NR}. We also compare the rate of convergence of each network for different values of p. Further, we compare the numerical performance of these networks with the neural network based on the Fischer-Burmeister (FB) function [23] given by (10) and the neural network based on the generalized Fischer- Burmeister function [2] given by (11).

In the following simulations, we use the Matlab ordinary differential equation solver ode23s. Recall that ρ is a time-scaling parameter. In particular, if we wish to achieve faster convergence, a higher value of ρ can be used. In our simulations, the values of ρ used are 103, 106 or 109, as indicated in the figures. The stopping criterion in simulating the trajectories is k∇Ψ(x(t))k ≤ 10−5.

Example 6.1 [21, Kojima-Shindo] Consider the NCP, where F : IR4 → IR4 is given by

F (x) =

3x21 + 2x1x2+ 2x22+ x3+ 3x4− 6 2x21+ x1+ x22+ 3x3+ 2x4− 2 3x21 + x1x2+ 2x22+ 2x3+ 3x4− 1

x21+ 3x22 + 2x3+ 3x4− 3

 .

This is a non-degenerate NCP and the solution is x = (√

6/2, 0, 0, 1/2).

We simulate the network (9) with different Ψ ∈ {Ψp

NR, Ψp

S1−NR, Ψp

S2−NR} for various values of p to see the influence of p on convergence of trajectories to the NCP solution.

From Figures 2-4, we see that a smaller value of p yields a faster convergence when the initial condition is x0 = (2, 0.5, 0.5, 1.5)T. Figure 5 depicts the comparison of the different NCP-functions with p = 3, together with the FB and generalized FB functions. Among these five classes of NCP-functions, we see that the neural network based on φp

S−NR has the best numerical performance. In Figure 6, we simulate the neural network based on φpS−NR using 6 random initial points, and the trajectories converges to x at around t = 5.5 ms. One can also observe from Figure 6 that the convergence of x2(t) and x3(t) is very fast. We note that ∇Φp

S1−NR(x) is non-singular, which leads to the exponential stability of x by Theorem 5.3. This particular problem was also simulated using neural networks

參考文獻

相關文件

In this paper, we build a new class of neural networks based on the smoothing method for NCP introduced by Haddou and Maheux [18] using some family F of smoothing functions.

Qi (2001), Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Journal of Computational and Applied Mathematics, vol. Pedrycz,

{ Title: Using neural networks to forecast the systematic risk..

Principle Component Analysis Denoising Auto Encoder Deep Neural Network... Deep Learning Optimization

CAST: Using neural networks to improve trading systems based on technical analysis by means of the RSI financial indicator. Performance of technical analysis in growth and small

CAST: Using neural networks to improve trading systems based on technical analysis by means of the RSI financial indicator. Performance of technical analysis in growth and small

Random Forest: Theory and Practice Neural Network Motivation.. Neural Network Hypothesis Neural Network Training Deep

They are suitable for different types of problems While deep learning is hot, it’s not always better than other learning methods.. For example, fully-connected