A smoothed NR neural network for solving nonlinear convex programs with second-order cone constraints

(1)

A smoothed NR neural network for solving nonlinear convex programs with second-order cone constraints

Xinhe Miao ¹

Department of Mathematics School of Science Tianjin University Tianjin 300072, P.R. China

Jein-Shan Chen ² Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan Chun-Hsu Ko ³

Department of Electrical Engineering I-Shou University

Kaohsiung 840, Taiwan

January 2, 2012

(1st revision on January 24, 2013) (2nd revision on May 27, 2013) (3rd revision on June 22, 2013)

Abstract. This paper proposes a neural network approach for efficiently solving general nonlinear convex programs with second-order cone constraints. The proposed neural network model was developed based on a smoothed natural residual merit function involving an unconstrained minimization reformulation of the complementarity problem. We study the existence and convergence of the trajectory of the neural network. Moreover, we show some stability properties for the considered neural network, such as the Lyapunov stability, asymptotic stability, and exponential stability. The examples in this paper provide a further demonstration of the effectiveness of the proposed neural network. This paper can be viewed as a follow-up version of [20] and [26] because more stability results are obtained.

1The author’s work is also supported by National Young Natural Science Foundation (No. 11101302) and The Seed Foundation of Tianjin University (No. 60302041). E-mail: [email protected]

2Corresponding author. Member of Mathematics Division, National Center for Theoretical Sci- ences, Taipei Office. The author’s work is supported by National Science Council of Taiwan. E-mail:

[email protected].

3E-mail: [email protected].

(2)

Keywords: Merit function, Neural network, NR function, Second-order cone, Stability.

1 Introduction

In this paper, we are interested in finding a solution to the following nonlinear convex programs with second-order cone constraints (henceforth SOCP):

min f (x) s.t. Ax = b

−g(x) ∈ K

(1)

where A ∈ R^m^×n has full row rank, b ∈ R^m, f : Rⁿ → R, g = [g¹,· · · , g^l]^T : Rⁿ → R^l with f and gi’s being two order continuous differentiable and convex on Rⁿ, and K is a Cartesian product of second-order cones (also called Lorentz cones), expressed as

K = Kⁿ¹ × Kⁿ² × · · · × Kⁿ^N with N, n1,· · · , n^N ≥ 1, n¹+ · · · + n^N = l and

Kⁿⁱ :=(x_i1, x_i2,· · · , xⁱⁿi)^T ∈ Rⁿⁱ| k(xi2,· · · , xⁱⁿi)k ≤ xi1 .

Here, k · k denotes the Euclidean norm and K¹ means the set of nonnegative reals R+. In fact, the problem (1) is equivalent to the following variational inequality problem, which is to find x ∈ D satisfying

h∇f(x), y − xi ≥ 0, ∀y ∈ D,

where D = {x ∈ Rⁿ| Ax = b, −g(x) ∈ K}. Many problems in the engineering, trans- portation science, and economics communities can be solved by transforming the original problems into the mentioned convex optimization problems or variational inequality problems, see [1, 7, 10, 17, 23].

Many studies have proposed computational approaches to solve convex optimization problems. Examples of these methods include the interior-point method [29], merit function method [5, 16], Newton method [18, 25], and projection method [10]. How- ever, real-time solutions are imperative in many applications, such as force analysis in robot grasping and control applications. The traditional optimization methods may not be suitable for these applications because of stringent computational time requirements.

Therefore, a feasible and efficient method is required to solve real-time optimization problems. The neural network method is an ideal method for solving real-time optimization problems. Compared with previous methods, the neural network method has an advan- tage in solving real-time optimization problems. Hence, researchers have developed many continuous-time neural networks for constrained optimization problems. The literature

(3)

contains many studies on neural networks for solving real-time optimization problems, please see [4, 9, 12, 14, 15, 19, 20, 21, 22, 27, 30, 31, 32, 33, 34, 36] and references therein.

Neural networks stemmed back from McCulloch and Pitts’ pioneering work a half century ago, and neural networks were first introduced to the optimization domain in the 1980s [13, 28]. The essence of neural network method for optimization [6] is to establish a nonnegative Lyapunov function (or energy function) and a dynamic system that represents an artificial neural network. This dynamic system usually adopts the form of a first-order ordinary differential equation. For an initial point, the neural network is likely to approach its equilibrium point, which corresponds to the solution to the considered optimization problem.

This paper presents a neural network method to solve general nonlinear convex programs with second-order cone constraints. In particular, we consider the Karush-Kuhn- Tucker (KKT) optimality conditions of the problem (1), which can be transformed into a second-order cone complementarity problems (SOCCP), as well as some equality constraints. Following a reformulation of the complementarity problem, an unconstrained optimization problem is formulated. A smoothed natural residual (NR) complementarity function is then used to construct a Lyapunov function and a neural network model.

At the same time, we show the existence and convergence of the solution trajectory for the dynamic system. This study also investigates the stability results, such as the Lya- punov stability, the asymptotic stability, and the exponential stability. We want to point out that the optimization problem considered in this paper is more general than the one studied in [20] where g(x) = −x is investigated therein. From [20], for solving the specific SOCP (i.e., g(x) = −x), we know that the neural network based on the cone projection function has better performance than the one based on the Fischer-Burmeister function in most cases (except for some oscillating cases). In light of considering this phenomenon, we employ a neural network model based on the cone projection function for a more general SOCP. Thus, this paper can be viewed as a follow-up of [20] in this sense. Nevertheless, the neural network model studied here is not exactly the same as the one considered in [20]. More specifically, we consider a neural network based on the smoothed NR function which was studied in [16]. Why do we make such a change? As in Section 4, we can establish various stability results, including exponential stability for the proposed neural network that were not achieved in [20]. In addition, the second neural network studied in [27] (for various types of problems) is also similar to the proposed network. Again, the stability is not guaranteed in that study, but three stabilities are proved here.

The remainder of this paper is organized as follows. Section 2 presents stability concepts and provides related results. Section 3 describes the neural network architecture, which is based on the smoothed NR function, to solve the problem (1). Section 4 presents

(4)

the convergence and stability results of the proposed neural network. Section 5 shows the simulation results of the new method. Finally, Section 6 gives the conclusion of this paper.

2 Preliminaries

In this section, we briefly recall background materials of the ordinary differential equation (ODE) and some stability concepts regarding the solution of ODE. We also present some related results that play an essential role in the subsequent analysis.

Let H : Rⁿ → Rⁿ be a mapping. The first order differential equation (ODE) means du

dt = H(u(t)), u(t0) = u0 ∈ Rⁿ. (2) We start with the existence and uniqueness of the solution of Eq. (2). Then, we introduce the equilibrium point of (2) and define various stabilities. All of these materials can be found in a typical ODE textbook, such as [24].

Lemma 2.1 (The existence and uniqueness) [21, Theorem 2.5] Assume that H : Rⁿ → Rⁿ is a continuous mapping. Then for arbitrary t0 ≥ 0 and u⁰ ∈ Rⁿ, there exists a local solution u(t), t ∈ [t⁰, τ) to (2) for some τ > t0. Furthermore, if H is locally Lipschitz continuous at u0, then the solution is unique; if H is Lipschitz continuous in Rⁿ , then τ can be extended to ∞.

Remark 2.1 For Eq. (2), if a local solution defined on [t₀, τ) cannot be extended to a local solution on a larger interval [t0, τ1), where τ1 > τ, then it is called a maximal solution, and this interval [t0, τ) is the maximal interval of existence. It is obvious that an arbitrary local solution has an extension to a maximal one.

Lemma 2.2 [21, Theorem 2.6] Let H : Rⁿ→ Rⁿ be a continuous mapping. If u(t) is a maximal solution, and [t0, τ) is the maximal interval of existence associated with u0 and τ <+∞, then lim_t

↑τ ku(t)k = +∞.

For the first-order differential equation (2), a point u^∗ ∈ Rⁿis called an equilibrium point of (2) if H(u^∗) = 0. If there is a neighborhood Ω ⊆ Rⁿ of u^∗ such that H(u^∗) = 0 and H(u) 6= 0 for any u ∈ Ω\{u^∗}, then u^∗ is called an isolated equilibrium point.

The following are definitions of various stabilities, and related materials can be found in [21, 24, 27].

(5)

Definition 2.1 (Lyapunov stability and Asymptotic stability) Let u(t) be a solution to Eq. (2).

(a) An isolated equilibrium point u^∗ is Lyapunov stable (or stable in the sense of Lya- punov) if for any u0 = u(t0) and ε > 0, there exists a δ > 0 such that

ku⁰− u^∗k < δ =⇒ ku(t) − u^∗k < ε for t ≥ t⁰.

(b) Under the condition that an isolated equilibrium point u^∗ is Lyapunov stable, u^∗ is said to be asymptotically stable if it has the property that if ku⁰ − u^∗k < δ, then u(t) → u^∗ as t → ∞.

Definition 2.2 (Lyapunov function) Let Ω ⊆ Rⁿ be an open neighborhood of ¯u. A continuously differentiable function g : Rⁿ → R is said to be a Lyapunov function (or energy function) at the state ¯u (over the set Ω) for Eq. (2) if











g(¯u) = 0,

g(u) > 0 ∀u ∈ Ω \ {¯u}, dg(u(t))

dt ≤ 0, ∀u ∈ Ω.

The following Lemma shows the relationship between stabilities and a Lyapunov function, see [3, 8, 35].

Lemma 2.3 (a) An isolated equilibrium point u^∗ is Lyapunov stable if there exists a Lyapunov function over some neighborhood Ω of u^∗.

(b) An isolated equilibrium point u^∗ is asymptotically stable if there exists a Lyapunov function over some neighborhood Ω of u^∗ that satisfies

dg(u(t))

dt <0, ∀u ∈ Ω \ {u^∗}.

Definition 2.3 (Exponential stability) An isolated equilibrium point u^∗ is exponentially stable for Eq. (2) if there exist ω < 0, κ > 0, δ > 0 such that arbitrary solution u(t) to Eq. (2), with the initial condition u(t0) = u0, ku⁰ − u^∗k < δ, is defined on [0, ∞) and satisfies

ku(t) − u^∗k ≤ κe^ωtku(t⁰) − u^∗k, t ≥ t⁰.

From the above definitions, it is obvious that exponential stability is asymptotic stable.

(6)

3 NR neural network model

This section shows how the dynamic system in this study was formed. As mentioned previously, the key steps in the neural network method lie in constructing the dynamic system and Lyapunov function. To this end, we first look into the KKT conditions of the problem (1) which are presented as below:







∇f(x) − A^Ty+ ∇g(x)z = 0, z ∈ K, −g(x) ∈ K, z^Tg(x) = 0, Ax− b = 0,

(3)

where y ∈ R^m, ∇g(x) denotes the gradient matrix of g. According to the KKT condition, it is well known that if the problem (1) satisfies Slater’s condition, which means there exists a strictly feasible point for (1), i.e., there exists an x ∈ Rⁿ such that −g(x) ∈ int(K) and Ax = b. Then x^∗ is a solution of the problem (1) if and only if there exist y^∗, z^∗ such that (x^∗, y^∗, z^∗) satisfies the KKT conditions (3). Hence, we assume that the problem (1) satisfies the Slater’s condition in this paper.

The following paragraphs provide a brief review of particular properties of the spectral factorization with respect to a second-order cone, which will be used in the subsequent analysis. Spectral factorization is one of the basic concepts in Jordan algebra. For more details, see [5, 11, 25].

For any vector z = (z1, z2) ∈ R × R^l⁻¹ (l ≥ 2), its spectral factorization with respect to the second-order cone K is defined as

z = λ1e₁ + λ2e₂,

where λi = z1+ (−1)ⁱkz²k, (i = 1, 2) are the spectral values of z, and ei =

( 1

2(1, (−1)^{i z}_kzⁱ_i_k), z2 6= 0

1

2(1, (−1)ⁱw), z2 = 0

with w ∈ R^l⁻¹ such that kwk = 1. The terms e¹, e₂ are called the spectral vectors of z.

The spectral values of z and the vector z have the following properties: for any z ∈ R^l, there have λ₁ ≤ λ2 and

λ1 ≥ 0 ⇐⇒ z ∈ K.

Now we review the concept of metric projection onto K. For arbitrary element z ∈ R^l, the metric projection of z onto K is denoted by PK(z) and defined as

P_K(z) := argmin_w_∈Kkz − wk.

Combining the spectral decomposition of z with the metric projection of z onto K yields the expression of metric projection P_K(z) in [11]:

P_K(z) = max{0, λ¹}e¹+ max{0, λ²}e².

(7)

The projection function P_K has the following property, which is called the Projection Theorem (see [2]).

Lemma 3.1 Let Ω be a closed convex set of Rⁿ. Then, for all x, y ∈ Rⁿ and any z ∈ Ω, (x − P^Ω(x))^T(PΩ(x) − z) ≥ 0 and kP^Ω(x) − P^Ω(y)k ≤ kx − yk.

Given the definition of the projection, suppose z+denotes the metric projection P_K(z) of z ∈ R^l onto K. Then, the natural residual (NR) function is given as follows [11]:

ΦNR(x, y) := x − (x − y)⁺ ∀x, y ∈ R^l. The NR function is a popular SOC-complementarity function, i.e.,

ΦNR(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K and hx, yi = 0.

Because of the non-differentiability of ΦNR, we consider a class of smoothed NR complementarity function. To this end, we employ a continuously differentiable convex function ˆ

g : R → R such that

a→−∞lim g(a) = 0, limˆ

a→∞(ˆg(a) − a) = 0 and 0 < ˆg^′(a) < 1. (4) What kind of functions satisfies the condition (4)? Here we present two examples:

ˆ g(a) =

√a²+ 4 + a

2 and ˆg(a) = ln(e^a+ 1).

Suppose z = λ₁e₁+ λ₂e₂, where λi and ei (i = 1, 2) are the spectral values and spectral vectors of z, respectively. By applying the function ˆg, we define the following function:

Pµ(z) := µˆg(λ1

µ)e₁+ µˆg(λ2

µ)e₂. (5)

Fukushima, Luo, and Tseng [11] show that Pµ is smooth for any µ > 0; moreover Pµ is a smoothing function of the projection P_K, i.e., limµ↓0Pµ = P_K. Hence, a smoothed NR complementarity function is given in the form of

Φµ(x, y) := x − P^µ(x − y).

In particular, from [11, Proposition 5.1], there exists a positive constant γ > 0 such that kΦ^µ(x, y) − Φ^NR(x, y)k ≤ γµ

for any µ > 0 and (x, y) ∈ Rⁿ× Rⁿ.

(8)

Now we look into the KKT conditions (3) of the problem (1). Let

L(x, y, z) = ∇f(x) − A^Ty+ ∇g(x)z, H(u) :=







µ Ax− b L(x, y, z) Φµ(z, −g(x))







and

Ψµ(u) := 1

2kH(u)k²

= 1

2kΦ^µ(z, −g(x))k²+1

2kL(x, y, z)k²+1

2kAx − bk²+ 1 2µ²,

where u = (µ, x^T, y^T, z^T)^T ∈ R⁺ × Rⁿ× R^m × R^l. It is known that Ψµ(u) serves as a smoothing function of the merit function ΨNR which means the KKT conditions (3) are shown to be equivalent to the following unconstrained minimization problem via the merit function approach:

min Ψµ(u) := 1

2kH(u)k². (6)

Theorem 3.1 (a) Let Pµ be defined by (5). Then, ∇P^µ(z) and I − ∇P^µ(z) are positive definite for any µ > 0 and z ∈ R^l .

(b) Let Ψµ be defined as in (6). Then, the smoothed merit function Ψµ is continuously differentiable everywhere with ∇Ψ^µ(u) = ∇H(u)H(u) where

∇H(u) =







1 0 0 −(^∂P^µ^(z+g(x))∂µ )^T

0 A^T ∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) −∇^xPµ(z + g(x))

0 0 −A 0

0 0 ∇g(x)^T I− ∇^zPµ(z + g(x))





 .

(7) Proof. Form the proof of [16, Proposition 3.1], it is clear that ∇P^µ(z) and I − ∇P^µ(z) are positive definite for any µ > 0 and z ∈ R^l. With the help of the definition of the smoothed merit function Ψµ, part(b) easily follows from the chain rule. 2

In light of the main ideas for constructing artificial neural networks (see [6] for details), we establish a specific first order ordinary differential equation, i.e., an artificial neural network. More specifically, based on the gradient of the objective function Ψµ in minimization problem (6), we propose the neural network for solving the KKT system (3) of nonlinear SOCP (1) with the following differential equation:

du(t)

dt = −ρ∇Ψ^µ(u), u(t0) = u0, (8)

(9)

where ρ > 0 is a time scaling factor. In fact, if τ = ρt, then ^du(t)_dt = ρ^{du(τ )}_dτ . Hence, it follows from (8) that ^{du(τ )}_dτ = −∇Ψ^µ(u). In view of this, for simplicity and convenience, we set ρ = 1 in this paper. Indeed, the dynamic system (8) can be realized by an architecture with the cone projection function shown in Figure 1. Moreover, the architecture of this artificial neural network is categorized as a “recurrent” neural network according to the classifications of artificial neural networks as in [6, Chapter 2.3.1]. The circuit for (8) requires n + m + l + 1 integrators, n processors for ∇f(x), l processors for g(x), ln processors for ∇g(x), (l + 1)²n processors for ∇²f(x) +

l

X

i=1

∇²gi(x), 1 processor for Φµ, 1 processor for ∂Pµ

∂µ , n processors for ∇^xPµ, l processors for ∇^zPµ, n²+ 4mn + 3ln + l²+ l connection weights and some summers.

Figure 1: Block diagram of the proposed neural network with smoothed NR function

4 Stability analysis

In this section, in order to study the stability issues of the proposed neural network (8) for solving the problem (1), we first make an assumption that will be required in our subsequent analysis.

Assumption 4.1 (a) The problem (1) satisfies the Slater’s condition.

(10)

(b) The matrix ∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) is positive definite for each x.

Here we say a few words about Assumption 4.1(a) and (b). The Slater’s condition is a standard condition that is widely used in optimization field. Assumption 4.1(b) seems stringent at first glance. Indeed, since f and gi’s are two order continuously differentiable and convex functions on Rⁿ, if there exists at least one function which is strictly convex among these functions, then Assumption 4.1(b) is guaranteed.

Lemma 4.1 (a) For any u, we have

kH(u) − H(u^∗) − V (u − u^∗)k = o(ku − u^∗k) for u → u^∗ and V ∈ ∂H(u) where ∂H(u) denotes the Clarke generalized Jacobian at u.

(b) Under Assumption 4.1, ∇H(u)^T is nonsingular for any u = (µ, x, y, z) ∈ R⁺⁺ × Rⁿ× R^m× R^l, where R++ denotes the set {µ | µ > 0}.

(c) Under Assumption 4.1 and V ∈ ∂P⁰(w) being a positive definite matrix where ∂P0(w) denotes the Clarke generalized Jacobian of the project function P at w, there has

T ∈ ∂H(u)

=

















1 0 0 −(^∂P^µ^(z+g(x))∂µ )^T|^µ=0

0 A^T ∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) −V^T∇g(x)

0 0 −A 0

0 0 ∇g(x)^T I− V







V ∈ ∂P⁰(W )









 is nonsingular for any u = (0, x, y, z) ∈ {0} × Rⁿ× R^m × R^l.

(d) Ψµ(u(t)) is nonincreasing with respect to t.

Proof. (a) This result follows directly from the definition of semismoothness of H, see [26] for more details.

(b) From the expression of ∇H(u) in Theorem 3.1, it follows that ∇H(u)^T is nonsingular if and only if the following matrix

M :=





A 0 0

∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) −A^T ∇g(x)

−∇^xPµ(z + g(x))^T 0 (I − ∇^zPµ(z + g(x)))^T





is nonsingular. Suppose v = (x, y, z) ∈ Rⁿ× R^m× R^l. To show the nonsingularity of M, it is enough to prove that

M v = 0 =⇒ x= 0, y = 0 and z = 0.

(11)

Because −∇^xPµ(z+g(x))^T = −∇P^µ(w)^T∇g(x)^T, where w = z+g(x) ∈ R^l, from Mv = 0, we have

Ax= 0, ∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) x − A^Ty+ ∇g(x)z = 0 (9) and

−∇P^µ(w)^T∇g(x)^Tx+ (I − ∇P^µ(w))^Tz= 0. (10) From (9), it follows that

x^T ∇²f(x) + ∇²g₁(x) + · · · + ∇²gl(x) x + ∇g(x)^TxT

z = 0. (11)

Moveover, equation (10) and Theorem 3.1 yield

∇g(x)^Tx= (∇P^µ(w)^T)⁻¹(I − ∇P^µ(w))^Tz. (12) Combining (11)-(12) and Theorem 3.1, under the condition of Assumption 4.1, it is not hard to obtain that x = 0 and z = 0. By looking at equation (9) again, since A is full row rank, we have y = 0. Therefore, ∇H(u)^T is nonsingular.

(c) The proof of Part (c) is similar to that of Part (b), in which the only option is to replace ∇P^µ(w) with V ∈ ∂P0(w).

(d) According to the definition of Ψµ(u(t)) and Eq. (8), it is clear that dΨµ(u(t))

dt = ∇Ψ^µ(u(t))du(t)

dt = −ρk∇Ψ^µ(u(t))k²≤ 0.

Consequently, Ψµ(u(t)) is nonincreasing with respect to t. 2

Proposition 4.1 Assume that ∇H(u) is nonsingular for any u ∈ R+× Rⁿ× R^m× R^l. Then,

(a) (x^∗, y^∗, z^∗) satisfies the KKT conditions (3) if and only if (0, x^∗, y^∗, z^∗) is an equilibrium point of the neural network (8);

(b) under the Slater’s condition, x^∗ is a solution to the problem (1) if and only if (0, x^∗, y^∗, z^∗) is an equilibrium point of the neural network (8).

Proof. (a) Because Φ0 = ΦNR when µ = 0, it follows that (x^∗, y^∗, z^∗) satisfies the KKT conditions (3) if and only if H(u^∗) = 0, where u^∗ = (0, x^∗, y^∗, z^∗)^T. Since ∇H(u) is nonsingular, we have that H(u^∗) = 0 if and only if ∇Ψ^µ(u^∗) = ∇H(u^∗)^TH(u^∗) = 0.

Thus, the desired result follows.

(b) Under the Slater’s condition, it is well known that x^∗ is a solution of the problem (1) if and only if there exist y^∗ and z^∗ such that (x^∗, y^∗, z^∗) satisfying the KKT conditions

(12)

(3). Hence, according to Part(a), it follows that (0, x^∗, y^∗, z^∗) is an equilibrium point of the neural network (8). 2

The next result addresses the existence and uniqueness of the solution trajectory of the neural network (8).

Theorem 4.1 (a) For any initial point u0 = u(t0), there exists a unique continuously maximal solution u(t) with t ∈ [t⁰, τ) for the neural network (8), where [t0, τ) is the maximal interval of existence.

(b) If the level set L(u⁰) := {u | Ψ^µ(u) ≤ Ψ^µ(u0)} is bounded, then τ can be extended to +∞.

Proof. This proof is exactly the same as the proof of [27, Proposition 3.4] , and therefore omitted here. 2

Theorem 4.2 Assume that ∇H(u) is nonsingular and that u^∗ is an isolated equilibrium point of the neural network (8). Then the solution of the neural network (8) with any initial point u0 is Lyapunov stable.

Proof. From Lemma 2.3, we only need to argue that there exists a Lyapunov function over some neighborhood Ω of u^∗. Now, we consider the smoothed merit function

Ψµ(u) = 1

2kH(u)k².

Since u^∗ is an isolated equilibrium point of (8), there is a neighborhood Ω of u^∗ such that

∇Ψ^µ(u^∗) = 0 and ∇Ψ^µ(u(t)) 6= 0, ∀u(t) ∈ Ω\{u^∗}.

By the nonsingularity of ∇H(u) and the definition of Ψ^µ, it is easy to obtain that Ψµ(u^∗) = 0. From the definition of Ψµ, we claim that Ψµ(u(t)) > 0 for any u(t) ∈ Ω\{u^∗}, where Ω is a neighborhood of u^∗. Suppose not, namely, Ψµ(u(t)) = 0. It follows that H(u(t)) = 0. Then, we have ∇Ψ^µ(u(t)) = 0 which contradicts with the assumption that u^∗ is an isolated equilibrium point of (8). Thus, Ψµ(u(t)) > 0 for any u(t) ∈ Ω\{u^∗}.

Furthermore, by the proof of Lemma 4.1(d), we know that for any u(t) ∈ Ω dΨµ(u(t))

dt = ∇Ψ^µ(u(t))du(t)

dt = −ρk∇Ψ^µ(u(t))k²≤ 0. (13) Consequently, the function Ψµ is a Lyapunov function over Ω. This implies that u^∗ is Lyapunov stable for the neural network (8). 2

(13)

Theorem 4.3 Assume that ∇H(u) is nonsingular and that u^∗ is an isolated equilibrium point of the neural network (8). Then u^∗ is asymptotically stable for neural network (8).

Proof. From the proof of Theorem 4.2, we consider again the Lyapunov function Ψµ. By Lemma 2.3 again, we only need to verify that the Lyapunov function Ψµ over some neighborhood Ω of u^∗ satisfies

dΨµ(u(t))

dt <0, ∀u(t) ∈ Ω\{u^∗}. (14)

In fact, by using (13) and the definition of the isolated equilibrium point, it is not hard to check that the equation (14) is true. Hence, u^∗ is asymptotically stable. 2

Theorem 4.4 Assume that u^∗ is an isolated equilibrium point of the neural network (8).

If ∇H(u)^T is nonsingular for any u = (µ, x, y, z) ∈ R⁺× Rⁿ× R^m × R^l, then u^∗ is exponentially stable for the neural network (8).

Proof. From the definition of H(u), we know that H is semismooth. Hence, by Lemma 4.1, we have

H(u) = H(u^∗) + ∇H(u(t))^T(u − u^∗) + o(ku − u^∗k), ∀u ∈ Ω\{u^∗}, (15) where ∇H(u(t))^T ∈ ∂H(u(t)) and Ω is a neighborhood of u^∗. Now, we let

g(u(t)) = ku(t) − u^∗k², t∈ [t⁰,∞).

Then, we have

dg(u(t))

dt = 2(u(t) − u^∗)^Tdu(t) dt

= −2ρ(u(t) − u^∗)^T∇Ψ^µ(u(t)) (16)

= −2ρ(u(t) − u^∗)^T∇H(u)H(u).

Substituting Eq. (15) into Eq. (16) yields dg(u(t))

dt

= −2ρ(u(t) − u^∗)^T∇H(u(t))(H(u^∗) + ∇H(u(t))^T(u(t) − u^∗) + o(ku(t) − u^∗k))

= −2ρ(u(t) − u^∗)^T∇H(u(t))∇H(u(t))^T(u(t) − u^∗) + o(ku(t) − u^∗k²).

Because ∇H(u) and ∇H(u)^T are nonsingular, we claim that there exists an κ > 0 such that

(u(t) − u^∗)^T∇H(u)∇H(u)^T(u(t) − u^∗) ≥ κku(t) − u^∗k². (17)

(14)

Otherwise, if (u(t) − u^∗)^T∇H(u(t))∇H(u(t))^T(u(t) − u^∗) = 0, it implies that

∇H(u(t))^T(u(t) − u^∗) = 0.

Indeed, from the nonsingularity of H(u), we have u(t) − u^∗ = 0, i.e., u(t) = u^∗ which contradicts with the assumption of u^∗ being an isolated equilibrium point. Consequently, there exists an κ > 0 such that (17) holds. Moreover, for o(ku(t) − u^∗k²), there is ε > 0 such that o(ku(t) − u^∗k²) ≤ εku(t) − u^∗k². Hence,

dg(u(t))

dt ≤ (−2ρκ + ε)ku(t) − u^∗k²= (−2ρκ + ε)g(u(t)).

This implies

g(u(t)) ≤ e^{(−2ρκ+ε)t}g(u(t0)) which means

ku(t) − u^∗k ≤ e^−ρκ+^ε²ku(t⁰) − u^∗k.

Thus, u^∗ is exponentially stable for the neural network (8). 2

To show the contribution of this paper, we present the stability comparisons of neural networks considered in the current paper, [20], and [27] in Table 1. More convergence comparisons will be presented in the next section. Generally speaking, we establish three stabilities for the proposed neural network, whereas not all three stabilities for the similar neural networks studied in [20, 27] are guaranteed. Why do we choose to investigate the proposed neural network? Indeed, in [20], two neural networks based on NR function and FB function are considered which does not reach exponential stability. Our target optimization problem is a wider class than the one studied in [20]. In contrast, the smoothed FB has good performance as is shown in [27], but not all the three stabilities are estab- lished even though exponential stability is good enough. In light of these observations, we decide to look into the smoothed NR function for our problem which turns out to have better theoretical results. We summarize their differences in problems format, dynamical model, and stability issues in Table 1.

5 Numerical examples

In order to demonstrate the effectiveness of the proposed neural network, in this section we test several examples for our neural network (8). The numerical implementation is coded by Matlab 7.0 and the ordinary differential equation solver adopted here is ode23, which uses the Ruge-Kutta (2; 3) formula. As mentioned earlier, the parameter ρ is set to be 1. How is µ chosen initially? From Theorem 4.2 in previous section, we know the solution converges with any initial point, we set initial µ = 1 in the codes (and of course µ→ 0, as seen in the trajectory behavior).

(15)

Table 1: Stability comparisons of neural networks considered in current paper, [20] and [27]

current paper [20] [27]

problem

min f (x) s.t. Ax= b

−g(x) ∈ K

min f (x) s.t. Ax= b

x∈ K

hF (x), y − xi ≥ 0, ∀y ∈ C C = {x| h(x) = 0, −g(x) ∈ K}

based on based on based on

ODE smoothed NR-function NR-function NR-function

and FB-function and smoothed FB-function Lyapunov (smoothed NR) Lyapunov (NR) Lyapunov (NR) stability asymptotical (smoothed NR) Lyapunov (FB) asymptotical (NR)

exponential (smoothed NR) asymptotical (FB) exponential (smoothed FB)

Example 5.1 Consider the following nonlinear convex programming problem:

min e^(x¹⁻³⁾²^+x²²^+(x³⁻¹⁾²^+(x⁴⁻²⁾²^+(x⁵⁺¹⁾² s.t. x ∈ K⁵,

Here, we denote f (x) := e^(x¹⁻³⁾²^+x²²^+(x³⁻¹⁾²^+(x⁴⁻²⁾²^+(x⁵⁺¹⁾² and g(x) = −x. Then, we compute

L(x, z) = ∇f(x) + ∇g(x)z

= 2f (x)







x₁ − 3 x2

x3 − 1 x₄ − 2 x₅+ 1







−





 z₁ z2

z3

z₄ z₅







. (18)

Moreover, let x := (x1,x¯) ∈ R × R⁴ and z := (z1,z¯) ∈ R × R⁴. Then, the element z − x can be expressed as

z− x := λ¹e₁+ λ2e₂

where λi = z1 − x¹ + (−1)ⁱk¯z − ¯xk and eⁱ = ¹₂(1, (−1)^{i ¯}_k¯^zz^−¯−¯^xxk) (i = 1, 2) if ¯z − ¯x 6= 0, otherwise ei = ¹₂(1, (−1)ⁱw) with w being any vector in R⁴ satisfying kwk = 1. This implies that

Φµ(z, −g(x)) = z − P^µ(z + g(x))

= z − µˆg(λ1

µ)e1+ µˆg(λ2

µ)e2 (19)

(16)

with ˆg(a) = ^√^a²^+4+a₂ or g(a) = ln(eˆ ^a+ 1). Therefore, by Eqs. (18) and (19), we obtain the expression of H(u) as follows:

H(u) =





µ L(x, z) Φµ(z, −g(x))



.

This problem has an optimal solution x^∗ = (3, 0, 1, 2, −1)^T. We use the proposed neural network to solve the above problem whose trajectories are depicted in Figure 2.

All simulation results show that the state trajectories with any initial point are always convergent to an optimal solution of the above problem x^∗.

0 1 2 3 4 5 6

−1.5

−1

−0.5 0 0.5 1 1.5 2 2.5 3 3.5

Time (ms)

Trajectories of x(t)

x1

x2 x3 x4

x5

Figure 2: Transient behavior of the neural network with the smoothed NR function in Example 5.1.

(17)

Example 5.2 Consider the following nonlinear second-order cone programming problem:

min f (x) = x²₁+ 2x²₂+ 2x₁x₂− 10x1− 12x2

s.t. g(x) =

"

8 − x¹+ 3x2

3 − x²1 − 2x¹+ 2x2− x²2

#

∈ K².

For this example, we compute that

L(x, z) = ∇f(x) + ∇g(x)z

= 2x₁− 2x² − 10 4x₂+ 2x₁− 12

− −z1− 2(x¹ + 1)z2

3z₃+ 2(1 − x2)z₂

. (20)

Since

z− g(x) =

z₁− 8 + x1 − 3x2

z2− 3 + x²1+ 2x1− 2x²+ x²₂

, the vector z − g(x) can be expressed as

z− x := λ¹e1+ λ2e2

where λi = z1− 8 + x¹− 3x²+ (−1)ⁱ|z²− 3 + x²1+ 2x1− 2x²+ x²₂| and ei = 1

2

1, (−1)ⁱ z2 − 3 + x²1+ 2x1− 2x²+ x²₂

|z2 − 3 + x²1+ 2x₁− 2x2+ x²₂|

(i = 1, 2), if z2−3+x²1+2x1−2x²+x²₂ 6= 0, otherwise, ei = ¹₂(1, (−1)ⁱw) with w being any element in R satisfying |w| = 1. This implies that

Φµ(z, −g(x)) = z − P^µ(z + g(x))

= z − µˆg(λ1

µ)e₁+ µˆg(λ2

µ)e₂ (21)

with ˆg(a) = ^√^a²^+4+a₂ or g(a) = ln(eˆ ^a+ 1). Therefore, by (20) and (21), we obtain the expression of H(u) as follows:

H(u) =





µ L(x, z) Φµ(z, −g(x))



.

This problem has an approximate solution x^∗ = (2.8308, 1.6375)^T. Note that the objective function is convex and the Hessian matrix ∇²f(x) is positive definite. Using the proposed neural network in this paper, we can easily obtain the approximate solution x^∗ of the above problem, see Figure 3.

(18)

0 5 10 15 20 25 30

−0.5 0 0.5 1 1.5 2 2.5 3

Time (ms)

x1

x2

(19)

Example 5.3 Consider the following nonlinear convex program with second-order cone constraints [18]:

min e^(x¹^−x³⁾+ 3(2x1− x²)⁴ +p1 + (3x2+ 5x3)² s.t. − g(x) = Ax + b ∈ K²,

x∈ K³ where

A:=

"

4 6 3

−1 7 −5

#

, b:=

"

−1 2

# .

For this example, f (x) := e^(x¹^−x³⁾+ 3(2x1 − x²)⁴ +p1 + (3x2+ 5x3)², from which we have

L(x, y, z) = ∇f(x) + ∇g(x)y − ∇xz

=







e^(x¹^−x³⁾+ 24(2x1− x²)³

−12(2x¹− x²)³+√^3(3x²^+5x³⁾

1+(3x2+5x3)²

−e^(x¹^−x³⁾+ √^5(3x²^+5x³⁾

1+(3x2+5x3)²





−





4y1− y² 6y1+ 7y2

3y1− 5y²



−



 z₁ z2

z3



.(22)

Since

y + g(x) z− x

=







y1− 4x¹− 6x² − 3x³+ 1 y₂ + x1− 7x²+ 5x3 − 2

z₁− x¹ z2− x² z3− x³





 ,

y+ g(x) and z − x can be expressed as follows, respectively, y+ g(x) := λ1e1+ λ2e2

and

z− x := κ¹f₁+ κ2f₂

where λi = y1− 4x¹− 6x²− 3x³+ 1 + (−1)ⁱ|y²+ x1 − 7x²+ 5x3− 2| and ei = 1

2

1, (−1)ⁱ y₂+ x1− 7x² + 5x3− 2

|y²+ x1− 7x² + 5x3− 2|

(i = 1, 2) if y2+ x1 − 7x²+ 5x3− 2 6= 0;

otherwise, ei = ¹₂(1, (−1)ⁱw) with w being any element in R satisfying |w| = 1. Moveover, let x := (x1,x¯) ∈ R × R² and z := (z1,z¯) ∈ R × R². Then, we obtain that κi =

(20)

z₁ − x¹ + (−1)ⁱk¯z − ¯xk and fⁱ = ¹₂(1, (−1)^{i ¯}_k¯^zz^−¯−¯^xxk) (i = 1, 2) if ¯z − ¯x 6= 0; otherwise fi = ¹₂(1, (−1)ⁱυ) with υ being any vector in R² satisfying kυk = 1. This implies that

Φµ(·) = y − P^µ(y + g(x)) z− P^µ(z − x)

=

"

y− µˆg(^λµ¹)e1+ µˆg(^λ_µ²)e2

z− µˆg(^κµ¹)f1+ µˆg(^κ_µ²)f2

#

(23) with ˆg(a) = ^√^a²^+4+a₂ or g(a) = ln(eˆ ^a+ 1). Therefore, by (22) and (23), we obtain the expression of H(u) as below:

H(u) =



 µ L(x, y, z)

Φµ(·)



.

The approximate solution to this problem is x^∗ = (0.2324, −0.07309, 0.2206)^T. The trajectories are depicted in Figure 4. We want to point out on thing: Assumption 4.1 (a) and (b) are not both satisfied in this example. More specifically, for this example, the assumption 4.1 (a) is satisfied, which is obvious. However, ∇²f(x) + ∇²g₁(x) + · · · +

∇²gl(x) is not positive semidefinite for each x. To see this, we compute

∇²f(x) + ∇²g1(x) + · · · + ∇²gl(x) = ∇²f(x)

=







e^(x¹^−x³⁾+ 144(2x1− x²)² −72(2x¹− x²)² −e^(x¹^−x³⁾

−72(2x1− x2)² 36(2x₁− x2)²+ ⁹

(1+(3x2+5x3)²)³²

15 (1+(3x2+5x3)²)³²

e^(x¹^−x³⁾ ¹⁵

(1+(3x²+5x³)²)

3

2 e^(x¹^−x³⁾+ ²⁵

(1+(3x²+5x³)²)

3 2





 ,

which is not positive semidefinite when 2x₁ − x2 = 0 (because the determinant equals zero). Hence, H(u) is not guaranteed to be nonsingular and all the theorems in Section 4 do not apply for this example. Nonetheless, the solution trajectory does converge as depicted in Figure 4. This phenomenon also occurs when it is solved by the second neural network studied in [27] (the stability is not guaranteed theoretically, but the solution trajectory does converges).

In addition, for Example 5.3, we also do comparisons among three neural networks based on FB function (considered in [20]), smoothed NR function (considered in this paper), and smoothed FB function (considered in [27]), respectively. Although Example 5.3 can be solved by all three neural networks, the neural network based on FB function does not behave as good as the other two neural networks, see Figure 5.

(21)

0 5 10 15 20 25 30

−0.2

−0.15

−0.1

−0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

Time (ms)

x1

x2 x3

(22)

0 20 40 60 80 100 10⁻⁶

10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰

Time (ms)

Norm of error

Smoothed NR Smoothed FB FB

Figure 5: Comparisons of three neural networks based on the FB function, smoothed NR function, and smoothed FB function in Example 5.3.

(23)

Example 5.4 Consider the following nonlinear second-order cone programming problem:

min f (x) = e^x¹^x³ + 3(x1+ x2)²−p1 + (2x2 − x³)²+ ¹₂x²₄+¹₂x²₅ s.t. h(x) = −24.51x¹+ 58x2− 16.67x³− x⁴− 3x⁵+ 11 = 0

−g¹(x) =







3x³₁+ 2x2− x³+ 5x²₃

−5x³1 + 4x₂− 2x3+ 10x³₃ x₃





 ∈ K³,

−g2(x) = x₄ 3x5

!

∈ K².

For this example, we compute

L(x, y, z) = ∇f(x) + ∇g1(x)y

∇g2(x)z

=







x3e^(x¹^x³⁾+ 6(x1+ x2) 6(x₁+ x₂) + √^2(2x²^−x³⁾

1+(2x²−x³)²

x₁e^(x¹^x³⁾+ √ ^2x²^−x³

1+(2x2−x³)²

x₄ x₅







−







9x²₁y₁− 15x²1y₂ 2y1+ 4y2

(10x3− 1)y¹+ (30x²₃)y2+ y3

z₁ 3z2





 (24).

Moreover, we know

y + g₁(x) z+ g₂(x)

=







y₁− 3x³1− 2x2 + x₃− 5x²3

y2+ 5x³₁− 4x²+ 2x3− 10x³3

y₃− x³ z₁− x⁴ z₂− x5





 .

Let y + g₁(x) := (u, ¯u) ∈ R × R² and z + g₂(x) := (v, ¯v) ∈ R × R, where u = y1− 3x³1− 2x₂+ x₃ − 5x²3,

¯

u= y2+ 5x³₁− 4x²+ 2x3− 10x³3

y3− x³

and

v = z1− x⁴, v¯= z2− x⁵. Then, y + g1(x) and z + g2(x) can be expressed as follows:

y+ g1(x) := λ1e₁+ λ2e₂ and

z+ g2(x) := κ1f₁+ κ2f₂

(24)

where λi = u + (−1)ⁱk¯uk, eⁱ = ¹₂(1, (−1)^{i ¯}_k¯^uuk) and κi = v + (−1)ⁱ|¯v|, fⁱ = ¹₂(1, (−1)^{i ¯}_|¯^vv|) (i = 1, 2) if ¯u 6= 0 and ¯v 6= 0, otherwise eⁱ = ¹₂(1, (−1)ⁱw) with w being any element in R² satisfying kwk = 1, and fⁱ = ¹₂(1, (−1)ⁱυ) with υ being any vector in R satisfying

|υ| = 1. This implies that

Φµ(·) = y − P^µ(y + g1(x)) z− P^µ(z + g2(x))

=

"

y− µˆg(^λµ¹)e1+ µˆg(^λ_µ²)e2

z− µˆg(^κµ¹)f1+ µˆg(^κ_µ²)f2

#

(25) with ˆg(a) = ^√^a²^+4+a₂ or ˆg(a) = ln(e^a+ 1). Consequently, by Eqs. (24) and (25), we obtain the expression of H(u) as follows:

H(u) =



 µ L(x, y, z)

Φµ(·)



.

This problem has an approximate solution x^∗ = (−0.0903, −0.0449, 0.6366, 0.0001, 0)^T and Figure 6 displays the trajectories obtained by using the proposed new neural network.

All simulation results show that the state trajectory with any initial point are always convergent to the solution x^∗. As observed, the neural network with the smoothed NR function has a fast convergence rate.

Furthermore, we also do comparisons between two neural networks based on smoothed NR function (considered in this paper) and smoothed FB function (considered in [27]) for Example 5.5. Note that Example 5.5 cannot be solved by the neural networks studied in [20]. Both neural networks possess exponential stability as shown in Table 1, which means the solution trajectories have the same order of convergence. This phenomenon is reflected in Figure 7.

6 Conclusion

In this paper, we have studied a neural network approach for solving general nonlinear convex programs with second-order cone constraints. The proposed neural network is based on the gradient of the merit function derived from smoothed NR complementarity function. In particular, from Definition 2.1 and Lemma 2.3, we know that there exists a stable equilibrium point u^∗ as long as there exists a Lyapunov function over some neighborhood of u^∗, and the stable equilibrium point u^∗ is exactly the solution of our considering problem. In addition to studying the existence and convergence of the solution trajectory of the neural network, this paper shows that the merit function is

(25)

0 20 40 60 80 100 120 140 160

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Time (ms)

x1 x2

x3

x4, x5

(26)

0 50 100 150 200 10⁻⁶

10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰

Time (ms)

Norm of error

Smoothed NR Smoothed FB

Figure 7: Comparisons of two neural network based on smoothed NR function and smoothed FB function in Example 5.4.

(27)

a Lyapunov function. Furthermore, the equilibrium point of the neural network (8) is stable, including the stability in the sense of Lyapunov, asymptotic stability, and exponential stability under suitable conditions.

Indeed, this paper can be viewed as a follow-up of [20] and [27] because we establish three stabilities for the proposed neural network, but not all three stabilities for the similar neural networks studied in [20, 27] are guaranteed. The numerical experiments presented in this study demonstrate the efficiency of the proposed neural network.

Acknowledgments The authors are grateful to the reviewers for their valuable sugges- tions, which have considerably improved the paper a lot.

References

[1] F. Alizadeh and D. Goldfarb, Second-order cone programming, Mathematical Programming, 95(2003), 3–52.

[2] D.P. Bertsekas, Nonlinear programming, Athena Scientific, 1995.

[3] Y.-H. Chen and S.-C. Fang, Solving convex programming problems with equality constraints by neural networks, Computers and Mathematics with Applications, 36(1998), 41–68.

[4] J.-S. Chen, C.-H. Ko, and S.-H. Pan, A neural network based on the generalized Fischer-Burmeister function for nonlinear complementarity problems, Informa- tion Sciences, 180(2010), 697–711.

[5] J.-S. Chen and P. Tseng, An unconstrained smooth minimization reformulation of the second-order cone complementarity problem, Mathematical Programming, 104(2005), 293–327.

[6] A. Cichocki and R. Unbehauen, Neural Networks for Optimization and Signal Processing, New York: John wiley, 1993.

[7] C. Dang, Y. Leung, X. Gao, and K. Chen, Neural networks for nonlinear and mixed complementarity problems and their applications, Neural Networks, 17(2004), 271–283.

[8] S. Effati, A. Ghomashi, and A.R. Nazemi, Application of projection neural network in solving convex programming problems, Applied Mathematics and Compu- tation, 188(2007), 1103–1114.

(28)

[9] S. Effati and A.R. Nazemi, Neural network and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation, 172(2006), 305–331.

[10] F. Facchinei and J. Pang, Finite-dimensional Variational Inequalities and Com- plementarity Problems, Springer, New York, 2003.

[11] M. Fukushima, Z.-Q. Luo, and P. Tseng, Smoothing functions for second- order-cone complimentarity problems, SIAM Journal on Optimization, 12(2002), 436–

460.

[12] Q. Han, L.-Z. Liao, H. Qi, and L. Qi, Stability analysis of gradient-based neural networks for optimiation problems, Journal of Global Optimization, 19(2001), 363–

381.

[13] J.J. Hopfield and D.W. Tank, Neural computation of decision in optimization problems, Biological Cybernetics, 52(1985), 141–152.

[14] X. Hu and J. Wang, A recurrent neural network for solving nonlinear convex programs subject to linear constraints, IEEE Transactions on Neural network,16(3)(2005), 379–386.

[15] X. Hu and J. Wang, A recurrent neural network for solving a class of general variational i nequalities, IEEE Transactions on Systems, Man, and Cybernetics-B, 37(2007), 528–539.

[16] S. Hayashi, N. Yamashita and M. Fukushima, A combined smoothing and reg- ularization method for monotone second-order cone complementarity problems, SIAM Journal on Optimization, 15(2005), 593–615.

[17] N. Kalouptisidis, Sigunal processing systems, Theory and design, New York, Wi- ley, 1997.

[18] C. Kanzow, I. Ferenczi, and M. Fukushima, On the local convergence of semismooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity, SIAM Journal on Optimization, 20(2009), 297–320.

[19] M.P. Kennedy and L.O. Chua, Neural network for nonlinear programming, IEEE Transactions on Circuits and Systems, 35(1988), 554–562.

[20] C.-H. Ko, J.-S. Chen, and C.-Y. Yang, Recurrent neural networks for solving second-order cone programs, Neurocomputing, 74(2011), 3646-3653.

[21] L.-Z. Liao, H. Qi, and L. Qi, Solving nonlinear complementarity problems with neural networks: a reformulation method approach, Jorunal of Computational and Applied Mathematics, 131(2001), 342–359.