A neural network based on the generalized FB function for nonlinear convex programs with second-order cone constraints
Xinhe Miao
a,1, Jein-Shan Chen
b,n,2, Chun-Hsu Ko
caDepartment of Mathematics, School of Science, Tianjin University, Tianjin 300072, PR China
bDepartment of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan
cDepartment of Electrical Engineering, I-Shou University, Kaohsiung 840, Taiwan
a r t i c l e i n f o
Article history:
Received 24 May 2015 Received in revised form 26 January 2016 Accepted 22 April 2016 Communicated by Ligang Wu Available online 10 May 2016 Keywords:
Neural network Generalized FB function Stability
Second-order cone
a b s t r a c t
This paper proposes a neural network approach to efficiently solve nonlinear convex programs with the second-order cone constraints. The neural network model is designed by the generalized Fischer–Bur- meister function associated with second-order cone. We study the existence and convergence of the trajectory for the considered neural network. Moreover, we also show stability properties for the con- sidered neural network, including the Lyapunov stability, the asymptotic stability and the exponential stability. Illustrative examples give a further demonstration for the effectiveness of the proposed neural network. Numerical performance based on the parameter being perturbed and numerical comparison with other neural network models are also provided. In overall, our model performs better than two comparative methods.
& 2016 Elsevier B.V. All rights reserved.
1. Introduction
The nonlinear convex programs with second-order cone con- straints (we abbreviate it as SOCP in this paper) is given as below:
min f ðxÞ s:t: Ax ¼ b
gðxÞAK ð1Þ
where AARmn has full row rank, bARm, f : Rn-R is two-order continuous differentiable and convex mapping, g ¼ ½g1; …; glT : Rn -Rl is two-order continuous differentiable K-convex mapping which means for every x; yARnand tA½0; 1 such that
tgðxÞþð1 tÞgðyÞg tx þð1 tÞyð ÞAK;
and K is a Cartesian product of second-order cones (also called Lorentz cones), expressed as
K ¼ Kn1 Kn2 … KnN
with N; n1; …; nNZ1; n1þ⋯þnN¼ l and Kni≔ xi1; xi2; …; xini
T
ARnij Jðxi2; …; xiniÞJ rxi1
n o
:
Here J J denotes the Euclidean norm and K1 means the set of nonnegative realsRþ.
It is well known that second-order cone programming problems (SOCP) have wide of applications in engineering, control and man- agement science[1,23,26]. For example, the grasping force optimiza- tion problem for the multi-fingered robot hand can be recast as SOCP, see[23, Example 5.3]for real application data. For solving SOCP(1), there also exist many traditional optimization methods such as the interior point method[24], the merit function method[7,18], Newton method[21,31], and projection method[12]and so on. For a survey of solution methods, refer to[4]. In this paper, we are interested in the so-called neural network approach for solving SOCP(1), which is substantially different from the traditional ones. The main motivation to employ this approach arises from the following reason. In many applications, for example, force analysis in robot grasping and control applications, real-time solutions are usually imperative. For such applications, traditional optimization methods may not be competent due to the problem's stringent requirement on computational time.
Compared with the traditional optimization methods, the neural network method has its advantage in dealing with real-time optimi- zation problems. Hence, many continuous-time neural networks for constrained optimization problems have been widely developed. At present, there are many results on neural networks for solving real- Contents lists available atScienceDirect
journal homepage:www.elsevier.com/locate/neucom
Neurocomputing
http://dx.doi.org/10.1016/j.neucom.2016.04.008 0925-2312/& 2016 Elsevier B.V. All rights reserved.
nCorresponding author.
E-mail addresses:xinhemiao@tju.edu.cn(X. Miao),
jschen@math.ntnu.edu.tw(J.-S. Chen),chko@isu.edu.tw(C.-H. Ko).
1The author's work is also supported by National Young Natural Science Foundation (No. 11101302) and National Natural Science Foundation of China (No.
11471241).
2The author's work is supported by Ministry of Science and Technology, Taiwan.
time optimization problems, see [6,9,11,14,16,17,19,22,23,25,27,33, 35–39,41]and references therein.
Neural networks stemmed back from McCulloch and Pitts' pioneering work half century ago, and these werefirst introduced for optimization domain in the 1980s[15,20,34]. The essence of neural network method for solving optimization problems[8]is to establish a nonnegative Lyapunov function (or called energy function) and a dynamic system which represents an artificial neural network. Indeed, the dynamic system is usually in the form of thefirst order ordinary differential equations. When utilizing neural networks for solving optimization problems, we are usually much more interested in the stability of networks starting from an arbitrary point. It is expected that for an initial point, the neural network will approach its equilibrium point which corresponds to the solution for the considered optimization problem.
In fact, the neural network approach for solving SOCP has been studied in[23,29]. More specifically, the SOCP studied in[23]is min f ðxÞ
s:t: Ax ¼ b
xAK ð2Þ
which is a special case of problem(1). Two kinds of neural net- works were proposed in [23]. One is based on cone projection function (also called NR function) with which only Lyapunov sta- bility is guaranteed. The other is based on the Fischer–Burmeister function (FB function) where Lyapunov stability and asymptotical stability are proved. Moreover, when solving problem(2), it was observed that the neural network based on the NR function has better performance than the one based on the FB function in most cases (except for some oscillating cases). However, compared to FB function, the NR function has a remarkable drawback, i.e., the non- differentiablity. In light of this phenomenon, the authors employed a neural network model based on “smoothed” NR function for solving more general SOCP(1), see[29]. In addition, all three kinds of stabilities including Lyapunov stability, asymp- totical stability, and exponential stability are proved for such model in[29]. Moreover, the neural network based on generalized FB function can be regulated appropriately by perturbing its parameter p. Previous study[6]has demonstrated its efficiency for solving the nonlinear complementarity problems, which also motivates us to further explore its numerical performance for solving the SOCP. In view of the above discussions and the existing literature, we wish to keep tracking the performance of neural networks based on “smoothed” FB function, which is the main motivation of this paper. In particular, we consider a more general function, which is called the generalized FB function. In other words, we propose a neural network model based on the
“smoothed” generalized FB function including FB function as a special case. With this function, we perturb the parameter p associated with the generalized FB function to see how it affects the numerical performance. In addition, all the aforementioned three types of stabilities are guaranteed in our proposed neural network. Numerical comparison between model based on smoothed NR function and model based on smoothed generalized FB function are provided.
The organization of this paper is as follows. InSection 2, we introduce concepts about the stability, and recall some background materials. In Section 3, based on the smoothed generalized FB function, the neural network architecture is proposed for solving the problem(1). InSection 4, we study the convergence and sta- bility results of the proposed neural network. Simulation results of the new method are reported in Section 5. Section 6 gives the conclusion of this paper.
2. Preliminaries
For a given mapping H: Rn-Rn, the first order differential equation (ODE) means
du
dt¼ HðuðtÞÞ; uðt0Þ ¼ u0ARn: ð3Þ
In general, the most concerned issues regarding ODE (3)are the existence and uniqueness of the solution. Besides, the convergence of solution trajectory is also concerned. To this end, concepts regarding equilibrium point and stabilities are needed. As below, we recall background materials about ODE(3)as well as stability concepts about the solution to ODE(3). All these materials can be found in usual ODE's textbook, e.g.,[30].
Lemma 2.1 (The existence and uniqueness). Assume that H: Rn- Rnis a continuous mapping. Then, for arbitrary t0Z0 and u0ARn, there exists a local solution u(t), tA½t0;
τ
Þ to (3) for someτ
4t0. Furthermore, if H is locally Lipschitz continuous at u0, then the solution is unique; and if H is Lipschitz continuous inRn, thenτ
canbe extended to 1.
Proof. See[25, Theorem 2.5].□
Remark 2.1. For Eq.(3), if a local solution defined on ½t0;
τ
Þ cannotbe extended to a local solution on a larger interval ½t0;
τ
1Þ, whereτ
14τ
, then it is called a maximal solution, and this interval ½t0;τ
Þ isthe maximal interval of existence. It is obvious that an arbitrary local solution has an extension to a maximal one.
Lemma 2.2. Let H: Rn-Rn is a continuous mapping. If u(t) is a maximal solution, and ½t0;
τ
Þ is the maximal interval of existence associated with u0andτ
o þ1, then limt↑τJuðtÞJ ¼ þ1.Proof. See[25, Theorem 2.6].□
For ODE(3), a point unARnis called an equilibrium point of(3)if HðunÞ ¼ 0. If there is a neighborhood
Ω
DRnof unsuch that HðunÞ¼ 0 and HðuÞa0 for any uA
Ω
⧹fung, then un is called an isolated equilibrium point. The following are definitions of various stabilities.More related materials can be found in[25,30,33].
Definition 2.1. Let u(t) be a solution of ODE(3).
(a) An isolated equilibrium point un is Lyapunov stable (or stability in the sense of Lyapunov) if for any u0¼ uðt0Þ and
ε
40, there exists aδ
40 such thatJu0unJ o
δ
⟹ JuðtÞunJ oε
for tZt0:(b) Under the condition that an isolated equilibrium point un is Lyapunov stable, un is said to be asymptotic stable if it has the property that if Ju0unJ o
δ
, then uðtÞ-un as t-1.(c) An isolated equilibrium point un is exponentially stable for(3)if there exist
ω
o0,κ
40,δ
40 such that arbitrary solution u(t) of ODE (3) with the initial condition uðt0Þ ¼ u0,Ju0unJ oδ
is defined on ½0; 1Þ and satisfies JuðtÞunJ rκ
eωtJuðt0ÞunJ; t Zt0:Definition 2.2 (Lyapunov function). Let
Ω
DRn be an open neigh- borhood of u. A continuously differentiable function g: Rn-R is said to be a Lyapunov function (or energy function) at the state u (overthe set
Ω
) for Eq.(3)ifgðuÞ ¼ 0;
gðuÞ40 8uA
Ω
⧹fug;dgðuðtÞÞ
dt r0 8uA
Ω
: 8>>>
<
>>
>:
From the above definition, it is obvious that exponentially stable is asymptotically stable. The next results show the rela- tionship between stabilities and a Lyapunov function, see [5,10,40].
Lemma 2.3.
(a) An isolated equilibrium point unis Lyapunov stable if there exists a Lyapunov function over some neighborhood
Ω
of un.(b) An isolated equilibrium point un is asymptotically stable if there exists a Lyapunov function over some neighborhood
Ω
of unsatisfying dgðuðtÞÞ
dt o0; 8 uA
Ω
⧹ u n :To close this section, we briefly review some properties of the spectral factorization with respect to second-order cone, which will be used in the subsequent analysis. Spectral factorization is one of the basic concepts in Jordan algebra. For more details, see [7,13,31]. For any vector z ¼ zð 1; z2ÞAR Rl 1ðlZ2Þ, its spectral factorization with respect to second-order coneK is defined as z ¼
λ
1ðzÞe1ðzÞþλ
2ðzÞe2ðzÞ;where
λ
iðzÞ ¼ z1þ 1ð ÞiJz2J ði ¼ 1; 2Þ are called the spectral values of z, andeiðzÞ ¼ 1
2 1; 1ð Þi zi
JziJ
; z2a0 1
2ð1; 1ð ÞiwÞ; z2¼ 0 8>
>>
<
>>
>:
with wARl 1being an arbitrary element such thatJwJ ¼ 1. Here e1ðzÞ and e2ðzÞ are called the spectral vectors of z. It is well known that for any zARl, we have
λ
1ð Þrzλ
2ðzÞ andλ
1ð ÞZ0⟺zAK:zNote that any closed convex cone can always yield a partial order.
Suppose that the partial order “≽K” is induced by K, i.e., z≽K03zAK. The following technical lemma is helpful towards the subsequent analysis.
Lemma 2.4 (Pan et al. [32, Lemma 2.2]). For any 0rr r1 and z≽Kw≽K0, we have zr≽Kwr.
For any x ¼ xð 1; x2ÞAR Rn 1 and y ¼ y1; y2
AR Rn 1, Jor- dan product of x○y is defined as
x○y ¼ 〈x; y〉
x1y2þy1x2
" #
:
According to Jordan product and spectral factorization with respect to second-order coneK, we often employ the following vector-valued functions (also called SOC-functions) associated with j tjpðt ARÞ and ffiffi
pt
p ðt Z0Þ, respectively, which are expressed as
j xjp¼ j
λ
1ðxÞjpe1ðxÞþ jλ
2ðxÞjpe2ðxÞ 8xARn; ffiffiffixpp
¼ ffiffiffiffiffiffiffiffiffiffiffi
λ
1ðxÞqp
e1ðxÞþ ffiffiffiffiffiffiffiffiffiffiffi
λ
1ðxÞqp
e2ðxÞ 8xAK:
In light of the expressions of j xjpand ffiffiffi x pp
as above, for any p41, the generalized FB merit function
ϕ
p: Rn Rn-Rn associatedwith second-order cone is defined in[32]:
ϕ
pðx; yÞ≔ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j xjpþ j yjp ppðxþyÞ:
In particular, in[32]the authors have shown that
ϕ
pðx; yÞ is an SOC-complementarity function, i.e.,ϕ
pðx; yÞ ¼ 0⟺xAK; yAK and 〈x; y〉 ¼ 0:This also yields that the function
Φ
p: Rn-Rngiven byΦ
pð Þ≔x 12J
ϕ
pðx; FðxÞÞJ2is a merit function for second-order cone complementarity pro- blems. Moreover, the following conclusions are obtained in[32].
Lemma 2.5. For any pffiffiffiffi 41, let w≔w x; yð Þ≔j xjpþ j yjp, t ¼ t xð ; yÞ≔
w pp
and denote gsocð Þ≔j xjx p. Then, tðx; yÞ is continuously differenti- able at (x,y) with wAintðKÞ, and
∇xtðx; yÞ ¼ ∇gsocð Þ∇gx socð Þt 1 and
∇ytðx; yÞ ¼ ∇gsocð Þ∇gy socð Þt 1 where
∇gsocðxÞ ¼
psignðx1Þj x1jp 1I; x2¼ 0 bðxÞ c xð ÞxT2
c xð Þx2 aðxÞI þ bðxÞ aðxÞð Þx2xT2
" #
; x2a0 8>
><
>>
: with x2¼J xx2
2Jand aðxÞ ¼j
λ
2ðxÞjp jλ
1ðxÞjpλ
2ðxÞλ
1ðxÞ ;bðxÞ ¼p
2 signð
λ
2ðxÞÞjλ
2ðxÞjp 1þsignðλ
1ðxÞÞjλ
1ðxÞjp 1; cðxÞ ¼p2 signð
λ
2ðxÞÞjλ
2ðxÞjp 1signðλ
1ðxÞÞjλ
1ðxÞjp 1: Proof. See[32, Lemma 3.2].□Lemma 2.6. Let
Φ
pbe defined asΦ
pðx; yÞ≔12Jϕ
pðx; yÞJ2and denote w xð ; yÞ≔j xjpþ j yjp, gsocð Þ≔j xjx p. Then, the functionΦ
pfor pAð1; 4Þ is differentiable everywhere. Moreover, for any x; yARn,(a) if wðx; yÞ ¼ 0, then ∇x
Φ
pðx; yÞ ¼ ∇yΦ
pðx; yÞ ¼ 0;(b) if wðx; yÞAintðKÞ, then
∇x
Φ
pðx; yÞ ¼ ∇g socðxÞ∇gsocð Þt 1Iϕ
pðx; yÞ¼ ∇g socðxÞ∇gsocðtÞ
∇gsocð Þt 1
ϕ
pðx; yÞ; ∇yΦ
pðx; yÞ¼ ∇g socðyÞ∇gsocð Þt 1I
ϕ
pðx; yÞ¼ ∇g socðyÞ∇gsocðtÞ
∇gsocð Þt 1
ϕ
pðx; yÞ:(c) if wðx; yÞA∂K⧹f0g, where ∂K means the boundary of K, then
∇x
Φ
pðx; yÞ ¼ signðx1Þj x1jp 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffij x1jpþ j y1jppq 1
!
ϕ
pðx; yÞ;∇y
Φ
pðx; yÞ ¼ signðy1Þj y1jp 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j x1jpþ j y1jppq 1
!
ϕ
pðx; yÞ:Proof. See[32, Proposition 3.1].□
3. Generalized FB neural network model
In this section, we will explain how we form the dynamic system. As is mentioned earlier, the key points for neural network method lie in constructing the dynamic system and Lyapunov function. To this end, wefirst look into the KKT conditions of the
problem(1)which are presented as below:
∇f ðxÞATy þ∇gðxÞz ¼ 0;
zAK; g xð ÞAK; zTgðxÞ ¼ 0; Ax b ¼ 0;
8>
<
>: ð4Þ
where yARm,∇gðxÞ denotes the gradient matrix of g. According to the KKT condition, it is well known that if the problem(1)satisfies Slater's condition, which means there exists a strictly feasible point for the problem(1), i.e., there exists an xARnsuch that g ð ÞAintðKÞ and Ax¼b. Then, for the nonlinear convex programsx (1), xnis a solution of the problem(1)if and only if there exist yn and znsuch that ðxn; yn; znÞ satisfying the KKT conditions (4), see [2]. Hence, we assume that the problem(1)satisfies Slater's con- dition in this paper.
Lemma 3.1. For z ¼ zð 1; z2ÞAR Rn 1 and x ¼ ðx1; x2ÞAR Rn 1 with z≽Kx, we have
λ
ið ÞZzλ
iðxÞ for i¼1,2.Proof. Since z≽Kx, we may express z ¼ x þ y where x ¼ xð 1; x2ÞAR Rn 1, y ¼ y1; y2
AR Rn 1 and y ¼ z x≽K0.
This implies y1Z Jy2J and
λ
1ðzÞ ¼ ðx1þy1Þ Jx2þy2J Zðx1þy1Þ Jx2J Jy2J Zx1 Jx2J ¼λ
1ðxÞ:Thus, we have
λ
2ðzÞ ¼ ðx1þy1Þþ Jx2þy2J Zðx1þy1Þþ j Jx2J Jy2J j¼ x1þy1þ Jx2J Jy2J; if Jx2J Z Jy2J x1þy1 Jx2J þ Jy2J; if Jx2J o Jy2J (
Z x1þ Jx2J; if Jx2J Z Jy2J x1þy1; if Jx2J o Jy2J (
Zx1þ Jx2J ¼
λ
2ðxÞwhich is the desired result.□
Lemma 3.2. Let w≔wðx; yÞ ¼ j xjpþ j yjp, t ¼ t xð ; yÞ≔ ffiffiffiffi w pp
and gsocð Þ≔j xjx p. Then, the following three matrices
∇gsocðtÞ∇gsocðxÞ; ∇gsocðtÞ∇gsocðyÞ;
∇gsocðtÞ∇gsocðxÞ
∇gsocðtÞ∇gsocðyÞ
are all positive semi-definite for p ¼n2with nAN.
Proof. From the expression of∇gsocðxÞ inLemma 2.5and the proof of[32, Lemma 3.2], we know that the eigenvalues of ∇gsocðxÞ for x2a0 are
bðxÞ cðxÞ; aðxÞ; …; aðxÞ; and bðxÞþcðxÞ:
Let w≔ wð 1; w2ÞAR Rn 1. Then applying[32, Lemma 3.1]gives w1¼j
λ
2ðxÞjpþ jλ
1ðxÞjp2 þj
λ
2ðyÞjpþ jλ
1ðyÞjp2 w2¼j
λ
2ðxÞjp jλ
1ðxÞjp2 x2þj
λ
2ðyÞjp jλ
1ðyÞjp2 y2;
where x2¼J xx2
2Jif x2a0, and otherwise x2is an arbitrary vector in Rn 1satisfyingJx2J ¼ 1. Similar situation applies for y2. Thus, we will proceed the proof by discussing two cases: w2¼ 0 or w2a0.
Case 1: For w2¼ 0, we have ∇gsocðtÞ ¼ p ffiffiffiffiffiffiffiqw1
p I where
w1¼j
λ
2ðxÞjpþ jλ
1ðxÞjp2 þj
λ
2ðyÞjpþ jλ
1ðyÞjp2 : ð5Þ
Under the condition of w2¼ 0, there are the following two subcases.
(i) If x2¼ 0, then w1¼ j x1jpþjλ2ðyÞjpþ j2 λ1ðyÞjp, which implies that pqffiffiffiffiffiffiffiw1
p Zpsignðx1Þj x1jp 1. Hence, we see that the matrix∇gsocðtÞ
∇gsocðxÞ is positive semi-definite. Indeed, if xa0, ∇gsocðtÞ∇gsocðxÞ is positive definite.
(ii) If x2a0, it follows from w2¼ 0 that j
λ
2ðxÞjp jλ
1ðxÞjp2
¼ j
λ
2ðyÞjp jλ
1ðyÞjp2
: ð6Þ
We want to prove that the matrix ∇gsocðtÞ∇gsocðxÞ is positive semi-definite. It is sufficient to show that
p ffiffiffiffiffiffiffi w1
pq
Zmax bðxÞcðxÞ; aðxÞ; bðxÞþcðxÞ : It is obvious that pqffiffiffiffiffiffiffiw1
p ðbðxÞcðxÞÞ40 when
λ
1ðxÞo0. Whenλ
1ðxÞZ0, using(5)andλ
2ð ÞZxλ
1ðxÞ, we have p ffiffiffiffiffiffiffiw1
pq
ðbðxÞcðxÞÞZp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j
λ
1ðxÞjppsignð
λ
1ðxÞÞjλ
1ðxÞjp 1Z0:
Next, we verify that pqffiffiffiffiffiffiffiw1
p aðxÞZ0. For j
λ
1ðxÞj Z jλ
2ðxÞj , it is clear that pqffiffiffiffiffiffiffiw1p aðxÞZ0. For j
λ
1ðxÞj o jλ
2ðxÞj , it follows fromλ
2ð ÞZx
λ
1ðxÞ that x140, which yields jλ
2ðxÞjp jλ
1ðxÞjpλ
2ðxÞλ
1ðxÞ rλ
2ð Þxp jλ
1ðxÞjpλ
2ðxÞ jλ
1ðxÞj :Let p ¼mnðn; mANÞ, a ¼
λ
2ð Þxm1 and b ¼ jλ
1ðxÞjm1. From p41, it fol- lows that n4m. Then, we have 0rboa andaðxÞ ¼anbn
ambm¼ an 1þan 2b þ…þabn 2þbn 1 am 1þam 2b þ…þabm 2þbm 1: Now, letting f ðvÞ ¼aamn v vnmwith vA½0; a, we obtain f0ðvÞ ¼nvn 1ðamvmÞþmvm 1ðanvnÞ
amvm
ð Þ2 :
In addition, it follows from f0ðvÞ ¼ 0 that anvn
amvm¼n mvn m:
Since f ð0Þ ¼aamn¼ an mwithv¼0 and f ðaÞ ¼mnan mwithv¼a, it is easy to verify that f bð Þrmnan mfor 0rboa, i.e.,
j
λ
2ðxÞjp jλ
1ðxÞjpλ
2ðxÞλ
1ðxÞ rp jλ
2ðxÞjp 1: Hence, we havepqffiffiffiffiffiffiffiw1
p aðxÞZp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi maxfj
λ
2ðxÞjp; jλ
1ðxÞjpgþminfjλ
2ðyÞjp; jλ
1ðyÞjpgj
λ
2ðxÞjp jλ
1ðxÞjpλ
2ðxÞλ
1ðxÞZp ffiffiffiffiffiffiffiffiffiffiffiffiffi
λ
2ð Þxppj
λ
2ðxÞjp 1Z0;
where thefirst inequality holds due to(6). Lastly, we also see that p ffiffiffiffiffiffiffi
w1 pq
ðbðxÞþcðxÞÞZp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi maxfjλ2ðxÞjp; jλ1ðxÞjpgþminfjλ2ðyÞjp; jλ1ðyÞjpg
q
q
psignð
λ
2ðxÞÞjλ
2ðxÞjp 1Zp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi maxfj
λ
2ðxÞjp; jλ
1ðxÞjpgpsignð
λ
2ðxÞÞjλ
2ðxÞjp 1Z0:
To sum up, under this case x2a0, we prove that the matrix ∇gsoc ðtÞ∇gsocðxÞ is positive semi-definite.
Case 2: For w2a0, from the expression of tðx; yÞ and the properties of the spectral values of the vector-valued function j xjp with p ¼n2for nAN, all the eigenvalues of the matrix ∇gsocðtÞ are
bðtÞ cðtÞraðtÞrbðtÞþcðtÞ: ð7Þ
When x2¼ 0, we note that
bðtÞcðtÞ psignðx1Þj x1jp 1¼ p ffiffiffiffiffiffiffiffiffiffiffiffi
λ
1ðwÞqp
p 1
psignðx1Þj x1jp 1
¼ p j
λ
2ðxÞjpþ jλ
1ðxÞjp2 þj
λ
2ðyÞjpþ jλ
1ðyÞjp2
Jj
λ
2ðxÞjp jλ
1ðxÞjp2 x2þj
λ
2ðyÞjp jλ
1ðyÞjp2 y2J
p 1p
psignðx1Þj x1jp 1
Zpj x1jp 1psignðx1Þj x1jp 1 Z0;
where y2 denotes y2¼J yy2
2J when y2a0, and otherwise y2 is an arbitrary vector in Rn 1 satisfying Jy2J ¼ 1. Now, applying the relation of the eigenvalues in(7), we have
bðtÞþcðtÞZaðtÞZpsignðx1Þj x1jp 1;
which implies that the matrix∇gsocðtÞ∇gsocðxÞ is positive semi- definite.
When x2a0, we also note that bðtÞcðtÞ bðxÞcðxÞð Þ ¼ p ffiffiffiffiffiffiffiffiffiffiffiffi
λ
1ðwÞqp
p 1
psignð
λ
1ðx1ÞÞjλ
1ðx1Þjp 1: Forλ
1ðxÞo0, it is clear that bðtÞcðtÞðbðxÞcðxÞÞZ0. Forλ
1ð ÞZ0, we havexλ
2ðxÞZλ
1ð ÞZ0, which leads toxλ
1ðwÞ ¼jλ
2ðxÞjpþ jλ
1ðxÞjp2 þj
λ
2ðyÞjpþ jλ
1ðyÞjp2
j
λ
2ðxÞjp jλ
1ðxÞjp2 x2þj
λ
2ðyÞjp jλ
1ðyÞjp2 y2
Zj
λ
2ðxÞjpþ jλ
1ðxÞjp2 j
λ
2ðxÞjp jλ
1ðxÞjp2 þj
λ
2ðyÞjpþ jλ
1ðyÞjp2 j
λ
2ðyÞjp jλ
1ðyÞjp2
Z j
λ
1ðxÞjp:Thus, it follows that bðtÞcðtÞðbðxÞ cðxÞÞZ0. Moreover, since t≽Kj xj , byLemma 3.1and the eigenvalue of j xj being j
λ
1ðxÞj andj
λ
2ðxÞj , we haveλ
2ðtÞZmaxfjλ
1ðxÞj ; jλ
2ðxÞj gand
λ
1ðtÞZminfjλ
1ðxÞj ; jλ
2ðxÞj g: ð8Þ When p ¼n2with nAN, then, we haveaðtÞ aðxÞ ¼
λ
2ð Þtn2λ
1ð Þtn2λ
2ðtÞλ
1ðtÞ jλ
2ðxÞjn2 jλ
1ðxÞjn2λ
2ðxÞλ
1ðxÞ :If j
λ
2ðxÞj o jλ
1ðxÞj , it is obvious that aðtÞ aðxÞZ0. If jλ
2ðxÞj Z jλ
1ðxÞj , in light ofλ
2ð ÞZxλ
1ðxÞ, we obtain that x1Z0 andλ
2ðxÞZ0. Now, leta≔
λ
2ð Þt12; b≔λ
1ð Þt12; c≔λ
2ð Þx12 and d≔jλ
1ðtÞj12: Then, we get thataðtÞ aðxÞ ¼anbn a2b2cndn
c2d2
¼ðan 1þan 2b þ…þabn 2þbn 1Þðc þdÞ ðaþbÞðcþdÞ
ðaþbÞðcn 1þcn 2d þ…þcdn 2þdn 1Þ ðaþbÞðc þdÞ
¼an 1c þ bcðan 2þan 3b þ…þabn 3þbn 2Þ ðaþbÞðc þdÞ
þadðan 2þan 3b þ…þabn 3þbn 2Þþbn 1d ðaþbÞðc þdÞ
acn 1þadðcn 2þcn 3d þ…þcdn 3þdn 2Þ ðaþbÞðc þdÞ
bcðcn 2þcn 3d þ…þcdn 3þdn 2Þþbdn 1
ðaþbÞðc þdÞ ;
which together with(8)implies that aZc; bZdZ0 and aðtÞa xð ÞZ0:
In addition, we also verity that bðtÞþ cðtÞðbðxÞþ cðxÞÞ ¼ p
λ
2ðtÞp 1psignð
λ
2ðxÞÞjλ
2ðxÞjp 1 Z0:Therefore, for any xARn, we have
xTð∇gsocðtÞ∇gsocðxÞÞx ¼ xT∇gsocðtÞxxT∇gsocðxÞx
¼ bðtÞcðtÞþðn2ÞaðtÞþbðtÞþcðtÞ xTx
bðxÞcðxÞþðn2ÞaðxÞþbðxÞþcðxÞ xTx Z0;
which shows that the matrix∇gsocðtÞ∇gsocðxÞ is positive semi- definite.
With the same arguments, we can verify that the matrix∇gsoc ðtÞ∇gsocðyÞ is also positive semi-definite.
Finally, using the properties of eigenvalues of symmetric matrix product, i.e.,
λ
iðABÞZλ
iðAÞλ
minðBÞ; i ¼ 1; …; n; 8A; BASnn;where Snn denotes n order symmetric matrix, we easily obtain that the matrix ð∇gsocðtÞ∇gsocðxÞÞð∇gsocðtÞ∇gsocðyÞÞ is also posi- tive semi-definite.□
Remark 3.1. From the above proof ofLemma 3.2, when xa0 and ya0, we have that the matrixes ∇gsocðtÞ∇gsocðxÞ, ∇gsocðtÞ∇gsoc ðyÞ and ð∇gsocðtÞ∇gsocðxÞÞð∇gsocðtÞ∇gsocðyÞÞ are all positive definite.
Now, we look into the KKT conditions(4)of the problem(1). Let Lðx; y; zÞ ¼ ∇f ðxÞATyþ∇gðxÞz;
HðuÞ≔
Ax b Lðx; y; zÞ
ϕ
pðz; gðxÞÞ 264
3
75 ð9Þ
and
Ψ
pðuÞ≔12JHðuÞJ2¼1
2J
ϕ
pðz; gðxÞÞJ2þ12JLðx; y; zÞJ2þ1
2JAxbJ2; where u ¼ x T; yT; zTT
ARn Rm Rl. From Lemma 2.5 in[32], we know that
ϕ
pðz; gðxÞÞ ¼ 0⟺zAK; gðxÞAK; zTgðxÞ ¼ 0:Hence, the KKT conditions (4) are equivalent to HðuÞ ¼ 0, i.e.,
Ψ
pðuÞ ¼ 0. Then, it follows that the KKT conditions (4) are equivalent to the following unconstrained minimization problem with zero optimal value via the merit function approach:min
Ψ
pð Þ≔u 12JHðuÞJ2: ð10Þ
However, the function
ϕ
pis notK-convex and the merit functionΨ
p is neither convex function for p ¼ 2, which is showed in Example 3.5 of[3].Theorem 3.1. Let
Ψ
pbe defined as in(10).(a) The matrix∇gsocðxÞ is positive definite for all 0axAK.
(b) The function
Ψ
p for pAð1; 4Þ is continuously differentiable everywhere. Moreover,∇Ψ
pðuÞ ¼ ∇HðuÞHðuÞ where∇HðuÞ ¼
AT ∇xLðx; y; zÞ ∇gðxÞV1
0 A 0
0 ∇g xð ÞT V2
2 64
3
75 ð11Þ
with
V1¼
0; wðz; gðxÞÞ ¼ j zjpþ j gðxÞjp¼ 0;
∇gsocð Þ∇gx socð Þt 1I; w z; gðxÞð ÞAintðKÞ;
signð g1ðxÞÞj g1ðxÞjp 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j g1ðxÞjpþ j z1jp
pq 1; w z; gðxÞð ÞA∂K⧹f0g
8>
>>
><
>>
>>
: and
V2¼
0; wðz; gðxÞÞ ¼ j zjpþ j gðxÞjp¼ 0;
∇gsocð Þ∇gz socð Þt 1I; w zð; gðxÞÞAintðKÞ;
signðz1Þj z1jp 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j g1ðxÞjpþ j z1jp
pq 1; w z; gðxÞð ÞA∂K⧹f0g
8>
>>
><
>>
>>
:
with t≔ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi wðz; gðxÞÞ pp
.
Proof. (a) For all 0axAK, if x2¼ 0, it is obvious that the matrix
∇gsocðxÞ ¼ psignðx1Þj x1jp 1I is positive definite. If xa0, from the expression of∇gsocðxÞ inLemma 2.5and xAK, we have bðxÞ40. In order to prove that the matrix∇gsocðxÞ is positive definite, it suf- fices to show that the Schur complement of b(x) in the matrix ∇ gsocðxÞ is positive definite. In fact, from the expression of ∇gsocðxÞ, the Schur complement has the form
aðxÞI þðbðxÞ aðxÞÞx2xT2c2ðxÞ
bðxÞx2xT2¼ aðxÞðI x2xT2ÞþbðxÞ 1c2ðxÞ bðxÞ
x2xT2: Since xAK, we have
λ
2ð ÞZxλ
1ð ÞZ0, which implies that aðxÞ40x and bðxÞ4c xð ÞZ0. Note that the matrices I x2xT2 and x2xT2 are positive semi-definite. Thus, the Schur complement is positive definite. Further, we get that ∇gsocðxÞ is positive definite for all 0axAK.(b) From the proof of Proposition 3.1 and Lemma 3.2 of[32], we know that the function
Ψ
p for pAð1; 4Þ is continuously differ- entiable everywhere. Hence, in view of the definition of the functionΨ
p and the chain rule, the expression of ∇Ψ
pðuÞ isobtained.□
In light of the main ideas for constructing artificial neural networks (see[8]for details), we will establish a specific first order ordinary differential equation, i.e., an artificial neural network.
Moreover, specifically, based on the gradient of the merit function
Ψ
pin minimization problem(10), we propose the neural network for solving the KKT system(4) of nonlinear SOCP (1) with the following differential equation:duðtÞ
dt ¼
ρ
∇Ψ
pðuÞ; uðt0Þ ¼ u0; ð12Þwhere
ρ
40 is a time scaling factor. In fact, ifτ
¼ρ
t, thenduðtÞ
dt ¼
ρ
duðdττÞ. Hence, it follows from(12)thatduðdττÞ¼ ∇Ψ
pðuÞ. Forsimplicity and convenience, we set
ρ
¼ 1 in this paper.4. Stability analysis
In this section, we are interested in the stability analysis about the proposed neural network(12). By these theoretical analyses, the desired optimal solution of SOCP(1)can always be obtained by setting the initial state of the network of an arbitrary value. In order to study the stability issues on the proposed neural network (12)for solving SOCP(1), wefirst make an assumption which will be needed in our subsequent analysis, in order to avoid the sin- gularity of∇HðuÞ.
Assumption 4.1.
(a) The SOCP problem(1)satisfies Slater's condition.
(b) The matrix ½AT ∇gðxÞ is full column rank, and the matrix
∇xLðx; y; zÞ is positive definite on the null space ft j At ¼ 0g of A.
Here we say a few words about Assumption 4.1(a) and (b).
Slater's condition is a standard condition which is widely used in optimization field. When g is linear,Assumption 4.1(b) is indeed equivalent to the well-used condition∇2f ðxÞ is positive definite.
Lemma 4.1. Let p ¼n2Að1; 4Þ with nAN. Then, the following hold.
(a) Under the condition ofAssumption 4.1,∇HðuÞ is nonsingular for u ¼ xð ; y; zÞARn Rm Rlwith zð ; gðxÞÞa0.
(b) Every stationary point of
Ψ
pis a global minimizer of problem (10)for zð ; gðxÞÞa0.(c)
Ψ
pðuðtÞÞ is nonincreasing with respect to t.Proof. (a) Suppose
ξ
¼ s; t; vð ÞARn Rm Rl. From the expression (11)of∇HðuÞ inTheorem 3.1, to show the nonsingularity of∇HðuÞ, it is enough to prove that∇H uð Þ
ξ
¼ 0 ⟹ s ¼ 0; t ¼ 0 and v ¼ 0:Indeed, by∇H uð Þ
ξ
¼ 0, we haveAt ¼ 0; ATs þ∇xLðx; y; zÞ t ∇gðxÞV1v ¼ 0 ð13Þ and
∇g xð ÞTt þ V2v ¼ 0: ð14Þ
From(13), it follows that
tT∇xLðx; y; zÞt tT∇gðxÞV1v ¼ 0: ð15Þ Moveover, by Eq.(14), we obtain
tT∇gðxÞ ¼ vTVT2: ð16Þ
Then, combining(15)and(16), this yields that tT∇xLðx; y; zÞt þvTVT2V1v ¼ 0:
By Lemma 3.2and Assumption 4.1(b), it is not hard to see that t¼0. In addition, from(13)and(14), we have
ATs ∇gðxÞV1v ¼ 0 and V2v ¼ 0:
ByAssumption 4.1(b) again, we also get that s ¼ 0 and V1v ¼ 0:
Thus, combining Lemma 3.2 with the expression V1 and V2 in Theorem 3.1, we havev¼0. Therefore, ∇H uð ÞT is nonsingular.
(b) Suppose that un is a stationary point of
Ψ
p. This says∇
Ψ
pðunÞ ¼ 0, and from Theorem 3.1, we have ∇HðunÞHðunÞ ¼ 0.According to part(a), ∇HðuÞ is nonsingular. Hence, it follows that HðunÞ ¼ 0, i.e.,
Ψ
pðunÞ ¼ 0, which says un is a global minimizer of (10).(c) By the definition of
Ψ
pðuðtÞÞ and(12), it is clear that dΨ
pðuðtÞÞdt ¼ ∇
Ψ
pðuðtÞÞduðtÞdt ¼ρ
∇Ψ
pðuðtÞÞ2r0:Therefore,
Ψ
pðuðtÞÞ is nonincreasing with respect to t.□Proposition 4.1. Assume that∇HðuÞ is nonsingular for any uARn Rm Rland p ¼n2Að1; 4Þ with nAN. Then,
(a) ðxn; yn; znÞ satisfies the KKT conditions(4)if and only if ðxn; yn; znÞ is an equilibrium point of the neural network(12);
(b) under Slater's condition, xnis a solution of the problem(1)if and only if ðxn; yn; znÞ is an equilibrium point of the neural network(12).
Proof. (a) It is easy to prove that ðxn; yn; znÞ satisfies the KKT conditions (4) if and only if HðunÞ ¼ 0 where un¼ xð n; yn; znÞT. According to the condition that∇HðuÞ is nonsingular, we have that HðunÞ ¼ 0 if and only if ∇
Ψ
pðunÞ ¼ ∇H uð ÞnTHðunÞ ¼ 0. Then the desired result follows.(b) Under Slater's condition, it is well known that xnis a solu- tion of the problem(1)if and only if there exist ynand znsuch that ðxn; yn; znÞ satisfying the KKT conditions(4). Hence, by part (a), it follows that ðxn; yn; znÞ is an equilibrium point of the neural net- work(12).□
The next result addresses the existence and uniqueness of the solution trajectory of the neural network(12).
Theorem 4.1. For any fixed p ¼n2Að1; 4Þ with nAN, the following hold.
(a) For any initial point u0¼ uðt0Þ, there exists a unique con- tinuously maximal solution u(t) with tA½t0;
τ
Þ for the neural network (12), where ½t0;τ
Þ is the maximal interval of existence.(b) If the level setL uð Þ≔fu j0
Ψ
pð ÞruΨ
pðu0Þg is bounded, thenτ
can be extended to þ1.Proof. This proof is exactly the same as the proof of[33, Propo- sition 3.4]. Hence, we omit it here.□
Theorem 4.2. Assume that∇HðuÞ is nonsingular and unis an iso- lated equilibrium point of the neural network(12). Then, the solution of the neural network (12) with any initial point u0 is Lyapunov stable.
Proof. FromLemma 2.3, we only need to argue that there exists a Lyapunov function over some neighborhood
Ω
of un. To this end, we consider the smoothed merit function for p ¼n2Að1; 4Þ with nANΨ
pðuÞ ¼12JHðuÞJ2:Since unis an isolated equilibrium point of(12), there is a neigh- borhood
Ω
of unsuch that∇
Ψ
pðunÞ ¼ 0 and ∇Ψ
pðuðtÞÞa0; 8 u tð ÞAΩ
⧹fung:By the nonsingularity of∇HðuÞ and the definition of
Ψ
p, it is easy to obtain thatΨ
pðunÞ ¼ 0. From the definition ofΨ
p, we claim thatΨ
pðuðtÞÞ40 for any u tð ÞAΩ
⧹fung, whereΩ
is a neighborhood of un. If not, that is,Ψ
pðuðtÞÞ ¼ 0, it follows that HðuðtÞÞ ¼ 0. Then, we have∇Ψ
pðuðtÞÞ ¼ 0, which contradicts with the assumption that un is an isolated equilibrium point of(12). Thus,Ψ
pðuðtÞÞ40 for any u tð ÞAΩ
⧹fung. Moreover, by the proof ofLemma 4.1(c), we know that for any u tð ÞAΩ
d
Ψ
pðuðtÞÞdt ¼ ∇
Ψ
pðuðtÞÞduðtÞdt ¼ρ
J∇Ψ
pðuðtÞÞJ2r0: ð17Þ Therefore, the functionΨ
p is a Lyapunov function overΩ
. Thisimplies that unis Lyapunov stable for the neural network(12).□ Theorem 4.3. Assume that∇HðuÞ is nonsingular and unis an iso- lated equilibrium point of the neural network (12). Then, un is asymptotically stable for neural network(12).
Proof. From the proof of Theorem 4.2, we consider again the Lyapunov function
Ψ
pfor p ¼n2Að1; 4Þ with nAN. ByLemma 2.3 again, we only need to verify that the Lyapunov functionΨ
poversome neighborhood
Ω
of unsatisfies dΨ
pðuðtÞÞdt o0; 8 u tð ÞA
Ω
⧹ u n: ð18Þ
In fact, by using(17)and the definition of the isolated equilibrium point, it is not hard to check that Eq.(18) is true. Hence, un is asymptotically stable.□
Theorem 4.4. Assume that unis an isolated equilibrium point of the neural network (12). If ∇H uð ÞT is nonsingular for any u ¼ xð ; y; zÞARn Rm Rl, then un is exponentially stable for the neural network(12).
Proof. From the definition of H(u) andLemma 2.6, we have HðuÞ ¼ HðunÞþ∇H uðtÞð ÞTðuunÞþoðJuunJÞ; 8 uA
Ω
⧹fung; ð19Þ where∇H uðtÞð ÞTA∂HðuðtÞÞ andΩ
is the neighborhood of un. Now, lettinggðuðtÞÞ ¼JuðtÞunJ2; t A½t0; 1Þ;
we have dgðuðtÞÞ
dt ¼ 2 uðtÞu nTduðtÞ dt
¼ 2
ρ
uðtÞ unT∇Ψ
pðuðtÞÞ¼ 2
ρ
uðtÞ unT∇HðuÞHðuÞ: ð20Þ Substituting(19)into(20)yieldsdgðuðtÞÞ
dt ¼ 2
ρ
uðtÞ unT∇HðuðtÞÞoðH u n þ∇H uðtÞð ÞTðuðtÞunÞþoðJuðtÞunJÞÞ¼ 2
ρ
uðtÞ unT∇HðuðtÞÞ∇H uðtÞð ÞTuðtÞ un þoðJuðtÞunJ2Þ:Since∇HðuÞ and ∇H uð ÞTare nonsingular, we claim that there exists an
κ
40 such thatuðtÞ un
T
∇H uð Þ∇H uð ÞTuðtÞ un
Z
κ
JuðtÞunJ2: ð21Þ Otherwise, if uðtÞ uð nÞT∇H uðtÞð Þ∇H uðtÞð ÞTðuðtÞunÞ ¼ 0, it implies that∇H uðtÞð ÞTðuðtÞunÞ ¼ 0:
Indeed, from the nonsingularity of H(u), we have uðtÞ un¼ 0, i.e., uðtÞ ¼ un, which contradicts with the assumption of un that is an isolated equilibrium point. Therefore, there exists an
κ
40 such that(21)holds. Moreover, for oðJuðtÞunJ2Þ, there isε
40 such that oðJuðtÞunJ2Þrε
JuðtÞunJ2. Hence,dgðuðtÞÞ
dt rð2
ρκ
þε
ÞJuðtÞunJ2¼ ð2ρκ
þε
ÞgðuðtÞÞ:This implies
g uðtÞð Þreð 2ρκþεÞtgðuðt0ÞÞ;
which means
JuðtÞunJ reρκþε2Juðt0ÞunJ:
Thus, unis exponentially stable for the neural network(12).□
5. Numerical examples
In order to demonstrate the effectiveness of the proposed neural network, we test several examples for our neural network (12)in this section. The numerical implementation is coded by Matlab 7.0 and the ordinary differential equation solver adopted here is ode23, which uses Ruge–Kutta ð2; 3Þ formula. As mentioned earlier, the parameter
ρ
is set to be 1. How isμ
chosen initially?From Theorem 4.2 in last section, we know the solution will