3 Neural Network Design with Fischer-Burmeister Function

(1)

to appear in Neurocomputing, 2011

Recurrent Neural Networks for Solving Second-Order Cone Programs

Chun-Hsu Ko ¹

Department of Electrical Engineering, I-Shou University

Kaohsiung County 840, Taiwan

Jein-Shan Chen ² Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Ching-Yu Yang ³ Department of Mathematics National Taiwan Normal University

Taipei, Taiwan 11677

January 5, 2011 (revised on June 1, 2011)

Abstract This paper proposes using the neural networks to efficiently solve the second- order cone programs (SOCP). To establish the neural networks, the SOCP is first reformulated as a second-order cone complementarity problem (SOCCP) with the Karush-Kuhn- Tucker conditions of the SOCP. The SOCCP functions, which transform the SOCCP into a set of nonlinear equations, are then utilized to design the neural networks. We propose two kinds of neural networks with the different SOCCP functions. The first neural network uses the Fischer-Burmeister function to achieve an unconstrained minimization with a merit function. We show that the merit function is a Lyapunov function and this neural network is asymptotically stable. The second neural network utilizes the natural residual function with the cone projection function to achieve low computation complexity. It is shown to be Lyapunov stable and converges globally to an optimal solution

1E-mail: [email protected]

2Corresponding author, Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan. E-mail:

[email protected]

3E-mail: [email protected]

(2)

under some condition. The SOCP simulation results demonstrate the effectiveness of the proposed neural networks.

Key words. SOCP, Neural network, Merit function, Fischer-Burmeister function, Cone projection function, Lyapunov stable.

AMS subject classifications. 92B20, 90C33, 65K05

1 Introduction

Second-order cone program (SOCP) has been widely applied in engineering optimization [1]. It requires solving the optimization problem subject to the linear equality and second-order cone inequality constraints [2]. Numerical approaches such as the interior-point method [1] or the merit function method [3] can effectively solve the SOCP.

However, many engineering dynamic systems, such as force analysis in robot grasping [1, 4] and control applications [5, 6], require the real-time SOCP solutions. As a re- sult, efficient approaches for solving the real-time SOCP are needed. Prior research [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18] indicates that the neural networks can be used to solve various optimization problems. Furthermore, the neural networks based on circuit implementation exhibit the real-time processing ability. We consider it is appropriate to utilize the neural networks for efficiently solving the SOCP problems.

The recurrent neural network was introduced by Hopfield and Tank [7] for solving linear programming problems. Kennedy and Chua [8] proposed an extended neural network for solving nonlinear convex programming problems thereafter, while their approach involves the penalty parameter which affects the neural network accuracy. To find the exact solutions, more neural networks for optimization have been further developed. Among them, the primal-dual neural network [9, 10, 11] with the global stability is proposed for providing the exact solutions of the linear and quadratic programming problems. The projection neural network, developed by Xia and Wang [12, 14, 15], was proposed to efficiently solve many optimization problems and variational inequalities. Since the SOCP is a nonlinear convex problem, both primal-dual neural network [16] and projection neural network [17] can be used to provide the SOCP solution. However, they require many state variables, leading to high model complexity. It thus motivates the development of more compact neural networks for SOCP.

The SOCP can be solved by analyzing its Karush-Kuhn-Tucker (KKT) optimality conditions which leads to the second-order cone complementarity problem (SOCCP) [3, 19, 20]. The approaches [3, 20] based on the SOCCP functions, such as Fischer- Burmeister (FB) and natural residual functions, can be further utilized for solving the SOCCP. In the merit function approach [3], an unconstrained smooth minimization with

(3)

the FB function is achieved in finding the SOCCP solution. On the other hand, the semi- smooth approach [20] uses the natural residual function with the cone projection (CP) function to reformulate the SOCCP as a set of nonlinear equations and then apply the non-smooth Newton method to obtain the solution. Previous studies have demonstrated the feasibility of these SOCCP functions in solving the SOCP problems. We also use them in our neural network design. In this paper, we propose two novel neural networks for efficiently solving the SOCP problems. One is based on the gradient of the smooth merit function derived from the FB function [18]. The other is an extended projection neural network by replacing the scalar projection function [12, 14, 15] with the CP function.

These neural networks are with less state variables than those previously proposed [16, 17]

for solving the SOCP. Furthermore, they are shown to be stable and globally convergent to the SOCP solutions.

This paper is organized as follows. Section 2 introduces the second-order cone program and its SOCCP formulation. In Section 3, the neural network based on the Fischer- Burmeister function is proposed and analyzed. In Section 4, the second neural network based on the cone projection function is proposed. Its global stability is also verified. In Section 5, several SOCP examples are presented to demonstrate the effectiveness of the proposed neural networks. Finally, the conclusions are given in Section 6.

2 Problem Formulation

In this section, we introduce the second-order cone program and reformulate it as a second-order cone complementarity problem. The second-order cone program is in the form of

minimize f (x)

subject to Ax = b, x ∈ K. (1)

Here f : Rⁿ → R is a nonlinear continuously differentiable function, A ∈ R^m×n is a full row rank matrix, b ∈ R^m is a vector, and K is a Cartesian product of second-order cones (or Lorentz cones), expressed as

K = Kⁿ¹ × Kⁿ² × · · · × Kⁿ^N (2) where N, n1, · · · , nN ≥ 1, n1+ · · · + nN = n, and

Kⁿⁱ :=(xi1, xi2, · · · , xini)^T ∈ Rⁿⁱ| k(xi2, · · · , xini)k ≤ xi1

with k · k denoting the Euclidean norm and K¹ the set of nonnegative reals R+. A special case of equation (2) is K = Rⁿ₊, namely the nonnegative orthant in Rⁿ, which corresponds to N = n and n1 = · · · = nN = 1. When f is linear, i.e., f = c^Tx with c ∈ Rⁿ, SOCP (1) reduces to the following linear SOCP:

minimize c^Tx

subject to Ax = b, x ∈ K. (3)

(4)

The KKT optimality conditions for (1) are given by







∇f (x) − A^Ty − λ = 0, x^Tλ = 0, x ∈ K, λ ∈ K, Ax = b,

(4)

where y ∈ R^m and λ ∈ Rⁿ. When f is convex, these conditions are sufficient for optimality. It also can be written as

x^T(∇f (x) − A^Ty) = 0, x ∈ K, ∇f (x) − A^Ty ∈ K,

Ax = b. (5)

By solving the system (5), we may obtain a primal-dual optimal solution of SOCP (1).

Note that system (5) involves the SOCCP. To efficiently solve it, we propose using the neural network approaches with the FB function and CP function, respectively, described below.

3 Neural Network Design with Fischer-Burmeister Function

It is known that the merit function approach [3] can be used for solving system (5).

Motivated by this approach, we propose a neural network with the Fischer-Burmeister function to find the minimal of the merit function and study its global stability.

In [3], system (5) is shown to be equivalent to an unconstrained smooth minimization problem via the merit function approach, described as

min E(x, y) = ΨF B(x, ∇f (x) − A^Ty) +1

2kAx − bk², (6)

where E(x, y) is a merit function, ΨF B(x, y) = 1 2

N

X

i=1

kφF B(xi, yi)k², x = (x1, · · · , xN)^T, y = (y1, · · · , yN)^T ∈ Rⁿ¹× · · · × Rⁿ^N, and φF B is the Fischer-Burmeister function defined as

φ_{F B}(x_i, y_i) := (x²_i + y²_i)^1/2− x_i− y_i. (7) Based on the gradient of the objective E(x, y) in minimization problem (6), we propose the first neural network for solving the SOCP, with the following dynamic equation

d dt

x y

= ρ −∇xE(x, y)

−∇yE(x, y)

, (8)

where ρ is a positive scaling factor and







∇xE(x, y) = ∇xΨF B(x, ∇f (x) − A^Ty)

+∇²f (x) · ∇yΨF B(x, ∇f (x) − A^Ty) + A^T(Ax − b),

∇yE(x, y) = −A · ∇yΨF B(x, ∇f (x) − A^Ty).

(9)

(5)

For linear SOCP (3), the above equations reduce to

∇xE(x, y) = ∇xΨF B(x, c − A^Ty) + A^T(Ax − b),

∇yE(x, y) = −A · ∇yΨF B(x, c − A^Ty). (10) Note that the Jordan product [3] is required for calculating ∇xΨF B and ∇yΨF B which are introduced in the appendix. And, the dynamic equation (8) can be realized by a recurrent neural network with FB function as shown in Figure 1. The circuit for the neural network realization requires n + m integrators, n processors for ∇f (x), n² processors for ∇²f (x), n processors for ∇xΨF B, m processors for ∇yΨF B, 4mn connection weights and some summers. Furthermore, the neural network (8) is asymptotically stable, as proven in the following theorem.

Theorem 3.1 If u^∗ = (x^∗, y^∗) is an isolated equilibrium point of neural network (8), then u^∗ = (x^∗, y^∗) is asymptotically stable for (8).

Proof. We assume that u^∗ = (x^∗, y^∗) is an isolated equilibrium point of neural network (8) over a neighborhood Ω^∗ ⊆ Rⁿ of u^∗ such that ∇E(x^∗, y^∗) = 0 and ∇E(x, y) 6= 0,

∀(x, y) ∈ Ω^∗\ {(x^∗, y^∗)}. First we show that E(x, y) is a Lyapunov function for u^∗ at Ω^∗. Since

∇yE(x^∗, y^∗) = −A · ∇yΨF B(x^∗, ∇f (x^∗) − A^Ty^∗) = 0, from Lemma 3 and Proposition 1 of [3], we have

∇_xΨ_{F B}(x^∗, ∇f (x^∗) − A^Ty^∗) = ∇_yΨ_{F B}(x^∗, ∇f (x^∗) − A^Ty^∗) = 0.

Moreover, from Proposition 1 of [3], this says

ΨF B(x^∗, ∇f (x^∗) − A^Ty^∗) = 0.

Then from equation (9),

∇_xE(x^∗, y^∗) = ∇_xΨ_{F B}(x^∗, ∇f (x^∗) − A^Ty^∗)

+∇²f (x^∗) · ∇yΨF B(x^∗, ∇f (x^∗) − A^Ty^∗) + A^T(Ax^∗− b) = 0, which implies that A^T(Ax^∗ − b) = 0. Because A ∈ R^m×n is a full row rank matrix, we must have Ax^∗− b = 0, which yields

E(x^∗, y^∗) = Ψ_{F B}(x^∗, ∇f (x^∗) − A^Ty^∗) + 1

2kAx^∗− bk² = 0.

Next, we claim that E(x, y) > 0, ∀(x, y) ∈ Ω^∗ \ {(x^∗, y^∗)}. If not, there is an (x, y) ∈ Ω^∗ \ {(x^∗, y^∗)} such that E(x, y) = 0, this says that ΨF B(x, ∇f (x) − A^Ty) = 0 and Ax = b, then ∇xE(x, y) = 0 and ∇yE(x, y) = 0. Hence, (x, y) is an equilibrium point of

(6)

neural network (8), contradicting with that u^∗ = (x^∗, y^∗) is an isolate equilibrium point.

Finally,

dE(x(t), y(t))

= [∇(x(t),y(t))dt E(x(t), y(t))]^T(−ρ∇(x(t),y(t))E(x(t), y(t)))

= −ρk∇(x(t),y(t))E(x(t), y(t))k²

≤ 0.

Therefore, the function E(x, y) is a Lyapunov function for neural network (8) over the set Ω^∗. Since u^∗ = (x^∗, y^∗) is an isolated equilibrium point of neural network (8), we have

dE(x(t), y(t))

dt < 0, ∀(x(t), y(t)) ∈ Ω^∗\ {(x^∗, y^∗)}.

Thus, u^∗ is asymptotically stable for neural network (8). 2

4 Neural Network Design with Cone Projection Func- tion

In this section, we propose another neural network associated with the cone projection function to solve system (5) for obtaining the SOCP solution and study its stability. In fact, from [24, Prop. 3.3], we know that such cone projection onto K has a special formula given as

PK(z) = [λ1(z)]+u⁽¹⁾_z + [λ2(z)]+u⁽²⁾_z ,

where [·]+ means the scalar projection, λ1(z), λ2(z) and u⁽¹⁾z , u⁽²⁾z are the spectral values and the associated spectral vectors of z = (z₁, z₂) ∈ R × Rⁿ⁻¹, respectively, given by







λ_i(z) = z₁+ (−1)ⁱkz₂k, u⁽ⁱ⁾z = 1

2

1, (−1)ⁱ z2

kz2k

,

for i = 1, 2. The CP function PK(z) has the following property, called projection theorem [21], which is useful in our subsequent analysis.

Property 4.1 Let K be a nonempty closed convex subset of Rⁿ. Then, for each z ∈ Rⁿ, P_K(z) is the unique vector ¯z ∈ K satisfying (y − ¯z)^T(z − ¯z) ≤ 0, ∀y ∈ K.

Employing the natural residual function with the CP function [19, 20], system (5) can be equivalently written as

x − PK(x − ∇f (x) + A^Ty) = 0,

Ax − b = 0, (11)

(7)

where x = (x₁, · · · , x_N)^T ∈ Rⁿ¹ × · · · × Rⁿ^N with x_i = (x_i1, x_i2, · · · , x_ini)^T, i = 1, · · · , N, and PK(x) = [PK(x1), · · · , PK(xN)]^T.

Based on the equivalent formulation in (11) and employing the ideas for networks used in [12, 13], we consider the second neural network for solving the SOCP, with the following dynamic equations:

d dt

x y

= ρ −x + PK(x − ∇f (x) + A^Ty)

−Ax + b

, (12)

where ρ is a positive scaling factor. The dynamic equations can be realized by a recurrent neural network with the cone projection function as shown in Figure 2. The circuit for the neural network realization requires n + m integrators, n processors for ∇f (x), N processors for cone projection mapping PK, 2mn connection weights and some summers.

Compared with the first neural network in (8), the second neural network (12) dose not require to calculate ∇²f (x), resulting in lower model complexity.

To analyze the stability of the neural network in equation (12), we first give three lemmas and one proposition.

Lemma 4.1 Let F (u) be defined as

F (u) := F (x, y) := −x + PK(x − ∇f (x) + A^Ty)

−Ax + b

. (13)

Then, F (u) is semi-smooth. Moreover, F (u) is strongly semi-smooth if ∇²f (x) is locally Lipschitz continuous.

Proof. This is an immediate consequence of [20, Theorem 1]. 2

Proposition 4.1 For any initial point u₀ = (x₀, y₀) where x₀ := x(t₀) ∈ K, there exists a unique solution u(t) = (x(t), y(t)) for neural network (12). Moreover, x(t) ∈ K.

Proof. For simplicity, we assume K = Kⁿ. The analysis can be carried over to the general case where K is the Cartesian product of second-order cones. From Lemma 4.1, F (u) := F (x, y) is semi-smooth and Lipschitz continuous. Thus, there exists a unique solution u(t) = (x(t), y(t)) for neural network (12). Therefore, it remains to show that x(t) ∈ Kⁿ. For convenience, we denote x(t) := (x₁(t), x₂(t)) ∈ R × Rⁿ⁻¹. To complete the proof, we need to verify two things: (i) x1(t) ≥ 0 and (ii) kx2(t)k ≤ x1(t). First, from (12), we have

dx

dt + ρx(t) = ρP_K(x − ∇f (x) + A^Ty).

(8)

The solution of the first-order ordinary differential equation above is x(t) = e⁻^ρ(t−t⁰⁾x(t0) + ρe⁻^ρt

Z t t0

e^ρsPK(x − ∇f (x) + A^Ty)ds.

If we let x(t0) := (x1(t0), x2(t0)) ∈ R × Rⁿ⁻¹ and denote z(t) := (z1(t0), z2(t0)) as the term PK(x − (∇f (x) − A^Ty)), which leads to

x₁(t) = e⁻^ρ(t−t⁰⁾x₁(t₀) + ρe⁻^ρt Z t

t0

e^ρsz₁(s)ds, x₂(t) = e⁻^ρ(t−t⁰⁾x₂(t₀) + ρe⁻^ρt

Z t t0

e^ρsz₂(s)ds.

Due to both x0(t) and z(t) belong to Kⁿ, there have x1(t0) ≥ 0, kx2(t0)k ≤ x1(t0) and z1(t) ≥ 0, kz2(t)k ≤ z1(t) . Therefore, x1(t) ≥ 0 since both terms in the right-hand side are nonnegative. In addition,

kx2(t)k ≤ e⁻^ρ(t−t⁰⁾kx2(t0)k + ρe⁻^ρt Z t

t0

e^ρskz2(s)kds

≤ e⁻^ρ(t−t⁰⁾x₁(t₀) + ρe⁻^ρt Z t

t0

e^ρsz₁(s)ds

= x1(t), which implies that x(t) ∈ Kⁿ. 2

Lemma 4.2 Let H(u) be defined as

H(u) := H(x, y) := ∇f (x) − A^Ty Ax − b

. (14)

Then, H is a monotone function if f is a convex function. Moreover, ∇H(u) is positive semi-definite if and only if ∇²f (x) is positive semi-definite.

Proof. Let u = (x, y) and ˜u = (˜x, ˜y). Then, the monotonicity of H holds since (u − ˜u)^T(H(u) − H(˜u))

= (x − ˜x)^T(∇f (x) − ∇f (˜x)) − (x − ˜x)^T(A^T(y − ˜y)) + (y − ˜y)^T(A(x − ˜x))

= (x − ˜x)^T(∇f (x) − ∇f (˜x))

≥ 0,

where the last inequality is due to the convexity of f (x), see [22, Theorem 3.4.5]. Fur- thermore, we observe that

∇H(u) = ∇²f (x) −A^T

A 0

.

(9)

Thus, we have

u^T∇H(u)u

= x^T y^T ∇²f (x) −A^T

A 0

x y

= x^T∇²f (x)x,

which indicates that the positive semi-definiteness of ∇H(u) is equivalent to the positive semi-definiteness of ∇²f (x). 2

Lemma 4.3 Let F (u), H(u) be defined as in (13) and (14), respectively. Also, let u^∗ = (x^∗, y^∗) be an equilibrium point of neural network (12) with x^∗ being an optimal solution of SOCP. Then, the following inequalities hold:

(F (u) + u − u^∗)^T(−F (u) − H(u)) ≥ 0. (15) Proof. First, we denote λ := ∇f (x) − A^Ty. Then, we obtain

(F (u) + u − u^∗)^T(−F (u) − H(u))

= −x + P_K(x − λ) + (x − x^∗) (−Ax + b) + (y − y^∗)

T

x − PK(x − λ) − λ (Ax − b) − (Ax − b)

=

−x^∗+ PK(x − λ) (−Ax + b) + (y − y^∗)

T

(x − λ) − P_K(x − λ) 0

= −(x^∗ − PK(x − λ))^T((x − λ) − PK(x − λ)).

Since x^∗ ∈ K, applying Property 4.1 gives

(x^∗− P_K(x − λ))^T((x − λ) − P_K(x − λ)) ≤ 0.

Thus, inequality (15) is proved. 2

We now investigate the stability and convergence issues of neural network (12). First, we analyze the behavior of the solution trajectory of neural network (12) including existence and convergence. We then establish two kinds of stability for an isolated equilibrium point.

We know that every solution u^∗ to SOCP is an equilibrium point of neural network (12). If further u^∗ is an isolated equilibrium point of neural network (12), we show that u^∗ is Lyapunov stable.

Theorem 4.1 If f is convex and twice differentiable, then the solution of neural network (12), with initial point u0 = (x0, y0) where x0 ∈ K, is Lyapunov stable. Moreover, the solution trajectory of neural network (12) is extendable to the global existence.

(10)

Proof. Again, for simplicity, we assume K = Kⁿ. From Proposition 4.1, there exists a unique solution u(t) = (x(t), y(t)) for neural network (12) and x(t) ∈ Kⁿ. Let u^∗ = (x^∗, y^∗) be an equilibrium point of neural network (12) with x^∗ being an optimal solution of SOCP. We define a Lyapunov function as below:

E(u) := E(x, y) := −H(u)^TF (u) − 1

2kF (u)k²+1

2ku − u^∗k², (16) where F (u) and H(u) are given as in (13) and (14), respectively. From [23, Theorem 3.2], we know that E is continuously differentiable with

∇E(u) = H(u) − [∇H(u) − I]F (u) + (u − u^∗).

It is also trivial that E(u^∗) = 0. Then, we have dE(u(t))

dt = ∇E(u(t))^Tdu

= {H(u) − [∇H(u) − I]F (u) + (u − udt ^∗)}^T ρF (u)

= ρ[H(u) + (u − u^∗)]^TF (u) + kF (u)k²− F (u)^T∇H(u)F (u) . Hence, inequality (15) in Lemma 4.3 implies

(H(u) + u − u^∗)^TF (u) ≤ −H(u)^T(u − u^∗) − kF (u)k², which yields

dE(u(t))

≤ ρ−H(u)dt ^T(u − u^∗) − F (u)^T∇H(u)F (u)

= ρ−H(u^∗)^T(u − u^∗) − (H(u) − H(u^∗))^T(u − u^∗) − F (u)^T∇H(u)F (u) .

(17)

On the other hand, we know that

(F (u^∗) + u^∗− u)^T(−F (u^∗) − H(u^∗))

= −(x − PK(x^∗ − λ^∗))^T((x^∗− λ^∗) − PK(x^∗ − λ^∗)).

Since x ∈ Kⁿ, applying Property 4.1 gives

(x − P_K(x^∗− λ^∗))^T((x^∗− λ^∗) − P_K(x^∗− λ^∗)) ≤ 0.

Thus, we have (F (u^∗)+u^∗−u)^T(−F (u^∗)−H(u^∗)) ≥ 0. Note that F (u^∗) = 0, we therefore obtain −H(u^∗)^T(u−u^∗)^T ≤ 0. Also the monotonicity of H implies −(H(u)−H(u^∗))^T(u−

u^∗) ≤ 0. In addition, f is convex and twice differentiable if and only if ∇²f (x) is positive semidefinite and hence ∇H is positive semidefinite by Lemma 4.2, i.e., the second term

−F (u)^T∇H(u)F (u) ≤ 0. The above discussions lead to dE(u(t))/dt ≤ 0.

In order to obtain E(u) is a Lyapunov function and u^∗ is Lyapunov stable, we will show the following inequality:

−H(u)^TF (u) ≥ kF (u)k². (18)

(11)

To see this, we first observe that

kF (u)k²+ H(u)^TF (u)

= (x − PK(x − λ))^T((x − λ) − PK(x − λ)).

Since x ∈ K, applying Property 4.1 again, there holds

(x − P_K(x − λ))^T((x − λ) − P_K(x − λ)) ≤ 0,

which yields the desired inequality (18). By combining equation (16) and inequality (18), we have

E(u) ≥ 1

2kF (u)k²+ 1

2ku − u^∗k²,

which says E(u) > 0 if u 6= u^∗. Hence E(u) is indeed a Lyapunov function and u^∗ is Lyapunov stable. Moreover, it holds that

E(u0) ≥ E(u) ≥ 1

2ku − u^∗k² for t ≥ t0, (19) which means the solution trajectory u(t) is bounded. Hence, it can be extended to global existence. 2

Theorem 4.2 Let u^∗ = (x^∗, y^∗) be an equilibrium point of (12) with x^∗ being an optimal solution of SOCP. If f is twice differentiable and ∇²f (x) is positive definite, the solution of neural network (12), with initial point u0 = (x0, y0) where x0 ∈ K, is globally convergent to u^∗ and has finite convergence time.

Proof. From (19), the level set

L(u0) := {u | E(u) ≤ E(u0)}

is bounded. Then, the Invariant Set Theorem [25] implies the solution trajectory u(t) converges to θ as t → ∞ where θ is the largest invariant set in

Π =

u ∈ L(u0) | dE(u(t)) dt = 0

.

We will show that du/dt = 0 if and only if dE(u(t))/dt = 0 which yields that u(t) converges globally to the equilibrium point u^∗ = (x^∗, y^∗). Suppose du/dt = 0, then it is clear that dE(u(t))/dt = ∇E(u)^T(du/dt) = 0. Let ˆu = (ˆx, ˆy) ∈ Π which says dE(ˆu(t))/dt = 0. From (17), we know that

dE(ˆu(t))

dt ≤ ρ−(H(û) − H(u^∗))^T(û − u^∗) − F (û)^T∇H(û)F (û) .

(12)

Both terms inside the big parenthesis are nonpositive as shown in Lemma 4.2, so (H(ˆu)−

H(u^∗))^T(û − u^∗) = 0, F (û)^T∇H(û)F (û) = 0, and F (û)^T∇H(û)F (û)

= {−ˆx + PK(ˆx − ∇f (ˆx) + A^Ty)}ˆ ^T∇²f (ˆx){−ˆx + PK(ˆx − ∇f (ˆx) + A^Ty)}ˆ

= 0.

The condition of ∇²f (ˆx) being positive definite leads to

−ˆx + PK(ˆx − ∇f (ˆx) + A^Ty) = 0,ˆ

which is equivalent to dˆx/dt = 0. On the other hand, similar to the arguments in Lemma 4.2, we have

(ˆu − u^∗)^T(H(ˆu) − H(u^∗))

= (ˆx − x^∗)^T(∇f (ˆx) − ∇f (x^∗))

= (ˆx − x^∗)^T∇²f (xs)(ˆx − x^∗)

= 0,

where xs ∈ [x^∗, ˆx]. Again, the condition of ∇²f (xs) being positive definite yields ˆx = x^∗. Hence dˆy/dt = 0 and therefore dˆu(t)/dt = 0. From above, u(t) converges globally to the equilibrium point u^∗ = (x^∗, y^∗). Moreover, with Theorem 4.1 and following the same arguments as in [12, Theorem 2], the neural network (12) has finite convergence time.

5 Simulations

To demonstrate the effectiveness of the proposed neural networks, three illustrative SOCP problems are tested, described as below.

Example 5.1 Consider the nonlinear convex SOCP [20] given by

minimize exp(x1− x3) + 3(2x1− x2)⁴+p1 + (3x2+ 5x3)² subject to Ax = b, x ∈ K³× K²

where

A =

4 6 3 −1 0

−1 7 −5 0 −1

and b =

1

−2

This problem has an approximate solution x^∗ = [0.2324, −0.07309, 0.2206, 0.153, 0.153]^T. We use the proposed neural networks with the FB and CP functions, respectively, to solve the problem with the trajectories obtained by them shown in Figures 3 and 4. From the simulation results, we found that both trajectories are globally convergent to x^∗ and the neural network with the CP function converged to x^∗ quicker than that with the FB function. On the other hand, the neural network with the CP function also has lower

(13)

model complexity than that with the FB function as mentioned in Sec. 4. Hence, the neural network with the CP function is preferable to the neural network with the FB function when both can globally converge to the optimal solution.

Example 5.2 Consider the following linear SOCP given by minimize x1+ x2+ x3+ x4+ x5+ x6

subject to Ax = b, x ∈ K³× K³ where

A =







1 2 0 0 0 1 1 0 0 1 4 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 2 0







and b =





 9 20

6 4 8







This problem has an optimal solution x^∗ = [3, 1, 2, 5, 3, 4]^T. Note that, its objective function is convex and the Hessian matrix ∇²f (x) is a zero matrix. Hence, the neural network with the FB function is asymptotically stable from Theorem 3.1 while the neural network with the CP function is Lyapunov stable from Theorem 4.1. Figures 5 and 6 display the trajectories obtained using the neural networks with the FB and CP functions, respectively. The simulation results show that both trajectories are convergent to x^∗. Coinciding with above results of Theorems 3.1 and 4.1, the neural network with the CP function yields the oscillating trajectory and has longer convergence time than the neural network with the FB function.

Example 5.3 Consider the grasping force optimization problem for the multi-fingered robotic hand [1, 4, 17]. Its goal is to find the minimum grasping force for moving an object. For the robotic hand with m fingers, the optimization problem can be formulated as

minimize 1 2f^Tf subject to Gf = −fext

k(fi1, fi2)k ≤ µfi3, (i = 1, · · · , m)

where f = [f₁₁, f₁₂, · · · , f_m3]^T is the grasping force, G the grasping transformation matrix, fext the time-varying external wrench, and µ the friction coefficient.

Letting [x_i1, x_i2, x_i3] = [µf_i3, f_i1, f_i2], i = 1, · · · , m, and x = [x₁₁, x₁₂, · · · , x_m3]^T, the problem can be reformulated as a nonlinear convex SOCP. For the three-finger grasp example in [17], the robot hand grasps a polyhedral with the grasp points [0, 1, 1]^T,

(14)

[1, 0.5, 0]^T, and [0, −1, 0]^T, and the robot hand moves along a vertical circular trajectory of radius r with a constant velocity ν. We reformulate the example as

minimize 1 2x^TQx

subject to Ax = b, x ∈ K³× K³× K³ (20) where Q = diag(1/µ², 1, 1, 1/µ², 1, 1, 1/µ², 1, 1)

A =







0 0 1 −1/µ 0 0 0 1 0

−1/µ 0 0 0 0 −1 1/µ 0 0

0 −1 0 0 −1 0 0 0 −1

0 −1 0 0 −0.5 0 0 0 1

0 0 0 0 1 0 0 0 0

0 0 −1 0.5/µ 0 −1 0 1 0







and b =







0

−fcsin θ(t) Mg − fccos θ(t)

0 0 0





 ,

where M is the mass of the polyhedral, g = 9.8m/s², fc = Mν²

r the centripetal force, t the time, and θ = νt

r ∈ [0, 2π]. Note that problem (20) is a nonlinear convex SOCP and the matrix Q is positive definite. We know from Theorems 3.1 and 4.2 that both the proposed neural networks are globally convergent to the optimal solution. Under the conditions M = 0.1kg, r = 0.2m, ν = 0.4π m/s, and µ = 0.6, the time-varying grasping force obtained from the proposed neural networks is shown in Figure 7. We found that the maximum grasping force occurs at the position θ = π (t = 0.5s) which corresponds to the maximum downward wrench. The simulation results demonstrate that the neural networks are effective in the SOCP applications.

6 Conclusion

In this paper, we have proposed two neural networks for efficiently solving the SOCP.

The first neural network is based on gradient of the merit function derived from the FB function and was shown to be asymptotically stable. The second neural network with the CP function has low model complexity, and has been shown to be Lyapunov stable and converge globally to the SOCP solution under the positive definite condition of Hessian matrix of the objective function. The convergence of the neural networks has been validated with the simulation results of the SOCP examples. When the second

(15)

neural network with the CP functions yields oscillating trajectory, we can employ the neural network based on FB function instead, though it has higher model complexity.

The proposed neural networks are thus ready for the SOCP applications.

During the reviewing process of this paper, we published another paper [26] which focuses on second-order cone constrained variational inequality problem. Since the KKT conditions of second-order cone programs can be recast as variational inequality problem, the paper [26] indeed deals with a broader class of optimization problems. However, the two neural networks considered therein are different from the two neural networks studied in this paper. More specifically, the FB method used in [26] is based on the smoothed FB function while the one studied here is based on regular FB function; the CP method in [26] is based on a Lagrangian model which is, even when it reduces to SOCP, not the same as the one investigated here. Due to the essential difference, the assumptions used to establish stability are also different. In view of this, it will be an interesting topic to do numerical comparison among these neural networks for SOCP.

Acknowledgement

The work was supported by National Science Council of Taiwan under the Grant NSC 97-2221-E-214-034.

References

[1] M. S. Lobo, L. Vandenberghe, S. Boyd and H. Lebret, Applications of second-order cone programming, Linear Algebra and its Applications, vol. 284, no. 1, pp. 193-228, 1998.

[2] F. Alizadeh and D. Goldfarb, Second-order cone programming, Mathematical Programming , vol. 95, no. 1, pp. 3-51, 2003.

[3] J.-S. Chen and P. Tseng, An unconstrained smooth minimization reformulation of the second-order cone complementarity problem, Mathematical Programming, vol.

104, pp. 293–327, 2005.

[4] S.P. Boyd and B. Wegbreit, A Fast computation of optimal contact forces, IEEE Transactions on Robotics, vol. 23, no. 6, pp. 1117-1132, 2007.

[5] S. Boyd, C. Crusius and A. Hansson, Control applications of nonlinear convex programming, Journal of Control Process, vol. 8, no. 5, pp. 313-324, 1998.

(16)

[6] D. Bertsimas and D.B. Brown, Constrained stochastic LQC: a tractable approach, IEEE Transactions on Automatic Control, vol. 52, no. 10, pp. 1826-1841, 2007.

[7] D.W. Tank and J.J. Hopfield, Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Transac- tions on Circuits and Systems, vol. 33, no. 5, pp. 533-541, 1986.

[8] M.P. Kennedy and L.O. Chua, A Neural network for nonlinear programming, IEEE Transaction on Circuits and Systems, Vol. 35, No. 5, pp. 554-562, 1988.

[9] Y.S. Xia, A new neural network for solving linear and quadratic programming problems, IEEE Transactions on Neural Networks, vol. 7, no. 6, pp. 1544-1547, 1996.

[10] Q. Tao, J.D. Cao, M.S. Xue and H. Qiao, A high performance neural network for solving nonlinear programming problems with hybrid constraints, Physics Letters A, vol. 288, no. 2, pp. 88-94, 2001.

[11] J. Wang, Q. Hu, and D. Jiang, A Lagrangian neural network for kinematic control of redundant robot manipulators, IEEE Transactions on Neural Networks, vol.

10, no. 5, pp. 1123-1132, 1999.

[12] Y. Xia and J. Wang, A recurrent neural network for solving nonlinear convex programs subject to linear constraints, IEEE Transactions on Neural Networks, vol.

16, no. 3, pp. 379-386, 2005.

[13] Y. Xia, H. Leung and J. Wang, A projection neural network and its application to constrained optimization problems, IEEE Transactions on Circuits and Systems - Part I, vol. 49, pp. 447-458, 2002.

[14] Y. Xia and J. Wang, A recurrent neural network for nonlinear convex optimization subject to nonlinear inequality constraints, IEEE Transactions on Circuits and Systems-I: Regular Papers, vol. 51, no. 7, pp. 1385-1394, 2004.

[15] Y. Xia, H. Leung and J. Wang, A general projection neural network for solving monotone variational inequalities and related optimization problems, IEEE Transac- tions on Neural Networks, vol. 15, no. 2, pp. 318-328, 2004.

[16] X. Mu, S. Liu and Y. Zhang, A neural network algorithm for second-order conic programming, Second International Symposium on Neural Networks, Chongqing, China, Proceedings Part II, pp. 718-724, 2005.

[17] Y. Xia, J. Wang and L. M. Fok, Grasping-force optimization for multifingered robotic hands using a recurrent neural network, IEEE Transactions on robotics and automation, vol. 20, no. 3, pp. 549-554, 2004.

(17)

[18] L. Z. Liao and H. D. Qi, A neural network for the linear complementarity problem, Mathematical and Computer Modeling, vol. 29, no. 3, pp. 9-18, 1999.

[19] J. S. Chen, X. Chen and P. Tseng, Analysis of nonsmooth vector-valued function associated with second-order cone, Mathematical Programming, vol. 101, no. 1, pp. 95-117, 2004.

[20] C. Kanzow, I. Ferenczi and M. Fukushima, On the local convergence of semis- mooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity, SIAM Journal on Optimization, vol. 20, pp. 297–320, 2009.

[21] D. P. Bertsekas, Nonlinear Programming, Belmont, MA: Athena Scientific, 1995.

[22] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Philadelphia: SIAM, 2000.

[23] M. Fukushima, Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems, Mathematical Programming, vol.

53, no. 1, pp. 99-110, 1992.

[24] M. Fukushima, Z.-Q. Luo, and P. Tseng, Smoothing functions for second- order-cone complimentarity problems, SIAM Journal on Optimization, vol. 12, pp.

436–460, 2002.

[25] R. Golden, Mathematical Methods for Neural Network Analysis and Design, Cam- bridge, MA: The MIT Press, 1996.

[26] J. Sun, J.-S. Chen and C.-H. Ko, Neural networks for solving second-order cone constrained variational inequality problem, to appear in Computational Optimization and Applications, 2011.

Appendix

In Appendix, we introduce the Jordan product and its properties used in the neural network with the FB function, which are needed when we write codes for simulations.

For any x = (x1, x2) ∈ R × Rⁿ⁻¹, their Jordan product is defined as x ◦ y = (x^Ty, y₁x₂ + x₁y₂).

Their sum of square is calculated by

x²+ y² = (kxk²+ kyk², 2x1x2+ 2y1y2).

(18)

The square root of x is

x^1/2 = (s,x2

2s), s = s

1 2

x1+

q

x²₁− kx2k²

, if x = 0, x^1/2= 0

and the determinant of x is det(x) = x²₁− kx₂k². Furthermore, a matrix L_x is defined as Lx = x1 x^T₂

x2 x1I

, and when det(x) 6= 0, Lx is invertible with

L⁻_x¹ = 1 det(x)





x1 −x^T₂

−x2

det(x) x1

I + 1 x1

x2x^T₂



.

Based on the properties of the Jordan product described above, the formulae of ∇xΨF B(x, y) and ∇yΨF B(x, y) in neural network (8) are calculated (see [3]) as

∇xΨF B(x, y) =

LxL⁻¹

(x²+y²)¹² − I

φF B(x, y), and

∇yΨF B(x, y) =

LyL⁻¹

(x²+y²)¹² − I

φF B(x, y).

(19)

Figure 1: Block diagram of the proposed neural network with FB function.

(20)

Figure 2: Block diagram of the proposed neural network with CP function.

(21)

Figure 3: Transient behavior of the neural network with FB function in Example 5.1.

Figure 4: Transient behavior of the neural network with CP function in Example 5.1.

(22)

Figure 5: Transient behavior of the neural network with FB function in Example 5.2.

Figure 6: Transient behavior of the neural network with CP function in Example 5.2.

(23)

Figure 7: Grasping force obtained by using proposed neural networks in Example 5.3.