• 沒有找到結果。

second-order cone

N/A
N/A
Protected

Academic year: 2022

Share "second-order cone"

Copied!
26
0
0

加載中.... (立即查看全文)

全文

(1)

to appear in Pacific Journal of Optimization, 2017

Applying a type of SOC-functions to solve a system of equalities and inequalities under the order induced by

second-order cone

Xin-He Miao1

Department of Mathematics Tianjin University, China

Tianjin 300072, China

Nuo Qi 2

Department of Mathematics Tianjin University, China

Tianjin 300072, China

B. Saheya3

College of Mathematical Science Inner Mongolia Normal University Hohhot 010022, Inner Mongolia, China

Jein-Shan Chen 4 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan.

April 28, 2016 (revised on July 15, 2017)

Abstract In this paper, we introduce a special type of SOC-functions which is a vector- valued function associated with second-order cone. By using it, we construct a type of

1E-mail: xinhemiao@tju.edu.cn. The author’s work is supported by National Natural Science Foun- dation of China (No. 11471241).

2E-mail: qinuotju@126.com.

3E-mail: saheya@imnu.edu.cn. The author’s work is supported by National Natural Science Founda- tion of China (No. 11402127,11401326).

4Corresponding author. E-mail:jschen@math.ntnu.edu.tw. The author’s work is supported by Min- istry of Science and Technology, Taiwan.

(2)

smoothing functions which converges to the projection function onto second-order cone.

Then, we reformulate the system of equalities and inequalities under the order induced by second-order cone as a system of parameterized smooth equations. Accordingly, we propose a smoothing-type Newton algorithm to solve the reformulation, and show that the proposed algorithm is globally convergent and locally quadratically convergent under suitable assumptions. Preliminary numerical results demonstrate that the approach is effective. Numerical comparison based on various smoothing functions is reported as well.

Keywords. System of equalities and inequalities, second-order cone, SOC-function, smoothing algorithm, global convergence.

1 Introduction

The second-order cone (SOC for short and denoted by Kn) in IRn (n ≥ 1), also called the Lorentz cone, is defined as

Kn =(x1, x2) ∈ IR × IRn−1| kx2k ≤ x1 ,

where k · k denotes the Euclidean norm. By the definition of Kn, if n = 1, K1 is the set of nonnegative reals IR+. Moreover, we know that a general second-order cone K is the Cartesian product of SOCs, i.e.,

K := Kn1 × Kn2 × · · · × Knr.

Since all the analysis can be carried over to the setting of Cartesian product, we only focus on the single second-order cone Kn for simplicity. It is well known that the second- order cone Kn is a symmetric cone. During the past decade, optimization problems involved SOC constraints and their corresponding solutions methods have been studied extensively, see [1, 5, 8, 13, 19, 20, 24, 29, 30, 31, 34, 35] and references therein.

There is a spectral decomposition with respect to second-order cone Kn in IRn, which plays a very important role in the study of second-order cone optimization problems. For any vector x = (x1, x2) ∈ IR×IRn−1, the spectral decomposition (or spectral factorization) with respect to Kn is given by

x = λ1(x)u(1)x + λ2(x)u(2)x , (1) where λ1(x), λ2(x) and u(1)x , u(2)x are called the spectral values and the spectral vectors of x, respectively, with their corresponding formulas as bellow:

λi(x) = x1 + (−1)ikx2k, i = 1, 2, (2)

(3)

u(i)x =













1 2

"

1 (−1)i xkx2

2k

#

, if x2 6= 0,

1 2

 1

(−1)iw



, if x2 = 0,

(3)

for i = 1, 2 with w being any vector in IRn−1 satisfying kwk = 1. Moreover, n

u(1)x , u(2)x

o is called a Jordan frame satisfying the following properties:

u(1)x + u(2)x = e, u(1)x , u(2)x = 0, u(1)x ◦ u(2)x = 0 and u(i)x ◦ u(i)x = u(i)x (i = 1, 2), where e = (1, 0, · · · , 0)T ∈ IRnis the unit element and Jordan product x ◦ y is defined by x ◦ y := (hx, yi, x1y2+ y1x2) ∈ IR × IRn−1 for any x = (x1, x2), y = (y1, y2) ∈ IR × IRn−1. For more details about Jordan product, please refer to [11].

In [5, 6], for any real-valued function f : IR → IR and x = (x1, x2) ∈ IR × IRn−1, based on the spectral factorization of x with respect to Kn, a type of vector-valued function associated with Kn (also called SOC-function) is introduced. More specifically, if we apply f to the spectral values of x in (1), then we obtain the function fsoc : IRn → IRn given by

fsoc(x) = f (λ1(x))u(1)x + f (λ2(x))u(2)x . (4) From the expression (4), it is clear that the SOC-function fsoc is unambiguous whether x2 = 0 or x2 6= 0. Further properties regarding fsoc were discussed in [3, 4, 5, 7, 17, 32].

It is also known that such SOC-functions fsoc associated with second-order cone play a crucial role in the theory and numerical algorithm for second-order cone programming, see [1, 5, 8, 13, 19, 20, 24, 29, 30, 31, 34, 35] again.

In this paper, in light of the definition of fsoc, we define another type of SOC-function Φµ(see Section 2 for details). In particular, using the SOC-function Φµ, we will solve the following system of equalities and inequalities under the order induced by the second- order cone:

 fI(x) Km 0,

fE(x) = 0, (5)

where fI(x) = (f1(x), · · · , fm(x))T, fE(x) = (fm+1(x), · · · , fn(x))T, and “x Km 0”

means “−x ∈ Km”. Likewise, x Km 0 means x ∈ Km and x Km 0 means x ∈ int(Km) whereas int(Km) denotes the interior of Km. Throughout this paper, we assume that fi is continuously differentiable for any i ∈ {1, 2, ..., n}. We also define

f (x) :=

 fI(x) fE(x)



and hence f is continuously differentiable. When Km = IRm+, the system (5) reduces to the standard system of equalities and inequalities over IRm. The corresponding standard

(4)

system (5) has been studied extensively due to its various applications, and there are many methods for solving such problems, see [10, 27, 28, 33, 37]. For the setting of second-order cone, we know that the KKT conditions of the second-order cone constrained optimization problems can be expressed in form of (5), i.e., the system of equalities and inequalities under the order induced by second-order cones. For example, for the following second-order cone optimization problem:

min h(x)

s.t. −g(x) ∈ Km, the KKT conditions of this problem is as follows

∇h(x) + ∇g(x)λ = 0, λTg(x) = 0,

−λ Km 0, g(x) Km 0,

where ∇g(x) denotes the gradient matrix of g. Now, by denoting fI(x, λ) :=

 −λ g(x)



and fE(x, λ) := ∇h(x) + ∇g(x)λ λTg(x)

 ,

it is clear to see that the KKT conditions of the second-order cone optimization problem is in form of (5). From this view, the investigation of the system (5) provides a theoretical way for solving second-order cone optimization problems. Hence, the study of the system (5) is important and that is the main motivation for this paper.

So far, there are many kinds of numerical methods for solving the second-order cone optimization problems. Among which, there is a class of popular numerical method, the so-called smoothing-type algorithms. This kind of algorithm has also been a powerful tool for solving many other optimization problems, including symmetric cone comple- mentarity problems [15, 16, 20, 21, 22], symmetric cone linear programming [23, 26], the system of inequalities under the order induced by symmetric cone [18, 25, 38], and so on. From these recent studies, most of the existing smoothing-type algorithms were designed on the basis of a monotone line search. In order to achieve better computational results, the nonmonotone line search technique is sometimes adopted in the numerical implementations of smoothing-type algorithms [15, 36, 37]. The main reason is that the nonmonotone line search scheme can improve the likelihood of finding a global optimal solution and convergence speed in cases where the function involved is highly nonconvex or has a valley in a small neighborhood of some point. In view of this, in this paper we also develop a nonmonotone smoothing-type algorithm for solving the system of equali- ties and inequalities under the order induced by second-order cones.

(5)

The remaining parts of this paper are organized as follows. In Section 2, some back- ground concepts and preliminary results about the second-order cone are given. In Sec- tion 3, we reformulate (5) as a system of smoothing equations in which Φµ is employed.

In Section 4, we propose a nonmonotone smoothing-type algorithm for solving (5), and show that the algorithm is well defined. Moreover, we also discuss the global convergence and locally quadratic convergence of the proposed algorithm. The preliminary numerical results are reported to demonstrate that the proposed algorithm is effective in Section 5. Some numerical comparison in light of performance profiles is presented which indi- cates the difference of numerical performance when various smoothing functions are used.

2 Preliminaries

In this section, we briefly review some basic properties about the second-order cone and the vector-valued functions with respect to SOC, which will be extensively used in subse- quent analysis. More details about the second-order cone and the vector-valued functions can be found in [3, 4, 5, 13, 14, 17].

First, we review the projection of x ∈ IRn onto the second-order cone Kn ⊂ IRn. For the second-order cone Kn, let (Kn) denote its dual cone. Then, (Kn) is given by

(Kn) :=y = (y1, y2) ∈ IR × IRn−1| hx, yi ≥ 0, ∀x ∈ Kn .

Moreover, it is well known that the second-order cone Kn is a self-dual cone, i.e., (Kn) = Kn. Let x+denote the projection of x ∈ IRnonto the second-order cone Kn, and xdenote the projection of −x onto the dual cone (Kn). With these notations, for any x ∈ IRn, it is not hard to verify that x = x+− x. In particular, due to the special structure of Kn, the explicit formula of the projection of x ∈ IRn onto Kn is obtained in [14] as below:

x+ =

x if x ∈ Kn,

0 if x ∈ −(Kn) = −Kn, u otherwise,

(6)

where

u =

x1 + kx2k

 x1+ kx22k 2

 x2 kx2k

.

In fact, according to the spectral decomposition of x, the expression of the projection x+

onto Kn can be alternatively expressed as (see [13, Prop. 3.3(b)]) x+ = ((λ1(x))+u(1)x + ((λ2(x))+u(2)x ,

(6)

where (α)+= max{0, α} for any α ∈ IR.

From the definition (4) of the vector-valued function associated with Kn, we know that the projection x+onto Knis a vector-valued function. Moreover, it is known that the projection x+and (α)+ for any α ∈ IR have many the same properties, such as the conti- nuity, the directional differentiability and semismooth and so on. Indeed, these properties are established for general vector-valued functions associated with SOC. Among which, Chen, Chen and Tseng [5] have obtained that many properties of fsoc are inherited from the function f , which is presented in the following proposition.

Proposition 2.1. Suppose that x = (x1, x2) ∈ IR × IRn−1 has the spectral decomposition given as in (1)-(3). For any the function f : IR → IR and the vector-valued function fsoc defined by (4), the following hold.

(a) fsoc is continuous at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is continuous at λ1(x), λ2(x);

(b) fsoc is directionally differentiable at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is directionally differentiable at λ1(x), λ2(x);

(c) fsoc is differentiable at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is differen- tiable at λ1(x), λ2(x);

(d) fsoc is strictly continuous at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is strictly continuous at λ1(x), λ2(x);

(e) fsoc is semismooth at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is semismooth at λ1(x), λ2(x);

(f ) fsoc is continuously differentiable at x ∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is continuously differentiable at λ1(x), λ2(x).

Note that the projection function x+ onto Kn is not a smoothing function on the whole space IRn. From Proposition 2.1, we can make some smoothing functions for the projection x+ onto Kn if we smooth the functions f (λi(x)) for i = 1, 2. More specifically, we consider a family of smoothing functions φ(µ, ·) : IR → IR with respect to (α)+

satisfying

limµ↓0 φ(µ, α) = (α)+ and 0 ≤ ∂φ

∂α(µ, α) ≤ 1. (7)

for all α ∈ IR. Are there functions satisfying the above conditions? Yes, there are many.

(7)

We illustrate three of them here:

φ1(µ, α) = pα2+ 4µ2+ α

2 , (µ > 0) φ2(µ, α) = µ ln(eαµ + 1), (µ > 0)

φ3(µ, α) =





α, if α ≥ µ,

(α+µ)2

, if − µ < α < µ, 0, if α ≤ −µ.

(µ > 0)

In fact, the functions φ1 and φ2 were considered in [13, 17], while the function φ3 was employed in [18, 37]. In addition, as for the function φ3, there is a more general function φp(µ, ·) : IR → IR given by

φp(µ, α) =





α if α ≥ p−1µ ,

µ p−1

h(p−1)(α+µ)

ip

if −µ < α < p−1µ ,

0 if α ≤ −µ,

where µ > 0 and p ≥ 2. This function φp is recently studied in [9] and it is not hard to verify that φp also satisfies the above conditions (7). All the functions φ1, φ2 and φ3 will play the role of smoothing functions as f (λi(x)) in (4). In other words, based on these smoothing functions, we define a type of SOC-functions Φµ(·) on IRn associated with Kn(n ≥ 1) as

Φµ(x) := φ(µ, λ1(x))u(1)x + φ(µ, λ2(x))u(2)x ∀x = (x1, x2) ∈ IR × IRn−1, (8) where λ1(x), λ2(x) are given by (2) and u(1)x , u(2)x are given by (3). In light of the prop- erties of φ(µ, α), we show as below that the SOC-function Φµ(x) becomes the smoothing function for the projection function x+ onto Kn.

We depict the graphs of φi(µ, α) for i = 1, 2, 3, in Figure 1. From Figure 1, we see that φ3 is the one which best approximates the function (α)+ under the sense that it is closest to (α)+ among all φi(µ, α) for i = 1, 2, 3.

Proposition 2.2. Suppose that x = (x1, x2) ∈ IR × IRn−1 has the spectral decomposition given as in (1)-(3), and that φ(µ, ·) with µ > 0 is continuously differentiable function satisfying (7). Then, the following hold.

(a) The function Φµ(x) : IRn → IRn defined as in (8) is continuously differentiable.

Moreover, its Jacobian matrix at x is described as

∂Φµ(x)

∂x =

∂φ

∂λ(µ, x1)I if x2 = 0,

 b cx2T/kx2k

cx2/kx2k aI + (b − a)x2x2T/kx2k2



if x2 6= 0, (9)

(8)

Max[0,t]

ϕ1(μ,t) ϕ2(μ,t) ϕ3(μ,t)

-1.0 -0.5 0.0 0.5 1.0

0.0 0.2 0.4 0.6 0.8 1.0

t ϕi,t)

Figure 1: Graphs of max(0, t) and all three φi(µ, t) with µ = 0.2.

where

a = φ(µ,λλ2(x))−φ(µ,λ1(x))

2(x)−λ1(x) , b = 12

∂φ

∂λ2(µ, λ2(x)) + ∂λ∂φ

1(µ, λ1(x)) , c = 12

∂φ

∂λ2(µ, λ2(x)) − ∂λ∂φ

1(µ, λ1(x))

;

(10)

(b) Both ∂Φ∂xµ(x) and I − ∂Φ∂xµ(x) are positive semi-definite matrices;

(c) lim

µ→0Φµ(x) = x+ = (λ1(x))+u(1)x + (λ2(x))+u(2)x for i = 1, 2.

Proof. (a) From the expression (8) and the assumption of φ(µ, ·) being continuously differentiable, it is easy to verify that the function Φµ is continuously differentiable. The Jacobian matrix (9) of Φµ(x) can be obtained by adopting the same arguments as in [13, Proposition 5.2]. Hence, we omit the details here.

(b) First, we prove that the matrix ∂Φ∂xµ(x) is positive semi-definite. For the case of x2 = 0, we know that ∂Φ∂xµ(x) = ∂φ∂λ(µ, x1)I. Then, from 0 ≤ ∂α∂φ(µ, α) ≤ 1, it is clear to see that the matrix ∂Φ∂xµ(x) is positive semi-definite. For the case of x2 6= 0, from ∂φ∂α(µ, α) ≥ 0 and (10), we have b ≥ 0. In order to prove that the matrix ∂Φ∂xµ(x) is positive semi-definite, we only need to verify that the Schur Complement of b with respect to ∂Φ∂xµ(x) is positive semi-definite. Note that the Schur Complement of b has the form of

aI + (b − a)x2xT2 kx2k2 −c2

b x2xT2 kx2k2 = a



I − x2xT2 kx2k2



+ b2− c2 b

x2xT2 kx2k2.

Since ∂φ∂α(µ, α) ≥ 0, we obtain that the function φ(µ, α) with respect to α is increasing, which leads to a ≥ 0. Besides, from (10), we observe that

b2− c2 = ∂φ

∂λ2(µ, λ2(x))∂φ

∂λ1(µ, λ1(x)) ≥ 0.

(9)

With this, it follows that the Schur Complement of b with respect to ∂Φ∂xµ(x) is a linear non-negative combination of the matrices kxx2xT2

2k2 and I − kxx2xT2

2k2. Thus, we show that the Schur Complement of b is positive semi-definite, which says the matrix ∂Φ∂xµ(x) is positive semi-definite.

Combining with ∂φ∂α(µ, α) ≤ 1 and following similar arguments as above, we can also argue that the matrix I − ∂Φ∂xµ(x) is also positive semi-definite.

(c) By the definition of the function Φµ(x), it can be verified directly. 2

We point out that the definition of (8) includes the similar way to define smoothing functions in [13, Section 4] as a special case, and hence [13, Prop. 4.1] is covered by Proposition 2.2. Indeed, Proposition 2.2 can be also verified by geometric views. More specifically, from Figures 2, 3 and 4, we see that when µ ↓ 0, φi is getting closer to (α)+, which verifies Proposition 2.2(c).

μ=0.5 μ=0.3 μ=0.1 μ=0.01

-2 -1 0 1 2

0.0 0.5 1.0 1.5 2.0

t ϕ1,t)

Figure 2: Graphs of φ1(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

3 Applying Φ

µ

to solve the system (5)

In this section, in light of the smoothing vector-valued function Φµ, we reformulate (5) as a system of smoothing equations. To this end, we need a partial order induced by SOC. More specifically, for any x ∈ IRn, using the definition of the partial order “Km” and the projection function x+ in (6), we have

fI(x) Km 0 ⇐⇒ −fI(x) ∈ Km ⇐⇒ fI(x) ∈ −Km ⇐⇒ (fI(x))+= 0.

Hence, the system (5) is equivalent to the following system of equations:

 (fI(x))+ = 0,

fE(x) = 0. (11)

(10)

μ=0.5 μ=0.3 μ=0.1 μ=0.01

-2 -1 0 1 2

0.0 0.5 1.0 1.5 2.0

t ϕ2,t)

Figure 3: Graphs of φ2(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

μ=0.5 μ=0.3 μ=0.1 μ=0.01

-2 -1 0 1 2

0.0 0.5 1.0 1.5 2.0

t ϕ3,t)

Figure 4: Graphs of φ3(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

Note that the function (fI(·))+ in the above equation (11) is nonsmooth. Therefore, the smoothing-type Newton methods cannot be directly applied to solve the equation (11).

To conquer this, we employ the smoothing function Φµ(·) defined in (8), and define the following function:

F (µ, x, y) :=

fI(x) − y fE(x) Φµ(y)

.

(11)

From Proposition 2.2(c), it follows that

F (µ, x, y) = 0 and µ = 0

⇐⇒ y = fI(x), fE(x) = 0, Φµ(y) = 0 and µ = 0

⇐⇒ y = fI(x), fE(x) = 0 and y+= 0

⇐⇒ (fI(x))+= 0, fE(x) = 0

⇐⇒ fI(x) Km 0, fE(x) = 0.

In other words, as long as the system F (µ, x, y) = 0 and µ = 0 is solved, the corresponding x is a solution to the original system (5). In view of Proposition 2.2(a), we can obtain the solution to the system (5) by applying smoothing-type Newton method for solving F (µ, x, y) = 0 and setting µ ↓ 0 at the same time. To do this, for any z = (µ, x, y) ∈ IR++ × IRn× IRm, we further define a continuously differentiable function H : IR++× IRn× IRm → IR++× IRn× IRm as follows:

H(z) :=

µ

fI(x) − y + µxI fE(x) + µxE

Φµ(y) + µy

, (12)

where xI := (x1, x2, ..., xm)T ∈ IRm, xE := (xm+1, ..., xn)T ∈ IRn−m, x := (xTI, xTE)T ∈ IRn and y ∈ IRm. Then, it is clear to see that when H(z) = 0, we have µ = 0 and x is a solution to the system (5). Now, we let H0(z) denote the Jacobian matrix of the function H at z, then for any z ∈ IR++× IRn× IRm, we obtain that

H0(z) =

1 0n 0m

xI fI0 + µU −Im xE fE0 + µV 0(n−m)×m

∂Φµ(y)

∂µ + y 0m×n ∂Φ∂yµ(y)+ µIm

, (13)

where U := Im 0m×(n−m) , V := 0(n−m)×m In−m , 0l denotes l dimensional zero vector, and 0l×q denotes l × q zero matrix for any positive integer l and q. In summary, we will apply smoothing-type Newton method to solve the smoothed equation H(z) = 0 at each iteration and make µ > 0 as well as H(z) → 0 to find a solution of the system (5).

4 A smoothing-type Newton algorithm

Now, we consider a smoothing-type Newton algorithm with a nonmonotone line search, and show that the algorithm is well defined. For convenience, we denote the merit function Ψ as Ψ(z) := kH(z)k2 for any z ∈ IR++× IRn× IRm.

(12)

Algorithm 4.1. (A smoothing-type Newton Algorithm)

Step 0 Choose γ ∈ (0, 1), ξ ∈ (0,12). Take η > 0, σ ∈ (0, 1) such that ση < 1. Let µ0 = η and (x0, y0) ∈ IRn × IRm be an arbitrary vector. Set z0 = (µ0, x0, y0), e0 := (1, 0, ..., 0) ∈ IR × IRn× IRm, G0 := kH(z0)k2 = Ψ(z0) and S0 := 1. Choose βmin and βmax such that 0 ≤ βmin ≤ βmax < 1. Set τ (z0) := σ min{1, Ψ(z0)} and k := 0.

Step 1 If kH(zk)k = 0, stop. Otherwise, go to Step 2.

Step 2 Compute ∆zk := (∆µk, ∆xk, ∆yk) ∈ IR × IRn× IRm by

H0(zk)∆zk = −H(zk) + ητ (zk)e0. (14) Step 3 Let αk be the maximum of the values 1, γ, γ2, ... such that

Ψ(zk+ αk∆zk) ≤ [1 − 2ξ(1 − ση)αk] Gk. (15) Step 4 Set zk+1 := zk+ αk∆zk. If kH(zk+1)k = 0, stop. Otherwise, go to Step 5.

Step 5 Choose βk ∈ [βmin, βmax]. Set

Sk+1 := βkSk+ 1,

τ (zk+1) := minσ, σΨ(zk+1), τ (zk) , Gk+1 := βkSkGk+ Ψ(zk+1) /Sk+1,

(16)

and set k := k + 1. Go to Step 2.

The nonmonotone line search technique in Algorithm 4.1 was introduced in [36]. From the first and third equations in (16), we know that Gk+1 is a convex combination of Gk and Ψ(zk+1). In fact, Gk is expressed as a convex combination of Ψ(z0), Ψ(z1), ..., Ψ(zk).

Moreover, the main role of βk is to control the degree of non-monotonicity. If βk = 0 for every k, then the corresponding line search is the usual monotone Armijo line search.

Proposition 4.1. Suppose that the sequences {zk}, {µk}, {Gk}, {Ψ(zk)} and {τ (zk)}

are generated by Algorithm 4.1. Then, the following hold.

(a) The sequence {Gk} is monotonically decreasing and Ψ(zk) ≤ Gk for all k ∈ N;

(b) The sequence {τ (zk)} is monotonically decreasing;

(c) ητ (zk) ≤ µk for all k ∈ N;

(d) The sequence {µk} is monotonically decreasing and µk > 0 for all k ∈ N.

(13)

Proof. The proof is similar to Remark 3.1 in [37], we omit the details. 2

Next, we show that Algorithm 4.1 is well-defined and establish its local quadratic convergence. For simplicity, we denote the Jacobian matrix of the function f by

f0(x) :=

 fI0(x) fE0 (x)



and use the following assumption.

Assumption 4.1. f0(x) + µIn is invertible for any x ∈ IRn and µ ∈ IR++.

We point our that the Assumption 4.1 is only a mild condition and there are many functions satisfying the assumption. For example, if f is a monotone function, then f0(x) is a positive semi-definite matrix for any x ∈ IRn. Thus, Assumption 4.1 is satisfied.

Theorem 4.1. Suppose that f is a continuously differentiable function and Assumption 4.1 is satisfied. Then, Algorithm 4.1 is well-defined.

Proof. In order to show that Algorithm 4.1 is well-defined, we need to prove that Newton equation (14) is solvable, and the line search (15) is well-defined.

First, we prove that Newton equation (14) is solvable. By the expression of Jacobian matrix H0(z) in (13), we see that the determinant det(H0(z)) of H0(z) satisfies

det(H0(z)) = det (f0(x) + µIn) · det ∂Φµ(y)

∂y + µIm



for any z ∈ IR++× IRn× IRm. Moreover, from Proposition 2.2(b), we know that ∂Φ∂yµ(y) is positive semi-definite for µ ∈ IR++. Hence, combing this with Assumption 4.1, we obtain that H0(z) is nonsingular for any z ∈ IR++× IRn× IRm with µ > 0. Applying Proposition 4.1(d), it follows that Newton equation (14) is solvable.

Secondly, we prove that the line search (15) is well-defined. For notational convenience, we denote

wk(α) := Ψ zk+ α∆zk − Ψ zk − αΨ0 zk ∆zk. From Newton equation (14) and the definition of Ψ, we have

Ψ zk+ α∆zk

= wk(α) + Ψ zk + αΨ0 zk ∆zk

= wk(α) + Ψ zk + 2αH zkT

−H(zk) + ητ (zk)e0

≤ wk(α) + (1 − 2α)Ψ zk + 2αητ (zk)

H(zk) . If Ψ(zk) ≤ 1, then we have kH(zk)k ≤ 1. Hence, it follows that

τ (zk)kH(zk)k ≤ σΨ(zk)kH(zk)k ≤ σΨ(zk).

(14)

If Ψ(zk) > 1, then we see that Ψ(zk) = kH(zk)k2 ≥ kH(zk)k, which yields τ (zk)kH(zk)k ≤ σkH(zk)k ≤ σΨ(zk).

Thus, from all the above, we obtain that Ψ zk+ α∆zk

≤ wk(α) + (1 − 2α)Ψ(zk) + 2αησΨ(zk)

= wk(α) +1 − 2(1 − ση)αΨ(zk) (17)

≤ wk(α) +1 − 2(1 − ση)αGk.

Since the function H is continuous and differentiable for any z ∈ IR++× IRn× IRm, we have wk(α) = o(α) for all k ∈ N. Combining with (17), this indicates that the line search (15) is well-defined. 2

Theorem 4.2. Suppose that f is a continuously differentiable function and Assumption 4.1 is satisfied. Then the sequence {zk} generated by Algorithm 4.1 is bounded; and any accumulation point of the sequence {xk} is a solution of the system (5).

Proof. The proof is similar to [37, Theorem 4.1] and we omit it. 2

In Theorem 4.2, we give the global convergence of Algorithm 4.1. Now, we analyze the convergence rate for Algorithm 4.1. We start with introducing the following con- cepts. A locally Lipschitz function F : IRn → IRm is said to be semismooth (or strongly semismooth) at x ∈ IRn if F is directionally differentiable at x and

F (x + h) − F (h) − V h = o(khk) (or = O(khk2))

holds for any V ∈ ∂F (x + h), where ∂F (x) is the generalized Jacobian matrix of the function F at x ∈ IRn in the sense of Clarke [2]. There are many functions being semis- mooth, such as convex functions, smooth functions, piecewise linear functions and so on.

In addition, it is known that the composition of semismooth functions is still a semis- mooth function, and the composition of strongly semismooth functions is still a strongly semismooth function [12]. From Proposition 2.2 (a), we know that Φµ(x) defined by (8) is smooth on IRn.

With the definition (12) of H, mimicking the arguments as in [37, Theorem 5.1], we have the local quadratic convergence of Algorithm 4.1.

Theorem 4.3. Suppose that the conditions given in Theorem 4.2 are satisfied, and z = (µ, x, y) is an accumulation point of sequence {zk} which is generated by Algorithm 4.1.

(a) If all V ∈ ∂H(z) are nonsingular, then the sequence {zk} converges to z, and kzk+1− zkk = o(kzk− zk), µk+1 = o(µk);

(15)

(b) If the functions f and Φµ satisfy that f0 and Φ0µ are Lipschitz continuous on IRn, then kzk+1− zkk = O(kzk− zk)2 and µk+1 = O(µ2k).

5 Numerical experiments

In this section, we present some numerical examples to demonstrate the efficiency of Algorithm 4.1 for solving the system (5). In our tests, all experiments are done on a PC with CPU of 1.9 GHz and RAM of 8.0 GB, and all the program codes are written in MATLAB and run in MATLAB environment. We point out that if there are no n numbers in I ∪ E, we can adopt a similar way to those given in [37], then the system (5) can be transformed as a new problem and we can solve the new problem using Algorithm 4.1. By this approach, a solution of the original problem can be found.

Throughout the following experiments, we employ three functions φ1, φ2 and φ3 along with the proposed algorithm to implement each example. Note that, for the function φ1, its corresponding SOC-function Φµ can be alternatively expressed as

Φeµ(x) = x +px2 + 4µ2e

2 with e = (1, 0, · · · , 0)T ∈ Kn.

This form is simpler than the Φµ(x) induced from (8). Hence, we adopt it in our imple- mentation. Moreover, the parameters used in the algorithm are chosen as follows:

γ = 0.3, ξ = 10−4, η = 1.0, β0 = 0.01, µ0 = 1.0, S0 = 1.0,

and the parameters c and σ are chosen according to the ones listed in Table 1 and Table 4. In the implementation, the stopping rule is kH(z)k ≤ 10−6, the step length ν ≤ 10−6, or the number of iteration is over 500; and the starting points are randomly generated from the interval [−1, 1].

Now, we present the test examples. We first consider two examples in which the system (5) only includes inequalities, i.e., m = n. Note that a similar way to construct the two examples was given in [25].

Example 5.1. Consider the system (5) with inequalities only, where f (x) := M x+q Kn 0 and Kn := Kn1 × · · · × Knr. Here M is generated by M = BBT with B ∈ IRn×n being a matrix whose every component is randomly chosen from the interval [0, 1] and q ∈ IRn being a vector whose every component is 1.

For Example 5.1, the tested problems are generated with sizes n = 500, 1000, ..., 4500 and each ni = 10. The random problems of each size are generated 10 times. Besides using the three functions along with Algorithm 4.1 for solving Example 5.1, we have also

(16)

n fun suc iter cpu res

500 φ1 10 5.000 0.251 8.864e-09

500 φ2 10 7.800 1.496 2.600e-07

500 φ3 10 3.500 0.707 3.762e-07

1000 φ1 10 5.000 0.632 2.165e-08

1000 φ2 10 7.200 5.240 8.657e-08

1000 φ3 10 3.400 3.093 4.853e-07

1500 φ1 9 5.000 1.224 1.537e-07

1500 φ2 9 8.111 13.232 3.124e-07

1500 φ3 9 4.222 8.781 2.706e-07

2000 φ1 10 5.000 2.145 1.599e-07

2000 φ2 10 7.700 24.130 2.234e-07 2000 φ3 10 4.200 16.925 1.923e-07

2500 φ1 9 5.000 3.519 3.897e-08

2500 φ2 9 6.889 34.849 2.016e-07

2500 φ3 9 4.000 27.870 1.479e-07

3000 φ1 10 5.000 5.161 9.769e-08

3000 φ2 10 8.300 69.723 1.714e-07 3000 φ3 10 4.100 45.891 1.608e-07

3500 φ1 7 5.000 7.415 2.226e-07

3500 φ2 7 7.857 102.272 4.037e-07

3500 φ3 7 4.429 75.068 2.334e-07

4000 φ1 9 5.000 9.974 5.795e-08

4000 φ2 9 6.444 106.850 3.132e-07

4000 φ3 9 4.000 98.983 7.743e-08

4500 φ1 8 5.000 13.075 2.374e-07

4500 φ2 8 10.250 240.602 3.115e-07 4500 φ3 8 4.250 147.863 3.070e-07

Table 1: Average performance of Algorithm4.1 for Example 5.1 (c = 0.01, σ = 10−5)

tested it by using the smoothing-type algorithm with the monotone line search which was introduced in [25] (for this case, we choose the function φ1). Table 1 shows the numerical results where

(17)

Non-monotone Monotone

n suc iter cpu res n suc iter cpu res

500 10 5.000 0.251 8.864e-09 500 10 5.500 0.289 4.905e-07 1000 10 5.000 0.632 2.165e-08 1000 10 5.500 0.616 7.184e-08

1500 9 5.000 1.224 1.537e-07 1500 9 6.000 1.466 4.654e-09

2000 10 5.000 2.145 1.599e-07 2000 10 6.500 2.866 3.151e-08 2500 9 5.000 3.519 3.897e-08 2500 10 6.000 4.477 4.320e-08 3000 10 5.000 5.161 9.769e-08 3000 10 6.500 7.348 1.743e-07 3500 7 5.000 7.415 2.226e-07 3500 10 8.000 11.957 5.674e-07 4000 9 5.000 9.974 5.795e-08 4000 10 7.000 14.875 2.166e-08 4500 8 5.000 13.075 2.374e-07 4500 10 7.000 19.204 2.433e-08

Table 2: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25]

for Example 5.1

“fun” denotes the three functions,

“suc” denotes the number that Algorithm 4.1 successfully solves every generated problem,

“iter” denotes the average iteration numbers,

“cpu” denotes the average CPU time in seconds,

“res” denotes the average residual norm kH(z)k for 9 test problems.

The initial points are also randomly generated. In light of “iter” and “cpu” in Table 1, we can conclude that

φ3(µ, α) > φ1(µ, α) > φ2(µ, α)

where “>” means “better performance”. In Table 2, we compare Algorithm 4.1 with non-monotone line search and the smoothing-type algorithm with monotone line search studied in [25]. Although the number that Algorithm 4.1 successfully solves every gener- ated problem is less than the one by the smoothing-type algorithm with monotone line search as aforementioned in overall, the performance based on cpu time and iterations of our proposed algorithm outperforms better than the other. This indicates that Algo- rithm 4.1 has some advantages over the one with the monotone line search in [25].

Another way to compare the performance of function φi(µ, α), i = 1, 2, 3, is via the so-called “performance profile”, which is introduced in [39]. In this means, we regard Algorithm 4.1 corresponding to a smoothing function φi(µ, α), i = 1, 2, 3 as a solver, and assume that there are ns solvers and np test problems from the test set P which is generated randomly. We are interested in using the iteration number as performance measure for Algorithm 4.1 with different φi(µ, α). For each problem p and solver s, let

fp,s = iteration number required to solve problem p by solver s.

(18)

We employ the performance ratio

rp,s:= fp,s

min{fp,s: s ∈ S},

where S is the four solvers set. We assume that a parameter rp,s ≤ rM for all p, s are chosen, and rp,s = rM if and only if solver s does not solve problem p. In order to obtain an overall assessment for each solver, we define

ρs(τ ) := 1 np

size{p ∈ P : rp,s≤ τ },

which is called the performance profile of the number of iteration for solver s. Then, ρs(τ ) is the probability for solver s ∈ S that a performance ratio fp,s is within a factor τ ∈ R of the best possible ratio.

We then need to test the three functions for Example 5.1. In particular, the random problems of each size are generated 50 times. In order to obtain an overall assessment for the three functions, we are interested in using the number of iterations as a performance measure for Algorithm 4.1 with φ1(µ, α), φ2(µ, α), and φ3(µ, α), respectively. The per- formance plot based on iteration number is presented in Figure 5. From this figure, we also see that φ3(µ, α) working with Algorithm 4.1 has the best numerical performance, followed by φ4(µ, α). In other words, in view of “iteration numbers”, there has

φ3(µ, α) > φ1(µ, α) > φ2(µ, α) where “>” means “better performance”.

We are also interested in using the computing time as performance measure for Algo- rithm 4.1 with different φi(µ, α), i = 1, 2, 3. The performance plot based on computing time is presented in Figure 6. From this figure, we can also see the function φ3(µ, t) has best performance. In other words, in view of “computing time”, there has

φ3(µ, α) > φ1(µ, α) > φ2(µ, α) where “>” means “better performance”.

In summary, for the Example 5.1, no matter the number of iterations or the comput- ing time is taken into account, the function φ3(µ, α) is the best choice for the Algorithm 4.1.

Example 5.2. Consider the system (5) with inequalities only, where x ∈ IR5, K5 = K3× K2 and

f (x) :=

24(2x1− x2)3 + exp(x1+ x3) − 4x4+ x5

−12(2x1− x2)3+ 3(3x2+ 5x3)/p1 + (3x2+ 5x3)2− 6x4− 7x5

−exp(x1− x3) + 5(3x2+ 5x3)/p1 + (3x2+ 5x3)2− 3x4+ 5x5 4x1+ 6x2+ 3x3− 1

−x1+ 7x2− 5x3+ 2

K5 0.

(19)

1.0 1.5 2.0 2.5 0.0

0.2 0.4 0.6 0.8 1.0

τ

ρs(τ) ϕ1(μ,t)

ϕ2(μ,t) ϕ3(μ,t)

Figure 5: Performance profile of iteration numbers for Example 5.1.

1.0 1.5 2.0 2.5 3.0

0.0 0.2 0.4 0.6 0.8 1.0

τ ρ

s

(τ )

ϕ1(μ,t) ϕ2(μ,t) ϕ3(μ,t)

Figure 6: Performance profile of computing time for Example 5.1.

This problem is taken from [17].

Example 5.2 is tested 20 times for 20 random starting points. Similar to the case of Example 5.1, besides using Algorithm 4.1 to test Example 5.2, we have also tested it using the monotone smoothing-type algorithm in [25]. From Table 3, we see that there is no big difference regarding performance between these two algorithms for Example 5.2.

Moreover, Figure 7 shows the performance profile of iteration number in Algorithm 4.1 for Example 5.2 on 100 test problems with random starting points. The three solvers

(20)

Non-monotone Monotone

suc iter cpu res suc iter cpu res

20 13.500 0.002 5.835e-08 20 8.750 0.005 1.2510e-07

Table 3: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25]

for Example 5.2

1.0 1.1 1.2 1.3 1.4 1.5 1.6

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

τ

ρs(τ) ϕ1(μ,t)

ϕ2(μ,t) ϕ3(μ,t)

Figure 7: Performance profile of iteration number for Example 5.2.

correspond to Algorithm 4.1 with φ1(µ, α), φ2(µ, α), and φ3(µ, α), respectively. From this figure, we see that φ3(µ, α) working with Algorithm 4.1 has the best numerical per- formance. followed by φ2(µ, t). In summary, from the viewpoint of “iteration numbers”, we conclude that

φ3(µ, α) > φ2(µ, α) > φ1(µ, α), where “>” means “better performance”.

Example 5.3. Consider the system of equalities and inequalities (5), where f (x) := fI(x)T, fE(x)TT

x ∈ IR6,

(21)

with

fI(x) =

−x41

3x32+ 2x2− x3− 5x23

−4x22− 7x3+ 10x33

−x34− x5 x5+ x6

K5=K3×K2 0,

fE(x) = 2x1+ 5x22− 3x23+ 2x4− x5x6− 7.

Example 5.4. Consider the system of equalities and inequalities (5), where f (x) := fI(x)T, fE(x)TT

x ∈ IR6, with

fI(x) =

−e5x1 + x2

x2 + x33

−3ex4 5x5− x6

K4=K2×K2 0,

fE(x) =

 3x1+ ex2+x3 − 2x4− 7x5+ x6− 3 2x21+ x2+ 3x3− (x4− x5)2+ 2x6 − 13



= 0.

Example 5.5. Consider the system of equalities and inequalities (5), where f (x) := fI(x)T, fE(x)TT

x ∈ IR7, with

fI(x) =

3x31 x2− x3

−2(x4− 1)2 sin(x5+ x6)

2x6+ x7

K5=K2×K3 0,

fE(x) =

 x1+ x2+ 2x3x4+ sin x5 + cos x6+ 2x7 x31+ x2+px23+ 3 + 2x4+ x5 + x6+ 6x7



= 0.

Table 4 shows the numerical results including three smoothing functions (fun) used to solve the problems, the number (suc) that Algorithm 4.1 successfully solves every gener- ated problem, the parameters c and σ, the average iteration numbers (iter), the average CPU time (cpu) in seconds and the average residual norm kH(z)k (res) for Examples 5.2-5.5 with random initializations, respectively. Performance profiles are provided as below.

Figure 8 and Figure 9 are the performance profiles in terms of iteration number for Example 5.3 and Example 5.5. From the Figure 8, we see that although the best

(22)

Exam fun suc c σ iter cpu res

5.2 φ1 20 5 0.02 13.500 0.002 5.835e-08

5.2 φ2 20 5 0.02 8.450 0.001 5.134e-07

5.2 φ3 20 5 0.02 8.600 0.002 2.260e-07

5.3 φ1 20 1 0.02 21.083 0.009 8.165e-07

5.3 φ2 17 1 0.02 14.647 0.001 2.899e-07??

5.3 φ3 17 1 0.02 18.529 0.002 7.167e-07

5.4 φ1 20 0.5 0.002 46.750 0.033 1.648e-07

5.4 φ2 2 0.5 0.002 420.000 0.499 9.964e-07

5.4 φ3 0 0.5 0.002 Fail Fail Fail

5.5 φ1 20 0.1 0.002 14.250 0.009 6.251e-07

5.5 φ2 20 0.1 0.002 13.250 0.001 6.532e-07

5.5 φ3 20 0.1 0.002 12.650 0.001 6.016e-07

Table 4: Average performance of Algorithm4.1 for Examples 5.2-5.5

1 2 3 4 5

0.2 0.4 0.6 0.8 1.0

τ

ρs(τ) ϕ1(μ,t)

ϕ2(μ,t) ϕ3(μ,t)

Figure 8: Performance profile of iteration number for Example 5.3.

probability of the function φ3 is lower, but the ratio that can be solved in a large number of problems is higher than that of the other two. In this case, the difference between the three functions is not obvious. From the Figure 9, we can also see the function φ3 has best performance.

In summary, below are our numerical observations and conclusions.

1. The Algorithm 4.1 is effective. In particular, the numerical results show that our

(23)

1.0 1.2 1.4 1.6 1.8 2.0 0.3

0.4 0.5 0.6 0.7 0.8 0.9 1.0

τ

ρs(τ) ϕ1(μ,t)

ϕ2(μ,t) ϕ3(μ,t)

Figure 9: Performance profile of iteration number for Example 5.5.

method is better than the algorithm with monotone line search studied in [25] when solving the system of inequalities under the order induced by second-order cone.

2. For Examples 5.1 and 5.2, φ3 outperforms much better than the others. For the rest problems, the difference of their numerical performance is very marginal.

3. For future topics, it is interesting to discover more efficient smoothing functions and to apply the type of SOC-functions to other optimization problems involved second-order cones.

References

[1] F. Alizadeh, D. Goldfarb, Second-order cone programming, Mathematical Pro- gramming, vol. 95, pp. 3–51, 2003.

[2] F.H. Clark, Optimizaton and Nonsmooth Analysis, Wiley, New York, 1983.

[3] J-S. Chen, The convex and monotone functions associated with second-order cone, Optimization, vol. 55, pp. 363-385, 2006.

[4] J.-S. Chen, X. Chen, S.-H. Pan, and J. Zhang, Some characterizations for SOC-monotone and SOC-convex functions, Journal of Global Optimization, vol. 45, pp. 259-279, 2009.

(24)

[5] J-S. Chen, X. Chen, and P. Tseng, Analysis of nonsmooth vector-valued func- tions associated with second-order cones, Mathematical Programming, vol. 101, pp.

95-117, 2004.

[6] J.-S. Chen and P. Tseng, An unconstrained smooth minimization reformulation of second-order cone complementarity problem, Mathematical Programming, vol. 104, pp. 293-327, 2005.

[7] J.-S. Chen, T.-K. Liao, and S.-H. Pan, Using Schur Complement Theorem to prove convexity of some SOC-functions, Journal of Nonlinear and Convex Analysis, vol. 13, pp. 421-431, 2012.

[8] J.-S. Chen and S.-H. Pan, A survey on SOC complementarity functions and so- lution methods for SOCPs and SOCCPs, Pacific Journal of Optimization, vol. 8, pp.

33-74, 2012.

[9] J-S. Chen, C.-H. Ko, Y.-D. Liu, and S.-P. Wang, New smoothing functions for solving a system of equalities and inequalities, Pacific Journal of Optimization, vol.

12, pp. 185-206, 2016.

[10] J.W. Daniel, Newton’s method for nonlinear inequalities, Numerische Mathematik, vol. 21, pp. 381-387, 1973.

[11] U. Faraut and A. Kor´anyi, Analysis on Symmetric Cones, Oxford Mathemat- ical Monographs, Oxford University Press, New York, 1994.

[12] A. Fischer, Solution of monotone complementarity problems with locally Lips- chitzian functions, Mathematical Programming, vol. 76, pp. 513-532, 1997.

[13] M. Fukushima, Z.Q. Luo, and P. Tseng, Smoothing functions for second-order cone complementarity problems, SIAM Journal on Optimization, vol. 12, pp. 436-460, 2002.

[14] F. Facchinei and J.S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, Volume-I, Springer, New York, 2003.

[15] Z.-H. Huang, S.-L. Hu, and J. Han, Global convergence of a smoothing algo- rithm for symmetric cone complementarity problems with a nonmonotone line search, Science in China Series A, vol. 52, pp. 833-848, 2009.

[16] Z.-H. Huang and T. Ni, Smoothing algorithms for complementarity problems over symmetric cones, Computational Optimization Applications, vol. 45, pp. 557-579, 2010.

參考文獻

相關文件

Based on a class of smoothing approximations to projection function onto second-order cone, an approximate lower order penalty approach for solving second-order cone

Although we have obtained the global and superlinear convergence properties of Algorithm 3.1 under mild conditions, this does not mean that Algorithm 3.1 is practi- cally efficient,

[1] F. Goldfarb, Second-order cone programming, Math. Alzalg, M.Pirhaji, Elliptic cone optimization and prime-dual path-following algorithms, Optimization. Terlaky, Notes on duality

It is well known that second-order cone programming can be regarded as a special case of positive semidefinite programming by using the arrow matrix.. This paper further studies

In section 4, based on the cases of circular cone eigenvalue optimization problems, we study the corresponding properties of the solutions for p-order cone eigenvalue

Chen, Conditions for error bounds and bounded level sets of some merit func- tions for the second-order cone complementarity problem, Journal of Optimization Theory and

We propose a primal-dual continuation approach for the capacitated multi- facility Weber problem (CMFWP) based on its nonlinear second-order cone program (SOCP) reformulation.. The

It is well-known that, to deal with symmetric cone optimization problems, such as second-order cone optimization problems and positive semi-definite optimization prob- lems, this