• 沒有找到結果。

to solve the system (1.5)

N/A
N/A
Protected

Academic year: 2022

Share "to solve the system (1.5)"

Copied!
22
0
0

加載中.... (立即查看全文)

全文

(1)
(2)

see [1, 5, 8, 13, 19, 20, 24, 29–31, 34, 35] and references therein.

There is a spectral decomposition with respect to second-order cone Kn in IRn, which plays a very important role in the study of second-order cone optimization problems. For any vector x = (x1, x2)∈ IR × IRn−1, the spectral decomposition (or spectral factorization) with respect toKn is given by

x = λ1(x)u(1)x + λ2(x)u(2)x , (1.1) where λ1(x), λ2(x) and u(1)x , u(2)x are called the spectral values and the spectral vectors of x, respectively, with their corresponding formulas as bellow:

λi(x) = x1+ (−1)i∥x2∥, i = 1, 2, (1.2)

u(i)x =











1 2

[ 1

(−1)i x∥x22 ]

, if x2̸= 0,

1 2

[ 1

(−1)iw ]

, if x2= 0,

(1.3)

for i = 1, 2 with w being any vector in IRn−1 satisfying∥w∥ = 1. Moreover,{

u(1)x , u(2)x

} is called a Jordan frame satisfying the following properties:

u(1)x + u(2)x = e,

u(1)x , u(2)x

= 0, u(1)x ◦ u(2)x = 0 and u(i)x ◦ u(i)x = u(i)x (i = 1, 2), where e = (1, 0,· · · , 0)T ∈ IRn is the unit element and Jordan product x◦ y is defined by x◦ y := (⟨x, y⟩, x1y2+ y1x2) ∈ IR × IRn−1 for any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1. For more details about Jordan product, please refer to [11].

In [5, 6], for any real-valued function f : IR→ IR and x = (x1, x2)∈ IR × IRn−1, based on the spectral factorization of x with respect toKn, a type of vector-valued function associated with Kn (also called SOC-function) is introduced. More specifically, if we apply f to the spectral values of x in (1.1), then we obtain the function fsoc: IRn → IRn given by

fsoc(x) = f (λ1(x))u(1)x + f (λ2(x))u(2)x . (1.4) From the expression (1.4), it is clear that the SOC-function fsoc is unambiguous whether x2 = 0 or x2 ̸= 0. Further properties regarding fsoc were discussed in [3–5, 7, 17, 32].

It is also known that such SOC-functions fsoc associated with second-order cone play a crucial role in the theory and numerical algorithm for second-order cone programming, see [1, 5, 8, 13, 19, 20, 24, 29–31, 34, 35] again.

In this paper, in light of the definition of fsoc, we define another type of SOC-function Φµ (see Section 2 for details). In particular, using the SOC-function Φµ, we will solve the following system of equalities and inequalities under the order induced by the second-order

cone: {

fI(x)⪯Km0,

fE(x) = 0, (1.5)

where fI(x) = (f1(x),· · · , fm(x))T, fE(x) = (fm+1(x),· · · , fn(x))T, and “x⪯Km 0” means

−x ∈ Km”. Likewise, x⪰Km 0 means x∈ Kmand x≻Km 0 means x∈ int(Km) whereas

(3)

int(Km) denotes the interior ofKm. Throughout this paper, we assume that fi is continu- ously differentiable for any i∈ {1, 2, ..., n}. We also define

f (x) :=

[ fI(x) fE(x)

]

and hence f is continuously differentiable. When Km = IRm+, the system (1.5) reduces to the standard system of equalities and inequalities over IRm. The corresponding standard system (1.5) has been studied extensively due to its various applications, and there are many methods for solving such problems, see [10, 27, 28, 33, 37]. For the setting of second-order cone, we know that the KKT conditions of the second-order cone constrained optimization problems can be expressed in form of (1.5), i.e., the system of equalities and inequalities under the order induced by second-order cones. For example, for the following second-order cone optimization problem:

min h(x)

s.t. −g(x) ∈ Km, the KKT conditions of this problem is as follows

∇h(x) + ∇g(x)λ = 0, λTg(x) = 0,

−λ ⪯Km 0, g(x) Km 0, where∇g(x) denotes the gradient matrix of g. Now, by denoting

fI(x, λ) :=

[ −λ g(x)

]

and fE(x, λ) :=

[ ∇h(x) + ∇g(x)λ λTg(x)

] ,

it is clear to see that the KKT conditions of the second-order cone optimization problem is in form of (1.5). From this view, the investigation of the system (1.5) provides a theoretical way for solving second-order cone optimization problems. Hence, the study of the system (1.5) is important and that is the main motivation for this paper.

So far, there are many kinds of numerical methods for solving the second-order cone optimization problems. Among which, there is a class of popular numerical method, the so-called smoothing-type algorithms. This kind of algorithm has also been a powerful tool for solving many other optimization problems, including symmetric cone complementarity problems [15, 16, 20–22], symmetric cone linear programming [23, 26], the system of inequal- ities under the order induced by symmetric cone [18, 25, 38], and so on. From these recent studies, most of the existing smoothing-type algorithms were designed on the basis of a monotone line search. In order to achieve better computational results, the nonmonotone line search technique is sometimes adopted in the numerical implementations of smoothing- type algorithms [15,36,37]. The main reason is that the nonmonotone line search scheme can improve the likelihood of finding a global optimal solution and convergence speed in cases where the function involved is highly nonconvex or has a valley in a small neighborhood of some point. In view of this, in this paper we also develop a nonmonotone smoothing-type algorithm for solving the system of equalities and inequalities under the order induced by second-order cones.

(4)

The remaining parts of this paper are organized as follows. In Section 2, some back- ground concepts and preliminary results about the second-order cone are given. In Section 3, we reformulate (1.5) as a system of smoothing equations in which Φµ is employed. In Section 4, we propose a nonmonotone smoothing-type algorithm for solving (1.5), and show that the algorithm is well defined. Moreover, we also discuss the global convergence and locally quadratic convergence of the proposed algorithm. The preliminary numerical results are reported to demonstrate that the proposed algorithm is effective in Section 5. Some numerical comparison in light of performance profiles is presented which indicates the dif- ference of numerical performance when various smoothing functions are used.

2 Preliminaries

In this section, we briefly review some basic properties about the second-order cone and the vector-valued functions with respect to SOC, which will be extensively used in subsequent analysis. More details about the second-order cone and the vector-valued functions can be found in [3–5, 13, 14, 17].

First, we review the projection of x∈ IRnonto the second-order cone Kn⊂ IRn. For the second-order coneKn, let (Kn) denote its dual cone. Then, (Kn) is given by

(Kn):={

y = (y1, y2)∈ IR × IRn−1| ⟨x, y⟩ ≥ 0, ∀x ∈ Kn} .

Moreover, it is well known that the second-order coneKnis a self-dual cone, i.e., (Kn)=Kn. Let x+ denote the projection of x∈ IRn onto the second-order coneKn, and x denote the projection of −x onto the dual cone (Kn). With these notations, for any x ∈ IRn, it is not hard to verify that x = x+− x. In particular, due to the special structure ofKn, the explicit formula of the projection of x∈ IRn ontoKn is obtained in [14] as below:

x+=



x if x∈ Kn,

0 if x∈ −(Kn)=−Kn, u otherwise,

(2.1)

where

u =



x1+∥x2 ( 2

x1+∥x2 2

) x2

∥x2

 .

In fact, according to the spectral decomposition of x, the expression of the projection x+ ontoKn can be alternatively expressed as (see [13, Prop. 3.3(b)])

x+= ((λ1(x))+u(1)x + ((λ2(x))+u(2)x , where (α)+= max{0, α} for any α ∈ IR.

From the definition (1.4) of the vector-valued function associated with Kn, we know that the projection x+ontoKn is a vector-valued function. Moreover, it is known that the projection x+and (α)+for any α∈ IR have many the same properties, such as the continuity, the directional differentiability and semismooth and so on. Indeed, these properties are established for general vector-valued functions associated with SOC. Among which, Chen, Chen and Tseng [5] have obtained that many properties of fsoc are inherited from the function f , which is presented in the following proposition.

(5)

Proposition 2.1. Suppose that x = (x1, x2)∈ IR × IRn−1 has the spectral decomposition given as in (1.1)-(1.3). For any the function f : IR→ IR and the vector-valued function fsoc defined by (1.4), the following hold.

(a) fsoc is continuous at x∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is continuous at λ1(x), λ2(x);

(b) fsoc is directionally differentiable at x∈ IRn with spectral values λ1(x), λ2(x)⇐⇒ f is directionally differentiable at λ1(x), λ2(x);

(c) fsoc is differentiable at x∈ IRn with spectral values λ1(x), λ2(x)⇐⇒ f is differentiable at λ1(x), λ2(x);

(d) fsoc is strictly continuous at x∈ IRn with spectral values λ1(x), λ2(x)⇐⇒ f is strictly continuous at λ1(x), λ2(x);

(e) fsoc is semismooth at x∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is semismooth at λ1(x), λ2(x);

(f ) fsoc is continuously differentiable at x∈ IRn with spectral values λ1(x), λ2(x) ⇐⇒ f is continuously differentiable at λ1(x), λ2(x).

Note that the projection function x+ ontoKn is not a smoothing function on the whole space IRn. From Proposition 2.1, we can make some smoothing functions for the projection x+ ontoKn if we smooth the functions f (λi(x)) for i = 1, 2. More specifically, we consider a family of smoothing functions ϕ(µ,·) : IR → IR with respect to (α)+ satisfying

lim

µ↓0ϕ(µ, α) = (α)+ and 0 ∂ϕ

∂α(µ, α)≤ 1. (2.2)

for all α ∈ IR. Are there functions satisfying the above conditions? Yes, there are many.

We illustrate three of them here:

ϕ1(µ, α) =

α2+ 4µ2+ α

2 , (µ > 0) ϕ2(µ, α) = µ ln(eαµ + 1), (µ > 0)

ϕ3(µ, α) =



α, if α≥ µ,

(α+µ)2

, if − µ < α < µ, 0, if α≤ −µ.

(µ > 0)

In fact, the functions ϕ1 and ϕ2 were considered in [13, 17], while the function ϕ3 was employed in [18, 37]. In addition, as for the function ϕ3, there is a more general function ϕp(µ,·) : IR → IR given by

ϕp(µ, α) =





α if α≥p−1µ ,

µ p−1

[(p−1)(α+µ)

]p

if −µ < α < p−1µ ,

0 if α≤ −µ,

where µ > 0 and p≥ 2. This function ϕpis recently studied in [9] and it is not hard to verify that ϕpalso satisfies the above conditions (2.2). All the functions ϕ1, ϕ2and ϕ3will play the

(6)

role of smoothing functions as f (λi(x)) in (1.4). In other words, based on these smoothing functions, we define a type of SOC-functions Φµ(·) on IRn associated withKn(n≥ 1) as

Φµ(x) := ϕ(µ, λ1(x))u(1)x + ϕ(µ, λ2(x))u(2)x ∀x = (x1, x2)∈ IR × IRn−1, (2.3) where λ1(x), λ2(x) are given by (1.2) and u(1)x , u(2)x are given by (1.3). In light of the properties of ϕ(µ, α), we show as below that the SOC-function Φµ(x) becomes the smoothing function for the projection function x+ ontoKn.

We depict the graphs of ϕi(µ, α) for i = 1, 2, 3, in Figure 1. From Figure 1, we see that ϕ3 is the one which best approximates the function (α)+ under the sense that it is closest to (α)+ among all ϕi(µ, α) for i = 1, 2, 3.

Figure 1: Graphs of max(0, t) and all three ϕi(µ, t) with µ = 0.2.

Proposition 2.2. Suppose that x = (x1, x2)∈ IR × IRn−1 has the spectral decomposition given as in (1.1)-(1.3), and that ϕ(µ,·) with µ > 0 is continuously differentiable function satisfying (2.2). Then, the following hold.

(a) The function Φµ(x) : IRn→ IRn defined as in (2.3) is continuously differentiable. More- over, its Jacobian matrix at x is described as

∂Φµ(x)

∂x =



∂ϕ

∂λ(µ, x1)I if x2= 0,

[ b cx2T/∥x2

cx2/∥x2∥ aI + (b − a)x2x2T/∥x22 ]

if x2̸= 0, (2.4) where

a = ϕ(µ,λλ2(x))−ϕ(µ,λ1(x))

2(x)−λ1(x) , b = 12

(∂ϕ

∂λ2(µ, λ2(x)) +∂λ∂ϕ

1(µ, λ1(x)) )

, c = 12

(∂ϕ

∂λ2(µ, λ2(x))−∂λ∂ϕ1(µ, λ1(x)) )

;

(2.5)

(b) Both ∂Φ∂xµ(x) and I−∂Φ∂xµ(x) are positive semi-definite matrices;

(c) lim

µ→0Φµ(x) = x+= (λ1(x))+u(1)x + (λ2(x))+u(2)x for i = 1, 2.

(7)

Proof. (a) From the expression (2.3) and the assumption of ϕ(µ,·) being continuously differ- entiable, it is easy to verify that the function Φµis continuously differentiable. The Jacobian matrix (2.4) of Φµ(x) can be obtained by adopting the same arguments as in [13, Proposition 5.2]. Hence, we omit the details here.

(b) First, we prove that the matrix ∂Φ∂xµ(x) is positive semi-definite. For the case of x2= 0, we know that ∂Φ∂xµ(x) = ∂ϕ∂λ(µ, x1)I. Then, from 0 ∂α∂ϕ(µ, α) ≤ 1, it is clear to see that the matrix ∂Φ∂xµ(x) is positive semi-definite. For the case of x2 ̸= 0, from ∂α∂ϕ(µ, α)≥ 0 and (2.5), we have b ≥ 0. In order to prove that the matrix ∂Φ∂xµ(x) is positive semi-definite, we only need to verify that the Schur Complement of b with respect to ∂Φ∂xµ(x) is positive semi-definite. Note that the Schur Complement of b has the form of

aI + (b− a)x2xT2

∥x22 −c2 b

x2xT2

∥x22 = a (

I− x2xT2

∥x22 )

+b2− c2 b

x2xT2

∥x22.

Since ∂α∂ϕ(µ, α) ≥ 0, we obtain that the function ϕ(µ, α) with respect to α is increasing, which leads to a≥ 0. Besides, from (2.5), we observe that

b2− c2= ∂ϕ

∂λ2(µ, λ2(x))∂ϕ

∂λ1(µ, λ1(x))≥ 0.

With this, it follows that the Schur Complement of b with respect to ∂Φ∂xµ(x) is a linear non-negative combination of the matrices ∥xx2xT2

22 and I ∥xx22xT22. Thus, we show that the Schur Complement of b is positive semi-definite, which says the matrix ∂Φ∂xµ(x) is positive semi-definite.

Combining with ∂ϕ∂α(µ, α)≤ 1 and following similar arguments as above, we can also argue that the matrix I−∂Φ∂xµ(x) is also positive semi-definite.

(c) By the definition of the function Φµ(x), it can be verified directly.

We point out that the definition of (2.3) includes the similar way to define smoothing functions in [13, Section 4] as a special case, and hence [13, Prop. 4.1] is covered by Proposi- tion 2.2. Indeed, Proposition 2.2 can be also verified by geometric views. More specifically, from Figures 2, 3 and 4, we see that when µ↓ 0, ϕi is getting closer to (α)+, which verifies Proposition 2.2(c).

3 Applying Φ

µ

to solve the system (1.5)

In this section, in light of the smoothing vector-valued function Φµ, we reformulate (1.5) as a system of smoothing equations. To this end, we need a partial order induced by SOC.

More specifically, for any x∈ IRn, using the definition of the partial order “Km” and the projection function x+ in (2.1), we have

fI(x)⪯Km0 ⇐⇒ −fI(x)∈ Km ⇐⇒ fI(x)∈ −Km ⇐⇒ (fI(x))+= 0.

Hence, the system (1.5) is equivalent to the following system of equations:

{ (fI(x))+ = 0,

fE(x) = 0. (3.1)

(8)

Figure 2: Graphs of ϕ1(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

Figure 3: Graphs of ϕ2(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

Note that the function (fI(·))+ in the above equation (3.1) is nonsmooth. Therefore, the smoothing-type Newton methods cannot be directly applied to solve the equation (3.1).

To conquer this, we employ the smoothing function Φµ(·) defined in (2.3), and define the following function:

F (µ, x, y) :=

fI(x)− y fE(x) Φµ(y)

 .

From Proposition 2.2(c), it follows that

F (µ, x, y) = 0 and µ = 0

⇐⇒ y = fI(x), fE(x) = 0, Φµ(y) = 0 and µ = 0

⇐⇒ y = fI(x), fE(x) = 0 and y+= 0

⇐⇒ (fI(x))+= 0, fE(x) = 0

⇐⇒ fI(x)⪯Km 0, fE(x) = 0.

(9)

Figure 4: Graphs of ϕ3(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

In other words, as long as the system F (µ, x, y) = 0 and µ = 0 is solved, the corresponding x is a solution to the original system (1.5). In view of Proposition 2.2(a), we can obtain the solution to the system (1.5) by applying smoothing-type Newton method for solving F (µ, x, y) = 0 and setting µ↓ 0 at the same time. To do this, for any z = (µ, x, y) ∈ IR++× IRn× IRm, we further define a continuously differentiable function H : IR++× IRn× IRm IR++× IRn× IRm as follows:

H(z) :=



µ fI(x)− y + µxI

fE(x) + µxE

Φµ(y) + µy



 , (3.2)

where xI := (x1, x2, ..., xm)T ∈ IRm, xE := (xm+1, ..., xn)T ∈ IRn−m, x := (xTI, xTE)T ∈ IRn and y∈ IRm. Then, it is clear to see that when H(z) = 0, we have µ = 0 and x is a solution to the system (1.5). Now, we let H(z) denote the Jacobian matrix of the function H at z, then for any z∈ IR++× IRn× IRm, we obtain that

H(z) =



1 0n 0m

xI fI + µU −Im

xE fE + µV 0(n−m)×m

∂Φµ(y)

∂µ + y 0m×n ∂Φµ(y)

∂y + µIm



 , (3.3)

where U := [

Im 0m×(n−m)]

, V := [

0(n−m)×m In−m]

, 0l denotes l dimensional zero vector, and 0l×q denotes l× q zero matrix for any positive integer l and q. In summary, we will apply smoothing-type Newton method to solve the smoothed equation H(z) = 0 at each iteration and make µ > 0 as well as H(z)→ 0 to find a solution of the system (1.5).

4 A smoothing-type Newton algorithm

Now, we consider a smoothing-type Newton algorithm with a nonmonotone line search, and show that the algorithm is well defined. For convenience, we denote the merit function Ψ as Ψ(z) :=∥H(z)∥2 for any z∈ IR++× IRn× IRm.

(10)

Algorithm 4.1. (A smoothing-type Newton Algorithm)

Step 0 Choose γ ∈ (0, 1), ξ ∈ (0,12). Take η > 0, σ ∈ (0, 1) such that ση < 1. Let µ0 = η and (x0, y0)∈ IRn× IRm be an arbitrary vector. Set z0 = (µ0, x0, y0), e0 :=

(1, 0, ..., 0)∈ IR × IRn× IRm, G0:=∥H(z0)2= Ψ(z0) and S0:= 1. Choose βmin and βmaxsuch that 0≤ βmin≤ βmax< 1. Set τ (z0) := σ min{1, Ψ(z0)} and k := 0.

Step 1 If∥H(zk)∥ = 0, stop. Otherwise, go to Step 2.

Step 2 Compute ∆zk := (∆µk, ∆xk, ∆yk)∈ IR × IRn× IRmby

H(zk)∆zk =−H(zk) + ητ (zk)e0. (4.1) Step 3 Let αk be the maximum of the values 1, γ, γ2, ... such that

Ψ(zk+ αk∆zk)≤ [1 − 2ξ(1 − ση)αk] Gk. (4.2) Step 4 Set zk+1:= zk+ αk∆zk. If∥H(zk+1)∥ = 0, stop. Otherwise, go to Step 5.

Step 5 Choose βk∈ [βmin, βmax]. Set

Sk+1 := βkSk+ 1, τ (zk+1) := min{

σ, σΨ(zk+1), τ (zk)} , Gk+1 := (

βkSkGk+ Ψ(zk+1)) /Sk+1,

(4.3)

and set k := k + 1. Go to Step 2.

The nonmonotone line search technique in Algorithm 4.1 was introduced in [36]. From the first and third equations in (4.3), we know that Gk+1 is a convex combination of Gk and Ψ(zk+1). In fact, Gk is expressed as a convex combination of Ψ(z0), Ψ(z1), ..., Ψ(zk).

Moreover, the main role of βk is to control the degree of non-monotonicity. If βk = 0 for every k, then the corresponding line search is the usual monotone Armijo line search.

Proposition 4.2. Suppose that the sequences {zk}, {µk}, {Gk}, {Ψ(zk)} and {τ(zk)} are generated by Algorithm 4.1. Then, the following hold.

(a) The sequence{Gk} is monotonically decreasing and Ψ(zk)≤ Gk for all k∈ N;

(b) The sequence{τ(zk)} is monotonically decreasing;

(c) ητ (zk)≤ µk for all k∈ N;

(d) The sequencek} is monotonically decreasing and µk> 0 for all k∈ N.

Proof. The proof is similar to Remark 3.1 in [37], we omit the details.

Next, we show that Algorithm 4.1 is well-defined and establish its local quadratic con- vergence. For simplicity, we denote the Jacobian matrix of the function f by

f(x) :=

[ fI(x) fE (x)

]

and use the following assumption.

(11)

Assumption 4.1. f(x) + µIn is invertible for any x∈ IRn and µ∈ IR++.

We point our that the Assumption 4.1 is only a mild condition and there are many functions satisfying the assumption. For example, if f is a monotone function, then f(x) is a positive semi-definite matrix for any x∈ IRn. Thus, Assumption 4.1 is satisfied.

Theorem 4.3. Suppose that f is a continuously differentiable function and Assumption 4.1 is satisfied. Then, Algorithm 4.1 is well-defined.

Proof. In order to show that Algorithm 4.1 is well-defined, we need to prove that Newton equation (4.1) is solvable, and the line search (4.2) is well-defined.

First, we prove that Newton equation (4.1) is solvable. By the expression of Jacobian matrix H(z) in (3.3), we see that the determinant det(H(z)) of H(z) satisfies

det(H(z)) = det (f(x) + µIn)· det

(∂Φµ(y)

∂y + µIm

)

for any z∈ IR++× IRn× IRm. Moreover, from Proposition 2.2(b), we know that ∂Φ∂yµ(y) is positive semi-definite for µ∈ IR++. Hence, combing this with Assumption 4.1, we obtain that H(z) is nonsingular for any z ∈ IR++× IRn× IRm with µ > 0. Applying Proposition 4.2(d), it follows that Newton equation (4.1) is solvable.

Secondly, we prove that the line search (4.2) is well-defined. For notational convenience, we denote

wk(α) := Ψ(

zk+ α∆zk)

− Ψ( zk)

− αΨ( zk)

∆zk. From Newton equation (4.1) and the definition of Ψ, we have

Ψ(

zk+ α∆zk)

= wk(α) + Ψ( zk)

+ αΨ( zk)

∆zk

= wk(α) + Ψ( zk)

+ 2αH( zk)T(

−H(zk) + ητ (zk)e0)

≤ wk(α) + (1− 2α)Ψ( zk)

+ 2αητ (zk) H(zk) . If Ψ(zk)≤ 1, then we have ∥H(zk)∥ ≤ 1. Hence, it follows that

τ (zk)∥H(zk)∥ ≤ σΨ(zk)∥H(zk)∥ ≤ σΨ(zk).

If Ψ(zk) > 1, then we see that Ψ(zk) =∥H(zk)2≥ ∥H(zk)∥, which yields τ (zk)∥H(zk)∥ ≤ σ∥H(zk)∥ ≤ σΨ(zk).

Thus, from all the above, we obtain that Ψ(

zk+ α∆zk)

≤ wk(α) + (1− 2α)Ψ(zk) + 2αησΨ(zk)

= wk(α) +[

1− 2(1 − ση)α]

Ψ(zk) (4.4)

≤ wk(α) +[

1− 2(1 − ση)α] Gk.

Since the function H is continuous and differentiable for any z∈ IR++× IRn× IRm, we have wk(α) = o(α) for all k∈ N. Combining with (4.4), this indicates that the line search (4.2) is well-defined.

(12)

Theorem 4.4. Suppose that f is a continuously differentiable function and Assumption 4.1 is satisfied. Then the sequence {zk} generated by Algorithm 4.1 is bounded; and any accumulation point of the sequence{xk} is a solution of the system (1.5).

Proof. The proof is similar to [37, Theorem 4.1] and we omit it.

In Theorem 4.4, we give the global convergence of Algorithm 4.1. Now, we analyze the convergence rate for Algorithm 4.1. We start with introducing the following concepts. A locally Lipschitz function F : IRn→ IRmis said to be semismooth (or strongly semismooth) at x∈ IRn if F is directionally differentiable at x and

F (x + h)− F (h) − V h = o(∥h∥) (or = O(∥h∥2))

holds for any V ∈ ∂F (x+h), where ∂F (x) is the generalized Jacobian matrix of the function F at x∈ IRnin the sense of Clarke [2]. There are many functions being semismooth, such as convex functions, smooth functions, piecewise linear functions and so on. In addition, it is known that the composition of semismooth functions is still a semismooth function, and the composition of strongly semismooth functions is still a strongly semismooth function [12].

From Proposition 2.2 (a), we know that Φµ(x) defined by (2.3) is smooth on IRn.

With the definition (3.2) of H, mimicking the arguments as in [37, Theorem 5.1], we have the local quadratic convergence of Algorithm 4.1.

Theorem 4.5. Suppose that the conditions given in Theorem 4.4 are satisfied, and z = , x, y) is an accumulation point of sequence{zk} which is generated by Algorithm 4.1.

(a) If all V ∈ ∂H(z) are nonsingular, then the sequence{zk} converges to z, and

∥zk+1− zk∥ = o(∥zk− z∥), µk+1= o(µk);

(b) If the functions f and Φµ satisfy that f and Φµ are Lipschitz continuous on IRn, then

∥zk+1− zk∥ = O(∥zk− z∥)2 and µk+1= O(µ2k).

5 Numerical experiments

In this section, we present some numerical examples to demonstrate the efficiency of Algo- rithm 4.1 for solving the system (1.5). In our tests, all experiments are done on a PC with CPU of 1.9 GHz and RAM of 8.0 GB, and all the program codes are written in MATLAB and run in MATLAB environment. We point out that if there are no n numbers in I∪ E, we can adopt a similar way to those given in [37], then the system (1.5) can be transformed as a new problem and we can solve the new problem using Algorithm 4.1. By this approach, a solution of the original problem can be found.

Throughout the following experiments, we employ three functions ϕ1, ϕ2 and ϕ3 along with the proposed algorithm to implement each example. Note that, for the function ϕ1, its corresponding SOC-function Φµ can be alternatively expressed as

µ(x) = x +

x2+ 4µ2e

2 with e = (1, 0,· · · , 0)T ∈ Kn.

(13)

This form is simpler than the Φµ(x) induced from (2.3). Hence, we adopt it in our imple- mentation. Moreover, the parameters used in the algorithm are chosen as follows:

γ = 0.3, ξ = 10−4, η = 1.0, β0= 0.01, µ0= 1.0, S0= 1.0,

and the parameters c and σ are chosen according to the ones listed in Table 1 and Table 4. In the implementation, the stopping rule is ∥H(z)∥ ≤ 10−6, the step length ν ≤ 10−6, or the number of iteration is over 500; and the starting points are randomly generated from the interval [−1, 1].

Now, we present the test examples. We first consider two examples in which the system (1.5) only includes inequalities, i.e., m = n. Note that a similar way to construct the two examples was given in [25].

Example 5.1. Consider the system (1.5) with inequalities only, where f (x) := M x+qKn0 and Kn :=Kn1 × · · · × Knr. Here M is generated by M = BBT with B∈ IRn×n being a matrix whose every component is randomly chosen from the interval [0, 1] and q∈ IRnbeing a vector whose every component is 1.

For Example 5.1, the tested problems are generated with sizes n = 500, 1000, ..., 4500 and each ni= 10. The random problems of each size are generated 10 times. Besides using the three functions along with Algorithm 4.1 for solving Example 5.1, we have also tested it by using the smoothing-type algorithm with the monotone line search which was introduced in [25] (for this case, we choose the function ϕ1). Table 1 shows the numerical results where

“fun” denotes the three functions,

“suc” denotes the number that Algorithm 4.1 successfully solves every generated problem,

“iter” denotes the average iteration numbers,

“cpu” denotes the average CPU time in seconds,

“res” denotes the average residual norm∥H(z)∥ for 9 test problems.

The initial points are also randomly generated. In light of “iter” and “cpu” in Table 1, we can conclude that

ϕ3(µ, α) > ϕ1(µ, α) > ϕ2(µ, α)

where “>” means “better performance”. In Table 2, we compare Algorithm 4.1 with non- monotone line search and the smoothing-type algorithm with monotone line search studied in [25]. Although the number that Algorithm 4.1 successfully solves every generated problem is less than the one by the smoothing-type algorithm with monotone line search as afore- mentioned in overall, the performance based on cpu time and iterations of our proposed algorithm outperforms better than the other. This indicates that Algorithm 4.1 has some advantages over the one with the monotone line search in [25].

Another way to compare the performance of function ϕi(µ, α), i = 1, 2, 3, is via the so- called “performance profile”, which is introduced in [39]. In this means, we regard Algorithm 4.1 corresponding to a smoothing function ϕi(µ, α), i = 1, 2, 3 as a solver, and assume that there are nssolvers and np test problems from the test setP which is generated randomly.

We are interested in using the iteration number as performance measure for Algorithm 4.1 with different ϕi(µ, α). For each problem p and solver s, let

fp,s = iteration number required to solve problem p by solver s.

(14)

n fun suc iter cpu res

500 ϕ1 10 5.000 0.251 8.864e-09

500 ϕ2 10 7.800 1.496 2.600e-07

500 ϕ3 10 3.500 0.707 3.762e-07

1000 ϕ1 10 5.000 0.632 2.165e-08

1000 ϕ2 10 7.200 5.240 8.657e-08

1000 ϕ3 10 3.400 3.093 4.853e-07

1500 ϕ1 9 5.000 1.224 1.537e-07

1500 ϕ2 9 8.111 13.232 3.124e-07

1500 ϕ3 9 4.222 8.781 2.706e-07

2000 ϕ1 10 5.000 2.145 1.599e-07

2000 ϕ2 10 7.700 24.130 2.234e-07

2000 ϕ3 10 4.200 16.925 1.923e-07

2500 ϕ1 9 5.000 3.519 3.897e-08

2500 ϕ2 9 6.889 34.849 2.016e-07

2500 ϕ3 9 4.000 27.870 1.479e-07

3000 ϕ1 10 5.000 5.161 9.769e-08

3000 ϕ2 10 8.300 69.723 1.714e-07

3000 ϕ3 10 4.100 45.891 1.608e-07

3500 ϕ1 7 5.000 7.415 2.226e-07

3500 ϕ2 7 7.857 102.272 4.037e-07

3500 ϕ3 7 4.429 75.068 2.334e-07

4000 ϕ1 9 5.000 9.974 5.795e-08

4000 ϕ2 9 6.444 106.850 3.132e-07

4000 ϕ3 9 4.000 98.983 7.743e-08

4500 ϕ1 8 5.000 13.075 2.374e-07

4500 ϕ2 8 10.250 240.602 3.115e-07

4500 ϕ3 8 4.250 147.863 3.070e-07

Table 1: Average performance of Algorithm4.1 for Example 5.1 (c = 0.01, σ = 10−5)

We employ the performance ratio

rp,s:= fp,s

min{fp,s: s∈ S},

whereS is the four solvers set. We assume that a parameter rp,s≤ rM for all p, s are chosen, and rp,s = rM if and only if solver s does not solve problem p. In order to obtain an overall assessment for each solver, we define

ρs(τ ) := 1 np

size{p ∈ P : rp,s≤ τ},

which is called the performance profile of the number of iteration for solver s. Then, ρs(τ ) is the probability for solver s∈ S that a performance ratio fp,s is within a factor τ ∈ R of the best possible ratio.

We then need to test the three functions for Example 5.1. In particular, the random problems of each size are generated 50 times. In order to obtain an overall assessment for the

(15)

Non-monotone Monotone

n suc iter cpu res n suc iter cpu res

500 10 5.000 0.251 8.864e-09 500 10 5.500 0.289 4.905e-07

1000 10 5.000 0.632 2.165e-08 1000 10 5.500 0.616 7.184e-08

1500 9 5.000 1.224 1.537e-07 1500 9 6.000 1.466 4.654e-09

2000 10 5.000 2.145 1.599e-07 2000 10 6.500 2.866 3.151e-08 2500 9 5.000 3.519 3.897e-08 2500 10 6.000 4.477 4.320e-08 3000 10 5.000 5.161 9.769e-08 3000 10 6.500 7.348 1.743e-07 3500 7 5.000 7.415 2.226e-07 3500 10 8.000 11.957 5.674e-07 4000 9 5.000 9.974 5.795e-08 4000 10 7.000 14.875 2.166e-08 4500 8 5.000 13.075 2.374e-07 4500 10 7.000 19.204 2.433e-08

Table 2: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25] for Example 5.1

three functions, we are interested in using the number of iterations as a performance measure for Algorithm 4.1 with ϕ1(µ, α), ϕ2(µ, α), and ϕ3(µ, α), respectively. The performance plot based on iteration number is presented in Figure 5. From this figure, we also see that ϕ3(µ, α) working with Algorithm 4.1 has the best numerical performance, followed by ϕ4(µ, α). In other words, in view of “iteration numbers”, there has

ϕ3(µ, α) > ϕ1(µ, α) > ϕ2(µ, α) where “>” means “better performance”.

We are also interested in using the computing time as performance measure for Algorithm 4.1 with different ϕi(µ, α), i = 1, 2, 3. The performance plot based on computing time is presented in Figure 6. From this figure, we can also see the function ϕ3(µ, t) has best performance. In other words, in view of “computing time”, there has

ϕ3(µ, α) > ϕ1(µ, α) > ϕ2(µ, α) where “>” means “better performance”.

In summary, for the Example 5.1, no matter the number of iterations or the computing time is taken into account, the function ϕ3(µ, α) is the best choice for the Algorithm 4.1.

Example 5.2. Consider the system (1.5) with inequalities only, where x ∈ IR5, K5 = K3× K2 and

f (x) :=





24(2x1− x2)3+ exp(x1+ x3)− 4x4+ x5

−12(2x1− x2)3+ 3(3x2+ 5x3)/

1 + (3x2+ 5x3)2− 6x4− 7x5

−exp(x1− x3) + 5(3x2+ 5x3)/

1 + (3x2+ 5x3)2− 3x4+ 5x5

4x1+ 6x2+ 3x3− 1

−x1+ 7x2− 5x3+ 2





K5 0.

This problem is taken from [17].

(16)

Figure 5: Performance profile of iteration numbers for Example 5.1.

Figure 6: Performance profile of computing time for Example 5.1.

Example 5.2 is tested 20 times for 20 random starting points. Similar to the case of Example 5.1, besides using Algorithm 4.1 to test Example 5.2, we have also tested it using the monotone smoothing-type algorithm in [25]. From Table 3, we see that there is no big difference regarding performance between these two algorithms for Example 5.2.

Moreover, Figure 7 shows the performance profile of iteration number in Algorithm 4.1 for Example 5.2 on 100 test problems with random starting points. The three solvers correspond to Algorithm 4.1 with ϕ1(µ, α), ϕ2(µ, α), and ϕ3(µ, α), respectively. From this figure, we see that ϕ3(µ, α) working with Algorithm 4.1 has the best numerical performance.

followed by ϕ2(µ, t). In summary, from the viewpoint of “iteration numbers”, we conclude that

ϕ3(µ, α) > ϕ2(µ, α) > ϕ1(µ, α), where “>” means “better performance”.

(17)

Non-monotone Monotone

suc iter cpu res suc iter cpu res

20 13.500 0.002 5.835e-08 20 8.750 0.005 1.2510e-07

Table 3: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25] for Example 5.2

Figure 7: Performance profile of iteration number for Example 5.2.

Example 5.3. Consider the system of equalities and inequalities (1.5), where f (x) :=(

fI(x)T, fE(x)T)T

x∈ IR6, with

fI(x) =





−x41

3x32+ 2x2− x3− 5x23

−4x22− 7x3+ 10x33

−x34− x5

x5+ x6





K5=K3×K2 0,

fE(x) = 2x1+ 5x22− 3x23+ 2x4− x5x6− 7.

Example 5.4. Consider the system of equalities and inequalities (1.5), where f (x) :=(

fI(x)T, fE(x)T)T

x∈ IR6, with

fI(x) =



−e5x1+ x2

x2+ x33

−3ex4 5x5− x6



 ⪯K4=K2×K2 0,

fE(x) =

[ 3x1+ ex2+x3− 2x4− 7x5+ x6− 3 2x21+ x2+ 3x3− (x4− x5)2+ 2x6− 13

]

= 0.

(18)

Example 5.5. Consider the system of equalities and inequalities (1.5), where

f (x) :=(

fI(x)T, fE(x)T)T

x∈ IR7, with

fI(x) =





3x31 x2− x3

−2(x4− 1)2 sin(x5+ x6) 2x6+ x7





K5=K2×K3 0,

fE(x) =

[ x1+ x2+ 2x3x4+ sin x5+ cos x6+ 2x7 x31+ x2+√

x23+ 3 + 2x4+ x5+ x6+ 6x7

]

= 0.

Exam fun suc c σ iter cpu res

5.2 ϕ1 20 5 0.02 13.500 0.002 5.835e-08

5.2 ϕ2 20 5 0.02 8.450 0.001 5.134e-07

5.2 ϕ3 20 5 0.02 8.600 0.002 2.260e-07

5.3 ϕ1 20 1 0.02 21.083 0.009 8.165e-07

5.3 ϕ2 17 1 0.02 14.647 0.001 2.899e-07??

5.3 ϕ3 17 1 0.02 18.529 0.002 7.167e-07

5.4 ϕ1 20 0.5 0.002 46.750 0.033 1.648e-07

5.4 ϕ2 2 0.5 0.002 420.000 0.499 9.964e-07

5.4 ϕ3 0 0.5 0.002 Fail Fail Fail

5.5 ϕ1 20 0.1 0.002 14.250 0.009 6.251e-07

5.5 ϕ2 20 0.1 0.002 13.250 0.001 6.532e-07

5.5 ϕ3 20 0.1 0.002 12.650 0.001 6.016e-07

Table 4: Average performance of Algorithm4.1 for Examples 5.2-5.5

Table 4 shows the numerical results including three smoothing functions (fun) used to solve the problems, the number (suc) that Algorithm 4.1 successfully solves every generated problem, the parameters c and σ, the average iteration numbers (iter), the average CPU time (cpu) in seconds and the average residual norm∥H(z)∥ (res) for Examples 5.2-5.5 with random initializations, respectively. Performance profiles are provided as below.

Figure 8 and Figure 9 are the performance profiles in terms of iteration number for Example 5.3 and Example 5.5. From the Figure 8, we see that although the best probability of the function ϕ3 is lower, but the ratio that can be solved in a large number of problems is higher than that of the other two. In this case, the difference between the three functions is not obvious. From the Figure 9, we can also see the function ϕ3has best performance.

In summary, below are our numerical observations and conclusions.

1. The Algorithm 4.1 is effective. In particular, the numerical results show that our proposed method is better than the algorithm with monotone line search studied in [25]

when solving the system of inequalities under the order induced by second-order cone.

(19)

Figure 8: Performance profile of iteration number for Example 5.3.

Figure 9: Performance profile of iteration number for Example 5.5.

2. For Examples 5.1 and 5.2, the function ϕ3 outperforms much better than the others.

For the rest problems, the difference of their numerical performance is very marginal.

3. For future topics, it is interesting to discover more efficient smoothing functions and to apply the type of SOC-functions to other optimization problems involved second-order cones.

References

[1] F. Alizadeh, D. Goldfarb, Second-order cone programming, Math. Program. 95 (2003) 3–51.

參考文獻

相關文件

According to the authors’ earlier experience on symmetric cone optimization, we believe that spectral decomposition associated with cones, nonsmooth analysis regarding

For different types of optimization problems, there arise various complementarity problems, for example, linear complementarity problem, nonlinear complementarity problem,

In section 4, based on the cases of circular cone eigenvalue optimization problems, we study the corresponding properties of the solutions for p-order cone eigenvalue

For the proposed algorithm, we establish a global convergence estimate in terms of the objective value, and moreover present a dual application to the standard SCLP, which leads to

It is well-known that, to deal with symmetric cone optimization problems, such as second-order cone optimization problems and positive semi-definite optimization prob- lems, this

According to the authors’ earlier experience on symmetric cone optimization, we believe that spectral decomposition associated with cones, nonsmooth analysis regarding cone-

For circular cone, a special non-symmetric cone, and circular cone optimization, like when dealing with SOCP and SOCCP, the following studies are cru- cial: (i) spectral

Taking second-order cone optimization and complementarity problems for example, there have proposed many ef- fective solution methods, including the interior point methods [1, 2, 3,