to solve the system (1.5)

(1)

(2)

see [1, 5, 8, 13, 19, 20, 24, 29–31, 34, 35] and references therein.

There is a spectral decomposition with respect to second-order cone Kⁿ in IRⁿ, which plays a very important role in the study of second-order cone optimization problems. For any vector x = (x1, x2)∈ IR × IRⁿ⁻¹, the spectral decomposition (or spectral factorization) with respect toKⁿ is given by

x = λ1(x)u⁽¹⁾_x + λ2(x)u⁽²⁾_x , (1.1) where λ1(x), λ2(x) and u⁽¹⁾x , u⁽²⁾x are called the spectral values and the spectral vectors of x, respectively, with their corresponding formulas as bellow:

λi(x) = x1+ (−1)ⁱ∥x2∥, i = 1, 2, (1.2)

u⁽ⁱ⁾_x =











1 2

[ 1

(−1)^{i x}_∥x²₂_∥ ]

, if x2̸= 0,

1 2

[ 1

(−1)ⁱw ]

, if x2= 0,

(1.3)

for i = 1, 2 with w being any vector in IRⁿ⁻¹ satisfying∥w∥ = 1. Moreover,{

u⁽¹⁾x , u⁽²⁾x

} is called a Jordan frame satisfying the following properties:

u⁽¹⁾_x + u⁽²⁾_x = e,

⟨

u⁽¹⁾_x , u⁽²⁾_x

⟩

= 0, u⁽¹⁾_x ◦ u⁽²⁾x = 0 and u⁽ⁱ⁾_x ◦ u⁽ⁱ⁾x = u⁽ⁱ⁾_x (i = 1, 2), where e = (1, 0,· · · , 0)^T ∈ IRⁿ is the unit element and Jordan product x◦ y is defined by x◦ y := (⟨x, y⟩, x1y2+ y1x2) ∈ IR × IRⁿ⁻¹ for any x = (x1, x2), y = (y1, y2)∈ IR × IRⁿ⁻¹. For more details about Jordan product, please refer to [11].

In [5, 6], for any real-valued function f : IR→ IR and x = (x1, x₂)∈ IR × IRⁿ⁻¹, based on the spectral factorization of x with respect toKⁿ, a type of vector-valued function associated with Kⁿ (also called SOC-function) is introduced. More specifically, if we apply f to the spectral values of x in (1.1), then we obtain the function f^soc: IRⁿ → IRⁿ given by

f^soc(x) = f (λ₁(x))u⁽¹⁾_x + f (λ₂(x))u⁽²⁾_x . (1.4) From the expression (1.4), it is clear that the SOC-function f^soc is unambiguous whether x₂ = 0 or x₂ ̸= 0. Further properties regarding f^soc were discussed in [3–5, 7, 17, 32].

It is also known that such SOC-functions f^soc associated with second-order cone play a crucial role in the theory and numerical algorithm for second-order cone programming, see [1, 5, 8, 13, 19, 20, 24, 29–31, 34, 35] again.

In this paper, in light of the definition of f^soc, we define another type of SOC-function Φµ (see Section 2 for details). In particular, using the SOC-function Φµ, we will solve the following system of equalities and inequalities under the order induced by the second-order

cone: {

fI(x)⪯K^m0,

fE(x) = 0, (1.5)

where fI(x) = (f1(x),· · · , fm(x))^T, fE(x) = (fm+1(x),· · · , fn(x))^T, and “x⪯K^m 0” means

“−x ∈ K^m”. Likewise, x⪰K^m 0 means x∈ K^mand x≻K^m 0 means x∈ int(K^m) whereas

(3)

int(K^m) denotes the interior ofK^m. Throughout this paper, we assume that fi is continu- ously diﬀerentiable for any i∈ {1, 2, ..., n}. We also define

f (x) :=

[ f_I(x) f_E(x)

]

and hence f is continuously diﬀerentiable. When K^m = IR^m₊, the system (1.5) reduces to the standard system of equalities and inequalities over IR^m. The corresponding standard system (1.5) has been studied extensively due to its various applications, and there are many methods for solving such problems, see [10, 27, 28, 33, 37]. For the setting of second-order cone, we know that the KKT conditions of the second-order cone constrained optimization problems can be expressed in form of (1.5), i.e., the system of equalities and inequalities under the order induced by second-order cones. For example, for the following second-order cone optimization problem:

min h(x)

s.t. −g(x) ∈ K^m, the KKT conditions of this problem is as follows

∇h(x) + ∇g(x)λ = 0, λ^Tg(x) = 0,

−λ ⪯_K^m 0, g(x) ⪯K^m 0, where∇g(x) denotes the gradient matrix of g. Now, by denoting

fI(x, λ) :=

[ −λ g(x)

]

and fE(x, λ) :=

[ ∇h(x) + ∇g(x)λ λ^Tg(x)

] ,

it is clear to see that the KKT conditions of the second-order cone optimization problem is in form of (1.5). From this view, the investigation of the system (1.5) provides a theoretical way for solving second-order cone optimization problems. Hence, the study of the system (1.5) is important and that is the main motivation for this paper.

So far, there are many kinds of numerical methods for solving the second-order cone optimization problems. Among which, there is a class of popular numerical method, the so-called smoothing-type algorithms. This kind of algorithm has also been a powerful tool for solving many other optimization problems, including symmetric cone complementarity problems [15, 16, 20–22], symmetric cone linear programming [23, 26], the system of inequalities under the order induced by symmetric cone [18, 25, 38], and so on. From these recent studies, most of the existing smoothing-type algorithms were designed on the basis of a monotone line search. In order to achieve better computational results, the nonmonotone line search technique is sometimes adopted in the numerical implementations of smoothing- type algorithms [15,36,37]. The main reason is that the nonmonotone line search scheme can improve the likelihood of finding a global optimal solution and convergence speed in cases where the function involved is highly nonconvex or has a valley in a small neighborhood of some point. In view of this, in this paper we also develop a nonmonotone smoothing-type algorithm for solving the system of equalities and inequalities under the order induced by second-order cones.

(4)

The remaining parts of this paper are organized as follows. In Section 2, some back- ground concepts and preliminary results about the second-order cone are given. In Section 3, we reformulate (1.5) as a system of smoothing equations in which Φµ is employed. In Section 4, we propose a nonmonotone smoothing-type algorithm for solving (1.5), and show that the algorithm is well defined. Moreover, we also discuss the global convergence and locally quadratic convergence of the proposed algorithm. The preliminary numerical results are reported to demonstrate that the proposed algorithm is eﬀective in Section 5. Some numerical comparison in light of performance profiles is presented which indicates the dif- ference of numerical performance when various smoothing functions are used.

2 Preliminaries

In this section, we briefly review some basic properties about the second-order cone and the vector-valued functions with respect to SOC, which will be extensively used in subsequent analysis. More details about the second-order cone and the vector-valued functions can be found in [3–5, 13, 14, 17].

First, we review the projection of x∈ IRⁿonto the second-order cone Kⁿ⊂ IRⁿ. For the second-order coneKⁿ, let (Kⁿ)^∗ denote its dual cone. Then, (Kⁿ)^∗ is given by

(Kⁿ)^∗:={

y = (y1, y2)∈ IR × IRⁿ⁻¹| ⟨x, y⟩ ≥ 0, ∀x ∈ Kⁿ} .

Moreover, it is well known that the second-order coneKⁿis a self-dual cone, i.e., (Kⁿ)^∗=Kⁿ. Let x+ denote the projection of x∈ IRⁿ onto the second-order coneKⁿ, and x₋ denote the projection of −x onto the dual cone (Kⁿ)^∗. With these notations, for any x ∈ IRⁿ, it is not hard to verify that x = x+− x₋. In particular, due to the special structure ofKⁿ, the explicit formula of the projection of x∈ IRⁿ ontoKⁿ is obtained in [14] as below:

x+=





x if x∈ Kⁿ,

0 if x∈ −(Kⁿ)^∗=−Kⁿ, u otherwise,

(2.1)

where

u =





x1+∥x2∥ ( 2

x1+∥x2∥ 2

) x2

∥x2∥



 .

In fact, according to the spectral decomposition of x, the expression of the projection x₊ ontoKⁿ can be alternatively expressed as (see [13, Prop. 3.3(b)])

x+= ((λ1(x))₊u⁽¹⁾_x + ((λ2(x))₊u⁽²⁾_x , where (α)+= max{0, α} for any α ∈ IR.

From the definition (1.4) of the vector-valued function associated with Kⁿ, we know that the projection x+ontoKⁿ is a vector-valued function. Moreover, it is known that the projection x+and (α)+for any α∈ IR have many the same properties, such as the continuity, the directional diﬀerentiability and semismooth and so on. Indeed, these properties are established for general vector-valued functions associated with SOC. Among which, Chen, Chen and Tseng [5] have obtained that many properties of f^soc are inherited from the function f , which is presented in the following proposition.

(5)

Proposition 2.1. Suppose that x = (x1, x2)∈ IR × IRⁿ⁻¹ has the spectral decomposition given as in (1.1)-(1.3). For any the function f : IR→ IR and the vector-valued function f^soc defined by (1.4), the following hold.

(a) f^soc is continuous at x∈ IRⁿ with spectral values λ₁(x), λ₂(x) ⇐⇒ f is continuous at λ₁(x), λ₂(x);

(b) f^soc is directionally diﬀerentiable at x∈ IRⁿ with spectral values λ1(x), λ2(x)⇐⇒ f is directionally diﬀerentiable at λ1(x), λ2(x);

(c) f^soc is diﬀerentiable at x∈ IRⁿ with spectral values λ1(x), λ2(x)⇐⇒ f is diﬀerentiable at λ1(x), λ2(x);

(d) f^soc is strictly continuous at x∈ IRⁿ with spectral values λ1(x), λ2(x)⇐⇒ f is strictly continuous at λ1(x), λ2(x);

(e) f^soc is semismooth at x∈ IRⁿ with spectral values λ₁(x), λ₂(x) ⇐⇒ f is semismooth at λ₁(x), λ₂(x);

(f ) f^soc is continuously diﬀerentiable at x∈ IRⁿ with spectral values λ₁(x), λ₂(x) ⇐⇒ f is continuously diﬀerentiable at λ₁(x), λ₂(x).

Note that the projection function x+ ontoKⁿ is not a smoothing function on the whole space IRⁿ. From Proposition 2.1, we can make some smoothing functions for the projection x₊ ontoKⁿ if we smooth the functions f (λ_i(x)) for i = 1, 2. More specifically, we consider a family of smoothing functions ϕ(µ,·) : IR → IR with respect to (α)+ satisfying

lim

µ↓0ϕ(µ, α) = (α)+ and 0≤ ∂ϕ

∂α(µ, α)≤ 1. (2.2)

for all α ∈ IR. Are there functions satisfying the above conditions? Yes, there are many.

We illustrate three of them here:

ϕ1(µ, α) =

√α²+ 4µ²+ α

2 , (µ > 0) ϕ2(µ, α) = µ ln(e^α^µ + 1), (µ > 0)

ϕ₃(µ, α) =





α, if α≥ µ,

(α+µ)²

4µ , if − µ < α < µ, 0, if α≤ −µ.

(µ > 0)

In fact, the functions ϕ1 and ϕ2 were considered in [13, 17], while the function ϕ3 was employed in [18, 37]. In addition, as for the function ϕ3, there is a more general function ϕp(µ,·) : IR → IR given by

ϕp(µ, α) =







α if α≥_p₋₁^µ ,

µ p−1

[(p−1)(α+µ) pµ

]p

if −µ < α < _p₋₁^µ ,

0 if α≤ −µ,

where µ > 0 and p≥ 2. This function ϕpis recently studied in [9] and it is not hard to verify that ϕ_palso satisfies the above conditions (2.2). All the functions ϕ₁, ϕ₂and ϕ₃will play the

(6)

role of smoothing functions as f (λi(x)) in (1.4). In other words, based on these smoothing functions, we define a type of SOC-functions Φµ(·) on IRⁿ associated withKⁿ(n≥ 1) as

Φµ(x) := ϕ(µ, λ1(x))u⁽¹⁾_x + ϕ(µ, λ2(x))u⁽²⁾_x ∀x = (x1, x2)∈ IR × IRⁿ⁻¹, (2.3) where λ1(x), λ2(x) are given by (1.2) and u⁽¹⁾x , u⁽²⁾x are given by (1.3). In light of the properties of ϕ(µ, α), we show as below that the SOC-function Φµ(x) becomes the smoothing function for the projection function x+ ontoKⁿ.

We depict the graphs of ϕ_i(µ, α) for i = 1, 2, 3, in Figure 1. From Figure 1, we see that ϕ₃ is the one which best approximates the function (α)₊ under the sense that it is closest to (α)₊ among all ϕ_i(µ, α) for i = 1, 2, 3.

Figure 1: Graphs of max(0, t) and all three ϕi(µ, t) with µ = 0.2.

Proposition 2.2. Suppose that x = (x₁, x₂)∈ IR × IRⁿ⁻¹ has the spectral decomposition given as in (1.1)-(1.3), and that ϕ(µ,·) with µ > 0 is continuously diﬀerentiable function satisfying (2.2). Then, the following hold.

(a) The function Φ_µ(x) : IRⁿ→ IRⁿ defined as in (2.3) is continuously diﬀerentiable. More- over, its Jacobian matrix at x is described as

∂Φµ(x)

∂x =





∂ϕ

∂λ(µ, x₁)I if x₂= 0,

[ b cx₂^T/∥x2∥

cx₂/∥x2∥ aI + (b − a)x2x₂^T/∥x2∥² ]

if x₂̸= 0, (2.4) where

a = ^ϕ(µ,λ_λ²^(x))^−ϕ(µ,λ¹^(x))

2(x)−λ1(x) , b = ¹₂

(∂ϕ

∂λ₂(µ, λ2(x)) +_∂λ^∂ϕ

1(µ, λ1(x)) )

, c = ¹₂

(∂ϕ

∂λ2(µ, λ2(x))−_∂λ^∂ϕ₁(µ, λ1(x)) )

;

(2.5)

(b) Both ^∂Φ_∂x^µ^(x) and I−^∂Φ_∂x^µ^(x) are positive semi-definite matrices;

(c) lim

µ→0Φ_µ(x) = x₊= (λ₁(x))₊u⁽¹⁾_x + (λ₂(x))₊u⁽²⁾_x for i = 1, 2.

(7)

Proof. (a) From the expression (2.3) and the assumption of ϕ(µ,·) being continuously diﬀer- entiable, it is easy to verify that the function Φµis continuously diﬀerentiable. The Jacobian matrix (2.4) of Φµ(x) can be obtained by adopting the same arguments as in [13, Proposition 5.2]. Hence, we omit the details here.

(b) First, we prove that the matrix ^∂Φ_∂x^µ^(x) is positive semi-definite. For the case of x2= 0, we know that ^∂Φ_∂x^µ^(x) = ^∂ϕ_∂λ(µ, x1)I. Then, from 0 ≤ _∂α^∂ϕ(µ, α) ≤ 1, it is clear to see that the matrix ^∂Φ_∂x^µ^(x) is positive semi-definite. For the case of x2 ̸= 0, from _∂α^∂ϕ(µ, α)≥ 0 and (2.5), we have b ≥ 0. In order to prove that the matrix ^∂Φ_∂x^µ^(x) is positive semi-definite, we only need to verify that the Schur Complement of b with respect to ^∂Φ_∂x^µ^(x) is positive semi-definite. Note that the Schur Complement of b has the form of

aI + (b− a)x₂x^T₂

∥x2∥² −c² b

x₂x^T₂

∥x2∥² = a (

I− x₂x^T₂

∥x2∥² )

+b²− c² b

x₂x^T₂

∥x2∥².

Since _∂α^∂ϕ(µ, α) ≥ 0, we obtain that the function ϕ(µ, α) with respect to α is increasing, which leads to a≥ 0. Besides, from (2.5), we observe that

b²− c²= ∂ϕ

∂λ₂(µ, λ2(x))∂ϕ

∂λ₁(µ, λ1(x))≥ 0.

With this, it follows that the Schur Complement of b with respect to ^∂Φ_∂x^µ^(x) is a linear non-negative combination of the matrices _∥x^x²^x^T²

2∥² and I − _∥x^x²₂^x_∥^T²2. Thus, we show that the Schur Complement of b is positive semi-definite, which says the matrix ^∂Φ_∂x^µ^(x) is positive semi-definite.

Combining with ^∂ϕ_∂α(µ, α)≤ 1 and following similar arguments as above, we can also argue that the matrix I−^∂Φ_∂x^µ^(x) is also positive semi-definite.

(c) By the definition of the function Φµ(x), it can be verified directly.

We point out that the definition of (2.3) includes the similar way to define smoothing functions in [13, Section 4] as a special case, and hence [13, Prop. 4.1] is covered by Proposi- tion 2.2. Indeed, Proposition 2.2 can be also verified by geometric views. More specifically, from Figures 2, 3 and 4, we see that when µ↓ 0, ϕi is getting closer to (α)₊, which verifies Proposition 2.2(c).

3 Applying Φ

_µ

to solve the system (1.5)

In this section, in light of the smoothing vector-valued function Φ_µ, we reformulate (1.5) as a system of smoothing equations. To this end, we need a partial order induced by SOC.

More specifically, for any x∈ IRⁿ, using the definition of the partial order “⪯K^m” and the projection function x₊ in (2.1), we have

f_I(x)⪯K^m0 ⇐⇒ −fI(x)∈ K^m ⇐⇒ fI(x)∈ −K^m ⇐⇒ (fI(x))₊= 0.

Hence, the system (1.5) is equivalent to the following system of equations:

{ (fI(x))+ = 0,

fE(x) = 0. (3.1)

(8)

Figure 2: Graphs of ϕ1(µ, α) with µ = 0.01, 0.1, 0.3, 0.5.

Note that the function (fI(·))+ in the above equation (3.1) is nonsmooth. Therefore, the smoothing-type Newton methods cannot be directly applied to solve the equation (3.1).

To conquer this, we employ the smoothing function Φµ(·) defined in (2.3), and define the following function:

F (µ, x, y) :=



 fI(x)− y fE(x) Φµ(y)



 .

From Proposition 2.2(c), it follows that

F (µ, x, y) = 0 and µ = 0

⇐⇒ y = fI(x), f_E(x) = 0, Φ_µ(y) = 0 and µ = 0

⇐⇒ y = fI(x), fE(x) = 0 and y+= 0

⇐⇒ (fI(x))+= 0, fE(x) = 0

⇐⇒ fI(x)⪯K^m 0, f_E(x) = 0.

(9)

In other words, as long as the system F (µ, x, y) = 0 and µ = 0 is solved, the corresponding x is a solution to the original system (1.5). In view of Proposition 2.2(a), we can obtain the solution to the system (1.5) by applying smoothing-type Newton method for solving F (µ, x, y) = 0 and setting µ↓ 0 at the same time. To do this, for any z = (µ, x, y) ∈ IR++× IRⁿ× IR^m, we further define a continuously diﬀerentiable function H : IR₊₊× IRⁿ× IR^m→ IR++× IRⁿ× IR^m as follows:

H(z) :=







µ fI(x)− y + µxI

fE(x) + µxE

Φµ(y) + µy





 , (3.2)

where xI := (x1, x2, ..., xm)^T ∈ IR^m, xE := (xm+1, ..., xn)^T ∈ IRⁿ^−m, x := (x^T_I, x^T_E)^T ∈ IRⁿ and y∈ IR^m. Then, it is clear to see that when H(z) = 0, we have µ = 0 and x is a solution to the system (1.5). Now, we let H^′(z) denote the Jacobian matrix of the function H at z, then for any z∈ IR++× IRⁿ× IR^m, we obtain that

H^′(z) =







1 0n 0m

xI f_I^′ + µU −Im

xE f_E^′ + µV 0_(n_−m)×m

∂Φ_µ(y)

∂µ + y 0m×n ∂Φ_µ(y)

∂y + µIm





 , (3.3)

where U := [

I_m 0_m_×(n−m)]

, V := [

0_(n_−m)×m I_n_−m]

, 0_l denotes l dimensional zero vector, and 0_l_×q denotes l× q zero matrix for any positive integer l and q. In summary, we will apply smoothing-type Newton method to solve the smoothed equation H(z) = 0 at each iteration and make µ > 0 as well as H(z)→ 0 to find a solution of the system (1.5).

4 A smoothing-type Newton algorithm

Now, we consider a smoothing-type Newton algorithm with a nonmonotone line search, and show that the algorithm is well defined. For convenience, we denote the merit function Ψ as Ψ(z) :=∥H(z)∥² for any z∈ IR++× IRⁿ× IR^m.

(10)

Algorithm 4.1. (A smoothing-type Newton Algorithm)

Step 0 Choose γ ∈ (0, 1), ξ ∈ (0,¹₂). Take η > 0, σ ∈ (0, 1) such that ση < 1. Let µ0 = η and (x⁰, y⁰)∈ IRⁿ× IR^m be an arbitrary vector. Set z⁰ = (µ0, x⁰, y⁰), e⁰ :=

(1, 0, ..., 0)∈ IR × IRⁿ× IR^m, G0:=∥H(z⁰)∥²= Ψ(z⁰) and S0:= 1. Choose βmin and βmaxsuch that 0≤ βmin≤ βmax< 1. Set τ (z⁰) := σ min{1, Ψ(z⁰)} and k := 0.

Step 1 If∥H(z^k)∥ = 0, stop. Otherwise, go to Step 2.

Step 2 Compute ∆z^k := (∆µ_k, ∆x^k, ∆y^k)∈ IR × IRⁿ× IR^mby

H^′(z^k)∆z^k =−H(z^k) + ητ (z^k)e⁰. (4.1) Step 3 Let α_k be the maximum of the values 1, γ, γ², ... such that

Ψ(z^k+ α_k∆z^k)≤ [1 − 2ξ(1 − ση)αk] G_k. (4.2) Step 4 Set z^k+1:= z^k+ αk∆z^k. If∥H(z^k+1)∥ = 0, stop. Otherwise, go to Step 5.

Step 5 Choose βk∈ [βmin, βmax]. Set

Sk+1 := βkSk+ 1, τ (z^k+1) := min{

σ, σΨ(z^k+1), τ (z^k)} , Gk+1 := (

βkSkGk+ Ψ(z^k+1)) /Sk+1,

(4.3)

and set k := k + 1. Go to Step 2.

The nonmonotone line search technique in Algorithm 4.1 was introduced in [36]. From the first and third equations in (4.3), we know that G_k+1 is a convex combination of G_k and Ψ(z^k+1). In fact, Gk is expressed as a convex combination of Ψ(z⁰), Ψ(z¹), ..., Ψ(z^k).

Moreover, the main role of βk is to control the degree of non-monotonicity. If βk = 0 for every k, then the corresponding line search is the usual monotone Armijo line search.

Proposition 4.2. Suppose that the sequences {z^k}, {µk}, {Gk}, {Ψ(z^k)} and {τ(z^k)} are generated by Algorithm 4.1. Then, the following hold.

(a) The sequence{Gk} is monotonically decreasing and Ψ(z^k)≤ Gk for all k∈ N;

(b) The sequence{τ(z^k)} is monotonically decreasing;

(c) ητ (z^k)≤ µk for all k∈ N;

(d) The sequence{µk} is monotonically decreasing and µk> 0 for all k∈ N.

Proof. The proof is similar to Remark 3.1 in [37], we omit the details.

Next, we show that Algorithm 4.1 is well-defined and establish its local quadratic con- vergence. For simplicity, we denote the Jacobian matrix of the function f by

f^′(x) :=

[ f_I^′(x) f_E^′ (x)

]

and use the following assumption.

(11)

Assumption 4.1. f^′(x) + µIn is invertible for any x∈ IRⁿ and µ∈ IR++.

We point our that the Assumption 4.1 is only a mild condition and there are many functions satisfying the assumption. For example, if f is a monotone function, then f^′(x) is a positive semi-definite matrix for any x∈ IRⁿ. Thus, Assumption 4.1 is satisfied.

Theorem 4.3. Suppose that f is a continuously diﬀerentiable function and Assumption 4.1 is satisfied. Then, Algorithm 4.1 is well-defined.

Proof. In order to show that Algorithm 4.1 is well-defined, we need to prove that Newton equation (4.1) is solvable, and the line search (4.2) is well-defined.

First, we prove that Newton equation (4.1) is solvable. By the expression of Jacobian matrix H^′(z) in (3.3), we see that the determinant det(H^′(z)) of H^′(z) satisfies

det(H^′(z)) = det (f^′(x) + µIn)· det

(∂Φ_µ(y)

∂y + µIm

)

for any z∈ IR++× IRⁿ× IR^m. Moreover, from Proposition 2.2(b), we know that ^∂Φ_∂y^µ^(y) is positive semi-definite for µ∈ IR++. Hence, combing this with Assumption 4.1, we obtain that H^′(z) is nonsingular for any z ∈ IR++× IRⁿ× IR^m with µ > 0. Applying Proposition 4.2(d), it follows that Newton equation (4.1) is solvable.

Secondly, we prove that the line search (4.2) is well-defined. For notational convenience, we denote

w_k(α) := Ψ(

z^k+ α∆z^k)

− Ψ( z^k)

− αΨ^′( z^k)

∆z^k. From Newton equation (4.1) and the definition of Ψ, we have

Ψ(

z^k+ α∆z^k)

= wk(α) + Ψ( z^k)

+ αΨ^′( z^k)

∆z^k

= wk(α) + Ψ( z^k)

+ 2αH( z^k)T(

−H(z^k) + ητ (z^k)e⁰)

≤ wk(α) + (1− 2α)Ψ( z^k)

+ 2αητ (z^k) H(z^k) . If Ψ(z^k)≤ 1, then we have ∥H(z^k)∥ ≤ 1. Hence, it follows that

τ (z^k)∥H(z^k)∥ ≤ σΨ(z^k)∥H(z^k)∥ ≤ σΨ(z^k).

If Ψ(z^k) > 1, then we see that Ψ(z^k) =∥H(z^k)∥²≥ ∥H(z^k)∥, which yields τ (z^k)∥H(z^k)∥ ≤ σ∥H(z^k)∥ ≤ σΨ(z^k).

Thus, from all the above, we obtain that Ψ(

z^k+ α∆z^k)

≤ wk(α) + (1− 2α)Ψ(z^k) + 2αησΨ(z^k)

= wk(α) +[

1− 2(1 − ση)α]

Ψ(z^k) (4.4)

≤ wk(α) +[

1− 2(1 − ση)α] Gk.

Since the function H is continuous and diﬀerentiable for any z∈ IR++× IRⁿ× IR^m, we have wk(α) = o(α) for all k∈ N. Combining with (4.4), this indicates that the line search (4.2) is well-defined.

(12)

Theorem 4.4. Suppose that f is a continuously diﬀerentiable function and Assumption 4.1 is satisfied. Then the sequence {z^k} generated by Algorithm 4.1 is bounded; and any accumulation point of the sequence{x^k} is a solution of the system (1.5).

Proof. The proof is similar to [37, Theorem 4.1] and we omit it.

In Theorem 4.4, we give the global convergence of Algorithm 4.1. Now, we analyze the convergence rate for Algorithm 4.1. We start with introducing the following concepts. A locally Lipschitz function F : IRⁿ→ IR^mis said to be semismooth (or strongly semismooth) at x∈ IRⁿ if F is directionally diﬀerentiable at x and

F (x + h)− F (h) − V h = o(∥h∥) (or = O(∥h∥²))

holds for any V ∈ ∂F (x+h), where ∂F (x) is the generalized Jacobian matrix of the function F at x∈ IRⁿin the sense of Clarke [2]. There are many functions being semismooth, such as convex functions, smooth functions, piecewise linear functions and so on. In addition, it is known that the composition of semismooth functions is still a semismooth function, and the composition of strongly semismooth functions is still a strongly semismooth function [12].

From Proposition 2.2 (a), we know that Φµ(x) defined by (2.3) is smooth on IRⁿ.

With the definition (3.2) of H, mimicking the arguments as in [37, Theorem 5.1], we have the local quadratic convergence of Algorithm 4.1.

Theorem 4.5. Suppose that the conditions given in Theorem 4.4 are satisfied, and z^∗ = (µ_∗, x^∗, y^∗) is an accumulation point of sequence{z^k} which is generated by Algorithm 4.1.

(a) If all V ∈ ∂H(z^∗) are nonsingular, then the sequence{z^k} converges to z^∗, and

∥z^k+1− z^k∥ = o(∥z^k− z^∗∥), µk+1= o(µk);

(b) If the functions f and Φµ satisfy that f^′ and Φ^′_µ are Lipschitz continuous on IRⁿ, then

∥z^k+1− z^k∥ = O(∥z^k− z^∗∥)² and µk+1= O(µ²_k).

5 Numerical experiments

In this section, we present some numerical examples to demonstrate the eﬃciency of Algo- rithm 4.1 for solving the system (1.5). In our tests, all experiments are done on a PC with CPU of 1.9 GHz and RAM of 8.0 GB, and all the program codes are written in MATLAB and run in MATLAB environment. We point out that if there are no n numbers in I∪ E, we can adopt a similar way to those given in [37], then the system (1.5) can be transformed as a new problem and we can solve the new problem using Algorithm 4.1. By this approach, a solution of the original problem can be found.

Throughout the following experiments, we employ three functions ϕ₁, ϕ₂ and ϕ₃ along with the proposed algorithm to implement each example. Note that, for the function ϕ₁, its corresponding SOC-function Φ_µ can be alternatively expressed as

eΦµ(x) = x +√

x²+ 4µ²e

2 with e = (1, 0,· · · , 0)^T ∈ Kⁿ.

(13)

This form is simpler than the Φµ(x) induced from (2.3). Hence, we adopt it in our imple- mentation. Moreover, the parameters used in the algorithm are chosen as follows:

γ = 0.3, ξ = 10⁻⁴, η = 1.0, β0= 0.01, µ0= 1.0, S0= 1.0,

and the parameters c and σ are chosen according to the ones listed in Table 1 and Table 4. In the implementation, the stopping rule is ∥H(z)∥ ≤ 10⁻⁶, the step length ν ≤ 10⁻⁶, or the number of iteration is over 500; and the starting points are randomly generated from the interval [−1, 1].

Now, we present the test examples. We first consider two examples in which the system (1.5) only includes inequalities, i.e., m = n. Note that a similar way to construct the two examples was given in [25].

Example 5.1. Consider the system (1.5) with inequalities only, where f (x) := M x+q⪯_Kⁿ0 and Kⁿ :=Kⁿ¹ × · · · × Kⁿ^r. Here M is generated by M = BB^T with B∈ IRⁿ^×n being a matrix whose every component is randomly chosen from the interval [0, 1] and q∈ IRⁿbeing a vector whose every component is 1.

For Example 5.1, the tested problems are generated with sizes n = 500, 1000, ..., 4500 and each n_i= 10. The random problems of each size are generated 10 times. Besides using the three functions along with Algorithm 4.1 for solving Example 5.1, we have also tested it by using the smoothing-type algorithm with the monotone line search which was introduced in [25] (for this case, we choose the function ϕ₁). Table 1 shows the numerical results where

“fun” denotes the three functions,

“suc” denotes the number that Algorithm 4.1 successfully solves every generated problem,

“iter” denotes the average iteration numbers,

“cpu” denotes the average CPU time in seconds,

“res” denotes the average residual norm∥H(z)∥ for 9 test problems.

The initial points are also randomly generated. In light of “iter” and “cpu” in Table 1, we can conclude that

ϕ₃(µ, α) > ϕ₁(µ, α) > ϕ₂(µ, α)

where “>” means “better performance”. In Table 2, we compare Algorithm 4.1 with non- monotone line search and the smoothing-type algorithm with monotone line search studied in [25]. Although the number that Algorithm 4.1 successfully solves every generated problem is less than the one by the smoothing-type algorithm with monotone line search as afore- mentioned in overall, the performance based on cpu time and iterations of our proposed algorithm outperforms better than the other. This indicates that Algorithm 4.1 has some advantages over the one with the monotone line search in [25].

Another way to compare the performance of function ϕ_i(µ, α), i = 1, 2, 3, is via the so- called “performance profile”, which is introduced in [39]. In this means, we regard Algorithm 4.1 corresponding to a smoothing function ϕ_i(µ, α), i = 1, 2, 3 as a solver, and assume that there are nssolvers and np test problems from the test setP which is generated randomly.

We are interested in using the iteration number as performance measure for Algorithm 4.1 with diﬀerent ϕi(µ, α). For each problem p and solver s, let

f_p,s = iteration number required to solve problem p by solver s.

(14)

n fun suc iter cpu res

500 ϕ1 10 5.000 0.251 8.864e-09

500 ϕ2 10 7.800 1.496 2.600e-07

500 ϕ3 10 3.500 0.707 3.762e-07

1000 ϕ1 10 5.000 0.632 2.165e-08

1000 ϕ2 10 7.200 5.240 8.657e-08

1000 ϕ3 10 3.400 3.093 4.853e-07

1500 ϕ₁ 9 5.000 1.224 1.537e-07

1500 ϕ₂ 9 8.111 13.232 3.124e-07

1500 ϕ₃ 9 4.222 8.781 2.706e-07

2000 ϕ1 10 5.000 2.145 1.599e-07

2000 ϕ2 10 7.700 24.130 2.234e-07

2000 ϕ3 10 4.200 16.925 1.923e-07

2500 ϕ₁ 9 5.000 3.519 3.897e-08

2500 ϕ₂ 9 6.889 34.849 2.016e-07

2500 ϕ3 9 4.000 27.870 1.479e-07

3000 ϕ1 10 5.000 5.161 9.769e-08

3000 ϕ2 10 8.300 69.723 1.714e-07

3000 ϕ3 10 4.100 45.891 1.608e-07

3500 ϕ1 7 5.000 7.415 2.226e-07

3500 ϕ2 7 7.857 102.272 4.037e-07

3500 ϕ3 7 4.429 75.068 2.334e-07

4000 ϕ₁ 9 5.000 9.974 5.795e-08

4000 ϕ₂ 9 6.444 106.850 3.132e-07

4000 ϕ₃ 9 4.000 98.983 7.743e-08

4500 ϕ1 8 5.000 13.075 2.374e-07

4500 ϕ2 8 10.250 240.602 3.115e-07

4500 ϕ3 8 4.250 147.863 3.070e-07

Table 1: Average performance of Algorithm4.1 for Example 5.1 (c = 0.01, σ = 10⁻⁵)

We employ the performance ratio

r_p,s:= f_p,s

min{fp,s: s∈ S},

whereS is the four solvers set. We assume that a parameter rp,s≤ rM for all p, s are chosen, and rp,s = rM if and only if solver s does not solve problem p. In order to obtain an overall assessment for each solver, we define

ρ_s(τ ) := 1 np

size{p ∈ P : rp,s≤ τ},

which is called the performance profile of the number of iteration for solver s. Then, ρs(τ ) is the probability for solver s∈ S that a performance ratio fp,s is within a factor τ ∈ R of the best possible ratio.

We then need to test the three functions for Example 5.1. In particular, the random problems of each size are generated 50 times. In order to obtain an overall assessment for the

(15)

Non-monotone Monotone

n suc iter cpu res n suc iter cpu res

500 10 5.000 0.251 8.864e-09 500 10 5.500 0.289 4.905e-07

1000 10 5.000 0.632 2.165e-08 1000 10 5.500 0.616 7.184e-08

1500 9 5.000 1.224 1.537e-07 1500 9 6.000 1.466 4.654e-09

2000 10 5.000 2.145 1.599e-07 2000 10 6.500 2.866 3.151e-08 2500 9 5.000 3.519 3.897e-08 2500 10 6.000 4.477 4.320e-08 3000 10 5.000 5.161 9.769e-08 3000 10 6.500 7.348 1.743e-07 3500 7 5.000 7.415 2.226e-07 3500 10 8.000 11.957 5.674e-07 4000 9 5.000 9.974 5.795e-08 4000 10 7.000 14.875 2.166e-08 4500 8 5.000 13.075 2.374e-07 4500 10 7.000 19.204 2.433e-08

Table 2: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25] for Example 5.1

three functions, we are interested in using the number of iterations as a performance measure for Algorithm 4.1 with ϕ1(µ, α), ϕ2(µ, α), and ϕ3(µ, α), respectively. The performance plot based on iteration number is presented in Figure 5. From this figure, we also see that ϕ3(µ, α) working with Algorithm 4.1 has the best numerical performance, followed by ϕ4(µ, α). In other words, in view of “iteration numbers”, there has

ϕ3(µ, α) > ϕ1(µ, α) > ϕ2(µ, α) where “>” means “better performance”.

We are also interested in using the computing time as performance measure for Algorithm 4.1 with diﬀerent ϕi(µ, α), i = 1, 2, 3. The performance plot based on computing time is presented in Figure 6. From this figure, we can also see the function ϕ3(µ, t) has best performance. In other words, in view of “computing time”, there has

ϕ₃(µ, α) > ϕ₁(µ, α) > ϕ₂(µ, α) where “>” means “better performance”.

In summary, for the Example 5.1, no matter the number of iterations or the computing time is taken into account, the function ϕ3(µ, α) is the best choice for the Algorithm 4.1.

Example 5.2. Consider the system (1.5) with inequalities only, where x ∈ IR⁵, K⁵ = K³× K² and

f (x) :=







24(2x₁− x2)³+ exp(x₁+ x₃)− 4x4+ x₅

−12(2x1− x2)³+ 3(3x2+ 5x3)/√

1 + (3x2+ 5x3)²− 6x4− 7x5

−exp(x1− x3) + 5(3x2+ 5x3)/√

1 + (3x2+ 5x3)²− 3x4+ 5x5

4x1+ 6x2+ 3x3− 1

−x1+ 7x2− 5x3+ 2





⪯_K⁵ 0.

This problem is taken from [17].

(16)

Figure 5: Performance profile of iteration numbers for Example 5.1.

Figure 6: Performance profile of computing time for Example 5.1.

Example 5.2 is tested 20 times for 20 random starting points. Similar to the case of Example 5.1, besides using Algorithm 4.1 to test Example 5.2, we have also tested it using the monotone smoothing-type algorithm in [25]. From Table 3, we see that there is no big diﬀerence regarding performance between these two algorithms for Example 5.2.

Moreover, Figure 7 shows the performance profile of iteration number in Algorithm 4.1 for Example 5.2 on 100 test problems with random starting points. The three solvers correspond to Algorithm 4.1 with ϕ₁(µ, α), ϕ₂(µ, α), and ϕ₃(µ, α), respectively. From this figure, we see that ϕ₃(µ, α) working with Algorithm 4.1 has the best numerical performance.

followed by ϕ₂(µ, t). In summary, from the viewpoint of “iteration numbers”, we conclude that

ϕ3(µ, α) > ϕ2(µ, α) > ϕ1(µ, α), where “>” means “better performance”.

(17)

Non-monotone Monotone

suc iter cpu res suc iter cpu res

20 13.500 0.002 5.835e-08 20 8.750 0.005 1.2510e-07

Table 3: Comparisons of non-monotone Algorithm 4.1 and monotone Algorithm in [25] for Example 5.2

Figure 7: Performance profile of iteration number for Example 5.2.

Example 5.3. Consider the system of equalities and inequalities (1.5), where f (x) :=(

fI(x)^T, fE(x)^T)T

x∈ IR⁶, with

fI(x) =







−x⁴1

3x³₂+ 2x2− x3− 5x²3

−4x²2− 7x3+ 10x³₃

−x³4− x5

x₅+ x₆





⪯_K⁵=K³×K² 0,

fE(x) = 2x1+ 5x²₂− 3x²3+ 2x4− x5x6− 7.

Example 5.4. Consider the system of equalities and inequalities (1.5), where f (x) :=(

f_I(x)^T, f_E(x)^T)T

x∈ IR⁶, with

fI(x) =







−e^5x¹+ x2

x2+ x³₃

−3e^x⁴ 5x5− x6





 ⪯^K⁴⁼^K²^×K² 0,

fE(x) =

[ 3x1+ e^x²^+x³− 2x4− 7x5+ x6− 3 2x²₁+ x2+ 3x3− (x4− x5)²+ 2x6− 13

]

= 0.

(18)

Example 5.5. Consider the system of equalities and inequalities (1.5), where

f (x) :=(

f_I(x)^T, f_E(x)^T)T

x∈ IR⁷, with

fI(x) =







3x³₁ x2− x3

−2(x4− 1)² sin(x5+ x6) 2x6+ x7





⪯_K⁵=K²×K³ 0,

f_E(x) =

[ x₁+ x₂+ 2x₃x₄+ sin x₅+ cos x₆+ 2x₇ x³₁+ x2+√

x²₃+ 3 + 2x4+ x5+ x6+ 6x7

]

= 0.

Exam fun suc c σ iter cpu res

5.2 ϕ1 20 5 0.02 13.500 0.002 5.835e-08

5.2 ϕ₂ 20 5 0.02 8.450 0.001 5.134e-07

5.2 ϕ₃ 20 5 0.02 8.600 0.002 2.260e-07

5.3 ϕ1 20 1 0.02 21.083 0.009 8.165e-07

5.3 ϕ2 17 1 0.02 14.647 0.001 2.899e-07??

5.3 ϕ3 17 1 0.02 18.529 0.002 7.167e-07

5.4 ϕ₁ 20 0.5 0.002 46.750 0.033 1.648e-07

5.4 ϕ₂ 2 0.5 0.002 420.000 0.499 9.964e-07

5.4 ϕ₃ 0 0.5 0.002 Fail Fail Fail

5.5 ϕ1 20 0.1 0.002 14.250 0.009 6.251e-07

5.5 ϕ2 20 0.1 0.002 13.250 0.001 6.532e-07

5.5 ϕ3 20 0.1 0.002 12.650 0.001 6.016e-07

Table 4: Average performance of Algorithm4.1 for Examples 5.2-5.5

Table 4 shows the numerical results including three smoothing functions (fun) used to solve the problems, the number (suc) that Algorithm 4.1 successfully solves every generated problem, the parameters c and σ, the average iteration numbers (iter), the average CPU time (cpu) in seconds and the average residual norm∥H(z)∥ (res) for Examples 5.2-5.5 with random initializations, respectively. Performance profiles are provided as below.

Figure 8 and Figure 9 are the performance profiles in terms of iteration number for Example 5.3 and Example 5.5. From the Figure 8, we see that although the best probability of the function ϕ3 is lower, but the ratio that can be solved in a large number of problems is higher than that of the other two. In this case, the diﬀerence between the three functions is not obvious. From the Figure 9, we can also see the function ϕ3has best performance.

In summary, below are our numerical observations and conclusions.

1. The Algorithm 4.1 is eﬀective. In particular, the numerical results show that our proposed method is better than the algorithm with monotone line search studied in [25]

when solving the system of inequalities under the order induced by second-order cone.

(19)

2. For Examples 5.1 and 5.2, the function ϕ3 outperforms much better than the others.

For the rest problems, the diﬀerence of their numerical performance is very marginal.

3. For future topics, it is interesting to discover more eﬃcient smoothing functions and to apply the type of SOC-functions to other optimization problems involved second-order cones.

References

[1] F. Alizadeh, D. Goldfarb, Second-order cone programming, Math. Program. 95 (2003) 3–51.