4 Smoothing Newton method

(1)

to appear in Applied Numerical Mathematics, 2017

A smoothing Newton method for absolute value equation associated with second-order cone

Xin-He Miao¹

Department of Mathematics Tianjin University, China

Tianjin 300072, China

Jian-Tao Yang ² Department of Mathematics

Tianjin University, China Tianjin 300072, China

B. Saheya³

College of Mathematical Science Inner Mongolia Normal University Hohhot 010022, Inner Mongolia, P. R. China

Jein-Shan Chen ⁴ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan.

May 20, 2016

(1st revision on November 14, 2016) (2nd revision on February 26, 2017)

Abstract In this paper, we consider the smoothing Newton method for solving a type of absolute value equations associated with second order cone (SOCAVE for short), which

1E-mail: xinhemiao@tju.edu.cn. The author’s work is supported by National Natural Science Foun- dation of China (No. 11471241).

2E-mail: zzlyyjt@163.com.

3E-mail: saheya@imnu.edu.cn. The author’s work is supported by Natural Science Foundation of Inner Mongolia (Award Number: 2014MS0119).

4E-mail:jschen@math.ntnu.edu.tw. The author’s work is supported by Ministry of Science and Tech- nology, Taiwan.

(2)

is a generalization of the standard absolute value equation frequently discussed in the literature during the past decade. Based on a class of smoothing functions, we reformulate the SOCAVE as a family of parameterized smooth equations, and propose the smoothing Newton algorithm to solve the problem iteratively. Moreover, the algorithm is proved to be locally quadratically convergent under suitable conditions. Preliminary numerical results demonstrate that the algorithm is effective. In addition, two kinds of numerical comparisons are presented which provides numerical evidence about why the smoothing Newton method is employed and also suggests a suitable smoothing function for future numerical implementations. Finally, we point out that although the main idea for proving the convergence is similar to the one used in the literature, the analysis is indeed more subtle and involves more techniques due to the feature of second-order cone.

Keywords. Second-order cone, absolute value equations, smoothing Newton algorithm.

1 Introduction

The standard absolute value equation (AVE) is in the form of

Ax + B|x| = b, (1)

where A ∈ IR^n×n, B ∈ IR^n×n, B 6= 0, and b ∈ IRⁿ. Here |x| means the componentwise absolute value of vector x ∈ IRⁿ. When B = −I, where I is the identity matrix, the AVE (1) reduces to the special form:

Ax − |x| = b.

It is known that the AVE (1) was first introduced by Rohn in [38] and recently has been investigated by many researchers, for example, Caccetta, Qu and Zhou [1], Hu and Huang [14], Jiang and Zhang [22], Ketabchi and Moosaei [23], Mangasarian [25, 26, 27, 28, 29, 30, 31, 32], Mangasarian and Meyer [34], Prokopyev [35], and Rohn [40].

In particular, Mangasarian and Meyer [34] show that the AVE (1) is equivalent to the bilinear program, the generalized LCP (linear complementarity problem), and the standard LCP provided 1 is not an eigenvalue of A. With these equivalent reformulations, they also show that the AVE (1) is NP-hard in its general form and provide existence results. Prokopyev [35] further improves the above equivalence which indicates that the AVE (1) can be equivalently recast as LCP without any assumption on A and B, and also provides a relationship with mixed integer programming. In general, if solvable, the AVE (1) can have either unique solution or multiple (e.g., exponentially many) solutions.

Indeed, various sufficiency conditions on solvability and non-solvability of the AVE (1) with unique and multiple solutions are discussed in [34, 35, 39]. Some variants of the AVE, like the absolute value equation associated with second-order cone and the absolute

(3)

value programs, are investigated in [16] and [41], respectively.

In this paper, we target another type of absolute value equation which is a natural ex- tension of the standard AVE (1). More specifically the following absolute value equation associated with second-order cones, abbreviated as SOCAVE, as below:

Ax + B|x| = b, (2)

where A, B ∈ IR^n×n and b ∈ IRⁿ are the same as those in (1); |x| denotes the absolute value of x coming from the square root of the Jordan product “◦” of x and x. What is the difference between the standard AVE (1) and the SOCAVE (2)? Their mathematical formats look the same. In fact, the main difference is that |x| in the standard AVE (1) means the componentwise |x_i| of each x_i ∈ IR, i.e., |x| = (|x₁|, |x₂|, · · · , |x_n|)^T ∈ IRⁿ; however, |x| in the SOCAVE (2) denotes the vector satisfying √

x² :=√

x ◦ x associated with second-order cone under Jordan product. To understand its meaning, we need to introduce the definition of second-order cone (SOC). The second-order cone in IRⁿ (n ≥ 1), also called the Lorentz cone, is defined as

Kⁿ:=(x₁, x₂) ∈ IR × IRⁿ⁻¹| kx₂k ≤ x₁ ,

where k · k denotes the Euclidean norm. If n = 1, then Kⁿ is the set of nonnegative reals IR₊. In general, a general second-order cone K could be the Cartesian product of SOCs, i.e.,

K := Kⁿ¹ × · · · × Kⁿ^r.

For simplicity, we focus on the single SOC Kⁿ because all the analysis can be carried over to the setting of Cartesian product. The SOC is a special case of symmetric cones and can be analyzed under Jordan product, see [11]. In particular, for any two vectors x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ and y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, the Jordan product of x and y associated with Kⁿ is defined as

x ◦ y :=

x^Ty y1x2+ x1y2

.

The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source of complication in the analysis of optimization problems involved SOC, see [3, 10, 12] and references therein for more details. The identity element under this Jordan product is e = (1, 0, ..., 0)^T ∈ IRⁿ. With these definitions, x² means the Jordan product of x with itself, i.e., x² := x ◦ x; and √

x with x ∈ Kⁿ denotes the unique vector such that √

x ◦√

x = x. In other words, the vector |x| in the SOCAVE (2) is computed by

|x| :=√ x ◦ x.

As mentioned earlier, the significance of the AVE (1) arises from the fact that the AVE is capable to formulate many optimization problems (also see [26, 30, 32, 34, 35]), such as,

(4)

linear programs, quadratic programs, bimatrix games, and so on. Moreover, the absolute value equations is equivalent to the linear complementarity problem [34]. Accordingly, we see that the SOCAVE (2) plays similar role in various optimization problems involved second-order cones. For solving the standard AVE (1), there are many various numerical methods proposed in the literature (see [1, 21, 22, 25, 26, 27, 35, 43]). As for the SO- CAVE (2), Hu, Huang and Zhang [16] propose a generalized Newton method for solving the SOCAVE (2). It is well known that smoothing-type algorithms is a powerful tool for solving many optimization problems, for example, the linear and nonlinear complementarity problems [3, 12, 19, 20, 24], the system of equalities and inequalities [17, 42]. In this paper, we are interested in an smoothing Newton method for solving the SOCAVE (2). Our numerical results also support that the smoothing Newton method is a bet- ter way than the generalized Newton method employed in [16]. That is why we adopt this algorithm as the main tool to do numerical implementations. In addition, we have shown that the proposed smoothing Newton method is locally quadratically convergent under suitable condition. We report some preliminary numerical results to show that the method is efficient. Moreover, numerical comparisons based on various value of p are presented as well.

To close this section, we say a few words about notations and the organization of this paper. As usual, IRⁿ denotes the space of n-dimensional real column vectors. IR₊ and IR₊₊ denote the nonnegative and positive reals. For any x, y ∈ IRⁿ, the Euclidean inner product is denoted hx, yi = x^Ty, and the Euclidean norm kxk is denoted as kxk = phx, xi. This paper is organized as follows. In Section 2, we briefly describe some concepts and properties on second-order cone. Besides, we review Jordan product and the spectral decomposition for elements x and y in IRⁿ. In Section 3, we introduce a smoothing function of the absolute value |x|, and study the Jacobian matrix of the smoothing function. In Section 4, we propose a smoothing Newton algorithm for solving the SOCAVE (2), and discuss the convergence of the proposed method under suitable conditions. In Section 5, the preliminary numerical results and numerical comparisons are given.

2 Preliminaries

In this section, we recall some basic concepts and background materials regarding the second-order cone, which will be extensively used in the subsequent analysis. More details can be found in [3, 10, 11, 12, 16]. First, we recall the expression of the spectral decomposition of x with respect to SOC. For x = (x₁, x₂) ∈ IR × IRⁿ⁻¹, the spectral decomposition of x with respect to SOC is given by

x = λ₁(x)u⁽¹⁾_x + λ₂(x)u⁽²⁾_x , (3)

(5)

where λ_i(x) = x₁+ (−1)ⁱkx₂k for i = 1, 2 and

u⁽ⁱ⁾_x =







1 2

1, (−1)^{i x}_kx^T²

2k

T

if kx₂k 6= 0,

1

2 1, (−1)ⁱω^TT

if kx₂k = 0,

(4)

with ω ∈ IRⁿ⁻¹ being any vector satisfying kωk = 1. The two scalars λ₁(x) and λ₂(x) are called spectral values of x; while the two vectors u⁽¹⁾x and u⁽²⁾x are called the spectral vectors of x. Moreover, it is obvious that the spectral decomposition of x ∈ IRⁿ is unique if x₂ 6= 0.

Lemma 2.1. For any x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ with the spectral decomposition given as in (3)-(4), the following results hold.

(a) u⁽¹⁾x ◦ u⁽²⁾x = 0 and u⁽ⁱ⁾x ◦ u⁽ⁱ⁾x = u⁽ⁱ⁾x for i = 1, 2;

(b) ku⁽¹⁾x k² = ku⁽²⁾x k² = ¹₂ and kxk² = ¹₂(λ²₁(x) + λ²₂(x)).

Proof. The property can be verified directly or can be found in [3, 11, 12, 16, 10]. 2

In the next content, we talk about the projection onto second-order cone. We let x+be the projection of x onto SOC Kⁿ, and x₋be the projection of −x onto the dual cone (Kⁿ)^∗ of Kⁿ, where the dual cone (Kⁿ)^∗ is defined by (Kⁿ)^∗ := {y ∈ IRⁿ| hx, yi ≥ 0, ∀x ∈ Kⁿ}.

In fact, the dual cone of Kⁿis itself, i.e., (Kⁿ)^∗ = Kⁿ. Due to the special structure of SOC Kⁿ, the explicit formula of projection of x = (x1, x2) ∈ IR × IRⁿ⁻¹ onto Kⁿ is obtained in [3, 10, 11, 12, 13] as below:

x₊ =







x if x ∈ Kⁿ, 0 if x ∈ −Kⁿ, u otherwise, where

u =

" _x₁_+kx₂k

_x 2

1+kx2k 2

x2

kx2k

# . Similarly, the expression of x− is in the form of

x− =







0 if x ∈ Kⁿ,

−x if x ∈ −Kⁿ, w otherwise, where

w =

"

−^x¹^−kx₂ ²^k

x1−kx₂k 2

x2

kx2k

# .

(6)

Together with the spectral decomposition of x, it is shown that x = x₊+ x− and the expression of x+ has the form:

x₊= (λ₁(x))₊u⁽¹⁾_x + (λ₂(x))₊u⁽²⁾_x , and

x− = (−λ₁(x))₊u⁽¹⁾_x + (−λ₂(x))₊u⁽²⁾_x , where (α)₊= max{0, α} for α ∈ IR.

Next, we talk about the expression of |x| associated with SOC. There is an alternative way via the so-called SOC-function to obtain the expression of |x|, which can be found in [2, 4]. More specifically, for any x ∈ IRⁿ, we define the absolute value |x| of x with respect to SOC as |x| := x₊+ x−. In fact, in the setting of SOC, the form |x| = x₊+ x−

is equivalent to the form |x| =√

x ◦ x. Combining the above expression of x+ and x−, it cab be verified that the expression of the absolute value |x| is in the form of

|x| = (λ1(x))++ (−λ1(x))+u⁽¹⁾_x +(λ2(x))++ (−λ2(x))+u⁽²⁾_x

= λ₁(x)

u⁽¹⁾_x + λ₂(x)

u⁽²⁾_x .

To end this section, we point out the relation between SOCAVE and SOCLCP (second-order cone linear complementarity problem). In [16], it was shown that SO- CAVE (2) is equivalent to the following SOCLCP: find x, y ∈ IRⁿ such that

M x + P y = c, and x ∈ Kⁿ, y ∈ Kⁿ, hx, yi = 0,

where M, P ∈ IR^n×n are matrices and c ∈ IRⁿ. However, the above is not a standard SOCLCP because there exists the equations M x + P y = c therein. As below, we show that the SOCAVE (2) can be further converted into a standard SOCLCP.

Theorem 2.1. The SOCAVE (2) can be reduced to the second-order cone linear complementarity problem (SOCLCP):

v ∈ Kⁿ× Kⁿ× Kⁿ, w = Qv + q ∈ Kⁿ× Kⁿ× Kⁿ and hv, wi = 0, (5) where

Q :=





−I 2I 0

A B − A 0

−A A − B 0



, v :=



 2x₊

|x|

0



 and q :=



 0

−b b



. (6)

Proof. By looking into (6), we have

w = Qv + q =





2x₋ Ax + B|x| − b

−Ax − B|x| + b



.

(7)

Plugging this into SOCLCP (18) implies that

Ax + B|x| − b ∈ Kⁿ and − Ax − B|x| + b ∈ Kⁿ.

Since Kⁿ is pointed, it follows that Ax + B|x| − b = 0. On the other hand, the above argument is reversible. Thus, we show that SOCAVE (2) is equivalent to second-order cone linear complementarity problem. 2

Remark 2.1. From Theorem 2.1, it follows that we can also solve the SOCAVE (2) by employing many efficient algorithms for solving SOCLCP (18). Nonetheless, when we apply the Newton method to solve SOCLCP, it still needs reformulate it as smooth equations or nonsmooth equations. This means that we need twice reformulations if we follow this way. In view of this, in this paper, we reformulate the SOCAVE (2) directly as the smooth equations, and solve the equations by smoothing Newton method.

3 Smoothing functions associate with SOCAVE

In this paper, we employ the smoothing Newton method for solving the SOCAVE (2).

To this end, we need to adopt a smoothing function. Due to the non-differentiability of

|α| for α ∈ IR, we consider a class of smoothing functions for the absolute value function

|α|. More specifically, we define the function φ_p(·, ·) : IR² → IR as

φ_p(a, b) :=p|a|^p ^p+ |b|^p, p > 1. (7) This class of functions is extracted from the so-called generalized Fischer-Burmeister function φ_p(a, b) = p|a|^p ^p+ |b|^p − (a + b), which is heavily studied in many references [5, 6, 7, 8, 9, 15]. For convenience, we still use the notation φ_p even it is no longer exactly the same as the generalized Fischer-Burmeister function.

Lemma 3.1. Let φ_p : IR² → IR be defined as in (7). Then, the following hold.

(a) φp(a, 0) = |a| and φp(0, b) = |b|;

(b) φ_p(·, ·) is Lipschitz continuous on IR²; (c) φ_p(·, ·) is strongly semismooth on IR²;

(d) φ_p(a, b) is continuously differentiable for any (a, b) 6= (0, 0) ∈ IR² with

∂φ_p(a, b)

∂a = sgn(a)|a|^p−1

(φ_p(a, b))^p−1 and ∂φ_p(a, b)

∂b = sgn(b)|b|^p−1 (φ_p(a, b))^p−1,

where the function sgn(·) is defined by sgn(α) :=







1 if α > 0, 0 if α = 0,

−1 if α < 0.

(8)

Proof. Please refer to [5, 6, 7, 8, 9, 15] for a proof. 2

According to Lemma 3.1, it follows that for any a ∈ IR and a → 0, we have φ_p(a, b) →

|b|. Therefore, combining the spectral decomposition of x and the function φp, we define a vector-valued smoothing function Φ_p : IR × IRⁿ→ IRⁿ as

Φp(µ, x) = φp(µ, λ1(x))u⁽¹⁾_x + φp(µ, λ2(x))u⁽²⁾_x

= p|µ|^p ^p+ |λ₁(x)|^pu⁽¹⁾_x +p|µ|^p ^p+ |λ₂(x)|^pu⁽²⁾_x ,

where µ ∈ IR is a parameter, and λ₁(x), λ₂(x) are the spectral values of x. From Lemma 3.1, it is easy to verify that

µ→0limΦ_p(µ, x) = |λ₁(x)| u⁽¹⁾_x + |λ₂(x)| u⁽²⁾_x = |x|.

In other words, the function Φ_p(µ, x) is a uniformly smoothing function of |x| associated with SOC. With this function, for the SOCAVE (2), we further define a function H(µ, x) : IR × IRⁿ → IR × IRⁿ by

H(µ, x) =

µ

Ax + BΦ_p(µ, x) − b

, ∀µ ∈ IR, x ∈ IRⁿ. (8) Then, we observe that

H(µ, x) = 0 ⇐⇒ µ = 0 and Ax + BΦ_p(µ, x) − b = 0

⇐⇒ Ax + B|x| − b = 0 and µ = 0.

This indicates that x is a solution to the SOCAVE (2) if and only if (µ, x) is a solution to the equation H(µ, x) = 0. In fact, we often choose µ ∈ IR₊₊. Applying Lemma 3.1 again, it is not difficult to show that the function H(µ, x) is continuously differentiable on IR₊₊× IRⁿ. From direct calculation, we can also obtain the explicit formula of the Jacobian matrix for the function H as below:

H⁰(µ, x) =

"

1 0

B^∂Φ^p_∂µ^(µ,x) A + B^∂Φ^p_∂x^(µ,x)

#

(9) for all (µ, x) ∈ IR++× IRⁿ with x = (x1, x2) ∈ IR × IRⁿ⁻¹, where

∂Φp(µ, x)

∂µ = ∂φp(µ, λ1(x))

∂µ u⁽¹⁾_x + ∂φp(µ, λ2(x))

∂µ u⁽²⁾_x

= µ^p−1

[φ_p(µ, λ₁(x))]^p−1u⁽¹⁾_x + µ^p−1

[φ_p(µ, λ₂(x))]^p−1u⁽²⁾_x and

∂Φ_p(µ, x)

∂x =











sgn(x1)|x1|^p−1 h√p

µ^p+|x1|^pip−1I if x₂ = 0,

"

b c_kx^x^T²

2k

c_kx^x²

2k aI + (b − a)_kx^x²^x^T²

2k²

#

if x₂ 6= 0,

(9)

with

a = φ_p(µ, λ₂(x)) − φ_p(µ, λ₁(x)) λ2(x) − λ1(x) , b = 1

2

sgn(λ₂(x))|λ₂(x)|^p−1

[φ_p(µ, λ₂(x))]^p−1 + sgn(λ₁(x))|λ₁(x)|^p−1 [φ_p(µ, λ₁(x))]^p−1

, (10)

c = 1 2

sgn(λ₂(x))|λ₂(x)|^p−1

[φ_p(µ, λ₂(x))]^p−1 − sgn(λ₁(x))|λ₁(x)|^p−1 [φ_p(µ, λ₁(x))]^p−1

.

4 Smoothing Newton method

In this section, we investigate the smoothing algorithm based on the smoothing function Φ_p(µ, x) for solving the SOCAVE (2), and show the convergence properties of the con- sidered algorithm. First, we present the generic framework of the smoothing algorithm.

Algorithm 4.1. (A Smoothing Newton Algorithm)

Step 0 Choose δ ∈ (0, 1), σ ∈ (0, 1), and µ₀ ∈ IR₊₊, x⁰ ∈ IRⁿ. Set z⁰ := (µ₀, x⁰), e := (1, 0) ∈ IR × IRⁿ⁻¹. Choose β > 1 satisfying min{1, kH(z⁰)k²} ≤ βµ₀. Set k := 0.

Step 1 If kH(z^k)k = 0, stop. Otherwise, set τ_k:= min{1, kH(z^k)k}.

Step 2 Compute 4z^k = (4µ_k, 4x^k) ∈ IR × IRⁿ by H(z^k) + H⁰(z^k)4z^k = 1

βτ_k²e, (11)

where H⁰(z^k) denotes the Jacobian matrix of H(z^k) at (µ_k, x^k) given by (9).

Step 3 Let α_k be the maximum of the values 1, δ, δ², · · · such that kH(z^k+ α_k4z^k)k ≤

1 − σ(1 − 1 β)α_k

kH(z^k)k. (12)

Step 4 Set z^k+1 := z^k+ α_k4z^k and k := k + 1. Go to Step 1.

In order to explain that Algorithm 4.1 is well defined, we have to prove that the system of Newton equation (11) is solvable, and the line search (12) is well-defined. To this end, we need the next two technical lemmas.

Lemma 4.1. For any M, N ∈ IR^n×n, σmin(M ) > σmax(N ) if and only if σmin(M^TM ) >

σ_max(N^TN ). In addition, if σ_min(M^TM ) > σ_max(N^TN ), then M^TM − N^TN is positive definite. Here σ_min(M ) denotes the minimum singular value of M , and σ_max(N ) denotes the maximum singular value of N .

(10)

Proof. The proof is straightforward or can be found in usual textbook of matrix analysis, so we omit it here. 2

Lemma 4.2. Let A, S ∈ IR^n×n and A be symmetric. Suppose that the eigenvalues of A and SS^T are arranged in non-increasing order. Then, for each k = 1, 2, · · · , n, there exists a nonnegative real number θ_k such that

λmin(SS^T) ≤ θk≤ λmax(SS^T) and λk(SAS^T) = θkλk(A).

Proof. Please see [18, Corollary 4.5.11] for a proof. 2

In order to show that the Jacobian matrix H⁰(µ, x) in Newton equation (11) is nonsingular for any µ > 0. we need the following assumption:

Assumption 4.1. For the SOCAVE (2), it holds σ_min(A) > σ_max(B).

In fact, under the condition of Assumption 4.1, The SOCAVE (2) has a unique solution, which is verified in [33].

Theorem 4.1. Let H be defined as in (8). Suppose that Assumption 4.1 holds. Then, the Jacobian matrix H⁰(µ, x) in Newton equations (11) is nonsingular for any µ > 0.

Proof. From the expression of H⁰(µ, x) given as in (9), we know that H⁰(µ, x) is nonsingular if and only if the matrix A + B ^∂Φ(µ,x)_∂x is nonsingular. Thus, it suffices to show that the matrix A + B ^∂Φ(µ,x)_∂x is nonsingular. Suppose not, i.e., there exists a vector 0 6= v ∈ IRⁿ such that

A + B∂Φ(µ, x)

∂x

v = 0.

This implies that

v^TA^TAv = v^T ∂Φ(µ, x)

∂x

T

B^TB ∂Φ(µ, x)

∂x v. (13)

For convenience, we denote C := ^∂Φ(µ,x)_∂x . Then, it follows that v^TA^TAv = v^TC^TB^TBCv.

By Lemma 4.2, there exists a constant ˆθ such that

λ_min(C^TC) ≤ ˆθ ≤ λ_max(C^TC) and λ_max(C^TB^TBC) = ˆθλ_max(B^TB).

Note that if we can prove that 0 ≤ λ_min(C^TC) ≤ λ_max(C^TC) ≤ 1, we have λ_max(C^TB^TBC) ≤ λ_max(B^TB). Then, by the assumption that the minimum singular value of A strictly exceeds the maximum singular value of B, and applying Lemma 4.1, we obtain v^TA^TAv >

v^TC^TB^TBCv. This contradicts the formula (13), which shows the Jacobian matrix H⁰(µ, x) in Newton equations (11) is nonsingular for µ > 0.

(11)

Thus, as discussed above, we only need to prove 0 ≤ λ_min(C^TC) ≤ λ_max(C^TC) ≤ 1.

For x₂ = 0, we compute that C = h^sgn(x¹^)|x¹^|^p−1

√p

µ^p+|x1|^pip−1I. Then, it is clear that 0 < λ(C^TC) < 1 for µ > 0. For x₂ 6= 0, using the fact that the matrix M^TM is always positive semidefinite for any matrix M ∈ IR^m×n, we see that the inequality λ_min(C^TC) ≥ 0 always holds. In order to prove that λ_max(C^TC) ≤ 1, we need to further prove that the matrix I − C^TC is positive semidefinite. To see this, note that

I − C^TC =

"

1 − b²− c² −2bc_kx^x^T²

2k

−2bc_kx^x²

2k (1 − a²)I + (a² − b²− c²)_kx^x²^x^T²

2k²

# .

Because b²+ c² = 1 2

|λ₂(x)|^2(p−1)

[φ_p(µ, λ₂(x))]^2(p−1) + |λ₁(x)|^2(p−1) [φ_p(µ, λ₁(x))]^2(p−1)

< 1

2· 2 = 1 for µ > 0, we have 1 − b²− c² > 0. Moreover, the Schur complement of 1 − b²− c² has the form of

(1 − a²)I + (a²− b² − c²)x₂x^T₂

kx₂k² − 4b²c² 1 − b²− c²

x₂x^T₂ kx₂k²

= (1 − a²)

I − x₂x^T₂ kx₂k²

+

1 − b²− c²− 4b²c² 1 − b²− c²

x₂x^T₂

kx₂k². (14) On the other hand, |λ_i(x)| < φ_p(µ, λ_i(x)) (i = 1, 2) for µ > 0, we have

|φ_p(µ, λ₂(x)) − φ_p(µ, λ₁(x))|

=

|λ₂(x)|^p− |λ₁(x)|^p

p

X

i=1

[φ_p(µ, λ₂(x))]^p−i[φ_p(µ, λ₁(x))]ⁱ⁻¹

=

(|λ₂(x)| − |λ₁(x)|)

p

X

i=1

|λ₂(x)|^p−i|λ₁(x)|ⁱ⁻¹

p

X

i=1

[φ_p(µ, λ₂(x))]^p−i[φ_p(µ, λ₁(x))]ⁱ⁻¹

< ||λ₂(x)| − |λ₁(x)||

≤ |λ2(x) − λ1(x)|.

This together with (10) implies that 1 − a² > 0 for any µ > 0. In addition, for any µ > 0, we observe that

(1 − b²− c²)²− 4b²c²

= (1 − (b − c)²)(1 − (b + c)²)

=

"

1 − |λ₁(x)|^2(p−1) [φp(µ, λ1(x))]^2(p−1)

#

·

"

1 − |λ₂(x)|^2(p−1) [φp(µ, λ2(x))]^2(p−1)

#

> 0,

(12)

where the inequality holds due to |λ_i(x)| < φ_p(µ, λ_i(x)) for i = 1, 2 and µ > 0. With all of these, we see that the Schur complement of 1 − b²− c² given as in (14) is a linear positive combination of the matrices

I − _kx^x²^x^T²

2k²

and _kx^x²^x^T²

2k², which yields that the Schur complement (14) of 1 − b²− c² is positive semidefinite. Hence, the matrix I − C^TC is also positive semidefinite, which is equivalent to saying 0 ≤ λ_min(C^TC) ≤ λ_max(C^TC) ≤ 1.

Thus, the proof is complete. 2

Theorem 4.1 indicates the Newton equation (11) in Algorithm 4.1 is solvable. It paves a way to show that the linear search (12) in Algorithm 4.1 is well-defined which is given in Theorem 4.2 as below. Indeed, the proof is very similar to the one in [17, Remark 2.1 (v)], we only present it here and omit its proof.

Theorem 4.2. Suppose that Assumption 4.1 holds. Then, for 4z ∈ IR × IRⁿ given by (11), the linear search (12) is well-defined.

Next, we discuss the convergence of Algorithm 4.1. To this end, we need the following results whose arguments are similar to the ones in [17, Remark 2.1].

Theorem 4.3. Let H be defined as in (8). Suppose that Assumption 4.1 holds and that the sequence {z^k} is generated by Algorithm 4.1. Then, the following results are hold.

(a) The sequences {kH(z^k)k} and {τk} are monotonically non-increasing.

(b) βµ_k ≥ τ_k² for all k.

(c) The sequence {µ_k} is monotonically non-increasing and µ_k > 0 for all k.

(d) The sequence {z^k} is bounded.

Proof. (a) From definition of the line search in (12) and τ_k := min{1, kH(z^k)k}, it is clear that {kH(z^k)k} and {τ_k} are monotonically non-increasing.

(b) We prove this conclusion by induction. First, by Algorithm 4.1, it is clear that τ₀² ≤ βµ0 with τ0, β and µ0 chosen in Algorithm 4.1. Secondly, we suppose that τ_k² ≤ βµk

for some k. Then, for k + 1, we have µ_k+1−τ_k+1²

β = µ_k+ α_k4µ_k− τ_k+1² β

= (1 − α_k)µ_k+ α_kτ_k²

β −τ_k+1² β

≥ (1 − α_k)τ_k²

β + α_kτ_k²

β − τ_k+1² β

≥ 0,

(13)

where the second equality holds due to the Newton equation (11), and the second inequality holds due to part (a). Hence, it follows that βµk ≥ τ_k² for all k.

(c) From the iterative scheme z^k+1 = z^k+ α_k∆z^k, we know µ_k+1 = µ_k+ α_k4µ_k. By the Newton equations (11) and the line search as in (12) again, it follows that

µ_k+1= (1 − α_k)µ_k+ α_kτ_k²

β ≥ (1 − α_k)τ_k²

β + α_kτ_k² β > 0 for all k. On the other hand, we have

µ_k+1 = (1 − α_k)µ_k+ α_kτ_k²

β ≤ (1 − α_k)µ_k+ α_kµ_k≤ µ_k,

where the first inequality holds due to part (b). Hence, the sequence {µ_k} is monotonically non-increasing and µk > 0 for all k.

(d) From part (a), we know the sequence {kH(z^k)k} is bounded. Thus, there is a constant C such that kH(z^k)k ≤ C. In addition, since

4

λ₁(x^k)u⁽¹⁾_x + λ₂(x^k)u⁽²⁾_x

2−

√p

4

4 |λ₁(x^k)| + |λ₂(x^k)|2

= 1 4

h

(8 − 2√^p

4)(|λ₁(x^k)|²+ |λ₂(x^k)|²) +√^p

4(|λ₁(x^k)| − |λ₂(x^k)|)²i

> 0 (∀p > 1), it follows that

kH(z^k)k

≥

Ax^k+ BΦ_p(µ_k, x^k) − b

≥ Ax^k

−

BΦ_p(µ_k, x^k) − kbk

= p

(x^k)^TA^TAx^k− q

[Φ_p(µ_k, x^k)]^TB^TBΦ_p(µ_k, x^k) − kbk

≥ p

λ_min(A^TA)kx^kk − q

λ_max(B^TB)kΦ_p(µ_k, x^k)k²− kbk

= p

λ_min(A^TA)kx^kk − r

λ_max(B^TB)

φ_p(µ_k, λ₁(x^k))u⁽¹⁾x + φ_p(µ_k, λ₂(x^k))u⁽²⁾x

2

− kbk

= p

λ_min(A^TA)kx^kk − r

λ_max(B^TB)h

φ²_p(µ_k, λ₁(x^k))ku⁽¹⁾x k²+ φ²_p(µ_k, λ₂(x^k))ku⁽²⁾x k²i

− kbk

= p

λ_min(A^TA)kx^kk − s

λ_max(B^TB) 1 2

p

q

(µ^p_k+ |λ₁(x^k)|^p)²+ ^p q

(µ^p_k+ |λ₂(x^k)|^p)²

− kbk

(14)

≥ p

λ_min(A^TA)kx^kk −p

λ_max(B^TB)

· s 1

2

(µ²_k+ |λ₁(x^k)|²+√^p

2µ_k|λ₁(x^k)|) + (µ²_k+ |λ₂(x^k)|² +√^p

2µ_k|λ₂(x^k)|)

− kbk

= p

λ_min(A^TA)kx^kk

−p

λ_max(B^TB) s

µ²_k+1

2|λ₁(x^k)|²+ 1

2|λ₂(x^k)|²+

√p

2

2 µ_k(|λ₁(x^k)| + |λ₂(x^k)|) − kbk

≥ p

λ_min(A^TA)kx^kk

−p

λ_max(B^TB) r

µ²_k+1

2|λ₁(x^k)|²+ 1

2|λ₂(x^k)|²+ 2µ_kkλ₁(x^k)u⁽¹⁾x + λ₂(x^k)u⁽²⁾x k − kbk

= p

λ_min(A^TA)kx^kk −p

λ_max(B^TB)µ_k+ kλ₁(x^k)u⁽¹⁾_x + λ₂(x^k)u⁽²⁾_x k − kbk

= p

λ_min(A^TA) −p

λ_max(B^TB)

kx^kk −p

λ_max(B^TB)µ_k− kbk.

This together with kH(z^k)k ≤ C implies

kx^kk ≤ C +pλmax(B^TB)µk+ kbk pλ_min(A^TA) −pλ_max(B^TB) holds for all k. Thus, the sequence {x^k} is bounded. 2

Theorem 4.4. Suppose that Assumption 4.1 holds and that {z^k} is generated by Algo- rithm 4.1. Then, any accumulation point of {z^k} is a solution to the SOCAVE (2).

Proof. From Theorem 4.3 (d), we know the sequence {z^k} is bounded. Hence, there exists at least a accumulation point for the sequence {z^k}. Without loss of generality, let lim_k→∞z^k := z^? = (µ_?, x^?). Then, it follows that H^? := H(z^?) = lim_k→∞H(z^k) and τ_? := min{1, kH^?k} = lim_k→∞min{1, kH(z^k)k}. Now, we will show H^? = 0. Suppose not, i.e., kH^?k > 0. To proceed, we discuss two cases according to whether lim_k→∞α_k = 0 or α_k≥ ˆα > 0 with ˆα ∈ IR₊₊.

Case 1: lim_k→∞α_k = 0. Then, from the line search (12), for the number α_k := ^α_δ^k with all sufficiently large k, we have

kH(z_k+ α_k4z_k)k > [1 − σ(1 − 1

β)α_k]kH(z_k)k.

Furthermore, this leads to

kH(z^k+ α_k4z^k)k − kH(z^k)k αk

> −σ(1 − 1

β)kH(z^k)k. (15)

(15)

Besides, from Theorem 4.3 (c) again, we know µ^? ≥ 0. It follows that the function H is continuously differentiable at the point z^?. Taking k → ∞ in the formula (15), we have

hH(z^?), H⁰(z^?)4z^?i

kH(z^?)k ≥ −σ(1 − 1

β)kH(z^?)k. (16)

This combining the Newton equations (11) yields hH(z^?), H⁰(z^?)4z^?i

kH(z^?)k = (τ_?)²

βkH(z^?)khH(z^?), ei − kH(z^?)k

≤ (τ_?)²kH(z^?)k

βkH(z^?)k − kH(z^?)k

≤ τ_?

β − kH(z^?)k

≤ (1

β − 1)kH(z^?)k, (17)

where the first inequality holds due to the H¨older inequality hH(z^?), ei ≤ kH(z^?)kkek = kH(z^?)k, the second and third inequality hold due to τ_? = min{1, kH(z^?)k}. Putting (16) and (17) together gives _β¹ − 1 ≥ −σ(1 −_β¹). This contradicts σ ∈ (0, 1) and β > 1.

Case 2: α_k ≥ ˆα > 0 for all k. From the line search (12), we have kH(z^k+1)k ≤

1 − σ(1 − 1 β) ˆα

kH(z^k)k = kH(z^k)k − σ(1 − 1

β) ˆαkH(z^k)k.

Then, it follows from the boundedness of kH(z^k)k that P∞

k=0ασ(1 −ˆ _β¹)kH(z^k)k is bounded. Moreover, we have lim_k→∞kH(z^k)k = 0, i.e., kH^?k = 0. This contradicts kH^?k > 0.

Hence, from all the above, we show H(z^?) = 0. That is, the element x^? is a solution of the SOCAVE (2). Then, the proof is complete. 2

Now, we show the local quadratic convergence of Algorithm 4.1. In fact, we can achieve the following result by similar arguments as those in [37, Theorem 8]. For com- pleteness, we also provide a detailed proof.

Theorem 4.5. Let H be defined as in (8) and z^? be the unique solution to SOCAVE (2).

Suppose that Assumption 4.1 holds and that all V ∈ ∂H(z^?) are nonsingular. Then, the whole sequence {z^k} converges to z^?, and kz^k+1− z^?k = O(kz^k− z^?k²).

Proof. Since z^? is the solution to SOCAVE (2), using Assumption 4.1 and applying Theorem 4.1 yield that the Jacobian matrix H⁰(z^k) is nonsingular for all z^k sufficiently close to z^?. On the other hand, applying the condition that all V ∈ ∂H(z^?) are nonsingular and from [36, Proposition 3.1], we have kH⁰(z^k)⁻¹k = O(1) for all z^k sufficiently

(16)

close to z^?. Because z^? is the solution to SOCAVE (2), it is clear that z^? is a solution of H(z) = 0. In addition, the function H is strongly semismooth, it follows that

kH(z^k) − H(z^?) − H⁰(z^k)(z^k− z^?))k = O(kz^k− z^?k²).

Thus, we have z^k+ 4z^k− z^?

=

z^k+ H⁰(z^k)⁻¹

−H(z^k) + 1 βτ_k²e

− z^?

≤

H⁰(z^k)⁻¹ −H(z^k) + H⁰(z^k)(z^k− z^?) +

H⁰(z^k)⁻¹1 βτ_k²e

≤

H⁰(z^k)⁻¹ −H(z^k) + H⁰(z^k)(z^k− z^?)

+ O(1) 1 βτ_k²e

= O(kH(z^k) − H(z^?) − H⁰(z^k)(z^k− z^?)k) + O(kH(z^k)k²)

= O(kz^k− z^?k²) + O(kz^k− z^?k²)

= O(kz^k− z^?k²)

where the first equality holds due to the Newton equation (11), and the third equality holds since the function H is locally Lipschitz continuous near z^k. Then, the proof is complete. 2

5 Numerical Results

This section is devoted to the numerical results. First, we show the numerical comparison between the smoothing Newton algorithm and generalized Newton method. This provides the numerical evidence about why we adopt the smoothing Newton algorithm, not the generalized Newton algorithm, in this paper. Secondly, we use the performance profile to depict the comparison among different values of p. This shows that the smoothing Newton algorithm is not regularly affected when p is perturbed. Moreover, a suitable smoothing function from the class of smoothing functions is suggested in view of the numerical comparisons.

5.1 Smoothing Newton algorithm vs Generalized Newton method

In this subsection, for fixed p = 2, we provide some numerical examples to evaluate the efficiency of Algorithm 4.1. In our tests, we choose parameters

µ₀ = 0.1, x₀ = rand(n, 1), δ = 0.5, σ = 10⁻⁵ and β = max(1, 1.01 ∗ τ₀²/µ).

We stop the iterations when kH(z_k)k ≤ 10⁻⁶or the number of iterations exceeds 100. All the experiments are done on a PC with Intel(R) CPU of 2.40GHz and RAM of 4.00GHz,

(17)

and all the program codes are written in Matlab and run in Matlab environment. We consider the following four problems, and compute these problems by using Smoothing Newton Algorithm (SN for short) 4.1 and Generalized Newton method (GN for short) which introduced in [16], respectively. Illustrative examples further demonstrate the superiority of our proposed algorithm.

Problem 5.1. Consider the SOCAVE (2) which is generated in the following way: first choose two random matrices B, C ∈ IR^n×n from a uniformly distribution on [−10, 10]

for every element. We compute the maximal singular value σ₁ of B and the minimal singular value σ₂ of C, and let σ := min{1, σ₂/σ₁}. Next, we divide C by σ multiplied by a random number in the interval [0, 1], and the resulting matrix is denoted as A.

Accordingly, the minimum singular values of A exceeds the maximal singular value of B. We choose randomly b ∈ IRⁿ on [0, 1] for every element. By Algorithm 4.1 in this paper, the resulting SOCAVE (2) is solvable. The initial point is chosen in the range [0, 1] entry-wisely. Note that a similar way to construct the problem was given in [16].

Table 1: Numerical results for Problem 5.1

SN GN

n ares itn time maxi mini fails ares itn time maxi mini fails 100 8.618e-08 2.8 0.078 3 2 0 9.992e-08 2.8 0.349 3 2 0 200 4.901e-08 2.6 0.051 3 2 0 6.904e-10 2.9 0.134 3 2 0 300 1.574e-08 2.7 0.122 3 2 0 3.779e-09 2.9 0.231 3 2 0 400 3.041e-09 2.7 0.232 3 2 0 9.155e-08 2.7 0.326 3 2 0 500 1.778e-07 2.2 0.300 3 2 0 1.445e-07 2.6 0.421 3 2 0 600 1.385e-07 2.5 0.498 3 2 0 5.626e-08 2.8 0.844 3 2 0 700 2.578e-07 2.4 0.668 3 2 0 1.527e-08 2.6 1.334 3 2 0 800 2.356e-07 2.1 0.771 3 2 0 6.846e-08 2.6 1.905 3 2 0 900 2.420e-08 2.5 1.031 3 2 0 1.272e-09 2.7 2.685 3 2 0 1000 4.718e-08 2.5 1.193 3 2 0 1.135e-07 2.7 3.691 3 2 0 1500 2.027e-07 2.3 1.919 3 2 0 6.417e-08 2.6 13.369 3 2 0 2000 3.121e-08 2.2 3.892 3 2 0 1.015e-07 2.5 32.982 3 2 0 2500 1.565e-07 2.1 6.625 3 2 0 3.940e-08 2.5 53.510 3 2 0 3000 1.028e-07 2.3 12.340 3 2 0 1.293e-07 2.5 87.910 3 2 0

Problem 5.2. Consider the SOCAVE (2) which is generated in the following way:

choose two random matrices C, D ∈ IR^n×n from a uniformly distribution on [−10, 10]

for every element, and compute their singular value decompositions C := U₁S₁V₁^T and D := U₂S₂V₂^T with diagonal matrices S₁ and S₂; unitary matrices U₁, V₁, U₂ and V₂.

(18)

Then, we choose randomly b, c ∈ IRⁿ on [0, 10] for every element. Next, we take a ∈ IRⁿ by setting ai = ci + 10 for all i ∈ {1, . . . , n}, so that a ≥ b. Set A := U1Diag(a)V₁^T and B := U₂Diag(b)V₂^T, where Diag(x) denotes a diagonal matrix with its i-th diagonal element being x_i. The gap between the minimal singular value of A and the maximal singular value of B is limited and can be very small. We choose randomly b ∈ IRⁿ in [0, 10]. The initial point is chosen in the range [0, 1] entry-wisely.

SN GN

Problem 5.3. Consider the SOCAVE (2) which is generated in the following way: choose two random matrices A, B ∈ IR^n×n from a uniformly distribution on [−10, 10] for every element. In order to ensure that the SOCAVE (2) is solvable, we update the matrix A by the following: let [U SV ] = svd(A). If min{S(i, i)} = 0 for i = 0, 1, · · · , n, we make A = U (S + 0.01E)V , and then A = ^λ^max_λ ^(B^T^B)+0.01

min(A^TA) A. We choose randomly b ∈ IRⁿ on [0, 10] for every element. The initial point is chosen in the range [0, 1] entry-wisely.

Problem 5.4. We consider the SOCAVE (2) which is generated the same as Problem 5.1. But, here the SOC is given by K := Kⁿ¹ × · · · × Kⁿ^r, where n₁ = · · · = n_r = ⁿ_r.

The above problems 5.1–5.4 are both generated randomly. Below, as suggested by the reviewer, we consider a real application problem. It is well known that the second-order cone linear complementarity problem (SOCLCP) has various applications in engineering,

(19)

SN GN

control, finance, robust optimization and combinatorial optimization since the KKT system of a second-order cone programming can be recastas an SOCLCP. In general, the SOCLCP is to find x, y ∈ Rⁿ such that

M x + P y = c, x ∈ K, y ∈ K, x^Ty = 0, (18) where M, P ∈ R^n×m are given matrices and c ∈ Rⁿ is given vector. From [16, Theorem 1.1], we know that the SOCLCP (18) is equivalent to the SOCAVE (2). In view of this, the next experiment is on this case.

Problem 5.5. Consider the SOCLCP with P = −I, which is generated in the following way: First, we generate a matrix B and a vector b as those given in Problem 5.1. Then, let d be a random number in [0, 1]. We set M := BB^T+(1+d)I and c := 0.5(M (b+|b|)+

|b| − b) to ensure the solvability of the SOCLCP. We test the above SOCLCP by casting it into an SOCAVE according to [16, Theorem 1.1], i.e., we implement the corresponding SOCAVE with A = M + I, B = M − I and b = 2c. Moreover, the initial point is chosen in the range [0, 1] entry-wisely.

In our experiments, every set of the simulations for every problem is randomly generated ten times, and the numerical results are listed in Tables 1–5, respectively. In Tables 1–5, n denotes the size of testing problem; ares denotes the average value of kH(z^k)k when the test stops; itn denotes the average value of the iteration numbers; time denotes

(20)

SN GN

n r ares itn time maxit minit fails ares itn time maxit minit fails 2 9.933e-08 2.4 1.318 3 2 0 2.995e-10 2.9 3.627 3 2 0 4 1.174e-07 2.5 1.245 3 2 0 1.594e-08 2.6 3.106 3 2 0 1000 5 1.056e-07 2.4 1.293 3 2 0 9.657e-08 2.7 3.115 3 2 0 10 3.380e-13 5.0 1.791 5 5 0 3.971e-08 2.5 3.218 3 2 0 20 3.360e-13 5.0 2.103 5 5 0 5.291e-08 2.7 3.181 3 2 0 2 1.971e-08 2.6 5.084 3 2 0 2.494e-08 2.6 28.888 3 2 0 4 1.047e-07 2.3 4.270 3 2 0 5.363e-08 2.6 29.002 3 2 0 2000 5 1.257e-07 2.5 4.813 3 2 0 1.360e-08 2.8 29.055 3 2 0 10 6.689e-13 5.0 10.463 5 5 0 1.360e-08 2.8 29.055 3 3 0 20 6.653e-13 5.0 11.255 5 5 0 1.360e-08 2.8 29.055 3 4 0 2 1.560e-07 2.1 12.312 3 2 0 2.496e-07 2.5 90.699 3 2 0 4 1.162e-07 2.5 14.457 3 2 0 1.609e-07 2.3 89.813 3 2 0 3000 5 3.156e-07 2.2 12.995 3 2 0 6.872e-0 2.4 88.921 3 2 0 10 9.922e-13 5.0 32.011 5 5 0 1.688e-07 2.4 90.041 3 2 0 20 1.016e-12 5.0 33.877 5 5 0 1.411e-08 2.5 88.949 3 2 0

the average value of the CPU time in seconds; maxit and minit denote the maximal value and the minimal value of the iteration numbers, respectively; and f ails denotes that the times of test is failed. From the numerical results that are presented in Tables 1–5, it is easy to see that the proposed smoothing Newton method is effective for solving all the simulated SOCAVE problems. For the SOCLCP, although the smoothing New- ton method performs slightly less than the generalized Newton method, the difference is marginal. To sum up, both approaches are competitive and can be employed to solve SOCAVE.

5.2 Numerical Comparisons with different values of p

In this subsection, we observe the numerical comparison of Algorithm 4.1 with different values of p. In particular, we consider the performance profile which is introduced in [44] as a means. In other words, we regard Algorithm 4.1 corresponding to different p = 1.1, 2, 3, 10, 20, 80 as a solver, and assume that there are n_s solvers and n_q test problems from the test set P which is generated randomly. We are interested in using the computing time as performance measure for Algorithm 4.1 with different p. For each problem q and solver s, let

f_q,s = computing time required to solve problem q by solver s.

We employ the performance ratio

r_q,s:= f_q,s

min{f_q,s: s ∈ S},