• 沒有找到結果。

4 Smoothing Newton method

N/A
N/A
Protected

Academic year: 2022

Share "4 Smoothing Newton method"

Copied!
26
0
0

加載中.... (立即查看全文)

全文

(1)

to appear in Applied Numerical Mathematics, 2017

A smoothing Newton method for absolute value equation associated with second-order cone

Xin-He Miao1

Department of Mathematics Tianjin University, China

Tianjin 300072, China

Jian-Tao Yang 2 Department of Mathematics

Tianjin University, China Tianjin 300072, China

B. Saheya3

College of Mathematical Science Inner Mongolia Normal University Hohhot 010022, Inner Mongolia, P. R. China

Jein-Shan Chen 4 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan.

May 20, 2016

(1st revision on November 14, 2016) (2nd revision on February 26, 2017)

Abstract In this paper, we consider the smoothing Newton method for solving a type of absolute value equations associated with second order cone (SOCAVE for short), which

1E-mail: xinhemiao@tju.edu.cn. The author’s work is supported by National Natural Science Foun- dation of China (No. 11471241).

2E-mail: zzlyyjt@163.com.

3E-mail: saheya@imnu.edu.cn. The author’s work is supported by Natural Science Foundation of Inner Mongolia (Award Number: 2014MS0119).

4E-mail:jschen@math.ntnu.edu.tw. The author’s work is supported by Ministry of Science and Tech- nology, Taiwan.

(2)

is a generalization of the standard absolute value equation frequently discussed in the literature during the past decade. Based on a class of smoothing functions, we reformulate the SOCAVE as a family of parameterized smooth equations, and propose the smoothing Newton algorithm to solve the problem iteratively. Moreover, the algorithm is proved to be locally quadratically convergent under suitable conditions. Preliminary numerical results demonstrate that the algorithm is effective. In addition, two kinds of numerical comparisons are presented which provides numerical evidence about why the smoothing Newton method is employed and also suggests a suitable smoothing function for future numerical implementations. Finally, we point out that although the main idea for proving the convergence is similar to the one used in the literature, the analysis is indeed more subtle and involves more techniques due to the feature of second-order cone.

Keywords. Second-order cone, absolute value equations, smoothing Newton algorithm.

1 Introduction

The standard absolute value equation (AVE) is in the form of

Ax + B|x| = b, (1)

where A ∈ IRn×n, B ∈ IRn×n, B 6= 0, and b ∈ IRn. Here |x| means the componentwise absolute value of vector x ∈ IRn. When B = −I, where I is the identity matrix, the AVE (1) reduces to the special form:

Ax − |x| = b.

It is known that the AVE (1) was first introduced by Rohn in [38] and recently has been investigated by many researchers, for example, Caccetta, Qu and Zhou [1], Hu and Huang [14], Jiang and Zhang [22], Ketabchi and Moosaei [23], Mangasarian [25, 26, 27, 28, 29, 30, 31, 32], Mangasarian and Meyer [34], Prokopyev [35], and Rohn [40].

In particular, Mangasarian and Meyer [34] show that the AVE (1) is equivalent to the bilinear program, the generalized LCP (linear complementarity problem), and the stan- dard LCP provided 1 is not an eigenvalue of A. With these equivalent reformulations, they also show that the AVE (1) is NP-hard in its general form and provide existence results. Prokopyev [35] further improves the above equivalence which indicates that the AVE (1) can be equivalently recast as LCP without any assumption on A and B, and also provides a relationship with mixed integer programming. In general, if solvable, the AVE (1) can have either unique solution or multiple (e.g., exponentially many) solutions.

Indeed, various sufficiency conditions on solvability and non-solvability of the AVE (1) with unique and multiple solutions are discussed in [34, 35, 39]. Some variants of the AVE, like the absolute value equation associated with second-order cone and the absolute

(3)

value programs, are investigated in [16] and [41], respectively.

In this paper, we target another type of absolute value equation which is a natural ex- tension of the standard AVE (1). More specifically the following absolute value equation associated with second-order cones, abbreviated as SOCAVE, as below:

Ax + B|x| = b, (2)

where A, B ∈ IRn×n and b ∈ IRn are the same as those in (1); |x| denotes the absolute value of x coming from the square root of the Jordan product “◦” of x and x. What is the difference between the standard AVE (1) and the SOCAVE (2)? Their mathematical formats look the same. In fact, the main difference is that |x| in the standard AVE (1) means the componentwise |xi| of each xi ∈ IR, i.e., |x| = (|x1|, |x2|, · · · , |xn|)T ∈ IRn; however, |x| in the SOCAVE (2) denotes the vector satisfying √

x2 :=√

x ◦ x associated with second-order cone under Jordan product. To understand its meaning, we need to introduce the definition of second-order cone (SOC). The second-order cone in IRn (n ≥ 1), also called the Lorentz cone, is defined as

Kn:=(x1, x2) ∈ IR × IRn−1| kx2k ≤ x1 ,

where k · k denotes the Euclidean norm. If n = 1, then Kn is the set of nonnegative reals IR+. In general, a general second-order cone K could be the Cartesian product of SOCs, i.e.,

K := Kn1 × · · · × Knr.

For simplicity, we focus on the single SOC Kn because all the analysis can be carried over to the setting of Cartesian product. The SOC is a special case of symmetric cones and can be analyzed under Jordan product, see [11]. In particular, for any two vectors x = (x1, x2) ∈ IR × IRn−1 and y = (y1, y2) ∈ IR × IRn−1, the Jordan product of x and y associated with Kn is defined as

x ◦ y :=

 xTy y1x2+ x1y2

 .

The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source of complication in the analysis of optimization problems involved SOC, see [3, 10, 12] and references therein for more details. The identity element under this Jordan product is e = (1, 0, ..., 0)T ∈ IRn. With these definitions, x2 means the Jordan product of x with itself, i.e., x2 := x ◦ x; and √

x with x ∈ Kn denotes the unique vector such that √

x ◦√

x = x. In other words, the vector |x| in the SOCAVE (2) is computed by

|x| :=√ x ◦ x.

As mentioned earlier, the significance of the AVE (1) arises from the fact that the AVE is capable to formulate many optimization problems (also see [26, 30, 32, 34, 35]), such as,

(4)

linear programs, quadratic programs, bimatrix games, and so on. Moreover, the absolute value equations is equivalent to the linear complementarity problem [34]. Accordingly, we see that the SOCAVE (2) plays similar role in various optimization problems involved second-order cones. For solving the standard AVE (1), there are many various numerical methods proposed in the literature (see [1, 21, 22, 25, 26, 27, 35, 43]). As for the SO- CAVE (2), Hu, Huang and Zhang [16] propose a generalized Newton method for solving the SOCAVE (2). It is well known that smoothing-type algorithms is a powerful tool for solving many optimization problems, for example, the linear and nonlinear complemen- tarity problems [3, 12, 19, 20, 24], the system of equalities and inequalities [17, 42]. In this paper, we are interested in an smoothing Newton method for solving the SOCAVE (2). Our numerical results also support that the smoothing Newton method is a bet- ter way than the generalized Newton method employed in [16]. That is why we adopt this algorithm as the main tool to do numerical implementations. In addition, we have shown that the proposed smoothing Newton method is locally quadratically convergent under suitable condition. We report some preliminary numerical results to show that the method is efficient. Moreover, numerical comparisons based on various value of p are presented as well.

To close this section, we say a few words about notations and the organization of this paper. As usual, IRn denotes the space of n-dimensional real column vectors. IR+ and IR++ denote the nonnegative and positive reals. For any x, y ∈ IRn, the Euclidean inner product is denoted hx, yi = xTy, and the Euclidean norm kxk is denoted as kxk = phx, xi. This paper is organized as follows. In Section 2, we briefly describe some concepts and properties on second-order cone. Besides, we review Jordan product and the spectral decomposition for elements x and y in IRn. In Section 3, we introduce a smoothing function of the absolute value |x|, and study the Jacobian matrix of the smoothing function. In Section 4, we propose a smoothing Newton algorithm for solving the SOCAVE (2), and discuss the convergence of the proposed method under suitable conditions. In Section 5, the preliminary numerical results and numerical comparisons are given.

2 Preliminaries

In this section, we recall some basic concepts and background materials regarding the second-order cone, which will be extensively used in the subsequent analysis. More details can be found in [3, 10, 11, 12, 16]. First, we recall the expression of the spectral decomposition of x with respect to SOC. For x = (x1, x2) ∈ IR × IRn−1, the spectral decomposition of x with respect to SOC is given by

x = λ1(x)u(1)x + λ2(x)u(2)x , (3)

(5)

where λi(x) = x1+ (−1)ikx2k for i = 1, 2 and

u(i)x =

1 2



1, (−1)i xkxT2

2k

T

if kx2k 6= 0,

1

2 1, (−1)iωTT

if kx2k = 0,

(4)

with ω ∈ IRn−1 being any vector satisfying kωk = 1. The two scalars λ1(x) and λ2(x) are called spectral values of x; while the two vectors u(1)x and u(2)x are called the spectral vectors of x. Moreover, it is obvious that the spectral decomposition of x ∈ IRn is unique if x2 6= 0.

Lemma 2.1. For any x = (x1, x2) ∈ IR × IRn−1 with the spectral decomposition given as in (3)-(4), the following results hold.

(a) u(1)x ◦ u(2)x = 0 and u(i)x ◦ u(i)x = u(i)x for i = 1, 2;

(b) ku(1)x k2 = ku(2)x k2 = 12 and kxk2 = 1221(x) + λ22(x)).

Proof. The property can be verified directly or can be found in [3, 11, 12, 16, 10]. 2

In the next content, we talk about the projection onto second-order cone. We let x+be the projection of x onto SOC Kn, and xbe the projection of −x onto the dual cone (Kn) of Kn, where the dual cone (Kn) is defined by (Kn) := {y ∈ IRn| hx, yi ≥ 0, ∀x ∈ Kn}.

In fact, the dual cone of Knis itself, i.e., (Kn) = Kn. Due to the special structure of SOC Kn, the explicit formula of projection of x = (x1, x2) ∈ IR × IRn−1 onto Kn is obtained in [3, 10, 11, 12, 13] as below:

x+ =

x if x ∈ Kn, 0 if x ∈ −Kn, u otherwise, where

u =

" x1+kx2k

x 2

1+kx2k 2

 x2

kx2k

# . Similarly, the expression of x is in the form of

x =

0 if x ∈ Kn,

−x if x ∈ −Kn, w otherwise, where

w =

"

x1−kx2 2k

x1−kx2k 2

 x2

kx2k

# .

(6)

Together with the spectral decomposition of x, it is shown that x = x++ x and the expression of x+ has the form:

x+= (λ1(x))+u(1)x + (λ2(x))+u(2)x , and

x = (−λ1(x))+u(1)x + (−λ2(x))+u(2)x , where (α)+= max{0, α} for α ∈ IR.

Next, we talk about the expression of |x| associated with SOC. There is an alternative way via the so-called SOC-function to obtain the expression of |x|, which can be found in [2, 4]. More specifically, for any x ∈ IRn, we define the absolute value |x| of x with respect to SOC as |x| := x++ x. In fact, in the setting of SOC, the form |x| = x++ x

is equivalent to the form |x| =√

x ◦ x. Combining the above expression of x+ and x, it cab be verified that the expression of the absolute value |x| is in the form of

|x| = (λ1(x))++ (−λ1(x))+u(1)x +(λ2(x))++ (−λ2(x))+u(2)x

= λ1(x)

u(1)x + λ2(x)

u(2)x .

To end this section, we point out the relation between SOCAVE and SOCLCP (second-order cone linear complementarity problem). In [16], it was shown that SO- CAVE (2) is equivalent to the following SOCLCP: find x, y ∈ IRn such that

M x + P y = c, and x ∈ Kn, y ∈ Kn, hx, yi = 0,

where M, P ∈ IRn×n are matrices and c ∈ IRn. However, the above is not a standard SOCLCP because there exists the equations M x + P y = c therein. As below, we show that the SOCAVE (2) can be further converted into a standard SOCLCP.

Theorem 2.1. The SOCAVE (2) can be reduced to the second-order cone linear comple- mentarity problem (SOCLCP):

v ∈ Kn× Kn× Kn, w = Qv + q ∈ Kn× Kn× Kn and hv, wi = 0, (5) where

Q :=

−I 2I 0

A B − A 0

−A A − B 0

, v :=

 2x+

|x|

0

 and q :=

 0

−b b

. (6)

Proof. By looking into (6), we have

w = Qv + q =

2x Ax + B|x| − b

−Ax − B|x| + b

.

(7)

Plugging this into SOCLCP (18) implies that

Ax + B|x| − b ∈ Kn and − Ax − B|x| + b ∈ Kn.

Since Kn is pointed, it follows that Ax + B|x| − b = 0. On the other hand, the above argument is reversible. Thus, we show that SOCAVE (2) is equivalent to second-order cone linear complementarity problem. 2

Remark 2.1. From Theorem 2.1, it follows that we can also solve the SOCAVE (2) by employing many efficient algorithms for solving SOCLCP (18). Nonetheless, when we apply the Newton method to solve SOCLCP, it still needs reformulate it as smooth equations or nonsmooth equations. This means that we need twice reformulations if we follow this way. In view of this, in this paper, we reformulate the SOCAVE (2) directly as the smooth equations, and solve the equations by smoothing Newton method.

3 Smoothing functions associate with SOCAVE

In this paper, we employ the smoothing Newton method for solving the SOCAVE (2).

To this end, we need to adopt a smoothing function. Due to the non-differentiability of

|α| for α ∈ IR, we consider a class of smoothing functions for the absolute value function

|α|. More specifically, we define the function φp(·, ·) : IR2 → IR as

φp(a, b) :=p|a|p p+ |b|p, p > 1. (7) This class of functions is extracted from the so-called generalized Fischer-Burmeister function φp(a, b) = p|a|p p+ |b|p − (a + b), which is heavily studied in many references [5, 6, 7, 8, 9, 15]. For convenience, we still use the notation φp even it is no longer exactly the same as the generalized Fischer-Burmeister function.

Lemma 3.1. Let φp : IR2 → IR be defined as in (7). Then, the following hold.

(a) φp(a, 0) = |a| and φp(0, b) = |b|;

(b) φp(·, ·) is Lipschitz continuous on IR2; (c) φp(·, ·) is strongly semismooth on IR2;

(d) φp(a, b) is continuously differentiable for any (a, b) 6= (0, 0) ∈ IR2 with

∂φp(a, b)

∂a = sgn(a)|a|p−1

p(a, b))p−1 and ∂φp(a, b)

∂b = sgn(b)|b|p−1p(a, b))p−1,

where the function sgn(·) is defined by sgn(α) :=

1 if α > 0, 0 if α = 0,

−1 if α < 0.

(8)

Proof. Please refer to [5, 6, 7, 8, 9, 15] for a proof. 2

According to Lemma 3.1, it follows that for any a ∈ IR and a → 0, we have φp(a, b) →

|b|. Therefore, combining the spectral decomposition of x and the function φp, we define a vector-valued smoothing function Φp : IR × IRn→ IRn as

Φp(µ, x) = φp(µ, λ1(x))u(1)x + φp(µ, λ2(x))u(2)x

= p|µ|p p+ |λ1(x)|pu(1)x +p|µ|p p+ |λ2(x)|pu(2)x ,

where µ ∈ IR is a parameter, and λ1(x), λ2(x) are the spectral values of x. From Lemma 3.1, it is easy to verify that

µ→0limΦp(µ, x) = |λ1(x)| u(1)x + |λ2(x)| u(2)x = |x|.

In other words, the function Φp(µ, x) is a uniformly smoothing function of |x| associated with SOC. With this function, for the SOCAVE (2), we further define a function H(µ, x) : IR × IRn → IR × IRn by

H(µ, x) =

 µ

Ax + BΦp(µ, x) − b



, ∀µ ∈ IR, x ∈ IRn. (8) Then, we observe that

H(µ, x) = 0 ⇐⇒ µ = 0 and Ax + BΦp(µ, x) − b = 0

⇐⇒ Ax + B|x| − b = 0 and µ = 0.

This indicates that x is a solution to the SOCAVE (2) if and only if (µ, x) is a solution to the equation H(µ, x) = 0. In fact, we often choose µ ∈ IR++. Applying Lemma 3.1 again, it is not difficult to show that the function H(µ, x) is continuously differentiable on IR++× IRn. From direct calculation, we can also obtain the explicit formula of the Jacobian matrix for the function H as below:

H0(µ, x) =

"

1 0

B∂Φp∂µ(µ,x) A + B∂Φp∂x(µ,x)

#

(9) for all (µ, x) ∈ IR++× IRn with x = (x1, x2) ∈ IR × IRn−1, where

∂Φp(µ, x)

∂µ = ∂φp(µ, λ1(x))

∂µ u(1)x + ∂φp(µ, λ2(x))

∂µ u(2)x

= µp−1

p(µ, λ1(x))]p−1u(1)x + µp−1

p(µ, λ2(x))]p−1u(2)x and

∂Φp(µ, x)

∂x =









sgn(x1)|x1|p−1 hp

µp+|x1|pip−1I if x2 = 0,

"

b ckxxT2

2k

ckxx2

2k aI + (b − a)kxx2xT2

2k2

#

if x2 6= 0,

(9)

with

a = φp(µ, λ2(x)) − φp(µ, λ1(x)) λ2(x) − λ1(x) , b = 1

2

 sgn(λ2(x))|λ2(x)|p−1

p(µ, λ2(x))]p−1 + sgn(λ1(x))|λ1(x)|p−1p(µ, λ1(x))]p−1



, (10)

c = 1 2

 sgn(λ2(x))|λ2(x)|p−1

p(µ, λ2(x))]p−1 − sgn(λ1(x))|λ1(x)|p−1p(µ, λ1(x))]p−1

 .

4 Smoothing Newton method

In this section, we investigate the smoothing algorithm based on the smoothing function Φp(µ, x) for solving the SOCAVE (2), and show the convergence properties of the con- sidered algorithm. First, we present the generic framework of the smoothing algorithm.

Algorithm 4.1. (A Smoothing Newton Algorithm)

Step 0 Choose δ ∈ (0, 1), σ ∈ (0, 1), and µ0 ∈ IR++, x0 ∈ IRn. Set z0 := (µ0, x0), e := (1, 0) ∈ IR × IRn−1. Choose β > 1 satisfying min{1, kH(z0)k2} ≤ βµ0. Set k := 0.

Step 1 If kH(zk)k = 0, stop. Otherwise, set τk:= min{1, kH(zk)k}.

Step 2 Compute 4zk = (4µk, 4xk) ∈ IR × IRn by H(zk) + H0(zk)4zk = 1

βτk2e, (11)

where H0(zk) denotes the Jacobian matrix of H(zk) at (µk, xk) given by (9).

Step 3 Let αk be the maximum of the values 1, δ, δ2, · · · such that kH(zk+ αk4zk)k ≤



1 − σ(1 − 1 β)αk



kH(zk)k. (12)

Step 4 Set zk+1 := zk+ αk4zk and k := k + 1. Go to Step 1.

In order to explain that Algorithm 4.1 is well defined, we have to prove that the system of Newton equation (11) is solvable, and the line search (12) is well-defined. To this end, we need the next two technical lemmas.

Lemma 4.1. For any M, N ∈ IRn×n, σmin(M ) > σmax(N ) if and only if σmin(MTM ) >

σmax(NTN ). In addition, if σmin(MTM ) > σmax(NTN ), then MTM − NTN is positive definite. Here σmin(M ) denotes the minimum singular value of M , and σmax(N ) denotes the maximum singular value of N .

(10)

Proof. The proof is straightforward or can be found in usual textbook of matrix analysis, so we omit it here. 2

Lemma 4.2. Let A, S ∈ IRn×n and A be symmetric. Suppose that the eigenvalues of A and SST are arranged in non-increasing order. Then, for each k = 1, 2, · · · , n, there exists a nonnegative real number θk such that

λmin(SST) ≤ θk≤ λmax(SST) and λk(SAST) = θkλk(A).

Proof. Please see [18, Corollary 4.5.11] for a proof. 2

In order to show that the Jacobian matrix H0(µ, x) in Newton equation (11) is non- singular for any µ > 0. we need the following assumption:

Assumption 4.1. For the SOCAVE (2), it holds σmin(A) > σmax(B).

In fact, under the condition of Assumption 4.1, The SOCAVE (2) has a unique solu- tion, which is verified in [33].

Theorem 4.1. Let H be defined as in (8). Suppose that Assumption 4.1 holds. Then, the Jacobian matrix H0(µ, x) in Newton equations (11) is nonsingular for any µ > 0.

Proof. From the expression of H0(µ, x) given as in (9), we know that H0(µ, x) is non- singular if and only if the matrix A + B ∂Φ(µ,x)∂x is nonsingular. Thus, it suffices to show that the matrix A + B ∂Φ(µ,x)∂x is nonsingular. Suppose not, i.e., there exists a vector 0 6= v ∈ IRn such that



A + B∂Φ(µ, x)

∂x

 v = 0.

This implies that

vTATAv = vT  ∂Φ(µ, x)

∂x

T

BTB ∂Φ(µ, x)

∂x v. (13)

For convenience, we denote C := ∂Φ(µ,x)∂x . Then, it follows that vTATAv = vTCTBTBCv.

By Lemma 4.2, there exists a constant ˆθ such that

λmin(CTC) ≤ ˆθ ≤ λmax(CTC) and λmax(CTBTBC) = ˆθλmax(BTB).

Note that if we can prove that 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1, we have λmax(CTBTBC) ≤ λmax(BTB). Then, by the assumption that the minimum singular value of A strictly ex- ceeds the maximum singular value of B, and applying Lemma 4.1, we obtain vTATAv >

vTCTBTBCv. This contradicts the formula (13), which shows the Jacobian matrix H0(µ, x) in Newton equations (11) is nonsingular for µ > 0.

(11)

Thus, as discussed above, we only need to prove 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1.

For x2 = 0, we compute that C = hsgn(x1)|x1|p−1

p

µp+|x1|pip−1I. Then, it is clear that 0 < λ(CTC) < 1 for µ > 0. For x2 6= 0, using the fact that the matrix MTM is always positive semidefinite for any matrix M ∈ IRm×n, we see that the inequality λmin(CTC) ≥ 0 always holds. In order to prove that λmax(CTC) ≤ 1, we need to further prove that the matrix I − CTC is positive semidefinite. To see this, note that

I − CTC =

"

1 − b2− c2 −2bckxxT2

2k

−2bckxx2

2k (1 − a2)I + (a2 − b2− c2)kxx2xT2

2k2

# .

Because b2+ c2 = 1 2

 |λ2(x)|2(p−1)

p(µ, λ2(x))]2(p−1) + |λ1(x)|2(p−1)p(µ, λ1(x))]2(p−1)



< 1

2· 2 = 1 for µ > 0, we have 1 − b2− c2 > 0. Moreover, the Schur complement of 1 − b2− c2 has the form of

(1 − a2)I + (a2− b2 − c2)x2xT2

kx2k2 − 4b2c2 1 − b2− c2

x2xT2 kx2k2

= (1 − a2)



I − x2xT2 kx2k2

 +



1 − b2− c2− 4b2c2 1 − b2− c2

 x2xT2

kx2k2. (14) On the other hand, |λi(x)| < φp(µ, λi(x)) (i = 1, 2) for µ > 0, we have

p(µ, λ2(x)) − φp(µ, λ1(x))|

=

2(x)|p− |λ1(x)|p

p

X

i=1

p(µ, λ2(x))]p−ip(µ, λ1(x))]i−1

=

(|λ2(x)| − |λ1(x)|)

p

X

i=1

2(x)|p−i1(x)|i−1

p

X

i=1

p(µ, λ2(x))]p−ip(µ, λ1(x))]i−1

< ||λ2(x)| − |λ1(x)||

≤ |λ2(x) − λ1(x)|.

This together with (10) implies that 1 − a2 > 0 for any µ > 0. In addition, for any µ > 0, we observe that

(1 − b2− c2)2− 4b2c2

= (1 − (b − c)2)(1 − (b + c)2)

=

"

1 − |λ1(x)|2(p−1)p(µ, λ1(x))]2(p−1)

#

·

"

1 − |λ2(x)|2(p−1)p(µ, λ2(x))]2(p−1)

#

> 0,

(12)

where the inequality holds due to |λi(x)| < φp(µ, λi(x)) for i = 1, 2 and µ > 0. With all of these, we see that the Schur complement of 1 − b2− c2 given as in (14) is a linear positive combination of the matrices 

I − kxx2xT2

2k2



and kxx2xT2

2k2, which yields that the Schur complement (14) of 1 − b2− c2 is positive semidefinite. Hence, the matrix I − CTC is also positive semidefinite, which is equivalent to saying 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1.

Thus, the proof is complete. 2

Theorem 4.1 indicates the Newton equation (11) in Algorithm 4.1 is solvable. It paves a way to show that the linear search (12) in Algorithm 4.1 is well-defined which is given in Theorem 4.2 as below. Indeed, the proof is very similar to the one in [17, Remark 2.1 (v)], we only present it here and omit its proof.

Theorem 4.2. Suppose that Assumption 4.1 holds. Then, for 4z ∈ IR × IRn given by (11), the linear search (12) is well-defined.

Next, we discuss the convergence of Algorithm 4.1. To this end, we need the following results whose arguments are similar to the ones in [17, Remark 2.1].

Theorem 4.3. Let H be defined as in (8). Suppose that Assumption 4.1 holds and that the sequence {zk} is generated by Algorithm 4.1. Then, the following results are hold.

(a) The sequences {kH(zk)k} and {τk} are monotonically non-increasing.

(b) βµk ≥ τk2 for all k.

(c) The sequence {µk} is monotonically non-increasing and µk > 0 for all k.

(d) The sequence {zk} is bounded.

Proof. (a) From definition of the line search in (12) and τk := min{1, kH(zk)k}, it is clear that {kH(zk)k} and {τk} are monotonically non-increasing.

(b) We prove this conclusion by induction. First, by Algorithm 4.1, it is clear that τ02 ≤ βµ0 with τ0, β and µ0 chosen in Algorithm 4.1. Secondly, we suppose that τk2 ≤ βµk

for some k. Then, for k + 1, we have µk+1−τk+12

β = µk+ αkk− τk+12 β

= (1 − αkk+ αkτk2

β −τk+12 β

≥ (1 − αkk2

β + αkτk2

β − τk+12 β

≥ 0,

(13)

where the second equality holds due to the Newton equation (11), and the second in- equality holds due to part (a). Hence, it follows that βµk ≥ τk2 for all k.

(c) From the iterative scheme zk+1 = zk+ αk∆zk, we know µk+1 = µk+ αkk. By the Newton equations (11) and the line search as in (12) again, it follows that

µk+1= (1 − αkk+ αkτk2

β ≥ (1 − αkk2

β + αkτk2 β > 0 for all k. On the other hand, we have

µk+1 = (1 − αkk+ αkτk2

β ≤ (1 − αkk+ αkµk≤ µk,

where the first inequality holds due to part (b). Hence, the sequence {µk} is monotoni- cally non-increasing and µk > 0 for all k.

(d) From part (a), we know the sequence {kH(zk)k} is bounded. Thus, there is a constant C such that kH(zk)k ≤ C. In addition, since

4

λ1(xk)u(1)x + λ2(xk)u(2)x

2

p

4

4 |λ1(xk)| + |λ2(xk)|2

= 1 4

h

(8 − 2√p

4)(|λ1(xk)|2+ |λ2(xk)|2) +√p

4(|λ1(xk)| − |λ2(xk)|)2i

> 0 (∀p > 1), it follows that

kH(zk)k

Axk+ BΦpk, xk) − b

≥ Axk

pk, xk) − kbk

= p

(xk)TATAxk− q

pk, xk)]TBTpk, xk) − kbk

≥ p

λmin(ATA)kxkk − q

λmax(BTB)kΦpk, xk)k2− kbk

= p

λmin(ATA)kxkk − r

λmax(BTB)

φpk, λ1(xk))u(1)x + φpk, λ2(xk))u(2)x

2

− kbk

= p

λmin(ATA)kxkk − r

λmax(BTB)h

φ2pk, λ1(xk))ku(1)x k2+ φ2pk, λ2(xk))ku(2)x k2i

− kbk

= p

λmin(ATA)kxkk − s

λmax(BTB) 1 2



p

q

pk+ |λ1(xk)|p)2+ p q

pk+ |λ2(xk)|p)2



− kbk

(14)

≥ p

λmin(ATA)kxkk −p

λmax(BTB)

· s 1

2



2k+ |λ1(xk)|2+√p

k1(xk)|) + (µ2k+ |λ2(xk)|2 +√p

k2(xk)|)

− kbk

= p

λmin(ATA)kxkk

−p

λmax(BTB) s

µ2k+1

2|λ1(xk)|2+ 1

2|λ2(xk)|2+

p

2

2 µk(|λ1(xk)| + |λ2(xk)|) − kbk

≥ p

λmin(ATA)kxkk

−p

λmax(BTB) r

µ2k+1

2|λ1(xk)|2+ 1

2|λ2(xk)|2+ 2µk1(xk)u(1)x + λ2(xk)u(2)x k − kbk

= p

λmin(ATA)kxkk −p

λmax(BTB)µk+ kλ1(xk)u(1)x + λ2(xk)u(2)x k − kbk

= p

λmin(ATA) −p

λmax(BTB)

kxkk −p

λmax(BTB)µk− kbk.

This together with kH(zk)k ≤ C implies

kxkk ≤ C +pλmax(BTB)µk+ kbk pλmin(ATA) −pλmax(BTB) holds for all k. Thus, the sequence {xk} is bounded. 2

Theorem 4.4. Suppose that Assumption 4.1 holds and that {zk} is generated by Algo- rithm 4.1. Then, any accumulation point of {zk} is a solution to the SOCAVE (2).

Proof. From Theorem 4.3 (d), we know the sequence {zk} is bounded. Hence, there exists at least a accumulation point for the sequence {zk}. Without loss of generality, let limk→∞zk := z? = (µ?, x?). Then, it follows that H? := H(z?) = limk→∞H(zk) and τ? := min{1, kH?k} = limk→∞min{1, kH(zk)k}. Now, we will show H? = 0. Suppose not, i.e., kH?k > 0. To proceed, we discuss two cases according to whether limk→∞αk = 0 or αk≥ ˆα > 0 with ˆα ∈ IR++.

Case 1: limk→∞αk = 0. Then, from the line search (12), for the number αk := αδk with all sufficiently large k, we have

kH(zk+ αk4zk)k > [1 − σ(1 − 1

β)αk]kH(zk)k.

Furthermore, this leads to

kH(zk+ αk4zk)k − kH(zk)k αk

> −σ(1 − 1

β)kH(zk)k. (15)

(15)

Besides, from Theorem 4.3 (c) again, we know µ? ≥ 0. It follows that the function H is continuously differentiable at the point z?. Taking k → ∞ in the formula (15), we have

hH(z?), H0(z?)4z?i

kH(z?)k ≥ −σ(1 − 1

β)kH(z?)k. (16)

This combining the Newton equations (11) yields hH(z?), H0(z?)4z?i

kH(z?)k = (τ?)2

βkH(z?)khH(z?), ei − kH(z?)k

≤ (τ?)2kH(z?)k

βkH(z?)k − kH(z?)k

≤ τ?

β − kH(z?)k

≤ (1

β − 1)kH(z?)k, (17)

where the first inequality holds due to the H¨older inequality hH(z?), ei ≤ kH(z?)kkek = kH(z?)k, the second and third inequality hold due to τ? = min{1, kH(z?)k}. Putting (16) and (17) together gives β1 − 1 ≥ −σ(1 −β1). This contradicts σ ∈ (0, 1) and β > 1.

Case 2: αk ≥ ˆα > 0 for all k. From the line search (12), we have kH(zk+1)k ≤



1 − σ(1 − 1 β) ˆα



kH(zk)k = kH(zk)k − σ(1 − 1

β) ˆαkH(zk)k.

Then, it follows from the boundedness of kH(zk)k that P

k=0ασ(1 −ˆ β1)kH(zk)k is bounded. Moreover, we have limk→∞kH(zk)k = 0, i.e., kH?k = 0. This contradicts kH?k > 0.

Hence, from all the above, we show H(z?) = 0. That is, the element x? is a solution of the SOCAVE (2). Then, the proof is complete. 2

Now, we show the local quadratic convergence of Algorithm 4.1. In fact, we can achieve the following result by similar arguments as those in [37, Theorem 8]. For com- pleteness, we also provide a detailed proof.

Theorem 4.5. Let H be defined as in (8) and z? be the unique solution to SOCAVE (2).

Suppose that Assumption 4.1 holds and that all V ∈ ∂H(z?) are nonsingular. Then, the whole sequence {zk} converges to z?, and kzk+1− z?k = O(kzk− z?k2).

Proof. Since z? is the solution to SOCAVE (2), using Assumption 4.1 and applying Theorem 4.1 yield that the Jacobian matrix H0(zk) is nonsingular for all zk sufficiently close to z?. On the other hand, applying the condition that all V ∈ ∂H(z?) are nonsin- gular and from [36, Proposition 3.1], we have kH0(zk)−1k = O(1) for all zk sufficiently

(16)

close to z?. Because z? is the solution to SOCAVE (2), it is clear that z? is a solution of H(z) = 0. In addition, the function H is strongly semismooth, it follows that

kH(zk) − H(z?) − H0(zk)(zk− z?))k = O(kzk− z?k2).

Thus, we have zk+ 4zk− z?

=

zk+ H0(zk)−1



−H(zk) + 1 βτk2e



− z?

H0(zk)−1 −H(zk) + H0(zk)(zk− z?) +

H0(zk)−11 βτk2e

H0(zk)−1 −H(zk) + H0(zk)(zk− z?)

+ O(1) 1 βτk2e

= O(kH(zk) − H(z?) − H0(zk)(zk− z?)k) + O(kH(zk)k2)

= O(kzk− z?k2) + O(kzk− z?k2)

= O(kzk− z?k2)

where the first equality holds due to the Newton equation (11), and the third equality holds since the function H is locally Lipschitz continuous near zk. Then, the proof is complete. 2

5 Numerical Results

This section is devoted to the numerical results. First, we show the numerical comparison between the smoothing Newton algorithm and generalized Newton method. This provides the numerical evidence about why we adopt the smoothing Newton algorithm, not the generalized Newton algorithm, in this paper. Secondly, we use the performance profile to depict the comparison among different values of p. This shows that the smoothing Newton algorithm is not regularly affected when p is perturbed. Moreover, a suitable smoothing function from the class of smoothing functions is suggested in view of the numerical comparisons.

5.1 Smoothing Newton algorithm vs Generalized Newton method

In this subsection, for fixed p = 2, we provide some numerical examples to evaluate the efficiency of Algorithm 4.1. In our tests, we choose parameters

µ0 = 0.1, x0 = rand(n, 1), δ = 0.5, σ = 10−5 and β = max(1, 1.01 ∗ τ02/µ).

We stop the iterations when kH(zk)k ≤ 10−6or the number of iterations exceeds 100. All the experiments are done on a PC with Intel(R) CPU of 2.40GHz and RAM of 4.00GHz,

(17)

and all the program codes are written in Matlab and run in Matlab environment. We consider the following four problems, and compute these problems by using Smoothing Newton Algorithm (SN for short) 4.1 and Generalized Newton method (GN for short) which introduced in [16], respectively. Illustrative examples further demonstrate the superiority of our proposed algorithm.

Problem 5.1. Consider the SOCAVE (2) which is generated in the following way: first choose two random matrices B, C ∈ IRn×n from a uniformly distribution on [−10, 10]

for every element. We compute the maximal singular value σ1 of B and the minimal singular value σ2 of C, and let σ := min{1, σ21}. Next, we divide C by σ multiplied by a random number in the interval [0, 1], and the resulting matrix is denoted as A.

Accordingly, the minimum singular values of A exceeds the maximal singular value of B. We choose randomly b ∈ IRn on [0, 1] for every element. By Algorithm 4.1 in this paper, the resulting SOCAVE (2) is solvable. The initial point is chosen in the range [0, 1] entry-wisely. Note that a similar way to construct the problem was given in [16].

Table 1: Numerical results for Problem 5.1

SN GN

n ares itn time maxi mini fails ares itn time maxi mini fails 100 8.618e-08 2.8 0.078 3 2 0 9.992e-08 2.8 0.349 3 2 0 200 4.901e-08 2.6 0.051 3 2 0 6.904e-10 2.9 0.134 3 2 0 300 1.574e-08 2.7 0.122 3 2 0 3.779e-09 2.9 0.231 3 2 0 400 3.041e-09 2.7 0.232 3 2 0 9.155e-08 2.7 0.326 3 2 0 500 1.778e-07 2.2 0.300 3 2 0 1.445e-07 2.6 0.421 3 2 0 600 1.385e-07 2.5 0.498 3 2 0 5.626e-08 2.8 0.844 3 2 0 700 2.578e-07 2.4 0.668 3 2 0 1.527e-08 2.6 1.334 3 2 0 800 2.356e-07 2.1 0.771 3 2 0 6.846e-08 2.6 1.905 3 2 0 900 2.420e-08 2.5 1.031 3 2 0 1.272e-09 2.7 2.685 3 2 0 1000 4.718e-08 2.5 1.193 3 2 0 1.135e-07 2.7 3.691 3 2 0 1500 2.027e-07 2.3 1.919 3 2 0 6.417e-08 2.6 13.369 3 2 0 2000 3.121e-08 2.2 3.892 3 2 0 1.015e-07 2.5 32.982 3 2 0 2500 1.565e-07 2.1 6.625 3 2 0 3.940e-08 2.5 53.510 3 2 0 3000 1.028e-07 2.3 12.340 3 2 0 1.293e-07 2.5 87.910 3 2 0

Problem 5.2. Consider the SOCAVE (2) which is generated in the following way:

choose two random matrices C, D ∈ IRn×n from a uniformly distribution on [−10, 10]

for every element, and compute their singular value decompositions C := U1S1V1T and D := U2S2V2T with diagonal matrices S1 and S2; unitary matrices U1, V1, U2 and V2.

(18)

Then, we choose randomly b, c ∈ IRn on [0, 10] for every element. Next, we take a ∈ IRn by setting ai = ci + 10 for all i ∈ {1, . . . , n}, so that a ≥ b. Set A := U1Diag(a)V1T and B := U2Diag(b)V2T, where Diag(x) denotes a diagonal matrix with its i-th diagonal element being xi. The gap between the minimal singular value of A and the maximal singular value of B is limited and can be very small. We choose randomly b ∈ IRn in [0, 10]. The initial point is chosen in the range [0, 1] entry-wisely.

Table 2: Numerical results for Problem 5.2

SN GN

n ares itn time maxi mini fails ares itn time maxi mini fails 100 2.884e-07 4.2 0.050 5 4 0 1.920e-07 4.4 0.134 5 4 0 200 4.556e-07 4.3 0.067 5 4 0 2.637e-07 4.6 0.346 5 4 0 300 2.805e-07 4.5 0.172 5 4 0 3.522e-07 4.4 0.615 5 4 0 400 2.453e-07 4.6 0.312 5 4 0 2.617e-07 4.6 0.863 5 4 0 500 1.809e-13 5.0 0.516 5 5 0 1.037e-07 4.8 1.440 5 4 0 600 1.870e-07 4.8 0.680 5 4 0 3.414e-12 5.0 2.346 5 5 0 700 2.550e-13 5.0 0.880 5 5 0 6.571e-08 4.9 3.535 5 4 0 800 2.868e-13 5.0 1.083 5 5 0 1.606e-07 4.8 5.317 5 4 0 900 7.559e-08 4.9 1.201 5 4 0 2.485e-07 4.7 7.596 5 4 0 1000 3.595e-13 5.0 1.572 5 5 0 1.662e-07 4.8 10.552 5 4 0 1500 5.412e-13 5.0 4.196 5 5 0 1.782e-11 5.0 34.400 5 5 0 2000 7.230e-13 5.0 8.962 5 5 0 2.851e-11 5.0 79.108 5 5 0 2500 8.893e-13 5.0 17.207 5 5 0 4.451e-11 5.0 146.769 5 5 0 3000 1.054e-12 5.0 29.175 5 5 0 6.119e-11 5.0 247.029 5 5 0

Problem 5.3. Consider the SOCAVE (2) which is generated in the following way: choose two random matrices A, B ∈ IRn×n from a uniformly distribution on [−10, 10] for every element. In order to ensure that the SOCAVE (2) is solvable, we update the matrix A by the following: let [U SV ] = svd(A). If min{S(i, i)} = 0 for i = 0, 1, · · · , n, we make A = U (S + 0.01E)V , and then A = λmaxλ (BTB)+0.01

min(ATA) A. We choose randomly b ∈ IRn on [0, 10] for every element. The initial point is chosen in the range [0, 1] entry-wisely.

Problem 5.4. We consider the SOCAVE (2) which is generated the same as Problem 5.1. But, here the SOC is given by K := Kn1 × · · · × Knr, where n1 = · · · = nr = nr.

The above problems 5.1–5.4 are both generated randomly. Below, as suggested by the reviewer, we consider a real application problem. It is well known that the second-order cone linear complementarity problem (SOCLCP) has various applications in engineering,

(19)

Table 3: Numerical results for Problem 5.3

SN GN

n ares itn time maxi mini fails ares itn time maxi mini fails 100 7.928e-10 3.0 0.048 3 3 0 2.085e-08 3.0 0.075 3 3 0 200 9.461e-10 3.0 0.062 3 3 0 4.297e-09 3.0 0.108 3 3 0 300 2.388e-10 3.0 0.122 3 3 0 5.843e-08 2.9 0.237 3 2 0 400 5.780e-11 3.0 0.236 3 3 0 3.841e-08 2.8 0.379 3 2 0 500 1.133e-08 2.9 0.360 3 2 0 1.183e-09 2.9 0.501 3 2 0 600 2.655e-08 2.9 0.566 3 2 0 1.225e-10 3.0 0.627 3 3 0 700 2.202e-11 3.0 0.807 3 3 0 2.525e-10 3.0 0.978 3 3 0 800 8.893e-08 2.8 0.975 3 2 0 2.563e-10 3.0 1.576 3 3 0 900 1.818e-08 2.9 1.240 3 2 0 2.505e-10 3.0 2.374 3 3 0 1000 6.951e-10 3.0 1.502 3 3 0 3.247e-10 3.0 3.367 3 3 0 1500 4.225e-08 2.9 2.482 3 2 0 4.245e-10 3.0 11.625 3 3 0 2000 6.979e-08 2.6 4.683 3 2 0 1.705e-09 3.0 27.704 3 3 0 2500 9.459e-10 2.9 9.441 3 2 0 1.376e-09 3.0 53.306 3 3 0 3000 5.624e-08 2.9 15.765 3 2 0 1.943e-08 2.8 91.226 3 2 0

control, finance, robust optimization and combinatorial optimization since the KKT sys- tem of a second-order cone programming can be recastas an SOCLCP. In general, the SOCLCP is to find x, y ∈ Rn such that

M x + P y = c, x ∈ K, y ∈ K, xTy = 0, (18) where M, P ∈ Rn×m are given matrices and c ∈ Rn is given vector. From [16, Theorem 1.1], we know that the SOCLCP (18) is equivalent to the SOCAVE (2). In view of this, the next experiment is on this case.

Problem 5.5. Consider the SOCLCP with P = −I, which is generated in the following way: First, we generate a matrix B and a vector b as those given in Problem 5.1. Then, let d be a random number in [0, 1]. We set M := BBT+(1+d)I and c := 0.5(M (b+|b|)+

|b| − b) to ensure the solvability of the SOCLCP. We test the above SOCLCP by casting it into an SOCAVE according to [16, Theorem 1.1], i.e., we implement the corresponding SOCAVE with A = M + I, B = M − I and b = 2c. Moreover, the initial point is chosen in the range [0, 1] entry-wisely.

In our experiments, every set of the simulations for every problem is randomly gener- ated ten times, and the numerical results are listed in Tables 1–5, respectively. In Tables 1–5, n denotes the size of testing problem; ares denotes the average value of kH(zk)k when the test stops; itn denotes the average value of the iteration numbers; time denotes

(20)

Table 4: Numerical results for Problem 5.4

SN GN

n r ares itn time maxit minit fails ares itn time maxit minit fails 2 9.933e-08 2.4 1.318 3 2 0 2.995e-10 2.9 3.627 3 2 0 4 1.174e-07 2.5 1.245 3 2 0 1.594e-08 2.6 3.106 3 2 0 1000 5 1.056e-07 2.4 1.293 3 2 0 9.657e-08 2.7 3.115 3 2 0 10 3.380e-13 5.0 1.791 5 5 0 3.971e-08 2.5 3.218 3 2 0 20 3.360e-13 5.0 2.103 5 5 0 5.291e-08 2.7 3.181 3 2 0 2 1.971e-08 2.6 5.084 3 2 0 2.494e-08 2.6 28.888 3 2 0 4 1.047e-07 2.3 4.270 3 2 0 5.363e-08 2.6 29.002 3 2 0 2000 5 1.257e-07 2.5 4.813 3 2 0 1.360e-08 2.8 29.055 3 2 0 10 6.689e-13 5.0 10.463 5 5 0 1.360e-08 2.8 29.055 3 3 0 20 6.653e-13 5.0 11.255 5 5 0 1.360e-08 2.8 29.055 3 4 0 2 1.560e-07 2.1 12.312 3 2 0 2.496e-07 2.5 90.699 3 2 0 4 1.162e-07 2.5 14.457 3 2 0 1.609e-07 2.3 89.813 3 2 0 3000 5 3.156e-07 2.2 12.995 3 2 0 6.872e-0 2.4 88.921 3 2 0 10 9.922e-13 5.0 32.011 5 5 0 1.688e-07 2.4 90.041 3 2 0 20 1.016e-12 5.0 33.877 5 5 0 1.411e-08 2.5 88.949 3 2 0

the average value of the CPU time in seconds; maxit and minit denote the maximal value and the minimal value of the iteration numbers, respectively; and f ails denotes that the times of test is failed. From the numerical results that are presented in Tables 1–5, it is easy to see that the proposed smoothing Newton method is effective for solving all the simulated SOCAVE problems. For the SOCLCP, although the smoothing New- ton method performs slightly less than the generalized Newton method, the difference is marginal. To sum up, both approaches are competitive and can be employed to solve SOCAVE.

5.2 Numerical Comparisons with different values of p

In this subsection, we observe the numerical comparison of Algorithm 4.1 with different values of p. In particular, we consider the performance profile which is introduced in [44] as a means. In other words, we regard Algorithm 4.1 corresponding to different p = 1.1, 2, 3, 10, 20, 80 as a solver, and assume that there are ns solvers and nq test problems from the test set P which is generated randomly. We are interested in using the computing time as performance measure for Algorithm 4.1 with different p. For each problem q and solver s, let

fq,s = computing time required to solve problem q by solver s.

We employ the performance ratio

rq,s:= fq,s

min{fq,s: s ∈ S},

參考文獻

相關文件

Abstract We investigate some properties related to the generalized Newton method for the Fischer-Burmeister (FB) function over second-order cones, which allows us to reformulate

Lin, A smoothing Newton method based on the generalized Fischer-Burmeister function for MCPs, Nonlinear Analysis: Theory, Methods and Applications, 72(2010), 3739-3758..

For finite-dimensional second-order cone optimization and complementarity problems, there have proposed various methods, including the interior point methods [1, 15, 18], the

We have provided alternative proofs for some results of vector-valued functions associ- ated with second-order cone, which are useful for designing and analyzing smoothing and

Fukushima, On the local convergence of semismooth Newton methods for linear and nonlinear second-order cone programs without strict complementarity, SIAM Journal on Optimization,

Based on the reformulation, a semi-smooth Levenberg–Marquardt method was developed, and the superlinear (quadratic) rate of convergence was established under the strict

support vector machine, ε-insensitive loss function, ε-smooth support vector regression, smoothing Newton algorithm..

Specifically, in Section 3, we present a smoothing function of the generalized FB function, and studied some of its favorable properties, including the Jacobian consistency property;