Applied Numerical Mathematics, vol. 135, January, pp. 206-227, 2019
Unified smoothing functions for absolute value equation associated with second-order cone
Chieu Thanh Nguyen 1 Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan.
B. Saheya2
College of Mathematical Science Inner Mongolia Normal University Hohhot 010022, Inner Mongolia, P. R. China
Yu-Lin Chang3 Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan.
Jein-Shan Chen 4 Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan.
February 19, 2018 (revised on July 12, 2018)
Abstract In this paper, we explore a unified way to construct smoothing functions for solving the absolute value equation associated with second-order cone (SOCAVE).
Numerical comparisons are presented, which illustrate what kinds of smoothing functions
1E-mail:thanhchieu90@gmail.com.
2E-mail: saheya@imnu.edu.cn. The author’s work is supported by Natural Science Foundation of Inner Mongolia (Award Number: 2017MS0125).
3E-mail:ylchang@math.ntnu.edu.tw.
4Corresponding author. E-mail:jschen@math.ntnu.edu.tw. The author’s work is supported by Min- istry of Science and Technology, Taiwan.
work well along with the smoothing Newton algorithm. In particular, the numerical experiments show that the well known loss function widely used in engineering community is the worst one among the constructed smoothing functions, which indicates that the other proposed smoothing functions can be employed for solving engineering problems.
Keywords. Second-order cone, absolute value equations, smoothing Newton algorithm.
1 Introduction
Recently, the paper [36] investigates a family of smoothing functions along with a smoothing- type algorithm to tackle the absolute value equation associated with second-order cone (SOCAVE) and shows the efficiency of such approach. Motivated by this article, we continue to ask two natural questions. (i) Whether there are other suitable smoothing functions that can be employed for solving the SOCAVE? (ii) Is there a unified way to construct smoothing functions for solving the SOCAVE? In this paper, we provide affir- mative answers for these two queries. In order to smoothly convey the story of how we figure out the answers, we begin with recalling where the SOCAVE comes from.
The standard absolute value equation (AVE) is in the form of
Ax + B|x| = b, (1)
where A ∈ IRn×n, B ∈ IRn×n, B 6= 0, and b ∈ IRn. Here |x| means the componentwise absolute value of vector x ∈ IRn. When B = −I, where I is the identity matrix, the AVE (1) reduces to the special form:
Ax − |x| = b.
It is known that the AVE (1) was first introduced by Rohn in [41], but was termed by Mangasarian [34]. During the past decade, there has been many researchers paying atten- tion to this equation, for example, Caccetta, Qu and Zhou [2], Hu and Huang [12], Jiang and Zhang [20], Ketabchi and Moosaei [21], Mangasarian [26, 27, 28, 29, 30, 31, 32, 33], Mangasarian and Meyer [34], Prokopyev [37], and Rohn [43].
We elaborate more about the developments of the AVE. Mangasarian and Meyer [34]
show that the AVE (1) is equivalent to the bilinear program, the generalized LCP (linear complementarity problem), and to the standard LCP provided 1 is not an eigenvalue of A. With these equivalent reformulations, they also show that the AVE (1) is NP-hard in its general form and provide existence results. Prokopyev [37] further improves the above equivalence which indicates that the AVE (1) can be equivalently recast as LCP without any assumption on A and B, and also provides a relationship with mixed integer programming. In general, if solvable, the AVE (1) can have either unique solution or
multiple (e.g., exponentially many) solutions. Indeed, various sufficiency conditions on solvability and non-solvability of the AVE (1) with unique and multiple solutions are discussed in [34, 37, 42]. Some variants of the AVE, like the absolute value equation associated with second-order cone and the absolute value programs, are investigated in [14] and [46], respectively.
Recently, another type of absolute value equation, a natural extension of the standard AVE (1), is considered [14, 35, 36]. More specifically the following absolute value equation associated with second-order cones, abbreviated as SOCAVE, is studied:
Ax + B|x| = b, (2)
where A, B ∈ IRn×n and b ∈ IRn are the same as those in (1); |x| denotes the absolute value of x coming from the square root of the Jordan product “◦” of x and x. What is the difference between the standard AVE (1) and the SOCAVE (2)? Their mathematical formats look the same. In fact, the main difference is that |x| in the standard AVE (1) means the componentwise |xi| of each xi ∈ IR, i.e., |x| = (|x1|, |x2|, · · · , |xn|)T ∈ IRn; however, |x| in the SOCAVE (2) denotes the vector satisfying √
x2 :=√
x ◦ x associated with second-order cone under Jordan product. To understand its meaning, we need to introduce the definition of second-order cone (SOC). The second-order cone in IRn (n ≥ 1), also called the Lorentz cone, is defined as
Kn:=(x1, x2) ∈ IR × IRn−1| kx2k ≤ x1 ,
where k · k denotes the Euclidean norm. If n = 1, then Kn is the set of nonnegative reals IR+. In general, a general second-order cone K could be the Cartesian product of SOCs, i.e.,
K := Kn1 × · · · × Knr.
For simplicity, we focus on the single SOC Kn because all the analysis can be carried over to the setting of Cartesian product. The SOC is a special case of symmetric cones and can be analyzed under Jordan product, see [8]. In particular, for any two vectors x = (x1, x2) ∈ IR × IRn−1 and y = (y1, y2) ∈ IR × IRn−1, the Jordan product of x and y associated with Kn is defined as
x ◦ y :=
xTy y1x2+ x1y2
.
The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source of complication in the analysis of optimization problems involved SOC, see [4, 6, 9] and references therein for more details. The identity element under this Jordan product is e = (1, 0, ..., 0)T ∈ IRn. With these definitions, x2 means the Jordan product of x with itself, i.e., x2 := x ◦ x; and √
x with x ∈ Kn denotes the unique vector such that √
x ◦√
x = x. In other words, the vector |x| in the SOCAVE (2) is computed by
|x| :=√ x ◦ x.
As remarked in the literature, the significance of the AVE (1) arises from the fact that the AVE is capable of formulating many optimization problems such as linear programs, quadratic programs, bimatrix games, and so on. Likewise, the SOCAVE (2) plays a sim- ilar role in various optimization problems involving second-order cones. There has been many numerical methods proposed for solving the standard AVE (1) and the SOCAVE (2). Please refer to [36] for a quick review. Basically, we follow the smoothing Newton algorithm employed in [36] to deal with the SOCAVE (2). This kind of algorithm has been a powerful tool for solving many other optimization problems, including symmetric cone complementarity problems [11, 23, 24], the system of inequalities under the order induced by symmetric cone [18, 25, 47], and so on. It is also employed for the standard AVE (1) in [19, 44]. The new upshot of this paper lies on discovering more suitable smoothing functions and exploring a unified way to construct smoothing functions. Of course, the numerical performance among different smoothing functions are compared.
These are totally new to the literature and are the main contribution of this paper.
To close this section, we recall some basic concepts and background materials regard- ing the second-order cone, which will be used in the subsequent analysis. More details can be found in [4, 6, 8, 9, 14]. First, we recall the expression of the spectral decomposition of x with respect to SOC. For x = (x1, x2) ∈ IR × IRn−1, the spectral decomposition of x with respect to SOC is given by
x = λ1(x)u(1)x + λ2(x)u(2)x , (3) where λi(x) = x1+ (−1)ikx2k for i = 1, 2 and
u(i)x =
1 2
1, (−1)i xkxT2
2k
T
if kx2k 6= 0,
1
2 1, (−1)iωTT
if kx2k = 0,
(4)
with ω ∈ IRn−1 being any vector satisfying kωk = 1. The two scalars λ1(x) and λ2(x) are called spectral values of x; while the two vectors u(1)x and u(2)x are called the spectral vectors of x. Moreover, it is obvious that the spectral decomposition of x ∈ IRn is unique if x2 6= 0. It is known that the spectral values and spectral vectors posses the following properties:
(i) u(1)x ◦ u(2)x = 0 and u(i)x ◦ u(i)x = u(i)x for i = 1, 2;
(ii) ku(1)x k2 = ku(2)x k2 = 12 and kxk2 = 12(λ21(x) + λ22(x)).
Next is the concept about the projection onto second-order cone. Let x+ denote the projection of x onto Kn, and x− be the projection of −x onto the dual cone (Kn)∗ of Kn, where the dual cone (Kn)∗ is defined by (Kn)∗ := {y ∈ IRn | hx, yi ≥ 0, ∀x ∈ Kn}.
In fact, the dual cone of Kn is itself, i.e., (Kn)∗ = Kn. Due to the special structure of
Kn, the explicit formula of projection of x = (x1, x2) ∈ IR × IRn−1 onto Kn is obtained in [4, 6, 8, 9, 10] as below:
x+=
x if x ∈ Kn, 0 if x ∈ −Kn, u otherwise,
where u =
" x1+kx2k
2 x1+kx2k
2
x2
kx2k
# .
Similarly, the expression of x− can be written out as
x− =
0 if x ∈ Kn,
−x if x ∈ −Kn, w otherwise,
where w =
"
−x1−kx2 2k
x
1−kx2k 2
x2
kx2k
# .
It is easy to verify that x = x++ x− and
x+ = (λ1(x))+u(1)x + (λ2(x))+u(2)x x−= (−λ1(x))+u(1)x + (−λ2(x))+u(2)x ,
where (α)+ = max{0, α} for α ∈ IR. As for the expression of |x| associated with SOC.
There is an alternative way via the so-called SOC-function to obtain the expression of
|x|, which can be found in [3, 5]. In any case, it comes out that
|x| = (λ1(x))++ (−λ1(x))+u(1)x +(λ2(x))++ (−λ2(x))+u(2)x
= λ1(x)
u(1)x + λ2(x)
u(2)x .
2 Unified smoothing functions for SOCAVE
As mentioned in Section 1, we employ the smoothing Newton method for solving the SOCAVE (2), which needs a smoothing function to work with. Indeed, a family of smoothing functions was already considered in [36]. In this section, we look into what kinds of smoothing functions can be employed to work with the smoothing Newton algorithm for solving the SOCAVE (2).
Definition 2.1. A function φ : IR++× IR → IR is called a smoothing function of |t| if it satisfies the following:
(i) φ is continuously differentiable at (µ, t) ∈ IR++× IR;
(ii) lim
µ↓0φ(µ, t) = |t| for any t ∈ IR.
Given a smoothing function φ, we further define a vector-valued function Φ : IR++× IRn → IRn as
Φ(µ, x) = φ (µ, λ1(x)) u(1)x + φ (µ, λ2(x)) u(2)x (5)
where µ ∈ IR++ is a parameter, λ1(x), λ2(x) are the spectral values of x, and u(1)x , u(2)x
are the spectral vectors of x. Consequently, Φ is also smooth on IR++× IRn. Moreover, it is easy to verify that
lim
µ→0+Φ(µ, x) = |λ1(x)| u(1)x + |λ2(x)| u(2)x = |x|
which means each function Φ(µ, x) serves as a smoothing function of |x| associated with SOC. With this observation, for the SOCAVE (2), we further define the function H(µ, x) : IR++× IRn→ IR × IRn by
H(µ, x) =
µ
Ax + BΦ(µ, x) − b
, ∀µ ∈ IR++ and x ∈ IRn. (6)
Proposition 2.1. Suppose that x = (x1, x2) ∈ IR × IRn−1 has the spectral decomposition as in (3)-(4). Let H : IR++× IRn→ IRn be defined as in (6). Then,
(a) H(µ, x) = 0 if and only if x solves the SOCAVE (2);
(b) H is continuously differentiable at (µ, x) ∈ IR++×IRnwith the Jacobian matrix given by
H0(µ, x) =
"
1 0
B ∂Φ(µ,x)∂µ A + B ∂Φ(µ,x)∂x
#
(7) where
∂Φ(µ, x)
∂µ = ∂φ(µ, λ1(x))
∂µ u(1)x + ∂φ(µ, λ2(x))
∂µ u(2)x ,
∂Φ(µ, x)
∂x =
∂φ(µ,x1)
∂x1 I if x2 = 0,
"
b c kxxT2
2k
c kxx2
2k aI + (b − a)kxx2xT2
2k2
#
if x2 6= 0,
with
a = φ(µ, λ2(x)) − φ(µ, λ1(x)) λ2(x) − λ1(x) ,
b = 1
2
∂φ(µ, λ2(x))
∂x1 + ∂φ(µ, λ1(x))
∂x1
, (8)
c = 1
2
∂φ(µ, λ2(x))
∂x1 − ∂φ(µ, λ1(x))
∂x1
.
Proof. (a) First, we observe that
H(µ, x) = 0 ⇐⇒ µ = 0 and Ax + BΦ(µ, x) − b = 0
⇐⇒ Ax + B|x| − b = 0 and µ = 0.
This indicates that x is a solution to the SOCAVE (2) if and only if (µ, x) is a solution to H(µ, x) = 0.
(b) Since Φ(µ, x) is continuously differentiable on IR++ × IRn, it is clear that H(µ, x) is continuously differentiable on IR++× IRn. Thus, it remains to compute the Jacobian matrix of H(µ, x). Note that
Φ(µ, x) = φ(µ, λ1(x))u(1)x + φ(µ, λ2(x))u(2)x ,
=
1 2
"
φ(µ, λ1(x)) + φ(µ, λ2(x))
−φ(µ, λ1(x))kxxT2
2k + φ(µ, λ2(x))kxxT2
2k
#
if x2 6= 0, 1
2
φ(µ, λ1(x)) + φ(µ, λ2(x))
−φ(µ, λ1(x))ωT + φ(µ, λ2(x))ωT
if x2 = 0.
= 1
2
φ(µ, λ1(x)) + φ(µ, λ2(x) (−φ(µ, λ1(x)) + φ(µ, λ2(x))) kxx¯2
2k
...
(−φ(µ, λ1(x)) + φ(µ, λ2(x))) kxx¯n
2k
if x2 6= 0,
φ(µ, λ1(x)) + φ(µ, λ2(x)) 0
... 0
if x2 = 0.
where x2 := (¯x2, · · · , ¯xn) ∈ IRn−1, ω = (ω2, · · · , ωn) ∈ IRn−1. From chain rule, it is trivial
that ∂Φ(µ, x)
∂µ = ∂φ(µ, λ1(x))
∂µ u(1)x +∂φ(µ, λ2(x))
∂µ u(2)x In order to compute ∂Φ(µ,x)∂x , for simplicity, we denote
Φ(µ, x) := 1 2
τ1(µ, x) τ2(µ, x)
... τn(µ, x)
.
To proceed, we discuss two cases.
(i) For x2 6= 0, we compute
∂τ1(µ, x)
∂x1 = ∂φ(µ, λ1(x))
∂x1 +∂φ(µ, λ2(x))
∂x1
= ∂φ(µ, λ1(x))
∂λ1(x)
∂λ1(x)
∂x1
+∂φ(µ, λ2(x))
∂λ2(x)
∂λ2(x)
∂x1
= ∂φ(µ, λ1(x))
∂λ1(x) +∂φ(µ, λ2(x))
∂λ2(x) := 2b
and
∂τ1(µ, x)
∂ ¯xi = ∂φ(µ, λ1(x))
∂ ¯xi +∂φ(µ, λ2(x))
∂ ¯xi
= ∂φ(µ, λ1(x))
∂λ1(x)
∂λ1(x)
∂ ¯xi
+ ∂φ(µ, λ2(x))
∂λ2(x)
∂λ2(x)
∂ ¯xi
= −∂φ(µ, λ1(x))
∂λ1(x)
¯ xi
kx2k +∂φ(µ, λ2(x))
∂λ2(x)
¯ xi
kx2k
= ∂φ(µ, λ2(x))
∂λ2(x) −∂φ(µ, λ1(x))
∂λ1(x)
x¯i kx2k
= ∂φ(µ, λ2(x))
∂x1 −∂φ(µ, λ1(x))
∂x1
x¯i
kx2k := 2c x¯i
kx2k, i = 2, · · · , n.
Moreover,
∂τi(µ, x)
∂x1 = ∂φ(µ, λ2(x))
∂x1 − ∂φ(µ, λ1(x))
∂x1
x¯i
kx2k = 2c x¯i
kx2k, i = 2, · · · , n.
Similarly, we have
∂τ2(µ, x)
∂ ¯x2
= ∂φ(µ, λ2(x))
∂ ¯x2
−∂φ(µ, λ1(x))
∂ ¯x2
x¯2
kx2k + (φ(µ, λ2(x)) − φ(µ, λ1(x)))
∂
¯x2
kx2k
∂ ¯x2
= 2bx¯2· ¯x2
kx2k2 + (φ(µ, λ2(x)) − φ(µ, λ1(x)))
1
kx2k − x¯2· ¯x2 kx2k3
= 2a + 2(b − a)x¯2· ¯x2 kx2k2 ,
where a means a := φ(µ, λ2(x)) − φ(µ, λ1(x))
λ2(x) − λ1(x) . In general, mimicking the same derivation yields
∂τi(µ, x)
∂ ¯xj =
( 2a + 2(b − a)kxx¯i·¯xi
2k2 if i = j, 2(b − a)kxx¯i·¯xj
2k2 if i 6= j.
To sum up, we obtain
∂Φ(µ, x)
∂x =
"
b ckxxT2
2k
ckxx2
2k aI + (b − a)kxx2xT2
2k2
#
which is the desired result.
(ii) For x2 = 0, it is clear to see
∂τ1(µ, x)
∂x1 = 2∂φ(µ, x1)
∂x1 and ∂τ1(µ, x)
∂ ¯xi = 0 for i = 2, · · · , n.
Since τi(µ, x) = 0 for i = 2, · · · , n, it gives ∂τi∂x(µ,x)
1 = 0. Moreover,
∂τ2(µ, x)
∂ ¯x2 = lim
¯ x2→0
τ2(µ, x1, ¯x2, 0, · · · , 0) − τ2(µ, x1, 0, · · · , 0)
¯ x2
= lim
¯ x2→0
φ(µ, x1+ |¯x2|) − φ(µ, x1− |¯x2|)
¯ x2
¯ x2
|¯x2|
= lim
¯ x2→0
φ(µ, x1+ |¯x2|) − φ(µ, x1− |¯x2|)
|¯x2|
= lim
¯ x2→0
∂φ(µ, x1+ |¯x2|)
∂(|¯x2|) − ∂φ(µ, x1− |¯x2|)
∂(|¯x2|) (as L0Hopital0s rule)
= lim
¯ x2→0
∂φ(µ, x1+ |¯x2|)
∂(x1+ |¯x2|) + ∂φ(µ, x1− |¯x2|)
∂(x1− |¯x2|)
= 2∂φ(µ, x1)
∂x1
. Thus, we obtain
∂τi(µ, x)
∂ ¯xj = (
2∂φ(µ,x∂x 1)
1 if i = j, 0 if i 6= j.
which is equivalent to saying
∂Φ(µ, x)
∂x = ∂φ(µ, x1)
∂x1 I.
From all the above, we conclude that
∂Φ(µ, x)
∂x =
∂φ(µ,x1)
∂x1 I if x2 = 0,
"
b c kxxT2
2k
c kxx2
2k aI + (b − a)kxx2xT2
2k2
#
if x2 6= 0,
Thus, the proof is complete. 2
Now, we are ready to answer the question about what kind of smoothing functions can be adopted in the smoothing type algorithm. Two technical lemmas are needed towards the answer.
Lemma 2.1. Suppose that M, N ∈ IRn×n. Let σmin(M ) denote the minimum singular value of M , and σmax(N ) denote the maximum singular value of N . Then, the following hold.
(a) σmin(M ) > σmax(N ) if and only if σmin(MTM ) > σmax(NTN ).
(b) If σmin(MTM ) > σmax(NTN ), then MTM − NTN is positive definite.
Proof. The proof is straightforward or can be found in usual textbook of matrix analysis, so we omit it here. 2
Lemma 2.2. Let A, S ∈ IRn×n and A be symmetric. Suppose that the eigenvalues of A and SST are arranged in non-increasing order. Then, for each k = 1, 2, · · · , n, there exists a nonnegative real number θk such that
λmin(SST) ≤ θk≤ λmax(SST) and λk(SAST) = θkλk(A).
Proof. Please see [15, Corollary 4.5.11] for a proof. 2
We point out that the crucial key, which guarantees a smoothing function can be employed in the smoothing type algorithm, is the nonsingularity of the Jacobian matrix H0(µ, x)) given in (7). As below, we provide under what condition the Jacobian matrix H0(µ, x)) is nonsingular.
Theorem 2.1. Consider a SOCAVE (2) with σmin(A) > σmax(B). Let H be defined as in (6). Suppose that φ : IR++× IR → IR is a smoothing function of |t|. If −1 ≤ dtdφ(µ, t) ≤ 1 is satisfied, then the Jacobian matrix H0(µ, x) is nonsingular for any µ > 0.
Proof. From the expression of H0(µ, x) given as in (7), we know that H0(µ, x) is non- singular if and only if the matrix A + B ∂Φ(µ,x)∂x is nonsingular. Thus, it suffices to show that the matrix A + B ∂Φ(µ,x)∂x is nonsingular under the conditions.
Suppose not, that is, there exists a vector 0 6= v ∈ IRn such that
A + B ∂Φ(µ, x)
∂x
v = 0 which implies that
vTATAv = vT ∂Φ(µ, x)
∂x
T
BTB ∂Φ(µ, x)
∂x v. (9)
For convenience, we denote C := ∂Φ(µ,x)∂x . Then, it follows that vTATAv = vTCTBTBCv.
Applying Lemma 2.2, there exists a constant ˆθ such that
λmin(CTC) ≤ ˆθ ≤ λmax(CTC) and λmax(CTBTBC) = ˆθλmax(BTB).
Note that if we can prove that
0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1,
we will have λmax(CTBTBC) ≤ λmax(BTB). Then, by the assumption that the minimum singular value of A strictly exceeds the maximum singular value of B (i.e., σmin(A) >
σmax(B)) and applying Lemma 2.1, we obtain vTATAv > vTCTBTBCv. This contradicts the identity (9), which shows the Jacobian matrix H0(µ, x) is nonsingular for µ > 0.
Thus, in light of the above discussion, it suffices to claim 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1. To this end, we discuss two cases.
Case 1: For x2 = 0, we compute that C = ∂φ(µ,x∂x 1)
1 I. Since −1 ≤ ∂φ(µ,x∂x 1)
1 ≤ 1, it is clear that 0 ≤ λ(CTC) ≤ 1 for µ > 0. Then, the claim is done.
Case 2: For x2 6= 0, using the fact that the matrix MTM is always positive semidefinite for any matrix M ∈ IRm×n, we see that the inequality λmin(CTC) ≥ 0 always holds. In order to prove λmax(CTC) ≤ 1, we need to further argue that the matrix I − CTC is positive semidefinite. First, we write out
I − CTC =
"
1 − b2− c2 −2bckxxT2
2k
−2bckxx2
2k (1 − a2)I + (a2 − b2− c2)kxx2xT2
2k2
# .
If −1 < ∂φ(µ,λ∂xi(x))
1 < 1, then we obtain b2+ c2 = 1
2
"
∂φ(µ, λ1(x))
∂x1
2
+ ∂φ(µ, λ2(x))
∂x1
2#
< 1.
This indicates that 1 − b2− c2 > 0. By considering [1 − b2− c2] as an 1 × 1 matrix, this says [1 − b2 − c2] is positive definite. Hence, its Schur complement can be computed as below:
(1 − a2)I + (a2− b2 − c2)x2xT2
kx2k2 − 4b2c2 1 − b2− c2
x2xT2 kx2k2
= (1 − a2)
I − x2xT2 kx2k2
+
1 − b2− c2− 4b2c2 1 − b2− c2
x2xT2
kx2k2. (10) On the other hand, by the Mean Value Theorem, we have
φ(µ, λ2(x)) − φ(µ, λ1(x)) = ∂φ(µ, ξ)
∂ξ (λ2(x) − λ1(x)),
where ξ ∈ (λ1(x), λ2(x)). To proceed, we need to further discuss two subcases.
(1) When −1 < ∂φ(µ,ξ)∂ξ < 1, we know |φ(µ, λ2(x)) − φ(µ, λ1(x))| < |λ2(x) − λ1(x)|. This together with (8) implies that 1 − a2 > 0 for any µ > 0. In addition, for any µ > 0, we observe that
(1 − b2− c2)2− 4b2c2
= (1 − (b − c)2)(1 − (b + c)2)
=
"
1 − ∂φ(µ, λ1(x))
∂x1
2#
·
"
1 − ∂φ(µ, λ2(x))
∂x1
2#
> 0.
With all of these, we verify that the Schur complement of [1−b2−c2] given as in (10) is a linear positive combination of the matrices
I − kxx2xT2
2k2
and kxx2xT2
2k2, which yields that the Schur complement (10) of [1 − b2 − c2] is positive semidefinite. Hence, the matrix I − CTC is also positive semidefinite, which is equivalent to saying 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1.
(2) When ∂φ(µ,ξ)∂ξ = ±1, we have
1 − a2 = 0, and (1 − b2− c2)2− 4b2c2 > 0.
Since the matrix kxx2xT2
2k2 is positive semidefinite, the matrix I − CTC is positive semidefinite. Hence, 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1.
If either
( ∂φ(µ,λ
1(x))
∂x1 = ±1
∂φ(µ,λ2(x))
∂x1 = ±1 or
( ∂φ(µ,λ
1(x))
∂x1 = ±1
∂φ(µ,λ2(x))
∂x1 = ∓1 , then we have b = ±1, c = 0 or b = 0, c = ∓1, which yields b2 + c2 = 1. Again, two subcases are needed.
(1) When −1 < ∂φ(µ,ξ)∂ξ < 1, we have |φ(µ, λ2(x)) − φ(µ, λ1(x))| < |λ2(x) − λ1(x)|. This implies that 1 − a2 > 0 for any µ > 0. Therefore
I − CTC = " 0 0 0 (1 − a2)
I − kxx2xT2
2k2
# ,
Since the matrix I − kxx2xT2
2k2 is positive semidefinite, the matrix I − CTC is positive semidefinite. Hence, 0 ≤ λmin(CTC) ≤ λmax(CTC) ≤ 1.
(2) When ∂φ(µ,ξ)∂ξ = ±1, we have I − CTC = 0, which leads to λ(CTC) = 1.
From all the above, the proof is complete. 2
We point out that the condition σmin(A) > σmax(B) in Theorem 2.1 guarantees the unique solution according to [35, Theorem 4.1]. From Theorem 2.1, we realize that for a SOCAVE (2) with σmin(A) > σmax(B), any smoothing function of |t| with
−1 ≤ dtdφ(µ, t) ≤ 1 will be good for serving in the smoothing Newton algorithm when solving the above SOCAVE. With this, it is easy to find or construct smoothing functions of |t| satisfying the above condition. One popular approach is a smoothing approximation via convolution for the absolute value function [1, 22, 38, 45], which is described as below.
First, we construct a smoothing approximation for the plus function (t)+ = max{0, t}.
Then, we consider the piecewise continuous function d(t) with finite number of pieces, which is a density (kernel) function. In other words, it satisfies
d(t) ≥ 0 and
Z +∞
−∞
d(t)dt = 1.
With this d(t), we further define ˆs(t, µ) := 1µd
t µ
, where µ is a positive parameter. If R+∞
−∞ |t| d(t)dt < +∞, then a smoothing approximation for (t)+ is formed. In particular, ˆ
p(t, µ) = Z +∞
−∞
(t − s)+ˆs(s, µ)ds = Z t
−∞
(t − s)ˆs(s, µ)ds ≈ (t)+.
The following are four well-known smoothing functions for the plus function [1, 38]:
φˆ1(µ, t) = t + µ ln
1 + e−µt
. (11)
φˆ2(µ, t) =
t if t ≥ µ2,
1
2µ t + µ22
if − µ2 < t < µ2, 0 if t ≤ −µ2.
(12)
φˆ3(µ, t) = p4µ2+ t2+ t
2 . (13)
φˆ4(µ, t) =
t − µ2 if t > µ,
t2
2µ if 0 ≤ t ≤ µ, 0 if t < 0.
(14)
where the corresponding kernel functions are d1(t) = e−x
(1 + e−x)2.
d2(t) = 1 if − 12 ≤ x ≤ 12, 0 otherwise.
d3(t) = 2 (x2+ 4)32.
d4(t) = 1 if 0 ≤ x ≤ 1, 0 otherwise.
Next, in light of |t| = (t)++ (−t)−, the smoothing function of |t| via convolution can be written as
ˆ
p(|t| , µ) = ˆp(t, µ) + ˆp(−t, µ) = Z +∞
−∞
|t − s| ˆs(s, µ)ds.
Analogous to (11)-(14), we achieve the following smoothing functions for |t|:
φ1(µ, t) = µh ln
1 + e−µt
+ ln
1 + eµt
i
. (15)
φ2(µ, t) =
t if t ≥ µ2,
t2
µ + µ4 if − µ2 < t < µ2,
−t if t ≤ −µ2.
(16)
φ3(µ, t) = p
4µ2+ t2. (17)
φ4(µ, t) =
( t2
2µ if |t| ≤ µ,
|t| − µ2 if |t| > µ. (18)
If we take a Epanechnikov kernel function K(t) =
3
4(1 − t2) if |t| ≤ 1, 0 if otherwise, then we obtain the following smoothing function for |t|:
φ5(µ, t) =
t if t > µ,
−8µt43 +3t4µ2 + 3µ8 if − µ ≤ t ≤ µ,
−t if t < µ.
(19)
Moreover, taking a Gaussian kernel function K(t) = √1
2πe−t22 for all t ∈ IR yields ˆ
s(t, µ) := 1 µK t
µ
= 1
p2πµ2e−2µ2t2 , and it leads to the smoothing function [45] for |t|:
φ6(µ, t) = terf
t
√2µ
+
r2
πµe−2µ2t2 , (20)
where the error function is defined by erf(t) = 2
√π Z t
0
e−u2du ∀t ∈ IR.
In summary, we have constructed six smoothing functions from the above discussions.
Can all the above functions serve as smoothing functions for solving SOCAVE? The an- swer is affirmative because it is not hard to verify that each φipossesses −1 ≤ dtdφi(µ, t) ≤ 1. Thus, these six functions will be adopted for our numerical implementations. Accord- ingly, we need to define Φi(µ, x) and Hi(µ, x) based on each φi. For subsequent needs, we only present the expression of each Jacobian matrix Hi0(µ, x) without detailed derivations.
Based on each φi, let Φi : IR × IRn → IRn for i = 1, 2, · · · , 6 be similarly defined as in (5), i.e
Φi(µ, x) = φi(µ, λ1(x)) u(1)x + φi(µ, λ2(x)) u(2)x (21) and Hi : IR × IRn→ IRn for i = 1, 2, · · · , 6 be similarly defined as in (6), i.e
Hi(µ, x) =
µ
Ax + BΦi(µ, x) − b
, ∀µ ∈ IR++ and x ∈ IRn. (22) Then, each Hiis continuously differentiable on IR++×IRnwith the Jacobian matrix given by
Hi0(µ, x) =
"
1 0
B ∂Φi∂µ(µ,x) A + B ∂Φi∂x(µ,x)
#
(23) for all (µ, x) ∈ IR++× IRn with x = (x1, x2) ∈ IR × IRn−1. Moreover, the differentation of each Φi is expressed as below.
(1) The Jacobian of Φ1 is characterized as below.
∂Φ1(µ, x)
∂µ
= ∂φ1(µ, λ1(x))
∂µ u(1)x +∂φ1(µ, λ2(x))
∂µ u(2)x
=
"
φ1(µ, λ1(x))
µ +λ1(x)
µ · 1 − eλ1(x)µ 1 + eλ1(x)µ
# u(1)x +
"
φ1(µ, λ2(x))
µ + λ2(x)
µ · 1 − eλ2(x)µ 1 + eλ2(x)µ
# u(2)x .
∂Φ1(µ, x)
∂x =
ex1µ −1 ex1µ +1
I if x2 = 0,
"
b1 c1
xT2 kx2k
c1kxx2
2k a1I + (b1− a1)kxx2xT2
2k2
#
if x2 6= 0,
with
a1 = φ1(µ, λ2(x)) − φ1(µ, λ1(x)) λ2(x) − λ1(x) , b1 = 1
2
eλ1(x)µ − 1 eλ1(x)µ + 1
+eλ2(x)µ − 1 eλ2(x)µ + 1
! ,
c1 = 1 2
1 − eλ1(x)µ eλ1(x)µ + 1
+eλ2(x)µ − 1 eλ2(x)µ + 1
! .
(2) The Jacobian of Φ2 is characterized as below.
∂Φ2(µ, x)
∂µ = ∂φ2(µ, λ1(x))
∂µ u(1)x + ∂φ2(µ, λ2(x))
∂µ u(2)x with
∂φ2(µ, λi(x))
∂µ =
0 if λi(x) ≥ µ2,
−
λi(x) µ
2
+ 14 if − µ2 < λi(x) < µ2, 0 if λi(x) ≤ −µ2.
∂Φ2(µ, x)
∂x =
dI if x2 = 0,
"
b2 c2kxxT2
2k
c2kxx2
2k a2I + (b2− a2)kxx2xT2
2k2
#
if x2 6= 0,
with
a2 = φ2(µ, λ2(x)) − φ2(µ, λ1(x)) λ2(x) − λ1(x) ,
b2 =
0 if λ2(x) ≥ µ2 > −µ2 ≥ λ1(x), 1 if λ2(x) > λ1(x) ≥ µ2,
λ1(x)
µ + 12 if λ2(x) ≥ µ2 > λ1(x) > −µ2,
λ1(x)+λ2(x)
µ if µ2 > λ2(x) > λ1(x) > −µ2,
λ2(x)
µ − 12 if µ2 > λ2(x) > −µ2 ≥ λ1(x),
−1 if λ1(x) < λ2(x) ≤ −µ2,
c2 =
1 if λ2(x) ≥ µ2 > −µ2 ≥ λ1(x), 0 if λ2(x) > λ1(x) ≥ µ2,
1
2 − λ1µ(x) if λ2(x) ≥ µ2 > λ1(x) > −µ2,
λ2(x)−λ1(x)
µ if µ2 > λ2(x) > λ1(x) > −µ2,
λ2(x)
µ + 12 if µ2 > λ2(x) > −µ2 ≥ λ1(x), 0 if λ1(x) < λ2(x) ≤ −µ2, d =
1 if x1 ≥ µ2,
2x1
µ if − µ2 < x1 < µ2,
−1 if x1 ≤ −µ2. (3) The Jacobian of Φ3 is characterized as below.
∂Φ3(µ, x)
∂µ = 4µ
p4µ2+ λ21(x)u(1)x + 4µ
p4µ2+ λ22(x)u(2)x
∂Φ3(µ, x)
∂x =
x1
√
4µ2+x21I if x2 = 0,
"
b3 c3
xT2 kx2k
c3kxx2
2k a1I + (b1− a1)kxx2xT2
2k2
#
if x2 6= 0, with
a3 = φ3(µ, λ2(x)) − φ3(µ, λ1(x)) λ2(x) − λ1(x) , b3 = 1
2
λ1(x)
p4µ2+ λ21(x) + λ2(x) p4µ2+ λ22(x)
! ,
c3 = 1 2
−λ1(x)
p4µ2+ λ21(x) + λ2(x) p4µ2+ λ22(x)
! .
(4) The Jacobian of Φ4 is characterized as below.
∂Φ4(µ, x)
∂µ = ∂φ4(µ, λ1(x))
∂µ u(1)x + ∂φ4(µ, λ2(x))
∂µ u(2)x
with
∂φ4(µ, λi(x))
∂µ =
−12 if λi(x) > µ,
−12
λi(x) µ
2
if − µ ≤ λi(x) ≤ µ,
−12 if λi(x) < −µ.
∂Φ4(µ, x)
∂x =
eI if x2 = 0,
"
b4 c4
xT2 kx2k
c4kxx2
2k a4I + (b4− a4)kxx2xT2
2k2
#
if x2 6= 0, with
a4 = φ4(µ, λ2(x)) − φ4(µ, λ1(x)) λ2(x) − λ1(x) ,
b4 =
0 if λ2(x) > µ > −µ > λ1(x), 1 if λ2(x) > λ1(x) > µ,
λ1(x)
2µ +12 if λ2(x) > µ ≥ λ1(x) ≥ −µ,
λ1(x)+λ2(x)
2µ if µ ≥ λ2(x) > λ1(x) ≥ −µ,
λ2(x)
2µ −12 if µ ≥ λ2(x) ≥ −µ > λ1(x),
−1 if λ1(x) < λ2(x) < −µ.
c4 =
1 if λ2(x) > µ > −µ > λ1(x), 0 if λ2(x) > λ1(x) > µ,
1
2 − λ12µ(x) if λ2(x) > µ ≥ λ1(x) ≥ −µ,
λ2(x)−λ1(x)
2µ if µ ≥ λ2(x) > λ1(x) ≥ −µ,
λ2(x)
2µ +12 if µ ≥ λ2(x) ≥ −µ > λ1(x), 0 if λ1(x) < λ2(x) < −µ, e =
1 if x1 > µ,
x1
µ if − µ ≤ x1 ≤ µ,
−1 if x1 < −µ.
(5) The Jacobian of Φ5 is characterized as below.
∂Φ5(µ, x)
∂µ = ∂φ5(µ, λ1(x))
∂µ u(1)x + ∂φ5(µ, λ2(x))
∂µ u(2)x with
∂φ5(µ, λi(x))
∂µ =
0 if λi(x) > µ,
3 8
λi(x) µ
2
− 1
2
if − µ ≤ λi(x) ≤ µ, 0 if λi(x) < −µ.
∂Φ5(µ, x)
∂x =
eI if x2 = 0,
"
b5 c5kxxT2
2k
c5 x2
kx2k a5I + (b5− a5)kxx2xT2
2k2
#
if x2 6= 0,