Applied Numerical Mathematics, vol. 135, January, pp. 206-227, 2019

### Unified smoothing functions for absolute value equation associated with second-order cone

Chieu Thanh Nguyen ^{1}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan.

B. Saheya^{2}

College of Mathematical Science Inner Mongolia Normal University Hohhot 010022, Inner Mongolia, P. R. China

Yu-Lin Chang^{3}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan.

Jein-Shan Chen ^{4}
Department of Mathematics
National Taiwan Normal University

Taipei 11677, Taiwan.

February 19, 2018 (revised on July 12, 2018)

Abstract In this paper, we explore a unified way to construct smoothing functions for solving the absolute value equation associated with second-order cone (SOCAVE).

Numerical comparisons are presented, which illustrate what kinds of smoothing functions

1E-mail:thanhchieu90@gmail.com.

2E-mail: saheya@imnu.edu.cn. The author’s work is supported by Natural Science Foundation of Inner Mongolia (Award Number: 2017MS0125).

3E-mail:ylchang@math.ntnu.edu.tw.

4Corresponding author. E-mail:jschen@math.ntnu.edu.tw. The author’s work is supported by Min- istry of Science and Technology, Taiwan.

work well along with the smoothing Newton algorithm. In particular, the numerical experiments show that the well known loss function widely used in engineering community is the worst one among the constructed smoothing functions, which indicates that the other proposed smoothing functions can be employed for solving engineering problems.

Keywords. Second-order cone, absolute value equations, smoothing Newton algorithm.

### 1 Introduction

Recently, the paper [36] investigates a family of smoothing functions along with a smoothing- type algorithm to tackle the absolute value equation associated with second-order cone (SOCAVE) and shows the efficiency of such approach. Motivated by this article, we continue to ask two natural questions. (i) Whether there are other suitable smoothing functions that can be employed for solving the SOCAVE? (ii) Is there a unified way to construct smoothing functions for solving the SOCAVE? In this paper, we provide affir- mative answers for these two queries. In order to smoothly convey the story of how we figure out the answers, we begin with recalling where the SOCAVE comes from.

The standard absolute value equation (AVE) is in the form of

Ax + B|x| = b, (1)

where A ∈ IR^{n×n}, B ∈ IR^{n×n}, B 6= 0, and b ∈ IR^{n}. Here |x| means the componentwise
absolute value of vector x ∈ IR^{n}. When B = −I, where I is the identity matrix, the AVE
(1) reduces to the special form:

Ax − |x| = b.

It is known that the AVE (1) was first introduced by Rohn in [41], but was termed by Mangasarian [34]. During the past decade, there has been many researchers paying atten- tion to this equation, for example, Caccetta, Qu and Zhou [2], Hu and Huang [12], Jiang and Zhang [20], Ketabchi and Moosaei [21], Mangasarian [26, 27, 28, 29, 30, 31, 32, 33], Mangasarian and Meyer [34], Prokopyev [37], and Rohn [43].

We elaborate more about the developments of the AVE. Mangasarian and Meyer [34]

show that the AVE (1) is equivalent to the bilinear program, the generalized LCP (linear complementarity problem), and to the standard LCP provided 1 is not an eigenvalue of A. With these equivalent reformulations, they also show that the AVE (1) is NP-hard in its general form and provide existence results. Prokopyev [37] further improves the above equivalence which indicates that the AVE (1) can be equivalently recast as LCP without any assumption on A and B, and also provides a relationship with mixed integer programming. In general, if solvable, the AVE (1) can have either unique solution or

multiple (e.g., exponentially many) solutions. Indeed, various sufficiency conditions on solvability and non-solvability of the AVE (1) with unique and multiple solutions are discussed in [34, 37, 42]. Some variants of the AVE, like the absolute value equation associated with second-order cone and the absolute value programs, are investigated in [14] and [46], respectively.

Recently, another type of absolute value equation, a natural extension of the standard AVE (1), is considered [14, 35, 36]. More specifically the following absolute value equation associated with second-order cones, abbreviated as SOCAVE, is studied:

Ax + B|x| = b, (2)

where A, B ∈ IR^{n×n} and b ∈ IR^{n} are the same as those in (1); |x| denotes the absolute
value of x coming from the square root of the Jordan product “◦” of x and x. What is
the difference between the standard AVE (1) and the SOCAVE (2)? Their mathematical
formats look the same. In fact, the main difference is that |x| in the standard AVE (1)
means the componentwise |x_{i}| of each x_{i} ∈ IR, i.e., |x| = (|x_{1}|, |x_{2}|, · · · , |x_{n}|)^{T} ∈ IR^{n};
however, |x| in the SOCAVE (2) denotes the vector satisfying √

x^{2} :=√

x ◦ x associated
with second-order cone under Jordan product. To understand its meaning, we need to
introduce the definition of second-order cone (SOC). The second-order cone in IR^{n} (n ≥
1), also called the Lorentz cone, is defined as

K^{n}:=(x1, x2) ∈ IR × IR^{n−1}| kx2k ≤ x1 ,

where k · k denotes the Euclidean norm. If n = 1, then K^{n} is the set of nonnegative reals
IR_{+}. In general, a general second-order cone K could be the Cartesian product of SOCs,
i.e.,

K := K^{n}^{1} × · · · × K^{n}^{r}.

For simplicity, we focus on the single SOC K^{n} because all the analysis can be carried
over to the setting of Cartesian product. The SOC is a special case of symmetric cones
and can be analyzed under Jordan product, see [8]. In particular, for any two vectors
x = (x_{1}, x_{2}) ∈ IR × IR^{n−1} and y = (y_{1}, y_{2}) ∈ IR × IR^{n−1}, the Jordan product of x and y
associated with K^{n} is defined as

x ◦ y :=

x^{T}y
y_{1}x_{2}+ x_{1}y_{2}

.

The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a
main source of complication in the analysis of optimization problems involved SOC, see
[4, 6, 9] and references therein for more details. The identity element under this Jordan
product is e = (1, 0, ..., 0)^{T} ∈ IR^{n}. With these definitions, x^{2} means the Jordan product
of x with itself, i.e., x^{2} := x ◦ x; and √

x with x ∈ K^{n} denotes the unique vector such
that √

x ◦√

x = x. In other words, the vector |x| in the SOCAVE (2) is computed by

|x| :=√ x ◦ x.

As remarked in the literature, the significance of the AVE (1) arises from the fact that the AVE is capable of formulating many optimization problems such as linear programs, quadratic programs, bimatrix games, and so on. Likewise, the SOCAVE (2) plays a sim- ilar role in various optimization problems involving second-order cones. There has been many numerical methods proposed for solving the standard AVE (1) and the SOCAVE (2). Please refer to [36] for a quick review. Basically, we follow the smoothing Newton algorithm employed in [36] to deal with the SOCAVE (2). This kind of algorithm has been a powerful tool for solving many other optimization problems, including symmetric cone complementarity problems [11, 23, 24], the system of inequalities under the order induced by symmetric cone [18, 25, 47], and so on. It is also employed for the standard AVE (1) in [19, 44]. The new upshot of this paper lies on discovering more suitable smoothing functions and exploring a unified way to construct smoothing functions. Of course, the numerical performance among different smoothing functions are compared.

These are totally new to the literature and are the main contribution of this paper.

To close this section, we recall some basic concepts and background materials regard-
ing the second-order cone, which will be used in the subsequent analysis. More details
can be found in [4, 6, 8, 9, 14]. First, we recall the expression of the spectral decomposition
of x with respect to SOC. For x = (x_{1}, x_{2}) ∈ IR × IR^{n−1}, the spectral decomposition of x
with respect to SOC is given by

x = λ_{1}(x)u^{(1)}_{x} + λ_{2}(x)u^{(2)}_{x} , (3)
where λ_{i}(x) = x_{1}+ (−1)^{i}kx_{2}k for i = 1, 2 and

u^{(i)}_{x} =

1 2

1, (−1)^{i x}_{kx}^{T}^{2}

2k

T

if kx_{2}k 6= 0,

1

2 1, (−1)^{i}ω^{T}T

if kx_{2}k = 0,

(4)

with ω ∈ IR^{n−1} being any vector satisfying kωk = 1. The two scalars λ1(x) and λ2(x)
are called spectral values of x; while the two vectors u^{(1)}x and u^{(2)}x are called the spectral
vectors of x. Moreover, it is obvious that the spectral decomposition of x ∈ IR^{n} is unique
if x_{2} 6= 0. It is known that the spectral values and spectral vectors posses the following
properties:

(i) u^{(1)}x ◦ u^{(2)}x = 0 and u^{(i)}x ◦ u^{(i)}x = u^{(i)}x for i = 1, 2;

(ii) ku^{(1)}x k^{2} = ku^{(2)}x k^{2} = ^{1}_{2} and kxk^{2} = ^{1}_{2}(λ^{2}_{1}(x) + λ^{2}_{2}(x)).

Next is the concept about the projection onto second-order cone. Let x+ denote the
projection of x onto K^{n}, and x_{−} be the projection of −x onto the dual cone (K^{n})^{∗} of
K^{n}, where the dual cone (K^{n})^{∗} is defined by (K^{n})^{∗} := {y ∈ IR^{n} | hx, yi ≥ 0, ∀x ∈ K^{n}}.

In fact, the dual cone of K^{n} is itself, i.e., (K^{n})^{∗} = K^{n}. Due to the special structure of

K^{n}, the explicit formula of projection of x = (x_{1}, x_{2}) ∈ IR × IR^{n−1} onto K^{n} is obtained in
[4, 6, 8, 9, 10] as below:

x_{+}=

x if x ∈ K^{n},
0 if x ∈ −K^{n},
u otherwise,

where u =

" _{x}_{1}_{+kx}_{2}_{k}

2 x1+kx2k

2

x2

kx2k

# .

Similarly, the expression of x− can be written out as

x− =

0 if x ∈ K^{n},

−x if x ∈ −K^{n},
w otherwise,

where w =

"

−^{x}^{1}^{−kx}_{2} ^{2}^{k}

_{x}

1−kx2k 2

x2

kx_{2}k

# .

It is easy to verify that x = x_{+}+ x− and

x_{+} = (λ_{1}(x))_{+}u^{(1)}_{x} + (λ_{2}(x))_{+}u^{(2)}_{x} x−= (−λ_{1}(x))_{+}u^{(1)}_{x} + (−λ_{2}(x))_{+}u^{(2)}_{x} ,

where (α)_{+} = max{0, α} for α ∈ IR. As for the expression of |x| associated with SOC.

There is an alternative way via the so-called SOC-function to obtain the expression of

|x|, which can be found in [3, 5]. In any case, it comes out that

|x| = (λ1(x))++ (−λ1(x))+u^{(1)}_{x} +(λ2(x))++ (−λ2(x))+u^{(2)}_{x}

=
λ_{1}(x)

u^{(1)}_{x} +
λ_{2}(x)

u^{(2)}_{x} .

### 2 Unified smoothing functions for SOCAVE

As mentioned in Section 1, we employ the smoothing Newton method for solving the SOCAVE (2), which needs a smoothing function to work with. Indeed, a family of smoothing functions was already considered in [36]. In this section, we look into what kinds of smoothing functions can be employed to work with the smoothing Newton algorithm for solving the SOCAVE (2).

Definition 2.1. A function φ : IR_{++}× IR → IR is called a smoothing function of |t| if it
satisfies the following:

(i) φ is continuously differentiable at (µ, t) ∈ IR_{++}× IR;

(ii) lim

µ↓0φ(µ, t) = |t| for any t ∈ IR.

Given a smoothing function φ, we further define a vector-valued function Φ : IR_{++}×
IR^{n} → IR^{n} as

Φ(µ, x) = φ (µ, λ_{1}(x)) u^{(1)}_{x} + φ (µ, λ_{2}(x)) u^{(2)}_{x} (5)

where µ ∈ IR_{++} is a parameter, λ_{1}(x), λ_{2}(x) are the spectral values of x, and u^{(1)}x , u^{(2)}x

are the spectral vectors of x. Consequently, Φ is also smooth on IR_{++}× IR^{n}. Moreover,
it is easy to verify that

lim

µ→0^{+}Φ(µ, x) = |λ_{1}(x)| u^{(1)}_{x} + |λ_{2}(x)| u^{(2)}_{x} = |x|

which means each function Φ(µ, x) serves as a smoothing function of |x| associated with
SOC. With this observation, for the SOCAVE (2), we further define the function H(µ, x) :
IR++× IR^{n}→ IR × IR^{n} by

H(µ, x) =

µ

Ax + BΦ(µ, x) − b

, ∀µ ∈ IR_{++} and x ∈ IR^{n}. (6)

Proposition 2.1. Suppose that x = (x_{1}, x_{2}) ∈ IR × IR^{n−1} has the spectral decomposition
as in (3)-(4). Let H : IR_{++}× IR^{n}→ IR^{n} be defined as in (6). Then,

(a) H(µ, x) = 0 if and only if x solves the SOCAVE (2);

(b) H is continuously differentiable at (µ, x) ∈ IR_{++}×IR^{n}with the Jacobian matrix given
by

H^{0}(µ, x) =

"

1 0

B ^{∂Φ(µ,x)}_{∂µ} A + B ^{∂Φ(µ,x)}_{∂x}

#

(7) where

∂Φ(µ, x)

∂µ = ∂φ(µ, λ1(x))

∂µ u^{(1)}_{x} + ∂φ(µ, λ2(x))

∂µ u^{(2)}_{x} ,

∂Φ(µ, x)

∂x =

∂φ(µ,x1)

∂x1 I if x2 = 0,

"

b c _{kx}^{x}^{T}^{2}

2k

c _{kx}^{x}^{2}

2k aI + (b − a)_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,

with

a = φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x))
λ_{2}(x) − λ_{1}(x) ,

b = 1

2

∂φ(µ, λ_{2}(x))

∂x_{1} + ∂φ(µ, λ_{1}(x))

∂x_{1}

, (8)

c = 1

2

∂φ(µ, λ_{2}(x))

∂x_{1} − ∂φ(µ, λ_{1}(x))

∂x_{1}

.

Proof. (a) First, we observe that

H(µ, x) = 0 ⇐⇒ µ = 0 and Ax + BΦ(µ, x) − b = 0

⇐⇒ Ax + B|x| − b = 0 and µ = 0.

This indicates that x is a solution to the SOCAVE (2) if and only if (µ, x) is a solution to H(µ, x) = 0.

(b) Since Φ(µ, x) is continuously differentiable on IR_{++} × IR^{n}, it is clear that H(µ, x)
is continuously differentiable on IR_{++}× IR^{n}. Thus, it remains to compute the Jacobian
matrix of H(µ, x). Note that

Φ(µ, x) = φ(µ, λ_{1}(x))u^{(1)}_{x} + φ(µ, λ_{2}(x))u^{(2)}_{x} ,

=

1 2

"

φ(µ, λ_{1}(x)) + φ(µ, λ_{2}(x))

−φ(µ, λ_{1}(x))_{kx}^{x}^{T}^{2}

2k + φ(µ, λ_{2}(x))_{kx}^{x}^{T}^{2}

2k

#

if x_{2} 6= 0,
1

2

φ(µ, λ1(x)) + φ(µ, λ2(x))

−φ(µ, λ_{1}(x))ω^{T} + φ(µ, λ_{2}(x))ω^{T}

if x_{2} = 0.

= 1

2

φ(µ, λ_{1}(x)) + φ(µ, λ_{2}(x)
(−φ(µ, λ_{1}(x)) + φ(µ, λ_{2}(x))) _{kx}^{x}^{¯}^{2}

2k

...

(−φ(µ, λ_{1}(x)) + φ(µ, λ_{2}(x))) _{kx}^{x}^{¯}^{n}

2k

if x_{2} 6= 0,

φ(µ, λ1(x)) + φ(µ, λ2(x)) 0

... 0

if x_{2} = 0.

where x_{2} := (¯x_{2}, · · · , ¯x_{n}) ∈ IR^{n−1}, ω = (ω_{2}, · · · , ω_{n}) ∈ IR^{n−1}. From chain rule, it is trivial

that ∂Φ(µ, x)

∂µ = ∂φ(µ, λ_{1}(x))

∂µ u^{(1)}_{x} +∂φ(µ, λ_{2}(x))

∂µ u^{(2)}_{x}
In order to compute ^{∂Φ(µ,x)}_{∂x} , for simplicity, we denote

Φ(µ, x) := 1 2

τ_{1}(µ, x)
τ2(µ, x)

...
τ_{n}(µ, x)

.

To proceed, we discuss two cases.

(i) For x_{2} 6= 0, we compute

∂τ_{1}(µ, x)

∂x_{1} = ∂φ(µ, λ_{1}(x))

∂x_{1} +∂φ(µ, λ_{2}(x))

∂x_{1}

= ∂φ(µ, λ_{1}(x))

∂λ1(x)

∂λ_{1}(x)

∂x1

+∂φ(µ, λ_{2}(x))

∂λ2(x)

∂λ_{2}(x)

∂x1

= ∂φ(µ, λ1(x))

∂λ_{1}(x) +∂φ(µ, λ2(x))

∂λ_{2}(x) := 2b

and

∂τ1(µ, x)

∂ ¯x_{i} = ∂φ(µ, λ1(x))

∂ ¯x_{i} +∂φ(µ, λ2(x))

∂ ¯x_{i}

= ∂φ(µ, λ_{1}(x))

∂λ1(x)

∂λ_{1}(x)

∂ ¯xi

+ ∂φ(µ, λ_{2}(x))

∂λ2(x)

∂λ_{2}(x)

∂ ¯xi

= −∂φ(µ, λ1(x))

∂λ_{1}(x)

¯ xi

kx_{2}k +∂φ(µ, λ2(x))

∂λ_{2}(x)

¯ xi

kx_{2}k

= ∂φ(µ, λ_{2}(x))

∂λ_{2}(x) −∂φ(µ, λ_{1}(x))

∂λ_{1}(x)

x¯_{i}
kx_{2}k

= ∂φ(µ, λ_{2}(x))

∂x_{1} −∂φ(µ, λ_{1}(x))

∂x_{1}

x¯_{i}

kx_{2}k := 2c x¯_{i}

kx_{2}k, i = 2, · · · , n.

Moreover,

∂τ_{i}(µ, x)

∂x_{1} = ∂φ(µ, λ_{2}(x))

∂x_{1} − ∂φ(µ, λ_{1}(x))

∂x_{1}

x¯_{i}

kx_{2}k = 2c x¯_{i}

kx_{2}k, i = 2, · · · , n.

Similarly, we have

∂τ_{2}(µ, x)

∂ ¯x2

= ∂φ(µ, λ_{2}(x))

∂ ¯x2

−∂φ(µ, λ_{1}(x))

∂ ¯x2

x¯_{2}

kx_{2}k + (φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x)))

∂

¯x2

kx_{2}k

∂ ¯x2

= 2bx¯_{2}· ¯x_{2}

kx2k^{2} + (φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x)))

1

kx2k − x¯_{2}· ¯x_{2}
kx2k^{3}

= 2a + 2(b − a)x¯_{2}· ¯x_{2}
kx_{2}k^{2} ,

where a means a := φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x))

λ2(x) − λ1(x) . In general, mimicking the same derivation yields

∂τ_{i}(µ, x)

∂ ¯x_{j} =

( 2a + 2(b − a)_{kx}^{x}^{¯}^{i}^{·¯}^{x}^{i}

2k^{2} if i = j,
2(b − a)_{kx}^{x}^{¯}^{i}^{·¯}^{x}^{j}

2k^{2} if i 6= j.

To sum up, we obtain

∂Φ(µ, x)

∂x =

"

b c_{kx}^{x}^{T}^{2}

2k

c_{kx}^{x}^{2}

2k aI + (b − a)_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

which is the desired result.

(ii) For x2 = 0, it is clear to see

∂τ_{1}(µ, x)

∂x_{1} = 2∂φ(µ, x_{1})

∂x_{1} and ∂τ_{1}(µ, x)

∂ ¯x_{i} = 0 for i = 2, · · · , n.

Since τ_{i}(µ, x) = 0 for i = 2, · · · , n, it gives ^{∂τ}^{i}_{∂x}^{(µ,x)}

1 = 0. Moreover,

∂τ_{2}(µ, x)

∂ ¯x_{2} = lim

¯ x2→0

τ_{2}(µ, x_{1}, ¯x_{2}, 0, · · · , 0) − τ_{2}(µ, x_{1}, 0, · · · , 0)

¯
x_{2}

= lim

¯ x2→0

φ(µ, x_{1}+ |¯x_{2}|) − φ(µ, x_{1}− |¯x_{2}|)

¯
x_{2}

¯
x_{2}

|¯x_{2}|

= lim

¯ x2→0

φ(µ, x_{1}+ |¯x_{2}|) − φ(µ, x_{1}− |¯x_{2}|)

|¯x2|

= lim

¯ x2→0

∂φ(µ, x1+ |¯x2|)

∂(|¯x_{2}|) − ∂φ(µ, x1− |¯x2|)

∂(|¯x_{2}|) (as L^{0}Hopital^{0}s rule)

= lim

¯ x2→0

∂φ(µ, x_{1}+ |¯x_{2}|)

∂(x_{1}+ |¯x_{2}|) + ∂φ(µ, x_{1}− |¯x_{2}|)

∂(x_{1}− |¯x_{2}|)

= 2∂φ(µ, x_{1})

∂x1

. Thus, we obtain

∂τ_{i}(µ, x)

∂ ¯x_{j} =
(

2^{∂φ(µ,x}_{∂x} ^{1}^{)}

1 if i = j, 0 if i 6= j.

which is equivalent to saying

∂Φ(µ, x)

∂x = ∂φ(µ, x1)

∂x_{1} I.

From all the above, we conclude that

∂Φ(µ, x)

∂x =

∂φ(µ,x1)

∂x1 I if x_{2} = 0,

"

b c _{kx}^{x}^{T}^{2}

2k

c _{kx}^{x}^{2}

2k aI + (b − a)_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,

Thus, the proof is complete. 2

Now, we are ready to answer the question about what kind of smoothing functions can be adopted in the smoothing type algorithm. Two technical lemmas are needed towards the answer.

Lemma 2.1. Suppose that M, N ∈ IR^{n×n}. Let σ_{min}(M ) denote the minimum singular
value of M , and σ_{max}(N ) denote the maximum singular value of N . Then, the following
hold.

(a) σ_{min}(M ) > σ_{max}(N ) if and only if σ_{min}(M^{T}M ) > σ_{max}(N^{T}N ).

(b) If σ_{min}(M^{T}M ) > σ_{max}(N^{T}N ), then M^{T}M − N^{T}N is positive definite.

Proof. The proof is straightforward or can be found in usual textbook of matrix analysis, so we omit it here. 2

Lemma 2.2. Let A, S ∈ IR^{n×n} and A be symmetric. Suppose that the eigenvalues of
A and SS^{T} are arranged in non-increasing order. Then, for each k = 1, 2, · · · , n, there
exists a nonnegative real number θ_{k} such that

λ_{min}(SS^{T}) ≤ θ_{k}≤ λ_{max}(SS^{T}) and λ_{k}(SAS^{T}) = θ_{k}λ_{k}(A).

Proof. Please see [15, Corollary 4.5.11] for a proof. 2

We point out that the crucial key, which guarantees a smoothing function can be
employed in the smoothing type algorithm, is the nonsingularity of the Jacobian matrix
H^{0}(µ, x)) given in (7). As below, we provide under what condition the Jacobian matrix
H^{0}(µ, x)) is nonsingular.

Theorem 2.1. Consider a SOCAVE (2) with σ_{min}(A) > σ_{max}(B). Let H be defined as in
(6). Suppose that φ : IR_{++}× IR → IR is a smoothing function of |t|. If −1 ≤ _{dt}^{d}φ(µ, t) ≤ 1
is satisfied, then the Jacobian matrix H^{0}(µ, x) is nonsingular for any µ > 0.

Proof. From the expression of H^{0}(µ, x) given as in (7), we know that H^{0}(µ, x) is non-
singular if and only if the matrix A + B ^{∂Φ(µ,x)}_{∂x} is nonsingular. Thus, it suffices to show
that the matrix A + B ^{∂Φ(µ,x)}_{∂x} is nonsingular under the conditions.

Suppose not, that is, there exists a vector 0 6= v ∈ IR^{n} such that

A + B ∂Φ(µ, x)

∂x

v = 0 which implies that

v^{T}A^{T}Av = v^{T} ∂Φ(µ, x)

∂x

T

B^{T}B ∂Φ(µ, x)

∂x v. (9)

For convenience, we denote C := ^{∂Φ(µ,x)}_{∂x} . Then, it follows that v^{T}A^{T}Av = v^{T}C^{T}B^{T}BCv.

Applying Lemma 2.2, there exists a constant ˆθ such that

λ_{min}(C^{T}C) ≤ ˆθ ≤ λ_{max}(C^{T}C) and λ_{max}(C^{T}B^{T}BC) = ˆθλ_{max}(B^{T}B).

Note that if we can prove that

0 ≤ λ_{min}(C^{T}C) ≤ λ_{max}(C^{T}C) ≤ 1,

we will have λ_{max}(C^{T}B^{T}BC) ≤ λ_{max}(B^{T}B). Then, by the assumption that the minimum
singular value of A strictly exceeds the maximum singular value of B (i.e., σ_{min}(A) >

σ_{max}(B)) and applying Lemma 2.1, we obtain v^{T}A^{T}Av > v^{T}C^{T}B^{T}BCv. This contradicts
the identity (9), which shows the Jacobian matrix H^{0}(µ, x) is nonsingular for µ > 0.

Thus, in light of the above discussion, it suffices to claim 0 ≤ λ_{min}(C^{T}C) ≤ λ_{max}(C^{T}C) ≤
1. To this end, we discuss two cases.

Case 1: For x_{2} = 0, we compute that C = ^{∂φ(µ,x}_{∂x} ^{1}^{)}

1 I. Since −1 ≤ ^{∂φ(µ,x}_{∂x} ^{1}^{)}

1 ≤ 1, it is clear
that 0 ≤ λ(C^{T}C) ≤ 1 for µ > 0. Then, the claim is done.

Case 2: For x_{2} 6= 0, using the fact that the matrix M^{T}M is always positive semidefinite
for any matrix M ∈ IR^{m×n}, we see that the inequality λmin(C^{T}C) ≥ 0 always holds. In
order to prove λ_{max}(C^{T}C) ≤ 1, we need to further argue that the matrix I − C^{T}C is
positive semidefinite. First, we write out

I − C^{T}C =

"

1 − b^{2}− c^{2} −2bc_{kx}^{x}^{T}^{2}

2k

−2bc_{kx}^{x}^{2}

2k (1 − a^{2})I + (a^{2} − b^{2}− c^{2})_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

# .

If −1 < ^{∂φ(µ,λ}_{∂x}^{i}^{(x))}

1 < 1, then we obtain
b^{2}+ c^{2} = 1

2

"

∂φ(µ, λ_{1}(x))

∂x_{1}

2

+ ∂φ(µ, λ_{2}(x))

∂x_{1}

2#

< 1.

This indicates that 1 − b^{2}− c^{2} > 0. By considering [1 − b^{2}− c^{2}] as an 1 × 1 matrix, this
says [1 − b^{2} − c^{2}] is positive definite. Hence, its Schur complement can be computed as
below:

(1 − a^{2})I + (a^{2}− b^{2} − c^{2})x2x^{T}_{2}

kx_{2}k^{2} − 4b^{2}c^{2}
1 − b^{2}− c^{2}

x2x^{T}_{2}
kx_{2}k^{2}

= (1 − a^{2})

I − x_{2}x^{T}_{2}
kx2k^{2}

+

1 − b^{2}− c^{2}− 4b^{2}c^{2}
1 − b^{2}− c^{2}

x_{2}x^{T}_{2}

kx2k^{2}. (10)
On the other hand, by the Mean Value Theorem, we have

φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x)) = ∂φ(µ, ξ)

∂ξ (λ_{2}(x) − λ_{1}(x)),

where ξ ∈ (λ_{1}(x), λ_{2}(x)). To proceed, we need to further discuss two subcases.

(1) When −1 < ^{∂φ(µ,ξ)}_{∂ξ} < 1, we know |φ(µ, λ_{2}(x)) − φ(µ, λ_{1}(x))| < |λ_{2}(x) − λ_{1}(x)|. This
together with (8) implies that 1 − a^{2} > 0 for any µ > 0. In addition, for any µ > 0,
we observe that

(1 − b^{2}− c^{2})^{2}− 4b^{2}c^{2}

= (1 − (b − c)^{2})(1 − (b + c)^{2})

=

"

1 − ∂φ(µ, λ_{1}(x))

∂x_{1}

2#

·

"

1 − ∂φ(µ, λ_{2}(x))

∂x_{1}

2#

> 0.

With all of these, we verify that the Schur complement of [1−b^{2}−c^{2}] given as in (10)
is a linear positive combination of the matrices

I − _{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

and _{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}, which yields
that the Schur complement (10) of [1 − b^{2} − c^{2}] is positive semidefinite. Hence,
the matrix I − C^{T}C is also positive semidefinite, which is equivalent to saying
0 ≤ λ_{min}(C^{T}C) ≤ λ_{max}(C^{T}C) ≤ 1.

(2) When ^{∂φ(µ,ξ)}_{∂ξ} = ±1, we have

1 − a^{2} = 0, and (1 − b^{2}− c^{2})^{2}− 4b^{2}c^{2} > 0.

Since the matrix _{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2} is positive semidefinite, the matrix I − C^{T}C is positive
semidefinite. Hence, 0 ≤ λmin(C^{T}C) ≤ λmax(C^{T}C) ≤ 1.

If either

( _{∂φ(µ,λ}

1(x))

∂x1 = ±1

∂φ(µ,λ2(x))

∂x1 = ±1 or

( _{∂φ(µ,λ}

1(x))

∂x1 = ±1

∂φ(µ,λ2(x))

∂x1 = ∓1 , then we have b = ±1, c = 0 or
b = 0, c = ∓1, which yields b^{2} + c^{2} = 1. Again, two subcases are needed.

(1) When −1 < ^{∂φ(µ,ξ)}_{∂ξ} < 1, we have |φ(µ, λ2(x)) − φ(µ, λ1(x))| < |λ2(x) − λ1(x)|. This
implies that 1 − a^{2} > 0 for any µ > 0. Therefore

I − C^{T}C = " 0 0
0 (1 − a^{2})

I − _{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

# ,

Since the matrix I − _{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2} is positive semidefinite, the matrix I − C^{T}C is positive
semidefinite. Hence, 0 ≤ λmin(C^{T}C) ≤ λmax(C^{T}C) ≤ 1.

(2) When ^{∂φ(µ,ξ)}_{∂ξ} = ±1, we have I − C^{T}C = 0, which leads to λ(C^{T}C) = 1.

From all the above, the proof is complete. 2

We point out that the condition σmin(A) > σmax(B) in Theorem 2.1 guarantees
the unique solution according to [35, Theorem 4.1]. From Theorem 2.1, we realize
that for a SOCAVE (2) with σ_{min}(A) > σ_{max}(B), any smoothing function of |t| with

−1 ≤ _{dt}^{d}φ(µ, t) ≤ 1 will be good for serving in the smoothing Newton algorithm when
solving the above SOCAVE. With this, it is easy to find or construct smoothing functions
of |t| satisfying the above condition. One popular approach is a smoothing approximation
via convolution for the absolute value function [1, 22, 38, 45], which is described as below.

First, we construct a smoothing approximation for the plus function (t)+ = max{0, t}.

Then, we consider the piecewise continuous function d(t) with finite number of pieces, which is a density (kernel) function. In other words, it satisfies

d(t) ≥ 0 and

Z +∞

−∞

d(t)dt = 1.

With this d(t), we further define ˆs(t, µ) := ^{1}_{µ}d

t µ

, where µ is a positive parameter. If R+∞

−∞ |t| d(t)dt < +∞, then a smoothing approximation for (t)+ is formed. In particular, ˆ

p(t, µ) = Z +∞

−∞

(t − s)_{+}ˆs(s, µ)ds =
Z t

−∞

(t − s)ˆs(s, µ)ds ≈ (t)_{+}.

The following are four well-known smoothing functions for the plus function [1, 38]:

φˆ_{1}(µ, t) = t + µ ln

1 + e^{−}^{µ}^{t}

. (11)

φˆ_{2}(µ, t) =

t if t ≥ ^{µ}_{2},

1

2µ t + ^{µ}_{2}2

if − ^{µ}_{2} < t < ^{µ}_{2},
0 if t ≤ −^{µ}_{2}.

(12)

φˆ_{3}(µ, t) = p4µ^{2}+ t^{2}+ t

2 . (13)

φˆ_{4}(µ, t) =

t − ^{µ}_{2} if t > µ,

t^{2}

2µ if 0 ≤ t ≤ µ, 0 if t < 0.

(14)

where the corresponding kernel functions are
d_{1}(t) = e^{−x}

(1 + e^{−x})^{2}.

d_{2}(t) = 1 if − ^{1}_{2} ≤ x ≤ ^{1}_{2},
0 otherwise.

d_{3}(t) = 2
(x^{2}+ 4)^{3}^{2}.

d_{4}(t) = 1 if 0 ≤ x ≤ 1,
0 otherwise.

Next, in light of |t| = (t)_{+}+ (−t)−, the smoothing function of |t| via convolution can be
written as

ˆ

p(|t| , µ) = ˆp(t, µ) + ˆp(−t, µ) = Z +∞

−∞

|t − s| ˆs(s, µ)ds.

Analogous to (11)-(14), we achieve the following smoothing functions for |t|:

φ_{1}(µ, t) = µh
ln

1 + e^{−}^{µ}^{t}

+ ln

1 + e^{µ}^{t}

i

. (15)

φ_{2}(µ, t) =

t if t ≥ ^{µ}_{2},

t^{2}

µ + ^{µ}_{4} if − ^{µ}_{2} < t < ^{µ}_{2},

−t if t ≤ −^{µ}_{2}.

(16)

φ3(µ, t) = p

4µ^{2}+ t^{2}. (17)

φ_{4}(µ, t) =

( t^{2}

2µ if |t| ≤ µ,

|t| − ^{µ}_{2} if |t| > µ. (18)

If we take a Epanechnikov kernel function K(t) =

_{3}

4(1 − t^{2}) if |t| ≤ 1,
0 if otherwise,
then we obtain the following smoothing function for |t|:

φ5(µ, t) =

t if t > µ,

−_{8µ}^{t}^{4}3 +^{3t}_{4µ}^{2} + ^{3µ}_{8} if − µ ≤ t ≤ µ,

−t if t < µ.

(19)

Moreover, taking a Gaussian kernel function K(t) = ^{√}^{1}

2πe^{−}^{t2}^{2} for all t ∈ IR yields
ˆ

s(t, µ) := 1 µK t

µ

= 1

p2πµ^{2}e^{−}^{2µ2}^{t2} ,
and it leads to the smoothing function [45] for |t|:

φ_{6}(µ, t) = terf

t

√2µ

+

r2

πµe^{−}^{2µ2}^{t2} , (20)

where the error function is defined by erf(t) = 2

√π Z t

0

e^{−u}^{2}du ∀t ∈ IR.

In summary, we have constructed six smoothing functions from the above discussions.

Can all the above functions serve as smoothing functions for solving SOCAVE? The an-
swer is affirmative because it is not hard to verify that each φipossesses −1 ≤ _{dt}^{d}φi(µ, t) ≤
1. Thus, these six functions will be adopted for our numerical implementations. Accord-
ingly, we need to define Φ_{i}(µ, x) and H_{i}(µ, x) based on each φ_{i}. For subsequent needs, we
only present the expression of each Jacobian matrix H_{i}^{0}(µ, x) without detailed derivations.

Based on each φ_{i}, let Φ_{i} : IR × IR^{n} → IR^{n} for i = 1, 2, · · · , 6 be similarly defined as in
(5), i.e

Φ_{i}(µ, x) = φ_{i}(µ, λ_{1}(x)) u^{(1)}_{x} + φ_{i}(µ, λ_{2}(x)) u^{(2)}_{x} (21)
and H_{i} : IR × IR^{n}→ IR^{n} for i = 1, 2, · · · , 6 be similarly defined as in (6), i.e

H_{i}(µ, x) =

µ

Ax + BΦ_{i}(µ, x) − b

, ∀µ ∈ IR_{++} and x ∈ IR^{n}. (22)
Then, each H_{i}is continuously differentiable on IR_{++}×IR^{n}with the Jacobian matrix given
by

H_{i}^{0}(µ, x) =

"

1 0

B ^{∂Φ}^{i}_{∂µ}^{(µ,x)} A + B ^{∂Φ}^{i}_{∂x}^{(µ,x)}

#

(23)
for all (µ, x) ∈ IR_{++}× IR^{n} with x = (x_{1}, x_{2}) ∈ IR × IR^{n−1}. Moreover, the differentation of
each Φ_{i} is expressed as below.

(1) The Jacobian of Φ_{1} is characterized as below.

∂Φ_{1}(µ, x)

∂µ

= ∂φ_{1}(µ, λ_{1}(x))

∂µ u^{(1)}_{x} +∂φ_{1}(µ, λ_{2}(x))

∂µ u^{(2)}_{x}

=

"

φ_{1}(µ, λ_{1}(x))

µ +λ_{1}(x)

µ · 1 − e^{λ1(x)}^{µ}
1 + e^{λ1(x)}^{µ}

#
u^{(1)}_{x} +

"

φ_{1}(µ, λ_{2}(x))

µ + λ_{2}(x)

µ · 1 − e^{λ2(x)}^{µ}
1 + e^{λ2(x)}^{µ}

#
u^{(2)}_{x} .

∂Φ_{1}(µ, x)

∂x =

e^{x1}^{µ} −1
e^{x1}^{µ} +1

I if x_{2} = 0,

"

b1 c1

x^{T}_{2}
kx_{2}k

c_{1}_{kx}^{x}^{2}

2k a_{1}I + (b_{1}− a_{1})_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,

with

a_{1} = φ_{1}(µ, λ_{2}(x)) − φ_{1}(µ, λ_{1}(x))
λ_{2}(x) − λ_{1}(x) ,
b_{1} = 1

2

e^{λ1(x)}^{µ} − 1
e^{λ1(x)}^{µ} + 1

+e^{λ2(x)}^{µ} − 1
e^{λ2(x)}^{µ} + 1

! ,

c1 = 1 2

1 − e^{λ1(x)}^{µ}
e^{λ1(x)}^{µ} + 1

+e^{λ2(x)}^{µ} − 1
e^{λ2(x)}^{µ} + 1

! .

(2) The Jacobian of Φ_{2} is characterized as below.

∂Φ_{2}(µ, x)

∂µ = ∂φ_{2}(µ, λ_{1}(x))

∂µ u^{(1)}_{x} + ∂φ_{2}(µ, λ_{2}(x))

∂µ u^{(2)}_{x}
with

∂φ_{2}(µ, λ_{i}(x))

∂µ =

0 if λ_{i}(x) ≥ ^{µ}_{2},

−

λi(x) µ

2

+ ^{1}_{4} if − ^{µ}_{2} < λ_{i}(x) < ^{µ}_{2},
0 if λ_{i}(x) ≤ −^{µ}_{2}.

∂Φ_{2}(µ, x)

∂x =

dI if x_{2} = 0,

"

b_{2} c_{2}_{kx}^{x}^{T}^{2}

2k

c_{2}_{kx}^{x}^{2}

2k a_{2}I + (b_{2}− a_{2})_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,

with

a_{2} = φ_{2}(µ, λ_{2}(x)) − φ_{2}(µ, λ_{1}(x))
λ2(x) − λ1(x) ,

b2 =

0 if λ_{2}(x) ≥ ^{µ}_{2} > −^{µ}_{2} ≥ λ_{1}(x),
1 if λ_{2}(x) > λ_{1}(x) ≥ ^{µ}_{2},

λ1(x)

µ + ^{1}_{2} if λ_{2}(x) ≥ ^{µ}_{2} > λ_{1}(x) > −^{µ}_{2},

λ1(x)+λ2(x)

µ if ^{µ}_{2} > λ_{2}(x) > λ_{1}(x) > −^{µ}_{2},

λ2(x)

µ − ^{1}_{2} if ^{µ}_{2} > λ_{2}(x) > −^{µ}_{2} ≥ λ_{1}(x),

−1 if λ_{1}(x) < λ_{2}(x) ≤ −^{µ}_{2},

c_{2} =

1 if λ_{2}(x) ≥ ^{µ}_{2} > −^{µ}_{2} ≥ λ_{1}(x),
0 if λ_{2}(x) > λ_{1}(x) ≥ ^{µ}_{2},

1

2 − ^{λ}^{1}_{µ}^{(x)} if λ_{2}(x) ≥ ^{µ}_{2} > λ_{1}(x) > −^{µ}_{2},

λ2(x)−λ1(x)

µ if ^{µ}_{2} > λ_{2}(x) > λ_{1}(x) > −^{µ}_{2},

λ2(x)

µ + ^{1}_{2} if ^{µ}_{2} > λ_{2}(x) > −^{µ}_{2} ≥ λ_{1}(x),
0 if λ_{1}(x) < λ_{2}(x) ≤ −^{µ}_{2},
d =

1 if x1 ≥ ^{µ}_{2},

2x1

µ if − ^{µ}_{2} < x_{1} < ^{µ}_{2},

−1 if x_{1} ≤ −^{µ}_{2}.
(3) The Jacobian of Φ_{3} is characterized as below.

∂Φ_{3}(µ, x)

∂µ = 4µ

p4µ^{2}+ λ^{2}_{1}(x)u^{(1)}_{x} + 4µ

p4µ^{2}+ λ^{2}_{2}(x)u^{(2)}_{x}

∂Φ_{3}(µ, x)

∂x =

x1

√

4µ^{2}+x^{2}_{1}I if x2 = 0,

"

b3 c3

x^{T}_{2}
kx_{2}k

c_{3}_{kx}^{x}^{2}

2k a_{1}I + (b_{1}− a_{1})_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,
with

a3 = φ_{3}(µ, λ_{2}(x)) − φ_{3}(µ, λ_{1}(x))
λ_{2}(x) − λ_{1}(x) ,
b_{3} = 1

2

λ1(x)

p4µ^{2}+ λ^{2}_{1}(x) + λ2(x)
p4µ^{2}+ λ^{2}_{2}(x)

! ,

c3 = 1 2

−λ_{1}(x)

p4µ^{2}+ λ^{2}_{1}(x) + λ_{2}(x)
p4µ^{2}+ λ^{2}_{2}(x)

! .

(4) The Jacobian of Φ_{4} is characterized as below.

∂Φ_{4}(µ, x)

∂µ = ∂φ_{4}(µ, λ_{1}(x))

∂µ u^{(1)}_{x} + ∂φ_{4}(µ, λ_{2}(x))

∂µ u^{(2)}_{x}

with

∂φ_{4}(µ, λ_{i}(x))

∂µ =

−^{1}_{2} if λ_{i}(x) > µ,

−^{1}_{2}

λi(x) µ

2

if − µ ≤ λi(x) ≤ µ,

−^{1}_{2} if λ_{i}(x) < −µ.

∂Φ_{4}(µ, x)

∂x =

eI if x_{2} = 0,

"

b4 c4

x^{T}_{2}
kx_{2}k

c_{4}_{kx}^{x}^{2}

2k a_{4}I + (b_{4}− a_{4})_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x_{2} 6= 0,
with

a_{4} = φ_{4}(µ, λ_{2}(x)) − φ_{4}(µ, λ_{1}(x))
λ_{2}(x) − λ_{1}(x) ,

b_{4} =

0 if λ_{2}(x) > µ > −µ > λ_{1}(x),
1 if λ_{2}(x) > λ_{1}(x) > µ,

λ1(x)

2µ +^{1}_{2} if λ_{2}(x) > µ ≥ λ_{1}(x) ≥ −µ,

λ1(x)+λ2(x)

2µ if µ ≥ λ_{2}(x) > λ_{1}(x) ≥ −µ,

λ2(x)

2µ −^{1}_{2} if µ ≥ λ_{2}(x) ≥ −µ > λ_{1}(x),

−1 if λ_{1}(x) < λ_{2}(x) < −µ.

c_{4} =

1 if λ_{2}(x) > µ > −µ > λ_{1}(x),
0 if λ_{2}(x) > λ_{1}(x) > µ,

1

2 − ^{λ}^{1}_{2µ}^{(x)} if λ_{2}(x) > µ ≥ λ_{1}(x) ≥ −µ,

λ2(x)−λ1(x)

2µ if µ ≥ λ2(x) > λ1(x) ≥ −µ,

λ2(x)

2µ +^{1}_{2} if µ ≥ λ2(x) ≥ −µ > λ1(x),
0 if λ_{1}(x) < λ_{2}(x) < −µ,
e =

1 if x_{1} > µ,

x1

µ if − µ ≤ x_{1} ≤ µ,

−1 if x_{1} < −µ.

(5) The Jacobian of Φ5 is characterized as below.

∂Φ_{5}(µ, x)

∂µ = ∂φ_{5}(µ, λ_{1}(x))

∂µ u^{(1)}_{x} + ∂φ_{5}(µ, λ_{2}(x))

∂µ u^{(2)}_{x}
with

∂φ_{5}(µ, λ_{i}(x))

∂µ =

0 if λ_{i}(x) > µ,

3 8

λi(x) µ

2

− 1

2

if − µ ≤ λi(x) ≤ µ,
0 if λ_{i}(x) < −µ.

∂Φ_{5}(µ, x)

∂x =

eI if x_{2} = 0,

"

b_{5} c_{5}_{kx}^{x}^{T}^{2}

2k

c5 x2

kx_{2}k a5I + (b5− a5)_{kx}^{x}^{2}^{x}^{T}^{2}

2k^{2}

#

if x2 6= 0,