# SOC Functions and Their Applications

(1)

## SOC Functions and Their Applications

### Jein-Shan Chen

Department of Mathematics National Taiwan Normal University

(2)

### Preface

The second-order cone programs (SOCP) have been an attraction due to plenty of ap- plications in engineering, data science, and finance. To deal with this special type of optimization problems involving second-order cone (SOC). We believe that the following items are crucial concepts: (i) spectral decomposition associated with SOC, (ii) analy- sis of SOC functions, (iii) SOC-convexity and SOC-monotonicity. In this book, we go through all these concepts and try to provide the readers a whole picture regarding SOC functions and their applications.

As introduced in Chapter 1, the SOC functions are indeed vector-valued functions associated with SOC, which are accompanied by Jordan product. However, unlike the matrix multiplication, the Jordan product associated with SOC is not associative which is the main source of difficulty when we do the analysis. Therefore, the ideas for proofs are usually quite different from those for matrix-valued functions. In other words, al- though SOC and positive semidefinite cone both belong to symmetric cones, the analysis for them are different. In general, the arguments are more tedious and need subtle ar- rangements in the SOC setting. This is due to the feature of SOC.

To deal with second-order cone programs (SOCPs) and second-order cone complemen- tarity problems (SOCCPs), many methods rely on some SOC complementarity functions or merit functions to reformulate the KKT optimality conditions as a nonsmooth (or smoothing) system of equations or an unconstrained minimization problem. In fact, such SOC complementarity or merit functions are connected to SOC functions. In other words, the vector-valued functions associated with SOC are heavily used in the solutions methods for SOCP and SOCCP. Therefore, further study on these functions will be help- ful for developing and analyzing more solutions methods.

For SOCP, there are still many approaches without using SOC complementarity func- tions. In this case, the concepts of SOC-convexity and SOC-monotonicity introduced in Chapter 2 play a key to those solution methods. In Chapter 3, we present proximal-type algorithms in which SOC-convexity and SOC-monotonicity are needed in designing so- lution methods and proving convergence analysis.

In Chapter 4, we pay attention to some other types of applications of SOC-functions, SOC-convexity, and SOC-monotonicity introduced in this monograph. These include so-called SOC means, SOC weighted means, and a few SOC trace versions of Young, H¨older, Minkowski inequalities, and Powers-Størmer’s inequality. All these materials are newly discovered and we believe that they will be helpful in convergence analysis of var- ious optimizations involving SOC. Chapter 5 offers a direction for future investigation, although it is not very consummate yet.

(3)

This book is based on my series of study regarding second-order cone, SOCP, SOCCP, SOC-functions, etc. during the past fifteen years. It is dedicated to the memory of my supervisor, Prof. Paul Tseng, who guided me into optimization research, especially to second-order cone optimization. Without his encouragement, it is not possible to achieve the whole picture of SOC-functions, which is the main role of this monograph. His attitude towards doing research always remains in my heart, albeit he got missing in 2009. I would like to thank all my co-authors of the materials that appear in this book, including Prof. Shaohua Pan, Prof. Xin Chen, Prof. Jiawei Zhang, Prof. Yu-Lin Chang, Dr. Chien-Hao Huang, etc.. The collaborations with them are wonderful and enjoyable experiences. I also thank Dr. Chien-Hao Huang, Dr. Yue Lu, Dr. Liguo Jiao, Prof.

Xinhe Miao, and Prof. Chu-Chin Hu for their help on proofreading. Final gratitude goes to my family, Vivian, Benjamin, and Ian, who offer me support and stimulate endless strength in pursuing my exceptional academic career.

January 07, 2019 Taipei, Taiwan

(4)

### Notations

• Throughout this book, an n-dimensional vector x = (x1, x2, · · · , xn) ∈ IRn means a column vector, i.e.,

x =

 x1

x2 ... xn

 .

In other words, without ambiguity, we also write the column vector as x = (x1, x2, · · · , xn).

• IRn+ means {x = (x1, x2, . . . , xn) | xi ≥ 0, for all i = 1, 2, . . . , n}, whereas IRn++ de- notes {x = (x1, x2, . . . , xn) | xi > 0, ∀i = 1, 2, . . . , n}.

• h·, ·i denotes the Euclidean inner product.

T means transpose of a vector or a matrix.

• B(x, δ) denotes the neighborhood of x with radius δ > 0.

• IRn×n denotes the space of n × n real matrices.

• I represents an identity matrix of suitable dimension.

• For any symmetric matrices A, B ∈ IRn×n, we write A  B (respectively, A  B) to mean A − B is positive semidefinite (respectively, positive definite).

• Sndenotes the space of n × n symmetric matrices; and S+n means the space of n × n symmetric positive semidefinite matrices.

• O denotes the set of P ∈ IRn×n that are orthogonal, i.e., PT = P−1.

• k · k is the Euclidean norm.

• Given a set S, we denote ¯S, int(S) and bd(S) by the closure, the interior and the boundary of S, respectively.

• A function f : IRn → (−∞, ∞] is said to be proper if f (ζ) < ∞ for at least one ζ ∈ IRn and f (ζ) > −∞ for all ζ ∈ IRn.

• For a mapping f : IRn→ IR, ∇f (x) denotes the gradient of f at x.

• For a closed proper convex function f : IRn → (−∞, ∞], we denote its domain by domf := { ζ ∈ IRn| f (ζ) < ∞}.

(5)

• For a closed proper convex function f : IRn → (−∞, ∞], we denote the subdiffer- ential of f at bζ by

∂f (bζ) :=n

w ∈ IRn| f (ζ) ≥ f (bζ) + hw, ζ − bζi, ∀ζ ∈ IRno .

• C(i)(J ) denotes the family of functions which are defined on J ⊆ IRn to IR and have continuous i-th derivative.

• For any differentiable mapping F = (F1, F2, · · · , Fm) : IRn → IRm, ∇F (x) = [∇F1(x) · · · ∇Fm(x)] is a n × m matrix which denotes the transpose Jacobian of F at x.

• For any x, y ∈ IRn, we write x Kn y if x − y ∈ Kn; and write x Kn y if x − y ∈ int(Kn).

• For a real valued function f : J → IR, f0(t) and f00(t) denote the first derivative and second-order derivative of f at the differentiable point t ∈ J , respectively.

• For a mapping F : S ⊆ IRn → IRm, ∂F (x) denotes the subdifferential of F at x, while ∂BF (x) denotes the B-subdifferential of F at x.

(6)

## SOC Functions

During the past two decades, there have been active research for second-order cone pro- grams (SOCPs) and second-order cone complementarity problems (SOCCPs). Various methods had been proposed which include the interior-point methods [1, 102, 109, 123, 146], the smoothing Newton methods [51, 63, 71], the semismooth Newton methods [86, 120], and the merit function methods [43, 48]. All of these methods are proposed by using some SOC complementarity function or merit function to reformulate the KKT optimality conditions as a nonsmooth (or smoothing) system of equations or an uncon- strained minimization problem. In fact, such SOC complementarity functions or merit functions are closely connected to so-called SOC functions. In other words, studying SOC functions is crucial to dealing with SOCP and SOCCP, which is the main target of this chapter.

### 1.1 On the second-order cone

The second-order cone (SOC) in IRn, also called Lorentz cone, is defined by

Kn =(x1, x2) ∈ IR × IRn−1| kx2k ≤ x1 , (1.1) where k · k denotes the Euclidean norm. If n = 1, let Kn denote the set of nonnegative reals IR+. For n = 2 and n = 3, the pictures of Kn are depicted in Figure 1.1(a) and Figure 1.1(b), respectively. It is known that Kn is a pointed closed convex cone so that a partial ordering can be deduced. More specifically, for any x, y in IRn, we write x Kn y if x − y ∈ Kn; and write x Kn y if x − y ∈ int(Kn). In other words, we have x Kn 0 if and only if x ∈ Kn; whereas x Kn 0 if and only if x ∈ int(Kn). The relation Kn is a partial ordering, but not a linear ordering in Kn, i.e., there exist x, y ∈ Kn such that neither x Kn y nor y Kn x. To see this, for n = 2, let x = (1, 1) ∈ K2 and y = (1, 0) ∈ K2. Then, we have x − y = (0, 1) /∈ K2 and y − x = (0, −1) /∈ K2.

1

(7)

(a) 2-dimensional SOC (b) 3-dimensional SOC

Figure 1.1: The graphs of SOC

The second-order cone has received much attention in optimization, particularly in the context of applications and solutions methods for second-order cone program (SOCP) [1, 47, 48, 102, 115, 116, 118] and second-order cone complementarity problem (SOCCP), [42, 43, 45, 48, 63, 71, 117]. For those solutions methods, there needs spectral decomposition associated with SOC whose basic concept is described below. For any x = (x1, x2) ∈ IR × IRn−1, x can be decomposed as

x = λ1(x)u(1)x + λ2(x)u(2)x , (1.2) where λ1(x), λ2(x) and u(1)x , u(2)x are the spectral values and the associated spectral vectors of x given by

λi(x) = x1+ (−1)ikx2k, (1.3)

u(i)x =

1 2



1, (−1)i x2

kx2k



, if x2 6= 0,

1

2(1, (−1)iw) , if x2 = 0,

(1.4)

for i = 1, 2 with w being any vector in IRn−1 satisfying kwk = 1. If x2 6= 0, the decom- position is unique.

For any x = (x1, x2) ∈ IR × IRn−1 and y = (y1, y2) ∈ IR × IRn−1, we define their Jordan product as

x ◦ y = (hx, yi, y1x2+ x1y2) ∈ IR × IRn−1. (1.5) The Jordan product is not associative. For example, for n = 3, let x = (1, −1, 1) and y = z = (1, 0, 1), then we have (x ◦ y) ◦ z = (4, −1, 4) 6= x ◦ (y ◦ z) = (4, −2, 4). However, it is power associative, i.e., x ◦ (x ◦ x) = (x ◦ x) ◦ x, for all x ∈ IRn. Thus, without fear of ambiguity, we may write xm for the product of m copies of x and xm+n= xm◦ xn for all positive integers m and n. The vector e = (1, 0, . . . , 0) is the unique identity element for the Jordan product, and we define x0 = e for convenience. In addition, Kn is not closed under Jordan product. For example, x = (√

2, 1, 1) ∈ K3, y = (√

2, 1, −1) ∈ K3,

(8)

but x ◦ y = (2, 2√

2, 0) /∈ K3. We point out that lacking associative property of Jordan product and closedness of SOC are the main sources of difficulty when dealing with SOC.

We write x2 to denote x ◦ x and write x + y to mean the usual componentwise addition of vectors. Then, “◦, +” together with e = (1, 0, . . . , 0) ∈ IRn have the following basic properties (see [61, 63]):

(1) e ◦ x = x, for all x ∈ IRn. (2) x ◦ y = y ◦ x, for all x, y ∈ IRn.

(3) x ◦ (x2◦ y) = x2◦ (x ◦ y), for all x, y ∈ IRn. (4) (x + y) ◦ z = x ◦ z + y ◦ z, for all x, y, z ∈ IRn.

For each x = (x1, x2) ∈ IR × IRn−1, the determinant and the trace of x are defined by det(x) = x21− kx2k2, tr(x) = 2x1.

In view of the definition of spectral values (1.3), it is clear that the determinant, the trace and the Euclidean norm of x can all be represented in terms of λ1(x) and λ2(x):

det(x) = λ1(x)λ2(x), tr(x) = λ1(x) + λ2(x), kxk2 = 1

2 λ1(x)2+ λ2(x)2 . (1.6) As below, we elaborate more about the determinant and trace by showing some properties.

Proposition 1.1. For any x Kn 0 and y Kn 0, the following results hold.

(a) If x Kn y, then det(x) ≥ det(y) and tr(x) ≥ tr(y).

(b) If x Kn y, then λi(x) ≥ λi(y) for i = 1, 2.

Proof. (a) From definition, we know that

det(x) = x21− kx2k2, tr(x) = 2x1, det(y) = y12− ky2k2, tr(y) = 2y1.

Since x − y = (x1− y1, x2− y2) Kn 0, we have kx2− y2k ≤ x1− y1. Thus, x1 ≥ y1, and then tr(x) ≥ tr(y). Besides, using the assumption on x and y gives

x1− y1 ≥ kx2− y2k ≥

kx2k − ky2k

, (1.7)

which is equivalent to x1− kx2k ≥ y1− ky2k > 0 and x1+ kx2k ≥ y1+ ky2k > 0. Hence, det(x) = x21− kx2k2 = (x1 + kx2k)(x1− kx2k) ≥ (y1+ ky2k)(y1− ky2k) = det(y).

(9)

(b) From definition of spectral values, we know that

λ1(x) = x1 − kx2k, λ2(x) = x1+ kx2k and λ1(y) = y1− ky2k, λ2(y) = y1+ ky2k.

Then, by the inequality (1.7) in the proof of part(a), the results follow immediately.  We point out that there may have other simpler ways to prove Proposition 1.1. The approach here is straightforward and intuitive by checking definitions. The converse of Proposition 1.1 does not hold, a counterexample occurs when taking x = (5, 3) ∈ K2 and y = (3, −1) ∈ K2. In fact, if (x1, x2) ∈ IR × IRn−1 serves as a counterexample for Kn, then (x1, x2, 0, . . . , 0) ∈ IR × IRm−1 is automatically a counterexample for Km whenever m ≥ n. Moreover, for any x Kn y, there always have λi(x) ≥ λi(y) and tr(x) ≥ tr(y) for i = 1, 2. There is no need to restrict x Kn 0 and y Kn 0 as in Proposition 1.1.

Proposition 1.2. Let x Kn 0, y Kn 0 and e = (1, 0, · · · , 0). Then, the following hold.

(a) det(x + y) ≥ det(x) + det(y).

(b) det(x ◦ y) ≤ det(x) det(y).

(c) det αx + (1 − α)y ≥ α2det(x) + (1 − α)2det(y) for all 0 < α < 1.

(d) det(e + x)1/2

≥ 1 + det(x)1/2. (e) det(e + x + y) ≤ det(e + x) det(e + y).

Proof. (a) For any x Kn 0 and y Kn 0, we know kx2k ≤ x1 and ky2k ≤ y1, which implies

|hx2, y2i| ≤ kx2k ky2k ≤ x1y1. Hence, we obtain

det(x + y) = (x1+ y1)2− kx2+ y2k2

= x21− kx2k2 + y21− ky2k2 + 2 x1y1− hx2, y2i

≥ x21− kx2k2 + y21− ky2k2

= det(x) + det(y).

(b) Applying the Cauchy inequality gives det(x ◦ y) = hx, yi2− kx1y2+ y1x2k2

= x1y1+ hx2, y2i2

− x21ky2k2+ 2x1y1hx2, y2i + y21kx2k2

= x21y12+ hx2, y2i2− x21ky2k2− y12kx2k2

≤ x21y12+ kx2k2ky2k2− x21ky2k2− y12kx2k2

= x21− kx2k2

y21− ky2k2

= det(x) det(y).

(10)

(c) For any x Kn 0 and y Kn 0, it is clear that αx Kn 0 and (1 − α)y Kn 0 for every 0 < α < 1. In addition, we observe that det(αx) = α2det(x). Hence,

det αx + (1 − α)y ≥ det(αx) + det((1 − α)y) = α2det(x) + (1 − α)2det(y), where the inequality is from part(a).

(d) For any x Kn 0, we know det(x) = λ1(x)λ2(x) ≥ 0, where λi(x) are the spectral values of x. Hence,

det(e + x) = (1 + λ1(x))(1 + λ2(x)) ≥ 1 +p

λ1(x)λ2(x)2

= 1 + det(x)1/22 . Then, taking square root on both sides yields the desired result.

(e) Again, For any x Kn 0 and y Kn 0, we have the following inequalities

x1− kx2k ≥ 0, y1− ky2k ≥ 0, |hx2, y2i| ≤ kx2k ky2k ≤ x1y1. (1.8) Moreover, we know det(e+x+y) = (1+x1+y1)2−kx2+y2k2, det(e+x) = (1+x1)2−kx2k2 and det(e + y) = (1 + y1)2− ky2k2. Hence,

det(e + x) det(e + y) − det(e + x + y)

= (1 + x1)2− kx2k2

(1 + y1)2− ky2k2 − (1 + x1 + y1)2− kx2+ y2k2

= 2x1y1+ 2hx2, y2i + 2x1y12+ 2x21y1− 2y1kx2k2− 2x1ky2k2 +x21y21 − y12kx2k2− x21ky2k2 + kx2k2ky2k2

= 2 x1y1+ hx2, y2i + 2x1 y12− ky2k2 + 2y1 x21− kx2k2 + x21− kx2k2

y21− ky2k2

≥ 0,

where we multiply out all the expansions to obtain the second equality and the last inequality holds by (1.8). 

Proposition 1.2(c) can be extended to a more general case:

det αx + βy ≥ α2det(x) + β2det(y) ∀α ≥ 0, β ≥ 0.

Note that together with Cauchy-Schwartz inequality and properties of determinant, one may achieve other way to verify Proposition 1.2. Again, the approach here is only one choice of proof which is straightforward and intuitive. There are more inequalities about determinant, see Proposition 1.8 and Proposition 2.32, which are established by using the concept of SOC-convexity that will be introduced in Chapter 2. Next, we move to the inequalities about trace.

Proposition 1.3. For any x, y ∈ IRn, we have

(11)

(a) tr(x + y) = tr(x) + tr(y) and tr(αx) = α tr(x) for any α ∈ IR. In other words, tr(·) is a linear function on IRn.

(b) λ1(x)λ2(y) + λ1(y)λ2(x) ≤ tr(x ◦ y) ≤ λ1(x)λ1(y) + λ2(x)λ2(y).

Proof. Part(a) is trivial and it remains to verify part(b). Using the fact that tr(x ◦ y) = 2hx, yi, we obtain

λ1(x)λ2(y) + λ1(y)λ2(x) = (x1− kx2k)(y1 + ky2k) + (x1+ kx2k)(y1− ky2k)

= 2(x1y1− kx2kky2k)

≤ 2(x1y1+ hx2, y2i)

= 2hx, yi

= tr(x ◦ y)

≤ 2(x1y1+ kx2kky2k)

= (x1− kx2k)(y1 − ky2k) + (x1+ kx2k)(y1+ ky2k)

= λ1(x)λ1(y) + λ2(x)λ2(y), which completes the proof. 

In general, det(x ◦ y) 6= det(x) det(y) unless x2 = αy2. A vector x = (x1, x2) ∈ IR × IRn−1 is said to be invertible if det(x) 6= 0. If x is invertible, then there exists a unique y = (y1, y2) ∈ IR × IRn−1 satisfying x ◦ y = y ◦ x = e. We call this y the inverse of x and denote it by x−1. In fact, we have

x−1 = 1

x21 − kx2k2(x1, −x2) = 1

det(x) tr(x)e − x.

Therefore, x ∈ int(Kn) if and only if x−1 ∈ int(Kn). Moreover, if x ∈ int(Kn), then x−k = (xk)−1 = (x−1)k is also well-defined. For any x ∈ Kn, it is known that there exists a unique vector in Kn denoted by x1/2 (also denoted by √

x sometimes) such that (x1/2)2 = x1/2◦ x1/2 = x. Indeed,

x1/2= s, x2

2s



, where s = s

1 2

 x1 +

q

x21− kx2k2

 .

In the above formula, the term x2s2 is defined to be the zero vector if s = 0 (and hence x2 = 0), i.e., x = 0 .

For any x ∈ IRn, we always have x2 ∈ Kn(i.e., x2 Kn 0). Hence, there exists a unique vector (x2)1/2 ∈ Kn denoted by |x|. It is easy to verify that |x| Kn 0 and x2 = |x|2 for any x ∈ IRn. It is also known that |x| Kn x. For any x ∈ IRn, we define [x]+ to be the projection point of x onto Kn, which is the same definition as in IRn+. In other words, [x]+ is the optimal solution of the parametric SOCP:

[x]+ = argmin{kx − yk | y ∈ Kn}.

(12)

Here the norm is in Euclidean norm since Jordan product does not induce a norm. Like- wise, [x] means the projection point of x onto −Kn, which implies [x]= −[−x]+. It is well known that [x]+ = 12(x + |x|) and [x] = 12(x − |x|), see Property 1.2(f).

The spectral decomposition along with the Jordan algebra associated with SOC entails some basic properties as below. We omit the proofs since they can be found in [61, 63].

Property 1.1. For any x = (x1, x2) ∈ IR × IRn−1 with the spectral values λ1(x), λ2(x) and spectral vectors u(1)x , u(2)x given as in (1.3)-(1.4), we have

(a) u(1)x and u(2)x are orthogonal under Jordan product and have length 1

2, i.e., u(1)x ◦ u(2)x = 0, ku(1)x k = ku(2)x k = 1

√2.

(b) u(1)x and u(2)x are idempotent under Jordan product, i.e., u(i)x ◦ u(i)x = u(i)x , i = 1, 2.

(c) λ1(x), λ2(x) are nonnegative (positive) if and only if x ∈ Kn (x ∈ int(Kn)), i.e., λi(x) ≥ 0 for i = 1, 2 ⇐⇒ x Kn 0.

λi(x) > 0 for i = 1, 2 ⇐⇒ x Kn 0.

Although the converse of Proposition 1.1(b) does not hold as mentioned earlier, Prop- erty 1.1(c) is useful in verifying whether a point x belongs to Kn or not.

Property 1.2. For any x = (x1, x2) ∈ IR × IRn−1 with the spectral values λ1(x), λ2(x) and spectral vectors u(1)x , u(2)x given as in (1.3)-(1.4), we have

(a) x2 = λ1(x)2u(1)x + λ2(x)2u(2)x and x−1 = λ−11 (x)u(1)x + λ−12 (x)u(2)x . (b) If x ∈ Kn, then x1/2 =pλ1(x) u(1)x +pλ2(x) u(2)x .

(c) |x| = |λ1(x)|u(1)x + |λ2(x)|u(2)x .

(d) [x]+ = [λ1(x)]+u(1)x + [λ2(x)]+u(2)x and [x] = [λ1(x)]u(1)x + [λ2(x)]u(2)x . (e) |x| = [x]++ [−x]+ = [x]+− [x].

(f ) [x]+ = 12(x + |x|) and [x]= 12(x − |x|).

Property 1.3. Let x = (x1, x2) ∈ IR × IRn−1 and y = (y1, y2) ∈ IR × IRn−1. Then, the following hold.

(13)

(a) Any x ∈ IRn satisfies |x| Kn x.

(b) For any x, y Kn 0, if x Kn y, then x1/2 Kn y1/2. (c) For any x, y ∈ IRn, if x2 Kn y2, then |x| Kn |y|.

(d) For any x ∈ IRn, x Kn 0 if and only if hx, yi ≥ 0 for all y Kn 0.

(e) For any x Kn 0 and y ∈ IRn, if x2 Kn y2, then x Kn y.

Note that for any x, y Kn 0, if x Kn y, one can also conclude that x−1 Kn y−1. However, the arguments are not trivial by direct verifications. We present it by other approach, see Proposition 2.3(a).

Property 1.4. For any x = (x1, x2) ∈ IR × IRn−1 with spectral values λ1(x), λ2(x) and any y = (y1, y2) ∈ IR × IRn−1 with spectral values λ1(y), λ2(y), we have

i(x) − λi(y)| ≤√

2kx − yk, i = 1, 2.

Proof. First, we compute that

1(x) − λ1(y)| = |x1 − kx2k − y1+ ky2k|

≤ |x1 − y1| + |kx2k − ky2k|

≤ |x1 − y1| + kx2− y2k

≤ √

2 |x1− y1|2+ kx2− y2k21/2

= √

2kx − yk,

where the second inequality uses kx2k ≤ kx2−y2k+ky2k and ky2k ≤ kx2−y2k+kx2k; the last inequality uses the relation between the 1-norm and the 2-norm. A similar argument applies to |λ2(x) − λ2(y)|. 

In fact, Property 1.1-1.3 are parallel results analogous to those associated with positive semidefinite cone S+n, see . Even though both Kn and S+n belong to the family of symmetric cones  and share similar properties, as we will see, the ideas and techniques for proving these results are quite different. One reason is that the Jordan product is not associative as mentioned earlier.

### 1.2 SOC function and SOC trace function

In this section, we introduce two types of functions, SOC function and SOC trace func- tion, which are very useful in dealing with optimization involved with SOC. Some in- equalities are established in light of these functions.

(14)

Let x = (x1, x2) ∈ IR × IRn−1 with spectral values λ1(x), λ2(x) given as in (1.3) and spectral vectors u(1)x , u(2)x given as in (1.4). We first define its corresponding SOC function as below. For any real-valued function f : IR → IR, the following vector-valued function associated with Kn (n ≥ 1) was considered [45, 63]:

fsoc(x) := f (λ1(x))u(1)x + f (λ2(x))u(2)x , ∀x = (x1, x2) ∈ IR × IRn−1. (1.9) The definition (1.9) is unambiguous whether x2 6= 0 or x2 = 0. The cases of fsoc(x) = x1/2, x2, exp(x), which correspond to f (t) = t1/2, t2, et, are already discussed in the book .

Indeed, the above definition (1.9) is analogous to one associated with the semidefinite cone S+n, see [140, 145]. For subsequent analysis, we also need the concept of SOC trace function  defined by

ftr(x) := f (λ1(x)) + f (λ2(x)) = tr(fsoc(x)). (1.10) If f is defined only on a subset of IR, then fsoc and ftrare defined on the corresponding subset of IRn. More specifically, from Proposition 1.4 shown as below, we see that the corresponding subset for fsoc and ftr is

S = {x ∈ IRn| λi(x) ∈ J, i = 1, 2.} (1.11) provided f is defined on a subset of J ⊆ IR. In addition, S is open in IRn whenever J is open in IR. To see this assertion, we need the following technical lemma.

Lemma 1.1. Let A ∈ IRm×m be a symmetric positive definite matrix, C ∈ IRn×n be a symmetric matrix, and B ∈ IRm×n. Then,

 A B

BT C



 O ⇐⇒ C − BTA−1B  O (1.12)

and

 A B

BT C



O ⇐⇒ C − BTA−1B  O. (1.13)

Proof. This is indeed the Schur Complement Theorem, please see [21, 22, 74] for a proof.



Proposition 1.4. For any given f : J ⊆ IR → IR, let fsoc : S → IRn and ftr : S → IR be given by (1.9) and (1.10), respectively. Assume that J is open. Then, the following results hold.

(a) The domain S of fsoc and ftr is also open.

(15)

(b) If f is (continuously) differentiable on J , then fsoc is (continuously) differentiable on S. Moreover, for any x ∈ S, ∇fsoc(x) = f0(x1)I if x2 = 0, and otherwise

∇fsoc(x) =

b(x) c(x) xT2

kx2k c(x) x2

kx2k a(x)I + (b(x) − a(x))x2xT2 kx2k2

, (1.14)

where

a(x) = f (λ2(x)) − f (λ1(x)) λ2(x) − λ1(x) , b(x) = f02(x)) + f01(x))

2 ,

c(x) = f02(x)) − f01(x))

2 .

(c) If f is (continuously) differentiable, then ftr is (continuously) differentiable on S with ∇ftr(x) = 2(f0)soc(x); if f is twice (continuously) differentiable, then ftr is twice (continuously) differentiable on S with ∇2ftr(x) = ∇(f0)soc(x).

Proof. (a) Fix any x ∈ S. Then λ1(x), λ2(x) ∈ J . Since J is an open subset of IR, there exist δ1, δ2 > 0 such that {t ∈ IR | |t − λ1(x)| < δ1} ⊆ J and {t ∈ IR | |t − λ2(x)| <

δ2} ⊆ J. Let δ := min{δ1, δ2}/√

2. Then, for any y satisfying ky − xk < δ, we have

1(y) − λ1(x)| < δ1 and |λ2(y) − λ2(x)| < δ2 by noting that (λ1(x) − λ1(y))2+ (λ2(x) − λ2(y))2

= 2(x21+ kx2k2) + 2(y21 + ky2k2) − 4(x1y1+ kx2kky2k)

≤ 2(x21+ kx2k2) + 2(y21 + ky2k2) − 4(x1y1+ hx2, y2i)

= 2 kxk2+ kyk2− 2hx, yi

= 2kx − yk2,

and consequently λ1(y) ∈ J and λ2(y) ∈ J . Since f is a function from J to IR, this means that {y ∈ IRn| ky − xk < δ} ⊆ S, and therefore the set S is open. In addition, from the above, we see that S is characterized as in (1.11).

(b) The arguments are similar to Proposition 1.13 and Proposition 1.14 in Section 1.3.

(c) If f is (continuously) differentiable, then from part(b) and ftr(x) = 2e, fsoc(x) it follows that ftr is (continuously) differentiable. In addition, a simple computation yields that ∇ftr(x) = 2∇fsoc(x)e = 2(f0)soc(x). Similarly, by part(b), the second part follows.



(16)

Proposition 1.5. For any f : J → IR, let fsoc : S → IRn and ftr : S → IR be given by (1.9) and (1.10), respectively. Assume that J is open. If f is twice differentiable on J , then

(a) f00(t) ≥ 0 for any t ∈ J ⇐⇒ ∇(f0)soc(x)  O for any x ∈ S ⇐⇒ ftr is convex in S.

(b) f00(t) > 0 for any t ∈ J ⇐⇒ ∇(f0)soc(x)  O for any x ∈ S =⇒ ftr is strictly convex in S.

Proof. (a) By Proposition 1.4(c), ∇2ftr(x) = 2∇(f0)soc(x) for any x ∈ S, and the second equivalence follows by [20, Prop. B.4(a) and (c)]. We next come to the first equivalence.

By Proposition 1.4(b), for any fixed x ∈ S, ∇(f0)soc(x) = f00(x1)I if x2 = 0, and otherwise

∇(f0)soc(x) has the same expression as in (1.14) except that a(x) = f02(x)) − f01(x))

λ2(x) − λ1(x) , b(x) = f002(x)) + f001(x))

2 ,

c(x) = f002(x)) − f001(x))

2 .

Assume that ∇(f0)soc(x)  O for any x ∈ S. Then, we readily have b(x) ≥ 0 for any x ∈ S. Noting that b(x) = f00(x1) when x2 = 0, we particularly have f00(x1) ≥ 0 for all x1 ∈ J, and consequently f00(t) ≥ 0 for all t ∈ J . Assume that f00(t) ≥ 0 for all t ∈ J . Fix any x ∈ S. Clearly, b(x) ≥ 0 and a(x) ≥ 0. If b(x) = 0, then f001(x)) = f002(x)) = 0, and consequently c(x) = 0, which in turn implies that

∇(f0)soc(x) = " 0 0 0 a(x)

I − kxx2xT2

2k2



#

 O. (1.15)

If b(x) > 0, then by the first equivalence of Lemma 1.1 and the expression of ∇(f0)soc(x) it suffices to argue that the following matrix

a(x)I + (b(x) − a(x)) x2xT2

kx2k2 −c2(x) b(x)

x2xT2

kx2k2 (1.16)

is positive semidefinite. Since the rank-one matrix x2xT2 has only one nonzero eigenvalue kx2k2, the matrix in (1.16) has one eigenvalue a(x) of multiplicity n−1 and one eigenvalue

b(x)2−c(x)2

b(x) of multiplicity 1. Since a(x) ≥ 0 and b(x)2b(x)−c(x)2 = f001(x))f002(x)) ≥ 0, the matrix in (1.16) is positive semidefinite. By the arbitrary of x, we have that ∇(f0)soc(x)  O for all x ∈ S.

(b) The first equivalence is direct by using (1.13) of Lemma 1.1, noting ∇(f0)soc(x)  O implies a(x) > 0 when x2 6= 0, and following the same arguments as part(a). The second part is due to [20, Prop. B.4(b)]. 

(17)

Remark 1.1. Note that the strict convexity of ftr does not necessarily imply the positive definiteness of ∇2ftr(x). Consider f (t) = t4 for t ∈ IR. We next show that ftr is strictly convex. Indeed, ftr is convex in IRn by Proposition 1.5(a) since f00(t) = 12t2 ≥ 0. Taking into account that ftr is continuous, it remains to prove that

ftr x + y 2



= ftr(x) + ftr(y)

2 =⇒ x = y. (1.17)

Since h(t) = (t0 + t)4 + (t0 − t)4 for some t0 ∈ IR is increasing on [0, +∞), and the function f (t) = t4 is strictly convex in IR, we have that

ftr x + y 2



=



λ1 x + y 2

4

+



λ2 x + y 2

4

=  x1+ y1− kx2+ y2k 2

4

+ x1+ y1+ kx2+ y2k 2

4

≤  x1+ y1− kx2k − ky2k 2

4

+ x1 + y1+ kx2k + ky2k 2

4

=  λ1(x) + λ1(y) 2

4

+ λ2(x) + λ2(y) 2

4

≤ (λ1(x))4 + (λ1(y))4+ (λ2(x))4+ (λ2(y))4 2

= ftr(x) + ftr(y)

2 ,

and moreover, the above inequalities become the equalities if and only if kx2+ y2k = kx2k + ky2k, λ1(x) = λ1(y), λ2(x) = λ2(y).

It is easy to verify that the three equalities hold if and only if x = y. Thus, the implication in (1.17) holds, i.e., ftr is strictly convex. However, by Proposition 1.5(b), ∇(f0)soc(x)  O does not hold for all x ∈ IRn since f00(t) > 0 does not hold for all t ∈ IR.

We point out that the fact that the strict convexity of f implies the strict convexity of ftr was proved in [7, 15] via the definition of convex function, but here we use the Schur Complement Theorem and the relation between ∇(f0)soc and ∇2ftr to establish the convexity of SOC trace functions. Next, we illustrate the application of Proposition 1.5 with some SOC trace functions.

Proposition 1.6. The following functions associated with Kn are all strictly convex.

(a) F1(x) = − ln(det(x)) for x ∈ int(Kn).

(b) F2(x) = tr(x−1) for x ∈ int(Kn).

(18)

(c) F3(x) = tr(φ(x)) for x ∈ int(Kn), where φ(x) =

( xp+1−e

p+1 +x1−qq−1−e if p ∈ [0, 1], q > 1,

xp+1−e

p+1 − ln x if p ∈ [0, 1], q = 1.

(d) F4(x) = − ln(det(e − x)) for x ≺Kn e.

(e) F5(x) = tr((e − x)−1◦ x) for x ≺Kn e.

(f ) F6(x) = tr(exp(x)) for x ∈ IRn.

(g) F7(x) = ln(det(e + exp(x))) for x ∈ IRn. (h) F8(x) = tr x + (x2 + 4e)1/2

2



for x ∈ IRn.

Proof. Note that F1(x), F2(x) and F3(x) are the SOC trace functions associated with f1(t) = − ln t (t > 0), f2(t) = t−1 (t > 0) and f3(t) (t > 0), respectively, where

f3(t) =

( tp+1−1

p+1 +t1−qq−1−1 if p ∈ [0, 1], q > 1,

tp+1−1

p+1 − ln t if p ∈ [0, 1], q = 1;

Next, F4(x) is the SOC trace function associated with f4(t) = − ln(1 − t) (t < 1), F5(x) is the SOC trace function associated with f5(t) = 1−tt (t < 1) by noting that

(e − x)−1◦ x = λ1(x)

λ1(e − x)u(1)x + λ2(x) λ2(e − x)u(2)x ;

In addition, F6(x) and F7(x) are the SOC trace functions associated with f6(t) = exp(t) (t ∈ IR) and f7(t) = ln(1 + exp(t)) (t ∈ IR), respectively, and F8(x) is the SOC trace function associated with f8(t) = 12 t +√

t2+ 4 (t ∈ IR). It is easy to verify that all the functions f1-f8 have positive second-order derivatives in their respective domain, and therefore F1-F8 are strictly convex functions by Proposition 1.5(b). 

The functions F1, F2 and F3 are the popular barrier functions which play a key role in the development of interior point methods for SOCPs, see, e.g., [14, 19, 109, 123, 146], where F3 covers a wide range of barrier functions, including the classical logarithmic barrier function, the self-regular functions and the non-self-regular functions; see 

for details. The functions F4 and F5 are the popular shifted barrier functions [6, 7, 9]

for SOCPs, and F6-F8 can be used as penalty functions for second-order cone programs (SOCPs), and these functions are added to the objective of SOCPs for forcing the solu- tion to be feasible.

Besides the application in establishing convexity for SOC trace functions, the Schur Complement Theorem can be employed to establish convexity of some compound func- tions of SOC trace functions and scalar-valued functions, which are usually difficult

(19)

to achieve by checking the definition of convexity directly. The following proposition presents such an application.

Proposition 1.7. For any x ∈ Kn, let F9(x) := −[det(x)]1/p with p > 1. Then, (a) F9 is twice continuously differentiable in int(Kn).

(b) F9 is convex when p ≥ 2, and moreover, it is strictly convex when p > 2.

Proof. (a) Note that −F9(x) = exp (p−1ln(det(x))) for any x ∈ int(Kn), and ln(det(x)) = ftr(x) with f (t) = ln(t) for t ∈ IR++. By Proposition 1.4(c), ln(det(x)) is twice contin- uously differentiable in int(Kn). Hence −F9(x) is twice continuously differentiable in int(Kn). The result then follows.

(b) In view of the continuity of F9, we only need to prove its convexity over int(Kn). By part(a), we next achieve this goal by proving that the Hessian matrix ∇2F9(x) for any x ∈ int(Kn) is positive semidefinite when p ≥ 2, and positive definite when p > 2. Fix any x ∈ int(Kn). From direct computations, we obtain

∇F9(x) = −1 p

"

(2x1) x21− kx2k21p−1

(−2x2) x21− kx2k21p−1

#

and

2F9(x) = p − 1

p2 (det(x))1p−2

4x212p(x21−kx2k2)

p−1 −4x1xT2

−4x1x2 4x2xT2 +2p(x21−kx2k2)

p−1 I

.

Since x ∈ int(Kn), we have x1 > 0 and det(x) = x21− kx2k2 > 0, and therefore a1(x) := 4x21− 2p (x21− kx2k2)

p − 1 =



4 − 2p p − 1



x21+ 2p

p − 1kx2k2.

We next proceed the arguments by the following two cases: a1(x) = 0 or a1(x) > 0.

Case 1: a1(x) = 0. Since p ≥ 2, under this case we must have x2 = 0, and consequently,

2F9(x) = p − 1

p2 (x1)2p−4 0 0 0 p−12p x21I



 O.

Case 2: a1(x) > 0. Under this case, we calculate that



4x21− 2p (x21 − kx2k2) p − 1

 

4x2xT2 +2p (x21− kx2k2) p − 1 I



− 16x21x2xT2

= 4p (x21− kx2k2) p − 1

 p − 2

p − 1x21I + p

p − 1kx2k2I − 2x2xT2



. (1.18)

(20)

Since the rank-one matrix 2x2xT2 has only one nonzero eigenvalue 2kx2k2, the matrix in the bracket of the right hand side of (1.18) has one eigenvalue of multiplicity 1 given by

p − 2

p − 1x21+ p

p − 1kx2k2− 2kx2k2 = p − 2

p − 1 x21− kx2k2 ≥ 0,

and one eigenvalue of multiplicity n − 1 given by p−2p−1x21+p−1p kx2k2 ≥ 0. Furthermore, we see that these eigenvalues must be positive when p > 2 since x21 > 0 and x21− kx2k2 > 0.

This means that the matrix on the right hand side of (1.18) is positive semidefinite, and moreover, it is positive definite when p > 2. Applying Lemma 1.1, we have that

2F9(x)  O, and furthermore ∇2F9(x)  O when p > 2.

Since a1(x) > 0 must hold when p > 2, the arguments above show that F9(x) is convex over int(Kn) when p ≥ 2, and strictly convex over int(Kn) when p > 2. 

It is worthwhile to point out that det(x) is neither convex nor concave on Kn, and it is difficult to argue the convexity of those compound functions involving det(x) by the definition of convex function. But, our SOC trace function offers a simple way to prove their convexity. Moreover, it helps on establishing more inequalities associated with SOC. Some of these inequalities have been used to analyze the properties of SOC function fsoc  and the convergence of interior point methods for SOCPs .

Proposition 1.8. For any x Kn 0 and y Kn 0, the following inequalities hold.

(a) det(αx + (1 − α)y) ≥ (det(x))α(det(y))1−α for any 0 < α < 1.

(b) det(x + y)1/p≥ 22p−1 det(x)1/p+ det(y)1/p for any p ≥ 2.

(c) det(αx + (1 − α)y) ≥ α2det(x) + (1 − α)2det(y) for any 0 < α < 1.

(d) [det(e + x)]1/2 ≥ 1 + det(x)1/2. (e) det(x)1/2 = inf 1

2tr(x ◦ y)

det(y) = 1, y Kn 0



. Furthermore, when x Kn 0, the same relation holds with inf replaced by min.

(f ) tr(x ◦ y) ≥ 2 det(x)1/2det(y)1/2.

Proof. (a) From Proposition 1.6(a), we know that ln(det(x)) is strictly concave in int(Kn). With this, we have

ln(det(αx + (1 − α)y)) ≥ α ln(det(x)) + (1 − α) ln(det(y))

= ln(det(x)α) + ln(det(x)1−α)

for any 0 < α < 1 and x, y ∈ int(Kn). This, together with the increasing of ln t (t > 0) and the continuity of det(x), implies the desired result.

(21)

(b) By Proposition 1.7(b), det(x)1/p is concave over Kn. Then, for any x, y ∈ Kn, we have

det x + y 2

1/p

≥ 1

2det(x)1/p+ det(y)1/p

⇐⇒ 2

"

 x1+ y1 2

2

x2+ y2 2

2#1/p

≥ x21− kx2k21/p

+ y12− ky2k21/p

⇐⇒ (x1+ y1)2− kx2+ y2k21/p

≥ 41p 2

h

x21− kx2k21/p

+ y12− ky2k21/pi

⇐⇒ det(x + y)1/p ≥ 22p−1 det(x)1/p+ det(y)1/p , which is the desired result.

(c) Using the inequality in part(b) with p = 2, we have

det(x + y)1/2 ≥ det(x)1/2+ det(y)1/2. Squaring both sides yields

det(x + y) ≥ det(x) + det(y) + 2 det(x)1/2det(y)1/2 ≥ det(x) + det(y),

where the last inequality is by the nonnegativity of det(x) and det(y) since x, y ∈ Kn. This together with the fact det(αx) = α2det(x) leads to the desired result.

(d) This inequality is presented in Proposition 1.2(d). Nonetheless, we provide a different approach by applying part(b) with p = 2 and the fact that det(e) = 1.

(e) From Proposition 1.3(b), we have

tr(x ◦ y) ≥ λ1(x)λ2(y) + λ1(y)λ2(x), ∀x, y ∈ IRn.

For any x, y ∈ Kn, this along with the arithmetic-geometric mean inequality implies that tr(x ◦ y)

2 ≥ λ1(x)λ2(y) + λ1(y)λ2(x) 2

≥ p

λ1(x)λ2(y)λ1(y)λ2(x)

= det(x)1/2det(y)1/2, which means that inf 1

2tr(x ◦ y)

det(y) = 1, y Kn 0



= det(x)1/2 for a fixed x ∈ Kn. If x Kn 0, then we can verify that the feasible point y = √x−1

det(x) is such that 1

2tr(x◦y) = det(x)1/2, and the second part follows.

(f) Using part(e), for any x ∈ Kn and y ∈ int(Kn), we have tr(x ◦ y)

2pdet(y) = 1

2tr x ◦ y pdet(y)

!

≥p

det(x),

Updating...

## References

Related subjects :