Some characterizations for SOC-monotone and SOC-convex functions

(1)

DOI 10.1007/s10898-008-9373-z

Some characterizations for SOC-monotone and SOC-convex functions

Jein-Shan Chen · Xin Chen · Shaohua Pan · Jiawei Zhang

Received: 29 June 2007 / Accepted: 21 October 2008 / Published online: 7 November 2008

Abstract We provide some characterizations for SOC-monotone and SOC-convex functions by using differential analysis. From these characterizations, we particularly obtain that a continuously differentiable function defined in an open interval is SOC-monotone (SOC-convex) of order n ≥ 3 if and only if it is 2-matrix monotone (matrix convex), and furthermore, such a function is also SOC-monotone (SOC-convex) of order n ≤ 2 if it is 2-matrix monotone (matrix convex). In addition, we also prove that Conjecture 4.2 proposed in Chen (Optimization 55:363–385, 2006) does not hold in general. Some examples are included to illustrate that these characterizations open convenient ways to verify the SOC- monotonicity and the SOC-convexity of a continuously differentiable function defined on an open interval, which are often involved in the solution methods of the convex second-order cone optimization.

Keywords Second-order cone· SOC-monotone function · SOC-convex function Mathematics Subject Classification (2000) 26A48· 26A51 · 26B05 · 90C25

J.-S. Chen (

B

⁾

Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan e-mail: [email protected]

X. Chen

Department of Industrial and Enterprise System Engineering, University of Illinois at Urbana–Champaign, Urbana 61801, IL, USA

e-mail: [email protected] S. Pan

School of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China e-mail: [email protected]

J. Zhang

Department of Information, Operations and Management Sciences, New York University, New York 10012-1126, NY, USA

e-mail: [email protected]

(2)

1 Introduction

The second-order cone (SOC) in IRⁿ, also called the Lorentz cone, is a set defined by Kⁿ:=

(x1, x2) ∈ IR × IRⁿ⁻¹| x2 ≤ x1

, (1)

where · denotes the Euclidean norm, andK¹denotes the set of nonnegative reals IR₊. It is known thatKⁿis a closed convex self-dual cone with nonempty interior int(Kⁿ). For any x, y ∈ IRⁿ, we write x_Kn y if x− y ∈Kⁿ; and write x_Kn y if x− y ∈ int(Kⁿ). In other words, we have x_Kn 0 if and only if x∈Kⁿand x_Kn 0 if and only if x∈ int(Kⁿ). The relation_Kn is a partial ordering, but not a linear ordering inKⁿ, i.e., there exist x, y ∈Kⁿ such that neither x_Kn y nor y_Kn x. To see this, let x = (1, 1), y = (1, 0), and then we have x− y = (0, 1) /∈K², y − x = (0, −1) /∈K².

For any x= (x1, x2), y = (y1, y2) ∈ IR × IRⁿ⁻¹, we define their Jordan product as x◦ y = (x, y , y1x₂+ x1y₂). (2) we write x²to mean x◦x and write x+y to mean the usual componentwise addition of vectors.

Then◦, +, and e = (1, 0, . . . , 0)^T ∈ IRⁿhave the following basic properties (see [7,8]): (1) e◦ x = x for all x ∈ IRⁿ. (2) x◦ y = y ◦ x for all x, y ∈ IRⁿ. (3) x◦ (x²◦ y) = x²◦ (x ◦ y) for all x, y ∈ IRⁿ. (4)(x + y) ◦ z = x ◦ z + y ◦ z for all x, y, z ∈ IRⁿ. Note that the Jordan product is not associative. Besides,Kⁿis not closed under Jordan product.

We recall from [7,8] that each x= (x1, x2) ∈ IR × IRⁿ⁻¹admits a spectral factorization, associated withKⁿ, of the form

x= λ1(x) · u⁽¹⁾x + λ2(x) · u⁽²⁾x , (3) whereλ1(x), λ2(x) and u⁽¹⁾x , u⁽²⁾x are the spectral values and the associated spectral vectors of x given by

λi(x) = x1+ (−1)ⁱx2, u⁽ⁱ⁾_x = 1 2

1, (−1)ⁱ¯x2

for i= 1, 2, (4) with ¯x2= _x^x²₂if x₂= 0 and otherwise ¯x2being any vector in IRⁿ⁻¹such that ¯x2 = 1. If x2 = 0, the factorization is unique. By the spectral factorization, for any f : IR → IR, we can define a vector-valued function associated withKⁿ(n≥ 1) by

f^soc(x) = f (λ1(x))u⁽¹⁾_x + f (λ2(x))u⁽²⁾_x , ∀x = (x1, x2) ∈ IR × IRⁿ⁻¹, (5) and call it the SOC-function induced by f . If f is defined only on a subset of IR, then f^socis defined on the corresponding subset of IRⁿ. The definition is unambiguous whether x2= 0 or x2 = 0. The cases of f^soc(x) = x^1/2, x², exp(x) were discussed in [7]. In fact, the above definition (5) is analogous to one associated with the semidefinite cone; see [19,20].

Recently, the concepts of SOC-monotone and SOC-convex functions are introduced in [5]. Especially, a function f: J → IR with J ⊆ IR is said to be SOC-monotone of order n if

x _Kn y ⇒ f^soc(x) _Kn f^soc(y) (6)

for any x, y ∈ dom f^soc ⊆ IRⁿ, where dom f^socdenotes the domain of the function f^soc; and f is said to be SOC-convex of order n if, for any x, y ∈ dom f^soc,

f^soc(λx + (1 − λ)y) _Kn λf^soc(x) + (1 − λ) f^soc(y) λ ∈ [0, 1]. (7)

(3)

The function f is said to be SOC-monotone (respectively, SOC-convex) if it is SOC- monotone of all order n (respectively, SOC-convex of all order n), and f is SOC-convex on J if and only if− f is SOC-concave on J. The concepts of SOC-monotone and SOC-convex functions are analogous to matrix monotone and matrix convex functions [2,10,11,14], and are special cases of operator monotone and operator convex functions [1,3,12]. For example, the function f is said to be n-matrix convex on J if

f(λA + (1 − λ)B) λf (A) + (1 − λ) f (B) λ ∈ [0, 1]

for arbitrary Hermitian n× n matrices A and B with spectra in J. It is clear that the set of SOC-monotone functions and the set of SOC-convex functions are closed under positive linear combinations and pointwise limits.

There has been systematic study on matrix monotone and matrix convex functions, and moreover, characterizations for such functions have been explored; see [4,10,11,13,14] and the references therein. To the contrast, the study on SOC-monotone and SOC-convex functions just starts its first step. One reason is that they were viewed as special cases of operator monotone and operator convex functions. However, we recently observed that SOC-monotone and SOC-convex functions play an important role in the design of solutions methods for convex second-order cone programs (SOCPs); for example, the proximal-like methods in [15]

and the augmented Lagrangian method introduced in Sect.5. On the other hand, we all know that the developments of matrix-valued functions have major contributions in the solution of optimization problems. Thus, we hope similar systematic study on SOC-functions can be exploited so that it can be readily adopted to optimization field. This is the main motivation of the paper.

Although some work was done in [5] for SOC-monotone and SOC-convex functions, the focus there is to provide some specific examples by the definition and it seems difficult to exploit the characterizations there to verify whether a given function is SOC-convex or not. In this paper, we employ differential analysis to establish some useful characterizations which will open convenient ways to verify the SOC-monotonicity and the SOC-convexity of a function defined on an open interval. Particularly, from these characterizations, we obtain that a continuously differentiable function defined on an open interval is SOC-monotone (SOC- convex) of order n ≥ 3 if and only if it is 2-matrix monotone (matrix convex), and such a function is also SOC-monotone (SOC-convex) of order n ≤ 2 if it is 2-matrix monotone (matrix convex). Thus, if such functions are 2-matrix monotone (matrix convex), then it must be SOC-monotone (SOC-convex). It should be pointed out that the analysis of this paper can not be obtained from those for matrix-valued functions. One of the reasons is that the matrix multiplication is associative whereas the Jordan product is not.

Throughout the paper,·, · denotes the Euclidean inner product, IRⁿ denotes the space of n-dimensional real column vectors, and IRⁿ¹× · · · × IRⁿ^m is identified with IRⁿ¹^+···+n^m. Thus,(x1, . . . , xm) ∈ IRⁿ¹× · · · × IRⁿ^m is viewed as a column vector in IRⁿ¹^+···+n^m. Also, I represents an identity matrix of suitable dimension; J is a subset of IR; and 0 is a zero matrix or vector of suitable dimension. The notation^T means transpose and C⁽ⁱ⁾(J) denotes the family of functions which are defined on J ⊆ IR to IR and have the i-th continuous derivative. For a function f : IR → IR, f⁽ⁱ⁾(x) represents the i-th order derivative of f at x ∈ IR, and the first-order and the second-order derivative of f are also written as fand f, respectively. For any f : IRⁿ → IR, ∇ f (x) denotes the gradient of f at x ∈ IRⁿ and dom f denotes the domain of f . For any differentiable mapping F= (F1, . . . , Fm)^T : IRⁿ → IR^m,

∇ F(x) = [∇ F1(x) · · · ∇ Fm(x)] is an n × m matrix which denotes the transposed Jacobian of F at x. For any symmetric matrices A, B ∈ IR^n×n, we write A B (respectively, A B) to mean A− B is positive semidefinite (respectively, positive definite).

(4)

2 Preliminaries

In this section, we develop the second-order Taylor’s expansion for the vector-valued SOC- function f^soc defined as in (5) which is crucial in our subsequent analysis. To the end, we assume that f ∈ C⁽²⁾(J) with J being an open interval in IR and dom f^soc ⊆ IRⁿ.

Given any x ∈ dom f^soc and h = (h1, h2) ∈ IR × IRⁿ⁻¹, we have x+ th ∈ dom f^soc for any sufficiently small t> 0. We wish to calculate the Taylor’s expansion of the function f^soc(x + th) at x for any sufficiently small t > 0. In particular, we are interested in finding matrices∇ f^soc(x) and Ai(x) for i = 1, 2, . . . , n such that

f^soc(x + th) = f^soc(x) + t∇ f^soc(x)h +1 2t²

⎡

⎢⎢

⎢⎣

h^TA₁(x)h h^TA2(x)h

...

h^TAn(x)h

⎤

⎥⎥

⎥⎦+ o(t²). (8)

For convenience, we omit the variable notion x inλi(x) for i = 1, 2 in the discussions below.

It is known that f^socis differentiable (respectively, smooth) if and only if f is differentiable (respectively, smooth); see [6,8]. Moreover, there holds that

∇ f^soc(x) =

⎡

⎢⎢

⎣

b⁽¹⁾ c⁽¹⁾ x₂^T

x2 c⁽¹⁾ x2

x2 a⁽⁰⁾I+ (b⁽¹⁾− a⁽⁰⁾)x₂x₂^T

x2²

⎤

⎥⎥

⎦ (9)

if x₂= 0, and otherwise

∇ f^soc(x) = f(x1)I (10)

where

a⁽⁰⁾= f(λ2) − f (λ1)

λ2− λ1 , b⁽¹⁾= f(λ2) + f(λ1)

2 , c⁽¹⁾ = f(λ2) − f(λ1)

2 . (11)

Therefore, we only need to derive the formula of A_i(x) for i = 1, 2, . . . , n in (8).

We first consider the case where x₂ = 0 and x2+ th2= 0. By the definition (5),

f^soc(x + th) = 1

2f(x1+ th1− x2+ th2)

⎡

⎣ 1

− x₂+ th2

x2+ th2

⎤

⎦

+1

2f(x1+ th1+ x2+ th2)

⎡

⎣ 1

x₂+ th2

x2+ th2

⎤

⎦

=

⎡

⎢⎣

f(x1+ th1− x2+ th2) + f (x1+ th1+ x2+ th2) f(x1+ th1+ x2+ th2) − f (x21+ th1− x2+ th2)

2

x₂+ th2

x2+ th2

⎤

⎥⎦

:=

1

2

. (12)

(5)

To derive the Taylor’s expansion of f^soc(x + th) at x with x2 = 0, we first write out and expandx2+ th2. Notice that

x2+ th2 =

x2²+ 2tx₂^Th₂+ t²h2²= x2

1+ 2tx₂^Th2

x2² + t²h2²

x2². Therefore, using the fact that

√1+ = 1 +1 2 − 1

8²+ o(²), we may obtain

x2+ th2 = x2

1+ t α

x2+1 2t² β

x2²

+ o(t²), (13)

where

α = x₂^Th₂

x2, β = h2²−(x₂^Th₂)²

x2² = h2²− α²= h^T₂M_x₂h₂, with

M_x₂ = I − x2x₂^T

x2².

Furthermore, from (13) and the fact that(1 + )⁻¹= 1 − + ²+ o(²), it follows that

x2+ th2⁻¹= x2⁻¹

1− t α

x2+1 2t²

2 α²

x2² − β

x2²

+ o(t²)

. (14) Combining Eqs. (13) and (14) then yields that

x2+ th2

x2+ th2 = x2

x2+ t

h2

x2− α

x2 x2

x2

+1 2t²

2 α²

x2² − β

x2²

x₂

x2− 2 h₂

x2 α

x2

+ o(t²)

= x₂

x2+ t Mx2

h₂

x2 +1

2t²

3h^T₂x₂x₂^Th₂

x2⁴ x2

x2−h2²

x2² x2

x2− 2h₂h₂^T

x2² x2

x2

+ o(t²). (15)

In addition, from (13), we have the following equalities f(x1+ th1− x2+ th2)

= f

x1+ th1−

x2

1+ t α

x2+1 2t² β

x2²

+ o(t²)

= f

λ1+ t(h1− α) −1 2t² β

x2+ o(t²)

= f (λ1) + t f(λ1)(h1− α) + 1 2t²

− f(λ1) β

x2+ f(λ1)(h1− α)²

+ o(t²) (16)

(6)

and

f(x1+ th1+ x2+ th2)

= f

λ2+ t(h1+ α) + 1 2t² β

x2+ o(t²)

= f (λ2) + t f(λ2)(h1+ α) +1 2t²

f(λ2) β

x2+ f(λ2)(h1+ α)²

+ o(t²). (17) For i= 0, 1, 2, we define

a⁽ⁱ⁾= f⁽ⁱ⁾(λ2) − f⁽ⁱ⁾(λ1)

λ2− λ1 , b⁽ⁱ⁾= f⁽ⁱ⁾(λ2) + f⁽ⁱ⁾(λ1)

2 , c⁽ⁱ⁾= f⁽ⁱ⁾(λ2) − f⁽ⁱ⁾(λ1)

2 ,

(18) where f⁽ⁱ⁾means the i -th derivative of f and f⁽⁰⁾is the same as the original f . Then, by the Eqs. (16)–(18), it can be verified that

1= 1 2

f(x1+ th1+ x2+ th2) + f (x1+ th1− x2+ th2)

= b⁽⁰⁾+ t

b⁽¹⁾h₁+ c⁽¹⁾α +1

2t²

a⁽¹⁾β + b⁽²⁾(h²1+ α²) + 2c⁽²⁾h₁α + o(t²)

= b⁽⁰⁾+ t

b⁽¹⁾h1+ c⁽¹⁾h^T₂ x2

x2

+1

2t²h^TA1(x)h + o(t²), where

A1(x) =

⎡

⎢⎢

⎣

b⁽²⁾ c⁽²⁾ x₂^T

x2 c⁽²⁾ x₂

x2 a⁽¹⁾I+

b⁽²⁾− a⁽¹⁾ x2x₂^T

x2²

⎤

⎥⎥

⎦ . (19)

Note that in the above expression for1, b⁽⁰⁾is exactly the first component of f^soc(x) and

b⁽¹⁾h1+ c⁽¹⁾h^T₂ _x^x²

2

is the first component of∇ f^soc(x)h. Using the same techniques again, 1

2

f(x1+ th1+ x2+ th2) − f (x1+ th1− x2+ th2)

= c⁽⁰⁾+ t

c⁽¹⁾h1+ b⁽¹⁾α +1

2t²

b⁽¹⁾ β

x2+ c⁽²⁾(h²₁+ α²) + 2b⁽²⁾h1α

+ o(t²)

= c⁽⁰⁾+ t

c⁽¹⁾h1+ b⁽¹⁾α +1

2t²h^TB(x)h + o(t²), (20)

where

B(x) =

⎡

⎢⎢

⎢⎣

c⁽²⁾ b⁽²⁾ x₂^T

x2 b⁽²⁾ x2

x2 c⁽²⁾I+

b⁽¹⁾

x2− c⁽²⁾

Mx2

⎤

⎥⎥

⎥⎦. (21)

Using Eqs. (15) and (20), we obtain that

2 = 1 2

f(x1+ th1+ x2+ th2) − f (x1+ th1− x2+ th2) x2+ th2

x2+ th2

= c⁽⁰⁾ x₂

x2+ t

x₂

x2(c⁽¹⁾h₁+ b⁽¹⁾α) + c⁽⁰⁾M_x₂ h₂

x2

+1

2t²W+ o(t²),

(7)

where

W = x₂

x2h^TB(x)h + 2Mx₂

h₂

x2

c⁽¹⁾h1+ b⁽¹⁾α +c⁽⁰⁾

3h₂^Tx₂x₂^Th₂

x2⁴ x2

x2−h2²

x2² x2

x2− 2h₂h^T₂

x2² x2

x2

.

Now we denote

d := b⁽¹⁾− a⁽⁰⁾

x2 = 2(b⁽¹⁾− a⁽⁰⁾) λ2− λ1

, U := h^TC(x)h

V := 2c⁽¹⁾h1+ b⁽¹⁾α

x2 − c⁽⁰⁾2x₂^Th₂

x2³ = 2a⁽¹⁾h1+ 2dx₂^Th₂

x2, where

C(x) :=

⎡

⎢⎢

⎣

c⁽²⁾ (b⁽²⁾− a⁽¹⁾) x₂^T

x2 (b⁽²⁾− a⁽¹⁾) x2

x2 dI+

c⁽²⁾− 3d x2x₂^T

x2²

⎤

⎥⎥

⎦ . (22)

Then U can be further recast as

U = h^TB(x)h + c⁽⁰⁾3h₂^Tx2x₂^Th2

x2⁴ − c⁽⁰⁾h2²

x2² − 2x₂^Th2

x2²(c⁽¹⁾h1+ b⁽¹⁾α).

Consequently,

W = x₂

x2U+ h2V.

We next consider the case where x₂= 0 and x2+ th2= 0. By definition (5),

f^soc(x + th) = f(x1+ t(h1− h2)) 2

⎡

⎣ 1

− h₂

h2

⎤

⎦ + f(x1+ t(h1+ h2)) 2

⎡

⎣ 1 h₂

h2

⎤

⎦

=

⎡

⎢⎣

f(x1+ t(h1− h2)) + f (x1+ t(h1+ h2)) f(x1+ t(h1+ h2)) − f (x2 1+ t(h1− h2))

2

h₂

h2

⎤

⎥⎦ . (23)

Using the Taylor expansion of f at x1, we can obtain that 1

2

f(x1+ t(h1− h2)) + f (x1+ t(h1+ h2))

= f (x1) + t f⁽¹⁾(x1)h1+1

2t²f⁽²⁾(x1)h^Th+ o(t²), 1

2

f(x1+ t(h1− h2)) − f (x1+ t(h1+ h2))

= t f⁽¹⁾(x1)h2+1

2t²f⁽²⁾(x1)2h1h2+ o(t²).

Therefore,

f^soc(x + th) = f^soc(x) + t f⁽¹⁾(x1)h + 1

2t²f⁽²⁾(x1) h^Th

2h₁h₂

. (24)

(8)

Thus, under this case, we have that

A₁(x) = f⁽²⁾(x1)I, Ai(x) = f⁽²⁾(x1)

0 ¯e_i−1^T

¯ei−1 0

i= 2, . . . , n, (25) where ¯ej ∈ IRⁿ⁻¹is the vector whose j th component is 1 and the others are 0.

Summing up the above discussions, we may obtain the following conclusion.

Proposition 2.1 Let f ∈ C⁽²⁾(J) with J being an open interval in IR and dom f^soc ⊆ IRⁿ. Then, for given x∈ dom f^soc, h∈ IRⁿand any sufficiently small t> 0,

f^soc(x + th) = f^soc(x) + t∇ f^soc(x)h +1 2t²

⎡

⎢⎢

⎢⎣

h^TA1(x)h h^TA₂(x)h

...

h^TA_n(x)h

⎤

⎥⎥

⎥⎦+ o(t²),

where∇ f^soc(x) and Ai(x), i = 1, 2, . . . , n are given by (10) and (25) if x₂ = 0; and otherwise∇ f^soc(x) and A1(x) are given by (9) and (19), respectively, and for i ≥ 2,

Ai(x) = C(x) x2i

x2+ Bi(x) where

B_i(x) = ve^T_i + eiv^T, v =

a⁽¹⁾ d x₂^T

x2

_T .

From Proposition 4.3 of [5] and Proposition2.1, we readily have the following result.

Proposition 2.2 Let f ∈ C⁽²⁾(J) with J being an open interval in IR and dom f^soc ⊆ IRⁿ. Then, f is SOC-convex if and only if for any x∈ dom f^socand h∈ IRⁿ, the vector

⎡

⎢⎢

⎢⎣

...

h^TA_n(x)h

⎤

⎥⎥

⎥⎦∈Kⁿ.

3 Characterizations of SOC-monotone functions

Now we are ready to show our main result concerning the characterization of SOC-monotone functions. We need the following technical lemmas for the proof. The first one is so-called S-Lemma whose proof can be found in [16,18].

Lemma 3.1 Let A, B be symmetric matrices and y^TAy> 0 for some y. Then, the implica- tion

z^TAz≥ 0 ⇒ z^TBz≥ 0

is valid if and only if B λA for some λ ≥ 0.

Lemma 3.2 Givenθ ∈ IR, a ∈ IRⁿ⁻¹, and a symmetric matrix A ∈ IRⁿ^×n. LetBⁿ⁻¹ :=

{z ∈ IRⁿ⁻¹| z ≤ 1}. Then, the following results hold:

(a) For any h∈Kⁿ, Ah∈Kⁿis equivalent to A 1

z

∈Kⁿfor any z∈Bⁿ⁻¹. (b) For any z∈Bⁿ⁻¹,θ + a^Tz≥ 0 is equivalent to θ ≥ a.

(9)

(c) If A = θ a^T

a H

with H being an(n − 1) × (n − 1) symmetric matrix, then for any h∈Kⁿ, Ah∈Kⁿis equivalent toθ ≥ a and there exists λ ≥ 0 such that the matrix

θ²− a²− λ θa^T − a^TH θa − H^Ta aa^T − H^TH+ λI

 O.

Proof (a) For any h∈Kⁿ, suppose that Ah∈Kⁿ. Let h= 1

z

where z∈Bⁿ⁻¹. Then h∈Kⁿand the desired result follows. For the other direction, if h= 0, the conclusion is obvious. Now let h := (h1, h2) be any nonzero vector inKⁿ. Then, h1 > 0 and

h2 ≤ h1. Consequently, h2

h1 ∈Bⁿ⁻¹and A

⎡

⎣1 h₂ h₁

⎤

⎦ ∈Kⁿ. SinceKⁿ is a cone, we have

h1A

⎡

⎣1 h2

h1

⎤

⎦ = Ah ∈Kⁿ.

(b) For z∈Bⁿ⁻¹, supposeθ + a^Tz≥ 0. If a = 0, then the result is clear since θ ≥ 0. If a = 0, let z := −a/a. Clearly, z ∈Bⁿ⁻¹and henceθ + −a^Ta

a ≥ 0 which gives θ − a ≥ 0. For the other direction, the result follows from the Cauchy Schwarz inequality:

θ + a^Tz≥ θ − a · z ≥ θ − a ≥ 0.

(c) From part (a), Ah∈Kⁿfor any h∈Kⁿis equivalent to A 1

z

∈Kⁿfor any z∈Bⁿ⁻¹. Notice that

A 1

z

= θ a^T

a H 1

z

=

θ + a^Tz a+ Hz

. Then, Ah∈Kⁿfor any h∈Kⁿ is equivalent to the following two things:

θ + a^Tz≥ 0, for any z ∈Bⁿ⁻¹ (26)

and

(a + Hz)^T(a + Hz) ≤ (θ + a^Tz)², for any z ∈Bⁿ⁻¹. (27) By part (b), (26) is equivalent toθ ≥ a. Now, we write the expression of (27) as below:

z^T(aa^T− H^TH)z + 2(θa^T − a^TH)z + θ²− a^Ta≥ 0, for any z ∈Bⁿ⁻¹, which can be further simplified as

1 z^T θ²− a² θa^T − a^TH θa − H^Ta aa^T− H^TH

1 z

≥ 0, for any z ∈Bⁿ⁻¹.

Observe that z∈Bⁿ⁻¹is the same as

1 z^T 1 0 0−I

1 z

≥ 0.

(10)

Thus, by applying the S-Lemma (Lemma3.1), there existsλ ≥ 0 such that θ²− a² θa^T− a^TH

θa − H^Ta aa^T − H^TH

− λ 1 0

0−I

 O

This completes the proof of part (c).

Theorem 3.1 Let f ∈ C⁽¹⁾(J) with J being an open interval and dom f^soc⊆ IRⁿ. Then, (i) when n= 2, f is SOC-monotone if and only if f(τ) ≥ 0 for any τ ∈ J;

(ii) when n≥ 3, f is SOC-monotone if and only if the 2 × 2 matrix

⎡

⎢⎣

f⁽¹⁾(t1) f(t2) − f (t1) t₂− t1

f(t2) − f (t1) t₂− t1

f⁽¹⁾(t2)

⎤

⎥⎦ O for all t¹, t2∈ J.

Proof By the definition of SOC-monotonicity, f is SOC-monotone if and only if

f^soc(x + h) − f^soc(x) ∈Kⁿ (28)

for any x ∈ dom f^soc and h ∈ Kⁿ such that x+ h ∈ dom f^soc. By the first-order Taylor expansion of f^soc, i.e.,

f^soc(x + h) = f^soc(x) + ∇ f^soc(x + th)h for some t ∈ (0, 1),

it is clear that (28) is equivalent to∇ f^soc(x + th)h ∈Kⁿfor any x ∈ dom f^socand h ∈Kⁿ such that x+ h ∈ dom f^soc, and some t ∈ (0, 1). Let y := x + th = µ1v⁽¹⁾+ µ2v⁽²⁾for such x, h and t. We next proceed the arguments by the two cases of y2= 0 and y2= 0.

Case (1): y₂= 0. Under this case, we notice that

∇ f^soc(y) = θ a^T

a H

,

where

θ = ˜b⁽¹⁾, a = ˜c⁽¹⁾ y2

y2, and H = ˜a⁽⁰⁾I+ ( ˜b⁽¹⁾− ˜a⁽⁰⁾)y2y₂^T

y2², with

˜a⁽⁰⁾= f(µ2) − f (µ1)

µ2− µ1 , ˜b⁽¹⁾= f(µ2) + f(µ1)

2 , ˜c⁽¹⁾ = f(µ2) − f(µ1)

2 . (29)

In addition, we also observe that

θ²− a²= ( ˜b⁽¹⁾)²− (˜c⁽¹⁾)², θa^T − a^TH= 0 and

aa^T − H^TH= −(˜a⁽⁰⁾)²I+

(˜c⁽¹⁾)²− ( ˜b⁽¹⁾)²+ (˜a⁽⁰⁾)² y2y₂^T

y2². Thus, by Lemma3.2, f is SOC-monotone if and only if

(a) ˜b⁽¹⁾≥ |˜c⁽¹⁾|;

(11)

(b) and there existsλ ≥ 0 such that the matrix

⎡

⎣( ˜b⁽¹⁾)²− (˜c⁽¹⁾)²− λ 0 0 (λ − (˜a⁽⁰⁾)²)I +

(˜c⁽¹⁾)²− ( ˜b⁽¹⁾)²+ (˜a⁽⁰⁾)² y₂y₂^T

y2²

⎤

⎦ O.

When n= 2, (a) together with (b) is equivalent to saying that f(µ1) ≥ 0 and f(µ2) ≥ 0.

Then we conclude that f is SOC-monotone if and only if f(τ) ≥ 0 for any τ ∈ J.

When n≥ 3, (b) is equivalent to saying that ( ˜b⁽¹⁾)²−(˜c⁽¹⁾)²= λ ≥ 0 and λ−(˜a⁽⁰⁾)²≥ 0, i.e.,( ˜b⁽¹⁾)²− (˜c⁽¹⁾)²≥ (˜a⁽⁰⁾)². Therefore, (a) together with (b) is equivalent to

⎡

⎢⎣

f⁽¹⁾(µ1) f(µ2) − f (µ1) µ2− µ1

f(µ2) − f (µ1) µ2− µ1

f⁽¹⁾(µ2)

⎤

⎥⎦ O

for any x∈ IRⁿ, h ∈Kⁿsuch that x+ h ∈ dom f^soc, and some t∈ (0, 1). Thus, we conclude that f is SOC-monotone if and only if

⎡

⎢⎣

f⁽¹⁾(t1) f(t2) − f (t1) t₂− t1

f(t2) − f (t1) t₂− t1

f⁽¹⁾(t2)

⎤

⎥⎦ O for all t¹, t2∈ J.

Case (2): y2 = 0. Now we have µ1 = µ2 and∇ f^soc(y) = f⁽¹⁾(µ1)I = f⁽¹⁾(µ2)I . Hence, f is SOC-monotone is equivalent to f⁽¹⁾(µ1) ≥ 0, which is also equivalent to

⎡

⎢⎣

f⁽¹⁾(µ1) f(µ2) − f (µ1) µ2− µ1

f(µ2) − f (µ1) µ2− µ1

f⁽¹⁾(µ2)

⎤

⎥⎦ O

since f⁽¹⁾(µ1) = f⁽¹⁾(µ2) and ^f^(µ_µ²₂^{)− f (µ}_−µ₁ ¹⁾ = f⁽¹⁾(µ1) = f⁽¹⁾(µ2) by the Taylor formula andµ1= µ2. Thus, similar to Case (1), the conclusion also holds under this case.

From Theorem3.1and [11, Theorem 6.6.36], we immediately have the following results.

Corollary 3.1 Let f ∈ C⁽¹⁾(J) with J being an open interval in IR. Then,

(a) f is SOC-monotone of order n ≥ 3 if and only if it is 2-matrix monotone, and f is SOC-monotone of order n≤ 2 if it is 2-matrix monotone.

(b) Suppose that n≥ 3 and f is SOC-monotone of order n. Then, f(t0) = 0 for some t0∈ J if and only if f(·) is a constant function on J.

Note that the SOC-monotonicity of order 2 does not imply the 2-matrix monotonicity.

For example, f(t) = t²is SOC-monotone of order 2 on(0, +∞) by Example 3.2 (a) in [5], but by [11, Theorem 6.6.36] we can verify that it is not 2-matrix monotone. Corollary3.1 (a) implies that a continuously differentiable function defined on an open interval must be SOC-monotone if it is 2-matrix monotone. In addition, from the following proposition, we also have that the compound of two simple SOC-monotone functions is SOC-monotone.

Proposition 3.1 If f : J1→ J and g : J → IR with J1, J ⊆ IR are SOC-monotone on J1

and J , respectively, then the function g◦ f : J1→ IR is SOC-monotone on J.

Proof It is easy to verify that for all x, y ∈ IRⁿ, x_Kⁿ y if and only ifλi(x) ≥ λi(y) with i= 1, 2. In addition, g is monotone on J since it is SOC-monotone. From the two facts, we

immediately obtain the result.

(12)

4 Characterizations of SOC-convex functions

In this section, we exploit Peirce decomposition to derive some characterizations for SOC- convex functions. Let f ∈ C⁽²⁾(J) with J being an open interval in IR and dom f^soc⊆ IRⁿ.

For any x∈ dom f^soc and h∈ IRⁿ, if x2= 0, then from Proposition2.1we have that

⎡

⎢⎢

⎢⎣

...

h^TA_n(x)h

⎤

⎥⎥

⎥⎦= f⁽²⁾(x1) h^Th

2h1h2

.

Since(h^Th, 2h1h₂) ∈Kⁿ, from Proposition2.2it follows that f is SOC-convex if and only if f⁽²⁾(x1) ≥ 0. By the arbitrariness of x1, f is SOC-convex if and only if f is convex on J . In what follows, we assume that x₂ = 0. Let x = λ1(x)u⁽¹⁾x + λ2(x)u⁽²⁾x , where u⁽¹⁾_x and u⁽²⁾_x are given by (4) with ¯x2 = _x^x²₂. Let u⁽ⁱ⁾_x = (0, υ₂⁽ⁱ⁾) for i = 3, . . . , n, where υ₂⁽³⁾, . . . , υ₂⁽ⁿ⁾is any orthonormal set of vectors that span the subspace of IRⁿ⁻²orthogonal to x2. It is easy to verify that the vectors u⁽¹⁾_x , u⁽²⁾x , u⁽³⁾x , . . . , u⁽ⁿ⁾x are linearly independent.

Hence, for any given h= (h1, h2) ∈ IR × IRⁿ⁻¹, there existsµi, i = 1, 2, . . . , n such that h = µ1

√2u⁽¹⁾_x + µ2

√2u⁽²⁾_x +

n i=3

µiu⁽ⁱ⁾_x .

From (19), we can verify that b⁽²⁾+ c⁽²⁾and b⁽²⁾− c⁽²⁾are the eigenvalues of A₁(x) with u⁽²⁾_x and u⁽¹⁾_x being the corresponding eigenvectors, and a⁽¹⁾is the eigenvalue of multiplicity n−2 with u⁽ⁱ⁾x = (0, υ₂⁽ⁱ⁾) for i = 3, . . . , n being the corresponding eigenvectors. Therefore,

h^TA₁(x)h = µ²₁(b⁽²⁾− c⁽²⁾) + µ²₂(b⁽²⁾+ c⁽²⁾) + a⁽¹⁾

n i=3

µ²_i

= f⁽²⁾(λ1)µ²₁+ f⁽²⁾(λ2)µ²₂+ a⁽¹⁾µ², (30) where

µ²=_n

i=3µ²_i.

Similarly, we can verify that c⁽²⁾+ b⁽²⁾− a⁽¹⁾and c⁽²⁾− b⁽²⁾+ a⁽¹⁾are the eigenvalues of

⎡

⎢⎢

⎣

c⁽²⁾ (b⁽²⁾− a⁽¹⁾) x₂^T

x2 (b⁽²⁾− a⁽¹⁾) x2

x2 dI+

c⁽²⁾− d x₂x₂^T

x2²

⎤

⎥⎥

⎦

with u⁽²⁾_x and u⁽¹⁾_x being the corresponding eigenvectors, and d is the eigenvalue of multiplicity n− 2 with u⁽ⁱ⁾x = (0, υ₂⁽ⁱ⁾) for i = 3, . . . , n being the corresponding eigenvectors. Notice that C(x) in (22) can be decomposed the sum of the above matrix and

⎡

⎣0 0

0−2dx₂x₂^T

x2²

⎤

⎦ .