(1)A CLASS OF INTERIOR PROXIMAL-LIKE ALGORITHMS FOR CONVEX SECOND-ORDER CONE PROGRAMMING∗ SHAOHUA PAN† AND JEIN-SHAN CHEN‡ Abstract

28  Download (0)

Full text

(1)

A CLASS OF INTERIOR PROXIMAL-LIKE ALGORITHMS FOR CONVEX SECOND-ORDER CONE PROGRAMMING

SHAOHUA PAN AND JEIN-SHAN CHEN

Abstract. We propose a class of interior proximal-like algorithms for the second-order cone program, which is to minimize a closed proper convex function subject to general second-order cone constraints. The class of methods uses a distance measure generated by a twice continuously differentiable strictly convex function on (0, +∞), and includes as a special case the entropy-like proximal algorithm [Eggermont, Linear Algebra Appl., 130 (1990), pp. 25–42], which was originally proposed for minimizing a convex function subject to nonnegative constraints. Particularly, we consider an approximate version of these methods, allowing the inexact solution of subproblems. Like the entropy-like proximal algorithm for convex programming with nonnegative constraints, we, under some mild assumptions, establish the global convergence expressed in terms of the objective values for the proposed algorithm, and we show that the sequence generated is bounded, and every accumulation point is a solution of the considered problem. Preliminary numerical results are reported for two approximate entropy-like proximal algorithms, and numerical comparisons are also made with the merit function approach [Chen and Tseng, Math. Program., 104 (2005), pp. 293–327], which verify the effectiveness of the proposed method.

Key words. proximal method, measure of distance, second-order cone, second-order cone- convexity

AMS subject classifications. 65K05, 90C30 DOI. 10.1137/070685683

1. Introduction. We consider the following convex second-order cone program- ming (CSOCP):

min f (ζ)

subject to (s.t.) Aζ + bK 0, (1)

where f :Rm→ (−∞, +∞] is a closed proper convex function; A is an n × m matrix, with n≥ m; b is a vector in Rn; xK0 means x∈ K; and K is the Cartesian product of second-order cones (SOCs), also called Lorentz cones [14]. In other words,

K = Kn1× Kn2× · · · × KnN, (2)

where N, n1, . . . , nN ≥ 1, n1+ n2+· · · + nN = n, and Kni :=

(x1, x2)∈ R × Rni−1 | x1≥ x2 ,

with ·  denoting the Euclidean norm and K1 denoting the set of nonnegative reals R+. The CSOCP, as an extension of the standard second-order cone programming, has a wide range of applications from engineering, control, and finance to robust optimization and combinatorial optimization; see [1, 21, 23] and references therein.

Received by the editors March 19, 2007; accepted for publication (in revised form) April 25, 2008; published electronically August 13, 2008.

http://www.siam.org/journals/siopt/19-2/68568.html

School of Mathematical Sciences, South China University of Technology Guangzhou 510640, China (shhpan@cut.edu.cn). This author’s work is partially supported by the Doctoral Starting-up Foundation (B13B6050640) of GuangDong Province.

Department of Mathematics, National Taiwan Normal University Taipei 11677, Taiwan (jschen@

math.ntnu.edu.tw). Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. This author’s work is partially supported by the National Science Council of Taiwan.

883

(2)

Recently, the second-order cone programming (SOCP) and the SOC complemen- tarity problem have received much attention in optimization. There exist many meth- ods for solving the CSOCP, including the smoothing methods [10, 15], the smoothing- regularization method [17], the semismooth Newton method [22], and the merit func- tion approach [8]. All of these methods are proposed by using some SOC comple- mentarity function or merit function to reformulate the KKT optimality conditions of the CSOCP as a nonsmooth (or smoothing) system of equations or an unconstrained minimization problem. Notice that the CSOCP is a typical convex programming prob- lem which has extensive applications. But, to the best of our knowledge, there are few convex programming methods developed for (or extended to) the CSOCP except the interior point method [33]. Hence, it is worthy to explore other types of convex programming methods for the CSOCP which are different from the aforementioned methods.

One such method is the proximal point algorithm for minimizing a convex function f (ζ) overRm, which generates a sequencek} by the following iterative scheme:

ζk = argmin

ζ∈Rm



f (ζ) + 1

kζ − ζk−12

 , (3)

where μk is a sequence of positive numbers. The method was originally introduced by Martinet [24] with the Moreau proximal approximation of f (see [25]), and then further developed by Rockafellar [30, 31]. Later, some researchers [5, 13, 32] proposed and studied nonquadratic proximal point algorithms by replacing the quadratic distance in (3) with a Bregman distance or an entropy-like distance.

The entropy-like proximal algorithm was designed for minimizing a convex func- tion f (ζ) subject to nonnegative constraints ζ≥ 0. In [12], Eggermont first introduced the Kullback–Leibler relative entropy, defined by

1d(ζ, ξ) =

m i=1

ζiln(ζii) + ζi− ξi ∀ζ ≥ 0, ξ > 0,

and established the following entropy-like proximal point algorithm:

ζ0> 0, ζk= argmin

ζ>0

f (ζ) + μk−1d(ζk−1, ζ) (4) .

Later, Teboulle [32] proposed to replace the usual Kullback–Leibler relative entropy with a new type of distance-like function, called ϕ-divergence, to define the entropy- like proximal map. Let ϕ : R → (−∞, +∞] be a closed proper convex function satisfying certain conditions (see [18, 32]). The ϕ-divergence induced by ϕ is defined as

dϕ(ζ, ξ) :=

m i=1

ξiϕ(ζii).

(5)

Based on the ϕ-divergence, Isume et al. [18, 19] generalized Eggermont’s algorithm as

ζ0> 0, ζk = argmin

ζ>0

f (ζ) + μk−1dϕ(ζ, ζk−1) (6) ,

1The convention of 0 ln 0 = 0 is used throughout this paper.

(3)

and they obtained the convergence theorems under weaker assumptions. Clearly, when

ϕ(t) =− ln t + t − 1 (t > 0),

we have that dϕ(ζ, ξ) = d(ξ, ζ), and consequently the algorithm reduces to Egger- mont’s.

Observing that the proximal-like algorithm (6) associated with ϕ(t) =− ln t+t−1 inherits the features of the interior point method as well as the proximal point method, Auslender [2] extended the algorithm to general linearly constrained convex minimiza- tion problems and variational inequalities on polyhedra. Then, is it possible to extend the algorithm to nonpolyhedra symmetric conic optimization problems and establish the corresponding convergence results? In this paper, we will explore its extension to the setting of SOCs and establish a class of interior proximal-like algorithms for the CSOCP. We should mention that the algorithm (6) with the entropy function t ln t− t + 1 (t ≥ 0) was recently extended to convex semidefinite programming [11].

For simplicity, in the rest of this paper, we focus on the case whereK = Kn. All of the analysis can be carried over to the general case whereK has the direct product structure as (2). It is known thatKn is a closed convex cone with the interior given by

int(Kn) :=

(x1, x2)∈ R × Rn−1 | x1>x2 .

For any x, y inRn, we write xKn y if x−y ∈ Kn; and write x Kn y if x−y ∈ int(Kn).

In other words, we have that xKn 0 if and only if x∈ Kn and x Kn 0 if and only if x∈ int(Kn). We denoteF by the constraint set of the CSOCP, i.e.,

F :=

ζ∈ Rm| Aζ + b Kn 0

. (7)

It is not difficult to verify thatF is convex, and its interior int(F) is given by int(F) :=

ζ∈ Rm | Aζ + b Kn 0

.

The proximal-like algorithm that we propose for the CSOCP is defined as follows:

ζ0∈ int(F), ζk = argmin

ζ∈int(F)

f (ζ) + μ−1k D(Aζ + b, Aζk−1+ b) (8) ,

where D : Rn× Rn → (−∞, +∞] is a closed proper convex function generated by a class of twice continuously differentiable strictly convex functions on (0, +∞), and the specific expression is given in section 3. The class of distance measures, as will be shown in section 3, includes as a special case the natural extension of dϕ(x, y), with ϕ(t) =− ln t + t − 1 to the SOCs. For the proximal-like algorithm (8), we par- ticularly consider an approximate version which allows an inexact minimization of the subproblem (8) and establish its global convergence results under some mild as- sumptions. Numerical results are reported for two approximate entropy-like proximal algorithms, which verify the effectiveness of the proximal method proposed. In ad- dition, numerical comparisons with the merit function approach [8] indicate that the condition number of the Hessian matrix2f (ζ) has a great influence on the numerical performance of the proximal-like algorithm and the merit function approach, but the

(4)

former seems to have no direct relation with the dense degree of test problems, but the latter tends to more function evaluations as the density increases.

The outline of this paper is as follows. In section 2, we review some basic con- cepts and properties associated with SOCs. In section 3, we state the definition of D(x, y) and present some specific examples. Some favorable properties of D(x, y) are investigated in section 4. In section 5, we describe an approximate proximal-like algorithm allowing inexact minimization in (8) and establish the global convergence of the algorithm. In section 6, we report our numerical experiences for the proposed proximal-like algorithm by solving some convex SOCPs. Finally, we conclude this paper in section 7.

Throughout this paper, I represents an identity matrix of suitable dimension, andRn denotes the space of n-dimensional real column vectors. For a differentiable function h on R, we denote h, h, and h by its first, second, and third derivative, respectively. Given a set S, we denote ¯S, int(S), and bd(S) by the closure, the interior and the boundary of S, respectively. Note that a function is closed if and only if it is lower semicontinuous, and a function is proper if f (ζ) <∞ for at least one ζ ∈ Rmand f (ζ) >−∞ for all ζ ∈ Rm. For a closed proper convex function f :Rm→ (−∞, +∞], we denote its domain by domf :={ ζ ∈ Rm| f(ζ) < ∞} and the subdifferential of f at ζ by

∂f (ζ) :=



w∈ Rm m

.

If f is differentiable at ζ, the notation∇f(ζ) represents the gradient at ζ of f.

2. Preliminaries. This section recalls some basic concepts and preliminary re- sults related to SOCs that will be used in the subsequent analysis. For any x = (x1, x2)∈ R × Rn−1 and y = (y1, y2)∈ R × Rn−1, we define their Jordan product as

x◦ y := ( x, y , y1x2+ x1y2).

(9)

We write x2 to mean x◦ x and write x + y to mean the usual componentwise addition of vectors. Then◦, +, and e = (1, 0, . . . , 0)T ∈ Rn have the following basic properties (see [14, 15]): (1) e◦ x = x for all x ∈ Rn. (2) x◦ y = y ◦ x for all x, y ∈ Rn. (3) x◦(x2◦y) = x2◦(x◦y) for all x, y ∈ Rn. (4) (x+y)◦z = x◦z+y◦z for all x, y, z ∈ Rn. The Jordan product is not associative. For example, for n = 3, let x = (1,−1, 1) and y = z = (1, 0, 1), then we have that (x◦ y) ◦ z = (4, −1, 4) = x ◦ (y ◦ z) = (4, −2, 4).

However, it is power associated, i.e., x◦ (x ◦ x) = (x ◦ x) ◦ x for all x ∈ Rn. Thus, we may, without fear of ambiguity, write xm for the product of m copies of x and xm+n= xm◦ xnfor all positive integers m and n. We stipulate that x0= e. Besides, Knis not closed under Jordan product. For example, x = (1, 1, 0), y = (2,−1, 3) ∈ Kn, but x◦ y = (1, 1, 3) ∈ Kn.

For each x = (x1, x2)∈ R × Rn−1, the determinant and the trace of x are defined by

det(x) = x21− x22, tr(x) = 2x1. (10)

In general, det(x◦ y) = det(x) det(y) unless x2 = αy2 for some α ∈ R. A vector x = (x1, x2)∈ R × Rn−1 is said to be invertible if det(x)= 0. If x is invertible, then there exists a unique y = (y1, y2)∈ R × Rn−1 satisfying x◦ y = y ◦ x = e. We call

(5)

this y the inverse of x and denote it by x−1. In fact, we have that x−1= 1

x21− x22(x1,−x2) = 1

det(x)(tr(x)e− x).

(11)

Hence, x ∈ int(Kn) if and only if x−1 ∈ int(Kn), and (xk)−1 is well-defined if x int(Kn).

In the following, we recall from [15] that each x = (x1, x2)∈ R × Rn−1 admits a spectral factorization associated withKn of the form

x = λ1(x)· u(1)x + λ2(x)· u(2)x ,

where λi(x) and u(i)x for i = 1, 2 are the spectral values and the associated spectral vectors of x, respectively, given by

λi(x) = x1+ (−1)ix2,

u(i)x =

⎧⎪

⎪⎩ 1 2



1, (−1)i x2

x2



if x2= 0;

1 2

1, (−1)ix¯2

if x2= 0, (12)

with ¯x2being any vector inRn−1such that¯x2 = 1. If x2= 0, then the factorization is unique. The spectral decomposition along with the Jordan algebra associated with SOC has some basic properties, whose proofs can be found in [14, 15]. Here, we list four of them that will often be used in the subsequent sections.

Property 2.1. For any x = (x1, x2) ∈ R × Rn−1 with the spectral values λ1(x), λ2(x) and spectral vectors u(1)x , u(2)x given as in (12), the following results hold:

(a) u(1)x and u(2)x are orthogonal under Jordan product and have length 1/√ 2, i.e., u(1)x ◦ u(2)x = 0, u(1)x  = u(2)x  = 1/√

2.

(b) u(1)x and u(2)x are idempotent under Jordan product, i.e., u(i)x ◦ u(i)x = u(i)x for i = 1, 2.

(c) The determinant, the trace, and the Euclidean norm of x can be denoted by λ1(x), λ2(x):

det(x) = λ1(x)λ2(x), tr(x) = λ1(x) + λ2(x), x2= 1(x)]2+ [λ2(x)]2

2 .

(d) λ1(x) are nonnegative (positive) if and only if x∈ Kn (x∈ int(Kn)).

Lemma 2.1.

(a) For any x∈ Rn, xKn 0⇐⇒ x, y ≥ 0 for any y Kn 0.

(b) For any x∈ Rn, x Kn 0⇐⇒ x, y > 0 for any y Kn0 and y= 0.

(c) For any x, y ∈ Rn, let λi(x) and λi(y) for i = 1, 2 be their spectral values.

Then,

λ1(x)λ2(y) + λ2(x)λ1(y)≤ tr(x ◦ y) ≤ λ1(x)λ1(y) + λ2(x)λ2(y).

Proof. Part (a) is direct by the self-duality ofKn, and we next consider parts (b) and (c).

(b) Let x = (x1, x2), y = (y1, y2)∈ R × Rn−1. The necessity follows from x, y = x1y1+ xT2y2≥ x1y1− x2y2 ≥ x1y1− y1x2 = y1(x1− x2) > 0,

(6)

where the first inequality is by Cauchy–Schwartz, the second is due to yKn 0, and the third is since x Kn 0 and y= 0, y Kn 0. Next, we prove the sufficiency. First, from x, y > 0 for any y Kn 0 and y = 0, we deduce that x1> 0 by setting y = e.

If x2= 0, then the conclusion follows. If x2= 0, then we set y = (1, −xx22). Clearly, y Kn 0, y= 0, and 0 < x, y = x1− x2 = λ1(x). By Property 2.1 (d), we then have x Kn 0.

(c) For any x = (x1, x2), y = (y1, y2)∈ R × Rn−1, by (12) we can compute that λ1(x)λ2(y) + λ2(x)λ1(y) = 2x1y1− 2x2y2 ≤ 2(x1y1+ xT2y2) = tr(x◦ y), λ1(x)λ1(y) + λ2(x)λ2(y) = 2x1y1+ 2x2y2 ≥ 2(x1y1+ xT2y2) = tr(x◦ y) . Combining with the two inequalities above then yields the desired result.

For any h :R → R, the following vector-valued function was considered in [6, 15]:

hsoc(x) = h[λ1(x)]· u(1)x + h[λ2(x)]· u(2)x ∀x = (x1, x2)∈ R × Rn−1. (13)

If h is defined only on a subset ofR, then hsoc is defined on the corresponding subset of Rn. The definition in (13) is unambiguous whether x2 = 0 or x2 = 0. For the vector-valued function hsoc induced by h, we have the following results.

Lemma 2.2. Given a function h : IR→ R, let hsoc : S→ Rn be the vector-valued function induced by h as in (13), where IR ⊆ R and S ⊆ Rn. Then, the following results hold:

(a) For any x ∈ S, λi[hsoc(x)] = h[λi(x)] for i = 1, 2 and tr[hsoc(x)] =

2

i=1h[λi(x)].

(b) If h is continuously differentiable on IR, then hsoc is continuously differen- tiable on the set S, and its transposed Jacobian at x = (x1, x2)∈ S is given by the formula

∇hsoc(x) = h(x1)I (14)

if x2= 0, and otherwise

∇hsoc(x) =

⎢⎢

b c xT2

x2 c x2

x2 aI + (b− a)x2xT2

x22

⎥⎥ (15) ⎦ ,

where

a = h[λ2(x)]− h[λ1(x)]

λ2(x)− λ1(x) , b = h2(x)] + h1(x)]

2 , c = h2(x)]− h1(x)]

2 .

(c) If h is continuously differentiable on IR, then tr[hsoc(x)] is continuously dif- ferentiable on the set S, and its gradient ∇tr[hsoc(x)] = 2∇hsoc(x)· e = 2(h)soc(x).

(d) If h is (strictly) convex on IR, then tr[hsoc(x)] is (strictly) convex on the set S.

Proof. (a) The proof is direct by the definition of hsoc and the spectral value.

(b) The conclusion follows directly from [15, Propostion 5.2] or [6, Proposition 4].

(c) Since tr[hsoc(x)] = 2 hsoc(x), e , by part (b) tr[hsoc(x)] is obviously continu- ously differentiable. Applying the chain rule for the inner product of two functions yields

∇tr[hsoc(x)] = 2∇hsoc(x)· e,

(7)

where ∇hsoc(x) is given by (14)–(15). By a simple computation, it is easy to verify that

∇hsoc(x)· e = h1(x)]u(1)x + h2(x)]u(2)x = (h)soc(x).

Combining the last two equalities immediately gives the second part of the conclusions.

(d) The proof is similar to that of [26, Lemma 3.2 (d)], and so we omit it.

To close this section, we review the definition of SOC-convexity and SOC-mono- tonicity. The two concepts, such as the matrix-convexity and the matrix-monotonicity in the semidefinite programming, play an important role in the solution methods of SOCPs.

Definition 2.1 (see [7]). Given a function h : IR→ R, let hsoc : S→ Rn be the vector-valued function defined as in (13), where IR⊆ R and S ⊆ Rn. Then,

(a) h is said to be SOC-monotone of order n on IRif for any x, y∈ S, xKn y =⇒ hsoc(x)Kn hsoc(y).

(b) h is said to be SOC-convex of order n on IRif for any x, y∈ S and 0 ≤ β ≤ 1, hsoc



βx + (1− β)y

Kn βhsoc(x) + (1− β)hsoc(y).

(16)

We say that h is SOC-convex (respectively, SOC-monotone) on IRif h is SOC-convex of all order n (respectively, SOC-monotone of all order n) on IR. A function h is said to be SOC-concave on IR whenever−h is SOC-convex on IR. When h is continuous on IR, the condition in (16) can be replaced by the more special condition:

hsoc

x + y 2



Kn 1 2



hsoc(x) + hsoc(y)

 . (17)

Obviously, the set of SOC-monotone functions and the set of SOC-convex functions are both closed under positive linear combinations and under pointwise limits.

3. Distance-like functions in SOCs. In this section, we present the definition of the distance-like function D(x, y) involved in the proximal-like algorithm (8) and some specific examples. Let φ :R → (−∞, +∞] be a closed proper convex function with domφ = [0, +∞) and assume that

(C.1) φ is strictly convex on its domain.

(C.2) φ is twice continuously differentiable on int(domφ), with limt→0+φ(t) = +∞.

(C.3) φ(t)t− φ(t) is convex on int(domφ).

(C.4) φ is SOC-concave on int(domφ).

In what follows, we denote by Φ the class of functions satisfying Conditions C.1–C.4.

Given a φ∈ Φ, let φsoc and (φ)soc be the vector-valued function given as in (13).

We define D(x, y) involved in the proximal-like algorithm (8) by

D(x, y) :=

 tr



φsoc(y)− φsoc(x)− (φ)soc(x)◦ (y − x)

∀x ∈ int(Kn), y∈ Kn,

+ otherwise.

(18)

The function, as will be shown in the next section, possesses some favorable properties.

Particularly, D(x, y)≥ 0 for any x, y ∈ int(Kn), and D(x, y) = 0 if and only if x = y.

Hence, D(x, y) can be used to measure the distance between the two points in int(Kn).

(8)

In the following, we concentrate on the examples of the distance-like function D(x, y). For this purpose, we first give another characterization for Condition C.3.

Lemma 3.1. Let φ : R → (−∞, +∞] be a closed proper function with domφ = [0, +∞). If φ is thrice continuously differentiable on int(domφ), then φ satisfies Con- dition C.3 if and only if its derivative function φ is exponentially convex,2 or

φ(t1t2) 1 2



φ(t21) + φ(t22)

 ∀t1, t2> 0.

(19)

Proof. Since the function φ is thrice continuously differentiable on int(domφ), φ satisfies Condition C.3 if and only if

φ(t) + tφ(t)≥ 0 (∀t > 0).

Observe that the inequality is also equivalent to

(t) + t2φ(t)≥ 0 (∀t > 0),

and hence substituting by t = exp(θ) for θ∈ R into the inequality yields that exp(θ)φ(exp(θ)) + exp(2θ)φ(exp(θ))≥ 0 ∀θ ∈ R.

Since the left-hand side of this inequality is exactly [φ(exp(θ))], it means that φ(exp(·)) is convex on R. Consequently, the first part of the conclusions follows.

Note that the convexity of φ(exp(·)) on R is equivalent to saying, for any θ1, θ2 R,

φ(exp(rθ1+ (1− r)θ2))≤ rφ(exp(θ1)) + (1− r)φ(exp(θ2)), r∈ [0, 1], which, by letting t1= exp(θ1) and t2= exp(θ2), can be rewritten as

φ(tr1t12−r)≤ rφ(t1) + (1− r)φ(t2) ∀t1, t2> 0 and r∈ [0, 1].

This is clearly equivalent to the statement in (19) due to the continuity of φ. Remark 3.1. The exponential convexity was also used in the definition of the self-regular function [27] in which the authors denote Ω by the set of functions whose elements are twice continuously differentiable and exponentially convex on (0, +∞).

By Lemma 3.1, clearly, if h ∈ Ω, then the function t

0h(θ)dθ necessarily satisfies Condition C.3. For example, ln t belongs to Ω, and so t

0ln θdθ = t ln t satisfies Condition C.3.

For the characterizations of the SOC-concavity, interested readers may refer to [7, 9]. Here, we present a lemma which states that the composition of two SOC- concave functions is SOC-concave under some conditions. By this lemma, we may conveniently obtain some new SOC-concave functions from the existing ones.

Lemma 3.2. Let g : JR→ R and h : IR→ JR, where JR⊆ R and IR ⊆ R. If g is SOC-concave and SOC-monotone on JR and h is SOC-concave on IR, then their composition g(h(·)) is also SOC-concave on IR. If, in addition, h is SOC-monotone on IR, then g(h(·)) is also SOC-monotone on IR.

Proof. For the sake of notation, let gsoc: S→ Rn and hsoc: S→ S be the vector- valued functions associated with g and h, respectively, where S ⊆ Rn and S ⊆ Rn.

2Which means the function φ(exp(·)) : R → R is convex on R,

(9)

Defineg(t) = g(h(t)). Then, for any x ∈ S, it follows from (11) and (13) that gsoc(hsoc(x)) = gsoc



h(λ1(x))u(1)x + h(λ2(x))u(2)x



= g[h(λ1(x))]u(1)x + g[h(λ2(x))]u(2)x

=gsoc(x).

(20)

We next prove thatg(t) is SOC-concave on IR. For any x, y∈ S and 0 ≤ β ≤ 1, from the SOC-concavity of h(t) it follows that

hsoc(βx + (1− β)y) Kn βhsoc(x) + (1− β)hsoc(y).

Using the SOC-monotonicity and SOC-concavity of g, we then obtain that gsoc



hsoc(βx + (1− β)y)

Kn gsoc



βhsoc(x) + (1− β)hsoc(y)



Kn βgsoc[hsoc(x)] + (1− β)gsoc[hsoc(y)].

This together with (20) implies that for any x, y∈ S and 0 ≤ β ≤ 1, gsoc

βx + (1− β)y

Kn βgsoc(x) + (1− β)gsoc(y).

Consequently, the functiong(t), i.e., g(h(·)) is SOC-concave on IR. The second part of the conclusions is obvious.

Proposition 3.1. (a) The function h(t) = tr, with 0 ≤ r ≤ 1 is both SOC- concave and SOC-monotone on [0, +∞).

(b) h(t) =−t−r, with 0≤ r ≤ 1 is SOC-concave and SOC-monotone on (0, +∞).

(c) For all u ≤ 0, h(t) = u−t1 is SOC-concave as well as SOC-monotone on (0, +∞).

(d) The function ln t is SOC-concave and SOC-monotone on (0, +∞).

Proof. (a) The proof has been given by [7, Proposition 3.7], and we here omit it.

(b) The conclusion follows directly from [9, Corollary 4.2].

(c) Let g(t) = −1/t and h(t) = t − u. Then, h(t) = 1/(u − t) is exactly the composition of the two functions, i.e., h(t) = g(h(t)). From part (b), g(t) is SOC- monotone and SOC-concave on (0, +∞); whereas by [7, Proposition 3.1 (b)] h(t) is SOC-monotone and SOC-concave on (0, +∞). Thus, applying Lemma 3.2, we readily obtain the conclusion.

(d) The proof can be found in [9]. In view of the importance of ln t, we here present a different proof by following the same line as [3]. Noting that

ln t =

 0

−∞

1

u− t u u2+ 1

!

du (t > 0),

we have for any x∈ int(Kn) that

ln x =

 0

−∞

(ue− x)−1 u u2+ 1e

! du.

(21)

For any x = (x1, x2), y = (y1, y2)∈ int(Kn) and any 0≤ β ≤ 1, let w = ln(βx + (1− β)y) − β ln x − (1 − β) ln y.

(10)

Then, by the definition of SOC-concavity, proving the SOC-concavity of ln t on (0, +∞) is equivalent to showing that w ∈ Kn. From (21) and (11), it follows that

w =

 0

−∞

"

(ue− βx − (1 − β)y)−1− β(ue − x)−1− (1 − β)(ue − y)−1# du

=

⎜⎜

 0

−∞

u− βx1− (1 − β)y1

det(ue− βx − (1 − β)y) β(u− x1)

det(ue− x)−(1− β)(u − y1) det(ue− y)

! du

 0

−∞

βx2+ (1− β)y2

det(ue− βx − (1 − β)y)− βx2

det(ue− x) (1− β)y2

det(ue− y)

! du

⎟⎟

:=

 w1 w2

 ,

where w1∈ R and w2∈ Rn−1. However, by Proposition 3.1 (c) and Definition 2.1,



ue− βx − (1 − β)y−1

− β(ue − x)−1− (1 − β)(ue − y)−1∈ Kn, which implies that

u− βx1− (1 − β)y1

det(ue− βx − (1 − β)y) β(u− x1)

det(ue− x)−(1− β)(u − y1) det(ue− y) ≥ 0

and **

** βx2+ (1− β)y2

det(ue− βx − (1 − β)y) βx2

det(ue− x)− (1− β)y2

det(ue− y)

****

u− βx1− (1 − β)y1

det(ue− βx − (1 − β)y)− β(u− x1)

det(ue− x) (1− β)(u − y1) det(ue− y) . As a consequence,

w1=

 0

−∞

u− βx1− (1 − β)y1

det(ue− βx − (1 − β)y)− β(u− x1)

det(ue− x) (1− β)(u − y1) det(ue− y)

! du

≥ 0 and

w2 ≤

 0

−∞

**** βx2+ (1− β)y2

det(ue− βx − (1 − β)y)− βx2

det(ue− x)− (1− β)y2

det(ue− y)

!****du

 0

−∞

u− βx1− (1 − β)y1

det(ue− βx − (1 − β)y) β(u− x1)

det(ue− x)−(1− β)(u − y1) det(ue− y)

! du

= w1.

This shows that w ∈ Kn, and consequently ln t is SOC-concave on (0, +∞). By a similar argument, we can prove that ln t is SOC-monotone on (0, +∞).

From Lemma 3.2 and Proposition 3.1, we may obtain the following corollary, which particularly shows that the modified logarithmic barrier function is SOC- concave.

Corollary 3.1. (a) The modified logarithmic barrier function ln(α+t) for α > 0 is both SOC-concave and SOC-monotone on (−α, +∞).

(b) For any α > 0 and β > 0, the functions ln(α + βtr), with 0 ≤ r ≤ 1 are SOC-concave and SOC-monotone on [0, +∞).

(11)

(c) For any u > 0, the functions u+tt are SOC-concave and SOC-monotone on (0, +∞).

(d) For all u > 0, the functions −1u+t are SOC-concave and SOC-monotone on (−u, +∞).

Proof. (a) The proof is due to Proposition 3.1(d), [7, Proposition 3.1], and Lemma 3.2 by letting g : (0, +∞) → R be g(t) = ln t, and h : (−a, +∞) → (0, +∞) be h(t) = a + t.

(b) Let g : (0, +∞) → R be g(t) = ln t, and h : (0, +∞) → (0, +∞) be h(t) = a + βtr. The result follows from Proposition 3.1(a), Proposition 3.1(d), and Lemma 3.2.

(c) Let g : (−1, 0) → (0, 1) be g(t) = 1 + t, and h : (0, +∞) → (−1, 0) be h(t) =−u/(u+t). Then, we obtain the result from Proposition 3.1(c), [7, Proposition 3.1], and Lemma 3.2. The result also extends the conclusion of [7, Proposition 3.4].

(d) Let g : (0, +∞) → (0, +∞) be g(t) =√

t, and h : (−u, +∞) → (0, +∞) be h(t) = u + t. Then, from Lemma 3.2 it follows that g(h(t)) =√

u + t is SOC-concave and SOC-monotone on (−u, +∞). Using Lemma 3.2 again with g(t) = −1/t and h(t) =√

u + t, we obtain the desired result.

Now we present serval examples of D(x, y) to close this section. From these examples, we may see that the conditions required by φ∈ Φ are not so strict, and the construction of the distance-like functions in SOCs can be completed by selecting a class of single variate convex functions.

Example 3.1. Let φ(t) = t ln t− t + 1 if t ≥ 0, and φ(t) = +∞ if t < 0. It is easy to verify that φ satisfies Conditions C.1–C.3. Also, by Proposition 3.1(d), Condition C.4 also holds. From formula (13), it follows that, for any y∈ Kn and x∈ int(Kn),

φsoc(y) = y◦ ln y − y + e and (φ)soc(x) = ln x.

Consequently, the distance-like function induced by φ is given by

D1(x, y) = tr (y◦ ln y − y ◦ ln x + x − y) ∀x ∈ int(Kn), y∈ Kn.

This function is precisely the natural extension of the entropy-like distance dϕ(·, ·), with ϕ(t) = − ln t + t − 1 to the SOCs. In addition, comparing D1(x, y) with the distance-like function H(x, y) in Example 3.1 of [26], we note that D1(x, y) = H(y, x), but the proximal-like algorithms corresponding to them are completely different.

Example 3.2. Let φ(t) = t ln t + (1 + t) ln(1 + t)− (1 + t) ln 2 if t ≥ 0, and φ(t) = +∞ if t < 0. By computing, we can show that φ satisfies Conditions C.1–

C.3. Furthermore, from Proposition 3.1(d) and Corollary 3.1(a), we learn that φ also satisfies Condition C.4. This means that φ∈ Φ. For any y ∈ Kn and x∈ int(Kn), we can compute that

φsoc(y) = y◦ ln y + (e + y) ◦ ln(e + y) − ln 2(e + y), )soc(x) = (2− ln 2)e + ln x + ln(e + x).

Therefore, the distance-like function generated by such a φ is given by D2(x, y) = tr

− ln(e + x) ◦ (e + y) + y ◦ (ln y − ln x) + (e + y) ◦ ln(e + y) − 2(y − x)

for any x ∈ int(Kn) and y ∈ Kn. It should be pointed out that D2(x, y) is not the extension of dϕ(·, ·), with ϕ(t) = φ(t) given by [18] to the SOCs.

Example 3.3. Take φ(t) = t2r+32 + t2, with 0≤ r < 12 if t≥ 0, and φ(t) = +∞

if t < 0. It is easy to verify that φ satisfies Conditions C.1–C.3. Furthermore, from

(12)

Proposition 3.1(a) it follows that φ satisfies Condition C.4. Thus, φ∈ Φ. By a simple computation,

φsoc(y) = y2r+32 + y2 ∀y ∈ Kn and (φ)soc(x) = 2r + 3

2 x2r+12 + 2x ∀x ∈ int(Kn).

Hence, the distance-like function induced by φ has the following expression:

D3(x, y) = tr 2r + 1

2 x2r+32 + x2− y ◦2r + 3

2 x2r+12 + 2x



+ y2r+32 + y2

! .

Example 3.4. Let φ(t) = ta+1+at ln t−at, with 0 < a ≤ 1 if t ≥ 0, and φ(t) = +∞

if t < 0. It is easily shown that φ satisfies Conditions C.1–C.3. By Proposition 3.1(a) and Proposition 3.1(d), φis SOC-concave on (0, +∞). Hence, φ ∈ Φ. For any y ∈ Kn and x∈ int(Kn),

φsoc(y) = ya+1+ ay◦ ln y − ay and (φ)soc(x) = (a + 1)xa+ a ln x.

Consequently, the distance-like function induced by φ has the following expression:

D4(x, y) = tr



axa+1+ ax− y ◦

(a + 1)xa+ a ln x



+ ya+1+ ay◦ ln y − ay . 4. Properties of distance-like functions. In what follows, we study some favorable properties of the function D(x, y). We begin with two technical lemmas that will be used in the subsequent analysis. Among others, the first lemma is a direct consequence of Lemma 2.2 and the definition of Φ.

Lemma 4.1. Given a φ∈ Φ, let φsoc and (φ)soc be the vector-valued functions given as in (13). Then, we have the following results:

(a) φsoc(x) and (φ)soc(x) are well-defined on Kn and int(Kn), respectively, and λisoc(x)] = φ[λi(x)], λi[(φ)soc(x)] = φi(x)], i = 1, 2.

(b) φsoc(x) and (φ)soc(x) are continuously differentiable on int(Kn), with the transposed Jacobian at x given as in formulas (14)–(15).

(c) tr[φsoc(x)] and tr[(φ)soc(x)] are continuously differentiable on int(Kn), and

∇tr φsoc(x)



= 2∇φsoc(x)· e = 2(φ)soc(x),

∇tr

)soc(x)



= 2∇(φ)soc(x)· e = 2(φ)soc(x).

(22)

(d) The function tr[φsoc(x)] is strictly convex on int(Kn).

Lemma 4.2. Given a φ∈ Φ and a fixed point z ∈ Rn, let φz : int(Kn)→ R be given by

φz(x) := tr

− z ◦ (φ)soc(x)

 . (23)

Then, the function φz(x) possesses the following properties:

(a) φz(x) is continuously differentiable on int(Kn), with∇φz(x) =−2∇(φ)soc(x)· z.

(b) φz(x) is convex over int(Kn) when z ∈ Kn, and furthermore, it is strictly convex over int(Kn) when z∈ int(Kn).

(13)

Proof. (a) Since φz(x) = −2 (φ)soc(x), z for any x ∈ int(Kn), we have that φz(x) is continuously differentiable on int(Kn) by Lemma 4.1(c). Moreover, apply- ing the chain rule for the inner product of two functions readily yields ∇φz(x) =

−2∇(φ)soc(x)· z.

(b) By the continuous differentiability of φz(x), to prove the convexity of φz on int(Kn), it suffices to prove the following inequality:

φz

x + y 2



1 2



φz(x) + φz(y)

 ∀x, y ∈ int(Kn).

(24)

By Condition C.4, φ is SOC-concave on (0, +∞). Therefore, we have that

−(φ)soc

x + y 2



Kn 1 2



)soc(x) + (φ)soc(y)

 , i.e.,

)soc

x + y 2



1

2)soc(x)−1

2)soc(y)Kn 0.

Using Lemma 2.1(a) and the fact that z∈ Kn, we then obtain that +

z, (φ)soc

x + y 2



1

2)soc(x)−1

2)soc(y) ,

≥ 0, (25)

which in turn implies that -− z, (φ)soc

x + y 2

 .1 2

-− z, (φ)soc(x) .

+1 2

-− z, (φ)soc(y) .

. The last inequality is exactly the one in (24). Hence, φz is convex on int(Kn) for z∈ Kn.

To prove the second part of the conclusions, we need only to prove that the inequality in (25) holds strictly for any x, y∈ int(Kn) and x= y. By Lemma 2.1(b), this is also equivalent to proving the vector (φ)socx+y

2

12)soc(x)−12)soc(y) is nonzero, since

)soc

x + y 2



1

2)soc(x)−1

2)soc(y)∈ Kn and z∈ int(Kn).

From Condition C.4, it follows that φis concave on (0, +∞), since the SOC-concavity implies the concavity. This, together with the strict monotonicity of φ, implies that φ is strictly concave on (0, +∞). Using Lemma 2.2(d), we then have that tr[(φ)soc(x)]

is strictly concave on int(Kn). This means that, for any x, y∈ int(Kn) and x= y, tr )soc

x + y 2

!1

2tr [(φ)soc(x)]−1

2tr [(φ)soc(y)] > 0.

(26)

In addition, we note that the first element of (φ)socx+y

2

12)soc(x)−12)soc(y) is

φ

 λ1

x+y 2



+ φ

 λ2

x+y 2



2 −φ1(x)) + φ2(x))

4 −φ1(y)) + φ2(y))

4 ,

(14)

which, by Property 2.1(c), can be rewritten as 1

2tr )soc

x + y 2

!1

4tr [(φ)soc(x)]−1

4tr [(φ)soc(y)] . This together with (26) shows that (φ)socx+y

2

12)soc(x)−12)soc(y) is nonzero for any x, y∈ int(Kn) and x= y. Consequently, φzis strictly convex on int(Kn).

Now we are in a position to study the properties of the distance-like function D(x, y).

Proposition 4.1. Given a φ∈ Φ, let D(x, y) be defined as in (18). Then, (a) D(x, y)≥ 0 for any x ∈ int(Kn) and y ∈ Kn, and D(x, y) = 0 if and only if

x = y;

(b) for any fixed y∈ Kn, D(·, y) is continuously differentiable on int(Kn), with

xD(x, y) = 2∇(φ)soc(x)· (x − y);

(27)

(c) for any fixed y∈ Kn, the function D(·, y) is convex over int(Kn), and for any fixed y∈ int(Kn), D(·, y) is strictly convex over int(Kn);

(d) for any fixed y∈ int(Kn), the function D(·, y) is essentially smooth;

(e) for any fixed y ∈ Kn, the level sets LD(y, γ) :={x ∈ int(Kn) : D(x, y)≤ γ}

for all γ≥ 0 are bounded.

Proof. (a) By Lemma 4.1(c), for any x ∈ int(Kn) and y ∈ Kn, we can rewrite D(x, y) as

D(x, y) = tr[φsoc(y)]− tr[φsoc(x)]− ∇tr[φsoc(x)], y− x .

Notice that tr[φsoc(x)] is strictly convex on int(Kn) by Lemma 4.1(d), and hence D(x, y)≥ 0 for any x ∈ int(Kn) and y∈ Kn, and D(x, y) = 0 if and only if x = y.

(b) By Lemma 4.1(b) and Lemma 4.1(c), the functions tr[φsoc(x)] and (φ)soc(x), x are continuously differentiable on int(Kn). Noting that, for any x ∈ int(Kn) and y∈ Kn,

D(x, y) = tr[φsoc(y)]− tr[φsoc(x)]− 2 (φ)soc(x), y− x ;

we then have the continuous differentiability of D(·, y) on int(Kn). Furthermore,

xD(x, y) =−∇tr[φsoc(x)]− 2∇(φ)soc(x)· (y − x) + 2(φ)soc(x)

=−2(φ)soc(x) + 2∇(φ)soc(x)· (x − y) + 2(φ)soc(x)

= 2∇(φ)soc(x)· (x − y).

(c) By the definition of φz given as in (23), D(x, y) can be rewritten as D(x, y) = tr[(φ)soc(x)◦ x − φsoc(x)] + φy(x) + tr[φsoc(y)].

Thus, to prove the (strict) convexity of D(·, y) on int(Kn), it suffices to show that tr[(φ)soc(x)◦ x − φsoc(x)] + φy(x)

is (strictly) convex on int(Kn). Let ψ : (0, +∞) → R be the function defined by ψ(t) := φ(t)t− φ(t).

(28)

Figure

Updating...

References

Related subjects :