DOI 10.1007/s10589-008-9227-0

**An entropy-like proximal algorithm**

**and the exponential multiplier method for convex** **symmetric cone programming**

**Jein-Shan Chen· Shaohua Pan**

Received: 28 April 2006 / Revised: 6 December 2008 / Published online: 18 December 2008

© Springer Science+Business Media, LLC 2008

**Abstract We introduce an entropy-like proximal algorithm for the problem of min-**
imizing a closed proper convex function subject to symmetric cone constraints. The
algorithm is based on a distance-like function that is an extension of the Kullback-
Leiber relative entropy to the setting of symmetric cones. Like the proximal algo-
rithms for convex programming with nonnegative orthant cone constraints, we show
that, under some mild assumptions, the sequence generated by the proposed algo-
rithm is bounded and every accumulation point is a solution of the considered prob-
lem. In addition, we also present a dual application of the proposed algorithm to the
symmetric cone linear program, leading to a multiplier method which is shown to
possess similar properties as the exponential multiplier method (Tseng and Bertsekas
in Math. Program. 60:1–19,1993) holds.

**Keywords Symmetric cone optimization**· Proximal-like method · Entropy-like
distance· Exponential multiplier method

**1 Introduction**

Symmetric cone programming provides a unified framework for linear programming, second-order cone programming and semidefinite programming, which arise from a

J.-S. Chen is a member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.

J.-S. Chen (

^{)}

Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan e-mail:jschen@math.ntnu.edu.tw

S. Pan

School of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China e-mail:shhpan@scut.edu.cn

wide range of applications in engineering, economics, management science, optimal control, combinatorial optimization, and other fields; see [1,16,28] and references therein. Recently, symmetric cone programming, especially symmetric cone linear programming (SCLP), has attracted the attention of some researchers with a focus on the development of interior point methods similar to those for linear programming;

see [9,10,23]. Although interior point methods were successfully applied for SCLPs, it is worthwhile to explore other solution methods for general convex symmetric cone optimization problems.

Let*A = (V, ◦, ·, ·) be a Euclidean Jordan algebra, where (V, ·, ·) is a finite*
dimensional inner product space over the real fieldR and “◦” denotes the Jordan
product which will be defined in the next section. Let*K be the symmetric cone in*
V. In this paper, we consider the following convex symmetric cone programming
(CSCP):

*min f (x)*

*s.t. x*_{K}*0,* (1)

*where f* *: V → (−∞, ∞] is a closed proper convex function, and x ** _{K}* 0 means

*x∈ K. In general, for any x, y ∈ V, we write x*

_{K}*yif x− y ∈ K and write x*

_{K}*y*

*if x−y ∈ int(K). A function is closed if and only if it is lower semi-continuous, and a*

*function is proper if f (x) <∞ for at least one x ∈ V and f (x) > −∞ for all x ∈ V.*

Notice that the CSCP is a special class of convex programs, and hence in princi-
ple it can be solved via general convex programming methods. One such method is
*the proximal point algorithm for minimizing a convex function f (x) on*R* ^{n}* which
generates a sequence

*{x*

*}*

^{k}*k*

*∈N*via the following iterative scheme:

*x** ^{k}*= argmin

*x*∈R^{n}

*f (x)*+ 1

*μ**k**x − x*^{k}^{−1}^{2}_{2}

*,* (2)

*where μ**k**>*0 and 2denotes the Euclidean norm inR* ^{n}*. This method was first
introduced by Martinet [17], based on the Moreau proximal approximation of f (see
[18]), and further developed and studied by Rockafellar [20,21]. Later, several gen-
eralizations of the proximal point algorithm have been considered where the usual
quadratic proximal term in (2) is replaced by a nonquadratic distance-like function;

see, for example, [5,7,8,14,25]. Among others, the algorithms using an entropy-like
distance [13,14,25,26] for minimizing a convex function f (x) subject to x∈ R^{n}_{+},
generate the iterates by

*x*^{0}∈ R^{n}_{++}

*x** ^{k}*= argmin

*x*∈R

^{n}_{+}

*f (x)*+_{μ}^{1}_{k}*d**ϕ**(x, x*^{k}^{−1}*)*

*,* (3)

*where d**ϕ**(·, ·) : R*^{n}_{++}× R^{n}_{++}→ R^{n}_{+}is the entropy-like distance defined by

*d*_{ϕ}*(x, y)*=

*n*
*i*=1

*y*_{i}*ϕ(x*_{i}*/y*_{i}*)* (4)

*with ϕ satisfying certain conditions; see [13,*14,25,26]. An important choice of ϕ is
*the function ϕ(t)= t ln t − t + 1, for which the corresponding d**ϕ*given by

*d*_{ϕ}*(x, y)*=

*n*
*i*=1

*x*_{i}*ln x**i**− x**i**ln y**i**+ y**i**− x**i*

(5)

is the popular Kullback-Leibler entropy from statistics and that is the “entropy” ter-
minology stems from. One key feature of entropic proximal methods is that they
generate a sequence staying in the interior ofR^{n}_{+}automatically, and thus eliminates
the combinatorial nature of the problem. One of the main applications of such proxi-
mal methods is to the dual of smooth convex programs, yielding twice continuously
differentiable nonquadratic augmented Lagrangians and thereby allowing the usage
of Newton’s methods.

The main purpose of this paper is to propose an interior proximal-like method and
the corresponding dual augmented Lagrangian method for the CSCP (1). Specifically,
by using the Euclidean Jordan algebraic techniques, we extend the entropy-like prox-
imal algorithm defined by (3)–(4) with ϕ(t)*= t ln t − t + 1 to the solution of (*1). For
the proposed algorithm, we establish a global convergence estimate in terms of the
objective value, and moreover present a dual application to the standard SCLP, which
leads to an exponential multiplier method shown to possess properties analogous to
the method proposed by [3,27] for convex programming over nonnegative orthant
coneR^{n}_{+}.

The paper is organized as follows. Section2 reviews some basic concepts and
materials on Euclidean Jordan algebras which are needed in the analysis of the al-
gorithm. In Sect.3, we introduce a distance-like function H to measure how close
between two points in the symmetric cone*K and investigate some related proper-*
ties. Furthermore, we outline a basic proximal-like algorithm with the measure func-
*tion H . The convergence analysis of the algorithm is the main content of Sect.*4. In
Sect.5, we consider a dual application of the algorithm to the SCLP and establish
the convergence results for the corresponding multiplier method. We close this paper
with some remarks in Sect.6.

**2 Preliminaries on Euclidean Jordan algebra**

This section recalls some concepts and results on Euclidean Jordan algebras that will be used in the subsequent sections. More detailed expositions of Euclidean Jordan al- gebras can be found in Koecher’s lecture notes [15] and Faraut and Korányi’s mono- graph [11].

Let V be a finite dimensional inner space endowed with a bilinear mapping
*(x, y)→ x ◦ y : V × V → V. For a given x ∈ V, let L(x) be the linear operator*
ofV defined by

*L(x)y := x ◦ y for every y ∈ V.*

*The pair (V, ◦) is called a Jordan algebra if, for all x, y ∈ V,*
*(i) x◦ y = y ◦ x,*

*(ii) x◦ (x*^{2}*◦ y) = x*^{2}*◦ (x ◦ y), where x*^{2}*:= x ◦ x.*

*In a Jordan algebra (V, ◦), x ◦ y is said to be the Jordan product of x and y. Note*
*that a Jordan algebra is not associative, i.e., x◦ (y ◦ z) = (x ◦ y) ◦ z may not hold in*
*general. If for some element e∈ V, x ◦ e = e ◦ x = x for all x ∈ V, then e is called*
*a unit element of the Jordan algebra (V, ◦). The unit element, if exists, is unique.*

*A Jordan algebra does not necessarily have a unit element. For x∈ V, let ζ(x) be the*
*degree of the minimal polynomial of x, which can be equivalently defined as*

*ζ (x)*:= min

*k: {e, x, x*^{2}*,· · · , x** ^{k}*} are linearly dependent

*.*

*Then the rank of (V, ◦) is well defined by r := max{ζ(x) : x ∈ V}.*

*A Jordan algebra (V, ◦), with a unit element e ∈ V, defined over the real field R is*
*called a Euclidean Jordan algebra or formally real Jordan algebra, if there exists a*
positive definite symmetric bilinear form onV which is associative; in other words,
there exists on*V an inner product denoted by ·, ·*_{V}*such that for all x, y, z*∈ V:

(iii) *x ◦ y, z*_{V}*= y, x ◦ z*_{V}.

In a Euclidean Jordan algebra*A = (V, ◦, ·, ·*_{V}*), we define the set of squares as*
*K :=*

*x*^{2}*: x ∈ V*
*.*

By [11, Theorem III. 2.1],*K is a symmetric cone. This means that K is a self-dual*
*closed convex cone with nonempty interior and for any two elements x, y∈ int(K),*
there exists an invertible linear transformation*T : V → V such that T (K) = K and*
*T (x) = y.*

Here are two popular examples of Euclidean Jordan algebras. LetS* ^{n}*be the space

*of n×n real symmetric matrices with the inner product given by X, Y*

_{S}

^{n}*:= Tr(XY ),*

*where XY is the matrix multiplication of X and Y and Tr(XY ) is the trace of XY .*

*Then, (*S

^{n}*,◦, ·, ·*S

^{n}*)*is a Euclidean Jordan algebra with the Jordan product defined by

*X◦ Y := (XY + Y X)/2, X, Y ∈ S*^{n}*.*

*In this case, the unit element is the identity matrix I in*S* ^{n}*and the cone of squares

*K*is the set of all positive semidefinite matrices inS

*. LetR*

^{n}*be the Euclidean space of*

^{n}*dimension n with the usual inner productx, y*R

^{n}*= x*

^{T}*y. For any x= (x*1

*, x*2

*), y*=

*(y*1

*, y*2

*)*∈ R × R

^{n}^{−1}

*, define x◦ y := (x*

^{T}*y, x*1

*y*2

*+ y*1

*x*2

*)*

^{T}*. Then (R*

^{n}*,◦, ·, ·*

_{R}

^{n}*)*is

*a Euclidean Jordan algebra, also called the quadratic forms algebra. In this algebra,*

*the unit element e= (1, 0, . . . , 0)*

*and*

^{T}*K = {x = (x*1

*, x*

_{2}

*)*∈ R × R

^{n}^{−1}

*: x*1

*≥ x*22}.

*Recall that an element c∈ V is said to be idempotent if c*^{2}*= c. Two idempotents*
*cand q are said to be orthogonal if c◦ q = 0. One says that {c*1*, c*2*, . . . , c**k*} is a
complete system of orthogonal idempotents if

*c*^{2}_{j}*= c**j**,* *c*_{j}*◦ c**i**= 0 if j
= i for all j, i = 1, 2, . . . , k, and*

*k*
*j*=1

*c*_{j}*= e.*

*An idempotent is said to be primitive if it is nonzero and cannot be written as the sum*
of two other nonzero idempotents. We call a complete system of orthogonal primitive
*idempotents a Jordan frame. Then we have the following spectral decomposition*
theorem.

**Theorem 2.1 [11, Theorem III. 1.2] Suppose that***A = (V, ◦, ·, ·*_{V}*)is a Euclidean*
*Jordan algebra and the rank ofA is r. Then for any x ∈ V, there exists a Jordan frame*
*{c*1*, c*2*, . . . , c**r**} and real numbers λ*1*(x), λ*2*(x), . . . , λ**r**(x), arranged in the decreas-*
*ing order λ*1*(x)≥ λ*2*(x)≥ · · · ≥ λ**r**(x), such that x*=_{r}

*j*=1*λ**j**(x)c**j**. The numbers*
*λ**j**(x)(counting multiplicities), which are uniquely determined by x, are called the*
*eigenvalues,*_{r}

*j*=1*λ**j**(x)c**j**the spectral decomposition of x, and tr(x)*=_{r}

*j*=1*λ**j**(x)*
*the trace of x.*

From [11, Proposition III.1.5], a Jordan algebra (V, ◦) with a unit element e ∈ V
*is Euclidean if and only if the symmetric bilinear form tr(x◦ y) is positive definite.*

Therefore, we may define another inner product onV by

*x, y := tr(x ◦ y) ∀x, y ∈ V.*

*By the associativity of tr(·) [11, Proposition II. 4.3], the inner product·, · is associa-*
*tive, i.e., for all x, y, z∈ V, there holds that x, y ◦ z = y, x ◦ z. Thus, the operator*
*L(x) for each x ∈ V is symmetric with respect to the inner product ·, · in the sense*
that

*L(x)y, z = y, L(x)z ∀y, z ∈ V.*

In the sequel, we let

*x, x =*

_{r}

*j*=1

*λ*^{2}_{j}*(x)*

*1/2*

*∀x ∈ V,*

*and denote by λ*min*(·) and λ*max*(·) the smallest and the largest eigenvalue of x, re-*
spectively. Then, by Lemma 13 and the proof of Lemma 14 in [23], we can prove the
following lemma.

**Lemma 2.1 Let x, y**∈ V, then we can bound the minimum eigenvalue of x + y as*follows:*

*λ*min*(x)+ λ*min*(y)≤ λ*min*(x+ y) ≤ λ*min*(x)+ λ*max*(y).*

*Let g*: R → R be a scalar-valued function. Then, it is natural to define a vector-
*valued function associated with the Euclidean Jordan algebra (V, ◦, ·, ·*_{V}*)*by

*g*^{sc}*(x):= g(λ*1*(x))c*_{1}*+ g(λ*2*(x))c*_{2}*+ · · · + g(λ**r**(x))c*_{r}*,* (6)
*where x∈ V has the spectral decomposition x =*_{r}

*j*=1*λ*_{j}*(x)c** _{j}*. This function is
called the Löwner operator in [24] and was shown to have the following important
property.

**Lemma 2.2 [24, Theorem 13] For any x**=_{r}

*j*=1*λ*_{j}*(x)c*_{j}*, let g*^{sc}*be defined by (6).*

*Then g*^{sc}*is (continuously) differentiable at x if and only if g is (continuously) differ-*
*entiable at all λ**j**(x). Furthermore, the derivative of g*^{sc}*at x, for any h∈ V, is given*
*by*

*(g*^{sc}*)*^{}*(x)h*=

*r*
*j*=1

*g*^{}*(λ**j**(x))c**j**, hc**j*+

1*≤j<l≤r*

4*[λ**i**(x), λ**j**(x)*]*g**c**j**◦ (c**l**◦ h)*

*with*

*[λ**i**(x), λ*_{j}*(x)*]*g*:=*g(λ*_{i}*(x))− g(λ**j**(x))*

*λ**i**(x)− λ**j**(x)* *∀ i, j = 1, 2, . . . , r and i
= j.*

*In fact, the Jacobian (g*^{sc}*)*^{}*(·) is a linear and symmetric operator, and can be written*
as

*(g*^{sc}*)*^{}*(x)*=

*r*
*j*=1

*g*^{}*(λ*_{j}*(x))Q(c**j**)*+ 2

*r*
*i,j**=1,i
=j*

*[λ**i**(x), λ*_{j}*(x)*]*g**L(c**j**)L(c**i**),* (7)

where*Q(x) := 2L*^{2}*(x)− L(x*^{2}*)for any x∈ V is called the quadratic representation*
ofV.

Finally, we recall the spectral function generated by a symmetric function. LetP
*denote the set of all permutations of r-dimensional vectors. A subset of*R* ^{r}* is said

*to be symmetric if it remains unchanged under every permutation ofP. Let S be a*symmetric set inR

^{r}*. A real-valued function f*

*: S → R is said to be symmetric if*

*for every permutation P*

*∈ P and each s ∈ S, there holds that f (P s) = f (s). For*

*any x∈ V with the spectral decomposition x =*

_{r}*j*=1*λ**j**(x)c**j**, define K:= {x ∈*
*V | λ(x) ∈ S} with λ(x) = (λ*1*(x), . . . , λ**r**(x))*^{T}*being the spectral vector of x. Then*
*F: K → R defined by*

*F (x):= f (λ(x))* (8)

*is called the spectral function generated by f . From Theorem 41 of [2], F is (strictly)*
*convex if f is (strictly) convex.*

Unless otherwise stated, in the rest of this paper, the notation*A = (V, ◦, ·, ·)*
*represents a Euclidean Jordan algebra of rank r and dim(V) = n. For a closed proper*
*convex function f: V → (−∞, +∞], we denote by domf := {x ∈ V | f (x) < +∞}*

*the domain of f . The subdifferential of f at x*0∈ V is the convex set

*∂f (x*0*)= {ξ ∈ V | f (x) ≥ f (x*0*)+ ξ, x − x*0* ∀ x ∈ V} .* (9)
Since*x, y = tr(x ◦ y) for any x, y ∈ V, the above subdifferential set is equivalent to*

*∂f (x*0*)*=

*ξ∈ V | f (x) ≥ f (x*0*)*+ tr

*ξ◦ (x − x*0*)*

*, x*∈ V

*.* (10)

For a sequence*{x** ^{k}*}

*k*

*∈N*

*, the notation N denotes the set of natural numbers.*

**3 Entropy-like proximal algorithm**

To solve the CSCP (1), we suggest the following proximal-like minimization algo- rithm:

*x*^{0} * _{K}*0

*x** ^{k}*= argmin

*x*

*0*

_{K}*f (x)*+_{μ}^{1}_{k}*H (x, x*^{k}^{−1}*)*

*,* (11)

*where μ**k**>0 and H: V × V → (−∞, +∞] is defined by*

*H (x, y)*:=

*tr(x◦ ln x − x ◦ ln y + y − x) ∀ x ∈ int(K), y ∈ int(K),*

+∞ *otherwise.* (12)

This algorithm is indeed a proximal-type one except that the classical quadratic term
*x − x*^{k}^{−1}^{2}_{2}*is replaced by the distance-like function H to guarantee that{x** ^{k}*}

*k*

*∈N*⊂

*int(K), thus leading to an interior proximal-like method (see Proposition*4.1).

*By the definition of Löwner operator, clearly, the function H (x, y) is well-defined*
*for all x, y∈ int(K). Moreover, the domain of x ∈ int(K) can be extended to x ∈ K*
by adopting the convention 0 ln 0*≡ 0. The function H is a natural extension of the*
distance-like entropy function in (5), and is used to measure the “distance” between
two points in*K. In fact, H will become the entropy function d**ϕ*in (5) if the Euclid-
ean Jordan algebra*A is specified as (R*^{n}*,◦, ·, ·*_{R}^{n}*)*with “◦” denoting the compo-
nentwise product of two vectors inR* ^{n}*and

*·, ·*

_{R}

*the usual Euclidean inner product.*

^{n}As shown by Proposition3.1below, most of the important properties, but not all, of
*d*_{ϕ}*also hold for H .*

The following two technical lemmas will be used to investigate the favorable
*properties of the distance measure H . Lemma*3.1states an extension of Von Neu-
mann inequality to Euclidean Jordan algebras, and Lemma3.2gives the properties of
*tr(x◦ ln x).*

**Lemma 3.1 [2] For any x, y** *∈ V, we have tr(x ◦ y) ≤* _{r}

*j*=1*λ**j**(x)λ**j**(y)*=
*λ(x)*^{T}*λ(y), where λ(x) and λ(y) are the spectral vectors of x and y, respectively.*

**Lemma 3.2 For any x**∈ K, let (x) := tr(x ◦ ln x). Then, we have the following*results.*

*(a) (x) is the spectral function generated by the symmetric entropy function*
*φ (u)*=_{r}

*j*=1*u*_{j}*ln u**j* *∀u ∈ R*^{r}_{+}*.* (13)
*(b) (x) is continuously differentiable on int(K) with ∇(x) = ln x + e.*

*(c) The function (x) is strictly convex overK.*

*Proof (a) Suppose that x has the spectral decomposition x*=_{r}

*j*=1*λ**j**(x)c**j*. Let
*g(t )= t ln t ∀ t ∈ R.*

*From Sect. 2, it follows that the vector-valued function x◦ln x is the Löwner function*
*g*^{sc}*(x), i.e., g*^{sc}*(x)= x ◦ ln x. Clearly, g*^{sc}*is well-defined for any x∈ K and*

*g*^{sc}*(x)= x ◦ ln x =*

*r*
*j*=1

*λ*_{j}*(x)ln(λ**j**(x))c*_{j}*.*

Therefore,

*(x)= tr(x ◦ ln x) = tr(g*^{sc}*(x))*=

*r*
*j*=1

*λ*_{j}*(x)ln(λ**j**(x))= φ(λ(x))*

*with φ*: R^{r}_{+}→ R given by (13). Since the function φ is symmetric, (x) is the
*spectral function generated by the symmetric function φ in view of (8).*

(b) From Lemma2.2, g^{sc}*(x)= x ◦ ln x is continuously differentiable on int(K).*

*Thus, (x) is also continuously differentiable on int(K) because is the compo-*
*sition of the trace function (clearly continuously differentiable) and g*^{sc}. Now, it re-
*mains to find its gradient formula. From the fact that tr(x◦ y) = x, y, we have*

*(x)= tr(x ◦ ln x) = x, ln x.*

Applying the chain rule for inner product of two functions, we then obtain

*∇(x) = ln x + (∇ ln x)x = ln x + (ln x)*^{}*x.* (14)
On the other hand, from formula (7) it follows that for any h∈ V,

*(ln x)*^{}*h*=

*r*
*j*=1

1

*λ*_{j}*(x)c**j**, hc**j*+

1*≤j<l≤r*

*ln(λ**j**(x))− ln(λ**l**(x))*

*λ*_{j}*(x)− λ**l**(x)* *c**j**◦ (c**l**◦ h).*

*By this and the spectral decomposition of x, it is easy to compute that*

*(ln x)*^{}*x*=

*r*
*j*=1

1

*λ**j**(x)c**j**, xc**j*+

1*≤j<l≤r*

*ln(λ**j**(x))− ln(λ**l**(x))*

*λ**j**(x)− λ**l**(x)* *c*_{j}*◦ (c**l**◦ x)*

=

*r*
*j*=1

1

*λ*_{j}*(x)λ**j**(x)c**j*+

1*≤j<l≤r*

*ln(λ**j**(x))− ln(λ**l**(x))*

*λ*_{j}*(x)− λ**l**(x)* *λ**l**(x)c**j**◦ c**l*

*= e.*

Combining with (14), we readily obtain the desired result.

*(c) Note that the function φ in (13) is strictly convex over*R^{n}_{+}. Therefore, the
conclusion immediately follows from part (a) and Theorem 41 of [2].
*Next we study the favorable properties of the distance-like function H . These*
properties play a crucial role in the convergence analysis of the algorithm defined by
(11)–(12).

**Proposition 3.1 Let H (x, y) be defined by (12). Then the following results hold.**

*(a) H (x, y) is continuous onK × int(K) and H(·, y) is strictly convex for any y ∈*
*int(K).*

*(b) For any fixed y∈ int(K), H(·, y) is continuously differentiable on int(K) with*

∇*x**H (x, y)= ln x − ln y.*

*(c) H (x, y)≥ 0 for any x ∈ K and y ∈ int(K), and H(x, y) = 0 if and only if x = y.*

*(d) H (x, y)≥ d(λ(x), λ(y)) ≥ 0 for any x ∈ K, y ∈ int(K), where d(·, ·) is defined*
*by*

*d(u, v)*=

*n*
*i*=1

*u*_{i}*ln u**i**− u**i**ln v**i**+ v**i**− u**i*

*∀ u ∈ R*^{r}_{+}*, v*∈ R^{r}_{++}*.* (15)

*(e) For fixed y∈ int(K), the level sets L**H**(x, γ ):= {x ∈ K | H(x, y) ≤ γ } are*
*bounded for all γ* *≥ 0, and for fixed x ∈ K, the level sets L**H**(y, γ ):= {y ∈*
*int(K) | H (x, y) ≤ γ } are bounded for all γ ≥ 0.*

*Proof (a) Since x◦ln x, x ◦ln y are continuous in x ∈ K and y ∈ int(K), and the trace*
*function is also continuous, the function H is continuous overK×int(K). Notice that*
*H (x, y)= (x) − tr(x ◦ ln y) + tr(y) − tr(x),* (16)

*(x)*is strictly convex over*K by Lemma*3.2(c), and the other terms on the right
hand side of (16) are clearly convex for any fixed y*∈ int(K). Therefore, H(·, y) is*
*strictly convex for any fixed y∈ int(K).*

*(b) From the expression of H (x, y) given by (16) and Lemma*3.2(b), obviously,
*the function H (·, y) is continuously differentiable in int(K). Moreover,*

∇*x**H (x, y)*= ∇*x**(x)− ln y − e = ln x − ln y.*

*(c) From the definition of (x) and its gradient formula shown as in Lemma*3.2(b),

*(x)− (y) − *^{}*(y), x− y*

*= tr(x ◦ ln x) − tr(y ◦ ln y) − ln y + e, x − y*

*= tr(x ◦ ln x) − tr(x ◦ ln y) − tr(x) + tr(y)*

*= H (x, y)* (17)

*for any x∈ K and y ∈ int(K). In addition, the strict convexity of implies that*

*(x)− (y) − *^{}*(y), x− y ≥ 0*

*and the equality holds if and only if x= y. The two sides readily give the desired*
result.

*(d) Using the definition of H and Lemma*3.1, we have for all x*∈ K and y ∈*
*int(K),*

*H (x, y)= tr(x ◦ ln x + y − x) − tr(x ◦ ln y)*

*≥ tr(x ◦ ln x + y − x) −*

*r*
*j*=1

*λ**j**(x)ln(λ**j**(y))*

=

*r*
*j*=1

*λ*_{j}*(x)ln(λ**j**(x))+ λ**j**(y)− λ**j**(x)*
−

*r*
*j*=1

*λ*_{j}*(x)ln(λ**j**(y))*

=

*r*
*j*=1

*λ*_{j}*(x)ln(λ**j**(x))− λ**j**(x)ln(λ**j**(y))+ λ**j**(y)− λ**j**(x)*

*= d(λ(x), λ(y)).*

*The nonnegativity of d(λ(x), λ(y)) is direct by the definition of d(·, ·) in (*15).

*(e) For fixed y∈ int(K), from part (d) we have L**H**(x, γ )⊆ {x ∈ K | d(λ(x),*
*λ(y))≤ γ } for all γ ≥ 0. Since the sets {u ∈ R*^{r}_{+}*| d(u, v) ≤ γ } are bounded for all*
*γ*≥ 0 by [26, Lemma 2.3], we have from the continuity of λ(*·) that the sets L**H**(x, γ )*
*are bounded for all γ≥ 0. Similarly, for fixed x ∈ K** ^{n}*, using Lemma 2.1(i) of [26]

*the sets L**H**(y, γ )are bounded for all γ*≥ 0.

**Proposition 3.2 Let H (x, y) be defined by (12). Suppose***{x** ^{k}*}

*k*

*∈N*

*⊆ K and*

*{y*

*}*

^{k}*k*

*∈N*

*⊂ int(K) are bounded sequences such that H (x*

^{k}*, y*

^{k}*)→ 0. Then, as k → ∞,*

*we have*

*(a) λ**j**(x*^{k}*)− λ**j**(y*^{k}*)→ 0 for all j = 1, 2, . . . , r.*

*(b) tr(x*^{k}*− y*^{k}*)*→ 0.

*Proof (a) From Proposition* 3.1(d), H (x^{k}*, y*^{k}*)* *≥ d(λ(x*^{k}*), λ(y*^{k}*))*≥ 0. Hence,
*H (x*^{k}*, y*^{k}*)→ 0 implies d(λ(x*^{k}*), λ(y*^{k}*))→ 0. By the definition of d(·, ·) given*
by (15),

*d(λ(x*^{k}*), λ(y*^{k}*))*=

*r*
*j*=1

*λ*_{j}*(y*^{k}*)ϕ*

*λ*_{j}*(x*^{k}*)/λ*_{j}*(y*^{k}*)*

*with ϕ(t)= t ln t − t + 1 (t ≥ 0). Since ϕ(t) ≥ 0 for any t ≥ 0, each term of the above*
*sum is nonnegative, and consequently, d(λ(x*^{k}*), λ(y*^{k}*))*→ 0 implies

*λ*_{j}*(y*^{k}*)ϕ*

*λ*_{j}*(x*^{k}*)/λ*_{j}*(y*^{k}*)*

*→ 0, j = 1, 2, . . . , r.*

This is equivalent to saying that

*λ*_{j}*(x*^{k}*)ln(λ**j**(x*^{k}*))− λ**j**(x*^{k}*)ln(λ**j**(y*^{k}*))+ λ**j**(y*^{k}*)− λ**j**(x*^{k}*)→ 0, j = 1, 2, . . . , r.*

Since*{λ**j**(x*^{k}*)} and {λ**j**(y*^{k}*)*} are bounded, using Lemma A.1 of [6] then yields that
*λ*_{j}*(x*^{k}*)− λ**j**(y*^{k}*)→ 0 for all j = 1, 2, . . . , r.*

*(b) Since tr(x*^{k}*−y*^{k}*)*=_{r}

*j*=1*(λ*_{j}*(x*^{k}*)−λ**j**(y*^{k}*)), the result follows from part (a).*
*To close this section, we present two useful relations for the function H , which*
*can be easily verified by using the definition of H and recalling the nonnegativity*
*of H .*

**Proposition 3.3 Let H (x, y) be defined by (12). For all x, y**∈ int(K) and z ∈ K, we*have*

*(a) H (z, x)− H (z, y) = tr(z ◦ ln y − z ◦ ln x + x − y).*

*(b) tr((z− y) ◦ (ln y − ln x)) = H (z, x) − H(z, y) − H(y, x) ≤ H (z, x) − H (z, y).*

**4 Convergence analysis of the algorithm**

The convergence analysis of the entropy-like proximal algorithm defined by (11)–

(12) is similar to that of the proximal point method [12] for convex minimization problems and the proximal-like algorithm [4] using Bregman functions. In this sec- tion, we will present a global convergence rate estimate for the algorithm in terms of function values. For this purpose, we need to make the following assumptions for the CSCP (1):

(A) inf*{f (x) | x ** _{K}*0

*} := f*∗

*>−∞ and domf ∩ int(K) = ∅.*

In addition, we denote by*X*^{∗}*:= {x ∈ K | f (x) = f*∗} the solution set of (1).

First, we show that the algorithm in (11)–(12) is well defined, i.e., it generates a
sequence*{x** ^{k}*}

*k∈N*

*⊂ int(K). This is a direct consequence of Proposition*4.1below.

For the proof of Proposition4.1, we need the following technical lemma.

**Lemma 4.1 For any x**_{K}*0, y* _{K}*0 and tr(x◦ y) = 0, we always have x = 0.*

*Proof From the self-duality ofK and [*11, Proposition I. 1.4], we have that
*u∈ int(K) ⇐⇒ u, v > 0, ∀ 0
= v ∈ K,*

*which together with tr(x◦ y) = x, y immediately implies x = 0.*
**Proposition 4.1 For any fixed y***∈ int(K) and μ > 0, let F**μ**(·, y) := f (·) +*
*μ*^{−1}*H (·, y). Then, under assumption (A), we have the following results.*

*(a) The function F**μ**(·, y) has bounded level sets.*

*(b) There exists a unique x(y, μ)∈ int(K) such that*
*x(y, μ)*:= argmin

*x** _{K}*0

*F**μ**(x, y)*

*and*

*μ*^{−1}*(ln y− ln x(y, μ)) ∈ ∂f (x(y, μ)).* (18)

*Proof (a) Since F**μ**(·, y) is convex, to show that F**μ**(·, y) has bounded level sets, it*
*suffices to show that for any ν≥ f*_{∗}*>−∞, the level set L(ν) := { x | F**μ**(x, y)≤ ν}*

*is bounded. Let ν*^{}*:= (ν − f*_{∗}*)μ. Clearly, we have L(ν)⊂ L**H**(x, ν*^{}*). Moreover, by*
Proposition3.1*(e), L**H**(x, ν*^{}*)is bounded. Therefore, L(ν) is bounded.*

*(b) From part (a), F**μ**(·, y) has bounded level sets, which in turn implies the*
*existence of the minimum point x(y, μ). Also, the strict convexity of F**μ**(x, y)*
by Proposition 3.1(a) guarantees the uniqueness. Under assumption (A), using
Proposition 3.1(b) and the optimality condition for the minimization problem
argmin_{x}_{}

*K*0*F**μ**(x, y)*gives that
0*∈ ∂f (x(y, μ)) + μ*^{−1}

*ln x(y, μ)− ln y*

*+ ∂δ(x(y, μ)|K),* (19)
*where δ(z|K) = 0 if z ** _{K}*0 and

*+∞ otherwise. We will show that ∂δ(x(y, μ)|K) =*

*{0} and hence the desired result follows. Notice that for any x*

^{k}*∈ int(K) with x*

*→*

^{k}*x∈ bd(K),*

* ln x** ^{k}* =

_{r}

*j*=1

*[ln(λ**j**(x*^{k}*))*]^{2}

*1/2*

*→ +∞,*

*where the second relation is due to the continuity of λ**j**(·) and ln t. Consequently,*

∇*x**H (x*^{k}*, y)* ^{k}^{k}

This by [19, pp. 251] means that H (·, y) is essentially smooth, and then ∂*x**H (x, y)*=

*∅ for all x ∈ bd(K) by [19, Theorem 26.1]. Together with (19), we must have*
*x(y, μ)∈ int(K). Furthermore, from [19, pp. 226], it follows that*

*∂δ(z|K) = {v ∈ V | v *_{K}*0, tr(v◦ z) = 0}.*

Using Lemma4.1*then yields ∂δ(x(y, μ)|K) = {0}. Thus, the proof is completed. *
Next we establish several important properties for the algorithm defined by (11)–

(12), from which our convergence result will follow.

**Proposition 4.2 Let**{x* ^{k}*}

*k*

*∈N*

*be the sequence generated by the algorithm defined by*(11)–(12), and let σ

*m*:=

_{m}*k*=1*μ*_{k}*. Then, the following results hold.*

*(a) The sequence{f (x*^{k}*)*}*k**∈N**is nonincreasing.*

*(b) μ**k**(f (x*^{k}*)− f (x)) ≤ H(x, x*^{k}^{−1}*)− H(x, x*^{k}*)for all x∈ K.*

*(c) σ**m**(f (x*^{m}*)− f (x)) ≤ H (x, x*^{0}*)− H(x, x*^{m}*)for all x∈ K.*

(d) *{H (x, x*^{k}*)*}*k**∈N* *is nonincreasing for any x∈ X*^{∗}*, if the optimal setX*^{∗}
= ∅.

*(e) H (x*^{k}*, x*^{k}^{−1}*)→ 0 and tr(x*^{k}*− x*^{k}^{−1}*)→ 0, if the optimal set X*^{∗}
= ∅.

*Proof (a) From the definition of x** ^{k}*in (11), it follows that

*f (x*^{k}*)+ μ*^{−1}_{k}*H (x*^{k}*, x*^{k}^{−1}*)≤ f (x*^{k}^{−1}*)+ μ*^{−1}_{k}*H (x*^{k}^{−1}*, x*^{k}^{−1}*).*

*Since H (x*^{k}*, x*^{k}^{−1}*)≥ 0 and H(x*^{k}^{−1}*, x*^{k}^{−1}*)*= 0 (see Proposition3.1(c)), we obtain
*f (x*^{k}*)≤ f (x*^{k}^{−1}*)* *∀ k ∈ N,*

and therefore the sequence*{f (x*^{k}*)*}*k**∈N*is nonincreasing.

(b) From (18),*−μ*^{−1}_{k}*(ln x*^{k}*− ln x*^{k}^{−1}*)∈ ∂f (x*^{k}*). Plugging this into the formula of*

*∂f (x*^{k}*)*given by (10), we have
*f (x)≥ f (x*^{k}*)− μ*^{−1}* _{k}* tr

*(x− x*^{k}*)◦ (ln x*^{k}*− ln x*^{k}^{−1}*)*

*∀ x ∈ V.*

*Consequently, for any x∈ K,*
*μ*_{k}*· (f (x*^{k}*)− f (x)) ≤ tr*

*(x− x*^{k}*)◦ (ln x*^{k}*− ln x*^{k}^{−1}*)*

*= H(x, x*^{k}^{−1}*)− H(x, x*^{k}*)− H (x*^{k}*, x*^{k}^{−1}*)*

*≤ H(x, x*^{k}^{−1}*)− H(x, x*^{k}*),* (20)
where the equality holds due to Proposition3.3(b), and the last inequality follows
*from the nonnegativity of H (x, y).*

*(c) First, summing up the inequalities in part (b) over k= 1, 2, 3, . . . , m yields that*

*m*
*k*=1

*μ*_{k}*f (x*^{k}*)− σ**m**f (x)≤ H(x, x*^{0}*)− H (x, x*^{m}*)* *∀ x ∈ K.* (21)

*On the other hand, since f (x*^{m}*)≤ f (x*^{k}*)*by part (a), multiplying this inequality by
*μ*_{k}*and summing up the inequalities over k= 1, 2, 3, . . . , m then gives that*

*m*
*k*=1

*μ**k**f (x*^{k}*)*≥

*m*
*k*=1

*μ**k**f (x*^{m}*)= σ**m**f (x*^{m}*).* (22)

Now, combining (22) with (22), we readily obtain the desired result.

*(d) Since f (x*^{k}*)− f (x) ≥ 0 for all x ∈ X*^{∗}, the result immediately follows from
part (b).

(e) From part (d), the sequence*{H(x, x*^{k}*)*}*k**∈N* *is nonincreasing for any x∈ X*^{∗}.
*This together with H (x, x*^{k}*)≥ 0 implies that {H(x, x*^{k}*)*}*k**∈N*is convergent. Thus,

*H (x, x*^{k}^{−1}*)− H(x, x*^{k}*)→ 0 ∀ x ∈ X*^{∗}*.* (23)
On the other hand, from (20) and part (d), we have

0*≤ μ**k**(f (x*^{k}*)− f (x)) ≤ H(x, x*^{k}^{−1}*)− H (x, x*^{k}*)− H (x*^{k}*, x*^{k}^{−1}*),* *∀ x ∈ X*^{∗}*,*
which in turn implies

*H (x*^{k}*, x*^{k}^{−1}*)≤ H(x, x*^{k}^{−1}*)− H (x, x*^{k}*).*

Using (23) and the nonnegativity of H (x^{k}*, x*^{k}^{−1}*), the first result is then obtained.*

Since *{x** ^{k}*}

*k*

*∈N*

*⊂ {y ∈ int(K) | H (x, y) ≤ H (x, x*

^{0}

*)} with x ∈ X*

^{∗}, the sequence

*{x** ^{k}*}

*k*

*∈N*is bounded. Thus, the second result follows by the first result and Propo-

sition3.2(b).

By now, we have proved that the algorithm in (11)–(12) is well-defined and sat- isfies some favorable properties. With these properties, we are ready to establish the convergence results of the algorithm which is one of the main purposes of this paper.

**Proposition 4.3 Let**{x* ^{k}*}

*k*

*∈N*

*be the sequence generated by the algorithm in (11)–*

(12), and let σ*m*:=_{m}

*k*=1*μ**k**. Then, under assumptions (A), the following results*
*hold.*

*(a) f (x*^{m}*)− f (x) ≤ σ**m*^{−1}*H (x, x*^{0}*)for all x∈ K.*

*(b) If σ**m**→ +∞, then lim**m*→+∞*f (x*^{m}*)= f*_{∗}.

*(c) If the optimal setX*^{∗}*
= ∅, then the sequence {x** ^{k}*}

*k*

*∈N*

*is bounded, and if, in addi-*

*tion, σ*

*m*

*→ +∞, every accumulation point is a solution of the CSCP (*1).

*Proof (a) This result follows by Proposition* 4.2(c) and the nonnegativity of
*H (x, x*^{m}*).*

*(b) Since σ**m*→ +∞, passing the limit to the inequality in part (a), we have
lim sup

*m*→+∞*f (x*^{m}*)≤ f (x) ∀ x ∈ K,*
which particularly implies

lim sup

*m*→+∞*f (x*^{m}*)≤ inf{f (x) : x ∈ K}.*

*This together with the fact that f (x*^{m}*)≥ f*_{∗}*= inf{f (x) : x ∈ K} yields the result.*

(c) From Proposition3.1(d), H (x,*·) has bounded level sets for every x ∈ K. Also,*
from Proposition4.2(d),*{H(x, x*^{k}*)}**k**∈N* *is nonincreasing for all x∈ X*^{∗}if*X*^{∗}
= ∅.

Thus, the sequence*{x** ^{k}*}

*k*

*∈N*is bounded and therefore has an accumulation point. Let

*ˆx ∈ K be an accumulation point of {x*

*}*

^{k}*k*

*∈N*. Then

*{x*

^{k}

^{j}*} → ˆx for some k*

*j*→ +∞.

*Since f is lower semi-continuous, we have f (ˆx) = lim inf**k** _{j}*→+∞

*f (x*

^{k}

^{j}*)*(see [22,

*p. 8]). On the other hand, we also have f (x*

^{k}

^{j}*)→ f*∗

*, hence f (ˆx) = f*∗which says

that *ˆx is a solution of (1).*

Proposition4.3(a) indicates an estimate for global rate of convergence which is
similar to the one obtained for the proximal-like algorithms of convex minimization
over nonnegative orthant cone. Analogously, as remarked in [6, Remark 4.1], global
convergence of*{x** ^{k}*} itself to a solution of (1) can be achieved under the assumption
that

*{x*

^{k}*} ⊂ int(K) has a limit point in int(K). Nonetheless, this assumption is rather*stringent.

**5 Exponential multiplier method for SCLP**

In this section, we give a dual application of the entropy-like proximal algorithm in (11)–(12), leading to an exponential multiplier method for the symmetric cone linear

program

*(SCLP)* min

*−b*^{T}*y: c −*

*m*
*i*=1

*y*_{i}*a*_{i}_{K}*0, y*∈ R^{m}

*,*

*where c∈ V, a**i**∈ V and b ∈ R** ^{m}*. For the sake of notation, we denote the feasible set
of (SCLP) by

*F := {y ∈ R*

^{m}*: A(y)*

*0}, where A : R*

_{K}*→ V is a linear operator defined by*

^{m}*A(y):= c −*

*m*
*i=1*

*y**i**a**i* *∀y ∈ R*^{m}*.* (24)

In addition, we make the following assumptions for the problem (SCLP):

(A1) The optimal solution set of (SCLP) is nonempty and bounded;

(A2) Slater’s condition holds, i.e., there exists a *ˆy ∈ R*^{m}*such that A(ˆy) ** _{K}*0.

Notice that the Lagrangian function associated with (SCLP) is defined as follows:

*L(y, x)*=

*−b*^{T}*y− tr[x ◦ A(y)] if x ∈ K,*

+∞ *otherwise.*

Therefore, the dual problem corresponding to the SCLP is given by
*(DSCLP)* max

*x** _{K}*0 inf

*y*∈R^{m}*L(y, x),*
which can be explicitly written as

*(DSCLP)*

⎧⎪

⎨

⎪⎩

max*−tr(c ◦ x)*
*s.t. tr(a**i**◦ x) = b**i**,*

*x*_{K}*0.*

*i= 1, 2, . . . , m,*

From the standard convex analysis arguments in [19], it follows that under assump- tion (A2) there is no duality gap between (SCLP) and (DSCLP), and furthermore, the solution set of (DSCLP) is nonempty and compact.

To solve the problem (SCLP), we introduce the following multiplier-type al-
*gorithm: given x*^{0}*∈ int(K), generate the sequence {y** ^{k}*}

*k*

*∈N*⊆ R

^{m}*and*

*{x*

*}*

^{k}*k*

*∈N*⊂

*int(K) by*

*y** ^{k}*= argmin

*y*∈R^{m}

*−b*^{T}*y+ μ*^{−1}* _{k}* tr

exp

*−μ**k**A(y)+ ln x*^{k}^{−1}

*,* (25)

*x** ^{k}*= exp

*−μ**k**A(y*^{k}*)+ ln x*^{k}^{−1}

*,* (26)

*where the parameters μ**k* *satisfy μ**k**>* *¯μ > 0 for all k ∈ N. The algorithm can be*
viewed as a natural extension of the exponential multiplier method used in convex
programs over nonnegative orthant cones (see, e.g., [3,27]) to symmetric cones. In

the rest of this section, we will show that the algorithm defined by (25)–(26) possesses similar properties.

We first present two technical lemmas, where Lemma5.1collects some properties
*of the Löwner operator exp(z) and Lemma*5.2characterizes the recession cone of*F.*

**Lemma 5.1**

*(a) For any z∈ V, there always holds that exp(z) ∈ int(K).*

*(b) The function exp(z) is continuously differentiable everywhere with*
*(exp(z))*^{}*e*= ∇*z**(exp(z))e= exp(z) ∀ z ∈ V.*

*(b) The function tr[exp(*_{m}

*i*=1*y**i**a**i**)] is continuously differentiable everywhere with*

∇*y*tr

exp

_{m}

*i*=1

*y*_{i}*a*_{i}

*= A*^{T}

exp

_{m}

*i*=1

*y*_{i}*a*_{i}

*,*

*where y*∈ R^{m}*and a**i**∈ V for all i = 1, 2, . . . , m, and A** ^{T}* : V → R

^{m}*be a linear*

*transformation defined by A*

^{T}*(x)= (a*1

*, x, . . . , a*

*m*

*, x)*

^{T}*for any x*∈ V.

*Proof (a) The result is clear since for any z∈ V with z =**r*

*j*=1*λ*_{j}*(z)c** _{j}*, all eigen-

*values of exp(z), given by exp(λ*

*j*

*(z))for j= 1, 2, . . . , r, are positive.*

(b) By Lemma2.2, clearly, exp(z) is continuously differentiable everywhere and

*(exp(z))*^{}*h*=

*r*
*j*=1

*exp(λ**j**(z))c**j**, hc**j*

+

1*≤j<l≤r*

4*exp(λ**j**(z))− exp(λ**l**(z))*

*λ**j**(z)− λ**l**(z)* *c*_{j}*◦ (c**l**◦ h)*
*for any h*∈ V. From this formula, we particularly have

*(exp(z))*^{}*e*=

*r*
*j*=1

*exp(λ**j**(z))c**j**, ec**j*

+

1*≤j<l≤r*

4*exp(λ**j**(z))− exp(λ**l**(z))*

*λ*_{j}*(z)− λ**l**(z)* *c*_{j}*◦ (c**l**◦ e)*

=

*r*
*j*=1

*exp(λ**j**(z))c*_{j}*= exp(z).*

(c) The first part is direct by the continuous differentiability of trace function and
part (b), and the second part follows from the differential chain rule.
**Lemma 5.2 Let**F_{∞}*denote the recession cone of the feasible setF. Then,*

*F*_{∞}=

*d*∈ R^{m}*: A(d) − c ** _{K}*0

*.*

*Proof Assume that d*∈ R^{m}*such that A(d)− c *_{K}*0. If d= 0, clearly, d ∈ F*_{∞}. If
*d
= 0, we take any y ∈ F. From the definition of the linear operator A, for any τ > 0,*

*A(y+ τd) = c −*

*m*
*i*=1

*(y**i**+ τd**i**)a**i**= A(y) + τ(A(d) − c) *_{K}*0.*

This, by the definition of recession direction [19, pp. 61], shows that d*∈ F*∞. Thus,
we prove that*{d ∈ R*^{m}*: A(d) − c ** _{K}*0} ⊆ F

_{∞}.

*Now take any d∈ F*_{∞} *and y∈ F. Then A(y + τd) *_{K}*0 for any τ > 0, and*
*therefore, λ*min*[A(y + τd)] ≥ 0. This must imply λ*min*(A(d)− c) ≥ 0. If not, using*
the fact that

*λ*_{min}*[A(y + τd)] = λ*min*[A(y) + τ(A(d) − c)]*

*≤ λ*min*(τ (A(d)− c)) + λ*max*(A(y))*

*= τλ*min*(A(d)− c) + λ*max*(A(y))*

where the inequality is due to Lemma2.1, and letting τ→ +∞, we then have
*λ*_{min}*[A(y + τd)] → −∞,*

*contradicting the fact that λ*min*[A(y + τd)] ≥ 0. Consequently, A(d) − c ** _{K}* 0. To-
gether with the above discussions, we show that

*F*∞

*= {d ∈ R*

^{m}*| A(d) − c*

*0}. Next we establish the convergence of the algorithm in (25)–(26). We first prove that the sequence generated is well-defined, which is implied by the following lemma.*

_{K}* Lemma 5.3 For any y*∈ R

^{m}*and μ > 0, let F*: R

^{m}*→ R be defined by*

*F (y):= −b*

^{T}*y+ μ*

^{−1}tr

*exp(−μA(y) + ln x*^{k}^{−1}*)*

*.* (27)

*Then under assumption (A1) the minimum set of F is nonempty and bounded.*

*Proof We prove that F is coercive by contradiction. Suppose not, i.e., some level set*
*of F is not bounded. Then, there exists a sequence{y** ^{k}*} ⊆ R

*such that*

^{m}*y*^{k}* → +∞,* lim

*k*→+∞

*y*^{k}

*y*^{k}*= d
= 0 and F (y*^{k}*)≤ δ* (28)
*for some δ* *∈ R. Since exp(−μA(y) + ln x*^{k}^{−1}*)∈ int(K) for any y ∈ R** ^{m}* by
Lemma5.1(a),

tr

exp

*− μA(y) + ln x*^{k}^{−1}

=

*r*
*j*=1

exp

*λ*_{j}

*− μA(y) + ln x*^{k}^{−1}

*>0.*

*Therefore, F (y*^{k}*)≤ δ implies that the following two inequalities hold*

*−b*^{T}*y*^{k}*< δ,* (29)

*r*
*j*=1

exp

*λ*_{j}

*− μA(y*^{k}*)+ ln x*^{k}^{−1}

*≤ μ(δ + b*^{T}*y*^{k}*).* (30)

Due to the nonnegativity of the exponential function, (30) is equivalent to saying that exp

*λ*_{j}

*− μA(y*^{k}*)+ ln x*^{k}^{−1}

*≤ μ(δ + b*^{T}*y*^{k}*)* *∀ j = 1, 2, . . . , r.* (31)

Dividing by*y*^{k}* on the both sides and using the monotonicity of ln t (t > 0) then*
gives

*λ*_{j}

*−μA(y*^{k}*)+ ln x*^{k}^{−1}

*− ln(y*^{k}*)*

≤ ln

*μ(δ+ b*^{T}*y*^{k}*)/y*^{k}

≤*μ(δ+ b*^{T}*y*^{k}*)*

*y*^{k}*− 1 ∀ j = 1, 2, . . . , r.*

*where the last inequality is due to ln t≤ t − 1 (t > 0). Now dividing y** ^{k}* on the both

*sides again and using the homogeneity of the function λ*

*j*

*(·) yields*

*λ**j*

−*μA(y*^{k}*)*

*y** ^{k}* +

*ln x*

^{k}^{−1}

*y*

^{k}

−*ln(y*^{k}*)*

*y** ^{k}* ≤

*μ(δ+ b*

^{T}*y*

^{k}*)*

*y*

^{k}^{2}− 1

*y*^{k}*∀ j = 1, 2, . . . , r.*

*Passing to the limit k→ +∞ on the both sides and noting that y** ^{k}* → +∞, we get

*λ*_{j}

*μ*

*m*
*i*=1

*d*_{i}*a*_{i}

*≤ 0 ∀ j = 1, 2, . . . , r,*

*which, by the homogeneity of λ**j**(·) again, implies*

*μλ*_{j}*(A(d)− c) = μλ**j*

−

*m*
*i*=1

*d*_{i}*a*_{i}

*≥ 0 ∀ j = 1, 2, . . . , r.*

*Consequently, A(d)− c ** _{K}* 0. From Lemma5.2,

*F*∞

*= {d ∈ R*

^{m}*: A(d) − c*

*0}.*

_{K}This together with*−b*^{T}*d≤ 0 shows that there exists a nonzero d ∈ R** ^{m}* such that

*d∈ F*∞but

*−b*

^{T}*d*≤ 0, obviously contradicting assumption (A1). Thus, we complete

the proof.

To analyze the convergence of the algorithm defined by (25)–(26), we also need
the following lemma which states that the sequence*{x** ^{k}*}

*k*

*∈N*generated by (25)–(26) is exactly the one given by the algorithm in (11)–(12) when applied to the dual prob- lem (DSCLP).