Stationary point conditions for the FB merit function associated with symmetric cones

(1)

Contents lists available atScienceDirect

Operations Research Letters

journal homepage:www.elsevier.com/locate/orl

Stationary point conditions for the FB merit function associated with symmetric cones

Shaohua Pan

^a

, Yu-Lin Chang

^b

, Jein-Shan Chen

^b,^∗

aSchool of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China

bDepartment of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan

a r t i c l e i n f o

Article history:

Received 28 October 2009 Accepted 13 July 2010 Available online 24 July 2010

Keywords:

Fischer–Burmeister merit function Symmetric cones

Stationary points

a b s t r a c t

For the symmetric cone complementarity problem, we show that each stationary point of the unconstrained minimization reformulation based on the Fischer–Burmeister merit function is a solution to the problem, provided that the gradient operators of the mappings involved in the problem satisfy column monotonicity or have the Cartesian P0-property. These results answer the open question proposed in the article that appeared in Journal of Mathematical Analysis and Applications 355 (2009) 195–215.

1. Introduction

Let A

= (

V

, ◦, h·, ·i)

be an n-dimensional Euclidean Jordan algebra (see Section2for the definition) andKbe the symmetric cone in V. We consider the following symmetric cone complemen- tarity problem (SCCP) which is to find a vector

ζ ∈

V such that F

(ζ) ∈

^K

,

^G

(ζ ) ∈

^K

, h

F

(ζ ),

^G

(ζ )i =

⁰

,

⁽¹⁾ where F and G are the differentiable mappings from V to V. This class of problems provides a unified framework for the nonlinear complementarity problem (NCP) over the nonnegative orthant cone in Rⁿ, the second-order cone complementarity problem (SOCCP) and the semidefinite complementarity problem (SDCP), and becomes one of the main research interests in the current optimization field; see, e.g., [5,10,14,16,17,19].

Recently, there are active studies for merit functions (or complementarity functions) for the SCCP. For example, Liu, Zhang and Wang [14] extended a class of merit functions proposed in [8] for the NCP to the SCCP; Kong, Tuncel and Xiu [11] studied the implicit Lagrangian merit function for the SCCP; Kong, Sun and Xiu [10]

proposed a regularized smoothing method by use of the natu- ral residual complementarity function associated with symmetric cones; and Huang and Ni [6] developed a smoothing algorithm with the regularized CHKS smoothing function over symmetric cones. Along this line, we also extended the one-parametric class of merit functions in [7] to the SCCP [15]. Specifically, a function

∗Corresponding author.

E-mail addresses:[email protected](S. Pan),[email protected] (Y.-L. Chang),[email protected](J.-S. Chen).

ψ :

V

×

_V

→

_R₊is called a merit function associated with the coneKif

ψ(

^x

,

^y

) =

⁰

⇐⇒

x

∈

_K

,

^y

∈

_K

, h

x

,

^y

i =

0

.

⁽²⁾ With such a function, the SCCP can be reformulated as an unconstrained minimization

minζ∈VΨ

(ζ ) := ψ(

^F

(ζ ),

^G

(ζ )),

⁽³⁾ in the sense that

ζ

^∗ ^solves⁽¹⁾ if and only if it is a solution of (3) with zero optimal value. Then, the effective unconstrained minimization methods can be applied for solving it.

A popular choice for

ψ

is the Fischer–Burmeister (FB) merit function

ψ

FBdefined as

ψ

FB

(

^x

,

^y

) :=

¹

2

k φ

FB

(

^x

,

^y

)k

²

∀

x

,

^y

∈

_V (4) where

φ

FB

:

_V

×

_V

→

V is the FB complementarity function associated withK, given by

φ

FB

(

^x

,

^y

) = (

^x²

+

y²

)

¹^/²

− (

^x

+

y

)

⁽⁵⁾ with x²

=

x

◦

x denoting the Jordan product of x and itself, and x¹^/² the unique square root of x

∈

_K, i.e., x¹^/²

◦

x¹^/²

=

x. The function

ψ

FB was first proved to be differentiable in [14], and later the authors of [12,15] independently showed that it is continuously differentiable everywhere with Lipschitz continuous gradients. However, it has been an open question: under what conditions every stationary point of the minimization problem

minζ∈V ΨFB

(ζ ) := ψ

FB

(

^F

(ζ ),

^G

(ζ ))

⁽⁶⁾

doi:10.1016/j.orl.2010.07.011

(2)

is guaranteed to be a solution of(1). The main difficulty to establish such results is described in [15]. The study for stationary point conditions is extremely important in the merit function approach since, when applying effective gradient-type methods for solving the minimization reformulation problems, one at most expects to get a stationary point due to the nonconvexity of the merit functions.

The main purpose of this paper is to settle down this open problem. By exploiting the classification of a simple Euclidean Jordan algebra and extending a weaker result than the first implication of [4, Prop. 3.4] to the setting of symmetric cones, we show that each stationary point of the minimization problem(6)is a solution to(1)if the gradient operators

∇

F and

−∇

G are column monotone. If the operator

∇

G is invertible, this condition can be relaxed to the one that

∇

G⁻¹

∇

F has the Cartesian P₀-property.

2. Preliminaries

This section recalls some results on Euclidean Jordan algebras that will be used in the subsequent section. More detailed expositions of Euclidean Jordan algebras can be found in Koecher’s lecture notes [9] and the monograph by Faraut and Korányi [3].

A Euclidean Jordan algebra is a triple

(

V

, ◦, h·, ·i

V

)

^where

(

V

, h·, ·i

V

)

is a finite-dimensional inner product space over the real number field R and

(

^x

,

^y

) 7→

^x

◦

y

:

_V

×

_V

→

V is a bilinear mapping satisfying the following conditions:

(i) x

◦

y

=

y

◦

x for all x

,

^y

∈

_V;

(ii) x

◦ (

^x²

◦

y

) =

^x²

◦ (

^x

◦

y

)

^{for all x}

,

^y

∈

_{V, where x}²

=

x

◦

x;

(iii)

h

x

◦

y

,

^z

i

_V

= h

y

,

^x

◦

z

i

_Vfor all x

,

^y

,

^z

∈

_V.

Let A

= (

V

, ◦, h·, ·i

V

)

denote a Euclidean Jordan algebra. We assume that there is an element e

∈

V (called the unit element) such that x

◦

e

=

x for all x

∈

_{V. By [}3, Theorem III. 2.1], the set of squaresK

:=

x²

|

x

∈

_V

is a symmetric cone. We write x

_Ky (respectively, x

_Ky) to mean x

−

y

∈

_K (respectively, x

−

y

∈

intK).

For x

∈

_{V, let m}

(

^x

) :=

^min

{

k

: {

e

,

^x

,

^x²

, . . . ,

^x^k

}

are linearly dependent

}

and define the rank of A by r

:=

max

{

m

(

^x

) :

^x

∈

_V

}

. Recall that an element c

∈

V is idempotent if c²

=

c, and it is a primitive idempotent if it is nonzero and cannot be written as a sum of two nonzero idempotents. One says that a finite set

{

c₁

,

^c2

, . . . ,

^ck

}

of primitive idempotents in V is a Jordan frame if c_j

◦

c_i

=

0 if j

6=

i for all j

,

ⁱ

=

1

,

²

, . . . ,

^k

,

^and

k

X

j=1

c_j

=

e

.

Now we may state the second version of the spectral decomposition theorem.

Theorem 2.1 ([3, Theorem III. 1.2]). Let A be a Euclidean Jordan algebra with rank r. Then for every x

∈

V, there exist a Jordan frame

{

c₁

,

^c2

, . . . ,

^cr

}

and real numbers

λ

1

(

^x

), . . . , λ

r

(

^x

)

, arranged in the decreasing order

λ

1

(

^x

) ≥ λ

2

(

^x

) ≥ · · · ≥ λ

r

(

^x

)

, such that

x

= λ

1

(

^x

)

^c1

+ λ

2

(

^x

)

^c2

+ · · · + λ

r

(

^x

)

^cr

.

The numbers

λ

j

(

^x

)

(counting multiplicities), which are uniquely determined by x, are called the eigenvalues of x, and tr

(

^x

) = P

r

j=1

λ

j

(

^x

)

is called the trace of x.

Since, by [3, Prop. III.1.5], a Jordan algebra

(

V

, ◦)

with a unit element e

∈

V is Euclidean if and only if the symmetric bilinear form tr

(

^x

◦

y

)

is positive definite, we may define another inner product on V by

h

x

,

^y

i :=

tr

(

^x

◦

y

) ∀

^x

,

^y

∈

_V

.

⁽⁷⁾ The inner product

h· , ·i

is associative by [3, Prop. II.4.3], i.e.,

h

x

,

^y

◦

z

i = h

y

,

^x

◦

z

i

for any x

,

^y

,

^z

∈

V. For any given x

∈

_{V, let}_L

(

^x

)

^be

the Lyapunov operator defined by L

(

^x

)

^y

:=

x

◦

y

∀

y

∈

_V

.

Then,L

(

^x

)

is symmetric with respect to the inner product

h· , ·i

ⁱⁿ the sense that

h

_L

(

^x

)

^y

,

^z

i = h

y

,

^L

(

^x

)

^z

i ∀

y

,

^z

∈

_V

.

In what follows, we let

k · k

be the norm on V induced by this inner product, i.e.,

k

x

k := p

h

x

,

^x

i = p

tr

(

^x²

) =

r

X

j=1

λ

²j

(

^x

)

!

1/2

∀

x

∈

_V

.

⁽⁸⁾ This definition implies that the unit element e in this paper has a length equal to

√

r.

Unless otherwise stated, in the rest of this paper, we assume that A

= (

V

, ◦, h·, ·i)

is a simple Euclidean Jordan algebra of rank r and dimension n. By [3, Theorem V.3.7], r

≥

2.

Let x

∈

V have the spectral decomposition x

= P

r j=1

λ

j

(

^x

)

^cj, where

λ

1

(

^x

) ≥ λ

2

(

^x

) ≥ · · · ≥ λ

r

(

^x

)

are the eigenvalues of x and

{

_c₁

,

^c2

, . . . ,

^cr

}

is the corresponding Jordan frame. By [3, Lemma IV. 1.3], the operatorsL

(

^cj

),

^j

=

1

,

²

, . . . ,

r commute and admit a simultaneous diagonalization. For all i

,

^j

∈ {

1

,

²

, . . . ,

^r

}

, define the subspaces

Vii

:=

_Rc_i

= {

x

∈

_V

|

x

◦

c_i

=

x

} ,

Vij

:=

x

∈

_V

|

x

◦

c_i

=

¹

2x

=

x

◦

c_j

when i

6=

j

,

and letCij

(

^x

)

be the orthogonal projection operator onto Vij. The following lemma gives the spectral decomposition of the operator L

(

^x

)

, whose proof can be found in [9].

Lemma 2.1. Let x

∈

V have the spectral decomposition x

= P

r

j=1

λ

j

(

^x

)

^cj. Then the linear symmetric operator L

(

^x

)

^{has the} spectral decomposition

L

(

^x

) =

r

X

j=1

λ

j

(

^x

)

^Cjj

(

^x

) + X

1≤j<^l≤r

1

2

λ

j

(

^x

) + λ

l

(

^x

)

^Cjl

(

^x

)

⁽⁹⁾ with the spectrum

σ(

^L

(

^x

))

consisting of all distinct ¹₂

(λ

j

(

^x

) + λ

l

(

^x

))

for j

,

^l

=

1

, . . . ,

^r.

To close this section, we recall the smoothness of FB merit function

ψ

FB defined by(4) and(5), whose proof can be found in [14, Lemma 12] and [15, Prop. 4.3].

Lemma 2.2. Let

ψ

FB be defined by (4) and (5). Then,

ψ

FB is continuously differentiable everywhere. Furthermore,

∇

_x

ψ

FB

(

⁰

,

⁰

) =

∇

_y

ψ

FB

(

⁰

,

⁰

) =

^{0; and if}

(

^x

,

^y

) 6= (

⁰

,

⁰

)

^,

∇

_x

ψ

FB

(

^x

,

^y

) =

^L

(

^x

)

^L⁻¹

(

^z

) −

^I

φ

FB

(

^x

,

^y

),

∇

_y

ψ

FB

(

^x

,

^y

) =

^L

(

^y

)

^L⁻¹

(

^z

) −

^I

φ

FB

(

^x

,

^y

),

where z

= (

^x²

+

y²

)

¹^/²^{, and}^Idenotes the identity operator from V to V.

3. Main result

First of all, we present a new representation for the elements in V. Let Vedenote the subspace generated by the unit element e, and V^⊥_e the orthogonal complementarity of Ve. Note that the unit element e of A is unique. Hence, any x

∈

V can be uniquely written as

λ

xe

+

x_e with

λ

x

∈

_{R and x}_e

∈

_V^⊥_e. Moreover, we have the following result.

Lemma 3.1. For z

= λ

ze

+

z_e

∈

_{V with}

λ

z

∈

_{R and z}_e

∈

_V^⊥_e, the following results hold.

(a) tr

(

^z

) =

^r

λ

zand

k

z

k

²

=

r

λ

²z

+ k

z_e

k

².

(3)

(b) If z

∈

_K, then

√

r²

−

r

λ

z

≥ k

z_e

k

. If in addition z

6=

0, then

λ

z

>

^{0 and}

k

_z_e

k >

^0.

(c) When r

=

2

,

^tr

(

^L²

(

^ze

)) = k

^ze

k

².

Proof. (a) The result is direct by the definition of

h· , ·i

and the fact that

k

e

k

²

=

r.

(b) Since z

∈

_K, we have tr

(

^z

) ≥ k

^z

k

. This by part (a) implies

(

^r²

−

r

)λ

²z

≥ k

z_e

k

², and the first part then follows. Since r

≥

2, from the inequality

√

r²

−

r

λ

z

≥ k

z_e

k

we obtain

λ

z

≥

0, and

k

z_e

k =

0 whenever

λ

z

=

0. This shows

λ

z

>

^{0 and}

k

z_e

k >

^{0 if 0}

6=

z

∈

_K. (c) SinceL

(

^e

) =

^I^{, we have}^L

(

^z

) = λ

zI

+

_L

(

^ze

)

, which together withLemma 2.1implies

L

(

^ze

) =

r

X

j=1

(λ

j

(

^z

) − λ

z

)

^Cjj

(

^z

)

+ X

1≤j<l≤z

1

2

λ

j

(

^z

) + λ

l

(

^z

) −

²

λ

z

Cjl

(

^z

).

Since Cjl

(

^z

)

^{for all j}

,

^l

=

1

,

²

, . . . ,

r are orthogonal projection operators, we have

L²

(

^ze

) =

r

X

j=1

(λ

j

(

^z

) − λ

z

)

²^Cjj

(

^z

)

+ X

1≤j<l≤r

1

4

λ

j

(

^z

) + λ

l

(

^z

) −

²

λ

z

2

Cjl

(

^z

).

Note that when r

=

2, part (a) implies

λ

1

(

^z

) + λ

2

(

^z

) −

²

λ

z

=

0, and therefore we have

tr

(

^L²

(

^ze

)) = (λ

1

(

^z

) − λ

z

)

²

+ (λ

2

(

^z

) − λ

z

)

²

= k

z

k

²

−

2tr

(

^z

)λ

z

+

2

λ

²z

= k

z

k

²

−

2

λ

²z

= k

z_e

k

²

where the last equality is due to part (a). Thus, the proof is complete.

To achieve the main result of this paper, the key is to establish the implication that

z²

_Kx²

+

y²

H⇒

c [L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^]

[L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]² ⁽¹⁰⁾ for all x

,

^y

∈

_{V and z}

_K0, where c

>

0 is a constant, and for the operatorsG

,

^H

:

_V

→

_V

,

^G

_Hmeans

h

x

, (

^G

−

_H

)

^x

i

_V

>

^{0 for} any 0

6=

x

∈

_{V and}_G

_Hmeans

h

x

, (

^G

−

_H

)

^x

i

_V

≥

0 for any x

∈

_V.

The following proposition tries to establish such an implication.

Proposition 3.1. For any x

= λ

xe

+

x_e

,

^y

= λ

ye

+

y_e

∈

_{V and} z

= λ

ze

+

z_e

∈

intK, if r

λ

²z

≥ k

z_e

k

²and tr

[

_L²

(

^ze

) −

^L²

(

^xe

) −

L²

(

^ye

)] ≥

^r⁻¹

(k

^ze

k

²

− k

x_e

k

²

− k

y_e

k

²

),

^then

z²

_Kx²

+

y²

H⇒

_L²

(

^z

) −

^L²

(

^x

) −

^L²

(

^y

)

⁰ ⁽¹¹⁾ which is equivalent to saying that

z²

_Kx²

+

y²

H⇒

2 [L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^]

[L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]²

.

⁽¹²⁾ Moreover, the two implications remain true when ‘‘

’’ is replaced by

‘‘

’’.

Proof. We adopt the proof technique of [4, Prop. 3.4]. First, consider the case where z

= (

^x²

+

y²

+ δ

^e

)

¹^/²^{for some}

δ >

^{0. Fix} any x

,

^y

∈

_{V with x}

= λ

xe

+

x_eand y

= λ

ye

+

y_ewhere

λ

x

, λ

y

∈

_R and x_e

,

^ye

∈

_V^⊥_e. From z²

=

x²

+

y²

+ δ

^{e and z}

= λ

ze

+

z_e, we have

λ

²ze

+

2

λ

zz_e

+

z_e²

= λ

²xe

+

2

λ

xx_e

+

x²_e

+ λ

²ye

+

2

λ

yy_e

+

y²_e

+ δ

^e

.

Noting that z²_e

,

^x²e

,

^y²e

∈

_V_eand x_e

,

^ye

,

^ze

∈

_V^⊥_e, we obtain from the last equality that

λ

zz_e

= λ

xx_e

+ λ

yy_e and

λ

²ze

+

z_e²

= λ

²xe

+

x²_e

+ λ

²ye

+

y²_e

+ δ

^e

.

⁽¹³⁾ From the first equality of(13),

λ

zL

(

^ze

) − λ

xL

(

^xe

) − λ

yL

(

^ye

) =

^0, which implies that

L²

(

^z

) −

^L²

(

^x

) −

^L²

(

^y

) = λ

²z

− λ

²x

− λ

²y

L

(

^e

) +

_L²

(

^ze

) −

^L²

(

^xe

) −

^L²

(

^ye

).

Thus, to prove(11), it suffices to prove that for any 0

6=

h

= λ

he

+

h_e

∈

_V,

λ

²z

− λ

²x

− λ

²y

k

_h

k

²

+ k

z_e

◦

h

k

²

− k

x_e

◦

h

k

²

− k

y_e

◦

h

k

²

>

⁰

,

which, by noting that z²_e

,

^x²e

,

^y²e

∈

_V_e

,

^h

= λ

he

+

h_eand h_e

∈

_V^⊥_e, is equivalent to

(λ

²z

− λ

²x

− λ

²y

)k

^h

k

²

+ λ

²h

(k

^ze

k

²

− k

x_e

k

²

− k

y_e

k

²

)

+ (k

^ze

◦

h_e

k

²

− k

x_e

◦

h_e

k

²

− k

y_e

◦

h_e

k

²

) >

⁰

.

⁽¹⁴⁾ Since

λ

z

>

^{0 by}Lemma 3.1(b), from the two equalities in(13)we have

r

λ

²z

+ k λ

xx_e

+ λ

yy_e

k

²

λ

²z

= k

z

k

²

=

r

λ

²x

+ k

x_e

k

²

+

r

λ

²y

+ k

y_e

k

²

+

r

δ.

Multiplying the two sides with

λ

²zand adding

λ

²y

k

x_e

k

²

+ λ

²x

k

y_e

k

² simultaneously yields

(λ

²x

+ λ

²y

)(k

^xe

k

²

+ k

y_e

k

²

) +

^r

λ

⁴z

− λ

²z

(

^r

λ

²x

+ k

x_e

k

²

+

r

λ

²y

+ k

y_e

k

²

)

> kλ

yx_e

− λ

xy_e

k

²

,

which is equivalent to

(λ

²z

− λ

²x

− λ

²y

)(

^r

λ

²z

−k

x_e

k

²

−k

y_e

k

²

) > kλ

yx_e

− λ

xy_e

k

². This means that both

λ

²z

− λ

²x

− λ

²yand r

λ

²z

− k

x_e

k

²

− k

y_e

k

² are positive or both are negative. If both are negative, we must have

k

x

k

²

+ k

y

k

²

>

^2r

λ

²z, which by the assumption r

λ

²z

≥ k

z_e

k

²yields the contradiction

k

z

k

²

> k

^x

k

²

+ k

y

k

²

>

^2r

λ

²z

≥

r

λ

²z

+ k

z_e

k

²

= k

z

k

². Thus, we get

λ

²z

> λ

²x

+ λ

²y and r

λ

²z

> k

^xe

k

²

+ k

y_e

k

²

.

⁽¹⁵⁾ Using the first equality of(13)and the second inequality of(15), for any s_e

∈

_V^⊥_e,

s_e

,

^L²

(

^ze

) −

^L²

(

^xe

) −

^L²

(

^ye

)

^se

= k

z_e

◦

s_e

k

²

− k

x_e

◦

s_e

k

²

− k

y_e

◦

s_e

k

²

= k λ

xx_e

◦

s_e

+ λ

yy_e

◦

s_e

k

²

λ

²z

− k

x_e

◦

s_e

k

²

+ k

y_e

◦

s_e

k

²

= (λ

²x

+ λ

²y

− λ

²z

) k

^xe

◦

s_e

k

²

+ k

y_e

◦

s_e

k

²

λ

²z

− k

s_e

◦ (λ

xy_e

− λ

yx_e

)k

²

λ

²z

≤

0

.

This shows thatL²

(

^ze

) −

^L²

(

^xe

) −

^L²

(

^ye

)

is negative semidefinite on V^⊥e. Therefore,

k

_z_e

◦

_h_e

k

²

− k

_x_e

◦

_h_e

k

²

− k

_y_e

◦

_h_e

k

²

= h

h_e

, [

^L²

(

^ze

) −

^L²

(

^xe

) −

^L²

(

^ye

)]

^he

i

≥ h

h_e

,

^tr

(

^L²

(

^ze

) −

^L²

(

^xe

) −

^L²

(

^ye

))

^I^he

i

≥

r⁻¹

k

h_e

k

²

(k

^ze

k

²

− k

x_e

k

²

− k

y_e

k

²

),

(4)

where the last inequality is due to the given assumption. Along with(15), we have that

(λ

²z

− λ

²x

− λ

²y

)k

^h

k

²

+ λ

²h

(k

^ze

k

²

− k

x_e

k

²

− k

y_e

k

²

) + (k

^ze

◦

h_e

k

²

− k

x_e

◦

h_e

k

²

− k

y_e

◦

h_e

k

²

)

≥ λ

²h

(k

^z

k

²

− k

_x

k

²

+ k

_y

k

²

) +

^r⁻¹

k

_h_e

k

²

(k

^z

k

²

− k

_x

k

²

− k

_y

k

²

)

=

r⁻¹

k

h

k

²

(k

^z

k

²

− k

x

k

²

− k

y

k

²

) >

⁰

.

This shows that(14)holds, and the implication in(11)is true for any x

,

^y

∈

_{V and z}

= (

^x²

+

y²

+ δ

^e

)

¹^/². Using the same arguments of [4, Prop. 3.4] yields that(11)holds.

We next prove that the implication in(11)is equivalent to that of(12). Suppose that the implication in(11)holds. Fix any 0

6=

h

∈

V. By the symmetry ofL

(

^x

)

^{, clearly,}

h

_h

,

^[^L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^{] h}

i

= h

h

,

^[^L

(

^z

) −

^L

(

^y

)

^{] [}^L

(

^z

) −

^L

(

^x

)

^{] h}

i .

LetS

(

^x

,

^y

)

denote the symmetric part of

[

_L

(

^z

) −

^L

(

^x

)][

^L

(

^z

) −

L

(

^y

)]

^{. Then,}

h

_h

,

^[^L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^{] h}

i = h

_h

,

^S

(

^x

,

^y

)

^h

i .

Using the definition ofS

(

^x

,

^y

)

, a simple computation yields that S

(

^x

,

^y

) =

¹

2[L

(

^y

) −

^L

(

^z

)

^{] [}^L

(

^x

) −

^L

(

^z

)

^]

+

¹

2[L

(

^x

) −

^L

(

^z

)

^{] [}^L

(

^y

) −

^L

(

^z

)

^]

=

¹

2[L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]²

+

¹

2

L²

(

^z

) −

^L²

(

^x

) −

^L²

(

^y

) .

The last two equations along with Eq.(11)imply the implication in (12). Conversely, if the implication in(12)holds, from the last two equations we obtain the implication in(11). The last part follows by the continuity of the operators.

ByLemma 3.1(b)–(c), when r

=

2, the assumptions ofPropo- sition 3.1 automatically hold, and we recover the first implication of [4, Prop. 3.4], or its equivalent result as below.

Corollary 3.1. Suppose that r

=

2. Then, for any x

,

^y

∈

_{V and} z

_K0, it holds that

z²

_Kx²

+

y²

H⇒

2 [L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^]

[L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]²

.

⁽¹⁶⁾ Moreover, the implication remains true when ‘‘

’’ is replaced by ‘‘

’’.

When r

≥

3, the assumption r

λ

²z

≥ k

z_e

k

²inProposition 3.1 may not hold. Also, it is hard to verify whether tr

(

^L²

(

^ze

)−

^L²

(

^xe

)−

L²

(

^ye

)) ≥

^r⁻¹

(k

^ze

k

²

− k

x_e

k

²

− k

y_e

k

²

)

holds or not. In other words, by use ofProposition 3.1it is difficult to achieve our goal for r

≥

3.

However, as will be shown byProposition 3.2, an implication as in(10)can be established for r

≥

3 by extending the proof of [18, Lemma 6.3(c)] to another three classes of matrix algebras.

Proposition 3.2. Suppose that r

≥

3. Then, for any x

,

^y

∈

_{V and} z

_K0,

z²

_Kx²

+

y²

H⇒

4 [L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^]

[L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]²

.

⁽¹⁷⁾ Moreover, the implication remains true when ‘‘

’’ is replaced by ‘‘

’’.

Proof. By [3, Theorem V.3.7], it suffices to prove this result for the following algebras:

(i) The algebra_S_nof n

×

n real symmetric matrices;

(ii) The algebra_H_nof all n

×

n complex Hermitian matrices;

(iii) The algebra_Q_nof all n

×

n quaternionic Hermitian matrices;

(iv) The algebra_O₃of all 3

×

3 octonionic Hermitian matrices.

Among others, the four classes of matrix algebras are equipped with the Jordan product x

◦

y

:=

¹

2

(

^xy

+

yx

)

and the trace inner product

h

x

,

^y

i

_T

:= <

Tr

(

^xy^∗

),

where the notation ‘‘

∗

’’ means the conjugate transpose, Tr

(

^xy

)

denotes the trace of xy which is the multiplication of matrices x and y, and

<

a means the real part of a.

Let C

,

Q and O denote the complex number field, the quaternion field and the octonion field, respectively. Let W be the algebra of n

×

n matrices with entries in R

,

C, or Q, or the algebra of 3

×

3 matrices with entries in O, equipped with the inner product

h· , ·i

Tand the norm

k·k

_Tinduced by

h· , ·i

T. By [3, Propositions V.1.2, V.1.5 and V.2.1], it is not difficult to verify that for any u

, v, w ∈

W,

<

Tr

[ (w

^u

)(vw)] = <

^Tr

[ w(

^u

vw)] = <

^Tr

[ w

^u

vw]

= <

Tr

[ w(

^u

v)w],

⁽¹⁸⁾

<

Tr

[ w(

^u

v)w] = <

^Tr

[ w(v

^u

)w]

^{if u}

, v, w

are Hermitian

,

⁽¹⁹⁾ and

<

Tr

(

^u

v) = <

^Tr

(

^u

v

^∗

)

if u is Hermitian

.

⁽²⁰⁾ Also, by [3, Prop. V.2.1] we may verify that

h

_L

(

^x

)

^y

,

^z

i

_T

= h

y

,

^L

(

^x

)

^z

i

_Tfor all x

,

y and z from the space_S_n, or_H_n, or_Q_n, or_O₃. Fix any x and y from_S_n, or_H_n, or_Q_n, or_O₃. Since z²

_K_x²

+

_y²_, from the Löwner–Heinz inequality in [13] it follows that

z

_Kx and z

_Ky

.

⁽²¹⁾

Fix any 0

6=

a from the same space as x and y. From the above discussions, we have

4

h

a

,

^[^L

(

^z

) −

^L

(

^x

)

^{] [}^L

(

^z

) −

^L

(

^y

)

^{] a}

i

_T

=

4

h (

^z

−

x

) ◦

^a

, (

^z

−

y

) ◦

^a

i

_T

= h

a

(

^z

−

x

) + (

^z

−

x

)

^a

,

^a

(

^z

−

y

) + (

^z

−

y

)

^a

i

_T

=

2

<

Tr [

(

^a

(

^z

−

x

) + (

^z

−

x

)

^a

)((

^z

−

y

)

^a

)

^]

=

2

<

Tr [a

(

^z

−

x

)(

^z

−

y

)

^a

+ (

^z

−

x

)

^a

(

^z

−

y

)

^a]

=

2

<

Tr

a

(

^z²

−

zy

−

xz

+

xy

)

^a

+

2

<

Tr

(

^z

−

x

)

¹^/²

(

^z

−

x

)

¹^/²^a

(

^z

−

y

)

¹^/²

(

^z

−

y

)

¹^/²^a

> <

^Tr

a

(

^2xy

−

2zx

−

2zy

+

z²

+

x²

+

y²

)

^a

+

2

<

Tr

(

^z

−

x

)

¹^/²^a

(

^z

−

y

)

¹^/²

(

^z

−

y

)

¹^/²^a

(

^z

−

x

)

¹^/²

= <

Tr

[

a

(

^z

−

x

−

y

)

²^a

] +

2

(

^z

−

x

)

¹^/²^a

(

^z

−

y

)

¹^/²

2 T

≥ <

Tr

[

a

(

^z

−

x

−

y

)

²^a

]

= <

Tr

[ (

^a

(

^z

−

x

−

y

))((

^z

−

x

−

y

)

^a

)]

= <

_Tr

[

_a

(

^z

−

_x

−

_y

)(

^a

(

^z

−

_x

−

_y

))

^∗

]

= k (

^z

−

x

−

y

)

^a

k

²_T

,

where the first equality is by the symmetry ofL

(·)

with respect to

h· , ·i

T, the third is due to(20)and the fact that a

(

^z

−

_x

) + (

^z

−

_x

)

^a is Hermitian, the fourth is by(18), the fifth is by(18)and(21), and the first inequality is using z²

_Kx²

+

y². On the other hand,

a

,

^[^L

(

^z

) −

^L

(

^x

) −

^L

(

^y

)

^]²^a

T

= h (

^z

−

x

−

y

) ◦

^a

, (

^z

−

x

−

y

) ◦

^a

i

_T

= <

Tr [

((

^z

−

x

−

y

) ◦

^a

)((

^z

−

x

−

y

) ◦

^a

)

^]

≤ <

Tr

((

^z

−

x

−

y

) ◦

^a

)((

^z

−

x

−

y

) ◦

^a

)

^∗

= k (

^z

−

x

−

y

) ◦

^a

k

²_T

= k (

^z

−

x

−

y

)

^a

+

a

(

^z

−

x

−

y

)k

²T

4

≤ k (

^z

−

x

−

y

)

^a

k

²_T

,