Further relationship between second-order cone and positive semideﬁnite cone

(1)

to appear in Optimization, 2016

Further relationship between second-order cone and positive semidefinite cone

Jinchuan Zhou ¹ Department of Mathematics

School of Science

Shandong University of Technology Zibo 255049, P.R. China E-mail: [email protected]

Jingyong Tang²

College of Mathematics and Information Science Xinyang Normal University

Xinyang 464000, Henan, P.R.China E-mail: [email protected]

Jein-Shan Chen ³ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: [email protected]

February 2, 2016 (1st revised on May 2, 2016) (2nd revised on July 21, 2016)

Abstract. It is well known that second-order cone programming can be regarded as a special case of positive semidefinite programming by using the arrow matrix. This paper further studies the relationship between second-order cones and positive semidefinite matrix cones. In particular, we explore the relationship to expressions regarding

1The author’s work is supported by National Natural Science Foundation of China (11101248, 11271233) and Shandong Province Natural Science Foundation (ZR2010AQ026).

2The author’s work is supported by Basic and Frontier Technology Research Project of Henan Province (162300410071).

3Corresponding author. The author’s work is supported by Ministry of Science and Technology, Taiwan.

(2)

distance, projection, tangent cone, normal cone, and the KKT system. Understanding these relationships will help us to see the connection and difference between the SOC and its PSD reformulation more clearly.

Keywords. Positive semidefinite matrix cone, second-order cone, projection, tangent cone, normal cone, KKT system.

AMS subject classifications. 90C25; 90C22.

1 Introduction

The second-order cone (SOC) in IRⁿ, also called the Lorentz cone, is defined as

Kⁿ:=(x1, x2) ∈ IR × IRⁿ⁻¹| x1 ≥ kx2k , (1) where k · k denotes the Euclidean norm. If n = 1, Kⁿ is the set of nonnegative reals IR₊. The positive semidefinite matrix cone (PSD cone), denoted by S₊ⁿ, is the collection of all symmetric positive semidefinite matrices in IR^n×n, i.e.,

S₊ⁿ := X ∈ IR^n×n| X ∈ Sⁿ and X O

:= X ∈ IR^n×n| X = X^T and v^TXv ≥ 0 ∀v ∈ IRⁿ .

It is well known that second-order cone and positive semidefinite matrix cone both belong to the category of symmetric cones [7], which are unified under Euclidean Jordan algebra.

In [1], for each vector x = (x₁, x₂) ∈ IR × IRⁿ⁻¹, an arrow-shaped matrix L_x (alterna- tively called an arrow matrix and denoted by Arw(x)) is defined as

L_x :=x₁ x^T₂ x₂ x₁I_n−1

. (2)

It can be verified that there is a close relationship between the SOC and the PSD cone as below:

x ∈ Kⁿ ⇐⇒ L_x :=x₁ x^T₂ x₂ x₁I_n−1

O. (3)

Hence, a second-order cone program (SOCP) can be recast as a special semidefinite program (SDP). In light of this, it seems that we just need to focus on SDP. Nevertheless, this reformulation has some disadvantages. For example, the reference [11] indicates that

“Solving SOCPs via SDP is not a good idea, however. Interior-point methods that solve the SOCP directly have a much better worst-case complexity than an SDP method....

The difference between these numbers is significant if the dimensions of the second-order constraints are large.”. This comment mainly concerns the algorithmic aspects; see [1, 11]

(3)

for more information.

In fact, “reformulation” is usually the main idea behind many approaches to studying various optimization problems and it is necessary to discuss the relationship between the primal problem and the transformed problem. For example, for complementarity problems (or variational inequality problems), we can reformulate these problems to work on a minimization optimization problem via merit functions (or gap functions). The proper- ties of merit functions ensure the solution to complementarity problems is the same as the global optimal solution to the minimization problem. Nonetheless, finding a global optimal solution is very difficult. Thus, we turn to study the connection between the solution to complementarity problems and the stationary points of the transformed optimization problem. Similarly, for mathematical programming with complementarity constraints (MPCC), the ordinary KKT conditions do not hold, because the standard constraint qualification fails to hold (due to the existence of complementarity constraints). One therefore considers to recast MPCC to other types of optimization problems with different approaches. These different approaches also ensure the solution set of MPCC is the same to that of the transformed optimization problems. But, the KKT conditions for these transformed optimization problems are different, which are the source of various concepts of stationary points for MPCC, such as S-, M -, C-stationary points.

A similar question arises from SOCP and its SDP-reformulation. In view of the above discussions, it could be interesting to study their relation from theoretical and numerical aspects. As mentioned above, the reference [11] mainly deals with the SOCP and its SDP-reformulation from the perspective of algorithm. The study on the relationship between SOCP and its corresponding SDP from theoretical aspect is rare. Sim and Zhao [13] discuss the relation between SOCP and its SDP counterpart from the perspective of duality theory. There are already some known relations between the SOC and the PSD cone; for instance,

(a) x ∈ int Kⁿ ⇐⇒ L_x ∈ int S₊ⁿ; (b) x = 0 ⇐⇒ L_x= 0;

(c) x ∈ bd Kⁿ\ {0} ⇐⇒ L_x ∈ bd S₊ⁿ\ {O}.

Besides the interior, boundary point set, we know that for an optimization problem, some other topological structures, such as tangent cones, normal cones, projections, and KKT systems, play very important roles. One may wonder whether there exist analogous relationship between the SOC and the PSD cone? We will answer it in this paper. In particular, by comparing the expressions of distance, projection, tangent cone, normal cone, and the KKT system between the SOC and the PSD cone, we will know more about the differences between SOCP and its SDP reformulation.

(4)

2 Preliminaries

In this section, we introduce some background materials that will be used in subsequent analysis. In the space of matrices, if we equip it with the trace inner product and the Frobenius norm

hX, Y i_F := tr(X^TY ), kXk_F :=phX, Xi_F,

then, for any X ∈ Sⁿ, its (repeated) eigenvalues λ₁, λ₂, · · · , λ_n are real and it admits a spectral decomposition of the form:

X = P diag[λ₁, λ₂, · · · , λ_n] P^T (4) for some P ∈ O. Here O denotes the set of orthogonal P ∈ IR^n×n, i.e., P^T = P⁻¹.

The above factorization (4) is the well-known spectral decomposition (eigenvalue decomposition) in matrix analysis [9]. There is a similar spectral decomposition associated with Kⁿ. To see this, we first introduce the so-called Jordan product. For any x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ and y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, their Jordan product [7] is defined by

x ◦ y := (hx, yi, y₁x₂+ x₁y₂) .

Since the Jordan product, unlike scalar or matrix multiplication, is not associative, this is a main source on complication in the analysis of second-order cone complementarity problem (SOCCP). The identity element under this product is e := (1, 0, · · · , 0)^T ∈ IRⁿ. It can be verified that the arrow matrix L_x is a linear mapping from IRⁿ to IRⁿ given by L_xy = x ◦ y. For each x = (x₁, x₂) ∈ IR × IRⁿ⁻¹, x admits a spectral decomposition [4, 5, 6, 7] associated with Kⁿ in the form of

x = λ₁(x)u⁽¹⁾_x + λ₂(x)u⁽²⁾_x , (5) where λ₁(x), λ₂(x) and u⁽¹⁾x , u⁽²⁾x are the spectral values and the corresponding spectral vectors of x, respectively, given by

λ_i(x) := x₁+ (−1)ⁱkx₂k and u⁽ⁱ⁾_x := 1 2

1

(−1)ⁱx¯2

, i = 1, 2, (6) with ¯x₂ = x₂/kx₂k if x₂ 6= 0, and otherwise ¯x₂ being any vector in IRⁿ⁻¹ with k¯x₂k = 1.

When x₂ 6= 0, the spectral decomposition is unique. The following lemma states the relation between the spectral decomposition of x and the eigenvalue decomposition of L_x.

Lemma 2.1. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ have the spectral decomposition given as in (5)-(6). Then, L_x has the eigenvalue decomposition:

L_x = P diag [λ₁(x), λ₂(x), x₁, · · · , x₁] P^T

(5)

where

P = h√

2u⁽¹⁾_x √

2u⁽²⁾_x u⁽³⁾_x · · · u⁽ⁿ⁾_x i

∈ IR^n×n

is an orthogonal matrix, and u⁽ⁱ⁾x for i = 3, · · · , n have the form of (0, ¯u_i) with ¯u₃, . . . , ¯u_n being any unit vectors in IRⁿ⁻¹ that span the linear subspace orthogonal to x₂.

Proof. Please refer to [5, 6, 8]. 2

From Lemma 2.1, it is not hard to calculate the inverse of L_x whenever it exists:

L⁻¹_x = 1 det(x)





x1 −x^T₂

−x₂ det(x) x₁ I + 1

x₁x₂x^T₂



 (7)

where det(x) := x²₁− kx2k² denotes the determinant of x.

Throughout the whole paper, we use Π_C(·) to denote the projection mapping onto a closed and convex set C. In addition, for α ∈ IR, (α)₊ := max{α, 0} and (α)− :=

min{α, 0}. Given a nonempty subset A in IRⁿ, we define AA^T := {uu^T| u ∈ A} and L_A := {L_u| u ∈ A} respectively. We denote Λⁿ the set of all arrow-shape matrices and Λⁿ₊ the set of all positive semidefinite arrow matrices, i.e.,

Λⁿ := {L_y ∈ IR^n×n| y ∈ IRⁿ} and Λⁿ₊ := {L_y O | y ∈ IRⁿ}.

Lemma 2.2. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ have the spectral decomposition given as in (5)-(6). Then, the following hold:

(a) ΠKⁿ(x) = (x₁− kx₂k)₊u⁽¹⁾x + (x₁+ kx₂k)₊u⁽²⁾x ,

(b) ΠSⁿ₊(L_x) = P





(x₁− kx₂k)₊ 0 0

0 (x₁+ kx₂k)₊ 0

0 0 (x₁)₊I_n−2



P^T where P is an orthogonal matrix of Lx.

Proof. Please see [8, 15] for a proof. 2

3 Relation on Distance and Projection

In this section, we show the relation on distance and projection associated with the SOC and the PSD cone. We begin with some explanation for why we need to do so. First, let us consider the projection of x over Kⁿ. In light of the relationship (3) between the SOC and the PSD cone, one may ask “Can we obtain the expression of projection ΠKⁿ(x) by using ΠS₊ⁿ(Lx), the projection of Lx over S₊ⁿ?”. In other words,

Is ΠKⁿ(x) = L⁻¹

ΠS₊ⁿ(L_x)

or ΠS₊ⁿ(L_x) = L (ΠKⁿ(x)) right ? (8)

(6)

Here the operator L, defined as L(x) := L_x, is a single-point mapping between IRⁿ and Sⁿ, and L⁻¹ is the inverse mapping of L, which can be achieved as in (7). To see this, take x = (1, 2, 0) ∈ IR³; then applying Lemma 2.1 yields

L_x =







√1 2

√1 2 0

−^√¹

2

√1 2 0

0 0 1











−1 0 0 0 3 0 0 0 1











√1 2 −^√¹

2 0

√1 2

√1 2 0

0 0 1





. Hence, by Lemma 2.2, we have

Π_S³

+(Lx) =







√1 2

√1 2 0

−^√¹₂ ^√¹₂ 0

0 0 1











0 0 0 0 3 0 0 0 1











√1

2 −^√¹₂ 0

√1 2

√1 2 0

0 0 1





=





3 2

3 2 0

3 2

3 2 0 0 0 1



,

which is not a form of the arrow matrix as shown in (2), because the diagonal entries are not equal. This means that we cannot seek a vector y such that L_y = ΠS₊ⁿ(L_x). Note that

ΠKⁿ(x) = (1 + 2)1 2



 1 1 0



=





3 23 2

0



 which gives

L (ΠKⁿ(x)) =





3 2

3 2 0

3 2

3 2 0 0 0 ³₂



. Hence Π_Kⁿ(x) 6= L⁻¹(Π_Sⁿ

+(L_x)) and Π_Sⁿ

+(L_x) 6= L(Π_Kⁿ(x)). The distances dist(x, Kⁿ) and dist(L_x, S₊³) are also different, since

dist(x, Kⁿ) = kx − ΠKⁿ(x)k =





−¹₂

1 2

0





=

√2 2 and

dist(L_x, S₊ⁿ) = kL_x− ΠS₊ⁿ(L_x)k =





−¹₂ ¹₂ 0

1

2 −¹₂ 0

0 0 0





= 1.

The failure of the above approach comes from the fact that the PSD cone is much larger, i.e., there exists a positive semi-definite matrix that is not arrow-shape. Conse- quently, we may ask whether (8) holds if we restrict the positive semi-definite matrices to arrow-shape matrices. Still for x = (1, 2, 0), by the expression given as in Theorem 3.1 below, we know that

Π_Λⁿ₊(L_x) =





7 5

7 5 0

7 5

7 5 0 0 0 ⁷₅





(7)

which implies L⁻¹(Π_Λⁿ

+(L_x)) = (⁷₅,⁷₅, 0). To sum up, Π_Kⁿ(x) 6= L⁻¹(Π_Λⁿ

+(L_x)) and Π_Λⁿ

+(L_x) 6= L(ΠKⁿ(x)). All the above observations and discussions lead us to further explore some relationship, other than (3), between the SOC and the PSD cone.

Lemma 3.1. The problem of finding the projection of L_x onto Λⁿ₊: min kL_x− L_yk_F

s.t. L_y ∈ Λⁿ₊ (9)

is equivalent to the following optimization problem:

min kL_x−yk_F

s.t. y ∈ Kⁿ. (10)

Precisely, L_y is an optimal solution to (9) if and only if y is an optimal solution to (10).

Proof. The result follows from the facts that L_x−L_y = L_x−y and L_y ∈ Λⁿ₊ ⇐⇒ y ∈ Kⁿ. 2

The result of Lemma 3.1 will help us to find the expressions of the distance and projection of x onto Kⁿ, L_x to S₊ⁿ and Λⁿ₊. In particular, the distance of x onto Kⁿ and Lx to S₊ⁿ can be obtained by using their expression of the projection given in Lemma 2.2.

Theorem 3.1. Let x = (x1, x2) ∈ IR × IRⁿ⁻¹ have the spectral decomposition given as in (5)-(6). Then, the following holds:

(a) dist(x, Kⁿ) = q1

2(x1− kx2k)²₋+ ¹₂(x1+ kx2k)²₋;

(b) dist(L_x, S₊ⁿ) = p(x₁− kx₂k)²₋+ (x₁+ kx₂k)²₋+ (n − 2)(x₁)²₋;

(c) ΠΛⁿ₊(Lx) =











Lx if x1 ≥ kx2k, O if x₁ ≤ −_n²kx₂k,

1

1+²_n x₁+ _n²kx₂k 1 x¯^T₂

¯

x₂ I_n−1

if −_n²kx₂k < x₁ < kx₂k,

(d) dist(L_x, Λⁿ₊) = q 2n

n+2(x₁− kx₂k)²₋+_n+2ⁿ² x₁+_n²kx₂k2

−.

Proof. (a) From Lemma 2.2, we know that x = (x1 − kx2k)u⁽¹⁾x + (x1+ kx2k)u⁽²⁾x and ΠKⁿ(x) = (x₁− kx₂k)₊u⁽¹⁾x + (x₁+ kx₂k)₊u⁽²⁾x . Thus, it is clear to see that

dist(x, Kⁿ) = kx − ΠKⁿ(x)k

=

(x₁− kx₂k)₋u⁽¹⁾_x + (x₁+ kx₂k)₋u⁽²⁾_x

(8)

= r1

2(x₁− kx₂k)²₋+1

2(x₁+ kx₂k)²₋ where the last step is derived from ku⁽ⁱ⁾x k =√

2/2 for i = 1, 2 and hu⁽¹⁾x , u⁽²⁾x i = 0.

(b) By Lemma 2.1 and Lemma 2.2(b),

L_x= P





x₁− kx₂k 0 0

0 x₁+ kx₂k 0

0 0 x₁I_n−2



P^T and

ΠS₊ⁿ(L_x) = P





(x₁− kx₂k)₊ 0 0

0 (x₁+ kx₂k)₊ 0

0 0 (x₁)₊I_n−2



P^T. Combining the above yields

dist(L_x, S₊ⁿ) =





(x1− kx2k)− 0 0

0 (x₁+ kx₂k)₋ 0

0 0 (x₁)−I_n−2





= q

(x₁− kx₂k)²₋+ (x₁+ kx₂k)²₋+ (n − 2)(x₁)²₋. (c) To find Π_Λⁿ

+(L_x), we need to solve the optimization problem (9). From Lemma 3.1, it is equivalent to look into problem (10). Thus, we first compute

kL_x−yk_F

= p

(x₁− y₁− kx₂− y₂k)²+ (x₁− y₁+ kx₂− y₂k)²+ (n − 2)(x₁− y₁)²

= p

n(x1− y1)²+ 2kx2− y2k²

= √

n r

(x₁− y₁)² + 2

nkx₂− y₂k²

= √

n v u u

t(x1− y1)² +

r2 nx2−

r2 ny2

2

. Now, we denote

y⁰ := y₁, r2

ny₂

!

= (y₁, γy₂) = Γy where γ :=

r2

n and Γ :=1 0 0 γI

.

Then, y₁ ≥ ky₂k if and only if y⁰₁ ≥ ¹_γky₂⁰k; that is, y ∈ Kⁿ if and only if y⁰ ∈ L_θ with cot θ = _γ¹, where L_θ := {x = (x₁, x₂) ∈ IR × IRⁿ⁻¹|x₁ ≥ kx₂k cot θ}; see [16]. We therefore conclude that the problem (10) is indeed equivalent to the following optimization problem:

min r

(x1− y₁⁰)²+

q2

nx2− y₂⁰

2

s.t. y⁰ ∈ Lθ.

(11)

(9)

The optimal solution to the problem (11) is Π_L_θ(x⁰), the projection of x⁰ := (x₁, γx₂) = Γx onto L_θ, which according to [16, Theorems 3.1 and 3.2] is expressed by

ΠL_θ(x⁰)

= 1

1 + cot²θ(x⁰₁− kx⁰₂k cot θ)+

1

−¯x⁰₂cot θ

+ 1

1 + tan²θ(x⁰₁+ kx⁰₂k tan θ)+

1

¯ x⁰₂tan θ

= γ²

1 + γ²(x₁− kx₂k)₊

1

−¹_γx¯₂

+ 1

1 + γ²(x₁+ γ²kx₂k)₊

1 γ ¯x₂

. Hence the optimal solution to (10) is

y = Γ⁻¹y⁰ = Γ⁻¹ΠL_θ(x⁰) = Γ⁻¹ΠL_θ(Γx)

=

" _γ²

1+γ²(x1− kx2k)++ _1+γ¹ 2(x1+ γ²kx2k)+

−_1+γ¹ 2(x₁− kx₂k)₊+ _1+γ¹ 2(x₁+ γ²kx₂k)₊

¯ x₂

#

(12)

=











x if x₁ ≥ kx₂k, 0 if x₁ ≤ −²_nkx₂k,

1

1+γ² (x₁ + γ²kx₂k) 1

¯ x₂

if −²_nkx₂k < x₁ < kx₂k.

By Lemma 3.1, the optimal solution to (9) is L_y, i.e,

L_y = Π_Λⁿ

+(L_x) =











L_x if x₁ ≥ kx₂k, O if x₁ ≤ −_n²kx₂k,

1

1+_n² x₁+_n²kx₂k 1 x¯^T₂

¯

x₂ I_n−1

if −_n²kx₂k < x₁ < kx₂k.

(d) In view of the expression (12), we can compute the distance dist(Lx, Λⁿ₊) as follows.

dist(L_x, Λⁿ₊) = kL_x− L_yk_F = kL_x−yk_F

= n

x₁− γ²

1 + γ²(x₁− kx₂k)₊− 1

1 + γ² x₁+ γ²kx₂k

+

2

+2

kx₂k + 1

1 + γ²(x₁− kx₂k)₊− 1

1 + γ² x₁+ γ²kx₂k

+

2!¹₂

= n

x1− 2

n + 2(x1− kx2k)+− n

n + 2 x1+ 2 nkx2k

+

2

+2

kx₂k + n

n + 2(x₁− kx₂k)₊− n

n + 2 x₁+ 2 nkx₂k

+

2!¹₂

= n

2

n + 2(x₁− kx₂k)−+ n

n + 2 x₁+ 2 nkx₂k

−

2

(10)

+2

− n

n + 2(x₁− kx₂k)−+ n

n + 2 x₁+ 2 nkx₂k

−

2!¹₂

= s

2n

n + 2(x₁− kx₂k)²₋+ n² n + 2

x₁ + 2 nkx₂k

2

−

, where the third equation comes from the facts that

x1 = 2

n + 2(x1 − kx2k) + n n + 2

x1+ 2 nkx2k

and

kx₂k = − n

n + 2(x₁ − kx₂k) + n n + 2

x₁+ 2 nkx₂k

. 2

Theorem 3.2. For any x = (x₁, x₂) ∈ IR × IRⁿ⁻¹,

dist(x, Kⁿ) ≤ dist(L_x, S₊ⁿ) ≤ dist(L_x, Λⁿ₊).

In particular, for n = 2, dist(x, K²) =

√2

2 dist(Lx, S₊²) and dist(Lx, S₊²) = dist(Lx, Λ²₊).

Proof. The first inequality follows from the formula of distance given as in Theorem 3.1; the second inequality comes from the fact that Λⁿ₊ is a subset of S₊ⁿ, i.e., Λⁿ₊ ⊂ S₊ⁿ. For n = 2, by part(d) of Theorem 3.1, we have

dist(Lx, Λ²₊) = q

(x1− kx2k)²₋+ (x1+ kx2k)²₋. Combining this and Theorem 3.1(a)-(b) yields dist(x, K²) =

√ 2

2 dist(L_x, S₊²) and dist(L_x, S₊²) = dist(L_x, Λ²₊). 2

Note that Λ²₊ is strictly included in S₊², i.e., Λ²₊ $ S+², because in the arrow matrix, the diagonal element is the same, but positive semidefinite matrix does not impose this requirement. Thus, dist(L_x, Λ²₊) ≤ dist(L_x, S₊²). In Theorem 3.2, we further show that the equality holds.

In view of Theorem 3.2, a natural question arises here: are these distances equivalent?

Recall that for two functions g, h : IRⁿ → IR, we say that they are equivalent if there exist τ₁, τ₂ > 0 such that

τ₁g(x) ≤ h(x) ≤ τ₂g(x), ∀x ∈ IRⁿ.

For instance, 1-norm and 2-norm are equivalent in this sense. To answer this question, we need the following lemma.

(11)

Lemma 3.2. For a, b ∈ IR, the following inequality holds:

a + b 2

2

−

≤ 1

2 a²₋+ b²₋.

Proof. We assume without loss of generality that a ≤ b. Then, we consider the following four cases to proceed the proof.

Case 1: For a ≥ 0 and b ≥ 0, we have

a + b 2

2

−

= 0 = 1

2 a²₋+ b²₋.

Case 2: For a ≤ 0 and b ≤ 0, we have

a + b 2

2

−

= a + b 2

2

≤ a²+ b²

2 = 1

2 a²₋+ b²₋.

Case 3: For a ≤ 0, b ≥ 0, and a ≤ −b, there implies (a + b)/2 ≤ 0. Then, we have

a + b 2

2

−

= a + b 2

2

= a²+ b²+ 2ab

4 ≤ a²+ b²

4 ≤ 1

2a² = 1

2 a²₋+ b²₋,

where the first inequality comes from the fact that ab ≤ 0 and the second inequality follows from the fact that a² ≥ b² due to a ≤ −b ≤ 0.

Case 4: For a ≤ 0, b ≥ 0, and a ≥ −b, we have

a + b 2

2

−

= 0 ≤ 1

2a² = 1

2 a²₋+ b²₋.

2

Theorem 3.3. The distances dist(x, Kⁿ), dist(L_x, S₊ⁿ), and dist(L_x, Λⁿ₊) are all equivalent in the sense of

dist(x, Kⁿ) ≤ dist(L_x, S₊ⁿ) ≤√

n dist(x, Kⁿ) (13)

and

dist(L_x, S₊ⁿ) ≤ dist(L_x, Λⁿ₊) ≤

r 2n

n + 2dist(L_x, S₊ⁿ). (14) Proof. (i) The key part to prove inequality (13) is to look into dist²(L_x, S₊ⁿ), which are computed as below:

dist²(L_x, S₊ⁿ)

(12)

= (x₁− kx₂k)²₋+ (x₁+ kx₂k)²₋+ (n − 2)(x₁)²₋

= (x₁− kx₂k)²₋+ (x₁+ kx₂k)²₋+ (n − 2) (x₁− kx₂k) + (x₁+ kx₂k) 2

2

−

≤ (x₁− kx₂k)²₋+ (x₁+ kx₂k)²₋+n − 2 2

(x₁ − kx₂k)²₋+ (x₁+ kx₂k)²₋

= n 1

2(x₁ − kx₂k)²₋+1

2(x₁+ kx₂k)²₋

= n dist²(x, Kⁿ),

where the inequality is due to Lemma 3.2. Hence, we achieve dist(x, Kⁿ) ≤ dist(L_x, S₊ⁿ) ≤√

n dist(x, Kⁿ),

which indicates that the distance between x to Kⁿ and L_x to S₊ⁿ is equivalent.

(ii) It remains to show the equivalence between dist(L_x, S₊ⁿ) and dist(L_x, Λⁿ₊). To proceed, we need to consider the following cases.

Case 1: For x₁ ≥ kx₂k, dist(L_x, S₊ⁿ) = 0 = dist(L_x, Λⁿ₊).

Case 2: For x₁ ≤ −kx₂k, dist(L_x, Λⁿ₊) = pnx²₁+ 2kx₂k² = dist(L_x, S₊ⁿ).

Case 3: For 0 ≤ x1 ≤ kx2k, dist(Lx, Λⁿ₊) = q 2n

n+2|x1− kx2k| and dist(Lx, S₊ⁿ) = |x1 − kx₂k|.

Case 4: For −_n²kx₂k ≤ x₁ ≤ 0, dist²(L_x, Λⁿ₊) = _n+2²ⁿ (x₁ − kx₂k)² and dist²(L_x, S₊ⁿ) = (x₁− kx₂k)²+ (n − 2)x²₁. Then,

2n

n + 2dist²(L_x, S₊ⁿ) = 2n

n + 2 x₁− kx₂k2

+ 2n

n + 2(n − 2)x²₁ ≥ dist²(L_x, Λⁿ₊).

Case 5: For −kx₂k ≤ x₁ ≤ −²_nkx₂k,

dist²(L_x, Λⁿ₊) = nx²₁+ 2kx₂k² and dist²(L_x, S₊ⁿ) = (x₁− kx₂k)²+ (n − 2)x²₁. Note that

dist(L_x, Λⁿ₊) ≤

r 2n

n + 2dist(L_x, S₊ⁿ)

⇐⇒ nx²₁ + 2kx₂k² ≤ 2n n + 2

(x₁− kx₂k)²+ (n − 2)x²₁

⇐⇒ 4kx2knx1+ kx2k ≤ n(n − 4)x²₁. Since x₁ ≤ −²_nkx₂k, it implies that

4kx₂knx1+ kx₂k ≤ −4kx2k² ≤ 4n − 4

n kx₂k² = n(n − 4)

−2 nkx₂k

2

≤ n(n − 4)x²₁,

(13)

where the second inequality is due to the fact ⁿ⁻⁴_n ≥ −1 for all n ≥ 2. Hence,

dist(L_x, Λⁿ₊) ≤

r 2n

n + 2dist(L_x, S₊ⁿ), which is the desired result. 2

The following example demonstrates that the inequalities (13) and (14) in Theorem 3.3 may be strict.

Example 3.1. Consider x = (−1, 2, 0, . . . , 0

| {z }

n−2

) with n ≥ 4. Then,

dist(x, Kⁿ) < dist(L_x, S₊ⁿ) < √

n dist(x, Kⁿ) and

dist(Lx, S₊ⁿ) < dist(Lx, Λⁿ₊) <

r 2n

n + 2dist(Lx, S₊ⁿ).

To see this, from Theorem 3.1, we know that

dist(x, Kⁿ) = r9

2, dist(L_x, S₊ⁿ) = √

n + 7, dist(L_x, Λⁿ₊) =√

n + 8. (15) Note that for n ≥ 4, we have

r9 2 <√

n + 7 <

r9n 2 , and

√n + 7 <√

n + 8 <

r 2n n + 2

√n + 7,

which says

dist(x, Kⁿ) < dist(L_x, S₊ⁿ) <√

n dist(x, Kⁿ), and

dist(L_x, S₊ⁿ) < dist(L_x, Λⁿ₊) <

r 2n

n + 2dist(L_x, S₊ⁿ).

From this example, we see that the distance related to second-order cone is indepen- dent of n; nonetheless, if we treat it as semi-definite matrix, the distance is dependent on n; see (15).

(14)

4 Relation on Tangent Cone

As shown earlier, all the distances introduced in Section 2 are equivalent. This allows us to study the relation on tangent cone, because the tangent cone can be achieved by distance function [3]. More specifically, for a convex set C, there is

T_C(x) := {h | dist(x + th, C) = o(t), t ≥ 0}.

In light of this, this section is devoted to exploring relation on tangent cones.

Theorem 4.1. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ belong to Kⁿ, i.e., x ∈ Kⁿ. Then

(a) TKⁿ(x) =







Kⁿ if x = 0,

IRⁿ if x ∈ int Kⁿ,

(d₁, d₂) ∈ IRⁿ

d^T₂x₂− x₁d₁ ≤ 0

if x ∈ bd Kⁿ\{0}.

(b) TS₊ⁿ(Lx) =







S₊ⁿ if x = 0,

Sⁿ if x ∈ int Kⁿ,

n

H ∈ Sⁿ| (u⁽¹⁾x )^THu⁽¹⁾x ≥ 0o

(c) T_Λⁿ

+(L_x) = {L_h| h ∈ TKⁿ(x)} = TS₊ⁿ(L_x) ∩ Λⁿ.

Proof. The formulae of TKⁿ(x) and TS₊(L_x) follows from the results given in [2, 14]. To verify part(c), we know that

T_Λⁿ

+(L_x) =H ∈ Sⁿ| L_x+ t_nH_n∈ Λⁿ₊, t_n → 0⁺, H_n → H .

Due to tnHn ∈ Λⁿ₊− Lx, Hn is also an arrow matrix. This means Hn = Lhn for some h_n ∈ IRⁿ. In addition, H_n → H implies H = L_h for some h with h_n → h. Thus, we obtain that L_x+ t_nH_n= L_x+t_n_h_n ∈ Λⁿ₊ which is equivalent to saying x + t_nh_n ∈ Kⁿ, i.e., h ∈ TKⁿ(x). Moreover, since Λⁿ₊ = S₊ⁿ ∩ Λⁿ and S₊ⁿ, Λⁿ cannot be separated, it yields

T_Λⁿ

+(L_x) = TSⁿ₊(L_x) ∩ T_Λⁿ(L_x) = TSⁿ₊(L_x) ∩ Λⁿ

by [12, Theorem 6.42], where the last step comes from the fact that Λⁿ is a subspace.

2

The relation between TKⁿ(x) and TSⁿ₊(L_x) can be also characterized by using their expression.

Theorem 4.2. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ belong to Kⁿ, i.e., x ∈ Kⁿ. Then,

L_T_Kn_(x) = TS₊ⁿ(L_x) ∩ Λⁿ. (16)

(15)

Proof. We proceed the proof by discussing the following three cases.

Case 1: For x ∈ intKⁿ, we have Lx ∈ intS₊ⁿ. Thus, TKⁿ(x) = IRⁿ and TS₊ⁿ(Lx) = Sⁿ. This implies

L_T_Kn_(x) = L_IRⁿ = Λⁿ = Sⁿ∩ Λⁿ = TS₊ⁿ(L_x) ∩ Λⁿ.

Case 2: For x = 0, we have TKⁿ(x) = Kⁿ and TS₊ⁿ(L_x) = S₊ⁿ. Since y ∈ Kⁿ if and only if Ly ∈ S₊ⁿ,

L_T_Kn_(x)= L_Kⁿ = Λⁿ₊= S₊ⁿ ∩ Λⁿ= T_Sⁿ

+(L_x) ∩ Λⁿ. Case 3: For x ∈ bd Kⁿ\{0}, take d ∈ TKⁿ(x). Then,

(u⁽¹⁾_x )^TL_du⁽¹⁾_x = 1

4 1 − ¯x^T₂d₁ d^T₂ d₂ d₁I

1

−¯x₂

= 1

2(d₁− d^T₂x¯₂) ≥ 0,

where the inequality comes from d ∈ TKⁿ(x). Hence, L_d∈ TS₊ⁿ(L_x) by Theorem 4.1, i.e., L_T_Kn_(x) ⊂ TS₊ⁿ(L_x) ∩ Λⁿ. The converse inclusion can be proved by a similar argument.

2

The restriction to Λⁿ in (16) is required, which is illustrated by the following example.

Taking x = (1, 1) ∈ IR², we have

T_K²(x) = {d = (d₁, d₂) ∈ IR²| − d₁+ d₂ ≤ 0}

and T_S²

+(L_x) = H ∈ S²| (u⁽¹⁾_x )^THu⁽¹⁾_x ≥ 0 = H ∈ S²| H₁₁− 2H₁₂+ H₂₂≥ 0 . Hence, L_T_Kn_(x) does not equal TS₊ⁿ(L_x).

5 Relation on Normal Cone

In this section, we continue to explore relation on normal cone between the SOC and its PSD reformulation. To this end, we first write out the expressions of NKⁿ(x), NS₊ⁿ(L_x), and N_Λⁿ

+(L_x), respectively.

Theorem 5.1. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ belong to Kⁿ, i.e., x ∈ Kⁿ. Then

(a) NKⁿ(x) =







−Kⁿ if x = 0, {0} if x ∈ int Kⁿ, IR+(−x1, x2) if x ∈ bd Kⁿ\{0}.

(b) N_Sⁿ

+(L_x) =











−S₊ⁿ if x = 0,

{O} if x ∈ int Kⁿ,

α

1 −¯x^T₂

−¯x2 x¯2x¯^T₂

α ≤ 0

(16)

(c) N_Λⁿ

+(L_x) = N_Sⁿ

+(L_x) + (Λⁿ)^⊥, where

(Λⁿ)^⊥= {H ∈ Sⁿ| tr(H) = 0, H_1,i = 0, i = 2, · · · , n} .

Proof. Part (a) and (b) follow from [2] and [14]. For Part (c), since Λⁿ₊ = S₊ⁿ∩ Λⁿ, it follows from [12, Theorem 6.42] that

N_Λⁿ₊(L_x) = NS₊ⁿ(L_x) + N_Λⁿ(L_x).

Because Λⁿ is a subspace, we know that N_Λⁿ(L_x) = (Λⁿ)^⊥, where

(Λⁿ)^⊥ = {H ∈ Sⁿ| hH, L_yi = 0, ∀y ∈ IRⁿ} = {H ∈ Sⁿ| tr(H) = 0, H_1,i = 0, i = 2, · · · , n} . 2

The relation between N_Λⁿ

+(L_x) and NSⁿ₊(L_x) is already described in Theorem 5.1.

Next, we further describe the relation between NKⁿ(x) and NSⁿ₊(L_x).

Theorem 5.2. Let x = (x1, x2) ∈ IR × IRⁿ⁻¹ belong to Kⁿ, i.e., x ∈ Kⁿ. Then, for x ∈ int Kⁿ and x ∈ bd Kⁿ\{0},

N_Sⁿ

+(L_x) = −N_Kⁿ(x)N_Kⁿ(x)^T.

Proof. Case 1: For x ∈ int Kⁿ, NKⁿ(x) = {0} and NS₊ⁿ(Lx) = {O}. The desired result holds in this case.

Case 2: For x ∈ bd Kⁿ\{0}, it follows from Theorem 5.1 that NS₊ⁿ(L_x) =

α

1 −¯x^T₂

−¯x₂ x¯₂x¯^T₂

α ≤ 0

=

α

1

−¯x2

1, −¯x^T₂

α ≤ 0

. (17)

Since NKⁿ(x) = {y| y = β ˆx, β ≤ 0} with ˆx := (x₁, −x₂),

−N_Kⁿ(x)N_Kⁿ(x)^T = {−β²xˆˆx^T| β ≤ 0} =

−(βx₁)²

1

−¯x₂

1, −¯x^T₂

β ≤ 0

. (18) Compared with (17) and (18) yields the desired result. 2

From Theorem 5.1(c), we know that N_Λⁿ

+(L_x) ⊃ NS₊ⁿ(L_x) since O ∈ (Λⁿ)^⊥. For x = 0, NKⁿ(x) = −Kⁿ and NS₊ⁿ(L_x) = −S₊ⁿ. In this case, NSⁿ₊(L_x) and −NKⁿ(x)NKⁿ(x)^T do not coincide, i.e., Theorem 5.2 fails when x = 0. Below, we give the algebraic expressions for N_Λⁿ

+(L_x) and NS₊ⁿ(L_x) as n = 2, from which we can see the difference between them more clearly.

(17)

Theorem 5.3. For n = 2, the explicit expressions of N_S²

+(L_x) and N_Λ²

+(L_x) are as below:

N_S²

+(L_x) =











a b b c

a ≤ 0, c ≤ 0, ac ≥ b²

if x = 0,

{O} if x ∈ int K²,

α 1 −1

−1 1

α ≤ 0

if x ∈ bd K²\{0}, x₂ > 0,

α1 1 1 1

α ≤ 0

if x ∈ bd K²\{0}, x₂ < 0.

and

N_Λ²

+(L_x) =











a b b c

a + c ≤ −2|b|

if x = 0,

a b b c

a + c = 0, b = 0

if x ∈ int K²,

a b b c

a + c + 2b = 0, b ≥ 0

if x ∈ bd K²\{0}, x₂ > 0,

a b b c

a + c − 2b = 0, b ≤ 0

if x ∈ bd K²\{0}, x₂ < 0.

Proof. First, we claim that

(Λ²₊)^◦ = a b b c

a + c ≤ −2|b|

. In fact

a b b c

∈ (Λ²₊)^◦ ⇐⇒ a b b c

,x₁ x₂ x2 x1

≤ 0, ∀x₁ ≥ |x₂|,

⇐⇒ (a + c)x₁+ 2bx₂ ≤ 0, ∀x₁ ≥ |x₂|. (19) If we plug in x₁ = |x₂| + τ with τ ≥ 0, then (19) can be rewritten as

(a + c)|x₂| + 2bx₂+ (a + c)τ ≤ 0, ∀x₂ ∈ IR and τ ≥ 0, i.e.,

(a + c + 2b)x2+ (a + c)τ ≤ 0, ∀x2 ≥ 0 and τ ≥ 0 (20) and

(−a − c + 2b)x2+ (a + c)τ ≤ 0, ∀x2 ≤ 0 and τ ≥ 0. (21) With the arbitrariness of τ ≥ 0, we have a + c ≤ 0. Likewise, we have a + c + 2b ≤ 0 by (20) and −a − c + 2b ≥ 0 by (21). Thus, a + c ≤ −2b and a + c ≤ 2b. In other words, we conclude that the inequality (19) implies

a + c ≤ min{−2b, 2b} = −2|b|.