1Introduction NonsingularityconditionsforFBsystemofreformulatingnonlinearsecond-orderconeprogramming

(1)

to appear in Abstract and Applied Analysis, 2013

Nonsingularity conditions for FB system of reformulating nonlinear second-order cone programming

¹

Shaohua Pan^†, Shujun Bi^‡ and Jein-Shan Chen^§

June 2, 2012

Abstract. This paper is a counterpart of [2]. Specifically, for a locally optimal solution to the nonlinear second-order cone programming (SOCP), under Robinson’s constraint qualification, we establish the equivalence among the following three conditions: the nonsingularity of Clarke’s Jacobian of Fischer-Burmeister (FB) nonsmooth system for the Karush-Kuhn-Tucker conditions, the strong second-order sufficient condition and constraint nondegeneracy, and the strong regularity of the Karush-Kuhn-Tucker point.

Key words: nonlinear second-order cone programming; FB nonsmooth system; non- singularity; Clarke’s generalized Jacobian; strong regularity.

1 Introduction

The nonlinear second-order cone programming (SOCP) problem can be stated as

ζmin∈IRⁿ f (ζ)

s.t. h(ζ) = 0, (1)

g(ζ)∈ K,

where f : IRⁿ → IR, h : IRⁿ → IR^m and g : IRⁿ → IRⁿ are given twice continuously diﬀerentiable functions, andK is the Cartesian product of some second-order cones, i.e.,

K := Kⁿ¹ × Kⁿ² × · · · × Kⁿ^r

1This work was supported by National Young Natural Science Foundation (No. 10901058) and Guangdong Natural Science Foundation (No. 9251802902000001), the Fundamental Research Funds for the Central Universities, and Project of Liaoning Innovative Research Team in UniversityWT2010004

†Department of Mathematics, South China University of Technology, Guangzhou, China (shh- [email protected]).

‡Department of Mathematics, South China University of Technology, Guangzhou, China (beami- [email protected]).

§Corresponding author. Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Oﬃce. The author’s work is supported by National Science Council of Taiwan, Department of Mathematics, National Taiwan Normal University, Taipei, Taiwan 11677 ([email protected]).

(2)

with n₁ +· · · + nr = n and Kⁿ^j being the second-order cone (SOC) in IRⁿ^j deﬁned by Kⁿ^j :={

(xj1, xj2)∈ IR × IRⁿ^j⁻¹ | xj1 ≥ ∥xj2∥} .

By introducing a slack variable to the second constraint, the SOCP (1) is equivalent to

ζ,xmin∈IRⁿf (ζ)

s.t. h(ζ) = 0, (2)

g(ζ)− x = 0, x ∈ K.

In this paper, we will concentrate on this equivalent formulation of problem (1).

Let L : IRⁿ× IRⁿ× IR^m× IRⁿ× K → IR be the Lagrangian function of problem (2) L(ζ, x, µ, s, y) := f (ζ) +⟨µ, h(ζ)⟩ + ⟨g(ζ) − x, s⟩ − ⟨x, y⟩,

and denote by NK(x) the normal cone of K at x in the sense of convex analysis [19]:

NK(x) =

{ {d ∈ IRⁿ: ⟨d, z − x⟩ ≤ 0 ∀z ∈ K} if x ∈ K,

∅ if x /∈ K.

Then the Karush-Kuhn-Tucker (KKT) conditions for (2) take the following form Jζ,xL(ζ, x, µ, s, y) = 0, h(ζ) = 0, g(ζ)− x = 0 and − y ∈ N_K(x), (3) where Jζ,xL(ζ, x, µ, s, y) is the derivative of L at (ζ, x, µ, s, y) with respect to (ζ, x).

Recall that ϕ^soc is an SOC complementarity function associated with the cone K if ϕ^soc(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K, ⟨x, y⟩ = 0 ⇐⇒ −y ∈ NK(x). (4) With an SOC complementarity function ϕ^soc associated withK, we may reformulate the KKT optimality conditions in (3) as the following nonsmooth system:

E(ζ, x, µ, s, y) :=







Jζ,xL(ζ, x, µ, s, y) h(ζ) g(ζ)− x ϕ^soc(x, y)





 = 0. (5)

The most popular SOC complementarity functions include the vector-valued natural residual (NR) function and Fischer-Burmeister (FB) function, respectively, deﬁned as

ϕ_NR(x, y) := x− Π_K(x− y) ∀x, y ∈ IRⁿ and

ϕ_FB(x, y) := (x + y)−√

x² + y² ∀x, y ∈ IRⁿ, (6)

(3)

where Π_K(·) is the projection operator onto the closed convex cone K, x² = x◦ x means the Jordan product of x and itself, and √

x denotes the unique square root of x ∈ K.

It turns out that the FB SOC complementarity function ϕ_FB enjoys almost all favorable properties of the NR SOC complementarity function ϕ_NR (see [22]). Also, the squared norm of ϕ_FB induces a continuously diﬀerentiable merit function with globally Lipschitz continuous derivative [6, 7]. This greatly facilitates the globalization of the semismooth Newton method [16, 17] for solving the FB nonsmooth system of KKT conditions:

E_FB(ζ, x, µ, s, y) :=







Jζ,xL(ζ, x, µ, s, y) h(ζ) g(ζ)− x ϕ_FB(x, y)





 = 0. (7)

Recently, with the help of [3, Theorem 30] and [5, Lemma 11], Wang and Zhang [23]

gave a characterization for the strong regularity of the KKT point of the SOCP (1) via the nonsingularity study of Clarke’s Jacobian of the NR nonsmooth system

E_NR(ζ, x, µ, s, y) :=







Jζ,xL(ζ, x, µ, s, y) h(ζ)

g(ζ)− x ϕ_NR(x, y)





 = 0. (8)

They showed that the strong regularity of the KKT point, the nonsingularity of Clarke’s Jacobian of E_NR at the KKT point, and the strong second order sufficient condition and constraint nondegeneracy [3], are all equivalent. These nonsingularity conditions are better structured than those of [14] for the nonsingularity of the B-subdifferential of the NR system. Then, it is natural to ask: is it possible to obtain a characterization for the strong regularity of the KKT point by studying the nonsingularity of Clarke’s Jacobian of E_FB. Note that up to now one even does not know whether the B-subdifferential of the FB system is nonsingular or not without the strict complementarity assumption.

In this work, for a locally optimal solution to the nonlinear SOCP (2), under Robin- son’s constraint qualification, we show that the strong second-order sufficient condition and constraint nondegeneracy introduced in [3], the nonsingularity of Clarke’s Jacobian of E_FB at the KKT point, and the strong regularity of the KKT point are equivalent to each other. This, on the one hand, gives a new characterization for the strong regularity of the KKT point; and on the other hand, provides a mild condition to guarantee the quadratic convergence rate of the semismooth Newton method [16, 17] for the FB system. Note that parallel results are obtained recently for the FB system of the nonlinear semidefinite programming (see [2]), however, we do not duplicate them. As will be seen in Section 3 and Section 4, the analysis techniques here are totally different from those in [2]. It seems hard to put them together in a unified framework under the Euclidean Jor- dan algebra. The main reason causing this is due to completely different analysis when

(4)

dealing with the Clarke Jacobians associated with FB SOC complementarity function and FB semideﬁnite cone complementarity function.

Throughout this paper, I denotes an identity matrix of appropriate dimension, IRⁿ (n > 1) denotes the space of n-dimensional real column vectors, and IRⁿ¹× · · · × IRⁿ^r is identified with IRⁿ¹⁺^···+n^r. Thus, (x₁, . . . , x_r)∈ IRⁿ¹ × · · · × IRⁿ^r is viewed as a column vector in IRⁿ¹⁺^···+n^r. The notations intKⁿ, bdKⁿ and bd⁺Kⁿ denote the interior, the boundary, and the boundary excluding the origin of Kⁿ, respectively. For any x ∈ IRⁿ, we write x ≽Kⁿ 0 (respectively, x≻Kⁿ 0) if x ∈ Kⁿ (respectively, x ∈ intKⁿ). For any given real symmetric matrix A, we write A ≽ 0 (respectively, A ≻ 0) if A is positive semidefinite (respectively, positive definite). In addition, Jωf (ω) and Jωω² f (ω) denote the derivative and the second order derivative, respectively, of a twice differentiable function f with respect to the variable ω.

2 Preliminary results

First we recall from [11] the deﬁnition of Jordan product and spectral factorization.

Deﬁnition 2.1 The Jordan product of x = (x₁, x₂), y = (y₁, y₂)∈ IR× IRⁿ⁻¹ is given by x◦ y := (⟨x, y⟩, x1y₂+ y₁x₂). (9) Unlike scalar or matrix multiplication, the Jordan product is not associative in general.

The identity element under this product is e := (1, 0, . . . , 0)^T ∈ IRⁿ, i.e., e◦ x = x for all x∈ IRⁿ. For each x = (x₁, x₂)∈ IR × IRⁿ⁻¹, we deﬁne the associated arrow matrix by

L_x:=

[ x₁ x^T₂ x2 x1I

]

. (10)

Then it is easy to verify that L_xy = x◦ y for any x, y ∈ IRⁿ. Recall that each x = (x₁, x₂)∈ IR × IRⁿ⁻¹ admits a spectral factorization, associated withKⁿ, of the form

x = λ₁(x)u⁽¹⁾_x + λ₂(x)u⁽²⁾_x , (11) where λ₁(x), λ₂(x) ∈ IR and u⁽¹⁾x , u⁽²⁾x ∈ IRⁿ are the spectral values and the associated spectral vectors of x, respectively, with respect to the Jordan product, deﬁned by

λ_i(x) := x₁+ (−1)ⁱ∥x2∥, u⁽ⁱ⁾x := 1 2

( 1

(−1)ⁱx˜2

)

for i = 1, 2, (12) with ˜x₂ = _∥x^x²

2∥ if x₂ ̸= 0 and otherwise being any vector in IRⁿ⁻¹ satisfying ∥˜x2∥ = 1.

Deﬁnition 2.2 The determinant of a vector x∈IRⁿ is deﬁned as det(x) := λ₁(x)λ₂(x), and a vector x is said to be invertible if its determinant det(x) is nonzero.

(5)

By the formula of spectral factorization, it is easy to compute that the projection of x∈ IRⁿ onto the closed convex cone Kⁿ, denoted by Π_Kⁿ(x), has the expression

Π_Kⁿ(x) = max(0, λ1(x))u⁽¹⁾_x + max(0, λ2(x))u⁽²⁾_x .

Deﬁne|x| := 2ΠKⁿ(x)− x. Then, using the expression of ΠKⁿ(x), it follows that

|x| = |λ1(x)|u⁽¹⁾x +|λ2(x)|u⁽²⁾x . The spectral factorization of the vectors x, x², √

x and the matrix L_x have various interesting properties (see [12]). We list several properties that we will use later.

Property 2.1 For any x = (x₁, x₂)∈IR×IRⁿ⁻¹ with spectral factorization (11), we have (a) x² = λ²₁(x)u⁽¹⁾x + λ²₂(x)u⁽²⁾x ∈ Kⁿ.

(b) If x∈ Kⁿ, then 0 ≤ λ1(x)≤ λ2(x) and √

x =√

λ₁(x)u⁽¹⁾x +√

λ₂(x)u⁽²⁾x . (c) If x∈ intKⁿ, then 0 < λ₁(x)≤ λ2(x) and L_x is invertible with

L⁻¹_x = 1 det(x)



 x₁ −x^T₂

−x2

det(x)

x₁ I +x₂x^T₂ x₁



 . (13)

(d) L_x ≽ 0 (respectively, Lx ≻ 0) if and only if x ∈ Kⁿ (respectively, x∈ intKⁿ).

The following lemma states a result for the arrow matrices associated with x, y ∈ IRⁿ and z ≽Kⁿ

√x²+ y², which will be used in the next section to characterize an important property for the elements of Clarke’s Jacobian of ϕ_FB at a general point.

Lemma 2.1 For any given x, y∈ IRⁿ and z ≻Kⁿ 0, if z² ≽Kⁿ x²+ y², then [L⁻¹_z L_x L⁻¹_z L_y]

2 ≤ 1,

where ∥A∥2 means the spectral norm of a real matrix A. Consequently, it holds that

∥L⁻¹z L_x△u + L⁻¹z L_y△v∥ ≤√

∥△u∥²+∥△v∥² ∀△u, △v ∈ IRⁿ. Proof. Let A = [L⁻¹_z L_x L⁻¹_z L_y]. From [12, Proposition 3.4], it follows that

AA^T = L⁻¹_z (L²_x+ L²_y)L⁻¹_z ≼ L⁻¹z L²_zL⁻¹_z = I.

This shows that ∥A∥2 ≤ 1, and the ﬁrst part follows. Note that for any ξ ∈ IR²ⁿ,

∥Aξ∥² = ξ^TA^TAξ≤ λmax(A^TA)∥ξ∥² ≤ ∥ξ∥².

By letting ξ = (△u, △v) ∈ IRⁿ× IRⁿ, we immediately obtain the second part. 2 The following two lemmas state the properties of x, y with x²+ y² ∈ bdKⁿ which are often used in the subsequent sections. The proof of Lemma 2.2 is given in [6, Lemma 2].

(6)

Lemma 2.2 For any x = (x₁, x₂), y = (y₁, y₂)∈ IR ×IRⁿ⁻¹ with x²+ y²∈bdKⁿ, we have x²₁ =∥x2∥², y₁² =∥y2∥², x₁y₁ = x^T₂y₂, x₁y₂ = y₁x₂.

Lemma 2.3 For any x = (x₁, x₂), y = (y₁, y₂)∈ IR × IRⁿ⁻¹, let w = (w₁, w₂) := x²+ y². (a) If w ∈ bdKⁿ, then for any g = (g₁, g₂), h = (h₁, h₂)∈ IR × IRⁿ⁻¹, it holds that

(x₁x₂+ y₁y₂)^T(x₁g₂+ g₁x₂+ y₁h₂+ h₁y₂) = (x²₁+ y₁²)(x^Tg + y^Th).

(b) If w∈ bd⁺Kⁿ, then the following four equalities hold x₁w₂

∥w2∥ = x₂, x^T₂w₂

∥w2∥ = x₁, y₁w₂

∥w2∥ = y₂, y^T₂w₂

∥w2∥ = y₁; and consequently the expression of ϕ_FB(x, y) can be simpliﬁed as

ϕ_FB(x, y) =



 x₁+ y₁−√

x²₁+ y₁² x₂+ y₂− x₁x₂+ y₁y₂

√x²₁+ y₁²



 . (14)

Proof. (a) The result is direct by the equalities of Lemma 2.2 since x²+ y² ∈ bdKⁿ. (b) Since w ∈ bd⁺Kⁿ, we must have w2 = 2(x1x2 + y1y2) ̸= 0. Using Lemma 2.2, w₂ = 2(x₁x₂+ y₁y₂) and ∥w2∥ = w1 = 2(x²₁+ y²₁), we easily obtain the ﬁrst part. Note that ϕ_FB(x, y) = (x + y)−√

w. Using Property 2.1(b) and Lemma 2.2 yields (14). 2 When x, y ∈ bdKⁿsatisfy the complementary condition, we have the following result.

Lemma 2.4 For any given x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, if x, y ∈ bdKⁿ and

⟨x, y⟩ = 0, then there exists a constant α > 0 such that x1 = αy₁ and x₂ =−αy2. Proof. Since x, y ∈ bdKⁿ, we have that x₁ =∥x2∥ and y1 =∥y2∥, and consequently,

0 = ⟨x, y⟩ = x1y₁+ x^T₂y₂ =∥x2∥∥y2∥ + x^T2y₂.

This means that there exists α > 0 such that x2 =−αy2, and then x1 = αy1. 2 Next we recall from [21] the strong regularity for a solution of generalized equation

0∈ ϕ(z) + ND(z), (15)

where ϕ is a continuously diﬀerentiable mapping from a ﬁnite dimensional real vector spaceZ to itself, D is a closed convex set in Z, and ND(z) is the normal cone of D at z.

As will be shown in Sec. 4, the KKT condition (3) can be written in the form of (15).

(7)

Deﬁnition 2.3 We say that ¯z is a strongly regular solution of the generalized equation (15) if there exist neighborhood B of the origin 0 ∈ Z and V of ¯z such that for every δ∈ B, the linearized generalized equation δ ∈ ϕ(¯z) + Jzϕ(¯z)(z− ¯z) + ND(z) has a unique solution in V, denoted by zV(δ), and the mapping z_V :B → V is Lipschitz continuous.

To close this section, we recall from [8] Clarke’s (generalized) Jacobian of a locally Lipschitz mapping. Let S ⊂ IRⁿ be an open set and Ξ : S → IRⁿ be a locally Lipschitz continuous function on S. By Rademacher’s theorem, Ξ is almost everywhere F(r´echet)- diﬀerentiable in S. We denote by S_Ξ the set of points in S where Ξ is F-diﬀerentiable.

Then Clarke’s Jacobian of Ξ at y is deﬁned by ∂Ξ(y) := conv{∂BΞ(y)}, where “conv”

means the convex hull, and B-subdiﬀerential ∂_BΞ(y), a name coined in [18], has the form

∂_BΞ(y) :=

{

V : V = lim

k→∞JyΞ(y^k), y^k → y, y^k ∈ SΞ

} .

For the concept of (strong) semismoothness, please refer to the literature [16, 17].

Unless otherwise stated, in the rest of this paper, for any x∈ IRⁿ (n > 1), we write x = (x1, x2) where x1 is the ﬁrst component of x, and x2 is a column vector consisting of the remaining n−1 entries of x. For any x = (x1, x₂), y = (y₁, y₂)∈ IR × IRⁿ⁻¹, let

w = w(x, y) := x²+ y², w˜₂ := w₂

∥w2∥ if w₂ ̸= 0 and z = z(x, y) =√

w(x, y). (16)

3 Directional derivative and B-subdiﬀerential

The function ϕ_FB is directionally diﬀerentiable everywhere by [22, Corollary 3.3]. But, to the best of our knowledge, the expression of its directional derivative is not given in the literature. In this section, we derive its expression, and then prove that the B- subdiﬀerential of ϕ_FB at a general point coincides with that of its directional derivative function at the origin. Throughout this section, we assume that K = Kⁿ.

Proposition 3.1 For any given x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, the directional derivative ϕ^′

FB((x,y); (g,h)) of ϕ_FB at (x, y) with the direction (g,h) has the following form.

(a) If (x, y) = (0, 0), then ϕ^′

FB((x, y); (g, h)) = ϕ_FB(g, h).

(b) If x²+ y² ∈ intKⁿ, then ϕ^′

FB((x, y); (g, h)) = (I− L⁻¹_z L_x) g + (I− L⁻¹_z L_y) h.

(c) If x²+ y² ∈ bd⁺Kⁿ, then ϕ^′

FB((x, y); (g, h)) = (g + h)− φ(g, h) 2

( 1

− ˜w₂ )

+x^T₂g₂+ y^T₂h₂ 2√

x²₁+ y²₁ ( 0

˜ w₂

)

− 1

2√

x²₁+ y₁²

( x^Tg + y^Th

2x₁g₂ + g₁x₂+ 2y₁h₂+ h₁y₂ )

(17)

(8)

where g = (g₁, g₂), h = (h₁, h₂)∈ IR × IRⁿ⁻¹, and φ : IRⁿ× IRⁿ → IR is deﬁned by φ(g, h) :=

√(x₁g₁− x^T2g₂ + y₁h₁−y^T2√h₂)²+∥x1h₂− h1x₂+g₁y₂− y1g₂∥²

x²₁ + y²₁ . (18)

Proof. Part (a) is immediate by noting that ϕ_FB is a positively homogeneous function.

Part (b) is due to [12, Proposition 5.2]. We next prove part (c) by two subcases as shown below. In the rest of proof, we let λ₁, λ₂ with λ₁ ≤ λ2 denote the spectral values of w.

Since w = x² + y² ∈ bd⁺Kⁿ, we have w2 ̸= 0, and from Lemma 2.3(b) it follows that w₁ =∥w2∥ = 2∥x1x₂+ y₁y₂∥ = 2∥x²1we₂+ y₁²we₂∥ = 2(x²1+ y₁²),

λ₁ = w₁− ∥w2∥ = 0, λ2 = w₁ +∥w2∥ = 4(x²₁+ y₁²).

(c.1): (x + tg)² + (y + th)² ∈ bd⁺Kⁿ for suﬃciently small t > 0. In this case, from Lemma 2.3(b), we know that ϕ_FB(x + tg, y + th) has the following expression



 (x₁+ y₁) + t(g₁+ h₁)−√

(x₁+ tg₁)²+ (y₁+ th₁)²

(x₂ + y₂) + t(g₂+ h₂)−(x₁+ tg₁)(x₂+ tg₂) + (y₁+ th₁)(y₂+ th₂)

√(x₁+ tg₁)²+ (y₁+ th₁)²



 .

Let [ϕ_FB(x, y)]1 be the ﬁrst element of ϕ_FB(x, y) and [ϕ_FB(x, y)]2 be the vector consisting of the rest n− 1 components of ϕFB(x, y). By the above expression of ϕ_FB(x + tg, y + th),

limt↓0

[ϕ_FB(x + tg, y + th)]1− [ϕFB(x, y)]1

t

= (g1+ h1)− lim

t↓0

√(x₁+ tg₁)²+ (y₁+ th₁)²−√

x²₁+ y₁² t

= (g₁+ h₁)− x₁g₁+ y₁h₁

√x²₁+ y₁² and

limt↓0

[ϕ_FB(x + tg, y + th)]2− [ϕFB(x, y)]2

t

= (g2+ h2)− lim

t↓0

[

(x₁+ tg₁)(x₂+ tg₂) + (y₁+ th₁)(y₂+ th₂) t√

(x₁+ tg₁)²+ (y₁+ th₁)² −x₁x₂ + y₁y₂ t√

x²₁+ y₁² ]

= (g2+ h2)− g₁x₂+ x₁g₂+ y₁h₂+ h₁y₂

√x²₁+ y²₁

− lim

t↓0

[

x₁x₂+ y₁y₂ t√

(x₁+ tg₁)²+ (y₁+ th₁)² −x₁x₂+ y₁y₂ t√

x²₁+ y²₁ ]

= (g₂+ h₂)− g₁x₂+ x₁g₂+ y₁h₂+ h₁y₂

√x²₁+ y²₁ +(x₁x₂+ y₁y₂)(x₁g₁+ y₁h₁) (x²₁+ y₁²)√

x²₁+ y₁²

= (g₂+ h₂)− x₁g₂+ y₁h₂

√x²₁+ y₁²

(9)

where the last equality is using x₁y₂ = y₁x₂ by Lemma 2.2. The above two limits imply ϕ^′_FB((x, y); (g, h)) = (g + h)− x1

√x²₁+ y₁²g− y1

√x²₁+ y²₁h. (19)

(c.2): (x + tg)² + (y + th)² ∈ intKⁿ for suﬃciently small t > 0. Let u = (u₁, u₂) :=

(x + tg)²+ (y + th)² with the spectral values µ1, µ2. An elementary calculation gives u₁ = ∥x + tg∥²+∥y + th∥² = w₁+ 2t(x^Tg + y^Th) + t²(∥g∥²+∥h∥²), (20) u₂ = 2(x₁+ tg₁)(x₂ + tg₂) + 2(y₁+ th₁)(y₂+ th₂)

= w₂+ 2t(x₁g₂+ g₁x₂+ y₁h₂+ h₁y₂) + 2t²(g₁g₂+ h₁h₂). (21) Also, since w₂ ̸= 0, applying the Taylor formula of ∥ · ∥ at w2 and Lemma 2.3(a) yields

∥u2∥ = ∥w2∥ +w^T₂(u₂− w2)

∥w2∥ + o(t) =∥w2∥ + 2t(x^Tg + y^Th) + o(t). (22) Now using the deﬁnition of ϕ_FB and noting that λ₁ = 0 and w₂ ̸= 0, we have that

ϕ_FB(x + tg, y + th)− ϕFB(x, y)

= (x + tg + y + th)−√

u− (x + y) +√ w

= t(g + h)−







√µ₁+√

µ₂−√ λ₂

√ 2

µ₂− √µ1

2

u₂

∥u2∥ −

√λ₂ 2

w₂

∥w2∥





 ,

which in turn implies that

ϕ^′

FB((x, y); (g, h)) = (g + h)−







limt↓0

√µ₁+√

µ₂−√ λ₂ 2t

limt↓0

(√µ₂ − √µ1

2t

u₂

∥u2∥ −

√λ₂ 2t

w₂

∥w2∥ )





 . (23)

We ﬁrst calculate lim_t_↓0^√^µ²⁻

√λ2

t . Using equations (20) and (22), it is easy to see that µ2− λ2 = (u1− w1) + (∥u2∥ − ∥w2∥) = 4t(x^Tg + y^Th) + o(t),

and consequently, lim

t↓0

√µ₂−√ λ₂

t = lim

t↓0

µ₂− λ2

t · 1

√µ2+√ λ2

= x^Tg + y^Th 2√

λ2

= x^Tg + y^Th

√x²₁+ y₁² . (24)

We next calculate lim_t↓0^√_t^µ¹. Since w₁ − ∥w2∥ = 0, using (20)-(21) and Lemma 2.3(a), µ₁ = (u₁− w1)− (∥u2∥ − ∥w2∥) = (u1− w1)− ∥u2∥²− ∥w2∥²

∥u2∥ + ∥w2∥

(10)

= 2t(x^Tg + y^Th)− 4tw₂^T(x₁g₂+ g₁x₂+ y₁h₂ + h₁y₂)

∥u2∥ + ∥w2∥ + t²(∥g∥²+∥h∥²)

−4t²∥g1x₂+ x₁g₂+ y₁h₂+ h₁y₂∥²

∥u2∥ + ∥w2∥ − 4t²w₂^T(g₁g₂+ h₁h₂)

∥u2∥ + ∥w2∥ + o(t²)

= 2t(x^Tg + y^Th)− 8t(x²₁+ y²₁)(x^Tg + y^Th)

∥u2∥ + ∥w2∥ + t²(∥g∥²+∥h∥²) + o(t²)

−4t²∥g1x2+ x1g2+ y1h2+ h1y2∥²

∥u2∥ + ∥w2∥ − 8t²(x1x2+ y1y2)^T(g1g2+ h1h2)

∥u2∥ + ∥w2∥ . (25) Using∥w2∥ = 2(x²₁+ y₁²) and (22), we simplify the sum of the ﬁrst two terms in (25) as

2t(x^Tg + y^Th)− 4t∥w2∥(x^Tg + y^Th)

∥u2∥ + ∥w2∥ = 2t(x^Tg + y^Th)∥u2∥ − ∥w2∥

∥u2∥ + ∥w2∥

= 4t²(x^Tg + y^Th)²

∥u2∥ + ∥w2∥ + o(t²).

Then, from equation (25) and ∥w2∥ = 2(x²1+ y₁²), we obtain that lim

t↓0

µ₁

t² = (x²₁+ y₁²)(∥g∥² +∥h∥²)− ∥g1x₂+ x₁g₂+ y₁h₂+ h₁y₂∥² x²₁ + y²₁

+(x^Tg + y^Th)²− 2(x1x2+ y1y2)^T(g1g2+ h1h2)

x²₁+ y²₁ . (26)

We next make simpliﬁcation for the numerator of the right hand side of (26). Note that (x²₁+ y₁²)(∥g∥²+∥h∥²)− ∥g1x2+ x1g2+ y1h2+ h1y2∥²

= (x²₁+ y₁²)(∥g∥²+∥h∥²)− ∥g1x₂+ x₁g₂∥²− ∥y1h₂+ h₁y₂∥²

−2(g1x₂+ x₁g₂)^T(y₁h₂+ h₁y₂)

= x²₁∥h∥²+ y²₁∥g∥²− 2x1g₁x^T₂g₂− 2y1h₁y₂^Th₂− 2(g1x₂+ x₁g₂)^T(y₁h₂+ h₁y₂) and

(x^Tg + y^Th)² − 2(x1x₂+ y₁y₂)^T(g₁g₂+ h₁h₂)

= (x₁g₁+ x^T₂g₂)²+ (y₁h₁+ y₂^Th₂)²+ 2x^Tgy^Th− 2(x1x₂+ y₁y₂)^T(g₁g₂+ h₁h₂)

= (x1g1)²+ (x^T₂g2)²+ (y1h1)²+ (y₂^Th2)²+ 2x^Tgy^Th− 2x1h1x^T₂h2− 2g1y1g₂^Ty2. Therefore, adding the last two equalities and using Lemma 2.2 yields that

(x²₁+ y₁²)(∥g∥²+∥h∥²)− ∥g1x2+ x1g2+ y1h2+ h1y2∥² +(x^Tg + y^Th)²− 2(x1x₂+ y₁y₂)^T(g₁g₂+ h₁h₂)

= (x²₁∥h∥²− 2x1h₁x^T₂h₂) + (y₁²∥g∥²− 2g1y₁g₂^Ty₂) +(

(x₁g₁)²+ (x^T₂g₂)²− 2x1g₁x^T₂g₂) +(

(y₁h₁)² + (y^T₂h₂)²− 2y1h₁y₂^Th₂)

+ 2x^Tgy^Th− 2(g1x₂+ x₁g₂)^T(y₁h₂+ h₁y₂)

(11)

= ∥x1h₂− h1x₂∥²+∥g1y₂− y1g₂∥² + (x₁g₁ − x^T2g₂)²+ (y₁h₁− y^T2h₂)² +2(g₁x₁+ g₂^Tx₂)(y₁h₁+ y₂^Th₂)− 2(g1x₂+ x₁g₂)^T(y₁h₂+ h₁y₂)

= ∥x1h2− h1x2∥²+∥g1y2− y1g2∥² + (x1g1 − x^T2g2)²+ (y1h1− y^T2h2)² +2(x₁h₂− h1x₂)^T(g₁y₂ − g2y₁) + 2(x₁g₁− x^T2g₂)(y₁h₁− y^T2h₂)

= ∥x1h₂− h1x₂ + g₁y₂− y1g₂∥²+ (x₁g₁ − x^T2g₂+ y₁h₁− y^T2h₂)².

Combining this equality with (26) and using the deﬁnition of φ in (18), we readily get limt↓0

√µ₁

t = φ(g, h). (27)

We next calculate lim_t_↓0

[√µ2−√µ1

2t u2

∥u²∥ − ^√_2t^λ²_∥w^w²₂_∥]

. To this end, we also need to take a look at ∥w2∥u2− ∥u2∥w2. From equations (20)-(21) and (22), it follows that

∥w2∥u2− ∥u2∥w2 = 2t∥w2∥[

(x₁g₂+ g₁x₂+ y₁h₂ + h₁y₂)− (x^Tg + y^Th) ˜w₂]

+ o(t).

Together with equations (24) and (27), we have that lim

t↓0

[√µ₂− √µ1

2t

u₂

∥u2∥−

√λ₂ 2t

w₂

∥w2∥ ]

= − lim

t↓0

√µ₁ 2t

u₂

∥u2∥ + lim

t↓0

[√µ₂ 2t

u₂

∥u2∥−

√λ₂ 2t

w₂

∥w2∥ ]

= − lim

t↓0

√µ₁ 2t

u₂

∥u2∥ + lim

t↓0

√µ₂−√ λ₂ 2t

u₂

∥u2∥ + lim

t↓0

√λ₂(∥w2∥u2− ∥u2∥w2) 2t∥u2∥∥w2∥

= −φ(g, h)

2 w˜₂+ x₁g₂+ g₁x₂+ y₁h₂+ h₁y₂

√x²₁+ y²₁ − x^Tg + y^Th 2√

x²₁+ y₁²w˜₂

= −φ(g, h)

2 w˜₂+ 2x₁g₂+ g₁x₂+ 2y₁h₂+ h₁y₂ 2√

x²₁+ y²₁ − x^T₂g₂+ y₂^Th₂ 2√

x²₁+ y²₁ w˜₂,

where the last equality is using x₁w˜₂ = x₂ and y₁w˜₂ = y₂. Combining with (23), (24) and (27), a suitable rearrangement shows that ϕ^′

FB((x, y); (g, h)) has the expression (17).

Finally, we show that when (x + tg)²+ (y + th)² ∈ bd⁺Kⁿfor suﬃciently small t > 0, the formula in (17) reduces to the one in (19). Indeed, an elementary calculation yields

λ₁(

(x + tg)²+ (y + th)²)

= [

∥x + tg∥²+∥y + th∥²]2

− 4 ∥(x1+ tg₁)(x₂+ tg₂) + (y₁+ th₁)(y₂+ th₂)∥²

= 4t²φ(g, h)

√

x²₁+ y²₁ + 4t³(x^Tg + y^Th)(∥g∥²+∥h∥²)

−8t²(x₁g₂+ g₁x₂+ y₁h₂+ h₁y₂)^T(g₁g₂ + h₁h₂) +t⁴[

(∥g∥²+∥h∥²)²− 2∥g1g₂+ h₁h₂∥²]

= 4t²φ(g, h)

√

x²₁+ y²₁ + o(t²).