A one-parametric class of merit functions for the second-order cone complementarity problem

(1)

A one-parametric class of merit functions for the second-order cone complementarity problem

Jein-Shan Chen ¹ Department of Mathematics National Taiwan Normal University

Taipei, Taiwan 11677 E-mail: [email protected]

Shaohua Pan

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China E-mail: [email protected]

January 10, 2007

(first revised September 3, 2007) (second revised December 24, 2007)

Abstract: We investigate a one-parametric class of merit functions for the second-order cone complementarity problem (SOCCP) which is closely related to two popular merit functions, i.e., the Fischer-Burmeister (FB) merit function and the natural residual merit function. In fact, it will reduce to the FB merit function if the parameter τ is equal to 2, whereas as τ tends to zero, its limit will become a multiple of the natural residual merit function. In this paper, we show that this class of merit functions enjoys several favorable properties as the FB merit function holds, for example, the smoothness. These properties play an important role in the reformulation method of an unconstrained minimization or a nonsmooth system of equations for the SOCCP. Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions, which indicate that the FB merit function is not the best. For the sparse linear SOCPs, the merit functions associated with τ ∈ (2, 3]

work as well as, even better than, the FB merit function; whereas for the dense convex SOCPs, the merit functions with τ ∈ [0.1, 1.5] have better numerical performance.

Key words. Second-order cone complementarity problem, merit function, smoothness, Jordan product.

AMS subject classifications. 26B05, 26B35, 90C33, 65K05

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.

(2)

1 Introduction

We consider the conic complementarity problem of finding a vector ζ ∈ IRⁿ such that F (ζ) ∈ K, G(ζ) ∈ K, hF (ζ), G(ζ)i = 0 (1) where h·, ·i is the Euclidean inner product, F : IRⁿ → IRⁿ and G : IRⁿ → IRⁿ are smooth mappings, and K is the Cartesian product of second-order cones (SOCs). In other words,

K = Kⁿ¹ × Kⁿ² × · · · × Kⁿ^N, (2) where N, n₁, . . . , n_N ≥ 1, n₁+ · · · + n_N = n, and

Kⁿⁱ :=ⁿ(x1, x2) ∈ IR × IRⁿⁱ⁻¹ | kx2k ≤ x1

o, (3)

with k · k denoting the Euclidean norm and K¹ denoting the set of nonnegative reals IR+. A special case of (2) is K = IRⁿ₊, the nonnegative orthant in IRⁿ, which corresponds to N = n and n₁ = · · · = n_N = 1. We will refer to (1)-(2) as the second-order cone complementarity problem (SOCCP). For convenience, in the sequel, we will focus on K = Kⁿ. All analysis can be carried over to the general case where K has the direct product structure as (2).

An important special case of the SOCCP corresponds to G(ζ) = ζ for all ζ ∈ IRⁿ. Then (1) reduces to

hF (ζ), ζi = 0, F (ζ) ∈ K, ζ ∈ K, (4)

which is a natural extension of the nonlinear complementarity problem (NCP) [9, 11] with K = IRⁿ₊. Another important special case corresponds to the Karush-Kuhn-Tucker (KKT) conditions for the convex second-order cone program (CSOCP):

minimize g(x)

subject to Ax = b, x ∈ K, (5)

where g : IRⁿ → IR is a convex twice continuously differentiable function, A ∈ IR^m×n has full row rank and b ∈ IR^m. When g is linear, this reduces to the linear SOCP which arises in numerous applications in engineering design, finance, robust optimization, and includes as special cases convex quadratically constrained quadratic programs and linear programs;

see [1, 19] and references therein.

There have been various methods proposed for solving SOCPs and SOCCPs. They include interior-point methods [2, 3, 19, 21, 22, 27], non-interior smoothing Newton methods [8, 14, 15], and smoothing–regularization methods [16]. Recently, there was an alternative approach [6] based on reformulating the SOCCP as an unconstrained minimization problem.

In that approach, it aimed to find a smooth function ψ : IRⁿ× IRⁿ→ IR+ such that ψ(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K, hx, yi = 0. (6)

(3)

We call such a ψ a smooth merit function. Consequently, the SOCCP can be expressed as an unconstrained smooth minimization problem of f (ζ) := ψ(F (ζ), G(ζ)).

A popular choice of ψ is the squared norm of Fischer-Burmeister (FB) function:

ψ_FB(x, y) = 1

2kφ_FB(x, y)k², (7)

where φ_FB : IRⁿ× IRⁿ→ IRⁿ is the well-known FB function [12, 13] given by

φ_FB(x, y) = (x²+ y²)^1/2− x − y, (8) with x² to mean x ◦ x and x + y to mean the usual componentwise addition of vectors.

More specifically, for any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, their Jordan product x ◦ y associated with Kⁿ is defined as

x ◦ y := (hx, yi, y₁x₂+ x₁y₂). (9) The Jordan product ◦, unlike scalar or matrix multiplication, is not associative, which is a main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, . . . , 0)^T ∈ IRⁿ. The function ψ_FB was studied in [6] and particularly shown to be a smooth merit function for the SOCCP. Another popular choice of ψ is

ψ_NR(x, y) := 1

2kφ_NR(x, y)k², (10)

which is induced by the natural residual function φ_NR : IRⁿ× IRⁿ → IRⁿ

φ_NR(x, y) := x − [x − y]+, (11)

where [ · ]₊ means the projection in the Euclidean norm onto Kⁿ. The function φ_NR was studied in [14, 16] which is involved in smoothing methods for the SOCCP. Compared with the FB merit function ψ_FB, the function ψ_NR has a drawback, i.e., its non-differentiability.

In this paper, we investigate the following one-parametric class of merit functions ψτ(x, y) := 1

2kφτ(x, y)k², (12)

where φ_τ : IRⁿ× IRⁿ→ IRⁿ is a family of functions defined by

φ_τ(x, y) := ^h(x − y)² + τ (x ◦ y)ⁱ^1/2− (x + y) (13) with τ being a fixed parameter such that τ ∈ (0, 4). Notice that, for any x, y ∈ IRⁿ,

(x − y)²+ τ (x ◦ y) =

·

x + τ − 2 2 y

¸₂

+τ (4 − τ ) 4 y²

=

·

y + τ − 2 2 x

¸₂

+τ (4 − τ )

4 x² ∈ Kⁿ, (14)

(4)

and hence φ_τ and ψ_τ are well-defined. We will prove that ψ_τ is a smooth merit function for the SOCCP with computable gradient formulas (see Propositions 3.1–3.3). In other words, the SOCCP can be expressed as an unconstrained smooth minimization problem:

ζ∈IRminⁿfτ(ζ) := ψτ(F (ζ), G(ζ)). (15) Also, we will show that every stationary point of f_τ solves the SOCCP if ∇F and −∇G are column monotone (see Proposition 4.2). Observe that φ_τ reduces to φ_FB when τ = 2, whereas its limit as τ → 0 becomes a multiple of φ_NR. Thus, this class of merit functions covers two of the most important merit functions for SOCCPs under this sense so that a closer look and study for it is worthwhile. Indeed, if τ = 0 then φ_τ is exactly a multiple of φ_NR whose squared norm is not even differentiable. This violates our expectation that kφ_τk² is smooth and that is why we exclude τ = 0 from the interval (0, 4). This study is motivated by the work [18] where φτ was used to develop a nonsmooth Newton method for the NCP. This paper is mainly concerned with the merit function approach based on the unconstrained smooth minimization problem (15). Numerical results are also reported by solving some convex SOCPs, which indicate that the merit function ψ_τ can be an alterna- tive for the FB merit function if a suitable τ is selected.

Throughout this paper, IRⁿ denotes the space of n-dimensional real column vectors, IRⁿ¹ × · · · × IRⁿ^m is identified with IRⁿ¹^+···+n^m, and int(Kⁿ) denotes the interior of Kⁿ. For any x, y in IRⁿ, we write x º_Kn y if x − y ∈ Kⁿ; and write x Â_Kn y if x − y ∈ int(Kⁿ).

For any differentiable mapping F : IRⁿ → IR^m, ∇F (x) ∈ IR^n×m denotes the transposed Jacobian of F at x. For a symmetric matrix A, we write A º O (respectively, A Â O) to mean A is positive semidefinite (respectively, positive definite). For nonnegative α and β, we write α = O(β) to mean α ≤ Cβ, with C > 0 independent of α and β.

2 Prelimiaries

For any x = (x1, x2) ∈ IR × IRⁿ⁻¹, we define its determinant and trace as follows:

det(x) := x²₁− kx2k², tr(x) = 2x1.

A vector x = (x1, x2) ∈ IR × IRⁿ⁻¹ is said to be invertible if det(x) 6= 0. If x is invertible, then there exists a unique y = (y₁, y₂) ∈ IR × IRⁿ⁻¹ satisfying x ◦ y = y ◦ x = e. We call this y the inverse of x and denote it by x⁻¹. For any x = (x₁, x₂) ∈ IR × IRⁿ⁻¹, let

Lx :=

"

x₁ x^T₂ x₂ x₁I

#

(16) which can be regarded a linear mapping from IRⁿto IRⁿ. It is easily verified that L_xy = x◦y and L_x+y = L_x+ L_y for any x, y ∈ IRⁿ, but generally L²_x = L_xL_x 6= L_x² and L⁻¹_x 6= L_x⁻¹.

(5)

We next recall from [14] that each x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ admits a spectral factor- ization, associated with Kⁿ, of the form

x = λ₁(x)u⁽¹⁾_x + λ₂(x)u⁽²⁾_x ,

where λ_i(x) and u⁽ⁱ⁾_x are the spectral values and the associated spectral vectors of x:

λi(x) = x1+ (−1)ⁱkx2k, u⁽ⁱ⁾_x = 1 2

³1, (−1)ⁱx¯2

´ for i = 1, 2,

with ¯x2 = _kx^x²₂_k if x2 6= 0, and otherwise ¯x2 being any vector in IRⁿ⁻¹ such that k¯x2k = 1. If x₂ 6= 0, the factorization is unique. The spectral factorization of x and the matrix L_x have various interesting properties, and we list several ones that will be used later.

Property 2.1 For any x = (x₁, x₂) ∈ IR × IRⁿ⁻¹, the following results always hold:

(a) x² = λ²₁(x)u⁽¹⁾_x + λ²₂(x)u⁽²⁾_x ∈ Kⁿ and x^1/2 =^qλ₁(x) u⁽¹⁾_x +^qλ₂(x) u⁽²⁾_x ∈ Kⁿ if x ∈ Kⁿ; (b) x º_Kn 0 ⇐⇒ λ₁(x) ≥ 0 ⇐⇒ L_x º O;

(c) x Â_Kn 0 ⇐⇒ λ₁(x) > 0 ⇐⇒ L_x Â O, and the inverse of L_x is given by

L⁻¹_x = 1 det(x)





x₁ −x^T₂

−x₂ det(x) x₁ I + 1

x₁x₂x^T₂



. (17)

3 Smoothness of Merit Functions

In this section we show that ψ_τ defined by (12) is a smooth merit function for the SOCCP.

First, we show that ψ_τ is a merit function, which is direct by the following proposition.

Proposition 3.1 The φ_τ given by (13) is a complementarity function of the SOCCP, i.e., φ_τ(x, y) = 0 ⇐⇒ x ∈ Kⁿ, y ∈ Kⁿ, hx, yi = 0.

Proof. “⇐”. Since the condition that x ∈ K, y ∈ K and hx, yi = 0 implies x ◦ y = 0, substituting it into the expression of φ_τ(x, y) yields that

φτ(x, y) = (x²+ y²)^1/2− (x + y) = φ_FB(x, y).

Applying Proposition 2.1 of [14], we readily obtain φ_τ(x, y) = 0.

“⇒”. Suppose that φ_τ(x, y) = 0. Then, x + y = [(x − y)²+ τ (x ◦ y)]^1/2. Squaring both sides yields (τ − 4)(x ◦ y) = 0. Since τ − 4 6= 0, we have x ◦ y = 0. This means that

x + y =^h(x − y)²+ τ (x ◦ y)ⁱ^1/2 = (x²+ y²)^1/2,

(6)

i.e., φ_FB(x, y) = 0. Consequently, the desired result is from Proposition 2.1 of [14]. 2 Now we introduce some notation that will be used in the rest of this paper. For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IRⁿ⁻¹ and τ ∈ (0, 4), let

w = (w₁, w₂) = w(x, y) := (x − y)²+ τ (x ◦ y),

z = (z₁, z₂) = z(x, y) := ^h(x − y)²+ τ (x ◦ y)ⁱ^1/2. (18) Then w ∈ Kⁿ and z ∈ Kⁿ always hold. Moreover, by the definition of Jordan product,

w₁ = w₁(x, y) = kxk²+ kyk²+ (τ − 2)x^Ty,

w₂ = w₂(x, y) = 2(x₁x₂ + y₁y₂) + (τ − 2)(x₁y₂+ y₁x₂). (19) Let λ₁(w) and λ₂(w) be the spectral values of w. From Property 2.1 (a),

z1 = z1(x, y) =

q

λ1(w) +^qλ2(w)

2 , z2 = z2(x, y) =

q

λ2(w) −^qλ1(w)

2 w¯2, (20)

where ¯w₂ := _kw^w²₂_k if w₂ 6= 0 and otherwise ¯w₂ is any vector in IRⁿ⁻¹ satisfying k ¯w₂k = 1.

In what follows, we concentrate on the proof of the smoothness of ψ_τ. First, we state an important lemma which describes the behavior of x, y when (x − y)²+ τ (x ◦ y) is on the boundary of Kⁿ. In fact, it may be viewed as an extension of [6, Lemma 3.2].

Lemma 3.1 For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IRⁿ⁻¹, let w be given as in (18). If (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ), then then there hold that

x²₁ = kx₂k², y₁² = ky₂k², x₁y₁ = x^T₂y₂, x₁y₂ = y₁x₂; (21) x²₁+ y₁²+ (τ − 2)x₁y₁ = kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k

= kx₂k²+ ky₂k²+ (τ − 2)x^T₂y₂. (22) If, in addition, (x, y) 6= (0, 0), then w₂ = w₂(x, y) 6= 0, and furthermore,

x^T₂ w₂

kw₂k = x₁, x₁ w₂

kw₂k = x₂, y₂^T w₂

kw₂k = y₁, y₁ w₂

kw₂k = y₂. (23) Proof. Since (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ), using [6, Lemma 3.2] and (14) yields that

µ

x₁ +τ − 2 2 y₁

¶₂

=

°°

°°x₂+τ − 2 2 y₂

°°

2

, y²₁ = ky₂k²,

µ

x₁+ τ − 2 2 y₁

¶

y₂ =

µ

x₂+τ − 2 2 y₂

¶

y₁,

µ

x₁+ τ − 2 2 y₁

¶

y₁ =

µ

x₂+τ − 2 2 y₂

¶_T

y₂;

µ

y₁+τ − 2 2 x₁

¶₂

=

°°

°°y₂+ τ − 2 2 x₂

°°

2

, x²₁ = kx₂k²,

µ

y₁+ τ − 2 2 x₁

¶

x₂ =

µ

y₂+τ − 2 2 x₂

¶

x₁,

µ

y₁+τ − 2 2 x₁

¶

x₁ =

µ

y₂+τ − 2 2 x₂

¶_T

x₂.

(7)

From the above equalities, we immediately obtain the results in (21). In addition, since w /∈ int(Kⁿ) but w ∈ Kⁿ, we have λ₁(w) = 0, which implies that

kxk²+ kyk²+ (τ − 2)x^Ty = k2x₁x₂+ 2y₁y₂+ (τ − 2)(x₁y₂+ y₁x₂)k.

Applying the relations in (21) then gives the equalities in (22). If, in addition, (x, y) 6= (0, 0), then it is clear that kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k = x²₁+ y²₁+ (τ − 2)x₁y₁ 6= 0. To prove the equalities in (23), we only need to verify that x^T₂_kw^w²₂_k = x₁ and x₁_kw^w²₂_k = x₂ in view of the symmetry of x and y in w. The verifications are straightforward due to x1y2 = y1x2 and equation (22). 2

By Lemma 3.1, when w(x, y) = (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ), the spectral values of w can be further simplified. Clearly, λ₁(w) = 0 and λ₂(w) can be rewritten as

λ₂(w) = 2^³x²₁+ y₁²+ (τ − 2)x₁y₁^´+ 2kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k

= 4^³x²₁+ y₁²+ (τ − 2)x1y1

´. (24)

Therefore, if (x, y) 6= (0, 0) also holds, using equations (20), (22) and (24) yields that z₁(x, y) =

q

x²₁+ y₁²+ (τ − 2)x₁y₁, z₂(x, y) = x₁x₂+ y₁y₂+ (τ − 2)x₁y₂

q

x²₁+ y²₁ + (τ − 2)x₁y₁ .

Thus, if (x, y) 6= (0, 0) and (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ), the function φ_τ is rewritten as

φ_τ(x, y) = z(x, y) − (x + y) =







qx²₁+ y₁²+ (τ − 2)x₁y₁− (x₁+ y₁) x₁x₂+ y₁y₂+ (τ − 2)x₁y₂

q

x²₁+ y₁²+ (τ − 2)x₁y₁ − (x₂+ y₂)





. (25)

This specific expression will be employed in the proof of the following main result.

Proposition 3.2 The function ψ_τ given by (12) is differentiable at every (x, y) ∈ IRⁿ×IRⁿ. Moreover, ∇_xψ_τ(0, 0) = ∇_yψ_τ(0, 0) = 0; and if (x − y)²+ τ (x ◦ y) ∈ int(Kⁿ), then

∇_xψ_τ(x, y) = ^hL_x+^{τ −2}

2 yL⁻¹_z − Iⁱφ_τ(x, y),

∇_yψ_τ(x, y) = ^hL_y+^{τ −2}

2 xL⁻¹_z − Iⁱφ_τ(x, y). (26) If (x, y) 6= (0, 0) and (x − y)²+ τ (x ◦ y) 6∈ int(Kⁿ), then x²₁+ y₁²+ (τ − 2)x₁y₁ 6= 0 and

∇_xψ_τ(x, y) =



 x₁+^{τ −2}₂ y₁

q

x²₁+ y²₁+ (τ − 2)x₁y₁ − 1



φ_τ(x, y),

∇_yψ_τ(x, y) =



 y1 +^{τ −2}₂ x1

qx²₁+ y²₁+ (τ − 2)x₁y₁ − 1



φ_τ(x, y). (27)

(8)

Proof. Case (1): (x, y) = (0, 0). For any h, k ∈ IRⁿ, let µ₁ ≤ µ₂ be the spectral values of (h − k)²+ τ (h ◦ k) and v⁽¹⁾, v⁽²⁾ be the corresponding spectral vectors. Then,

ψτ(h, k) − ψτ(0, 0) = 1 2

°°

°[h²+ k² + (τ − 2)(h ◦ k)]^1/2− h − k^°^°°²

= 1 2

°°

°√

µ1 v⁽¹⁾+√

µ2 v⁽²⁾− h − k^°^°°²

≤ 1 2

·q

2µ₂+ khk + kkk

¸₂

.

In addition, by the definition of spectral value µ₂, it is easy to verify that µ₂ ≤ 2khk²+ 2kkk²+ 3|τ − 2|khkkkk ≤ 5(khk²+ kkk²).

Combining the last two equations then yields ψ_τ(h, k) − ψ_τ(0, 0) = O(khk²+ kkk²). This shows that ψ_τ is differentiable at (0, 0) with ∇_xψ_τ(0, 0) = ∇_yψ_τ(0, 0) = 0.

Case (2): (x−y)²+τ (x◦y) ∈ int(Kⁿ). By [7, Proposition 5] or [14, Proposition 5.2], we know that z(x, y) defined by (20) is continuously differentiable at such (x, y), and consequently, φτ(x, y) = z(x, y) − (x + y) is also continuously differentiable at such (x, y). Notice that

z²(x, y) =

µ

x +τ − 2 2 y

¶₂

+τ (4 − τ ) 4 y².

Differentiating on both sides about x, it then follows that ∇_xz(x, y)L_z = L_x+^{τ −2}

2 y. Using Property 2.1 (c) and noting that ∇_xφ_τ(x, y) = ∇_xz(x, y) − I, we have

∇_xφ_τ(x, y) = L_x+^{τ −2}

2 yL⁻¹_z − I,

which, together with ∇xψτ(x, y) = ∇xφτ(x, y)φτ(x, y), yields the first formula in (26). For the symmetry of x and y in ψ_τ, the second formula in (26) also holds.

Case (3): (x, y) 6= (0, 0) and (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ). For any (x⁰, y⁰) ∈ IRⁿ× IRⁿ, 2ψ_τ(x⁰, y⁰) =

°°

hx⁰²+ y⁰²+ (τ − 2)(x⁰◦ y⁰)ⁱ^1/2

°°

2+ kx⁰+ y⁰k²

−2^¿hx⁰²+ y⁰²+ (τ − 2)(x⁰ ◦ y⁰)ⁱ^1/2, x⁰+ y⁰

À

= kx⁰k²+ ky⁰k²+ (τ − 2)hx⁰, y⁰i + kx⁰ + y⁰k²

−2^D[x⁰²+ y⁰²+ (τ − 2)(x⁰◦ y⁰)]^1/2, x⁰+ y⁰^E,

where the second equality uses the observation that kzk² = hz², ei for any z ∈ IRⁿ. Since kx⁰k² + ky⁰k² + (τ − 2)hx⁰, y⁰i + kx⁰+ y⁰k² is clearly differentiable in (x⁰, y⁰), it suffices to show that h[x⁰²+ y⁰²+ (τ − 2)(x⁰◦ y⁰)]^1/2, x⁰ + y⁰i is differentiable at (x⁰, y⁰) = (x, y). By Lemma 3.1, w₂(x, y) = 2(x₁x₂ + y₁y₂) + 2(τ − 2)x₁y₂ 6= 0, which implies

w2(x⁰, y⁰) = 2x⁰₁x⁰₂+ 2y₁⁰y⁰₂+ (τ − 2)(x⁰₁y⁰₂+ y₁⁰x⁰₂) 6= 0

(9)

for all (x⁰, y⁰) ∈ IRⁿ× IRⁿ sufficiently near to (x, y). Let µ₁, µ₂ be the spectral values of x⁰²+ y⁰²+ (τ − 2)(x⁰◦ y⁰). Then we can compute that

2^Dhx⁰²+ y⁰²+ (τ − 2)(x⁰◦ y⁰)ⁱ^1/2, x⁰+ y⁰^E

= √

µ₂



x⁰₁ + y⁰₁+

h2(x⁰₁x⁰₂+ y₁⁰y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y₁⁰x⁰₂)ⁱ^T(x⁰₂+ y₂⁰) k2(x⁰₁x⁰₂+ y₁⁰y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y₁⁰x⁰₂)k





+√ µ₁



x⁰₁+ y₁⁰ −

h2(x⁰₁x⁰₂+ y₁⁰y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y₁⁰x⁰₂)ⁱ^T(x⁰₂+ y₂⁰) k2(x⁰₁x⁰₂+ y₁⁰y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y₁⁰x⁰₂)k



. (28)

Since λ₂(w) > 0 and w₂(x, y) 6= 0, the first term on the right-hand side of (28) is differen- tiable at (x⁰, y⁰) = (x, y). We claim that the second term is o(khk+kkk) with h := x⁰−x, k :=

y⁰−y, i.e., it is differentiable at (x, y) with zero gradient. To see this, note that w₂(x, y) 6= 0, and hence µ1 = kx⁰k²+ky⁰k²+(τ −2)hx⁰, y⁰i−k2(x⁰₁x⁰₂+y⁰₁y₂⁰)+(τ −2)(x⁰₁y⁰₂+y⁰₁x⁰₂)k, viewed as a function of (x⁰, y⁰), is differentiable at (x⁰, y⁰) = (x, y). Moreover, µ₁ = λ₁(w) = 0 when (x⁰, y⁰) = (x, y). Thus, the first-order Taylor’s expansion of µ₁ at (x, y) yields

µ₁ = O(kx⁰ − xk + ky⁰− yk) = O(khk + kkk).

Also, since w2(x, y) 6= 0, by the product and quotient rules for differentiation, the function

x⁰₁+ y₁⁰ −

h2(x⁰₁x⁰₂ + y⁰₁y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y⁰₁x⁰₂)ⁱ^T(x⁰₂+ y⁰₂)

k2(x⁰₁x⁰₂ + y⁰₁y₂⁰) + (τ − 2)(x⁰₁y₂⁰ + y⁰₁x⁰₂)k (29) is also differentiable at (x⁰, y⁰) = (x, y), and it has value 0 at (x⁰, y⁰) = (x, y) because

x₁+ y₁−

hx₁x₂+ y₁y₂+ (τ − 2)x₁y₂ⁱ^T(x₂+ y₂)

kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k = x₁− x^T₂ w₂

kw₂k + y₁− y^T₂ w₂ kw₂k = 0 by Lemma 3.1. Thus, the function (29) is O(khk + kkk) in magnitude, which together with µ₁ = O(khk + kkk) shows that the second term on the right-hand side of (28) is

O((khk + kkk)^3/2) = o(khk + kkk).

Then, we have shown that ψ_τ is differentiable at (x, y). Moreover, we see that 2∇ψ_τ(x, y) is the sum of the gradient of kx⁰k²+ ky⁰k²+ (τ − 2)hx⁰, y⁰i + kx⁰+ y⁰k² and the gradient of the first term on the right-hand side of (28), evaluated at (x⁰, y⁰) = (x, y).

The gradient of kx⁰k²+ ky⁰k²+ (τ − 2)hx⁰, y⁰i + kx⁰+ y⁰k² with respect to x⁰, evaluated at (x⁰, y⁰) = (x, y), is 2x + (τ − 2)y + 2(x + y). The derivative of the first term on the right-hand side of (28) with respect to x⁰₁, evaluated at (x⁰, y⁰) = (x, y), works out to be

q 1 λ₂(w)

"µ

x₁+τ − 2 2 y₁

¶

+

µ

x₂+τ − 2 2 y₂

¶_T w₂ kw2k

# Ã

x₁ + y₁+ (x₂+ y₂)^T w₂ kw2k

!

(10)

+

q

λ2(w)

"

1 + (x₂+ ^{τ −2}₂ y₂)^T(x₂+ y₂)

kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k − w₂^T(x₂+ y₂) · w^T₂(x₂+ ^{τ −2}₂ y₂) kx₁x₂+ y₁y₂+ (τ − 2)x₁y₂k · kw₂k²

#

= 2(x₁+ ^{τ −2}₂ y₁)(x₁+ y₁)

q

x²₁+ y₁²+ (τ − 2)x₁y₁ + 2^qx²₁+ y₁²+ (τ − 2)x₁y₁,

where the equality follows from Lemma 3.1. Similarly, the gradient of the first term on the right of (28) with respect to x⁰₂, evaluated at (x⁰, y⁰) = (x, y), works out to be

q 1 λ₂(w)

"µ

x2+τ − 2 2 y2

¶

+

µ

x1+τ − 2 2 y1

¶ w2

kw₂k

# Ã

x1+ y1+ (x2 + y2)^T w2

kw₂k

!

+^qλ₂(w)

"

(2x₁+ (τ − 2)y₁)x₂ +^τ₂(x₁+ y₁)y₂

kx1x2+ y1y2+ (τ − 2)x1y2k − w^T₂(x₂+ y₂) · (x₁+^{τ −2}₂ y₁)w₂ kx1x2+ y1y2+ (τ − 2)x1y2k · kw2k²

#

= 2(2x1+ (τ − 2)y1)x2+^τ₂(x1+ y1)y2

qx²₁+ y₁²+ (τ − 2)x₁y₁ .

Since λ₂(w) = 4(x²₁+ y₁²+ (τ − 2)x₁y₁, combining the above gradient expressions yields 2∇_xψ_τ(x, y) = 2x + (τ − 2)y + 2(x + y) −

"

2^qx²₁+ y₁²+ (τ − 2)x₁y₁ 0

#

− 2

q

x²₁+ y₁²+ (τ − 2)x₁y₁

"

(x₁+^{τ −2}₂ y₁)(x₁+ y₁) (2x₁+ (τ − 2)y₁)x₂ +^τ₂(x₁+ y₁)y₂

#

.

Using the fact x1y2 = y1x2 and noting that φτ can be simplified as the one given by (25) under this case, we readily rewrite the above expression for ∇_xψ_τ(x, y) in the form of (27).

By symmetry, ∇_yψ_τ(x, y) also holds as form of (27). 2

Proposition 3.2 gives a formula for computing ∇ψ_τ. It is a natural question whether ψ_τ is smooth or not. In what follows, we concentrate on this issue for which two crucial technical lemmas are needed. Their proofs are provided in appendix.

Lemma 3.2 For any x = (x₁, x₂), y = (y₁, y₂), let w be given as in (18). If w₂ 6= 0, then

"µ

x₁ +τ − 2 2 y₁

¶

+ (−1)ⁱ

µ

x₂+τ − 2 2 y₂

¶_T w2

kw₂k

#₂

≤

°°

° µ

x2+ τ − 2 2 y2

¶

+ (−1)ⁱ

µ

x1 +τ − 2 2 y1

¶ w₂ kw₂k

°°

°

2

≤ λi(w)

for i = 1, 2, and furthermore, these relations also hold when interchanging x and y.

(11)

Lemma 3.3 Let z = z(x, y) be given as in (18). Then, for any (x, y) satisfying (x − y)²+ τ (x ◦ y) ∈ int(Kⁿ), there exists a scalar constant C > 0 such that

°°

°L_x+^{τ −2}

2 yL⁻¹_z ^°^°°

F ≤ C, ^°^°°L_y+^{τ −2}

2 xL⁻¹_z ^°^°°

F ≤ C, (30)

where kAk_F denotes the Frobenius norm of the n × n matrix A.

Proposition 3.3 The function ψ_τ defined by (12) is smooth everywhere on IRⁿ× IRⁿ. Proof. By Proposition 3.2 and the symmetry of x and y in ∇ψ_τ, it suffices to show that

∇_xψ_τ is continuous at every (a, b) ∈ IRⁿ×IRⁿ. If (a−b)²+τ (a◦b) ∈ int(Kⁿ), the conclusion has been shown in Proposition 3.2. We next consider the other two cases.

Case (1): (a, b) = (0, 0). By Proposition 3.2, we need to show that ∇_xψ_τ(x, y) → 0 as (x, y) → (0, 0). If (x − y)²+ τ (x ◦ y) ∈ int(Kⁿ), then ∇_xψ_τ(x, y) is given by (26), whereas if (x, y) 6= (0, 0) and (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ), then ∇xψτ(x, y) is given by (27). Notice that L_x+^{τ −2}

2 yL⁻¹_z and √ ^x¹⁺^{τ −2}² ^y¹

x²₁+y₁²+(τ −2)x1y1 are uniformly bounded with bound independent of (x, y). Using the continuity of φτ(x, y) then leads to the desired result.

Case (2): (a, b) 6= (0, 0) and (a − b)²+ τ (a ◦ b) /∈ int(Kⁿ). We will show that ∇_xψ_τ(x, y) →

∇_xψ_τ(a, b) by the two subcases: (i) (x, y) 6= (0, 0) and (x − y)²+ τ (x ◦ y) /∈ int(Kⁿ) and (ii) (x − y)²+ τ (x ◦ y) ∈ int(Kⁿ). In subcase (i), ∇xψτ(x, y) is given by (27). Noting that the right hand side of (27) is continuous at (a, b), the desired result follows.

Next, we prove that ∇_xψ_τ(x, y) → ∇_xψ_τ(a, b) in subcase (ii). From (26),

∇_xψ_τ(x, y) = L_x+^{τ −2}

2 yL⁻¹_z ^³L_ze − (x + y)^´− φ_τ(x, y)

=

µ

x +τ − 2 2 y

¶

− L_x+^{τ −2}

2 yL⁻¹_z (x + y) − φ_τ(x, y). (31) On the other hand, since (a, b) 6= (0, 0) and (a − b)²+ τ (a ◦ b) /∈ int(Kⁿ), we have

kak²+ kbk²+ (τ − 2)a^Tb = k2(a₁a₂+ b₁b₂) + (τ − 2)(a₁b₂+ b₁a₂)k 6= 0, (32) and moreover from (22) in Lemma 3.1, it follows that

kak²+ kbk²+ (τ − 2)a^Tb = 2(a²₁+ b²₁+ (τ − 2)a₁b₁)

= 2(ka₂k²+ kb₂k²+ (τ − 2)a^T₂b₂)

= 2k(a1a2+ b1b2) + (τ − 2)a1b2k. (33) Using the equalities in (33), it is not hard to verify that

a1+ ^{τ −2}₂ b1

qa²₁+ b²₁+ (τ − 2)a₁b₁

³(a − b)² + τ (a ◦ b)^´^1/2 = a + τ − 2 2 b.