3 Two classes of merit functions

27  Download (0)

Full text

(1)

Mathematical Methods of Operations Research, vol. 64, pp. 495-519, 2006

Two classes of merit functions for the second-order cone complementarity problem

Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

June 2, 2005

(revised December 8, 2005) (second revised March 10, 2006)

Abstract Recently Tseng [Merit function for semidefinite complementarity, Mathematical Programming, 83, pp. 159-185, 1998] extended a class of merit functions, proposed by Z.

Luo and P. Tseng [A new class of merit functions for the nonlinear complementarity problem, in Complementarity and Variational Problems: State of the Art, pp. 204-225, 1997], for the nonlinear complementarity problem (NCP) to the semidefinite complementarity problem (SDCP) and showed several related properties. In this paper, we extend this class of merit functions to the second-order cone complementarity problem (SOCCP) and show analogous properties as in NCP and SDCP cases. In addition, we study another class of merit functions which are based on a slight modification of the aforementioned class of merit functions. Both classes of merit functions provide an error bound for the SOCCP and have bounded level sets.

Key words. Error bound, Jordan product, level set, merit function, second-order cone, spectral factorization.

AMS subject classifications. 26B05, 90C33

1 Introduction

We consider the following conic complementarity problem of finding x, y ∈ IRn and ζ ∈ IRn satisfying

hx, yi = 0, x ∈ K, y ∈ K, (1)

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan. E-mail: jschen@math.ntnu.edu.tw, FAX: 886-2-29332342.

(2)

x = F (ζ), y = G(ζ), (2) where h·, ·i is the Euclidean inner product, F : IRn → IRn and G : IRn → IRn are smooth (i.e., continuously differentiable) mappings, and K is the Cartesian product of second-order cones (SOC), also called Lorentz cones [8]. In other words,

K = Kn1× · · · × KnN, (3)

where N, n1, . . . , nN ≥ 1, n1+ · · · + nN = n, and

Kni := {(x1, x2) ∈ IR × IRni−1 | kx2k ≤ x1}, (4) with k · k denoting the Euclidean norm and K1 denoting the set of nonnegative reals IR+. A special case of (3) is K = IRn+, the nonnegative orthant in IRn, which corresponds to N = n and n1 = · · · = nN = 1. We will refer to (1), (2), (3) as the second-order cone complementarity problem (SOCCP).

An important special case of SOCCP corresponds to G(ζ) = ζ for all ζ ∈ IRn. Then (1) and (2) reduce to

hF (ζ), ζi = 0, F (ζ) ∈ K, ζ ∈ K, (5)

which is a natural extension of the nonlinear complementarity problem (NCP) where K = IRn+. Another important special case of SOCCP corresponds to the Karush-Kuhn-Tucker (KKT) optimality conditions for the second-order cone program (SOCP) (see [4] for details):

minimize cTx

subject to Ax = b, x ∈ K, (6)

where A ∈ IRm×n has full row rank, b ∈ IRm and c ∈ IRn.

For simplicity, we will focus on K = Kn throughout the whole paper. All the analysis can be carried over to the general case where K has the direct product structure as (3). It is known that Kn is a closed convex cone with interior given by

int(Kn) = {(x1, x2) ∈ IR × IRn−1| kx2k < x1}.

For any x, y in IRn, we write x ºKn y if x − y ∈ Kn; and write x ÂKn y if x − y ∈ int(Kn). In other words, we have x ºKn 0 if and only if x ∈ Kn and x ÂKn 0 if and only if x ∈ int(Kn).

The relation ºKn is a partial ordering, i.e., it is anti-symmetric, transitive, and reflexive.

Nonetheless, it is not a total ordering in Kn.

There have been various methods proposed for solving SOCP and SOCCP. They include interior-point methods [1, 2, 18, 20, 21, 23, 28], non-interior smoothing Newton methods [6, 11, 13], and smoothing–regularization methods [14]. Recently, the author and his co- author studied an alternative approach based on reformulating SOCP and SOCCP as an

(3)

unconstrained smooth minimization problem [4]. In that approach, it aimed to find a smooth function ψ : IRn× IRn→ IR+ such that

ψ(x, y) = 0 ⇐⇒ x ∈ Kn, y ∈ Kn, hx, yi = 0. (7) Then SOCCP can be expressed as an unconstrained smooth (global) minimization problem:

ζ∈IRminn f (ζ) := ψ(F (ζ), G(ζ)). (8) We call such a f a merit function for the SOCCP.

A popular choice of ψ is the squared norm of Fischer-Burmeister function, i.e., ψFB : IRn× IRn→ IR+ associated with second-order cone given by

ψFB(x, y) = 1

2FB(x, y)k2, (9)

where φFB : IRn× IRn → IRn is the well-known Fischer-Burmeister function [9, 10] defined by

φFB(x, y) = (x2+ y2)1/2− x − y. (10) More specifically, for any x = (x1, x2), y = (y1, y2) ∈ IR × IRn−1, we define their Jordan product associated with Kn as

x ◦ y := (hx, yi, y1x2+ x1y2). (11) The Jordan product ◦, unlike scalar or matrix multiplication, is not associative, which is a main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, . . . , 0)T ∈ IRn. We write x2 to mean x ◦ x and write x + y to mean the usual componentwise addition of vectors. It is known that x2 ∈ Kn for all x ∈ IRn. Moreover, if x ∈ Kn, then there exists a unique vector in Kn, denoted by x1/2, such that (x1/2)2 = x1/2◦ x1/2= x. Thus, φFB defined as (10) is well-defined for all (x, y) ∈ IRn× IRn and maps IRn× IRn to IRn. It was shown in [11] that φFB(x, y) = 0 if and only if (x, y) satisfies (1). Therefore, ψFB defined as (9) induces a merit function for the SOCCP.

In this paper, we study two classes of merit functions for the SOCCP. The first class is fLT(ζ) := ψ0(hF (ζ), G(ζ)i) + ψ(F (ζ), G(ζ)), (12) where ψ0 : IR → IR+ satisfies

ψ0(t) = 0 ∀t ≤ 0 and ψ00(t) > 0 ∀t > 0, (13) and ψ : IRn× IRn→ IR+ satisfies

ψ(x, y) = 0, hx, yi ≤ 0 ⇐⇒ (x, y) ∈ Kn× Kn, hx, yi = 0. (14)

(4)

The function fLT was proposed by Z. Luo and P. Tseng for NCP case in [19] and was extended to the SDCP case by P. Tseng in [27]. We explore the extension to the SOCCP as will be seen in Sec. 3 and Sec. 4. In addition, we make a slight modification of fLT which forms another class of merit function as below.

fdLT(ζ) := ψ0(F (ζ) ◦ G(ζ)) + ψ(F (ζ), G(ζ)), (15) where ψ0 : IRn→ IR+ is given as

ψ0(w) = 1

2k(w)+k2. (16)

and ψ : IRn× IRn→ IR+ satisfies (14). We notice that ψ0 possesses the following property:

ψ0(w) = 0 ⇐⇒ w ¹Kn 0, (17)

which is a similar feature to (13) in some sense. Examples of ψ0 and ψ will be given in Sec. 3. The second class of merit functions for SDCP case was recently studied in [12] and a variant of fdLT was also studied by the author in [3].

We will show that both fLT and fdLT provide global error bound (Prop. 4.1 and Prop.

4.2), which plays an important role in analyzing the convergence rate of some iterative methods for solving the SOCCP, if F and G are jointly strongly monotone. We will also prove that if F and G are jointly monotone and a strictly feasible solution exists then both fLT and fdLT have bounded level sets (Prop. 4.3 and Prop. 4.4) which will ensure that the sequence generated by a descent algorithm has at least an accumulation point. All these properties will make it possible to construct a descent algorithm for solving the equivalent unconstrained reformulation of the SOCCP. In contrast, the merit function induced by ψFB lacks these properties. In addition, we will show that both fLT and fdLT are differentiable and their gradients have computable formulas. All the aforementioned features are signifi- cant reasons for choosing and studying these new merit functions.

Finally, we point out that SOCCP can be reduced to an SDCP by observing that, for any x = (x1, x2) ∈ IR × IRn−1, we have x ∈ Kn if and only if

Lx :=

"

x1 xT2 x2 x1I

#

is positive semidefinite (also see [11, p. 437] and [24]). However, this reduction increases the problem dimension from n to n(n + 1)/2 and it is not known whether this increase can be mitigated by exploiting the special “arrow” structure of Lx.

Throughout this paper, IRn denotes the space of n-dimensional real column vectors and T denotes transpose. For any differentiable function f : IRn → IR, ∇f (x) denotes

(5)

the gradient of f at x. For any differentiable mapping F = (F1, ..., Fm)T : IRn → IRm,

∇F (x) = [∇F1(x) · · · ∇Fm(x)] is a n × m matrix which denotes the transpose Jacobian of F at x. For any symmetric matrices A, B ∈ IRn×n, we write A º B (respectively, A Â B) to mean A−B is positive semidefinite (respectively, positive definite). For nonnegative scalars α and β, we write α = O(β) to mean α ≤ Cβ, with C independent of α and β. For any x ∈ IRn, (x)+is used to denote the orthogonal projection of x onto Kn, whereas (x)means the orthogonal projection of x onto −Kn. Also we denote C := {y | hx, yi ≥ 0 ∀x ∈ K}

the dual cone of C, given any closed convex cone C.

2 Preliminaries

In this section, we review some background materials and preliminary results obtained by the author and his co-author in [4] that will be used later. We begin with the determinant and trace of x. For any x = (x1, x2) ∈ IR × IRn−1, its determinant and trace are defined by

det(x) := x21− kx2k2 , tr(x) := 2x1.

In general, det(x ◦ y) 6= det(x)det(y) unless x2 = y2. Besides, we observe that tr(x ◦ y) = 2hx, yi. We next recall from [11] that each x = (x1, x2) ∈ IR × IRn−1 admits a spectral factorization, associated with Kn, of the form

x = λ1u(1)+ λ2u(2),

where λ1, λ2 and u(1), u(2) are the spectral values and the associated spectral vectors of x given by

λi = x1+ (−1)ikx2k, u(i) =

1 2

µ

1, (−1)i x2

kx2k

if x2 6= 0;

1 2

µ

1, (−1)iw2

if x2 = 0,

for i = 1, 2, with w2 being any vector in IRn−1 satisfying kw2k = 1. If x2 6= 0, the factor- ization is unique.

The above spectral factorization of x, as well as x2 and x1/2 and the matrix Lx, have various interesting properties; see [11]. We list four properties that we will use in the subsequent sections.

Property 2.1 For any x = (x1, x2) ∈ IR × IRn−1, with spectral values λ1, λ2 and spectral vectors u(1), u(2), the following results hold.

(a) tr(x) = λ1+ λ2 and det(x) = λ1λ2. (b) If x ∈ Kn, then 0 ≤ λ1 ≤ λ2 and x1/2=

λ1 u(1)+

λ2 u(2).

(6)

(c) If x ∈ int(Kn), then 0 < λ1 ≤ λ2, and Lx is invertible with

L−1x = 1 det(x)

x1 −xT2

−x2 det(x) x1 I + 1

x1x2xT2

.

(d) x ◦ y = Lxy for all y ∈ IRn, and Lx  0 if and only if x ∈ int(Kn).

In the following, we present some preliminary properties about φFB and ψFB given as (10) and (9), respectively, which are crucial to proving the results in Sec. 3 and Sec. 4. We only indicate their sources and omit the proofs since they can be found in [4] and [11].

Lemma 2.1 ([11, Prop. 2.1]) Let φFB : IRn× IRn→ IRn be given by (10). Then φFB(x, y) = 0 ⇐⇒ x, y ∈ Kn, x ◦ y = 0,

⇐⇒ x, y ∈ Kn, hx, yi = 0.

Lemma 2.2 ([4, Lem. 3.2]) For any x = (x1, x2), y = (y1, y2) ∈ IR × IRn−1 with x2+ y2 6∈

int(Kn), we have

x21 = kx2k2, y21 = ky2k2, x1y1 = xT2y2, x1y2 = y1x2.

Lemma 2.3 ([4, Prop. 3.1, 3.2]) Let φFB, ψFB be given as (10) and (9), respectively. Then, ψFB has the following properties.

(a) ψFB : IRn× IRn→ IR+ satisfies (7).

(b) ψFB is continuously differentiable at every (x, y) ∈ IRn× IRn. Moreover, ∇xψFB(0, 0) =

yψBF(0, 0) = 0. If (x, y) 6= (0, 0) and x2+ y2 ∈ int(Kn), then

xψFB(x, y) =

µ

LxL−1(x2+y2)1/2− I

φFB(x, y),

yψFB(x, y) =

µ

LyL−1(x2+y2)1/2− I

φFB(x, y). (18)

If (x, y) 6= (0, 0) and x2+ y2 6∈ int(Kn), then x21+ y12 6= 0 and

xψFB(x, y) =

x1

q

x21 + y21 − 1

φFB(x, y), (19)

yψFB(x, y) =

y1

q

x21 + y21 − 1

φFB(x, y). (20)

(7)

Lemma 2.4 ([4, Lem. 5.1]) Let C be any closed convex cone in IRn. For each x ∈ IRn, let x+C and xC denote the nearest-point (in the Euclidean norm) projection of x onto C and

−C, respectively. Then, the following results hold.

(a) For any x ∈ IRn, we have x = x+C + xC and kxk2 = kx+Ck2+ kxCk2. (b) For any x ∈ IRn and y ∈ C, we have hx, yi ≤ hx+C, yi.

(c) If C is self-dual, then for any x ∈ IRn and y ∈ C, we have °°°(x + y)+C°°°°°°x+C°°°. Proof. In fact, part (a) and (b) are classical results of [16]. 2

Lemma 2.5 ([4, Lem. 5.2]) Let φFB, ψFB be given by (10) and (9), respectively. For any (x, y) ∈ IRn× IRn, we have

FB(x, y) ≥ 2

°°

°°φFB(x, y)+

°°

°°

2

°°

°°(−x)+

°°

°°

2+

°°

°°(−y)+

°°

°°

2.

To close this section, we recall some definitions that will be used for analysis in subse- quent sections. We say that F and G are jointly monotone if

hF (ζ) − F (ξ), G(ζ) − G(ξ)i ≥ 0 ∀ζ, ξ ∈ IRn.

Similarly, F and G are jointly strongly monotone if there exists ρ > 0 such that hF (ζ) − F (ξ), G(ζ) − G(ξ)i ≥ ρkζ − ξk2 ∀ζ, ξ ∈ IRn.

In the case where G(ζ) = ζ for all ζ ∈ IRn, the above notions are equivalent to the well- known notions of F being, respectively, monotone and strongly monotone [7, Sec. 2.3].

3 Two classes of merit functions

In this section, we study two classes of merit functions for the SOCCP. We are motivated by a class of merit functions proposed by Z. Luo and P. Tseng [19] for the NCP case originally and was already extended to the SDCP by P. Tseng [27]. We introduce them as below.

Let fLT be given as (12), i.e.,

fLT(ζ) := ψ0(hF (ζ), G(ζ)i) + ψ(F (ζ), G(ζ)),

where ψ0 satisfies (13) and ψ satisfies (14). We notice that ψ0 is differentiable and strictly increasing on [0, ∞). An example of ψ0 is ψ0(t) = 14(max{0, t})4. Let Ψ+ (we adopt the

(8)

notation used as in [27]) denote the collection of ψ : IRn× IRn → IR+ satisfying (14) that are differentiable and satisfy the following conditions:

( h∇xψ(x, y), ∇yψ(x, y)i ≥ 0, ∀(x, y) ∈ IRn× IRn.

hx, ∇xψ(x, y)i + hy, ∇yψ(x, y)i ≥ 0, ∀(x, y) ∈ IRn× IRn. (21) We will give an example of ψ belonging to Ψ+in Prop. 3.1. Before that, we need couple technical lemmas which will be used for proving Prop. 3.1 and Prop. 3.2.

Lemma 3.1 (a) For any x ∈ IRn, hx, (x)i = k(x)k2 and hx, (x)+i = k(x)+k2. (b) For any x ∈ IRn and y ∈ IRn, we have

x ∈ Kn ⇐⇒ hx, yi ≥ 0 ∀y ∈ Kn. (22)

Proof. (a) By definition of trace, we know that tr(x ◦ y) = 2hx, yi. Thus, hx, (x)i = 1

2tr

µ

x ◦ (x)

= 1 2tr

µ

[(x)++ (x)] ◦ (x)

= 1 2tr

µ

(x)2

= k(x)k2,

where the last inequality is from definition of trace again. Similar arguments applied for hx, (x)+i = k(x)+k2.

(b) Since Kn is self-dual, that is Kn = (Kn). Hence, the desired result follows. 2

Lemma 3.2 [11, Prop. 3.4] For any x, y ∈ IRn and w ∈ Kn, we have w2 º x2 + y2 =⇒ L2w º L2x+ L2y,

w2 º x2 =⇒ w º x.

Proposition 3.1 Let ψ1 : IRn× IRn → IR+ be given by ψ1(x, y) := 1

2

µ

k(−x)+k2+ k(−y)+k2

. (23)

Then, the following results hold.

(a) ψ1 satisfies (14).

(9)

(b) ψ1 is convex and differentiable at every (x, y) ∈ IRn× IRn with ∇xψ1(x, y) = (x) and

yψ1(x, y) = (y).

(c) For every (x, y) ∈ IRn× IRn, we have

h∇xψ1(x, y), ∇yψ1(x, y)i ≥ 0.

(d) For every (x, y) ∈ IRn× IRn, we have

hx, ∇xψ1(x, y)i + hy, ∇yψ1(x, y)i = k(x)k2 + k(y)k2. (e) ψ1 belongs to Ψ+.

Proof. (a) Suppose ψ1(x, y) = 0 and hx, yi ≤ 0. Then by definition of ψ1 as (23), we have (−x)+= 0, (−y)+ = 0 which implies x ∈ Kn, y ∈ Kn. Since Knis self-dual, x, y ∈ Knleads to hx, yi ≥ 0 by (22). This together with hx, yi ≤ 0 yields hx, yi = 0. The other direction is clear from the above arguments. Hence, we proved that ψ1 satisfies (14).

(b) For any x ∈ IRn, we have the decomposition x = (x)++ (x)= (x)+− (−x)+. Hence, 1

2k(−x)+k2 = 1

2k(x)+− xk2 = min

w∈Kn

1

2kw − xk2,

which is convex and differentiable in x (see [22, page 255]). Moreover, the chain rule gives

x

·1

2k(−x)+k2

¸

= −(−x)+= (x).

Similar formula holds for y. Thus, ψ1 is convex and differentiable at every (x, y) ∈ IRn×IRn with ∇xψ1(x, y) = −(−x)+ = (x) and ∇yψ1(x, y) = −(−y)+= (y).

(c) From part(b), we have

h∇xψ1(x, y), ∇yψ1(x, y)i = h(x), (y)i = h(−x)+, (−y)+i ≥ 0, where the inequality is true by (22).

(d) By applying Lemma 3.1(a), we obtain

hx, ∇xψ1(x, y)i = hx, (x)i = k(x)k2.

Similarly, hy, ∇xψ1(x, y)i = k(y)k2 and hence the desired result holds.

(e) This is an immediate consequence of (a) through (d). 2

(10)

Next, we consider a further restriction on ψ. Let Ψ++ denote the collection of ψ ∈ Ψ+ satisfying the following conditions:

ψ(x, y) = 0 ∀(x, y) ∈ IRn× IRn whenever h∇xψ(x, y), ∇yψ(x, y)i = 0. (24) We notice that the ψ1 defined as (23) in Prop. 3.1 does not belong to Ψ++. An example of such ψ belonging to Ψ++ is given in Prop. 3.2.

Proposition 3.2 Let ψ2 : IRn× IRn → IR+ be given by ψ2(x, y) := 1

2FB(x, y)+k2, (25)

where φFB is defined as (10). Then, the following results hold.

(a) ψ2 satisfies (14).

(b) ψ2 is differentiable at every (x, y) ∈ IRn× IRn Moreover, ∇xψ2(0, 0) = ∇yψ2(0, 0) = 0.

If (x, y) 6= (0, 0) and x2+ y2 ∈ int(Kn), then

xψ2(x, y) =

µ

LxL−1(x2+y2)1/2− I

φFB(x, y)+,

yψ2(x, y) =

µ

LyL−1(x2+y2)1/2− I

φFB(x, y)+. (26) If (x, y) 6= (0, 0) and x2+ y2 6∈ int(Kn), then x21+ y12 6= 0 and

xψ2(x, y) =

x1

q

x21+ y12 − 1

φFB(x, y)+,

yψ2(x, y) =

y1

q

x21+ y12 − 1

φFB(x, y)+. (27)

(c) For every (x, y) ∈ IRn× IRn, we have

h∇xψ2(x, y), ∇yψ2(x, y)i ≥ 0, and the equality holds whenever ψ2(x, y) = 0.

(d) For every (x, y) ∈ IRn× IRn, we have

hx, ∇xψ2(x, y)i + hy, ∇yψ2(x, y)i = kφFB(x, y)+k2. (e) ψ2 belongs to Ψ++.

(11)

Proof. (a) Suppose ψ2(x, y) = 0 and hx, yi ≤ 0. Let z := −φFB(x, y). Then (−z)+ = φFB(x, y)+ = 0 which says z ∈ Kn. Since x + y = (x2+ y2)1/2+ z, squaring both sides and simplifying yield

2(x ◦ y) = 2

µ

(x2 + y2)1/2◦ z

+ z2.

Now, taking trace of both sides and using the fact tr(x ◦ y) = 2hx, yi, we obtain

4hx, yi = 4h(x2+ y2)1/2, zi + 2kzk2. (28) Since (x2+ y2)1/2 ∈ Kn and z ∈ Kn, then we know h(x2+ y2)1/2, zi ≥ 0 by Lemma 3.1(b).

Thus, the right hand-side of (28) is nonnegative, which togethers with hx, yi ≤ 0 implies hx, yi = 0. Therefore, with this, the equation (28) says z = 0 which is equivalent to φFB(x, y) = 0. Then by Lemma 2.1, we have x, y ∈ Kn. Conversely, if x, y ∈ Kn and hx, yi = 0, then again Lemma 2.1 yields φFB(x, y) = 0. Thus, ψ2(x, y) = 0 and hx, yi ≤ 0.

(b) For the proof of part(b), we need to discuss three cases.

Case (1): If (x, y) = (0, 0), then for any h, k ∈ IRn, let µ1 ≤ µ2 be the spectral values and let v(1), v(2) be the corresponding spectral vectors of h2+ k2. Hence, by Property 2.1(b),

k(h2+ k2)1/2− h − kk = k√

µ1v(1)+

µ2v(2)− h − kk

µ1kv(1)k +√

µ2kv(2)k + khk + kkk

= (

µ1+ µ2)/√

2 + khk + kkk.

Also

µ1 ≤ µ2 = khk2+ kkk2+ 2kh1h2+ k1k2k

≤ khk2+ kkk2+ 2|h1|kh2k + 2|k1|kk2k

≤ 2khk2 + 2kkk2. Combining the above two inequalities yields

ψ2(h, k) − ψ2(0, 0) = 1

2FB(h, k)+k2

≤ kφFB(h, k)k2

= k(h2+ k2)1/2− h − kk2

³(

µ1+ µ2)/√

2 + khk + kkk´2

µ

2

q

2khk2+ 2kkk2/√

2 + khk + kkk

2

= O(khk2+ kkk2),

where the first inequality is from Lemma 2.5. This shows that ψ2 is differentiable at (0, 0) with

xψ2(0, 0) = ∇yψ2(0, 0) = 0.

(12)

Case (2): If (x, y) 6= (0, 0) and x2+ y2 ∈ int(Kn), let z be factored as z = λ1u(1) + λ2u(2) for any z ∈ IRn. Now, let g : IRn→ IRn be defined as

g(z) := 1

2((z)+)2 = ˆg(λ1)u(1)+ ˆg(λ2)u(2),

where ˆg : IR → IR is given by ˆg(λ) := 12(max(0, λ))2. From the continuous differentiability of ˆg and Prop. 5.2 of [5], the vector-valued function g is also continuously differentiable.

Hence, the first component g1(z) = 12k(z)+k2 of g(z) is continuously differentiable as well.

By an easy computation, we have ∇g1(z) = (z)+. Since ψ2(x, y) = g1FB(x, y)) and φFB is differentiable at (x, y) 6= (0, 0) with x2+ y2 ∈ int(Kn) (see [11, Cor. 5.2]). Hence, the chain rule yields

xψ2(x, y) = ∇xφFB(x, y)∇g1FB(x, y)) =

µ

LxL−1(x2+y2)1/2− I

φFB(x, y)+,

yψ2(x, y) = ∇yφFB(x, y)∇g1FB(x, y)) =

µ

LyL−1(x2+y2)1/2− I

φFB(x, y)+.

Case (3): If (x, y) 6= (0, 0) and x2+ y2 6∈ int(Kn), by direct computation, we know kxk2 + kyk2 = 2kx1x2+ y1y2k under this case. Since (x, y) 6= (0, 0), this also implies x1x2+ y1y2 6=

0. We notice that we can not apply the chain rule as in case(2) since φFB is no longer differentiable at such (x, y) of case(3). By the spectral factorization, we observe that

φFB(x, y)+= φFB(x, y) ⇐⇒ φFB(x, y) ∈ Kn

φFB(x, y)+ = 0 ⇐⇒ φFB(x, y) ∈ −Kn (29) φFB(x, y)+= λ2u(2) ⇐⇒ φFB(x, y) 6∈ Kn∪ −Kn,

where λ2 is the bigger spectral value of φFB(x, y) and u(2) is the corresponding spectral vector. Indeed, by applying Lemma 2.2, under this case, we have (as in [4, eq. (26)])

φFB(x, y) =

µq

x21+ y12− (x1 + y1),x1x2+ y1y2

q

x21+ y12 − (x2+ y2)

. (30)

Therefore, λ2 and u(2) are given as below:

λ2 = qx21+ y21− (x1+ y1) + kw2k, (31) u(2) = 1

2

µ

1, w2 kw2k

, (32)

where w2 = x1x2+y1y2

x21+y12 − (x2+ y2). To prove the differentiability of ψ2 under this case, we shall discuss the following three subcases according to the above observation (29).

(i) If φFB(x, y) 6∈ Kn∪ −Knthen φFB(x, y)+ = λ2u(2) where λ2 and u(2) are given as in (31).

From the fact that ku(2)k = 12, we obtain ψ2(x, y) = 1

2FB(x, y)+k2 = 1 4λ22

= 1 4

·µq

x21+ y12− (x1+ y1)

2

+ 2

µq

x21+ y12− (x1+ y1)

· kw2k + kw2k2

¸

.

(13)

Since (x, y) 6= (0, 0) in this case, ψ2 is differentiable clearly. Moreover, using the product rule and chain rule for differentiation, the derivative of ψ2 with respect to x1 works out to be

∂x1

ψ2(x, y) = 1 4

·

2

µq

x21+ y12− (x1 + y1)

¶µ x1

q

x21+ y21 − 1

+ 2

µ x1

q

x21+ y12 − 1

kw2k

+2

µq

x21+ y12− (x1+ y1)

· w2Tx1w2

kw2k + 2w2Tx1w2

¸

= 1

2

µ x1

q

x21+ y12 − 1

¶µq

x21+ y21− (x1+ y1) + kw2k

. The last equality of the above expression is true because of

x1w2 =

x2·qx21+ y12− (x1x2+ y1y2) ·√x1

x21+y21

(x21+ y21)

=

1 x21+y21

·

x2(x21+ y21) − (x21x2+ x1y1y2)

¸

(x21+ y12)

= x21x2+ y12x2− x21x2− x1y1y2 (qx21+ y12)3

= 0,

where the last equality holds by Lemma 2.2. Similarly, the gradient of ψ2 with respect to x2 works out to be

x2ψ2(x, y) = 1 4

"

2

µq

x21+ y12− (x1+ y1)

x2w2· w2

kw2k + 2∇x2w2· w2

#

= 1

2

µq

x21+ y21− (x1+ y1)

¶µ x1

q

x21+ y12 − 1

w2 kw2k +

µ x1

q

x21+ y12 − 1

w2

= 1

2

µ x1

q

x21+ y12 − 1

¶µq

x21 + y21− (x1+ y1) + kw2k

w2 kw2k

.

Then, we can rewrite ∇xψ2(x, y) as

xψ2(x, y) =

"

∂x1ψ2(x, y)

x2ψ2(x, y)

#

:=

"

Ξ1 Ξ2

#

=

µ x1

q

x21+ y12 − 1

λ2u(2)

=

µ x1

q

x21+ y12 − 1

φFB(x, y)+, (33)

(14)

where

Ξ1 := 1 2

µ x1

qx21+ y12 − 1

¶µq

x21 + y21− (x1+ y1) + kw2k

∈ IR

Ξ2 := 1 2

µ x1

q

x21+ y12 − 1

¶µq

x21 + y21− (x1+ y1) + kw2k

w2

kw2k ∈ IRn−1.

(ii) If φFB(x, y) ∈ Kn then φFB(x, y)+ = φFB(x, y) and hence ψ2(x, y) = 12FB(x, y)+k2 =

1

2FB(x, y)k2. Thus, by [4, Prop. 3.1(b)], we know that the gradient of ψ2 under this subcase is as below:

xψ2(x, y) =

x1

q

x21+ y12 − 1

φFB(x, y) =

x1

q

x21+ y21 − 1

φFB(x, y)+ (34)

yψ2(x, y) =

y1

q

x21+ y12 − 1

φFB(x, y) =

y1

q

x21+ y21 − 1

φFB(x, y)+.

If there is (x0, y0) such that φFB(x0, y0) 6∈ Kn∪ −Kn and φFB(x0, y0) → φFB(x, y) ∈ Kn (the neighborhood of point belonging to this subcase). From (33) and (34), it can be seen that

xψ2(x0, y0) → ∇xψ2(x, y), yψ2(x0, y0) → ∇yψ2(x, y).

Thus, ψ2 is differentiable under this subcase.

(iii) If φFB(x, y) ∈ −Kn then φFB(x, y)+= 0. Thus, ψ2(x, y) = 12FB(x, y)+k2 = 0 and it is clear that its gradient under this subcase is

xψ2(x, y) = 0 =

x1

q

x21+ y12 − 1

φFB(x, y)+, (35)

yψ2(x, y) = 0 =

y1

q

x21+ y12 − 1

φFB(x, y)+.

Again, if there is (x0, y0) such that φFB(x0, y0) 6∈ Kn∪−Knand φFB(x0, y0) → φFB(x, y) ∈ −Kn (the neighborhood of point belonging to this subcase). From (33) and (35), it can be seen that

xψ2(x0, y0) → 0 = ∇xψ2(x, y), yψ2(x0, y0) → 0 = ∇yψ2(x, y).

Thus, ψ2 is differentiable under this subcase.

From the above, we complete the proof of this case and therefore the proof for part(b) is done.

(c) We wish to show that h∇xψ2(x, y), ∇yψ2(x, y)i ≥ 0 and the equality holds if and only if ψ2(x, y) = 0. We follow the three cases as above.

Case (1): If (x, y) = (0, 0), by part (b), we know ∇xψ2(x, y) = ∇yψ2(x, y) = 0. Therefore, the desired equality holds.

Figure

Updating...

References

Related subjects :