4 Properties of the operator Φ

(1)

Applied Mathematics and Optimization, vol. 59, pp. 293-318, 2009

A damped Gauss-Newton method for the second-order cone complementarity problem

Shaohua Pan¹

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China

Jein-Shan Chen ² Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

June 5, 2007

(revised January 18, 2008) (final version June 25, 2008)

Abstract. We investigate some properties related to the generalized Newton method for the Fischer-Burmeister (FB) function over second-order cones, which allows us to reformulate the second-order cone complementarity problem (SOCCP) as a semismooth system of equations. Specifically, we characterize the B-subdifferential of the FB function at a general point and study the condition for every element of the B-subdifferential at a solution being nonsingular. In addition, for the induced FB merit function, we establish its coerciveness and provide a weaker condition than [7] for each stationary point to be a solution, under suitable Cartesian P -properties of the involved mapping. By this, a damped Gauss-Newton method is proposed and the global and superlinear convergence results are obtained. Numerical results are reported for the second-order cone programs from the DIMACS library, which verify the good theoretical properties of the method.

Key words: second-order cones; complementarity; Fischer-Burmeister function; B- subdifferential; generalized Newton method.

1The author’s work is partially supported by the Doctoral Starting-up Foundation (B13B6050640) of GuangDong Province. E-mail:shhpan@scut.edu.cn.

2Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office.

The author’s work is partially supported by National Science Council of Taiwan. E-mail:

jschen@math.ntnu.edu.tw, FAX: 886-2-29332342.

(2)

1 Introduction

Consider the following conic complementarity problem of finding ζ ∈ IRⁿ such that F (ζ) ∈ K, G(ζ) ∈ K, hF (ζ), G(ζ)i = 0, (1) where h·, ·i represents the Euclidean inner product, F, G : IRⁿ → IR^m are the mapping assumed to be continuously differentiable throughout this paper, and K is the Cartesian product of second-order cones (SOCs), or called Lorentz cones. In other words,

K = Kⁿ¹ × Kⁿ² × · · · × Kⁿ^q, (2) where q, n₁, . . . , n_q ≥ 1, n₁+ · · · + n_q = m and

Kⁿⁱ :=©

x = (x₁, x₂) ∈ IR × IRⁿⁱ⁻¹ | x₁ ≥ kx₂kª

with k · k denoting the Euclidean norm and K¹ denoting the set of nonnegative reals IR₊. We will refer to (1)–(2) as the second-order cone complementarity problem (SOCCP).

Corresponding to the Cartesian structure of K, in the rest of this paper, we always write F = (F₁, . . . , F_q) and G = (G₁, . . . , G_q) with F_i, G_i : IRⁿ → IRⁿⁱ.

An important special case of the SOCCP corresponds to n = m and G(ζ) = ζ for all ζ ∈ IRⁿ. Then (1) and (2) reduce to

F (ζ) ∈ K, ζ ∈ K, hF (ζ), ζi = 0, (3)

which is a natural extension of the nonlinear complementarity problem (NCP) over the nonnegative orthant cone IRⁿ₊. Another special case corresponds to the Karush-Kuhn- Tucker (KKT) conditions for the convex second-order cone program (CSOCP):

min g(x)

s.t. Ax = b, x ∈ K, (4)

where A ∈ IR^p×m has full row rank, b ∈ IR^p and g : IR^m → IR is a twice continuously differentiable convex function. From [7], the KKT conditions of (4), which are sufficient but not necessary for optimality, can be rewritten in the form of (1) with n = m and

F (ζ) := ˆx + (I − A^T(AA^T)⁻¹A)ζ, G(ζ) := ∇g(F (ζ)) − A^T(AA^T)⁻¹Aζ, (5) where ˆx ∈ IRⁿis any vector satisfying Ax = b. When g is a linear function, (4) reduces to the standard second-order cone program which has extensive applications in engineering design, finance, control, and robust optimization; see [1, 14] and references therein.

There have been many methods proposed for solving SOCPs and SOCCPs. They include the interior-point methods [1, 2, 14, 16, 24, 26], the non-interior smoothing New- ton methods [6, 11], the smoothing-regularization method [13], and the merit function

(3)

approach [7]. Among others, the last three kinds of methods are all based on an SOC complementarity function. Specifically, a mapping φ : IR^l× IR^l → IR^l is called an SOC complementarity function associated with the cone K^l (l ≥ 1) if

φ(x, y) = 0 ⇐⇒ x ∈ K^l, y ∈ K^l, hx, yi = 0. (6) A popular choice of φ is the vector-valued Fischer-Burmeister (FB) function, defined by

φ(x, y) := (x²+ y²)^1/2− (x + y) ∀x, y ∈ IR^l (7) where x² = x ◦ x denotes the Jordan product of x and itself, x^1/2 denotes a vector such that (x^1/2)² = x, and x + y means the usual componentwise addition of vectors. From the next section, we see that φ in (7) is well-defined for all (x, y) ∈ IR^l× IR^l. The function was shown in [11] to satisfy the equivalence (6), and therefore its squared norm

ψ(x, y) := 1

2kφ(x, y)k² (8)

is a merit function for the SOCCP, i.e., ψ(x, y) = 0 if and only if x ∈ K^l, y ∈ K^l and hx, yi = 0. The functions φ and ψ were studied in the literature [7, 21], where ψ was shown to be continuously differentiable everywhere by Chen and Tseng [7] and φ was proved to be strongly semismooth by D. Sun and J. Sun [21].

In view of the characterization in (6), clearly, the SOCCP can be reformulated as the following nonsmooth system of equations:

Φ(ζ) :=







φ(F₁(ζ), G₁(ζ)) ...

φ(Fi(ζ), Gi(ζ)) ...

φ(F_q(ζ), G_q(ζ))







= 0 (9)

where φ is defined as in (7) with a suitable dimension l. By Corollary 3.3 of [21], it is not hard to show that the operator Φ : IRⁿ → IR^m in (9) is semismooth. Furthermore, from Proposition 2 of [7], its squared norm induces a smooth merit function, given by

Ψ(ζ) := 1

2kΦ(ζ)k² = Xq

i=1

ψ(Fi(ζ), Gi(ζ)). (10)

In this paper, we mainly characterize the B-subdifferential of φ at a general point and present an estimate for the B-subdifferential of Φ. By this, a condition is given to guarantee every element of the B-subdifferential of Φ at a solution to be nonsingular, which plays an important role in the local convergence analysis of nonsmooth Newton

(4)

methods for the SOCCP. In addition, two important results are also presented for the merit function Ψ(ζ). One of them shows that each stationary point of Ψ is a solution of the SOCCP under a weaker condition than the one used by [7], and the other establishes the coerciveness of Ψ for the SOCCP (3) under the uniform Cartesian P -property of F . Based on these results, we finally propose a damped Gauss-Newton method by applying the generalized Newton method [19, 20] for the system (9), and analyze its global and superlinear (quadratic) convergence. Numerical results are reported for the SOCPs from the DIMACS library [18], which verify the good theoretical properties of the method.

Throughout this paper, I represents an identity matrix of suitable dimension, IRⁿ denotes the space of n-dimensional real column vectors, and IRⁿ¹× · · · × IRⁿ^q is identified with IRⁿ¹^+···+n^q. Thus, (x₁, . . . , x_q) ∈ IRⁿ¹ × · · · × IRⁿ^q is viewed as a column vector in IRⁿ¹^+···+n^q. For any differentiable mapping F : IRⁿ → IR^m, the notation ∇F (x) ∈ IR^n×m denotes the transpose of the Jacobian F⁰(x). For a symmetric matrix A, we write A Â O (respectively, A º O) if A is positive definite (respectively, positive semidefinite). Given a finite number of square matrices Q₁, · · · , Q_q, we denote the block diagonal matrix with these matrices as block diagonals by diag(Q₁, . . . , Q_q) or by diag(Q_i, i = 1, . . . , q). If J and B are index sets such that J , B ⊆ {1, 2, . . . , q}, we denote by PJ B the block matrix consisting of the sub-matrices P_jk ∈ IRⁿ^j^×n^k of P with j ∈ J , k ∈ B, and denote by x_B a vector consisting of sub-vectors x_i ∈ IRⁿⁱ with i ∈ B.

2 Preliminaries

This section recalls some background materials and preliminary results that will be used in the subsequent sections. We start with the interior and the boundary of K^l (l > 1).

It is known that K^l is a closed convex self-dual cone with nonempty interior given by int(K^l) :=©

x = (x1, x2) ∈ IR × IR^l−1 | x1 > kx2kª and the boundary given by

bd(K^l) :=©

x = (x₁, x₂) ∈ IR × IR^l−1 | x₁ = kx₂kª .

For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IR^l−1, we define their Jordan product [9] as x ◦ y := (hx, yi, x₁y₂+ y₁x₂).

The Jordan product “◦”, unlike scalar or matrix multiplication, is not associative, which is the main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, · · · , 0)^T ∈ IR^l. For each x = (x1, x2) ∈ IR × IR^l−1, define the matrix L_x by

Lx :=

· x₁ x^T₂ x₂ x₁I

¸ ,

(5)

which can be viewed as a linear mapping from IR^l to IR^l with the following properties.

Property 2.1 (a) L_xy = x ◦ y and L_x+y = L_x+ L_y for any y ∈ IR^l. (b) x ∈ K^l⇐⇒ Lx º O and x ∈ int(K^l) ⇐⇒ Lx Â O.

(c) L_x is invertible whenever x ∈ int(K^l) with the inverse L⁻¹_x given by

L⁻¹_x = 1 det(x)



 x1 −x^T₂

−x₂ det(x)

x₁ I + x₂x^T₂ x₁



 , (11)

where det(x) := x²₁− kx₂k² denotes the determinant of x.

In the following, we recall from [9, 11] that each x = (x1, x2) ∈ IR × IR^l−1 admits a spectral factorization, associated with K^l, of the form

x = λ₁(x) · u⁽¹⁾_x + λ₂(x) · u⁽²⁾_x ,

where λ1(x), λ2(x) and u⁽¹⁾x , u⁽²⁾x are the spectral values and the associated spectral vectors of x, respectively, defined by

λ_i(x) = x₁+ (−1)ⁱkx₂k, u⁽ⁱ⁾_x = 1 2

¡1, (−1)ⁱx¯₂¢

, i = 1, 2,

with ¯x₂ = x₂/kx₂k if x₂ 6= 0 and otherwise ¯x₂ being any vector in IR^l−1 satisfying k¯x₂k = 1. If x₂ 6= 0, the factorization is unique. The spectral factorizations of x, x² and x^1/2 have various interesting properties, and some of them are summarized as follows.

Property 2.2 For any x = (x₁, x₂) ∈ IR × IR^l−1, let λ₁(x), λ₂(x) and u⁽¹⁾x , u⁽²⁾x be the spectral values and the associated spectral vectors. Then, the following results hold.

(a) x ∈ K^l ⇐⇒ 0 ≤ λ₁(x) ≤ λ₂(x) and x ∈ int(K^l) ⇐⇒ 0 < λ₁(x) ≤ λ₂(x).

(b) x² = [λ1(x)]²u⁽¹⁾x + [λ2(x)]²u⁽²⁾x ∈ K^l for any x ∈ IR^l. (c) If x ∈ K^l, then x^1/2 =p

λ₁(x) u⁽¹⁾x +p

λ₂(x) u⁽²⁾x ∈ K^l.

Now we recall the concepts of the B-subdifferential and (strong) semismoothness.

Given a mapping H : IRⁿ→ IR^m, if H is locally Lipschitz continuous, then the set

∂_BH(z) :=

n

V ∈ IR^m×n| ∃{z^k} ⊆ D_H : z^k→ z, H⁰(z^k) → V o

is nonempty and is called the B-subdifferential of H at z, where D_H ⊆ IRⁿ denotes the set of points at which H is differentiable. The convex hull ∂H(z) := conv∂_BH(z) is the

(6)

generalized Jacobian of Clarke [4]. Semismoothness was originally introduced by Mifflin [15] for functionals. Smooth functions, convex functionals, and piecewise linear functions are examples of semismooth functions. Later, Qi and Sun [19] extended the definition of semismooth functions to a mapping H : IRⁿ → IR^m. H is called semismooth at x if H is directionally differentiable at x and for all V ∈ ∂H(x + h) and h → 0,

V h − H⁰(x; h) = o(khk);

H is called strongly semismooth at x if H is semismooth at x and for all V ∈ ∂H(x + h) and h → 0,

V h − H⁰(x; h) = O(khk²);

H is called (strongly) semismooth if it is (strongly) semismooth everywhere. Here, o(khk) means a function α : IRⁿ → IR^m satisfying lim

h→0α(h)/khk = 0, while O(khk²) denotes a function α : IRⁿ→ IR^m satisfying kα(h)k ≤ Ckhk²for all khk ≤ δ and some C > 0, δ > 0.

Next, we present the definitions of Cartesian P -properties for a matrix M ∈ IR^m×m, which are special cases of those introduced by Chen and Qi [5] for a linear transformation.

Definition 2.1 A matrix M ∈ IR^m×m is said to have

(a) the Cartesian P -property if for any 0 6= x = (x₁, . . . , x_q) ∈ IR^m with x_i ∈ IRⁿⁱ, there exists an index ν ∈ {1, 2, . . . , q} such that hx_ν, (Mx)_νi > 0;

(b) the Cartesian P₀-property if for any 0 6= x = (x₁, . . . , x_q) ∈ IR^m with x_i ∈ IRⁿⁱ, there exists an index ν ∈ {1, 2, . . . , q} such that x_ν 6= 0 and hx_ν, (Mx)_νi ≥ 0.

Some nonlinear generalizations of these concepts in the setting of K are defined as follows.

Definition 2.2 Given a mapping F = (F₁, . . . , F_q) with F_i : IRⁿ→ IRⁿⁱ, F is said to (a) have the uniform Cartesian P -property if for any x = (x₁, . . . , x_q), y = (y₁, . . . , y_q) ∈

IR^m, there is an index ν ∈ {1, 2, . . . , q} and a positive constant ρ > 0 such that hxν − yν, Fν(x) − Fν(y)i ≥ ρkx − yk²;

(b) have the Cartesian P₀-property if for any x = (x₁, . . . , x_q), y = (y₁, . . . , y_q) ∈ IR^m and x 6= y, there exists an index ν ∈ {1, 2, . . . , q} such that

x_ν 6= y_ν and hx_ν − y_ν, F_ν(x) − F_ν(y)i ≥ 0.

(7)

From the above definitions, if a continuously differentiable mapping F : IRⁿ → IRⁿhas the uniform Cartesian P -property (Cartesian P0-property), then ∇F (x) at any x ∈ IRⁿ enjoys the Cartesian P -property (Cartesian P₀-property). In addition, we may see that, when n₁ = · · · = n_q = 1, the above concepts reduce to the definitions of P -matrices and P -functions, respectively, for the NCP.

Finally, we introduce some notations which will be used in the rest of this paper. For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IR^l−1, we define w, z : IR^l× IR^l → IR^l by

w = (w₁, w₂) = (w₁(x, y), w₂(x, y)) = w(x, y) := x²+ y²,

z = (z₁, z₂) = (z₁(x, y), z₂(x, y)) = z(x, y) := (x²+ y²)^1/2. (12) Clearly, w ∈ K^l with w1 = kxk² + kyk² and w2 = 2(x1x2 + y1y2). Let ¯w2 = w2/kw2k if w₂ 6= 0, and otherwise ¯w₂ be any vector in IR^l−1 satisfying k ¯w₂k = 1. Then, using Property 2.2 (b) and (c), it is not hard to compute that

z =

Ãpλ₂(w) +p λ₁(w)

2 ,

pλ₂(w) −p λ₁(w)

2 w¯2

!

∈ K^l.

3 B-subdifferential of the FB Function

In this section, we characterize the B-subdifferential of the FB function φ at a general point (x, y) ∈ IR^l× IR^l. For this purpose, we need several important technical lemmas.

The first lemma characterizes the set of the points where z(x, y) is differentiable. Since the proof is direct by [3, Propostion 4] and formula (11), we here omit it.

Lemma 3.1 The function z(x, y) in (12) is continuously differentiable at a point (x, y) if and only if x²+ y² ∈ int(K^l). Moreover, ∇_xz(x, y) = L_xL⁻¹_z and ∇_yz(x, y) = L_yL⁻¹_z , where L⁻¹_z = (1/√

w1)I if w2 = 0, and otherwise L⁻¹_z =

µ b c ¯w^T₂

c ¯w₂ aI + (b − a) ¯w₂w¯^T₂

¶

(13) with

a = 2

pλ₂(w) +p

λ₁(w), b = 1 2

Ã p 1

λ₂(w)+ 1 pλ₁(w)

!

, c = 1 2

Ã p 1

λ₂(w)− 1 pλ₁(w)

! .

The following two lemmas extends the results of Lemmas 2 and 3 of [7], respectively.

Since the proofs are direct by using the same technique as [7], we here omit them.

(8)

Lemma 3.2 For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR × IR^l−1 with w = x²+ y² ∈ bd(K^l), we have

x²₁ = kx₂k², y₁² = ky₂k², x₁y₁ = x^T₂y₂, x₁y₂ = y₁x₂. If, in addition, w₂ 6= 0, then kwk² = 2w₁² = 2kw₂k² = 4(x²₁+ y₁²) 6= 0 and

x₁w¯₂ = x₂, x^T₂w¯₂ = x₁, y₁w¯₂ = y₂, y₂^Tw¯₂ = y₁.

Lemma 3.3 For any x = (x₁, x₂), y = (y₁, y₂) ∈ IR×IR^l−1 with w₂ = 2(x₁x₂+y₁y₂) 6= 0, there holds that

¡x₁+ (−1)ⁱx^T₂w¯₂¢₂

≤°

°x₂ + (−1)ⁱx₁w¯₂°

°² ≤ λ_i(w) for i = 1, 2.

Based on Lemmas 3.1–3.3, we are now in a position to present the representation for the elements of the B-subdifferential ∂_Bφ(x, y) at a general point (x, y) ∈ IR^l× IR^l. Proposition 3.1 Given a general point (x, y) ∈ IR^l× IR^l, each element in ∂Bφ(x, y) is given by [V_x− I V_y − I] with V_x and V_y having the following representation:

(a) If x²+ y² ∈ int(K^l), then V_x = L⁻¹_z L_x and V_y = L⁻¹_z L_y. (b) If x²+ y² ∈ bd(K^l) and (x, y) 6= (0, 0), then

V_x ∈

½ 1

2√ 2w₁

µ 1 w¯^T₂

¯

w2 4I − 3 ¯w2w¯^T₂

¶

L_x+1 2

µ 1

− ¯w2

¶ u^T

¾

V_y ∈

½ 1

2√ 2w1

µ 1 w¯^T₂

¯

w₂ 4I − 3 ¯w₂w¯₂^T

¶

L_y+ 1 2

µ 1

− ¯w₂

¶ v^T

¾

(14) for some u = (u₁, u₂), v = (v₁, v₂) ∈ IR × IR^l−1 satisfying |u₁| ≤ ku₂k ≤ 1 and

|v1| ≤ kv2k ≤ 1, where ¯w2 = w2/kw2k.

(c) If (x, y) = (0, 0), then V_x ∈ {L_ˆ_x}, V_y ∈ {L_y_ˆ} for some ˆx, ˆy with kˆxk²+ kˆyk² = 1, or V_x ∈

½1 2

µ 1

¯ w₂

¶

ξ^T +1 2

µ 1

− ¯w₂

¶

u^T + 2

µ 0 0

(I − ¯w₂w¯^T₂)s₂ (I − ¯w₂w¯^T₂)s₁

¶¾

V_y ∈

½1 2

µ 1

¯ w₂

¶

η^T + 1 2

µ 1

− ¯w₂

¶

v^T + 2

µ 0 0

(I − ¯w₂w¯^T₂)ω₂ (I − ¯w₂w¯^T₂)ω₁

¶¾ (15) for some u = (u₁, u₂), v = (v₁, v₂), ξ = (ξ₁, ξ₂), η = (η₁, η₂) ∈ IR × IR^l−1 such that |u₁| ≤ ku₂k ≤ 1, |v₁| ≤ kv₂k ≤ 1, |ξ₁| ≤ kξ₂k ≤ 1, |η₁| ≤ kη₂k ≤ 1, ¯w₂ ∈ IR^l−1 satisfying k ¯w2k = 1, and s = (s1, s2), ω = (ω1, ω2) ∈ IR × IR^l−1 satisfying ksk²+ kωk² ≤ 1/2.

(9)

Proof. Let D_φ denote the set of points where φ is differentiable. Recall that this set is characterized by Lemma 3.1 since φ(x, y) = z(x, y) − (x + y), and moreover,

φ⁰_x(x, y) = L⁻¹_z L_x− I, φ⁰_y(x, y) = L⁻¹_z L_y− I ∀(x, y) ∈ D_φ.

(a) In this case, φ is continuously differentiable at (x, y) by Lemma 3.1. Hence, ∂_Bφ(x, y) consists of a single element, i.e. φ⁰(x, y) = [L⁻¹_z L_x− I L⁻¹_z L_y− I], and the result is clear.

(b) Assume that (x, y) 6= (0, 0) satisfies x²+ y² ∈ bd(K^l). Then w ∈ bd(K^l) and w1 > 0, which means kw₂k = w₁ > 0 and λ₂(w) > λ₁(w) = 0. Observe that, when w₂ 6= 0, the matrix L⁻¹_z in (13) can be decomposed as the sum of

L₁(w) := 1 2p

λ₁(w)

µ 1 − ¯w^T₂

− ¯w₂ w¯₂w¯₂^T

¶

(16)

and

L₂(w) := 1 2p

λ₂(w)





1 w¯^T₂

¯

w₂ 4p

λ₂(w) pλ2(w) +p

λ1(w)(I − ¯w₂w¯₂^T) + ¯w₂w¯^T₂



 (17)

with ¯w₂ = w₂/kw₂k. Consequently, φ⁰_x and φ⁰_y can be rewritten as

φ⁰_x(x, y) = (L₁(w) + L₂(w)) L_x− I, φ⁰_y(x, y) = (L₁(w) + L₂(w)) L_y − I. (18) Let {(x^k, y^k)} ⊆ Dφ be an arbitrary sequence converging to (x, y). Let w^k= (w^k₁, w₂^k) = w(x^k, y^k) and z^k = z(x^k, y^k) for each k, where w(x, y) and z(x, y) are given as in (12).

Since w₂ 6= 0, we without loss of generality assume kw^k₂k 6= 0 for each k. Let ¯w₂^k = w^k₂/kw^k₂k for each k. From (18), it follows that

φ⁰_x(x^k, y^k) = ¡

L₁(w^k) + L₂(w^k)¢

L_x^k− I, φ⁰_y(x^k, y^k) = ¡

L₁(w^k) + L₂(w^k)¢

L_y^k − I. (19)

Since lim_k→∞λ₁(w^k) = 0, lim_k→∞λ₂(w^k) = 2w₁ > 0 and lim_k→∞w¯^k₂ = ¯w₂, we have

k→∞lim L2(w^k)L_x^k = C(w)Lx and lim

k→∞L2(w^k)L_y^k = C(w)Ly (20) where

C(w) = 1 2√

2w₁

µ 1 w¯₂^T

¯

w₂ 4I − 3 ¯w₂w¯^T₂

¶

. (21)

Next we focus on the limit of L1(w^k)L_x^k and L1(w^k)L_y^k as k → ∞. By computing, L₁(w^k)L_x^k = 1

2

µ u^k₁ (u^k₂)^T

−u^k₁w¯₂^k − ¯w^k₂(u^k₂)^T

¶ , L1(w^k)L_y^k = 1

2

µ v^k₁ (v₂^k)^T

−v₁^kw¯₂^k − ¯w^k₂(v₂^k)^T

¶ ,

(10)

where

u^k₁ = x^k₁ − (x^k₂)^Tw¯₂^k

pλ₁(w^k) , u^k₂ = x^k₂ − x^k₁w¯^k₂

pλ₁(w^k) , v^k₁ = y₁^k− (y^k₂)^Tw¯₂^k

pλ₁(w^k) , v₂^k = y^k₂ − y₁^kw¯₂^k pλ₁(w^k).

By Lemma 3.3, |u^k₁| ≤ ku^k₂k ≤ 1 and |v^k₁| ≤ kv₂^kk ≤ 1. So, taking the limit (possibly on a subsequence) on L₁(w^k)L_x^k and L₁(w^k)L_y^k, respectively, gives

L₁(w^k)L_x^k → 1 2

µ u₁ u^T₂

−u1w¯2 − ¯w2u^T₂

¶

= 1 2

µ 1

− ¯w2

¶ u^T L1(w^k)L_y^k → 1

2

µ v₁ v₂^T

−v₁w¯₂ − ¯w₂v^T₂

¶

= 1 2

µ 1

− ¯w₂

¶

v^T (22)

for some u = (u₁, u₂), v = (v₁, v₂) ∈ IR × IR^l−1 satisfying |u₁| ≤ ku₂k ≤ 1 and |v₁| ≤ kv2k ≤ 1. In fact, u and v are some accumulation point of the sequences {u^k} and {v^k}, respectively. From equations (19)–(22), we immediately obtain

φ⁰_x(x^k, y^k) → C(w)L_x+1 2

µ 1

− ¯w2

¶

u^T − I, φ⁰_y(x^k, y^k) → C(w)L_y+ 1

2

µ 1

− ¯w₂

¶

v^T − I.

This shows that φ⁰(x^k, y^k) → [V_x− I V_y − I] as k → ∞ with V_x, V_y satisfying (14).

(c) Assume that (x, y) = (0, 0). Let {(x^k, y^k)} ⊆ D_φ be an arbitrary sequence converging to (x, y). Let w^k = (w^k₁, w^k₂) = w(x^k, y^k) and z^k = z(x^k, y^k) for each k. Since w = 0, we without any loss of generality assume that w₂^k= 0 for all k, or w₂^k6= 0 for all k.

Case (1): w₂^k= 0 for all k. From Lemma 3.1, it follows that L⁻¹_zk = (1/p

w₁^k)I. Therefore, φ⁰_x(x^k, y^k) = 1

pw^k₁L_x^k − I and φ⁰_y(x^k, y^k) = 1

pw₁^kL_y^k− I.

Since w^k₁ = kx^kk²+ ky^kk², every element in φ⁰_x(x^k, y^k) and φ⁰_y(x^k, y^k) is bounded. Taking limit (possibly on a subsequence) on φ⁰_x(x^k, y^k) and φ⁰_y(x^k, y^k), we obtain

φ⁰_x(x^k, y^k) → Lxˆ− I and φ⁰_y(x^k, y^k) → Lyˆ− I

for some vectors ˆx, ˆy ∈ IR^l satisfying kˆxk² + kˆyk² = 1, where ˆx and ˆy are some accu- mulation point of the sequences

n√x^k w^k₁

o and

n√y^k w^k₁

o

, respectively. Thus, we prove that φ⁰(x^k, y^k) → [V_x− I V_y− I] as k → ∞ with V_x ∈ {L_x_ˆ} and V_y ∈ {L_y_ˆ}.

Case (2): w^k₂ 6= 0 for all k. Now φ⁰_x(x^k, y^k) and φ⁰_y(x^k, y^k) are given as in (19). Using the same arguments as part (b) and noting the boundedness of { ¯w^k₂}, we have

L1(w^k)L_x^k → 1 2

µ 1

− ¯w₂

¶

u^T, L1(w^k)L_y^k → 1 2

µ 1

− ¯w₂

¶

v^T (23)

(11)

for some u = (u₁, u₂), v = (v₁, v₂) ∈ IR × IR^l−1 satisfying |u₁| ≤ ku₂k ≤ 1 and |v₁| ≤ kv2k ≤ 1, and ¯w2 ∈ IR^l−1 satisfying k ¯w2k = 1. We next compute the limit of L2(w^k)L_x^k and L₂(w^k)L_y^k as k → ∞. By the definition of L₂(w) in (17),

L₂(w^k)L_x^k = 1 2

µ ξ₁^k (ξ₂^k)^T

ξ₁^kw¯^k₂ + 4¡

I − ¯w₂^k( ¯w^k₂)^T¢

s^k₂ w¯₂^k(ξ₂^k)^T + 4¡

I − ¯w₂^k( ¯w^k₂)^T¢ s^k₁

¶ , L₂(w^k)L_y^k = 1

2

µ η₁^k (η^k₂)^T

η^k₁w¯₂^k+ 4¡

I − ¯w^k₂( ¯w₂^k)^T¢

ω₂^k w¯₂^k(η^k₂)^T + 4¡

I − ¯w₂^k( ¯w^k₂)^T¢ ω^k₁

¶ , where

ξ₁^k = x^k₁ + (x^k₂)^Tw¯₂^k

pλ₂(w^k) , ξ₂^k= x^k₂+ x^k₁w¯₂^k

pλ₂(w^k) , η₁^k = y^k₁ + (y₂^k)^Tw¯^k₂

pλ₂(w^k) , η^k₂ = y₂^k+ y^k₁w¯₂^k

pλ₂(w^k), (24) and

s^k₁ = x^k₁ pλ₂(w^k) +p

λ₁(w^k), s^k₂ = x^k₂ pλ₂(w^k) +p

λ₁(w^k), ω₁^k= y₁^k

pλ₂(w^k) +p

λ₁(w^k), ω^k₂ = y₂^k pλ₂(w^k) +p

λ₁(w^k). (25) By Lemma 3.3, |ξ₁^k| ≤ kξ₂^kk ≤ 1 and |η₁^k| ≤ kη^k₂k ≤ 1. In addition,

ks^kk²+ kω^kk² = kx^kk²+ ky^kk² 2(kx^kk²+ ky^kk²) + 2p

λ1(w^k)p

λ2(w^k) ≤ 1 2.

Hence, taking limit (possibly on a subsequence) on L₂(w^k)L_x^k and L₂(w^k)L_y^k yields L₂(w^k)L_x^k → 1

2

µ ξ1 ξ₂^T

ξ₁w¯₂+ 4(I − ¯w₂w¯₂^T)s₂ w¯₂ξ₂^T + 4(I − ¯w₂w¯₂^T)s₁

¶

= 1

2 µ 1

¯ w₂

¶

ξ^T + 2

µ 0 0

(I − ¯w₂w¯₂^T)s₂ (I − ¯w₂w¯^T₂)s₁

¶ , L₂(w^k)L_y^k → 1

2

µ η1 η^T₂

η₁w¯₂+ 4(I − ¯w₂w¯₂^T)ω₂ w¯₂η^T₂ + 4(I − ¯w₂w¯^T₂)ω₁

¶

= 1

2 µ 1

¯ w₂

¶

η^T + 2

µ 0 0

(I − ¯w₂w¯^T₂)ω₂ (I − ¯w₂w¯₂^T)ω₁

¶

(26) for some vectors ξ = (ξ₁, ξ₂), η = (η₁, η₂) ∈ IR × IR^l−1 satisfying |ξ₁| ≤ kξ₂k ≤ 1 and

|η₁| ≤ kη₂k ≤ 1, ¯w₂ ∈ IR^l−1satisfying k ¯w₂k = 1, and s = (s₁, s₂), ω = (ω₁, ω₂) ∈ IR×IR^l−1 satisfying ksk²+ kωk² ≤ 1/2. Among others, ξ and η are some accumulation point of the sequences {ξ^k} and {η^k}, respectively; and s and ω are some accumulation point of the sequences {s^k} and {ω^k}, respectively. From (19), (23) and (26), we obtain

φ⁰_x(x^k, y^k) → 1 2

µ 1

¯ w2

¶

ξ^T +1 2

µ 1

− ¯w2

¶

u^T + 2

µ 0 0

(I − ¯w2w¯^T₂)s2 (I − ¯w2w¯₂^T)s1

¶

− I, φ⁰_y(x^k, y^k) → 1

2 µ 1

¯ w₂

¶

η^T + 1 2

µ 1

− ¯w₂

¶

v^T + 2

µ 0 0

(I − ¯w₂w¯₂^T)ω₂ (I − ¯w₂w¯^T₂)ω₁

¶

− I.