4 Properties of the operator Φ

28  Download (0)

Full text

(1)

Applied Mathematics and Optimization, vol. 59, pp. 293-318, 2009

A damped Gauss-Newton method for the second-order cone complementarity problem

Shaohua Pan1

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China

Jein-Shan Chen 2 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

June 5, 2007

(revised January 18, 2008) (final version June 25, 2008)

Abstract. We investigate some properties related to the generalized Newton method for the Fischer-Burmeister (FB) function over second-order cones, which allows us to reformulate the second-order cone complementarity problem (SOCCP) as a semismooth system of equations. Specifically, we characterize the B-subdifferential of the FB function at a general point and study the condition for every element of the B-subdifferential at a solution being nonsingular. In addition, for the induced FB merit function, we establish its coerciveness and provide a weaker condition than [7] for each stationary point to be a solution, under suitable Cartesian P -properties of the involved mapping. By this, a damped Gauss-Newton method is proposed and the global and superlinear convergence results are obtained. Numerical results are reported for the second-order cone programs from the DIMACS library, which verify the good theoretical properties of the method.

Key words: second-order cones; complementarity; Fischer-Burmeister function; B- subdifferential; generalized Newton method.

1The author’s work is partially supported by the Doctoral Starting-up Foundation (B13B6050640) of GuangDong Province. E-mail:shhpan@scut.edu.cn.

2Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office.

The author’s work is partially supported by National Science Council of Taiwan. E-mail:

jschen@math.ntnu.edu.tw, FAX: 886-2-29332342.

(2)

1 Introduction

Consider the following conic complementarity problem of finding ζ ∈ IRn such that F (ζ) ∈ K, G(ζ) ∈ K, hF (ζ), G(ζ)i = 0, (1) where h·, ·i represents the Euclidean inner product, F, G : IRn → IRm are the mapping assumed to be continuously differentiable throughout this paper, and K is the Cartesian product of second-order cones (SOCs), or called Lorentz cones. In other words,

K = Kn1 × Kn2 × · · · × Knq, (2) where q, n1, . . . , nq ≥ 1, n1+ · · · + nq = m and

Kni :=©

x = (x1, x2) ∈ IR × IRni−1 | x1 ≥ kx2kª

with k · k denoting the Euclidean norm and K1 denoting the set of nonnegative reals IR+. We will refer to (1)–(2) as the second-order cone complementarity problem (SOCCP).

Corresponding to the Cartesian structure of K, in the rest of this paper, we always write F = (F1, . . . , Fq) and G = (G1, . . . , Gq) with Fi, Gi : IRn → IRni.

An important special case of the SOCCP corresponds to n = m and G(ζ) = ζ for all ζ ∈ IRn. Then (1) and (2) reduce to

F (ζ) ∈ K, ζ ∈ K, hF (ζ), ζi = 0, (3)

which is a natural extension of the nonlinear complementarity problem (NCP) over the nonnegative orthant cone IRn+. Another special case corresponds to the Karush-Kuhn- Tucker (KKT) conditions for the convex second-order cone program (CSOCP):

min g(x)

s.t. Ax = b, x ∈ K, (4)

where A ∈ IRp×m has full row rank, b ∈ IRp and g : IRm → IR is a twice continuously differentiable convex function. From [7], the KKT conditions of (4), which are sufficient but not necessary for optimality, can be rewritten in the form of (1) with n = m and

F (ζ) := ˆx + (I − AT(AAT)−1A)ζ, G(ζ) := ∇g(F (ζ)) − AT(AAT)−1Aζ, (5) where ˆx ∈ IRnis any vector satisfying Ax = b. When g is a linear function, (4) reduces to the standard second-order cone program which has extensive applications in engineering design, finance, control, and robust optimization; see [1, 14] and references therein.

There have been many methods proposed for solving SOCPs and SOCCPs. They include the interior-point methods [1, 2, 14, 16, 24, 26], the non-interior smoothing New- ton methods [6, 11], the smoothing-regularization method [13], and the merit function

(3)

approach [7]. Among others, the last three kinds of methods are all based on an SOC complementarity function. Specifically, a mapping φ : IRl× IRl → IRl is called an SOC complementarity function associated with the cone Kl (l ≥ 1) if

φ(x, y) = 0 ⇐⇒ x ∈ Kl, y ∈ Kl, hx, yi = 0. (6) A popular choice of φ is the vector-valued Fischer-Burmeister (FB) function, defined by

φ(x, y) := (x2+ y2)1/2− (x + y) ∀x, y ∈ IRl (7) where x2 = x ◦ x denotes the Jordan product of x and itself, x1/2 denotes a vector such that (x1/2)2 = x, and x + y means the usual componentwise addition of vectors. From the next section, we see that φ in (7) is well-defined for all (x, y) ∈ IRl× IRl. The function was shown in [11] to satisfy the equivalence (6), and therefore its squared norm

ψ(x, y) := 1

2kφ(x, y)k2 (8)

is a merit function for the SOCCP, i.e., ψ(x, y) = 0 if and only if x ∈ Kl, y ∈ Kl and hx, yi = 0. The functions φ and ψ were studied in the literature [7, 21], where ψ was shown to be continuously differentiable everywhere by Chen and Tseng [7] and φ was proved to be strongly semismooth by D. Sun and J. Sun [21].

In view of the characterization in (6), clearly, the SOCCP can be reformulated as the following nonsmooth system of equations:

Φ(ζ) :=







φ(F1(ζ), G1(ζ)) ...

φ(Fi(ζ), Gi(ζ)) ...

φ(Fq(ζ), Gq(ζ))







= 0 (9)

where φ is defined as in (7) with a suitable dimension l. By Corollary 3.3 of [21], it is not hard to show that the operator Φ : IRn → IRm in (9) is semismooth. Furthermore, from Proposition 2 of [7], its squared norm induces a smooth merit function, given by

Ψ(ζ) := 1

2kΦ(ζ)k2 = Xq

i=1

ψ(Fi(ζ), Gi(ζ)). (10)

In this paper, we mainly characterize the B-subdifferential of φ at a general point and present an estimate for the B-subdifferential of Φ. By this, a condition is given to guarantee every element of the B-subdifferential of Φ at a solution to be nonsingular, which plays an important role in the local convergence analysis of nonsmooth Newton

(4)

methods for the SOCCP. In addition, two important results are also presented for the merit function Ψ(ζ). One of them shows that each stationary point of Ψ is a solution of the SOCCP under a weaker condition than the one used by [7], and the other establishes the coerciveness of Ψ for the SOCCP (3) under the uniform Cartesian P -property of F . Based on these results, we finally propose a damped Gauss-Newton method by applying the generalized Newton method [19, 20] for the system (9), and analyze its global and superlinear (quadratic) convergence. Numerical results are reported for the SOCPs from the DIMACS library [18], which verify the good theoretical properties of the method.

Throughout this paper, I represents an identity matrix of suitable dimension, IRn denotes the space of n-dimensional real column vectors, and IRn1× · · · × IRnq is identified with IRn1+···+nq. Thus, (x1, . . . , xq) ∈ IRn1 × · · · × IRnq is viewed as a column vector in IRn1+···+nq. For any differentiable mapping F : IRn → IRm, the notation ∇F (x) ∈ IRn×m denotes the transpose of the Jacobian F0(x). For a symmetric matrix A, we write A Â O (respectively, A º O) if A is positive definite (respectively, positive semidefinite). Given a finite number of square matrices Q1, · · · , Qq, we denote the block diagonal matrix with these matrices as block diagonals by diag(Q1, . . . , Qq) or by diag(Qi, i = 1, . . . , q). If J and B are index sets such that J , B ⊆ {1, 2, . . . , q}, we denote by PJ B the block matrix consisting of the sub-matrices Pjk ∈ IRnj×nk of P with j ∈ J , k ∈ B, and denote by xB a vector consisting of sub-vectors xi ∈ IRni with i ∈ B.

2 Preliminaries

This section recalls some background materials and preliminary results that will be used in the subsequent sections. We start with the interior and the boundary of Kl (l > 1).

It is known that Kl is a closed convex self-dual cone with nonempty interior given by int(Kl) :=©

x = (x1, x2) ∈ IR × IRl−1 | x1 > kx2kª and the boundary given by

bd(Kl) :=©

x = (x1, x2) ∈ IR × IRl−1 | x1 = kx2kª .

For any x = (x1, x2), y = (y1, y2) ∈ IR × IRl−1, we define their Jordan product [9] as x ◦ y := (hx, yi, x1y2+ y1x2).

The Jordan product “◦”, unlike scalar or matrix multiplication, is not associative, which is the main source on complication in the analysis of SOCCP. The identity element under this product is e := (1, 0, · · · , 0)T ∈ IRl. For each x = (x1, x2) ∈ IR × IRl−1, define the matrix Lx by

Lx :=

· x1 xT2 x2 x1I

¸ ,

(5)

which can be viewed as a linear mapping from IRl to IRl with the following properties.

Property 2.1 (a) Lxy = x ◦ y and Lx+y = Lx+ Ly for any y ∈ IRl. (b) x ∈ Kl⇐⇒ Lx º O and x ∈ int(Kl) ⇐⇒ Lx  O.

(c) Lx is invertible whenever x ∈ int(Kl) with the inverse L−1x given by

L−1x = 1 det(x)

x1 −xT2

−x2 det(x)

x1 I + x2xT2 x1

 , (11)

where det(x) := x21− kx2k2 denotes the determinant of x.

In the following, we recall from [9, 11] that each x = (x1, x2) ∈ IR × IRl−1 admits a spectral factorization, associated with Kl, of the form

x = λ1(x) · u(1)x + λ2(x) · u(2)x ,

where λ1(x), λ2(x) and u(1)x , u(2)x are the spectral values and the associated spectral vectors of x, respectively, defined by

λi(x) = x1+ (−1)ikx2k, u(i)x = 1 2

¡1, (−1)ix¯2¢

, i = 1, 2,

with ¯x2 = x2/kx2k if x2 6= 0 and otherwise ¯x2 being any vector in IRl−1 satisfying k¯x2k = 1. If x2 6= 0, the factorization is unique. The spectral factorizations of x, x2 and x1/2 have various interesting properties, and some of them are summarized as follows.

Property 2.2 For any x = (x1, x2) ∈ IR × IRl−1, let λ1(x), λ2(x) and u(1)x , u(2)x be the spectral values and the associated spectral vectors. Then, the following results hold.

(a) x ∈ Kl ⇐⇒ 0 ≤ λ1(x) ≤ λ2(x) and x ∈ int(Kl) ⇐⇒ 0 < λ1(x) ≤ λ2(x).

(b) x2 = [λ1(x)]2u(1)x + [λ2(x)]2u(2)x ∈ Kl for any x ∈ IRl. (c) If x ∈ Kl, then x1/2 =p

λ1(x) u(1)x +p

λ2(x) u(2)x ∈ Kl.

Now we recall the concepts of the B-subdifferential and (strong) semismoothness.

Given a mapping H : IRn→ IRm, if H is locally Lipschitz continuous, then the set

BH(z) :=

n

V ∈ IRm×n| ∃{zk} ⊆ DH : zk→ z, H0(zk) → V o

is nonempty and is called the B-subdifferential of H at z, where DH ⊆ IRn denotes the set of points at which H is differentiable. The convex hull ∂H(z) := conv∂BH(z) is the

(6)

generalized Jacobian of Clarke [4]. Semismoothness was originally introduced by Mifflin [15] for functionals. Smooth functions, convex functionals, and piecewise linear functions are examples of semismooth functions. Later, Qi and Sun [19] extended the definition of semismooth functions to a mapping H : IRn → IRm. H is called semismooth at x if H is directionally differentiable at x and for all V ∈ ∂H(x + h) and h → 0,

V h − H0(x; h) = o(khk);

H is called strongly semismooth at x if H is semismooth at x and for all V ∈ ∂H(x + h) and h → 0,

V h − H0(x; h) = O(khk2);

H is called (strongly) semismooth if it is (strongly) semismooth everywhere. Here, o(khk) means a function α : IRn → IRm satisfying lim

h→0α(h)/khk = 0, while O(khk2) denotes a function α : IRn→ IRm satisfying kα(h)k ≤ Ckhk2for all khk ≤ δ and some C > 0, δ > 0.

Next, we present the definitions of Cartesian P -properties for a matrix M ∈ IRm×m, which are special cases of those introduced by Chen and Qi [5] for a linear transformation.

Definition 2.1 A matrix M ∈ IRm×m is said to have

(a) the Cartesian P -property if for any 0 6= x = (x1, . . . , xq) ∈ IRm with xi ∈ IRni, there exists an index ν ∈ {1, 2, . . . , q} such that hxν, (Mx)νi > 0;

(b) the Cartesian P0-property if for any 0 6= x = (x1, . . . , xq) ∈ IRm with xi ∈ IRni, there exists an index ν ∈ {1, 2, . . . , q} such that xν 6= 0 and hxν, (Mx)νi ≥ 0.

Some nonlinear generalizations of these concepts in the setting of K are defined as follows.

Definition 2.2 Given a mapping F = (F1, . . . , Fq) with Fi : IRn→ IRni, F is said to (a) have the uniform Cartesian P -property if for any x = (x1, . . . , xq), y = (y1, . . . , yq) ∈

IRm, there is an index ν ∈ {1, 2, . . . , q} and a positive constant ρ > 0 such that hxν − yν, Fν(x) − Fν(y)i ≥ ρkx − yk2;

(b) have the Cartesian P0-property if for any x = (x1, . . . , xq), y = (y1, . . . , yq) ∈ IRm and x 6= y, there exists an index ν ∈ {1, 2, . . . , q} such that

xν 6= yν and hxν − yν, Fν(x) − Fν(y)i ≥ 0.

(7)

From the above definitions, if a continuously differentiable mapping F : IRn → IRnhas the uniform Cartesian P -property (Cartesian P0-property), then ∇F (x) at any x ∈ IRn enjoys the Cartesian P -property (Cartesian P0-property). In addition, we may see that, when n1 = · · · = nq = 1, the above concepts reduce to the definitions of P -matrices and P -functions, respectively, for the NCP.

Finally, we introduce some notations which will be used in the rest of this paper. For any x = (x1, x2), y = (y1, y2) ∈ IR × IRl−1, we define w, z : IRl× IRl → IRl by

w = (w1, w2) = (w1(x, y), w2(x, y)) = w(x, y) := x2+ y2,

z = (z1, z2) = (z1(x, y), z2(x, y)) = z(x, y) := (x2+ y2)1/2. (12) Clearly, w ∈ Kl with w1 = kxk2 + kyk2 and w2 = 2(x1x2 + y1y2). Let ¯w2 = w2/kw2k if w2 6= 0, and otherwise ¯w2 be any vector in IRl−1 satisfying k ¯w2k = 1. Then, using Property 2.2 (b) and (c), it is not hard to compute that

z =

Ãpλ2(w) +p λ1(w)

2 ,

pλ2(w) −p λ1(w)

2 w¯2

!

∈ Kl.

3 B-subdifferential of the FB Function

In this section, we characterize the B-subdifferential of the FB function φ at a general point (x, y) ∈ IRl× IRl. For this purpose, we need several important technical lemmas.

The first lemma characterizes the set of the points where z(x, y) is differentiable. Since the proof is direct by [3, Propostion 4] and formula (11), we here omit it.

Lemma 3.1 The function z(x, y) in (12) is continuously differentiable at a point (x, y) if and only if x2+ y2 ∈ int(Kl). Moreover, ∇xz(x, y) = LxL−1z and ∇yz(x, y) = LyL−1z , where L−1z = (1/√

w1)I if w2 = 0, and otherwise L−1z =

µ b c ¯wT2

c ¯w2 aI + (b − a) ¯w2w¯T2

(13) with

a = 2

pλ2(w) +p

λ1(w), b = 1 2

à p 1

λ2(w)+ 1 pλ1(w)

!

, c = 1 2

à p 1

λ2(w)− 1 pλ1(w)

! .

The following two lemmas extends the results of Lemmas 2 and 3 of [7], respectively.

Since the proofs are direct by using the same technique as [7], we here omit them.

(8)

Lemma 3.2 For any x = (x1, x2), y = (y1, y2) ∈ IR × IRl−1 with w = x2+ y2 ∈ bd(Kl), we have

x21 = kx2k2, y12 = ky2k2, x1y1 = xT2y2, x1y2 = y1x2. If, in addition, w2 6= 0, then kwk2 = 2w12 = 2kw2k2 = 4(x21+ y12) 6= 0 and

x1w¯2 = x2, xT2w¯2 = x1, y1w¯2 = y2, y2Tw¯2 = y1.

Lemma 3.3 For any x = (x1, x2), y = (y1, y2) ∈ IR×IRl−1 with w2 = 2(x1x2+y1y2) 6= 0, there holds that

¡x1+ (−1)ixT2w¯2¢2

°

°x2 + (−1)ix1w¯2°

°2 ≤ λi(w) for i = 1, 2.

Based on Lemmas 3.1–3.3, we are now in a position to present the representation for the elements of the B-subdifferential ∂Bφ(x, y) at a general point (x, y) ∈ IRl× IRl. Proposition 3.1 Given a general point (x, y) ∈ IRl× IRl, each element in ∂Bφ(x, y) is given by [Vx− I Vy − I] with Vx and Vy having the following representation:

(a) If x2+ y2 ∈ int(Kl), then Vx = L−1z Lx and Vy = L−1z Ly. (b) If x2+ y2 ∈ bd(Kl) and (x, y) 6= (0, 0), then

Vx

½ 1

2 2w1

µ 1 w¯T2

¯

w2 4I − 3 ¯w2w¯T2

Lx+1 2

µ 1

− ¯w2

uT

¾

Vy

½ 1

2 2w1

µ 1 w¯T2

¯

w2 4I − 3 ¯w2w¯2T

Ly+ 1 2

µ 1

− ¯w2

vT

¾

(14) for some u = (u1, u2), v = (v1, v2) ∈ IR × IRl−1 satisfying |u1| ≤ ku2k ≤ 1 and

|v1| ≤ kv2k ≤ 1, where ¯w2 = w2/kw2k.

(c) If (x, y) = (0, 0), then Vx ∈ {Lˆx}, Vy ∈ {Lyˆ} for some ˆx, ˆy with kˆxk2+ kˆyk2 = 1, or Vx

½1 2

µ 1

¯ w2

ξT +1 2

µ 1

− ¯w2

uT + 2

µ 0 0

(I − ¯w2w¯T2)s2 (I − ¯w2w¯T2)s1

¶¾

Vy

½1 2

µ 1

¯ w2

ηT + 1 2

µ 1

− ¯w2

vT + 2

µ 0 0

(I − ¯w2w¯T22 (I − ¯w2w¯T21

¶¾ (15) for some u = (u1, u2), v = (v1, v2), ξ = (ξ1, ξ2), η = (η1, η2) ∈ IR × IRl−1 such that |u1| ≤ ku2k ≤ 1, |v1| ≤ kv2k ≤ 1, |ξ1| ≤ kξ2k ≤ 1, |η1| ≤ kη2k ≤ 1, ¯w2 IRl−1 satisfying k ¯w2k = 1, and s = (s1, s2), ω = (ω1, ω2) ∈ IR × IRl−1 satisfying ksk2+ kωk2 ≤ 1/2.

(9)

Proof. Let Dφ denote the set of points where φ is differentiable. Recall that this set is characterized by Lemma 3.1 since φ(x, y) = z(x, y) − (x + y), and moreover,

φ0x(x, y) = L−1z Lx− I, φ0y(x, y) = L−1z Ly− I ∀(x, y) ∈ Dφ.

(a) In this case, φ is continuously differentiable at (x, y) by Lemma 3.1. Hence, ∂Bφ(x, y) consists of a single element, i.e. φ0(x, y) = [L−1z Lx− I L−1z Ly− I], and the result is clear.

(b) Assume that (x, y) 6= (0, 0) satisfies x2+ y2 ∈ bd(Kl). Then w ∈ bd(Kl) and w1 > 0, which means kw2k = w1 > 0 and λ2(w) > λ1(w) = 0. Observe that, when w2 6= 0, the matrix L−1z in (13) can be decomposed as the sum of

L1(w) := 1 2p

λ1(w)

µ 1 − ¯wT2

− ¯w2 w¯2w¯2T

(16)

and

L2(w) := 1 2p

λ2(w)



1 w¯T2

¯

w2 4p

λ2(w) pλ2(w) +p

λ1(w)(I − ¯w2w¯2T) + ¯w2w¯T2

 (17)

with ¯w2 = w2/kw2k. Consequently, φ0x and φ0y can be rewritten as

φ0x(x, y) = (L1(w) + L2(w)) Lx− I, φ0y(x, y) = (L1(w) + L2(w)) Ly − I. (18) Let {(xk, yk)} ⊆ Dφ be an arbitrary sequence converging to (x, y). Let wk= (wk1, w2k) = w(xk, yk) and zk = z(xk, yk) for each k, where w(x, y) and z(x, y) are given as in (12).

Since w2 6= 0, we without loss of generality assume kwk2k 6= 0 for each k. Let ¯w2k = wk2/kwk2k for each k. From (18), it follows that

φ0x(xk, yk) = ¡

L1(wk) + L2(wk

Lxk− I, φ0y(xk, yk) = ¡

L1(wk) + L2(wk

Lyk − I. (19)

Since limk→∞λ1(wk) = 0, limk→∞λ2(wk) = 2w1 > 0 and limk→∞w¯k2 = ¯w2, we have

k→∞lim L2(wk)Lxk = C(w)Lx and lim

k→∞L2(wk)Lyk = C(w)Ly (20) where

C(w) = 1 2

2w1

µ 1 w¯2T

¯

w2 4I − 3 ¯w2w¯T2

. (21)

Next we focus on the limit of L1(wk)Lxk and L1(wk)Lyk as k → ∞. By computing, L1(wk)Lxk = 1

2

µ uk1 (uk2)T

−uk1w¯2k − ¯wk2(uk2)T

, L1(wk)Lyk = 1

2

µ vk1 (v2k)T

−v1kw¯2k − ¯wk2(v2k)T

,

(10)

where

uk1 = xk1 − (xk2)Tw¯2k

pλ1(wk) , uk2 = xk2 − xk1w¯k2

pλ1(wk) , vk1 = y1k− (yk2)Tw¯2k

pλ1(wk) , v2k = yk2 − y1kw¯2k pλ1(wk).

By Lemma 3.3, |uk1| ≤ kuk2k ≤ 1 and |vk1| ≤ kv2kk ≤ 1. So, taking the limit (possibly on a subsequence) on L1(wk)Lxk and L1(wk)Lyk, respectively, gives

L1(wk)Lxk 1 2

µ u1 uT2

−u1w¯2 − ¯w2uT2

= 1 2

µ 1

− ¯w2

uT L1(wk)Lyk 1

2

µ v1 v2T

−v1w¯2 − ¯w2vT2

= 1 2

µ 1

− ¯w2

vT (22)

for some u = (u1, u2), v = (v1, v2) ∈ IR × IRl−1 satisfying |u1| ≤ ku2k ≤ 1 and |v1| ≤ kv2k ≤ 1. In fact, u and v are some accumulation point of the sequences {uk} and {vk}, respectively. From equations (19)–(22), we immediately obtain

φ0x(xk, yk) → C(w)Lx+1 2

µ 1

− ¯w2

uT − I, φ0y(xk, yk) → C(w)Ly+ 1

2

µ 1

− ¯w2

vT − I.

This shows that φ0(xk, yk) → [Vx− I Vy − I] as k → ∞ with Vx, Vy satisfying (14).

(c) Assume that (x, y) = (0, 0). Let {(xk, yk)} ⊆ Dφ be an arbitrary sequence converging to (x, y). Let wk = (wk1, wk2) = w(xk, yk) and zk = z(xk, yk) for each k. Since w = 0, we without any loss of generality assume that w2k= 0 for all k, or w2k6= 0 for all k.

Case (1): w2k= 0 for all k. From Lemma 3.1, it follows that L−1zk = (1/p

w1k)I. Therefore, φ0x(xk, yk) = 1

pwk1Lxk − I and φ0y(xk, yk) = 1

pw1kLyk− I.

Since wk1 = kxkk2+ kykk2, every element in φ0x(xk, yk) and φ0y(xk, yk) is bounded. Taking limit (possibly on a subsequence) on φ0x(xk, yk) and φ0y(xk, yk), we obtain

φ0x(xk, yk) → Lxˆ− I and φ0y(xk, yk) → Lyˆ− I

for some vectors ˆx, ˆy ∈ IRl satisfying kˆxk2 + kˆyk2 = 1, where ˆx and ˆy are some accu- mulation point of the sequences

nxk wk1

o and

nyk wk1

o

, respectively. Thus, we prove that φ0(xk, yk) → [Vx− I Vy− I] as k → ∞ with Vx ∈ {Lxˆ} and Vy ∈ {Lyˆ}.

Case (2): wk2 6= 0 for all k. Now φ0x(xk, yk) and φ0y(xk, yk) are given as in (19). Using the same arguments as part (b) and noting the boundedness of { ¯wk2}, we have

L1(wk)Lxk 1 2

µ 1

− ¯w2

uT, L1(wk)Lyk 1 2

µ 1

− ¯w2

vT (23)

(11)

for some u = (u1, u2), v = (v1, v2) ∈ IR × IRl−1 satisfying |u1| ≤ ku2k ≤ 1 and |v1| ≤ kv2k ≤ 1, and ¯w2 ∈ IRl−1 satisfying k ¯w2k = 1. We next compute the limit of L2(wk)Lxk and L2(wk)Lyk as k → ∞. By the definition of L2(w) in (17),

L2(wk)Lxk = 1 2

µ ξ1k 2k)T

ξ1kw¯k2 + 4¡

I − ¯w2k( ¯wk2)T¢

sk2 w¯2k2k)T + 4¡

I − ¯w2k( ¯wk2)T¢ sk1

, L2(wk)Lyk = 1

2

µ η1k k2)T

ηk1w¯2k+ 4¡

I − ¯wk2( ¯w2k)T¢

ω2k w¯2kk2)T + 4¡

I − ¯w2k( ¯wk2)T¢ ωk1

, where

ξ1k = xk1 + (xk2)Tw¯2k

pλ2(wk) , ξ2k= xk2+ xk1w¯2k

pλ2(wk) , η1k = yk1 + (y2k)Tw¯k2

pλ2(wk) , ηk2 = y2k+ yk1w¯2k

pλ2(wk), (24) and

sk1 = xk1 pλ2(wk) +p

λ1(wk), sk2 = xk2 pλ2(wk) +p

λ1(wk), ω1k= y1k

pλ2(wk) +p

λ1(wk), ωk2 = y2k pλ2(wk) +p

λ1(wk). (25) By Lemma 3.3, |ξ1k| ≤ kξ2kk ≤ 1 and |η1k| ≤ kηk2k ≤ 1. In addition,

kskk2+ kωkk2 = kxkk2+ kykk2 2(kxkk2+ kykk2) + 2p

λ1(wk)p

λ2(wk) 1 2.

Hence, taking limit (possibly on a subsequence) on L2(wk)Lxk and L2(wk)Lyk yields L2(wk)Lxk 1

2

µ ξ1 ξ2T

ξ1w¯2+ 4(I − ¯w2w¯2T)s2 w¯2ξ2T + 4(I − ¯w2w¯2T)s1

= 1

2 µ 1

¯ w2

ξT + 2

µ 0 0

(I − ¯w2w¯2T)s2 (I − ¯w2w¯T2)s1

, L2(wk)Lyk 1

2

µ η1 ηT2

η1w¯2+ 4(I − ¯w2w¯2T2 w¯2ηT2 + 4(I − ¯w2w¯T21

= 1

2 µ 1

¯ w2

ηT + 2

µ 0 0

(I − ¯w2w¯T22 (I − ¯w2w¯2T1

(26) for some vectors ξ = (ξ1, ξ2), η = (η1, η2) ∈ IR × IRl−1 satisfying |ξ1| ≤ kξ2k ≤ 1 and

1| ≤ kη2k ≤ 1, ¯w2 ∈ IRl−1satisfying k ¯w2k = 1, and s = (s1, s2), ω = (ω1, ω2) ∈ IR×IRl−1 satisfying ksk2+ kωk2 ≤ 1/2. Among others, ξ and η are some accumulation point of the sequences {ξk} and {ηk}, respectively; and s and ω are some accumulation point of the sequences {sk} and {ωk}, respectively. From (19), (23) and (26), we obtain

φ0x(xk, yk) → 1 2

µ 1

¯ w2

ξT +1 2

µ 1

− ¯w2

uT + 2

µ 0 0

(I − ¯w2w¯T2)s2 (I − ¯w2w¯2T)s1

− I, φ0y(xk, yk) → 1

2 µ 1

¯ w2

ηT + 1 2

µ 1

− ¯w2

vT + 2

µ 0 0

(I − ¯w2w¯2T2 (I − ¯w2w¯T21

− I.

Figure

Updating...

References

Related subjects :