• 沒有找到結果。

The semismooth-related properties of a merit function and a descent method for the nonlinear

N/A
N/A
Protected

Academic year: 2022

Share "The semismooth-related properties of a merit function and a descent method for the nonlinear"

Copied!
16
0
0

加載中.... (立即查看全文)

全文

(1)

DOI 10.1007/s10898-006-9027-y O R I G I NA L A RT I C L E

The semismooth-related properties of a merit function and a descent method for the nonlinear

complementarity problem

Jein-Shan Chen

Received: 13 March 2006 / Accepted: 20 March 2006 / Published online: 14 June 2006

© Springer Science+Business Media B.V. 2006

Abstract This paper is a follow-up of the work [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted for publication (2004)] where an NCP-function and a descent method were proposed for the nonlinear complementarity problem. An unconstrained reformula- tion was formulated due to a merit function based on the proposed NCP-function.

We continue to explore properties of the merit function in this paper. In particular, we show that the gradient of the merit function is globally Lipschitz continuous which is important from computational aspect. Moreover, we show that the merit function is SC1function which means it is continuously differentiable and its gradient is semi- smooth. On the other hand, we provide an alternative proof, which uses the new properties of the merit function, for the convergence result of the descent method considered in [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted for publication (2004)].

Keywords Complementarity· SC1function· Merit function · Semismooth function · Descent method

1 Introduction

In the past decades, the well-known nonlinear complementarity problem (NCP) has attracted much attention due to its various applications in operations research, eco- nomics, and engineering [6, 11, 17]. The NCP is to find a point x∈ IRnsuch that

x≥ 0, F(x) ≥ 0, x, F(x) = 0, (1)

where·, · is the Euclidean inner product and F = (F1, F2,. . . , Fn)T maps from IRn to IRn. We assume that F is continuously differentiable throughout this paper.

There have been many methods proposed for solving the NCP [9, 11, 17]. Among which, one of the most popular approaches that has been studied intensively recently

J.-S. Chen (

B

)

Department of Mathematics National Taiwan Normal University Taipei 11677, Taiwan

e-mail: jschen@math.ntnu.edu.tw

(2)

is to reformulate the NCP as an unconstrained minimization problem [5, 7, 10, 13, 14, 28]. Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP.

For constructing a merit function, the class of functions, so-called NCP-functions and defined as below, serves an important role.

Definition 1.1 A functionφ : IR2→ IR is called an NCP-function if it satisfies

φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2)

A popular NCP-function intensively studied recently is the well-known Fischer–

Burmeister NCP-function [7, 8, 24] defined as φ(a, b) =

a2+ b2− (a + b). (3)

Let: IRn→ IRnbe

(x) =



φ(x1, F1(x)) ... φ(xn, Fn(x))

 . (4)

Then the function : IRn→ IR+defined by

(x) := 1

2(x)2= 1 2

n i=1

φ(xi, Fi(x))2 (5)

is a merit function for the NCP, i.e., the NCP can be recast as an unconstrained minimization:

xmin∈IRn(x). (6)

In the paper [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)], an NCP- function which is an extension of the Fischer–Burmeister function (3) was studied.

More specifically, they defineφp: IR2→ IR by

φp(a, b) := (a, b)p− (a + b), (7) where(a, b)pdenotes the p-norm of(a, b), i.e., (a, b)p = p

|a|p+ |b|p. In other words, in the functionφp, the 2-norm of (a, b) in the Fischer–Burmeister function (3) is replaced by more generally a p-norm of(a, b) with p ≥ 2. This function φpis still an NCP-function as was noted in Tseng’s paper [26]. Nonetheless, there was no further study on this NCP-function even for p= 3 until the recent paper [Chen, J.-S.:

J. Optimiz. Theory Appl., Submitted (2004)] by the author. Following the functionφp, we can further defineψp: IR2→ IR+by

ψp(a, b) :=1

2p(a, b)|2. (8)

The functionψpis a nonnegative NCP-function and smooth on IR2with some favor- able properties, see [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)]. In this paper, we continue to explore properties ofψpas will be seen in Sect. 3. Analogous to, the function p: IRn→ IRngiven by

(3)

p(x) =



φp(x1, F1(x)) ... φp(xn, Fn(x))

 (9)

yields a merit functionp: IRn→ IR+for the NCP where

p(x) := 1

2p(x)2= 1 2

n i=1

φp(xi, Fi(x))2=

n i=1

ψp(xi, Fi(x)). (10)

As shown in [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)],pis a continu- ously differentiable merit function for the NCP. Therefore, classical iterative methods such as Newton method can be applied to the unconstrained smooth minimization of the NCP, i.e.,

xmin∈IRnp(x). (11)

On the other hand, derivative-free methods have also attracted much attention which do not require computation of derivatives of F [10, 13, 27]. Derivative-free methods, taking advantages of particular properties of a merit function, are suitable for problems where the derivatives of F are not available or expensive. In this paper, we also study a derivative-free descent algorithm for solving the NCP based on the merit functionp in Sect. 4. Indeed, the descent method was considered in [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)], we apply the new properties ofψp

explored in this paper to provide an alternative proof for the convergence result.

Throughout this paper, IRndenotes the space of n-dimensional real column vectors andT denotes transpose. For any differentiable function f : IRn→ IR, ∇f (x) denotes the gradient of f at x. For any differentiable mapping F= (F1,. . . , Fm)T: IRn→ IRm,

∇F(x) = [∇F1(x) · · · ∇Fm(x)] denotes the transpose Jacobian of F at x. We write z= ◦(α) with α ∈ IR and z ∈ IRnto meanz/|α| tends to zero as α → 0. Also, we denote byxp the p-norm of x and byx the Euclidean norm of x. In the whole paper, we assume p is an integer greater than or equal to 2.

2 Preliminaries

In this section, we recall some background concepts and review some known materi- als which are crucial to the subsequent analysis. We begin with the monotonicity of a mapping. Let F : IRn → IRn, then F is monotone ifx − y, F(x) − F(y) ≥ 0, for all x, y∈ IRn; F is strictly monotone ifx−y, F(x)−F(y) > 0, for all x, y ∈ IRnand x = y;

and F is strongly monotone with modulusµ > 0 if x−y, F(x)−F(y) ≥ µx−y2, for all x, y∈ IRn. Next, we recall the so-called semismooth functions. First, we say that F is strictly continuous (also called ‘locally Lipschitz continuous’) at x∈ IRn[23, Chap. 9]

if there exist scalarsκ > 0 and δ > 0 such that

F(y) − F(z) ≤ κy − z ∀y, z ∈ IRnwithy − x ≤ δ, z − x ≤ δ;

and F is strictly continuous if F is strictly continuous at every x∈ IRn. Ifδ can be taken to be∞, then F is Lipschitz continuous with Lipschitz constant κ. Define the function lipF : IRn→ [0, ∞] by

(4)

lipF(x) := lim sup

y,z→x y =z

F(y) − F(z)

y − z .

Then F is strictly continuous at x if and only if lipF(x) is finite. We say F is directionally differentiable at x∈ IRnif

F(x; h) := lim

t→0+

F(x + th) − F(x)

t exists ∀h ∈ IRn;

and F is directionally differentiable if F is directionally differentiable at every x∈ IRn. F is differentiable (in the Fréchet sense) at x∈ IRnif there exists a linear mapping

∇F(x): IRn→ IRnsuch that

F(x + h) − F(x) − ∇F(x)h = o(h).

We say that F is continuously differentiable if F is differentiable at every x∈ IRnand

∇F is continuous.

If F is strictly continuous, then F is almost everywhere differentiable by Rademach- er’s Theorem—see [3] and [23, Sect. 9J]. In this case, the generalized Jacobian∂F(x) of F at x (in the Clarke sense) can be defined as the convex hull of the generalized JacobianBF(x), where

BF(x) :=

lim

xj→x∇F(xj)F is differentiable at xj∈ IRn

.

The notationBis adopted from [19]. In [23, Chap. 9], the case of n= 1 is considered and the notations “ ¯∇” and “¯∂” are used instead of, respectively, “∂B” and “∂”.

Assume F : IRn → IRnis strictly continuous. We say F is semismooth at x if F is directionally differentiable at x and, for any V∈ ∂F(x + h), we have

F(x + h) − F(x) − Vh = o(h).

We say F isρ-order semismooth at x (0 < ρ < ∞) if F is semismooth at x and, for any V∈ ∂F(x + h), we have

F(x + h) − F(x) − Vh = O(h1).

The following lemma, proven by Sun and Sun [25, Thm. 3.6] using the definition of generalized Jacobian,(Sun and Sun did not consider the case of o(h) but their argument readily applies to this case.) enables one to study the semismooth property of F by examining only those points x∈ IRnwhere F is differentiable and thus work only with the Jacobian of F, rather than the generalized Jacobian.

Lemma 2.1 Suppose F : IRn → IRnis strictly continuous and directionally differen- tiable in a neighborhood of x∈ IRn. Then, for any 0< ρ < ∞, the following two statements(where O(·) depends on F and x only) are equivalent:

(a) For any h∈ IRnand any V∈ ∂F(x + h),

F(x + h) − F(x) − Vh = o(h) (respectively, O(h1+ρ)).

(b) For any h∈ IRnsuch that F is differentiable at x+ h,

F(x + h) − F(x) − ∇F(x + h)h = o(h) (respectively, O(h1+ρ)).

(5)

We say F is semismooth (respectively,ρ-order semismooth) if F is semismooth (respectively, ρ-order semismooth) at every x ∈ IRn. We say F is strongly semi- smooth if it is 1-order semismooth. Convex functions and piecewise continuously differentiable functions are examples of semismooth functions. The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function. The property of semismoothness plays an important role in nonsmooth Newton methods [19, 21] as well as in some smoothing methods. For extensive discussions of semismooth functions, see [8, 15, 21].

Now, we review some useful properties aboutφp,ψp defined as in (7) and (8), respectively which will be used for the analysis in the subsequent sections. We notice that the functionφpreduces to the Fischer–Burmeister function given as in (3) when p= 2. Thus, most properties are extensions of properties of Fischer–Burmeister func- tion. For detailed proofs of them, please refer to [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)].

Lemma 2.2 ([Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004), Prop. 3.1]) Let φp: IR2→ IR be defined as (7) where p ≥ 2. Then

(a) φpis an NCP-function, i.e., it satisfies(2).

(b) φpis sub-additive, i.e.,φp(w + w) ≤ φp(w) + φp(w) for all w, w∈ IR2. (c) φpis positively homogeneous, i.e.,φp(αw) = αφp(w) for all w ∈ IR2andα ≥ 0.

(d) φpis convex, i.e.,φp(αw + (1 − α)w) ≤ αφp(w) + (1 − α)φp(w) for all w, w∈ IR2 andα ≥ 0.

(e) φpis Lipschitz continuous with L1= 1 +√

2, i.e.,|φp(w) − φp(w)| ≤ L1w − w;

or with L2= 1 + 2(1−1/p), i.e.,|φp(w) − φp(w)| ≤ L2w − wpfor all w, w∈ IR2. Lemma 2.2(b) and (c) imply thatφpis sublinear, i.e., it satisfies

φp(αw + βw) ≤ αφp(w) + βφp(w)

for all w, w ∈ IR2 andα, β ≥ 0. This can be seen by the fact [1, Prop. 3.11] that a function from IRnto IR is sublinear if and only if it is positively homogeneous and sub- additive. Note that the sublinear condition is stronger than convexity. In fact, under Lemma 2.2(c), Lemma 2.2(b) is equivalent to Lemma 2.2(d). This is from [22, Thm.

4.7] that a positively homogeneous function is convex if and only if it is sub-additive.

Lemma 2.3 ([Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004), Prop. 3.2]) Let φp,ψpbe defined as(7) and (8), respectively, where p ≥ 2. Then

(a) ψpis an NCP-function, i.e., it satisfies (2).

(b) ψp(a, b) ≥ 0 for all (a, b) ∈ IR2.

(c) ψpis continuously differentiable everywhere. Moreover,aψp(0, 0) = ∇bψp(0, 0) = 0 and

aψp(a, b) =

ap−1

(a, b)p−1p

− 1

φp(a, b),

bψp(a, b) =

bp−1

(a, b)p−1p

− 1

φp(a, b), (12)

for(a, b) = (0, 0) with p is even, whereas

(6)

aψp(a, b) =

sgn(a) · ap−1

(a, b)p−1p

− 1

φp(a, b),

bψp(a, b) =

sgn(b) · bp−1

(a, b)pp−1

− 1

φp(a, b), (13)

for(a, b) = (0, 0) with p is odd.

(d) ∇aψp(a, b) · ∇bψp(a, b) ≥ 0 for all (a, b) ∈ IR2. The equality holds if and only if φp(a, b) = 0.

(e) ∇aψp(a, b) = 0 ⇐⇒ ∇bψp(a, b) = 0 ⇐⇒ φp(a, b) = 0.

Lemma 2.4 ([Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004), Prop. 3.5]) Let

p: IRn→ IR be defined as (10) where p ≥ 2. Assume F is either strongly monotone or uniform P-function, then the level setsL(p,γ ) are bounded for all γ ∈ IR.

In additional to the above properties ofφpandψp, we still need the following two lemmas for the analysis in the subsequent sections.

Lemma 2.5 ([12, (1.3)]) Let x ∈ IRnand 1< p1< p2. Then

xp2 ≤ xp1≤ n(1/p1−1/p2)xp2.

Lemma 2.6 If F : D⊆ IRn→ IRmhas a second derivative at each point of a convex set D0⊆ D, then

∇F(y) − ∇F(x) ≤ sup

0≤t≤1∇2F(x + t(y − x)) · y − x.

Proof This is Theorem 3.3.5 of [16] (page 78). 2

3 The semismooth-related properties of the NCP and merit functions

In this section, we study some semismooth-related properties ofφpincluding semi- smooth and almost smooth properties as well as SC1 and LC1 properties of ψp. The semismooth property is very important from the computational point of view.

In particular, it plays a fundamental role in the superlinear convergence analysis of generalized Newton methods, see [19, 21, 29]. The classes of SC1and LC1functions have been a subject of interest in relation to the development minimization algorithm.

We will introduce their definitions later. We begin this section by showing that the functionsφpandpare semismooth (in fact, they are strongly semismooth as shown in Corollary 3.1). Its proof is easy and routine.

Proposition 3.1 The functionp: IRn→ IRndefined as(9) is semismooth.

Proof We notice thatφp is convex by Lemma 2.2(d), and hence is a semismooth function. We also observe that each component ofp(x) is the composite of the con- vex functionφp: IR2 → IR and the differentiable function (xi, Fi(x))T: IRn → IR2. Since convex and differentiable functions are semismooth and the composition of semismooth functions is semismooth, it yields thatpis semismooth. 2 An important concept in relation to semismooth function is the SC1 function, so we next introduce its definition as below.

(7)

Definition 3.1 A function f : IRn→ IR is said to be an SC1function if f is continuously differentiable and its gradient is semismooth.

We can view SC1functions are functions lying between C1 and C2 functions. By defining SC1functions, many results regarding the minimization of C2 functions can be extended to the minimization of SC1 functions, see [18] and references therein.

For applications and more details of SC1functions, please refer to the excellent book [4]. Prop. 3.2 shows thatψpis an SC1function; hence, if every Fiis SC1function then so isp. Before presenting its proof, we need a very important and crucial technical lemma, which states∇ψpis globally Lipschitz continuous. The lemma will not only be used in the proof of Prop. 3.2 but also for the analysis of convergence result of the descent algorithm in Sect. 4.

Lemma 3.1 The gradient of the functionψpdefined as(8) is Lipschitz continuous, that is, there exists L> 0 such that

∇ψp(a, b) − ∇ψp(c, d) ≤ L(a, b) − (c, d), (14) for all(a, b), (c, d) ∈ IR2.

Proof Following the gradient ofψp given as in (12) and (13) and then applying the chain rule and quotient rule (the computation is routine though tedious, so we omit the details), we have the following two cases.

If p is even and(a, b) = (0, 0), then

aa2ψp(a, b) =

ap−1

(a, b)pp−1

− 1 2

+(p − 1)ap−2bp

(a, b)2pp −1



(a, b)p− (a + b)

 ,

ab2 ψp(a, b) = ∇ba2 ψp(a, b) =

ap−1

(a, b)p−1p

− 1

bp−1

(a, b)p−1p

− 1

,

(p − 1)ap−1bp−1

(a, b)2p−1p



(a, b)p− (a + b)

 ,

bb2 ψp(a, b) =

bp−1

(a, b)p−1p

− 1 2

+(p − 1)apbp−2

(a, b)2p−1p



(a, b)p− (a + b)

 .

It is clear that |a|p−1

(a, b)p−1p

≤ 1 and it also follows

|a|p−2· |b|p



max{|a|, |b|}

2p−2

≤

p

|a|p+ |b|p2p−2

≤ (a, b)2p−2p , that

|a|p−2|b|p

(a, b)2p−2p

≤ 1. Similarly, |a|p|b|p−2

(a, b)2p−2p

≤ 1. (15)

On the other hand, by Lemma 2.5, we have

|a| + |b| ≤√ 2

a2+ b2=√

2(a, b)2≤√

2· 2(1/2−1/p)(a, b)p= 2(1−1/p)(a, b)p.

(8)

Applying all the above, we can give an upper bound for∇aa2 ψp(a, b) as below.

2

aaψp(a, b)

ap−1

(a, b)p−1p

+ 1 2

+(p − 1)|a|p−2|b|p

(a, b)2p−2p

+(p − 1)|a|p−2|b|p· (|a| + |b|)

(a, b)2p−1p

≤ 4 + (p − 1) +(p − 1)|a|p−2|b|p· 2(1−1/p)(a, b)p

(a, b)2pp −1

≤ 4 + (p − 1) + (p − 1)2(1−1/p)

= 4 + (p − 1)



1+ 2(1−1/p)

 ,

where the last inequality holds due to (15). By the same arguments, we also have

2

bbψp(a, b)



1+ 2(1−1/p)

 .

Now, we estimate the upper bound for∇ab2 ψp(a, b) = ∇ba2 ψp(a, b) as below.

2

abψp(a, b) ba2 ψp(a, b)

ap−1

(a, b)pp−1

− 1 · bp−1

(a, b)pp−1

− 1

+(p − 1)|a|p−1|b|p−1

(a, b)2p−1p



(a, b)p+ (|a| + |b|)



|a|p−1

(a, b)p−1p

+ 1

|b|p−1

(a, b)p−1p

+ 1

+(p − 1)|a|p−1|b|p−1

(a, b)2p−2p

+(p − 1)|a|p−1|b|p−1· (|a| + |b|)

(a, b)2pp −1

≤ 4 + (p − 1) +(p − 1)|a|p−1|b|p−1· 2(1−1/p)(a, b)p

(a, b)2p−1p

≤ 4 + (p − 1) + (p − 1)2(1−1/p)

= 4 + (p − 1)



1+ 2(1−1/p)

 ,

where the third and fourth inequalities are true by the similar result as (15), that is,

|a|p−1|b|p−1

(a, b)2p−2p

≤ 1.

If p is odd and(a, b) = (0, 0), then we obtain

(9)

aa2ψp(a, b) =

sgn(a) · ap−1

(a, b)p−1p

− 1 2

+sgn(a)sgn(b) · (p − 1)ap−2bp

(a, b)2p−1p



(a, b)p− (a + b)

 ,

ab2 ψp(a, b) = ∇ba2ψp(a, b) =

sgn(a) · ap−1

(a, b)pp−1

− 1

sgn(b) · bp−1

(a, b)pp−1

− 1

,

−sgn(a)sgn(b) · (p − 1)ap−1bp−1

(a, b)2pp−1



(a, b)p− (a + b)

 ,

bb2 ψp(a, b) =

sgn(b) · bp−1

(a, b)pp−1

− 1 2

+sgn(a)sgn(b) · (p − 1)apbp−2

(a, b)2pp−1



(a, b)p− (a + b)

 .

In fact, the upper bounds for∇aa2 ψp(a, b), ∇ab2 ψp(a, b), ∇bb2 ψp(a, b) remain the same by following exactly the same steps as in the case where p is even. Thus, there exist a constant L> 0 independent of (a, b) such that

∇2ψp(a, b) ≤ L, ∀ (a, b) = (0, 0) ∈ IR2. Then, by Lemma 2.6, we have

∇ψp(a, b) − ∇ψp(c, d) ≤ L(a, b) − (c, d), (16) for all (a, b), (c, d) ∈ IR2 with (0, 0) ∈ [(a, b), (c, d)]. Moreover, (16) also holds in case(a, b) = (c, d) = (0, 0) since ∇aψp(a, b) = ∇bψp(a, b) = 0. Therefore, we can assume(a, b) = (0, 0). From Lemma 2.3(c), ψpis continuously differentiable for all (a, b) ∈ IR2with∇ψp(0, 0) = (0, 0); then using a continuity argument, we obtain (16) remains true for all(c, d) ∈ IR2. Thus, (16) holds for all(a, b), (c, d) ∈ IR2which says

ψpis globally Lipschitz continuous. 2

Proposition 3.2 The functionψpdefined as in(8) is an SC1function. Hence, if every Fiis an SC1function, then the functionpgiven as(10) is also an SC1function.

Proof It is known by Lemma 2.3(c) thatψpis continuously differentiable, it remains to show that the gradient ofψpis semismooth. From Lemma 3.1, ∇ψpis Lipschitz continuous; hence is strictly continuous (locally Lipschitz continuous). Therefore, to check semismoothness of∇ψp, we only need to show that∇ψpsatisfies Lemma 2.1(b).

More specifically, we only need to check semismoothness at(0, 0) because at other points∇ψpis continuously differentiable (see the proof of Lemma 3.1), hence is semi- smooth. For this purpose, we will have to verify that the equation in Lemma 2.1(b) is satisfied, i.e., for any(h1, h2) ∈ IR2such that∇ψpis differentiable at(h1, h2), we have

∇ψp(h1, h2) − ∇ψp(0, 0) − ∇2ψp(h1, h2) · h = ◦((h1, h2)). (17) To prove (17), we have two cases where p is even and p is odd.

For p is even, we denote( 1, 2) the left-hand side of (17). Then, we have

 1

2

 :=

k1 k2



· φp(h1, h2) −

0 0





k21+

(p−1)hp−21 hp2

(h1,h2)2p−1p



φp(h1, h2) k1· k2− k3φp(h1, h2) k1· k2− k3φp(h1, h2) k22+

(p−1)hp1hp−22

(h1,h2)2p−1p



φp(h1, h2)



 ·

h1 h2

 ,

(18)

(10)

where

k1=

 hp−11

(h1, h2)pp−1

− 1

 ,

k2=

 hp2−1

(h1, h2)p−1p

− 1



, (19)

k3= (p − 1)hp−11 hp−12

(h1, h2)2pp−1

.

By plugging (19) into (18) and writing out 1 and 2, we obtain that 1 = 0 and 2= 0. To see this, we compute 1as below:

1=

 hp1−1

(h1, h2)p−1p

− 1



φp(h1, h2) −

 hp1−1

(h1, h2)p−1p

− 1

2

h1

(p − 1)hp1−1hp2

(h1, h2)2p−1p

· φp(h1, h2) −

 hp1−1

(h1, h2)p−1p

− 1

 hp−12

(h1, h2)p−1p

− 1

 h2

+(p − 1)hp1−1hp2

(h1, h2)2p−1p

· φp(h1, h2)

= φp(h1, h2)

 hp1−1

(h1, h2)pp−1

− 1



(p − 1)hp1−1hp2

(h1, h2)2pp−1

+(p − 1)hp1−1hp2

(h1, h2)2pp−1



 hp1−1

(h1, h2)pp−1

− 1

2

h1

 hp1−1

(h1, h2)pp−1

− 1

 hp−12

(h1, h2)pp−1

− 1

 h2

= φp(h1, h2)

 hp1−1

(h1, h2)pp−1

− 1



 hp1−1

(h1, h2)pp−1

− 1

2

h1

 hp1−1

(h1, h2)pp−1

− 1

 hp2−1

(h1, h2)pp−1

− 1

 h2

=

 hp1−1

(h1, h2)pp−1

− 1

 

φp(h1, h2) −

 hp1−1

(h1, h2)pp−1

− 1

 h1

 hp2−1

(h1, h2)pp−1

− 1

 h2



=

 hp−11

(h1, h2)pp−1

− 1

 

(h1, h2)php1+ hp2

(h1, h2)pp−1



=

 hp−11

(h1, h2)pp−1

− 1



· 0

= 0 ,

where the second-to-last equality is true since hp1 + hp2 = (h1, h2)ppwhen p is even.

Similarly,

(11)

2=

 hp2−1

(h1, h2)pp−1

− 1



φp(h1, h2) −

 hp2−1

(h1, h2)pp−1

− 1

2

h2

(p − 1)hp1hp2−1

(h1, h2)2p−1p

· φp(h1, h2) −

 hp1−1

(h1, h2)p−1p

− 1

 hp2−1

(h1, h2)p−1p

− 1

 h1

+(p − 1)hp1hp−12

(h1, h2)2pp−1

· φp(h1, h2)

= φp(h1, h2)

 hp2−1

(h1, h2)p−1p

− 1



(p − 1)hp1hp2−1

(h1, h2)2p−1p

+(p − 1)hp1hp2−1

(h1, h2)2p−1p



 hp2−1

(h1, h2)pp−1

− 1

2

h2

 hp−11

(h1, h2)pp−1

− 1

 hp2−1

(h1, h2)pp−1

− 1

 h1

= φp(h1, h2)

 hp2−1

(h1, h2)pp−1

− 1



 hp2−1

(h1, h2)pp−1

− 1

2

h2

 hp1−1

(h1, h2)p−1p

− 1

 hp−12

(h1, h2)p−1p

− 1

 h1

=

 hp2−1

(h1, h2)pp−1

− 1

 

φp(h1, h2) −

 hp−11

(h1, h2)pp−1

− 1

 h1

 hp2−1

(h1, h2)pp−1

− 1

 h2



=

 hp2−1

(h1, h2)pp−1

− 1

 

(h1, h2)p hp1+ hp2

(h1, h2)pp−1



=

 hp1−1

(h1, h2)p−1p

− 1



· 0

= 0 ,

where the second-to-last equality is true since hp1 + hp2 = (h1, h2)ppwhen p is even.

From the above two expressions of 1and 2, it implies that (17) is satisfied. Thus,

∇ψpis semismooth at(0, 0) for the case where p is even.

For p is odd, following the same arguments leads to the same verifications. There- fore, we complete proving that∇ψpis semismooth, and henceψpis SC1function. The second statement follows immediately from this result. 2 We want to point out one thing that, for p = 2, ψp was already proved an SC1 function in [4, 5] (Indeed, it was first formally shown in [5]). Prop. 3.2 is a general extension for any p≥ 2 and its proof is much more complicated than the case of p = 2.

In addition to SC1functions, we also introduce LC1functions here.

Definition 3.2 A function f : IRn→ IR is called an LC1 function if f is continuously differentiable and its gradient is locally Lipschitz continuous.

The class of LC1 minimization problems was studied in [20], where the local, su- perlinear convergence of an approximate Newton method was established under a semismoothness assumption on the gradient function at a solution point. It is obvious that any SC1function is an LC1function. With the results of Lemma 3.1 and Prop.

3.2, we therefore has the following corollaries.

Corollary 3.1 If every Fiis an LC1function, then the functionpgiven as(9) is strongly semsmooth.

(12)

Proof We know thatφpis semismooth, indeed, it is strongly semismooth. This can be seen by Lemma 2.2(c), Lemma 3.1 and Theorem 7 of [Qi, L., Tseng, P.: Math. Oper.

Res., Submitted (2002)]. Also every LC1 function is strongly semismooth. Thus, the

result follows. 2

Corollary 3.2 The functionψpdefined as in(8) is an LC1function. Hence, if every Fi is an LC1function, then the functionpgiven as(10) is also an LC1function.

Some other issues related to semismooth functions are concepts of piecewise smooth and almost smooth functions. It is well-known that piecewise smooth func- tions are examples of semismooth functions and there have emerged other examples of semismooth functions that are not piecewise smooth recently, see [Qi, L., Tseng, P.: Math. Oper. Res., Submitted (2002)] and references therein. In particular, these examples include the p-norm function with 1< p < ∞ defined on IRnwhere n≥ 2, the Euclidean norm function, pseudo-smooth NCP-functions, smoothing functions, etc.. To close this section, we point out that the NCP-function studied in this paper and [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)] is indeed strongly almost smooth since it is based on the p-norm function. We briefly state definition of almost smooth functions and the result as below.

Definition 3.3 The almost smooth (respectively, strongly almost smooth) functions are functions that are semismooth (respectively, strongly semismooth) on the whole space IRnand smooth everywhere except on sets with “dimension” less than n− 1 in the sense that the sets do not locally partition IRninto multiple connected components.

By applying Lemma 2.2(c), 3.1 and a result in [Qi, L., Tseng, P.: Math. Oper. Res., Submitted (2002)], we immediately have an interesting property in relation to strongly almost smoothness forp. For more details regarding to almost smooth and strongly almost smooth functions, please refer to the recent paper [Qi, L., Tseng, P.: Math.

Oper. Res., Submitted (2002)].

Proposition 3.3 If every Fiis an LC1function, then the functionpdefined as(9) is strongly almost smooth function.

Proof This result follows by Lemma 2.2(c), Prop. 3.1, and Theorem 7 of [Qi, L., Tseng,

P.: Math. Oper. Res., Submitted (2002)]. 2

4 A descent method

In this section, we study an almost the same descent method as in Sect. 4 of [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)] for solving the unconstrained mini- mization (11), which does not require the derivative of F involved in the NCP. In fact, we consider the same search direction for the algorithm as in [Chen, J.-S.: J. Optimiz.

Theory Appl., Submitted (2004)]:

dk:= −∇bψp(xk, F(xk)), (20) except the way to obtain the step-size is slightly different (see Step 3). Such a way to find step-size can also be found in the literature, for instance in [10]. Using the property ofψp being globally Lipschitz continuous (see Lemma 3.1), we have an

(13)

alternative proof for the convergence result of the same descent method considered as in [Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004)]. We state the detailed steps as below.

Algorithm 4.1 (Step 0) Choose x0∈ IRn,ε ≥ 0, σ ∈ (0, 1), β ∈ (0, 1) and set k := 0.

(Step 1) Ifp(xk) ≤ ε, then Stop.

(Step 2) Let

dk:= −∇bψp(xk, F(xk)).

(Step 3) Compute a step-size tk:= βmk, where mkis the smallest nonnegative inte- ger m satisfying the Armijo-type condition:

p(xk+ βmdk) ≤ p(xk) − σβ2mdk2. (21) (Step 4) Set xk+1:= xk+ tkdk, k := k + 1 and Go to Step 1.

We wish to show the global convergence result for Algorithm 4.1 under the strongly monotone assumption of F. The following lemmas plus Lemma 3.1 will enable the convergence result for the algorithm. In what follows, we assume that the parameter ε used in Algorithm 4.1 is set to be zero and Algorithm 4.1 generates an infinite sequence{xk}.

Lemma 4.1 ([Chen, J.-S.: J. Optimiz. Theory Appl., Submitted (2004), Lem. 4.1]) Let xk∈ IRnand F be a monotone function. Then the search direction defined as (20) sat- isfies the descent condition∇p(xk)Tdk< 0 as long as xkis not a solution of the NCP.

Moreover, if F is strongly monotone with modulusµ > 0 then ∇p(xk)Tdk≤ −µdk2. Lemma 4.2 If F is strongly monotone, then the NCP has at most one solution.

Proof Suppose there are two solutionsζ, x∈ IRnsuch that F(ζ), ζ = 0,

F(ζ) ≥ 0, ζ≥ 0 and

F(x), x = 0, F(x) ≥ 0, x≥ 0.

By F is strongly monotone, we haveF(ζ) − F(x), ζ− x > 0. However,

F(ζ) − F(x), ζ− x

= F(ζ), ζ + F(x), x − F(ζ), x − F(x), ζ

= −F(ζ), x − F(x), ζ

≤ 0,

where the inequality is due to F(ζ), ζ, F(x), xare all nonnegative. Hence, it is a contradiction and therefore there is at most one solution for the NCP. 2 Proposition 4.1 Suppose that F is continuously differentiable and strongly monotone with modulusµ > 0. Let x0 ∈ IRn be any starting point andL(x0) denote its level set. Assume∇F is Lipschitz continuous inL(x0). Then the sequence {xk} generated by Algorithm 4.1 converges to the unique solution of the NCP.

Proof From Lemma 3.1 and the assumption of∇F being Lipschitz continuous, we obtain∇pis also Lipschitz continuous inL(x0), i.e.,

∇p(x) − ∇p(y) ≤ Lx − y, ∀x, y ∈L(x0)

參考文獻

相關文件

We have made a survey for the properties of SOC complementarity functions and theoretical results of related solution methods, including the merit function methods, the

We have made a survey for the properties of SOC complementarity functions and the- oretical results of related solution methods, including the merit function methods, the

Then, we tested the influence of θ for the rate of convergence of Algorithm 4.1, by using this algorithm with α = 15 and four different θ to solve a test ex- ample generated as

In this paper, we present a descent method for solving the unconstrained minimization reformulation of the SOCCP which is based on the Fischer–Burmeister merit function

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization, vol.. Soares, A new

The comparison results indicate that, if a scaling strategy is imposed on the test problem, the descent method proposed is comparable with the merit function approach in the CPU

Particularly, combining the numerical results of the two papers, we may obtain such a conclusion that the merit function method based on ϕ p has a better a global convergence and