• 沒有找到結果。

The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem

N/A
N/A
Protected

Academic year: 2022

Share "The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem"

Copied!
19
0
0

加載中.... (立即查看全文)

全文

(1)

Journal of Global Optimization, vol. 36, pp. 565-580, 2006

The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem

Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

February 1, 2005 (revised March 11, 2006)

Abstract This paper is a follow-up of the work [1] where an NCP-function and a descent method were proposed for the nonlinear complementarity problem. An unconstrained re- formulation was formulated due to a merit function based on the proposed NCP-function.

We continue to explore properties of the merit function in this paper. In particular, we show that the gradient of the merit function is globally Lipschitz continuous which is im- portant from computational aspect. Moreover, we show that the merit function is SC1 function which means it is continuously differentiable and its gradient is semismooth. On the other hand, we provide an alternative proof, which uses the new properties of the merit function, for the convergence result of the descent method considered in [1].

Key words. Complementarity, SC1 function, merit function, semismooth function, de- scent method.

1 Introduction

In the past decades, the well-known nonlinear complementarity problem (NCP) has at- tracted much attention due to its various applications in operations research, economics, and engineering [7, 12, 18]. The NCP is to find a point x ∈ IRn such that

x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0, (1)

where h·, ·i is the Euclidean inner product and F = (F1, F2, · · · , Fn)T maps from IRnto IRn. We assume that F is continuously differentiable throughout this paper.

1E-mail: jschen@math.ntnu.edu.tw, TEL: 886-2-29320206, FAX: 886-2-29332342.

(2)

There have been many methods proposed for solving the NCP [10, 12, 18]. Among which, one of the most popular approaches that has been studied intensively recently is to reformulate the NCP as an unconstrained minimization problem [6, 8, 11, 14, 15, 30].

Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions and defined as below, serves an important role.

Definition 1.1 A function φ : IR2 → IR is called an NCP-function if it satisfies

φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2)

A popular NCP-function intensively studied recently is the well-known Fischer-Burmeister NCP-function [8, 9, 26] defined as

φ(a, b) =

a2+ b2− (a + b). (3)

Let Φ : IRn → IRn be

Φ(x) =

φ(x1 , F1(x))

·

·

· φ(xn , Fn(x))

. (4)

Then the function Ψ : IRn→ IR+ defined by Ψ(x) := 1

2kΦ(x)k2 = 1 2

Xn

i=1

φ(xi , Fi(x))2 (5)

is a merit function for the NCP, i.e., the NCP can be recast as an unconstrained minimiza- tion:

x∈IRminnΨ(x). (6)

In the paper [1], an NCP-function which is an extension of the Fischer-Burmeister function (3) was studied. More specifically, they define φp : IR2 → IR by

φp(a, b) := k(a, b)kp− (a + b), (7) where k(a, b)kp denotes the p-norm of (a, b), i.e., k(a, b)kp = qp|a|p+ |b|p. In other words, in the function φp, the 2-norm of (a, b) in the Fischer-Burmeister function (3) is replaced by more generally a p-norm of (a, b) with p ≥ 2. This function φp is still an NCP-function

(3)

as was noted in Tseng’s paper [28]. Nonetheless, there was no further study on this NCP- function even for p = 3 until the recent paper [1] by the author. Following the function φp, we can further define ψp : IR2 → IR+ by

ψp(a, b) := 1

2p(a, b)|2. (8)

The function ψp is a nonnegative NCP-function and smooth on IR2 with some favorable properties, see [1]. In this paper, we continue to explore properties of ψp as will be seen in Sec. 3. Analogous to Φ, the function Φp : IRn→ IRn given by

Φp(x) =

φp(x1 , F1(x))

·

·

· φp(xn , Fn(x))

(9)

yields a merit function Ψp : IRn → IR+ for the NCP where Ψp(x) := 1

2p(x)k2 = 1 2

Xn

i=1

φp(xi , Fi(x))2 =

Xn

i=1

ψp(xi , Fi(x)). (10) As shown in [1], Ψp is a continuously differentiable merit function for the NCP. Therefore, classical iterative methods such as Newton method can be applied to the unconstrained smooth minimization of the NCP, i.e.,

x∈IRminnΨp(x). (11)

On the other hand, derivative-free methods have also attracted much attention which do not require computation of derivatives of F [11, 14, 29]. Derivative-free methods, taking advantages of particular properties of a merit function, are suitable for problems where the derivatives of F are not available or expensive. In this paper, we also study a derivative-free descent algorithm for solving the NCP based on the merit function Ψp in Sec. 4. Indeed, the descent method was considered in [1], we apply the new properties of ψp explored in this paper to provide an alternative proof for the convergence result.

Throughout this paper, IRn denotes the space of n-dimensional real column vectors and T denotes transpose. For any differentiable function f : IRn → IR, ∇f (x) denotes the gradient of f at x. For any differentiable mapping F = (F1, ..., Fm)T : IRn → IRm,

∇F (x) = [∇F1(x) · · · ∇Fm(x)] denotes the transpose Jacobian of F at x. We write z = ◦(α) with α ∈ IR and z ∈ IRn to mean kzk/|α| tends to zero as α → 0. Also, we denote by kxkp the p-norm of x and by kxk the Euclidean norm of x. In the whole paper, we assume p is an integer greater than or equal to 2.

(4)

2 Preliminaries

In this section, we recall some background concepts and review some known materials which are crucial to the subsequent analysis. We begin with the monotonicity of a mapping. Let F : IRn → IRn, then F is monotone if hx − y, F (x) − F (y)i ≥ 0, for all x, y ∈ IRn; F is strictly monotone if hx − y, F (x) − F (y)i > 0, for all x, y ∈ IRn and x 6= y; and F is strongly monotone with modulus µ > 0 if hx − y, F (x) − F (y)i ≥ µkx − yk2, for all x, y ∈ IRn. Next, we recall the so-called semismooth functions. First, we say that F is strictly continuous (also called ‘locally Lipschitz continuous’) at x ∈ IRn [25, Chap. 9] if there exist scalars κ > 0 and δ > 0 such that

kF (y) − F (z)k ≤ κky − zk ∀y, z ∈ IRn with ky − xk ≤ δ, kz − xk ≤ δ;

and F is strictly continuous if F is strictly continuous at every x ∈ IRn. If δ can be taken to be ∞, then F is Lipschitz continuous with Lipschitz constant κ. Define the function lipF : IRn → [0, ∞] by

lipF (x) := lim sup

y,z→x y6=z

kF (y) − F (z)k ky − zk .

Then F is strictly continuous at x if and only if lipF (x) is finite. We say F is directionally differentiable at x ∈ IRn if

F0(x; h) := lim

t→0+

F (x + th) − F (x)

t exists ∀h ∈ IRn;

and F is directionally differentiable if F is directionally differentiable at every x ∈ IRn. F is differentiable (in the Fr´echet sense) at x ∈ IRn if there exists a linear mapping ∇F (x) : IRn → IRn such that

F (x + h) − F (x) − ∇F (x)h = o(khk).

We say that F is continuously differentiable if F is differentiable at every x ∈ IRn and ∇F is continuous.

If F is strictly continuous, then F is almost everywhere differentiable by Rademacher’s Theorem–see [4] and [25, Sec. 9J]. In this case, the generalized Jacobian ∂F (x) of F at x (in the Clarke sense) can be defined as the convex hull of the generalized Jacobian ∂BF (x), where

BF (x) :=

½

xlimj→x∇F (xj)|F is differentiable at xj ∈ IRn

¾

.

The notation ∂B is adopted from [20]. In [25, Chap. 9], the case of n = 1 is considered and the notations “ ¯∇” and “ ¯∂” are used instead of, respectively, “∂B” and “∂”.

Assume F : IRn → IRn is strictly continuous. We say F is semismooth at x if F is directionally differentiable at x and, for any V ∈ ∂F (x + h), we have

F (x + h) − F (x) − V h = o(khk).

(5)

We say F is ρ-order semismooth at x (0 < ρ < ∞) if F is semismooth at x and, for any V ∈ ∂F (x + h), we have

F (x + h) − F (x) − V h = O(khk1+ρ).

The following lemma, proven by Sun and Sun [27, Thm. 3.6] using the definition of generalized Jacobian,(Sun and Sun did not consider the case of o(khk) but their argument readily applies to this case.) enables one to study the semismooth property of F by ex- amining only those points x ∈ IRn where F is differentiable and thus work only with the Jacobian of F , rather than the generalized Jacobian.

Lemma 2.1 Suppose F : IRn → IRn is strictly continuous and directionally differentiable in a neighborhood of x ∈ IRn. Then, for any 0 < ρ < ∞, the following two statements (where O(·) depends on F and x only) are equivalent:

(a) For any h ∈ IRn and any V ∈ ∂F (x + h),

F (x + h) − F (x) − V h = o(khk) (respectively, O(khk1+ρ)).

(b) For any h ∈ IRn such that F is differentiable at x + h,

F (x + h) − F (x) − ∇F (x + h)h = o(khk) (respectively, O(khk1+ρ)).

We say F is semismooth (respectively, ρ-order semismooth) if F is semismooth (re- spectively, ρ-order semismooth) at every x ∈ IRn. We say F is strongly semismooth if it is 1-order semismooth. Convex functions and piecewise continuously differentiable func- tions are examples of semismooth functions. The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function. The property of semismoothness plays an important role in nonsmooth Newton methods [20, 22] as well as in some smoothing methods. For extensive discussions of semismooth functions, see [9, 16, 22].

Now, we review some useful properties about φp, ψp defined as in (7) and (8), respec- tively which will be used for the analysis in the subsequent sections. We notice that the function φp reduces to the Fischer-Burmeister function given as in (3) when p = 2. Thus, most properties are extensions of properties of Fischer-Burmeister function. For detailed proofs of them, please refer to [1].

Lemma 2.2 ([1, Prop. 3.1]) Let φp : IR2 → IR be defined as (7) where p ≥ 2. Then (a) φp is an NCP-function, i.e., it satisfies (2).

(6)

(b) φp is sub-additive, i.e., φp(w + w0) ≤ φp(w) + φp(w0) for all w, w0 ∈ IR2.

(c) φp is positively homogeneous, i.e., φp(αw) = αφp(w) for all w ∈ IR2 and α ≥ 0.

(d) φp is convex, i.e., φp(αw + (1 − α)w0) ≤ αφp(w) + (1 − α)φp(w0) for all w, w0 ∈ IR2 and α ∈ (0, 1).

(e) φp is Lipschitz continuous with L1 = 1 +

2, i.e., |φp(w) − φp(w0)| ≤ L1kw − w0k; or with L2 = 1 + 2(1−1/p), i.e., |φp(w) − φp(w0)| ≤ L2kw − w0kp for all w, w0 ∈ IR2. Lemma 2.2(b) and (c) imply that φp is sublinear, i.e., it satisfies

φp(αw + βw0) ≤ αφp(w) + βφp(w0)

for all w, w0 ∈ IR2 and α, β ≥ 0. This can be seen by the fact [3, Prop. 3.11] that a function from IRn to IR is sublinear if and only if it is positively homogeneous and sub-additive.

Note that the sublinear condition is stronger than convexity. In fact, under Lemma 2.2(c), Lemma 2.2(b) is equivalent to Lemma 2.2(d). This is from [24, Thm. 4.7] that a positively homogeneous function is convex if and only if it is sub-additive.

Lemma 2.3 ([1, Prop. 3.2]) Let φp, ψp be defined as (7) and (8), respectively, where p ≥ 2.

Then

(a) ψp is an NCP-function, i.e., it satisfies (2).

(b) ψp(a, b) ≥ 0 for all (a, b) ∈ IR2.

(c) ψp is continuously differentiable everywhere. Moreover, ∇aψp(0, 0) = ∇bψp(0, 0) = 0 and

aψp(a, b) =

à ap−1 k(a, b)kp−1p

− 1

!

φp(a, b)

bψp(a, b) =

à bp−1 k(a, b)kp−1p

− 1

!

φp(a, b), (12)

for (a, b) 6= (0, 0) with p is even, whereas

aψp(a, b) =

Ãsgn(a) · ap−1 k(a, b)kp−1p

− 1

!

φp(a, b)

bψp(a, b) =

Ãsgn(b) · bp−1 k(a, b)kp−1p

− 1

!

φp(a, b), (13)

for (a, b) 6= (0, 0) with p is odd.

(d) ∇aψp(a, b) · ∇bψp(a, b) ≥ 0 for all (a, b) ∈ IR2. The equality holds if and only if φp(a, b) = 0.

(7)

(e) ∇aψp(a, b) = 0 ⇐⇒ ∇bψp(a, b) = 0 ⇐⇒ φp(a, b) = 0.

Lemma 2.4 ([1, Prop. 3.5]) Let Ψp : IRn→ IR be defined as (10) where p ≥ 2. Assume F is either strongly monotone or uniform P -function, then the level sets L(Ψp, γ) are bounded for all γ ∈ IR.

In additional to the above properties of φp and ψp, we still need the following two lemmas for the analysis in the subsequent sections.

Lemma 2.5 ([13, (1.3)]) Let x ∈ IRn and 1 < p1 < p2. Then kxkp2 ≤ kxkp1 ≤ n(1/p1−1/p2)kxkp2.

Lemma 2.6 If F : D ⊆ IRn → IRm has a second derivative at each point of a convex set D0 ⊆ D, then

k∇F (y) − ∇F (x)k ≤ sup

0≤t≤1k∇2F (x + t(y − x))k · ky − xk.

Proof. This is Theorem 3.3.5 of [17] (page 78). 2

3 The semismooth-related properties of the NCP and merit functions

In this section, we study some semismooth-related properties of φp including semismooth and almost smooth properties as well as SC1 and LC1 properties of ψp. The semismooth property is very important from the computational point of view. In particular, it plays a fundamental role in the superlinear convergence analysis of generalized Newton methods, see [20, 22, 31]. The classes of SC1 and LC1 functions have been a subject of interest in relation to the development of minimization algorithm. We will introduce their definitions later. We begin this section by showing that the functions φp and Φp are semismooth (in fact, they are strongly semismooth as shown in Corollary 3.1) . Its proof is easy and routine.

Proposition 3.1 The function Φp : IRn → IRn defined as (9) is semismooth.

(8)

Proof. We notice that φp is convex by Lemma 2.2(d), and hence is a semismooth function.

We also observe that each component of Φp(x) is the composite of the convex function φp : IR2 → IR and the differentiable function (xi, Fi(x))T : IRn → IR2. Since convex and differentiable functions are semismooth and the composition of semismooth functions is semismooth, it yields that Φp is semismooth. 2

An important concept in relation to semismooth function is the SC1 function, so we next introduce its definition as below.

Definition 3.1 A function f : IRn→ IR is said to be an SC1 function if f is continuously differentiable and its gradient is semismooth.

We can view SC1 functions are functions lying between C1 and C2 functions. By defining SC1 functions, many results regarding the minimization of C2 functions can be extended to the minimization of SC1 functions, see [19] and references therein. For appli- cations and more details of SC1 functions, please refer to the excellent book [5]. Prop. 3.2 shows that ψp is an SC1 function; hence, if every Fi is SC1 function then so is Ψp. Before presenting its proof, we need a very important and crucial technical lemma, which states

∇ψp is globally Lipschitz continuous. The lemma will not only be used in the proof of Prop. 3.2 but also for the analysis of convergence result of the descent algorithm in Sec. 4.

Lemma 3.1 The gradient of the function ψp defined as (8) is Lipschitz continuous, that is, there exists L > 0 such that

k∇ψp(a, b) − ∇ψp(c, d)k ≤ Lk(a, b) − (c, d)k, (14) for all (a, b), (c, d) ∈ IR2.

Proof. Following the gradient of ψp given as in (12) and (13) and then applying the chain rule and quotient rule (the computation is routine though tedious, so we omit the details), we have the following two cases.

If p is even and (a, b) 6= (0, 0), then

2aaψp(a, b) =

à ap−1 k(a, b)kp−1p

− 1

!2

+ (p − 1)ap−2bp k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

,

2abψp(a, b) = ∇2baψp(a, b) =

à ap−1 k(a, b)kp−1p

− 1

! Ã bp−1 k(a, b)kp−1p

− 1

!

,

−(p − 1)ap−1bp−1 k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

,

2bbψp(a, b) =

à bp−1 k(a, b)kp−1p

− 1

!2

+ (p − 1)apbp−2 k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

.

(9)

It is clear that |a|p−1 k(a, b)kp−1p

≤ 1 and it also follows

|a|p−2· |b|p

µ

max{|a|, |b|}

2p−2

µ

p

q

|a|p+ |b|p

2p−2

≤ k(a, b)k2p−2p ,

that |a|p−2|b|p

k(a, b)k2p−2p

≤ 1. Similarly, |a|p|b|p−2 k(a, b)k2p−2p

≤ 1. (15)

On the other hand, by Lemma 2.5, we have

|a| + |b| ≤√ 2

a2+ b2 =

2k(a, b)k2 ≤√

2 · 2(1/2−1/p)k(a, b)kp = 2(1−1/p)k(a, b)kp. Applying all the above, we can give an upper bound for ∇2aaψp(a, b) as below.

¯¯

¯¯2aaψp(a, b)

¯¯

¯¯

à ap−1 k(a, b)kp−1p

+ 1

!2

+(p − 1)|a|p−2|b|p k(a, b)k2p−2p

+ (p − 1)|a|p−2|b|p· (|a| + |b|) k(a, b)k2p−1p

≤ 4 + (p − 1) +(p − 1)|a|p−2|b|p· 2(1−1/p)k(a, b)kp k(a, b)k2p−1p

≤ 4 + (p − 1) + (p − 1)2(1−1/p)

= 4 + (p − 1)

·

1 + 2(1−1/p)

¸

,

where the last inequality holds due to (15). By the same arguments, we also have

¯¯

¯¯2bbψp(a, b)

¯¯

¯¯≤ 4 + (p − 1)

·

1 + 2(1−1/p)

¸

.

Now, we estimate the upper bound for ∇2abψp(a, b) = ∇2baψp(a, b) as below.

¯¯

¯¯2abψp(a, b)

¯¯

¯¯=

¯¯

¯¯2baψp(a, b)

¯¯

¯¯

¯¯

¯¯

¯

ap−1 k(a, b)kp−1p

− 1

¯¯

¯¯

¯·

¯¯

¯¯

¯

bp−1 k(a, b)kp−1p

− 1

¯¯

¯¯

¯

+(p − 1)|a|p−1|b|p−1 k(a, b)k2p−1p

µ

k(a, b)kp+ (|a| + |b|)

à |a|p−1 k(a, b)kp−1p

+ 1

! Ã |b|p−1 k(a, b)kp−1p

+ 1

!

+(p − 1)|a|p−1|b|p−1 k(a, b)k2p−2p

+ (p − 1)|a|p−1|b|p−1· (|a| + |b|) k(a, b)k2p−1p

≤ 4 + (p − 1) + (p − 1)|a|p−1|b|p−1· 2(1−1/p)k(a, b)kp k(a, b)k2p−1p

≤ 4 + (p − 1) + (p − 1)2(1−1/p)

= 4 + (p − 1)

·

1 + 2(1−1/p)

¸

,

(10)

where the third and fourth inequalities are true by the similar result as (15), that is,

|a|p−1|b|p−1 k(a, b)k2p−2p

≤ 1.

If p is odd and (a, b) 6= (0, 0), then we obtain

2aaψp(a, b) =

Ãsgn(a) · ap−1 k(a, b)kp−1p

− 1

!2

+sgn(a)sgn(b) · (p − 1)ap−2bp k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

,

2abψp(a, b) = ∇2baψp(a, b) =

Ãsgn(a) · ap−1 k(a, b)kp−1p

− 1

! Ãsgn(b) · bp−1 k(a, b)kp−1p

− 1

!

,

−sgn(a)sgn(b) · (p − 1)ap−1bp−1 k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

,

2bbψp(a, b) =

Ãsgn(b) · bp−1 k(a, b)kp−1p

− 1

!2

+ sgn(a)sgn(b) · (p − 1)apbp−2 k(a, b)k2p−1p

µ

k(a, b)kp− (a + b)

.

In fact, the upper bounds for ∇2aaψp(a, b), ∇2abψp(a, b), ∇2bbψp(a, b) remain the same by fol- lowing exactly the same steps as in the case where p is even. Thus, there exist a constant L > 0 independent of (a, b) such that

k∇2ψp(a, b)k ≤ L, ∀ (a, b) 6= (0, 0) ∈ IR2. Then, by Lemma 2.6, we have

k∇ψp(a, b) − ∇ψp(c, d)k ≤ Lk(a, b) − (c, d)k, (16) for all (a, b), (c, d) ∈ IR2 with (0, 0) 6∈ [(a, b), (c, d)]. Moreover, (16) also holds in case (a, b) = (c, d) = (0, 0) since ∇aψp(a, b) = ∇bψp(a, b) = 0. Therefore, we can assume (a, b) 6= (0, 0). From Lemma 2.3(c), ψp is continuously differentiable for all (a, b) ∈ IR2 with ∇ψp(0, 0) = (0, 0); then using a continuity argument, we obtain (16) remains true for all (c, d) ∈ IR2. Thus, (16) holds for all (a, b), (c, d) ∈ IR2 which says ψp is globally Lipschitz continuous. 2

Proposition 3.2 The function ψp defined as in (8) is an SC1 function. Hence, if every Fi is an SC1 function, then the function Ψp given as (10) is also an SC1 function.

Proof. It is known by Lemma 2.3(c) that ψp is continuously differentiable, it remains to show that the gradient of ψp is semismooth. From Lemma 3.1, ∇ψp is Lipschitz continuous;

hence is strictly continuous (locally Lipschitz continuous). Therefore, to check semismooth- ness of ∇ψp, we only need to show that ∇ψp satisfies Lemma 2.1(b). More specifically, we only need to check semismoothness at (0, 0) because at other points ∇ψp is continuously differentiable (see the proof of Lemma 3.1), hence is semismooth. For this purpose, we will

(11)

have to verify that the equation in Lemma 2.1(b) is satisfied, i.e., for any (h1, h2) ∈ IR2 such that ∇ψp is differentiable at (h1, h2), we have

∇ψp(h1, h2) − ∇ψp(0, 0) − ∇2ψp(h1, h2) · h = ◦(k(h1, h2)k). (17) To prove (17), we have two cases where p is even and p is odd.

For p is even, we denote (Ξ1, Ξ2) the left-hand side of (17). Then, we have

"

Ξ1 Ξ2

#

:=

"

k1 k2

#

· φp(h1, h2) −

"

0 0

#

(18)

k12+

µ(p−1)hp−21 hp2 k(h1,h2)k2p−1p

φp(h1, h2) k1· k2− k3φp(h1, h2) k1· k2− k3φp(h1, h2) k22+

µ(p−1)hp1hp−22 k(h1,h2)k2p−1p

φp(h1, h2)

·

"

h1 h2

#

,

where

k1 =

µ hp−11 k(h1, h2)kp−1p

− 1

, k2 =

µ hp−12 k(h1, h2)kp−1p

− 1

, (19)

k3 = (p − 1)hp−11 hp−12 k(h1, h2)k2p−1p

.

By plugging (19) into (18) and writing out Ξ1 and Ξ2, we obtain that Ξ1 = 0 and Ξ2 = 0.

To see this, we compute Ξ1 as below:

Ξ1 =

µ hp−11 k(h1, h2)kp−1p

− 1

φp(h1, h2) −

µ hp−11 k(h1, h2)kp−1p

− 1

2

h1

−(p − 1)hp−11 hp2 k(h1, h2)k2p−1p

· φp(h1, h2) −

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h2

+(p − 1)hp−11 hp2 k(h1, h2)k2p−1p

· φp(h1, h2)

= φp(h1, h2)

hp−11 k(h1, h2)kp−1p

− 1

(p − 1)hp−11 hp2 k(h1, h2)k2p−1p

+ (p − 1)hp−11 hp2 k(h1, h2)k2p−1p

#

µ hp−11 k(h1, h2)kp−1p

− 1

2

h1

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h2

= φp(h1, h2)

µ hp−11 k(h1, h2)kp−1p

− 1

µ hp−11 k(h1, h2)kp−1p

− 1

2

h1

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h2

=

µ hp−11 k(h1, h2)kp−1p

− 1

¶ "

φp(h1, h2) −

µ hp−11 k(h1, h2)kp−1p

− 1

h1

µ hp−12 k(h1, h2)kp−1p

− 1

h2

#

(12)

=

µ hp−11 k(h1, h2)kp−1p

− 1

¶ "

k(h1, h2)kp hp1+ hp2 k(h1, h2)kp−1p

#

=

µ hp−11 k(h1, h2)kp−1p

− 1

· 0

= 0 ,

where the second-to-last equality is true since hp1+hp2 = k(h1, h2)kppwhen p is even. Similarly,

Ξ2 =

µ hp−12 k(h1, h2)kp−1p

− 1

φp(h1, h2) −

µ hp−12 k(h1, h2)kp−1p

− 1

2

h2

−(p − 1)hp1hp−12 k(h1, h2)k2p−1p

· φp(h1, h2) −

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h1

+(p − 1)hp1hp−12 k(h1, h2)k2p−1p

· φp(h1, h2)

= φp(h1, h2)

hp−12 k(h1, h2)kp−1p

− 1

(p − 1)hp1hp−12 k(h1, h2)k2p−1p

+ (p − 1)hp1hp−12 k(h1, h2)k2p−1p

#

µ hp−12 k(h1, h2)kp−1p

− 1

2

h2

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h1

= φp(h1, h2)

µ hp−12 k(h1, h2)kp−1p

− 1

µ hp−12 k(h1, h2)kp−1p

− 1

2

h2

µ hp−11 k(h1, h2)kp−1p

− 1

¶µ hp−12 k(h1, h2)kp−1p

− 1

h1

=

µ hp−12 k(h1, h2)kp−1p

− 1

¶ "

φp(h1, h2) −

µ hp−11 k(h1, h2)kp−1p

− 1

h1

µ hp−12 k(h1, h2)kp−1p

− 1

h2

#

=

µ hp−12 k(h1, h2)kp−1p

− 1

¶ "

k(h1, h2)kp hp1+ hp2 k(h1, h2)kp−1p

#

=

µ hp−11 k(h1, h2)kp−1p

− 1

· 0

= 0 ,

where the second-to-last equality is true since hp1+ hp2 = k(h1, h2)kpp when p is even. From the above two expressions of Ξ1 and Ξ2, it implies that (17) is satisfied. Thus, ∇ψp is semismooth at (0, 0) for the case where p is even.

For p is odd, following the same arguments leads to the same verifications. Therefore, we complete proving that ∇ψp is semismooth, and hence ψp is SC1 function. The second statement follows immediately from this result. 2

We want to point out one thing that, for p = 2, ψp was already proved an SC1 function in [5, 6] (Indeed, it was first formally shown in [6]). Prop. 3.2 is a general extension for any

(13)

p ≥ 2 and its proof is much more complicated than the case of p = 2. In addition to SC1 functions, we also introduce LC1 functions here.

Definition 3.2 A function f : IRn → IR is called an LC1 function if f is continuously differentiable and its gradient is locally Lipschitz continuous.

The class of LC1 minimization problems was studied in [21], where the local, superlinear convergence of an approximate Newton method was established under a semismoothness assumption on the gradient function at a solution point. It is obvious that any SC1 func- tion is an LC1 function. With the results of Lemma 3.1 and Prop. 3.2, we therefore has the following corollaries.

Corollary 3.1 If every Fi is an LC1 function, then the function Φp given as (9) is strongly semsmooth.

Proof. We know that φp is semismooth, indeed, it is strongly semismooth. This can be seen by Lemma 2.2(c), Lemma 3.1 and Theorem 7 of [23]. Also every LC1 function is strongly semismooth. Thus, the result follows. 2

Corollary 3.2 The function ψp defined as in (8) is an LC1 function. Hence, if every Fi is an LC1 function, then the function Ψp given as (10) is also an LC1 function.

Some other issues related to semismooth functions are concepts of piecewise smooth and almost smooth functions. It is well-known that piecewise smooth functions are examples of semismooth functions and there have emerged other examples of semismooth functions that are not piecewise smooth recently, see [23] and references therein. In particular, these examples include the p-norm function with 1 < p < ∞ defined on IRn where n ≥ 2, the Euclidean norm function, pseudo-smooth NCP-functions, smoothing functions, etc.. To close this section, we point out that the NCP-function studied in this paper and [1] is indeed strongly almost smooth since it is based on the p-norm function. We briefly state definition of almost smooth functions and the result as below.

Definition 3.3 The almost smooth (respectively, strongly almost smooth) functions are functions that are semismooth (respectively, strongly semismooth) on the whole space IRn and smooth everywhere except on sets with ”dimension” less than n − 1 in the sense that the sets do not locally partition IRn into multiple connected components.

By applying Lemma 2.2(c), 3.1 and a result in [23], we immediately have an interesting property in relation to strongly almost smoothness for Φp. For more details regarding to almost smooth and strongly almost smooth functions, please refer to the recent paper [23].

(14)

Proposition 3.3 If every Fi is an LC1 function, then the function Φp defined as (9) is strongly almost smooth function.

Proof. This result follows by Lemma 2.2(c), Prop. 3.1, and Theorem 7 of [23]. 2

4 A descent method

In this section, we study an almost the same descent method as in Sec. 4 of [1] for solving the unconstrained minimization (11), which does not require the derivative of F involved in the NCP. In fact, we consider the same search direction for the algorithm as in [1]:

dk:= −∇bψp(xk, F (xk)), (20) except the way to obtain the step-size is slightly different (see Step 3). Such a way to find step-size can also be found in the literature, for instance in [11]. Using the property of ψp being globally Lipschitz continuous (see Lemma 3.1), we have an alternative proof for the convergence result of the same descent method considered as in [1]. We state the detailed steps as below.

Algorithm 4.1

(Step 0) Choose x0 ∈ IRn, ε ≥ 0, σ ∈ (0, 1), β ∈ (0, 1) and set k := 0.

(Step 1) If Ψp(xk) ≤ ε, then Stop.

(Step 2) Let

dk:= −∇bψp(xk, F (xk)).

(Step 3) Compute a step-size tk:= βmk, where mk is the smallest nonnegative integer m satisfying the Armijo-type condition:

Ψp(xk+ βmdk) ≤ Ψp(xk) − σβ2mkdkk2. (21) (Step 4) Set xk+1:= xk+ tkdk, k := k + 1 and Go to Step 1.

We wish to show the global convergence result for Algorithm 4.1 under the strongly monotone assumption of F . The following lemmas plus Lemma 3.1 will enable the conver- gence result for the algorithm. In what follows, we assume that the parameter ε used in Algorithm 4.1 is set to be zero and Algorithm 4.1 generates an infinite sequence {xk}.

Lemma 4.1 ([1, Lem. 4.1]) Let xk ∈ IRn and F be a monotone function. Then the search direction defined as (20) satisfies the descent condition ∇Ψp(xk)Tdk < 0 as long as xk is not a solution of the NCP. Moreover, if F is strongly monotone with modulus µ > 0 then

∇Ψp(xk)Tdk ≤ −µkdkk2.

參考文獻

相關文件

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

A derivative free algorithm based on the new NCP- function and the new merit function for complementarity problems was discussed, and some preliminary numerical results for

11 (1998) 227–251] for the nonnegative orthant complementarity problem to the general symmet- ric cone complementarity problem (SCCP). We show that the class of merit functions

By exploiting the Cartesian P -properties for a nonlinear transformation, we show that the class of regularized merit functions provides a global error bound for the solution of

In this paper, we propose a family of new NCP functions, which include the Fischer-Burmeister function as a special case, based on a p-norm with p being any fixed real number in

In this paper, we have shown that how to construct complementarity functions for the circular cone complementarity problem, and have proposed four classes of merit func- tions for

On the other hand, we provide an alternative proof, which uses the new properties of the merit function, for the convergence result of the descent method considered in [Chen, J.-S.:

Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization 36 (2006) 565–580..