merit function

(1)

World Scientific Publishing Co. & Operational Research Society of Singaporec

ON SOME NCP-FUNCTIONS BASED ON THE GENERALIZED FISCHER–BURMEISTER FUNCTION

JEIN-SHAN CHEN Department of Mathematics National Taiwan Normal University

Taipei, Taiwan 11677 jschen@math.ntnu.edu.tw

Received 4 June 2005 Revised 18 November 2005 Second Revised 11 January 2006

In this paper, we study several NCP-functions for the nonlinear complementarity problem (NCP) which are indeed based on the generalized Fischer–Burmeister function, φp(a, b) = (a, b)p− (a + b). It is well known that the NCP can be reformulated as an equivalent unconstrained minimization by means of merit functions involving NCP- functions. Thus, we aim to investigate some important properties of these NCP-functions that will be used in solving and analyzing the reformulation of the NCP.

Keywords: NCP-function; complementarity; merit function; bounded level sets;

stationary point.

1. Introduction

The nonlinear complementarity problem (NCP) (Harker and Pang, 1990; Pang, 1994) is to ﬁnd a pointx ∈ Rⁿ such that

x ≥ 0, F (x) ≥ 0, x, F (x) = 0, (1.1)

where·, · is the Euclidean inner product and F = (F1, F2, . . . , F_n)^Tmaps fromRⁿ toRⁿ. We assume thatF is continuously diﬀerentiable throughout this paper. The NCP has attracted much attention due to its various applications in operations research, economics, and engineering (Ferris and Pang, 1997; Harker and Pang, 1990; Pang, 1994).

There have been many methods proposed for solving the NCP (Harker and Pang, 1990; Pang, 1994). Among which, one of the most popular and powerful approaches that has been studied intensively recently is to reformulate the NCP as a system of nonlinear equations (Mangasario 1976) or as an unconstrained minimization problem (Facchinei and Soares, 1997; Fisher, 1992; Kanzow, 1996). Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is

401

(2)

called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions and deﬁned as below, serves an important role.

A functionφ : R²→ R is called an NCP-function if it satisﬁes

φ(a, b) = 0 ⇔ a ≥ 0, b ≥ 0, ab = 0. (1.2) Many NCP-functions and merit functions have been explored during the past two decades (De Luca et al., 1996; Kanzow et al., 1997; San and Qi, 1999; Tseng, 1996). Among which, a popular NCP-function intensively studied recently is the well-known Fischer–Burmeister NCP-function (Fisher, 1992, 1997) deﬁned as

φ_FB(a, b) =

a²+b²− (a + b). (1.3)

With the above characterization of φ_FB, the NCP is equivalent to a system of nonsmooth equations:

Φ_FB(x) =





φ_FB(x1 , F1(x)) ... φ_FB(xn , Fn(x))



 = 0. (1.4)

For each NCP-function, there is a natural merit function, Ψ_FB :Rⁿ→ R+ given by Ψ_FB(x) := 1

2ΦFB(x)²=1 2

n i=1

φFB(xi, Fi(x))², (1.5)

from which the NCP can be recast as an unconstrained minimization:

x∈RminⁿΨ_FB(x). (1.6)

In this paper, we are particularly interested in the generalized Fischer–

Burmeister function, i.e.,φ_p:R²→ R given by

φp(a, b) := (a, b)p− (a + b), (1.7) where p is a positive integer greater than one and (a, b)_p =^p

|a|^p+|b|^p means thep-norm of (a, b). Notice that φp reduces to the well known Fischer–Burmeister functionφ_FB whenp = 2 and its related properties were recently presented in (Chen and Pan, 2006; Chen, 2006). Corresponding toφ_p, we deﬁneψ_p:R²→ R+ by

ψp(a, b) :=1

2|φp(a, b)|². (1.8)

Then bothφ_p andψ_p are NCP-functions and yield a merit function Ψ_p(x) :=ⁿ

i=1

ψ_p(x_i, F_i(x)) = 1 2

n i=1

φ_p(x_i, F_i(x))², (1.9)

(3)

from which the NCP can be reformulated as an unconstrained minimization:

x∈RminⁿΨ_p(x). (1.10)

However, there has some limitations for the (generalized) Fischer–Burmeister functions and some of its variants when dealing with monotone complementarity problem. In particular, its natural merit function Ψ_p does not guarantee bounded level sets for this class of problem which is an important class (see page 4 of Chen et al., 2000). Some modifications to the Fischer–Burmeister have been proposed to con- quer the above problem, see (Kanzow et al., 1997; Sun and Qi, 1999). In this paper, we extend these modifications to the generalized Fischer–Burmeister function φp. More specifically, we study the following NCP-functions:

φ1(a, b) := φ_p(a, b) − αa+b+, α > 0, φ2(a, b) := φp(a, b) − α(ab)+, α > 0, φ3(a, b) :=

[φp(a, b)]²+α(a+b+)², α > 0, φ4(a, b) :=

[φp(a, b)]²+α[(ab)+]², α > 0,

(1.11)

The function φ1 is called penalized Fischer–Burmeister function when p = 2 and was studied in (Chen et al., 2000). The functionsφ2, φ3, φ4generalize the merit functions ofp = 2, which were discussed in Sun and Qi (1999) and Yamada et al.

(2000). Note that fori = 1, 2, 3, 4, we have

φi(a, b) ≡ φp(a, b) (1.12)

for all (a, b) ∈ N₋ (this notation is used in Sun and Qi, 1999) where

N−:={(a, b)| ab ≤ 0}. (1.13)

Thus,φi wherei = 1, 2, 3, 4 are only diﬀerent in the ﬁrst or third quadrant.

Similarly, for eachφ_i there is an associatedψ_i:R²→ R+ given by ψi(a, b) := 1

2|φi(a, b)|², i = 1, 2, 3, 4, (1.14) which is also an NCP-function for every i. Moreover, for φ ∈ {φ1, φ2, φ3, φ4}, we can deﬁne

Φ(x) =





φ(x1 , F1(x)) ... φ(xn , Fn(x))



 , (1.15)

from which the NCP is equivalent to the unconstrained minimization:

x∈RminⁿΨ(x) (1.16)

where

Ψ(x) := 1

2Φ(x)²= 1 2

n i=1

φ(xi , Fi(x))² (1.17)

is the natural merit function corresponding toφ ∈ {φ1, φ2, φ3, φ4}.

(4)

The paper is organized as follows. In Section 2, we review some background deﬁnitions including monotonicity, P0-function, semismoothness, etc. and known results about Ψ_p and its related properties. In Section 3, we show that all (φi)², i ∈ {1, 2, 3, 4} are continuously diﬀerentiable and investigate properties of the merit function Ψ constructed via φ_i with i ∈ {1, 2, 3, 4}. In particular, it pro- vides bounded level sets for a monotone NCP with a strictly feasible point. In addition, we give conditions under which a stationary point of Ψ is a solution of the NCP. In general, the analytic techniques used in this paper are similar to those in Chen et al. (2000), Ficchinei and Soares (1997), Sun and Qi (1999) since the work is somewhat considered the extensions of NCP-functions studied in those literatures.

Throughout this paper,Rⁿ denotes the space ofn-dimensional real column vec- tors and^T denotes transpose. For any diﬀerentiable function f : Rⁿ → R, ∇f(x) denotes the gradient off at x. For any diﬀerentiable mapping F = (F1, . . . , Fm)^T : Rⁿ→ R^m,∇F (x) = [∇F1(x) · · · ∇F_m(x)] denotes the transpose Jacobian of F at x. We denote by xpthep-norm of x and by x the Euclidean norm of x. In this whole paper, we assumep is a positive integer greater than one.

2. Preliminaries

In this section, we recall some background concepts and materials which will play an important role in the subsequent analysis. LetF : Rⁿ→ Rⁿ. Then,

(1) F is monotone if x − y, F (x) − F (y) ≥ 0, for all x, y ∈ Rⁿ.

(2) F is strictly monotone if x − y, F (x) − F (y) > 0, for all x, y ∈ Rⁿ andx = y.

(3) F is strongly monotone with modulus µ > 0 if x−y, F (x)−F (y) ≥ µx−y², for allx, y ∈ Rⁿ.

(4) F is a P0-function if max_{1 ≤ i ≤ n}

xi= yi

(xi− yi)(Fi(x) − Fi(y)) ≥ 0, for all x, y ∈ Rⁿ and x = y.

(5) F is a P -function if max1≤i≤n(xi− yi)(Fi(x) − Fi(y)) > 0, for all x, y ∈ Rⁿand x = y.

(6) F is a uniform P -function with modulus µ > 0 if max1≤i≤n(xi− yi)(Fi(x) − F_i(y)) ≥ µx − y², for all x, y ∈ Rⁿ. (7) F is a R0-function if for every sequence {x^k} satisfying {x^k} → ∞, lim infk→∞ min_ix^k_i

x^k ≥ 0, and lim inf_k→∞^min_xⁱ^Fⁱ_k^(x^k⁾ ≥ 0, there exists an index j such that {x^k_j} → ∞ and {F_j(x^k)} → ∞.

It is clear that strongly monotone functions are strictly monotone, and strictly monotone functions are monotone. Moreover, F is a P0-function if F is mono- tone and F is a uniform P -function with modulus µ > 0 if F is strongly mono- tone with modulus µ > 0. In addition, when F is continuously differentiable, we have the following: (i)F is monotone if and only if ∇F (x) is positive semi-definite for all x ∈ Rⁿ. (ii) F is strictly monotone if ∇F (x) is positive definite for all

(5)

x ∈ Rⁿ. (iii) F is strongly monotone if and only if ∇F (x) is uniformly positive deﬁnite. AnR0-function can be viewed as a generalization of a uniformP -function since every uniform P -function is an R0-function (see, Chen and Harker, 1997, Proposition 3.11).

A matrixM ∈ R^n×nis aP0-matrix if every of its principal minors is nonnegative, and it is aP -matrix if every of its principal minors is positive. In addition, it is said to be aR0-matrix if the following system has only zero solution:

x ≥ 0,

M_ix = 0 if x_i> 0, Mix ≥ 0 if xi= 0,

It is obvious that every P -matrix is also a P0-matrix and it is known that the Jacobian of every continuously diﬀerentiableP0-function is aP0-matrix. For more properties aboutP -matrix and P0-matrix, please refer to Facchinei and Pang (2003).

It is also known thatF is an R0-function if and only ifM is an R0-matrix whenF is an aﬃne function (see, Chen and Harker, 1997, Proposition 3.10).

Next, we recall the deﬁnition of semismoothness. First, we introduce that F is strictly continuous (also called “locally Lipschitz continuous”) at x ∈ Rⁿ (Rockafellar and Wets, 1998, Chapter 9) if there exist scalarsκ > 0 and δ > 0 such that

F (y) − F (z) ≤ κy − z ∀y, z ∈ Rⁿ with y − x ≤ δ, z − x ≤ δ;

andF is strictly continuous if F is strictly continuous at every x ∈ Rⁿ. Ifδ can be taken to be∞, then F is Lipschitz continuous with Lipschitz constant κ. We say F is directionally diﬀerentiable at x ∈ Rⁿ if

F(x; h) := lim

t→0⁺

F (x + th) − F (x)

t exists∀h ∈ Rⁿ;

and F is directionally diﬀerentiable if F is directionally diﬀerentiable at every x ∈ Rⁿ.

Assume F : Rⁿ → R^m is strictly continuous. We say F is semismooth at x if F is directionally diﬀerentiable at x and, for any V ∈ ∂F (x + h) (the generalized Jacobian), we have

F (x + h) − F (x) − V h = o(h).

We sayF is ρ-order semismooth at x (0 < ρ < ∞) if F is semismooth at x and, for anyV ∈ ∂F (x + h), we have

F (x + h) − F (x) − V h = O(h^1+ρ).

We say F is semismooth (respectively, ρ-order semismooth) if F is semismooth (respectively, ρ-order semismooth) at every x ∈ R^k. We say F is strongly semis- mooth if it is 1-order semismooth. Convex functions and piecewise continuously dif- ferentiable functions are examples of semismooth functions. Examples of strongly

(6)

semismooth functions include piecewise linear functions andLC¹functions meaning smooth functions with gradients being locally Lipschitz continuous (strictly continuous) (Facchinei and Soares, 2003; Qi, 1994). The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function.

The property of semismoothness plays an important role in nonsmooth Newton methods (Qi, 1993; Qi and Sun, 1993) as well as in some smoothing methods mentioned in the Section 1. For extensive discussions of semismooth functions, see Fischer (1997), Miﬄin (1977), and Qi and Sun (1993).

To end this section, we collect some useful properties ofφp, ψpdeﬁned as in (1.7) and (1.8), respectively, that will be used in the subsequent analysis. All the proofs can be found in Chen and Pan (2006).

Property 2.1 (Chen and Pan, 2006, Proposition 3.1, Lemma 3.1). Letφp :R²→ R be deﬁned as (1.7). Then

(a) φ_p is an NCP-function, i.e. it satisﬁes (1.2).

(b) φp is sub-additive, i.e.φp(w + w)≤ φp(w) + φ(w) for allw, w ∈ R². (c) φp is positive homogeneous, i.e.φp(αw) = αφp(w) for all w ∈ R²andα ≥ 0.

(d) φ_pis convex, i.e. φ_p(αw+(1−α)w)≤ αφ_p(w)+(1−α)φ_p(w) for allw, w ∈ R² andα ≥ 0.

(e) φ_pis Lipschitz continuous withL1= 1 +√

2, i.e.|φ_p(w)−φ_p(w)| ≤ L1w−w;

or withL2= 1+2^(1−1/p), i.e.|φp(w)−φp(w)| ≤ L2w−wpfor allw, w ∈ R². (f) φ_p is semismooth.

(g) If{(a^k, b^k)} ⊆ R² with (a^k → −∞) or (b^k → −∞) or (a^k → ∞ and b^k → ∞), then we have |φ_p(a^k, b^k)| → ∞ for k → ∞.

Property 2.2 (Chen and Pan, 2006, Proposition 3.2). Let φ_p, ψ_p be deﬁned as (1.7) and (1.8), respectively. Then

(a) ψp is an NCP-function, i.e. it satisﬁes (1.2).

(b) ψp(a, b) ≥ 0 for all (a, b) ∈ R².

(c) ψ_p is continuously diﬀerentiable everywhere.

(d) ∇aψp(a, b)·∇bψp(a, b) ≥ 0 for all (a, b) ∈ R². The equality holds⇔ φp(a, b) = 0.

(e) ∇_aψ_p(a, b) = 0 ⇔ ∇_bψ_p(a, b) = 0 ⇔ φ_p(a, b) = 0.

From these properties, it was proved in Chen and Pan (2006) that Ψ_p(x) ≥ 0 for allx ∈ Rⁿand Ψ_p(x) = 0 if and only if x solves the NCP (1.1), where Ψp:Rⁿ→ R is deﬁned as (1.9). Moreover, suppose that the NCP has at least one solution. Then x is a global minimizer of Ψ_p if and only ifx solves the NCP. In addition, it was also shown in Chen and Pan (2006) that if F is either monotone or P0-function, then every stationary point of Ψ_pis a global minima of (1.10); and therefore solves the original NCP. We will investigate the analogous results for the merit function Ψ which is based on φ_i studied in this paper. On the other hand, as mentioned the natural merit function induced from the generalized Fischer–Burmeister

(7)

(which behaves like the Fischer–Burmeister function) does not guarantee bounded level sets under the assumption of F being monotone. Instead, there needs that F is strongly monotone or uniform P -function to ensure that the property is held.

Another main purpose of this work is to obtain same results for the merit function Ψ studied in this paper under the weaker assumption that F is monotone only (see Section 4).

3. Properties of φ and ψ

In this section, we investigate properties of φ ∈ {φ1, φ2, φ3, φ4} and ψ ∈ {ψ1, ψ2, ψ3, ψ4} deﬁned as in (1.11) and (1.14), respectively. These include strong semismoothness ofφ and continuous diﬀerentiability of ψ. First, we denote

Nφ:={(a, b)| a ≥ 0, b ≥ 0, ab = 0}. (3.1) This notation is adopted from Chen et al. (2000) and it is easy to see that (a, b) ∈ N_φ if and only if (a, b) satisﬁes (1.2). Now we are ready to show the favorable properties ofφ and ψ.

Proposition 3.1. Letφ ∈ {φ1, φ2, φ3, φ4} be deﬁned as in (1.11). Then (a) φ(a, b) = 0 ⇔ (a, b) ∈ Nφ⇔ (a, b) satisﬁes (1.2).

(b) φ is strongly semismooth.

(c) Let {a^k}, {b^k} ⊆ R be any two sequences such that either a^k+b^k+→ ∞ or a^k →

−∞ or b^k → −∞. Then |φ(a^k, b^k)| → ∞ for k → ∞.

Proof. (a) It is enough to prove the ﬁrst equivalence. Supposeφ(a, b) = 0, for i = 2, 3, 4, φi(a, b) = 0 yields φp(a, b) = 0 which says (a, b) ∈ Nφby Property 2.1(a). For i = 1, φ1(a, b) = 0 implies φ_p(a, b) = αa+b+. Sinceα could be any arbitrary positive number, the above leads to φp(a, b) = a+b+ = 0 which which says (a, b) ∈ Nφ by Property 2.1(a) again. On the other hand, suppose (a, b) ∈ Nφ thenφp(a, b) = 0 by by Property 2.1(a). Sincea ≥ 0, b ≥ 0, we obtain a+b+=ab = 0. Hence we see that allφi(a, b) = 0, i = 1, 2, 3, 4.

(b) The veriﬁcation of strong semismoothness ofφ is a routine work which can be done as in Yamada et al. (2000) of Lemma 1. We omit it.

(c) This follows from Property 2.1(g) and deﬁnition of (·)+.

Proposition 3.2. Let Φ be deﬁned as in (1.15) with φ ∈ {φ1, φ2, φ3, φ4}. Then (a) Φ is semismooth.

(b) Φ is strongly semismooth if Fi isLC¹ function.

Proof. By using Proposition 3.1(b) and the fact that everyLC¹function is strongly semismooth, the results follow.

The following is a technical lemma which describes the generalized gradients of allφi, i = 1, 2, 3, 4 deﬁned as in Eq. (1.11). It will be used for proving Propotion 3.3.

(8)

Lemma 3.1. Letφ1, φ2, φ3, φ4 be deﬁned as (1.11).

(a) The generalized gradient ∂φ1(a, b) of φ1 at a point (a, b) is equal to the set of all (va, vb) such that

(v_a, v_b) =











a^p−1

(a, b)^p−1p − 1, b^p−1

(a, b)^p−1p − 1

− α(b+∂a+, a+∂b+), if (a, b) = (0, 0) and p is even,

sgn(a) · a^p−1

(a, b)^p−1p − 1, sgn(b) · b^p−1

(a, b)^p−1p − 1

− α(b+∂a+, a+∂b+), if (a, b) = (0, 0) and p is odd, (ξ − 1, ζ − 1), if (a, b) = (0, 0),

(3.2)

where (ξ, ζ) is any vector satisfying (ξ, ζ)p ≤ 1 and

∂z+=





1, if z > 0, [0, 1], if z = 0, 0, if z < 0.

(b) The generalized gradient ∂φ2(a, b) of φ2 at a point (a, b) is equal to the set of all (v_a, v_b) such that

(v_a, v_b) =

a^p−1

(a, b)^p−1p − 1, b^p−1

(a, b)^p−1p − 1

− α(b, a),

if (a, b) = (0, 0), ab > 0 and p is even,

(v_a, v_b) =

a^p−1

(a, b)^p−1p − 1, b^p−1

(a, b)^p−1p − 1

− α(b, a) · [0, 1],

if (a, b) = (0, 0), ab = 0 and p is even,

(va, vb) =

a^p−1

(a, b)^p−1p − 1, b^p−1

(a, b)^p−1p − 1

,

if (a, b) = (0, 0), ab < 0 and p is even,

(va, vb) =

sgn(a) · a^p−1

(a, b)^p−1p − 1, sgn(b) · b^p−1

(a, b)^p−1p − 1

− α(b, a),

if (a, b) = (0, 0), ab > 0 and p is odd, (3.3) (v_a, v_b) =

sgn(a) · a^p−1

(a, b)^p−1p − 1, sgn(b) · b^p−1

(a, b)^p−1p − 1

− α(b, a) · [0, 1],

if (a, b) = (0, 0), ab = 0 and p is odd,

(9)

(va, vb) =

sgn(a) · a^p−1

(a, b)^p−1p − 1, sgn(b) · b^p−1

(a, b)^p−1p − 1

,

if (a, b) = (0, 0), ab < 0 and p is odd, (va, vb) = (ξ − 1, ζ − 1) − α(b, a) · [0, 1],

if (a, b) = (0, 0),

where (ξ, ζ) is any vector satisfying (ξ, ζ)p≤ 1.

(c) φ3 is continuously diﬀerentiable everywhere except at (0, 0) with

∇_aφ3(a, b) =











φ_p(a, b) ·

a^p−1

(a,b)^p−1p − 1

+α(a+)(b+)²

φ3(a, b) ,

if (a, b) = (0, 0), and p is even, φp(a, b) ·

sgn(a)·a^p−1

(a,b)^p−1p − 1

+α(a+)(b+)²

φ3(a, b) ,

if (a, b) = (0, 0), and p is odd,

(3.4)

∇_bφ3(a, b) =











φ_p(a, b) ·

b^p−1

(a,b)^p−1p − 1

+α(a+)²(b+)

φ3(a, b) ,

sgn(b)·b^p−1

(a,b)^p−1p − 1

+α(a+)²(b+)

φ3(a, b) ,

if (a, b) = (0, 0), and p is odd,

(3.5)

and∂φ3(0, 0) = (v_a, v_b) where (v_a, v_b)∈ (−∞, ∞).

(d) φ4 is continuously diﬀerentiable everywhere except at (0, 0) with

∇aφ4(a, b) =











φ_p(a, b) ·

a^p−1

(a,b)^p−1p − 1

+α(ab)+· b

φ4(a, b) ,

sgn(a)·a^p−1

(a,b)^p−1p − 1

+α(ab)+· b

φ4(a, b) ,

if (a, b) = (0, 0), and p is odd,

(3.6)

(10)

∇bφ4(a, b) =











φp(a, b) ·

b^p−1

(a,b)^p−1p − 1

+α(ab)+· a

φ4(a, b) ,

if (a, b) = (0, 0), and p is even, φ_p(a, b) ·

sgn(b)·b^p−1

(a,b)^p−1p − 1

+α(ab)+· a

φ4(a, b) ,

if (a, b) = (0, 0), and p is odd,

(3.7)

and ∂φ4(0, 0) = (v_a, v_b) where (v_a, v_b)∈ (−∞, ∞).

Proof. (a) First, we note thatφp is continuously differentiable everywhere except at (0, 0) (see Chen and Pan, 2006). Hence, by the Corollary to Proposition 2.2.1 in Clarke (1983), φp is strictly differentiable everywhere except at the origin. Let φ+(a, b) := a+b+. Thenφ+ is strictly differentiable at the origin as proved in Chen et al. (2000) of Proposition 2.1. Bothφ1andφ+are strongly semismooth functions, we know that they are locally Lipschitz (strictly continuous) functions. Thus, the Corollary 2 to Proposition 2.3.3 in Clarke (1983) yields

∂φ1(a, b) = ∂φ_p(a, b) − α · ∂φ+(a, b).

On the other hand, the generalized gradient of φ_p can be veriﬁed as below (see Chen, 2004):

∂φp(a, b) =

a^p−1

(a, b)^p−1p − 1, b^p−1

(a, b)^p−1p − 1

, if (a, b) = (0, 0) and p is even,

∂φp(a, b) =

sgn(a) · a^p−1

(a, b)^p−1p − 1, sgn(b) · b^p−1

(a, b)^p−1p − 1

, if (a, b) = (0, 0) and p is odd,

∂φ_p(a, b) = (ξ − 1, ζ − 1), if (a, b) = (0, 0), (3.8) where (ξ, ζ) is any vector satisfying (ξ, ζ)_p≤ 1. In addition, it was already shown in Chen et al. (2000) Proposition 2.1 that

∂φ+(a, b) = (b+∂a+, a+∂b+). Thus, the desired results follow.

(b) Following the same arguments as in part(a) and using the fact that

∂(ab)+=





(b, a), if ab > 0, (0, 0), if ab < 0, (b, a) · [0, 1], if ab = 0, the desired results hold.

(11)

(c) It is known that (φp)²and (a+b+)²are continuously diﬀerentiable. Then the desired results follow by direct computations using the chain rule and the fact that

∂(√ z) =



 1 2√

z, if z > 0, (−∞, ∞), if z = 0.

(d) Same arguments as part(c).

Proposition 3.3. Letψ ∈ {ψ1, ψ2, ψ3, ψ4} be deﬁned as in (1.14). Then (a) ψ(a, b) = 0 ⇔ (a, b) ∈ N_φ⇔ (a, b) satisﬁes (1.2).

(b) ψ is continuously diﬀerentiable on R². (c) ∇_aψ(a, b) · ∇_bψ(a, b) ≥ 0 for all (a, b) ∈ R².

(d) ψ(a, b) = 0 ⇔ ∇ψ(a, b) = 0 ⇔ ∇aψ(a, b) = 0 ⇔ ∇bψ(a, b) = 0.

Proof. (a) The proof is straightforward by the same arguments as in Propositon 3.1(a).

(b) The ideas for the proof are indeed borrowed from Facchinei and Soares (1997) of Propositon 3.4.

Fori = 1 and p is even, ψ1(a, b) = ¹₂(φ1(a, b))². By the chain rule (see Clarke, 1983, Theorem 2.2.4) we obtain∂ψ1(a, b) = ∂φ1(a, b)^Tφ1(a, b). We will show that

∂φ1(a, b)^Tφ1(a, b) is single-valued for all (a, b) ∈ R² because the zero of φ1 cancels the multi-valued portion of∂φ1(a, b)^T. To see this, we discuss several cases as below.

(i) If a > 0, b > 0, then (b+∂a+, a+∂b+) = (b, a) which is single-valued. Hence, by (3.2), it is easy to see that ∂φ1(a, b)^Tφ1(a, b) is single-valued.

(ii) If a > 0, b < 0, then (b+∂a+, a+∂b+) = (0, a) which is single-valued. Hence, by (3.2),∂φ1(a, b)^Tφ1(a, b) is single-valued.

(iii) If a > 0, b = 0, then (b+∂a+, a+∂b+) = (0, a · [0, 1]) which is multi-valued.

However, under this case, we observe that φ1(a, b) = (a, b)_p − (a + b) − αa+b+= 0. Hence,∂φ1(a, b)^Tφ1(a, b) is still single-valued.

(iv) If a < 0, b > 0 or a < 0, b < 0, or a < 0, b = 0, then (b+∂a+, a+∂b+) all equals (0, 0) which is single-valued. Hence, by (3.2), ∂φ1(a, b)^Tφ1(a, b) is single-valued.

(v) If a = 0, b > 0, then (b+∂a+, a+∂b+) = (b · [0, 1], 0) which is multi-valued.

However, under this case, we observe that φ1(a, b) = (a, b)p − (a + b) − αa+b+= 0. Hence,∂φ1(a, b)^Tφ1(a, b) is still single-valued.

(vi) If a = 0, b < 0, then (b+∂a+, a+∂b+) = (0, 0) which is single-valued. Hence, by (3.2),∂φ1(a, b)^Tφ1(a, b) is single-valued.

(vii) If a = 0, b = 0 then φ1(a, b) = 0. Hence, ∂φ1(a, b)^Tφ1(a, b) is single-valued.

Thus, by applying the corollary to Theorem 2.2.4 in Clarke (1983), the above facts yield that ψ1 is continuously diﬀerentiable everywhere. For p is odd, going over the same cases, the proof follows.

(12)

For i = 2, ψ2(a, b) = ¹₂(φ2(a, b))². We discuss the following cases: (i) (a, b) = (0, 0) and ab > 0, (ii) (a, b) = (0, 0) and ab = 0, (iii) (a, b) = (0, 0) and ab < 0, (iv) (a, b) = (0, 0). From (3.3), we know that ∂φ2(a, b) becomes multi-valued when ab = 0 or (a, b) = (0, 0). However, φ2(a, b) = 0 under these two cases which implies that ∂φ2(a, b)^Tφ2(a, b) is still single-valued. Hence, ψ2 is continuously diﬀerentiable everywhere by the Corollary to Theorem 2.2.4 in Clarke (1983) again.

For i = 3, 4, from (3.4)–(3.8), it is trivial that ∂φ3(a, b), ∂φ4(a, b) are single- valued when (a, b) = (0, 0). When (a, b) = (0, 0), we observe that φ3(a, b) = φ4(a, b) = 0. Hence, ∂φ3(a, b)^Tφ3(a, b) and ∂φ4(a, b)^Tφ4(a, b) are still single-valued, which yield thatψ3, ψ4are continuously diﬀerentiable everywhere by the same reason as above.

(c) For i = 1, ψ1 = ¹₂(φ1)², we employ and go over the cases discussed as in part (b).

(i) If a > 0, b > 0, then (b+∂a+, a+∂b+) = (b, a). Hence, from (3.3), we obtain that

∇_aψ1(a, b) =

a^p−1

(a, b)^p−1p − 1 − αb

φ1(a, b),

∇bψ1(a, b) =

b^p−1

(a, b)^p−1p − 1 − αa

φ1(a, b),

for bothp are even and odd. Then, ∇aψ1(a, b) · ∇bψ1(a, b) equals a^p−1

(a, b)^p−1p − 1 − αb

b^p−1

(a, b)^p−1p − 1 − αa

φ²1(a, b).

Since,_(a,b)^a^p−1p−1 p

 ≤ 1,_(a,b)^b^p−1p−1 p

 ≤ 1, and αa > 0, αb > 0, we know a^p−1

(a, b)^p−1p − 1 − αb

< 0 and

b^p−1

(a, b)^p−1p − 1 − αa

< 0,

which implies that∇aψ1(a, b) · ∇bψ1(a, b) ≥ 0.

(ii) Ifa > 0, b < 0, then (b+∂a+, a+∂b+) = (0, a). Hence, from (3.2), we have

∇aψ1(a, b) =

a^p−1

(a, b)^p−1p − 1

φ1(a, b),

∇bψ1(a, b) =

b^p−1

(a, b)^p−1p − 1 − αa

φ1(a, b),

(13)

forp is even; and

∇aψ1(a, b) =

a^p−1

(a, b)^p−1p − 1

φ1(a, b),

∇bψ1(a, b) =

−b^p−1

(a, b)^p−1p − 1 − αa

φ1(a, b),

forp is odd. Again, since _(a,b)^a^p−1p−1 p

 ≤ 1,_(a,b)^b^p−1p−1 p

 ≤ 1, and αa > 0, it can be easily veriﬁed that ∇_aψ1(a, b) · ∇_bψ1(a, b) ≥ 0.

(iii) If a > 0, b = 0, then φ1(a, b) = 0 which says ∇aψ1(a, b) = 0 = ∇bψ1(a, b).

Hence,∇_aψ1(a, b) · ∇_bψ1(a, b) = 0.

(iv) Ifa < 0, b > 0 or a < 0, b < 0, or a < 0, b = 0, then (b+∂a+, a+∂b+) = (0, 0).

Hence, from (3.3), we have

∇_aψ1(a, b) =

a^p−1

(a, b)^p−1p − 1

φ1(a, b),

∇bψ1(a, b) =

b^p−1

(a, b)^p−1p − 1

φ1(a, b),

forp is even; and

∇aψ1(a, b) =

sgn(a) · a^p−1

(a, b)^p−1p − 1

φ1(a, b),

∇bψ1(a, b) =

sgn(b) · b^p−1

(a, b)^p−1p − 1

φ1(a, b),

for p is odd. Again, by _(a,b)^a^p−1p−1 p

 ≤ 1, and _(a,b)^b^p−1p−1 p

 ≤ 1, the desired inequality holds.

(v) If a = 0, b > 0, then φ1(a, b) = 0 which says ∇aψ1(a, b) = 0 = ∇bψ1(a, b).

Hence,∇_aψ1(a, b) · ∇_bψ1(a, b) = 0.

(vi) Ifa = 0, b < 0, then (b+∂a+, a+∂b+) = (0, 0). Hence, from (3.2), we have

∇aψ1(a, b) = −φ1(a, b), ∇bψ1(a, b) =

b^p−1

(a, b)^p−1p − 1

φ1(a, b),

forp is even; and

∇aψ1(a, b) = −φ1(a, b), ∇bψ1(a, b) =

−b^p−1

(a, b)^p−1p − 1

φ1(a, b),

(14)

for p is odd. By the same reasons as in previous discussions, we obtain that

∇aψ1(a, b) · ∇bψ1(a, b) ≥ 0.

(vii) If a = 0, b = 0, then φ1(a, b) = 0. Hence, ∇_aψ1(a, b) = 0 = ∇_bψ1(a, b) and

∇aψ1(a, b) · ∇bψ1(a, b) = 0.

Fori = 2, ψ2= ¹₂(φ2)², we discuss discuss four cases as in part (b).

(i) If (a, b) = (0, 0) and ab > 0, from (3.3), we have

∇aψ2(a, b) =

a^p−1

(a, b)^p−1p − 1 − αb

φ2(a, b),

∇_bψ2(a, b) =

b^p−1

(a, b)^p−1p − 1 − αa

φ2(a, b),

forp is even; and

∇aψ2(a, b) =

sgn(a) · a^p−1

(a, b)^p−1p − 1 − αb

φ2(a, b),

∇_bψ2(a, b) =

sgn(b) · b^p−1

(a, b)^p−1p − 1 − αa

φ2(a, b),

for p is odd. By the same reasons as in previous discussions, it can be easily veriﬁed that∇aψ1(a, b) · ∇bψ1(a, b) ≥ 0.

(ii) If (a, b) = (0, 0) and ab = 0, then φ2(a, b) = 0. Hence, ∇_aψ2(a, b) = 0 =

∇bψ2(a, b) and ∇aψ2(a, b) · ∇bψ2(a, b) = 0.

(iii) If (a, b) = (0, 0) and ab < 0, the arguments are the same as case (iv) fori = 1 except that φ1 is replaced byφ2.

(iv) If (a, b) = (0, 0), then φ2(a, b) = 0. Hence, ∇_aψ2(a, b) = 0 = ∇_bψ2(a, b) and

∇aψ2(a, b) · ∇bψ2(a, b) = 0.

Fori = 3, ψ3= ¹₂(φ3)², we have two cases as below.

(i) If (a, b) = (0, 0), from (3.4)–(3.5), we have

∇aψ3(a, b) = φp(a, b)

a^p−1

(a, b)^p−1p − 1

+α(a+)(b+)²,

∇bψ3(a, b) = φp(a, b)

b^p−1

(a, b)^p−1p − 1

+α(a+)²(b+),

(15)

forp is even; and

∇aψ3(a, b) = φp(a, b)

sgn(a) · a^p−1

(a, b)^p−1p − 1

+α(a+)(b+)²,

∇bψ3(a, b) = φp(a, b)

sgn(b) · b^p−1

(a, b)^p−1p − 1

+α(a+)²(b+), forp is odd. Thus, ∇_aψ3(a, b) · ∇_bψ3(a, b) equals

φ²_p(a, b)

a^p−1

(a, b)^p−1p − 1

b^p−1

(a, b)^p−1p − 1

+α²(a+)³(b+)³

+φp(a, b)

a^p−1

(a, b)^p−1p − 1

α(a+)²(b+)

+φ_p(a, b)

b^p−1

(a, b)^p−1_p − 1

α(a+)(b+)² or

φ²_p(a, b)

sgn(a) · a^p−1

(a, b)^p−1_p − 1

sgn(b) · b^p−1

(a, b)^p−1_p − 1

+α²(a+)³(b+)³

+φ_p(a, b)

sgn(a) · a^p−1

(a, b)^p−1p − 1

α(a+)²(b+)

+φp(a, b)

sgn(b) · b^p−1

(a, b)^p−1p − 1

α(a+)(b+)².

Note that in the above expressions, it is trivial that the ﬁrst and second terms are nonnegative. We also notice that

(a+)(b+) =

ab, if a > 0, b > 0 0, else.

Therefore, we only need to consider the subcase ofa > 0, b > 0 for the third and fourth terms. In fact, summing up the third and fourth term under this subcase gives

αab · φp(a, b)

a^p−1

(a, b)^p−1p − 1

a +

b^p−1

(a, b)^p−1p − 1

b

=αab · φp(a, b)

a^p+b^p

(a, b)^p−1p − (a + b)

(16)

=αab · φ_p(a, b)[(a, b)_p− (a + b)]

=αab · φ²_p(a, b)

≥ 0.

Thus, we proved∇aψ2(a, b) · ∇bψ2(a, b) ≥ 0.

Fori = 4, ψ4= ¹₂(φ4)², we also have two cases as below.

(i) If (a, b) = (0, 0), from (3.6) and (3.7), we have

∇aψ4(a, b) = φp(a, b)

a^p−1

(a, b)^p−1p − 1

+α(ab)+· b,

∇_bψ4(a, b) = φ_p(a, b)

b^p−1

(a, b)^p−1p − 1

+α(ab)+· a,

forp is even; and

∇aψ4(a, b) = φp(a, b)

sgn(a) · a^p−1

(a, b)^p−1p − 1

+α(ab)+· b,

∇_bψ4(a, b) = φ_p(a, b)

sgn(b) · b^p−1

(a, b)^p−1p − 1

+α(ab)+· a,

forp is odd. Thus, ∇_aψ4(a, b) · ∇_bψ4(a, b) equals

φ²_p(a, b)

a^p−1

(a, b)^p−1p − 1

b^p−1

(a, b)^p−1p − 1

+α²(ab)²+· (ab)

+φ_p(a, b)

a^p−1

(a, b)^p−1p − 1

α(ab)+· a

+φ_p(a, b)

b^p−1

(a, b)^p−1p − 1

α(ab)+· b

or φ²_p(a, b)

sgn(a) · a^p−1

(a, b)^p−1p − 1

sgn(b) · b^p−1

(a, b)^p−1p − 1

+α²(ab)²+· (ab)

+φp(a, b)

sgn(a) · a^p−1

(a, b)^p−1p − 1

α(ab)+· a + φp(a, b)

sgn(b) · b^p−1

(a, b)^p−1p − 1

× α(ab)+· b.

(17)

The ﬁrst and second terms are non-negative by the same reasons in previous discussions. We notice that

(ab)+=

ab, if ab > 0 0, else.

Thus, we only need to consider the subcase ofab > 0 for the third and fourth terms. In fact, summing up the third and fourth term under this subcase gives

α(ab)+· φ_p(a, b)

a^p−1

(a, b)^p−1p − 1

a +

b^p−1

(a, b)^p−1p − 1

b

=α(ab)+· φp(a, b)

a^p+b^p

(a, b)^p−1p − (a + b)

=α(ab)+· φ_p(a, b)[(a, b)_p− (a + b)]

=α(ab)+· φ²_p(a, b)

≥ 0.

The arguments hold as well for p is odd. Hence, we proved ∇_aψ2(a, b) ·

∇bψ2(a, b) ≥ 0.

(d) Going over exactly the same cases for eachi discussed as in part (c) where

∇_aψ(a, b) and ∇_bψ(a, b) are formed, it is not hard to verify that the desired result is satisﬁed. We omit the details.

Based on the properties of ψ stated as in Proposition 3.3 and using the same proof techniques developed in (De Luca et al., 1996; Kanzow and Kleinmichel, 1998;

Kanzow et al., 1997), we have the following condition for a stationary point to be a solution of the NCP. We omit the details.

Proposition 3.4. Assume that x^∗ ∈ Rⁿ is a stationary point of Ψ deﬁned as (1.15) − (1.17) (except for Ψ induced from ψ2) such that the Jacobian∇F (x^∗) is a P0-matrix. Then x^∗ is a solution of the NCP.

As pointed out in Proposition 3.4, if Ψ is induced fromψ2then Proposition 3.4 does not necessary hold for such a Ψ. The reason is that there needs ∇aψ(a, b) ·

∇_bψ(a, b) > 0 when ψ(a, b) = 0 in the proof. However, this is not always true (we proved that ∇aψ(a, b) · ∇bψ(a, b) ≥ 0) for ψ2. A counterexample for p = 2 was given in Sun and Qi (1999, pp. 206–207). Hence, due to this reason, the merit function induced fromψ2may not be recommended even though it is continuously diﬀerentiable.

4. Bounded Level Sets

As mentioned earlier, the merit function Ψ_p deﬁned as in Eq. (1.9) does not guarantee bounded level sets for monotone NCP. In fact, it needs that F is either