A family of NCP functions and a descent method for the nonlinear complementarity problem

(1)

DOI 10.1007/s10589-007-9086-0

A family of NCP functions and a descent method for the nonlinear complementarity problem

Jein-Shan Chen· Shaohua Pan

Received: 13 May 2006 / Revised: 20 November 2006 / Published online: 23 October 2007

Abstract In last decades, there has been much effort on the solution and the analy- sis of the nonlinear complementarity problem (NCP) by reformulating NCP as an unconstrained minimization involving an NCP function. In this paper, we propose a family of new NCP functions, which include the Fischer-Burmeister function as a special case, based on a p-norm with p being any fixed real number in the interval (1,+∞), and show several favorable properties of the proposed functions. In addi- tion, we also propose a descent algorithm that is indeed derivative-free for solving the unconstrained minimization based on the merit functions from the proposed NCP functions. Numerical results for the test problems from MCPLIB indicate that the de- scent algorithm has better performance when the parameter p decreases in (1,+∞).

This implies that the merit functions associated with p∈ (1, 2), for example p = 1.5, are more effective in numerical computations than the Fischer-Burmeister merit func- tion, which exactly corresponds to p= 2.

Keywords NCP· NCP function · Merit function · Descent method

1 Introduction

The nonlinear complementarity problem (NCP) is to find a point x∈ Rⁿsuch that x≥ 0, F (x) ≥ 0, x, F (x) = 0, (1)

J.-S. Chen is a member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. J.-S. Chen’s work is partially supported by National Science Council of Taiwan.

J.-S. Chen (

⁾

Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan e-mail:jschen@math.ntnu.edu.tw

S. Pan

School of Mathematical Sciences, South China University of Technology, Guangzhou 510641, China e-mail:shhpan@scut.edu.cn

(2)

where·, · is the Euclidean inner product and F = (F1, F₂, . . . , F_n)^T is a map from Rⁿ toRⁿ. We assume that F is continuously differentiable throughout this paper.

The NCP has attracted much attention due to its various applications in operations research, economics, and engineering [6,12,18].

There have been many methods proposed for solving the NCP [9,12,18]. Among which, one of the most popular and powerful approaches that has been studied intensively recently is to reformulate the NCP as a system of nonlinear equations [17,24]

or as an unconstrained minimization problem [5,7,10,14–16,23]. Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions and defined as below, serves an important role.

Definition 1.1 A function φ: R²→ R is called an NCP-function if it satisfies φ (a, b)= 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2) Over the past two decades, a variety of NCP-functions have been studied, see [9,20] and references therein. Among which, a popular NCP-function intensively studied recently is the well-known Fischer-Burmeister NCP-function [7,8] defined as

φ (a, b)=

a²+ b²− (a + b). (3)

With the above characterization of φ, the NCP is equivalent to a system of nonsmooth equations:

(x)=

⎛

⎝

φ (x₁, F₁(x)) ... φ (x_n, F_n(x))

⎞

⎠ = 0. (4)

Then the function : Rⁿ→ R₊defined by

(x):=1 2

2=1 2

n i=1

φ (xi, Fi(x))² (5)

is a merit function for the NCP, i.e., the NCP can be recast as an unconstrained minimization:

xmin∈Rⁿ(x). (6)

In this paper, we propose and investigate a family of new NCP functions based on the Fischer-Burmeister function (3). In particular, we define φp: R²→ R by

φ_p(a, b) _p− (a + b), (7)

where p is any fixed real number in the interval (1, pdenotes the p-norm of (a, b), i.e., p=√^p

|a|^p+ |b|^p. In other words, in the function φp,

(3)

we replace the 2-norm of (a, b) in the Fischer-Burmeister function (3) by a more general p-norm with p∈ (1, +∞). The function φp is still an NCP-function as was noted in Tseng’s paper [21]. Nonetheless, to our knowledge, there was no further study on this family of NCP functions except for p= 2. We aim to explore and study properties of φpin this paper. More specifically, we define ψp: R²→ R+by

ψ_p(a, b):=1

2|φp(a, b)|². (8)

For any given p > 1, the function ψp is a nonnegative NCP-function and smooth on R²as will be seen in Sect.3. Analogous to , the function p: Rⁿ→ Rⁿgiven as

_p(x)=

⎛

⎝

φ_p(x1, F1(x)) ... φ_p(x_n, F_n(x))

⎞

⎠ (9)

yields a family of merit functions p: Rⁿ→ R for the NCP for which

_p(x):=1

2p(x)²=1 2

n i=1

φ_p(x_i, F_i(x))²=

n i=1

ψ_p(x_i, F_i(x)). (10)

As will be seen later, p for any given p > 1 is a continuously differentiable merit function for the NCP. Therefore, classical iterative methods such as Newton method can be applied to the unconstrained smooth minimization of the NCP, i.e.,

xmin∈Rⁿ_p(x). (11)

On the other hand, derivative-free methods [22] have also attracted much attention which do not require computation of derivatives of F . Derivative-free methods, tak- ing advantages of particular properties of a merit function, are suitable for problems where the derivatives of F are not available or expensive.

In this paper, we also study a derivative-free descent algorithm for solving the NCP based on the merit function p. The algorithm is shown to be convergent for strongly monotone NCPs. In addition, we also do numerical experiments with three specific merit functions 1.5, 2 and 3 for the test problems from MCPLIB. Numerical results show that the descent algorithm has better performance as p decreases in the interval (1,+∞). This means that a more effective NCP function than the Fischer- Burmeister function, at lest in numerical computations, can be obtained by setting p∈ (1, 2) in φp(a, b).

Throughout this paper,Rⁿdenotes the space of n-dimensional real column vectors and^T denotes transpose. For any differentiable function f: Rⁿ→ R, ∇f (x) denotes the gradient of f at x. For any differentiable mapping F = (F1, . . . , Fm)^T : Rⁿ→ R^m,∇F (x) = [∇F1(x)· · · ∇Fm(x)] denotes the transpose Jacobian of F at x. We denote by pthe p-norm of x and by

unless otherwise stated, we always assume p in the sequel is any fixed real number in (1,+∞).

(4)

2 Preliminaries

In this section, we recall some background concepts and materials which will play an important role in the subsequent analysis.

Definition 2.1 Let F : Rⁿ→ Rⁿ, then

(a) F is monotone ifx − y, F (x) − F (y) ≥ 0, for all x, y ∈ Rⁿ.

(b) F is strictly monotone ifx − y, F (x) − F (y) > 0, for all x, y ∈ Rⁿand x = y.

(c) F is strongly monotone with modulus μ > 0 if x − y, F (x) − F (y) ≥ μ ², for all x, y∈ Rⁿ.

(d) F is a P0-function if max1≤i≤n

xi =yi(xi− yi)(Fi(x)− Fi(y))≥ 0, for all x, y ∈ Rⁿ and x = y.

(e) F is a P -function if max1≤i≤n(x_i− yi)(F_i(x)− Fi(y)) >0, for all x, y∈ Rⁿand x = y.

(f) F is a uniform P -function with modulus μ > 0 if max1≤i≤n(xi − yi)(Fi(x)− F_i(y)) ², for all x, y∈ Rⁿ.

(g) ∇F (x) is uniformly positive definite with modulus μ > 0 if d^T ², for all x∈ Rⁿand d∈ Rⁿ.

(h) F is Lipschitz continuous if there exists a constant L > 0 such that F (x) −

F (y) ⁿ.

From the above definitions, it is obvious that strongly monotone functions are strictly monotone, and strictly monotone functions are monotone. Moreover, F is a P₀-function if F is monotone and F is a uniform P -function with modulus μ > 0 if F is strongly monotone with modulus μ > 0. In addition, when F is continuously differentiable, we have the following conclusions:

1. F is monotone if and only if∇F (x) is positive semidefinite for all x ∈ Rⁿ. 2. F is strictly monotone if∇F (x) is positive definite for all x ∈ Rⁿ.

3. F is strongly monotone if and only if∇F (x) is uniformly positive definite.

Next, we recall the definition of P0-matrix and P -matrix.

Definition 2.2 A matrix M∈ Rⁿ^×nis a

(a) P0-matrix if each of its principal minors is nonnegative.

(b) P -matrix if each of its principal minors is positive.

It is obvious that every P -matrix is also a P0-matrix. Furthermore, it is known that the Jacobian of every continuously differentiable P0-function is a P0-matrix.

Finally, we state one of the characterizations of P0-matrices that will be used later, and for more properties about P -matrix and P0-matrix, please refer to [4].

Lemma 2.1 A matrix M∈ Rⁿ^×n is a P0-matrix if and only if for every nonzero vector x there exists an index i such that xi = 0 and xi(Mx)_i≥ 0.

(5)

3 A family of NCP functions and their properties

In this section, we study a family of NCP functions φp defined as (7) with p > 1, which are indeed variants of Fischer-Burmeister function, and show that these functions have several favorable properties analogous to what Fischer-Burmeister function has. We first present some similar properties of φp to those for Fischer- Burmeister function.

Proposition 3.1 Let φp: R²→ R be defined as (7) with p being any fixed real num- ber in the interval (1,+∞). Then

(a) φpis an NCP-function, i.e., it satisfies (2).

(b) φpis sub-additive, i.e., φp(w+ w)≤ φp(w)+ φp(w)for all w, w∈ R². (c) φpis positive homogeneous, i.e., φp(αw)= αφp(w)for all w∈ R²and α≥ 0.

(d) φp is convex, i.e., φp(αw+ (1 − α)w)≤ αφp(w)+ (1 − α)φp(w) for all w, w∈ R²and α∈ 0.

(e) φpis Lipschitz continuous with L1=√

2+ 2^(1/p^−1/2)when 1 < p < 2, and with L2= 1 +√

2 when p≥ 2. In other words, |φp(w)− φp(w)| ≤ L1w − w when 1 < p < 2 and|φp(w)− φp(w)| ≤ L2w − w when p ≥ 2 for all w, w∈ R². Proof (a) The proof can be seen in [21, page 20]. For completeness, we here include it. Consider any a≥ 0 and b ≥ 0 satisfying ab = 0. Then, we have either a = 0 or b= 0. This implies that φp(a, b)=√^p

|a|^p− a or φp(a, b)=√^p

|b|^p− b, and conse- quently φp(a, b)= 0. Conversely, consider any (a, b) ∈ R²satisfying φp(a, b)= 0.

Then there must hold a≥ 0 and b ≥ 0, otherwise we have √^p

|a|^p+ |b|^p> (a+ b) and hence φp(a, b) >0. Now we prove that one of a and b must be 0. If not, then p < ₁ = a + b, which obviously contradicts the fact that φ_p(a, b)= 0. The two sides show that φpis indeed an NCP-function.

(b) Let w= (a, b) and w= (c, d). Then

φp(w+ w) p− (a + b + c + d)

p p− (a + b) − (c + d)

= φp(a, b)+ φp(c, d)= φp(w)+ φp(w),

where the inequality is true since the triangle inequality holds for p-norm when p >1.

(c) Let w= (a, b) ∈ R²and α > 0. Then the proof follows by φp(αw)=^p

|αa|^p+ |αb|^p− (αa + αb) = α^p

|a|^p+ |b|^p− α(a + b) = αφp(w).

(d) This is true by part (b) and part (c).

(e) Let w= (a, b) and w= (c, d), we have

|φp(w)− φp(w) _p _p+ (c + d)|

p p| + |a − c| + |b − d|

(6)

p+√ 2

|a − c|²+ |b − d|²

p+√ 2

= w − wp+√

2w − w. Then, from the inequality as below (see [13, (1.3)]),

p₂ p₁≤ n^(1/p¹^−1/p²⁾ p₂ for x∈ Rⁿand 1 < p1< p2,

we obtain the desired results.

As below, φp has more further properties which are key to proving results of the subsequent section.

Lemma 3.1 Let φp: R²→ R be defined as (7) where p > 1. If {(a^k, b^k)} ⊆ R² with (a^k → −∞) or (b^k → −∞) or (a^k → ∞ and b^k → ∞), then we have

|φp(a^k, b^k)| → ∞ for k → ∞.

Proof This result has been mentioned in [21, p. 20].

Next, we study another family of NCP functions ψp: R²→ R₊ defined by (8).

This class of functions will lead the NCP to a reformulation of unconstrained minimization. In other words, they are a family of merit functions for the NCP. Further- more, they have some favorable properties shown as below. Particularly, ψpfor any given p > 1 is continuously differentiable everywhere whereas φp is not differentiable everywhere.

Proposition 3.2 Let φp, ψ_p be defined as (7) and (8), respectively, where p is any fixed real number in the interval (1,+∞). Then

(a) ψpis an NCP-function, i.e., it satisfies (2).

(b) ψp(a, b)≥ 0 for all (a, b) ∈ R².

(c) ψpis continuously differentiable everywhere.

(d) ∇aψ_p(a, b)· ∇bψ_p(a, b)≥ 0 for all (a, b) ∈ R². The equality holds if and only if φp(a, b)= 0.

(e) ∇aψ_p(a, b)= 0 ⇐⇒ ∇bψ_p(a, b)= 0 ⇐⇒ φp(a, b)= 0.

Proof (a) Since ψp(a, b)= 0 if and only if φp(a, b)= 0, the desired result is satisfied by Proposition3.1(a).

(b) It is clear by definition of ψp.

(c) From direct computation, we obtain ∇aψp(0, 0)= ∇bψp(0, 0)= 0. For (a, b) = (0, 0),

∇aψ_p(a, b)=

sgn(a)· |a|^p⁻¹

p−1 p

− 1

φ_p(a, b),

∇bψp(a, b)=

sgn(b)· |b|^p⁻¹

p−1p

− 1

φp(a, b)

(12)

(7)

where sgn(·) is the sign function. Clearly, sgn(a)· |a|^p⁻¹

p−1 p

≤ 1 and

sgn(b)· |b|^p⁻¹

p−1 p

≤ 1 (13)

(i.e., uniformly bounded) and moreover φp(a, b)→ 0 as (a, b) → (0, 0). Therefore, we have∇aψ_p(a, b)→ 0 and ∇bψ_p(a, b)→ 0 as (a, b) → (0, 0). This means that ψ_pis continuously differentiable everywhere.

(d) From part (c), we know that if (a, b)= (0, 0), it is clear that ∇aψ_p(a, b)·

∇bψp(a, b)= 0 and ψp(a, b)= 0. Now we assume that (a, b) = (0, 0). Then,

∇aψ_p(a, b)· ∇bψ_p(a, b)

=

sgn(a)· |a|^p⁻¹

p−1p

− 1

sgn(b)· |b|^p⁻¹

p−1p

− 1

φ_p²(a, b). (14)

Again, from (13), it follows immediately that∇aψp(a, b)· ∇bψp(a, b)≥ 0 for all (a, b)∈ R². The equality holds if and only if φp(a, b)= 0, ^sgn(a)^·|a|p−1^p−1

p = 1 or

sgn(b)·|b|^p−1

p−1

p = 1. In fact, if ^sgn(a)^·|a|p^p−1−1 p

p, which leads to b= 0 and hence φp(a, b)=√^p

|a|^p− a = a − a = 0. Similarly, we have φp(a, b)= 0 if ^sgn(b)·|b|_p−1^p⁻¹

p

= 1. Thus, we conclude that the equality holds if and only if φp(a, b)= 0.

(e) It is already seen in the last part of proof for part (d). It was shown that if F is monotone [10] or a P0-function [5], then any stationary point of is a global minima of the unconstrained minimization min_x∈Rⁿ(x), and hence solves the NCP. Moreover, it was also shown that if F is strongly monotone [10] or uniform P -function [5], then the level sets of are bounded. In what follows, we will present and prove analogous results for p under the same conditions as in [5,10]. The ideas for proving the following propositions are borrowed from those analogous results in [5,10].

Proposition 3.3 Let p: Rⁿ→ R be defined as (10) with p > 1. Then p(x)≥ 0 for all x∈ Rⁿand p(x)= 0 if and only if x solves the NCP (1). Moreover, suppose that the NCP (1) has at least one solution. Then x is a global minimizer of pif and only if x solves the NCP (1).

Proof The results directly follow from Proposition3.2.

Proposition 3.4 Let p: Rⁿ→ R be defined as (10) with p > 1. Assume F is ei- ther monotone or P0-function, then every stationary point of pis a global minima of (11); and therefore solves the NCP (1).

(8)

Proof (I) For the assumption of monotonicity of F , suppose that x^∗is a stationary point of p. Then we have∇p(x^∗)= 0 which implies that

n i=1

(∇aψ_p(x_i^∗, F_i(x^∗))e_i+ ∇bψ_p(x^∗_i, F_i(x^∗))∇Fi(x^∗))= 0, (15)

where ei = (0, . . . , 1, . . . , 0)^T. We denote ∇aψp(x^∗, F (x^∗)) = (. . . , ∇aψp(x_i^∗, F_i(x^∗)), . . .)^T and∇bψ_p(x^∗, F (x^∗))= (. . . , ∇bψ_p(x_i^∗, F_i(x^∗)), . . .)^T, respectively.

Then (15) can be abbreviated as

∇aψ_p(x^∗, F (x^∗))+ ∇F (x^∗)∇bψ_p(x^∗, F (x^∗))= 0. (16) Now, multiplying (16) by∇bψ_p(x^∗, F (x^∗))^T leads to

n i=1

(∇aψ_p(x_i^∗, F_i(x^∗))· ∇bψ_p(x_i^∗, F_i(x^∗)))

+ ∇bψ_p(x^∗, F (x^∗))^T∇F (x^∗)∇bψ_p(x^∗, F (x^∗))= 0. (17) Since F is monotone,∇F (x^∗)is positive semidefinite, the second term of (17) is nonnegative. Moreover, each term in the first summation of (17) is nonnegative as well due to Proposition3.2(d). Therefore, we have

∇aψp(x_i^∗, Fi(x^∗))· ∇aψp(x_i^∗, Fi(x^∗))= 0, ∀i = 1, 2, . . . , n,

which yields φp(x_i^∗, F_i(x^∗))= 0 for all i = 1, 2, . . . , n by Proposition3.2(e). Thus,

p(x^∗)= 0 which says x^∗is a global minimizer of (11).

(II) If F is P0-function and suppose x^∗ is a stationary point of p. Then

∇p(x^∗)= 0 which yields (16). Notice that∇aψ_p(a, b)and∇bψ_p(a, b)are given as forms of (12). If we denote A(x^∗)and B(x^∗)the possibly multivalued n× n diagonal matrices whose diagonal elements are given by

A_ii(x^∗)= sgn(x_i^∗)· |x_i^∗|^p⁻¹ (x_i^∗, F_i(x^∗))^pp⁻¹

if (x_i^∗, F_i(x^∗)) = (0, 0)

and

Bii(x^∗)=sgn(F_i(x^∗))· |Fi(x^∗)|^p⁻¹ (x_i^∗, Fi(x^∗))^pp⁻¹

if (x_i^∗, Fi(x^∗)) = (0, 0).

If (x_i^∗, F_i(x^∗))= (0, 0) then we let A(x^∗)= B(x^∗)= I , i.e., the n×n identity matrix.

With the notions of A(x^∗), B(x^∗)and (12), (16) can be rewritten as (A(x^∗)− I) + ∇F (x^∗)(B(x^∗)− I)

_p(x^∗)= 0. (18) We want to prove that p(x^∗)= 0 (and hence p(x^∗)= 0). Suppose not, i.e.,

p(x^∗) = 0. Recall that p(x^∗)= 0 if and only if (1) is satisfied and the ith com- ponent of p(x^∗)is φp(x_i^∗, F_i(x^∗)). Thus, φp(x_i, F_i(x^∗)) = 0 means one of the following occurs:

(9)

1. x^∗_i = 0 and Fi(x^∗) = 0.

2. x^∗_i = 0 and Fi(x^∗) <0.

3. x^∗_i <0 and Fi(x^∗)= 0.

In every case, we have Bii(x^∗) = 1 (since Bii(x^∗)= 1 if and only if φp(x_i^∗, F_i(x^∗))= 0 by Proposition3.2(d, e)), so that (Bii(x^∗)− 1) · φp(x_i^∗, F_i(x^∗)) = 0. Similar argu- ments apply for the vector (A(x^∗)− I)p(x^∗). Thus, from the above, we can easily verify that if p(x^∗) = 0 then (B(x^∗)− I)p(x^∗)and (A(x^∗)− I)p(x^∗)are both nonzero. Moreover, both of their nonzero elements are in the same positions, and such nonzero elements have the same sign. But, for (18) to hold, it would be necessary that

∇F (x^∗)“revert the sign” of all the nonzero elements of (B(x^∗)− I)p(x^∗), which contradicts the fact that∇F (x^∗)is a P0-matrix by Lemma2.1. Proposition 3.5 Let p: Rⁿ→ R be defined as (10) with p > 1. Assume F is either strongly monotone or uniform P -function, then the level sets

L(p, γ ):= {x ∈ Rⁿ| p(x)≤ γ }

are bounded for all γ∈ R.

Proof (I) First, we consider the assumption of strong monotonicity of F . Suppose there exists an unbounded sequence{x^k}k∈K→ ∞ with {x^k}k∈K⊆ L(p, γ )for some γ ≥ 0, where K is a subset of N. We define the index set

J:= {i ∈ {1, 2, . . . , n} | {x_i^k} is unbounded}.

Since{x^k} is unbounded, J = ∅. Let {z^k} denote a bounded sequence defined by

z^k_i =

0, if i∈ J,

x_i^k, if i /∈ J.

Then from the definition of{z^k} and the strong monotonicity of F , we obtain

μ

i∈J

(x_i^k)²= μx^k− z^k²

≤ x^k− z^k, F (x^k)− F (z^k)

=

n i=1

(x_i^k− z^k_i)(F_i(x^k)− Fi(z^k))

=

i∈J

x_i^k(F_i(x^k)− Fi(z^k))

≤

i∈J

(x^k_i)²

1/2

i∈J

|Fi(x^k)− Fi(z^k)|. (19)

(10)

Since

i∈J(x_i^k)² = 0 for k ∈ K, then dividing by

i∈J(x_i^k)²on both sides of (19) yields

μ

i∈J

(x_i^k)² 1/2

≤

i∈J

|Fi(x^k)− Fi(z^k)|, k ∈ K. (20)

On the other hand, we know {Fi(z^k)}k∈K is bounded (i ∈ J ) due to {z^k}k∈K is bounded and F is continuous. Therefore, from (20), we have

{|Fi0(x^k)|} → ∞ for some i0∈ J.

Also,{x^k_i₀} → ∞ by the definition of the index set J . Thus, Lemma3.1yields φp(x_i^k

0, Fi₀(x^k))→ ∞ as k → ∞.

But this contradicts{x^k} ⊆ L(p, γ ).

(II) If F is uniform P -function, then the proof almost follows the same arguments as above. In particular, (19) is replaced by

μ

i∈J

(x_i^k)²= μx^k− z^k²

≤ max

1≤i≤n(x^k_i − z^ki)(Fi(x^k)− Fi(z^k))

= max

i∈J x^k_i(Fi(x^k)− Fi(z^k))

= x_j^k₀(F_j₀(x^k)− Fj0(z^k))

≤ |x_j^k₀| · |Fj0(x^k)− Fj0(z^k)|, (21) where j0is one of the indices for which the max is attained. Then dividing by|x_j^k₀|

on both sides of (21) and the proof follows.

4 A descent method

In this section, we study a descent method for solving the unconstrained minimization (11), which does not require the derivative of F involved in the NCP. In addition, we prove a global convergence result for this derivative-free descent algorithm. More precisely, we consider the search direction as below:

d^k:= −∇bψp(x^k, F (x^k)), (22) where ∇bψ_p(x^k, F (x^k))= (∇bψ_p(x₁^k, F (x₁^k)), . . . ,∇bψ_p(x_n^k, F (x_n^k)))^T. From the following lemma, we see that d^kis a descent direction of pat x^kunder monotonicity assumption.

(11)

Lemma 4.1 Let x^k∈ Rⁿand F be a monotone function. Then the search direction defined as (22) satisfies the descent condition∇p(x^k)^Td^k<0 as long as x^kis not a solution of the NCP (1). Moreover, if F is strongly monotone with modulus μ > 0, then∇p(x^k)^Td^k≤ −μd^k².

Proof Since ∇p(x^k)= ∇aψ_p(x^k, F (x^k))+ ∇F (x^k)∇bψ_p(x^k, F (x^k)), we have that

∇p(x^k)^Td^k= −

n i=1

∇aψ_p(x_i^k, F_i(x^k))· ∇bψ_p(x_i^k, F_i(x^k))

− (d^k)^T∇F (x^k)(d^k). (23)

From the monotonicity of F , it follows that∇F (x^k)is positive semidefinite. There- fore, the second term of (23) is nonnegative. Also, by Proposition 3.2(d), the first term of (23) is nonnegative. Therefore, ∇p(x^k)^Td^k ≤ 0. We next prove that ∇p(x^k)^Td^k <0 by contradiction. Assume that ∇p(x^k)^Td^k = 0. Then

∇aψ_p(x_i^k, F_i(x^k))· ∇bψ_p(x^k_i, F_i(x^k))= 0 for all i which, by Proposition 3.2(d) again, yields φp(x_i^k, F_i(x^k))= 0. Thus, p(x^k, F (x^k))= 0 and p(x^k, F (x^k))= 0.

Consequently, x^ksolves the NCP (1). This obviously contradicts our assumption that x^kis not a solution of the NCP (1).

If F is strongly monotone with modulus μ > 0, then we have that

∇p(x^k)^Td^k≤ −(d^k)^T∇F (x^k)(d^k)≤ −μd^k²,

where the first inequality follows from (23) and Proposition3.2(d). The above lemma motivates the following descent algorithm.

Algorithm 4.1

(Step 0) Given a real number p > 1 and x⁰∈ Rⁿ. Choose the parameters ε≥ 0, σ∈ (0, 1) and β ∈ (0, 1). Set k := 0.

(Step 1) If p(x^k)≤ ε, then Stop.

(Step 2) Let

d^k:= −∇bψ_p(x^k, F (x^k)).

(Step 3) Compute a step-size tk:= β^m^k, where mkis the smallest nonnegative integer msatisfying the Armijo-type condition:

_p(x^k+ β^md^k)≤ (1 − σβ^2m)_p(x^k). (24) (Step 4) Set x^k⁺¹:= x^k+ tkd^k, k:= k + 1 and Go to Step 1.

We next show the global convergence result for Algorithm4.1under the strongly monotone assumption of F . To this end, we assume that the parameter ε used in Algorithm4.1is set to be zero and Algorithm4.1generates an infinite sequence{x^k}.

(12)

Proposition 4.1 Suppose that F is strongly monotone. Then the sequence{x^k} gen- erated by Algorithm4.1has at least one accumulation point and any accumulation point is a solution of the NCP (1).

Proof Firstly, we show that there exists a nonnegative integer mkin Step 3 of Algo- rithm 4.1 whenever x^k is not a solution. Assume that the conclusion does not hold.

Then for any m > 0,

_p(x^k+ β^md^k)− p(x^k) >−σβ^2m_p(x^k).

Dividing by β^mon both sides and taking the limit m→ ∞ yield

∇p(x^k), d^k ≥ 0.

Since F is strongly monotone, this obviously contradicts Lemma4.1. Hence, we can find an integer mkin Step 3.

Secondly, we show that the sequence{x^k} generated by Algorithm4.1has at least one accumulation point. By the descent property of Algorithm 4.1, the sequence {p(x^k)} is decreasing. Thus, by Proposition3.5, the generated sequence is bounded and hence it has at least one accumulation point.

Finally, we prove that every accumulation point is a solution of the NCP (1). Let x^∗ be an arbitrary accumulation point of the generated sequence{x^k}. Then there exists a subsequence{x^k}k∈K converging to x^∗. We know that−∇bψ_p

· , F (·) is continuous since ψp is continuously differentiable, therefore,{d^k}k∈K→ d^∗. Next, we need to discuss two cases. First, we consider the case where there exists a constant

¯β such that β^m^k≥ ¯β > 0 for all k ∈ K. Then, from (24), we obtain

_p(x^k⁺¹)≤ (1 − σ ¯β²)_p(x^k)

for all k ∈ K and the entire sequence {p(x^k)} is decreasing. Thus, we have

_p(x^∗)= 0 (by taking the limit) which says x^∗is a solution of the NCP (1). Now, we consider the other case where there exists a further subsequence such that β^m^k→ 0.

Note that by Armijo’s rule (24) in Step 3, we have

p(x^k+ β^m^k⁻¹· d^k)− p(x^k) >−σβ^2(m^k⁻¹⁾p(x^k).

Dividing both sides by β^m^k⁻¹and passing to the limit on the subsequence, we obtain

∇p(x^∗), d^∗ ≥ 0

which implies that x^∗is a solution of the NCP (1) by Lemma4.1.

5 Numerical experiments

We implemented Algorithm4.1with our code in MATLAB 6.1 for all test problems with all available starting points in MCPLIB [1]. All numerical experiments were done at a PC with CPU of 2.8 GHz and RAM of 512 MB. In order to improve the

(13)

numerical behavior of Algorithm4.1, we replaced the standard (monotone) Armijo- rule by nonmonotone line search as described in [11], i.e., we computed the smallest nonnegative integer l such that

p(x^k+ β^ld^k)≤ Wk− σβ^2lp(x^k), whereWk is given by

Wk= max

j=k−mk,...,k_p(x^j) and where, for given nonnegative integers ˆm and s, we set

m_k=

0 if k≤ s,

min{mk−1+ 1, ˆm} otherwise.

Throughout the experiments, we use ˆm = 5 and s = 5. Moreover, we use the parame- ters σ = 1.0e−10 and β = 0.2 in Algorithm4.1. We terminated our iteration when the number of iteration is over 500 000 or the steplength is less than 1.0e−10 or one of the following conditions is satisfied:

(C1) p(x^k)≤ 1.0e−5 and (x^k)^TF (x^k)≤ 5.0e−3;

(C2) p(x^k)≤ 3.0e−7 and (x^k)^TF (x^k)≤ 3.0e−2;

(C3) p(x^k)≤ 3.0e−6 and (x^k)^TF (x^k)≤ 1.0e−2.

Our computational results are summarized in Tables1,2,3 (see theAppendix).

In these tables, the first column lists the name of the problems and the starting point number in MCPLIB, Gap denotes the value of x^TF (x) at the final iteration, NF indicates the number of function evaluations of the merit function pfor solving each problem, and Time represents the CPU time in seconds for solving each problem.

The results reported in Tables1–3show that our descent method based on the merit function 1.5(x), 2(x)or 3(x)was able to solve most complementarity prob- lems in MCPLIB. More precisely, there are seven failures (pgvon105, pgvon106, powell, scarfanum, scarfasum, scarfbnum, scarfbsum) for Algorithm4.1due to a too small steplength. After a careful check, we find the direction d defined in Al- gorithm4.1is not a descent one for these problems. In fact, the seven problems are regarded as difficult ones for those Newton type algorithms [19,20]. In addition, we may see that the descent algorithm using the merit function 1.5(x)has better numerical results than using the Fischer-Burmeister function. Particularly, it appears from Tables1–3that the descent algorithm based on p(x)will take more function eval- uations and yield larger value of Gap when the parameter p increases. A reasonable interpretation for this is that the value of p(x)become smaller when p increases and hence causes some difficulty for the descent Algorithm4.1. This also implies that the performance of Algorithm4.1will become worse when the parameter p increases.

This is an important new discovery, which has big contribution in constructing new NCP-functions, not found in the literature to our best knowledge.

(14)

6 Final remarks

In this paper, we have studied a family of NCP-functions φp(a, b)which include the well-known Fischer-Burmeister function as a special case and have shown that this class of functions enjoy some favorable properties as other NCP-functions do. In addition, we propose a descent method for the unconstrained minimization (11) which is a reformulation of the NCP via the proposed NCP-functions. Numerical results for the test problem in MCPLIB have shown this method is promising when p(x) is specified as 1.5(x), 2(x)or 3(x). Moreover, from our numerical implemen- tations, there indicates that the performance of the descent method become better when p decreases, which is a new and important discovery. This implies that there does exist new NCP-function which is better than Fischer-Burmeister function. It is yet unknown whether similar phenomena happens in different algorithm, which is an interesting future topic.

There still are many issues for this NCP-function to be explored like those for other NCP-functions done in the literature. For instance, it would be of interest to know the semismoothness property of ψp and the Lipschitz continuous property of

∇ψp. In fact, some of them are recently studied in [2,3]. In addition, it is interesting to know whether this class of NCP-functions can be used for SDCP and SOCCP.

Some researchers have started this issue but no update reports by now. We leave them for future research topics.

Acknowledgements The authors are grateful to Professor P. Tseng for his suggestion on studying this family of NCP-functions and thank for the referees for their careful reading and helpful suggestions.

Appendix

Table 1 Numerical results for MCPLIB problems based on _1.5(x), ₂(x)and ₃(x)

1.5(x) 2(x) 3(x)

Problem Gap NF Time Gap NF Time Gap NF Time

bertsekas(1) 9.99e–3 63 094 30.74 3.00e–2 86 826 36.42 3.00e–2 71 127 35.92 bertsekas(2) 3.00e–2 63 764 34.92 3.00e–2 65 801 31.39 1.00e–2 89 556 49.22 bertsekas(3) 3.00e–2 318 308 176.4 3.00e–2 322 751 161.5 3.00e–2 416 869 231.8

billups 3.35e–19 25 0.00 3.35e–19 25 0.00 3.35e–19 25 0.00

colvdual(1) 1.48e–2 69 675 41.84 2.33e–2 70 393 36.72 3.00e–2 181 627 109.2 colvdual(2) 1.00e–2 34 266 21.36 9.98e–3 49 436 24.81 9.94e–3 53 895 32.13 colvnlp(1) 1.08e–2 206 856 104.1 1.43e–2 207 529 93.99 3.00e–2 221 400 117.23 colvnlp(2) 9.99e–3 11 390 5.37 9.99e–3 11 753 4.84 9.96e–3 11 964 5.83

cycle 1.14e–3 7 0.00 9.05e–6 5 0.00 2.81e–4 4 0.00

explcp 2.43e–3 5895 3.14 2.58e–3 6001 2.70 2.88e–3 6008 3.13

gafni(1) 2.15e–3 202 0.08 2.68e–3 203 0.06 4.65e-3 203 0.08

gafni(2) 2.08e–3 236 0.09 2.63e–3 229 0.08 4.64e-3 227 0.08

gafni(3) 2.05e–3 250 0.09 2.61e–3 240 0.08 4.62e-3 238 0.08