A family of NCP functions and a descent method for the nonlinear complementarity problem

(1)

Computational Optimization and Applications, vol. 40, pp. 389-404, 2008

A family of NCP functions and a descent method for the nonlinear complementarity problem

Jein-Shan Chen ¹ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Shaohua Pan²

School of Mathematical Sciences South China University of Technology

Guangzhou 510641, China

May 12, 2006 (revised, July 15, 2006) (second revised, September 7, 2006)

Abstract In last decades, there has been much effort on the solution and the analysis of the nonlinear complementarity problem (NCP) by reformulating NCP as an unconstrained minimization involving an NCP function. In this paper, we propose a family of new NCP functions, which include the Fischer-Burmeister function as a special case, based on a p-norm with p being any fixed real number in the interval (1, +∞), and show several favorable properties of the proposed functions. In addition, we also propose a descent algorithm that is indeed derivative-free for solving the unconstrained minimization based on the merit functions from the proposed NCP functions. Numerical results for the test problems from MCPLIB indicate that the descent algorithm has better performance when the parameter p decreases in (1, +∞). This implies that the merit functions asso- ciated with p ∈ (1, 2), for example p = 1.5, are more effective in numerical computations than the Fischer-Burmeister merit function, which exactly corresponds to p = 2.

Key words. NCP, NCP function, merit function, descent method.

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office , E-mail:

jschen@math.ntnu.edu.tw, FAX: 886-2-29332342. The author’s work is partially supported by National Science Council of Taiwan.

2E-mail: shhpan@scut.edu.cn

(2)

1 Introduction

The nonlinear complementarity problem (NCP) is to find a point x ∈ IRⁿ such that

x ≥ 0, F (x) ≥ 0, hx, F (x)i = 0, (1)

where h·, ·i is the Euclidean inner product and F = (F₁, F₂, · · · , F_n)^T is a map from IRⁿ to IRⁿ. We assume that F is continuously differentiable throughout this paper. The NCP has attracted much attention due to its various applications in operations research, economics, and engineering [6, 12, 18].

There have been many methods proposed for solving the NCP [9, 12, 18]. Among which, one of the most popular and powerful approaches that has been studied intensively recently is to reformulate the NCP as a system of nonlinear equations [17, 24] or as an unconstrained minimization problem [5, 7, 10, 14, 15, 16, 23]. Such a function that can constitute an equivalent unconstrained minimization problem for the NCP is called a merit function. In other words, a merit function is a function whose global minima are coincident with the solutions of the original NCP. For constructing a merit function, the class of functions, so-called NCP-functions and defined as below, serves an important role.

Definition 1.1 A function φ : IR² → IR is called an NCP-function if it satisfies

φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2)

Over the past two decades, a variety of NCP-functions have been studied, see [9, 20] and references therein. Among which, a popular NCP-function intensively studied recently is the well-known Fischer-Burmeister NCP-function [7, 8] defined as

φ(a, b) = √

a²+ b²− (a + b). (3)

With the above characterization of φ, the NCP is equivalent to a system of nonsmooth equations:

Φ(x) =







φ(x₁ , F₁(x))

·

· φ(x_n , F_n(x))







= 0. (4)

Then the function Ψ : IRⁿ→ IR+ defined by Ψ(x) := 1

2kΦ(x)k² = 1 2

Xn

i=1

φ(xi , Fi(x))² (5)

(3)

is a merit function for the NCP, i.e., the NCP can be recast as an unconstrained minimization:

x∈IRminⁿΨ(x). (6)

In this paper, we propose and investigate a family of new NCP functions based on the Fischer-Burmeister function (3). In particular, we define φ_p : IR² → IR by

φ_p(a, b) := k(a, b)k_p− (a + b), (7) where p is any fixed real number in the interval (1, +∞) and k(a, b)k_p denotes the p-norm of (a, b), i.e., k(a, b)k_p =^q^p|a|^p+ |b|^p. In other words, in the function φ_p, we replace the 2-norm of (a, b) in the Fischer-Burmeister function (3) by a more general p-norm with p ∈ (1, +∞). The function φp is still an NCP-function as was noted in Tseng’s paper [21]. Nonetheless, to our knowledge, there was no further study on this family of NCP functions except for p = 2. We aim to explore and study properties of φ_p in this paper.

More specifically, we define ψ_p : IR² → IR₊ by ψ_p(a, b) := 1

2|φ_p(a, b)|². (8)

For any given p > 1, the function ψp is a nonnegative NCP-function and smooth on IR² as will be seen in Sec. 3. Analogous to Φ, the function Φ_p : IRⁿ → IRⁿ given as

Φp(x) =







φ_p(x₁ , F₁(x))

·

· φ_p(x_n , F_n(x))







(9)

yields a family of merit functions Ψ_p : IRⁿ → IR for the NCP for which Ψ_p(x) := 1

2kΦ_p(x)k² = 1 2

Xn

i=1

φ_p(x_i , F_i(x))² =

Xn

i=1

ψ_p(x_i , F_i(x)). (10) As will be seen later, Ψ_p for any given p > 1 is a continuously differentiable merit function for the NCP. Therefore, classical iterative methods such as Newton method can be applied to the unconstrained smooth minimization of the NCP, i.e.,

x∈IRminⁿΨp(x). (11)

On the other hand, derivative-free methods [22] have also attracted much attention which do not require computation of derivatives of F . Derivative-free methods, taking advan- tages of particular properties of a merit function, are suitable for problems where the derivatives of F are not available or expensive.

(4)

In this paper, we also study a derivative-free descent algorithm for solving the NCP based on the merit function Ψp. The algorithm is shown to be convergent for strongly monotone NCPs. In addition, we also do numerical experiments with three specific merit functions Ψ_1.5, Ψ₂ and Ψ₃ for the test problems from MCPLIB. Numerical results show that the descent algorithm has better performance as p decreases in the interval (1, +∞).

This means that a more effective NCP function than the Fischer-Burmeister function, at lest in numerical computations, can be obtained by setting p ∈ (1, 2) in φ_p(a, b).

Throughout this paper, IRⁿ denotes the space of n-dimensional real column vectors and ^T denotes transpose. For any differentiable function f : IRⁿ → IR, ∇f (x) denotes the gradient of f at x. For any differentiable mapping F = (F₁, · · · , F_m)^T : IRⁿ → IR^m,

∇F (x) = [∇F₁(x) · · · ∇F_m(x)] denotes the transpose Jacobian of F at x. We denote by kxk_p the p-norm of x and by kxk the Euclidean norm of x. In addition, unless otherwise stated, we always assume p in the sequel is any fixed real number in (1, +∞).

2 Preliminaries

In this section, we recall some background concepts and materials which will play an important role in the subsequent analysis.

Definition 2.1 Let F : IRⁿ→ IRⁿ, then

(a) F is monotone if hx − y, F (x) − F (y)i ≥ 0, for all x, y ∈ IRⁿ.

(b) F is strictly monotone if hx − y, F (x) − F (y)i > 0, for all x, y ∈ IRⁿ and x 6= y.

(c) F is strongly monotone with modulus µ > 0 if hx − y, F (x) − F (y)i ≥ µkx − yk², for all x, y ∈ IRⁿ.

(d) F is a P₀-function if max

1≤i≤n xi6=yi

(x_i− y_i)(F_i(x) − F_i(y)) ≥ 0, for all x, y ∈ IRⁿ and x 6= y.

(e) F is a P -function if max

1≤i≤n(xi− yi)(Fi(x) − Fi(y)) > 0, for all x, y ∈ IRⁿ and x 6= y.

(f) F is a uniform P -function with modulus µ > 0 if max

1≤i≤n(x_i − y_i)(F_i(x) − F_i(y)) ≥ µkx − yk², for all x, y ∈ IRⁿ.

(g) ∇F (x) is uniformly positive definite with modulus µ > 0 if d^T∇F (x)d ≥ µkdk², for all x ∈ IRⁿ and d ∈ IRⁿ.

(h) F is Lipschitz continuous if there exists a constant L > 0 such that kF (x) − F (y)k ≤ Lkx − yk, for all x, y ∈ IRⁿ.

(5)

From the above definitions, it is obvious that strongly monotone functions are strictly monotone, and strictly monotone functions are monotone. Moreover, F is a P0-function if F is monotone and F is a uniform P -function with modulus µ > 0 if F is strongly monotone with modulus µ > 0. In addition, when F is continuously differentiable, we have the following conclusions.

1. F is monotone if and only if ∇F (x) is positive semidefinite for all x ∈ IRⁿ. 2. F is strictly monotone if ∇F (x) is positive definite for all x ∈ IRⁿ.

3. F is strongly monotone if and only if ∇F (x) is uniformly positive definite.

Next, we recall the definition of P₀-matrix and P -matrix.

Definition 2.2 A matrix M ∈ IR^n×n is a

(a) P₀-matrix if each of its principal minors is nonnegative.

(b) P -matrix if each of its principal minors is positive.

It is obvious that every P -matrix is also a P₀-matrix. Furthermore, it is known that the Jacobian of every continuously differentiable P0-function is a P0-matrix.

Finally, we state one of the characterizations of P₀-matrices that will be used later, and for more properties about P -matrix and P₀-matrix, please refer to [4].

Lemma 2.1 A matrix M ∈ IR^n×n is a P₀-matrix if and only if for every nonzero vector x there exists an index i such that xi 6= 0 and xi(Mx)i ≥ 0.

3 A family of NCP functions and their properties

In this section, we study a family of NCP functions φ_p defined as (7) with p > 1, which are indeed variants of Fischer-Burmeister function, and show that these functions have several favorable properties analogous to what Fischer-Burmeister function has. We first present some similar properties of φp to those for Fischer-Burmeister function.

Proposition 3.1 Let φp : IR² → IR be defined as (7) with p being any fixed real number in the interval (1, +∞). Then

(a) φp is an NCP-function, i.e., it satisfies (2).

(b) φ_p is sub-additive, i.e., φ_p(w + w⁰) ≤ φ_p(w) + φ_p(w⁰) for all w, w⁰ ∈ IR².

(6)

(c) φ_p is positive homogeneous, i.e., φ_p(αw) = αφ_p(w) for all w ∈ IR² and α ≥ 0.

(d) φ_p is convex, i.e., φ_p(αw + (1 − α)w⁰) ≤ αφ_p(w) + (1 − α)φ_p(w⁰) for all w, w⁰ ∈ IR² and α ∈ (0, 1).

(e) φ_p is Lipschitz continuous with L₁ = √

2 + 2^(1/p−1/2) when 1 < p < 2, and with L2 = 1 +√

2 when p ≥ 2. In other words, |φp(w) − φp(w⁰)| ≤ L1kw − w⁰k when 1 < p < 2 and |φ_p(w) − φ_p(w⁰)| ≤ L₂kw − w⁰k when p ≥ 2 for all w, w⁰ ∈ IR². Proof. (a) The proof can be seen in [21, page 20]. For completeness, we here include it.

Consider any a ≥ 0 and b ≥ 0 satisfying ab = 0. Then, we have either a = 0 or b = 0. This implies that φp(a, b) = ^q^p|a|^p− a or φp(a, b) = ^q^p|b|^p− b, and consequently φp(a, b) = 0.

Conversely, consider any (a, b) ∈ IR² satisfying φ_p(a, b) = 0. Then there must hold a ≥ 0 and b ≥ 0, otherwise we have ^q^p|a|^p+ |b|^p > (a + b) and hence φ_p(a, b) > 0. Now we prove that one of a and b must be 0. If not, then k(a, b)k_p < k(a, b)k₁ = a + b, which obviously contradicts the fact that φp(a, b) = 0. The two sides show that φp is indeed an NCP-function.

(b) Let w = (a, b) and w⁰ = (c, d). Then

φ_p(w + w⁰) = k(a, b) + (c, d)k_p− (a + b + c + d)

≤ k(a, b)k_p+ k(c, d)k_p− (a + b) − (c + d)

= φp(a, b) + φp(c, d) = φp(w) + φp(w⁰),

where the inequality is true since the triangle inequality holds for p-norm when p > 1.

(c) Let w = (a, b) ∈ IR² and α > 0. Then the proof follows by

φ_p(αw) =^q^p|αa|^p + |αb|^p− (αa + αb) = α^q^p|a|^p+ |b|^p− α(a + b) = αφ_p(w).

(d) This is true by part (b) and part (c).

(e) Let w = (a, b) and w⁰ = (c, d), we have

|φ_p(w) − φ_p(w⁰)| =

¯¯

¯¯k(a, b)k_p − (a + b) − k(c, d)k_p+ (c + d)

¯¯

≤

¯¯

¯¯k(a, b)k_p − k(c, d)k_p

¯¯

¯¯+ |a − c| + |b − d|

≤ k(a, b) − (c, d)kp+√ 2

q

|a − c|²+ |b − d|²

≤ k(a, b) − (c, d)k_p+√

2k(a, b) − (c, d)k

= kw − w⁰k_p +√

2kw − w⁰k.

Then, from the inequality as below (see [13, (1.3)]),

kxk_p₂ ≤ kxk_p₁ ≤ n^(1/p¹^−1/p²⁾kxk_p₂ for x ∈ IRⁿ and 1 < p₁ < p₂,

(7)

we obtain the desired results. 2

As below, φp has more further properties which are key to proving results of the subsequent section.

Lemma 3.1 Let φ_p : IR² → IR be defined as (7) where p > 1. If {(a^k, b^k)} ⊆ IR² with (a^k → −∞) or (b^k → −∞) or (a^k → ∞ and b^k → ∞), then we have |φ_p(a^k, b^k)| → ∞ for k → ∞.

Proof. This result has been mentioned in [21, page 20]. 2

Next, we study another family of NCP functions ψ_p : IR² → IR₊ defined by (8). This class of functions will lead the NCP to a reformulation of unconstrained minimization.

In other words, they are a family of merit functions for the NCP. Furthermore, they have some favorable properties shown as below. Particularly, ψ_p for any given p > 1 is continuously differentiable everywhere whereas φ_p is not differentiable everywhere.

Proposition 3.2 Let φ_p, ψ_p be defined as (7) and (8), respectively, where p is any fixed real number in the interval (1, +∞). Then

(a) ψ_p is an NCP-function, i.e., it satisfies (2).

(b) ψ_p(a, b) ≥ 0 for all (a, b) ∈ IR².

(c) ψ_p is continuously differentiable everywhere.

(d) ∇_aψ_p(a, b) · ∇_bψ_p(a, b) ≥ 0 for all (a, b) ∈ IR². The equality holds if and only if φ_p(a, b) = 0.

(e) ∇_aψ_p(a, b) = 0 ⇐⇒ ∇_bψ_p(a, b) = 0 ⇐⇒ φ_p(a, b) = 0.

Proof. (a) Since ψp(a, b) = 0 if and only if φp(a, b) = 0, the desired result is satisfied by Prop. 3.1(a).

(b) It is clear by definition of ψ_p.

(c) From direct computation, we obtain ∇_aψ_p(0, 0) = ∇_bψ_p(0, 0) = 0. For (a, b) 6= (0, 0),

∇_aψ_p(a, b) =

Ãsgn(a) · |a|^p−1 k(a, b)k^p−1p

− 1

!

φ_p(a, b)

∇bψp(a, b) =

Ãsgn(b) · |b|^p−1 k(a, b)k^p−1p

− 1

!

φp(a, b) (12)

(8)

where sgn(·) is the sign function. Clearly,

¯¯

¯

sgn(a) · |a|^p−1 k(a, b)k^p−1p

¯¯

¯≤ 1 and

¯¯

¯

sgn(b) · |b|^p−1 k(a, b)k^p−1p

¯¯

¯≤ 1 (13)

(i.e., uniformly bounded) and moreover φp(a, b) → 0 as (a, b) → (0, 0). Therefore, we have ∇_aψ_p(a, b) → 0 and ∇_bψ_p(a, b) → 0 as (a, b) → (0, 0). This means that ψ_p is continuously differentiable everywhere.

(d) From part (c), we know that if (a, b) = (0, 0), it is clear that ∇_aψ_p(a, b)·∇_bψ_p(a, b) = 0 and ψ_p(a, b) = 0. Now we assume that (a, b) 6= (0, 0). Then,

∇_aψ_p(a, b) · ∇_bψ_p(a, b) =

Ãsgn(a) · |a|^p−1 k(a, b)k^p−1p

− 1

! Ãsgn(b) · |b|^p−1 k(a, b)k^p−1p

− 1

!

φ²_p(a, b). (14)

Again, from (13), it follows immediately that ∇_aψ_p(a, b)·∇_bψ_p(a, b) ≥ 0 for all (a, b) ∈ IR². The equality holds if and only if φ_p(a, b) = 0, ^sgn(a)·|a|_k(a,b)kp−1^p−1

p = 1 or ^sgn(b)·|b|_k(a,b)kp−1^p−1

p = 1. In fact, if ^sgn(a)·|a|_k(a,b)kp−1^p−1

p = 1, then we have a > 0 and |a| = k(a, b)k_p, which leads to b = 0 and hence φ_p(a, b) =^q^p|a|^p− a = a − a = 0. Similarly, we have φ_p(a, b) = 0 if ^sgn(b)·|b|^p−1

k(a,b)k^p−1p = 1. Thus, we conclude that the equality holds if and only if φ_p(a, b) = 0.

(e) It is already seen in the last part of proof for part (d). 2

It was shown that if F is monotone [10] or a P₀-function [5], then any stationary point of Ψ is a global minima of the unconstrained minimization min

x∈IRⁿΨ(x), and hence solves the NCP. Moreover, it was also shown that if F is strongly monotone [10] or uniform P -function [5], then the level sets of Ψ are bounded. In what follows, we will present and prove analogous results for Ψ_p under the same conditions as in [5, 10]. The ideas for proving the following propositions are borrowed from those analogous results in [5, 10].

Proposition 3.3 Let Ψp : IRⁿ→ IR be defined as (10) with p > 1. Then Ψp(x) ≥ 0 for all x ∈ IRⁿ and Ψ_p(x) = 0 if and only if x solves the NCP (1). Moreover, suppose that the NCP (1) has at least one solution. Then x is a global minimizer of Ψ_p if and only if x solves the NCP (1).

Proof. The results directly follow from Prop. 3.2. 2

Proposition 3.4 Let Ψ_p : IRⁿ → IR be defined as (10) with p > 1. Assume F is either monotone or P₀-function, then every stationary point of Ψ_p is a global minima of (11);

and therefore solves the NCP (1).

(9)

Proof. (I) For the assumption of monotonicity of F , suppose that x^∗ is a stationary point of Ψp . Then we have ∇Ψp(x^∗) = 0 which implies that

Xn

i=1

µ

∇_aψ_p(x^∗_i, F_i(x^∗))e_i + ∇_bψ_p(x^∗_i, F_i(x^∗))∇F_i(x^∗)

¶

= 0, (15)

where e_i = (0, · · · , 1, · · · , 0)^T. We denote ∇_aψ_p(x^∗, F (x^∗)) = (· · · , ∇_aψ_p(x^∗_i, F_i(x^∗)), · · ·)^T and ∇bψp(x^∗, F (x^∗)) = (· · · , ∇bψp(x^∗_i, Fi(x^∗)), · · ·)^T, respectively. Then (15) can be ab- breviated as

∇_aψ_p(x^∗, F (x^∗)) + ∇F (x^∗)∇_bψ_p(x^∗, F (x^∗)) = 0. (16) Now, multiplying (16) by ∇_bψ_p(x^∗, F (x^∗))^T leads to

Xn

i=1

µ

∇_aψ_p(x^∗_i, F_i(x^∗))·∇_bψ_p(x^∗_i, F_i(x^∗))

¶

+∇_bψ_p(x^∗, F (x^∗))^T∇F (x^∗)∇_bψ_p(x^∗, F (x^∗)) = 0.

(17) Since F is monotone, ∇F (x^∗) is positive semidefinite, the second term of (17) is nonnegative. Moreover, each term in the first summation of (17) is nonnegative as well due to Prop. 3.2(d). Therefore, we have

∇_aψ_p(x^∗_i, F_i(x^∗)) · ∇_aψ_p(x^∗_i, F_i(x^∗)) = 0, ∀i = 1, 2, · · · , n,

which yields φ_p(x^∗_i, F_i(x^∗)) = 0 for all i = 1, 2, · · · , n by Prop. 3.2(e). Thus, Ψ_p(x^∗) = 0 which says x^∗ is a global minimizer of (11).

(II) If F is P₀-function and suppose x^∗ is a stationary point of Ψ_p. Then ∇Ψ_p(x^∗) = 0 which yields (16). Notice that ∇_aψ_p(a, b) and ∇_bψ_p(a, b) are given as forms of (12). If we denote A(x^∗) and B(x^∗) the possibly multivalued n × n diagonal matrices whose diagonal elements are given by

Aii(x^∗) = sgn(x^∗_i) · |x^∗_i|^p−1 k(x^∗_i, F_i(x^∗))k^p−1p

if (x^∗_i, Fi(x^∗)) 6= (0, 0) and

B_ii(x^∗) = sgn(F_i(x^∗)) · |F_i(x^∗)|^p−1 k(x^∗_i, F_i(x^∗))k^p−1p

if (x^∗_i, F_i(x^∗)) 6= (0, 0).

If (x^∗_i, Fi(x^∗)) = (0, 0) then we let A(x^∗) = B(x^∗) = I, i.e., the n × n identity matrix.

With the notions of A(x^∗), B(x^∗) and (12), the equation (16) can be rewritten as

[(A(x^∗) − I) + ∇F (x^∗)(B(x^∗) − I)]Φp(x^∗) = 0. (18) We want to prove that Φ_p(x^∗) = 0 (and hence Ψ_p(x^∗) = 0). Suppose not, i.e., Φ_p(x^∗) 6= 0.

Recall that Φp(x^∗) = 0 if and only if (1) is satisfied and the i-th component of Φp(x^∗) is φ_p(x^∗_i, F_i(x^∗)). Thus, φ_p(x_i, F_i(x^∗)) 6= 0 means one of the following occurs:

1. x^∗_i 6= 0 and F_i(x^∗) 6= 0.

(10)

2. x^∗_i = 0 and F_i(x^∗) < 0.

3. x^∗_i < 0 and F_i(x^∗) = 0.

In every case, we have B_ii(x^∗) 6= 1 (since B_ii(x^∗) = 1 if and only if φ_p(x^∗_i, F_i(x^∗)) = 0 by Prop. 3.2(d)(e)), so that (B_ii(x^∗)−1)·φ_p(x^∗_i, F_i(x^∗)) 6= 0. Similar arguments apply for the vector (A(x^∗) − I)Φ_p(x^∗). Thus, from the above, we can easily verify that if Φ_p(x^∗) 6= 0 then (B(x^∗) − I)Φp(x^∗) and (A(x^∗) − I)Φp(x^∗) are both nonzero. Moreover, both of their nonzero elements are in the same positions, and such nonzero elements have the same sign. But, for equation (18) to hold, it would be necessary that ∇F (x^∗) ”revert the sign”

of all the nonzero elements of (B(x^∗) − I)Φ_p(x^∗), which contradicts the fact that ∇F (x^∗) is a P0-matrix by Lemma 2.1. 2

Proposition 3.5 Let Ψ_p : IRⁿ → IR be defined as (10) with p > 1. Assume F is either strongly monotone or uniform P -function, then the level sets

L(Ψp, γ) := {x ∈ IRⁿ | Ψp(x) ≤ γ}

are bounded for all γ ∈ IR.

Proof. (I) First, we consider the assumption of strong monotonicity of F . Suppose there exists an unbounded sequence {kx^kk}_k∈K → ∞ with {x^k}_k∈K ⊆ L(Ψ_p, γ) for some γ ≥ 0, where K is a subset of N. We define the index set

J :=ⁿi ∈ {1, 2, · · · , n}| {x^k_i} is unbounded^o.

Since {x^k} is unbounded, J 6= ∅. Let {z^k} denote a bounded sequence defined by z_i^k =

( 0, if i ∈ J, x^k_i, if i 6∈ J.

Then from the definition of {z^k} and the strong monotonicity of F , we obtain µ^X

i∈J

(x^k_i)² = µkx^k− z^kk²

≤ hx^k− z^k, F (x^k) − F (z^k)i

=

Xn

i=1

(x^k_i − z^k_i)(Fi(x^k) − Fi(z^k)) (19)

= ^X

i∈J

x^k_i(Fi(x^k) − Fi(z^k))

≤ ^{µ X}

i∈J

(x^k_i)²

¶_1/2X

i∈J

|Fi(x^k) − Fi(z^k)|.

(11)

Since ^X

i∈J

(x^k_i)² 6= 0 for k ∈ K, then dividing by ^X

i∈J

(x^k_i)² on both sides of (19) yields

µ^{µ X}

i∈J

(x^k_i)²

¶_1/2

≤^X

i∈J

|F_i(x^k) − F_i(z^k)|, k ∈ K. (20)

On the other hand, we know {F_i(z^k)}_k∈K is bounded (i ∈ J) due to {z^k}_k∈K is bounded and F is continuous. Therefore, from (20), we have

{|F_i₀(x^k)|} → ∞ for some i₀ ∈ J.

Also, {kx^k_i₀k} → ∞ by the definition of the index set J. Thus, Lemma 3.1 yields φ_p(x^k_i₀, F_i₀(x^k)) → ∞ as k → ∞.

But this contradicts {x^k} ⊆ L(Ψ_p, γ).

(II) If F is uniform P -function, then the proof almost follows the same arguments as above. In particular, (19) is replaced by

µ^X

i∈J

(x^k_i)² = µkx^k− z^kk²

≤ max

1≤i≤n(x^k_i − z_i^k)(F_i(x^k) − F_i(z^k))

= max

i∈J x^k_i(Fi(x^k) − Fi(z^k)) (21)

= x^k_j₀(F_j₀(x^k) − F_j₀(z^k))

≤ |x^k_j₀| · |Fj0(x^k) − Fj0(z^k)|,

where j₀ is one of the indices for which the max is attained. Then dividing by |x^k_j₀| on both sides of (21) and the proof follows. 2

4 A descent method

In this section, we study a descent method for solving the unconstrained minimization (11), which does not require the derivative of F involved in the NCP. In addition, we prove a global convergence result for this derivative-free descent algorithm. More precisely, we consider the search direction as below:

d^k:= −∇_bψ_p(x^k, F (x^k)), (22) where ∇_bψ_p(x^k, F (x^k)) =^³∇_bψ_p(x^k₁, F (x^k₁)), · · · , ∇_bψ_p(x^k_n, F (x^k_n))^´^T. From the following lemma, we see that d^k is a descent direction of Ψ_p at x^k under monotonicity assumption.

(12)

Lemma 4.1 Let x^k ∈ IRⁿ and F be a monotone function. Then the search direction defined as (22) satisfies the descent condition ∇Ψp(x^k)^Td^k < 0 as long as x^k is not a solution of the NCP (1). Moreover, if F is strongly monotone with modulus µ > 0, then

∇Ψ_p(x^k)^Td^k ≤ −µkd^kk².

Proof. Since ∇Ψ_p(x^k) = ∇_aψ_p(x^k, F (x^k)) + ∇F (x^k)∇_bψ_p(x^k, F (x^k)), we have that

∇Ψp(x^k)^Td^k = −

Xn

i=1

∇aψp(x^k_i, Fi(x^k)) · ∇bψp(x^k_i, Fi(x^k)) − (d^k)^T∇F (x^k)(d^k). (23) From the monotonicity of F , it follows that ∇F (x^k) is positive semidefinite. Therefore, the second term of (23) is nonnegative. Also, by Prop. 3.2(d), the first term of (23) is nonnegative. Therefore, ∇Ψ_p(x^k)^Td^k ≤ 0. We next prove that ∇Ψ_p(x^k)^Td^k < 0 by contradiction. Assume that ∇Ψp(x^k)^Td^k = 0. Then ∇aψp(x^k_i, Fi(x^k))·∇bψp(x^k_i, Fi(x^k)) = 0 for all i which, by Prop. 3.2(d) again, yields φ_p(x^k_i, F_i(x^k)) = 0. Thus, Φ_p(x^k, F (x^k)) = 0 and Ψ_p(x^k, F (x^k)) = 0. Consequently, x^k solves the NCP (1). This obviously contradicts our assumption that x^k is not a solution of the NCP (1).

If F is strongly monotone with modulus µ > 0, then we have that

∇Ψ_p(x^k)^Td^k≤ −(d^k)^T∇F (x^k)(d^k) ≤ −µkd^kk², where the first inequality follows from (23) and Prop. 3.2(d). 2

The above lemma motivates the following descent algorithm.

Algorithm 4.1

(Step 0) Given a real number p > 1 and x⁰ ∈ IRⁿ. Choose the parameters ε ≥ 0, σ ∈ (0, 1) and β ∈ (0, 1). Set k := 0.

(Step 1) If Ψ_p(x^k) ≤ ε, then Stop.

(Step 2) Let

d^k:= −∇bψp(x^k, F (x^k)).

(Step 3) Compute a step-size t_k := β^m^k, where m_k is the smallest nonnegative integer m satisfying the Armijo-type condition:

Ψ_p(x^k+ β^md^k) ≤ (1 − σβ^2m)Ψ_p(x^k). (24) (Step 4) Set x^k+1 := x^k+ tkd^k, k := k + 1 and Go to Step 1.

We next show the global convergence result for Algorithm 4.1 under the strongly monotone assumption of F . To this end, we assume that the parameter ε used in Algo- rithm 4.1 is set to be zero and Algorithm 4.1 generates an infinite sequence {x^k}.

(13)

Proposition 4.1 Suppose that F is strongly monotone. Then the sequence {x^k} gener- ated by Algorithm 4.1 has at least one accumulation point and any accumulation point is a solution of the NCP (1).

Proof. Firstly, we show that there exists a nonnegative integer m_kin Step 3 of Algorithm 4.1 whenever x^k is not a solution. Assume that the conclusion does not hold. Then for any m > 0,

Ψ_p(x^k+ β^md^k) − Ψ_p(x^k) > −σβ^2mΨ_p(x^k).

Dividing by β^m on both sides and taking the limit m → ∞ yield h∇Ψ_p(x^k), d^ki ≥ 0.

Since F is strongly monotone, this obviously contradicts Lemma 4.1. Hence, we can find an integer m_k in Step 3.

Secondly, we show that the sequence {x^k} generated by Algorithm 4.1 has at least one accumulation point. By the descent property of Algorithm 4.1, the sequence {Ψp(x^k)} is decreasing. Thus, by Prop. 3.5, the generated sequence is bounded and hence it has at least one accumulation point.

Finally, we prove that every accumulation point is a solution of the NCP (1). Let x^∗ be an arbitrary accumulation point of the generated sequence {x^k}. Then there exists a subsequence {x^k}_k∈K converging to x^∗. We know that −∇_bψ_p( · , F (·)) is continuous since ψ_p is continuously differentiable, therefore, {d^k}_k∈K → d^∗. Next, we need to discuss two cases. First, we consider the case where there exists a constant ¯β such that β^m^k ≥ ¯β > 0 for all k ∈ K. Then, from (24), we obtain

Ψ_p(x^k+1) ≤ (1 − σ ¯β²)Ψ_p(x^k)

for all k ∈ K and the entire sequence {Ψ_p(x^k)} is decreasing. Thus, we have Ψ_p(x^∗) = 0 (by taking the limit) which says x^∗ is a solution of the NCP (1). Now, we consider the other case where there exists a further subsequence such that β^m^k → 0. Note that by Armijo’s rule (24) in Step 3, we have

Ψ_p(x^k+ β^m^k⁻¹· d^k) − Ψ_p(x^k) > −σβ^2(m^k⁻¹⁾Ψ_p(x^k).

Dividing both sides by β^m^k⁻¹ and passing to the limit on the subsequence, we obtain h∇Ψ_p(x^∗), d^∗i ≥ 0

which implies that x^∗ is a solution of the NCP (1) by Lemma 4.1. 2

(14)

5 Numerical experiments

We implemented Algorithm 4.1 with our code in MATLAB 6.1 for all test problems with all available starting points in MCPLIB [1]. All numerical experiments were done at a PC with CPU of 2.8 GHz and RAM of 512 MB. In order to improve the numerical behavior of Algorithm 4.1, we replaced the standard (monotone) Armijo-rule by nonmonotone line search as described in [11], i.e., we computed the smallest nonnegative integer l such that

Ψp(x^k+ β^ld^k) ≤ Wk− σβ^2lΨp(x^k).

where W_k is given by

W_k = max

j=k−mk,...,kΨ_p(x^j) and where, for given nonnegative integers ˆm and s, we set

mk =

( 0 if k ≤ s

min {m_k−1+ 1, ˆm} otherwise .

Throughout the experiments, we use ˆm = 5 and s = 5. Moreover, we use the parameters σ = 1.0e − 10 and β = 0.2 in Algorithm 4.1. We terminated our iteration when the number of iteration is over 500000 or the steplength is less than 1.0e − 10 or one of the following conditions is satisfied:

(C1) Ψ_p(x^k) ≤ 1.0e − 5 and (x^k)^TF (x^k) ≤ 5.0e − 3;

(C2) Ψp(x^k) ≤ 3.0e − 7 and (x^k)^TF (x^k) ≤ 3.0e − 2;

(C3) Ψ_p(x^k) ≤ 3.0e − 6 and (x^k)^TF (x^k) ≤ 1.0e − 2.

Our computational results are summarized in Tables 1-3 (see the Appendix). In these tables, the first column lists the name of the problems and the starting point number in MCPLIB, Gap denotes the value of x^TF (x) at the final iteration, NF indicates the number of function evaluations of the merit function Ψ_p for solving each problem, and Time represents the CPU time in seconds for solving each problem.

The results reported in Tables 1-3 show that our descent method based on the merit function Ψ_1.5(x), Ψ₂(x) or Ψ₃(x) was able to solve most complementarity problems in MCPLIB. More precisely, there are seven failures (pgvon105, pgvon106, powell, scar- fanum, scarfasum, scarfbnum, scarfbsum) for Algorithm 4.1 due to a too small steplength. After a careful check, we find the direction d defined in Algorithm 4.1 is not a descent one for these problems. In fact, the seven problems are regarded as difficult ones for those Newton type algorithms [19, 20]. In addition, we may see that the descent algorithm using the merit function Ψ_1.5(x) has better numerical results than using the Fischer-Burmeister function. Particularly, it appears from Tables 1-3 that the descent algorithm based on Ψ_p(x) will take more function evaluations and yield larger value of

(15)

Gap when the parameter p increases. A reasonable interpretation for this is that the value of Ψp(x) become smaller when p increases and hence causes some difficulty for the descent Algorithm 4.1. This also implies that the performance of Algorithm 4.1 will be- come worse when the parameter p increases. This is an important new discovery, which has big contribution in constructing new NCP-functions, not found in the literature to our best knowledge.

6 Final remarks

In this paper, we have studied a family of NCP-functions φ_p(a, b) which include the well- known Fischer-Burmeister function as a special case and have shown that this class of functions enjoy some favorable properties as other NCP-functions do. In addition, we propose a descent method for the unconstrained minimization (11) which is a reformulation of the NCP via the proposed NCP-functions. Numerical results for the test problem in MCPLIB have shown this method is promising when Ψ_p(x) is specified as Ψ_1.5(x), Ψ2(x) or Ψ3(x). Moreover, from our numerical implementations, there indicates that the performance of the descent method become better when p decreases, which is a new and important discovery. This implies that there does exist new NCP-function which is better than Fischer-Burmeister function. It is yet unknown whether similar phenomena happens in different algorithm, which is an interesting future topic.

There still are many issues for this NCP-function to be explored like those for other NCP-functions done in the literature. For instance, it would be of interest to know the semismoothness property of ψp and the Lipschitz continuous property of ∇ψp. In fact, some of them are recently studied in [2, 3]. In addition, it is interesting to know whether this class of NCP-functions can be used for SDCP and SOCCP. Some researchers have started this issue but no update reports by now. We leave them for future research topics.

Acknowledge. The authors are grateful to Professor P. Tseng for his suggestion on studying this family of NCP-functions and thank the referees for their careful reading and helpful suggestions.

References

[1] S. C. Billups, S. P. Dirkse and M. C. Soares, A comparison of algorithms for large scale mixed complementarity problems, Computational Optimization and Applications, vol. 7, pp. 3-25, 1997.

[2] J.-S. Chen, On some NCP-functions based on the generalized Fischer-Burmeister function, Asia-Pacific Journal of Opertional Research, vol. 24, pp. 401-420, 2007.

(16)

[3] J.-S. Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization, vol. 36, pp. 565-580, 2006.

[4] R.W. Cottle, J.-S. Pang and R.-E. Stone, The Linear Complementarity Prob- lem, Academic Press, New York, 1992.

[5] F. Facchinei and J. Soares, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization, vol. 7, pp. 225-247, 1997.

[6] M. C. Ferris, O. L. Mangasarian, and J.-S. Pang, editors, Complementarity:

Applications, Algorithms and Extensions, Kluwer Academic Publishers, Dordrecht, 2001.

[7] A. Fischer, A special Newton-type optimization methods, Optimization, vol. 24, pp.

269-284, 1992.

[8] A. Fischer, Solution of the monotone complementarity problem with locally Lips- chitzian functions, Mathematical Programming, vol. 76, pp. 513-532, 1997.

[9] M. Fukushima, Merit functions for variational inequality and complementarity prob- lems, Nonlinear Optimization ans Applications, edited by G. Di Pillo and F. Gian- nessi, Plenum Press, New York, pp. 155-170, 1996.

[10] C. Geiger and C. Kanzow, On the resolution of monotone complementarity problems, Computational Optimization and Applications, vol. 5, pp. 155-173.

[11] L. Grippo, F. Lampariello and S. Lucidi, A nonmonotone line search tech- nique for Newton’s method, SIAM Journal on Numerical Analysis, vol. 23, pp. 707-716, 1986.

[12] P. T. Harker and J.-S. Pang, Finite dimensional variational inequality and nonlinear complementarity problem: A survey of theory, algorithms and applications, Mathematical Programming, vol. 48, pp. 161-220, 1990.

[13] N.J. Higham, Estimating the matrix p-norm, Numerical Mathematics, vol. 62, pp.539-555, 1992.

[14] H. Jiang, Unconstrained minimization approaches to nonlinear complementarity problems, Journal of Global Optimization, vol. 9, pp. 169-181, 1996.

[15] C. Kanzow, Nonlinear complementarity as unconstrained optimization, Journal of Optimization Theory and Applications, vol. 88, pp. 139-155, 1996.

(17)

[16] O. L. Mangasarian and M. V. Solodov, Nonlinear complementarity as un- constrained and constrained minimization, Mathematical Programming, vol. 62, pp.

277-297, 1993.

[17] O. L. Mangasarian, Equivalence of the complementarity problem to a system of nonlinear equations, SIAM Journal on Applied Mathematics, vol. 31, pp. 89-92, 1976.

[18] J.-S. Pang, Complementarity problems, Handbook of Global Optimization, edited by R. Horst and P. Pardalos, Kluwer Academic Publishers, Boston, Massachusetts, pp. 271-338, 1994.

[19] H.-D. Qi and L.-Z. Liao, A smoothing Newton method for gneral nonlinear com- plementarity problems, Computational Optimization and Applications, vol. 17, pp.

231-253, 2000.

[20] D. Sun and L.-Q. Qi, On NCP-functions, Computational Optimization and Ap- plications, vol. 13, pp. 201-220, 1999.

[21] P. Tseng, Global behaviour of a class of merit functions for the nonlinear com- plementarity problem, Journal of Optimization Theory and Applications, vol. 89, pp.

17-37, 1996.

[22] K. Yamada, N. Yamashita, and M. Fukushima, A new derivative-free descent method for the nonlinear complementarity problems, in Nonlinear Optimization and Related Topics edited by G.D. Pillo and F. Giannessi, Kluwer Academic Publishers, Netherlands, pp. 463–489, 2000.

[23] N. Yamashita and M. Fukushima, On stationary points of the implicit La- grangian for the nonlinear complementarity problems, Journal of Optimization The- ory and Applications, vol. 84, pp. 653-663, 1995.

[24] N. Yamashita and M. Fukushima, Modified Newton methods for solving a semis- mooth reformulation of monotone complementarity problems, Mathematical Program- ming, vol. 76, pp. 469-491, 1997.