3 Properties of the Merit Function

(1)

Journal of Computational and Applied Mathematics, vol. 232, pp. 455-471, 2009

A R-linearly convergent derivative-free algorithm for the NCPs based on the generalized Fischer-Burmeister merit function

Jein-Shan Chen ¹ Department of Mathematics National Taiwan Normal University

Taipei, Taiwan 11677 E-mail: jschen@math.ntnu.edu.tw

Hung-Ta Gao

Department of Mathematics National Taiwan Normal University

Taipei, Taiwan 11677 E-mail: kleinmankao@gmail.com

Shaohua Pan

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China E-mail: shhpan@scut.edu.cn

July 10, 2008

Abstract. In the paper [4], the authors proposed a derivative-free descent algorithm for the nonlinear complementarity problems (NCPs) by the generalized Fischer-Burmeister merit function: ψ_p(a, b) = ¹₂[∥(a, b)∥p − (a + b)]², and observed that the choice of the parameter p has a great influence on the numerical performance of the algorithm. In this paper, we analyze the phenomenon theoretically for a derivative-free descent algorithm which is based on a penalized form of ψ_p and uses a different direction from [4]. More specifically, we show that the algorithm proposed is globally convergent and has a locally R-linear convergence rate, and furthermore, its convergence rate will become worse when the parameter p decreases. Numerical results are also reported for the test problems from MCPLIB, which further verify the obtained theoretical results.

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Oﬃce. The author’s work is partially supported by National Science Council of Taiwan.

(2)

Key Words. Nonlinear complementarity problem, NCP-function, merit function, global error bound, convergence rate.

1 Introduction

The nonlinear complementarity problem (NCP) is to ﬁnd a point x∈ IRⁿ such that

x≥ 0, F (x) ≥ 0, ⟨x, F (x)⟩ = 0, (1)

where ⟨·, ·⟩ is the Euclidean inner product and F = (F1, . . . , F_n)^T is a map from IRⁿ to IRⁿ. We assume that F is continuously diﬀerentiable throughout this paper. The NCP has attracted much attention because of its wide applications in the ﬁelds of economics, engineering, and operations research [5, 12], to name a few.

Many methods have been proposed to solve the NCP; see [9, 12, 18] and the references therein. One of the most powerful and popular methods is to reformulate the NCP as a system of nonlinear equations [16, 19, 24], or an unconstrained minimization problem [6, 7, 8, 10, 13, 14, 20, 23]. The objective function that can constitute an equivalent unconstrained minimization problem is called a merit function, whose global minima are coincident with the solutions of the original NCP. To construct a merit function, a class of functions, called NCP-functions and deﬁned below, plays a signiﬁcant role.

Definition 1.1 A function ϕ : IR² → IR is called an NCP-function if it satisﬁes

ϕ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0. (2)

The Fischer-Burmeister (FB) function is a well-known NCP-function deﬁned as ϕ_FB(a, b) =√

a²+ b²− (a + b), (3)

by which the NCP can be reformulated as a system of nonsmooth equations:

Φ_FB(x) =







ϕ_FB(x₁ , F₁(x))

·

ϕ_FB(x_n , F_n(x))







= 0. (4)

Thus, the function Ψ_FB : IRⁿ → IR+ deﬁned as below is a merit function for the NCP:

Ψ_FB(x) := 1

2∥ΦFB(x)∥² =

∑n i=1

ψ_FB(x_i, F_i(x)), (5)

(3)

where ψ_FB : IR² → IR+ is the square of ϕ_FB, i.e., ψ_FB(a, b) = 1

2

√

a²+ b²− (a + b)². (6) Consequently, the NCP is equivalent to an unconstrained minimization problem:

xmin∈IRⁿΨ_FB(x). (7)

Recently, an extension of FB function was considered in [2, 3, 4] by the authors. More speciﬁcally, we deﬁne the generalized FB function ϕ_p : IR² → IR by

ϕ_p(a, b) :=∥(a, b)∥p− (a + b), (8) where p > 1 is an arbitrary ﬁxed real number and ∥(a, b)∥p denotes the p-norm of (a, b), i.e., ∥(a, b)∥p = ^p

√|a|^p+|b|^p. In other words, in the function ϕ_p, we replace the 2-norm of (a, b) in the FB function by a more general p-norm. The function ϕp is still an NCP- function, which naturally induces another NCP-function ψ_p : IR² → IR+ given by

ψ_p(a, b) := 1

2|ϕp(a, b)|². (9)

For any given p > 1, the function ψpis shown to possess all favorable properties of the FB function ψ_FB; see [2, 3, 4]. For example, ψ_p is also continuously diﬀerentiable everywhere on IR². Like ϕ_FB, the operator Φ_p : IRⁿ→ IRⁿ deﬁned as

Φ_p(x) =







ϕ_p(x₁ , F₁(x))

·

·· ϕ_p(x_n , F_n(x))







(10)

yields a family of merit functions Ψ_p : IRⁿ → IR+ for the NCP Ψ_p(x) := 1

2∥Φp(x)∥² =

∑n i=1

ψ_p(x_i , F_i(x)). (11)

In this paper, we study the following merit function Ψ_α,p: IRⁿ → IR for the NCP:

Ψ_α,p(x) :=

∑n i=1

ψ_α,p(x_i , F_i(x)), (12)

where ψα,p : IR² → IR+ is an NCP-function deﬁned by ψ_α,p(a, b) := α

2(max{0, ab})²+ ψ_p(a, b) = α

2(ab)²₊+ 1

2(∥(a, b)∥p− (a + b))² (13)

(4)

with α ≥ 0 being a real parameter. When α = 0, the function ψα,p reduces to ψ_p. Hence, ψ_α,p is an extension of ψ_p. Besides, ψ_α,p also extends the function ψ_α studied in [25] by Yamada, Yamashita, and Fukushima which corresponds to p = 2. Indeed, ψ_α,p was ever studied in [3] by one of the authors (see ψ4 therein), but there was no investigation on property of error bound. In this paper, we present more favorable properties of ψ_α,p, and particularly, the conditions under which Ψ_α,p provides a global error bound for the NCP.

With these results, we propose a derivative-free descent algorithm based on ϕ_α,p and establish its global convergence and local R-linear convergence rate. Moreover, we also analyze the inﬂuence of p on the convergence rate of the proposed algorithm theoretically and obtain the conclusion that the convergence rate of the algorithm will become worse when the value of p decreases. Thus, this paper can be viewed as a follow-up of [3] and [4].

This paper is organized as follows. In Section 2, we review some deﬁnitions and prelim- inary results to be used in the subsequent analysis. In Section 3, we show some important properties of the proposed merit function. In Section 4, we propose a derivative-free algorithm associated with Ψ_α,p, prove its global convergence and the R-linear convergence rate, and analyze the inﬂuence of p on the convergence rate. Some numerical experiments are reported in Section 5, and we make concluding remarks in Section 6.

Throughout this paper, IRⁿ denotes the space of n-dimensional real column vectors and ^T denotes transpose. For every diﬀerentiable function f : IRⁿ → IR, ∇f(x) denotes the gradient of f at x. For every diﬀerentiable mapping F = (F₁, . . . , F_n)^T : IRⁿ → IRⁿ,

∇F (x) = (∇F1(x) . . . ∇Fn(x)) denotes the transpose Jacobian of F at x. We denote by∥x∥p the p-norm of x and by∥x∥ the Euclidean norm of x. The level set of a function Ψ : IRⁿ→ IR is denoted by L(Ψ, c) := {x ∈ IRⁿ | Ψ(x) ≤ c}. In addition, we also use the natural residual merit function Ψ_NR : IRⁿ→ IR+ deﬁned by

Ψ_NR(x) := 1 2

∑n i=1

ϕ²

NR(x_i , F_i(x)), (14)

where ϕ_NR : IR² → IR denotes the minimum NCP-function min{a, b}. Unless otherwise stated, in the sequel, we always suppose that p is a ﬁxed real number in (1,∞).

2 Preliminaries

This section mainly recalls some concepts about the mapping F that will be used later.

Definition 2.1 Let F = (F₁, . . . , F_n)^T with F_i : IRⁿ→ IR for i = 1, . . . , n. We say that (a) F is monotone if ⟨x − y, F (x) − F (y)⟩ ≥ 0 for all x, y ∈ IRⁿ.

(5)

(b) F is strongly monotone with modulus µ > 0 if ⟨x − y, F (x) − F (y)⟩ ≥ µ∥x − y∥² for all x, y ∈ IRⁿ.

(c) F is a P₀-function if max

1≤i≤n xi̸=yi

(x_i− yi)(F_i(x)− Fi(y))≥ 0 for all x, y ∈ IRⁿ and x̸= y.

(d) F is a uniform P -function with modulus µ > 0 if max

1≤i≤n(xi − yi)(Fi(x)− Fi(y)) ≥ µ∥x − y∥² for all x, y∈ IRⁿ.

(e) ∇F (x) is uniformly positive deﬁnite with modulus µ > 0 if d^T∇F (x)d ≥ µ∥d∥² for all x∈ IRⁿ and d∈ IRⁿ.

(f ) F is Lipschitz continuous if there exists a constant L > 0 such that∥F (x) − F (y)∥ ≤ L∥x − y∥ for all x, y ∈ IRⁿ.

From Deﬁnition 2.1, it is easy to see that F is a uniform P -function with modulus µ > 0 if F is strongly monotone with modulus µ > 0, and F is a P₀-function if F is monotone. In addition, when F is continuously diﬀerentiable, the following results hold:

1. F is monotone if and only if ∇F (x) is positive semideﬁnite for all x ∈ IRⁿ. 2. F is strongly monotone if and only if ∇F (x) is uniformly positive deﬁnite.

3 Properties of the Merit Function

In this section, we study some favorable properties of the merit function ψ_α,pwhich will be used in the subsequent analysis, and then present some mild conditions under which the merit function Ψα,phas bounded level sets and provides a global error bound, respectively.

The following lemma states that ψ_α,p enjoys many favorable properties as ψ_p holds.

Furthermore, when α > 0, it has an important property that ψ_p does not have (see Lemma 3.1(f)). Although most results of the lemma were investigated in [3, Prop. 3.3]

where only p being integer was considered, we here provide more detailed arguments for general case where p is any real number greater than one.

Lemma 3.1 The function ψ_α,p deﬁned by (13) has the following favorable properties:

(a) ψ_α,p is an NCP-function and ψ_α,p ≥ 0 for all (a, b) ∈ IR².

(b) ψ_α,p is continuously diﬀerentiable everywhere, and moreover, if (a, b) ̸= (0, 0),

∇aψ_α,p(a, b) = αb(ab)₊+

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b),

∇bψ_α,p(a, b) = αa(ab)₊+

(sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b);

(15)

(6)

and otherwise ∇aψ_α,p(0, 0) =∇bψ_α,p(0, 0) = 0.

(c) For p≥ 2, the gradient of ψα,p is Lipschitz continuous on any nonempty bounded set S, i.e., there exists L > 0 such that for any (a, b), (c, d)∈ S,

∥∇ψα,p(a, b)− ∇ψα,p(c, d)∥ ≤ L∥(a, b) − (c, d)∥.

(d) ∇aψ_α,p(a, b) · ∇bψ_α,p(a, b) ≥ 0 for any (a, b) ∈ IR², and furthermore, the equality holds if and only if ψ_α,p(a, b)=0.

(e) ∇aψα,p(a, b) = 0⇐⇒ ∇bψα,p(a, b) = 0⇐⇒ ψα,p(a, b) = 0.

(f ) Suppose that α > 0. If a→ −∞ or b → −∞ or ab → ∞, then ψα,p(a, b)→ ∞.

Proof. Parts (a), (b) and (f) directly follow from the deﬁnition of ψ_α,p and Proposition 3.2 (a)–(c) and Lemma 3.1 of [4]. It remains to show parts (c)–(e).

(c) Notice that the functions a(ab)₊ and b(ab)₊for any a, b∈ IR are Lipschitz continuous on any nonempty bounded set S, whereas ϕ_p(a, b) is Lipschitz continuous on IR² by [4, Proposition 3.1 (e)]. Therefore, by the expression of ∇ψα,p(a, b) and the boundedness of

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

and

(sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

,

it is not hard to verify that the gradient ∇ψα,p(a, b) is Lipschitz continuous on S for p≥ 2.

(d) If (a, b) = (0, 0), part (d) clearly holds. Now suppose that (a, b)̸= (0, 0). Then,

∇aψ_α,p(a, b)· ∇bψ_α,p(a, b) =

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

) (sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ²_p(a, b) +α²ab(ab)₊²+ αa(ab)₊

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b) +αb(ab)₊

(sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b). (16)

Since

ab(ab)₊² ≥ 0, sgn(a)· |a|^p−1

∥(a, b)∥^p^p⁻¹ − 1 ≤ 0, and sgn(b)· |b|^p−1

∥(a, b)∥^p^p⁻¹ − 1 ≤ 0, (17) it suﬃces to show that the last two terms of (16) are nonnegative. We next claim that

αa(ab)₊

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b)≥ 0, ∀ (a, b) ̸= (0, 0). (18)

(7)

If a≤ 0, then ϕp(a, b)≥ 0, which together with the second inequality in (17) implies that (18) holds. If a > 0 and b > 0, then ϕ_p(a, b) < 0, which implies (18) by a similar reason.

If a > 0 and b≤ 0, then (ab)+ = 0, and hence (18) holds. Similarly, we have that αb(ab)₊

(sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b)≥ 0, ∀ (a, b) ̸= (0, 0).

Consequently, ∇aψ_α,p(a, b)· ∇bψ_α,p(a, b) ≥ 0. From (16), ∇aψ_α,p(a, b)· ∇bψ_α,p(a, b)=0 if and only if {a = 0 or (a ≥ 0 and b = 0) or ϕp(a, b)=0} and {b = 0 or (b ≥ 0 and a=0) or ϕ_p(a, b) = 0} and {ab=0}. Thus, ∇aψ_α(a, b)·∇bψ_α,p(a, b) = 0 if and only if ψ_α,p(a, b) = 0.

(e) If ψ_α,p(a, b) = 0, then ab = 0 and ϕ_p(a, b) = 0 by part (a), which in turn implies that ∇aψα,p(a, b) = 0 and ∇bψα,p(a, b) = 0. Next, we claim that∇aψα,p(a, b) = 0 implies ψ_α,p(a, b) = 0. Suppose that ∇aψ_α,p(a, b) = 0. Then,

αb(ab)₊=−

(sgn(a)· |a|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 1

)

ϕ_p(a, b). (19)

We can verify that the equality (19) implies b = 0, a≥ 0 or b > 0, a = 0. Under the two cases, we both have ψ_α,p(a, b) = 0. Similarly, ∇bψ_α,p(a, b) = 0 also implies ψ_α,p(a, b) = 0.

2

Notice that ab → ∞ does not necessarily imply ψp(a, b) → ∞ which means ψp does not share Lemma 3.1(f). In fact, for α = 0, the lemma needs to be modiﬁed as “if (a → ∞) or (b → ∞) or (a → ∞ and b → ∞), then ψα,p(a, b) → ∞”. As we will see later, Lemma 3.1(f) is useful for proving that the level sets of Ψ_α,p are bounded. Besides, by Lemma 3.1(a), we immediately have the following theorem.

Theorem 3.1 Let Ψ_α,p be deﬁned as in (12). Then Ψ_α,p(x) ≥ 0 for all x ∈ IRⁿ and Ψα,p(x) = 0 if and only if x solves the NCP. Moreover, if the NCP has at least one solution, then x is a global minimizer of Ψ_α,p if and only if x solves the NCP.

Theorem 3.1 indicates that the NCP can be recast as the unconstrained minimization:

xmin∈IRⁿΨ_α,p(x). (20)

In general, it is hard to ﬁnd a global minimum of Ψ_α,p. Therefore, it is important to know under what conditions a stationary point of Ψ_α,p is a global minimum. Using Lemma 3.1(d) and the same proof techniques as in [10, Theorem 3.5], we can estabish that each stationary point of Ψ_α,p is a global minimum only if F is a P₀-function.

Theorem 3.2 Let F be a P₀-function. Then x^∗ ∈ IRⁿ is a global minimum of the unconstrained optimization problem (20) if and only if x^∗ is a stationary point of Ψ_α,p.

(8)

From the following theorem, we see that the unconstrained minimization problem (20) has a stationary point under rather weak conditions of the mapping F . Since similar results and analogous analysis can be found in [3, Proposition 4.1], [10, Theorem 3.8] and [15, Theorem 4.1], we here omit the proof.

Theorem 3.3 The function Ψ_α,p has bounded level sets L(Ψα,p, c) for all c∈ IR, if F is monotone and the NCP is strictly feasible (i.e., there exists ˆx > 0 such that F (ˆx) > 0) when α > 0, or F is a uniform P -function when α≥ 0.

In what follows, we will show that the merit functions Ψ_p, Ψ_NR and Ψ_α,phave the same order on every bounded set. For this purpose, we need the following crucial technical lemma, which generalizes the important property of ϕ_FB proved by Tseng in [22].

Lemma 3.2 Let ϕ_p : IR² → IR be deﬁned as in (8). Then for any p > 1 we have

(2− 2¹^p)| min{a, b}| ≤ |ϕp(a, b)| ≤ (2 + 2¹^p)| min{a, b}|. (21) Proof. Without loss of generality, suppose a ≥ b. We will prove the desired results by considering the following two cases: (1) a + b≤ 0 and (2) a + b > 0.

Case(1): a + b≤ 0. In this case, we have

|ϕp(a, b)| ≥ ∥(a, b)∥p ≥ |b| = | min{a, b}| ≥ (2 − 2^p¹)| min{a, b}|. (22) On the other hand, since a≥ b and a + b ≤ 0, we have |b| ≥ |a|. Then

|ϕp(a, b)| ≤ ∥(a, b)∥p− 2b = (2 + 2^p¹)|b| = (2 + 2^p¹)| min{a, b}|. (23) Case(2): a + b > 0. If ab=0, then (21) clearly holds. Thus, we discuss by two subcases:

(i) ab < 0. In this subcase, we have a > 0, b < 0, and |a| > |b|. Consequently,

ϕ_p(a, b)≤ |a| + |b| − (a + b) = −2b = 2| min{a, b}| ≤ (2 + 2¹^p)| min{a, b}|, (24) and

ϕ_p(a, b)≥ ∥(a, b)∥∞− (a + b) = −b = | min{a, b}| ≥ (2 − 2¹^p)| min{a, b}|. (25) (ii) ab > 0. Now we have a≥ b > 0. Since for any p > 1 there holds that

0≥ ϕp(a, b) ≥ ∥(a, b)∥_∞− (a + b) = a − (a + b) = −b = − min{a, b}, we immediately obtain that

|ϕp(a, b)| ≤ | min{a, b}| ≤ (2 + 2¹^p)| min{a, b}|. (26)

(9)

On the other hand, since ϕ_p(a, b)≤ 0, it follows that

|ϕp(a, b)| = a + b − ∥(a, b)∥p = b

[(a b + 1

)

−⁽⁽a b

)_p

+ 1

)1/p]

.

Let f (t) = t + 1− (t^p + 1)^1/p for t≥ 1. Then

f^′(t) = 1−⁽ t^p t^p + 1

)^p⁻¹_p

.

Notice that f^′(t) > 0 for t≥ 1, and f(1) = 2 − 2¹^p, and hence we obtain that

|ϕp(a, b)| ≥ (2 − 2¹^p)b = (2− 2¹^p)| min{a, b}| for any p > 1. (27) All the aforementioned inequalities (22)-(27) imply that (21) holds. 2

Proposition 3.1 Let Ψp, Ψ_NR and Ψα,p be deﬁned as in (11), (14) and (12), respectively.

Let S be an arbitrary bounded set. Then, for any p > 1, we have

(

2− 2¹^p⁾²Ψ_NR(x)≤ Ψp(x)≤⁽2 + 2^p¹⁾²Ψ_NR(x) for all x∈ IRⁿ (28) and

(

2− 2¹^p⁾²Ψ_NR(x)≤ Ψα,p(x)≤⁽αB²+ (2 + 2¹^p)²⁾Ψ_NR(x) for all x∈ S, (29)

where B is a constant deﬁned by B = max

1≤i≤n

{

sup

x∈S{max {|xi|, |Fi(x)|}}

}

<∞.

Proof. The inequality in (28) is direct by Lemma 3.2 and the deﬁnitions of Ψp and Ψ_NR. In addition, from Lemma 3.2 and the deﬁnition of Ψ_α,p, it follows that

Ψ_α,p(x)≥⁽2− 2^p¹⁾²Ψ_NR(x) for all x∈ IRⁿ.

We next prove the inequality on the right hand side of (29). We claim that, for each i, (x_iF_i(x))₊≤ B| min{xi, F_i(x)}| for all x ∈ S. (30) Without loss of generality, suppose F_i(x)≥ xi. If F_i(x)≥ xi ≥ 0, it follows that

(x_iF_i(x))₊= x_iF_i(x) = F_i(x)| min{xi, F_i(x)}| ≤ B| min{xi, F_i(x)}|.

If F_i(x)≥ 0 ≥ xi, then (x_iF_i(x))₊= 0. If 0 ≥ Fi(x)≥ xi, it follows that (x_iF_i(x))₊ =|xiF_i(x)| ≤ |xi|² ≤ B| min{xi, F_i(x)}|.

(10)

Thus, (30) holds for all x∈ S. By Lemma 3.2 and (30), for all i = 1, . . . , n and x ∈ S, ψ_α,p(x_i, F_i(x)) ≤^{αB²+ (2 + 2¹^p)²^}min{xi, F_i(x)}²

holds for any p > 1. The proof is then complete by the deﬁnition of Ψ_α,p and Ψ_NR. 2

From Proposition 3.1, we immediately obtain the following result.

Corollary 3.1 Let Ψp and Ψα,p be deﬁned by (12) and (11), respectively, and S be any bounded set. Then, for any p > 1 and all x∈ S, we have the following inequalities:

(2− 2¹^p)²

(

αB²+ (2 + 2¹^p)²⁾

Ψ_α,p(x)≤ Ψp(x)≤ (2 + 2¹^p)² (2− 2^p¹)²

Ψ_α,p(x)

where B is the constant deﬁned as in Proposition 3.1.

Since Ψ_p, Ψ_NR and Ψ_α,p have the same order on a bounded set, one will provide a global error bound for the NCP as long as the other one does. As below, we show that Ψ_α,p provides a global error bound without the Lipschitz continuity of F when α > 0.

Theorem 3.4 Let Ψ_α,p be deﬁned as in (12). Suppose that F is a uniform P -function with modulus µ > 0. If α > 0, then there exists a constant κ₁ > 0 such that

∥x − x^∗∥ ≤ κ1Ψα,p(x)¹⁴ for all x∈ IRⁿ;

if α = 0 and S is any bounded set, there exists a constant κ₂ > 0 such that

∥x − x^∗∥ ≤ κ2

(

max

{

Ψα,p(x),

√

Ψα,p(x)

})¹

2 for all x∈ S;

where x^∗ = (x^∗₁,· · · , x^∗_n) is the unique solution for the NCP.

Proof. Since F is a uniform P -function, the NCP has the unique solution, and moreover, µ∥x − x^∗∥² ≤ max

1≤i≤n(x− x^∗)(F_i(x)− Fi(x^∗))

= max

1≤i≤n{xiF_i(x)− x^∗iF_i(x)− xiF_i(x^∗) + x^∗_iF_i(x^∗)}

= max

1≤i≤n{xiFi(x)− x^∗iFi(x)− xiFi(x^∗)}

≤ max

1≤i≤nτ_i{(xiF_i(x))₊+ (−Fi(x))₊+ (−xi)₊}, (31) where τ_i := max{1, x^∗i, F_i(x^∗)}. We next prove that for all (a, b) ∈ IR²,

(−a)+2

+ (−b)+2 ≤ [∥(a, b)∥p− (a + b)]². (32)

(11)

Without loss of generality, suppose a ≥ b. If a ≥ b ≥ 0, then (32) holds obviously. If a≥ 0 ≥ b, then ∥(a, b)∥p− (a + b) ≥ −b ≥ 0, which in turn implies that

(−a)+2

+ (−b)+2

= b² ≤ [∥(a, b)∥p− (a + b)]². If 0≥ a ≥ b, then (−a)+2

+ (−b)+2

= a²+ b² ≤ [∥(a, b)∥p− (a + b)]². Hence, (32) follows.

Suppose that α > 0. Using the inequality (32), we then obtain that

[(ab)₊+ (−a)++ (−b)+]² = (ab)²₊+ (−b)²++ (−a)²++ 2(ab)₊(−a)+

+2(−a)+(−b)++ 2(ab)₊(−b)+

≤ (ab)²₊+ (−b)²₊+ (−a)²₊+ (ab)²₊+ (−a)²₊ +(−a)²++ (−b)²++ (ab)²₊+ (−b)²+

≤ 3^[(ab)²₊+ (∥(a, b)∥p − (a + b))²^]

≤ τ^[α

2(ab)²₊+ 1

2(∥(a, b)∥p− (a + b))²^]

= τ ψ_α,p(a, b) for all (a, b)∈ IR², (33) where τ := max

{6 α, 6

}

> 0. Combining (33) with (31) and letting ˆτ = max

1≤i≤nτ_i, we get µ∥x − x^∗∥² ≤ max

1≤i≤nτ_i{τψα,p(x_i, F_i(x))}^1/2

≤ ˆττ^1/2 max

1≤i≤nψ_α,p(x_i, F (x))^1/2

≤ ˆττ^1/2

{ _n

∑

i=1

{ψα,p(x_i, F_i(x))

}_1/2

= ˆτ τ^1/2Ψ_α,p(x, F (x))^1/2.

From this, the ﬁrst desired result follows immediately by setting κ₁ :=^[ˆτ τ^1/2/µ^]^1/2. Suppose that α = 0. From the proof of Proposition 3.1, the inequality (30) holds.

Combining with equations (31)–(32), it then follows that for all x∈ S, µ∥x − x^∗∥² ≤ max

1≤i≤nτ_i^[B| min{xi, F_i(x)}| + (ψp(x_i, F_i(x)))^1/2^]

≤ ˆτ max

1≤i≤n

[√

2 ˆBψ_p(x_i, F_i(x)) + (ψ_p(x_i, F_i(x)))^1/2^]

≤ √

2ˆτ ˆB

(

Ψ_p(x) +

√

Ψ_p(x)

)

≤ 4ˆτ ˆB max

{

Ψ_p(x),

√

Ψ_p(x)

}

= 4ˆτ ˆB max

{

Ψ_α,p(x),

√

Ψ_α,p(x)

}

where ˆB = B/(2− 2¹^p) and the second inequality is from Lemma 3.2. Letting κ₂ :=

2^[τ ˆˆB/µ^]^1/2, we obtain the desired result from the above inequality. 2

(12)

The following lemma is needed for the proof of Proposition 3.2, which plays a crucial role in showing the convergence rate of the algorithm described in the next section.

Lemma 3.3 For all (a, b)̸= (0, 0) and p > 1, we have the following inequality:

(sgn(a)· |a|^p⁻¹+ sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 2

)₂

≥⁽2− 2¹^p⁾².

Proof. If a = 0 or b = 0, the inequality holds obviously. Then we complete the proof by considering three cases: (i) a > 0 and b > 0, (ii) a < 0 and b < 0, and (iii) ab < 0.

Case (i): Without loss of generality, we suppose a≥ b > 0. Then

|a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ =

(^a_b)_p₋₁

+ 1

((^a_b)_p

+ 1⁾¹⁻

1 p

. (34)

Let f (t) := t^p⁻¹+ 1 (t^p + 1)¹⁻¹^p

for any t > 0. By computation, we have that

f^′(t) = t^p⁻²(p− 1)(1 − t)

(t^p+ 1)² ∀t > 0.

Since f^′(t) < 0 for t≥ 1 and f(1) = 2¹^p, it follows that f (t)≤ 2¹^p for t ≥ 1. Therefore,

|a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ ≤ 2¹^p for p > 1, which in turn implies that 2− |a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ ≥ 2 − 2¹^p for p > 1. Squaring both sides then leads to the desired inequality.

Case (ii): By similar arguments as in case (i), we obtain 2− 2¹^p ≤ 2 −|a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ ≤ 2 + |a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ for p > 1, from which the result follows immediately.

Case (iii): Again, we suppose |a| ≥ |b| and therefore have 2¹^p ≥ |a|^p⁻¹+|b|^p⁻¹

∥(a, b)∥^p^p⁻¹ ≥ |a|^p⁻¹− |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ for p > 1.

Thus 2− 2¹^p ≤ 2 −|a|^p⁻¹− |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ for p > 1 and the desired result is also satisﬁed. 2

(13)

Proposition 3.2 Let ψ_α,p be given as in (13). Then, for all x∈ IRⁿ and p > 1,

∥∇aψ_α,p(x, F (x)) +∇bψ_α,p(x, F (x))∥² ≥ 2⁽2− 2^p¹⁾²Ψ_p(x), and particularly, for all x belonging to any bounded set S and p > 1,

∥∇aψ_α,p(x, F (x)) +∇bψ_α,p(x, F (x))∥² ≥ 2(2− 2¹^p)⁴

(

αB²+ (2 + 2¹^p)²

)Ψ_α,p(x) where B is deﬁned as in Proposition 3.1 and

∇aψ_α,p(x, F (x)) :=

(

∇aψ_α,p(x₁, F₁(x)), · · · , ∇aψ_α,p(x_n, F_n(x))

)_T

,

∇bψα,p(x, F (x)) :=

(

∇bψα,p(x1, F1(x)), · · · , ∇bψα,p(xn, Fn(x))

)_T

. (35)

Proof. The second part of the conclusions is direct by Corollary 3.1 and the ﬁrst part.

From the deﬁnition of ∇aψ_α,p(x, F (x)),∇bψ_α,p(x, F (x)) and Ψ_p(x), the ﬁrst part of the conclusions is equivalent to proving that the following inequality

(∇aψ_α,p(a, b) +∇bψ_α,p(a, b))² ≥ 2⁽2− 2¹^p⁾²ψ_p(a, b) (36) holds for all (a, b)∈ IR². When (a, b) = (0, 0), the inequality (36) clearly holds. Suppose (a, b)̸= (0, 0). Then, it follows from equation (15) that

(∇aψ_α,p(a, b) +∇bψ_α,p(a, b))²

=

{

α(a + b)(ab)₊+ (ϕ_p(a, b))

(sgn(a)· |a|^p−1+ sgn(b)· |b|^p−1

∥(a, b)∥^p^p⁻¹ − 2

)}₂

= α²(a + b)²(ab)²₊+ (ϕ_p(a, b))²

(sgn(a)· |a|^p⁻¹+ sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 2

)₂

+2α(a + b)(ab)₊(ϕ_p(a, b))

(sgn(a)· |a|^p⁻¹+ sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 2

)

. (37)

Now, we claim that for all (a, b)̸= (0, 0) ∈ IR², 2α(a + b)(ab)₊(ϕ_p(a, b))

(sgn(a)· |a|^p⁻¹+ sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 2

)

≥ 0. (38)

If ab≤ 0, then (ab)+ = 0 and the inequality (36) is clear. If a, b > 0, then by noting that

(sgn(a)· |a|^p⁻¹+ sgn(b)· |b|^p⁻¹

∥(a, b)∥^p^p⁻¹ − 2

)

≤ 0, ∀(a, b) ̸= (0, 0) ∈ IR² (39) and ϕ_p(a, b) ≤ 0, the inequality (38) also holds. If a, b < 0, then ϕp(a, b) ≥ 0, which together with (39) then yields the inequality (38). Thus, we prove that the inequality (38) holds for all (a, b) ̸= (0, 0). Using Lemma 3.3 and equations (38)–(39), we readily obtain the inequality (36) holds for all (a, b)̸= (0, 0). The proof is thus complete. 2

(14)

4 A descent algorithm and convergence results

In this section, we propose a derivative-free descent algorithm based on the function Ψ_α,p. By Lemma 3.1 (d), it is easy to verify that ¯d :=−∇bψα,p(x, F (x)) is a descent direction for monotone nonlinear complementarity problems, i.e., the following result holds.

Lemma 4.1 Let Ψ_α,p be deﬁned as in (12). If the mapping F is monotone, then ¯d :=

−∇bψ_α,p(x, F (x)) is a descent direction of Ψ_α,p at any x∈ IRⁿ, i.e., ∇Ψα,p(x)^Td < 0.¯ However, we observe that ¯d does not involve any information of ∇aψ_α,p(x, F (x)) and is lack of a certain symmetry, for which we can not ﬁnd a constant c > 0 such that

∥ ¯d∥ ≥ cψα,p(x, F (x)).

This sets a big obstacle to establish the convergence rate of the derivative-free algorithm based on ¯d. In view of this, we follow the similar line as [25] to adopt the search direction of the following form:

d^k(ρ) :=−∇bψ_α,p(x^k, F (x^k))− ρ∇aψ_α,p(x^k, F (x^k)), (40) where ρ is a parameter such that ρ ∈ (0, 1) and ∇aψ_α,p(x, F (x)), ∇bψ_α,p(x, F (x)) are deﬁned as in (35). Although d^k(ρ) for any ρ ∈ (0, 1) is not necessarily a descent direction of Ψ_α,p at the iterate x^k, Lemma 4.1 implies that it is a descent one if ρ∈ (0, ¯ρk) where

¯

ρk:= 1 if ∇aψα,p(x^k, F (x^k))^T∇Ψα,p(x^k)≥ 0, and otherwise

¯

ρ_k := min

{

1,−∇bψ_α,p(x^k, F (x^k))^T∇Ψα,p(x^k)

∇aψ_α,p(x^k, F (x^k))^T∇Ψα,p(x^k)

}

.

Clearly, ¯ρ_k ∈ (0, 1) except that x^k is a solution of the NCP. Thus, d^kis a descent direction of Ψ_α,p at x^k for monotone NCPs only if ρ is chosen suﬃciently small. Similar to [25], we also determine an appropriate ρk by the backtracking search of Armijo-type instead of the value of ¯ρ_k, in our algorithm described as below.

Algorithm 4.1

(Step 0) Given real numbers p > 1 and α ≥ 0 and a starting point x⁰ ∈ IRⁿ. Choose the parameters σ∈ (0, 1), β ∈ (0, 1), γ ∈ (0, 1) and ε ≥ 0. Set k := 0.

(Step 1) If Ψ_α,p(x^k)≤ ε, then stop.

(Step 2) Let m_k be the smallest nonnegative integer m satisfying

Ψ_α,p(x^k+ β^md^k(γ^m))≤ (1 − σβ^2m)Ψ_α,p(x^k), (41) where

d^k(γ^m) :=−∇bψ_α,p(x^k, F (x^k))− γ^m∇aψ_α,p(x^k, F (x^k)).

(15)

(Step 3) Set x^k+1 := x^k+ β^m^kd^k(γ^m^k), k := k + 1 and go to Step 1.

We see that Algorithm 4.1 does not involve the computation of ∇Ψα,p and ∇F , and hence it is a derivative-free algorithm. In what follows, we establish the convergence results for Algorithm 4.1, and particularly, analyze its convergence rate under the strongly monotone assumption of F . To this end, we assume that the parameter ε in Algorithm 4.1 equals to zero and Algorithm 4.1 generates an inﬁnite sequence {x^k}.

Proposition 4.1 Suppose that F is monotone. Then Algorithm 4.1 is well-deﬁned for any starting point x⁰. Furthermore, if x^∗ is an accumulation point of the sequence {x^k} generated by Algorithm 4.1, then x^∗ is a solution of the NCP.

Proof. We first prove that Algorithm 4.1 is well-defined. From the construction of the algorithm, it suffices to show that Step 2 is well-defined. Assume to the contrary that there is no nonnegative integer m satisfying (41). Then, for any integer m≥ 0,

Ψ_α,p(x^k+ β^md^k(γ^m))− Ψα,p(x^k) >−σβ^2mΨ_α,p(x^k).

Dividing the above inequality by β^m and passing to the limit m→ +∞, we obtain that

mlim→+∞

Ψα,p(x^k+ β^md^k(γ^m))− Ψα,p(x^k)

β^m ≥ 0. (42)

Since Ψ_α,p is continuously diﬀerentiable, we have that Ψ_α,p is locally Lipschitz continuous at x^k, which in turn implies that there exists L > 0 such that

∥Ψα,p(x^k+ β^md^k(γ^m))− Ψα,p(x^k+ β^md^k(0))∥ ≤ Lβ^m∥d^k(γ^m)− d^k(0)∥ for all suﬃciently large m. Consequently,

mlim→+∞

Ψ_α,p(x^k+ β^md^k(γ^m))− Ψα,p(x^k) β^m

= lim

m→+∞

Ψ_α,p(x^k+ β^md^k(0))− Ψα,p(x^k) β^m

+ lim

m→+∞

Ψα,p(x^k+ β^md^k(γ^m))− Ψα,p(x^k+ β^md^k(0)) β^m

≤ ∇Ψα,p(x^k)^Td^k(0).

This together with (42) yields that ∇Ψα,p(x^k)^Td^k(0) ≥ 0. However, by Lemma 4.1,

∇Ψα,p(x^k)^Td^k(0) < 0 which leads to a contradiction. Hence, Algorithm 4.1 is well- deﬁned.

Next we prove that any accumulation point x^∗ of {x^k} is a solution of the NCP. Let {x^k}k∈Kbe a subsequence converging to x^∗. Notice that Ψ_α,pis continuously diﬀerentiable