### An R-Linearly Convergent Nonmonotone Derivative-Free Method for Symmetric Cone Complementarity Problems

Shaohua Pan^{1}
Department of Mathematics
South China University of Technology

Guangzhou 510640, China E-mail: shhpan@scut.edu.cn.

Jein-Shan Chen^{2}
Department of Mathematics
National Taiwan Normal University

Taipei, Taiwan 11677 E-mail: jschen@math.ntnu.edu.tw

Abstract. This paper extends the derivative-free descent method [18] for the nonlinear complementarity problem to the symmetric cone complementarity problem (SCCP). The algorithm is based on the unconstrained implicit La- grangian reformulation of the SCCP, but uses a convex combination of the neg- ative partial gradients of the implicit Lagrangian function ψα, i.e. the vector of the form −θ∇xψα−(1 −θ)∇yψα for θ ∈ [0, 1], as the search direction, and a nonmonotone line search rule to seek a desirable stepsize. We show that the derivative-free algorithm converges in terms of the implicit Lagrangian value for a large class of SCCPs that may even not be monotone. If θ is restricted to be less than a threshold ¯θ ∈ (0, 1) and the SCCP is strongly monotone, the sequence generated converges globally to the solution of SCCP at a R-linear rate.

Key words: Symmetric cone complementarity problem, implicit Lagrangian, derivative-free methods, nonmonotone, linear convergence.

1The author’s work is supported by Guangdong Natural Science Foundation (No.

9251802902000001) and the Fundamental Research Funds for the Central Universities (SCUT).

2Corresponding author. Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.

AMO - Advanced Modeling and Optimization. ISSN: 1841-4311

### 1 Introduction

Let V be a finite-dimensional vector space over the real field R, A ≡ (V, ◦, h·, ·i) be a Euclidean Jordan algebra (see Section 2 for the definition), and K be a symmetric cone in A. Given a continuously differentiable mapping F : V → V, we are interested in the following symmetric cone complementarity problem (SCCP): to find a ζ ∈ V such that

ζ ∈ K, F (ζ) ∈ K, hζ, F (ζ)i = 0. (1)

This class of problem provides a unified framework for the classical nonlinear complementarity problem (NCP), the second-order cone complementarity prob- lem (SOCCP), and the semidefinite complementarity problem (SDCP), as well as arises from the KKT system of a nonlinear symmetric cone optimization problem. When F (ζ) = L(ζ) + q with L : V → V being a linear transformation and q ∈ V, the problem (1) reduces to the linear complementarity problem over symmetric cones (LSCCP):

ζ ∈ K, L(ζ) + q ∈ K, hζ, L(ζ) + qi = 0. (2)

Recently, there is active research for the solution of the symmetric cone optimization and complementarity problems, and have been proposed various solution methods. They include the interior-point methods [3, 22, 29], the merit function methods [14, 16], the regularized smoothing method [13], and the smoothing Newton method [8]. This paper is concerned with a derivative- free method based on the implicit Lagrangian reformulation (5) of the SCCP (1). An attractive feature of this method is that no derivatives of F (·) need to be computed, which makes the method suitable for large-scale problems, as well as for applications where the derivatives of F (·) are not available or are costly to compute.

The implicit Lagrangian function was first introduced by Mangasarian and Solodov [17] as a smooth merit function for the NCP, and further studied un- der the setting of nonnegative orthant cones by [10, 15, 18, 25, 27] and other literature. Recently, Kong, Tuncel and Xiu [14] utilized the Jordan-algebraic technique to extend the implicit Lagrangian to the symmetric cone K. The corresponding function is defined as

ψα(x, y) := hx, yi+ 1 2α

nk(x−αy)+k^{2}−kxk^{2}+k(y−αx)+k^{2}−kyk^{2}o

, ∀x, y ∈ V
(3)
where α > 1 is a parameter, k · k is the norm induced by the inner product h·, ·i,
and (·)^{+} denotes the metric projection onto the symmetric cone K. They have
showed that ψα is a continuously differentiable merit function associated with

K, that is,

ψα(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K, hx, yi = 0, (4) and thus the SCCP can be formulated as an unconstrained smooth minimization problem

minζ∈V Ψα(ζ) := ψα(ζ, F (ζ)) (5) in the sense that the minimizer of (5) with zero objective value is a solution of (1). For the unconstrained reformulation, they particularly gave a sufficient and necessary condition for each stationary point of Ψαto be a solution of (1), and established that Ψαoffers a global error bound for the SCCP (1) when F has the uniform Cartesian P -property.

Although the literature on derivative-free algorithms for the NCPs is vast (see, e.g., [1, 7, 9, 12, 18, 26, 27]), to our best knowledge, there are few papers to consider the ones for the nonpolyhedral symmetric cone complementarity prob- lems except [20, 28]. In these two papers, the derivative-free methods are de- veloped for the SDCP and the SOCCP, respectively, by the Fischer-Burmeister type merit function. Moreover, the rate of convergence result is not established in [28] and the one in [20] is only shown to be Q-linear. We also note that almost all derivative-free methods mentioned above are descent ones with the monotone Armijo-type line search.

To the contrast, in this work we develop a nonmonotone derivative-free
method for the SCCP by using the vector of the form d(ζ) ≡ −θ∇^{x}ψα(ζ, F (ζ))−

(1 − θ)∇yψα(ζ, F (ζ)) with θ ∈ [0, 1] as the search direction. As shown in Prop.3.4, when θ is sufficiently small, d(ζ) is a descent direction, and it re- duces to the one adopted in [18]. However, for a general θ ∈ [0, 1], d(ζ) is not necessarily descent, and we adopt a nonmonotone line search rule to seek a de- sirable stepsize. We show that the method converges in terms of the implicit Lagrangian value for a large class of SCCPs, and if θ is restricted to be less than a threshold ¯θ ∈ (0, 1) and the SCCP is strongly monotone, the sequence generated converges to the solution at a R-linear rate. Numerical tests verify the theoretical results, and show that the method with a smaller θ does not have better performance than the method with a θ close to 1, though it may have a R-linear rate of convergence when θ is sufficiently small.

Throughout this paper, int K denotes the interior of the cone K and k · k represents the norm induced by the inner product h·, ·i, i.e., k · k :=ph·, ·i. For any x ∈ V, we write (x)+ and (x)− as the metric projection of x onto K and

−K, respectively, i.e.,

(x)+:= argmin_{y∈K}{kx − yk}.

For a differentiable mapping F : V → V, we denote its transposed Jacobian at x ∈ V by ∇F (x). Unless otherwise stated, the parameter α in the sequel always satisfies α > 1.

### 2 Preliminaries

A Euclidean Jordan algebra is a triple (V, ◦, h·, ·i^{V}), where (V, h·, ·i^{V}) is a finite-
dimensional inner product space over R and (x, y) 7→ x ◦ y : V × V → V is a
bilinear mapping satisfying:

(i) x ◦ y = y ◦ x for all x, y ∈ V;

(ii) x ◦ (x^{2}◦ y) = x^{2}◦ (x ◦ y) for all x, y ∈ V, where x^{2}:= x ◦ x;

(iii) hx ◦ y, zi^{V}= hy, x ◦ zi^{V} for all x, y, z ∈ V.

We call x ◦ y the Jordan product of x and y. We assume that there is an element e ∈ V such that x ◦ e = x for all x ∈ V, and call such e the unit element. Let

ζ(x) := mink : {e, x, x^{2}, . . . , x^{k}} are linearly dependent .

Since ζ(x) is bounded by the dimension of V, denoted by dim(V), the rank of
(V, ◦) is well defined by r := max{ζ(x) : x ∈ V}. Define the set of squares
as K := x^{2}: x ∈ V . Then, from [11, Theorem III.2.1], it follows that K is
a symmetric cone. This means that K is a self-dual closed convex cone with
nonempty interior int K, and for any x, y ∈ int K, there exists an invertible
linear transformation T : V → V such that T (K) = K.

Recall that an element c ∈ V is idempotent if c^{2}= c, and two idempotents c
and d are orthogonal if c ◦ d = 0. A nonzero idempotent is primitive if it cannot
be written as the sum of two other nonzero idempotents. A complete system
of orthogonal idempotents is a finite set {c^{1}, c2, . . . , ck} of idempotents with
ci◦ cj = 0 (i 6= j) and Pk

i=1ci = e. We call a complete system of orthogonal primitive idempotents a Jordan frame.

Theorem 2.1 [11, Theorem III.1.2] Suppose that A = (V, ◦, h·, ·i^{V}) is a Eu-
clidean Jordan algebra with rank r. Then for each x ∈ V, there exist a Jordan
frame {c^{1}, c2, . . . , cr} and real numbers λ^{1}(x), λ2(x), . . . , λr(x) such that

x = λ1(x)c1+ λ2(x)c2+ · · · + λr(x)cr.

The numbers λ1(x), . . . , λr(x) (counting multiplicities) are called the eigenval- ues of x. Furthermore, the trace of x, denoted by tr(x), is defined as tr(x) = Pr

j=1λj(x).

Since, by [11, Prop.III.1.5], a Jordan algebra A = (V, ◦) over R with a unit element e ∈ V is Euclidean if and only if the symmetric bilinear form tr(x ◦ y) is positive definite, we may define another inner product h·, ·i on V by

hx, yi := tr(x ◦ y), ∀ x, y ∈ V. (6) By the associativity of tr(·) (see [11, Prop.II.4.3]), the inner product h·, ·i is associative, i.e., for all x, y, z ∈ V, it holds that hx ◦ y, zi = hy, x ◦ zi.

Unless otherwise stated, in the rest of this paper, we always assume that A= (V, ◦, h·, ·i) is a Euclidean Jordan algebra of rank r and dim(V) = n with h·, ·i defined as in (6).

Let ϕ : R → R be a scalar-valued function. By Theorem 2.1, it is natural to define a vector-valued function associated with the Euclidean Jordan algebra A= (V, ◦, h·, ·i) by

ϕV(x) := ϕ(λ1(x))c1+ ϕ(λ2(x))c2+ · · · + ϕ(λ^{r}(x))cr, (7)
where x ∈ V has the spectral decomposition x =Pr

j=1λj(x)cj. The function ϕV is also called L¨owner operator in [23] and shown to inherit many properties from ϕ. Especially, when ϕ(t) is chosen as max{0, t} and min{0, t} for t ∈ R, the L¨owner operator ϕV(·) respectively becomes the metric projection operator onto K and −K:

(x)+:=

r

X

j=1

max0, λj(x) cj and (x)−:=

r

X

j=1

min0, λj(x) cj. (8)

A Euclidean Jordan algebra is called simple if it is not the direct sum of two
Euclidean Jordan algebras. By [11, Prop.III.4.4-4.5 & Theorem V.3.7], each
Euclidean Jordan algebra is, in a unique way, a direct sum of simple Euclidean
Jordan algebras. Also, the symmetric cone in a given Euclidean Jordan algebra
is, in a unique way, a direct sum of symmetric cones in the constituent simple
Euclidean Jordan algebras. In the sequel, we assume that V ≡ V^{1}×· · ·×V^{m}and
K ≡ K^{1}× · · · × K^{m}, where each Ai= (Vi, ◦, h·, ·i) is a simple Euclidean Jordan
algebra and K^{i}is a symmetric cone in Vi. Corresponding to the Cartesian struc-
ture of V and K, let ζ = (ζ1, . . . , ζm) with ζi∈ Viand F (ζ) = (F1(ζ), . . . , Fm(ζ))
with Fi : V → V^{i}.

To close this section, we recall the definitions of uniform Cartesian P - property [5, 14] and uniform Jordan P -property [24].

Definition 2.1 The mapping F = (F1, F2, . . . , Fm) with Fi: V → Vi is said to have

(a) the uniform Cartesian P -property if there exists a positive scalar ρ such that for any ζ, ξ ∈ V, there is an index i ∈ {1, 2, . . . , m} such that

hζi− ξi, Fi(ζ) − Fi(ξ)i ≥ ρkζ − ξk^{2}.

(b) the uniform Jordan P -property if there is a scalar ρ > 0 such that for any ζ, ξ ∈ V,

λmax[(ζ − ξ) ◦ (F (ζ) − F (ξ))] ≥ ρkζ − ξk^{2}
where λmax(x) denotes the largest eigenvalue of a vector x ∈ V.

### 3 Properties of the function Ψ

αThis section is devoted to the favorable properties of the implicit Lagrangian function Ψα. Most of the properties have been given by Kong, Tuncel and Xiu [14], and we supplement some ones that play an important role in the convergence analysis of the algorithms. For this purpose, let rα : V × V → V and Rα: V → V be respectively defined by

rα(x, y) := x − (x − αy)^{+} for α > 0 (9)
and

Rα(ζ) := rα(ζ, F (ζ)) for α > 0. (10) Then, by using the same arguments as those of [4, Lemma 1] and [21, Theorem 4.2], it is easy to obtain the following properties of rα, and we omit the proof for simplicity.

Lemma 3.1 Let rα be defined as in (9). Then, there hold that

(a) min{1, α}kr^{1}(x, y)k ≤ kr^{α}(x, y)k ≤ max{1, α}kr^{1}(x, y)k for all x, y ∈ V
and α > 0;

(b) min{1, α}kr1(x, y)k ≤ krα(y, x)k ≤ max{1, α}kr1(x, y)k for all x, y ∈ V and α > 0;

(c) α^{−1}(α − 1)kr1(x, y)k^{2} ≤ ψα(x, y) ≤ (α − 1)kr1(x, y)k^{2} for all x, y ∈ V and
α > 1.

Now we give a proposition to summarize the favorable properties of the function Ψα.

Proposition 3.1 Let Ψα be defined as in (5). Then the following statements hold:

(a) Ψα(ζ) ≥ 0 for all ζ ∈ V, and Ψα(ζ) = 0 if and only if ζ ∈ V solves the SCCP (1).

(b) Ψα is continuously differentiable everywhere on V with the gradient given by

∇Ψα(ζ) = ∇xψα(ζ, F (ζ)) + ∇F (ζ)∇yψα(ζ, F (ζ)).

(c) ∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ)) = 0 if and only if ζ ∈ V solves the SCCP (1).

(d) ∇^{x}ψα(ζ, F (ζ)), ∇yψα(ζ, F (ζ)) ≥ 0 for any ζ ∈ V.

(e)

∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))

2≥ (α^{2}− 1)^{2}

α^{2}(α^{2}+ 1)kRα(ζ)k^{2} for any ζ ∈ V.

(f )

∇^{x}ψα(ζ, F (ζ)) + ∇^{y}ψα(ζ, F (ζ))

2≤ 2α(α − 1)Ψ^{α}(ζ) for any ζ ∈ V.

Proof. The proof of parts (a)–(d) can be found in the literature [14]. To prove parts (e) and (f), we only need to show that for all x, y ∈ V,

∇xψα(x, y) + ∇yψα(x, y)

2 ≥ (α^{2}− 1)^{2}

α^{2}(α^{2}+ 1)kr^{α}(x, y)k^{2}+ krα(y, x)k^{2},(11)

∇xψα(x, y) + ∇yψα(x, y)

2 ≤ 2α(α − 1)ψα(x, y). (12)

From [14], it follows that for any x, y ∈ V,

∇xψα(x, y) = y + α^{−1}[(x − αy)+− x − α(y − αx)+] ,

∇yψα(x, y) = x + α^{−1}[(y − αx)+− y − α(x − αy)+] . (13)
Therefore, we have

∇xψα(x, y) + ∇yψα(x, y)

2= (α − 1)^{2}
α^{2}

x − (x − αy)^{+} + y − (y − αx)^{+}

2

= (α − 1)^{2}
α^{2}

h

x − (x − αy)^{+}

2+

y − (y − αx)^{+}

2i

+2(α − 1)^{2}

α^{2} x − (x − αy)^{+}, y − (y − αx)+. (14)
From part (d), we see that h∇xψα(x, y), ∇yψα(x, y)i ≥ 0 for any x, y ∈ V, that
is,

0 ≤

y − (y − αx)++ 1

α[(x − αy)+− x], x − (x − αy)++ 1

α[(y − αx)+− y]

= −1 α

x − (x − αy)+

2− 1 α

y − (y − αx)+

2

+

1 + 1

α^{2}

x − (x − αy)+, y − (y − αx)+.

This in turn implies that

x − (x − αy)^{+}, y − (y − αx)+

≥ α

α^{2}+ 1kx − (x − αy)^{+}k^{2}+ ky − (y − αx)+k^{2}

∀ x, y ∈ V. (15)

Combining (14) with (15) and noting that α > 0, we immediately obtain

∇^{x}ψα(x, y) + ∇^{y}ψα(x, y)

2

≥ (α − 1)^{2}(α + 1)^{2}
α^{2}(α^{2}+ 1)

hkx − (x − αy)+k^{2}+ ky − (y − αx)+k^{2}i

= (α^{2}− 1)^{2}

α^{2}(α^{2}+ 1)kr^{α}(x, y)k^{2}+ krα(y, x)k^{2} .

This completes the proof of (11). To see inequality (12), we verify the following:

∇xψα(x, y) + ∇yψα(x, y)

2 = (α − 1)^{2}
α^{2}

rα(x, y) + rα(y, x)

2

≤ 2(α − 1)^{2}

α^{2} krα(x, y)k^{2}+ krα(y, x)k^{2}

≤ (α − 1)^{2}

α^{2} · 2α^{2}kr1(x, y)k^{2}

≤ 2(α − 1)^{2}· α

α − 1ψα(x, y)

= 2α(α − 1)ψα(x, y),

where the first equality is due to the first equation of (14) and the definition of rα, and the second inequality holds by Lemma 3.1(a)-(b). Thus, we complete the proof. 2

The assertions of Prop.3.1(e)-(f) are new, and they play a key role in es-
tablishing the rate of convergence result of the nonmonotone descent algorithm
of this paper. When V reduces to the Euclidean space R^{n} with the standard
inner product and Jordan product defined as the componentwise product of the
vectors, Prop.3.1(e) implies the second result of [18, Lemma 1] by observing
α > 1 and the following inequalities

∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))

≥ α^{2}− 1
α√

α^{2}+ 1kRα(ζ)k

≥ (α − 1)(α + 1) α√

α^{2}+ 1 kR^{1}(ζ)k

≥ α − 1

α kR1(ζ)k

where the first inequality is by Lemma 3.1(b) and the second one is due to α + 1 >√

α^{2}+ 1.

The following results for Ψα can be found in [14, Corollary 6.4] and [14, Theorem 6.3].

Proposition 3.2 Assume that F has the uniform Cartesian P -property. Then, (a) Each stationary point of Ψαis a solution of the SCCP (1).

(b) If, in addition, F is Lipschitz continuous with constant L > 0, then for any ζ ∈ V,

1

(α − 1)(2 + L)^{2}Ψα(ζ) ≤ kζ − ζ^{∗}k^{2}≤ α(1 + L)^{2}
(α − 1)ρ^{2}Ψα(ζ),

where ζ^{∗} be the unique solution of (1), and the constant ρ is same as in
Def.2.1.

It is well known that the coerciveness of the merit function plays an impor- tant role in the convergence analysis of the unconstrained reformulation methods for the complementarity problems. The next proposition presents a mild con- dition to guarantee the coerciveness of Ψα, whose proof can be found in [19, Theorem 4.1].

Proposition 3.3 The function Ψα is coercive under the following condition that

(C.1) F has the uniform Jordan P -property and the linear growth, i.e., there exists a constant C > 0 such that for any ζ ∈ V, kF (ζ)k ≤ kF (0)k+Ckζk.

Particularly, if F is given as in (2) with L having the P -property, then Ψ^{α} is
coercive.

To close this section, we present the direction d(ζ) that will be employed to design our derivative-free algorithm. Specifically, let the mapping d : V → V be given by

d(ζ) := −θ∇xψα(ζ, F (ζ)) − (1 − θ)∇yψα(ζ, F (ζ)) ∀θ ∈ [0, 1]. (16) Such vector d enjoys the properties stated as in the following proposition.

Proposition 3.4 Suppose that ∇F is positive definite. Then, for sufficiently small θ > 0,

d(ζ)^{T}∇Ψ^{α}(ζ) < 0 when d(ζ) 6= 0.

If F is strongly monotone with modulus µ > 0 and S ⊆ V is any bounded set, then there exists ¯θ ∈ (0, 1) such that for all θ ≤ ¯θ,

d(ζ)^{T}∇Ψα(ζ) ≤ −1
2θ

∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))

2 ∀ζ ∈ S.

Proof. By the formula of ∇Ψαand Prop.3.1(d), for any θ ∈ [0, 1], we have
d(ζ)^{T}∇Ψα(ζ) = −θk∇xψα(ζ, F (ζ))k^{2}− (1 − θ)h∇xψα(ζ, F (ζ)), ∇yψα(ζ, F (ζ))i

−θ h∇xψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i

−(1 − θ) h∇yψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i

≤ −θk∇xψα(ζ, F (ζ))k^{2}− θ h∇xψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i

−(1 − θ) h∇yψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i . (17)
Notice that for sufficiently small θ > 0 and any given ζ ∈ V, the vector d(ζ) 6= 0
must imply that ∇^{y}ψα(ζ, F (ζ)) 6= 0. Thus, the last term of the right hand side
is always strictly negative by the positive definiteness of ∇F , whereas the first
two terms are sufficiently small. Therefore, we obtain that d(ζ)^{T}∇Ψα(ζ) < 0
whenever d(ζ) 6= 0.

Since ∇F is continuous and S is bounded, there exists a constant ν > 0 such that

k∇F (ζ)k ≤ ν ∀ ζ ∈ S. (18)

On the other hand, using the strong monotonicity of F , we have

h∇F (ζ)u, ui ≥ µkuk^{2} ∀ ζ, u ∈ V. (19)
Now, from equations (17)–(19), it follows that for any θ ∈ [0, 1] and ζ ∈ S,

d(ζ)^{T}∇Ψ^{α}(ζ) ≤ −θk∇^{x}ψα(ζ, F (ζ))k^{2}− (1 − θ)µk∇^{y}ψα(ζ, F (ζ))k^{2}
+θνk∇^{x}ψα(ζ, F (ζ))k · k∇^{y}ψα(ζ, F (ζ))k

= −1 2θ

k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2

−1 2θ

∇^{x}ψα(ζ, F (ζ))

2−2(1 − θ)µ − θ 2

∇^{y}ψα(ζ, F (ζ))

2

+θ(ν + 1)k∇^{x}ψα(ζ, F (ζ))k · k∇^{y}ψα(ζ, F (ζ))k. (20)
If θ ≤ 2µ/(2µ + 1), then the last inequality can be rewritten as

d(ζ)^{T}∇Ψα(ζ) ≤ −1
2θ

k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2

− rθ

2

∇^{x}ψα(ζ, F (ζ))
−

r2µ − (2µ + 1)θ 2

∇^{y}ψα(ζ, F (ζ))

!^{2}
(21)
+

θ(ν + 1) −p2µθ − (2µ + 1)θ^{2}

k∇xψα(ζ, F (ζ))kk∇yψα(ζ, F (ζ))k.

If θ(ν + 1) ≤p2µθ − (2µ + 1)θ^{2}, that is, θ ≤ 2µ/(2µ + 1 + (ν + 1)^{2}), then using
(21) and the Cauchy-Schwartz inequality yields

d(ζ)^{T}∇Ψα(ζ) ≤ −1
2θ

k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2

≤ −1 2θ

∇^{x}ψα(ζ, F (ζ)) + ∇^{y}ψα(ζ, F (ζ))

2.

Thus, by setting θ := min¯

2µ

2µ + 1, 2µ

2µ + 1 + (ν + 1)^{2}

= 2µ

2µ + 1 + (ν + 1)^{2}, (22)
we obtain the desired result. The proof is complete. 2

### 4 Nonmonotone derivative-free algorithm

In this section, we utilize the direction d(ζ) defined by (16) to design a derivative- free algorithm. By Prop.3.4, d(ζ) with θ ∈ [0, 1] may not satisfy the descent condition. Moreover, the technique of nonmonotone line search is often more effective than the Armijo-type line search. So, we adopt a nonmonotone line search rule to seek a suitable stepsize.

Algorithm 4.1

(Step 0) Choose ζ^{0}∈ V, ǫ ≥ 0, θ ∈ [0, 1] and γ, δ ∈ (0, 1). Let M > 0 be an
integer. Set k := 0.

(Step 1) If Ψα(ζ^{k}) ≤ ǫ, then stop. Otherwise, go to Step 2.

(Step 2) Let m(0) = 0, 0 ≤ m(k) ≤ min{m(k − 1) + 1, M − 1} for k ≥ 1. Let lk be the smallest nonnegative integer l satisfying

Ψα(ζ^{k}+ γ^{l}d^{k}) ≤ max

0≤j≤m(k)Ψα(ζ^{k−j}) − δγ^{2l}h(ζ^{k}), (23)
where d^{k} := d(ζ^{k}) with d(ζ) defined as in (16), and

h(ζ) := k∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))k^{2}. (24)

(Step 3) Set ζ^{k+1}:= ζ^{k}+ γ^{l}^{k}d^{k} and k := k + 1, and then go to Step 1.

Observe that no derivatives of F are needed to compute the search direction
or the stepsize in Algorithm 4.1. Hence, Algorithm 4.1 requires little computa-
tion and storing work at each iteration. Since θ is any fixed constant in [0, 1],
the direction d^{k} is different from the one used in [18] and at each iteration may
not satisfy the descent condition (d^{k})^{T}∇Ψα(ζ^{k}) < 0. Based on this, a non-
monotone line search rule is used in Step 2. The line search rule is different
from the ones adopted in [2, 6] where the gradient of the merit (or objective)
function is needed, and when m(k) ≡ 0, the nonmonotone line search reduces
to the Armijo line search. Particularly, if θ is restricted to be less than ¯θ given
by (22) and F is strongly monotone, then Prop.3.4 implies that Algorithm 4.1

will become a nonmonotone derivative-free descent algorithm.

In what follows, we study the convergence of Algorithm 4.1. To the end,
assume that Algorithm 4.1 generates an infinite sequence {ζ^{k}}, i.e., ǫ = 0. We
define the level set

L(Ψα, ζ^{0}) :=ζ ∈ V | Ψ^{α}(ζ) ≤ Ψα(ζ^{0}) .

Then L(Ψ^{α}, ζ^{0}) is bounded under one of the condition given in Prop.3.3. By
the continuity of F (·), we know that D(ζ^{0}) := supkd(ζ)k | ζ ∈ L(Ψ^{α}, ζ^{0}) is
finite. Consequently,

B(ζ^{0}) := L(Ψ^{α}, ζ^{0}) +ζ ∈ V | kζk ≤ D(ζ^{0})
is also bounded under the condition stated in Prop.3.3.

Lemma 4.1 Let {ζ^{k}} be the sequence generated by Algorithm 4.1. Then,
(a) the sequence {ζ^{k}} is contained in L(Ψα, ζ^{0});

(b) max

1≤i≤MΨα(ζ^{M p+i}) ≤ max

1≤i≤MΨα(ζ^{M (p−1)+i}) − δ min

0≤i≤M −1γ^{2l}^{(M p+i)}h(ζ^{M p+i})
for any p ≥ 1.

Proof. (a) For each k ≥ 0, let σ(k) be an integer from [k − m(k), k] such that
Ψα(ζ^{σ(k)}) = max

0≤j≤m(k)Ψα(ζ^{k−j}).

Then, the line search condition (23) can be rewritten as

Ψα(ζ^{k+1}) ≤ Ψ^{α}(ζ^{σ(k)}) − δγ^{2l}^{k}h(ζ^{k}). (25)
Noting that m(k + 1) ≤ m(k) + 1 and h(ζ) ≥ 0 for any ζ ∈ V, we have from
(23) that

Ψα(ζ^{σ(k+1)}) = max

0≤j≤m(k+1)Ψα(ζ^{k+1−j}) ≤ max

0≤j≤m(k)+1Ψα(ζ^{k+1−j})

= max{Ψα(ζ^{σ(k)}), Ψα(ζ^{k+1})}

= Ψα(ζ^{σ(k)}),

where the last equality is from (25) and the nonnegativity of h(ζ^{k}). This shows
that the sequence {Ψ^{α}(ζ^{σ(k)})} is nonincreasing. Noting that ζ^{σ(0)}= ζ^{0}, we then
have Ψα(ζ^{k}) ≤ Ψα(ζ^{0}) for all k, which in turn implies {ζ^{k}} ⊆ L(Ψα, ζ^{0}).

(b) We only need to show that the following inequality holds for j = 1, 2, . . . , M :
Ψα(ζ^{M p+j}) ≤ max

1≤i≤MΨα(ζ^{M (p−1)+i}) − δγ^{2l}^{(M p+j−1)}h(ζ^{M p+j−1}) ∀p ≥ 1. (26)

Notice that the linear search condition (23) implies
Ψα(ζ^{M p+1}) ≤ max

0≤i≤m(M p)Ψα(ζ^{M p−i}) − δγ^{2l}^{M p}h(ζ^{M p}),

which together with m(M p) ≤ M −1 shows that inequality (26) holds for j = 1.

Suppose that (26) holds for any 1 ≤ j ≤ M − 1. Then, from the nonnegativity of h(ζ), it follows that

1≤i≤jmax Ψα(ζ^{M p+i}) ≤ max

1≤i≤MΨα(ζ^{M (p−1)+i}).

Consequently, by using (23), the induction hypothesis and m(M p + j) ≤ M − 1, we get

Ψα(ζ^{M p+j+1}) ≤ max

0≤i≤m(M p+j)Ψα(ζ^{M p+j−i}) − δγ^{2l}^{(M p+j)}h(ζ^{M p+j})

≤ max

1≤i≤Mmax Ψα(ζ^{M (p−1)+i}), max

1≤i≤jΨα(ζ^{M p+j})

− δγ^{2l}^{(M p+j)}h(ζ^{M p+j})

≤ max

1≤i≤MΨα(ζ^{M (p−1)+i}) − δγ^{2l}^{(M p+j)}h(ζ^{M p+j}).

This shows that (26) also holds for j + 1. By induction, we prove that (26) is true for all 1 ≤ j ≤ M. Consequently, the assertion of part (b) follows. 2

Now we are in a position to state and prove our convergent result for Algo- rithm 4.1.

Theorem 4.1 Let {ζ^{k}} be the sequence generated by Algorithm 4.1. Suppose
that F is Lipschitz continuous and satisfies the condition in Prop.3.3, and ∇F (·)
is Lipschitz continuous on B(ζ^{0}). Then, the following results hold.

(a) The sequence {ζ^{k}} is bounded.

(b) The sequence {Ψα(ζ^{k})} is convergent.

(c) limk→∞γ^{2l}^{k}h(ζ^{k}) = 0, limk→∞γ^{l}^{k}kd^{k}k = 0 and limk→∞kζ^{k+1}− ζ^{k}k = 0.

(d) Each accumulation point of {ζ^{k}} either is a solution of the SCCP (1) or
satisfies

|∇Ψ^{α}(ζ)^{T}d(ζ)|

h(ζ) = 0. (27)

Proof. (a) By Prop.3.3, L(Ψ^{α}, ζ^{0}) is bounded, and the result holds by Lemma
4.1(a).

(b) First, by the proof of Lemma 4.1(a), the sequence {Ψα(ζ^{σ(k)})} is nonin-
creasing. This together with the nonnegativity of Ψα(ζ) for any ζ ∈ V implies

that {Ψα(ζ^{σ(k)})} admits a limit when k → ∞. Let j be an integer such that
1 ≤ j ≤ M + 1. We first by induction on j show that

k→∞lim kζ^{σ(k)−j+1}− ζ^{σ(k)−j}k = 0, (28)

k→∞lim Ψα(ζ^{σ(k)}) = lim

k→∞Ψα(ζ^{σ(k)−j}), (29)

where σ(k) is defined as in Lemma 4.1, and the sequences are considered for sufficiently large k such that σ(k) ≥ k − M > 1. If j = 1, then using (25) with k replaced by σ(k) − 1, we obtain that

Ψα(ζ^{σ(k)}) ≤ Ψ^{α}(ζ^{σ(σ(k)−1)}) − δγ^{2l}^{σ(k)−1}h(ζ^{σ(k)−1}). (30)
Since {Ψ^{α}(ζ^{σ(k)})} admits a limit, taking limits to the both sides of (30) yields

k→∞lim γ^{2l}^{σ(k)−1}h(ζ^{σ(k)−1}) = 0.

From the definition of d(ζ) and h(ζ), it is easy to verify that
h(ζ) ≥ kd(ζ)k^{2} for any ζ ∈ V.

Using the last two equations, it then follows that 0 ≥ lim

k→∞kγ^{l}^{σ(k)−1}d^{σ(k)−1}k = lim

k→∞kζ^{σ(k)}− ζ^{σ(k)−1}k ≥ 0. (31)
On the other hand, since Ψα is continuously differentiable everywhere and
L(Ψ^{α}, ζ^{0}) is bounded, the function Ψα is Lipschitz continuous on L(Ψ^{α}, ζ^{0}).

This means that there exists a constant L2> 0 such that

|Ψα(ζ) − Ψα(ξ)| ≤ L2kζ − ξk ∀ζ, ξ ∈ L(Ψα, ζ^{0}). (32)
From equations (31)–(32), we immediately obtain

k→∞lim Ψα(ζ^{σ(k)}) = lim

k→∞Ψα(ζ^{σ(k)−1}).

This shows that (28) and (29) hold at each k for j = 1. Now assume that (29) holds for a given j. Using (25) with k replaced by σ(k) − j − 1, we have

Ψα(ζ^{σ(k)−j}) ≤ Ψ^{α}(ζσ(σ(k)−j−1)

) − δγ^{2l}^{σ(k)−j−1}h(ζ^{σ(k)−j−1}).

Taking limits for k → ∞ and recalling (29) give

k→∞lim γ^{2l}^{σ(k)−j−1}h(ζ^{σ(k)−j−1}) = 0.

This together with h(ζ^{σ(k)−j−1}) ≥ kd^{σ(k)−j−1}k^{2} implies
0 ≥ lim

k→∞γ^{l}^{σ(k)−j−1}kd^{σ(k)−j−1}k = lim

k→∞kζ^{σ(k)−j}− ζ^{σ(k)−j−1}k = 0.

Combining with (29) and (32), we then obtain

k→∞lim Ψα(ζ^{σ(k)}) = lim

k→∞Ψα(ζ^{σ(k)−j−1}).

The last two equations show that (28) and (29) hold when replacing j with j + 1, and hence (28) and (29) hold for any given j ∈ {1, . . . , M}. Let ˆσ(k) = σ(k + M + 1). Then,

ζ^{σ(k)}^{ˆ} = ζ^{k}+ (ζ^{k+1}− ζ^{k}) + · · · + (ζ^{ˆ}^{σ(k)}− ζ^{σ(k)−1}^{ˆ} )

= ζ^{k}+

ˆ σ(k)−k

X

j=1

(ζ^{ˆ}^{σ(k)−j+1}− ζ^{ˆ}^{σ(k)−j}). (33)

Notice that σ(k + M + 1) ≤ k + M + 1 and ˆσ(k) − k ≤ M + 1, and therefore, from (33) and (28), it follows

k→∞lim kζ^{k}− ζ^{ˆ}^{σ(k)}k = 0. (34)

Since {Ψα(ζ^{σ(k)})} has a limit, using (32) and (34), we have

k→∞lim Ψα(ζ^{k}) = lim

k→∞Ψα(ζ^{σ(k)}^{ˆ} ) = lim

k→∞Ψα(ζ^{σ(k+M +1)}) = lim

k→∞Ψα(ζ^{σ(k)}).

Thus, we complete the proof of assertion (b).

(c) From the line search condition (23) and part (b), it readily follows

k→∞lim γ^{2l}^{k}h(ζ^{k}) = 0.

This together with h(ζ^{k}) ≥ kd^{k}k^{2} and kγ^{l}^{k}d^{k}k = kζ^{k+1}− ζ^{k}k yields

k→∞lim γ^{l}^{k}kd^{k}k = lim

k→∞kζ^{k+1}− ζ^{k}k = 0.

Consequently, the assertions of part (c) hold.

(d) If lk = 0 fails for the line search condition (23), then we have
Ψα(ζ^{k}+ γ^{l}^{k}^{−1}d^{k}) > max

0≤j≤m(k)Ψα(ζ^{k−j}) − δγ^{2(l}^{k}^{−1)}h(ζ^{k})

≥ Ψ^{α}(ζ^{k}) − δγ^{2(l}^{k}^{−1)}h(ζ^{k}). (35)
Since F (·) and ∇F (·) are Lipschitz continuous on B(ζ^{0}), it is clear that ∇Ψα(·)
is Lipschitz continuous on this bounded set, i.e., there exists a constant L3> 0
such that

k∇Ψ^{α}(ζ) − ∇Ψ^{α}(ξ)k ≤ L^{3}kζ − ξk ∀ζ, ξ ∈ B(ζ^{0}). (36)
Notice that ζ^{k} and ζ^{k} + td^{k} for any t ∈ [0, 1] belong to the set B(ζ^{0}). By
the mean-value theorem and the Lipschitz continuity of ∇Ψ^{α}on B(ζ^{0}), it then

follows that

Ψα(ζ^{k}+ td^{k}) − Ψ^{α}(ζ^{k})

= t∇Ψα(ζ^{k})^{T}d^{k}+
Z t

0 [∇Ψα(ζ^{k}+ sd^{k}) − ∇Ψα(ζ^{k})]^{T}d^{k}ds

≤ t∇Ψα(ζ^{k})^{T}d^{k}+
Z t

0

L3kd^{k}k^{2}sds

= t∇Ψα(ζ^{k})^{T}d^{k}+ (1/2)L3t^{2}kd^{k}k^{2}

≤ t∇Ψα(ζ^{k})^{T}d^{k}+ (1/2)L3t^{2}h(ζ^{k})

≤ −δt^{2}h(ζ^{k}) for all t ∈

0,2|∇Ψα(ζ^{k})^{T}d^{k}|
h(ζ^{k})(2δ + L3)

. (37)

Combining the inequality (37) with (35), we obtain that

γ^{l}^{k}^{−1}>2|∇Ψ^{α}(ζ^{k})^{T}d^{k}|
h(ζ^{k})(2δ + L3).

If lk = 0 succeeds for the line search condition (23), then γ^{l}^{k} = 1. Thus, there
exists some constant C1= 2γ/(2δ + L3) > 0 such that

γ^{l}^{k} > min

1, C1|∇Ψα(ζ^{k})^{T}d^{k}|
h(ζ^{k})

for all k. (38)

Now let ζ^{∗} be an accumulation point of {ζ^{k}} and {ζ^{k}}k∈K be the subsequence
such that

k→∞,k∈Klim ζ^{k}= ζ^{∗}.

By part (c), limk→∞γ^{2l}^{k}h(ζ^{k}) = 0. If limk→∞,k∈Kh(ζ^{k}) = h(ζ^{∗}) = 0, then

∇xψα(ζ^{∗}, F (ζ^{∗})) + ∇xψα(ζ^{∗}, F (ζ^{∗}))
= 0.

By Proposition 3.1 (c), ζ^{∗}is a solution of the SCCP. If limk→∞h(ζ^{k}) 6= 0, then
there holds limk→∞γ^{l}^{k} = 0. This together with (38) implies

0 = lim

k→∞

|∇Ψ^{α}(ζ^{k})^{T}d^{k}|

h(ζ^{k}) = |∇Ψ^{α}(ζ^{∗})^{T}d(ζ^{∗})|

h(ζ^{∗}) .
Thus, we complete the proof. 2

Theorem 4.1 states that, when θ is any fixed real number in [0, 1], the non-
monotone derivative-free algorithm converges in terms of the value of merit
function Ψαand the sequence {ζ^{k}} is bounded for a large class of SCCPs which
may even not be monotone. If θ is chosen to be less than ¯θ and F is strongly
monotone, then by Prop.3.4,

|∇Ψ^{α}(ζ)^{T}d(ζ)| ≥ 1

2θh(ζ) ∀ζ ∈ B(ζ^{0}).

This implies that any accumulation point of {ζ^{k}} can not satisfy (27), and
consequently, each accumulation of {ζ^{k}} is a solution of the SCCP (1). In fact,
under this case, {ζ^{k}} converges to the solution of (1) at a R-linear rate. We
next prove the assertion.

Theorem 4.2 Let {ζ^{k}} be the sequence generated by Algorithm 4.1. Suppose
that F is strongly monotone and Lipschitz continuous, and ∇F (·) is Lipschitz
continuous on B(ζ^{0}). If θ ≤ ¯θ with ¯θ given by (22), then there exist constants
ν0> 0 and ν6∈ (0, 1) such that

Ψα(ζ^{k}) ≤ ν0ν_{6}^{k}Ψα(ζ^{1}).

Moreover, {ζ^{k}} converges to the unique solution ζ^{∗} of the SCCP (1) with R-
linear rate.

Proof. Since strong monotonicity implies the uniform Jordan P -property,
which by Prop.3.3 implies that B(ζ^{0}) is bounded and all results of Theorem
4.1 hold.

To prove the conclusion, we first show that there exist constants ν1, ν2> 0 such that

Ψα(ζ^{k+1}) ≤ ν1Ψα(ζ^{k}) for all k ≥ 0, (39)
and

h(ζ^{k+1}) ≤ ν2h(ζ^{k}) for all k ≥ 0. (40)
Because θ ≤ ¯θ and F is strongly monotone, using (37) and Proposition 3.4 yields

Ψα(ζ^{k+1}) − Ψα(ζ^{k}) ≤ γ^{l}^{k}∇Ψ^{α}(ζ^{k})^{T}d^{k}+ (1/2)L3γ^{l}^{k}h(ζ^{k})

≤ −1

2γ^{l}^{k}(θ − L3γ^{l}^{k})h(ζ^{k}). (41)
By Proposition 3.1 (e)–(f), Lemma 3.1 (a) and (c), it is easy to verify that

h(ζ) ≥(α − 1)^{2}

α^{2} kR^{α}(ζ)k^{2}≥ (α − 1)^{2}

α^{2} kR^{1}(ζ)k^{2}≥α − 1

α^{2} Ψα(ζ) ∀ζ ∈ V, (42)
and

h(ζ) ≤ 2α(α − 1)Ψα(ζ) ∀ζ ∈ V. (43)
Therefore, if θ − L^{3}γ^{l}^{k} ≥ 0, equations (41) and (42) imply

Ψα(ζ^{k+1}) ≤ Ψα(ζ^{k}) −1

2γ^{l}^{k}(θ − L3γ^{l}^{k})α − 1
α^{2} Ψα(ζ^{k})

=

1 −1

2γ^{l}^{k}(θ − L3γ^{l}^{k})α − 1
α^{2}

Ψα(ζ^{k}) ≤ Ψα(ζ^{k});

whereas if θ − L3γ^{l}^{k}< 0, equations (41) and (43) lead to

Ψα(ζ^{k+1}) ≤ 1 − γ^{l}^{k}(θ − L^{3}γ^{l}^{k})α(α − 1) Ψα(ζ^{k})

≤ [1 + (L^{3}− θ)α(α − 1)]Ψ^{α}(ζ^{k}).

This shows that (39) holds with ν1 := max1, 1 + (L_{3}− θ)α(α − 1) . Using
(43), (39) and (42), we have

h(ζ^{k+1}) ≤ 2α(α − 1)Ψ^{α}(ζ^{k+1}) ≤ 2α(α − 1)ν^{1}Ψα(ζ^{k}) ≤ 2ν^{1}α^{3}h(ζ^{k}),
which implies that (40) holds with ν2:= 2ν1α^{3}> 0.

Now for any p ≥ 1, let φ(p) be any index in [Mp + 1, M(p + 1)] satisfying
Ψα(ζ^{φ(p)}) := max

1≤i≤MΨα(ζ^{M p+i}).

From Lemma 4.1 (b), it then follows

Ψα(ζ^{φ(p)}) ≤ Ψ^{α}(ζ^{φ(p−1)}) − δ min

0≤i≤M −1γ^{2l}^{(M p+i)}h(ζ^{M p+i}).

Notice that γ^{l}^{k}≥ min1, C1θ/2 for all k by using (38) and the second assertion
of Proposition 3.4. Hence, there exists a constant ν3 := δ min{1, C1θ/2} > 0
such that

Ψα(ζ^{φ(p)}) ≤ Ψα(ζ^{φ(p−1)}) − ν3 min

0≤i≤M −1h(ζ^{M p+i}). (44)
Let s(p) and w(p) be any indices in [M p + 1, M (p + 2)] for which

h(ζ^{s(p)}) := min

1≤i≤2Mh(ζ^{M p+i}) and Ψα(ζ^{w(p)}) := min

1≤i≤2MΨα(ζ^{M p+i}), (45)
and denote by ν4 the constant given by

ν4=

ν3+ α^{2}
α − 1ν_{2}^{4M}

−1

. (46)

We now define an infinite subsequence {ki: i ≥ 0} ⊂ {1, 2, . . .} as follows. Let k0= φ(0). Suppose that ki= φ(¯p) has been chosen for some ¯p. Define

ki+1 :=

( w(¯p + 1) if h(ζ^{s( ¯}^{p+1)}) ≤ ν4Ψα(ζ^{φ( ¯}^{p)})

φ(¯p + 3) otherwise. (47)

For the subsequence {ki} defined as above, it is obvious that

ki+1− k^{i}≤ 4M. (48)

In addition, there necessarily exists a constant ν5∈ (0, 1) such that

Ψα(ζ^{k}^{i+1}) ≤ ν5Ψα(ζ^{k}^{i}), for all i ≥ 1. (49)
In fact, if h(ζ^{s( ¯}^{p+1)}) ≤ ν4Ψα(ζ^{φ( ¯}^{p)}), from (42), (40) and (48), it follows that

Ψα(ζ^{k}^{i+1}) ≤ α^{2}

α − 1h(ζ^{k}^{i+1}) ≤ α^{2}

α − 1ν_{2}^{4M}h(ζ^{s( ¯}^{p+1)}) ≤ α^{2}

α − 1ν^{4M}_{2} ν4Ψα(ζ^{k}^{i}).

If h(ζ^{s( ¯}^{p+1)}) > ν4Ψα(ζ^{φ( ¯}^{p)}), using (44) and (45) yields
Ψα(ζ^{k}^{i+1}) ≤ (1 − ν3ν4)Ψα(ζ^{k}^{i})

By the choice of ν4, the last two equations imply that (49) holds with ν5 = (1 − ν3ν4).

For any k ≥ 1, assume that k ∈ [ki, ki+1) for some i. Then from (48) we have that

k − ki≤ 4M and ki≤ 4Mi + k0. (50) Using equation (50) and noting that 1 ≤ k0≤ M give

i ≥ ki− k0

4M ≥ k − 4M − k0

4M ≥ k

4M −5

4. (51)

Thus, by (39), (49), (50)–(51), we obtain

Ψα(ζ^{k}) ≤ ν^{k−k}1 ^{i}Ψα(ζ^{k}^{i}) ≤ ν1^{4M}ν_{5}^{i}Ψα(ζ^{k}^{0})

≤ ν^{4M}1 ν(k/(4M )−5/4)

5 Ψα(ζ^{k}^{0})

≤ ν^{5M}1 ν(k/(4M )−5/4)

5 Ψα(ζ^{1}).

Letting ν0 = ν_{1}^{5M}ν_{5}^{−5/4} and ν6 = ν_{5}^{1/(4M )} and noting that ν5 = (1 − ν^{3}ν4) <

1, we prove the first part of the conclusion. The second part is direct since
{Ψ^{α}(ζ^{k})} converges Q-linearly to zero and kζ^{k}− ζ^{∗}k ≤ ^{L+1}ρ

q _{α}

α−1pΨα(ζ^{k}) by
Prop.3.2(b). 2

Theorem 4.2 is the first rate of convergence result for the class of derivative- free descent methods with a nonmonotone line search rule for the non-polyhedral SCCPs. In the next section, we compare the numerical performance of Algo- rithm 4.1 with that of Algorithm 4.2 descried as below, which is a monotone descent derivative-free method similar to the one in [26] for the NCPs. The stepsize and the search direction of Algorithm 4.2 are adjusted during the back- tracking search of Armijo-type.

Algorithm 4.2

(Step 0) Choose ζ^{0}∈ V, ǫ ≥ 0, δ ∈ (0, 1), γ ∈ (0, 1), and a sufficiently small
β ∈ (0, 1). Set k := 0.

(Step 1) If Ψα(ζ^{k}) ≤ ε, then stop. Otherwise, go to Step 2.

(Step 2) Let lk be the smallest nonnegative integer l satisfying

Ψα(ζ^{k}+ γ^{l}d^{k}(β^{l})) ≤ Ψα(ζ^{k}) − δγ^{2l}h(ζ^{k}), (52)
where h(ζ) is defined as in (24) and

d^{k}(β^{l}) := −β^{l}∇xψα(ζ^{k}, F (ζ^{k})) − (1 − β^{l})∇yψα(ζ^{k}, F (ζ^{k})). (53)

(Step 3) Set ζ^{k+1}:= ζ^{k}+ γ^{l}^{k}d^{k}(β^{l}^{k}), k := k + 1, and go to Step 1.

### 5 Numerical experiments

In this section, we test the performance of Algorithms 4.1 and 4.2 for the affine SOCCP

ζ ∈ K+^{n}, F (ζ) = M ζ + b ∈ K+^{n}, hζ, F (ζ)i = 0, (54)
where K_{+}^{n} = K_{+}^{n}^{1}× · · · × K+^{n}^{m} with n1+ · · · + n^{m}= n, M ∈ R^{n×n} and b ∈ R^{n}.

During the testing, we set M ≡ diag(M^{1}, · · · , M^{m}) with Mi = NiN_{i}^{T} + τ Ii

for all i, where τ ≥ 0 is a given parameter, Iiis an ni× ni identity matrix, and
each Ni ∈ R^{n}^{i}^{×n}^{i} was generated randomly such that it has 1% nonzero density
with the nonzero entries from a normal distribution of mean −1 and variance
4. It is not hard to see that the matrix M generated by such a way is positive
semidefinite (respectively, positive definite) if τ = 0 (respectively, τ > 0), which
means that the corresponding F is strongly monotone (or monotone). The vec-
tor b was obtained by setting b = −Mw with w = (w1, . . . , wm) ∈ K+^{n}, where
wi∈ K+^{n}^{i} was generated as follows: let the elements of wi be chosen randomly
from a normal distribution with mean −1 and variance 4, and then set the first
element wi1 of wito be kw^{i2}k, where w^{i2}is a vector composed of the rest ni− 1
components of wi. In this way, the affine SOCCP is guaranteed to have a solu-
tion ζ^{∗}= w.

All experiments were done with a PC of Pentium 4 with 2.8GHz CPU and 512MB memory. The computer codes were written in Matlab 6.5. During the tests, we chose ni and m such that n1= · · · = nm= 10 and m = 100. We set m(k) in Algorithm 4.1 as

m(k) :=

( 0 k < 5

min{m(k − 1) + 1, M − 1} otherwise with M = 6.

We started Algorithms 4.1 and 4.2 from the initial point ζ^{0} = (¯ζ^{n}^{i}, . . . , ¯ζ^{n}^{m})
with ¯ζ^{n}^{i} = (10, ωi/kωik), where ωi ∈ R^{n}^{i}^{−1} for all i were generated randomly
by Matlab’s rand.m. The parameters γ and δ in the two algorithms, and β in
Algorithm 4.2 were chosen as

γ = 0.2, δ = 10^{−10}, and β = 0.1.

The algorithms were terminated once one of the following conditions is satisfied:

(a) minΨ_{α}(ζ^{k}), |hζ^{k}, F (ζ^{k})i| ≤ 10^{−5};
(b) The stepsize is less than 10^{−8};

(c) The maximum iteration number is over 5 × 10^{5}.

If the algorithms are stopped under condition (a), we say that they solve the test problem successfully, and otherwise say that they fail to the test problem.

We first tested the influence of α for the iterations and the function eval- uations needed by Algorithms 4.1 and 4.2 for solving (54) with τ in each Mi

chosen as 0.1. For every α = 2, 5, 10, 20, 40, 50, 60, 80, 100, 150, 200, we applied Algorithm 4.1 with θ = 0.95 and Algorithm 4.2, respectively, for solving the same 50 test problems generated as above. The the average iteration and av- erage function evaluation were respectively taken as the average of iterations and function evaluations of the test problems solved successfully. The testing results show that Algorithm 4.1 with α = 2 failed for 4 test problems due to too small stepsize, and successfully solved all test problems with the other α;

whereas Algorithm 4.2 with α = 2 and α = 5 failed for 11 and 1 test problems, respectively, due to too small stepsize, and successfully solved all test problems with the rest α.

Figures 1 and 2 depict the curves of the average function evaluation and the average iteration, respectively, of Algorithms 4.1 and 4.2 with respect to α. From these figures, we see that the number of function evaluations and the iteration times needed by Algorithm 4.1 and Algorithm 4.2 increase with α.

Taking into account that the global convergence of the two algorithms is not stable when α is close to 1 (for example they fail to some test problems when α = 2), a desirable choice for α should be in the interval [10, 50]. Also, the average function evaluation and the average iteration of Algorithm 4.2 are more than those of Algorithm 4.1, especially when α > 40. This implies that the non- monotone derivative-free method has better performance than the monotone descent one.

Then, we tested the influence of θ for the rate of convergence of Algorithm
4.1, by using this algorithm with α = 15 and four different θ to solve a test ex-
ample generated as above with τ = 0.01. Figure 3 below depicts the convergence
curve of Algorithm 4.1. From this figure, we see that the curve corresponding
to θ = 0.5 has the largest slope rate, the curve corresponding to θ = 10^{−4} has
the smallest slope rate, and the curve corresponding to a smaller θ has a smaller
slope rate when θ ≤ 0.1. This shows that Algorithm 4.1 with a smaller θ has
a better rate of convergence, and it has the worst rate of convergence when
θ = 0.5. This coincides with the theoretical results of Theorem 4.2.

We also tested the influence of θ for the performance of Algorithm 4.1.

Specifically, for every θ = 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, we em-

Figure 1: Influence of α on the average function evaluation of Algorithms 4.1 and 4.2

0 20 40 60 80 100 120 140 160 180 200

0 0.5 1 1.5 2 2.5 3 3.5

4x 10^{4} Influence of alpha on the function evaluation needed by two algorithms

alpha

function evaluations

Algorithm 4.1 Algorithm 5.1

Figure 2: Influence of α on the average iteration of Algorithms 4.1 and 4.2

0 20 40 60 80 100 120 140 160 180 200

0 1000 2000 3000 4000 5000 6000 7000

Influence of alpha on the iteration times needed by two algorithms

alpha

iteration times

Algorithm 4.1 Algorithm 5.1

ployed Algorithm 4.1 with α = 15 to solve the same 50 test problems generated as above with τ = 0. Note that this class of problems is more difficult than the one used above since the mapping F is now only monotone, instead of strongly monotone. The testing results show that Algorithm 4.1 successfully solved all test problems with all these θ. This shows that Algorithm 4.1 is also suitable for the solution of monotone SCCPs although the global convergence of the se- quence generated is not established for this class of problems. Figure 4 below depicts the curves of the function evaluation and the iteration times of Algo- rithm 4.1 with respect to θ. From this figure, we see that Algorithm 4.1 has the worst performance when θ = 0.5, and a desirable θ should be from the interval [0.2, 0.4] or [0.9, 1).

### 6 Conclusion

We have extended the derivative-free method [18] for the NCP to the general SCCPs by using a different search direction. It was shown that the algorithm is convergent in terms of the value of Ψαfor a large class of SCCPs which may not even be monotone, whereas if θ ≤ ¯θ with ¯θ given by (22) and F is strongly monotone, the sequence generated by the algorithm converge globally to the solution of the problem at a R-linear rate. It is interesting to note that the lin- ear convergence rate of the nonmonotone descent algorithm is obtained without requiring any convexity of Ψα, and the relation among R1(ζ), h(ζ) and Ψα(ζ) plays a key role. In the future research, it is worthwhile to study the convergence rate of nomonotone derivative-free methods based on other merit functions, and explore other derivative-free methods for the SCCPs, for example, the pattern search algorithms.

### References

[1] J.-S. Chen, H.-T. Gao and S.-H. Pan, A derivative-free R-linearly con- vergent algorithm based on the generalized Fischer-Burmeister merit func- tion, Journal of Computational and Applied Mathematics, vol. 232, pp. 455- 471, 2009.

[2] Y.-H. Dai, On the nonmonotone line search, Journal of Optimization The- ory and Applications, vol. 112, pp. 315-330, 2002.

[3] L. Faybusovich, Euclidean Jordan algebras and interior-point algorithms, J. Positivity, vol. 1, pp. 331-357, 1997.

Figure 3: Convergence process of Algorithm 4.1 with different θ

0 1000 2000 3000 4000 5000 6000

10^{−12}
10^{−10}
10^{−8}
10^{−6}
10^{−4}
10^{−2}
10^{0}
10^{2}
10^{4}

Iterations

Merit Func values

Merit Func values v.s. Iterations

theta=1.0e−4

theta=0.05

theta=0.95 theta=0.5

Figure 4: Influence of θ on the performance of Algorithm 4.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

5.5x 10^{4} Influence of theta on the performance of Algorithm 4.1

theta the curve of the average function evaluation

the curve of the average iteration