An R-Linearly Convergent Nonmonotone Derivative-Free Method for Symmetric Cone Complementarity Problems
Shaohua Pan1 Department of Mathematics South China University of Technology
Guangzhou 510640, China E-mail: shhpan@scut.edu.cn.
Jein-Shan Chen2 Department of Mathematics National Taiwan Normal University
Taipei, Taiwan 11677 E-mail: jschen@math.ntnu.edu.tw
Abstract. This paper extends the derivative-free descent method [18] for the nonlinear complementarity problem to the symmetric cone complementarity problem (SCCP). The algorithm is based on the unconstrained implicit La- grangian reformulation of the SCCP, but uses a convex combination of the neg- ative partial gradients of the implicit Lagrangian function ψα, i.e. the vector of the form −θ∇xψα−(1 −θ)∇yψα for θ ∈ [0, 1], as the search direction, and a nonmonotone line search rule to seek a desirable stepsize. We show that the derivative-free algorithm converges in terms of the implicit Lagrangian value for a large class of SCCPs that may even not be monotone. If θ is restricted to be less than a threshold ¯θ ∈ (0, 1) and the SCCP is strongly monotone, the sequence generated converges globally to the solution of SCCP at a R-linear rate.
Key words: Symmetric cone complementarity problem, implicit Lagrangian, derivative-free methods, nonmonotone, linear convergence.
1The author’s work is supported by Guangdong Natural Science Foundation (No.
9251802902000001) and the Fundamental Research Funds for the Central Universities (SCUT).
2Corresponding author. Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan.
AMO - Advanced Modeling and Optimization. ISSN: 1841-4311
1 Introduction
Let V be a finite-dimensional vector space over the real field R, A ≡ (V, ◦, h·, ·i) be a Euclidean Jordan algebra (see Section 2 for the definition), and K be a symmetric cone in A. Given a continuously differentiable mapping F : V → V, we are interested in the following symmetric cone complementarity problem (SCCP): to find a ζ ∈ V such that
ζ ∈ K, F (ζ) ∈ K, hζ, F (ζ)i = 0. (1)
This class of problem provides a unified framework for the classical nonlinear complementarity problem (NCP), the second-order cone complementarity prob- lem (SOCCP), and the semidefinite complementarity problem (SDCP), as well as arises from the KKT system of a nonlinear symmetric cone optimization problem. When F (ζ) = L(ζ) + q with L : V → V being a linear transformation and q ∈ V, the problem (1) reduces to the linear complementarity problem over symmetric cones (LSCCP):
ζ ∈ K, L(ζ) + q ∈ K, hζ, L(ζ) + qi = 0. (2)
Recently, there is active research for the solution of the symmetric cone optimization and complementarity problems, and have been proposed various solution methods. They include the interior-point methods [3, 22, 29], the merit function methods [14, 16], the regularized smoothing method [13], and the smoothing Newton method [8]. This paper is concerned with a derivative- free method based on the implicit Lagrangian reformulation (5) of the SCCP (1). An attractive feature of this method is that no derivatives of F (·) need to be computed, which makes the method suitable for large-scale problems, as well as for applications where the derivatives of F (·) are not available or are costly to compute.
The implicit Lagrangian function was first introduced by Mangasarian and Solodov [17] as a smooth merit function for the NCP, and further studied un- der the setting of nonnegative orthant cones by [10, 15, 18, 25, 27] and other literature. Recently, Kong, Tuncel and Xiu [14] utilized the Jordan-algebraic technique to extend the implicit Lagrangian to the symmetric cone K. The corresponding function is defined as
ψα(x, y) := hx, yi+ 1 2α
nk(x−αy)+k2−kxk2+k(y−αx)+k2−kyk2o
, ∀x, y ∈ V (3) where α > 1 is a parameter, k · k is the norm induced by the inner product h·, ·i, and (·)+ denotes the metric projection onto the symmetric cone K. They have showed that ψα is a continuously differentiable merit function associated with
K, that is,
ψα(x, y) = 0 ⇐⇒ x ∈ K, y ∈ K, hx, yi = 0, (4) and thus the SCCP can be formulated as an unconstrained smooth minimization problem
minζ∈V Ψα(ζ) := ψα(ζ, F (ζ)) (5) in the sense that the minimizer of (5) with zero objective value is a solution of (1). For the unconstrained reformulation, they particularly gave a sufficient and necessary condition for each stationary point of Ψαto be a solution of (1), and established that Ψαoffers a global error bound for the SCCP (1) when F has the uniform Cartesian P -property.
Although the literature on derivative-free algorithms for the NCPs is vast (see, e.g., [1, 7, 9, 12, 18, 26, 27]), to our best knowledge, there are few papers to consider the ones for the nonpolyhedral symmetric cone complementarity prob- lems except [20, 28]. In these two papers, the derivative-free methods are de- veloped for the SDCP and the SOCCP, respectively, by the Fischer-Burmeister type merit function. Moreover, the rate of convergence result is not established in [28] and the one in [20] is only shown to be Q-linear. We also note that almost all derivative-free methods mentioned above are descent ones with the monotone Armijo-type line search.
To the contrast, in this work we develop a nonmonotone derivative-free method for the SCCP by using the vector of the form d(ζ) ≡ −θ∇xψα(ζ, F (ζ))−
(1 − θ)∇yψα(ζ, F (ζ)) with θ ∈ [0, 1] as the search direction. As shown in Prop.3.4, when θ is sufficiently small, d(ζ) is a descent direction, and it re- duces to the one adopted in [18]. However, for a general θ ∈ [0, 1], d(ζ) is not necessarily descent, and we adopt a nonmonotone line search rule to seek a de- sirable stepsize. We show that the method converges in terms of the implicit Lagrangian value for a large class of SCCPs, and if θ is restricted to be less than a threshold ¯θ ∈ (0, 1) and the SCCP is strongly monotone, the sequence generated converges to the solution at a R-linear rate. Numerical tests verify the theoretical results, and show that the method with a smaller θ does not have better performance than the method with a θ close to 1, though it may have a R-linear rate of convergence when θ is sufficiently small.
Throughout this paper, int K denotes the interior of the cone K and k · k represents the norm induced by the inner product h·, ·i, i.e., k · k :=ph·, ·i. For any x ∈ V, we write (x)+ and (x)− as the metric projection of x onto K and
−K, respectively, i.e.,
(x)+:= argminy∈K{kx − yk}.
For a differentiable mapping F : V → V, we denote its transposed Jacobian at x ∈ V by ∇F (x). Unless otherwise stated, the parameter α in the sequel always satisfies α > 1.
2 Preliminaries
A Euclidean Jordan algebra is a triple (V, ◦, h·, ·iV), where (V, h·, ·iV) is a finite- dimensional inner product space over R and (x, y) 7→ x ◦ y : V × V → V is a bilinear mapping satisfying:
(i) x ◦ y = y ◦ x for all x, y ∈ V;
(ii) x ◦ (x2◦ y) = x2◦ (x ◦ y) for all x, y ∈ V, where x2:= x ◦ x;
(iii) hx ◦ y, ziV= hy, x ◦ ziV for all x, y, z ∈ V.
We call x ◦ y the Jordan product of x and y. We assume that there is an element e ∈ V such that x ◦ e = x for all x ∈ V, and call such e the unit element. Let
ζ(x) := mink : {e, x, x2, . . . , xk} are linearly dependent .
Since ζ(x) is bounded by the dimension of V, denoted by dim(V), the rank of (V, ◦) is well defined by r := max{ζ(x) : x ∈ V}. Define the set of squares as K := x2: x ∈ V . Then, from [11, Theorem III.2.1], it follows that K is a symmetric cone. This means that K is a self-dual closed convex cone with nonempty interior int K, and for any x, y ∈ int K, there exists an invertible linear transformation T : V → V such that T (K) = K.
Recall that an element c ∈ V is idempotent if c2= c, and two idempotents c and d are orthogonal if c ◦ d = 0. A nonzero idempotent is primitive if it cannot be written as the sum of two other nonzero idempotents. A complete system of orthogonal idempotents is a finite set {c1, c2, . . . , ck} of idempotents with ci◦ cj = 0 (i 6= j) and Pk
i=1ci = e. We call a complete system of orthogonal primitive idempotents a Jordan frame.
Theorem 2.1 [11, Theorem III.1.2] Suppose that A = (V, ◦, h·, ·iV) is a Eu- clidean Jordan algebra with rank r. Then for each x ∈ V, there exist a Jordan frame {c1, c2, . . . , cr} and real numbers λ1(x), λ2(x), . . . , λr(x) such that
x = λ1(x)c1+ λ2(x)c2+ · · · + λr(x)cr.
The numbers λ1(x), . . . , λr(x) (counting multiplicities) are called the eigenval- ues of x. Furthermore, the trace of x, denoted by tr(x), is defined as tr(x) = Pr
j=1λj(x).
Since, by [11, Prop.III.1.5], a Jordan algebra A = (V, ◦) over R with a unit element e ∈ V is Euclidean if and only if the symmetric bilinear form tr(x ◦ y) is positive definite, we may define another inner product h·, ·i on V by
hx, yi := tr(x ◦ y), ∀ x, y ∈ V. (6) By the associativity of tr(·) (see [11, Prop.II.4.3]), the inner product h·, ·i is associative, i.e., for all x, y, z ∈ V, it holds that hx ◦ y, zi = hy, x ◦ zi.
Unless otherwise stated, in the rest of this paper, we always assume that A= (V, ◦, h·, ·i) is a Euclidean Jordan algebra of rank r and dim(V) = n with h·, ·i defined as in (6).
Let ϕ : R → R be a scalar-valued function. By Theorem 2.1, it is natural to define a vector-valued function associated with the Euclidean Jordan algebra A= (V, ◦, h·, ·i) by
ϕV(x) := ϕ(λ1(x))c1+ ϕ(λ2(x))c2+ · · · + ϕ(λr(x))cr, (7) where x ∈ V has the spectral decomposition x =Pr
j=1λj(x)cj. The function ϕV is also called L¨owner operator in [23] and shown to inherit many properties from ϕ. Especially, when ϕ(t) is chosen as max{0, t} and min{0, t} for t ∈ R, the L¨owner operator ϕV(·) respectively becomes the metric projection operator onto K and −K:
(x)+:=
r
X
j=1
max0, λj(x) cj and (x)−:=
r
X
j=1
min0, λj(x) cj. (8)
A Euclidean Jordan algebra is called simple if it is not the direct sum of two Euclidean Jordan algebras. By [11, Prop.III.4.4-4.5 & Theorem V.3.7], each Euclidean Jordan algebra is, in a unique way, a direct sum of simple Euclidean Jordan algebras. Also, the symmetric cone in a given Euclidean Jordan algebra is, in a unique way, a direct sum of symmetric cones in the constituent simple Euclidean Jordan algebras. In the sequel, we assume that V ≡ V1×· · ·×Vmand K ≡ K1× · · · × Km, where each Ai= (Vi, ◦, h·, ·i) is a simple Euclidean Jordan algebra and Kiis a symmetric cone in Vi. Corresponding to the Cartesian struc- ture of V and K, let ζ = (ζ1, . . . , ζm) with ζi∈ Viand F (ζ) = (F1(ζ), . . . , Fm(ζ)) with Fi : V → Vi.
To close this section, we recall the definitions of uniform Cartesian P - property [5, 14] and uniform Jordan P -property [24].
Definition 2.1 The mapping F = (F1, F2, . . . , Fm) with Fi: V → Vi is said to have
(a) the uniform Cartesian P -property if there exists a positive scalar ρ such that for any ζ, ξ ∈ V, there is an index i ∈ {1, 2, . . . , m} such that
hζi− ξi, Fi(ζ) − Fi(ξ)i ≥ ρkζ − ξk2.
(b) the uniform Jordan P -property if there is a scalar ρ > 0 such that for any ζ, ξ ∈ V,
λmax[(ζ − ξ) ◦ (F (ζ) − F (ξ))] ≥ ρkζ − ξk2 where λmax(x) denotes the largest eigenvalue of a vector x ∈ V.
3 Properties of the function Ψ
αThis section is devoted to the favorable properties of the implicit Lagrangian function Ψα. Most of the properties have been given by Kong, Tuncel and Xiu [14], and we supplement some ones that play an important role in the convergence analysis of the algorithms. For this purpose, let rα : V × V → V and Rα: V → V be respectively defined by
rα(x, y) := x − (x − αy)+ for α > 0 (9) and
Rα(ζ) := rα(ζ, F (ζ)) for α > 0. (10) Then, by using the same arguments as those of [4, Lemma 1] and [21, Theorem 4.2], it is easy to obtain the following properties of rα, and we omit the proof for simplicity.
Lemma 3.1 Let rα be defined as in (9). Then, there hold that
(a) min{1, α}kr1(x, y)k ≤ krα(x, y)k ≤ max{1, α}kr1(x, y)k for all x, y ∈ V and α > 0;
(b) min{1, α}kr1(x, y)k ≤ krα(y, x)k ≤ max{1, α}kr1(x, y)k for all x, y ∈ V and α > 0;
(c) α−1(α − 1)kr1(x, y)k2 ≤ ψα(x, y) ≤ (α − 1)kr1(x, y)k2 for all x, y ∈ V and α > 1.
Now we give a proposition to summarize the favorable properties of the function Ψα.
Proposition 3.1 Let Ψα be defined as in (5). Then the following statements hold:
(a) Ψα(ζ) ≥ 0 for all ζ ∈ V, and Ψα(ζ) = 0 if and only if ζ ∈ V solves the SCCP (1).
(b) Ψα is continuously differentiable everywhere on V with the gradient given by
∇Ψα(ζ) = ∇xψα(ζ, F (ζ)) + ∇F (ζ)∇yψα(ζ, F (ζ)).
(c) ∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ)) = 0 if and only if ζ ∈ V solves the SCCP (1).
(d) ∇xψα(ζ, F (ζ)), ∇yψα(ζ, F (ζ)) ≥ 0 for any ζ ∈ V.
(e)
∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))
2≥ (α2− 1)2
α2(α2+ 1)kRα(ζ)k2 for any ζ ∈ V.
(f )
∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))
2≤ 2α(α − 1)Ψα(ζ) for any ζ ∈ V.
Proof. The proof of parts (a)–(d) can be found in the literature [14]. To prove parts (e) and (f), we only need to show that for all x, y ∈ V,
∇xψα(x, y) + ∇yψα(x, y)
2 ≥ (α2− 1)2
α2(α2+ 1)krα(x, y)k2+ krα(y, x)k2,(11)
∇xψα(x, y) + ∇yψα(x, y)
2 ≤ 2α(α − 1)ψα(x, y). (12)
From [14], it follows that for any x, y ∈ V,
∇xψα(x, y) = y + α−1[(x − αy)+− x − α(y − αx)+] ,
∇yψα(x, y) = x + α−1[(y − αx)+− y − α(x − αy)+] . (13) Therefore, we have
∇xψα(x, y) + ∇yψα(x, y)
2= (α − 1)2 α2
x − (x − αy)+ + y − (y − αx)+
2
= (α − 1)2 α2
h
x − (x − αy)+
2+
y − (y − αx)+
2i
+2(α − 1)2
α2 x − (x − αy)+, y − (y − αx)+. (14) From part (d), we see that h∇xψα(x, y), ∇yψα(x, y)i ≥ 0 for any x, y ∈ V, that is,
0 ≤
y − (y − αx)++ 1
α[(x − αy)+− x], x − (x − αy)++ 1
α[(y − αx)+− y]
= −1 α
x − (x − αy)+
2− 1 α
y − (y − αx)+
2
+
1 + 1
α2
x − (x − αy)+, y − (y − αx)+.
This in turn implies that
x − (x − αy)+, y − (y − αx)+
≥ α
α2+ 1kx − (x − αy)+k2+ ky − (y − αx)+k2
∀ x, y ∈ V. (15)
Combining (14) with (15) and noting that α > 0, we immediately obtain
∇xψα(x, y) + ∇yψα(x, y)
2
≥ (α − 1)2(α + 1)2 α2(α2+ 1)
hkx − (x − αy)+k2+ ky − (y − αx)+k2i
= (α2− 1)2
α2(α2+ 1)krα(x, y)k2+ krα(y, x)k2 .
This completes the proof of (11). To see inequality (12), we verify the following:
∇xψα(x, y) + ∇yψα(x, y)
2 = (α − 1)2 α2
rα(x, y) + rα(y, x)
2
≤ 2(α − 1)2
α2 krα(x, y)k2+ krα(y, x)k2
≤ (α − 1)2
α2 · 2α2kr1(x, y)k2
≤ 2(α − 1)2· α
α − 1ψα(x, y)
= 2α(α − 1)ψα(x, y),
where the first equality is due to the first equation of (14) and the definition of rα, and the second inequality holds by Lemma 3.1(a)-(b). Thus, we complete the proof. 2
The assertions of Prop.3.1(e)-(f) are new, and they play a key role in es- tablishing the rate of convergence result of the nonmonotone descent algorithm of this paper. When V reduces to the Euclidean space Rn with the standard inner product and Jordan product defined as the componentwise product of the vectors, Prop.3.1(e) implies the second result of [18, Lemma 1] by observing α > 1 and the following inequalities
∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))
≥ α2− 1 α√
α2+ 1kRα(ζ)k
≥ (α − 1)(α + 1) α√
α2+ 1 kR1(ζ)k
≥ α − 1
α kR1(ζ)k
where the first inequality is by Lemma 3.1(b) and the second one is due to α + 1 >√
α2+ 1.
The following results for Ψα can be found in [14, Corollary 6.4] and [14, Theorem 6.3].
Proposition 3.2 Assume that F has the uniform Cartesian P -property. Then, (a) Each stationary point of Ψαis a solution of the SCCP (1).
(b) If, in addition, F is Lipschitz continuous with constant L > 0, then for any ζ ∈ V,
1
(α − 1)(2 + L)2Ψα(ζ) ≤ kζ − ζ∗k2≤ α(1 + L)2 (α − 1)ρ2Ψα(ζ),
where ζ∗ be the unique solution of (1), and the constant ρ is same as in Def.2.1.
It is well known that the coerciveness of the merit function plays an impor- tant role in the convergence analysis of the unconstrained reformulation methods for the complementarity problems. The next proposition presents a mild con- dition to guarantee the coerciveness of Ψα, whose proof can be found in [19, Theorem 4.1].
Proposition 3.3 The function Ψα is coercive under the following condition that
(C.1) F has the uniform Jordan P -property and the linear growth, i.e., there exists a constant C > 0 such that for any ζ ∈ V, kF (ζ)k ≤ kF (0)k+Ckζk.
Particularly, if F is given as in (2) with L having the P -property, then Ψα is coercive.
To close this section, we present the direction d(ζ) that will be employed to design our derivative-free algorithm. Specifically, let the mapping d : V → V be given by
d(ζ) := −θ∇xψα(ζ, F (ζ)) − (1 − θ)∇yψα(ζ, F (ζ)) ∀θ ∈ [0, 1]. (16) Such vector d enjoys the properties stated as in the following proposition.
Proposition 3.4 Suppose that ∇F is positive definite. Then, for sufficiently small θ > 0,
d(ζ)T∇Ψα(ζ) < 0 when d(ζ) 6= 0.
If F is strongly monotone with modulus µ > 0 and S ⊆ V is any bounded set, then there exists ¯θ ∈ (0, 1) such that for all θ ≤ ¯θ,
d(ζ)T∇Ψα(ζ) ≤ −1 2θ
∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))
2 ∀ζ ∈ S.
Proof. By the formula of ∇Ψαand Prop.3.1(d), for any θ ∈ [0, 1], we have d(ζ)T∇Ψα(ζ) = −θk∇xψα(ζ, F (ζ))k2− (1 − θ)h∇xψα(ζ, F (ζ)), ∇yψα(ζ, F (ζ))i
−θ h∇xψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i
−(1 − θ) h∇yψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i
≤ −θk∇xψα(ζ, F (ζ))k2− θ h∇xψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i
−(1 − θ) h∇yψα(ζ, F (ζ)), ∇F (ζ)∇yψα(ζ, F (ζ))i . (17) Notice that for sufficiently small θ > 0 and any given ζ ∈ V, the vector d(ζ) 6= 0 must imply that ∇yψα(ζ, F (ζ)) 6= 0. Thus, the last term of the right hand side is always strictly negative by the positive definiteness of ∇F , whereas the first two terms are sufficiently small. Therefore, we obtain that d(ζ)T∇Ψα(ζ) < 0 whenever d(ζ) 6= 0.
Since ∇F is continuous and S is bounded, there exists a constant ν > 0 such that
k∇F (ζ)k ≤ ν ∀ ζ ∈ S. (18)
On the other hand, using the strong monotonicity of F , we have
h∇F (ζ)u, ui ≥ µkuk2 ∀ ζ, u ∈ V. (19) Now, from equations (17)–(19), it follows that for any θ ∈ [0, 1] and ζ ∈ S,
d(ζ)T∇Ψα(ζ) ≤ −θk∇xψα(ζ, F (ζ))k2− (1 − θ)µk∇yψα(ζ, F (ζ))k2 +θνk∇xψα(ζ, F (ζ))k · k∇yψα(ζ, F (ζ))k
= −1 2θ
k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2
−1 2θ
∇xψα(ζ, F (ζ))
2−2(1 − θ)µ − θ 2
∇yψα(ζ, F (ζ))
2
+θ(ν + 1)k∇xψα(ζ, F (ζ))k · k∇yψα(ζ, F (ζ))k. (20) If θ ≤ 2µ/(2µ + 1), then the last inequality can be rewritten as
d(ζ)T∇Ψα(ζ) ≤ −1 2θ
k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2
− rθ
2
∇xψα(ζ, F (ζ)) −
r2µ − (2µ + 1)θ 2
∇yψα(ζ, F (ζ))
!2 (21) +
θ(ν + 1) −p2µθ − (2µ + 1)θ2
k∇xψα(ζ, F (ζ))kk∇yψα(ζ, F (ζ))k.
If θ(ν + 1) ≤p2µθ − (2µ + 1)θ2, that is, θ ≤ 2µ/(2µ + 1 + (ν + 1)2), then using (21) and the Cauchy-Schwartz inequality yields
d(ζ)T∇Ψα(ζ) ≤ −1 2θ
k∇xψα(ζ, F (ζ))k + k∇yψα(ζ, F (ζ))k2
≤ −1 2θ
∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))
2.
Thus, by setting θ := min¯
2µ
2µ + 1, 2µ
2µ + 1 + (ν + 1)2
= 2µ
2µ + 1 + (ν + 1)2, (22) we obtain the desired result. The proof is complete. 2
4 Nonmonotone derivative-free algorithm
In this section, we utilize the direction d(ζ) defined by (16) to design a derivative- free algorithm. By Prop.3.4, d(ζ) with θ ∈ [0, 1] may not satisfy the descent condition. Moreover, the technique of nonmonotone line search is often more effective than the Armijo-type line search. So, we adopt a nonmonotone line search rule to seek a suitable stepsize.
Algorithm 4.1
(Step 0) Choose ζ0∈ V, ǫ ≥ 0, θ ∈ [0, 1] and γ, δ ∈ (0, 1). Let M > 0 be an integer. Set k := 0.
(Step 1) If Ψα(ζk) ≤ ǫ, then stop. Otherwise, go to Step 2.
(Step 2) Let m(0) = 0, 0 ≤ m(k) ≤ min{m(k − 1) + 1, M − 1} for k ≥ 1. Let lk be the smallest nonnegative integer l satisfying
Ψα(ζk+ γldk) ≤ max
0≤j≤m(k)Ψα(ζk−j) − δγ2lh(ζk), (23) where dk := d(ζk) with d(ζ) defined as in (16), and
h(ζ) := k∇xψα(ζ, F (ζ)) + ∇yψα(ζ, F (ζ))k2. (24)
(Step 3) Set ζk+1:= ζk+ γlkdk and k := k + 1, and then go to Step 1.
Observe that no derivatives of F are needed to compute the search direction or the stepsize in Algorithm 4.1. Hence, Algorithm 4.1 requires little computa- tion and storing work at each iteration. Since θ is any fixed constant in [0, 1], the direction dk is different from the one used in [18] and at each iteration may not satisfy the descent condition (dk)T∇Ψα(ζk) < 0. Based on this, a non- monotone line search rule is used in Step 2. The line search rule is different from the ones adopted in [2, 6] where the gradient of the merit (or objective) function is needed, and when m(k) ≡ 0, the nonmonotone line search reduces to the Armijo line search. Particularly, if θ is restricted to be less than ¯θ given by (22) and F is strongly monotone, then Prop.3.4 implies that Algorithm 4.1
will become a nonmonotone derivative-free descent algorithm.
In what follows, we study the convergence of Algorithm 4.1. To the end, assume that Algorithm 4.1 generates an infinite sequence {ζk}, i.e., ǫ = 0. We define the level set
L(Ψα, ζ0) :=ζ ∈ V | Ψα(ζ) ≤ Ψα(ζ0) .
Then L(Ψα, ζ0) is bounded under one of the condition given in Prop.3.3. By the continuity of F (·), we know that D(ζ0) := supkd(ζ)k | ζ ∈ L(Ψα, ζ0) is finite. Consequently,
B(ζ0) := L(Ψα, ζ0) +ζ ∈ V | kζk ≤ D(ζ0) is also bounded under the condition stated in Prop.3.3.
Lemma 4.1 Let {ζk} be the sequence generated by Algorithm 4.1. Then, (a) the sequence {ζk} is contained in L(Ψα, ζ0);
(b) max
1≤i≤MΨα(ζM p+i) ≤ max
1≤i≤MΨα(ζM (p−1)+i) − δ min
0≤i≤M −1γ2l(M p+i)h(ζM p+i) for any p ≥ 1.
Proof. (a) For each k ≥ 0, let σ(k) be an integer from [k − m(k), k] such that Ψα(ζσ(k)) = max
0≤j≤m(k)Ψα(ζk−j).
Then, the line search condition (23) can be rewritten as
Ψα(ζk+1) ≤ Ψα(ζσ(k)) − δγ2lkh(ζk). (25) Noting that m(k + 1) ≤ m(k) + 1 and h(ζ) ≥ 0 for any ζ ∈ V, we have from (23) that
Ψα(ζσ(k+1)) = max
0≤j≤m(k+1)Ψα(ζk+1−j) ≤ max
0≤j≤m(k)+1Ψα(ζk+1−j)
= max{Ψα(ζσ(k)), Ψα(ζk+1)}
= Ψα(ζσ(k)),
where the last equality is from (25) and the nonnegativity of h(ζk). This shows that the sequence {Ψα(ζσ(k))} is nonincreasing. Noting that ζσ(0)= ζ0, we then have Ψα(ζk) ≤ Ψα(ζ0) for all k, which in turn implies {ζk} ⊆ L(Ψα, ζ0).
(b) We only need to show that the following inequality holds for j = 1, 2, . . . , M : Ψα(ζM p+j) ≤ max
1≤i≤MΨα(ζM (p−1)+i) − δγ2l(M p+j−1)h(ζM p+j−1) ∀p ≥ 1. (26)
Notice that the linear search condition (23) implies Ψα(ζM p+1) ≤ max
0≤i≤m(M p)Ψα(ζM p−i) − δγ2lM ph(ζM p),
which together with m(M p) ≤ M −1 shows that inequality (26) holds for j = 1.
Suppose that (26) holds for any 1 ≤ j ≤ M − 1. Then, from the nonnegativity of h(ζ), it follows that
1≤i≤jmax Ψα(ζM p+i) ≤ max
1≤i≤MΨα(ζM (p−1)+i).
Consequently, by using (23), the induction hypothesis and m(M p + j) ≤ M − 1, we get
Ψα(ζM p+j+1) ≤ max
0≤i≤m(M p+j)Ψα(ζM p+j−i) − δγ2l(M p+j)h(ζM p+j)
≤ max
1≤i≤Mmax Ψα(ζM (p−1)+i), max
1≤i≤jΨα(ζM p+j)
− δγ2l(M p+j)h(ζM p+j)
≤ max
1≤i≤MΨα(ζM (p−1)+i) − δγ2l(M p+j)h(ζM p+j).
This shows that (26) also holds for j + 1. By induction, we prove that (26) is true for all 1 ≤ j ≤ M. Consequently, the assertion of part (b) follows. 2
Now we are in a position to state and prove our convergent result for Algo- rithm 4.1.
Theorem 4.1 Let {ζk} be the sequence generated by Algorithm 4.1. Suppose that F is Lipschitz continuous and satisfies the condition in Prop.3.3, and ∇F (·) is Lipschitz continuous on B(ζ0). Then, the following results hold.
(a) The sequence {ζk} is bounded.
(b) The sequence {Ψα(ζk)} is convergent.
(c) limk→∞γ2lkh(ζk) = 0, limk→∞γlkkdkk = 0 and limk→∞kζk+1− ζkk = 0.
(d) Each accumulation point of {ζk} either is a solution of the SCCP (1) or satisfies
|∇Ψα(ζ)Td(ζ)|
h(ζ) = 0. (27)
Proof. (a) By Prop.3.3, L(Ψα, ζ0) is bounded, and the result holds by Lemma 4.1(a).
(b) First, by the proof of Lemma 4.1(a), the sequence {Ψα(ζσ(k))} is nonin- creasing. This together with the nonnegativity of Ψα(ζ) for any ζ ∈ V implies
that {Ψα(ζσ(k))} admits a limit when k → ∞. Let j be an integer such that 1 ≤ j ≤ M + 1. We first by induction on j show that
k→∞lim kζσ(k)−j+1− ζσ(k)−jk = 0, (28)
k→∞lim Ψα(ζσ(k)) = lim
k→∞Ψα(ζσ(k)−j), (29)
where σ(k) is defined as in Lemma 4.1, and the sequences are considered for sufficiently large k such that σ(k) ≥ k − M > 1. If j = 1, then using (25) with k replaced by σ(k) − 1, we obtain that
Ψα(ζσ(k)) ≤ Ψα(ζσ(σ(k)−1)) − δγ2lσ(k)−1h(ζσ(k)−1). (30) Since {Ψα(ζσ(k))} admits a limit, taking limits to the both sides of (30) yields
k→∞lim γ2lσ(k)−1h(ζσ(k)−1) = 0.
From the definition of d(ζ) and h(ζ), it is easy to verify that h(ζ) ≥ kd(ζ)k2 for any ζ ∈ V.
Using the last two equations, it then follows that 0 ≥ lim
k→∞kγlσ(k)−1dσ(k)−1k = lim
k→∞kζσ(k)− ζσ(k)−1k ≥ 0. (31) On the other hand, since Ψα is continuously differentiable everywhere and L(Ψα, ζ0) is bounded, the function Ψα is Lipschitz continuous on L(Ψα, ζ0).
This means that there exists a constant L2> 0 such that
|Ψα(ζ) − Ψα(ξ)| ≤ L2kζ − ξk ∀ζ, ξ ∈ L(Ψα, ζ0). (32) From equations (31)–(32), we immediately obtain
k→∞lim Ψα(ζσ(k)) = lim
k→∞Ψα(ζσ(k)−1).
This shows that (28) and (29) hold at each k for j = 1. Now assume that (29) holds for a given j. Using (25) with k replaced by σ(k) − j − 1, we have
Ψα(ζσ(k)−j) ≤ Ψα(ζσ(σ(k)−j−1)
) − δγ2lσ(k)−j−1h(ζσ(k)−j−1).
Taking limits for k → ∞ and recalling (29) give
k→∞lim γ2lσ(k)−j−1h(ζσ(k)−j−1) = 0.
This together with h(ζσ(k)−j−1) ≥ kdσ(k)−j−1k2 implies 0 ≥ lim
k→∞γlσ(k)−j−1kdσ(k)−j−1k = lim
k→∞kζσ(k)−j− ζσ(k)−j−1k = 0.
Combining with (29) and (32), we then obtain
k→∞lim Ψα(ζσ(k)) = lim
k→∞Ψα(ζσ(k)−j−1).
The last two equations show that (28) and (29) hold when replacing j with j + 1, and hence (28) and (29) hold for any given j ∈ {1, . . . , M}. Let ˆσ(k) = σ(k + M + 1). Then,
ζσ(k)ˆ = ζk+ (ζk+1− ζk) + · · · + (ζˆσ(k)− ζσ(k)−1ˆ )
= ζk+
ˆ σ(k)−k
X
j=1
(ζˆσ(k)−j+1− ζˆσ(k)−j). (33)
Notice that σ(k + M + 1) ≤ k + M + 1 and ˆσ(k) − k ≤ M + 1, and therefore, from (33) and (28), it follows
k→∞lim kζk− ζˆσ(k)k = 0. (34)
Since {Ψα(ζσ(k))} has a limit, using (32) and (34), we have
k→∞lim Ψα(ζk) = lim
k→∞Ψα(ζσ(k)ˆ ) = lim
k→∞Ψα(ζσ(k+M +1)) = lim
k→∞Ψα(ζσ(k)).
Thus, we complete the proof of assertion (b).
(c) From the line search condition (23) and part (b), it readily follows
k→∞lim γ2lkh(ζk) = 0.
This together with h(ζk) ≥ kdkk2 and kγlkdkk = kζk+1− ζkk yields
k→∞lim γlkkdkk = lim
k→∞kζk+1− ζkk = 0.
Consequently, the assertions of part (c) hold.
(d) If lk = 0 fails for the line search condition (23), then we have Ψα(ζk+ γlk−1dk) > max
0≤j≤m(k)Ψα(ζk−j) − δγ2(lk−1)h(ζk)
≥ Ψα(ζk) − δγ2(lk−1)h(ζk). (35) Since F (·) and ∇F (·) are Lipschitz continuous on B(ζ0), it is clear that ∇Ψα(·) is Lipschitz continuous on this bounded set, i.e., there exists a constant L3> 0 such that
k∇Ψα(ζ) − ∇Ψα(ξ)k ≤ L3kζ − ξk ∀ζ, ξ ∈ B(ζ0). (36) Notice that ζk and ζk + tdk for any t ∈ [0, 1] belong to the set B(ζ0). By the mean-value theorem and the Lipschitz continuity of ∇Ψαon B(ζ0), it then
follows that
Ψα(ζk+ tdk) − Ψα(ζk)
= t∇Ψα(ζk)Tdk+ Z t
0 [∇Ψα(ζk+ sdk) − ∇Ψα(ζk)]Tdkds
≤ t∇Ψα(ζk)Tdk+ Z t
0
L3kdkk2sds
= t∇Ψα(ζk)Tdk+ (1/2)L3t2kdkk2
≤ t∇Ψα(ζk)Tdk+ (1/2)L3t2h(ζk)
≤ −δt2h(ζk) for all t ∈
0,2|∇Ψα(ζk)Tdk| h(ζk)(2δ + L3)
. (37)
Combining the inequality (37) with (35), we obtain that
γlk−1>2|∇Ψα(ζk)Tdk| h(ζk)(2δ + L3).
If lk = 0 succeeds for the line search condition (23), then γlk = 1. Thus, there exists some constant C1= 2γ/(2δ + L3) > 0 such that
γlk > min
1, C1|∇Ψα(ζk)Tdk| h(ζk)
for all k. (38)
Now let ζ∗ be an accumulation point of {ζk} and {ζk}k∈K be the subsequence such that
k→∞,k∈Klim ζk= ζ∗.
By part (c), limk→∞γ2lkh(ζk) = 0. If limk→∞,k∈Kh(ζk) = h(ζ∗) = 0, then
∇xψα(ζ∗, F (ζ∗)) + ∇xψα(ζ∗, F (ζ∗)) = 0.
By Proposition 3.1 (c), ζ∗is a solution of the SCCP. If limk→∞h(ζk) 6= 0, then there holds limk→∞γlk = 0. This together with (38) implies
0 = lim
k→∞
|∇Ψα(ζk)Tdk|
h(ζk) = |∇Ψα(ζ∗)Td(ζ∗)|
h(ζ∗) . Thus, we complete the proof. 2
Theorem 4.1 states that, when θ is any fixed real number in [0, 1], the non- monotone derivative-free algorithm converges in terms of the value of merit function Ψαand the sequence {ζk} is bounded for a large class of SCCPs which may even not be monotone. If θ is chosen to be less than ¯θ and F is strongly monotone, then by Prop.3.4,
|∇Ψα(ζ)Td(ζ)| ≥ 1
2θh(ζ) ∀ζ ∈ B(ζ0).
This implies that any accumulation point of {ζk} can not satisfy (27), and consequently, each accumulation of {ζk} is a solution of the SCCP (1). In fact, under this case, {ζk} converges to the solution of (1) at a R-linear rate. We next prove the assertion.
Theorem 4.2 Let {ζk} be the sequence generated by Algorithm 4.1. Suppose that F is strongly monotone and Lipschitz continuous, and ∇F (·) is Lipschitz continuous on B(ζ0). If θ ≤ ¯θ with ¯θ given by (22), then there exist constants ν0> 0 and ν6∈ (0, 1) such that
Ψα(ζk) ≤ ν0ν6kΨα(ζ1).
Moreover, {ζk} converges to the unique solution ζ∗ of the SCCP (1) with R- linear rate.
Proof. Since strong monotonicity implies the uniform Jordan P -property, which by Prop.3.3 implies that B(ζ0) is bounded and all results of Theorem 4.1 hold.
To prove the conclusion, we first show that there exist constants ν1, ν2> 0 such that
Ψα(ζk+1) ≤ ν1Ψα(ζk) for all k ≥ 0, (39) and
h(ζk+1) ≤ ν2h(ζk) for all k ≥ 0. (40) Because θ ≤ ¯θ and F is strongly monotone, using (37) and Proposition 3.4 yields
Ψα(ζk+1) − Ψα(ζk) ≤ γlk∇Ψα(ζk)Tdk+ (1/2)L3γlkh(ζk)
≤ −1
2γlk(θ − L3γlk)h(ζk). (41) By Proposition 3.1 (e)–(f), Lemma 3.1 (a) and (c), it is easy to verify that
h(ζ) ≥(α − 1)2
α2 kRα(ζ)k2≥ (α − 1)2
α2 kR1(ζ)k2≥α − 1
α2 Ψα(ζ) ∀ζ ∈ V, (42) and
h(ζ) ≤ 2α(α − 1)Ψα(ζ) ∀ζ ∈ V. (43) Therefore, if θ − L3γlk ≥ 0, equations (41) and (42) imply
Ψα(ζk+1) ≤ Ψα(ζk) −1
2γlk(θ − L3γlk)α − 1 α2 Ψα(ζk)
=
1 −1
2γlk(θ − L3γlk)α − 1 α2
Ψα(ζk) ≤ Ψα(ζk);
whereas if θ − L3γlk< 0, equations (41) and (43) lead to
Ψα(ζk+1) ≤ 1 − γlk(θ − L3γlk)α(α − 1) Ψα(ζk)
≤ [1 + (L3− θ)α(α − 1)]Ψα(ζk).
This shows that (39) holds with ν1 := max1, 1 + (L3− θ)α(α − 1) . Using (43), (39) and (42), we have
h(ζk+1) ≤ 2α(α − 1)Ψα(ζk+1) ≤ 2α(α − 1)ν1Ψα(ζk) ≤ 2ν1α3h(ζk), which implies that (40) holds with ν2:= 2ν1α3> 0.
Now for any p ≥ 1, let φ(p) be any index in [Mp + 1, M(p + 1)] satisfying Ψα(ζφ(p)) := max
1≤i≤MΨα(ζM p+i).
From Lemma 4.1 (b), it then follows
Ψα(ζφ(p)) ≤ Ψα(ζφ(p−1)) − δ min
0≤i≤M −1γ2l(M p+i)h(ζM p+i).
Notice that γlk≥ min1, C1θ/2 for all k by using (38) and the second assertion of Proposition 3.4. Hence, there exists a constant ν3 := δ min{1, C1θ/2} > 0 such that
Ψα(ζφ(p)) ≤ Ψα(ζφ(p−1)) − ν3 min
0≤i≤M −1h(ζM p+i). (44) Let s(p) and w(p) be any indices in [M p + 1, M (p + 2)] for which
h(ζs(p)) := min
1≤i≤2Mh(ζM p+i) and Ψα(ζw(p)) := min
1≤i≤2MΨα(ζM p+i), (45) and denote by ν4 the constant given by
ν4=
ν3+ α2 α − 1ν24M
−1
. (46)
We now define an infinite subsequence {ki: i ≥ 0} ⊂ {1, 2, . . .} as follows. Let k0= φ(0). Suppose that ki= φ(¯p) has been chosen for some ¯p. Define
ki+1 :=
( w(¯p + 1) if h(ζs( ¯p+1)) ≤ ν4Ψα(ζφ( ¯p))
φ(¯p + 3) otherwise. (47)
For the subsequence {ki} defined as above, it is obvious that
ki+1− ki≤ 4M. (48)
In addition, there necessarily exists a constant ν5∈ (0, 1) such that
Ψα(ζki+1) ≤ ν5Ψα(ζki), for all i ≥ 1. (49) In fact, if h(ζs( ¯p+1)) ≤ ν4Ψα(ζφ( ¯p)), from (42), (40) and (48), it follows that
Ψα(ζki+1) ≤ α2
α − 1h(ζki+1) ≤ α2
α − 1ν24Mh(ζs( ¯p+1)) ≤ α2
α − 1ν4M2 ν4Ψα(ζki).
If h(ζs( ¯p+1)) > ν4Ψα(ζφ( ¯p)), using (44) and (45) yields Ψα(ζki+1) ≤ (1 − ν3ν4)Ψα(ζki)
By the choice of ν4, the last two equations imply that (49) holds with ν5 = (1 − ν3ν4).
For any k ≥ 1, assume that k ∈ [ki, ki+1) for some i. Then from (48) we have that
k − ki≤ 4M and ki≤ 4Mi + k0. (50) Using equation (50) and noting that 1 ≤ k0≤ M give
i ≥ ki− k0
4M ≥ k − 4M − k0
4M ≥ k
4M −5
4. (51)
Thus, by (39), (49), (50)–(51), we obtain
Ψα(ζk) ≤ νk−k1 iΨα(ζki) ≤ ν14Mν5iΨα(ζk0)
≤ ν4M1 ν(k/(4M )−5/4)
5 Ψα(ζk0)
≤ ν5M1 ν(k/(4M )−5/4)
5 Ψα(ζ1).
Letting ν0 = ν15Mν5−5/4 and ν6 = ν51/(4M ) and noting that ν5 = (1 − ν3ν4) <
1, we prove the first part of the conclusion. The second part is direct since {Ψα(ζk)} converges Q-linearly to zero and kζk− ζ∗k ≤ L+1ρ
q α
α−1pΨα(ζk) by Prop.3.2(b). 2
Theorem 4.2 is the first rate of convergence result for the class of derivative- free descent methods with a nonmonotone line search rule for the non-polyhedral SCCPs. In the next section, we compare the numerical performance of Algo- rithm 4.1 with that of Algorithm 4.2 descried as below, which is a monotone descent derivative-free method similar to the one in [26] for the NCPs. The stepsize and the search direction of Algorithm 4.2 are adjusted during the back- tracking search of Armijo-type.
Algorithm 4.2
(Step 0) Choose ζ0∈ V, ǫ ≥ 0, δ ∈ (0, 1), γ ∈ (0, 1), and a sufficiently small β ∈ (0, 1). Set k := 0.
(Step 1) If Ψα(ζk) ≤ ε, then stop. Otherwise, go to Step 2.
(Step 2) Let lk be the smallest nonnegative integer l satisfying
Ψα(ζk+ γldk(βl)) ≤ Ψα(ζk) − δγ2lh(ζk), (52) where h(ζ) is defined as in (24) and
dk(βl) := −βl∇xψα(ζk, F (ζk)) − (1 − βl)∇yψα(ζk, F (ζk)). (53)
(Step 3) Set ζk+1:= ζk+ γlkdk(βlk), k := k + 1, and go to Step 1.
5 Numerical experiments
In this section, we test the performance of Algorithms 4.1 and 4.2 for the affine SOCCP
ζ ∈ K+n, F (ζ) = M ζ + b ∈ K+n, hζ, F (ζ)i = 0, (54) where K+n = K+n1× · · · × K+nm with n1+ · · · + nm= n, M ∈ Rn×n and b ∈ Rn.
During the testing, we set M ≡ diag(M1, · · · , Mm) with Mi = NiNiT + τ Ii
for all i, where τ ≥ 0 is a given parameter, Iiis an ni× ni identity matrix, and each Ni ∈ Rni×ni was generated randomly such that it has 1% nonzero density with the nonzero entries from a normal distribution of mean −1 and variance 4. It is not hard to see that the matrix M generated by such a way is positive semidefinite (respectively, positive definite) if τ = 0 (respectively, τ > 0), which means that the corresponding F is strongly monotone (or monotone). The vec- tor b was obtained by setting b = −Mw with w = (w1, . . . , wm) ∈ K+n, where wi∈ K+ni was generated as follows: let the elements of wi be chosen randomly from a normal distribution with mean −1 and variance 4, and then set the first element wi1 of wito be kwi2k, where wi2is a vector composed of the rest ni− 1 components of wi. In this way, the affine SOCCP is guaranteed to have a solu- tion ζ∗= w.
All experiments were done with a PC of Pentium 4 with 2.8GHz CPU and 512MB memory. The computer codes were written in Matlab 6.5. During the tests, we chose ni and m such that n1= · · · = nm= 10 and m = 100. We set m(k) in Algorithm 4.1 as
m(k) :=
( 0 k < 5
min{m(k − 1) + 1, M − 1} otherwise with M = 6.
We started Algorithms 4.1 and 4.2 from the initial point ζ0 = (¯ζni, . . . , ¯ζnm) with ¯ζni = (10, ωi/kωik), where ωi ∈ Rni−1 for all i were generated randomly by Matlab’s rand.m. The parameters γ and δ in the two algorithms, and β in Algorithm 4.2 were chosen as
γ = 0.2, δ = 10−10, and β = 0.1.
The algorithms were terminated once one of the following conditions is satisfied:
(a) minΨα(ζk), |hζk, F (ζk)i| ≤ 10−5; (b) The stepsize is less than 10−8;
(c) The maximum iteration number is over 5 × 105.
If the algorithms are stopped under condition (a), we say that they solve the test problem successfully, and otherwise say that they fail to the test problem.
We first tested the influence of α for the iterations and the function eval- uations needed by Algorithms 4.1 and 4.2 for solving (54) with τ in each Mi
chosen as 0.1. For every α = 2, 5, 10, 20, 40, 50, 60, 80, 100, 150, 200, we applied Algorithm 4.1 with θ = 0.95 and Algorithm 4.2, respectively, for solving the same 50 test problems generated as above. The the average iteration and av- erage function evaluation were respectively taken as the average of iterations and function evaluations of the test problems solved successfully. The testing results show that Algorithm 4.1 with α = 2 failed for 4 test problems due to too small stepsize, and successfully solved all test problems with the other α;
whereas Algorithm 4.2 with α = 2 and α = 5 failed for 11 and 1 test problems, respectively, due to too small stepsize, and successfully solved all test problems with the rest α.
Figures 1 and 2 depict the curves of the average function evaluation and the average iteration, respectively, of Algorithms 4.1 and 4.2 with respect to α. From these figures, we see that the number of function evaluations and the iteration times needed by Algorithm 4.1 and Algorithm 4.2 increase with α.
Taking into account that the global convergence of the two algorithms is not stable when α is close to 1 (for example they fail to some test problems when α = 2), a desirable choice for α should be in the interval [10, 50]. Also, the average function evaluation and the average iteration of Algorithm 4.2 are more than those of Algorithm 4.1, especially when α > 40. This implies that the non- monotone derivative-free method has better performance than the monotone descent one.
Then, we tested the influence of θ for the rate of convergence of Algorithm 4.1, by using this algorithm with α = 15 and four different θ to solve a test ex- ample generated as above with τ = 0.01. Figure 3 below depicts the convergence curve of Algorithm 4.1. From this figure, we see that the curve corresponding to θ = 0.5 has the largest slope rate, the curve corresponding to θ = 10−4 has the smallest slope rate, and the curve corresponding to a smaller θ has a smaller slope rate when θ ≤ 0.1. This shows that Algorithm 4.1 with a smaller θ has a better rate of convergence, and it has the worst rate of convergence when θ = 0.5. This coincides with the theoretical results of Theorem 4.2.
We also tested the influence of θ for the performance of Algorithm 4.1.
Specifically, for every θ = 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, we em-
Figure 1: Influence of α on the average function evaluation of Algorithms 4.1 and 4.2
0 20 40 60 80 100 120 140 160 180 200
0 0.5 1 1.5 2 2.5 3 3.5
4x 104 Influence of alpha on the function evaluation needed by two algorithms
alpha
function evaluations
Algorithm 4.1 Algorithm 5.1
Figure 2: Influence of α on the average iteration of Algorithms 4.1 and 4.2
0 20 40 60 80 100 120 140 160 180 200
0 1000 2000 3000 4000 5000 6000 7000
Influence of alpha on the iteration times needed by two algorithms
alpha
iteration times
Algorithm 4.1 Algorithm 5.1
ployed Algorithm 4.1 with α = 15 to solve the same 50 test problems generated as above with τ = 0. Note that this class of problems is more difficult than the one used above since the mapping F is now only monotone, instead of strongly monotone. The testing results show that Algorithm 4.1 successfully solved all test problems with all these θ. This shows that Algorithm 4.1 is also suitable for the solution of monotone SCCPs although the global convergence of the se- quence generated is not established for this class of problems. Figure 4 below depicts the curves of the function evaluation and the iteration times of Algo- rithm 4.1 with respect to θ. From this figure, we see that Algorithm 4.1 has the worst performance when θ = 0.5, and a desirable θ should be from the interval [0.2, 0.4] or [0.9, 1).
6 Conclusion
We have extended the derivative-free method [18] for the NCP to the general SCCPs by using a different search direction. It was shown that the algorithm is convergent in terms of the value of Ψαfor a large class of SCCPs which may not even be monotone, whereas if θ ≤ ¯θ with ¯θ given by (22) and F is strongly monotone, the sequence generated by the algorithm converge globally to the solution of the problem at a R-linear rate. It is interesting to note that the lin- ear convergence rate of the nonmonotone descent algorithm is obtained without requiring any convexity of Ψα, and the relation among R1(ζ), h(ζ) and Ψα(ζ) plays a key role. In the future research, it is worthwhile to study the convergence rate of nomonotone derivative-free methods based on other merit functions, and explore other derivative-free methods for the SCCPs, for example, the pattern search algorithms.
References
[1] J.-S. Chen, H.-T. Gao and S.-H. Pan, A derivative-free R-linearly con- vergent algorithm based on the generalized Fischer-Burmeister merit func- tion, Journal of Computational and Applied Mathematics, vol. 232, pp. 455- 471, 2009.
[2] Y.-H. Dai, On the nonmonotone line search, Journal of Optimization The- ory and Applications, vol. 112, pp. 315-330, 2002.
[3] L. Faybusovich, Euclidean Jordan algebras and interior-point algorithms, J. Positivity, vol. 1, pp. 331-357, 1997.
Figure 3: Convergence process of Algorithm 4.1 with different θ
0 1000 2000 3000 4000 5000 6000
10−12 10−10 10−8 10−6 10−4 10−2 100 102 104
Iterations
Merit Func values
Merit Func values v.s. Iterations
theta=1.0e−4
theta=0.05
theta=0.95 theta=0.5
Figure 4: Influence of θ on the performance of Algorithm 4.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
5.5x 104 Influence of theta on the performance of Algorithm 4.1
theta the curve of the average function evaluation
the curve of the average iteration