to appear in Pacific Journal of Optimization, January, 2016
New smoothing functions for solving a system of equalities and inequalities
Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan E-mail: jschen@math.ntnu.edu.tw
Chun-Hsu Ko
Department of Electrical Engineering I-Shou University
Kaohsiung 840, Taiwan E-mail: chko@isu.edu.tw
Yan-Di Liu
Department of Mathematics National Taiwan Normal University
Taipei 11677, Taiwan E-mail: 60140026S@ntnu.edu.tw
Sheng-Pen Wang 2
Department of Industrial and Business Management Chang Gung University
Taoyuan 333, Taiwan E-mail: wangsp@mail.cgu.edu.tw
September 22, 2014 (revised on November 20, 2014)
Abstract In this paper, we propose a family of new smoothing functions for solving a system of equalities and inequalities, which is a generalization of [11]. We then investigate an algorithm based on a new reformation bH with less dimensionality and show, as in [11], that it is globally and locally convergent under suitable assumptions. Numerical evidence
1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is supported by Ministry of Science and Technology, Taiwan.
2Corresponding author.
shows the better performance of the algorithm in the sense that some unsolved examples in [11] can be solved by our proposed method. Moreover, the involved parameters in the family of new smoothing functions does not have influence in the algorithm, which is a new discovery to the literature.
Keywords. Smoothing function, System of equations and inequalities, Convergence
1 Introduction and Motivation
The target problem of this paper is the following system of equalities and inequalities:
{ fI(x) ≤ 0
fE(x) = 0 (1)
where I = {1, 2, · · · , m} and E = {m + 1, m + 2, · · · , n}. In other words, the function fI : IRn→ IRm is given by
fI(x) =
f1(x) f2(x)
... fm(x)
where fi : IRn → IR for i = {1, 2, · · · , m}; and the function fE : IRn → IRn−m is given by
fE(x) =
fm+1(x) fm+2(x)
... fn(x)
where fj : IRn → IR for j = {m + 1, m + 2, · · · , n}. For simplicity, throughout this paper, we denote f : IRn → IRn as
f (x) :=
[ fI(x) fE(x)
]
=
f1(x) f2(x)
... fm(x) fm+1(x) fm+2(x)
... fn(x)
and assume that f is continuously differentiable. When E is empty set, the system (1) reduces to a system of inequalities; whereas it reduces to a system of equations when I
is empty.
Problems in form of (1) arise in real applications, including data analysis, computer- aided design problems, image reconstructions, and set separation problems, etc.. Many optimization methods have been proposed for solving the system (1), for instance, non- interior continuation method [12], smoothing-type algorithm [6, 11], Newton algorithm [14], and iteration methods [4, 7, 8, 10]. In this paper, we consider the similar smoothing- type algorithm studied in [6, 11] for solving the system (1). In particular, we propose a family of smoothing functions, investigate its properties, and report numerical perfor- mance of an algorithm in which this family of new smoothing functions is involved.
As seen in [6, 11], the main idea of smoothing-type algorithm for solving the system (1) is to reformulate system (1) as a system of smoothing equations via projection function, More specifically, for any x = (x1, x2,· · · , xn)∈ IRn, one defines
(x)+:=
max{0, x1} ... max{0, xn}
.
Then, the system (1) is equivalent to the following system of equations:
{ (fI(x))+ = 0
fE(x) = 0. (2)
Note that the function (fI(x))+ in the reformulation (2) is nonsmooth, the classical Newton methods cannot be directly applied to solve (2). To conquer this, a smooth- ing algorithm was considered in [6, 11], in which the following smoothing function was employed:
ϕ(µ, t) =
t if t≥ µ,
(t+µ)2
4µ if −µ < t < µ, 0 if t≤ −µ,
(3)
where µ > 0.
In this paper, we propose a family of new smoothing functions, which include the function ϕ(µ, t) given as in (3) as a special case, for solving the reformulation (2). More specifically, we consider the family of smoothing functions as below:
ϕp(µ, t) =
t if t≥ p−1µ ,
µ p−1
[(p−1)(t+µ) pµ
]p
if −µ < t < p−1µ ,
0 if t≤ −µ,
(4)
where µ > 0 and p≥ 2. Note that ϕp reduces to the smoothing function studied in [11]
when p = 2. The graphs of ϕp with different values of p and various µ are depicted as in Figures 1-3.
Proposition 1.1. Let ϕp be defined as in (4). For any (µ, t)∈ IR++× IR, we have (a) ϕp(., .) is continuously differentiable at any (µ, t)∈ IR++× IR.
(b) ϕp(0, t) = (t)+.
(c) ∂ϕp∂t(µ,t) ≥ 0 for any (µ, t) ∈ IR++× IR.
(d) limp→∞ϕp(µ, t) → (t)+.
Proof. (a) First, we calculate ∂ϕp∂t(µ,t) and ∂ϕp∂µ(µ,t) as below:
∂ϕp(µ, t)
∂t =
1 if t≥ p−1µ ,
[(p−1)(t+µ) pµ
]p−1
if −µ < t < p−1µ ,
0 if t≤ −µ,
∂ϕp(µ, t)
∂µ =
0 if t ≥ p−1µ ,
[(p−1)(t+µ) pµ
]p−1(t+µ−pt)
pµ if −µ < t < p−1µ ,
0 if t ≤ −µ,
Then, we see that ∂ϕp∂t(µ,t) ∈ C1 because
lim
t→p−1µ
∂ϕp(µ, t)
∂t = lim
t→p−1µ
[(p− 1)(p−1µ + µ) pµ
]p−1
= 1,
t→−µlim
∂ϕp(µ, t)
∂t = lim
t→−µ
[(p− 1)(−µ + µ) pµ
]p−1
= 0.
and ∂ϕp∂µ(µ,t) ∈ C1 since
lim
t→p−1µ
∂ϕp(µ, t)
∂µ = lim
t→pµ−1
[(p− 1)(p−1µ + µ) pµ
]p−1
(p−1µ + µ− pp−1µ )
pµ = 0,
t→−µlim
∂ϕp(µ, t)
∂µ = lim
t→−µ
[(p− 1)(−µ + µ) pµ
]p−1 (−µ + µ − p(−µ))
pµ = 0.
The above verifications imply that ϕp(., .) is continuously differentiable.
(b) From the definition of ϕp(µ, t), it is clear that
ϕp(0, t) =
{ t if t≥ 0
0 if t≤ 0 = (t)+ which is the desired result.
(c) When −µ < t < p−1µ , we have t + µ > 0. Hence, from the expression of ∂ϕp∂t(µ,t), it is obvious that
[(p−1)(t+µ) pµ
]p−1
≥ 0, which says ∂ϕp∂t(µ,t) ≥ 0.
(d) Part(d) is clear from the definition. 2
The properties of ϕp in Proposition 1.1 can be verified via the graphs. In particular, in Figures 1-2, we see that when µ → 0, ϕp(µ, t) goes to (t)+ which verifies Proposition 1.1(b).
−2.50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
0.5 1 1.5 2 2.5
t φ p(µ,t)
p=2, µ=0.1 p=2, µ=0.5 p=2, µ=1 p=2, µ=2
Figure 1: Graphs of ϕp(µ, t) with p = 2 and µ = 0.1, 0.5, 1, 2.
Figure 3 says that for fixed µ > 0, ϕp(µ, t) approaches to (t)+ as p → ∞. This also verifies Proposition 1.1(d).
Next, we will form another reformulation for problem (1). To this end, we define
F (z) :=
fI(x)− s fE(x) Φp(µ, s)
with Φp(µ, s) :=
ϕp(µ, s1) ... ϕp(µ, sm)
and z = (µ, x, s) (5)
where Φp is a mapping from IR1+m → IRm. Then, in light of Proposition 1.1(b), we see that
F (z) = 0 and µ = 0 ⇐⇒ s = fI(x), s+= 0, fE(x) = 0.
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
t φ p(µ,t)
p=10, µ=0.1 p=10, µ=0.5 p=10, µ=1 p=10, µ=2
Figure 2: Graphs of ϕp(µ, t) with p = 10 and µ = 0.1, 0.5, 1, 2.
−0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25 0
0.05 0.1 0.15 0.2 0.25
t φ p(µ,t)
p=2, µ=0.2 p=3, µ=0.2 p=10, µ=0.2 p=20, µ=0.2
Figure 3: Graphs of ϕp(µ, t) with p = 2, 3, 10, 20 and µ = 0.2.
This, together with Proposition 1.1(a), indicates that one can solve system (1) by ap- plying Newton-type methods to solve F (z) = 0 by letting µ ↓ 0. Furthermore, by introducing an extra parameter p, we define a function H : IR1+n+m → IR1+n+m by
H(z) :=
µ
fI(x)− s + µxI
fE(x) + µxE
Φp(µ, s) + µs
(6)
where xI = (x1, x2,· · · , xm), xE = (xm+1, xm+2,· · · , xn), s ∈ IRm, x := (xI, xE) ∈ IRn and functions ϕp and Φp are defined as in(4) and (5), respectively. Thereby, it is obvious that if H(z) = 0, then µ = 0 and x solves the system (1). It is not difficult to see that, for any z ∈ IR++× IRn× IRm, the function H is continuously differentiable. Let H′ denote the Jacobian of the function H. Then, for any z ∈ IR++× IRn× IRm, we have
H′(z) =
1 O1×n O1×m
x1
... xm
m×1
A
−1 · · · 0 ... . .. ... 0 · · · −1
m×m
xm+1
... xn
(n−m)×1
B
0 · · · 0 ... . .. ... 0 · · · 0
(n−m)×m
s1+∂µ∂ ϕ′(µ, s1) ...
sm+∂µ∂ ϕ′(µ, sm)
m×1
0 · · · 0 ... 0 ... 0 · · · 0
m×n
∂
∂sϕ′(µ, s1) + µ · · · 0 ... . .. ... 0 · · · ∂s∂ϕ′(µ, sm) + µ
m×m
where
A =
∂f1(x1)
∂x1 + µ · · · 0
... . .. ... 0m×(n−m) 0 · · · ∂fm∂x(xmm) + µ
m×n
and
B =
∂fm+1(xm+1)
∂xm+1 + µ · · · 0
0(n−m)×m ... . .. ...
0 · · · ∂f∂xn(xnn) + µ
(n−m)×n
With the above, we can simplify the matrix H′(z) as
H′(z) =
1 0n 0m
xI fI′(x) + µU −Im
xE fE′ (x) + µV 0(n−m)×m s + Φ′µ(µ, s) 0m×n Φ′s(µ, s) + µIm
(7)
where
U := [Im 0m×(n−m)], V := [0(n−m)×m In−m],
s + Φ′µ(µ, s) =
s1+ ∂µ∂ ϕ′(µ, s1) ...
sm+ ∂µ∂ ϕ′(µ, sm)
m×1
,
Φ′s(µ, s) + µIm =
∂
∂sϕ′(µ, s1) + µ · · · 0
... . .. ...
0 · · · ∂s∂ϕ′(µ, sm) + µ
m×m
.
Here, we use 0I to denote the I-dimensional zero vector and 0l×q to denote the l × q zero matrix for any positive integers l and q. Thus, we might apply some Newton-type methods to solve the system of smooth equations H(z) = 0 at each iteration by letting µ > 0 and H(z) → 0 so that a solution of (1) can be found. This is the main idea of smoothing approach for solving system (1).
Alternatively, one may have another smoothing reformulation for system (1) without introducing the extra variable s. More specifically, we can define bH : IR1+n → IR1+n as
H(µ, x) :=b
µ
fE(x) + µxE Φp(µ, fI(x)) + µxI
(8)
The Jacobian of bH(µ, x) is similar to H′(z) and indeed is a bit tedious, so we omit its presentation here. The reformulation of bH(µ, x) = 0 has less dimension than H(z) = 0, whereas the expression of bH′(µ, x) is more tedious than H′(z). Both smoothing ap- proaches can lead to the solution to system (1). The numerical results based on H(z) = 0 and bH(µ, x) = 0 are compared in this paper. Moreover, we also investigate how the pa- rameter p affect the numerical performance when different ϕp is employed. Proposing the new family of smoothing functions as well as the above two aspects of numerical points are the main motivation and contribution of this paper.
2 A smoothing-type algorithm
In this section, we consider a non-monotone smoothing-type algorithm whose similar framework has been discussed in [6, 11]. In particular, we correct a flaw in Step 5 in [11] and show that only this modification can really make the algorithm well-defined.
Moreover, for bH(µ, x), a new reformulation of H(z) with lower dimensionality, we will use the function ψ(·) := ∥H(z)∥2 or ψ(·) := ∥ bH(µ, x)∥2 alternatively. Below are the details of the algorithm.
Algorithm 2.1. (A Nonmonotone Smoothing-Type Algorithm)
Step 0 Choose δ ∈ (0, 1), σ ∈ (0, 1/2), β > 0. Take τ ∈ (0, 1) such that τβ < 1. Let µ0 = β and (x0, s0) ∈ IRn+m be an arbitrary vector. Set z0 := (µ0, x0, s0). Take e0 := (1, 0,· · · , 0) ∈ IR1+n+m, R0 :=∥H(z0)∥2 = ψ(z0) and Q0 = 1.
Choose ηmin and ηmax such that 0≤ ηmin ≤ ηmax < 1. Set θ(z0) := τ min{1, ψ(z0)} and k := 0.
Step 1 If ∥H(zk)∥ = 0, stop.
Step 2 Compute △zk := (△µk,△xk,△sk)∈ IR × IRn× IRm by using
H′△zk =−H(zk) + βθ(zk)e0 (9) Step 3 Let αk be the maximum of the values 1, δ, δ2,· · · such that
ψ(zk+ αk△zk)≤ [1 − 2σ(1 − τβ)αk] Rk (10) Step 4 Set zk+1 := zk+ αk△zk. If ∥H(zk+1)∥ = 0, stop.
Step 5 Choose ηk∈ [ηmin, ηmax]. Set
Qk+1 := ηkQk+ 1 θ(zk+1) := min{
τ, τ ψ(zk+1), θ(zk)}
(11) Rk+1 := (ηkQkRk+ ψ(zk+1))
Qk+1 and k := k + 1. Go to Step 2
In Algorithm 2.1, a nonmonotone line search technique is adopted. It is easy to see that Rk+1 is a convex combination of Rk and ψ(zk+1). Since R0 = ψ(z0), it follows that Rk is a convex combination of the function values ψ(z0), ψ(z1),· · · , ψ(zk). The choice of ηk controls the degree of nonmonotonicity. If ηk = 0 for every k, then the line search is the usual monotone Armijo line search. The scheme of Algorithm 2.1 is not exactly the same as the one in [11]. In particular, θ(zk+1) := min{
τ, τ ψ(zk+1), θ(zk)}
which is different from θ(zk+1) := min{
τ, τ ψ(zk), θ(zk)}
given in [11]. Only this modification can really make the algorithm well-defined as shown in the following Theorem 2.1. For convenience, we denote
f′(x) :=
[ fI′(x) fE′ (x)
]
and make the following assumption.
Assumption 2.1. f′(x) + µIn is invertible for any x ∈ IRn and µ∈ IR++.
Some basic properties of Algorithm 2.1 are stated in the following lemma. Since the proof arguments are almost the same as those in [11], they are thus omitted.
Lemma 2.1. Let the sequence {Rk} and { zk}
be generated by Algorithm 2.1. Then, the following hold.
(a) The sequence {Rk} is monotonically decreasing.
(b) The function ψ(zk)≤ Rk for all k ∈ J .
(c) The sequence θ(zk) is monotonically decreasing.
(d) βθ(zk)≤ µk for all k ∈ J .
(e) µk > 0 for all k∈ J and the sequence {µk} is monotonically decreasing.
Lemma 2.2. Suppose A∈ IRn×nwhich is partitioned as A =
[ A11 A12 A21 A22
]
where A11and A22 are square matrices. If A12 or A21 is zero matrix, then det(A) = det(A11)· det(A22).
Proof. This a well known result in matrix analysis, which is a special case of Fischer’s ineqiality [1, 5]. Please refer to [9, Theorem 7.3] for a proof. 2
Theorem 2.1. Suppose that f is a continuously differentiable function and Assumption 2.1 is satisfied. Then Algorithm 2.1 is well defined.
Proof. Applying Lemmas 2.1-2.2 and mimicking the arguments as in [11], it is easy to achieve the desired result. However, we point it out again that θ(zk+1) in step 5 is different from the one in [11]. Only this modification can really make the algorithm well-defined. 2
3 Convergence analysis
In this section, we analyze the convergence of the algorithm proposed in previous section.
To this end, the following assumption is needed which was introduced in [6].
Assumption 3.1. For an arbitrary sequence {(µk, xk)} with lim
k−→∞∥x∥ = +∞ and the sequence {µk} ⊂ IR+ bounded, then either
(i) there is at least an index i0 such that lim sup
k−→∞ {fi0(xk) + µkxki0} = +∞; or (ii) there is at least an index i0 such that lim sup
k−→∞ {µk(fi0(xk) + µkxki0)} = −∞.
It can be seen that many functions satisfy Assumption 3.1, see [6]. The global con- vergence of Algorithm 2.1 is stated as follows. In fact, with Proposition 1.1, the main idea for the proof is almost the same as that in [11, Theorem 4.1], only a few technical parts are different. Thus, we omit the details.
Theorem 3.1. Suppose that f is a continuously differentiable function and Assumptions 2.1 and 3.1 are satisfied. Then, the sequence{zk} generated by Algorithm 2.1 is bounded.
Moreover, any accumulation point of xk is a solution to (1).
Next, we analyse the convergence rate for Algorithm 2.1. Before presenting the re- sult, we introduce some concepts that will used in the subsequent analysis as well as a technical lemma.
A locally Lipschitz function F : IRn → IRm, which has the generalized Jacobian
∂F (x), is said to be semismooth (or strongly semismooth) at x∈ IRn if F is directionally differentiable at x and
F (x + h)− F (x) − V h = o(∥h∥) (or = O(∥h∥2)
holds for any V ∈ ∂F (x + h). It is well known that convex functions, smooth functions, and piecewise linear functions are examples of semismooth functions. The composition of (strongly) semismooth functions is still a (strongly) semismooth function. It can be verified that the function ϕp defined by (4) is strongly semismooth on IR2. Thus, f being continuously differentiable implies that the function H defined by (6) and bH defined by (8)are semismooth (or strongly semismooth if f′ is Lipschitz continuous on IRn.
Lemma 3.1. For any α, β ∈ IR++, α = O(β) represents that αβ is uniformly bounded, and α = o(β) denotes αβ → 0 as β → 0. Then, we have
(a) O(β)± O(β) = O(β);
(b) o(β)± o(β) = o(β);
(c) If c̸= 0 then O(cβ) = O(β), o(cβ) = o(β);
(d) O(o(β)) = o(β), o(O(β)) = o(β);
(e) O(β1)O(β2) = O(β1β2), O(β1)o(β2) = o(β1β2), o(β1)O(β2) = o(β1β2).
(f ) If α = O(β1) and β1 = o(β2), then α = o(β2).
Proof. For parts (a)-(e), please refer to [13] for a proof. Part (f) can be verified straightforwardly. 2
With Proposition 1.1 and Lemma 3.1, mimicking the arguments as in [11, Theorem 5.1] gives the following theorem.
Theorem 3.2. Suppose that f is a continuously differentiable function and Assumptions 2.1 and 3.1 are satisfied. Let z∗ = (µ∗, x∗, s∗) be an accumulation point of{zk} generated by Algorithm 2.1. If all V ∈ ∂H(z∗) are nonsingular, then the following hold.
(a) αk ≡ 1 for all zk sufficiently close to z∗; (b) the whole sequence {zk} converges to z∗;
(c) ∥zk+1− z∗∥ = o(∥zk− z∗∥) (or ∥zk+1− z∗∥ = O(∥zk− z∗∥2)) provided f′ is Lipschitz continuous on IRn );
(d) µk+1 = o(µk) (or µk+1 = O(µ2k) if f′ is Lipschitz continuous on IRn).
4 Numerical Results
In this section, we present our test problems and report numerical results. In this paper, the function f is assumed to be a mapping from IRn to IRn, which means the dimension of x is exactly the same as the total number of inequalities and equalities. In reality, this may not be the case. In other words, there may have a system like this:
{ fI(x) ≤ 0, I = {1, 2, · · · , m}
fE(x) = 0, E ={m, m + 1, · · · , l} (12) This says f could be a mapping from IRn to IRl. When l ̸= n, the scheme in Algorithm 2.1 cannot be applied to the system (12) because the dimension of x is not equal to the total number of inequalities and equalities. To make system (12) solvable under the proposed algorithm, as remarked in [11, Sec. 6], some additional inequality or variable needs to be added. For example,
(i) if l < n, we may add a trivial inequality like
∑n i=1
x2i ≤ M
where M is sufficiently large, into system (12) so that Algorithm 2.1 can be applied.
(ii) if l > n and m≥ 1, we may add a variable xn+1 into the inequalities so that fi(x)≤ 0 → fi(x) + x2n+1 ≤ 0.
(iii) if l > n and m = 0, we may add a trivial inequality like
∑n+2 i=1
x2i ≤ M
where M is sufficiently large, into system (12) so that Algorithm 2.1 can be applied.
In real implementation, the H(z) given as in (6) is replaced by
H(z) :=
µ
fI(x)− s + cµxI
fE(x) + cµxE Φp(µ, s) + cµs
(13)
where c is a given constant. Likewise, the bH(µ, x) given as in (8) is replaced by
H(µ, x) :=b
µ
fE(x) + cµxE Φp(µ, fI(x)) + cµxI
. (14)
Adding such a constant c is helpful when coding the algorithm because µ approaches to zero eventually. The theoretical results will not be affected in any case. In practice, in order to obtain an interior solution x∗ for inequalities (fI(x∗) < 0), the following system
{ fI(x) + εe≤ 0 fE(x) = 0
is considered, where ε is a small number and e is the vector of all ones. Now, we list the test problems which are employed from [6, 11].
Example 4.1. Consider f (x) =
[ f1(x) f2(x)
]
with x∈ IR2 where
f1(x) = x21+ x22− 1 + ε ≤ 0,
f2(x) = −x21− x22+ (0.999)2+ ε≤ 0.
Example 4.2. Consider f (x) =
f1(x) f2(x) f3(x) f4(x) f5(x) f6(x)
with x∈ IR2 where
f1(x) = sin(x1) + ε≤ 0, f2(x) = − cos(x2) + ε≤ 0, f3(x) = x1− 3π + ε ≤ 0, f4(x) = x2− π
2 − 2 + ε ≤ 0, f5(x) = −x1− π + ε ≤ 0, f6(x) = −x2− π
2 + ε≤ 0.
Example 4.3. Consider f (x) =
[ f1(x) f2(x)
]
with x∈ IR2 where f1(x) = sin(x1) + ε≤ 0, f2(x) = − cos(x2) + ε≤ 0.
Example 4.4. Consider f (x) =
f1(x) f2(x) f3(x) f4(x) f5(x)
with x∈ IR5 where
f1(x) = x1+ x3− 1.6 + ε ≤ 0, f2(x) = 1.333x2+ x4− 3 + ε ≤ 0, f3(x) = −x3 − x4 + x5+ ε≤ 0, f4(x) = x21+ x23− 1.25 = 0, f5(x) = x1.52 + 1.5x4− 3 = 0.
Example 4.5. Consider f (x) =
f1(x) f2(x) f3(x)
with x ∈ IR3 where
f1(x) = x1+ x2e0.8x3 + e1.6+ ε≤ 0, f2(x) = x21+ x22+ x23− 5.2675 + ε ≤ 0, f3(x) = x1+ x2+ x3− 0.2605 = 0.
Example 4.6. Consider f (x) =
f1(x) f2(x) f3(x)
with x ∈ IR2 where
f1(x) = 0.8− ex1+x2 + ε≤ 0, f2(x) = 1.21ex1 + ex2 − 2.2 = 0, f3(x) = x21+ x22 + x2− 0.1135 = 0.
Example 4.7. Consider f (x) =
[ f1(x) f2(x)
]
with x∈ IR2 where f1(x) = x1− 0.7 sin(x1)− 0.2 cos(x2) = 0 f2(x) = x2− 0.7 cos(x1) + 0.2 sin(x2) = 0
Moreover, in light of the aforementioned discussions, there have corresponding mod- ified problems for Example 4.2’, Example 4.6’, and Example 4.7’, which are stated as
below. The other examples are kept unchanged. In other words, Example 4.1 and Exam- ple 4.1’ are the same , so are Example 4.3 and Example 4.3’, Example 4.4 and Example 4.4’, Example 4.5 and Example 4.5’.
Example 4.2’. Consider f (x) =
f1(x) f2(x) f3(x) f4(x) f5(x) f6(x)
with x∈ IR6 where
f1(x) = sin(x1) + ε≤ 0 f2(x) = − cos(x2) + ε≤ 0 f3(x) = x1− 3π + x23+ ε≤ 0 f4(x) = x2− π
2 − 2 + x24+ ε≤ 0 f5(x) = −x1− π + x25+ ε≤ 0 f6(x) = −x2− π
2 + x26+ ε≤ 0
Example 4.6’. Consider f (x) =
f1(x) f2(x) f3(x)
with x ∈ IR3 where
f1(x) = 0.8− ex1+x2 + x23+ ε≤ 0, f2(x) = 1.21ex1 + ex2 − 2.2 = 0, f3(x) = x21+ x22 + x2− 0.1135 = 0.
Example 4.7’. Consider f (x) =
f1(x) f2(x) f3(x)
with x ∈ IR3 where
f1(x) = x21+ x22+ x23− 10000 + ε ≤ 0, f2(x) = x1− 0.7 sin(x1)− 0.2 cos(x2) = 0, f3(x) = x2− 0.7 cos(x1) + 0.2 sin(x2) = 0.
The numerical implementations are coded in Matlab. In the numerical reports, x0 is the stating point, NI is the total number of iterations, NF denotes the number of function evaluations for H(zk) or bH(µk, xk), and SOL means the solution obtained from the algorithm. The parameters used in the algorithm are set as
ε = 0.00001, δ = 0.3, σ = 0.001, β = 1.0, µ0 = 1.0, Q0 = 1.0.
Table 1: Numerical performance when p = 2, stop criterion: 0.001.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 8 12 (-0.6188, 0.7853)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 Fail Fail Fail
Ex 4.3 (0, 0) 0.5 0.2 0.01 3 4 (-0.01516, 0.7207)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.5557, 1.324, 0.9703, 0.984, 1.156) Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 5 6 (-0.8301, -0.8662, 1.957) Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 7 8 (0.2743, -0.4975, 1.5e+006) Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 9 15 (0.5268, 0.5084, -100)
Ex 4.1’ (0, 5) 100 0.006 0.01 8 12 (-0.6188, 0.7853)
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 6 9 (-0.009654,1.428,2.846,1.28,1.639,1.666)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 3 4 (-0.01516, 0.7207)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.5557, 1.324, 0.9703, 0.984, 1.156) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 5 6 (-0.8301, -0.8662, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 5 7 (-0.09533, 0.09533, 0.3321) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 9 15 (0.5268, 0.5084, -100)
Note: Based on H(z) = 0 given as in (13).
In Table 1 and Table 2, we adapt the same x0, c, τ , η used as in [11] for p = 2.
Basically, in Table 1 and Table 2, the bottom half data for the modified problems are the same as those in [11], respectively. Below are our numerical observations and conclusions.
• From Table 1 and Table 2, we see that, when employing formulation H(z) = 0, solving the modified problems is more successful than solving the original problems.
• Table 3 indicates that the numerical results are the same for original problems and modified problems, when bH(µ, x) = 0 is employed. Hence, in Tables 4-11, we focus on the modified problems when formulation H(z) = 0 is employed, whereas we only test original problems whenever the implementations are based on bH(µ, x) = 0.
• From Table 5 (p = 2), we see that the algorithm based on bH(µ, x) = 0 can solve more problems than the one in [11] does. In view of the lower dimensionality of H(µ, x) = 0 and this performance, we can confirm the merit of this new reformu-b lation.
• In Table 4 and Table 5, the initial point and other parameters are the same as those in [11]. In Tables 6-7, we fix the initial point x0 for all test problems. In Table 8 and Table 9, even x0, c, τ and η are all fixed for all test problems. In Table 10 and Table 11, x0 is fixed for all test problems and parts of c, τ and η are fixed.
In general, we observe that the numerical performance based on the formulation H(µ, x) = 0 is better than the one based on H(z) = 0.b
Table 2: Numerical performance when p = 2, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 Fail Fail Fail
Ex 4.3 (0, 0) 0.5 0.2 0.01 4 5 (-0.01516, 0.7206)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 5 6 (0.5563, 1.326, 0.9698, 0.9822, 1.155) Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8299, -0.8663, 1.957) Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 7 8 (0.2743, -0.4975, 1.5e+006) Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
Ex 4.1’ (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.009276,1.429, 2.846,1.279,1.64, 1.667)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.01516, 0.7206)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 5 6 (0.5563, 1.326, 0.9698, 0.9822, 1.155) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8299, -0.8663, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 8 (-0.09533, 0.09533, 0.332) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
Note: Based on H(z) = 0 given as in (13).
• Moreover, the changing of parameter p seems have no influence on the numerical performance no matter bH(µ, x) = 0 or H(z) = 0 is adapted. This indicates that the smoothing approach may not be affected when p is perturbed. This phenomenon is different from the one for other approaches observed in [2, 3] and is a new discovery to the literature. We guess that the main reason comes from µ dominating the algorithm in the smoothing approach even various p is perturbed. This conjecture still needs further verification and investigation.
In summary, the main contribution of this paper is to propose a new family of smooth- ing functions and correct a flaw in an algorithm studied in [11], which is used to guarantee its convergence. We believe that the proposed new smoothing functions can be also em- ployed in other contexts where the projection function is involved. The related numerical performance can be investigated accordingly. We leave them as future research topics.
Acknowledgements. We are gratefully indebted to anonymous referees for their valu- able suggestions that help us to improve the presentation of the paper.
References
[1] R. Bhatia, Matrix Analysis, Springer-Verlag, New York, 1997.
Table 3: Numerical performance when p = 2, stop criterion: 0.001.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 10 15 (0.5942, -0.8031)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 3 4 (-0.01407, 7.663e-006, 0, 0, 0, 0)
Ex 4.3 (0, 0) 0.5 0.2 0.01 3 4 (-0.01407, 7.663e-006)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 3 4 (0.5489, 2.066, 0.9741, 0.0204, 9.748e-007) Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 24 39 (0.5031, -1.7, 1.458)
Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 3 4 (-0.09533, 0.09533, 0.08515)
Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 3 4 (0.5271, 0.508, 0)
Ex 4.1’ (0, 5) 100 0.006 0.01 10 15 (0.5942, -0.8031)
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 3 4 (-0.01407, 7.663e-006, 0, 0, 0, 0)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 3 4 (-0.01407, 7.663e-006)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 3 4 (0.5489, 2.066, 0.9741, 0.0204, 9.748e-007) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 24 39 (0.5031, -1.7, 1.458)
Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 3 4 (-0.09533, 0.09533, 0.1698)
Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 3 4 (0.5271, 0.508, 0)
Note: Based on bH(µ, x) = 0 given as in (14).
[2] J-S. Chen and S-H. Pan, A family of NCP-functions and a descent method for the nonlinear complementarity problem, Computational Optimization and Applications, vol. 40, no. 3, pp. 389-404, 2008.
[3] J-S. Chen, S-H. Pan, and C-Y. Yang, Numerical comparison of two effective methods for mixed complementarity problems, Journal of Computational and Applied Mathematics, vol. 234, no. 3, pp. 667-683, 2010.
[4] J.W. Daniel, Newton’s method for nonlinear inequalities, Numerical Mathematics, vol. 21, pp. 381-387, 1973.
[5] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1986.
[6] Z-H. Huang, Y. Zhang, and W. Wu, A smoothing-type algorithm for solving stsyem of inequalities, Journal of Computational and Applied Mathematics, vol. 220, pp. 355-363, 2008.
[7] D.Q. Mayne, E. Polak, and A.J. Heunis, Solving nonlinear inequalities in a finite number of iterations, Journal of Optimization Theory and Applications, vol.
33, pp. 207-221, 1981.
[8] M. Sahba, On the solution of nonlinear inequalities in a finite number of iterations, Numerical Mathematics, vol. 46, pp. 229-236, 1985.
[9] J.M. Schott, Matrix Analysis for Statistics, 2nd edition, John Wiley, New Jersey, 2005.
[10] H-X. Ying, Z-H. Huang, and L. Qi, The convergence of a Levenberg-Marquard method for the l2-norm solution of nonlinear inequalities, Numerical Functional Anal- ysis and Optimization, vol. 29, pp. 687-716, 2008.
[11] Y. Zhang and Z-H. Huang, A nonmonotone smoothing-type algorithm for solv- ing a system of equalities and inequalities, Journal of Computational and Applied Mathematics, vol. 233, pp. 2312-2321, 2010.
[12] J. Zhu and B. Hao, A new non-interior continuation method for solving a sys- tem of equalities and inequalities, Journal of Applied Mathematics, Artical Number 592540, 2014.
[13] Robert G. Bartle, The Elements of Real Analysis, Second Edition Jhon Wiley, New Jersey, 1976.
[14] S.L. HU, Z-H. Huang and P. Wang , A non-monotone smoothing Newton algorithm for solving nonlinear complementarity problems, Optimization Methods and Software , vol. 24, pp. 447-460, 2009
Table 4: Numerical performance with different p for modified problems.
p = 2, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.009276,1.429,2.846,1.279,1.64,1.667)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.01516, 0.7206)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 5 6 (0.5563, 1.326, 0.9698, 0.9822, 1.155) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8299, -0.8663, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 8 (-0.09533, 0.09533, 0.332) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
p = 1.5, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.03532,1.428, 2.849, 1.278,1.63,1.666)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.0189, 0.7217)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 5 6 (0.5546, 1.329, 0.9708, 0.979, 1.135) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8288, -0.8673, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 8 (-0.09533, 0.09533, 0.3509) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
p = 3, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.007125,1.43,2.848,1.281, 1.641,1.667)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.01061, 0.72)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 6 7 (0.5589, 1.33, 0.9683, 0.9771, 1.17) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8313, -0.8648, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 7 (-0.09533, 0.09533, 0.4472) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
p = 8, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 5) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.00355,1.431,2.849,1.282,1.643,1.668)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.001338, 0.7196)
Ex 4.4’ (0.5, 2, 1, 0, 0) 5 0.02 0.8 6 7 (0.5622, 1.351, 0.9664, 0.9528, 1.18) Ex 4.5’ (-1, -1, 1) 0.5 0.2 0.8 6 7 (-0.8338, -0.8624, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 5 6 (-0.09533, 0.09533, 0.2985) Ex 4.7’ (0, 1, 0) 0.5 0.006 0.8 10 16 (0.5265, 0.5079, -100)
Note: Based on H(z) = 0 given as in (13).
Table 5: Numerical performance with different p for original problems.
p = 2, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 12 20 (0.5821, -0.812)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 4 5 (-0.01407, 7.663e-006,0,0,0,0)
Ex 4.3 (0, 0) 0.5 0.2 0.01 4 5 (-0.01407, 7.663e-006)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.549,2.066,0.974,0.02039,9.747e-007) Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 25 40 (0.5029, -1.7, 1.458)
Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 4 5 (-0.09533, 0.09533, 0.08515)
Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 4 5 (0.5265, 0.5079, 0)
p = 1.5, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 12 20 (0.5821, -0.812)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 4 5 (-0.01739, 0.002797,0,0,0,0)
Ex 4.3 (0, 0) 0.5 0.2 0.01 4 5 (-0.01739, 0.002797)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.547,2.061,0.9751,0.02811,0.000359)
Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 Fail Fail Fail
Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 4 5 (-0.09533, 0.09533, 0.1086)
Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 4 5 (0.5265, 0.5079, 0)
p = 3, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 12 20 (0.5821, -0.812)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 4 5 (-0.009902, 6.814e-011,0,0,0,0)
Ex 4.3 (0, 0) 0.5 0.2 0.01 4 5 (-0.009902, 6.814e-011)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.5513, 2.071,0.9727,0.01239, 8.581e-012)
Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 Fail Fail Fail
Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 4 5 (-0.09533, 0.09533, 0.06131)
Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 4 5 (0.5265, 0.5079, 0)
p = 8, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1 (0, 5) 100 0.006 0.01 12 20 (0.5821, -0.812)
Ex 4.2 (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 4 5 (-0.002048, 6.022e-036, 0, 0, 0, 0)
Ex 4.3 (0, 0) 0.5 0.2 0.01 4 5 (-0.002048, 6.022e-036)
Ex 4.4 (0.5, 2, 1, 0, 0) 5 0.02 0.8 4 5 (0.557, 2.079,0.9694,0.001483,7.47e-037)
Ex 4.5 (-1, -1, 1) 0.5 0.2 0.8 Fail Fail Fail
Ex 4.6 (0, 0, 0) 0.5 0.02 0.8 4 5 (-0.09533, 0.09533, 0.01803)
Ex 4.7 (0, 1, 0) 0.5 0.006 0.8 4 5 (0.5265, 0.5079, 0)
Note: Based on bH(µ, x) = 0 given as in (14).
Table 6: Numerical performance with different p for modified problems.
p = 1.5, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 0) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.03532,1.428,2.849,1.278,1.63,1.666)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.0189, 0.7217)
Ex 4.4’ (0, 0, 0, 0, 0) 5 0.02 0.8 Fail Fail Fail
Ex 4.5’ (0, 0, 0) 0.5 0.2 0.8 8 13 (-0.8355, -0.8607, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 8 (-0.09533, 0.09533, 0.3509)
Ex 4.7’ (0, 0, 0) 5 0.02 0.8 17 21 (0.5265, 0.5079, 100)
p = 2, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 0) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.009276,1.429,2.846,1.279,1.64,1.667)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.01516, 0.7206)
Ex 4.4’ (0, 0, 0, 0, 0) 5 0.02 0.8 Fail Fail Fail
Ex 4.5’ (0, 0, 0) 0.5 0.2 0.8 8 13 (-0.8356, -0.8606, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 8 (-0.09533, 0.09533, 0.332)
Ex 4.7’ (0, 0, 0) 5 0.02 0.8 17 21 (0.5265, 0.5079, 100)
p = 3, stop criterion: 1e− 006.
Problem x0 c τ η NI NF SOL
Ex 4.1’ (0, 0) 100 0.006 0.01 Fail Fail Fail
Ex 4.2’ (0, 0, 0, 0, 0, 0) 0.5 0.2 0.01 7 10 (-0.007125,1.43,2.848,1.281,1.641,1.667)
Ex 4.3’ (0, 0) 0.5 0.2 0.01 4 5 (-0.01061, 0.72)
Ex 4.4’ (0, 0, 0, 0, 0) 5 0.02 0.8 Fail Fail Fail
Ex 4.5’ (0, 0, 0) 0.5 0.2 0.8 8 13 (-0.8356, -0.8606, 1.957) Ex 4.6’ (0, 0, 0) 0.5 0.02 0.8 6 7 (-0.09533, 0.09533, 0.4472)
Ex 4.7’ (0, 0, 0) 5 0.02 0.8 17 21 (0.5265, 0.5079, 100)
Note: Based on H(z) = 0 given as in (13), x0 is fixed.