4 Geometric analysis of merit function in descent algorithms

In this section, we employ derivative-free descent algorithms presented in [4, 5] to solve the unconstrained minimization problems (11) by using the merit function (10). We then compare two algorithms and study their convergent behavior by investigating an intuitive visualization. We first list these two algorithms as below.

Algorithm 4.1 [4, Algorithm 4.1]

(Step 0) Given real numbers p > 1 and a starting point x⁰ ∈ IRⁿ. Choose the parameters σ ∈ (0, 1), β ∈ (0, 1) and ε ≥ 0. Set k := 0.

(Step 1) If Ψ_p(x^k)≤ ε, then stop.

(Step 2) Let m_k be the smallest nonnegative integer m satisfying Ψ_p(x^k+ β^md^k)≤ (1 − σβ^2m)Ψ_p(x^k), where

d^k :=−∇bψp(x^k, F (x^k)) and

∇bψ_p(x, F (x)) :=

(

∇bψ_p(x₁, F₁(x)), · · · , ∇bψ_p(x_n, F_n(x))

)_T

(Step 3) Set x^k+1 := x^k+ β^m^kd^k, k := k + 1 and go to Step 1.

Algorithm 4.2 [5, Algorithm 4.1]

(Step 0) Given real numbers p > 1 and α ≥ 0 and a starting point x⁰ ∈ IRⁿ. Choose the parameters σ∈ (0, 1), β ∈ (0, 1), γ ∈ (0, 1) and ε ≥ 0. Set k := 0.

(Step 1) If Ψ_α,p(x^k)≤ ε, then stop.

(Step 2) Let m_k be the smallest nonnegative integer m satisfying Ψ_α,p(x^k+ β^md^k(γ^m))≤ (1 − σβ^2m)Ψ_α,p(x^k), where

d^k(γ^m) := −∇bψ_α,p(x^k, F (x^k))− γ^m∇aψ_α,p(x^k, F (x^k)) and

∇aψ_α,p(x, F (x)) :=

(

∇aψ_α,p(x₁, F₁(x)), · · · , ∇aψ_α,p(x_n, F_n(x))

)_T

∇bψ_α,p(x, F (x)) :=

(

∇bψ_α,p(x₁, F₁(x)), · · · , ∇bψ_α,p(x_n, F_n(x))

)_T

(Step 3) Set x^k+1 := x^k+ β^m^kd^k(γ^m^k), k := k + 1 and go to Step 1.

In Algorithm 4.2, ψ_α,p : IR² → IR+ is an NCP-function defined by ψ_α,p(a, b) := α

2(max{0, ab})²+ ψ_p(a, b) = α

2(ab)²₊+1

2(∥(a, b)∥p− (a + b))²

with α ≥ 0 being a real parameter. When α = 0, the function ψα,p reduces to ψ_p. For comparing these two algorithms, we take α = 0 when we use Algorithm 4.2 in this section. Note that the descent direction in Algorithm 4.1 is lack of a certain symmetry whereas Algorithm 4.2 adopts a symmetric search direction. Under the assumption of monotonicity. i.e.,

⟨x − y, F (x) − F (y)⟩ ≥ 0 for all x, y ∈ IRⁿ,

the error bound is proposed and Algorithm 4.2 is shown to have locally R-linear conver-gence rate in [5]. In other words, there exists a positive constant κ₂ such that

∥x^k− x^∗∥ ≤ κ2

(

max

{

Ψ_α,p(x^k),

√

Ψ_α,p(x^k)

})¹

2 when α = 0.

Furthermore, the convergence rate of Algorithm 4.2 has a close relation with the constant

⌈

log^γ

(

L₁+ σ C(B, α, p)

)⌉

, where C(B, α, p) =

(

2− 2¹^p⁾⁴ αB²+

(

2 + 2¹^p

)2.

Therefore, when the value of p decreases, the convergence rate of Algorithm 4.2 becomes worse and worse, see Remark 4.1 in [5].

Recall that the merit function Ψ_p(x) is sum of n nonnegative function ψ_p. i.e., Ψ_p(x) =

∑n i=1

ψ_p(x_i, F_i(x)).

This encourages us to view each component ψ_p(x^k_i, F_i(x^k)) for i = 1, 2..., n as the motion with different velocity on the same surface z = ψp(a, b) at each iteration. Due to our study in Section 2 and Section 3, we observe a visualization that help us understand the convergent behavior in details. Figure 19 depicts the visualization in a four-dimensional NCP in Example 4.3. The merit function of this NCP is Ψ_p(x) = ^∑⁴

i=1

ψ_p(x_i , F_i(x)). We plot point sequences^{(x^k_i, F_i(x^k))^}for i = 1, 2, 3, 4 together with different color and level curve of surface z = ψ_1.1(a, b) in Figure 19(a). The vertical line represents the value of each x_i, the horizontal line represents the value of each F_i(x) and the skew line means x_i = F_i(x). We take initial point x⁰ = (0, 0, 0, 0) which implies F (x⁰) = (−6, −2, −1, −3), and observe convergent behavior separately with different i from initial point to the

solution x^∗ = (√

6/2, 0, 0, 1/2) which is on the horizontal line in this figure. Furthermore, we observe the position of point sequence on the surface in Figure 19(a) and merit function which is the sum of their height at each iteration shown as in Figure 19(b).

In one-dimensional NCP, F is continuously differentiable and there is only one vari-able x in F , so (x, F (x)) is a continuous curve on IR² and the merit function Ψ_p(x) = ψ_p(x, F (x)) is obviously a curve on the surface z = ψ_p(a, b), see Figure 15(a) and (b). Therefore, point sequence in one-dimensional problem can only lie on the curve (x, F (x), ψp(x, F (x))).

Example 4.1 Consider the NCP, where F : IR→ IR is given by F (x) = (x− 3)³+ 1.

The unique solution of this NCP is x^∗ = 2. Note that F is strictly monotone, see geometric view of this NCP problem in Figure 15. The value of the merit function with each iteration is plotted in Figure 15(c) which presents the different behavior of the functions with different value p near by the solution. Figures 16(a)-(d) depict convergent behavior in Algorithm 4.1 from two direction with two different initial points, and Figures 16(e)-(f) show convergent behavior with different p. Figures 18(a)-(d) depict convergent behavior in Algorithm 4.2 from two direction with two different initial points. We found that Algorithm 4.2 always produce point sequence in or close to the boundary of feasible set, i.e., {(x, F (x)) : x ≥ 0 and F (x) ≥ 0}. Based on Proposition 3.2, the speed of the decreasing of merit function with different initial point in Algorithm 4.1 is different when we increase p. But it is similar with different initial point in Algorithm 4.2. This phenomena is consistent with geometric properties studied in Section 3.

To show the importance of inflection point, we give an extreme example as follows:

Example 4.2 Consider the NCP, where F : IR→ IR is given by F (x) = 1.

The unique solution of this NCP is x^∗ = 0. From above discussion, we know that point sequence is on the curve (x, 1, ψp(x, 1)), see Figure 17(a). Figure 17(c) shows there is rapid decreasing of merit function form the 80th to 120th iteration. Figure 17(b) shows the behavior during 80th to 120th iteration. Observing the width of the level curve in Figure 17 (b), we found that rapid decreasing may arise from the existence of inflection point on the surface. Figures 17(c)-(f) and Figures 18(e) and (f) show that the position of inflection point may change with different p.

Example 4.3 Consider the NCP, where F : IR⁴ → IR⁴ is given by

F (x) =







3x²₁ + 2x₁x₂+ 2x²₂+ x₃+ 3x₄− 6 2x²₁+ x₁+ x²₂+ 3x₃+ 2x₄− 2 3x²₁ + x₁x₂+ 2x²₂+ 2x₃+ 3x₄− 1

x²₁+ 3x²₂ + 2x3+ 3x4− 3





.

This is non-degenerate NCP and the solution is x^∗ = (√

6/2, 0, 0, 1/2). Figure 19 shows that the behavior of merit function is consistent with Proposition 3.2(b) in Algorithm 4.1.

Figure 20 shows that the convergent behavior is different with different initial point in Algorithm 4.1. Figure 21 says that the convergent behavior is also different with different initial point in Algorithm 4.2. But we saw that the behavior is still different between two algorithms even with the same initial point from Figures 20 and 21. Figure 22(e) shows that merit function decreases more and more quickly when p is smaller. However, Figure 23 (e) shows that merit function decreases more and more quickly when p is bigger.

Example 4.4 Consider the NCP, where F : IR⁵ → IR⁵ is given by

F (x) =







x₁+ x₂x₃x₄x₅/50 x₂+ x₁x₃x₄x₅/50− 3 x3+ x1x2x4x5/50− 1 x₄+ x₁x₂x₃x₅/50 + 1/2

x₅+ x₁x₂x₃x₄/50







This NCP has only one solution x^∗ = (0, 3, 1, 0, 0). We choose initial point x⁰ = (3, 4, 2, 2, 2) in Algorithm 4.2, see Figure 24. Figures 24(e) and Figure 23(e) show different result with different p.

The results of above examples suggest that the convergent behavior is influenced by the position of initial point, properties of F (x), and the geometric structure of the NCP function ψ_p. Indeed, the convergent behavior of Algorithm 4.1 can be classified into three cases when starting from different initial points.

• Case 1: the point sequences ^{(x^k_i, F_i(x^k))^}, i = 1, . . . , n are almost located in or close to the boundary of surface z = ψp(a, b) where a > 0 and b > 0, see Figure 20(f).

• Case 2: the point sequences ^{(x^k_i, F_i(x^k))^}, i = 1, . . . , n are almost located in or close to the boundary of surface z = ψ_p(a, b) where a > 0 and b < 0, see Figures 16(a) and (e).

• Case 3: the point sequence does not belong to case 1 and case 2, see Figure 20(a).

In addition, the value of merit function decreases more quickly when the value of p increases in case 1, see Figure 22. The value of merit function decreases more quickly when the value of p increases in case 2, see Figure 19. The value of merit function seems to depend on the slope of the surface, as visualization shows. Thus, the above two cases can be explained by Proposition 3.2 geometrically. Although the convergent behavior is complicated in case 3, if there exists some initial point x⁰ such that F_i(x⁰) < 0 for some i and p is small, for example, p ∈ (1, 2), we can easily deduce that the value of merit function decreases more quickly when the value of p increases, like case 2 and see Figure 16. This is because the surface z = ψ_p(a, b) is much higher when a > 0 and b < 0 than

when a > 0 and b > 0, see Figure 12 and Figure 14(d). Therefore the value of merit function is dominated by the component of F (x) with initial point satisfy F_i(x⁰) < 0 for some i. This result was discussed and presented in [4].

Convergent behavior in Algorithm 4.2 belongs to case 1, see Figure 18(a) and Figure 21. The behavior of merit function seems to depend on the height scale of surface with steady step length at each iteration, as visualization shows. Therefore surface in case 1 is closer to zero but becomes flatter when p is smaller. This is the geometrical reason why Algorithm 4.2 has a better a global convergence and a worse convergence rate when p decreases, see concluding remark in [5] and Figure 23(e).

Although Algorithm 4.2 seems to have different global convergence result with Algo-rithm 4.1, we can choose initial point we want by observing our visualization. Figure 24(e) and Figure 22(e) show similar global convergence result with different p. This is because they have similar convergent behavior, see Figure 20(f) and Figures 24(a)-(d).

5 Conclusion

In this paper, we view generalized Fischer-Burmeister function ϕ_p as a family of surfaces in IR³ and study some geometric properties of ϕ_p and ψ_p. If we use a descent method to solve NCP with merit function Ψ_p, then convergent behavior is influenced by descent direction and geometric structure of surface ψp. By looking at the visualization we present in this paper, we observe that the existence of inflection point causes rapid decreasing of merit function. The convergent behavior of both algorithms is sensitive to initial point which is also mentioned in [5, 7]. In addition, the global convergence of Algorithm 4.1 and Algorithm 4.2 may depend on the geometric structure with different initial point. From the geometric view, an NCP function can be regarded as a surface in IR³. The parameters defined in NCP functions may change the geometric structure of the surface. This geometric view and visualization greatly help us realize that how convergent behavior changes by changing or adding parameters in NCP function.

1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6

−1

−0.5 0 0.5 1 1.5

F(x)

(a) Graph F (x) = (x− 3)³+ 1.

−1 0 1 2 3 4

−1

−0.5 0 0.5 1 1.5 2

F(x)

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

(b) Plane curve in (a) and level curve of surface z = ψ2(a, b) near the solution (x^∗, F (x^∗)) = (2, 0).

1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

x Ψ p(x,F(x))

p=1.1 p=1.5 p=2 p=3 p=100

(c) The merit function with different p near the solution.

Figure 15: Geometric view of NCP in Example 4.1.

−1 0 1 2 3 4 5

0 200 400 600 800 1000 1200 1400

Figure 16: Convergent behavior of Algorithm 4.1 and the value of merit function in Example 4.1.

−1 −0.5 0 0.5 1 1.5

(b) Local convergent behavior of (a) with the 80th to 120th iteration of (c).

0 20 40 60 80 100 120 140

0 200 400 600 800 1000 1200

Figure 17: Convergent behavior of Algorithm 4.1 and merit function with initial point x⁰ = 1 in Example 4.2.

−1 0 1 2 3 4 5 6 7

(e) Merit function in Example 4.2 with p = 1.1 and x⁰= 10.

(f) Merit function in Example 4.2 with p = 5 and x⁰= 10.

Figure 18: Convergent behavior of Algorithm 4.2 and the value of merit function in Example 4.1 and 4.2.

−0.5 0 0.5 1 1.5 2

0 500 1000 1500 2000

(b) Iteration and merit function p = 1.1.

−0.5 0 0.5 1 1.5 2

(d) Iteration and merit function p = 1.5.

−0.5 0 0.5 1 1.5 2

0 200 400 600 800 1000 1200 1400

(f) Iteration and merit function p = 2.

Figure 19: Convergent behavior of Algorithm 4.1 and merit function with initial point x⁰ = (0, 0, 0, 0) in Example 4.3.

−0.5 0 0.5 1 1.5 2

Figure 20: Convergent behavior of Algorithm 4.1 with different initial point and p = 2 in Example 4.3.

0 0.5 1 1.5 2 2.5 3

Figure 21: Convergent behavior of Algorithm 4.2 with different initial point and p = 2 in Example 4.3.

0 50 100 150 200 0

0.5 1 1.5 2 2.5 3 3.5x 10⁻³

Iteration

Merit function

(a) p = 1.1.

0 50 100 150 200

0 0.005 0.01 0.015 0.02 0.025 0.03

Iteration

Merit function

(b) p = 2.

0 50 100 150 200

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

Iteration

Merit function

(c) p = 3.

0 50 100 150 200

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

Iteration

Merit function

(d) p = 100.

0 20 40 60 80 100 120 140 160 180 200

0 0.005 0.01 0.015 0.02 0.025

Iteration

Merit function

p=100

p=3 p=2

p=1.1

(e) All together.

Figure 22: Merit function in the first 200 iterations of Algorithm 4.1 with different p and initial point x⁰ = (2, 1, 1, 1) of Figure 20(f).

0 50 100 150 200 250 300 0

0.05 0.1 0.15 0.2 0.25

Iteration

Merit function

(a) p = 1.1.

0 100 200 300 400 500

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

Iteration

Merit function

(b) p = 1.5.

0 100 200 300 400 500

0 0.5 1 1.5 2 2.5 3 3.5

Iteration

Merit function

(c) p = 2.

0 50 100 150 200 250 300 350 400 450

0 0.5 1 1.5 2 2.5 3 3.5

Iteration

Merit function

(d) p = 100.

0 50 100 150 200 250 300 350 400 450 500

0 0.5 1 1.5 2 2.5 3 3.5

Iteration

Merit function

p=1.1 p=1.5 p=2

p=100

(e) All together.

Figure 23: Merit function of Algorithm 4.2 with different p and initial point x⁰ = (2, 1, 1, 1) of Figure 21(f).

0 0.5 1 1.5 2 2.5 3 3.5 4

(e) Merit function with different p.

Figure 24: Convergent behavior with different p and merit function of Algorithm 4.2 with initial point x⁰ = (3, 4, 2, 2, 2) in Example 4.4.

References

[1] S. C. Billups, S. P. Dirkse and M. C. Soares, A comparison of algorithms for large scale mixed complementarity problems, Computational Optimization and Applications, vol. 7, pp. 3-25, 1997.

[2] J.-S. Chen, The Semismooth-related Properties of a Merit Function and a Descent Method for the Nonlinear Complementarity Problem, Journal of Global Optimization, vol. 36, pp. 565-580, 2006.

[3] J.-S. Chen, On Some NCP-functions Based on the Generalized Fischer-Burmeister function, Asia-Pacific Journal of Opertional Research, vol. 24, pp. 401-420, 2007.

[4] J.-S. Chen and S. Pan, A Family of NCP-functions and a Descent Method for the Nonlinear Complementarity Problem, Computational Optimization and Applications, vol. 40, pp. 389-404, 2008.

[5] J.-S. Chen, H.-T. Gao and S. Pan, A R-linearly convergent derivative-free al-gorithm for the NCPs based on t he generalized Fischer-Burmeister merit function, Journal of Computational and Applied Mathematics, vol. 230, pp. 69-82, 2009.

[6] J.-S. Chen, Z.-H. Huang and C.-Y. she, A new class of penalized NCP-functions and its properties, to appear in Computational Optimization and Applications, 2011.

[7] J.-S. Chen, C.-H. Ko and S.-H. Pan, A neural network based on generalized Fischer-Burmeister function for nonlinear complementarity problems, Information Sciences, vol. 180, pp. 697-711, 2010.

[8] R.W. Cottle, J.-S. Pang and R.-E. Stone, The Linear Complementarity Prob-lem, Academic Press, New York, 1992.

[9] S. Dafermos, An Iterative Scheme for Variational Inequalities, Mathematical Pro-gramming, vol. 26, pp.40-47, 1983.

[10] F. Facchinei and J. Soares, A New Merit Function for Nonlinear Complemen-tarity Problems and a Related Algorithm, SIAM Journal on Optimization, vol. 7, pp.

225-247, 1997.

[11] F. Facchinei and J.S. Pang, Finite-Dimensional Variational Inequalities and Complementary Problems, Springer, New York, vol. I and II, 2003.

[12] A. Fischer, A Special Newton-type Optimization Methods, Optimization, vol. 24, pp. 269-284, 1992.

[13] A. Fischer, Solution of the monotone complementarity problem with locally Lips-chitzian functions, Mathematical Programming, vol. 76, pp. 513-532, 1997.

[14] M. Fukushima Merit Functions for Varitional Inequality and Complementarity Problem, Nonlinear Optimization and Applications, edited by G Di Pillo and F.

Giannessi, Pleneum Press, New York, pp. 155-170, 1996.

[15] C. Geiger and C. Kanzow, On the Resolution of Monotone Complementarity Problems, Computational Optimization and Applications, vol. 5, pp. 155-173, 1996.

[16] P. T. Harker and J.-S. Pang, Finite Dimensional Variational Inequality and Nonlinear Complementarity Problem: A Survey of Theory, Algorithms and Applica-tions, Mathematical Programming, vol. 48, pp. 161-220, 1990.

[17] N.J. Higham, Estimating the matrix p-norm, Numerical Mathematics Vol. 62, pp.

539-555, 1992.

[18] H. Jiang, Unconstrained Minimization Approaches to Nonlinear Complementarity Problems, Journal of Global Optimization, vol. 9, pp. 169-181, 1996.

[19] C. Kanzow, Nonlinear Complementarity as Unconstrained Optimization, Journal of Optimization Theory and Applications, vol. 88, pp. 139-155, 1996.

[20] C. Kanzow, N. Yamashita and M. Fukushima, New NCP-functions and Their Properties, Journal of Optimization Theory and Applications, vol. 94, pp. 115-135, 1997.

[21] O. L. Mangasarian, Equivalence of the Complementarity Problem to a System of Nonlinear Equations, SIAM Journal on Applied Mathematics, vol. 31, pp. 89-92, 1976.

[22] J.-S. Pang, Complementarity problems, Handbook of Global Optimization, edited by R. Horst and P. Pardalos, Kluwer Academic Publishers, Boston, Massachusetts, pp. 271-338, 1994.

[23] J.-S. Pang, Newton’s Method for B-differentiable Equations, Mathematics of Op-erations Research, vol. 15, pp. 311-341, 1990.

[24] J.-S. Pang and D. Chan, Iterative Methods for Variational and Complemantarity Problems, Mathematics Programming, vol. 27, 99. 284-313, 1982.

[25] D. Sun and L.-Q. Qi, On NCP-functions, Computational Optimization and Ap-plications, vol. 13, pp. 201-220, 1999.

[26] P. Tseng, Growth Behavior of a Class of Merit Functions for the Nonlinear Com-plementarity Problem, Journal of Optimization Theory and Applications, vol. 89, pp.

17-37, 1996.

[27] N. Yamashita and M. Fukushima, On Stationary Points of the Implicit La-grangian for the Nonlinear Complementarity problems, Journal of Optimization The-ory and Applications, vol. 84, pp. 653-663, 1995.

[28] N. Yamashita and M. Fukushima, Modified Newton Methods for Solving a Semismooth Reformulation of Monotone Complementarity Problems, Mathematical Programming, vol. 76, pp. 469-491, 1997.

[29] K. Yamada, N. Yamashita, and M. Fukushima, A New Derivative-free Descent Method for the Nonlinear Complementarity Problems, in Nonlinear Optimization and Related Topics edited by G.D. Pillo and F. Giannessi, Kluwer Academic Publishers, Netherlands, pp. 463-487, 2000.

在文檔中廣義FB函數與其merit函數的幾何觀點 (頁 25-42)