廣義FB函數與其merit函數的幾何觀點

全文

(1)國立臺灣師範大學數學系碩士班碩士論文. 指導教授：陳界山博士. Geometric view of generalized Fischer-Burmeister function and its induced merit function. 研究生：蔡懷潁. 中華民國九十九年十二. 月.

(2) 致謝. 能夠完成人生的第一份論文，除了要感謝指導老師陳界山老師的指導，口試委員朱亮儒老師、李育杰老師寶貴的意見，還有父母同學的支持，第一次出版自己作品的經驗異常寶貴，也獲得許多收穫。回首這一段在師大數學系進修充實的日子，不斷的學習與成長，對數學更深入的體會，而這一切要感謝師大的老師們熱心的給予啟發與指導，也要感謝筱芸、敬凡等同學的陪伴與幫助。記得陳界山老師說過，「微分方程、機率統計、最優化是數學與外界的橋梁。」有幸接觸到最優化的領域，期望未來還能再進步，不管是數學基礎部分、論文寫作部分，依然需要多加強，也期許未來能寫出結構更嚴謹有創見的論文。. 蔡懷潁謹致 2010年 12月.

(3) 目次. 1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3. Geometric view of ϕp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Geometric view of ψp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5. Geometric analysis of merit function in descent algorithms . . . 22 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 7. Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.

(4) Geometric view of generalized Fischer-Burmeister function and its induced merit function Abstract. In this paper, we study some geometric properties of generalized FischerBurmeister function, ϕp (a, b) = ∥(a, b)∥p − (a + b) where p ∈ (1, +∞), and the merit function ψp (a, b) induced from ϕp (a, b). It is well known that the nonlinear complementarity problem (NCP) can be reformulated as an equivalent unconstrained minimization by means of merit functions involving NCP-functions. From the geometric view of curve and surface, we have more intuitive ideas about convergent behaviors of the descent algorithms that we use. Furthermore, geometric view indicates how to improve the algorithm to achieve our goal by setting proper value of the parameter in merit function approach. Keywords. Curvature, surface, level curve, NCP-function, merit function.. 1. Introduction. The nonlinear complementarity problem (NCP) is to find a point x ∈ IRn such that x ≥ 0,. F (x) ≥ 0,. ⟨x, F (x)⟩ = 0,. (1). where ⟨·, ·⟩ is the Euclidean inner product and F = (F1 , . . . , Fn )T is a map from IRn to IRn . We assume that F is continuously differentiable throughout this paper. The NCP has attracted much attention because of its wide applications in the fields of economics, engineering, and operations research [8, 11, 16], to name a few. Many methods have been proposed to solve the NCP; see [14, 16, 22] and the references therein. One of the most powerful and popular approach is to reformulate the NCP as a system of nonlinear equations [21, 23, 28], or an unconstrained minimization problem [9, 10, 12, 15, 18, 19, 24, 27]. The objective function that can constitute an equivalent unconstrained minimization problem is called a merit function, whose global minima are coincident with the solutions of the original NCP. To construct a merit function, a class of functions, called NCP-functions and defined below, plays a significant role. A function ϕ : IR2 → IR is called an NCP-function if it satisfies ϕ(a, b) = 0. ⇐⇒. a ≥ 0, b ≥ 0, ab = 0.. (2). Equivalently, ϕ is an NCP-function if the set of its zeros is the two nonnegative semiaxes. An important NCP-function, which plays a central role in the development of efficient algorithms for the solution of the NCP, is the well-known Fischer-Burmeister(FB) NCPfunction [12, 13] defined as √ ϕ(a, b) = a2 + b2 − (a + b). (3) 1.

(5) With the NCP function, we can obtain an equivalent formulation of the NCP by a system of equations:   ϕ(x1 , F1 (x))   ·       = 0. Φ(x) =  (4) ·    ·   ϕ(xn , Fn (x)) In other words, we have x solves the NCP ⇐⇒ Φ(x) = 0. In view of this, we define a real-valued function Ψ : IRn → IR+ n 1 1∑ 2 Ψ(x) := ∥Φ(x)∥ = ϕ2 (xi , Fi (x)). 2 2 i=1. (5). It is known that Ψ a merit function of the NCP, i.e., the NCP is equivalent to an unconstrained minimization problem: min Ψ(x).. (6). x∈IRn. Merit functions is frequently used in designing numerical algorithms for solving the NCP. In particular, we can apply an iterative algorithm to minimize the merit function with hope of obtaining its global minimum. Recently, the so-called generalized Fischer-Burmeister function was proposed in [3, 4]. More specifically, they considered ϕp : IR2 → IR and ϕp (a, b) := ∥(a, b)∥p − (a + b),. (7). where p > 1 is an√arbitrary fixed real number and ∥(a, b)∥p denotes the p-norm of (a, b), i.e., ∥(a, b)∥p = p |a|p + |b|p . In other words, in the function ϕp , the 2-norm of (a, b) in the FB function is replaced by a more general p-norm. The function ϕp is still an NCP-function, which naturally induces another NCP-function ψp : IR2 → IR+ given by 1 ψp (a, b) := |ϕp (a, b)|2 . 2. (8). For any given p > 1, the function ψp is shown to possess all favorable properties of the FB function ψ; see [2, 3, 4]. It plays an important part in our study throughout the paper. Like Φ, the operator Φp : IRn → IRn defined as     Φp (x) =    . ϕp (x1 , F1 (x)) · · · ϕp (xn , Fn (x)) 2.        . (9).

(6) yields a family of merit functions Ψp : IRn → IR+ for the NCP: n ∑ 1 2 Ψp (x) := ∥Φp (x)∥ = ψp (xi , Fi (x)). 2 i=1. (10). Analogously, the NCP is equivalent to an unconstrained minimization problem: min Ψp (x).. x∈IRn. (11). It was shown that if F is monotone [15] or a P0 -function [10], then any stationary point of Ψ is a global minima of the unconstrained minimization minx∈IRn Ψ(x), and hence solves the NCP. The similar results were generalized to Ψp in [4]. On the other hand, there are many classical iterative methods applied to this unconstrained minimization of the NCP. Derivative-free methods [29] are suitable for problems where the derivatives of F are not available or expansive. Some derivative-free algorithms with global convergence results were proposed to solve the NCP based on generalized Fischer-Burmeister merit function. For example, in [4, 5] suggested that the performance of the algorithm is influenced by parameter p. However, not much is known about how parameter p affects convergent behavior in detail. In this paper, we aim to analyze it from geometric view. More specifically, the objective of this paper is to study the relation between convergent behavior and parameter p via aspect of geometry in which the graphs of ϕp and ψp can be regarded as families of surfaces embedded in IR3 . This paper is organized as follows. In Section 2, we propose some geometric properties of ϕp and present its surface structure by figures. In Section 3, we study properties of ψp , and summarize the comparison between ϕp and ψp . In Section 4, we investigate a geometric visualization to see possible convergence behavior with different p by a few examples. Finally, we state the conclusion.. 2. Geometric view of ϕp. In this section, we study some geometric properties of ϕp and interpret their meanings. We present the family of surfaces of ϕp (a, b) where p ∈ (1, +∞), see Figures 1-2. When we fix a real number p with 1 < p < +∞, Figure 2 gives us intuitive image that the surface shape is indeed influenced by the value of p. From the definition of p-norm, we know that ∥(a, b)∥1 := |a| + |b|, and ∥(a, b)∥∞ := max{|a|, |b|}. It is trivial that ϕp (a, b) → ϕ1 (a, b) := |a| + |b| − (a + b) pointwisely, see Figure 2 (a) and (b). On the other hand ϕp (a, b) → ϕ∞ (a, b) := max{|a|, |b|} − (a + b) pointwisely, see Figure 2(e) and (f). Note that ϕ1 (a, b) is not a NCP function because when a > 0 and b > 0, we have ϕ1 (a, b) = 0 whereas ϕ∞ (a, b) is an NCP function but not differentiable on a = b. Next, we give some lemmas which will be used in subsequent analysis. Lemma 2.1 [6, Lemma 3.1] If a > 0 and b > 0, then (a + b)p > ap + bp for all p ∈ (1, +∞). 3.

(7) 40 30 z−axis. 20 10 0 −10 −10 −5. 10 0. 5 0. 5 −5 a−axis. 10. −10. b−axis. Figure 1: The surface of z = ϕ2 (a, b) with (a, b) ∈ [−10, 10] × [−10, 10]. Lemma 2.2 [17, Lemma 1.3] Let x = (x1 , x2 , . . . , xn ) ∈ IR and ∥x∥p := n. If 1 < p1 < p2 , then ∥x∥p2 ≤ ∥x∥p1 ≤ n. ( p1 − p1 ) 1 2. (. n ∑ i=1. |xi |. p. )1. p. .. ∥x∥p2 .. Lemma 2.3 [5, Lemma 3.2] Let ϕp : IR2 → IR be given as in (7) where p ∈ (1, +∞). Then, 1 1 (2 − 2 p )| min{a, b}| ≤ |ϕp (a, b)| ≤ (2 + 2 p )| min{a, b}|.. Proposition 2.1 Let ϕp : IR2 → IR be given as in (7) where p ∈ (1, +∞). Then, (a) (a > 0 and b > 0) ⇐⇒ ϕp (a, b) < 0; (b) (a = 0 and b ≥ 0) or (b = 0 and a ≥ 0) ⇐⇒ ϕp (a, b) = 0; (c) b = 0 and a < 0 ⇒ ϕp (a, b) = −2a > 0; (d) a = 0 and b < 0 ⇒ ϕp (a, b) = −2b > 0. Proof. (a) If a > 0 and b > 0, √ it is easy to see ϕp (a, b) < 0√by Lemma 2.1. Conversely, √ p p p because |a| + |b| ≥ |a| and p |a|p + |b|p ≥ |b|, we have p |a|p + |b|p ≥ max{|a|, |b|}. Suppose a ≤ 0 or b ≤ 0, then we have max{|a|, |b|} ≥ (a + b) which implies ϕp (a, b) ≥ 0. This is a contradiction. (b) By definition of ϕp (a, b), we know {. ϕp (a, 0) = |a| − a =. {. 0 a ≥ 0, −2a a < 0,. ϕp (0, b) = |b| − b = 4. 0 b ≥ 0, −2b b < 0,.

(8) which say that (a = 0 and b ≥ 0) or (b = 0 and a ≥ 0) ⇒ ϕp (a, b) = 0. Conversely, suppose ϕp (a, b) = 0. If a < 0 or b < 0, mimicking the arguments of part(a) yields √ p. |a|p + |b|p > max{|a|, |b|} > (a + b). which implies ϕp (a, b) > 0. Thus, there must hold a ≥ 0 and b ≥ 0. Furthermore, one of a and b must be 0 from part(a). The proof of (c) and (d) are direct from the proof of part(b).. 2. Proposition 2.1(a) shows that the ϕp (a, b) is negative on the first quadrant of IR2 plane, see Figure 3, while Proposition 2.1(b) shows that the ϕp (a, b) = 0 can only happen on the nonnegative semiaxes (i.e., a ≥ 0, b = 0 or a = 0, b ≥ 0). In fact, this proposition is also equivalent to saying that ϕp (a, b) is an NCP-function. In addition, Proposition 2.1(b)-(d) indicate that the value of p does not affect the value ϕp (a, b) on the a-axis and b-axis. Proposition 2.2 Let ϕp : IR2 → IR be given as in (7) where p ∈ (1, +∞). Then, (a) ϕp (a, b) = ϕp (b, a); (b) ϕp is convex, i.e., ϕp (αw + (1 − α)w′ ) ≤ αϕp (w) + (1 − α)ϕp (w′ ) for all w, w′ ∈ IR2 and α ∈ [0, 1]; (c) if 1 < p1 < p2 , then ϕp1 (a, b) ≥ ϕp2 (a, b). Proof. The verifications for part(a) and (b) are straightforward, we omit them. Part(c) is true by applying Lemma 2.2. 2 Proposition 2.2(a) shows the symmetric property of ϕp (a, b) which means there have a couple of points on plane between line a = b having the same height. In other words, surface z = ϕp (a, b) has the same structure on second and forth quadrant of the plane, see Figures 3-5. Proposition 2.2(b) says that the shape of surface is convex because the function ϕp is convex while Proposition 2.2(c) implies that the value of ϕp is decreasing when the value of p is increasing. In summary, the value of p would affect geometric structure. Proposition 2.3 If {ak , bk } ⊆ IR2 with (ak → −∞) or (bk → −∞) or (ak → +∞ and bk → +∞), then |ϕp (ak , bk )| → +∞ as k → +∞. 5.

(9) Proof. This can be found in [26, page 20].. 2. Proposition 2.3 implies the increasing direction on surface. This can be seen from the contour graph of z = ϕp (a, b) which is plotted in Figure 3, where the deep color presents the lower height. In order to understand the structure of the surface, it is nature to investigate special curves on the surface. We consider a family of curves αr,p : IR → IR3 defined as follows: ( ) αr,p (t) := r + t, r − t, ϕp (r + t, r − t) (12) where r ∈ IR and p ∈ (1, +∞) are two arbitrary fixed real number. These curves can be viewed as the intersection of surface z = ϕp (a, b) and plane a + b = 2r, see Figure 5. We study some properties regarding these special curves. Lemma 2.4 Let ϕp : IR2 → IR be given as in (7) where p ∈ (1, +∞). Fix any r ∈ IR, we define f : IR → IR as f (t) := ϕp (r + t, r − t), then f is a convex function. Proof. We know that ϕp is a convex function by Proposition 2.2(b) and observe that f is a composition of ϕp and an affine function. Thus, f is convex since it is a composition of two convex functions. 2. Theorem 2.1 Let ϕp : IR2 → IR be given as in (7) where p ∈ (1, +∞). Suppose a and b are constrained on the curve determined by a + b = 2r (r ∈ IR) and the surface. Then, 1 ϕp (a, b) attains its minima ϕp (r, r) = 2 p |r| − 2r along this curve at (a, b) = (r, r). Proof. We know that ϕp (a, b) is differentiable except at (0, 0); therefore we discuss two cases as follows. (i) Case (1): r = 0. Because a + b = 0, a and b have opposite sign to each other except a = b = 0, from Proposition 2.1, we know ϕp (a, b) ≥ 0 under this case. Thus, when (a, b) = (0, 0), ϕp (a, b) attains its minima zero. (ii) Case (2): r ̸= 0. Fix r and p > 1. Let f : IR → IR and g : IR → IR be respectively defined as f (t) := ϕp (r + t, r − t), g(t) := |r + t|p + |r − t|p . Then, we calculate that f ′ (t) =. g ′ (t) p(g(t)). p−1 p. ]. [. and g ′ (t) = p sgn(r + t)(r + t)p−1 − sgn(r − t)(r − t)p−1 .. We know g(t) > 0 for all t ∈ IR. It is clear g ′ (0) = 0, and hence f ′ (0) = 0. By Lemma 2.4, f (t) is convex on IR. In addition, it is also continuous, therefore, t = 0 is a critical point of f (t) which is also a global minimizer of f (t). The proof is done since a = b = r 1 and ϕp (r, r) = 2 p |r| − 2r when t = 0. 2 6.

(10) Lemma 2.4 and Theorem 2.1 show that the curve determined by the plane a + b = 2r and the surface z = ϕp (a, b) is convex and attains minima when a = b, see Figure 6. We now study curvature of the family of curves αr,p defined as in (12) at point (r, r, ϕr,p (r, r)). Because function ϕp is not differentiable at (a, b) = (0, 0) (i.e., r = 0), we choose two points (−t0 , t0 , ϕ0,p (−t0 , t0 )) and (t0 , −t0 , ϕ0,p (t0 , −t0 )) where t0 > 0, and calculate the value of cosine function of the angle between α0,p (−t0 ), α0,p (t0 ), see Figure 7. Proposition 2.4 Let αr,p : IR → IR3 be defined as in (12), and cosp (θ) be the cosine function of the angle θ between two vectors α0,p (−t0 ) and α0,p (t0 ) where t0 > 0. Then, 2. (a) cosp (θ) = √(. 2p − 6 2 p. )2. 2 −2. ; + 32. 5 as p → +∞; (b) cosp (θ) → − 13 as p → 1, and cosp (θ) → − 33. (c) if 1 < p1 < p2 , then cosp1 (θ) < cosp2 (θ). Proof. (a) By direct computation, we obtain 2. 2. 2p − 6 2p − 6 α0,p (−t0 ) · α0,p (t0 ) √ =√ 2 =√ 2 . cosp (θ) = 2 1 1 +2 +2 ∥α0,p (−t0 )∥∥α0,p (t0 )∥ 2 p p p p p (2 + 6) + 2 (2 + 6) − 2 (2 − 2) + 32 (b) From part(a), let f : (1, +∞) → IR be f (p) := cosp (θ). Then f (p) is continuous on 5 (1, +∞). By taking the limit, we have cosp (θ) → − 31 as p → 1, and cosp (θ) → − 33 as p → +∞. 2. ′. 6−(1− lnp2 )2 p. (c) From part(b), we know f (p) = √. 2 (2 p −2)2 +32. which implies f ′ (p) > 0 for all p > 1.. Therefore, f (p) is a strictly increasing function on (1, +∞).. 2. Proposition 2.5 Let αr,p : IR → IR3 be defined as in (12). Then the following hold. (p − 1)2 p −1 (a) The curvature at point αr,p (0) = (r, r, ϕp (r, r)) is κp (0) = . |r| 1. (b) κp (0) → 0 as p → 1 and κp (0) → +∞ as p → +∞. (c) If 1 < p1 < p2 , then κp1 (0) < κp2 (0). 7.

(11) (. ). Proof. (a) Because αr,p (t) = r + t, r − t, ϕp (r + t, r − t) , we know . 1. . (p − 1)2 p  ′ ′′ αr,p (0) = (1, −1, 0) and αr,p (0) = 0, 0, . |r| Recall the formulation of curvature κp (t) =. ′ ′′ |αr,p (t) ∧ αr,p (t)| , ′ (t)|3 |αr,p. where wage operator means the outer product of two vectors. Thus, we have ′′ |α′ (0) ∧ αr,p (0)| (p − 1)2 p −1 = κp (0) = r,p ′ . |αr,p (0)|3 |r| 1. (b) Let f : (1, +∞) → IR be defined as (p − 1)2 p −1 , f (p) := κp (0) = |r| 1. then obviously f (p) is continuous on IR. Thus, the desired result follows by taking the limit directly. (c) From part(b), we compute that (. 2 p −1 ln 2 ln 2 f (p) = 1− + 2 |r| p p 1. ). ′. which implies f ′ (p) > 0 for all p ∈ (1, +∞). Then f (p) is strictly increasing on (1, +∞). 2 The above two propositions show how p affects the geometric structure, see Figure 8(a) and (b). Proposition 2.5(b) says that when p → 1 the curve becomes a straight line, see Figure 8(c). Note that when p → +∞ the curve becomes more and more sharp at the point. This curve is not differentiable when t = 0, see Figure 8(d). To sum up, from all properties we presented in this section we realize that p indeed affects the geometric behavior of surface z = ϕp (a, b) both locally and globally.. 3. Geometric view of ψp. In previous section, we see that generalized FB function ϕp is convex and differentiable everywhere except at (0, 0). To the contrast, the function ψp (a, b) defined as in (8) is non-convex, but continuously differentiable everywhere. Nonetheless, ϕp and ψp have many similar geometric properties as will be seen later. In this section, we study some properties like what we have done in Section 2 and compare the difference between ψp and ϕp . 8.

(12) 40. 30. 30. 20. 20. z−axis. z−axis. 40. 10. 10. 0. 0. −10 −10. −10 −10 −5. −5. 10 0. 10 0. 5 0. 5. 5 0. 5. −5 10. a−axis. −10. −5 b−axis. 10. a−axis. 40. 40. 30. 30. 20. 20. 10. 10. 0. 0. −10 −10. −10 −10 −5. −5. 10 0. 10 0. 5 0. 5. 5 0. 5. −5 10. a−axis. −10. −5 b−axis. 10. a−axis. −10. 40. 40. 30. 30. 20. 20. 10. 10. 0. 0. −10 −10. −10 −10 −5. −5. 10 0. 10 0. 5 0. 5 0. 5. −5 a−axis. 10. −10. b−axis. (d) p = 3. z−axis. z−axis. (c) p = 2. 5. b−axis. (b) p = 1.1. z−axis. z−axis. (a) z = ϕ1 (a, b). −10. −5 b−axis. a−axis. (e) p = 100. 10. −10. b−axis. (f) z = ϕ∞ (a, b). Figure 2: The surface of z = ϕp (a, b) with different p. 9.

(13) 100. 100. 80. 350. 80. 350. 60. 60. 300. 300 40. 40. 0. 20 b−axis. b−axis. 250. 250. 20. 200. −20. 200. 0. 150. −20. 150. −40. −40. 100. 100 −60. −60. −100 −100. 50. 50. −80 −50. 0 a−axis. 50. −80 −100 −100. 100. 0 −50. (a) ϕ1 (a, b). 0 a−axis. 50. 100. (b) p = 1.1. 100. 100 300. 80. 300. 80. 60. 250. 60. 40. 250. 40. 200. 200 20 b−axis. b−axis. 20 150. 0 −20. 150 0 100. −20. 100. −40. 50. −40 50. −60. −60 0. −80. 0. −80 −50. −100 −100. −50. 0 a−axis. 50. −100 −100. 100. −50. (c) p = 2. 0 a−axis. 50. 100. (d) p = 3. 100. 100. 80. 80. 250. 60. 250. 60 200. 200 40. 40 150. 0. 100. −20. 0. 100. −20. 50. −40. 50. −40 0. 0. −60. −60. −50. −80 −100 −100. 150. 20 b−axis. b−axis. 20. −50. 0 a−axis. 50. −50. −80 −100 −100. 100. (e) p = 100. −50. 0 a−axis. 50. (f) ϕ∞ (a, b). Figure 3: Level curves of z = ϕp (a, b) with different p.. 10. 100.

(14) 0. z−axis. −2. −4. −6 0 2 10. 4. 8 6. 6 4. 8 a−axis. 2 10. 0. b−axis. 40. 40. 30. 30. 20. 20 z−axis. z−axis. Figure 4: The surface of z = ϕ2 (a, b) with (a, b) ∈ [0, 10] × [0, 10]. 10. 10. 0. 0. −10 −10. −10 −10 −5. −5 0. 0 5 10. x−axis. −10. 0. −5. 5. 5. 10. y−axis. −10. 0. −5. 5. 10. y−axis. (b) a + b = −4 and z = ϕ2 (a, b). (a) a + b = 4 and z = ϕ2 (a, b). 40. 40. 30. 30. 20. 20 z−axis. z−axis. 10. x−axis. 10. 10. 0. 0. −10 −10. −10 −10 −5. −5 0. 0 5. x−axis. 10. −10. 0. −5. 5. 5. 10 x−axis. y−axis. (c) a + b = 0 and z = ϕ2 (a, b). 10. −10. 0. −5. 5. y−axis. (d) a + b = 0 and z = ϕ1.1 (a, b). Figure 5: The curve intersected by surface z = ϕp (a, b) and plane a + b = 2r. 11. 10.

(15) −0.0585. 10. −0.059. 8. −0.0595. 6. −0.06. 4. −0.0605. 2. −0.061. 0. −0.0615 −0.1. −0.05. 0. 0.05. −2 −10. 0.1. (a) r = 1/2 and p = 1.1 15.6. −4.5. 15.5. −4.6. 15.4. −4.7. 15.3. −4.8. 15.2. −4.9. 15.1. 0. 0. 5. 10. (b) r = 5/2 and p = 2. −4.4. −5 −0.5. −5. 15 −0.5. 0.5. 0. (d) r = −5 and p = 100. (c) r = 5 and p = 100. Figure 6: The curve f (t) = ϕp (r + t, r − t).. 12. 0.5.

(16) 4 p=1.1 p=1.5 p=2 p=3 p=10. 3.5 3 2.5 2 1.5 1 0.5 0 −2. −1.5. −1. −0.5. 0. 0.5. 1. 1.5. 2. (a) Angle with different p. 0.2. 0.12. 0.18 0.1. 0.16 0.14. 0.08. 0.12 0.1. 0.06. 0.08 0.04. 0.06 0.04. 0.02. 0.02 0 −0.1. −0.05. 0. 0.05. 0 −0.1. 0.1. (b) The change of angle as p → 1.. −0.05. 0. 0.05. (c) The change of angle as p → +∞.. Figure 7: Angle between vectors α0,p (−t0 ) and α0,p (t0 ).. 13. 0.1.

(17) 0 p=1.1 p=1.5 p=2 p=3 p=10. −0.05 −0.1 −0.15 −0.2 −0.25 −0.3 −0.35 −0.4 −0.45 −0.5 −0.5. 0. 0.5. (a) The curvature with different p with r = 1/2 4 p=1.1 p=1.5 p=2 p=3 p=10. 3.9 3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3 −0.5. 0. 0.5. (b) The curvature with different p with r = −1/2. 4. 3.7. 3.998. 3.6. 3.996. 3.5. 3.994. 3.4. 3.992. 3.3. 3.99. 3.2. 3.988. 3.1. 3.986 −0.5. 0. 3 −0.5. 0.5. (c) The change of curvature as p → 1.. 0. 0.5. (d) The change of curvature as p → +∞.. Figure 8: The curvature κp (0) at point αr,p (0). 14.

(18) 600. z−axis. 400. 200. 0 −10 −5. 10 0. 5 0. 5 −5 10. a−axis. −10. b−axis. Figure 9: The surface of z = ψ2 (a, b) with (a, b) ∈ [−10, 10] × [−10, 10].. 20. z−axis. 15 10 5 0 0 2 10. 4. 8 6. 6 4. 8 a−axis. 2 10. 0. b−axis. Figure 10: The surface of z = ψ2 (a, b) with (a, b) ∈ [0, 10] × [0, 10].. 15.

(19) 800. 600. 600 z−axis. z−axis. 800. 400. 400. 200. 200. 0 −10. 0 −10 −5. −5. 10 0. 10 0. 5 0. 5. 5 0. 5. −5 10. a−axis. −10. −5 b−axis. 10. a−axis. b−axis. (b) p = 1.1. 600. 600. 400. 400 z−axis. z−axis. (a) z = ψ1 (a, b). −10. 200. 0 −10. 200. 0 −10 −5. −5. 10 0. 10 0. 5 0. 5. 5 0. 5. −5 10. a−axis. −10. −5 b−axis. 10. a−axis. (c) p = 2. −10. b−axis. (d) p = 3. 600. 500 400 z−axis. z−axis. 400. 200. 300 200 100. 0 −10. 0 −10 −5. −5. 10 0. 10 0. 5 0. 5. 5 0. 5. −5 a−axis. 10. −10. −5 b−axis. a−axis. (e) p = 100. 10. −10. b−axis. (f) z = ψ∞ (a, b). Figure 11: The surface of z = ϕp (a, b) with different p.. 16.

(20) 4. 4. x 10. 100. x 10. 100. 7 80. 80. 7. 60. 6. 60 6. 40. 40 5. 0. 4. −20. 5. 20 b−axis. b−axis. 20. 4. 0 −20. 3. −40. 3. −40 2. 2 −60. −60 1. −80 −100 −100. −50. 0 a−axis. 50. 1. −80 −100 −100. 100. −50. (a) ψ1 (a, b). 0 a−axis. 50. 100. (b) p = 1.1 4. 4. x 10. 100. 60. 4.5. 60. 40. 4. 40. 20. 3.5. 20. 3. 0. 2.5. −20. 4.5 4 3.5 3. 0. 2.5. −20. 2. −40. 5. 80. 5. b−axis. b−axis. 80. x 10. 100. 5.5. 2. −40. 1.5. 1.5 −60. −60. 1. 1 −80 −100 −100. −80. 0.5 −50. 0 a−axis. 50. −100 −100. 100. 0.5 −50. (c) p = 2. 0 a−axis. 50. 100. (d) p = 3 4. 4. x 10. 100 80. 4. 80. 4. 60. 3.5. 60. 3.5. 40. 40. 3. 20 2.5 0 2 −20. 2.5 0 2 −20. 1.5. −40 −60. 1. −80. 0.5. −100 −100. 3. 20 b−axis. b−axis. x 10. 100. −50. 0 a−axis. 50. 1.5. −40 −60. 1. −80. 0.5. −100 −100. 100. (e) p = 100. −50. 0 a−axis. 50. (f) ψ∞ (a, b). Figure 12: Level curves of z = ψp (a, b) with different p.. 17. 100.

(21) Proposition 3.1 Let ψp : IR2 → IR be given as in (8) where p ∈ (1, +∞). Then, (a) ψp (a, b) ≥ 0, ∀(a, b) ∈ IR2 ; (b) ψp (a, b) = ψp (b, a), ∀(a, b) ∈ IR2 ; (c) (a = 0 and b ≥ 0) or (b = 0 and a ≥ 0) ⇐⇒ ψp (a, b) = 0; (d) b = 0 and a < 0 ⇒ ψp (a, b) = 2a2 > 0; (e) a = 0 and b < 0 ⇒ ψp (a, b) = 2b2 > 0; (f ) ψp is continuously differentiable everywhere. Proof. Part (d) and (e) come from Proposition 2.1(c) and Proposition 2.1(d), please see [2, 3, 4] for the rest. 2 Proposition 2.2(c) says that the value of ϕp is decreasing with respect to p. To the contrast, ψp does not have such property. More specifically, it is true for ψp to hold such property only on certain quadrants. Proposition 3.2 Suppose 1 < p1 < p2 and (a, b) ∈ IR2 . Then, (a) if a < 0 or b < 0, then ψp1 (a, b) ≥ ψp2 (a, b); (b) if a > 0 and b > 0, then ψp1 (a, b) ≤ ψp2 (a, b). Proof. (a) This is clear from Proposition 2.2(c). (b) Suppose a > 0 and b > 0, from Proposition 2.1(a), we have ϕp (a, b) < 0. Then Proposition 2.2(c) yields ϕp1 (a, b) ≥ ϕp2 (a, b), and hence ϕ2p1 (a, b) ≤ ϕ2p2 (a, b). 2 Since ψp is not convex in general. The counterpart of Theorem 2.1 is as below. Theorem 3.1 Let ψp (a, b) be defined as (8) with a + b = 2r. Then, the following hold. (. ). (a) If r ∈ IR+ and a > 0, b > 0, then ψp (a, b) attains maxima 2 p −1 − 2 p +1 + 2 r2 when (a, b) = (r, r). (. 1. 2. ). (b) If r ∈ IR− ∪ {0}, then ψp (a, b) attains minima 2 p −1 + 2 p +1 + 2 r2 when (a, b) = (r, r). 18. 2. 1.

(22) 25. 2. 20. 1.5 z−axis. z−axis. 15 10. 1 0.5. 5 0 −2. 0 0 −1. 1 0 2. 1 x−axis. 2. −2. 0. −1. 1. 2 x−axis. y−axis. (a) a + b = 0 and z = ψ2 (a, b). 3. 0. 0.5. 1.5. 1. 2.5. 2. 3. y−axis. (b) a + b = 2 and z = ψ2 (a, b). Figure 13: The curve intersected by surface z = ψp (a, b) and plane a + b = 2r. Proof. (a) When a > 0 and b > 0, Proposition 2.1(a) says that ϕp (a, b) < 0. Since ϕ2p (a, b) > 0, by Theorem 2.1, the minima of ϕp (a, b) becomes maxima of ψp (a, b). 2. (b) This is a consequence of Theorem 2.1.. The aforementioned results show ψp have many similar properties like ϕp hold, see Figures 11-12, where we denote ψ1 (a, b) := 21 |ϕ1 (a, b)|2 and ψ∞ (a, b) = 12 |ϕ∞ (a, b)|2 . However, there still are some differences between ϕp and ψp . For example, ψp is not convex whereas ϕp is. Figure 12 depicts the increasing direction of ψp . Note that ψp (a, b) is nonnegative and has different properties when a > 0 and b > 0, see Figure 10. In order to further understand the geometric properties, we define a family of curves as follows: ( ) βr,p (t) := r + t, r − t, ψp (r + t, r − t) , (13) where r is a fixed real number, and t ∈ IR. This family of curves can be regarded as intersection of plane a + b = 2r and surface z = ψp (a, b), see Figure 13. Proposition 3.3 Let βr,p : IR → IR3 be defined as in (13). Then the following hold. (. ). (a) The curvature at point βr,p (0) = (r, r, ψp (r, r)) is κ ¯ p (0) = (p − 1)2 p 1 − 2 p −1 . 1. 1. (b) κ ¯ p (0) → 0 as p → 1 and κ ¯ p (0) → +∞ as p → +∞. (c) If 1 < p1 < p2 , then κ ¯ p1 (0) < κ ¯ p2 (0). Proof. (a) From βr,p (t) = (r + t, r − t, ψp (r + t, r − t)), we know (. 2. 1. ′ ′′ βr,p (0) = (1, −1, 0) and βr,p (0) = 0, 0, (p − 1)2 p − sgn(r)(p − 1)2 p +1. 19. ).

(23) which yields. ′ ′′ |βr,p (0) ∧ βr,p (0)| 1 1 −1 p (1 − 2 p κ ¯ p (r) = = (p − 1)2 ). 3 ′ |βr,p (0)|. ¯ p (0) = (p − 1)2 p (1 − 2 p −1 ). Then the (b) Let f : (1, +∞) → IR be defined as f (p) := κ result follows by taking the limit directly. 1. 1. (c) From part(b), it can be verified that f ′ (p) > 0 for all p ∈ (1, +∞). Thus, f (p) is strictly increasing on (1, +∞). 2 Figure 14 depicts the change of the curve when we have different value of p in which we can see the change of curvature when p is close to one or infinity. We state an addendum to part(a) here: the curvature at another two special points βr,p (−r) = (0, 2r, 0), βr,p (r) = (2r, 0, 0) is the same, namely, κ ¯ p (r) = κ ¯ p (−r) = 12 . Note that although ψp is differentiable everywhere, the mean curvature at (0, 0) does not exist. To end up this section, we summarize the similarity and difference between ϕp and ψp as below. ϕp (a, b) ψp (a, b) Difference convex nonconvex differentiable everywhere except (0, 0) differentiable everywhere ϕp (a, b) < 0 when a > 0 and b > 0 ψp (a, b) ≥ 0, ∀(a, b) ∈ IR2 Similarity (1) NCP-function. (2) Symmetry. (i.e. ϕp (a, b) = ϕp (b, a) and ψp (a, b) = ψp (b, a).) (3) The function is not affected by p on axes. (4) When (ak → −∞) or (bk → −∞) or (ak , bk → +∞), there have |ϕp (ak , bk )| → ∞ and |ψp (ak , bk )| → ∞. (5) non-coercive.. 20.

(24) 1.8 p=1.1 p=1.5 p=2 p=3 p=10. 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 −2. −1.5. −1. −0.5. 0. 0.5. 1. 1.5. 2. (a) The curvature with different p with r = 1 −3. 8. x 10. 0.5. 7. 0.45. 6. 0.4. 5. 0.35. 4. 0.3. 3. 0.25. 2. 0.2. 1. 0.15. 0 −0.5. 0. 0.1 −0.5. 0.5. (b) The change of curvature as p → 1 at β1,p (0).. 0. 0.5. (c) The change of curvature as p → +∞ at β1,p (0).. 0.5. 0.14 p=1 p=1.1 p=1.2 p=1.3 p=1.4. 0.45 0.4. p=100 p=10 p=5 p=4 p=3. 0.12. 0.1. 0.35 0.3. 0.08. 0.25 0.06. 0.2 0.15. 0.04. 0.1 0.02 0.05 0 0.5. 1. 0 0.5. 1.5. (d) The change of curvature as p → 1 at β1,p (1).. 1. 1.5. (e) The change of curvature as p → +∞ at β1,p (1).. Figure 14: The curvature κ ¯ p (0) at point βr,p (0). 21.

(25) 4. Geometric analysis of merit function in descent algorithms. In this section, we employ derivative-free descent algorithms presented in [4, 5] to solve the unconstrained minimization problems (11) by using the merit function (10). We then compare two algorithms and study their convergent behavior by investigating an intuitive visualization. We first list these two algorithms as below. Algorithm 4.1 [4, Algorithm 4.1] (Step 0) Given real numbers p > 1 and a starting point x0 ∈ IRn . Choose the parameters σ ∈ (0, 1), β ∈ (0, 1) and ε ≥ 0. Set k := 0. (Step 1) If Ψp (xk ) ≤ ε, then stop. (Step 2) Let mk be the smallest nonnegative integer m satisfying Ψp (xk + β m dk ) ≤ (1 − σβ 2m )Ψp (xk ), where dk := −∇b ψp (xk , F (xk )) and ∇b ψp (x, F (x)) :=. (. ∇b ψp (x1 , F1 (x)), · · · , ∇b ψp (xn , Fn (x)). )T. .. (Step 3) Set xk+1 := xk + β mk dk , k := k + 1 and go to Step 1. Algorithm 4.2 [5, Algorithm 4.1] (Step 0) Given real numbers p > 1 and α ≥ 0 and a starting point x0 ∈ IRn . Choose the parameters σ ∈ (0, 1), β ∈ (0, 1), γ ∈ (0, 1) and ε ≥ 0. Set k := 0. (Step 1) If Ψα,p (xk ) ≤ ε, then stop. (Step 2) Let mk be the smallest nonnegative integer m satisfying Ψα,p (xk + β m dk (γ m )) ≤ (1 − σβ 2m )Ψα,p (xk ), where dk (γ m ) := −∇b ψα,p (xk , F (xk )) − γ m ∇a ψα,p (xk , F (xk )) and ∇a ψα,p (x, F (x)) := ∇b ψα,p (x, F (x)) :=. (. )T. ∇a ψα,p (x1 , F1 (x)), · · · , ∇a ψα,p (xn , Fn (x)). (. ,. )T. ∇b ψα,p (x1 , F1 (x)), · · · , ∇b ψα,p (xn , Fn (x)) 22. ..

(26) (Step 3) Set xk+1 := xk + β mk dk (γ mk ), k := k + 1 and go to Step 1. In Algorithm 4.2, ψα,p : IR2 → IR+ is an NCP-function defined by ψα,p (a, b) :=. 1 α α (max{0, ab})2 + ψp (a, b) = (ab)2+ + (∥(a, b)∥p − (a + b))2 2 2 2. with α ≥ 0 being a real parameter. When α = 0, the function ψα,p reduces to ψp . For comparing these two algorithms, we take α = 0 when we use Algorithm 4.2 in this section. Note that the descent direction in Algorithm 4.1 is lack of a certain symmetry whereas Algorithm 4.2 adopts a symmetric search direction. Under the assumption of monotonicity. i.e., ⟨x − y, F (x) − F (y)⟩ ≥ 0 for all x, y ∈ IRn , the error bound is proposed and Algorithm 4.2 is shown to have locally R-linear convergence rate in [5]. In other words, there exists a positive constant κ2 such that ∗. (. {. ∥x − x ∥ ≤ κ2 max Ψα,p (x ), k. k. }) 1. √. Ψα,p. (xk ). 2. when α = 0.. Furthermore, the convergence rate of Algorithm 4.2 has a close relation with the constant ⌈. (. log βγ. σ L1 + C(B, α, p). (. )⌉. ,. 1. 2 − 2p. where C(B, α, p) = αB 2. (. )4. + 2+2. 1 p. )2 .. Therefore, when the value of p decreases, the convergence rate of Algorithm 4.2 becomes worse and worse, see Remark 4.1 in [5]. Recall that the merit function Ψp (x) is sum of n nonnegative function ψp . i.e., Ψp (x) =. n ∑. ψp (xi , Fi (x)).. i=1. This encourages us to view each component ψp (xki , Fi (xk )) for i = 1, 2..., n as the motion with different velocity on the same surface z = ψp (a, b) at each iteration. Due to our study in Section 2 and Section 3, we observe a visualization that help us understand the convergent behavior in details. Figure 19 depicts the visualization in a four-dimensional NCP in Example 4.3. The merit function of this NCP is Ψp (x) = {. }. 4 ∑. ψp (xi , Fi (x)). We. i=1. plot point sequences (xki , Fi (xk )) for i = 1, 2, 3, 4 together with different color and level curve of surface z = ψ1.1 (a, b) in Figure 19(a). The vertical line represents the value of each xi , the horizontal line represents the value of each Fi (x) and the skew line means xi = Fi (x). We take initial point x0 = (0, 0, 0, 0) which implies F (x0 ) = (−6, −2, −1, −3), and observe convergent behavior separately with different i from initial point to the 23.

(27) √ solution x∗ = ( 6/2, 0, 0, 1/2) which is on the horizontal line in this figure. Furthermore, we observe the position of point sequence on the surface in Figure 19(a) and merit function which is the sum of their height at each iteration shown as in Figure 19(b). In one-dimensional NCP, F is continuously differentiable and there is only one variable x in F , so (x, F (x)) is a continuous curve on IR2 and the merit function Ψp (x) = ψp (x, F (x)) is obviously a curve on the surface z = ψp (a, b), see Figure 15(a) and (b). Therefore, point sequence in one-dimensional problem can only lie on the curve (x, F (x), ψp (x, F (x))). Example 4.1 Consider the NCP, where F : IR → IR is given by F (x) = (x − 3)3 + 1. The unique solution of this NCP is x∗ = 2. Note that F is strictly monotone, see geometric view of this NCP problem in Figure 15. The value of the merit function with each iteration is plotted in Figure 15(c) which presents the different behavior of the functions with different value p near by the solution. Figures 16(a)-(d) depict convergent behavior in Algorithm 4.1 from two direction with two different initial points, and Figures 16(e)-(f) show convergent behavior with different p. Figures 18(a)-(d) depict convergent behavior in Algorithm 4.2 from two direction with two different initial points. We found that Algorithm 4.2 always produce point sequence in or close to the boundary of feasible set, i.e., {(x, F (x)) : x ≥ 0 and F (x) ≥ 0}. Based on Proposition 3.2, the speed of the decreasing of merit function with different initial point in Algorithm 4.1 is different when we increase p. But it is similar with different initial point in Algorithm 4.2. This phenomena is consistent with geometric properties studied in Section 3. To show the importance of inflection point, we give an extreme example as follows: Example 4.2 Consider the NCP, where F : IR → IR is given by F (x) = 1. The unique solution of this NCP is x∗ = 0. From above discussion, we know that point sequence is on the curve (x, 1, ψp (x, 1)), see Figure 17(a). Figure 17(c) shows there is rapid decreasing of merit function form the 80th to 120th iteration. Figure 17(b) shows the behavior during 80th to 120th iteration. Observing the width of the level curve in Figure 17 (b), we found that rapid decreasing may arise from the existence of inflection point on the surface. Figures 17(c)-(f) and Figures 18(e) and (f) show that the position of inflection point may change with different p. Example 4.3 Consider the NCP, where F : IR4 → IR4 is given by   . F (x) =   . 3x21 + 2x1 x2 + 2x22 + x3 + 3x4 − 6 2x21 + x1 + x22 + 3x3 + 2x4 − 2 3x21 + x1 x2 + 2x22 + 2x3 + 3x4 − 1 x21 + 3x22 + 2x3 + 3x4 − 3 24.    .  .

(28) √ This is non-degenerate NCP and the solution is x∗ = ( 6/2, 0, 0, 1/2). Figure 19 shows that the behavior of merit function is consistent with Proposition 3.2(b) in Algorithm 4.1. Figure 20 shows that the convergent behavior is different with different initial point in Algorithm 4.1. Figure 21 says that the convergent behavior is also different with different initial point in Algorithm 4.2. But we saw that the behavior is still different between two algorithms even with the same initial point from Figures 20 and 21. Figure 22(e) shows that merit function decreases more and more quickly when p is smaller. However, Figure 23 (e) shows that merit function decreases more and more quickly when p is bigger. Example 4.4 Consider the NCP, where F : IR5 → IR5 is given by     F (x) =    . x1 + x2 x3 x4 x5 /50 x2 + x1 x3 x4 x5 /50 − 3 x3 + x1 x2 x4 x5 /50 − 1 x4 + x1 x2 x3 x5 /50 + 1/2 x5 + x1 x2 x3 x4 /50.     .   . This NCP has only one solution x∗ = (0, 3, 1, 0, 0). We choose initial point x0 = (3, 4, 2, 2, 2) in Algorithm 4.2, see Figure 24. Figures 24(e) and Figure 23(e) show different result with different p. The results of above examples suggest that the convergent behavior is influenced by the position of initial point, properties of F (x), and the geometric structure of the NCP function ψp . Indeed, the convergent behavior of Algorithm 4.1 can be classified into three cases when starting from different initial points. {. }. {. }. • Case 1: the point sequences (xki , Fi (xk )) , i = 1, . . . , n are almost located in or close to the boundary of surface z = ψp (a, b) where a > 0 and b > 0, see Figure 20(f). • Case 2: the point sequences (xki , Fi (xk )) , i = 1, . . . , n are almost located in or close to the boundary of surface z = ψp (a, b) where a > 0 and b < 0, see Figures 16(a) and (e). • Case 3: the point sequence does not belong to case 1 and case 2, see Figure 20(a). In addition, the value of merit function decreases more quickly when the value of p increases in case 1, see Figure 22. The value of merit function decreases more quickly when the value of p increases in case 2, see Figure 19. The value of merit function seems to depend on the slope of the surface, as visualization shows. Thus, the above two cases can be explained by Proposition 3.2 geometrically. Although the convergent behavior is complicated in case 3, if there exists some initial point x0 such that Fi (x0 ) < 0 for some i and p is small, for example, p ∈ (1, 2), we can easily deduce that the value of merit function decreases more quickly when the value of p increases, like case 2 and see Figure 16. This is because the surface z = ψp (a, b) is much higher when a > 0 and b < 0 than 25.

(29) when a > 0 and b > 0, see Figure 12 and Figure 14(d). Therefore the value of merit function is dominated by the component of F (x) with initial point satisfy Fi (x0 ) < 0 for some i. This result was discussed and presented in [4]. Convergent behavior in Algorithm 4.2 belongs to case 1, see Figure 18(a) and Figure 21. The behavior of merit function seems to depend on the height scale of surface with steady step length at each iteration, as visualization shows. Therefore surface in case 1 is closer to zero but becomes flatter when p is smaller. This is the geometrical reason why Algorithm 4.2 has a better a global convergence and a worse convergence rate when p decreases, see concluding remark in [5] and Figure 23(e). Although Algorithm 4.2 seems to have different global convergence result with Algorithm 4.1, we can choose initial point we want by observing our visualization. Figure 24(e) and Figure 22(e) show similar global convergence result with different p. This is because they have similar convergent behavior, see Figure 20(f) and Figures 24(a)-(d).. 5. Conclusion. In this paper, we view generalized Fischer-Burmeister function ϕp as a family of surfaces in IR3 and study some geometric properties of ϕp and ψp . If we use a descent method to solve NCP with merit function Ψp , then convergent behavior is influenced by descent direction and geometric structure of surface ψp . By looking at the visualization we present in this paper, we observe that the existence of inflection point causes rapid decreasing of merit function. The convergent behavior of both algorithms is sensitive to initial point which is also mentioned in [5, 7]. In addition, the global convergence of Algorithm 4.1 and Algorithm 4.2 may depend on the geometric structure with different initial point. From the geometric view, an NCP function can be regarded as a surface in IR3 . The parameters defined in NCP functions may change the geometric structure of the surface. This geometric view and visualization greatly help us realize that how convergent behavior changes by changing or adding parameters in NCP function.. 26.

(30) 1.5. 2. 5.5 5. 1.5. 1. 4.5 4. 1. 3.5 F(x). F(x). 0.5. 3. 0.5. 2.5. 0. 2. 0. 1.5 −0.5. −0.5. 1 0.5. 2. 2.2. 2.4. 2.6. 2.8. 3. 3.2. 3.4. −1 −1. 3.6. 0. 1. x. 2. 3. (b) Plane curve in (a) and level curve of surface z = ψ2 (a, b) near the solution (x∗ , F (x∗ )) = (2, 0).. 0.9 p=1.1 p=1.5 p=2 p=3 p=100. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1.8. 4. x. (a) Graph F (x) = (x − 3)3 + 1.. Ψp(x,F(x)). −1 1.8. 2. 2.2. 2.4. 2.6. 2.8. 3. 3.2. 3.4. x. (c) The merit function with different p near the solution.. Figure 15: Geometric view of NCP in Example 4.1.. 27. 3.6.

(31) −3. 6. 3.5. x 10. 140 4. 3 120. 2. 2.5 Merit function. 100 F(x). 0 80 −2 60 −4. 2. 1.5. 1 40. −6. −8 −1. 0.5. 20. 0. 1. 2 x. 3. 4. 0. 5. 0. (a) Initial point x0 = 1 and p = 1.1.. 20. 40. 60 Iteration. 80. 100. 120. (b) Initial point x0 = 1 and p = 1.1. −5. 3. 4.5. x 10. 7 4. 2.5 6. 3.5. 2 5. 3 Merit function. F(x). 1.5 4. 1 0.5. 3. 0. 2. −0.5. 1. 2.5 2 1.5. −1 −1. 0. 1. 2 x. 3. 4. 1 0.5 0. 5. 0. (c) Initial point x0 = 4 and p = 1.1.. 200. 400. 600 800 Iteration. 1000. 1200. 1400. (d) Initial point x0 = 4 and p = 1.1. −3. 6. 140. 4. 120. 2. x 10. 1.8 1.6. 0. 80. −2. 60. −4. 1.4 Merit function. 100. F(x). 2. 1.2 1 0.8 0.6. 40. 0.4 −6. −8 −1. 20. 0. 1. 2 x. 3. 4. 0.2 0. 5. 0. 50. 100. 150. 200. 250. Iteration. (e) Initial point x0 = 1 and p = 100.. (f) Initial point x0 = 1 and p = 100.. Figure 16: Convergent behavior of Algorithm 4.1 and the value of merit function in Example 4.1.. 28.

(32) −3. 1.5. 4. 1.025. 5 1. x 10. 1.03. 5.5. 4.5. 3.5. 1.02. 4. 3. 1.015. F(x). F(x). 3.5. 0.5. 3. 2.5. 1.005. 2.5. 0. 1.01. 2. 2. 1. 1.5. 1.5. 0.995. −0.5. 1. 1 0.99. 0.5. 0.5 −1 −1. −0.5. 0. 0.5. 1. 0.985. 1.5. 0. 0.02. 0.04. x. (a) p = 2.. 5. 4. 4 3.5 Merit function. Merit function. x 10. 4.5. 3.5 3 2.5 2. 3 2.5 2. 1.5. 1.5. 1. 1. 0.5. 0.5 0. 20. 40. 60 80 Iteration. 100. 120. 0. 140. 0. 20. (c) p = 2.. 4. 0.8. 3.5. 0.7 Merit function. Merit function. 1 0.9. 3 2.5. 100. 120. 140. 2. 0.5 0.4 0.3. 1. 0.2. 0.5. 0.1 50. 100. 150. 200 Iteration. 250. 300. 350. 0. 400. (e) p = 5.. x 10. 0.6. 1.5. 0. 60 80 Iteration. −5. x 10. 4.5. 0. 40. (d) p = 3.. −5. 5. 0.1. −5. x 10. 4.5. 0. 0.08. (b) Local convergent behavior of (a) with the 80th to 120th iteration of (c).. −5. 5. 0.06 x. 0. 200. 400. 600 Iteration. 800. 1000. 1200. (f) p = 1.1.. Figure 17: Convergent behavior of Algorithm 4.1 and merit function with initial point x0 = 1 in Example 4.2. 29.

(33) 50. 1.4 220 200. 40. 1.2. 180 1 Merit function. 160. 30. F(x). 140 120. 20. 100 10. 0.8. 0.6. 80 0.4 60. 0. 40. 0.2. 20 −10 −1. 0. 1. 2. 3 x. 4. 5. 6. 0. 7. 0. 20. 40. 60 Iteration. 80. 100. 120. (a) Initial point x0 = 1 and p = 1.1 in Example 4.1. (b) Initial point x0 = 1 and p = 1.1 in Example 4.1. 5. 0.06 7. 4. 5. 0.04 Merit function. 3. F(x). 0.05. 6. 4. 2. 3 1. 0.03. 0.02 2. 0. 0.01 1. −1 −1. 0. 1. 2 x. 3. 4. 0. 5. 0. 10. 20. 30. 40 50 Iteration. 60. 70. 80. 90. (c) Initial point x0 = 4 and p = 1.1 in Example 4.1. (d) Initial point x0 = 4 and p = 1.1 in Example 4.1. 0.45. 0.5. 0.4. 0.45 0.4. 0.35. 0.35 Merit function. Merit function. 0.3 0.25 0.2 0.15. 0.2 0.15. 0.1. 0.1. 0.05 0. 0.3 0.25. 0.05 2. 4. 6. 8. 10 12 Iteration. 14. 16. 18. 0. 20. 2. 4. 6. 8 Iteration. 10. 12. 14. (e) Merit function in Example 4.2 with p = 1.1 and (f) Merit function in Example 4.2 with p = 5 and x0 = 10. x0 = 10.. Figure 18: Convergent behavior of Algorithm 4.2 and the value of merit function in Example 4.1 and 4.2. 30.

(34) −3. 8. 6. 80 i=1 i=2 i=3 i=4. 6. 70. x 10. 5. 60. 4. 4 Merit function. 50 Fi(x). 2 40 0 30. 3. 2. −2 20 1. −4 10 −6 −0.5. 0. 0.5. 1. 1.5. 0. 2. 0. 500. 1000 Iteration. xi. (a) p = 1.1.. 1500. 2000. (b) Iteration and merit function p = 1.1. −3. 8. 3.5 i=1 i=2 i=3 i=4. 6. x 10. 70 3 60. 4. 2.5 Merit function. 50 Fi(x). 2 40 0 30 −2. 2. 1.5. 1 20. −4. −6 −0.5. 0.5. 10. 0. 0.5. 1. 1.5. 0. 2. 0. 100. 200. xi. (c) p = 1.5.. 300 Iteration. 400. 500. 600. (d) Iteration and merit function p = 1.5. −3. 8. 2.5 i=1 i=2 i=3 i=4. 6. x 10. 70 2. 60. 4 Merit function. 50 Fi(x). 2 40 0 30. 1.5. 1. −2 20 0.5 −4. −6 −0.5. 10. 0. 0.5. 1. 1.5. 0. 2. xi. 0. 200. 400. 600 800 Iteration. 1000. 1200. 1400. (f) Iteration and merit function p = 2.. (e) p = 2.. Figure 19: Convergent behavior of Algorithm 4.1 and merit function with initial point x0 = (0, 0, 0, 0) in Example 4.3.. 31.

(35) 8. 8 i=1 i=2 i=3 i=4. 6. i=1 i=2 i=3 i=4. 70 6 60. 4. 70. 60. 4 50. 50 F (x). 2. Fi(x). 2. 40. i. 40 0. 0 30. 30. −2. −2 20. −4. −6 −0.5. 20 −4. 10. 0. 0.5. 1. 1.5. −6 −0.5. 2. 10. 0. 0.5. xi. 1. 1.5. 2. xi. (a) x0 = (0, 1, 1, 0).. (b) x0 = (1/2, 1, 1, 1/2).. 10. 14 i=1 i=2 i=3 i=4. 8. 70. 10. 60. 6. i=1 i=2 i=3 i=4. 12. 70. 60. 8 50. 50 6. 40. 0. 30. i. 2. F (x). Fi(x). 4. 40. 4 2. 30. 0 −2. 20. 20 −2. −4 −6 −0.5. 10. 0. 0.5. 1. 1.5. 10. −4 −6 −0.5. 2. 0. 0.5. xi. 1. 1.5. 2. xi. (c) x0 = (1, 1, 1, 1).. (d) x0 = (1, 1, 1, 2).. 20. 20 i=1 i=2 i=3 i=4. 15. i=1 i=2 i=3 i=4. 50 45. 15. 50 45. 40. 40. 35. 35. 10. 10 30. i. F (x). Fi(x). 30 25 5. 25 5. 20. 20. 15 0. 15 0. 10. 10. 5 −5 −0.5. 0. 0.5. 1. 1.5. 5 −5 −0.5. 2. xi. 0. 0.5. 1. 1.5. 2. xi. (e) x0 = (2, 1, 1, 0).. (f) x0 = (2, 1, 1, 1).. Figure 20: Convergent behavior of Algorithm 4.1 with different initial point and p = 2 in Example 4.3.. 32.

(36) 25. 10 i=1 i=2 i=3 i=4. 20. i=1 i=2 i=3 i=4. 45 8. 40. 7 6. 35 15. 6. 5. 25. i. 10. F (x). Fi(x). 30 4. 4. 20 5. 3. 2 15. 2 10. 0. 0 1. 5 −5. 0. 0.5. 1. 1.5 xi. 2. 2.5. −2. 3. 0. (a) x0 = (0, 1, 1, 0).. 1 xi. 1.5. 2. (b) x0 = (1/2, 1, 1, 1/2).. 10. 14. 6. 12. 6. 10. 6. 5. 8. 5. 6. 4. 3. 4. 3. 2. 2. 2. 1. 0. 1. 4. 2. i. 4. i=1 i=2 i=3 i=4. 7. F (x). i=1 i=2 i=3 i=4. 8. Fi(x). 0.5. 7. 0. −2. 0. 0.5. 1 xi. 1.5. −2. 2. 0. 0.5. (c) x0 = (1, 1, 1, 1).. 1 xi. 1.5. 2. (d) x0 = (1, 1, 1, 2).. 18. 20 i=1 i=2 i=3 i=4. 16 14. i=1 i=2 i=3 i=4. 7 15. 45 40. 6 35. 12 5 4. 6. 25. i. 8. 30. 10 F (x). Fi(x). 10. 5. 3. 20. 4. 15 2. 2. 0. −2. 10. 1. 0 0. 0.5. 1 xi. 1.5. 5 −5. 2. (e) x0 = (2, 1, 1, 0).. 0. 0.5. 1 xi. 1.5. 2. (f) x0 = (2, 1, 1, 1).. Figure 21: Convergent behavior of Algorithm 4.2 with different initial point and p = 2 in Example 4.3.. 33.

(37) −3. 3.5. x 10. 0.03. 3. 0.025. 2.5 Merit function. Merit function. 0.02 2. 1.5. 0.015. 0.01 1 0.005. 0.5. 0. 0. 50. 100 Iteration. 150. 0. 200. 0. 50. 0.035. 0.035. 0.03. 0.03. 0.025. 0.025. 0.02. 0.015. 0.005. 0.005. 50. 150. 200. 0.015. 0.01. 0. 200. 0.02. 0.01. 0. 150. (b) p = 2.. Merit function. Merit function. (a) p = 1.1.. 100 Iteration. 100 Iteration. 150. 0. 200. 0. 50. (c) p = 3.. 100 Iteration. (d) p = 100.. 0.025. Merit function. 0.02. 0.015 p=100 0.01 p=3 p=2 0.005. 0. p=1.1. 0. 20. 40. 60. 80. 100 120 Iteration. 140. 160. 180. 200. (e) All together.. Figure 22: Merit function in the first 200 iterations of Algorithm 4.1 with different p and initial point x0 = (2, 1, 1, 1) of Figure 20(f).. 34.

(38) 0.25. 0.2 0.18. 0.2. 0.16. Merit function. Merit function. 0.14 0.15. 0.1. 0.12 0.1 0.08 0.06. 0.05. 0.04 0.02. 0. 0. 50. 100. 150 Iteration. 200. 250. 0. 300. 0. 100. 3. 3. 2.5. 2.5 Merit function. Merit function. 3.5. 2. 1.5. 1.5. 1. 0.5. 0.5. 200. 500. 2. 1. 100. 400. (b) p = 1.5.. 3.5. 0. 300 Iteration. (a) p = 1.1.. 0. 200. 300. 400. 0. 500. 0. 50. 100. 150. Iteration. (c) p = 2.. 200 250 Iteration. 300. 350. 400. 450. (d) p = 100.. 3.5 3. Merit function. 2.5 2 1.5 p=2 1 0.5 0. p=1.5 p=100. p=1.1. 0. 50. 100. 150. 200. 250 300 Iteration. 350. 400. 450. 500. (e) All together.. Figure 23: Merit function of Algorithm 4.2 with different p and initial point x0 = (2, 1, 1, 1) of Figure 21(f).. 35.

(39) 10. 10 0.25. i=1 i=2 i=3 i=4 i=5. 8. i=1 i=2 i=3 i=4 i=5. 8 0.2. 2.5. 2 6. 6. F (x) 0.1. 2. 0.05. 0. −2. 0. 0.5. 1. 1.5. 2 x. 2.5. 3. 3.5. 2. 1. 0. 0.5. −2. 4. 1.5. 4. i. Fi(x). 0.15 4. 0. 0.5. 1. 1.5. 2 x. i. 2.5. 3. 3.5. 4. i. (a) p = 1.1.. (b) p = 1.5.. 10. 8. 10. 5. i=1 i=2 i=3 i=4 i=5. 7 i=1 i=2 i=3 i=4 i=5. 4.5 8 4. 5. 3.5. 6. 6. 6 4. F (x). 4. i. Fi(x). 3 4. 2.5. 3. 2 2. 2 1.5. 2. 1. 0. 0 1. 0.5 −2. 0. 0.5. 1. 1.5. 2 xi. 2.5. 3. 3.5. −2. 4. 0. 0.5. 1. (c) p = 2.. 1.5. 2 xi. 2.5. 3. 3.5. 4. (d) p = 3.. 1.5. Merit function. 1. p=3 p=2 0.5 p=1.5 p=1.1 0. 0. 50. 100. 150. 200. 250. Iteration. (e) Merit function with different p.. Figure 24: Convergent behavior with different p and merit function of Algorithm 4.2 with initial point x0 = (3, 4, 2, 2, 2) in Example 4.4.. 36.

(40) References [1] S. C. Billups, S. P. Dirkse and M. C. Soares, A comparison of algorithms for large scale mixed complementarity problems, Computational Optimization and Applications, vol. 7, pp. 3-25, 1997. [2] J.-S. Chen, The Semismooth-related Properties of a Merit Function and a Descent Method for the Nonlinear Complementarity Problem, Journal of Global Optimization, vol. 36, pp. 565-580, 2006. [3] J.-S. Chen, On Some NCP-functions Based on the Generalized Fischer-Burmeister function, Asia-Pacific Journal of Opertional Research, vol. 24, pp. 401-420, 2007. [4] J.-S. Chen and S. Pan, A Family of NCP-functions and a Descent Method for the Nonlinear Complementarity Problem, Computational Optimization and Applications, vol. 40, pp. 389-404, 2008. [5] J.-S. Chen, H.-T. Gao and S. Pan, A R-linearly convergent derivative-free algorithm for the NCPs based on t he generalized Fischer-Burmeister merit function, Journal of Computational and Applied Mathematics, vol. 230, pp. 69-82, 2009. [6] J.-S. Chen, Z.-H. Huang and C.-Y. she, A new class of penalized NCP-functions and its properties, to appear in Computational Optimization and Applications, 2011. [7] J.-S. Chen, C.-H. Ko and S.-H. Pan, A neural network based on generalized Fischer-Burmeister function for nonlinear complementarity problems, Information Sciences, vol. 180, pp. 697-711, 2010. [8] R.W. Cottle, J.-S. Pang and R.-E. Stone, The Linear Complementarity Problem, Academic Press, New York, 1992. [9] S. Dafermos, An Iterative Scheme for Variational Inequalities, Mathematical Programming, vol. 26, pp.40-47, 1983. [10] F. Facchinei and J. Soares, A New Merit Function for Nonlinear Complementarity Problems and a Related Algorithm, SIAM Journal on Optimization, vol. 7, pp. 225-247, 1997. [11] F. Facchinei and J.S. Pang, Finite-Dimensional Variational Inequalities and Complementary Problems, Springer, New York, vol. I and II, 2003. [12] A. Fischer, A Special Newton-type Optimization Methods, Optimization, vol. 24, pp. 269-284, 1992. [13] A. Fischer, Solution of the monotone complementarity problem with locally Lipschitzian functions, Mathematical Programming, vol. 76, pp. 513-532, 1997. 37.

(41) [14] M. Fukushima Merit Functions for Varitional Inequality and Complementarity Problem, Nonlinear Optimization and Applications, edited by G Di Pillo and F. Giannessi, Pleneum Press, New York, pp. 155-170, 1996. [15] C. Geiger and C. Kanzow, On the Resolution of Monotone Complementarity Problems, Computational Optimization and Applications, vol. 5, pp. 155-173, 1996. [16] P. T. Harker and J.-S. Pang, Finite Dimensional Variational Inequality and Nonlinear Complementarity Problem: A Survey of Theory, Algorithms and Applications, Mathematical Programming, vol. 48, pp. 161-220, 1990. [17] N.J. Higham, Estimating the matrix p-norm, Numerical Mathematics Vol. 62, pp. 539-555, 1992. [18] H. Jiang, Unconstrained Minimization Approaches to Nonlinear Complementarity Problems, Journal of Global Optimization, vol. 9, pp. 169-181, 1996. [19] C. Kanzow, Nonlinear Complementarity as Unconstrained Optimization, Journal of Optimization Theory and Applications, vol. 88, pp. 139-155, 1996. [20] C. Kanzow, N. Yamashita and M. Fukushima, New NCP-functions and Their Properties, Journal of Optimization Theory and Applications, vol. 94, pp. 115-135, 1997. [21] O. L. Mangasarian, Equivalence of the Complementarity Problem to a System of Nonlinear Equations, SIAM Journal on Applied Mathematics, vol. 31, pp. 89-92, 1976. [22] J.-S. Pang, Complementarity problems, Handbook of Global Optimization, edited by R. Horst and P. Pardalos, Kluwer Academic Publishers, Boston, Massachusetts, pp. 271-338, 1994. [23] J.-S. Pang, Newton’s Method for B-differentiable Equations, Mathematics of Operations Research, vol. 15, pp. 311-341, 1990. [24] J.-S. Pang and D. Chan, Iterative Methods for Variational and Complemantarity Problems, Mathematics Programming, vol. 27, 99. 284-313, 1982. [25] D. Sun and L.-Q. Qi, On NCP-functions, Computational Optimization and Applications, vol. 13, pp. 201-220, 1999. [26] P. Tseng, Growth Behavior of a Class of Merit Functions for the Nonlinear Complementarity Problem, Journal of Optimization Theory and Applications, vol. 89, pp. 17-37, 1996. 38.

(42) [27] N. Yamashita and M. Fukushima, On Stationary Points of the Implicit Lagrangian for the Nonlinear Complementarity problems, Journal of Optimization Theory and Applications, vol. 84, pp. 653-663, 1995. [28] N. Yamashita and M. Fukushima, Modified Newton Methods for Solving a Semismooth Reformulation of Monotone Complementarity Problems, Mathematical Programming, vol. 76, pp. 469-491, 1997. [29] K. Yamada, N. Yamashita, and M. Fukushima, A New Derivative-free Descent Method for the Nonlinear Complementarity Problems, in Nonlinear Optimization and Related Topics edited by G.D. Pillo and F. Giannessi, Kluwer Academic Publishers, Netherlands, pp. 463-487, 2000.. 39.

(43)