用來解一個系統的等式與不等式的新平滑函數

全文

(1)國立臺灣師範大學數學系碩士班碩士論文. 指導教授：陳界山博士. New smoothing functions for solving a system of equalities and inequalities .. 研究生：劉彥迪. 中華民國一零三年六月.

(2) New smoothing functions for solving a system of equalities and inequalities .. By Yan-Di Liu. Advisor Jein-Shan Chen. Department of Mathematics National Taiwan Normal University Taipei 11677, Taiwan June , 2014.

(3) 誌謝感恩我的指導教授陳界山老師，這兩年來不遺餘力的指導我，一步一步的帶領我走進這個在大學時期不曾接觸的領域，在撰寫論文的過程中也學會了幾個應用軟體，對於之後在中學任教，相信會有莫大的幫助，感恩兩位口試委員柯春旭老師與張毓麟老師，在百忙中撥出時間幫我口試，並在過程中給予我許多細節上的叮嚀，感恩師大如來實證社的同學們，在我忙碌於碩士論文時，時時給予我鼓勵與支持，以及當我論文遇到問題時幫助我的研究所同學們，真的非常感恩大家兩年來的照顧，彥迪感恩合十。劉彥迪. 謹誌. 2014 年 6 月.

(4) Contents 1 Introduction. 1. 2 Smooth reformulation. 2. 3 A smoothing-type algorithm. 7. 4 Global convergence. 11. 5 Local superlinear convergence. 15. 6 References. 17.

(5) New smoothing functions for solving a system of equalities and inequalities Yan-Di Liu 1 Department of Mathematics National Taiwan Normal University Taipei 11677, Taiwan. Abstract In this paper, we propose a family of new smoothing functions for solving a system of equalities and inequalities. We also investigate a monotone algorithm and show that it is globally and locally superlinearly convergent under suitable assumptions. Key words. Smoothing function, system of equations and inequalities, convergence.. 1. Introduction. The target problem of this paper is the following system of equalities and inequalities: { fI (x) ≤ 0 (1) fE (x) = 0 where I = {1, 2, · · · , m} and E = {m + 1, m + 2, · · · , n}. In other words, the function fI : IRn → IRm is given by   f1 (x)  f2 (x)    fI (x) =   ..   . fm (x) where fi : IRn → IR for i = {1, 2, · · · , m}; and the function fE : IRn → IRn−m is given by   fm+1 (x)  fm+2 (x)    fE (x) =   ..   . fn (x) 1. E-mail: 60140026S@ntnu.edu.tw. 1.

(6) where fj : IRn → IR for j = {m + 1, m + 2, · · · , n}. In this paper, for simplicity, we denote f : IRn → IRn as   f1 (x)  f2 (x)      ..   .  [ ]   fm (x)  fI (x)   f (x) := =  f (x) fE (x) m+1     f  m+2 (x)    ..   . fn (x) and assume that f is continuously differentiable. When E is empty set, the problem (1) reduces to a system of inequalities; whereas it reduces to a system of equations when I is empty. Problems in form of (1) arise in real applications, including data analysis, computeraided design problems, image reconstructions, and set separation problems, etc.. Many optimization methods have been proposed for solving the system (1), for instance, noninterior continuation method [10], smoothing-type algorithm [4, 9], Newton algorithm [12], and iteration methods [2, 5, 6, 8]. In this paper, we consider the similar smoothingtype algorithm studied in [4, 9] for solving the system (1). In particular, we propose a family of smoothing functions and investigate its properties.. 2. Smooth reformulation. The main idea of smoothing-type algorithm for solving the system (1) is to reformulate (1) as a system of smoothing equations via projection function, see [4, 9]. More specifically, for any x = (x1 , x2 , · · · , xn ) ∈ IRn , we define   max{0, x1 }   .. (x)+ :=  . . max{0, xn } Then, the system (1) is equivalent to the following system of equations: { (fI (x))+ = 0 fE (x) = 0.. (2). Note that the function in the reformulation (2) is nonsmooth, the classical Newton methods cannot be directly applied to solve (2). To conquer this, the following smoothing 2.

(7) algorithm was considered (see [4, 9]):    t (t+µ)2 ϕ(µ, t) = 4µ   0. if t ≥ µ, if −µ < t < µ, if t ≤ −µ,. (3). where µ > 0. In this paper, we propose a new class of smoothing functions, which include the one given as in (3) as a special case, for solving the reformulation (2). Here is the family of smoothing functions we consider:  µ if t ≥ p−1 ,   t [ ]p (p−1)(t+µ) µ µ ϕp (µ, t) = (4) if −µ < t < p−1 , p−1 pµ   0 if t ≤ −µ, where µ > 0 and p ≥ 2. Note that ϕp reduces to the smoothing function studied in [9] when p = 2. The graphs of ϕp with different value of p and various µ are depicted as in Figures 1-3. Proposition 2.1 Let ϕp be defined as in (4). For any (µ, t) ∈ IR++ × IR, we have (a) ϕp (., .) is continuously differentiable at any (µ, t) ∈ IR++ × IR. (b) ϕp (0, t) = (t)+ . (c). ∂ϕp (µ,t) ∂t. ≥ 0 for any (µ, t) ∈ IR++ × IR... (d) limp→∞ ϕp (µ, t) → (t)+ . ∂ϕp (µ,t) ∂t. Proof. (a) First, we calculate.  1   [. ∂ϕp (µ, t) =  ∂t   0   [. ∂ϕp (µ, t) =  ∂µ  Then, we see that. ∂ϕp (µ,t) ∂t. and. ∂ϕp (µ,t) ∂µ. (p−1)(t+µ) pµ. ]p−1. as below: if t ≥. µ , p−1. if −µ < t <. µ , p−1. if t ≤ −µ,. 0. (p−1)(t+µ) pµ. ]p−1. if t ≥ (t+µ−pt) pµ. µ , p−1. if −µ < t <. µ , p−1. if t ≤ −µ,. 0. ∈ C 1 because. ∂ϕp (µ, t) = limµ ∂t t→ p−1 ∂ϕp (µ, t) lim = t→−µ ∂t. [ lim. pµ. µ t→ p−1. [. lim. t→−µ. µ (p − 1)( p−1 + µ). (p − 1)(−µ + µ) pµ. 3. ]p−1 = 1,. ]p−1 = 0..

(8) and. ∂ϕp (µ,t) ∂µ. ∈ C 1 since. ∂ϕp (µ, t) limµ = ∂µ t→ p−1 ∂ϕp (µ, t) lim = t→−µ ∂µ. [ lim. µ (p − 1)( p−1 + µ). pµ. µ t→ p−1. [. lim. t→−µ. (p − 1)(−µ + µ) pµ. ]p−1. ]p−1. µ µ + µ − p p−1 ) ( p−1. pµ. = 0,. (−µ + µ − p(−µ)) = 0. pµ. The above says that ϕp (., .) is continuously differentiable. (b) From the definition of ϕp (µ, t), it is clear that { t if t ≥ 0 ϕp (0, t) = = (t)+ 0 if t ≤ 0 which is the desired result. µ (c) When −µ < t < p−1 , we have t + µ > 0. Hence, from the expression of [ ]p−1 obvious that (p−1)(t+µ) ≥ 0, which says ∂ϕp∂t(µ,t) ≥ 0. pµ. (d) Part(d) is clear from the definition.. ∂ϕp (µ,t) , ∂t. it is. 2. The properties of ϕp in Proposition 2.1 can be seen via the graphs. In particular, in Figures 1-2, we see that when µ → 0, ϕp (µ, t) goes to (t)+ which verify Proposition 2.1(b). Figure 3 says that µ → 0, ϕp (µ, t) approaches to (t)+ as p → ∞. This also verify Proposition 2.1(d). Next, we will form another reformulation for problem (1). To this end, we define     ϕp (µ, s1 ) fI (x) − s   .. F (z) :=  fS (x)  with Φ(µ, s) :=  (5)  and z = (x, s) . Φp (µ, s) ϕp (µ, sm ) Then, by Proposition 2.1(b) we have F (z) = 0 and µ = 0 ⇔ s = fI (x), s+ = 0, fS (x) = 0. This, together with Proposition 2.1(a), indicates that one can solve (1) by applying Newton-type methods to solve F (z) = 0 by letting µ ↓ 0. Furthermore, we define a function H : IR1+n+m → IR1+n+m by   µ  fI (x) − s + µxI   (6) H(z) :=   fE (x) + µxE  Φp (µ, s) + µs 4.

(9) 2.5 p=2, µ=0.1 p=2, µ=0.5 p=2, µ=1 p=2, µ=2. 2. p. φ (µ,t). 1.5. 1. 0.5. 0 −2.5. −2. −1.5. −1. −0.5. 0 t. 0.5. 1. 1.5. 2. Figure 1: Graphs of ϕp (µ, t) with p = 2 and µ = 0.1, 0.5, 1, 2.. 0.5 p=10, µ=0.1 p=10, µ=0.5 p=10, µ=1 p=10, µ=2. 0.45 0.4 0.35 φp(µ,t). 0.3 0.25 0.2 0.15 0.1 0.05 0. −0.5. −0.4. −0.3. −0.2. −0.1. 0 t. 0.1. 0.2. 0.3. 0.4. Figure 2: Graphs of ϕp (µ, t) with p = 10 and µ = 0.1, 0.5, 1, 2.. 5. 0.5. 2.5.

(10) 0.25. p=2, µ=0.2 p=3, µ=0.2 p=10, µ=0.2 p=20, µ=0.2. φp(µ,t). 0.2. 0.15. 0.1. 0.05. 0. −0.25 −0.2 −0.15 −0.1 −0.05. 0 t. 0.05. 0.1. 0.15. 0.2. 0.25. Figure 3: Graphs of ϕp (µ, t) with p = 2, 3, 10, 20 and µ = 0.2.. where xI = (x1 , x2 , · · · , xm ), xE = (xm+1 , xm+2 , · · · , xn ), s ∈ IRm , x := (xI , xE ) ∈ IRn and functions ϕp and Φp are defined as in(4) and (5), respectively. Thereby, it is obvious that if H(z) = 0, then µ = 0 and x solves the system (1). It is not difficult to see that, for ′ any z ∈ IR++ × IRn × IRm , the function H is continuously differentiable. Let H denote the Jacobian of the function H. Then, for any z ∈ IR++ × IRn × IRm , we have  1 O1×n O1×m     x1 −1 · · · 0    ..   .. . . .  A   .   . . ..    xm m×1 0 · · · −1 m×m       0 · · · 0 x m+1  ′  ..   .. . . ..  H (z) =   B  .  .  . .    0 · · · 0 (n−m)×m xn  (n−m)×1  ∂ ′      ∂ ′  s1 + ∂µ ϕ (µ, s1 ) 0 · · · 0 ϕ (µ, s ) + µ · · · 0 1 ∂s    ..   .  .. .. .. ..    . 0 ..    . . . .  ∂ ′ ∂ ′ 0 · · · 0 m×n sm + ∂µ ϕ (µ, sm ) 0 · · · ∂s ϕ (µ, sm ) + µ m× m×1. 6.

(11) . where.  A= and. ∂f1 (x1 ) ∂x1. .. . 0.   B =  0(n−m)×m. + µ ··· .. . ···.  0m×(n−m) . ∂fm (xm ) ∂xm. ∂fm+1 (xm+1 ) ∂xm+1. .. . 0. . 0 .. . +µ. m×n. . + µ ··· .... 0 .. .. ···. ∂fn (xn ) ∂xn.   +µ. (n−m)×n. With the above, we can simplify the matrix H ′ (z) as . 1 0n ′  ′ xI fI (x) + µU H (z) =  ′  xE fE (x) + µV ′ s + Φµ (µ, s) 0m×n. 0m −Im 0(n−m)×m ′ Φs (µ, s) + µIm.    . (7). where U := [Im 0m×(n−m) ], . V := [0(n−m)×m In−m ],  ∂ ′ ϕ (µ, s1 ) s1 + ∂µ ′   .. s + Φµ (µ, s) =  ,  . ∂ ′ sm + ∂µ ϕ (µ, sm ) m×1   ∂ ′ ϕ (µ, s ) + µ · · · 0 1 ∂s ′   .. .. ... . Φs (µ, s) + µIm =   . . ∂ ′ 0 · · · ∂s ϕ (µ, sm ) + µ m×m Here, we use 0I to denote the I-dimensional zero vector and 0l×q to denote the l × q zero matrix for any positive integers l and q. Thus, we might apply some Newton-type methods to solve the system of smooth equations H(z) = 0 at each iteration by letting µ > 0 and H(z) → 0 so that a solution of (1) can be found.. 3. A smoothing-type algorithm. In this section, we propose a smoothing-type algorithm with a nonmonotone line search. Some basic properties are given. In particular, we show that the algorithm is well defined. We will use the following function: ψ(z) := ∥H(z)∥2 . In fact, the scheme is similar to those in [4, 9]. Below are the details of the algorithm. Algorithm 3.1 (A Nonmonotone Smoothing-Type Algorithm). 7.

(12) Step 0 Choose δ ∈ (0, 1), σ ∈ (0, 1/2), β > 0. Take τ ∈ (0, 1) such that τ β < 1. Let µ0 = β and (x0 , s0 ) ∈ IRn+m be an arbitary vector.Set z 0 := (µ0 , x0 , s0 ). Take e0 := (1, 0, · · · , 0) ∈ IR1+n+m , R0 := ∥H(z 0 )∥2 = ψ(z 0 ) and Q0 = 1. Choose ηmin and ηmax such that 0 ≤ ηmin ≤ ηmax < 1. Set θ(0) := τ min{1, ψ(z 0 )} and k := 0. Step 1 If ∥H(z k )∥ = 0, stop. Step 2 Computer △z k := (△µk , △xk , △sk ) ∈ IR × IRn × IRm by using ′. H △z k = −H(z k ) + βθ(z k )e0. (8). Step 3 Let αk be the maximum of the values 1, δ, δ 2 , · · · such that ψ(z k + αk △z k ) ≤ [1 − 2σ(1 − τ β)αk ] Rk. (9). Step 4 Set z k+1 := z k + αk △z k . If ∥H(z k+1 )∥ = 0, stop. Step 5 Choose ηk ∈ [ηmin , ηmax ]. Set Qk+1 := ηk Qk + 1 { ) θ(z k+1 ) := min τ, τ ψ(z k+1 ), θ(z k ) Rk+1 :=. (10). (ηk Qk Rk + ψ(z k+1 )) Qk + 1. and k := k + 1. Go to Step 2 In Algorithm 3.1, a nonmonotone line search technique is adopted. It is easy to see that Rk+1 is a convex combination of Rk and ψ(z k+1 ). Since R0 = ψ(z 0 ), it follows that Rk is a convex combination of the function values ψ(z 0 ), ψ(z 1 ), · · · , ψ(z k ). The choice of ηk controls the degree of nonmonotonicity. If ηk = 0 for every k, then the line search is the usual monotone Armijo line search. For convenience, we denote ′. [. f (x) :=. ′. fI (x) ′ fE (x). ]. and make the following assumption. ′. Assumption 3.1 f (x) + µIn is invertible for any x ∈ IRn and µ ∈ IR++. Some basic results involving Algorithm 3.1 are included in the following lemma.. 8.

(13) { } Lemma 3.1 Let the sequence {Rk } and z k be generated by Algorithm 3.1. Then, the following hold. (a) The sequence {Rk } is monotonically decreasing. (b) The function ψ(z k ) ≤ Rk for all k ∈ J . (c) The sequence θ(z k ) is monotonically decreasing. (d) βθ(z k ) ≤ µk for all k ∈ J . (e) µk > 0 for all k ∈ J and the sequence {µk } is monotonically decreasing. Proof. (a) From (9) and the definition of Rk in (10), it follows that for any k ∈ J , (ηk Qk Rk + ψ(z k+1 )) Qk+1 ηk Qk Rk + Rk − 2σ(1 − τ β)αk Rk ≤ Qk+1 2σ(1 − τ β)αk Rk = Rk − Qk+1 ≤ Rk. Rk ≤. (11). which implies that {Rk } is monotonically decreasing. (b) In fact, this can be obtained by an inductive method. Firstly, it is evident from the choice of R0 that the result holds when k=0. Secondly, if we assume that the result holds when k =ℓ, then we only need to show that the result holds when k=ℓ + 1. Note that ψ(z ℓ+1 ) = Qℓ+1 Rℓ+1 − ηℓ Qℓ Rℓ ≤ Qℓ+1 Rℓ+1 − ηℓ Ql Rℓ+1 = (Qℓ+1 − ηℓ Qℓ )Rℓ+1 = Rℓ+1 where the first equality follows from the definition of Rℓ+1 in (10); the first inequality from the above result (a); and the last equality from the definition of Qℓ+1 in (11). (c) By (10) it is obvious. (d) In fact, from the choice of the starting point it follows that βθ(z 0 ) ≤ µ0 . Next, we assume that βθ(z ℓ ) ≤ µℓ for some index ℓ ∈ J . Then, µℓ+1 − βθ(z ℓ+1 ) = µℓ + αℓ △µℓ − βθ(z ℓ+1 ) = µℓ + αℓ βθ(z ℓ ) − αℓ µℓ − βθ(z ℓ+1 ) = (1 − αk )µk + αℓ βθ(z ℓ ) − βθ(z ℓ+1 ) ≥ (1 − αk )βθ(z ℓ ) + αℓ βθ(z ℓ ) − βθ(z ℓ+1 ) = βθ(z ℓ ) − βθ(z ℓ+1 ) ≥ 0 9.

(14) where the second equality follows from the first equation of (8) and µl+1 = µl + αl △µl ; the first inequality holds from βθ(z l ) ≤ µl ; and the second inequality from the above result (c). Thus, by using the inductive method we obtain the desired result. (e) In fact, from the first equation of (8) it follows that µℓ+1 = µℓ + αℓ △µℓ = (1 − αk )µk + αℓ βθ(z ℓ ) = µk − αℓ µℓ + αℓ βθ(z ℓ ) ≥ µk − αℓ βθ(z ℓ ) + αℓ βθ(z ℓ ) = µk > 0. (12). which indicates that µk > 0 for all k ∈ J . Combining parts (d) and (e), we have that for any k ∈ J , µl+1 = (1 − αk )µk + αl βθ(z l ) ≤ (1 − αk )µk + αl µk = µk which implies that the sequence {µk } is monotonically decreasing.. 2. [. ] A A 11 12 Lemma 3.2 Suppose A ∈ IRn×n which is partitioned as A = where A11 and A21 A22 A22 are square matrices. If A12 or A21 is zero matrix, then det(A) = det(A11 ) · det(A22 ). Proof. This a well known result in matrix analysis, which is a special case of Fischer’s ineqiality [1, 3]. Please refer to [7, Theorem 7.3] for a proof. 2. Theorem 3.1 Suppose that f is a continuously differentiable function and Assumption 3.1 is satisfied. Then Algorithm 3.1 is well defined. Proof. First, we show that the line search (9) is well defined. Let ′. Lk (α) := ψ(z k + α△z k ) − ψ(z k )αψ (z k )△z k . Then, in light of (8), we have ′. ψ(z k + α△z k ) = Lk (α) + ψ(z k ) + αψ (z k )△z k ′. = Lk (α) + ψ(z k ) + α2H(z k )T H (z k )△z k [ ] = Lk (α) + ψ(z k ) + α2H(z k )T −H(z k ) + βθ(z k )e0 = Lk (α) + ψ(z k ) − 2αψ(z k ) + 2αβH(z k )T θ(z k )e0 ≤ Lk (α) + (1 − 2α)ψ(z k ) + 2αβθ(z k )∥H(z k )∥ 10.

(15) On the other hand, for ψ(z k ) ≤ 1, using ψ(z k ) := ∥H(z k )∥2 , we know ∥H(z k )∥2 ≤ 1 ⇒ ∥H(z k )∥ ≤ 1. In addition, from (9), we know θ(z k ) ≤ τ ψ(z k ) ⇒ θ(z k )∥H(z k )∥ ≤ τ ψ(z k )∥H(z k )∥ ≤ τ ψ(z k ). On the other hand, if ψ(z k ) > 1 ,then ψ(z k ) := ∥H(z k )∥2 ≥ ∥H(z k )∥. Applying (9) yields θ(z k ) ≤ τ ⇒ θ(z k )∥H(z k )∥ ≤ τ ∥H(z k )∥ ≤ τ ψ(z k ). Thus, we obtain ψ(z k + α△z k ) ≤ Lk (α) + (1 − 2α)ψ(z k ) + 2αβθ(z k )∥H(z k )∥ ≤ Lk (α) + (1 − 2α)ψ(z k ) + 2αβτ ψ(z k ). (13). = L (α) + [1 − 2(1 − τ β)α] ψ(z ) k. k. ≤ Lk (α) + [1 − 2(1 − τ β)α] Rk Since the function H is continuously differentiable for any z ∈ IR × IRn × IRm with µ > 0, it follows from (12) and Lemma 3.1(e) that Lk = o(α) for all k ∈ J . Thus, the desired result holds by ψ(z k ) ≤ Rk for all k ∈ J . Secondly, we show that Step 2 is well defined. For any square matrix A, we use det(A) to denote the determinant of A. It is easy to see from (2.5) that ( ′ ) ( ′ ) ′ det(H (z)) = det f (x) + µIn · det Φs (µ, s) + µIm ′. for any z ∈ IR++ ×IRn ×IRm . Furthermore, we know from Proposition 2.1(c) that Φs (µ, s) ′ is positive semi-definite. Thus, by Assumption 3.1 we obtain that H (z) is nonsingular for any z ∈ IR1+n+m with µ > 0 . This, together with the result Lemma 3.1(e), implies that the system of equations (7) is solvable, i.e. Step 2 is well defined. From all the above, we prove that Algorithm 3.1 is well defined.. 4. 2. Global convergence. The following assumption was introduced in [4] Assumption 4.1 For an arbitrary sequence {(µk , xk )} with limk−→∞ ∥x∥ = +∞ and the sequence {µk } ⊂ IR+ bounded, then either (i) there is at least an index i0 such that lim supk−→∞ {fi0 (xk ) + µk xki0 } = +∞ ; or 11.

(16) (ii) there is at least an index i0 such that lim supk−→∞ {µk (fi0 (xk ) + µk xki0 )} = −∞. It can be seen that many functions satisfy Assumption 4.1. The global convergence of Algorithm 3.1 is stated as follows.. Theorem 4.1 Suppose that f is a continuously differentiable function and Assumptions 3.1 and 4.1 are satisfied. Then the infinite sequence {z k } generated by Algorithm 3.1 is bounded; and any accumulation point of xk is a solution of (1). Proof. We divide the proof into the following two parts. Part 1. We show that the sequence {z k } is bounded. By Remark 3.1(e), we know that the sequence {µk } is bounded, and hence, we only need to show that {(xk , sk )} is bounded. In the following, by assuming that {xk } is unbounded, we will derive a contradiction. By (6) and the definition of ψ it follows that ψ(z k ) = µ2k + ∥fI (xk ) − sk + µk xkI ∥2 + ∥fE (xk ) + µk xkE ∥2 + ∥Φ(µk , sk ) + µk sk ∥2. (14). Since the sequence {Rk } is monotonically decreasing and Rk > 0, it follows that the sequence {Rk } is bounded. Then, by Lemma 3.1(b) we get that {ψ(zk )} is bounded. Thus, from (13) we obtain that {fI (xk ) − sk + µk xkI }, {fE (xk ) + µk xkE }, {Φ(µk , sk ) + µk sk }. (15). are bounded For any k ∈ F , let h(z k ) := sk − fI (xk ) − µk xkI ; then {h(z k )} is bounded and sk = h(z k ) + fI (xk ) + µk xkI. (16). Since {fE (xk ) + µk xkE } is bounded by (15), it follows from Assumption 4.1 that either : (i) there is at least an index i0 such that lim supk→∞ {fi0 (xk ) + µk xki0 } = +∞ ; or (ii) there is at least an index i0 such that lim supk→∞ {µk (fi0 (xk ) + µk xki0 )} = −∞ In the following, we consider these two cases, separately. • If the above result (i) holds, then by (16) we have lim sup skio = lim sup{hio (z k ) + fi0 (xk ) + µk xki0 } = +∞ k→∞. k→∞. Furthermore, by using definitions of ϕ and Φ we have lim sup{Φi0 (µk , sk ) + µk ski0 } = lim sup{ski0 + µk ski0 } = +∞ k→∞. k→∞. which indicates that {Φ(z ) + µk s } is unbounded. This contradicts (15). • If the above result (ii) holds, then lim supk−→∞ {fi0 (xk ) + µk xki0 } = −∞ since µk is a nonnegative and bounded sequence. Thus, by (16) we have lim supk→∞ ski0 = −∞. k. k. 12.

(17) Furthermore, lim sup{Φi0 (µk , sk ) + µk ski0 } = lim sup{µk ski0 } = −∞ k→∞. k→∞. which implies that {Φ(z k ) + µk sk } is unbounded. This contradicts (15). Therefore, the sequence {xk } is bounded. Since sequences {µk } , {xk }, and {h(z k )} are bounded and the function f is continuous, by (16) we further obtain that the sequence {sk } is bounded. Therefore, the sequence {z k } is bounded. Part 2. We prove that any accumulation point of {xk } generated by Algorithm 3.1 is a solution of (1). By Lemma 3.1(a), we know that the sequence {Rk } is nonnegative and monotone decreasing, and hence, it is convergent. From Lemma 3.1(a) and (b) we have 0 ≤ ∥H(z k )∥2 = ψ(z k ) ≤ Rk ≤ Rk−1 ≤ R0. (17). so the sequence {Rk } converges and both sequences {ψ(zk )} and {∥H(z k )∥} are bounded. In addition, by the first result of this theorem, we obtain that the sequence {z k } is bounded, and hence, it has at least a subsequence which is convergent. We denote this subsequence by {z k } where k ∈ J¯ ⊆ J . Thus, there exists a point z ∗ = (µ∗ , x∗ , s∗ ) ∈ IR++ × IRn × IRm such that limJ¯∋k→∞ z k = z ∗ , and by continuity of the function H, we get limJ¯∋k→∞ ∥H(z k )∥ = ∥H(z ∗ )∥ . Define R∗ = limJ¯∋k→∞ Rk . If R∗ = 0, then ∥H(z ∗ )∥ = 0, and hence, x∗ is a solution of (1). In the following, we assume that R > 0 and ∥H(z ∗ )∥ > 0, and then derive a contradiction. • Suppose that αk ≥ α∗ > 0 for all k ∈ J , where α∗ is a fixed constant. On one hand, from (11) it follows that β)α∗ Rk+1 ≤ Rk − 2σ(1−τ Rk for any k ∈ J . Since {Rk } is bounded, we further obtain Qk+1 that ∞ ∞ ∞ ∑ ∑ ∑ 2σ(1 − τ β)α∗ Rk < ∞ (18) Rk − Rk+1 < Rk ≤ Q k+1 k=0 k=0 k=0 On the other hand, by the definition of Qk and the fact that ηmax ∈ [0, 1) given in the algorithm, we have Qk+1 = ηk Qk + 1 = ηk (ηk−1 Qk−1 + 1) + 1 = ηk ηk−1 Qk−1 + ηk + 1 = · · · So we can obtain that Qk+1 = 1 +. k ∑. ηk ηk−1 · · · ηk−i. i=0. = 1 + ηk + ηk ηk−1 + ηk ηk−1 ηk−2 + · · · + ηk ηk−1 · · · η0 2 3 k+1 ≤ 1 + ηmax + ηmax + ηmax + · · · + ηmax k+1 3 2 + ······ + · · · + ηmax + ηmax ≤ 1 + ηmax + ηmax 1 a = (s∞ = ). 1 − ηmax 1−r. 13.

(18) Further,. ∑∞ k=0. 1 Qk+1. ≥. 2σ(1−τ β)α∗ Rk Qk+1. ≥. 1−ηmax 1. ∑∞ k=0. 2σ(1 − τ β)α∗ (1 − ηmax )Rk =. ∑∞ k=0. CRk. (C ∈ IR). ∑ ∗ by (18) ∞ k=0 CRk < ∞ i.e., limJ¯∋k→∞ Rk = 0 , which contradicts R > 0. b := αk /δ does not satisfy the • Suppose that limJ¯∋k→∞ αk = 0. Then, the step size α ¯ line search criterion (8) for any sufficiently large k ∈ J , i.e., ∥H(z k + α ˆ K △z k )∥2 = ψ(z k + α ˆ K △z k ) > [1 − 2α(1 − τ β)ˆ αk ]Rk. (19). holds for any sufficiently large k ∈ J¯ . It is easy to see that the following results hold. (a) µ∗ > 0 . This can be easily obtained by Lemma 3.1(e). ′. (b) {△z k }k∈J¯ is convergent. In fact, since µ∗ > 0 , it follows that H (z k ) is an invertible continuously linear operator for any k ∈ J¯ . Thus, by using (8) we may obtain the desired result. Define △z ∗ = limJ¯∋k→∞ △z k (c) R∗ = ψ(z ∗ ). In fact, from (16) it follows that R∗ ≥ ψ(z ∗ ) ; and from (18) it follows that R∗ ≤ ψ(z ∗ ). By (19), we have that ψ(z k + α ˆ K △z k ) > [1 − 2α(1 − τ β)ˆ αk ]Rk . Since Rk ≥ ψ(z k ) , the above inequality becomes that ψ(z k + α ˆ k △z k ) > [1 − 2α(1 − τ β)ˆ αk ]Rk ≥ [1 − 2α(1 − τ β)ˆ αk ]ψ(z k ) ⇒ ψ(z k + α ˆ k △z k ) − ψ(z k ) > −2α(1 − τ β)ˆ αk Rk k k k ⇒ {ψ(z + α ˆ k △z ) − ψ(z )}/ˆ αk > −2α(1 − τ β)Rk. (20). By the above result (a) we know that ∥H(·)∥ is continuously differentiable at z ∗ , and so is ∥ψ(·)∥ . Hence, by taking the limit for (18) and using the above result (b), we have ′. ψ(z k + α ˆ k △z k ) = ψ(z k ) + α ˆ k ψ (z k )△z k ⇒. ψ(z k + α ˆ k △z k ) − ψ(z k ) ′ = ψ (z k )△z k α ˆk. by (20) ψ(z ∗ + α ˆ k △z ∗ ) − ψ(z ∗ ) ′ ′ = ψ (z ∗ )△z ∗ = 2H(z ∗ )T H (z ∗ )△z ∗ ≥ −2α(1 − τ β)R∗ α ˆk In addition, by (8) and the above results (b) and (c), we have ) ( ψ(z k + α ˆ k △z k ) = 2H(z k )T −H(z k ) + βθ(z k )e0 ≤ −2ψ(z ∗ ) + 2βθ(z ∗ )∥H(z ∗ )∥ = −2R∗ + 2βθ(z ∗ )∥H(z ∗ )∥ 14. (21).

(19) By the definition of θ(·) in (9) and for the proof of Theorem 3.1 we change k to * ,then we obtain that If ψ(z ∗ ) ≥ 1 ,then θ(z ∗ )∥H(z ∗ )∥ ≤ τ ψ(z ∗ )∥H(z ∗ )∥ ≤ τ ψ(z ∗ ) ≤ τ R∗ If ψ(z ∗ ) < 1 ,then θ(z ∗ )∥H(z ∗ )∥ ≤ ∥H(z ∗ )∥τ R∗ ≤ τ R∗ Thus by (21) , we get ′. −2α(1 − τ β)R∗ ≤ 2H(z ∗ )T H (z ∗ )△z ∗ ≤ −2R∗ + 2βτ R∗ = −2(1 − τ β)R∗. (22). Furthermore, from (22) and R∗ > 0 , it is easy to see that −2α(1 − τ β)R∗ ≤ −2(1 − τ β)R∗ α(1 − τ β) ≥ (1 − τ β) σ ≥ 1 which contradicts the fact that σ ∈ (0, 12 ) and τ β < 1 This proof is complete . 2. 5. Local superlinear convergence. In this section, we analyse the rate of convergence for Algorithm 3.1. A locally Lipschitz function F : IRn → IRm , which has the generalized Jacobian ∂F (x), is said to be semismooth (or strongly semismooth) at x ∈ IRn if F is directionally differentiable at x and F (x + h) − F (x) − V h = o(∥h∥)(or = O(∥h∥2 ) holds for any V ∈ ∂F (x + h). It is well known that convex functions, smooth functions, and piecewise linear functions are examples of semismooth functions; and the composition of (strongly) semismooth functions is still a (strongly) semismooth function. It is easy to show that the function ϕ defined by (4) is strongly semismooth on IR2 . Thus, on noticing that f is continuously differentiable, we obtain that the function H defined by (6) is semismooth ′ (or strongly semismooth if f is Lipschitz continuous on IRn . Now, we show the local superlinear convergence of Algorithm 3.1. To this end, we present a technical lemma. Lemma 5.1 Let O(X) and O(Y ) denote the Big-oh of X and Y , respectively , let o(x) and o(Y ) be the Little-oh of X and Y , then we have (a) O(Y ) ± O(Y ) = O(Y ); (b) o(Y ) ± o(Y ) = o(Y ); (c) If c ̸= 0 then O(cY ) = O(Y ), o(cY ) = o(Y ); 15.

(20) (d) O(o(Y )) = o(Y ), o(O(Y )) = o(Y ); (e) O(X)O(Y ) = O(XY ), O(X)o(Y ) = o(XY ), o(X)O(Y ) = o(XY ). (f ) If α = O(β1 ) and β1 = o(β2 ), then α = o(β2 ). Proof. Form (a) to (e) please refer to [11] for a proof. Part (f) can be verified straightforwardly. 2. Theorem 5.1 Suppose that the conditions given in Theorem 4.1 are satisfied and that z ∗ = (µ∗ , x∗ , s∗ ) is an accumulation point of {z k } generated by Algorithm 3.1. If all V ∈ ∂H(z ∗ ) are nonsingular, then the following hold. (a) αk ≡ 1 for all z k sufficiently close to z ∗ ; (b) the whole sequence {z k } converges to z ∗ ; ′. (c) ∥z k+1 − z ∗ ∥ = o(∥z k − z ∗ ∥) (or ∥z k+1 − z ∗ ∥ = O(∥z k − z ∗ ∥2 )) provided f is Lipschitz continuous on IRn ); ′. (d) µk+1 = o(µk ) (or µk+1 = O(µ2k ) if f is Lipschitz continuous on IRn ). Proof. It holds by the proof of Theorem 4.1 that H(z ∗ ) = 0 and z ∗ is a solution of H(z) = 0. Note that all V ∈ ∂H(z ∗ ) are singular , we have that ∥H −1 (z k )∥ = O(1) holds for all z k sufficiently close to z ∗ . Since the function H is semismooth (or strongly ′ semismooth if f is Lipschitz continuous on IRn ), it follows that for all z k sufficiently close to z ∗ . ′. ∥H(z k ) − H(z ∗ ) − H (z k )(z k − z ∗ )∥ = o(∥z k − z ∗ ∥) (or = O(∥z k − z ∗ ∥2 )) Notice that the function H is locally Lipschitz continuous near z ∗ . Therefore, for all z k sufficiently close to z ∗ , ∥H(z k )∥ = O(∥z k − z ∗ ∥), which implies that ψ(z k ) = o(∥z k − z ∗ ∥). Thus, for all z k sufficiently close to z ∗ , applying Lemma 5 yields ∥z k + △z k − z ∗ ∥ = ∥z k + H(z k )−1 (−H(z k ) + βθ(z k )e0 ) − z ∗ ∥ ′. ≤ ∥H −1 (z k )∥(∥H(z k ) − H(z ∗ ) − H (z k )(z k − z ∗ )∥ + βθ(z k )) ′. ≤ ∥H −1 (z k )∥(∥H(z k ) − H(z ∗ ) − H (z k )(z k − z ∗ )∥ + βτ ψ(z k )) ′. = ∥H −1 (z k )∥∥H(z k ) − H(z ∗ ) − H (z k )(z k − z ∗ )∥ + ∥H −1 (z k )∥βτ ψ(z k ) ∗. ∗. = O(1)o(∥z − z ∥) + O(1)o(∥z − z ∥) k. k. = o(∥z k − z ∗ ∥) + o(∥z k − z ∗ ∥) = o(∥z k − z ∗ ∥) (or = O(∥z k − z ∗ ∥2 )) 16. (23).

(21) We have that ∥z k − z ∗ ∥ = O(∥H(z k ) − H(z ∗ )∥) holds for all z k sufficiently close to z ∗ . Hence, for all z k sufficiently close to z ∗ . ∥H(z k + △z k )∥ = O(∥z k + △z k − z ∗ ∥) = o(∥z k − z ∗ ∥) (or = O(∥z k − z ∗ ∥2 )) = o(∥H(z k ) − H(z ∗ )∥) (or = O(∥H(z k ) − H(z ∗ )∥2 )) k. k. (24). 2. = o(∥H(z )∥) (or = O(∥H(z )∥ )) By the proof of Theorem 4.1 it follows that limk→∞ ∥H(z k )∥ = 0. Hence , (24) implies that αk = 1 holds for all z k which is sufficiently close to z ∗ . This proves the result (a). Therefore, for all z k which is sufficiently close to z ∗ , we have z k+1 = z k + △z k , which, together with (23), indicates that the results (b) and (c) hold. In addition, since µk+1 = µk + △µk = βθ(z k ) ≤ βτ ψ(z k ) for all sufficiently large k, ψ(z k+1 ) = ∥H(z k+1 )∥2 = ∥H(z k+1 )∥∥H(z k+1 )∥ = o(∥H(z k )∥)o(∥H(z k )∥) = o(∥H(z k )∥2 ) = o(ψ(z k )) and µk+1 = O(ψ(z k )) ⇒ o(µk+1 ) = o(O(ψ(z k ))) = o(ψ(z k )) By using (24) and Lemma (f) µk+1 = O(ψ(z k )) = O(o(ψ(z k−1 ))) = o(ψ(z k−1 )) (or = O(ψ(z k−1 )2 ) = o(µk ) (or = O(µ2k )) which is the desired result of part (d) holds.. 2. References [1] R. bhatia, Matrix Analysis, Springer-Verlag, New York, 1997. [2] J.W. Daniel, Newton’s method for nonlinear inequalities, Numerical Mathematics, vol. 21, pp. 381-387, 1973. [3] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1986.. 17.

(22) [4] Z-H. Huang, Y. Zhang, and W. Wu, A smoothing-type algorithm for solving stsyem of inequalities, Journal of Computational and Applied Mathematics, vol. 220, pp. 355-363, 2008. [5] D.Q. Mayne, E. Polak, and A.J. Heunis, Solving nonlinear inequalities in a finite number of iterations, Optimization Theory and Applications, vol. 33, pp. 207221, 1981. [6] M. Sahba, On the solution of nonlinear inequalities in a finite number of iterations, Numerical Mathematics, vol. 46, pp. 229-236, 1985. [7] J.M. Schott, Matrix Analysis for Statistics, 2nd edition, John Wiley, New Jersey, 2005. [8] H-X. Ying, Z-H. Huang, and L. Qi, The convergence of a Levenberg-Marquard method for the l2 -norm solution of nonlinear inequalities, Numerical Functional Analysis and Optimization, vol. 29, pp. 687-716, 2008. [9] Y. Zhang and Z-H. Huang, A nonmonotone smoothing-type algorithm for solving a system of equalities and inequalities, Journal of Computational and Applied Mathematics, vol. 233, pp. 2312-2321, 2010. [10] J. Zhu and B. Hao, A new non-interior continuation method for solving a system of equalities and inequalities, submitted manuscript, 2014. [11] Robert G. Bartle, The Elements of Real Analysis, Second Edition [12] S.L. HU , Z-H. Huang and P. Wang , A non-monotone smoothing Newton algorithm for solving nonlinear complementarity problems, Optimization Methods and Software , vol. 24, pp. 447-460, 2009. 18.

(23)