A Dynamical Systems Approach to Complementarity Problems

全文

(1)國立臺灣師範大學理學院數學系(所) 博士論文 Department of Mathematics College of Science. National Taiwan Normal University Doctoral Dissertation. A Dynamical Systems Approach to Complementarity Problems. Alcantara, Jan Harold Mercado. 指導教授:陳界山. 博士. 中華民國 109 年 6 月 June 2020.

(2)

(3) Acknowledgments I am immensely grateful to my very talented thesis adviser, Professor Jein-Shan Chen, for all his support, encouragement and trust in my capabilities. His door has always been open to me regardless of the matter that I wanted to consult with him. He always finds value in my ideas and appreciates all my efforts, small or otherwise. Whenever I am veering off the right course, he always puts me on the correct track. His attitude towards research has always amazed and inspired me. I will not be where I am today if not for his guidance and support. To the panel members, Prof. Ruey-Lin Shieu, Prof. Chun-Hsu Ko, Prof. Pengwen Chen, and Prof. Yu-Lin Chang, who spent their valuable time in reading my thesis and gave me helpful comments to improve this manuscript, I am very grateful. I am also very thankful to the members of our research group, especially Prof. Yu-Lin Chang, Prof. Chu-Chin Hu, Dr. Chieu Thanh Nguyen and Lee-Chen Han, for our fruitful discussions contributed a lot in my journey of learning different areas of optimization. My research experience in optimization would not have been complete had I not learned how to implement algorithms. For this, I am very thankful to Prof. ShaoTung Chang from whom I learned how to write a “good” code in Matlab.. I will. always remember his scrupulousness when writing codes. Without him, I would not have accomplished as much as I have right now. I will not forget the weekly challenge of writing codes for our courses on Statistical Computing and Cluster Analysis, from where I have learned so much. I would also like to express my deepest gratitude to various staff members of the Department of Mathematics. First of all, I am very thankful to Mr. Chi-Tai Chu who has welcomed me into the department and helped me whenever I encounter some problems. All our small talks about studies, culture, and music will not be forgotten. To Ms. Cindy Shen and Ms. Chia-Hua Chang, thank you for always being there to attend to the questions of international students. To Mr. Hung-Lin Huang, thank you for always helping me fix some technical problems in my office.. iii.

(4) On a personal note, I want to sincerely thank my friend Chieu for our meaningful discussions, Bernise and her family for the love and support, and my family for the inspiration. Above all else, I am thankful to God for guiding me. This work is dedicated to the memory of my beloved grandmother, Estelita Jolo.. iv.

(5) Abstract The nonlinear complementarity problem (NCP) is not only central in the study of constrained optimization but also provides an important framework in modelling equilibrium problems in several areas such as engineering, economics and operations research. We solve the NCP using systems of ordinary differential equations inspired by (i) a reformulation approach via complementarity functions and (ii) a special type of smoothing method for NCPs. First, a neural network model is constructed based on the discrete-type generalization of the natural residual (NR) function and its two symmetrizations.. We establish several important properties of their induced merit. functions which are necessary not only in neural network approach but also in most NCP functions-based algorithms. Using these results, we analyze the formulated dynamical systems with parameter p ≥ 3, p is odd. Numerical experiments suggest that lower values of p provide optimal speed of convergence and are further recommended due to ill-conditioning problems encountered when p is large. To provide better convergence results, we construct new NCP functions by proposing a continuous-type generalization of the NR function, together with two symmetrizations, which involve a continuous tunable parameter p ∈ (1, ∞). The extension is meaningful as it offers more stable dynamical systems with faster convergence speeds. More importantly, we discovered one class of NCP functions which can outperform the traditionally used (generalized) FischerBurmeister function. Second, a novel smoothing approach for complementarity problems will also be utilized to construct alternative dynamical systems for solving the NCP. We use some family of functions to construct smooth perturbations of the zero-level curve of the NR function, and introduce two important subclasses which have significantly different theoretical and numerical properties.. We establish sufficient conditions to. guarantee asymptotic and exponential stability. Comparisons between the NCP-based and the smoothing type neural networks are also presented. Keywords: complementarity problems, neural network, NCP-functions, natural residual function, smoothing approach, stability. v.

(6)

(7) Contents. Acknowledgments. iii. Abstract. v. Contents. vii. List of Notations. xi. List of Tables. xiii. List of Figures. xv. 1 The Problem and its Background. 1. 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1. 1.2. Solution Methods for NCP . . . . . . . . . . . . . . . . . . . . . . . . . .. 2. 1.3. Neural Network Approach . . . . . . . . . . . . . . . . . . . . . . . . . .. 4. 1.4. Complementarity Functions . . . . . . . . . . . . . . . . . . . . . . . . .. 5. 1.5. Overview and Contributions . . . . . . . . . . . . . . . . . . . . . . . . .. 6. 2 Preliminaries. 9. 2.1. Nonlinear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 10. 2.2. Stability Analysis of Dynamical Systems . . . . . . . . . . . . . . . . . .. 13. vii.

(8) 3 Properties of Gradient Dynamical Systems. 17. 3.1. General Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17. 3.2. Properties of NCP Functions-based Gradient Systems . . . . . . . . . . .. 20. 4 Neural Networks Based on Discrete Generalization and Symmetrizations of the Natural Residual Function. 25. 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 26. 4.2. Motivation and Contributions . . . . . . . . . . . . . . . . . . . . . . . .. 27. 4.3. Properties of Induced Merit Functions . . . . . . . . . . . . . . . . . . .. 28. 4.4. Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 37. 4.5. Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 40. 4.6. Summary and Recommendations . . . . . . . . . . . . . . . . . . . . . .. 49. 5 Neural Networks based on Novel Generalization of the Natural Residual Function. 51. 5.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 52. 5.2. Continuous Generalization of the Natural Residual Function . . . . . . .. 53. 5.3. Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 57. 5.4. Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 65. 5.5. Discussion of Numerical Results . . . . . . . . . . . . . . . . . . . . . . .. 75. 5.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 77. 6 Neural Network based on Haddou-Maheux Smoothing Framework. 79. 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 80. 6.2. Smoothing Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 81. 6.3. Construction of Smoothing Functions . . . . . . . . . . . . . . . . . . . .. 85. 6.4. Smoothed Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . .. 91. viii.

(9) 6.4.1. The First Neural Network . . . . . . . . . . . . . . . . . . . . . .. 91. 6.4.2. The Second Neural Network . . . . . . . . . . . . . . . . . . . . .. 97. 6.5. Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102. 6.6. Comparisons between different neural networks . . . . . . . . . . . . . . 115. 6.7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115. 7 Conclusions and Future Research. 121. Bibliography. 123. Appendix A: Collection of NCP Test Problems. 131. Appendix B: Simulation Results for Chapter 5. 137. ix.

(10)

(11) List of Notations • IRn denotes the n-dimensional Euclidean space endowed with the usual inner product hx, yi = xT y.. • NCP(F ) denotes the problem of finding a point x ∈ IRn such that x ≥ 0, F (x) ≥ 0 and hx, F (x)i = 0. • SOL(F ) denotes the solution set of NCP(F ). • ΩF denotes the feasible region of NCP(F ), i.e. x ∈ ΩF if and only if x ≥ 0 and F (x) ≥ 0. • k · k denotes the usual norm on IRn , and k · kp denotes the lp -norm, i.e. kxkp = ! p1 n X where p ∈ (0, ∞). |xi |p i=1. • IRn+ and IRn++ denote the nonnegative and positive orthant of IRn , respectively. • IRm×n denotes the space of m × n real matrices.. • M T denotes the transpose of a matrix M , and Mij will be used to denote the (i, j)-entry of M . • For M ∈ IRn×n and Λ ⊆ {1, . . . , n}, we denote by MΛ the principal submatrix of M indexed by Λ (i.e. the submatrix of M corresponding to rows and columns indexed by Λ). Λc denotes the complement of Λ. • For any differentiable function f : IR2 → IR, ∇a f (a, b) and ∇b f (a, b) means the partial derivative of f w.r.t. a and b, respectively. xi.

(12) • Given a differentiable mapping F = (F1 , . . . , Fm )T : IRn → IRm , ∇F (x) =. [∇F1 (x) · · · ∇Fm (x)] ∈ IRn×m denotes the transposed Jacobian of F at x, where ∇Fi (x) denotes the gradient of Fi at x.. • Given a family of real-valued functions on IRn : {φr : r > 0}, we denote by φ0 the pointwise limit limr&0 φr , whenever it exists.. xii.

(13) List of Tables 5.1. 5.2. 5.3. 5.4. 5.5. 6.1 6.2. Numerical results for NCP1 and NCP2 using the neural networks based p on φepNR , φepS−NR and ψeS−NR for different values of p. . . . . . . . . . . . . . .. 68. p on φepNR , φepS−NR and ψeS−NR for different values of p. . . . . . . . . . . . . . .. 69. p for different values of p. . . . . . . . . . . . . . . on φepNR , φepS−NR and ψeS−NR. 70. p for different values of p. . . . . . . . . . . . . . . on φepNR , φepS−NR and ψeS−NR. 71. p on φepNR , φepS−NR and ψeS−NR for different values of p. . . . . . . . . . . . . . .. 72. Numerical results for NCP3 and NCP4 using the neural networks based. Numerical results for NCP5 and NCP6 using the neural networks based. Numerical results for NCP7 and NCP8 using the neural networks based. Numerical results for NCP9 and NCP10 using the neural networks based. θ functions generated from (6.6) . . . . . . . . . . . . . . . . . . . . . . .. 89. Average Convergence Time of Neural Network (NN2) for Linear Complementarity Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103. 6.3. Average Convergence Time of Neural Network (NN2) . . . . . . . . . . . 107. 6.4. Average Convergence Time of Neural Network (NN2) . . . . . . . . . . . 114. 6.5. Convergence Time of Neural Networks based on θ2 , θ5 and φepS−NR for all. NCP test problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116. xiii.

(14)

(15) List of Figures 1.1. Thesis Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7. 4.1. Simplified block diagram for neural network (3.9). . . . . . . . . . . . . .. 38. 4.2. Convergence behavior of the error kx(t) − x∗ k in Example 4.5.1 using. the neural network with φpNR for different values of p, where x0 = (2, 0.5, 0.5, 1.5)T and ρ = 106 . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. 42. Convergence behavior of the error kx(t) − x∗ k in Example 4.5.1 using. the neural network with φpS−NR for different values of p, where x0 = (2, 0.5, 0.5, 1.5)T and ρ = 103 . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. 42. Convergence behavior of the error kx(t) − x∗ k in Example 4.5.1 using. p for different values of p, where x0 = the neural network with ψS−NR. (2, 0.5, 0.5, 1.5)T and ρ = 109 . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. 43. Comparison of convergence speeds of kx(t)−x∗ k in Example 4.5.1 using the. neural network with different NCP functions, where x0 = (2, 0.5, 0.5, 1.5)T and ρ = 103 . 4.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Transient behavior of x(t) Example 4.5.1 of the neural network with φpS−NR (p = 3) with 6 random initial points, where ρ = 103 . . . . . . . . . . . . .. 4.7. 43. 44. Convergence behavior of the error kx(t) − x∗ k in Example 4.5.2 using. the neural network with φpNR for different values of p, where x0 = (0.5, 0.5, 3.5, 0.5)T and ρ = 106 . . . . . . . . . . . . . . . . . . . . . . . .. xv. 45.

(16) 4.8. Convergence behavior of the error kx(t) − x∗ k in Example 4.5.2 using. p the neural network with ψS−NR for different values of p, where x0 =. (0.5, 0.5, 3.5, 0.5)T and ρ = 106 . . . . . . . . . . . . . . . . . . . . . . . . 4.9. 46. Comparison of convergence speeds of kx(t) − x∗ k in Example 4.5.2. using the neural network with different NCP functions, where x0 =. (0.5, 0.5, 3.5, 0.5)T and ρ = 103 . . . . . . . . . . . . . . . . . . . . . . . .. 46. 4.10 Transient behavior of x(t) in Example 4.5.2 of the neural network with φpS−NR (p = 3) with 6 random initial points, where ρ = 103 . . . . . . . . .. 47. 4.11 Comparison of convergence speeds of kx(t) − x∗ k in Example 4.5.3 using the neural network with different NCP functions, where x0 =. (0.5, 1, 1.5, 0.5, 0.5)T and ρ = 103 . . . . . . . . . . . . . . . . . . . . . . .. 47. 4.12 Transient behavior of x(t) in Example 4.5.3 of the neural network with φpS−NR (p = 3) with 6 random initial points, where ρ = 103 . . . . . . . . . 5.1. 48. Graph of upper bound for the error term kx(t) − x∗ k for different values of a and b with a, b ≥ 0 and a > b. . . . . . . . . . . . . . . . . . . . . . .. 63. 5.2. Graph of g4,0.5 (p) on the interval [30, 40]. . . . . . . . . . . . . . . . . . .. 64. 6.1. Simplified block diagram for neural network (NN1). . . . . . . . . . . . .. 92. 6.2. Simplified block diagram for neural network (NN2). . . . . . . . . . . . .. 98. 6.3. Performance profile of convergence time for linear complementarity problems104. 6.4. Performance profile of convergence time for nonlinear complementarity problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106. 6.5. Performance profile of convergence time for complementarity problems. . 108. 6.6. Comparison of convergence speeds of kx(t) − x∗ k for NCP7 using α = 0.5. for functions from F1 (θ1 , θ2 ) and α = 1 for functions from F2 (θ3 , θ4 , θ5 ), where x0 = (15, 15, 15, 15)T and r0 = 10. . . . . . . . . . . . . . . . . . . 109. xvi.

(17) 6.7. Trajectories starting at x0 = (300, 600, 300, 150)T of the neural network using (NN1) and (NN2) with functions from F2 : θ3 , θ4 and θ5 were used for Figures (a) & (d), (b) & (e) and (c) & (f), respectively. Figures (a)(c) show the trajectories of (NN2) with α = 1 which all converged to an equilibrium point which is not the NCP solution, while Figures (d)(f) show the trajectories of (NN1) with β = 0.7, which all converge to √ x∗ = ( 6/2, 0, 0, 1/2)T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. 6.8. Trajectories starting at x0 = (300, 600, 300, 150)T of the neural network based on FB function [42]. . . . . . . . . . . . . . . . . . . . . . . . . . . 112. 6.9. Performance profile of convergence time of (NN2) and the neural networks based on FB and generalized FB function for solving complementarity problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113. 6.10 Performance profiles for (a) all solvers θ2 , θ5 and φepS−NR , (b) θ2 and θ5 , (c) 7.1. θ5 and φepS−NR and (d) θ5 and φepS−NR . . . . . . . . . . . . . . . . . . . . . . 117 Influence of p on convergence time and Gap value of the neural networks. p (Fig. based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR. (e) and (f)) for NCP1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.2. Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)). of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) 7.3. p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP1.. . . . . . . . . . . . . . 139. Comparison of neural networks based on NR and FB functions, and the. convergence of the neural network based on φepS−NR (p = 1.01) to the. approximate solution x∗ = (0.1837, 0.2652, 0.3068, 0.3030, 0.4015) for NCP1.140. 7.4. Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141. xvii.

(18) 7.5. Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)). of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) 7.6. p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP2.. . . . . . . . . . . . . . 142. Comparison of neural networks based on NR and FB functions, and the. convergence of the neural network based on φepS−NR (p = 1.01) to the. approximate solution x∗ = (0.6555, 0.3913, 0, 0, 0) for NCP2. . . . . . . . 143. 7.7. Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144. 7.8. Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)). of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) 7.9. p (Fig. (e) and (f)) for NCP3. and (d)) and ψeS−NR. . . . . . . . . . . . . . 145. Comparison of neural networks based on NR and FB functions, and the. convergence of the neural network based on φepS−NR (p = 1.01) to the solution. x∗ = (0, 0, 1, 2, 3) for NCP3. . . . . . . . . . . . . . . . . . . . . . . . . . 146. 7.10 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147. 7.11 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP4.. . . . . . . . . . . . . . 148. 7.12 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR (p = 1.01) to the solution. x∗ = (2, 0, 1) for NCP4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149. 7.13 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150. xviii.

(19) 7.14 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP5.. . . . . . . . . . . . . . 151. 7.15 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR (p = 1.01) to the solution. x∗ = (1, 1, 8, 4) for NCP5. . . . . . . . . . . . . . . . . . . . . . . . . . . 152. 7.16 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153. 7.17 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p (Fig. (e) and (f)) for NCP6. and (d)) and ψeS−NR. . . . . . . . . . . . . . 154. 7.18 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR (p = 1.01) to the solution √ x∗ = ( 6/2, 0, 0, 0.5) for NCP6. . . . . . . . . . . . . . . . . . . . . . . . 155 7.19 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156. 7.20 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP7.. . . . . . . . . . . . . . 157. 7.21 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR to the degenerate √ solution x∗ = ( 6/2, 0, 0, 0.5) and non-degenerate solution x∗ = (1, 0, 3, 0) (using p = 1.01 and p = 2, respectively) for NCP7. . . . . . . . . . . . . . 158 7.22 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159. xix.

(20) 7.23 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP8.. . . . . . . . . . . . . . 160. 7.24 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR (p = 1.01) to the solution. x∗ = (0, 3, 1, 0, 0) for NCP8. . . . . . . . . . . . . . . . . . . . . . . . . . 161. 7.25 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162. 7.26 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) p (Fig. (e) and (f)) for NCP9. and (d)) and ψeS−NR. . . . . . . . . . . . . . 163. 7.27 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR (p = 1.01) to a solution. x∗ = (k, 0, 0, 0) (where k ∈ [0, 3]) for NCP9. . . . . . . . . . . . . . . . . 164. 7.28 Influence of p on convergence time and Gap value of the neural networks p based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c) and (d)) and ψeS−NR (Fig.. (e) and (f)) for NCP10.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 165. 7.29 Influence of p on convergence of the error term kx(t) − x∗ k and ΨF (x(t)) of the neural networks based on φepNR (Fig. (a) and (b)), φepS−NR (Fig. (c). p and (d)) and ψeS−NR (Fig. (e) and (f)) for NCP10. . . . . . . . . . . . . . 166. 7.30 Comparison of neural networks based on NR and FB functions, and the convergence of the neural network based on φepS−NR to the solution. x∗ = (0, 1, 0, 1, 0) for NCP10. . . . . . . . . . . . . . . . . . . . . . . . . . 167. xx.

(21) Chapter 1 The Problem and its Background 1.1. Introduction. The nonlinear complementarity problem (NCP) is a system of nonlinear inequalities together with a nonnegativity condition on the variables and an equation which requires the orthogonality of the variables and the functions involved in the inequalities. In precise terms, suppose we are given a function F : IRn → IRn . The nonlinear complementarity problem, which we denote by NCP(F ), is to find a point x ∈ IRn such that x ≥ 0,. F (x) ≥ 0,. and hx, F (x)i = 0,. where ≥ means the component-wise order on IRn . The set of all solutions of NCP(F ) will be denoted by SOL(F ), which we shall assume to be nonempty. The feasible region of NCP(F ), which we will denote by ΩF , is the set of all x ∈ IRn+ such that F (x) ≥ 0. Hence, SOL(F ) ⊆ ΩF . It is clear that if x∗ = 0 ∈ ΩF or if F (x∗ ) = 0 with x∗ ∈ ΩF , then x∗ ∈ SOL(F ).. Throughout this manuscript, we assume that F is continuously differentiable, and let F = (F1 , . . . , Fn )T with Fi : IRn → IR for i = 1, . . . , n. The problem NCP(F ) is in fact a special case of the more general problem of finding a point x ∈ K ⊆ IRn such that hy − x, F (x)i ≥ 0,. ∀y ∈ K,. which is known as a variational inequality problem denoted by VI(F, K). Indeed, NCP(F ) corresponds to the variational inequality with K = IRn+ . 1.

(22) These problems arise in the study of constrained optimization problems and in modelling equlibrium problems [15]. Significant research efforts in the study of NCPs have been put forth in the past decades due to their broad scope of applications in areas such as engineering, economics and operations research [15, 18, 19]. Among others, traffic network design, obstacle problems, frictional contact problems, elastoplastic structural analysis, and pricing american options are some source problems for NCP [15, 18]. Our main goal in this dissertation is to solve NCP(F ) using systems of ordinary differential equations.. This is more popularly known as the neural network (NN). approach in the optimization literature.. Several models will be presented and we. discuss the theoretical and numerical advantages and disadvantages of each dynamical system. These proposed dynamic systems are important in applications and can in fact be extended to convex programming problems, variational inequalities, and other mathematical programming problems.. 1.2. Solution Methods for NCP. As mentioned above, NCP provides an important framework for the study of equilibrium problems which usually arise from different areas [15, 18, 19]. Because of the wide range of applications of the NCP, many methods for solving this problem have been developed over the years. Several of these techniques utilize a special function which naturally considers the structure of the problem NCP(F ) defined as follows. Definition 1.2.1 (NCP Function). A function φ : IR2 → IR is called an NCP function if it satisfies φ(a, b) = 0. ⇐⇒. a ≥ 0,. b ≥ 0 and. ab = 0.. An NCP function is very useful in solving NCP(F ) as it naturally exploits the structure of the problem. In turn, the problem can be recast into another form which can then be solved by various techniques. For instance, define ΦF : IRn → IRn as   φ(x1 , F1 (x)) .. . ΦF (x) =  . φ(xn , Fn (x)) 2. (1.1).

(23) Then, it is clear to see that NCP(F ) is equivalent to solving the system of equations ΦF (x) = 0. Hence, solution methods for nonlinear systems of equations can be employed.. In. particular, depending on the smoothness property of φ and F , we may employ the Newton method [46], the semismooth Newton method [16, 50, 64], or some smoothing approaches [7, 49]. NCP functions also naturally give rise to merit functions, i.e. functions whose set of global minimizers coincides with the set SOL(F ). In particular, the function 1 ΨF (x) := kΦF (x)k2 2. (1.2). is a merit function for NCP(F ). That is, NCP(F ) is equivalent to solving the unconstrained minimization problem min ΨF (x),. x∈IRn. which is known as the merit-function approach [21, 35, 43]. This nonlinear least-squares formulation of NCP(F ) allows us to take advantage of several approaches such as steepest descent method, Newton method, Gauss-Newton method and Levenberg-Marquardt method, if our merit function ΨF is smooth [46]. Similar to the approach by system of equations described above, smoothing strategies can also be adopted if ΨF is nonsmooth. Another technique involves a regularization approach [31, 55] where we consider a sequence of regularized complementarity problems NCP(Fε ). The function Fε is a simple perturbation of F given by Fε (x) := F (x) + εx. Solution of the NCP is then obtained by letting ε → 0. On the other hand, interior-point method [47, 48] and proximal point algorithm [51] are some well-known approaches to solve NCP(F ) which do not utilize NCP-functions in general. We refer the interested reader to the monograph [15] and the paper [25] for a survey and thorough discussion of solution methods for complementarity problems.. 3.

(24) 1.3. Neural Network Approach. The above-mentioned numerical approaches can solve the NCP efficiently; however, it is often desirable in scientific and engineering applications to obtain a real-time solution, which may not be attainable with the traditional approaches mentioned above. One promising approach that can provide real-time solutions is the use of neural networks, which were first introduced in optimization by Hopfield and Tank in the 1980s [26, 57]. Neural networks are hardware-implementable, i.e. can be implemented via integrated circuits, and therefore exhibit real-time processing. Prior researches show that neural networks can be used efficiently in linear and nonlinear programming, variational inequalities and nonlinear complementarity problems [8, 12, 27, 28, 37, 42, 62, 61, 63, 66, 68], as well as in other fields [53, 58, 65]. Notice, however, that the original formulation of the NCP is not an optimization problem. However, as discussed in the preceding section, NCP functions can be useful in achieving merit functions ΨF for NCP(F ). In particular, a nonnegative merit function usually serves as an energy function, which is then used to formulate a steepest-descent dynamical system given by dx(t) = −ρ∇ΨF (x(t)), dt. x(0) = x0 ∈ IRn ,. where ρ > 0, whose equilibrium solutions correspond to NCP solutions under some suitable conditions. Aside from utilizing merit functions to construct a dynamical system, another typical neural network are based on projection mappings. For instance, the variational inequality problem VI(F ,K) can equivalently be reformulated as a fixed point problem given by x = PK (x − F (x)),. (1.3). which motivates a simple projection-based neural network dx(t) = −ρ(x − PK (x − F (x))), dt. x(0) = x0 .. This type of neural network has been extensively studied in the literature because of its simplistic nature. The main hurdle, however, is the difficulty involved in calculating 4.

(25) the projection onto K. Moreover, there are instances when this dynamical system has oscillating trajectories and in turn fails to converge to the solution [56].. 1.4. Complementarity Functions. In Section 1.2, we illustrated how NCP functions are useful in solving NCP(F ). We now turn to some examples of these functions, which will be used in designing neural networks for NCP(F ). Two well-known NCP functions are the Fischer-Burmeister (FB) function √ φFB (a, b) = a2 + b2 − (a + b),. (1.4). and the natural residual (NR) function (also known as the minimum function) given by φNR (a, b) = min{a, b} = a − (a − b)+ ,. (1.5). where t+ denotes the projection of t onto [0, ∞), i.e. t if t ≥ 0 t+ = 0 if t < 0. Observe that the NR function is motivated by the fixed point reformulation given in (1.3) with K = IRn+ . The FB function had gained significant attention and had been widely used in several studies because of its desirable numerical properties. In addition, as noted in [20], it is remarkable that several NCP functions are akin to the FB function. For instance, the generalized FB function (or GFB function) φpFB (a, b) = k(a, b)kp − (a + b),. p>1. (1.6). is an interesting generalization of φFB which can be used efficiently in solving NCPs. Here, k · kp denotes the lp -norm, and the tunable parameter p has been shown to possibly improve numerical performance of some algorithms [8, 10]. Inspired by this, Chen, Ko and Wu [9] proposed a discrete-type generalization of the natural residual function φNR given by (1.5), which is given by φpNR (a, b) = ap − (a − b)p+ , 5. (1.7).

(26) where p ≥ 1 is an odd integer. This generalization is considered “discrete” in the sense that p admits integral values only, as opposed to the generalized FB function (1.6) where p is a continuous parameter in (1, ∞). It is shown in [9] that φpNR is twice continuously differentiable when p > 1. However, its surface is not symmetric, which may result to difficulties in designing and analyzing solution methods [30]. To conquer this, two symmetrizations of the φpNR are presented in [5]. A natural symmetrization of φpNR is given by p. φS−NR (a, b) =. (. ap − (a − b)p if a > b, ap = b p if a = b, p p b − (b − a) if a < b,. (1.8). where p > 1 is an odd integer. The above NCP-function is symmetric, but is only differentiable on {(a, b) | a 6= b or a = b = 0}. It was however shown in [30] that φpS−NR. is semismooth and is directionally differentiable. The second symmetrization of φpNR is described by p ψS−NR (a, b) =. (. ap bp − (a − b)p bp if a > b, ap bp = a2p if a = b, ap bp − (b − a)p ap if a < b,. (1.9). where p > 1 is an odd integer. This possesses both differentiability and symmetry. The p are three classes of the four discrete-type families of NCPfunctions φpNR , φpS−NR and ψS−NR. functions which are recently discovered, together with the discrete-type generalization of the Fischer-Burmeister function given by φpD−FB (a, b) =. p p x2 + y 2 − (x + y)p ,. where p > 1 is an odd integer. A comprehensive discussion of their properties is presented in [30]. Several other NCP functions are surveyed in [20].. 1.5. Overview and Contributions. In Chapter 2, we recall some preliminary concepts and results on nonlinear mappings which will be the focus of discussion in the succeeding chapters. We also list some important results from dynamical systems, especially those pertaining to stability analysis. Chapter 3 is dedicated to establishing some general results which will often be used in. 6.

(27) Chapters 4 to 6. We first discuss in Section 3.1 the general properties of gradient dynamical systems arising from taking the gradient of an arbitrary function Ψ : IRn → IR+. given by Ψ(x) := 12 kΦ(x)k2 , where Φ is any function from IRn to IRn . That is, Φ does not necessarily correspond to an NCP function as in (1.1). The latter will be the main discussion in Section 3.2. Our primary goal in this thesis is to use systems of first-order differential equations to solve NCP(F ). Figure 1.1 shows an overview of the flow of this manuscript.. Figure 1.1: Thesis Plan In particular, we address the following research problems: (i) In Chapter 4, we construct three classes of neural networks based on the discrete generalization of the NR function (1.7) and its two symmetrizations (1.8) and (1.9) for solving NCP(F ). Since these are newly discovered NCP functions, some analytic properties prerequisite to stability analysis of the neural network are needed to be established first. These include the analysis of the growth behavior of these discretetype functions, and the characterization of stationary points of their induced 7.

(28) merit functions given by (1.2). We provide numerical simulations to illustrate the theoretical results, and also compare the proposed neural networks with existing neural networks based on other well-known NCP-functions. Preliminary merical results indicate that the performance of the neural network is better when the parameter p associated with the NCP-function is smaller. (ii) We propose a continuous-type generalization of the NR function together with two symmetrizations in Chapter 5. The new family admits a continuous parameter p > 1, giving us a wider range of choices for p. Moreover, this generalization subsumes the discrete generalization originally proposed in [9]. The proposed generalization is a meaningful extension since it not only induces more stable dynamical systems, but it also offers values of p which result to faster convergence speed for the neural network. We also prove some useful error bounds which is helpful in understanding the influence of p on the performance of the network. (iii) We formulate a dynamical system based on a certain family F of real-valued functions that can be used to construct smooth perturbations of the level curve defined by φNR (a, b) = 0. Two important subclasses of F , which deserve particular attention because of their significantly different theoretical and numerical properties, are introduced. One of these subfamilies yield a smoothing function for φNR , while the other subfamily only yields a smoothing curve for φNR (a, b) = 0. We also propose a simple framework for generating functions from these subclasses. Using the smoothing approach, we build two types of neural networks and provide several sufficient conditions to guarantee asymptotic and exponential stability of equilibrium solutions. The discussions are presented in Chapter 6. We close the discussion in Chapter 7 with some concluding remarks and some suggested future research directions which extends the models presented in this manuscript.. 8.

(29) Chapter 2 Preliminaries Despite its simple formulation, the nonlinear complementarity problem is in general a difficult problem, yet a very important one in many applications. In fact, eveb just the feasibility of the problem, i.e. the existence (or non-existence) of a point x ∈ IRn such that x ≥ 0 and F (x) ≥ 0, may be as difficult as solving NCP(F ) itself. In other words, determining whether or not ΩF is non-empty is already a difficult problem. To date, there are only a few studies dealing with this problem (see the papers [32, 69] and the monograph [15]). However, the feasibility problem is clearly important since it is a necessary condition for the existence of NCP solution(s), i.e. the nonemptiness of SOL(F ). There are even instances when feasibility is sufficient for solvability of the complementarity problem (see [11, 15, 45]). In general, conditions on F must be imposed to ensure that SOL(F ) is nonempty. To deal with both the feasibility and solvability of NCP(F ), special types of nonlinear functions F are usually involved. These special mappings will be the subject of Section 2.1. In Section 2.2, we recall some important definitions and theorems pertaining to system of ordinary differential equations.. 9.

(30) 2.1. Nonlinear Mappings. This section summarizes basic concepts related to continuously differentiable nonlinear mappings together with their properties. Most of the materials here can be found in [11, 15]. We review first some special types of matrices. Definition 2.1.1. [11] Let M ∈ IRn×n . Then (a) M is said to be positive definite (resp. positive semidefinite) if hx, M xi > 0 (resp. hx, M xi ≥ 0) for all x ∈ IRn \ {0}.. (b) M is said to be a P -matrix (resp. a P0 -matrix) if all its principal minors are positive (resp. nonnegative). Observe that our definition of a positive definite (semidefinite) matrix does not require M to be a symmetric matrix, as is usually assumed. Rather, a matrix is considered positive definite (semidefinite) if and only if its symmetric part is positive definite (semidefinite) in the usual sense. Moreover, it is known that a positive definite matrix (resp. a positive semidefinite matrix) is a P -matrix (resp. a P0 -matrix). We also note that M is a P0 -matrix if and only if M +εI is a P -matrix for all ε > 0. These facts can be found in [11]. The following is a very important characterization of P - and P0 -matrices. Lemma 2.1.1. [11] Let M ∈ IRn×n . (a) M is a P -matrix if and only if whenever xi (M x)i ≤ 0 for all i, then x = 0. (b) M is a P0 -matrix if and only if whenever x 6= 0, there exists j such that xj 6= 0 and xj (M x)j ≥ 0. We now define some important nonlinear mappings which are closely related to the matrices defined above as we shall see later. We begin with monotone maps. Definition 2.1.2. [15] Let F = (F1 , . . . , Fn )T : IRn → IRn . Then, the mapping F is said to be 10.

(31) (a) monotone if hx − y, F (x) − F (y)i ≥ 0 for all x, y ∈ IRn . (b) strictly monotone if hx − y, F (x) − F (y)i > 0 for all x, y ∈ IRn and x 6= y. (c) strongly monotone with modulus µ > 0 if hx − y, F (x) − F (y)i ≥ µkx − yk2 for all x, y ∈ IRn .. It is clear that strongly monotone functions are strictly monotone, and strictly monotone functions are monotone.. The following result indicates the relationship. between monotone maps and their transposed Jacobians. A consequence of this lemma is that the monotonicity of an affine map F (x) = M x + b is completely determined by M. Lemma 2.1.2. [15] Let F : IRn → IRn be continuously differentiable. Then (a) F is monotone if and only if ∇F (x) is positive semidefinite for all x ∈ IRn . (b) F is strictly monotone if and only if ∇F (x) is positive definite for all x ∈ IRn . (c) F is strongly monotone if and only if ∇F (x) is uniformly positive definite; i.e. there exists a constant c > 0 such that hy, ∇F (x)yi ≥ ckyk2 ,. ∀y ∈ IRn. for all x ∈ IRn . It is known that there is at most one solution to NCP(F ) if F is a strictly monotone function, while a unique solution exists when F is a strongly monotone function [15]. On the other hand, NCP(F ) may not have a solution when F is only monotone. For instance, the function F (x) = −e−x is clearly monotone on IR and it is easy to verify that SOL(F )= ∅. More general classes of nonlinear functions are defined below, followed by an important results relating the function F to ∇F . Clearly, the class of P0 -functions contain the class of P -functions, which contains the class of uniform P -functions. Moreover, we 11.

(32) remark that from Definitions 2.1.2 and 2.1.3, it is clear that monotone functions (resp. strongly monotone and strictly monotone) are contained in the class of P0 -functions (resp. uniformly P -functions and P -functions). Definition 2.1.3. [15] Let F = (F1 , . . . , Fn )T : IRn → IRn . Then, the mapping F is said to be (a) a P0 -function if max (xi − yi )(Fi (x) − Fi (y)) ≥ 0 for all x, y ∈ IRn and x 6= y. 1≤i≤n xi 6=yi. (b) a P -function if max (xi − yi )(Fi (x) − Fi (y)) > 0 for all x, y ∈ IRn and x 6= y. 1≤i≤n. (c) a uniformly P -function with modulus κ > 0 if max (xi − yi )(Fi (x) − Fi (y)) ≥ 1≤i≤n. κkx − yk2 , for all x, y ∈ IRn .. Lemma 2.1.3. [15] Let F : IRn → IRn be continuously differentiable. Then (a) F is a P0 -function if and only if ∇F (x) is a P0 -matrix for all x ∈ IRn . (b) If ∇F (x) is a P -matrix for all x ∈ IRn , then F is a P -function. (c) If F is a uniformly P -function, then there exists a constant c > 0 such that for all x ∈ IRn ,. max yi (∇F (x)y)i ≥ ckyk2 ,. 1≤i≤n. ∀y ∈ IRn. We point out that the converse of Lemma 2.1.3 does not hold, that is, a P -function does not necessarily have a Jacobian which is a P -matrix. As for existence of solutions, it is known that when F is a P -function, then NCP(F ) has at most one solution. If F is a continuous uniformly P -function, then NCP(F ) has a unique solution [15]. On the other hand, solutions may not exist if F is a P0 -function. For a necessary and sufficient condition for existence of solution for P0 -functions, please see [15, Theorem 3.5.11]. Finally, we consider the class of R0 -functions introduced in [6], which is a generalization of an affine function F (x) = M x + b where M is an R0 -matrix [11]. It was proved in [6] that a uniformly P -function is an R0 -function. 12.

(33) Definition 2.1.4. [6] The mapping F : IRn → IRn is called an R0 -function if for any k sequence {xk }∞ k=1 such that kx k → ∞ and. lim inf k→∞. min1≤i≤n xki ≥ 0, kxk k. lim inf k→∞. min1≤i≤n Fi (xk ) ≥ 0, kxk k. there exists an index j such that xk → ∞ and Fj (xk ) → ∞.. 2.2. Stability Analysis of Dynamical Systems. We recall some basic concepts and results from stability analysis (c.f. [38, 44]), which will be useful in the convergence analysis of the neural networks. First, we consider a first-order autonomous ordinary differential equation x(t) ˙ = f (x(t)),. x(t0 ) = x0 ∈ IRn. (2.1). where f : IRn → IRn . The following result provides a sufficient condition for existence and uniqueness of solutions. Lemma 2.2.1. [44] Assume that f : IRn → IRn is a continuous mapping. Then, for. any t0 ≥ 0 and x0 ∈ IRn , there exists a local solution x(t) for (2.1) with t ∈ [t0 , τ ) for some τ > t0 . If, in addition, f is locally Lipschitz continuous at x0 , then the solution is unique; if f is Lipschitz continuous in IRn , then τ can be extended to ∞. If [t0 , τ ) is the maximal interval of existence of x(t) and there exists a compact set K such that {x(t) : t ∈ [t0 , τ )} ⊆ K, then τ = ∞. A point x∗ is called an equilibrium point or a steady state of the dynamic system (2.1) if f (x∗ ) = 0. It is called an isolated equilibrium point if there exists some neighborhood Ω∗ ⊆ IRn of x∗ such that f (x∗ ) = 0 and f (x) 6= 0 ∀x ∈ Ω∗ \{x∗ }. We consider three types of stability. Definition 2.2.1. [38, Stability in the sense of Lyapunov] Let x∗ be an isolated equilibrium point, and let x(t) denote the solution to (2.1). 13.

(34) (a) x∗ is said to be stable if for any ε > 0, there exists a δ > 0 such that for any x0 = x(t0 ) with kx0 − x∗ k < δ and for any t ≥ t0 , we have kx(t) − x∗ k < ε. (b) x∗ is said to be asymptotically stable if it is stable and if there exists δ > 0 such that for any x0 with kx0 − x∗ k < δ, x(t) → x∗ as t → ∞. If the limit limt→∞ x(t) = x∗. holds for any x0 ∈ IRn , then x∗ is said to be globally asymptotically stable.. (c) x∗ is said to be exponentially stable if there exist positive constants δ, c and ω such that for any x0 = x(t0 ) with kx0 − x∗ k < δ and for any t ≥ t0 , we have. kx(t) − x∗ k ≤ ce−ωt kx0 − x∗ k.. In determining stability, the usual approach is to use so-called Lyapunov functions which is defined as follows. Definition 2.2.2. [38, Lyapunov Function] Let Ω ⊆ IRn be an open neighborhood of x∗ .. A continuously differentiable function Ψ : IRn → IR is said to be a Lyapunov function at the state x∗ over the set Ω for equation (2.1) if  ∗   Ψ(x ) = 0, Ψ(x) > 0 ∀x ∈ Ω\{x∗ },   d Ψ(x(t)) = ∇Ψ(x(t))T f (x(t)) ≤ 0, dt. ∀x ∈ Ω.. The following result gives the relationship between Lyapunov functions and stability.. Lemma 2.2.3 provides a sufficient condition to achieve global asymptotic stability. Lemma 2.2.2. [38] Let x∗ be an isolated equilibrium point of (2.1). Then (a) x∗ is Lyapunov stable if there exists a Lyapunov function over some neighborhood Ω of x∗ . (b) x∗ is asymptotically stable if there is a Lyapunov function Ψ over some neighborhood dΨ(x(t)) Ω of x∗ such that < 0 for all x ∈ Ω\{x∗ }. dt. 14.

(35) Lemma 2.2.3. [38, Barbashin-Krasovskii Theorem] Let x∗ be an equilibrium point of (2.1) and let Ψ : IRn → IR be a continuously differentiable function such that  ∗ ∗   Ψ(x ) = 0 and Ψ(x) > 0, ∀x 6= x kxk → ∞ ⇒ Ψ(x) → ∞ ,   d Ψ(x(t)) < 0, ∀x 6= x∗ dt then x∗ is globally asymptotically stable.. LaSalle’s Invariance Principle [40] is another useful result to prove asymptotic behavior of trajectories. We use the following formulation of this result. Lemma 2.2.4. [60, LaSalle’s Invariance Principle] Let M be a compact subset of IRn such that any trajectory x(t) pf (2.1) with x0 ∈ M stays in M for all t ≥ 0. Let. Ψ : IRn → IR be such that M . Further, let. d Ψ(x(t)) dt. ≤ 0 for any trajectory x(t) that starts at a point in. d E = x ∈ M | Ψ(x(t)) ≡ 0 dt. and let M be the union of all trajectories of (2.1) that start in E and stays in E for all time t ≥ 0. Then for any initial condition x0 ∈ M , we have dist(x(t), M ) → 0 as t → ∞ where x(t) is the solution of (2.1). We note that in the above result, the distance function from a point x to a set M is defined as dist(x, M ) = inf kx − yk. y∈M. The following result will also be helpful in our stability analysis of steepest-descent dynamical systems. Lemma 2.2.5. [50] Let F be locally Lipschitzian. If all V ∈ ∂F (x) are nonsingular, then there is a neighborhood N (x) of x and a constant C > 0 such that for any y ∈ N (x) and any V ∈ ∂F (y), V is nonsingular and kV −1 k ≤ C.. In the case of continuously differentiable functions, we have the following version of the above result. 15.

(36) Corollary 2.2.6. If F is continuously differentiable and ∇F (x) is nonsingular, then there exists δ, C > 0 such that for any y with ky−xk < δ, the matrix ∇F (y) is nonsingular and k∇F (y)k ≤ C.. 16.

(37) Chapter 3 Properties of Gradient Dynamical Systems In this chapter, we discuss general properties of a gradient dynamical system dx(t) = −ρ∇Ψ(x(t)), dt. x(0) = x0 ,. (3.1). where ρ > 0, which will be very useful in our subsequent discussions. We will also discuss some general results when Ψ = ΨF given by (1.2), i.e. when Ψ is a merit function function for NCP(F ). The propositions herein are generalizations of results that have been frequently used in several papers using a steepest-descent based neural network.. 3.1. General Properties. Throughout the discussion, we assume that Φ : IRn → IRn is a continuously. differentiable function. We define Ψ : IRn → IR+ by. 1 Ψ(x) := kΦ(x)k2 . 2. (3.2). Then the global minimizers of Ψ are precisely the solutions of the system Φ(x) = 0. Since these global minimizers corresponds to stationary points of Ψ, it is only natural to consider a steepest-descent based neural network as in (3.1). In the following propositions, we summarize some important results which will be frequently used in the subsequent chapters. 17.

(38) Proposition 3.1.1. Let Φ : IRn → IRn be continuously differentiable, and define Ψ : IRn → IR+ as in (3.2). Then. (a) If Φ(x) = 0 has a solution, then x∗ is a global minimizer of Ψ if and only if Φ(x∗ ) = 0. (b) Ψ(x(t)) is a nonincreasing function of t, where x(t) is a solution of (3.1). (c) Every accumulation point of a solution x(t) of (3.1) is an equilibrium point of (3.1). (d) Every solution of Φ(x) = 0 is an equilibrium point of (3.1). Proof. Claim (a) is obvious. Nevertheless, we point out that the hypothesis is important to guarantee the “only if” part. On the other hand, (b) follows from dΨ(x(t)) dx = ∇Ψ(x(t))T = ∇Ψ(x(t))T (−ρ∇Ψ(x(t))) = −ρk∇Ψ(x(t))k2 ≤ 0 dt dt. (3.3). for all solutions x(t). For a proof of (c), please see page 232 of [60]. Claim (d) follows from the formula ∇Ψ(x) = ∇Φ(x)Φ(x).. 2. In the succeeding discussions, we always assume that Φ(x) = 0 always has a solution. Then by Proposition 3.1.1(a), this solution corresponds to a global minimizer of Ψ, which in turn corresponds to an equilibrium point of (3.1). Thus, the assumption that Φ has zeros guarantees the existence of equilibrium points of the dynamical system (3.1). Now, we present a very important result that establishes exponential stability of an equilibrium point. The idea of the proof of this exponential stability theorem has been repeatedly used in several research papers. Here, we present that this idea can be used for the general case (3.1) where Ψ is given by (3.2). Theorem 3.1.2. Suppose ∇Φ(x∗ ) is nonsingular for some isolated equilibrium point x∗. of (3.1). Then x∗ is a zero of Φ and is an exponentially stable equilibrium point of (3.1). Proof. Since ∇Ψ(x∗ ) = ∇Φ(x∗ )Φ(x∗ ) and x∗ is an equilibrium point, the nonsingularity. assumption implies that Φ(x∗ ) = 0. This proves the first claim. This consequently. implies that Ψ(x∗ ) = 0 and that Ψ is a Lyapunov function satisfying the assumptions of Lemma 2.2.2(b). Hence, x∗ is locally asymptotically stable. 18.

(39) Now, note that since Φ is differentiable at x∗ , we have Φ(x) = ∇Φ(x)T (x − x∗ ) + o(kx − x∗ k) as x → x∗. (3.4). By Lemma 2.2.5, there exist δ > 0 and a constant C > 0 such that ∇Φ(x) is nonsingular for all x with kx − x∗ k < δ, and k∇Φ(x)−1 k ≤ C. Then, it gives κkyk2 ≤ k∇Φ(x)yk2. (3.5). for any x in the δ-neighborhood (call it Nδ (x∗ )) and any y ∈ IRn , where κ = 1/C 2 . Let ε < 2ρκ. Since x∗ is asymptotically stable, we may choose δ small enough so that o(kx−x∗ k2 ) < εkx−x∗ k2 and x(t) → x∗ as t → ∞ for any initial condition x(0) ∈ Nδ (x∗ ). Now, define g : [0, ∞) → IR by g(t) := kx(t) − x∗ k2 where x(t) is the unique solution through x(0) ∈ Nδ (x∗ ). Using equations (3.4) and (3.5), we obtain dg(t) dx(t) = 2(x(t) − x∗ )T dt dt ∗ T = −2ρ(x(t) − x ) ∇Ψ(x(t)) = −2ρ(x(t) − x∗ )T ∇Φ(x(t))Φ(x(t)) = −2ρ(x(t) − x∗ )T ∇Φ(x(t))∇Φ(x)T (x(t) − x∗ ) + o(kx(t) − x∗ k2 ) ≤ (−2ρκ + ε)kx(t) − x∗ k2 = (−2ρκ + ε)g(t). Then, it follows that g(t) ≤ e(−2ρκ+ε)t g(0), which says kx(t) − x∗ k ≤ e(−ρκ+ε/2)t kx(0) − x∗ k, where −ρκ + ε/2 < 0. This proves that x∗ is exponentially stable by Definition 2.2.1(c).. 2. The following proposition presents some conditions to achieve convergence to equilibrium points and to guarantee global stability (see Definition 2.2.1). 19.

(40) Proposition 3.1.3. Let Ψ : IRn → IR+ be as in (3.2), and suppose the level sets. L(Ψ, γ) := {x ∈ IRn | Ψ(x) ≤ γ} of Ψ are bounded for any γ ≥ 0. Then. (a) The trajectory x(t) through any initial condition x0 ∈ IRn is defined for all t ≥ 0.. Moreover, the trajectory x(t) of (3.1) through any x0 ∈ IRn converges to an equilibrium point.. (b) If Φ(x) = 0 has a unique solution x∗ , then x∗ is globally asymptotically stable. Proof. That x(t) is defined for all t ≥ 0 follows from the proof of Proposition 4.2(b) in [8]. Meanwhile, L(Ψ, γ) is positively invariant with respect to the solution x(t) (i.e. if the initial condition x0 is in L(Ψ, γ), then x(t) ∈ L(Ψ, γ) for all t ≥ 0) by Proposition 3.1.1(b). In addition, the calculation in (3.3) indicates that. dΨ(x) dt. = 0 if and only if. ∇Ψ(x(t)) = 0, i.e. x is an equilibrium point. By LaSalle’s Invariance Principle (Lemma 2.2.4), trajectories x(t) converge to an equilibrium point of (3.1). This proves (a). On the other hand, if Φ(x) = 0 has a unique solution, then (3.1) has a unique equilibrium solution. Then (b) immediately follows from (a).. 2. We note that an alternative characterization of boundedness of level sets L(Ψ, γ) for any γ ≥ 0 is that Φ is coercive, i.e. kΦ(x)k → ∞ whenever kxk → ∞. We state this result without proof. Proposition 3.1.4. L(Ψ, γ) is bounded for all γ ≥ 0 if and only if Φ is coercive. Proposition 3.1.3(a), in turn, is a direct consequence of Lemma 2.2.3.. 3.2. Properties of NCP Functions-based Gradient Systems. In Section 3.1, we discussed the general properties of the gradient system (3.1) with Ψ given by (3.2), where Φ is any mapping. We now turn to the case when Φ = ΦF as given by (1.1). To recall, let φ : IR2 → IR be an NCP function. We further define ψ : IR2 → IR+ 20.

(41) as 1 ψ(a, b) = |φ(a, b)|2 , 2 which is also an NCP function. We define ΦF : IRn → IRn as   φ(x1 , F1 (x)) .. , ΦF (x) =  . φ(xn , Fn (x)). (3.6). (3.7). and ΨF : IRn → IR+ as. n. 1 1X ΨF (x) := kΦF (x)k2 = ψ(xi , Fi (x)) 2 2 i=1. (3.8). The following proposition lists some general properties of ΨF and the corresponding gradient system dx(t) = −ρ∇ΨF (x(t)), dt. x(0) = x0 .. (3.9). Proposition 3.2.1. Let ΨF : IRn → IR+ be defined as in (3.8). Suppose that F is continuously differentiable. Then, (a) If SOL(F ) 6= ∅, then x∗ is a global minimizer of ΨF if and only if x∗ ∈ SOL(F ). (b) ΨF (x(t)) is a nonincreasing function of t, where x(t) is a solution of (3.9). (c) Let x ∈ IRn , and suppose that φ is differentiable at (xi , Fi (x)) for each i = 1, . . . , n. Then ∇ΨF (x) = ∇a ψ(x, F (x)) + ∇F (x)∇b ψ(x, F (x)). (3.10). where ∇a ψ(x, F (x)) := [∇a ψ(x1 , F1 (x)), . . . , ∇a ψ(xn , Fn (x))]T , ∇b ψ(x, F (x)) := [∇b ψ(x1 , F1 (x)), . . . , ∇b ψ(xn , Fn (x))]T . (d) Let x∗ ∈ SOL(F ) such that φ is differentiable at (x∗i , Fi (x∗ )) for each i = 1, . . . , n. Then, x∗ is an equilibrium point of (3.9).. 21.

(42) Proof. (a) and (b) follows from Proposition 3.1.1(a) and (b). The formula in (c) can be obtained using chain rule. Claim (d) is clear from Proposition 3.1.1(d).. 2. From Proposition 3.2.1(d), we know that solutions of NCP(F ) are stationary points of the merit function ΨF . In general, the converse does not necessarily hold. Nevertheless, we can provide a nice characterization when F is a P0 -function and the NCP function φ satisfies some nice properties. We state this in the following theorem, whose proof is a generalization of the proof given in [16, Theorem 4.1]. Theorem 3.2.2. Let φ be a differentiable NCP function and let ψ be given by (3.6). Suppose that the following properties hold: (P1) ∇a ψ(a, b) · ∇b ψ(a, b) ≥ 0 for all (a, b) ∈ IR2 ; and (P2) For all (a, b) ∈ IR2 , ∇a ψ(a, b) = 0 ⇐⇒ ∇b ψ(a, b) = 0 ⇐⇒ φ(a, b) = 0. If F is a P0 -function, then every equilibrium point of (3.9) solves the NCP. Proof. From equation (3.10), an equilibirum point x of (3.9) satisfies ∇a ψ(x, F (x)) + ∇F (x)∇b ψ(x, F (x)) = 0.. (3.11). Suppose that ∇b ψ(x, F (x)) 6= 0. Then by Lemma 2.1.1, since F is a P0 -function, we can furnish an index j for which (∇b ψ(x, F (x)))j 6= 0,. (3.12). (∇b ψ(x, F (x)))j · (∇F (x)∇b ψ(x, F (x)))j ≥ 0.. (3.13). and. On the other hand, we have from property (P1) that (∇a ψ(x, F (x)))j · (∇b ψ(x, F (x)))j ≥ 0.. (3.14). Meanwhile, from equation (3.11), we also have that (∇a ψ(x, F (x)))j + (∇F (x)∇b ψ(x, F (x)))j = 0. 22. (3.15).

(43) From equations (3.13), (3.14), and (3.15), we see that (∇a ψ(x, F (x)))j = 0. From (P2), we conclude that (∇b ψ(x, F (x)))j = 0 which contradicts (3.12). Thus, the case ∇b ψ(x, F (x)) 6= 0 cannot take place if x is an equilibrium point. We conclude that ∇b ψ(x, F (x)) = 0, which implies that ∇b ψ(xi , Fi (x)) = 0 for all i. It follows from property (P2) that φ(xi , Fi (x)) = 0 for any i, and thus x solves NCP(F ).. 2. Example 3.2.1. The generalized Fischer-Burmeister function is an example of an NCP function which satisfies properties (P1) and (P2), which makes it very suitable for P0 complementarity problems. Another NCP function which possesses these nice properties is the Mangasarian-Solodov function [35] given by 1 φMS (a, b) = ab + α((max{0, a − αb})2 − a2 + (max{0, b − αa})2 − b2 ), 2. α > 1.. Now, we establish global asymptotic stability of (3.9) for uniformly P -functions. Theorem 3.2.3. Let φ be an NCP function such that |φ(ak , bk )| → ∞ as |ak | → ∞ and |bk | → ∞. If F is a uniformly P -function, then the dynamical system (3.9) is globally asymptotically stable to a unique equilibrium point. Proof. To prove this, it is enough to show by Proposition 3.1.3(b) that the level sets L(ΨF , γ) is bounded for any γ ≥ 0. Suppose there exists γ for which the level set is k unbounded. Then, there exists a sequence {xk }∞ k=1 ⊆ L(ΨF , γ) such that kx k → ∞. as k → ∞. A similar argument as in [16] shows that there exists an index i such that |xki | → ∞ and |Fi (xk )| → ∞ as k → ∞. By hypothesis, we have |φ(xki , Fi (xk ))| → ∞, p where φ ∈ {φpNR , φpS−NR , ψS−NR }. But, this is impossible since ΨF (xk ) ≤ γ for all k. Thus,. 2. the level set L(ΨF , γ) is bounded for all γ.. We point out that the above result does not necessarily require F to be differentiable.. 23.

(44) Example 3.2.2. We note that the generalized FB function φpFB satisfies the hypothesis of Theorem 3.2.3. It follows that the neural network (3.9) equipped with φ = φpFB has a unique equilibrium point which is globally asymptotically stable whenever F is a uniformly P -function.. 24.

(45) Chapter 4 Neural Networks Based on Discrete Generalization and Symmetrizations of the Natural Residual Function As mentioned in Section 1.3, traditional methods for solving NCP may not be desirable in some applications where we require to obtain real-time solutions. As a result, there has been a growing research interest in the use of neural networks, which are hardwareimplementable and therefore offers real-time processing. In Chapters 4 to 6, we explore several neural networks based on NCP functions and a special type of smoothing strategy for NCP to solve the nonlinear complementarity problem. In particular, this chapter† is devoted to a family of neural networks for solving nonlinear complementarity problems (NCP). The neural networks are constructed from the merit functions based on three classes of NCP-functions: the generalized natural p , which were presented residual function φpNR and its two symmetrizations φpS−NR and ψS−NR. in Section 1.4. We first characterize the stationary points of the induced merit functions. Growth behavior of the complementarity functions is also described, as this will play an important role in describing the level sets of the merit functions. In addition, the stability of the steepest descent-based neural network model for NCP is analyzed. We provide numerical simulations to illustrate the theoretical results, and also compare the proposed neural networks with existing neural networks based on other well-known NCP-functions. †. The results presented in this chapter are the author’s work published in [3].. 25.

(46) Numerical results indicate that the performance of the neural network is better when the parameter p associated with the NCP-function is smaller. The efficiency of the neural networks in solving NCPs is also reported.. 4.1. Introduction. In [42], a neural network based on the Fischer-Burmeister (FB) function (1.4) was designed to handle P0 -NCPs, that is, an NCP(F ) with a P0 -function F (see Definition 2.1.3(a)). These results were extended in [8] to the generalized Fischer-Burmeister (GFB) function (1.6), an NCP function with continuous parameter p ∈ (1, ∞). It was shown that for the latter NN, better numerical performance of the network can be achieved by choosing a larger value of p. Moreover, these neural networks have good stability and convergence properties, as well as insensitivity to initial conditions. These FB and generalized FB functions, which have been extensively used in the different solution methods, are strongly semismooth functions, which often provide efficient performance [15]. In contrast, we explore in this chapter the neural network approach using smooth NCP functions. In particular, we use NCP functions which induce a merit function ΨF that is differentiable on the feasible region ΩF . However, we give significant attention to nondegenerate NCPs. Recall that a solution x∗ is said to be degenerate if {i | x∗i = Fi (x∗ ) = 0}. is not empty. Note that if x∗ is degenerate and φ is differentiable at x∗ , then ∇Φ(x∗ ) is singular. Consequently, one should not expect a locally fast convergence of numerical methods based on smooth NCP functions if the computed solution is degenerate [15, 36]. p Because of the differentiability of φpNR , φpS−NR and ψS−NR on the feasible region of the. NCP problem, it is also expected that the convergence of the trajectories of the neural network (3.9) to a degenerate solution could be slow. Hence, in this chapter, we will give particular attention to nondegenerate NCPs. Finally, the NCP functions we consider herein have piecewise-defined formulas, in contrast with the FB and generalized FB functions which have simple formulations. 26.

(47) In turn, the subsequent analysis is more complicated. Nevertheless, we show that the proposed neural networks may offer promising results too. The analysis and numerical reports in this chapter pave the way for the use of piecewise-defined NCP functions.. 4.2. Motivation and Contributions. One of the contributions we present in this chapter lies on establishing the theoretical properties of the generalized natural residual functions and their symmetrizations. These are fundamental in designing NCP-based solution methods, and in this manuscript, we use the neural network approach. Basic properties of these functions are already presented in [30]. The purpose of this chapter is to elaborate some more properties and applications of the newly discovered discrete-type classes of NCP functions given by (1.7), (1.8) and (1.9), which we recall here to facilitate better discussion. The generalized natural residual function is given by φpNR (a, b) = ap − (a − b)p+ ,. (4.1). and its two symmetrizations are p. φS−NR (a, b) = and p. ψS−NR (a, b) =. (. (. ap − (a − b)p if a > b, ap = b p if a = b, bp − (b − a)p if a < b,. (4.2). ap bp − (a − b)p bp if a > b, ap bp = a2p if a = b, , ap bp − (b − a)p ap if a < b,. (4.3). where p > 1 is an odd integer. Specifically, we look at the properties of their induced merit functions ΨF given by (3.8). First, it is important for us to determine the correspondence between the solutions of NCP(F ) and the stationary points of ΨF . From the above discussion (also see Proposition 3.2.1(a)), we already know that an NCP solution is a stationary point. On the other hand, we also want to determine which stationary points of ΨF are solutions to the NCP. For certain NCP functions such as the Mangasarian and Solodov function [35], FB function [21] and generalized FB function [10], a stationary point of the merit function was shown to be a solution to the NCP when F is monotone 27.

(48) or a P0 -function. It should be pointed out that these NCP functions possess the following nice properties mentioned in Theorem 3.2.2: (P1) ∇a ψ(a, b) · ∇b ψ(a, b) ≥ 0 for all (a, b) ∈ IR2 ; and (P2) For all (a, b) ∈ IR2 , ∇a ψ(a, b) = 0 ⇐⇒ ∇b ψ(a, b) = 0 ⇐⇒ φ(a, b) = 0. p However, these properties are not possessed by φpNR , φpS−NR and ψS−NR , which leads to. some difficulties in the subsequent analysis. Hence, we seek for other conditions which will guarantee that a stationary point is an NCP solution. Furthermore, we also want to look at the growth behavior of the functions (4.1), (4.2) and (4.3). This will play a key role in characterizing the level sets of the induced merit functions (see Theorem 3.2.3). p It must be noted that since the NCP functions φpS−NR and ψS−NR are piecewise-defined. functions, then the analyses of their growth behavior and the properties of their induced merit functions are more difficult, as compared with the commonly used FB functions (1.4) and (1.6) which have simple formulations. Another goal we wish to accomplish in this chapter is to discuss the stability properties p . We further look into different of the neural networks based on φpNR , φpS−NR and ψS−NR. examples to see the influence of p on the convergence of trajectories of the neural network to the NCP solution. Finally, we compare the numerical performance of these three types of neural networks with two well-studied neural networks based on the FB function [42] and generalized FB function [8].. 4.3. Properties of Induced Merit Functions. Before we look into the properties of the neural network, we first discuss some properties of the induced merit functions ΨF (x) =. 1 kΦF (x)k2 , 2. where ΦF is given by. p (3.7) with φ ∈ {φpNR , φpS−NR , ψS−NR }. The function ΦF corresponding to φpNR , φpS−NR and. p ψS−NR is denoted, respectively, by ΦpNR , ΦpS1−NR and ΦpS2−NR , where it is understood that. we consider a particular function F , thus we omit the subscript F. Their corresponding merit functions will be denoted by ΨpNR , ΨpS1−NR and ΨpS2−NR , respectively. 28.

(49) To establish some properties of these merit functions, we recall the following important lemmas. Lemma 4.3.1. Let p > 1 be an odd integer. Then, the following hold. (a) The function φpNR is twice continuously differentiable. Its gradient is given by p−1 a − (a − b)p−2 (a − b)+ p ∇φNR (a, b) = p . (a − b)p−2 (a − b)+ (b) The function φpS−NR is twice continuously differentiable on the set Ω := {(a, b) | a 6= b}. Its gradient is given by p [ ap−1 − (a − b)p−1 , (a − b)p−1 ]T if a > b, p ∇φS−NR (a, b) = p [ (b − a)p−1 , bp−1 − (b − a)p−1 ]T if a < b.. Further, φpS−NR is differentiable at (0, 0) with ∇φpS−NR (0, 0) = [0, 0]T .. p is twice continuously differentiable. Its gradient is (c) The function ψS−NR  p−1 p p−1 p  a b − (a − b) b  if p p p−1   ab − (a − b)p bp−1 + (a − b)p−1 bp  p ∇ψS−NR (a, b) = p [ap−1 bp , ap bp−1 ]T = pa2p−1 [1 , 1 ]T if  p−1 p p p−1 p−1 p  a b − (b − a) a + (b − a) a   if  p ap bp−1 − (b − a)p−1 ap. given by a > b, a = b, a < b.. Proof. Please see [9, Proposition 2.2], [5, Propositions 2.2 and 3.2], and [30, Proposition 4.3].. 2. We also need the following result. Observe from Lemma 4.3.2(b) and (c) that indeed, properties (P1) and (P2) (see Section 4.2) do not hold for the three discrete-type classes of NCP functions considered. Lemma 4.3.2. Let p > 1 be a positive odd integer. Then, the following hold. p (a) If φ ∈ {φpNR , φpS−NR }, then φ(a, b) > 0 ⇐⇒ a > 0, b > 0, while ψS−NR (a, b) ≥ 0 on. IR2 .. (. > 0 on {(a, b) | a > b > 0 or a > b > 2a}, = 0 on {(a, b) | a ≤ b or a > b = 2a or a > b = 0}, < 0 otherwise, S p p ∇a φS−NR (a, b) · ∇b φS−NR (a, b) > 0 on {(a, b) | a > b > 0} {(a, b) | b > a > 0}, and. (b) ∇a φpNR (a, b) · ∇b φpNR (a, b). p p ∇a ψS−NR (a, b) · ∇b ψS−NR (a, b) > 0 on the first quadrant IR2++ .. 29.

(50) (c) If φ ∈ {φpNR , φpS−NR }, then ∇a φ(a, b) · ∇b φ(a, b) = 0 provided that φ(a, b) = 0. On p p the other hand, ψS−NR (a, b) = 0 ⇐⇒ ∇ψS−NR (a, b) = 0. In particular, we have p p p ∇a ψS−NR (a, b) · ∇b ψS−NR (a, b) = 0 provided that ψS−NR (a, b) = 0.. Proof. Please see [30, Propositions 3.4, 4.5, and 5.4].. 2. Our next goal is to determine the conditions such that stationary points of ΨF are also global minimizers. When an NCP function has properties (P1) and (P2), an equilibrium point is a global minimizer when F is a P0 -function. However, these properties (P1) and p (P2) only hold on a proper subset of IRn for the functions φpNR , φpS−NR and ψS−NR . Thus,. we seek for other conditions to achieve the goal. We start with the merit function ΨpNR . Proposition 4.3.3. If F is strongly monotone with modulus µ > 1, then every stationary point of ΨpNR is a global minimizer. Proof. Let x∗ be a stationary point of ΨpNR , that is, ∇ΨpNR (x∗ ) = 0. For convenience, we denote by A(x∗ ) and B(x∗ ) the diagonal matrices such that for each i = 1, . . . , n, Aii (x∗ ) = (x∗i )p−1. and. Bii (x) = (x∗i − Fi (x∗ ))p−2 (x∗i − Fi (x∗ ))+ .. Then, by formula (3.10) and Lemma 4.3.1(a), we have p[A(x∗ ) − B(x∗ )]ΦpNR (x∗ ) + p∇F (x∗ )B(x∗ )ΦpNR (x∗ ) = 0,. (4.4). A(x∗ )ΦpNR (x∗ ) + (∇F (x∗ ) − I)B(x∗ )ΦpNR (x∗ ) = 0.. (4.5). which yields. Analogous to the technique in [21], pre-multiplying both sides of (4.5) by (B(x∗ )ΦpNR (x∗ ))T leads to ΦpNR (x∗ )T [B(x∗ )A(x∗ )]ΦpNR (x∗ ) + (B(x∗ )ΦpNR (x∗ ))T (∇F (x∗ ) − I)B(x∗ )ΦpNR (x∗ ) = 0. (4.6) Since p is an odd integer, we have A(x∗ ) ≥ 0 and B(x∗ ) ≥ 0; and hence, ΦpNR (x∗ )T [B(x∗ )A(x∗ )]ΦpNR (x∗ ) ≥ 0. 30.