## Geometric views of the generalized Fischer–Burmeister function and its induced merit function

### Huai-Yin Tsai, Jein-Shan Chen

^{⇑}

^{,1}

Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan

a r t i c l e i n f o

Keywords:

Curvature Surface Level curve NCP-function Merit function

a b s t r a c t

In this paper, we study geometric properties of surfaces of the generalized Fischer–Burmei- ster function and its induced merit function. Then, a visualization is proposed to explain how the convergent behaviors are inﬂuenced by two descent directions in merit function approach. Based on the geometric properties and visualization, we have more intuitive ideas about how the convergent behavior is affected by changing parameter. Furthermore, geometric view indicates how to improve the algorithm to achieve our goal by setting proper value of the parameter in merit function approach.

Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction

The nonlinear complementarity problem (NCP) is to ﬁnd a point x 2 R^{n}such that

x P 0; FðxÞ P 0; hx; FðxÞi ¼ 0; ð1Þ

where h; i is the Euclidean inner product and F ¼ ðF_{1}; . . . ;F_{n}Þ^{T}is a map from R^{n}to R^{n}. We assume that F is continuously dif-
ferentiable throughout this paper. The NCP has attracted much attention because of its wide applications in the ﬁelds of eco-
nomics, engineering, and operations research[8,11,16], to name a few.

Many methods have been proposed to solve the NCP; see[1,14,16,20,22,25]and the references therein. One of the most powerful and popular approach is to reformulate the NCP as a system of nonlinear equations[21,23,28], or an unconstrained minimization problem[9,10,12,15,18,19,24,27]. The objective function that can constitute an equivalent unconstrained min- imization problem is called a merit function, whose global minima are coincident with the solutions of the original NCP. To construct a merit function, a class of functions, called NCP-functions and deﬁned below, plays a signiﬁcant role.

A function / : R^{2}! R is called an NCP-function if it satisﬁes

/ða; bÞ ¼ 0 () a P 0; b P 0; ab ¼ 0: ð2Þ

Equivalently, / is an NCP-function if the set of its zeros is the two nonnegative semiaxes. An important NCP-function, which plays a central role in the development of efﬁcient algorithms for the solution of the NCP, is the well-known Fischer–Burmeister (FB) NCP-function[12,13]deﬁned as

http://dx.doi.org/10.1016/j.amc.2014.03.089 0096-3003/Ó 2014 Elsevier Inc. All rights reserved.

⇑Corresponding author.

E-mail addresses:tasiwhyin@gmail.com(H.-Y. Tsai),jschen@math.ntnu.edu.tw(J.-S. Chen).

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Ofﬁce. The author’s work is supported by Ministry of Science and Technology, Taiwan.

Contents lists available atScienceDirect

## Applied Mathematics and Computation

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / a m c

/ða; bÞ ¼

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
a^{2}þ b^{2}
q

ða þ bÞ: ð3Þ

With the NCP function, we can obtain an equivalent formulation of the NCP by a system of equations:

UðxÞ ¼

/ðx1;F1ðxÞÞ

/ðxn;FnðxÞÞ 0

BB BB BB

@

1 CC CC CC A

¼ 0: ð4Þ

In other words, we have

x solves the NCP ()UðxÞ ¼ 0:

In view of this, we deﬁne a real-valued functionW:R^{n}! Rþ

WðxÞ :¼1

2kUðxÞk^{2}¼1
2

X^{n}

i¼1

/^{2}ðxi;FiðxÞÞ: ð5Þ

It is known thatWa merit function of the NCP, i.e., the NCP is equivalent to an unconstrained minimization problem:

minx2R^{n} W_{ðxÞ:} _{ð6Þ}

Merit functions is frequently used in designing numerical algorithms for solving the NCP. In particular, we can apply an iter- ative algorithm to minimize the merit function with hope of obtaining its global minimum.

Recently, the so-called generalized Fischer–Burmeister function was proposed in[3,4]. More speciﬁcally, they considered
/_{p}:R^{2}! R and

/pða; bÞ :¼ kða; bÞk_{p} ða þ bÞ; ð7Þ

where p > 1 is an arbitrary ﬁxed real number and kða; bÞk_{p}denotes the p-norm of ða; bÞ, i.e., kða; bÞk_{p}¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
jaj^{p}þ jbj^{p}
pp

. In other
words, in the function /_{p}, the 2-norm of ða; bÞ in the FB function is replaced by a more general p-norm. The function /_{p}is still
an NCP-function, which naturally induces another NCP-function w_{p}:R^{2}! Rþgiven by

w_{p}ða; bÞ :¼1

2j/pða; bÞj^{2}: ð8Þ

For any given p > 1, the function w_{p}is shown to possess all favorable properties of the FB function w; see[2–4]. It plays an
important part in our study throughout the paper. LikeU, the operatorUp:R^{n}! R^{n}deﬁned as

UpðxÞ ¼

/_{p}ðx1;F1ðxÞÞ

/_{p}ðxn;FnðxÞÞ
0

BB BB BB

@

1 CC CC CC A

ð9Þ

yields a family of merit functionsWp:R^{n}! Rþfor the NCP:

WpðxÞ :¼1

2kUpðxÞk^{2}¼X^{n}

i¼1

wpðxi;FiðxÞÞ: ð10Þ

Analogously, the NCP is equivalent to an unconstrained minimization problem:

min

x2R^{n} WpðxÞ: ð11Þ

It was shown that if F is monotone[15]or an P0-function[10], then any stationary point ofWis a global minima of the
unconstrained minimization min_{x2R}^{n}WðxÞ, and hence solves the NCP. The similar results were generalized toWp-case in
[4]. On the other hand, there are many classical iterative methods applied to this unconstrained minimization of the NCP.

Derivative-free methods[29]are suitable for problems where the derivatives of F are not available or expansive. Some deriv- ative-free algorithms with global convergence results were proposed to solve the NCP based on generalized Fischer–Burmei- ster merit function. For example,[4,5]pointed out that the performance of the algorithm is inﬂuenced by parameter p. In addition, there have been observed some phenomenon in the derivative-free algorithm studied in[5]. More speciﬁcally, there occurs kind of ‘‘cliff’’ in the convergent behavior depicted asFig. 1.

During these years, we are frequently asked about what is the main factor causing this and how parameter p affects con-
vergent behavior? These are what we are eager to know of. In light of our earlier numerical experience, we ﬁnd that ﬁguring
out the geometric properties of /_{p} and w_{p} may be a key way to answer the aforementioned puzzles. In view of this

motivation, we aim to do analysis from geometric view in this paper. More speciﬁcally, the objective of this paper is to study
the relation between convergent behavior and parameter p via aspect of geometry in which the graphs of /_{p}and w_{p}can be
regarded as families of surfaces embedded in R^{3}.

This paper is organized as follows. In Section2, we propose some geometric properties of /_{p}and present its surface struc-
ture by ﬁgures. In Section3, we study properties of w_{p}, and summarize the comparison between /_{p}and w_{p}. In Section4, we
investigate a geometric visualization to see possible convergence behavior with different p by a few examples. Finally, we
state the conclusion.

2. Geometric view of /_{p}

In this section, we study some geometric properties of /_{p}and interpret their meanings. We present the family of surfaces
of /_{p}ða; bÞ where p 2 ð1; þ1Þ, seeFigs. 2 and 3. When we ﬁx a real number p with 1 < p < þ1,Fig. 3gives us intuitive image
that the surface shape is indeed inﬂuenced by the value of p. From the deﬁnition of p-norm, we know that
kða; bÞk_{1}:¼ jaj þ jbj, and kða; bÞk_{1}:¼ maxfjaj; jbjg. It is trivial that /pða; bÞ ! /1ða; bÞ :¼ jaj þ jbj ða þ bÞ pointwisely, see
Fig. 3(a) and (b). On the other hand /_{p}ða; bÞ ! /_{1}ða; bÞ :¼ maxfjaj; jbjg ða þ bÞ pointwisely, seeFig. 3(e) and (f). Note that
/_{1}ða; bÞ is not an NCP function because when a > 0 and b > 0, we have /_{1}ða; bÞ ¼ 0 whereas /_{1}ða; bÞ is an NCP function but
not differentiable when a ¼ b.

Next, we give some lemmas which will be used in subsequent analysis.

Lemma 2.1 [6, Lemma 3.1]. If a > 0 and b > 0, then ða þ bÞ^{p}>a^{p}þ b^{p}for all p 2 ð1; þ1Þ.

Fig. 1. ‘‘Cliff’’ phenomenon that appears in some derivative-free algorithm.

−10

−5 0

5

10 −10

−5 0

5

10

−10 0 10 20 30 40

b−axis a−axis

z−axis

Fig. 2. The surface of z ¼ /2ða; bÞ with ða; bÞ 2 ½10; 10 ½10; 10.

Lemma 2.2 [17, Lemma 1.3]. Let x ¼ ðx1;x2; . . . ;xnÞ 2 R^{n} and kxk_{p}:¼ Pn
i¼1jxij^{p}

^{1}_{p}

. If 1 < p_{1}<p_{2}, then kxk_{p}

2

6kxk_{p}

1

6 n

p11_{p2}^{1}

kxk_{p}_{2}.

Lemma 2.3 [5, Lemma 3.2]. Let /_{p}:R^{2}! R be given as in(7)where p 2 ð1; þ1Þ. Then,
2 2^{1}^{p}

j minfa; bgj 6 j/pða; bÞj 6 2 þ 2 ^{1}^{p}

j minfa; bgj:

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10

−10 0 10 20 30 40

b−axis a−axis

z−axis

Fig. 3. The surface of z ¼ /pða; bÞ with different p.

Proposition 2.1. Let /_{p}:R^{2}! R be given as in(7)where p 2 ð1; þ1Þ. Then,

(a) ða > 0 and b > 0Þ () /_{p}ða; bÞ < 0;

(b) ða ¼ 0 and b P 0Þ or ðb ¼ 0 and a P 0Þ () /_{p}ða; bÞ ¼ 0;

(c) b ¼ 0 and a < 0 ) /_{p}ða; bÞ ¼ 2a > 0;

(d) a ¼ 0 and b < 0 ) /_{p}ða; bÞ ¼ 2b > 0.

Proof

(a) If a > 0 and b > 0, it is easy to see /_{p}ða; bÞ < 0 by Lemma 2.1. Conversely, because ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
jaj^{p}þ jbj^{p}
pp

Pjaj and ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

jaj^{p}þ jbj^{p}
pp

Pjbj, we have ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
jaj^{p}þ jbj^{p}
pp

Pmaxfjaj; jbjg. Suppose a 6 0 or b 6 0, then we have maxfjaj; jbjg P ða þ bÞ
which implies /_{p}ða; bÞ P 0. This is a contradiction.

(b) By deﬁnition of /_{p}ða; bÞ, we know

/_{p}ða; 0Þ ¼ jaj a ¼ 0 a P 0;

2a a < 0;

/_{p}ð0; bÞ ¼ jbj b ¼ 0 b P 0;

2b b < 0;

which say that ða ¼ 0 and b P 0Þ or ðb ¼ 0 and a P 0Þ ) /_{p}ða; bÞ ¼ 0. Conversely, suppose /pða; bÞ ¼ 0. If a < 0 or b < 0,
mimicking the arguments of part (a) yields

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
jaj^{p}þ jbj^{p}
qp

>maxfjaj; jbjg > a þ b;

which implies /_{p}ða; bÞ > 0. Thus, there must hold a P 0 and b P 0. Furthermore, one of a and b must be 0 from part (a).

The proofs of (c) and (d) are direct from the proof of part (b). h

Proposition 2.1(a)shows that /_{p}ða; bÞ is negative on the ﬁrst quadrant of R^{2}-plane, seeFig. 4, whileProposition 2.1(b)
shows that /_{p}ða; bÞ ¼ 0 can only happen on the nonnegative semiaxes (i.e., a P 0; b ¼ 0 or a ¼ 0; b P 0). In fact, this prop-
osition is also equivalent to saying that /_{p}ða; bÞ is an NCP-function. In addition,Proposition 2.1(b)–(d) indicate that the value
of p does not affect the value /_{p}ða; bÞ on the a-axis and b-axis.

Proposition 2.2. Let /p:R^{2}! R be given as in(7)where p 2 ð1; þ1Þ. Then,

(a) /_{p}ða; bÞ ¼ /pðb; aÞ;

(b) /_{p}is convex, i.e.,

/pð

### a

w þ ð1### a

Þw^{0}Þ 6

### a

/pðwÞ þ ð1### a

Þ/pðw^{0}Þ for all w; w

^{0}2 R

^{2}and

### a

2 ½0; 1;(c) if 1 < p_{1}<p_{2}, then /_{p}

1ða; bÞ P /_{p}_{2}ða; bÞ.

Proof. The veriﬁcations for part (a) and (b) are straightforward, we omit them. Part (c) is true by applyingLemma 2.2. h

Proposition 2.2(a)shows the symmetric property of /_{p}ða; bÞ which means there have a couple of points on plane between
line a ¼ b having the same height. In other words, surface z ¼ /_{p}ða; bÞ has the same structure on second and forth quadrant of
the plane, seeFigs. 4–6.Proposition 2.2(b)says that the shape of surface is convex because the function /_{p}is convex while
Proposition 2.2(c)implies that the value of /_{p}is decreasing when the value of p is increasing. In summary, the value of p
would affect geometric structure.

Proposition 2.3. If fða^{k};b^{k}Þg # R^{2} with ða^{k}! 1Þ or ðb^{k}! 1Þ or ða^{k}! þ1 and b^{k}! þ1Þ, then j/pða^{k};b^{k}Þj ! þ1 for
k ! þ1.

Proof. This can be found in[26, p. 20]. h

Proposition 2.3implies the increasing direction on surface. This can be seen from the contour graph of z ¼ /_{p}ða; bÞ which
is plotted inFig. 4, where the deep color presents the lower height. In order to understand the structure of the surface, it is
nature to investigate special curves on the surface. We consider a family of curves

### a

r;p:R! R^{3}deﬁned as follows:

### a

r;pðtÞ :¼ r þ t; r t; / pðr þ t; r tÞð12Þ

where r 2 R and p 2 ð1; þ1Þ are two arbitrary ﬁxed real number. These curves can be viewed as the intersection of surface
z ¼ /_{p}ða; bÞ and plane a þ b ¼ 2r, seeFig. 6. We study some properties regarding these special curves.

Lemma 2.4. Let /_{p}:R^{2}! R be given as in(7)where p 2 ð1; þ1Þ. Fix any r 2 R, we deﬁne f : R ! R as f ðtÞ :¼ /_{p}ðr þ t; r tÞ,
then f is a convex function.

Proof. We know that /_{p}is a convex function byProposition 2.3and observe that f is a composition of /_{p}and an afﬁne func-
tion. Thus, f is convex since it is a composition of a convex function and an afﬁne function (the composition of two convex
functions is not necessarily convex, however, our case does guarantee the convexity because one of them is afﬁne). h

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

50 100 150 200 250 300 350

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0 50 100 150 200 250 300 350

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0 50 100 150 200 250 300

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

−50 0 50 100 150 200 250 300

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

−50 0 50 100 150 200 250

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

−50 0 50 100 150 200 250

Fig. 4. Level curves of z ¼ /pða; bÞ with different p.

Theorem 2.1. Let /_{p}:R^{2}! R be given as in(7)where p 2 ð1; þ1Þ. Suppose a and b are constrained on the curve determined by
a þ b ¼ 2r (r 2 R) and the surface. Then, /_{p}ða; bÞ attains its minima /_{p}ðr; rÞ ¼ 2^{1}^{p}jrj 2r along this curve at ða; bÞ ¼ ðr; rÞ.

Proof. We know that /_{p}ða; bÞ is differentiable except ð0; 0Þ, therefore we discuss two cases as follows.

0 2

4 6

8

10 0 2 4 6 8 10

−6

−4

−2 0

b−axis a−axis

z−axis

Fig. 5. The surface of z ¼ /2ða; bÞ with ða; bÞ 2 ½0; 10 ½0; 10.

Fig. 6. The curve intersected by surface z ¼ /pða; bÞ and plane a þ b ¼ 2r.

(i) Case (1): r ¼ 0. Because a þ b ¼ 0; a and b have opposite sign to each other except a ¼ b ¼ 0, fromProposition 2.1, we
know /_{p}ða; bÞ P 0 under this case. Thus, when ða; bÞ ¼ ð0; 0Þ; /pða; bÞ attains its minima zero.

(ii) Case (2): r – 0. Fix r and p > 1. Let f : R ! R and g : R ! R be respectively deﬁned as

f ðtÞ :¼ /_{p}ðr þ t; r tÞ; gðtÞ :¼ jr þ tj^{p}þ jr tj^{p}:

Then, we calculate that

f^{0}ðtÞ ¼ g^{0}ðtÞ
pðgðtÞÞ^{p1}^{p}

and g^{0}ðtÞ ¼ p sgnðr þ tÞðr þ tÞh ^{p1} sgnðr tÞðr tÞ^{p1}i
:

We know gðtÞ > 0 for all t 2 R. It is clear g^{0}ð0Þ ¼ 0, and hence f^{0}ð0Þ ¼ 0. ByLemma 2.4, f ðtÞ is convex on R. In addition, it is
also continuous, therefore, t ¼ 0 is a critical point of f ðtÞ which is also a global minimizer of f ðtÞ. The proof is done since
a ¼ b ¼ r and /_{p}ðr; rÞ ¼ 2^{1}^{p}jrj 2r when t ¼ 0. h

Lemma 2.4andTheorem 2.1show that the curve determined by the plane a þ b ¼ 2r and the surface z ¼ /_{p}ða; bÞ is convex
and attains minima when a ¼ b, seeFig. 7. We now study curvature of the family of curves

### a

r;pdeﬁned as in(12)at pointr; r; /_{r;p}ðr; rÞ

. Because function /_{p} is not differentiable at ða; bÞ ¼ ð0; 0Þ (i.e., r ¼ 0), we choose two points

t0;t0;/_{0;p}ðt0;t0Þ

and t 0;t0;/_{0;p}ðt0;t0Þ

where t0>0, and calculate the value of cosine function of the angle between

### a

0;pðt0Þ;### a

0;pðt0Þ, seeFig. 8.Proposition 2.4. Let

### a

r;p:R! R^{3}be deﬁned as in(12), and cospðhÞ be cosine function of the angle between two vectors

### a

0;pðt0Þ and### a

0;pðt0Þ where t0>0. Then,(a) cospðhÞ ¼ ^{2}

2p6

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

2^{2}^{p}2

^{2}

þ32

q ;

(b) cos_{p}ðhÞ ! ^{1}_{3}as p ! 1, and cos_{p}ðhÞ ! _{33}^{5} as p ! þ1;

(c) if 1 < p_{1}<p_{2}, then cosp_{1}ðhÞ < cosp_{2}ðhÞ.

Proof

(a) By direct computation, we obtain

cospðhÞ ¼

### a

0;pðt0Þ### a

0;pðt0Þk

### a

0;pðt0Þkk### a

0;pðt0Þk¼ 2^{2}

^{p}6 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

2^{2}^{p}þ 6

þ 2^{1}^{p}^{þ2}

r ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
2^{2}^{p}þ 6

2^{1}^{p}^{þ2}

r ¼ 2^{2}^{p} 6

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
2^{2}^{p} 2

2

þ 32

r :

(b) From part (a), let f : ð1; þ1Þ ! R be f ðpÞ :¼ cospðhÞ. Then f ðpÞ is continuous on ð1; þ1Þ. By taking the limit, we have
cospðhÞ ! ^{1}_{3}as p ! 1, and cospðhÞ ! _{33}^{5} as p ! þ1.

(c) From part (b), we know f^{0}ðpÞ ¼ ^{6 1}

ln 2

ð pÞ^{2}^{2}^{p}
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

ð2^{2}^{p}2Þ
2

þ32

q which implies f^{0}ðpÞ > 0 for all p > 1. Therefore, f ðpÞ is a strictly increasing

function on ð1; þ1Þ. h

Proposition 2.5. Let

### a

r;p:R! R^{3}be deﬁned as in(12). Then the following hold.

(a) The curvature at point

### a

r;pð0Þ ¼ r; r; /_{p}ðr; rÞ

is

### j

pð0Þ ¼^{ðp1Þ2}

1p1 jrj . (b)

### j

pð0Þ ! 0 as p ! 1 and### j

pð0Þ ! þ1 as p ! þ1.(c) If 1 < p_{1}<p_{2}, then

### j

_{p}

_{1}ð0Þ <

### j

_{p}

_{2}ð0Þ.

Proof

(a) Because

### a

r;pðtÞ ¼ r þ t; r t; /_{p}ðr þ t; r tÞ

, we know

### a

^{0}

_{r;p}ð0Þ ¼ ð1; 1; 0Þ and

### a

^{00}

_{r;p}ð0Þ ¼ 0; 0;ðp 1Þ2

^{1}

^{p}jrj

! :

Recall the formulation of curvature

### j

pðtÞ ¼j### a

^{0}

_{r;p}ðtÞ ^

### a

^{00}

_{r;p}ðtÞj j

### a

^{0}r;pðtÞj

^{3};

where wage operator means the outer product of two vectors. Thus, we have

### j

pð0Þ ¼j### a

^{0}

_{r;p}ð0Þ ^

### a

^{00}

_{r;p}ð0Þj

j

### a

^{0}

_{r;p}ð0Þj

^{3}¼ðp 1Þ2

^{1}

^{p}

^{1}jrj :

(b) Let f : ð1; þ1Þ ! R be deﬁned as

f ðpÞ :¼

### j

pð0Þ ¼ðp 1Þ2^{1}

^{p}

^{1}jrj ;

then obviously f ðpÞ is continuous on R. Thus, the desired result follows by taking the limit directly.

(c) From part (b), we compute that

f^{0}ðpÞ ¼2^{1}^{p}^{1}

jrj 1 ln 2 p þln 2

p^{2}

;

which implies f^{0}ðpÞ > 0 for all p 2 ð1; þ1Þ. Then f ðpÞ is strictly increasing on ð1; þ1Þ. h

−0.1 −0.05 0 0.05 0.1

−0.0615

−0.061

−0.0605

−0.06

−0.0595

−0.059

−0.0585

−10 −5 0 5 10

−2 0 2 4 6 8 10

−0.5 0 0.5

−5

−4.9

−4.8

−4.7

−4.6

−4.5

−4.4

−0.5 0 0.5

15 15.1 15.2 15.3 15.4 15.5 15.6

Fig. 7. The curve f ðtÞ ¼ /pðr þ t; r tÞ.

The above two propositions shows how p affect the geometric structure, seeFig. 9(a) and (b).Proposition 2.5(b)says that
when p ! 1 the curve becomes a straight line, seeFig. 9(c). Note that when p ! þ1 the curve becomes more and more sharp
at the point. This curve is not differentiable when t ¼ 0, seeFig. 9(d). To sum up, from all properties we presented in this
section we realize that p indeed affect the geometric behavior of surface z ¼ /_{p}ða; bÞ both locally and globally.

3. Geometric view of w_{p}

In previous section, we see that generalized FB function /_{p}is convex and differentiable everywhere except ð0; 0Þ. To the
contrast, the function w_{p}ða; bÞ deﬁned as in(8)is non-convex, but continuously differentiable everywhere. Nonetheless, /_{p}
and w_{p}have many similar geometric properties as will be seen later. In this section, we study some properties like what we
have done in Section2and compare the difference between w_{p}and /_{p}(seeFigs. 10 and 11).

Proposition 3.1. Let wp:R^{2}! R be given as in(8)where p 2 ð1; þ1Þ. Then,

(a) w_{p}ða; bÞ P 0; 8ða; bÞ 2 R^{2};
(b) w_{p}ða; bÞ ¼ w_{p}ðb; aÞ; 8ða; bÞ 2 R^{2};

(c) ða ¼ 0 and b P 0Þ or ðb ¼ 0 and a P 0Þ () w_{p}ða; bÞ ¼ 0;

(d) b ¼ 0 and a < 0 ) w_{p}ða; bÞ ¼ 2a^{2}>0;

(e) a ¼ 0 and b < 0 ) w_{p}ða; bÞ ¼ 2b^{2}>0;

(f) w_{p}is continuously differentiable everywhere.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3 3.5 4

p=1.1 p=1.5 p=2 p=3p=10

−0.10 −0.05 0 0.05 0.1

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

−0.10 −0.05 0 0.05 0.1

0.02 0.04 0.06 0.08 0.1 0.12

Fig. 8. Angle between vectorsa0;pðt0Þ anda0;pðt0Þ.

Proof. Parts (d) and (e) come fromPropositions 2.5(c) and 2.1(d), please see[2–4]for the rest. h

Proposition 2.2(c)says that the value of /_{p}is decreasing with respect to p. To the contrast, w_{p}does not have such property.

More speciﬁcally, it is true for w_{p} to hold such property only on certain quadrants.

Proposition 3.2. Suppose 1 < p_{1}<p_{2}and ða; bÞ 2 R^{2}. Then,

(a) if a < 0 or b < 0, then w_{p}

1ða; bÞ P w_{p}_{2}ða; bÞ;

(b) if a > 0 and b > 0, then w_{p}

1ða; bÞ 6 w_{p}

2ða; bÞ.

Proof

(a) This is clear fromProposition 2.2(c).

(b) Suppose a > 0 and b > 0, from Proposition 2.1(a), we have /_{p}ða; bÞ < 0. Then Proposition 2.2(c) yields
/_{p}_{1}ða; bÞ P /_{p}_{2}ða; bÞ, and hence /^{2}_{p}

1ða; bÞ 6 /^{2}_{p}

2ða; bÞ. h

Since w_{p}is not convex in general. The counterpart ofTheorem 2.1is as below.

Theorem 3.1. Let w_{p}ða; bÞ be deﬁned as(8)with a þ b ¼ 2r. Then, the following hold.

(a) If r 2 R^{þ}and a > 0; b > 0, then w_{p}ða; bÞ attains maxima 2 ^{2}^{p}^{1} 2^{1}^{p}^{þ1}þ 2

r^{2}when ða; bÞ ¼ ðr; rÞ.

(b) If r 2 R^{}[ f0g, then w_{p}ða; bÞ attains minima 2 ^{2}^{p}^{1}þ 2^{1}^{p}^{þ1}þ 2

r^{2}when ða; bÞ ¼ ðr; rÞ.

−0.5 0 0.5

−0.5

−0.45

−0.4

−0.35

−0.3

−0.25

−0.2

−0.15

−0.1

−0.05 0

p=1.1 p=1.5 p=2p=3 p=10

−0.53 0 0.5

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4

p=1.1 p=1.5 p=2p=3 p=10

−0.5 0 0.5

3.986 3.988 3.99 3.992 3.994 3.996 3.998 4

−0.53 0 0.5

3.1 3.2 3.3 3.4 3.5 3.6 3.7

Fig. 9. The curvaturejpð0Þ at pointar;pð0Þ.

Proof

(a) When a > 0 and b > 0,Proposition 2.1(a)says that /_{p}ða; bÞ < 0. Since /^{2}_{p}ða; bÞ > 0, byTheorem 2.1, the minima of
/_{p}ða; bÞ becomes maxima of w_{p}ða; bÞ.

(b) This is a consequence ofTheorem 2.1. h

The aforementioned results show w_{p} has many similar properties like /_{p} hold, seeFigs. 11 and 12, where we denote
w_{1}ða; bÞ :¼^{1}_{2}j/_{1}ða; bÞj^{2}and w_{1}ða; bÞ ¼^{1}_{2}j/_{1}ða; bÞj^{2}. However, there still are some differences between /_{p}and w_{p}. For example,
w_{p}is not convex whereas /_{p}is.Fig. 13depicts the increasing direction of w_{p}. Note that w_{p}ða; bÞ is nonnegative and has dif-
ferent properties when a > 0 and b > 0, seeFig. 11.

In order to further understand the geometric properties, we deﬁne a family of curves as follows:

b_{r;p}ðtÞ :¼ r þ t; r t; w pðr þ t; r tÞ

; ð13Þ

where r is a ﬁxed real number, and t 2 R. This family of curves can be regarded as intersection of plane a þ b ¼ 2r and surface
z ¼ w_{p}ða; bÞ, seeFig. 14.

Proposition 3.3. Let br;p:R! R^{3}be deﬁned as in(13). Then the following hold.

(a) The curvature at point b_{r;p}ð0Þ ¼ r; r; w _{p}ðr; rÞ

is

### j

pð0Þ ¼ ðp 1Þ2^{1}

^{p}1 2

^{1}

^{p}

^{1}. (b)

### j

pð0Þ ! 0 as p ! 1 and### j

pð0Þ ! þ1 as p ! þ1.(c) If 1 < p_{1}<p_{2}, then

### j

p_{1}ð0Þ <

### j

p_{2}ð0Þ.

Proof

(a) From b_{r;p}ðtÞ ¼ r þ t; r t; w _{p}ðr þ t; r tÞ

, we know

b^{0}_{r;p}ð0Þ ¼ ð1; 1; 0Þ and b^{0}_{r;p}ð0Þ ¼ 0; 0; ðp 1Þ2 ^{2}^{p} sgnðrÞðp 1Þ2^{1}^{p}^{þ1}

;

which yields

### j

pðrÞ ¼jb^{0}

_{r;p}ð0Þ ^ b

^{0}

_{r;p}ð0Þj

jb^{0}_{r;p}ð0Þj^{3} ¼ ðp 1Þ2^{1}^{p}1 2^{1}^{p}^{1}
:

(b) Let f : ð1; þ1Þ ! R be deﬁned as f ðpÞ :¼

### j

pð0Þ ¼ ðp 1Þ2^{1}

^{p}1 2

^{1}

^{p}

^{1}. Then the result follows by taking the limit directly.

(c) From part (b), it can be veriﬁed that f^{0}ðpÞ > 0 for all p 2 ð1; þ1Þ. Thus, f ðpÞ is strictly increasing on ð1; þ1Þ. h

Fig. 14depicts the change of the curve when we have different value of p in which we can see the change of curvature
when p is close to one or inﬁnity. We state an addendum to part (a) here: the curvature at another two special points
b_{r;p}ðrÞ ¼ ð0; 2r; 0Þ, br;pðrÞ ¼ ð2r; 0; 0Þ is the same, namely,

### j

pðrÞ ¼### j

pðrÞ ¼^{1}

_{2}. Note that although w

_{p}is differentiable every- where, the mean curvature at ð0; 0Þ does not exist. To end up this section, we summarize the similarity and difference be- tween /

_{p}and w

_{p}as below.

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600

b−axis a−axis

z−axis

Fig. 10. The surface of z ¼ w2ða; bÞ with ða; bÞ 2 ½10; 10 ½10; 10.

/pða; bÞ w_{p}ða; bÞ

Difference Convex Nonconvex

differentiable everywhere except ð0; 0Þ/_{p}ða; bÞ < 0 when
a > 0 and b > 0

differentiable everywhere
wpða; bÞ P 0;8ða; bÞ 2 R^{2}
Similarity (1) NCP-function

(2) Symmetry (i.e. /pða; bÞ ¼ /pðb; aÞ and wpða; bÞ ¼ wpðb; aÞ) (3) The function is not affected by p on axes

(4) When ða^{k}! 1Þ or ðb^{k}! 1Þ or ða^{k};b^{k}! þ1Þ there have j/pða^{k};b^{k}Þj ! 1 and jwpða^{k};b^{k}Þj ! 1
(5) Non-coercive

4. Geometric analysis of merit function in descent algorithms

In this section, we employ derivative-free descent algorithms presented in[4,5]to solve the unconstrained minimization problem(11)by using the merit function(10). We then compare two algorithms and study their convergent behavior by investigating an intuitive visualization. We ﬁrst list these two algorithms as below.

Algorithm 4.1 [4, Algorithm 4.1].

(Step 0) Given real numbers p > 1 and a starting point x^{0}2 R^{n}. Choose the parameters

### r

2 ð0; 1Þ; b 2 ð0; 1Þ and### e

P0. Set k :¼ 0.(Step 1) IfWpðx^{k}Þ 6

### e

, then stop.(Step 2) Let mkbe the smallest nonnegative integer m satisfying
Wpðx^{k}þ b^{m}d^{k}Þ 6 ð1

### r

b^{2m}ÞWpðx

^{k}Þ;

where

d^{k}:¼ rbw_{p}ðx^{k};Fðx^{k}ÞÞ
and

rbw_{p}ðx; FðxÞÞ :¼ rbw_{p}ðx1;F1ðxÞÞ; . . . ;rbw_{p}ðxn;FnðxÞÞT

:
(Step 3) Set x^{kþ1}:¼ x^{k}þ b^{m}^{k}d^{k}, k :¼ k þ 1 and go to Step 1.

Algorithm 4.2 [5, Algorithm 4.1].

(Step 0) Given real numbers p > 1 and

### a

P0 and a starting point x^{0}2 R

^{n}. Choose the parameters

### r

2 ð0; 1Þ; b 2 ð0; 1Þ;### c

2 ð0; 1Þ and### e

P0. Set k :¼ 0.(Step 1) IfWa;pðx^{k}Þ 6

### e

, then stop.(Step 2) Let m_{k}be the smallest nonnegative integer m satisfying

0 2

4 6

8

10 0 2 4 6 8 10

0 5 10 15 20

b−axis a−axis

z−axis

Fig. 11. The surface of z ¼ w2ða; bÞ with ða; bÞ 2 ½0; 10 ½0; 10.

Wa;pðx^{k}þ b^{m}d^{k}ð

### c

^{m}ÞÞ 6 ð1

### r

b^{2m}ÞWa;pðx

^{k}Þ;

where

d^{k}ð

### c

^{m}Þ :¼ rbw

_{a};pðx

^{k};Fðx

^{k}ÞÞ

### c

^{m}raw

_{a};pðx

^{k};Fðx

^{k}ÞÞ and

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600 800

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600 800

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10 0

200 400 600

b−axis a−axis

z−axis

−10

−5 0

5 10 −10

−5 0

5 10 0

100 200 300 400 500

b−axis a−axis

z−axis

Fig. 12. The surface of z ¼ /pða; bÞ with different p.

raw_{a}_{;p}ðx; FðxÞÞ :¼raw_{a}_{;p}ðx1;F1ðxÞÞ; . . . ;raw_{a}_{;p}ðxn;FnðxÞÞT

;
rbw_{a}_{;p}ðx; FðxÞÞ :¼rbw_{a}_{;p}ðx1;F1ðxÞÞ; . . . ;rbw_{a}_{;p}ðxn;FnðxÞÞT

:
(Step 3) Set x^{kþ1}:¼ x^{k}þ b^{m}^{k}d^{k}ð

### c

^{m}

^{k}Þ, k :¼ k þ 1 and go to Step 1.

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

1
2
3
4
5
6
7
x 10^{4}

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

1
2
3
4
5
6
7
x 10^{4}

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
x 10^{4}

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
x 10^{4}

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0.5
1
1.5
2
2.5
3
3.5
4
x 10^{4}

a−axis

b−axis

−100 −50 0 50 100

−100

−80

−60

−40

−20 0 20 40 60 80 100

0.5
1
1.5
2
2.5
3
3.5
4
x 10^{4}

Fig. 13. Level curves of z ¼ wpða; bÞ with different p.

InAlgorithm 4.2, w_{a}_{;p}:R^{2}! Rþis an NCP-function deﬁned by

w_{a};pða; bÞ :¼

### a

2ðmaxf0; abgÞ^{2}þ wpða; bÞ ¼

### a

2ðabÞ^{2}_{þ}þ1

2ðkða; bÞk_{p} ða þ bÞÞ^{2}

with

### a

P0 being a real parameter. When### a

¼ 0, the function w_{a}

_{;p}reduces to w

_{p}. For comparing these two algorithms, we take

### a

¼ 0 when we useAlgorithm 4.2in this section. Note that the descent direction inAlgorithm 4.1is lack of a certain sym- metry whereasAlgorithm 4.2adopts a symmetric search direction. Under the assumption of monotonicity, i.e.,hx y; FðxÞ FðyÞi P 0 for all x; y 2 R^{n};

the error bound is proposed andAlgorithm 4.2is shown to have locally R-linear convergence rate in[5]. In other words, there exists a positive constant

### j

2such thatkx^{k} x^{}k 6

### j

2 max Wa;pðx^{k}Þ; ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Wa;pðx

^{k}Þ

q

^{1}_{2}

when

### a

¼ 0:Furthermore, the convergence rate ofAlgorithm 4.2has a close relation with the constant

log^{c}

b L1þ

### r

CðB;

### a

;pÞ

where CðB;

### a

;pÞ ¼ 2 2^{1}

^{p}4

### a

B^{2}þ 2 þ 2

^{1}

^{p}2:

Therefore, when the value of p decreases, the convergence rate ofAlgorithm 4.2becomes worse and worse, see Remark 4.1 in [5].

Recall that merit functionWpðxÞ is sum of n nonnegative functions wp, i.e.,

WpðxÞ ¼X^{n}

i¼1

wpðxi;FiðxÞÞ:

This encourages us to view each component w_{p}ðx^{k}_{i};Fiðx^{k}ÞÞ for i ¼ 1; 2 . . . ; n as the motion with different velocity on the same
surface z ¼ w_{p}ða; bÞ at each iteration. Due to our study in Sections2 and 3, we observe a visualization that help us understand
the convergent behavior in details.Fig. 20depicts the visualization in a four-dimensional NCP inExample 4.3. The merit
function of this NCP isWpðxÞ ¼P4

i¼1w_{p}ðxi;FiðxÞÞ. We plot point sequences ðx
^{k}_{i};Fiðx^{k}ÞÞ

for i ¼ 1; 2; 3; 4 together with different
color and level curve of surface w_{1:1}ða; bÞ inFig. 20(a). Vertical line represents value of x_{i}, horizontal line represents value of
FiðxÞ and skew line means xi¼ FiðxÞ. We take initial point x^{0}¼ ð0; 0; 0; 0Þ which implies Fðx^{0}Þ ¼ ð6; 2; 1; 3Þ, and observe
convergent behavior separately with different i from initial point to the solution x^{}¼ ð ﬃﬃﬃ

p6

=2; 0; 0; 1=2Þ which is on the hor- izontal line in this ﬁgure. Furthermore, we observe the position of point sequence on the surface inFig. 20(a) and merit func- tion which is the sum of their height at each iteration shown as inFig. 20(b).

In one-dimensional NCP, F is continuously differentiable and there is only one variable x in F, so ðx; FðxÞÞ is continuous
curve on R^{2}and merit functionWpðxÞ ¼ w_{p}ðx; FðxÞÞ is obviously a curve on the surface z ¼ w_{p}ða; bÞ, seeFig. 16(a) and (b).

Therefore, point sequence in one-dimensional problem can only lie on the curve x; FðxÞ; w _{p}ðx; FðxÞÞ
.

−2

−1 0

1

2 −2 −1 0 1 2

0 5 10 15 20 25

y−axis x−axis

z−axis

0 1

2

3 0 0.5 1 1.5 2 2.5 3

0 0.5 1 1.5 2

y−axis x−axis

z−axis

Fig. 14. The curve intersected by surface z ¼ wpða; bÞ and plane a þ b ¼ 2r.

Example 4.1. Consider the NCP, where F : R ! R is given by
FðxÞ ¼ ðx 3Þ^{3}þ 1:

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

p=1.1 p=1.5 p=2p=3 p=10

−0.50 0 0.5

1
2
3
4
5
6
7
8x 10^{−3}

−0.5 0 0.5

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

0.5 1 1.5

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

0.5 p=1

p=1.1 p=1.2 p=1.3 p=1.4

0.5 1 1.5

0 0.02 0.04 0.06 0.08 0.1 0.12

0.14 p=100

p=10 p=5p=4 p=3

Fig. 15. The curvature jpð0Þ at point br;pð0Þ.

The unique solution of this NCP is x^{}¼ 2. Note that F is strictly monotone, see geometric view of this NCP problem inFig. 16.

The value of merit function with each iteration is plotted inFig. 16(c) which presents the different behavior of the functions with different value p near by the solution.Fig. 17(a)–(d) depict convergent behavior inAlgorithm 4.1from two direction with two different initial points, andFig. 17(e) and (f) show convergent behavior with different p.Fig. 19(a)–(d) depict con- vergent behavior inAlgorithm 4.2from two direction with two different initial points. We found thatAlgorithm 4.2always produce point sequence in or close to the boundary of feasible set, i.e., ðx; FðxÞÞ : x P 0 and FðxÞ P 0f g. Based onProposition 3.2, the speed of the decreasing of merit function with different initial point inAlgorithm 4.1is different when we increase p.

But it is similar with different initial point inAlgorithm 4.2. This phenomena is consistent with geometric properties studied in Section3.

To show the importance of inﬂection point, we give an extreme example as follows:

Example 4.2. Consider the NCP, where F : R ! R is given by

FðxÞ ¼ 1:

The unique solution of this NCP is x^{}¼ 0. From above discussion, we know that point sequence is on the curve x; 1; w _{p}ðx; 1Þ
,
seeFig. 18(a).Fig. 18(c) shows there is rapid decreasing of merit function form the 80th to 120th iteration.Fig. 18(b) shows
the behavior during 80th to 120th iteration. Observing the width of the level curve inFig. 18(b), we found that rapid decreas-
ing may arise from the existence of inﬂection point on the surface.Figs. 18(c)–(f) andFig. 19(e) and (f) show that the position
of inﬂection point may change with different p.

1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6

−1

−0.5 0 0.5 1 1.5

x

F(x)

−1 0 1 2 3 4

−1

−0.5 0 0.5 1 1.5 2

x

F(x)

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

x Ψp(x,F(x))

p=1.1 p=1.5 p=2p=3 p=100

Fig. 16. Geometric view of NCP inExample 4.1.

Example 4.3. Consider the NCP, where F : R^{4}! R^{4}is given by

FðxÞ ¼

3x^{2}_{1}þ 2x1x2þ 2x^{2}2þ x3þ 3x4 6
2x^{2}_{1}þ x1þ x^{2}_{2}þ 3x3þ 2x4 2
3x^{2}_{1}þ x1x2þ 2x^{2}2þ 2x3þ 3x4 1

x^{2}_{1}þ 3x^{2}_{2}þ 2x3þ 3x4 3
0

BB B@

1 CC CA:

−1 0 1 2 3 4 5

−8

−6

−4

−2 0 2 4 6

x

F(x)

20 40 60 80 100 120 140

0 20 40 60 80 100 120

0
0.5
1
1.5
2
2.5
3
3.5x 10^{−3}

Iteration

Merit function

−1 0 1 2 3 4 5

−1

−0.5 0 0.5 1 1.5 2 2.5 3

x

F(x)

1 2 3 4 5 6 7

0 200 400 600 800 1000 1200 1400

0
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 10^{−5}

Iteration

Merit function

−1 0 1 2 3 4 5

−8

−6

−4

−2 0 2 4 6

x

F(x)

20 40 60 80 100 120 140

0 50 100 150 200 250

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

2x 10^{−3}

Iteration

Merit function

Fig. 17. Convergent behavior ofAlgorithm 4.1and the value of merit function inExample 4.1.