• 沒有找到結果。

A continuation approach for solving binary quadratic program based on a class of NCP-functions

N/A
N/A
Protected

Academic year: 2022

Share "A continuation approach for solving binary quadratic program based on a class of NCP-functions"

Copied!
30
0
0

加載中.... (立即查看全文)

全文

(1)

to appear in Applied Mathematics and Computation, 2012

A continuation approach for solving binary quadratic program based on a class of NCP-functions

Jein-Shan Chen 1 Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: jschen@math.ntnu.edu.tw

Jing-Fan Li

Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: 697400011@ntnu.edu.tw

Jia Wu

School of Mathematical Sciences Dalian University of Technology

Dalian 116024, China

E-mail: jwu dora@mail.dlut.edu.cn

February 12, 2012

Abstract. In the paper, we consider a continuation approach for the binary quadratic program (BQP) based on a class of NCP-functions. More specifically, we recast the BQP as an equivalent minimization and then seeks its global minimizer via a global continuation method. Such approach had been considered in [11] which is based on the Fischer-Burmeister function. We investigate this continuation approach again by using a more general function, called the generalized Fischer-Burmeister function. However, the theoretical background for such extension can not be easily carried over. Indeed, it needs some subtle analysis.

Keywords. Nonlinear complementarity problem, generalized Fischer-Burmeister func- tion, Binary quadratic program.

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is supported by National Science Council of Taiwan.

(2)

In this paper, we consider the following binary quadratic program (BQP)

min xTQx + cTx over x∈ S, (1)

where Q is an n×n symmetric matrix, c is a vector in IRnand S is the binary discrete set {0, 1}n. It is known that BQP is NP-hard and has a variety of applications in computer science, operations research and engineering, see [1, 3, 8, 13, 14] and references therein.

There have been proposed several continuous approaches for solving BQP [9, 12, 15] which often need to cooperate with branch and bound algorithms or some heuristic strategies to generate an exact or approximate solution. In [10], another type of continuous approach was proposed which is to reformulate BQP as an equivalent mathematical programming problem with equilibrium constraints (MPEC) and then consider an effective algorithm to find its global solution. In this approach, many NCP-functions are employed to convert equilibrium constraints into a collection of quasi-linear equality constraints. Among others, the Fischer-Burmeister function ϕFB : IR2 → IR defined as

ϕFB(a, b) =√

a2+ b2− (a + b) (2)

is a popular one. In this paper, we investigate this continuation approach again by using a more general function ϕp : IR2 → IR, called the generalized Fischer-Burmeister function and defined by

ϕp(a, b) :=∥(a, b)∥p− (a + b), (3) where p > 1 is an arbitrary fixed real number and ∥(a, b)∥p denotes the p-norm of (a, b), i.e., ∥(a, b)∥p = √p

|a|p+|b|p. In other words, in the generalized FB function ϕp, we replace the 2-norm of (a, b) appeared in the FB function by a more general p-norm.

The function ϕp is still an NCP-function, which naturally induces another NCP-function ψp : IR2 → IR+ given by

ψp(a, b) := 1

2p(a, b)|2. (4)

For any given p > 1, the function ψp is shown to possess all favorable properties of the FB function ψFB.

Traditionally, in the continuation approach for BQP, one needs to utilize the fact that x∈ {0, 1}n ⇐⇒ xi = x2i, i = 1, 2,· · · , n. (5) To the contrast, our proposed continuous optimization approach arises from the comple- mentarity condition formulation of 0−1 vector x ∈ {0, 1}n, which includes the equivalence (5) with redundant constraints

0≤ xi ≤ 1, i = 1, 2, · · · , n.

(3)

so that it can generate an integer feasible solution. For finding the global minimizer of our continuous optimization problem, we employ the similar way as in [10, 11]. In summary, the method is to add a quadratic penalty term associated with its equilibrium constraints and a logarithmic barrier term associated with box constraints−1 ≤ xi ≤ 1, i = 1, 2, ..., n, respectively, to the objective function, and then construct a global smoothing function.

Since the generalized Fischer-Burmeister function ψpis quasi-linear, the quadratic penalty for equilibrium constraints will make the convexity of the global smoothing function more stronger. Particularly, we have shown that the global smoothing function is strictly convex in the whole domain for barrier parameter large enough or in a subset of its domain for penalty parameter large enough. According to the feature above, we use a global continuation algorithm defined in [11] via a sequence of unconstrained minimization for this function with varying penalty and barrier parameters. Although the idea is brought from [11], as will be seen, the theoretical background for such extension can not be easily carried over. Indeed, it needs some subtle analysis for extending the background materials. Without loss of generally, in this paper we consider the case that S ={−1, 1}n. By a transformation z = (x + e)/2 for the variable x and the unit vector e in IR, we can extend the conclusions to the case S ={0, 1}n.

2 Continuous formulation based on Φ

p

function

In this section we will reformulate (1) as an equivalent continuous optimization based on the ϕp function. As will be seen, the following equivalence plays a key role which says that a binary constraint t∈ {a, b} with a, b ∈ IR is equivalent to a complementarity condition (or equilibrium constraint), i.e.,

t∈ {a, b} ⇐⇒ t − a ≥ 0, b − t ≥ 0, (t − a)(t − b) = 0.

With this, the unconstrained BQP problem in (1) can be recast as a mathematical programming problem with equilibrium constraints (MPEC)

min f (x)

s.t. (1 + xi, 1− xi) = 0, i = 1, 2, ..., n, 1 + xi ≥ 0, 1 − xi ≥ 0, i = 1, 2, · · · , n.

(6)

In fact, given any NCP-function ϕ : IR×IR → IR, the property of NCP-functions (see [6]) yields that the equilibrium constraint in (6) is indeed equivalent to an equality constraint associated with ϕ:

(1 + xi, 1− xi) = 0, 1 + xi ≥ 0, 1 − xi ≥ 0, ⇐⇒ ϕ(1 + xi, 1− xi) = 0. (7) Thus we reformulate the original BQP problem, which together with (6) and (7), as the following continuous optimization problem:

min f (x)

s.t. ϕ(1 + xi, 1− xi) = 0, i = 1, 2,· · · , n

−1 ≤ xi ≤ 1, i = 1, 2, · · · , n.

(8)

(4)

box constraints −1 ≤ xi ≤ 1, i = 1, 2, · · · , n in (8) are indeed redundant, we keep them on purpose. Actually, we shall see that such constraints play a crucial role in the construction of a global smoothing function for problem (8) as was shown in [9, 10]. Generally speaking, most NCP-functions are non-differentiable, such as the popular Fischer-Burmeister function in (2), the generalized Fischer-Burmeister function in (3), as well as the minimum function

ϕmin(a, b) = min{a, b}.

However, it is very interesting to observe that, when specializing ϕ in (8) as the generalized Fischer-Burmeister function, we can reach smooth constraint functions

ϕp(1 + xi, 1− xi) = √p

|1 + xi|p+|1 − xi|p− 2 = 0, i = 1, 2, ..., n

and consequently some usual nonlinear programming solvers can be employed to design an effective algorithm for solving problem (8). In view of this, we in this paper pay atten- tion to the following equivalent continuous formulations reformulated by the generalized Fischer-Burmeister function:

min f (x)

s.t. ϕp(1 + xi, 1− xi) = 0, i = 1, 2,· · · , n

−1 ≤ xi ≤ 1, i = 1, 2, · · · , n.

(9)

We also note that using the equivalence that xi ∈ {−1, 1} ⇐⇒ x2i = 1 gives another another type of continuous optimization:

min f (x)

s.t. x2i = 1, i = 1, 2,· · · , n

−1 ≤ xi ≤ 1, i = 1, 2, · · · , n.

(10)

The formulation of (10) looks simple and friendly at first glance, nonetheless, the following remarkable advantages explain why we still stick to the smooth constrained optimization problem (9):

(i) The quasi-linearity of generalized Fischer-Burmeister function implies that it feasible set tends to be convex.

(ii) The equality constraint conditions ϕp(1 + xi, 1− xi) = 0, i = 1, 2, ..., n have incor- porated the equivalent formulation x2i = 1, i = 1, 2, ..., n, of x ∈ {−1, 1} with its relaxation formulation −1 ≤ xi ≤ 1, i = 1, 2, ..., n, which indicates that, when solving (9) with a penalty function method, an implicit interior point constraint is additionally imposed on.

(iii) From Proposition 2.1 as below, we see that the quadratic penalty function of equal- ity constraints is strictly convex in a very large region when the penalty parameter is large enough.

(5)

These advantages have great contributions to searching for an optimal solution or a favorable suboptimal solution of (1), which will be shown later. Before we prove the main proposition, we first introduce several technical lemmas which are important for building up the background materials of our extension.

Lemma 2.1 Let f, g be real-valued functions from IR to IR+. Suppose f, g satisfy (i) f(x) > 0 and g(x) < 0 for all x ∈ (a, b),

(ii) f′′(x) < 0 and g′′(x) < 0 for all x∈ (a, b), (iii) (f g)(a) < 0 and f (a)≥ g(a).

Then (f g)(x) < 0 for all x ∈ (a, b).

Proof. To achieve our result, we need to verify two things: (i) (f g)(a) < 0 and (ii) (f g)(x) is decreasing on x ∈ (a, b). We proceed these verifications as below.

(i) From the assumptions and the chain rule, it is clear that (f g)(a) = f(a)g(a) + f (a)g(a) < 0.

(ii) Since (f g)(x) = f(x)g(x) + f (x)g(x), we see that in order to show (f g)(x) is decreasing on x∈ (a, b), it is enough to argue both f(x)g(x) and f (x)g(x) are decreasing on (a, b). We look into the first term first. Note that

(f(x)g(x)) = f′′(x)g(x) + f(x)g(x)≤ 0 ∀x ∈ (a, b)

because f′′(x) < 0, g(x) ≥ 0, f(x) > 0 and g(x) < 0. This claims that f(x)g(x) is decreasing on x ∈ (a, b). The decreasing of f(x)g(x) over (a, b) can be concluded similarly.

Thus, from all the above, the proof is complete. 2

The conclusion of next lemma is simple and neat, however, its arguments are very tedious. Indeed the main idea behind is approximation.

Lemma 2.2 Let ψp be defined as in (4). Then, ψp′′(1+t, 1−t) is positive at t = ±√ 213 − 1 for all p≥ 2.

Proof. For symmetry, we only prove the case of t =

213 − 1. First, from direct computations and simplifying the expression of ψ′′p, we have

ψ′′p(1 + t, 1− t) = ((1 + t)p + (1− t)p)1p

(1 + t)2(1− t)2((1 + t)p+ (1− t)p)2 × F (p, t) (11)

(6)

where F (p, t) = f0(p, t) f1(p, t) + f2(p, t) + f3(p, t) + f4(p, t) with

f0(p, t) = ((1 + t)p+ (1− t)p)p1 f1(p, t) = (1− t)2(t + 1)2p f2(p, t) = (t + 1)2(1− t)2p

f3(p, t) = (2t2+ 4p− 6)(t + 1)p(1− t)p f4(p, t) = (8− 8p)(1 − t2)p.

Since the first term on the right side of (11) is always positive for all p≥ 2, it suffices to show that F

( p,

213 − 1 )

> 0 for all p≥ 2. However, it is very hard to claim this fact directly. Our strategy is to construct a function A : IR→ IR such that

A(p)≤ F (

p,

√ 213 − 1

)

∀p ≥ 2. (12)

The special feature for A(p) is that it is easier to verify A(p)≥ 0 for all p ≥ 2 so that our goal could be reached. Now, we proceed the proof by carrying out the aforementioned two steps.

Step (1): Construct a function A(·) satisfying (12). Indeed, the function F (·, ·) is com- posed of f0, f1, f2, f3 and f4, so for each fi, we will construct a corresponding piecewise function ai such that ai(p)≤ fi

( p,

213 − 1)

for i = 0, 1, 2, 3, 4. Then, combining them together to build up the function A(·). For making the reader understand more easier , we will give some pictures during the process of proof.

(i) First, we explain how to set up a0(p). Notice that the second derivative of f0 with respect to p is positive at t =

213 − 1 for all p ≥ 2, f0 is strictly convex at t =

√ 213 − 1 for all p≥ 2 (the detailed arguments are provided in Appendix A). Hence, we consider a real piecewise function defined as

a0(p) = { −1

8 (p− 2) + 223 if 2≤ p ≤ −8

213 − 1 − 6 + 8(223),

213 − 1 + 1 if p≥ −8

213 − 1 − 6 + 8(223).

Figure 1 depicts the relation between a0(p) and f0 (

p,

213 − 1 )

. Besides, the following facts

a0(2) = f0 (

2,

√ 213 − 1

)

lim

p→2+a0(p) < d dpf0

( 2,

√ 213 − 1

)

a′′0(p) = 0 < d2 dp2f0

( p,

√ 213 − 1

)

(7)

Figure 1: The graphs of a0 and f0.

indicate the first part of function a0(p) is less than f0 (

p,

213 − 1 )

for 2 < p

−8

213 − 1 − 6 + 8(223). On the other hand, another fact

p→∞lim f0

( p,

√ 213 − 1

)

=

213 − 1 + 1

says that the second part of function a0(p) is less than or equal to f0 (

p,

213 − 1) for p≥ −8

213 − 1 − 6 + 8(223). Thus, we conclude that

a0(p)≤ f0

( p,

√ 213 − 1

)

∀p ≥ 2.

(ii) Secondly, we consider a quadratic function defined as

a1(p) = (

1

√ 213 − 1

)2( 1 +

√ 213 − 1

)4

ln (

1 +

√ 213 − 1

)

(p− 1)2 +

( 1

√ 213 − 1

)2( 1 +

√ 213 − 1

)4[ 1− ln

( 1 +

√ 213 − 1

)]

.

(8)

Figure 2 depicts the relation between a1(p) and f1 (

p,

213 − 1)

. Again, using the fol- lowing facts

a1(2) = f1 (

2,

√ 213 − 1

) , a1(2) = d

dpf1 (

2,

√ 213 − 1

) , a′′1(p) d2

dp2f1 (

p,

√ 213 − 1

)

∀p ≥ 2, we immediately achieve

a1(p)≤ f1

( p,

√ 213 − 1

)

∀p ≥ 2.

(iii) Thirdly, we consider a function defined as

a2(p) =













15(p− 2) +( 1

213 − 1 )4( 1 +√

213 − 1 )2

if 2≤ p ≤ 12 + 20(

213 − 223) +

213 − 1(

40(213)− 40 − 10(223) )

, 0

if p≥ 12 + 20(

213 − 223) +

213 − 1(

40(213)− 40 − 10(223) )

.

(9)

Figure 3: The graphs of a2 and f2. Figure 3 depicts the relation between a2(p) and f2

( p,

213 − 1)

. We observe that the function f2 is positive and convex on p≥ 2, then the following facts

a2(2) = f2

( 2,

√ 213 − 1

)

lim

p→2+a2(p) < d dpf2

( 2,

√ 213 − 1

)

a′′2(p) = 0 < d2 dp2f2

( p,

√ 213 − 1

)

∀p > 2

yield a2(p)≤ f2

( p,

213 − 1)

for all p≥ 2.

(iv) Fourthly, we consider a real piecewise function defined as

a3(p) =











 [√

2− 213 (

24− 12(223) )

+ 16(223 − 213)− 8] p +

2− 213(24(223)− 48) + 40(223 − 213) + 20 if 2≤ p ≤ 52,

(

4

31213 + 314 )

(2− 213)52p + (72

31213 + 7231 )

(2− 213)52 if 52 ≤ p ≤ 18,

0 if p≥ 18.

Figure 4 depicts the relation between a3(p) and f3 (

p,

213 − 1)

. The relation is clear from the picture, however, we need to go through three subcases to verify it mathemati-

(10)

cally.

If 2 ≤ p ≤ 52, we compute f3 (

p,

213 − 1 )

= (

2(213)− 8 + 4p)

(2− 213)p. Moreover, we have

d dpf3

( p,

√ 213 − 1

)

= (2− 213)p [

4 + (

2(213)− 8 + 4p)

ln(2− 213) ]

, d2

dp2f3 (

p,

√ 213 − 1

)

= (2− 213)pln(2− 213) [

8 + (

2(213)− 8 + 4p)

ln(2− 213) ]

. Then, the following facts

a3(2) = f3 (

2,

√ 213 − 1

) , a3

(5 2

)

= f3

(5 2,

√ 213 − 1

) , lim

p→2+a3(p) d dpf3

( 2,

√ 213 − 1

) ,

and f3 (

p,

213 − 1 )

being concave on[ 2,52]

imply a3(p)≤ f3

( p,

213 − 1)

under this case.

(11)

If 52 ≤ p ≤ 18 , using the facts that

a3 (5

2 )

= f3 (5

2,

√ 213 − 1

)

lim

p52+

a3(p) d dpf3

(5 2,

√ 213 − 1

)

and a3(p) = f3 (

p,

213 − 1)

having only one solution at p = 52, we obtain a3(p) f3

( p,

213 − 1)

under this case.

If p ≥ 18, knowing f3(p) > 0 for all p, then it is clear that a3(p) ≤ f3

( p,

213 − 1) under this case.

(v) Finally, notice that the second derivative of f4 with respect to p is positive at t =

213 − 1 for all p ≥ −2+ln(2−213)

ln(2−213) , and negative for p≤ −2+ln(2−213)

ln(2−213) , so f4 is strictly convex at t =

213 − 1 for all p ≥ −2+ln(2−213)

ln(2−213) and strictly concave for all p≤ −2+ln(2−213)

ln(2−213) . Hence, we consider a real piecewise function defined as

a4(p) =









15350p− 8(2 − 213)2+ 15325 if 2≤ p ≤ 52, [49750 + 16(2− 213)2

]

p + 58325 − 48(2 − 213)2 if 52 ≤ p ≤ 3,

1310p− 135 if 3≤ p ≤ 4913,

152 if p≥ 4913.

Figure 5 depicts the relation between a4(p) and f4 (

p,

213 − 1)

. Again, we need to discuss several subcases to prove the relation mathematically.

For 2≤ p ≤ 52, the following facts

a4(2) = f4 (

2,

√ 213 − 1

)

lim

p→2+a4(p) < d dpf4

( 2,

√ 213 − 1

)

a′′4(p) = 0 < d2 dp2f4

( p,

√ 213 − 1

)

yield the first part of function a4(p) is less than f4 (

p,

213 − 1 )

under this case.

(12)

For 52 ≤ p ≤ 3, using the following facts

a4(3) < f4 (

3,

√ 213 − 1

)

lim

p→3a4(p) > d dpf4

( 3,

√ 213 − 1

)

(13) a′′4(p) = 0 < d2

dp2f4 (

p,

√ 213 − 1

)

we have a4(p) is less than f4

( p,

213 − 1)

under this case.

For 3≤ p ≤ 4913, we know that

lim

p→3+a4(p) > d dpf4

( 3,

√ 213 − 1

) .

This together with (13) gives a4(p) is less than f4 (

p,

213 − 1)

under this case.

(13)

Figure 6: The graphs of A and F

For p 4913, from f4(p) being strictly convex for all p −2+ln(2−213)

ln(2−213) and being strictly concave for all p≥ −2+ln(2−213)

ln(2−213) , we know d

dpf4

(−1 + ln(2 − 213) ln(2− 213)

)

= 0 and lim

p→∞f4(p) = 0 which lead to f4(p) >−152 for all p≤ 2. Thus, a4(p) ≤ f4

( p,

213 − 1 )

under this case.

Now, we are ready to define a function A : IR → IR satisfying (12). As the mentioned idea, the function is defined by

A(p) = a0(p) [

a1(p) + a2(p) + a3(p) ]

+ a4(p).

According to our constructions of ai(p), it is clear that A(p) ≤ F( p,

213 − 1 )

for all p≥ 2. Figure 6 shows the relation between A(p) and F(

p,

213 − 1 ) .

Step (2): We will show that A(p) ≥ 0 for all p ≥ 2. Notice that A(p) is piecewise smooth, hence A(p) is a piecewise function. Indeed, the expression of A(p) looks very

(14)

expression for A(p) in Appendix C which helps us understand the structure of A(p).

The key point is that from the expression of the A(p), we can verify the following facts:

A(2) = 0, A

(

−8

213 − 1 − 6 + 8(223) )

> A (5

2 )

> 0,

and {

A(p) < 0 if p∈ (52,−8

213 − 1 − 6 + 8(223)), A(p) > 0 otherwise,

with the exception of points of discontinuity. Thus, we conclude A(p)≥ 0 for all p ≥ 2 and (12) is satisfied, which imply F

( p,

213 − 1 )

≥ 0 for all p ≥ 2. Then, the proof is complete. 2

Lemma 2.3 (a) Let f be a convex function defied on a convex set C in IRn and g be a nondecreasing convex function defined on an interval I in IR. Suppose f (C)⊆ I.

Then, the composite function g◦ f defined by (g ◦ f)(x) = g(f(x)) is convex on C.

(b) Suppose ϕ1 : U → IR is a twice continuously differentiable function with a compact set U ∈ IRn and ϕ2 : X → IR is a twice continuously differentiable function such that the minimum eigenvalue of its Hessian matrix∇2xxϕ2(x) is greater than ε (> 0) for all x∈ X, where X ⊂ U. Then there exists a constant ˆβ > 0 such that ϕ1+ βϕ2 is a strictly convex function on X for β > ˆβ.

Proof. (a) See [2, Chap III, Lemma 1.4].

(b) See [9, Theorem 3.1]. 2

Proposition 2.1 Let ϕp and ψp be defined as in (3) and (4), respectively. Then, for any fixed p≥ 2, the following hold.

(a) The function ϕp(1 + t, 1− t) is strictly convex for all t ∈ IR.

(b) The function ψp(1 + t, 1− t) is strictly convex for all t /∈[

213 − 1,

213 − 1] . Proof. (a) It is known know that ϕp is a convex function [4, 5, 6]. Note that f is a composition of ϕp and an affine function. Thus, f is convex since it is a composition of a convex function and an affine function (the composition of two convex functions is not necessarily convex, however, our case does guarantee the convexity because one of them is affine).

(15)

(b) Due to the symmetry of ψp(1 + t, 1− t), it is enough to show that ψp(1 + t, 1− t) is strictly convex for t≥

213 − 1. To proceed, we discuss two cases.

(i) If t ≥ 1, the function ψp(1 + t, 1− t) can be regard as a composite function of ϕp(1 + t, 1− t) and h(·) = (·)2. Because h(·) is nondecreasing convex function on [0, ∞]

and ϕp(1 + t, 1− t) is positive strictly convex for t ≥ 1, from Lemma 2.3, we obtain ψ(1 + t, 1− t) is strictly convex for t ≥ 2.

(ii) If 1 > t≥

213 − 1, we know that

−ψp(1 + t, 1− t) = −ϕp(1 + t, 1− t)ϕp(1 + t, 1− t)

−ψp′′(1 + t, 1− t) = [

−ϕp(1 + t, 1− t)ϕp(1 + t, 1− t) − ϕp(1 + t, 1− t)ϕ′′p(1 + t, 1− t)] . Then, it suffices to show that −ψp′′(1 + t, 1− t) < 0 for p ≥ 2. To this end, we compute the third derivative of ϕp(1 + t, 1− t) with respect to t and prove that it is negative. To see this,

ϕ′′′p(1 + t, 1− t) = 4 [(1 + t)p + (1− t)p]1p(1 + t)p(1− t)p(p− 1)

(1 + t)3(t− 1)3[(1 + t)p+ (1− t)p]3 × T (p, t) (14) where T is a real valued function defined by

T (p, t) = (1 + t)p(2p− 1 − 3t) − (1 − t)p(2p + 3t− 1).

It’s not hard to verify the first term of the right side of (14) is always negative for all p ≥ 2. Thus, we only need to show T (p, t) > 0 for all p ≥ 2 which is equivalent to verifying T (2, t) > 0 and T (p, t) > T (2, t) for all p > 2. These can be done as below.

(i) Because T (2, t) = 6t− 6t3, it is clear T (2, t) > 0.

(ii) To show that T (p, t) > T (2, t) for p > 2, we first argue that

(1 + t)p > (1− t)p−1(2p + 3t− 1) ∀p > 2, (15) it’s equivalent to show that (1−t)p(1+t)−1(2p+3tp −1) is greater than 1 for all p > 2. Therefore, we consider the derivative of the following function with respect to p as follows:

d dp

(1 + t)p

(1− t)p−1(2p + 3t− 1) (16)

= (1 + t)p

(1− t)p−1(2p + 3t− 1)2 ×[

(1− 3t − 2p) ln(1 − t) + (2p + 3t − 1) ln(1 + t) − 2] . Observing both terms of the right side of (16) are positive for all p > 2 and using

(1+t)p

(−1+3t+2p)(1−t)p−1 > 1 when p=2, we can achieve (15). Secondly, we know that

2p− 1 − 3t > 1 − t ∀p > 2. (17)

(16)

ϕ′′′p(1 + t, 1− t) < 0 ∀p ≥ 2.

Then, applying Lemma 2.1 gives the desired result for which we set f (t) =−ϕp(1+t, 1−t) and g(t) = ϕp(1 + t, 1− t). 2

The result of Proposition 2.1(b) could be improved under some sense. More specifi- cally, the interval where ψp(1 + t, 1− t) is strictly convex varies as long as p changes. We originally wish to figure out the exact interval where ψp(1 + t, 1− t) is strictly convex for each p. However, it is very hard to find a closed form depending p to reflect this feature (indeed, it may be not possible in our opinion). To compromise, we try to find such an appropriate common interval for all p≥ 2 as shown in Proposition 2.1(b). The following two figures (Figures 7-8) depict the geometric view regarding what we just mentioned.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

p=1.1 p=1.5 p=2 p=3 p=10

Figure 7: The graphs of ψp(1 + t, 1− t) for different p.

3 Global continuation algorithm for BQP

Due to the logarithmic barrier function being strictly convex and Proposition 2.1, we now introduce the quadratic penalty ∑n

i=1ψp(1 + xi, 1− xi) for the equality constraints

(17)

0

1

2

3 0 0.5 1 1.5 2 2.5 3

0 0.5 1 1.5 2

y−axis x−axis

z−axis

Figure 8: The graph of ψp(1 + t, 1− t) with a fixed p.

and the logarithmic barriern

i=1[ln(1 + xi) + ln(1− xi)] of the box constraints into the (9). Construct a global smoothing function

ϕ(x, α, τ ) = f (x) + α

n i=1

ψp(1 + xi, 1− xi)− τ

n i=1

[ln(1 + xi) + ln(1− xi)] (18)

where τ > 0 is a barrier parameter and α > 0 is a penalty parameter. The next property indicates that the strictly convexity of function ϕ(x, α, τ ) on (−1, 1)n when the barrier parameter is large enough, and the strictly convexity of function ϕ(x, α, τ ) in a large subset of its domain for all τ > 0.

Proposition 3.1 Let ϕ(x, α, τ ) be the function defined by (18). Then, the following hold.

(a) There exists a constant ˆτ > 0 such that if τ > ˆτ and α > 0, ϕ(x, α, τ ) is strictly convex on (−1, 1)n.

(b) There exists a constant ˆα > 0 such that if α > ˆα and τ > 0, ϕ(x, α, τ ) is strictly convex on the set D :=

{

x∈ (−1, 1)n |xi| >

213 − 1, i = 1, 2, · · · , n} .

(18)

ϕa(x) := f (x) + α

n i=1

ψp(1 + xi, 1− xi),

ϕb(x) :=

n i=1

[ln(1 + xi) + ln(1− xi)].

Then the expression of the Hessian matrix of ϕb(x) at any x∈ X is given by

2xxϕb(x) = diag

( 1

(1− x1)2 + 1

(1 + x1)2,· · · , 1

(1− xn)2 + 1 (1 + xn)2

) ,

where diag(x) denotes a diagonal matrix with the components of x as the diagonal elements. Moreover, the function (1−x1

i)2 + (1+x1

i)2 has minimum 2 at point xi = 0, and so every diagonal element of2xxϕb(x) is at least 2. Thus, by letting U = [−1, 1]n, ε = 2 and using Lemma 2.3(b) yield the desired result.

(b) Set ϕa= f (x) and ϕb =

n i=1

ψp(1 + xi, 1− xi) τ α

n i=1

[ln(1 + xi) + ln(1− xi)].

From the proof of Lemma 2.2, it follows that

2xx

(∑n

i=1

ψp(1− xi, 1− xi) )

= diag (

ψp′′(1− x1, 1 + x1),· · · , ψp′′(1− xn, 1 + xn) )

,

where ψp′′(1− xi, 1 + xi) can be found in (11). Now taking f (t) = −ϕp(1 + t, 1− t), g(t) = ϕp(1 + t, 1− t) and applying the proof(ii) of Lemma 2.1, we have

ψp′′(1− t, 1 + t) > ψ2′′(1− t, 1 + t) ∀p > 2.

In addition, from [11, Lemma 3.1], we also have ψb′′(1− t, 1 + t) = 2√

(2t2+ 2)3 − 8

(2t2+ 2)3 > 0.0004 ∀|t| > 0.51.

Therefore, the above two inequalities imply

ψp′′(1− t, 1 + t) > 0.0004 ∀ |t| > 0.51 and ∀p ≥ 2.

This indicates that every diagonal element of 2xxψp is at least 0.0004. Using the fact that the Hessian matrix of ατn

i=1[ln(1 + xi) + ln(1− xi)] is positive definite, we obtain that every diagonal element of2xxϕb is at least 0.0004. Now taking

U = [−1, 1]n, X = D and ε = 0.0004

(19)

and applying Lemma 2.3 gives the desired conclusion. 2

As remarked in [11], the result of Proposition 3.1 offers motivation to use the function ϕ(x, α, τ ) to develop a global continuation algorithm for the constrained optimization problem (9). This method will generate a global optimal solution or at least a desirable local solution via a sequence of unconstrained minimization

xmin∈IRnϕ(x, αk, τk) (19) with an increasing penalty parameter sequencek} and a decreasing barrier parameter sequencek}. Note that to ensure the strict convexity of ϕ(x, αk, τk), we have to utilize a sufficiently large initial value τ0 to start with the algorithm. As the iteration goes on, the convexity of logarithmic barrier −τk

n

i=1[ln(1 + xi) + ln(1− xi)] will become weak, but the strict convexity of ϕ(x, αk, τk) can still be guaranteed due to the increasing of the penalty parameter αk. This means that for each k ∈ IN, the minimization prob- lem (19) can be easily solved if we have skillful technique to adjust the parameter α and τ .

Algorithm 3.1

Step 0 Given parameters α0, τ0, σ1 > 1, σ2 ∈ (0, 1) and ϵ > 0. Select a starting point ˆ

x0 and set k = 0.

Step 1 Solve the unconstrained minimization problem (19) with the starting point ˆxk, and denote by xk its optimal solution.

Step 2 If √∑n

i=1ψp(1− xki, 1− xki)≤ ϵ, terminate the algorithm, else go to Step 3.

Step 3 Update the parameters αk+1 = σ1αk and τk+1 = σ2τk. Step 4 Set ˆxk+1 = ˆxk, k = k + 1 and go to Step 1.

Is Algorithm 3.1 well-defined? To answer this, we give an existence theorem of solution for the unconstrained minimization problem (19). In fact, its proof can be found in [11, Lemma 3.2], we give a brief proof here for completeness.

Proposition 3.2 Let ϕ(x, αk, τk) be the function defined as in (18). Then, the following hold.

(a) For each k∈ IN, the minimization problem (19) has a solution xk.

(b) From (a), there exists an ˆτ such that the solution to problem (19) is unique when τk> ˆτ

(20)

ϕ(x, αk, τk) is continuous and X1 is a compact set, there exist two real numbers L1 and U1 such that

L1 ≤ ϕ(x, αk, τk)≤ U1 ∀x ∈ X1.

On the other hand, we note that ϕ(x, αk, τk)−→ +∞ when xi0 −→ 1 or xi0 −→ 1+ for some i0 ∈ {1, 2, ..., n}. Hence, the continuity of function ϕ(x, αk, τk) implies that there exists an δ with 0 < δ < 1/4 such that

ϕ(x, αk, τk)≥ U1 ∀x ∈ (

(−1, −1 + δ] ∪ [1 − δ, 1) )n

. (20)

Let X = [−1+δ, 1−δ]n. Again, ϕ(x, αk, τk) being continuous on a compact set X implies that there exists an ˆx∈ X such that for each k ∈ IN

ϕ(ˆx, αk, τk)≤ ϕ(x, αk, τk) ∀x ∈ X.

Moreover, due to X1 ⊆ X, we know

ϕ(ˆx, αk, τk)≤ U1. (21)

Combining (20) and (21) yields that

ϕ(ˆx, αk, τk)≤ ϕ(x, αk, τk) ∀x ∈ (−1, 1)n\ X.

Thus, together with (20), it shows that ˆx is exactly the desired solution xk.

(b) From conclusion of Proposition 3.1(a), ϕ(ˆx, αk, τk) is strictly convex on (−1, 1)n. Hence xk is unique. 2

4 Numerical experiments

In this section, we report numerical results of Algorithm 3.1 for solving the unconstrained binary quadratic programming problem. Our numerical experiments are carried out in Matlab (version 7.8) running on a PC Inter core 2 Q8200 of 2.33 GHz CPU and 2.00 GB Memory.

In our numerical experiments, we employ BFGS algorithm with strong Wolfe-Powell line search to solve the unconstrained minimization problem (19), and terminate the current iteration as long as xk satisfies the following criterion:

xϕ(xk, αk, τk) ≤5.0e− 3.

(21)

The values for the parameters involved in Algorithm 3.1 are chosen as follows:

α0 = 0, σ1 = 2, σ2 = 0.5, ε = 1.0e− 3,

and the initial barrier parameter τ0 varies with the scale of problems (here we choose its value the same as that in [11]). The starting point ˆx0 = 0.9(1, 1, . . . , 1)T ∈ IRn is used for all test problems. To obtain an integer solution x from the final iterate point ˆx of Algorithm 3.1, we let

xi =

{ −1 if |ˆxi + 1| ≤ 1.0e − 2

1 if |ˆxi − 1| ≤ 1.0e − 2 for i = 1, 2,· · · , n.

The test problems are all from the OR-Library and have the following formulation max zTQz

s.t. zi ∈ {0, 1}, i = 1, 2, · · · , n.

To solve these problems with Algorithm 3.1, we use the formula z = (x+e)/z to transform them into the following formulation

− min −14xTQx− 12xTQe− 14eTQe s.t. xi ∈ {−1, 1}, i = 1, 2, · · · , n.

The optimal values generated by Algorithm 3.1 with different p (p=1.1, 2, 4, 5, 10, 20, 50, 100) are listed in Tables 1-5 (see Appendix D), where means that the algorithm fails to get an optimal solution when the maximum CPU time arrives. Moreover, to present the objective evaluation and comparison of the performance of Algorithm 3.1 with different p, we adopt the performance profile introduced in [7] as a means. In particular, we regard Algorithm 3.1 corresponding to a p as a solver and assume that there are ns solvers and nj test problems from the OR-Library collection J . We are interested in using the optimal values calculated by Algorithm 3.1 as performance measure for different p. For each problem j and solver s, let

tj,s:= the optimal value of problem j by solver s, µj,s:= 1 tj,s

.

We compare the performance on problem j by solver s with the best performance by any one of the ns solvers on this problem, i.e., we employ the performance ratio

rj,s:= µj,s

minj,s : s∈ S } = max{tj,s: s∈ S } tj,s ,

whereS is the set of eight solvers. An overall assessment of each solver is obtained from ρs(τ ) := 1

njsize{j ∈ J : rj,s≤ τ},

(22)

Algorithm 3.1 for solver s.

Figure 9 shows the performance profile of the reciprocals of optimal values obtained by Algorithm 3.1 in the range of [1, 1.04] for eight solvers on 50 test problems. The eight solvers correspond to Algorithm 3.1 with p = 1.1, p = 2, p = 4, p = 5, p = 10, p = 20, p = 50 and p = 100, respectively. From this figure, we see that Algorithm 3.1 are considerably efficient no matter which value of p is chosen. In fact, Algorithm 3.1 with the aforementioned p values can solve all the 50 test problems except for p = 5, 20, 100.

Moreover, Algorithm 3.1 with p = 4 has the best numerical performance (has the highest probability of being the optimal solver) and the probability of its being the winner on a given BQP is around 0.48. Besides, p = 1.1 and p = 2 have a comparable performance with p = 4, please refer to Appendix D for more detailed numerical reports.

1 1.005 1.01 1.015 1.02 1.025 1.03 1.035 1.04

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

The values of tau

The values of performance profile

p=1.1 p=2 p=4 p=5 p=10 p=20 p=50 p=100

Figure 9: Performance profile of the reciprocals of optimal values by Algorithm 3.1 with different p.

References

[1] B. Alidaee, G. Kochenberger and A. Ahmadian, 0−1 quadratic programming approach for the optimal solution of two scheduling problems, International Journal of Systems Science, 25 (1994), 401-408.

參考文獻

相關文件

Numerical experiments are done for a class of quasi-convex optimization problems where the function f (x) is a composition of a quadratic convex function from IR n to IR and

Particularly, combining the numerical results of the two papers, we may obtain such a conclusion that the merit function method based on ϕ p has a better a global convergence and

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

Then, based on these systematically generated smoothing functions, a unified neural network model is pro- posed for solving absolute value equationB. The issues regarding

Abstract Based on a class of smoothing approximations to projection function onto second-order cone, an approximate lower order penalty approach for solving second-order cone

A derivative free algorithm based on the new NCP- function and the new merit function for complementarity problems was discussed, and some preliminary numerical results for

Based on a class of smoothing approximations to projection function onto second-order cone, an approximate lower order penalty approach for solving second-order cone

By exploiting the Cartesian P -properties for a nonlinear transformation, we show that the class of regularized merit functions provides a global error bound for the solution of