• 沒有找到結果。

# (4)Max-cut problem – Given a graph G = (V, E), find a partition of the nodes of V into two disjoint sets V1 and V2 (V1 ∩ V2

N/A
N/A
Protected

Share "(4)Max-cut problem – Given a graph G = (V, E), find a partition of the nodes of V into two disjoint sets V1 and V2 (V1 ∩ V2"

Copied!
47
0
0

(1)

Duan Li

Department of Systems Engineering & Engineering Management

The Chinese University of Hong Kong At National Cheng Kung University

March 24, 2011

(2)

Outline

• Binary quadratic program and maximum cut problem

• Polynomially solvable subclasses of BQP

• Exploiting geometric properties of BQP

• SDP relaxation

• Goemanns and Williamson’s bound and Nesterov’s SDP bound

• Further improvement of the SDP bound

(3)

Binary quadratic optimization and max-cut problem

• A general 0-1 or binary quadratic problem can be expressed as:

(P ) min f (x) = 1

2xT Qx + cTx s.t. x ∈ X,

where Q is a symmetric n × n matrix, c ∈ Rn and X is either {−1, 1}n or {0, 1}n. Note

x2i = 1, xi ∈ {−1, 1} or x2i = xi, xi ∈ {0, 1}.

Problem (P ) is in general NP-hard even if the matrix Q is positive definite, since xTQx = xT (Q + diag(λ))x − eT λ if xi ∈ {−1, 1} or xT Qx = xT (Q + diag(λ))x − λT x if xi ∈ {0, 1}.

(4)

Max-cut problem

– Given a graph G = (V, E), find a partition of the nodes of V into two disjoint sets V1 and V2 (V1 ∩ V2 = ∅, V1 ∪ V2 = V ) in such a way to maximize the weights of edges that have one endpoint in V1 and the other in V2.

(5)

– Let wij be the weight corresponding to the (i, j) edge, and is zero if the nodes i and j are not connected.

Define:

yi = −1⇐⇒i ∈ V1, yi = 1⇐⇒i ∈ V2. – The max-cut problem can be written as:

max 1 4

X

i,j

wij(1 − yiyj)

s.t. yi ∈ {−1, 1}, i = 1, . . . , n,

– The max cut problem is also APX-hard, meaning that there is no polynomial-time (ǫ-) approximation scheme (PTAS) for it unless P = NP.

(6)

Maximal stable set

– Given an undirected graph G = (V, E).

Figure 1: Maximal stable set of 9 blue nodes in a 24-node graph

(7)

– Independent subset of a graph: no two nodes from it are linked by an arc in E. Maximal stable set : the independent subset with maximum size of nodes α(G).

– Define

(SQP ) min xT (I + A)x

s.t. eT x = 1, x ≥ 0,

where A is the adjacency matrix of G. Then α(G) =

1

v(SQP ).

– Maximal stable set problem is NP-hard and is closely related to quadratic knapsack problem.

(8)

Polynomially solvable subclasses of BQP

• As the binary quadratic programming program is NP-hard in general, identifying polynomially solvable subclasses of binary quadratic programming problems not only offers the- oretical insight into the complicated nature of the problem, but also provides platforms to design relaxation schemes for exact solution methods.

(9)

Problems with all off-diagonal elements of Q being non-positive

• Consider a subclass of problem (0-1QP ) where all off-diagonal elements of Q are non-positive and qii = 0, i = 1, . . ., n.

• As xixj = min(xi, xj) when xi, xj ∈ {0, 1}. and qij ≤ 0 for 1 ≤ i < j ≤ n ⇒ A linear programming problem with a totally unimodular constraint matrix and an integral right- hand side

min

n

X

i=1

cixi + X

1≤i<j≤n

qijzij

s.t. zij ≤ xi, 1 ≤ i < j ≤ n, zij ≤ xj, 1 ≤ i < j ≤ n,

xi, xj, zij ∈ {0, 1}, 1 ≤ i < j ≤ n.

• LP generates integer solution!

(10)

Homogenous problems with fixed rank Q

• We consider now a subclass of homogenous problems where Q is negative semidefinite and rank(Q) = d.

• Let G = −Q ⇒ There a row full rank d × n matrix, V , such that G = V T V .

• ⇒ Equivalent problem:

(BQPf r) max

x∈{0,1}n xT Gx = xTV TV x =

d

X

i=1

(vix)2, (1) where vi is the i-th row of matrix V .

(11)

• If d is equal to 1, i.e., the matrix G is of rank one with G

= v1T v1, the solution to (BQPf r) can be easily found by inspection. More specifically, we only need to select x such that the absolute value of v1x is maximized on {0, 1}n.

• In general cases with rank(G) = d > 1, we consider a linear map Φ: x ∈ Rn → z = V x ∈ Rd, in which Φ maps the hypercube [0, 1]n into a convex polytope Z(V ) = Φ([0, 1]n) = {z ∈ Rd | z = V x, x ∈ [0, 1]n}, known as a zonotope.

• Note that

x∈{0,1}maxn xTGx = max

x∈{0,1}n d

X

i=1

(vix)2 = max

z∈Z(V ) d

X

i=1

zi2 = max

z∈Z(V ) kzk2. Thus, (BQPf r) reduces to a problem of finding the maxi-

mum norm in a zonotope.

(12)

• Let Nep(Z) denote the number of extreme points of the zonotope Z(V ). Then Nep(Z) = O(nd−1).

• All the extreme points of the zonotope Z(V ) can be found by cell enumeration algorithms in discrete geometry with complexity O(nd).

• ⇒ Homogenous problems with fixed rank Q is polynomially solvable.

(13)

Problems with Q Being a tridiagonal matrix are polynomially solvable

(BQP) defined by a series-parallel graph are polynomially solvable

• (BQP ) can be always reduced to a max-cut problem. Con- sider an instance of (BQP ) with

Q =

0 1 0 −1.5

1 0 0 0

0 0 0 −0.5

−1.5 0 −0.5 0

, c =

2.5

−2 3 1.5

(14)

• A graph is called series-parallel if it is not contractible to K4.

• If graph G(Q) is series-parallel, then graph G(Q, c) is not contractible to K5.

(15)

Exploiting Geometric Properties of QBP

• Equivalent perturbed problem:

(Pµ) min

x∈{0,1}n fµ(x) = 1

2xT (Q + µI)x + (c − 1

2µe)T x,

where µ ≥ 0, I is the n × n identity matrix and e is the vector with all elements equal to 1.

– The shape of contour changes when µ changes.

– The properties of (Pµ), e.g., the conditional number, change when µ changes.

(16)

−1 −0.5 0 0.5 1 1.5 2

−4

−3

−2

−1 0 1 2

µ=1 µ=2

µ=3 µ=5

v = fµ(x)

(17)

Properties of Perturbation Problem

• The ellipse contour of fµ(x) with level ˜v:

Eµ(˜v) = {x ∈ Rn | fµ(x) = ˜v}.

• The center of Eµ(˜v):

x0(µ) = −(Q + µI)−1(c − 1

2µe). (2)

• Denote x0 = x0(0) = −Q−1c and t0(µ) = fµ(x0(µ)). The perturbed objective function can be rewritten as

fµ(x) = 1

2(x − x0(µ))T(Q + µI)(x − x0(µ)) + t0(µ). (3)

(18)

Properties of Perturbation Problem

Theorem 1 Assume that x0 6= −12e. (i) t0(µ) is a strictly con- cave function on [0, ∞).

(ii) Let d0(µ) = kx0(µ) − 12ek. Then d0(µ) is a strictly de- creasing function on [0, ∞).

(iii) Let S = {x ∈ Rn | kx− 12ek ≤ 2n}. Let ¯µ be the critical point on the sphere such that kx00) − 12ek = 2n. If x0 6∈ S, then t0(µ) is strictly increasing on [0, ¯µ] and strictly decreasing on [¯µ, ∞). Otherwise, if x0 ∈ S, then t0(µ) is strictly decreasing on [0, ∞).

(19)

Lower Bound Derived from the Minimum Distance Sphere

−0.5 0 0.5 1 1.5

−1

−0.5 0 0.5

1 (0,1) (1,1)

(0,0) (1,0)

x0

f(x) = f (xc)

f(x) = ℓ(0) S0(0)

f(x) = f

xc x2

x1

(20)

Lower Bound Derived from the Minimum Distance Sphere

Theorem 2 Let f denote the optimal value of (P ) and ¯x(µ) be the nearest 0-1 point to x0(µ).

(i) For any µ ≥ 0, a lower bound of f is given by ℓ(µ) = λ1 + µ

2 k¯x(µ) − x0(µ)k2 + t0(µ). (4) (ii) The limit of ℓ(µ) exists:

≡ lim

µ→∞ ℓ(µ) = f (1

2e) + nλ1

8 − (¯x − 1

2e)TQ(x0 − 1 2e).

(21)

Lower Bound derived from Multiple Minimum Distance Spheres centered at the Longest Axis of the Contour

−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5

−4

−3

−2

−1 0 1 2

(22)

Lower Bound

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5

−2.5

−2

−1.5

−1

−0.5 0 0.5 1 1.5 2 2.5

α0 α1 α2 α3

z0 z1 z2

Ep(η)

W0 W1 W2 W3

(23)

• Let zi be the intersection point of Ωi and Ωi+1 and define η = min{mini=0,...,m h(zi), h(u1), h(u2)}.

• Let Ψ be the set of all points of {0, 1}n in ∪m+1i=0 Si.

Theorem 3 Let P (¯v) = {x ∈ Rn | f (x) ≤ ¯v}, where ¯v =

1

2η + f (x0)

(i) If P (¯v) ∩ Ψ 6= ∅, Then

˜

x ∈ arg min

x∈P (¯v)∩Ψ f (x) is an optimal solution to (P ).

(ii) If P (¯v) ∩ Ψ = ∅, then ¯v is a lower bound of f.

(24)

SDP Problem

• Example of Linear program and semidefinite program (SDP):

(LP ) min 2x1 + x2 + x3 s.t. x1 + x2 + x3 = 1

(x1, x2, x3) ≥ 0.

(SDP ) min 2x1 + x2 + x3 s.t. x1 + x2 + x3 = 1

 x1 x2 x2 x3



 0.

(25)

Figure 2: Set of 3 × 3 positive semidefinite matrices with unit diagonal

(26)

• General form of SDP problem:

(SDP ) min C • X

s.t. Ai • X = bi, i = 1, . . . , m, X  0,

where C, Ai are given n × n symmetric and bis are given scalars, and

A • X =

n

X

i=1 n

X

j=1

aijxij = T r(ATX).

(27)

• The dual of (SDP) is

(SDD) max bT y s.t.

m

X

i=1

yiAi  C,

where y ∈ Rm. Or equivalently (SDD) max bTy

s.t.

m

X

i=1

yiAi + S = C, S  0.

• Strong duality: v(SDP ) = v(SDD) if (SDP ) or (SDD) is strictly feasible.

(28)

• The SDP interior-point algorithm finds an ǫ-approximate solution where solution time is linear in log(1/ǫ) and poly- nomial in m and n.

• SDP solvers and software based on MatLab: SeDuMi, CVX (n ≤ 1000).

(29)

SDP relaxation for 0-1 QP

Consider the unconstrained case of (P ):

(BQP ) min f (x) = 1

2xTQx + cT x x ∈ {−1, 1}n,

Two ways of constructing SDP relaxations:

• Lifting method. Note that f (x) = 1

2(1, xT )  0 c c Q

  1 x



= 1

2Q•[˜

 1 x



(1, xT)].

(30)

Lifting x to a symmetric matrix X =

 1 x



(1, xT), we get 1

2xTQx + cTx = 1

2Q • X,˜

x ∈ {−1, 1}n⇐⇒Xii = 1, i = 2, . . . , n + 1.

• Note also X =

 1 x



(1, xT )⇐⇒X  0, X11 = 1, rank(X) = 1.

Dropping rank(X) = 1, we obtain the following SDP relax- ation for (BQP ):

(SDP0) min 1

2Q • X˜

s.t. X  0, Xii = 1, i = 1, . . . , n + 1.

(31)

• Shor’s relaxation scheme. Note that (BQP ) is equivalent to

(BQPc) min f (x) = 1

2xT Qx + cTx x2i = 1, i = 1, . . . , n.

The dual problem of (BQP ) is

(D) max

λ∈Rn min

x∈Rn

1

2xT [Q + 2diag(λ)]x + cTx − eT λ which is equivalent to an SDP problem:

(D) max τ

s.t.  Q + 2diag(λ) c

c −2eT λ − 2τ



 0.

• (SDP0)⇐⇒(D) (conic dual).

(32)

• The nature of the SDP relaxation is a nonlinear lifting to a higher dimensional space. Indeed, rather than solving the original quadratic problem in the n-dimensional vector space, we are instead solving a linear problem in the n × n matrix space.

• If we find an optimal solution X of the SDP that has rank one, ⇒ we solve the original problem.

• In general, it is not the case that the optimal solution of the SDP relaxation will be rank one. However, it is possible to use rounding schemes to obtain nearby rank one solutions.

Furthermore, in some cases, it is possible to do so while obtaining some approximation guarantees on the quality of the rounded solutions.

(33)

Goemanns and Williamson’s bound

• Basic questions:

– Approximation guarantees: Is it possible to prove gen- eral properties on the quality of the bounds obtained by SDP?

– Feasible solutions: Can we (somehow) use the SDP re- laxations to provide not just bounds, but actual feasible points with good (or optimal) values of the objective?

• In their celebrated MAXCUT paper (JACM, 1995), Goe- mans and Williamson developed the following randomized method for finding a good feasible cut from the solution of the SDP:

(34)

– Factorize X as X = V TV , where V = [v1, . . . , vn] ∈ Rr×n, where r is the rank of X.

– Then Xij = viT vj, and, since Xii = 1, this factorization gives n vectors vi on the unit sphere in Rr.

– Now, choose a random hyperplane (passing through the origin) in Rr, and assign to each variable xi either +1 or −1, depending on which side of the hyperplane the point vi lies.

• It turns out that this procedure gives a solution that, on average, is quite close to the value of the SDP bound. The random hyperplane can be characterized by its normal vector p, which is chosen to be uniformly distributed on the unit sphere.

(35)

• The rounded solution is given by xi = sign(pTvi). ⇒ The expected value of this solution in xT W x:

Ep[xTW x] = X

i,j

wijEp[xixj]

= X

i,j

wijEp[sign(pTvi) · sign(pT vj)]

= 2

π

X

i,j

wij arcsin Xij.

• The objective of maximum cut is 14 P

i,j wij(1 − yiyj). ⇒ csdp−expected = 1

4 · 2 π

X

i,j

wij arccos Xij.

(36)

• On the other hand, the solution of the SDP gives an upper bound on the cut capacity equal to:

csdp−upperbound = 1 4

X

i,j

wij(1 − Xij).

• Consider the problem of finding a constant α such that α(1 − t) ≤ 2

π arccos(t), ∀t ∈ [−1, 1].

This is

α = min

t∈[−1,1]

2 π

arccos(t)

1 − t = min

θ∈[0,π]

2 π

θ

1 − cos θ.

• It can be shown that 0.87856 < α < 0.87857.

(37)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0

0.5 1 1.5 2 2.5

α(1−t) 2/π arccos(t)

• Thus

csdp−upperbound ≤ 1 4· 1

α

X

i,j

wij arccos(Xij) = 1

αcsdp−expected.

(38)

• So far we have the following inequalities:

csdp−upperbound ≤ 1

αcsdp−expected

csdp−expected ≤ cmax cmax ≤ csdp−upperbound

• Therefore

0.878 · csdp−upperbound ≤ cmax

(approximation ratio for SDP bound) 0.878 · cmax ≤ csdp−expected

(approximation ratio for feasible solution)

(39)

Nesterov’s SDP bound

• In the MAXCUT problem, we are in fact maximizing the homogeneous quadratic form (omitting 14):

xT Ax =

n

X

i,j=1

wij(1−xixj) =

n

X

i=1

(

n

X

j=1

wij)x2i

n

X

i,j=1

wijxixj over {−1, 1}n.

• Special properties of A:

– A  0;

– Aij ≤ 0 for all i 6= j;

Pn

j=1 Aij = 0.

(40)

• What happens if A is a general positive semidefinite matrix?

• Let A  0, consider the problem:

(P ) max

x∈{−1,1}n xT Ax.

• The SDP relaxation of (P ) is

(SDP ) max A • X

s.t. diag(X) = e, X  0.

v(SDP )

v(P ) ≥ 1.

• Nesterov (1998). Let A  0. Then

v(P ) ≤ v(SDP ) ≤ π2v(P )

(41)

Improvement of the SDP bound

– Saddle point characterization of optimality conditions:

Let x ∈ {−1, 1}n and λ ∈ Rn. Then x solves (P ), λ solves (D) and v(P ) = v(D) ⇔

(1) Q + 2diag(λ)  0;

(2) [Q + 2diag(λ)]x = −c.

– Let

Q = Q + 2diag(λ),

C = {x ∈ Rn | Qx = −c}.

Then, v(P ) = v(D) ⇔∈ {−1, 1}n ∩ C 6= ∅.

(42)

– A sufficient condition for the polynomial solvability of (P ):

Q ≻ 0 ⇒ v(P ) = v(D), i.e. there is no duality gap.

– What if Q 6≻ 0? The saddle point conditions motivate us to define the distance:

δ = dist({−1, 1}n, C).

– δ = 0⇔ v(P ) = v(D).

– Can we improve the SDP bound v(D) if δ 6= 0?

(43)

– Let

Q = Q + 2diag(λ),

C = {x ∈ Rn | Qx = −c}.

Then, v(P ) − v(D) = 0 ⇔ ∃ x ∈ {−1, 1}n ∩ C.

– This motivates us to define the distance:

δ = dist({−1, 1}n, C).

– δ = 0⇔ v(P ) = v(D).

(44)

• Main result: δ > 0 ⇒ an improved lower bound:

νs = v(D) + 1

r+1δ2,

where 0 < ξr+1 ≤ · · · ≤ ξn are the positive eigenvalues of Q = Q + 2diag(λ).

• Now, the question is how to compute

δ = dist({−1, 1}n, C)?

(45)

C1 C2

C3

C4 C5

C6

C7

C9 C10

Figure 3: Partition of C

(46)

Computation of distance δ

The idea is to partition the set C into certain subsets in each of which the signs of x ∈ C are the same.

The distance between {−1, 1}n and each subset must be achieved at sign(π), where π is an interior point of the subset.

Via enumerating all the subsets and finding an interior point in each of the subsets, we obtain δ.

• Let the solution in Rs of C be given as xj = x0j +

s

X

i=1

Gijzi = 0, i = 1, . . . , n. (5)

(47)

• Consider n hyperplanes in Rs: x0j +

s

X

i=1

Gijzi = 0, i = 1, . . . , n, (6) which consist of an arrangement of hyperplanes (in a term of geometric computation). Finding all Tx’s corresponds to finding all cells of the arrangement.

• It has been known that the number of cells of the hyperplane arrangement is O(ns).

• For fixed s, the distance δ can be computed in polynomial time. Another polynomially solvable case is s = n − 1.

• Solution method for cell enumeration: Reverse search algo- rithm (Avis and Fukuda (1996), Sleumer (1999))

Create an edge in T2 between two vertices if their corr- esponding faces in G share an edge in G that is not in T1 6.. Let e be the lone edge of G in the face corresponding to v

For 5 to be the precise limit of f(x) as x approaches 3, we must not only be able to bring the difference between f(x) and 5 below each of these three numbers; we must be able

[This function is named after the electrical engineer Oliver Heaviside (1850–1925) and can be used to describe an electric current that is switched on at time t = 0.] Its graph

As discussed in the example (Figure 3.3), the goal of the min-sum algorithm is to find a valid configuration such that the sum of the local costs (over all sites and check sets) is

5 Create features of V1,V2 and testing data sets for validation set blending, including the predictions of models in step 2 and some optional extra features.. 6 Treat V1 as the

• The Java programming language is based on the virtual machine concept. • A program written in the Java language is translated by a Java compiler into Java

Numerical experiments are done for a class of quasi-convex optimization problems where the function f (x) is a composition of a quadratic convex function from IR n to IR and

The CME drastically changes the time evolution of the chiral fluid in a B-field. - Chiral fluid is not stable against a small perturbation on v

Since the generalized Fischer-Burmeister function ψ p is quasi-linear, the quadratic penalty for equilibrium constraints will make the convexity of the global smoothing function

From all the above, φ is zero only on the nonnegative sides of the a, b-axes. Hence, φ is an NCP function.. Graph of g functions given in Example 3.9.. Graphs of generated NCP

Miroslav Fiedler, Praha, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal 23 (98) 1973,

Given a connected graph G together with a coloring f from the edge set of G to a set of colors, where adjacent edges may be colored the same, a u-v path P in G is said to be a

The vertex-cover problem is to find a vertex cover of minimum size in a given undirected graph. • 此問題的decision版本為NP-Complete

A Boolean function described by an algebraic expression consists of binary variables, the constant 0 and 1, and the logic operation symbols.. For a given value of the binary

 If SAT can be solved in deterministic polynomial time, then so can any NP problems  SAT ∈ NP-hard..  If A is an NP-hard problem and B can be reduced from A, then B is an

The well-known halting problem (a decision problem), which is to determine whether or not an algorithm will terminate with a given input, is NP-hard, but

The min-max and the max-min k-split problem are defined similarly except that the objectives are to minimize the maximum subgraph, and to maximize the minimum subgraph respectively..

Given a graph and a set of p sources, the problem of finding the minimum routing cost spanning tree (MRCT) is NP-hard for any constant p &gt; 1 .. When p = 1, i.e., there is only

There is no general formula for counting the number of transitive binary relations on A... The poset A in the above example is not

• Definition: A binary tree is a finite set of nodes that is either empty or consists of a root and two disjoint binary trees called the left subtree and t he right subtree2.

• Any node that does not have a local replica of the object periodically creates a QoS-advert message contains (a) its δ i deadline value and (b) depending-on , the ID of the node

Experiment a little with the Hello program. It will say that it has no clue what you mean by ouch. The exact wording of the error message is dependent on the compiler, but it might

To decide the correspondence between different sets of fea- ture points and to consider the binary relationships of point pairs at the same time, we construct a graph for each set