Duan Li
Department of Systems Engineering & Engineering Management
The Chinese University of Hong Kong At National Cheng Kung University
March 24, 2011
Outline
• Binary quadratic program and maximum cut problem
• Polynomially solvable subclasses of BQP
• Exploiting geometric properties of BQP
• SDP relaxation
• Goemanns and Williamson’s bound and Nesterov’s SDP bound
• Further improvement of the SDP bound
Binary quadratic optimization and max-cut problem
• A general 0-1 or binary quadratic problem can be expressed as:
(P ) min f (x) = 1
2xT Qx + cTx s.t. x ∈ X,
where Q is a symmetric n × n matrix, c ∈ Rn and X is either {−1, 1}n or {0, 1}n. Note
x2i = 1, xi ∈ {−1, 1} or x2i = xi, xi ∈ {0, 1}.
Problem (P ) is in general NP-hard even if the matrix Q is positive definite, since xTQx = xT (Q + diag(λ))x − eT λ if xi ∈ {−1, 1} or xT Qx = xT (Q + diag(λ))x − λT x if xi ∈ {0, 1}.
Max-cut problem
– Given a graph G = (V, E), find a partition of the nodes of V into two disjoint sets V1 and V2 (V1 ∩ V2 = ∅, V1 ∪ V2 = V ) in such a way to maximize the weights of edges that have one endpoint in V1 and the other in V2.
– Let wij be the weight corresponding to the (i, j) edge, and is zero if the nodes i and j are not connected.
Define:
yi = −1⇐⇒i ∈ V1, yi = 1⇐⇒i ∈ V2. – The max-cut problem can be written as:
max 1 4
X
i,j
wij(1 − yiyj)
s.t. yi ∈ {−1, 1}, i = 1, . . . , n,
– The max cut problem is also APX-hard, meaning that there is no polynomial-time (ǫ-) approximation scheme (PTAS) for it unless P = NP.
Maximal stable set
– Given an undirected graph G = (V, E).
Figure 1: Maximal stable set of 9 blue nodes in a 24-node graph
– Independent subset of a graph: no two nodes from it are linked by an arc in E. Maximal stable set : the independent subset with maximum size of nodes α(G).
– Define
(SQP ) min xT (I + A)x
s.t. eT x = 1, x ≥ 0,
where A is the adjacency matrix of G. Then α(G) =
1
v(SQP ).
– Maximal stable set problem is NP-hard and is closely related to quadratic knapsack problem.
Polynomially solvable subclasses of BQP
• As the binary quadratic programming program is NP-hard in general, identifying polynomially solvable subclasses of binary quadratic programming problems not only offers the- oretical insight into the complicated nature of the problem, but also provides platforms to design relaxation schemes for exact solution methods.
Problems with all off-diagonal elements of Q being non-positive
• Consider a subclass of problem (0-1QP ) where all off-diagonal elements of Q are non-positive and qii = 0, i = 1, . . ., n.
• As xixj = min(xi, xj) when xi, xj ∈ {0, 1}. and qij ≤ 0 for 1 ≤ i < j ≤ n ⇒ A linear programming problem with a totally unimodular constraint matrix and an integral right- hand side
min
n
X
i=1
cixi + X
1≤i<j≤n
qijzij
s.t. zij ≤ xi, 1 ≤ i < j ≤ n, zij ≤ xj, 1 ≤ i < j ≤ n,
xi, xj, zij ∈ {0, 1}, 1 ≤ i < j ≤ n.
• LP generates integer solution!
Homogenous problems with fixed rank Q
• We consider now a subclass of homogenous problems where Q is negative semidefinite and rank(Q) = d.
• Let G = −Q ⇒ There a row full rank d × n matrix, V , such that G = V T V .
• ⇒ Equivalent problem:
(BQPf r) max
x∈{0,1}n xT Gx = xTV TV x =
d
X
i=1
(vix)2, (1) where vi is the i-th row of matrix V .
• If d is equal to 1, i.e., the matrix G is of rank one with G
= v1T v1, the solution to (BQPf r) can be easily found by inspection. More specifically, we only need to select x such that the absolute value of v1x is maximized on {0, 1}n.
• In general cases with rank(G) = d > 1, we consider a linear map Φ: x ∈ Rn → z = V x ∈ Rd, in which Φ maps the hypercube [0, 1]n into a convex polytope Z(V ) = Φ([0, 1]n) = {z ∈ Rd | z = V x, x ∈ [0, 1]n}, known as a zonotope.
• Note that
x∈{0,1}maxn xTGx = max
x∈{0,1}n d
X
i=1
(vix)2 = max
z∈Z(V ) d
X
i=1
zi2 = max
z∈Z(V ) kzk2. Thus, (BQPf r) reduces to a problem of finding the maxi-
mum norm in a zonotope.
• Let Nep(Z) denote the number of extreme points of the zonotope Z(V ). Then Nep(Z) = O(nd−1).
• All the extreme points of the zonotope Z(V ) can be found by cell enumeration algorithms in discrete geometry with complexity O(nd).
• ⇒ Homogenous problems with fixed rank Q is polynomially solvable.
Problems with Q Being a tridiagonal matrix are polynomially solvable
(BQP) defined by a series-parallel graph are polynomially solvable
• (BQP ) can be always reduced to a max-cut problem. Con- sider an instance of (BQP ) with
Q =
0 1 0 −1.5
1 0 0 0
0 0 0 −0.5
−1.5 0 −0.5 0
, c =
2.5
−2 3 1.5
• A graph is called series-parallel if it is not contractible to K4.
• If graph G(Q) is series-parallel, then graph G(Q, c) is not contractible to K5.
Exploiting Geometric Properties of QBP
• Equivalent perturbed problem:
(Pµ) min
x∈{0,1}n fµ(x) = 1
2xT (Q + µI)x + (c − 1
2µe)T x,
where µ ≥ 0, I is the n × n identity matrix and e is the vector with all elements equal to 1.
– The shape of contour changes when µ changes.
– The properties of (Pµ), e.g., the conditional number, change when µ changes.
−1 −0.5 0 0.5 1 1.5 2
−4
−3
−2
−1 0 1 2
µ=1 µ=2
µ=3 µ=5
v = fµ(x)
Properties of Perturbation Problem
• The ellipse contour of fµ(x) with level ˜v:
Eµ(˜v) = {x ∈ Rn | fµ(x) = ˜v}.
• The center of Eµ(˜v):
x0(µ) = −(Q + µI)−1(c − 1
2µe). (2)
• Denote x0 = x0(0) = −Q−1c and t0(µ) = fµ(x0(µ)). The perturbed objective function can be rewritten as
fµ(x) = 1
2(x − x0(µ))T(Q + µI)(x − x0(µ)) + t0(µ). (3)
Properties of Perturbation Problem
Theorem 1 Assume that x0 6= −12e. (i) t0(µ) is a strictly con- cave function on [0, ∞).
(ii) Let d0(µ) = kx0(µ) − 12ek. Then d0(µ) is a strictly de- creasing function on [0, ∞).
(iii) Let S = {x ∈ Rn | kx− 12ek ≤ √2n}. Let ¯µ be the critical point on the sphere such that kx0(µ0) − 12ek = √2n. If x0 6∈ S, then t0(µ) is strictly increasing on [0, ¯µ] and strictly decreasing on [¯µ, ∞). Otherwise, if x0 ∈ S, then t0(µ) is strictly decreasing on [0, ∞).
Lower Bound Derived from the Minimum Distance Sphere
−0.5 0 0.5 1 1.5
−1
−0.5 0 0.5
1 (0,1) (1,1)
(0,0) (1,0)
x0
f(x) = f (xc)
f(x) = ℓ(0) S0(0)
f(x) = f∗
xc x2
x1
Lower Bound Derived from the Minimum Distance Sphere
Theorem 2 Let f∗ denote the optimal value of (P ) and ¯x(µ) be the nearest 0-1 point to x0(µ).
(i) For any µ ≥ 0, a lower bound of f∗ is given by ℓ(µ) = λ1 + µ
2 k¯x(µ) − x0(µ)k2 + t0(µ). (4) (ii) The limit of ℓ(µ) exists:
ℓ∞ ≡ lim
µ→∞ ℓ(µ) = f (1
2e) + nλ1
8 − (¯x∞ − 1
2e)TQ(x0 − 1 2e).
Lower Bound derived from Multiple Minimum Distance Spheres centered at the Longest Axis of the Contour
−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
−4
−3
−2
−1 0 1 2
Lower Bound
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
−2.5
−2
−1.5
−1
−0.5 0 0.5 1 1.5 2 2.5
α0 α1 α2 α3
z0 z1 z2
Ep(η)
W0 W1 W2 W3
• Let zi be the intersection point of Ωi and Ωi+1 and define η = min{mini=0,...,m h(zi), h(u1), h(u2)}.
• Let Ψ be the set of all points of {0, 1}n in ∪m+1i=0 Si.
Theorem 3 Let P (¯v) = {x ∈ Rn | f (x) ≤ ¯v}, where ¯v =
1
2η + f (x0)
(i) If P (¯v) ∩ Ψ 6= ∅, Then
˜
x ∈ arg min
x∈P (¯v)∩Ψ f (x) is an optimal solution to (P ).
(ii) If P (¯v) ∩ Ψ = ∅, then ¯v is a lower bound of f∗.
SDP Problem
• Example of Linear program and semidefinite program (SDP):
(LP ) min 2x1 + x2 + x3 s.t. x1 + x2 + x3 = 1
(x1, x2, x3) ≥ 0.
(SDP ) min 2x1 + x2 + x3 s.t. x1 + x2 + x3 = 1
x1 x2 x2 x3
0.
Figure 2: Set of 3 × 3 positive semidefinite matrices with unit diagonal
• General form of SDP problem:
(SDP ) min C • X
s.t. Ai • X = bi, i = 1, . . . , m, X 0,
where C, Ai are given n × n symmetric and bis are given scalars, and
A • X =
n
X
i=1 n
X
j=1
aijxij = T r(ATX).
• The dual of (SDP) is
(SDD) max bT y s.t.
m
X
i=1
yiAi C,
where y ∈ Rm. Or equivalently (SDD) max bTy
s.t.
m
X
i=1
yiAi + S = C, S 0.
• Strong duality: v(SDP ) = v(SDD) if (SDP ) or (SDD) is strictly feasible.
• The SDP interior-point algorithm finds an ǫ-approximate solution where solution time is linear in log(1/ǫ) and poly- nomial in m and n.
• SDP solvers and software based on MatLab: SeDuMi, CVX (n ≤ 1000).
SDP relaxation for 0-1 QP
Consider the unconstrained case of (P ):
(BQP ) min f (x) = 1
2xTQx + cT x x ∈ {−1, 1}n,
Two ways of constructing SDP relaxations:
• Lifting method. Note that f (x) = 1
2(1, xT ) 0 c c Q
1 x
= 1
2Q•[˜
1 x
(1, xT)].
Lifting x to a symmetric matrix X =
1 x
(1, xT), we get 1
2xTQx + cTx = 1
2Q • X,˜
x ∈ {−1, 1}n⇐⇒Xii = 1, i = 2, . . . , n + 1.
• Note also X =
1 x
(1, xT )⇐⇒X 0, X11 = 1, rank(X) = 1.
Dropping rank(X) = 1, we obtain the following SDP relax- ation for (BQP ):
(SDP0) min 1
2Q • X˜
s.t. X 0, Xii = 1, i = 1, . . . , n + 1.
• Shor’s relaxation scheme. Note that (BQP ) is equivalent to
(BQPc) min f (x) = 1
2xT Qx + cTx x2i = 1, i = 1, . . . , n.
The dual problem of (BQP ) is
(D) max
λ∈Rn min
x∈Rn
1
2xT [Q + 2diag(λ)]x + cTx − eT λ which is equivalent to an SDP problem:
(D) max τ
s.t. Q + 2diag(λ) c
c −2eT λ − 2τ
0.
• (SDP0)⇐⇒(D) (conic dual).
• The nature of the SDP relaxation is a nonlinear lifting to a higher dimensional space. Indeed, rather than solving the original quadratic problem in the n-dimensional vector space, we are instead solving a linear problem in the n × n matrix space.
• If we find an optimal solution X of the SDP that has rank one, ⇒ we solve the original problem.
• In general, it is not the case that the optimal solution of the SDP relaxation will be rank one. However, it is possible to use rounding schemes to obtain nearby rank one solutions.
Furthermore, in some cases, it is possible to do so while obtaining some approximation guarantees on the quality of the rounded solutions.
Goemanns and Williamson’s bound
• Basic questions:
– Approximation guarantees: Is it possible to prove gen- eral properties on the quality of the bounds obtained by SDP?
– Feasible solutions: Can we (somehow) use the SDP re- laxations to provide not just bounds, but actual feasible points with good (or optimal) values of the objective?
• In their celebrated MAXCUT paper (JACM, 1995), Goe- mans and Williamson developed the following randomized method for finding a good feasible cut from the solution of the SDP:
– Factorize X as X = V TV , where V = [v1, . . . , vn] ∈ Rr×n, where r is the rank of X.
– Then Xij = viT vj, and, since Xii = 1, this factorization gives n vectors vi on the unit sphere in Rr.
– Now, choose a random hyperplane (passing through the origin) in Rr, and assign to each variable xi either +1 or −1, depending on which side of the hyperplane the point vi lies.
• It turns out that this procedure gives a solution that, on average, is quite close to the value of the SDP bound. The random hyperplane can be characterized by its normal vector p, which is chosen to be uniformly distributed on the unit sphere.
• The rounded solution is given by xi = sign(pTvi). ⇒ The expected value of this solution in xT W x:
Ep[xTW x] = X
i,j
wijEp[xixj]
= X
i,j
wijEp[sign(pTvi) · sign(pT vj)]
= 2
π
X
i,j
wij arcsin Xij.
• The objective of maximum cut is 14 P
i,j wij(1 − yiyj). ⇒ csdp−expected = 1
4 · 2 π
X
i,j
wij arccos Xij.
• On the other hand, the solution of the SDP gives an upper bound on the cut capacity equal to:
csdp−upperbound = 1 4
X
i,j
wij(1 − Xij).
• Consider the problem of finding a constant α such that α(1 − t) ≤ 2
π arccos(t), ∀t ∈ [−1, 1].
This is
α = min
t∈[−1,1]
2 π
arccos(t)
1 − t = min
θ∈[0,π]
2 π
θ
1 − cos θ.
• It can be shown that 0.87856 < α < 0.87857.
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0
0.5 1 1.5 2 2.5
α(1−t) 2/π arccos(t)
• Thus
csdp−upperbound ≤ 1 4· 1
α
X
i,j
wij arccos(Xij) = 1
αcsdp−expected.
• So far we have the following inequalities:
csdp−upperbound ≤ 1
αcsdp−expected
csdp−expected ≤ cmax cmax ≤ csdp−upperbound
• Therefore
0.878 · csdp−upperbound ≤ cmax
(approximation ratio for SDP bound) 0.878 · cmax ≤ csdp−expected
(approximation ratio for feasible solution)
Nesterov’s SDP bound
• In the MAXCUT problem, we are in fact maximizing the homogeneous quadratic form (omitting 14):
xT Ax =
n
X
i,j=1
wij(1−xixj) =
n
X
i=1
(
n
X
j=1
wij)x2i −
n
X
i,j=1
wijxixj over {−1, 1}n.
• Special properties of A:
– A 0;
– Aij ≤ 0 for all i 6= j;
–
Pn
j=1 Aij = 0.
• What happens if A is a general positive semidefinite matrix?
• Let A 0, consider the problem:
(P ) max
x∈{−1,1}n xT Ax.
• The SDP relaxation of (P ) is
(SDP ) max A • X
s.t. diag(X) = e, X 0.
v(SDP )
v(P ) ≥ 1.
• Nesterov (1998). Let A 0. Then
v(P ) ≤ v(SDP ) ≤ π2v(P )
Improvement of the SDP bound
– Saddle point characterization of optimality conditions:
Let x∗ ∈ {−1, 1}n and λ∗ ∈ Rn. Then x∗ solves (P ), λ∗ solves (D) and v(P ) = v(D) ⇔
(1) Q + 2diag(λ∗) 0;
(2) [Q + 2diag(λ∗)]x∗ = −c.
– Let
Q∗ = Q + 2diag(λ∗),
C = {x ∈ Rn | Q∗x = −c}.
Then, v(P ) = v(D) ⇔∈ {−1, 1}n ∩ C 6= ∅.
– A sufficient condition for the polynomial solvability of (P ):
Q∗ ≻ 0 ⇒ v(P ) = v(D), i.e. there is no duality gap.
– What if Q∗ 6≻ 0? The saddle point conditions motivate us to define the distance:
δ = dist({−1, 1}n, C).
– δ = 0⇔ v(P ) = v(D).
– Can we improve the SDP bound v(D) if δ 6= 0?
– Let
Q∗ = Q + 2diag(λ∗),
C = {x ∈ Rn | Q∗x = −c}.
Then, v(P ) − v(D) = 0 ⇔ ∃ x∗ ∈ {−1, 1}n ∩ C.
– This motivates us to define the distance:
δ = dist({−1, 1}n, C).
– δ = 0⇔ v(P ) = v(D).
• Main result: δ > 0 ⇒ an improved lower bound:
νs = v(D) + 1
4ξr+1δ2,
where 0 < ξr+1 ≤ · · · ≤ ξn are the positive eigenvalues of Q∗ = Q + 2diag(λ∗).
• Now, the question is how to compute
δ = dist({−1, 1}n, C)?
C1 C2
C3
C4 C5
C6
C7
C9 C10
Figure 3: Partition of C
Computation of distance δ
The idea is to partition the set C into certain subsets in each of which the signs of x ∈ C are the same.
⇓
The distance between {−1, 1}n and each subset must be achieved at sign(π), where π is an interior point of the subset.
⇓
Via enumerating all the subsets and finding an interior point in each of the subsets, we obtain δ.
• Let the solution in Rs of C be given as xj = x0j +
s
X
i=1
Gijzi = 0, i = 1, . . . , n. (5)
• Consider n hyperplanes in Rs: x0j +
s
X
i=1
Gijzi = 0, i = 1, . . . , n, (6) which consist of an arrangement of hyperplanes (in a term of geometric computation). Finding all Tx’s corresponds to finding all cells of the arrangement.
• It has been known that the number of cells of the hyperplane arrangement is O(ns).
• For fixed s, the distance δ can be computed in polynomial time. Another polynomially solvable case is s = n − 1.
• Solution method for cell enumeration: Reverse search algo- rithm (Avis and Fukuda (1996), Sleumer (1999))