Maximum Satisfiability
• Given a set of clauses, maxsat seeks the truth assignment that satisfies the most simultaneously.
• max2sat is already NP-complete (p. 349), so maxsat is NP-complete.
• Consider the more general k-maxgsat for constant k.
– Let Φ = { φ1, φ2, . . . , φm } be a set of boolean expressions in n variables.
– Each φi is a general expression involving up to k variables.
– k-maxgsat seeks the truth assignment that satisfies the most expressions simultaneously.
A Probabilistic Interpretation of an Algorithm
• Let φi involve ki ≤ k variables and be satisfied by si of the 2ki truth assignments.
• A random truth assignment ∈ { 0, 1 }n satisfies φi with probability p(φi) = si/2ki.
– p(φi) is easy to calculate as k is a constant.
• Hence a random truth assignment satisfies an average of p(Φ) =
m i=1
p(φi) expressions φi.
The Search Procedure
• Clearly
p(Φ) = p(Φ[ x1 = true ]) + p(Φ[ x1 = false ])
2 .
• Select the t1 ∈ { true, false } such that p(Φ[ x1 = t1 ]) is the larger one.
• Note that p(Φ[ x1 = t1 ]) ≥ p(Φ).
• Repeat the procedure with expression Φ[ x1 = t1 ] until all variables xi have been given truth values ti and all φi are either true or false.
The Search Procedure (continued)
• By our hill-climbing procedure, p(Φ)
≤ p(Φ[ x1 = t1 ])
≤ p(Φ[ x1 = t1, x2 = t2 ])
≤ · · ·
≤ p(Φ[ x1 = t1, x2 = t2, . . . , xn = tn ]).
• So at least p(Φ) expressions are satisfied by truth assignment (t1, t2, . . . , tn).
The Search Procedure (concluded)
• Note that the algorithm is deterministic!
• It is called the method of conditional expectations.a
aErd˝os & Selfridge (1973); Spencer (1987).
Approximation Analysis
• The optimum is at most the number of satisfiable φi—i.e., those with p(φi) > 0.
• The ratio of algorithm’s output vs. the optimum isa
≥ p(Φ)
p(φi)>0 1 =
i p(φi)
p(φi)>0 1 ≥ min
p(φi)>0p(φi).
• This is a polynomial-time -approximation algorithm with = 1 − minp(φi)>0 p(φi) by Eq. (20) on p. 732.
• Because p(φi) ≥ 2−k for a satisfiable φi, the heuristic is a polynomial-time -approximation algorithm with
= 1 − 2−k.
Back to maxsat
• In maxsat, the φi’s are clauses (like x ∨ y ∨ ¬z).
• Hence p(φi) ≥ 1/2 (why?).
• The heuristic becomes a polynomial-time
-approximation algorithm with = 1/2.a
• Suppose we set each boolean variable to true with probability (√
5 − 1)/2, the golden ratio.
• Then follow through the method of conditional expectations to derandomize it.
aJohnson (1974).
Back to maxsat (concluded)
• We will obtain a [ (3 − √
5 ) ]/2-approximation algorithm.a
– Note [ (3 − √
5 ) ]/2 ≈ 0.382.
• If the clauses have k distinct literals, p(φi) = 1 − 2−k.
• The heuristic becomes a polynomial-time
-approximation algorithm with = 2−k.
– This is the best possible for k ≥ 3 unless P = NP.
• All the results hold even if clauses are weighted.
max cut Revisited
• max cut seeks to partition the nodes of graph
G = (V, E) into (S, V − S) so that there are as many edges as possible between S and V − S.
• It is NP-complete (p. 384).
• Local search starts from a feasible solution and
performs “local” improvements until none are possible.
• Next we present a local-search algorithm for max cut.
A 0.5-Approximation Algorithm for max cut
1: S := ∅;
2: while ∃v ∈ V whose switching sides results in a larger cut do
3: Switch the side of v;
4: end while
5: return S;
Analysis
V3 V4
V2 V1
Optimal cut
Our cut
e12
e13
e24
e34 e14 e23
Analysis (continued)
• Partition V = V1 ∪ V2 ∪ V3 ∪ V4, where
– Our algorithm returns (V1 ∪ V2, V3 ∪ V4).
– The optimum cut is (V1 ∪ V3, V2 ∪ V4).
• Let eij be the number of edges between Vi and Vj.
• Our algorithm returns a cut of size
e13 + e14 + e23 + e24.
• The optimum cut size is
e12 + e34 + e14 + e23.
Analysis (continued)
• For each node v ∈ V1, its edges to V3 ∪ V4 cannot be outnumbered by those to V1 ∪ V2.
– Otherwise, v would have been moved to V3 ∪ V4 to improve the cut.
• Considering all nodes in V1 together, we have 2e11 + e12 ≤ e13 + e14.
– 2e11, because each edge in V1 is counted twice.
• The above inequality implies
e12 ≤ e13 + e14.
Analysis (concluded)
• Similarly,
e12 ≤ e23 + e24 e34 ≤ e23 + e13 e34 ≤ e14 + e24
• Add all four inequalities, divide both sides by 2, and add the inequality e14 + e23 ≤ e14 + e23 + e13 + e24 to obtain
e12 + e34 + e14 + e23 ≤ 2(e13 + e14 + e23 + e24).
• The above says our solution is at least half the optimum.
Remarks
• A 0.12-approximation algorithm exists.a
• 0.059-approximation algorithms do not exist unless NP = ZPP.b
aGoemans & Williamson (1995).
bH˚astad (1997).
Approximability, Unapproximability, and Between
• Some problems have approximation thresholds less than 1.
– knapsack has a threshold of 0 (p. 782).
– node cover (p. 738), bin packing, and maxsata have a threshold larger than 0.
• The situation is maximally pessimistic for tsp (p. 757) and independent set,b which cannot be approximated
– Their approximation threshold is 1.
aWilliamson & Shmoys (2011).
bSee the textbook.
Unapproximability of tsp
aTheorem 83 The approximation threshold of tsp is 1 unless P = NP.
• Suppose there is a polynomial-time -approximation algorithm for tsp for some < 1.
• We shall construct a polynomial-time algorithm to solve the NP-complete hamiltonian cycle.
• Given any graph G = (V, E), construct a tsp with | V | cities with distances
dij =
⎧⎨
⎩
1, if [ i, j ] ∈ E,
| V |
1−, otherwise.
aSahni & Gonzales (1976).
The Proof (continued)
• Run the alleged approximation algorithm on this tsp instance.
• Note that if a tour includes edges of length | V |/(1 − ), then the tour costs more than | V |.
• Note also that no tour has a cost less than | V |.
• Suppose a tour of cost | V | is returned.
– Then every edge on the tour exists in the original graph G.
– So this tour is a Hamiltonian cycle on G.
The Proof (concluded)
• Suppose a tour that includes an edge of length
| V |/(1 − ) is returned.
– The total length of this tour exceeds | V |/(1 − ).a – Because the algorithm is -approximate, the optimum
is at least 1 − times the returned tour’s length.
– The optimum tour has a cost exceeding | V |.
– Hence G has no Hamiltonian cycles.
aSo this reduction is gap introducing.
metric tsp
• metric tsp is similar to tsp.
• But the distances must satisfy the triangular inequality:
dij ≤ dik + dkj for all i, j, k.
• Inductively,
dij ≤ dik + dkl + · · · + dzj.
A 0.5-Approximation Algorithm for metric tsp
a• It suffices to present an algorithm with the approximation ratio of
c(M (x))
opt(x) ≤ 2 (see p. 733).
aChoukhmane (1978); Iwainsky, Canuto, Taraszow, & Villa (1986);
Kou, Markowsky, & Berman (1981); Plesn´ık (1981).
A 0.5-Approximation Algorithm for metric tsp (concluded)
1: T := a minimum spanning tree of G;
2: T := duplicate the edges of T plus their cost; {Note: T is an Eulerian multigraph.}
3: C := an Euler cycle of T;
4: Remove repeated nodes of C; {“Shortcutting.”}
5: return C;
Analysis
• Let Copt be an optimal tsp tour.
• Note first that
c(T ) ≤ c(Copt). (21) – Copt is a spanning tree after the removal of one edge.
– But T is a minimum spanning tree.
• Because T doubles the edges of T , c(T) = 2c(T ).
Analysis (concluded)
• Because of the triangular inequality, “shortcutting” does not increase the cost.
– (1, 2, 3, 2, 1, 4, . . .) → (1, 2, 3, 4, . . .), a Hamiltonian cycle.
• Thus
c(C) ≤ c(T).
• Combine all the inequalities to yield
c(C) ≤ c(T) = 2c(T ) ≤ 2c(Copt), as desired.
A 100-Node Example
The cost is 7.72877.
A 100-Node Example (continued)
The minimum spanning tree T .
A 100-Node Example (continued)
“Shortcutting” the repeated nodes on the Euler cycle C.
A 100-Node Example (concluded)
The cost is 10.5718 ≤ 2 × 7.72877 = 15.4576.
A (1/3)-Approximation Algorithm for metric tsp
a• It suffices to present an algorithm with the approximation ratio of
c(M (x))
opt(x) ≤ 3 2 (see p. 733).
• This is the best approximation ratio for metric tsp as of 2016!
aChristofides (1976).
A (1/3)-Approximation Algorithm for metric tsp (concluded)
1: T := a minimum spanning tree of G;
2: V := the set of nodes with an odd degree in T ; {| V | must be even by a well-known parity result.}
3: G := the induced subgraph of G by V ; {G is a complete graph on V .}
4: M := a minimum-cost perfect matching of G;
5: G := T ∪ M; {G is an Eulerian multigraph.}
6: C := an Euler cycle of G;
7: Remove repeated nodes of C; {“Shortcutting.”}
8: return C;
Analysis
• Let Copt be an optimal tsp tour.
• By Eq. (21) on p. 763,
c(T ) ≤ c(Copt). (22)
• Let C be Copt on V by “shortcutting.”
– Copt is a Hamiltonian cycle on V .
– Replace any path (v1, v2, . . . , vk) on Copt with (v1, vk), where v1, vk ∈ V but v2, . . . , vk−1 ∈ V .
• So C is simply the restriction of Copt to V .
Analysis (continued)
• By the triangular inequality,
c(C) ≤ c(Copt).
• C is now a Hamiltonian cycle on V .
• C consists of two perfect matchings on G.a – The first, third, . . . edges constitute one.
– The second, fourth, . . . edges constitute the other.
aNote that G is a complete graph with an even | V |.
Analysis (continued)
• By Eq. (22) on p. 771, the cheaper perfect matching has a cost of
c(C)
2 ≤ c(Copt) 2 .
• As a result, the minimum-cost one M must satisfy c(M ) ≤ c(C)
2 ≤ c(Copt) 2 .
• Minimum-cost perfect matching can be solved in polynomial time.a
aEdmonds (1965); Micali & V. Vazirani (1980).
Analysis (concluded)
• By combining the two earlier inequalities, any Euler cycle C has a cost of
c(C) ≤ c(T ) + c(M) by Line 5 of the algorithm
≤ c(Copt) + c(Copt) 2
= 3
2 c(Copt), as desired.
A 100-Node Example
The cost is 7.72877.
A 100-Node Example (continued)
A 100-Node Example (continued)
A minimum-cost perfect matching M .
A 100-Node Example (continued)
∪ M.
A 100-Node Example (continued)
“Shortcutting” the repeated nodes on the Euler cycle C.
A 100-Node Example (continued)
The cost is 8.74583 ≤ (3/2) × 7.72877 = 11.5932.a
aIn comparison, the earlier 0.5-approximation algorithm gave a cost of 10.5718 on p. 768.
A 100-Node Example (concluded)
If a different Euler cycle were generated on p. 778, the cost could be different, such as 8.54902 (above), 8.85674, 8.53410, 9.20841, and 8.87152.a
aContributed by Mr. Yu-Chuan Liu (B00507010, R04922040) on July 15, 2017.
knapsack Has an Approximation Threshold of Zero
aTheorem 84 For any , there is a polynomial-time
-approximation algorithm for knapsack.
• We have n weights w1, w2, . . . , wn ∈ Z+, a weight limit W , and n values v1, v2, . . . , vn ∈ Z+.b
• We must find an I ⊆ { 1, 2, . . . , n } such that
i∈I wi ≤ W and
i∈I vi is the largest possible.
aIbarra & Kim (1975).
bIf the values are fractional, the result is slightly messier, but the main conclusion remains correct. Contributed by Mr. Jr-Ben Tian (B89902011, R93922045) on December 29, 2004.
The Proof (continued)
• Let
V = max{ v1, v2, . . . , vn }.
• Clearly,
i∈I vi ≤ nV .
• Let 0 ≤ i ≤ n and 0 ≤ v ≤ nV .
• W (i, v) is the minimum weight attainable by selecting only from the first i items and with a total value of v.
– It is an (n + 1) × (nV + 1) table.
The Proof (continued)
• Set W (0, v) = ∞ for v ∈ { 1, 2, . . . , nV } and W (i, 0) = 0 for i = 0, 1, . . . , n.a
• Then, for 0 ≤ i < n and 1 ≤ v ≤ nV ,b W (i + 1, v)
=
⎧⎨
⎩
min{ W (i, v), W (i, v − vi+1) + wi+1 }, if v ≥ vi+1,
W (i, v), otherwise.
• Finally, pick the largest v such that W (n, v) ≤ W .c
aContributed by Mr. Ren-Shuo Liu (D98922016) and Mr. Yen-Wei Wu (D98922013) on December 28, 2009.
bThe textbook’s formula has an error here.
v
0 nV
≤ W
The Proof (continued)
With 6 items, values (4, 3, 3, 3, 2, 3), weights (3, 3, 1, 3, 2, 1), and W = 12, the maximum total value 16 is achieved with I = { 1, 2, 3, 4, 6 }; I’s weight is 11.
The Proof (continued)
• The running time O(n2V ) is not polynomial.
• Call the problem instance
x = (w1, . . . , wn, W, v1, . . . , vn).
• Additional idea: Limit the number of precision bits.
• Define
vi =
vi 2b
.
• Note that
vi ≥ 2bvi > vi − 2b.
The Proof (continued)
• Call the approximate instance
x = (w1, . . . , wn, W, v1 , . . . , vn ).
• Solving x takes time O(n2V /2b).
– Use vi = vi/2b and V = max(v1 , v2 , . . . , vn ) in the dynamic programming.
– It is now an (n + 1) × (nV + 1)/2b table.
• The selection I is optimal for x.
• But I may not be optimal for x, although it still satisfies the weight budget W .
The Proof (continued)
With the same parameters as p. 786 and b = 1: Values are (2, 1, 1, 1, 1, 1) and the optimal selection I = { 1, 2, 3, 5, 6 } for x has a smaller maximum value 4 + 3 + 3 + 2 + 3 = 15 for x than I’s 16; its weight is 10 < W = 12.a
aThe original optimal I = { 1, 2, 3, 4, 6 } on p. 786 has the same value 6 and but higher weight 11 for x.
The Proof (continued)
• The value of I for x is close to that of the optimal I:
i∈I
vi ≥
i∈I
2bvi = 2b
i∈I
vi
≥ 2b
i∈I
vi =
i∈I
2bvi
≥
i∈I
vi − 2b
≥
i∈I
vi
− n2b.
The Proof (continued)
• In summary,
i∈I
vi ≥
i∈I
vi
− n2b.
• Without loss of generality, assume wi ≤ W for all i.
– Otherwise, item i is redundant and can be removed early on.
• V is a lower bound on opt.
– Picking one single item with value V is a legitimate choice.
The Proof (concluded)
• The relative error from the optimum is:
i∈I vi −
i∈I vi
i∈I vi ≤
i∈I vi −
i∈I vi
V ≤ n2b
V .
• Suppose we pick b = log2 Vn .
• The algorithm becomes -approximate.a
• The running time is then O(n2V /2b) = O(n3/), a polynomial in n and 1/.b
aSee Eq. (17) on p. 727.
bIt hence depends on the value of 1/. Thanks to a lively class dis- cussion on December 20, 2006. If we fix and let the problem size increase, then the complexity is cubic. Contributed by Mr. Ren-Shan
Comments
• independent set and node cover are reducible to each other (Corollary 45, p. 375).
• node cover has an approximation threshold at most 0.5 (p. 740).
• But independent set is unapproximable (see the textbook).
• independent set limited to graphs with degree ≤ k is called k-degree independent set.
• k-degree independent set is approximable (see the textbook).
On P vs. NP
If 50 million people believe a foolish thing, it’s still a foolish thing.
— George Bernard Shaw (1856–1950)
Exponential Circuit Complexity for NP-Complete Problems
• We shall prove exponential lower bounds for NP-complete problems using monotone circuits.
– Monotone circuits are circuits without ¬ gates.a
• Note that this result does not settle the P vs. NP problem.
aRecall p. 313.
The Power of Monotone Circuits
• Monotone circuits can only compute monotone boolean functions.
• They are powerful enough to solve a P-complete problem: monotone circuit value (p. 314).
• There are NP-complete problems that are not monotone;
they cannot be computed by monotone circuits at all.
• There are NP-complete problems that are monotone;
they can be computed by monotone circuits.
– hamiltonian path and clique.
clique
n,k• cliquen,k is the boolean function deciding whether a graph G = (V, E) with n nodes has a clique of size k.
• The input gates are the n
2
entries of the adjacency matrix of G.
– Gate gij is set to true if the associated undirected edge { i, j } exists.
• cliquen,k is a monotone function.
• Thus it can be computed by a monotone circuit.
• This does not rule out that nonmonotone circuits for cliquen,k may use fewer gates.
Crude Circuits
• One possible circuit for cliquen,k does the following.
1. For each S ⊆ V with | S | = k, there is a circuit with O(k2) ∧-gates testing whether S forms a clique.
2. We then take an or of the outcomes of all the n
k
subsets S1, S2, . . . , S(nk).
• This is a monotone circuit with O(k2 n
k
) gates, which is exponentially large unless k or n − k is a constant.
• A crude circuit CC(X1, X2, . . . , Xm) tests if there is an Xi ⊆ V that forms a clique.
– The above-mentioned circuit is CC(S1, S2, . . . , S(nk)).
The Proof: Positive Examples
• Analysis will be applied to only the following positive examples and negative examples as input graphs.
• A positive example is a graph that has k
2
edges connecting k nodes in all possible ways.
• There are n
k
such graphs.
• They all should elicit a true output from cliquen,k.
The Proof: Negative Examples
• Color the nodes with k − 1 different colors and join by an edge any two nodes that are colored differently.
• There are (k − 1)n such graphs.
• They all should elicit a false output from cliquen,k. – Each set of k nodes must have 2 identically colored
nodes; hence there is no edge between them.
Positive and Negative Examples with k = 5
$ SRVLWLYH H[DPSOH $ QHJDWLYH H[DPSOH