Unapproximability of tsp
aTheorem 80 The approximation threshold of tsp is 1 unless P = NP.
• Suppose there is a polynomial-time ²-approximation algorithm for tsp for some ² < 1.
• We shall construct a polynomial-time algorithm for the NP-complete hamiltonian cycle.
• Given any graph G = (V, E), construct a tsp with | V | cities with distances
dij =
1, if { i, j } ∈ E
|V |
1−², otherwise
aSahni and Gonzales (1976).
The Proof (concluded)
• Run the alleged approximation algorithm on this tsp.
• Suppose a tour of cost |V | is returned.
– This tour must be a Hamiltonian cycle.
• Suppose a tour with at least one edge of length 1−²|V | is returned.
– The total length of this tour is > 1−²|V | .
– Because the algorithm is ²-approximate, the optimum is at least 1 − ² times the returned tour’s length.
– The optimum tour has a cost exceeding | V |.
– Hence G has no Hamiltonian cycles.
knapsack Has an Approximation Threshold of Zero
aTheorem 81 For any ², there is a polynomial-time
²-approximation algorithm for knapsack.
• We have n weights w1, w2, . . . , wn ∈ Z+, a weight limit W , and n values v1, v2, . . . , vn ∈ Z+.b
• We must find an S ⊆ {1, 2, . . . , n} such that P
i∈S wi ≤ W and P
i∈S vi is the largest possible.
aIbarra and Kim (1975).
bIf the values are fractional, the result is slightly messier but the main conclusion remains correct. Contributed by Mr. Jr-Ben Tian (R92922045) on December 29, 2004.
The Proof (continued)
• Let
V = max{v1, v2, . . . , vn}.
• Clearly, P
i∈S vi ≤ nV .
• Let 0 ≤ i ≤ n and 0 ≤ v ≤ nV .
• W (i, v) is the minimum weight attainable by selecting some of the first i items with a total value of v.
• Set W (0, v) = ∞ for v ∈ { 1, 2, . . . , nV } and W (i, 0) = 0 for i = 0, 1, . . . , n.a
aContributed by Mr. Ren-Shuo Liu (D98922016) and Mr. Yen-Wei Wu (D98922013) on December 28, 2009.
The Proof (continued)
• Then, for 0 ≤ i < n,
W (i + 1, v) = min{W (i, v), W (i, v − vi+1) + wi+1}.
• Finally, pick the largest v such that W (n, v) ≤ W .
• The running time is O(n2V ), not polynomial time.
• Key idea: Limit the number of precision bits.
The Proof (continued)
• Define
vi0 = 2b j vi 2b
k .
– This is equivalent to zeroing each vi’s last b bits.
• From the original instance
x = (w1, . . . , wn, W, v1, . . . , vn), define the approximate instance
x0 = (w1, . . . , wn, W, v10 , . . . , vn0 ).
The Proof (continued)
• Solving x0 takes time O(n2V /2b).
– The algorithm only performs subtractions on the vi-related values.
– So the b last bits can be removed from the calculations.
– That is, use vi0 = ¥v
2bi
¦ in the calculations.
– Then multiply the returned value by 2b.
• The solution S0 is close to the optimum solution S:
X
i∈S0
vi ≥ X
i∈S0
vi0 ≥ X
i∈S
vi0 ≥ X
i∈S
(vi − 2b) ≥ X
i∈S
vi − n2b.
The Proof (continued)
• Hence X
i∈S0
vi ≥ X
i∈S
vi − n2b.
• Without loss of generality, assume wi ≤ W for all i.
– Otherwise item i is redundant.
• V is a lower bound on opt.
– Picking any single item with value ≤ V is a legitimate choice.
• The relative error from the optimum is ≤ n2b/V : P
i∈S vi − P
i∈S0 vi
P v ≤
P
i∈S vi − P
i∈S0 vi
V ≤ n2b
V .
The Proof (concluded)
• Suppose we pick b = blog2 ²Vn c.
• The algorithm becomes ²-approximate (see Eq. (10) on p. 639).
• The running time is then O(n2V /2b) = O(n3/²), a polynomial in n and 1/².a
aIt hence depends on the value of 1/². Thanks to a lively class dis- cussion on December 20, 2006. If we fix ² and let the problem size increase, then the complexity is cubic. Contributed by Mr. Ren-Shan Luoh (D97922014) on December 23, 2008.
Pseudo-Polynomial-Time Algorithms
• Consider problems with inputs that consist of a
collection of integer parameters (tsp, knapsack, etc.).
• An algorithm for such a problem whose running time is a polynomial of the input length and the value (not length) of the largest integer parameter is a
pseudo-polynomial-time algorithm.a
• On p. 665, we presented a pseudo-polynomial-time algorithm for knapsack that runs in time O(n2V ).
• How about tsp (d), another NP-complete problem?
aGarey and Johnson (1978).
No Pseudo-Polynomial-Time Algorithms for tsp (d)
• By definition, a pseudo-polynomial-time algorithm becomes polynomial-time if each integer parameter is limited to having a value polynomial in the input length.
• Corollary 43 (p. 344) showed that hamiltonian path is reducible to tsp (d) with weights 1 and 2.
• As hamiltonian path is NP-complete, tsp (d) cannot have pseudo-polynomial-time algorithms unless P = NP.
• tsp (d) is said to be strongly NP-hard.
• Many weighted versions of NP-complete problems are strongly NP-hard.
Polynomial-Time Approximation Scheme
• Algorithm M is a polynomial-time approximation scheme (PTAS) for a problem if:
– For each ² > 0 and instance x of the problem, M runs in time polynomial (depending on ²) in | x |.
∗ Think of ² as a constant.
– M is an ²-approximation algorithm for every ² > 0.
Fully Polynomial-Time Approximation Scheme
• A polynomial-time approximation scheme is fully polynomial (FPTAS) if the running time depends polynomially on | x | and 1/².
– Maybe the best result for a “hard” problem.
– For instance, knapsack is fully polynomial with a running time of O(n3/²) (p. 663).
Square of G
• Let G = (V, E) be an undirected graph.
• G2 has nodes {(v1, v2) : v1, v2 ∈ V } and edges
{{ (u, u0), (v, v0) } : (u = v ∧ { u0, v0 } ∈ E) ∨ { u, v } ∈ E}.
1
2
3
(1,1)
G
(1,2) (1,3)
(2,1) (2,2) (2,3)
(3,1) (3,2) (3,3)
G2
Independent Sets of G and G
2Lemma 82 G(V, E) has an independent set of size k if and only if G2 has an independent set of size k2.
• Suppose G has an independent set I ⊆ V of size k.
• {(u, v) : u, v ∈ I} is an independent set of size k2 of G2.
1
2
3
(1,1)
G
(1,2) (1,3)
(2,1) (2,2) (2,3)
(3,1) (3,2) (3,3)
G2
The Proof (continued)
• Suppose G2 has an independent set I2 of size k2.
• U ≡ {u : ∃v ∈ V (u, v) ∈ I2} is an independent set of G.
1
2
3
(1,1)
G
(1,2) (1,3)
(2,1) (2,2) (2,3)
(3,1) (3,2) (3,3)
G2
• | U | is the number of “rows” that the nodes in I2 occupy.
The Proof (concluded)
a• If | U | ≥ k, then we are done.
• Now assume | U | < k.
• As the k2 nodes in I2 cover fewer than k “rows,” there must be a “row” in possession of > k nodes of I2.
• Those > k nodes will be independent in G as each “row”
is a copy of G.
aThanks to a lively class discussion on December 29, 2004.
Approximability of independent set
• The approximation threshold of the maximum independent set is either zero or one (it is one!).
Theorem 83 If there is a polynomial-time ²-approximation algorithm for independent set for any 0 < ² < 1, then there is a polynomial-time approximation scheme.
• Let G be a graph with a maximum independent set of size k.
• Suppose there is an O(ni)-time ²-approximation algorithm for independent set.
• We seek a polynomial-time ²0-approximation algorithm
The Proof (continued)
• By Lemma 82 (p. 675), the maximum independent set of G2 has size k2.
• Apply the algorithm to G2.
• The running time is O(n2i).
• The resulting independent set has size ≥ (1 − ²) k2.
• By the construction in Lemma 82 (p. 675), we can obtain an independent set of size ≥ p
(1 − ²) k2 for G.
• Hence there is a (1 − √
1 − ²)-approximation algorithm for independent set by Eq. (11) on p. 640.
The Proof (concluded)
• In general, we can apply the algorithm to G2` to obtain an (1 − (1 − ²)2−`)-approximation algorithm for
independent set.
• The running time is n2`i.a
• Now pick ` = dlog log(1−²log(1−²)0)e.
• The running time becomes nilog(1−²0)log(1−²) .
• It is an ²0-approximation algorithm for independent set.
aIt is not fully polynomial.
Comments
• independent set and node cover are reducible to each other (Corollary 40, p. 309).
• node cover has an approximation threshold at most 0.5 (p. 645).
• But independent set is unapproximable (see the textbook).
• independent set limited to graphs with degree ≤ k is called k-degree independent set.
• k-degree independent set is approximable (see the textbook).
On P vs. NP
Density
aThe density of language L ⊆ Σ∗ is defined as densL(n) = |{x ∈ L : | x | ≤ n}|.
• If L = {0, 1}∗, then densL(n) = 2n+1 − 1.
• So the density function grows at most exponentially.
• For a unary language L ⊆ {0}∗,
densL(n) ≤ n + 1.
– Because L ⊆ {², 0, 00, . . . ,
z }| {n
00 · · · 0, . . .}.
aBerman and Hartmanis (1977).
Sparsity
• Sparse languages are languages with polynomially bounded density functions.
• Dense languages are languages with superpolynomial density functions.
Self-Reducibility for sat
• An algorithm exhibits self-reducibility if it finds a certificate by exploiting algorithms for the decision version of the same problem.
• Let φ be a boolean expression in n variables x1, x2, . . . , xn.
• t ∈ {0, 1}j is a partial truth assignment for x1, x2, . . . , xj.
• φ[ t ] denotes the expression after substituting the truth values of t for x1, x2, . . . , x| t | in φ.
An Algorithm for sat with Self-Reduction
We call the algorithm below with empty t.
1: if | t | = n then
2: return φ[ t ];
3: else
4: return φ[ t0 ] ∨ φ[ t1 ];
5: end if
The above algorithm runs in exponential time, by visiting all the partial assignments (or nodes on a depth-n binary tree).
NP-Completeness and Density
aTheorem 84 If a unary language U ⊆ {0}∗ is NP-complete, then P = NP.
• Suppose there is a reduction R from sat to U .
• We use R to find a truth assignment that satisfies
boolean expression φ with n variables if it is satisfiable.
• Specifically, we use R to prune the exponential-time exhaustive search on p. 686.
• The trick is to keep the already discovered results φ[ t ] in a table H.
aBerman (1978).
1: if | t | = n then
2: return φ[ t ];
3: else
4: if (R(φ[ t ]), v) is in table H then
5: return v;
6: else
7: if φ[ t0 ] = “satisfiable” or φ[ t1 ] = “satisfiable” then
8: Insert (R(φ[ t ]), “satisfiable”) into H;
9: return “satisfiable”;
10: else
11: Insert (R(φ[ t ]), “unsatisfiable”) into H;
12: return “unsatisfiable”;
13: end if
14: end if
The Proof (continued)
• Since R is a reduction, R(φ[ t ]) = R(φ[ t0 ]) implies that φ[ t ] and φ[ t0 ] must be both satisfiable or unsatisfiable.
• R(φ[ t ]) has polynomial length ≤ p(n) because R runs in log space.
• As R maps to unary numbers, there are only polynomially many p(n) values of R(φ[ t ]).
• How many nodes of the complete binary tree (of invocations/truth assignments) need to be visited?
• If that number is a polynomial, the overall algorithm runs in polynomial time and we are done.
The Proof (continued)
• A search of the table takes time O(p(n)) in the random access memory model.
• The running time is O(M p(n)), where M is the total number of invocations of the algorithm.
• The invocations of the algorithm form a binary tree of depth at most n.
The Proof (continued)
• There is a set T = {t1, t2, . . .} of invocations (partial truth assignments, i.e.) such that:
1. |T | ≥ (M − 1)/(2n).
2. All invocations in T are recursive (nonleaves).
3. None of the elements of T is a prefix of another.
VWVWHS'HOHWH OHDYHV0−
QRQOHDYHVUHPDLQLQJ
QGVWHS6HOHFWDQ\
ERWWRPXQGHOHWHG LQYRFDWLRQWDQGDGG LWWR7
UGVWHS'HOHWHDOOWV DWPRVWQDQFHVWRUV SUHIL[HVIURP
IXUWKHUFRQVLGHUDWLRQ
An Example
r
a c
d e f
g h i j
l k
1
2 3
4 5
T = { h, j }.
The Proof (continued)
• All invocations t ∈ T have different R(φ[ t ]) values.
– None of h, j ∈ T is a prefix of the other.
– The invocation of one started after the invocation of the other had terminated.
– If they had the same value, the one that was invoked second would have looked it up, and therefore would not be recursive, a contradiction.
• The existence of T implies that there are at least (M − 1)/(2n) different R(φ[ t ]) values in the table.
The Proof (concluded)
• We already know that there are at most p(n) such values.
• Hence (M − 1)/(2n) ≤ p(n).
• Thus M ≤ 2np(n) + 1.
• The running time is therefore O(M p(n)) = O(np2(n)).
• We comment that this theorem holds for any sparse language, not just unary ones.a
aMahaney (1980).
coNP-Completeness and Density
Theorem 85 (Fortung (1979)) If a unary language U ⊆ {0}∗ is coNP-complete, then P = NP.
• Suppose there is a reduction R from sat complement to U .
• The rest of the proof is basically identical except that, now, we want to make sure a formula is unsatisfiable.
Exponential Circuit Complexity
• Almost all boolean functions require 2n
2n
gates to compute (generalized Theorem 14 on p. 164).
• Progress of using circuit complexity to prove exponential lower bounds for NP-complete problems has been slow.
– As of January 2006, the best lower bound is 5n − o(n).a
aIwama and Morizumi (2002).
Exponential Circuit Complexity for NP-Complete Problems
• We shall prove exponential lower bounds for NP-complete problems using monotone circuits.
– Monotone circuits are circuits without ¬ gates.
• Note that this does not settle the P vs. NP problem or any of the conjectures on p. 545.
The Power of Monotone Circuits
• Monotone circuits can only compute monotone boolean functions.
• They are powerful enough to solve a P-complete problem, monotone circuit value (p. 257).
• There are NP-complete problems that are not monotone;
they cannot be computed by monotone circuits at all.
• There are NP-complete problems that are monotone;
they can be computed by monotone circuits.
– hamiltonian path and clique.
clique
n,k• cliquen,k is the boolean function deciding whether a graph G = (V, E) with n nodes has a clique of size k.
• The input gates are the ¡n
2
¢ entries of the adjacency matrix of G.
– Gate gij is set to true if the associated undirected edge { i, j } exists.
• cliquen,k is a monotone function.
• Thus it can be computed by a monotone circuit.
• This does not rule out that nonmonotone circuits for cliquen,k may use fewer gates.
Crude Circuits
• One possible circuit for cliquen,k does the following.
1. For each S ⊆ V with |S| = k, there is a subcircuit with O(k2) ∧-gates testing whether S forms a clique.
2. We then take an or of the outcomes of all the ¡n
k
¢ subsets S1, S2, . . . , S(nk).
• This is a monotone circuit with O(k2¡n
k
¢) gates, which is exponentially large unless k or n − k is a constant.
• A crude circuit CC(X1, X2, . . . , Xm) tests if any of Xi ⊆ V forms a clique.
– The above-mentioned circuit is CC(S1, S2, . . . , S(nk)).
Sunflowers
• Fix p ∈ Z+ and ` ∈ Z+.
• A sunflower is a family of p sets {P1, P2, . . . , Pp}, called petals, each of cardinality at most `.
• All pairs of sets in the family must have the same intersection (called the core of the sunflower).
FRUH
A Sample Sunflower
{{1, 2, 3, 5}, {1, 2, 6, 9}, {0, 1, 2, 11}, {1, 2, 12, 13}, {1, 2, 8, 10}, {1, 2, 4, 7}}
Æ¿³È
É¿³Ì
Ë¿³Äà ǿ³Ê
ÿ³ÄÄ ÄÅ¿³ÄÆ
The Erd˝os-Rado Lemma
Lemma 86 Let Z be a family of more than M = (p − 1)``!
nonempty sets, each of cardinality ` or less. Then Z must contain a sunflower (of size p).
• Induction on `.
• For ` = 1, p different singletons form a sunflower (with an empty core).
• Suppose ` > 1.
• Consider a maximal subset D ⊆ Z of disjoint sets.
– Every set in Z − D intersects some set in D.
The Proof of the Erd˝os-Rado Lemma (continued)
• Suppose D contains at least p sets.
– D constitutes a sunflower with an empty core.
• Suppose D contains fewer than p sets.
– Let C be the union of all sets in D.
– | C | ≤ (p − 1)` and C intersects every set in Z.
– There is a d ∈ C that intersects more than
M
(p−1)` = (p − 1)`−1(` − 1)! sets in Z.
– Consider Z0 = {Z − {d} : Z ∈ Z, d ∈ Z}.
– Z0 has more than M0 = (p − 1)`−1(` − 1)! sets.
The Proof of the Erd˝os-Rado Lemma (concluded)
• (continued)
– M0 is just M with ` replaced with ` − 1.
– Z0 contains a sunflower by induction, say {P1, P2, . . . , Pp}.
– Now,
{P1 ∪ {d}, P2 ∪ {d}, . . . , Pp ∪ {d}}
is a sunflower in Z.
Comments on the Erd˝os-Rado Lemma
• A family of more than M sets must contain a sunflower.
• Plucking a sunflower entails replacing the sets in the sunflower by its core.
• By repeatedly finding a sunflower and plucking it, we can reduce a family with more than M sets to a family with at most M sets.
• If Z is a family of sets, the above result is denoted by pluck(Z).
• Note: pluck(Z) is not unique.
An Example of Plucking
• Recall the sunflower on p. 703:
Z = {{1, 2, 3, 5}, {1, 2, 6, 9}, {0, 1, 2, 11}, {1, 2, 12, 13}, {1, 2, 8, 10}, {1, 2, 4, 7}}
• Then
pluck(Z) = {{1, 2}}.
Razborov’s Theorem
Theorem 87 (Razborov (1985)) There is a constant c such that for large enough n, all monotone circuits for cliquen,k with k = n1/4 have size at least ncn1/8.
• We shall approximate any monotone circuit for cliquen,k by a restricted kind of crude circuit.
• The approximation will proceed in steps: one step for each gate of the monotone circuit.
• Each step introduces few errors (false positives and false negatives).
• But the resulting crude circuit has exponentially many errors.