### Back to maxsat

*•* *In maxsat, the φ** _{i}*’s are clauses.

*•* *Hence p(φ*_{i}*) ≥ 1/2, which happens when φ** _{i}* contains a
single literal.

*•* And the heuristic becomes a polynomial-time

*²-approximation algorithm with ² = 1/2.*^{a}

*•* *If the clauses have k distinct literals, p(φ*_{i}*) = 1 − 2** ^{−k}*.

*•* And the heuristic becomes a polynomial-time

*²-approximation algorithm with ² = 2** ^{−k}*.

– *This is the best possible for k ≥ 3 unless P = NP.*

aJohnson (1974).

### max cut Revisited

*•* The NP-complete max cut seeks to partition the nodes
*of graph G = (V, E) into (S, V − S) so that there are as*
*many edges as possible between S and V − S (p. 318).*

*•* Local search starts from a feasible solution and

performs “local” improvements until none are possible.

*•* Next we present a local search algorithm for max cut.

### A 0.5-Approximation Algorithm for max cut

1: *S := ∅;*

2: *while ∃v ∈ V whose switching sides results in a larger*
cut do

3: *Switch the side of v;*

4: end while

5: *return S;*

*•* *A 0.12-approximation algorithm exists.*^{a}

*•* *0.059-approximation algorithms do not exist unless*
NP = ZPP.

aGoemans and Williamson (1995).

### Analysis

*V*_{3} *V*_{4}

*V*_{2}
*V*_{1}

Optimal cut

Our cut

*e*_{12}

*e*_{13}

*e*_{24}

*e*_{34}
*e*_{14} *e*_{23}

### Analysis (continued)

*•* *Partition V = V*_{1} *∪ V*_{2} *∪ V*_{3} *∪ V*_{4}, where

– *Our algorithm returns (V*_{1} *∪ V*_{2}*, V*_{3} *∪ V*_{4}).

– *The optimum cut is (V*_{1} *∪ V*_{3}*, V*_{2} *∪ V*_{4}).

*•* *Let e*_{ij}*be the number of edges between V*_{i}*and V** _{j}*.

*•* *For each node v ∈ V*_{1}*, its edges to V*_{1} *∪ V*_{2} are
*outnumbered by those to V*_{3} *∪ V*_{4}.

– *Otherwise, v would have been moved to V*_{3} *∪ V*_{4} to
improve the cut.

### Analysis (continued)

*•* *Considering all nodes in V*_{1} together, we have
*2e*_{11} *+ e*_{12} *≤ e*_{13} *+ e*_{14}

– *It is 2e*_{11} *is because each edge in V*_{1} is counted twice.

*•* The above inequality implies

*e*_{12} *≤ e*_{13} *+ e*_{14}*.*

### Analysis (concluded)

*•* Similarly,

*e*_{12} *≤ e*_{23} *+ e*_{24}
*e*_{34} *≤ e*_{23} *+ e*_{13}
*e*_{34} *≤ e*_{14} *+ e*_{24}

*•* Add all four inequalities, divide both sides by 2, and add
*the inequality e*_{14} *+ e*_{23} *≤ e*_{14} *+ e*_{23} *+ e*_{13} *+ e*_{24} to obtain

*e*_{12} *+ e*_{34} *+ e*_{14} *+ e*_{23} *≤ 2(e*_{13} *+ e*_{14} *+ e*_{23} *+ e*_{24}*).*

*•* The above says our solution is at least half the optimum.

### Approximability, Unapproximability, and Between

*•* knapsack, node cover, maxsat, and max cut have
approximation thresholds less than 1.

– knapsack has a threshold of 0 (p. 664).

– But node cover and maxsat have a threshold larger than 0.

*•* The situation is maximally pessimistic for tsp: It
cannot be approximated unless P = NP (p. 662).

– The approximation threshold of tsp is 1.

*∗* *The threshold is 1/3 if the tsp satisfies the*
triangular inequality.

– The same holds for independent set.

### Unapproximability of tsp

^{a}

Theorem 77 *The approximation threshold of tsp is 1*
*unless P = NP.*

*•* *Suppose there is a polynomial-time ²-approximation*
*algorithm for tsp for some ² < 1.*

*•* We shall construct a polynomial-time algorithm for the
NP-complete hamiltonian cycle.

*•* *Given any graph G = (V, E), construct a tsp with | V |*
cities with distances

*d** _{ij}* =

*1,* *if { i, j } ∈ E*

*|V |*

*1−²**, otherwise*

aSahni and Gonzales (1976).

### The Proof (concluded)

*•* Run the alleged approximation algorithm on this tsp.

*•* *Suppose a tour of cost |V | is returned.*

– This tour must be a Hamiltonian cycle.

*•* Suppose a tour with at least one edge of length _{1−²}* ^{|V |}* is
returned.

– *The total length of this tour is >* _{1−²}* ^{|V |}* .

– *Because the algorithm is ²-approximate, the optimum*
*is at least 1 − ² times the returned tour’s length.*

– *The optimum tour has a cost exceeding | V |.*

– *Hence G has no Hamiltonian cycles.*

### knapsack Has an Approximation Threshold of Zero

^{a}

Theorem 78 *For any ², there is a polynomial-time*

*²-approximation algorithm for knapsack.*

*•* *We have n weights w*_{1}*, w*_{2}*, . . . , w*_{n}*∈ Z*^{+}, a weight limit
*W , and n values v*_{1}*, v*_{2}*, . . . , v*_{n}*∈ Z*^{+}.^{b}

*•* *We must find an S ⊆ {1, 2, . . . , n} such that*
P

*i∈S* *w*_{i}*≤ W and* P

*i∈S* *v** _{i}* is the largest possible.

aIbarra and Kim (1975).

bIf the values are fractional, the result is slightly messier but the main conclusion remains correct. Contributed by Mr. Jr-Ben Tian (R92922045) on December 29, 2004.

### The Proof (continued)

*•* Let

*V = max{v*_{1}*, v*_{2}*, . . . , v*_{n}*}.*

*•* Clearly, P

*i∈S* *v*_{i}*≤ nV .*

*•* *Let 0 ≤ i ≤ n and 0 ≤ v ≤ nV .*

*•* *W (i, v) is the minimum weight attainable by selecting*
*some of the first i items with a total value of v.*

*•* *Set W (0, v) = ∞ for v ∈ { 1, 2, . . . , nV } and W (i, 0) = 0*
*for i = 0, 1, . . . , n.*^{a}

aContributed by Mr. Ren-Shuo Liu (D98922016) and Mr. Yen-Wei Wu (D98922013) on December 28, 2009.

### The Proof (continued)

*•* *Then, for 0 ≤ i < n,*

*W (i + 1, v) = min{W (i, v), W (i, v − v*_{i+1}*) + w*_{i+1}*}.*

*•* *Finally, pick the largest v such that W (n, v) ≤ W .*

*•* *The running time is O(n*^{2}*V ), not polynomial time.*

*•* Key idea: Limit the number of precision bits.

### The Proof (continued)

*•* Define

*v*_{i}* ^{0}* = 2

^{b}*j v*

*2*

_{i}

^{b}k
*.*

– *This is equivalent to zeroing each v*_{i}*’s last b bits.*

*•* From the original instance

*x = (w*_{1}*, . . . , w*_{n}*, W, v*_{1}*, . . . , v*_{n}*),*
define the approximate instance

*x*^{0}*= (w*_{1}*, . . . , w*_{n}*, W, v*_{1}^{0}*, . . . , v*_{n}^{0}*).*

### The Proof (continued)

*•* *Solving x*^{0}*takes time O(n*^{2}*V /2** ^{b}*).

– The algorithm only performs subtractions on the
*v** _{i}*-related values.

– *So the b last bits can be removed from the*
calculations.

– *That is, use v*_{i}* ^{0}* = ¥

_{v}2^{b}*i*

¦ in the calculations.

– Then multiply the returned value by 2* ^{b}*.

*•* *The solution S*^{0}*is close to the optimum solution S:*

X

*i∈S*^{0}

*v*_{i}*≥* X

*i∈S*^{0}

*v*_{i}^{0}*≥* X

*i∈S*

*v*_{i}^{0}*≥* X

*i∈S*

*(v*_{i}*− 2*^{b}*) ≥* X

*i∈S*

*v*_{i}*− n2*^{b}*.*

### The Proof (continued)

*•* Hence X

*i∈S*^{0}

*v*_{i}*≥* X

*i∈S*

*v*_{i}*− n2*^{b}*.*

*•* *Without loss of generality, assume w*_{i}*≤ W for all i.*

– *Otherwise, item i is redundant.*

*•* *V is a lower bound on opt.*

– *Picking an item with value V is a legitimate choice.*

*•* *The relative error from the optimum is ≤ n2*^{b}*/V :*
P

*i∈S* *v*_{i}*−* P

*i∈S*^{0}*v** _{i}*
P

*i∈S* *v*_{i}*≤*

P

*i∈S* *v*_{i}*−* P

*i∈S*^{0}*v*_{i}

*V* *≤* *n2*^{b}

*V* *.*

### The Proof (concluded)

*•* *Suppose we pick b = blog*_{2} ^{²V}_{n}*c.*

*•* *The algorithm becomes ²-approximate (see Eq. (10) on*
p. 640).

*•* *The running time is then O(n*^{2}*V /2*^{b}*) = O(n*^{3}*/²), a*
*polynomial in n and 1/².*^{a}

a*It hence depends on the value of 1/². Thanks to a lively class dis-*
*cussion on December 20, 2006. If we fix ² and let the problem size*
increase, then the complexity is cubic. Contributed by Mr. Ren-Shan
Luoh (D97922014) on December 23, 2008.

### Pseudo-Polynomial-Time Algorithms

*•* Consider problems with inputs that consist of a

collection of integer parameters (tsp, knapsack, etc.).

*•* An algorithm for such a problem whose running time is
*a polynomial of the input length and the value (not*
length) of the largest integer parameter is a

pseudo-polynomial-time algorithm.^{a}

*•* On p. 666, we presented a pseudo-polynomial-time
*algorithm for knapsack that runs in time O(n*^{2}*V ).*

*•* How about tsp (d), another NP-complete problem?

aGarey and Johnson (1978).

### No Pseudo-Polynomial-Time Algorithms for tsp (d)

*•* By definition, a pseudo-polynomial-time algorithm
becomes polynomial-time if each integer parameter is
*limited to having a value polynomial in the input length.*

*•* Corollary 42 (p. 335) showed that hamiltonian path is
reducible to tsp (d) with weights 1 and 2.

*•* As hamiltonian path is NP-complete, tsp (d) cannot
have pseudo-polynomial-time algorithms unless P = NP.

*•* tsp (d) is said to be strongly NP-hard.

*•* Many weighted versions of NP-complete problems are
strongly NP-hard.

### Polynomial-Time Approximation Scheme

*•* *Algorithm M is a polynomial-time approximation*
scheme (PTAS) for a problem if:

– *For each ² > 0 and instance x of the problem, M*
*runs in time polynomial (depending on ²) in | x |.*

*∗* *Think of ² as a constant.*

– *M is an ²-approximation algorithm for every ² > 0.*

### Fully Polynomial-Time Approximation Scheme

*•* A polynomial-time approximation scheme is fully
polynomial (FPTAS) if the running time depends
*polynomially on | x | and 1/².*

– Maybe the best result for a “hard” problem.

– For instance, knapsack is fully polynomial with a
*running time of O(n*^{3}*/²) (p. 664).*

*Square of G*

*•* *Let G = (V, E) be an undirected graph.*

*•* *G*^{2} *has nodes {(v*_{1}*, v*_{2}*) : v*_{1}*, v*_{2} *∈ V } and edges*

*{{ (u, u*^{0}*), (v, v*^{0}*) } : (u = v ∧ { u*^{0}*, v*^{0}*} ∈ E) ∨ { u, v } ∈ E}.*

1

2

3

(1,1)

*G*

(1,2) (1,3)

(2,1) (2,2) (2,3)

(3,1) (3,2) (3,3)

*G*^{2}

*Independent Sets of G and G*

^{2}

Lemma 79 *G(V, E) has an independent set of size k if and*
*only if G*^{2} *has an independent set of size k*^{2}*.*

*•* *Suppose G has an independent set I ⊆ V of size k.*

*•* *{(u, v) : u, v ∈ I} is an independent set of size k*^{2} *of G*^{2}.

1

2

3

(1,1)

*G*

(1,2) (1,3)

(2,1) (2,2) (2,3)

(3,1) (3,2) (3,3)

*G*^{2}

### The Proof (continued)

*•* *Suppose G*^{2} *has an independent set I*^{2} *of size k*^{2}.

*•* *U ≡ {u : ∃v ∈ V (u, v) ∈ I*^{2}*} is an independent set of G.*

1

2

3

(1,1)

*G*

(1,2) (1,3)

(2,1) (2,2) (2,3)

(3,1) (3,2) (3,3)

*G*^{2}

*•* *| U | is the number of “rows” that the nodes in I*^{2} occupy.

### The Proof (concluded)

^{a}

*•* *If | U | ≥ k, then we are done.*

*•* *Now assume | U | < k.*

*•* *As the k*^{2} *nodes in I*^{2} *cover fewer than k “rows,” there*
*must be a “row” in possession of > k nodes of I*^{2}.

*•* *Those > k nodes will be independent in G as each “row”*

*is a copy of G.*

aThanks to a lively class discussion on December 29, 2004.

### Approximability of independent set

*•* The approximation threshold of the maximum
independent set is either zero or one (it is one!).

Theorem 80 *If there is a polynomial-time ²-approximation*
*algorithm for independent set for any 0 < ² < 1, then*
*there is a polynomial-time approximation scheme.*

*•* *Let G be a graph with a maximum independent set of*
*size k.*

*•* *Suppose there is an O(n*^{i}*)-time ²-approximation*
algorithm for independent set.

*•* *We seek a polynomial-time ²** ^{0}*-approximation algorithm

*with ²*

^{0}*< ².*

### The Proof (continued)

*•* By Lemma 79 (p. 676), the maximum independent set of
*G*^{2} *has size k*^{2}.

*•* *Apply the algorithm to G*^{2}.

*•* *The running time is O(n** ^{2i}*).

*•* *The resulting independent set has size ≥ (1 − ²) k*^{2}.

*•* By the construction in Lemma 79 (p. 676), we can
*obtain an independent set of size ≥* p

*(1 − ²) k*^{2} *for G.*

*•* *Hence there is a (1 −* *√*

*1 − ²)-approximation algorithm*
for independent set by Eq. (11) on p. 641.

### The Proof (concluded)

*•* *In general, we can apply the algorithm to G*^{2}* ^{`}* to obtain

*an (1 − (1 − ²)*

^{2}

*)-approximation algorithm for*

^{−`}independent set.

*•* *The running time is n*^{2}^{`}* ^{i}*.

^{a}

*•* *Now pick ` = dlog* _{log(1−²}^{log(1−²)}*0*)*e.*

*•* *The running time becomes n*^{i}^{log(1−²0)}* ^{log(1−²)}* .

*•* *It is an ²** ^{0}*-approximation algorithm for independent
set.

aIt is not fully polynomial.

### Comments

*•* independent set and node cover are reducible to
each other (Corollary 39, p. 312).

*•* node cover has an approximation threshold at most
0.5 (p. 646).

*•* But independent set is unapproximable (see the
textbook).

*•* *independent set limited to graphs with degree ≤ k is*
*called k-degree independent set.*

*•* *k-degree independent set is approximable (see the*
textbook).

*On P vs. NP*

### Density

^{a}

*The density of language L ⊆ Σ** ^{∗}* is defined as
dens

_{L}*(n) = |{x ∈ L : | x | ≤ n}|.*

*•* *If L = {0, 1}** ^{∗}*, then dens

_{L}*(n) = 2*

^{n+1}*− 1.*

*•* So the density function grows at most exponentially.

*•* *For a unary language L ⊆ {0}** ^{∗}*,

dens_{L}*(n) ≤ n + 1.*

– *Because L ⊆ {², 0, 00, . . . ,*

z }| {*n*

*00 · · · 0, . . .}.*

aBerman and Hartmanis (1977).

### Sparsity

*•* Sparse languages are languages with polynomially
bounded density functions.

*•* Dense languages are languages with superpolynomial
density functions.

### Self-Reducibility for sat

*•* An algorithm exhibits self-reducibility if it finds a
*certificate by exploiting algorithms for the decision*
version of the same problem.

*•* *Let φ be a boolean expression in n variables*
*x*_{1}*, x*_{2}*, . . . , x** _{n}*.

*•* *t ∈ {0, 1}** ^{j}* is a partial truth assignment for

*x*

_{1}

*, x*

_{2}

*, . . . , x*

*.*

_{j}*•* *φ[ t ] denotes the expression after substituting the truth*
*values of t for x*_{1}*, x*_{2}*, . . . , x*_{| t |}*in φ.*

### An Algorithm for sat with Self-Reduction

*We call the algorithm below with empty t.*

1: *if | t | = n then*

2: *return φ[ t ];*

3: else

4: *return φ[ t0 ] ∨ φ[ t1 ];*

5: end if

The above algorithm runs in exponential time, by visiting all
*the partial assignments (or nodes on a depth-n binary tree).*

### NP-Completeness and Density

^{a}

Theorem 81 *If a unary language U ⊆ {0}*^{∗}*is*
*NP-complete, then P = NP.*

*•* *Suppose there is a reduction R from sat to U .*

*•* *We use R to find a truth assignment that satisfies*

*boolean expression φ with n variables if it is satisfiable.*

*•* *Specifically, we use R to prune the exponential-time*
exhaustive search on p. 687.

*•* *The trick is to keep the already discovered results φ[ t ]*
*in a table H.*

aBerman (1978).

1: *if | t | = n then*

2: *return φ[ t ];*

3: else

4: *if (R(φ[ t ]), v) is in table H then*

5: *return v;*

6: else

7: *if φ[ t0 ] = “satisfiable” or φ[ t1 ] = “satisfiable” then*

8: *Insert (R(φ[ t ]), “satisfiable”) into H;*

9: return “satisfiable”;

10: else

11: *Insert (R(φ[ t ]), “unsatisfiable”) into H;*

12: return “unsatisfiable”;

13: end if

14: end if

15: end if

### The Proof (continued)

*•* *Since R is a reduction, R(φ[ t ]) = R(φ[ t** ^{0}* ]) implies that

*φ[ t ] and φ[ t*

*] must be both satisfiable or unsatisfiable.*

^{0}*•* *R(φ[ t ]) has polynomial length ≤ p(n) because R runs in*
log space.

*•* *As R maps to unary numbers, there are only*
*polynomially many p(n) values of R(φ[ t ]).*

*•* How many nodes of the complete binary tree (of
invocations/truth assignments) need to be visited?

*•* If that number is a polynomial, the overall algorithm
runs in polynomial time and we are done.

### The Proof (continued)

*•* *A search of the table takes time O(p(n)) in the random*
access memory model.

*•* *The running time is O(M p(n)), where M is the total*
number of invocations of the algorithm.

*•* The invocations of the algorithm form a binary tree of
*depth at most n.*

### The Proof (continued)

*•* *There is a set T = {t*_{1}*, t*_{2}*, . . .} of invocations (partial*
truth assignments, i.e.) such that:

1. *|T | ≥ (M − 1)/(2n).*

2. *All invocations in T are recursive (nonleaves).*

3. *None of the elements of T is a prefix of another.*

### VWVWHS'HOHWH OHDYHV0−

### QRQOHDYHVUHPDLQLQJ

### QGVWHS6HOHFWDQ\

### ERWWRPXQGHOHWHG LQYRFDWLRQWDQGDGG LWWR7

### UGVWHS'HOHWHDOOWV DWPRVWQDQFHVWRUV SUHIL[HVIURP

### IXUWKHUFRQVLGHUDWLRQ

### An Example

r

a c

d e f

g h i j

l k

1

2 3

4 5

*T = { h, j }.*

### The Proof (continued)

*•* *All invocations t ∈ T have different R(φ[ t ]) values.*

– *None of h, j ∈ T is a prefix of the other.*

– The invocation of one started after the invocation of the other had terminated.

– If they had the same value, the one that was invoked second would have looked it up, and therefore would not be recursive, a contradiction.

*•* *The existence of T implies that there are at least*
*(M − 1)/(2n) different R(φ[ t ]) values in the table.*

### The Proof (concluded)

*•* *We already know that there are at most p(n) such*
values.

*•* *Hence (M − 1)/(2n) ≤ p(n).*

*•* *Thus M ≤ 2np(n) + 1.*

*•* *The running time is therefore O(M p(n)) = O(np*^{2}*(n)).*

*•* We comment that this theorem holds for any sparse
language, not just unary ones.^{a}

aMahaney (1980).

### coNP-Completeness and Density

Theorem 82 (Fortung (1979)) *If a unary language*
*U ⊆ {0}*^{∗}*is coNP-complete, then P = NP.*

*•* *Suppose there is a reduction R from sat complement*
*to U .*

*•* The rest of the proof is basically identical except that,
now, we want to make sure a formula is unsatisfiable.