## NP-Complete Problems

Wir m¨ussen wissen, wir werden wissen.

(We must know, we shall know.)

— David Hilbert (1900)

### Two Notions

• Let R ⊆ Σ^{∗} × Σ^{∗} be a binary relation on strings.

• R is called polynomially decidable if {x; y : (x, y) ∈ R}

is in P.

• R is said to be polynomially balanced if (x, y) ∈ R
implies |y| ≤ | x |^{k} for some k ≥ 1.

### An Alternative Characterization of NP

Proposition 30 (Edmonds (1965)) Let L ⊆ Σ^{∗} be a

language. Then L ∈ NP if and only if there is a polynomially decidable and polynomially balanced relation R such that

L = {x : ∃y (x, y) ∈ R}.

• Suppose such an R exists.

• L can be decided by this NTM:

– On input x, the NTM guesses a y of length ≤ | x |^{k}
and tests if (x, y) ∈ R in polynomial time.

– It returns “yes” if the test is positive.

### The Proof (concluded)

• Now suppose L ∈ NP.

• NTM N decides L in time | x |^{k}.

• Define R as follows: (x, y) ∈ R if and only if y is the encoding of an accepting computation of N on input x.

• Clearly R is polynomially balanced because N is polynomially bounded.

• R is polynomially decidable because it can be efficiently verified by checking with N ’s transition function.

• Finally L = {x : (x, y) ∈ R for some y} because N decides L.

### Comments

• Any “yes” instance x of an NP problem has at least one succinct certificate or polynomial witness y.

• “No” instances have none.

• Certificates are short and easy to verify.

– An alleged satisfying truth assignment for sat; an alleged Hamiltonian path for hamiltonian path.

• Certificates may be hard to generate (otherwise, NP equals P), but verification must be easy.

• NP is the class of easy-to-verify (in P) problems.

### You Have an NP-Complete Problem (for Your Thesis)

• From Propositions 24 (p. 221) and Proposition 25 (p. 224), it is the least likely to be in P.

• Your options are:

– Approximations.

– Special cases.

– Average performance.

– Randomized algorithms.

– Exponential-time algorithms that work well in practice.

– “Heuristics” (and pray).

### 3sat

• k-sat, where k ∈ Z^{+}, is the special case of sat.

• The formula is in CNF and all clauses have exactly k literals (repetition of literals is allowed).

• For example,

(x1 ∨ x2 ∨ ¬x3) ∧ (x1 ∨ x1 ∨ ¬x2) ∧ (x1 ∨ ¬x2 ∨ ¬x3).

### 3sat Is NP-Complete

• Recall Cook’s Theorem (p. 242) and the reduction of circuit sat to sat (p. 210).

• The resulting CNF has at most 3 literals for each clause.

– This shows that 3sat where each clause has at most 3 literals is NP-complete.

• Finally, duplicate one literal once or twice to make it a 3sat formula.

• Note: The overall reduction remains parsimonious.

### The Satisfiability of Random 3sat Expressions

• Consider a random 3sat expressions φ with n variables and cn clauses.

• Each clause is chosen independently and uniformly from the set of all possible clauses.

• Intuitively, the larger the c, the less likely φ is satisfiable as more constraints are added.

• Indeed, there is a cn such that for c < cn(1 − ǫ), φ is
satisfiable almost surely, and for c > cn(1 + ǫ), φ is
unsatisfiable almost surely.^{a}

aFriedgut and Bourgain (1999). As of 2006, 3.52 < c^{n} < 4.596.

### Another Variant of 3sat

Proposition 31 3sat is NP-complete for expressions in which each variable is restricted to appear at most three times, and each literal at most twice. (3sat here requires only that each clause has at most 3 literals.)

• Consider a general 3sat expression in which x appears k times.

• Replace the first occurrence of x by x1, the second by x2, and so on, where x1, x2, . . . , xk are k new variables.

### The Proof (concluded)

• Add (¬x1 ∨ x2) ∧ (¬x2 ∨ x3) ∧ · · · ∧ (¬xk ∨ x1) to the expression.

– This is logically equivalent to x1 ⇒ x2 ⇒ · · · ⇒ xk ⇒ x1.

– Note that each clause above has fewer than 3 literals.

• The resulting equivalent expression satisfies the condition for x.

### An Example

• Suppose we are given the following 3sat expression

· · · (¬x ∨ w ∨ g) ∧ · · · ∧ (x ∨ y ∨ z) · · · .

• The transformed expression is

· · · (¬x1∨w∨g)∧· · ·∧(x2∨y∨z) · · · (¬x1∨x2)∧(¬x2∨x1).

– Variable x1 appears thrice.

– Literal x1 appears once.

– Literal ¬x1 appears twice.

### 2sat and Graphs

• Let φ be an instance of 2sat: Each clause has 2 literals.

• Define graph G(φ) as follows:

– The nodes are the variables and their negations.

– Add edges (¬α, β) and (¬β, α) to G(φ) if α ∨ β is a clause in φ.

∗ For example, if x ∨ ¬y ∈ φ, add (¬x, ¬y) and (y, x).

∗ Two edges are added for each clause.

• Think of the edges as ¬α ⇒ β and ¬β ⇒ α.

• b is reachable from a iff ¬a is reachable from ¬b.

• Paths in G(φ) are valid implications.

### Illustration: Directed Graph for

### (x

_{1}

### ∨ x

_{2}

### ) ∧ (x

_{1}

### ∨ ¬x

_{3}

### ) ∧ (¬x

_{1}

### ∨ x

_{2}

### ) ∧ (x

_{2}

### ∨ x

_{3}

### )

¬[_{}

¬[_{}

¬[_{}
[_{}

[_{}

[_{}

### Properties of G(φ)

Theorem 32 φ is unsatisfiable if and only if there is a variable x such that there are paths from x to ¬x and from

¬x to x in G(φ).

### 2sat Is in NL ⊆ P

• NL is a subset of P (p. 181).

• By Eq. (3) on p. 191, coNL equals NL.

• We need to show only that recognizing unsatisfiable expressions is in NL.

• In nondeterministic logarithmic space, we can test the conditions of Theorem 32 (p. 263) by guessing a variable x and testing if ¬x is reachable from x and if ¬x can reach x.

– See the algorithm for reachability (p. 94).

### Generalized 2sat: max2sat

• Consider a 2sat expression.

• Let K ∈ N.

• max2sat is the problem of whether there is a truth assignment that satisfies at least K of the clauses.

• max2sat becomes 2sat when K equals the number of clauses.

• max2sat is an optimization problem.

• max2sat ∈ NP: Guess a truth assignment and verify the count.

### max2sat Is NP-Complete

^{a}

• Consider the following 10 clauses:

(x) ∧ (y) ∧ (z) ∧ (w)

(¬x ∨ ¬y) ∧ (¬y ∨ ¬z) ∧ (¬z ∨ ¬x) (x ∨ ¬w) ∧ (y ∨ ¬w) ∧ (z ∨ ¬w)

• Let the 2sat formula r(x, y, z, w) represent the conjunction of these clauses.

• How many clauses can we satisfy?

• The clauses are symmetric with respect to x, y, and z.

aGarey, Johnson, and Stockmeyer (1976).

### The Proof (continued)

All of x, y, z are true: By setting w to true, we satisfy 4 + 0 + 3 = 7 clauses, whereas by setting w to false, we satisfy only 3 + 0 + 3 = 6 clauses.

Two of x, y, z are true: By setting w to true, we satisfy 3 + 2 + 2 = 7 clauses, whereas by setting w to false, we satisfy 2 + 2 + 3 = 7 clauses.

### The Proof (continued)

One of x, y, z is true: By setting w to false, we satisfy 1 + 3 + 3 = 7 clauses, whereas by setting w to true, we satisfy only 2 + 3 + 1 = 6 clauses.

None of x, y, z is true: By setting w to false, we satisfy 0 + 3 + 3 = 6 clauses, whereas by setting w to true, we satisfy only 1 + 3 + 0 = 4 clauses.

### The Proof (continued)

• Any truth assignment that satisfies x ∨ y ∨ z can be extended to satisfy 7 of the 10 clauses and no more.

• Any other truth assignment can be extended to satisfy only 6 of them.

• The reduction from 3sat φ to max2sat R(φ):

– For each clause Ci = (α ∨ β ∨ γ) of φ, add group r(α, β, γ, wi) to R(φ).

– If φ has m clauses, then R(φ) has 10m clauses.

• Set K = 7m.

### The Proof (concluded)

• We now show that K clauses of R(φ) can be satisfied if and only if φ is satisfiable.

• Suppose 7m clauses of R(φ) can be satisfied.

– 7 clauses must be satisfied in each group because each group can have at most 7 clauses satisfied.

– Hence all clauses of φ must be satisfied.

• Suppose all clauses of φ are satisfied.

– Each group can set its wi appropriately to have 7 clauses satisfied.

### naesat

• The naesat (for “not-all-equal” sat) is like 3sat.

• But we require additionally that there be a satisfying truth assignment under which no clauses have the three literals equal in truth value.

– Each clause must have one literal assigned true and one literal assigned false.

### naesat Is NP-Complete

^{a}

• Recall the reduction of circuit sat to sat on p. 210.

• It produced a CNF φ in which each clause has at most 3 literals.

• Add the same variable z to all clauses with fewer than 3 literals to make it a 3sat formula.

• Goal: The new formula φ(z) is nae-satisfiable if and only if the original circuit is satisfiable.

aKarp (1972).

### The Proof (continued)

• Suppose T nae-satisfies φ(z).

– ¯T also nae-satisfies φ(z).

– Under T or ¯T, variable z takes the value false.

– This truth assignment must still satisfy all clauses of φ.

– So it satisfies the original circuit.

### The Proof (concluded)

• Suppose there is a truth assignment that satisfies the circuit.

– Then there is a truth assignment T that satisfies every clause of φ.

– Extend T by adding T (z) = false to obtain T^{′}.
– T^{′} satisfies φ(z).

– So in no clauses are all three literals false under T^{′}.
– Under T^{′}, in no clauses are all three literals true.

∗ Review the detailed construction on p. 211 and p. 212.

### Undirected Graphs

• An undirected graph G = (V, E) has a finite set of nodes, V , and a set of undirected edges, E.

• It is like a directed graph except that the edges have no directions and there are no self-loops.

• We use [ i, j ] to denote the fact that there is an edge between node i and node j.

### Independent Sets

• Let G = (V, E) be an undirected graph.

• I ⊆ V .

• I is independent if whenever i, j ∈ I, there is no edge between i and j.

• The independent set problem: Given an undirected graph and a goal K, is there an independent set of size K?

– Many applications.

### independent set Is NP-Complete

• This problem is in NP: Guess a set of nodes and verify that it is independent and meets the count.

• If a graph contains a triangle, any independent set can contain at most one node of the triangle.

• We consider graphs whose nodes can be partitioned in m disjoint triangles.

– If the special case is hard, the original problem must be at least as hard.

### Reduction from 3sat to independent set

• Let φ be an instance of 3sat with m clauses.

• We will construct graph G (with constraints as said) with K = m such that φ is satisfiable if and only if G has an independent set of size K.

• There is a triangle for each clause with the literals as the nodes.

• Add additional edges between x and ¬x for every variable x.

### A Sample Construction

»[_{}

»[_{} »[_{}

[_{}

[_{} [_{}

»[_{}

[_{} [_{}

(x1 ∨ x2 ∨ x3) ∧ (¬x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ x2 ∨ x3).

### The Proof (continued)

• Suppose G has an independent set I of size K = m.

– An independent set can contain at most m nodes, one from each triangle.

– An independent set of size m exists if and only if it contains exactly one node from each triangle.

– Truth assignment T assigns true to those literals in I.

– T is consistent because contradictory literals are connected by an edge, hence not both in I.

– T satisfies φ because it has a node from every triangle, thus satisfying every clause.

### The Proof (concluded)

• Suppose a satisfying truth assignment T exists for φ.

– Collect one node from each triangle whose literal is true under T .

– The choice is arbitrary if there is more than one true literal.

– This set of m nodes must be independent by construction.

∗ Literals x and ¬x cannot be both assigned true.

### Other independent set-Related NP-Complete Problems

Corollary 33 4-degree independent set is NP-complete.

Theorem 34 independent set is NP-complete for planar graphs.

### clique

• We are given an undirected graph G and a goal K.

• clique asks if there is a set C with K nodes such that whenever i, j ∈ C, there is an edge between i and j.

### clique Is NP-Complete

Corollary 35 clique is NP-complete.

• Let ¯G be the complement of G, where [x, y] ∈ ¯G if and only if [x, y] 6∈ G.

• I is an independent set in G ⇔ I is a clique in ¯G.

### node cover

• We are given an undirected graph G and a goal K.

• node cover asks if there is a set C with K or fewer nodes such that each edge of G has at least one of its endpoints in C.

### node cover Is NP-Complete

Corollary 36 node cover is NP-complete.

• I is an independent set of G = (V, E) if and only if V − I is a node cover of G.

*I*

### min cut and max cut

• A cut in an undirected graph G = (V, E) is a partition of the nodes into two nonempty sets S and V − S.

• The size of a cut (S, V − S) is the number of edges between S and V − S.

• min cut ∈ P by the maxflow algorithm.

• max cut asks if there is a cut of size at least K.

– K is part of the input.

### min cut and max cut (concluded)

• max cut has applications in VLSI layout.

– The minimum area of a VLSI layout of a graph is not
less than the square of its maximum cut size.^{a}

aRaspaud, S´ykora, and Vrˇto (1995).

### A Cut

### max cut Is NP-Complete

^{a}

• We will reduce naesat to max cut.

• Given an instance φ of 3sat with m clauses, we shall construct a graph G = (V, E) and a goal K such that:

– There is a cut of size at least K if and only if φ is nae-satisfiable.

• Our graph will have multiple edges between two nodes.

– Each such edge contributes one to the cut if its nodes are separated.

aGarey, Johnson, and Stockmeyer (1976).

### The Proof

• Suppose φ’s m clauses are C1, C2, . . . , Cm.

• The boolean variables are x1, x2, . . . , xn.

• G has 2n nodes: x1, x2, . . . , xn,¬x1,¬x2, . . . ,¬xn.

• Each clause with 3 distinct literals makes a triangle in G.

• For each clause with two identical literals, there are two parallel edges between the two distinct literals.

• No need to consider clauses with one literal (why?).

• For each variable xi, add ni copies of edge [xi,¬xi],
where ni is the number of occurrences of xi and ¬xi in
φ.^{a}

### »[

_{M}

### [

_{L}

### [

_{L}

### »[

_{L}

### Q

_{L}

### FRSLHV [

_{L}

### [

_{M}

### »[

_{N}

### The Proof (continued)

• Set K = 5m.

• Suppose there is a cut (S, V − S) of size 5m or more.

• A clause (a triangle or two parallel edges) contributes at most 2 to a cut no matter how you split it.

• Suppose both xi and ¬xi are on the same side of the cut.

• Then they together contribute at most 2ni edges to the cut as they appear in at most ni different clauses.

»[_{L}
[_{L}

Q_{L}ØWULDQJOHVÙ
Q LSDUDOOHOOLQHV

### The Proof (continued)

• Changing the side of a literal contributing at most ni to the cut does not decrease the size of the cut.

• Hence we assume variables are separated from their negations.

• The total number of edges in the cut that join opposite literals is P

i ni = 3m.

– The total number of literals is 3m.

### The Proof (concluded)

• The remaining 2m edges in the cut must come from the m triangles or parallel edges that correspond to the

clauses.

• As each can contribute at most 2 to the cut, all are split.

• A split clause means at least one of its literals is true and at least one false.

• The other direction is left as an exercise.

»[_{}

»[_{}

»[_{}
[_{}

[_{}

[_{}

• (x1 ∨ x2 ∨ x2) ∧ (x1 ∨ ¬x3 ∨ ¬x3) ∧ (¬x1 ∨ ¬x2 ∨ x3).

• The cut size is 13 < 5 × 3 = 15.

»[_{}

»[_{}

»[_{}
[_{}

[_{}

[_{}
WUXH

IDOVH

• (x1 ∨ x2 ∨ x2) ∧ (x1 ∨ ¬x3 ∨ ¬x3) ∧ (¬x1 ∨ ¬x2 ∨ x3).

• The cut size is now 15.

### A Remark

• We had proved that max cut is NP-complete for multigraphs.

• How about proving the same thing for simple graphs?^{a}

• For 4sat, how do you modify the proof?^{b}

aContributed by Mr. Tai-Dai Chou (J93922005) on June 2, 2005.

bContributed by Mr. Chien-Lin Chen (J94922015) on June 8, 2006.

### max bisection

• max cut becomes max bisection if we require that

|S| = |V − S|.

• It has many applications, especially in VLSI layout.

### max bisection Is NP-Complete

• We shall reduce the more general max cut to max bisection.

• Add |V | isolated nodes to G to yield G^{′}.

• G^{′} has 2 × |V | nodes.

• As the new nodes have no edges, moving them around contributes nothing to the cut.

### The Proof (concluded)

• Every cut (S, V − S) of G = (V, E) can be made into a bisection by appropriately allocating the new nodes between S and V − S.

• Hence each cut of G can be made a cut of G^{′} of the
same size, and vice versa.

### bisection width

• bisection width is like max bisection except that it asks if there is a bisection of size at most K (sort of min bisection).

• Unlike min cut, bisection width remains NP-complete.

– A graph G = (V, E), where |V | = 2n, has a bisection
of size K if and only if the complement of G has a
bisection of size n^{2} − K.

– So G has a bisection of size ≥ K if and only if its
complement has a bisection of size ≤ n^{2} − K.

### Illustration

### hamiltonian path Is NP-Complete

^{a}

Theorem 37 Given an undirected graph, the question whether it has a Hamiltonian path is NP-complete.

aKarp (1972).

### tsp (d) Is NP-Complete

Corollary 38 tsp (d) is NP-complete.

• Consider a graph G with n nodes.

• Define dij = 1 if [ i, j ] ∈ G and dij = 2 if [ i, j ] 6∈ G.

• Set the budget B = n + 1.

• Suppose G has no Hamiltonian paths.

• Then every tour on the new graph must contain at least two edges with weight 2.

– Otherwise, by removing up to one edge with weight 2, one obtains a Hamiltonian path, a contradiction.

### tsp (d) Is NP-Complete (concluded)

• The total cost is then at least (n − 2) + 2 · 2 = n + 2 > B.

• On the other hand, suppose G has Hamiltonian paths.

• Then there is a tour on the new graph containing at most one edge with weight 2.

• The total cost is then at most (n − 1) + 2 = n + 1 = B.

• We conclude that there is a tour of length B or less if and only if G has a Hamiltonian path.