knapsack Is NP-Complete
a• knapsack ∈ NP: Guess an S and check the constraints.
• We shall reduce exact cover by 3-sets to knapsack, in which vi = wi for all i and K = W .
• The simplified knapsack now asks if a subset of v1, v2, . . . , vn adds up to exactly K.b
– Picture yourself as a radio DJ.
aKarp (1972).
bThis problem is called subset sum.
The Proof (continued)
• The primary differences between the two problems are:a – Sets vs. numbers.
– Union vs. addition.
• We are given a family F = {S1, S2, . . . , Sn} of size-3 subsets of U = {1, 2, . . . , 3m}.
• exact cover by 3-sets asks if there are m disjoint sets in F that cover the set U .
aThanks to a lively class discussion on November 16, 2010.
The Proof (continued)
• Think of a set as a bit vector in {0, 1}3m. – Assume m = 3.
– 110010000 means the set {1, 2, 5}.
– 001100010 means the set {3, 4, 8}.
• Assume there are n = 5 size-3 subsets in F .
• Our goal is
z }| {3m
1 1· · · 1 .
The Proof (continued)
• A bit vector can also be seen as a binary number.
• Set union resembles addition:
001100010 + 110010000 111110010
which denotes the set {1, 2, 3, 4, 5, 8}, as desired.
The Proof (continued)
• Trouble occurs when there is carry:
010000000 + 010000000 100000000
which denotes the set {1}, not the desired {2}.
The Proof (continued)
• Or consider
001100010 + 001110000 011010010
which denotes the set {2, 3, 5, 8}, not the desired {3, 4, 5, 8}.a
aCorrected by Mr. Chihwei Lin (D97922003) on January 21, 2010.
The Proof (continued)
• Carry may also lead to a situation where we obtain our solution 1 1· · · 1 with more than m sets in F .
• For example,
000100010 001110000 101100000 + 000001101 111111111
• But the correct answer, {1, 3, 4, 5, 6, 7, 8, 9}, is not an exact cover.
The Proof (continued)
• And it uses 4 sets instead of the required m = 3.a
• To fix this problem, we enlarge the base just enough so that there are no carries.b
• Because there are n vectors in total, we change the base from 2 to n + 1.
aThanks to a lively class discussion on November 20, 2002.
bYou cannot map ∪ to ∨ because knapsack requires +.
The Proof (continued)
• Set vi to be the integer corresponding to the bit vector encoding Si in base n + 1:
vi = ∑
j∈Si
1 × (n + 1)3m−j (4)
• Set
K =
3m∑−1 j=0
1 × (n + 1)j =
z }| {3m
1 1 · · · 1 (base n + 1).
• Now in base n + 1, if there is a set S such that
∑
i∈S vi =
z }| {3m
1 1· · · 1, then every position must be contributed by exactly one vi and |S| = m.
The Proof (continued)
• For example, the case on p. 420 becomes 000100010
001110000 101100000 + 000001101 102311111 in base n + 1 = 6.
• It does not meet the goal.
The Proof (continued)
• Suppose F admits an exact cover, say {S1, S2, . . . , Sm}.
• Then picking S = {1, 2, . . . , m} clearly results in
v1 + v2 + · · · + vm =
z }| {3m
1 1· · · 1 .
• It is important to note that the meaning of addition (+) is independent of the base.a
– It is just regular addition.
– But an Si may give rise to different integer vi’s in Eq. (4) on p. 422 under different bases.
aContributed by Mr. Kuan-Yu Chen (R92922047) on November 3, 2004.
The Proof (concluded)
• On the other hand, suppose there exists an S such that
∑
i∈S
vi =
z }| {3m
1 1· · · 1 in base n + 1.
• The no-carry property implies that |S| = m and {Si : i ∈ S}
is an exact cover.
An Example
• Let m = 3, U = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and S1 = {1, 3, 4},
S2 = {2, 3, 4}, S3 = {2, 5, 6}, S4 = {6, 7, 8}, S5 = {7, 8, 9}.
• Note that n = 5, as there are 5 Si’s.
An Example (continued)
• Our reduction produces
K =
3×3−1∑
j=0
6j =
3×3
z }| {
1 1· · · 1 (base 6) = 2015539, v1 = 101100000 = 1734048,
v2 = 011100000 = 334368, v3 = 010011000 = 281448, v4 = 000001110 = 258, v5 = 000000111 = 43.
An Example (concluded)
• Note v1 + v3 + v5 = K because
101100000 010011000 + 000000111 111111111
• Indeed,
S1 ∪ S3 ∪ S5 = {1, 2, 3, 4, 5, 6, 7, 8, 9}, an exact cover by 3-sets.
bin packing
• We are given N positive integers a1, a2, . . . , aN, an
integer C (the capacity), and an integer B (the number of bins).
• bin packing asks if these numbers can be partitioned into B subsets, each of which has total sum at most C.
• Think of packing bags at the check-out counter.
Theorem 47 bin packing is NP-complete.
bin packing (concluded)
• But suppose a1, a2, . . . , aN are randomly distributed between 0 and 1.
• Let B be the smallest number of unit-capacity bins capable of holding them.
• Then B can differ from its average by more than t with probability at most 2e−2t2/N.a
aDubhashi and Panconesi (2012).
integer programming
• integer programming asks whether a system of linear inequalities with integer coefficients has an integer
solution.
• In contrast, linear programming asks whether a
system of linear inequalities with integer coefficients has a rational solution.
integer programming Is NP-Complete
a• set covering can be expressed by the inequalities Ax ≥ ⃗1, ∑n
i=1 xi ≤ B, 0 ≤ xi ≤ 1, where – xi is one if and only if Si is in the cover.
– A is the matrix whose columns are the bit vectors of the sets S1, S2, . . ..
– ⃗1 is the vector of 1s.
– The operations in Ax are standard matrix operations.
• This shows integer programming is NP-hard.
• Many NP-complete problems can be expressed as an integer programming problem.
aKarp (1972).
Easier or Harder?
a• Adding restrictions on the allowable problem instances will not make a problem harder.
– We are now solving a subset of problem instances or special cases.
– The independent set proof (p. 361) and the knapsack proof (p. 414).
– sat to 2sat (easier by p. 342).
– circuit value to monotone circuit value (equally hard by p. 314).
Easier or Harder? (concluded)
• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.
• It is problem dependent.
– min cut to bisection width (harder by p. 389).
– linear programming to integer programming (harder by p. 431).
– sat to naesat (equally hard by p. 355) and max cut to max bisection (equally hard by p. 387).
– 3-coloring to 2-coloring (easier by p. 398).
coNP and Function Problems
coNP
• NP is the class of problems that have succinct certificates (recall Proposition 36 on p. 326).
• By definition, coNP is the class of problems whose complement is in NP.
• coNP is therefore the class of problems that have succinct disqualifications:
– A “no” instance of a problem in coNP possesses a short proof of its being a “no” instance.
– Only “no” instances have such proofs.
coNP (continued)
• Suppose L is a coNP problem.
• There exists a polynomial-time nondeterministic algorithm M such that:
– If x ∈ L, then M(x) = “yes” for all computation paths.
– If x ̸∈ L, then M(x) = “no” for some computation path.
• Note that if we swap “yes” and “no” of M, the new algorithm M′ decides ¯L ∈ NP in the classic sense (p.
\HV [ ∉ /
\HV QR
\HV QR
\HV [ ∈ /
\HV
\HV
\HV
\HV
coNP (continued)
• So there are 3 major approaches to proving L ∈ coNP.
1. Prove ¯L ∈ NP.
2. Prove that “no” instances possess short proofs.
3. Write an algorithm for it.
coNP (concluded)
• Clearly P ⊆ coNP.
• It is not known if
P = NP ∩ coNP.
– Contrast this with
R = RE ∩ coRE (see Proposition 12 on p. 170).
Some coNP Problems
• validity ∈ coNP.
– If ϕ is not valid, it can be disqualified very succinctly:
a truth assignment that does not satisfy it.
• sat complement ∈ coNP.
– sat complement is the complement of sat.
– The disqualification is a truth assignment that satisfies it.
• hamiltonian path complement ∈ coNP.
Some coNP Problems (concluded)
• optimal tsp (d) ∈ coNP.
– optimal tsp (d) asks if the optimal tour has a total distance of B, where B is an input.a
– The disqualification is a tour with a length < B.
aDefined by Mr. Che-Wei Chang (R95922093) on September 27, 2006.
A Nondeterministic Algorithm for sat complement
ϕ is a boolean formula with n variables.
1: for i = 1, 2, . . . , n do
2: Guess xi ∈ {0, 1}; {Nondeterministic choice.}
3: end for
4: {Verification:}
5: if ϕ(x1, x2, . . . , xn) = 1 then
6: “no”;
7: else
8: “yes”;
end if
Analysis
• The algorithm decides language {ϕ : ϕ is unsatisfiable}.
– The computation tree is a complete binary tree of depth n.
– Every computation path corresponds to a particular truth assignment out of 2n.
– ϕ is unsatisfiable iff every truth assignment falsifies ϕ.
– But every truth assignment falsifies ϕ iff every computation path results in “yes.”
An Alternative Characterization of coNP
Proposition 48 Let L ⊆ Σ∗ be a language. Then L ∈ coNP if and only if there is a polynomially decidable and
polynomially balanced relation R such that L = {x : ∀y (x, y) ∈ R}.
(As on p. 325, we assume | y | ≤ | x |k for some k.)
• ¯L = {x : ∃y (x, y) ∈ ¬R}.
• Because ¬R remains polynomially balanced, ¯L ∈ NP by Proposition 36 (p. 326).
coNP-Completeness
Proposition 49 L is NP-complete if and only if its complement ¯L = Σ∗ − L is coNP-complete.
Proof (⇒; the ⇐ part is symmetric)
• Let ¯L′ be any coNP language.
• Hence L′ ∈ NP.
• Let R be the reduction from L′ to L.
• So x ∈ L′ if and only if R(x) ∈ L.
• Equivalently, x ̸∈ L′ if and only if R(x) ̸∈ L (the law of transposition).
coNP Completeness (concluded)
• So x ∈ ¯L′ if and only if R(x) ∈ ¯L.
• R is a reduction from ¯L′ to ¯L.
• This shows ¯L is coNP-hard.
• But ¯L ∈ coNP.
• This shows ¯L is coNP-complete.
Some coNP-Complete Problems
• sat complement is coNP-complete.
• validity is coNP-complete.
– ϕ is valid if and only if ¬ϕ is not satisfiable.
– The reduction from sat complement to validity is hence easy.
• hamiltonian path complement is coNP-complete.
Possible Relations between P, NP, coNP
1. P = NP = coNP.
2. NP = coNP but P ̸= NP.
3. NP ̸= coNP and P ̸= NP.
• This is the current “consensus.”a
aCarl Gauss (1777–1855), “I could easily lay down a multitude of such propositions, which one could neither prove nor dispose of.”
The Primality Problem
• An integer p is prime if p > 1 and all positive numbers other than 1 and p itself cannot divide it.
• primes asks if an integer N is a prime number.
• Dividing N by 2, 3, . . . ,√
N is not efficient.
– The length of N is only log N , but √
N = 20.5 log N. – So it is an exponential-time algorithm.
• A polynomial-time algorithm for primes was not found until 2002 by Agrawal, Kayal, and Saxena!
• Later, we will focus on efficient “probabilistic”
algorithms for primes (used in Mathematica, e.g.).
1: if n = ab for some a, b > 1 then
2: return “composite”;
3: end if
4: for r = 2, 3, . . . , n− 1 do
5: if gcd(n, r) > 1 then
6: return “composite”;
7: end if
8: if r is a prime then
9: Let q be the largest prime factor of r − 1;
10: if q ≥ 4√
r log n and n(r−1)/q ̸= 1 mod r then
11: break; {Exit the for-loop.}
12: end if
13: end if
14: end for{r − 1 has a prime factor q ≥ 4√
r log n.}
15: for a = 1, 2, . . . , 2√
r log n do
16: if (x− a)n ̸= (xn − a) mod (xr − 1) in Zn[ x ] then
17: return “composite”;
18: end if
The Primality Problem (concluded)
• NP ∩ coNP is the class of problems that have succinct certificates and succinct disqualifications.
– Each “yes” instance has a succinct certificate.
– Each “no” instance has a succinct disqualification.
– No instances have both.
• We will see that primes ∈ NP ∩ coNP.
– In fact, primes ∈ P as mentioned earlier.
Primitive Roots in Finite Fields
Theorem 50 (Lucas and Lehmer (1927)) a A number p > 1 is a prime if and only if there is a number 1 < r < p such that
1. rp−1 = 1 mod p, and
2. r(p−1)/q ̸= 1 mod p for all prime divisors q of p − 1.
• This r is called the primitive root or generator.
• We will prove the theorem later (see pp. 464ff).
aFran¸cois Edouard Anatole Lucas (1842–1891); Derrick Henry
Derrick Lehmer (1905–1991)
Pratt’s Theorem
Theorem 51 (Pratt (1975)) primes ∈ NP ∩ coNP.
• primes is in coNP because a succinct disqualification is a proper divisor.
– A proper divisor of a number n means n is not a prime.
• Now suppose p is a prime.
• p’s certificate includes the r in Theorem 50 (p. 453).
• Use recursive doubling to check if rp−1 = 1 mod p in time polynomial in the length of the input, log2 p.
The Proof (concluded)
• We also need all prime divisors of p − 1: q1, q2, . . . , qk. – Whether r, q1, . . . , qk are easy to find is irrelevant.
– There may be multiple choices for r.
• Checking r(p−1)/qi ̸= 1 mod p is also easy.
• Checking q1, q2, . . . , qk are all the divisors of p− 1 is easy.
• We still need certificates for the primality of the qi’s.
• The complete certificate is recursive and tree-like:
C(p) = (r; q1, C(q1), q2, C(q2), . . . , qk, C(qk)).
• We next prove that C(p) is succinct.
• As a result, C(p) can be checked in polynomial time.
The Succinctness of the Certificate
Lemma 52 The length of C(p) is at most quadratic at 5 log22 p.
• This claim holds when p = 2 or p = 3.
• In general, p − 1 has k ≤ log2 p prime divisors q1 = 2, q2, . . . , qk.
– Reason:
2k ≤
∏k i=1
qi ≤ p − 1.
• Note also that, as q1 = 2,
∏k
≤ p − 1
The Proof (continued)
• C(p) requires:
– 2 parentheses;
– 2k < 2 log2 p separators (at most 2 log2 p bits);
– r (at most log2 p bits);
– q1 = 2 and its certificate 1 (at most 5 bits);
– q2, . . . , qk (at most 2 log2 p bits);a – C(q2), . . . , C(qk).
aWhy?
The Proof (concluded)
• C(p) is succinct because, by induction,
|C(p)| ≤ 5 log2 p + 5 + 5
∑k i=2
log22 qi
≤ 5 log2 p + 5 + 5
( k
∑
i=2
log2 qi )2
≤ 5 log2 p + 5 + 5 log22 p − 1
2 by inequality (5)
< 5 log2 p + 5 + 5(log2 p − 1)2
= 5 log22 p + 10 − 5 log2 p ≤ 5 log22 p
A Certificate for 23
a• Note that 7 is a primitive root modulo 23 and 23 − 1 = 22 = 2 × 11.
• So
C(23) = (7; 2, C(2), 11, C(11)).
• Note that 2 is a primitive root modulo 11 and 11 − 1 = 10 = 2 × 5.
• So
C(11) = (2; 2, C(2), 5, C(5)).
aThanks to a lively discussion on April 24, 2008.
A Certificate for 23 (concluded)
• Note that 2 is a primitive root modulo 5 and 5 − 1 = 4 = 22.
• So
C(5) = (2; 2, C(2)).
• In summary,
C(23) = (7; 2, C(2), 11, (2; 2, C(2), 5, (2; 2, C(2)))).
Basic Modular Arithmetics
a• Let m, n ∈ Z+.
• m | n means m divides n; m is n’s divisor.
• We call the numbers 0, 1, . . . , n − 1 the residue modulo n.
• The greatest common divisor of m and n is denoted gcd(m, n).
• The r in Theorem 50 (p. 453) is a primitive root of p.
• We now prove the existence of primitive roots and then Theorem 50 (p. 453).
aCarl Friedrich Gauss.
Basic Modular Arithmetics (concluded)
• We use
a ≡ b mod n if n| (a − b).
– So 25 ≡ 38 mod 13.
• We use
a = b mod n
if b is the remainder of a divided by n.
– So 25 = 12 mod 13.
Euler’s
aTotient or Phi Function
• Let
Φ(n) = {m : 1 ≤ m < n, gcd(m, n) = 1}
be the set of all positive integers less than n that are prime to n.b
– Φ(12) = {1, 5, 7, 11}.
• Define Euler’s function of n to be ϕ(n) = |Φ(n)|.
• ϕ(p) = p − 1 for prime p, and ϕ(1) = 1 by convention.
• Euler’s function is not expected to be easy to compute without knowing n’s factorization.
aLeonhard Euler (1707–1783).
bZn∗ is an alternative notation.
Two Properties of Euler’s Function
The inclusion-exclusion principlea can be used to prove the following.
Lemma 53 ϕ(n) = n∏
p|n(1 − 1p).
• If n = pe11pe22 · · · peℓℓ is the prime factorization of n, then
ϕ(n) = n
∏ℓ i=1
(
1 − 1 pi
) .
Corollary 54 ϕ(mn) = ϕ(m) ϕ(n) if gcd(m, n) = 1.
aConsult any textbook on discrete mathematics.
A Key Lemma
Lemma 55 ∑
m|n ϕ(m) = n.
• Let n = ∏ℓ
i=1 pkii be the prime factorization of n and consider
∏ℓ i=1
[ ϕ(1) + ϕ(pi) + · · · + ϕ(pkii) ]. (6)
• Equation (6) equals n because ϕ(pki ) = pki − pki−1 by Lemma 53 (p. 466) so ϕ(1) + ϕ(pi) + · · · + ϕ(pkii) = pkii.
• Expand Eq. (6) to yield
n = ∑ ∏ℓ
ϕ(pk′i).
The Proof (concluded)
• By Corollary 54 (p. 466),
∏ℓ i=1
ϕ(pki′i) = ϕ
( ℓ
∏
i=1
pki′i )
.
• So Eq. (6) becomes
n = ∑
k1′≤k1,...,kℓ′≤kℓ
ϕ
( ℓ
∏
i=1
pki′i )
.
• Each ∏ℓ
i=1 pki′i is a unique divisor of n = ∏ℓ
i=1 pkii.
• Equation (6) becomes
∑
m|n
ϕ(m).