knapsack Is NP-Complete

(1)

tripartite matching

^a

• We are given three sets B, G, and H, each containing n elements.

• Let T ⊆ B × G × H be a ternary relation.

• tripartite matching asks if there is a set of n triples in T , none of which has a component in common.

– Each element in B is matched to a diﬀerent element in G and diﬀerent element in H.

Theorem 49 (Karp, 1972) tripartite matching is NP-complete.

aPrincess Diana (November 20, 1995), “There were three of us in this marriage, so it was a bit crowded.”

(2)

knapsack

• There is a set of n items.

• Item i has value vi ∈ Z⁺ and weight w_i ∈ Z⁺.

• We are given K ∈ Z⁺ and W ∈ Z⁺.

• knapsack asks if there exists a subset I ⊆ { 1, 2, . . . , n } such that

i∈I w_i ≤ W and

i∈I v_i ≥ K.

– We want to achieve the maximum satisfaction within the budget.

(6)

knapsack Is NP-Complete

^a

• knapsack ∈ NP: Guess an I and check the constraints.

• We shall reduce x3c to knapsack, in which vi = w_i for all i and K = W .

• The simpliﬁed knapsack now asks if a subset of v1, v2, . . . , v_n adds up to exactly K.^b

– Picture yourself as a radio DJ.

aKarp (1972). It can be solved in time O(2^n/2) with space O(2^n/4) (Schroeppel & Shamir, 1981; Vyskoˇc, 1987).

bThis problem is called subset sum or 0-1 knapsack.

(7)

The Proof (continued)

• The primary diﬀerences between the two problems are:^a – Sets vs. numbers.

– Union vs. addition.

• We are given a family F = { S1, S2, . . . , S_n } of size-3 subsets of U = { 1, 2, . . . , 3m }.

• x3c asks if there are m disjoint sets in F that cover the set U .

aThanks to a lively class discussion on November 16, 2010.

(8)

The Proof (continued)

• Think of a set as a bit vector in { 0, 1 }³^m. – Assume m = 3.

– 110010000 means the set { 1, 2, 5 }.

– 001100010 means the set { 3, 4, 8 }.

• Assume there are n = 5 size-3 subsets in F .

• Our goal is

3m

1 1 · · · 1 .

(9)

The Proof (continued)

• A bit vector can also be seen as a binary number.

• Set union resembles addition:

001100010 + 110010000 111110010

which denotes the set { 1, 2, 3, 4, 5, 8 }, as desired.

(10)

The Proof (continued)

• Trouble occurs when there is carry:

010000000 + 010000000 100000000

which denotes the wrong set { 1 }, not the correct { 2 }.

(11)

The Proof (continued)

• Or consider

001100010 + 001110000 011010010

which denotes the set { 2, 3, 5, 8 }, not the correct { 3, 4, 5, 8 }.^a

aCorrected by Mr. Chihwei Lin (D97922003) on January 21, 2010.

(12)

The Proof (continued)

• Carry may also lead to a situation where we obtain our solution 1 1 · · · 1 with more than m sets in F .

• For example, with m = 3,

000100010 001110000 101100000 + 000001101 111111111

• But the correct union result, { 1, 3, 4, 5, 6, 7, 8, 9 }, is not

(13)

The Proof (continued)

• And it uses 4 sets instead of the required m = 3.^a

• To ﬁx this problem, we enlarge the base just enough so that there are no carries.^b

• Because there are n vectors in total, we change the base from 2 to n + 1.

• Every positive integer N has a unique expression in base b: There are b-adic digits 0 ≤ d_i < b such that

N =

k i=0

dibⁱ, dk = 0.

aThanks to a lively class discussion on November 20, 2002.

bYou cannot map ∪ to ∨ because knapsack requires + not ∨!

(14)

The Proof (continued)

• Set v_i to be the integer corresponding to the bit vector encoding S_i in base n + 1:

vi =

j∈Si

1 × (n + 1)³^m−j (4)

• Set

K =

3m−1 j=0

1 × (n + 1)^j =

3m

1 1 · · · 1 (base n + 1).

• Now in base n + 1, if there is a set S such that

i∈S v_i =

3m

1 1· · · 1, then every position must be

(15)

The Proof (continued)

• For example, the case on p. 429 becomes 000100010

001110000 101100000 + 000001101 102311111 in base n + 1 = 6.

• As desired, it no longer meets the goal.

(16)

The Proof (continued)

• Suppose F admits an exact cover, say { S1, S2, . . . , S_m }.

• Then picking I = { 1, 2, . . . , m } clearly results in

v1 + v2 + · · · + vm =

3m

1 1· · · 1 .

• It is important to note that the meaning of addition (+) is independent of the base.^a

– It is just regular addition.

– But an Si may give rise to diﬀerent integers vi in Eq.

(4) on p. 431 under diﬀerent bases.

a R92922047) on November 3,

(17)

The Proof (concluded)

• On the other hand, suppose there exists an I such that

i∈I

v_i =

3m

1 1 · · · 1 in base n + 1.

• The no-carry property implies that | I | = m and { S_i : i ∈ I }

is an exact cover.

The proof actually proves:

Corollary 51 subset sum (p. 423) is NP-complete.

(18)

An Example

• Let m = 3, U = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }, and S1 = { 1, 3, 4 },

S2 = { 2, 3, 4 }, S3 = { 2, 5, 6 }, S4 = { 6, 7, 8 }, S5 = { 7, 8, 9 }.

• Note that n = 5, as there are 5 S_i’s.

(19)

An Example (continued)

• Our reduction produces

K =

3×3−1

j=0

6^j =

3×3

1 1· · · 16 = 2015539₁₀, v1 = 101100000 = 1734048,

v2 = 011100000 = 334368, v3 = 010011000 = 281448, v4 = 000001110 = 258, v5 = 000000111 = 43.

(20)

An Example (concluded)

• Note v1 + v3 + v5 = K because

101100000 010011000 + 000000111 111111111

• Indeed,

S1 ∪ S3 ∪ S5 = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }, an exact cover by 3-sets.

(21)

bin packing

• We are given N positive integers a1, a2, . . . , aN, an

integer C (the capacity), and an integer B (the number of bins).

• bin packing asks if these numbers can be partitioned into B subsets, each of which has total sum at most C.

• Think of packing bags at the check-out counter.

Theorem 52 bin packing is NP-complete.

(22)

bin packing (concluded)

• But suppose a1, a2, . . . , a_N are randomly distributed between 0 and 1.

• Let B be the smallest number of unit-capacity bins capable of holding them.

• Then B can deviate from its average by more than t with probability at most 2e^−2t²^/N.^a

aDubhashi & Panconesi (2012).

(23)

integer programming (ip)

• ip asks whether a system of linear inequalities with integer coeﬃcients has an integer solution.

• In contrast, linear programming (lp) asks whether a system of linear inequalities with integer coeﬃcients has a rational solution.

– lp is solvable in polynomial time.^a

aKhachiyan (1979).

(24)

ip Is NP-Complete

^a

• set covering can be expressed by the inequalities Ax ≥ 1, _n

i=1 xi ≤ B, 0 ≤ xi ≤ 1, where – x_i = 1 if and only if S_i is in the cover.

– A is the matrix whose columns are the bit vectors of the sets S1, S2, . . ..

– 1 is the vector of 1s.

– The operations in Ax are standard matrix operations.

– The ith row of Ax is at least 1 means item i is covered.

a

(25)

ip Is NP-Complete (concluded)

• This shows ip is NP-hard.

• Many NP-complete problems can be expressed as an ip problem.

• ip with a ﬁxed number of variables is in P.^a

aLenstra (1983).

(26)

Christos Papadimitriou (1949–)

(27)

Easier or Harder?

^a

• Adding restrictions on the allowable problem instances will not make a problem harder.

– We are now solving a subset of problem instances or special cases.

– The independent set proof (p. 365) and the knapsack proof (p. 423): equally hard.

– circuit value to monotone circuit value (p. 314): equally hard.

– sat to 2sat (p. 346): easier.

aThanks to a lively class discussion on October 29, 2003.

(28)

Easier or Harder? (concluded)

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.

• It is problem dependent.

– min cut to bisection width (p. 398): harder.

– lp to ip (p. 440): harder.

– sat to naesat (p. 358) and max cut to max bisection (p. 396): equally hard.

– 3-coloring to 2-coloring (p. 407): easier.

(29)

coNP and Function Problems

(30)

coNP

• By deﬁnition, coNP is the class of problems whose complement is in NP.

– L ∈ coNP if and only if ¯L ∈ NP.

• NP problems have succinct certiﬁcates.^a

• coNP is therefore the class of problems that have succinct disqualifications:^b

– A “no” instance possesses a short proof of its being a

“no” instance.

– Only “no” instances have such proofs.

aRecall Proposition 40 (p. 328).

(31)

coNP (continued)

• Suppose L is a coNP problem.

• There exists a nondeterministic polynomial-time algorithm M such that:

– If x ∈ L, then M (x) = “yes” for all computation paths.

– If x ∈ L, then M (x) = “no” for some computation path.

• If we swap “yes” and “no” in M, the new algorithm decides ¯L ∈ NP in the classic sense (p. 107).

(32)

\HV [ ∉ /

\HV QR

\HV [ ∈ /

\HV

(33)

coNP (continued)

• So there are 3 major approaches to proving L ∈ coNP.

1. Prove ¯L ∈ NP.

2. Prove that only “no” instances possess short proofs.

3. Write an algorithm for it directly.

(34)

coNP (concluded)

• Clearly P ⊆ coNP.

• It is not known if

P = NP ∩ coNP.

– Contrast this with

R = RE ∩ coRE (see p. 155).

(35)

Some coNP Problems

• sat complement ∈ coNP.

– sat complement is the complement of sat.

– Or, the disqualiﬁcation is a truth assignment that satisfies it.

• hamiltonian path complement ∈ coNP.

– hamiltonian path complement is the complement of hamiltonian path.

– Or, the disqualiﬁcation is a Hamiltonian path.

(36)

Some coNP Problems (concluded)

• validity ∈ coNP.

– If φ is not valid, it can be disqualiﬁed very succinctly:

a truth assignment that does not satisfy it.

• optimal tsp (d) ∈ coNP.

– optimal tsp (d) asks if the optimal tour has a total distance of B, where B is an input.^a

– The disqualiﬁcation is a tour with a length < B.

aDeﬁned by Mr. Che-Wei Chang (R95922093) on September 27, 2006.

(37)

A Nondeterministic Algorithm for sat complement (See also p. 117)

φ is a boolean formula with n variables.

1: for i = 1, 2, . . . , n do

2: Guess x_i ∈ { 0, 1 }; {Nondeterministic choice.}

3: end for

4: {Veriﬁcation:}

5: if φ(x1, x2, . . . , xn) = 1 then

6: “no”;

7: else

8: “yes”;

9: end if

(38)

Analysis

• The algorithm decides language { φ : φ is unsatisﬁable }.

– The computation tree is a complete binary tree of depth n.

– Every computation path corresponds to a particular truth assignment out of 2ⁿ.

– φ is unsatisﬁable if and only if every truth assignment falsiﬁes φ.

– But every truth assignment falsiﬁes φ if and only if every computation path results in “yes.”

(39)

An Alternative Characterization of coNP

Proposition 53 Let L ⊆ Σ^∗ be a language. Then L ∈ coNP if and only if there is a polynomially decidable and

polynomially balanced relation R such that L = { x : ∀y (x, y) ∈ R }.

(As on p. 327, we assume | y | ≤ | x |^k for some k.)

• ¯L = { x : ∃y (x, y) ∈ ¬R }.

• Because ¬R remains polynomially balanced, ¯L ∈ NP by Proposition 40 (p. 328).

• Hence L ∈ coNP by deﬁnition.

(40)

coNP-Completeness

Proposition 54 L is NP-complete if and only if its complement ¯L = Σ^∗ − L is coNP-complete.

Proof (⇒; the ⇐ part is symmetric)

• Let ¯L be any coNP language.

• Hence L ∈ NP.

• Let R be the reduction from L to L.

• So x ∈ L if and only if R(x) ∈ L.

• By the law of transposition, x ∈ L if and only if

(41)

coNP Completeness (concluded)

• So x ∈ ¯L if and only if R(x) ∈ ¯L.

• The same R is a reduction from ¯L to ¯L.

• This shows ¯L is coNP-hard.

• But ¯L ∈ coNP.

• This shows ¯L is coNP-complete.

(42)

Some coNP-Complete Problems

• sat complement is coNP-complete.

• hamiltonian path complement is coNP-complete.

• validity is coNP-complete.

– φ is valid if and only if ¬φ is not satisﬁable.

– φ ∈ validity is valid if and only if

¬φ ∈ sat complement.

– The reduction from sat complement to validity is hence easy.

(43)

Possible Relations between P, NP, coNP

1. P = NP = coNP.

2. NP = coNP but P = NP.

3. NP = coNP and P = NP.

• This is the current “consensus.”^a

aCarl Gauss (1777–1855), “I could easily lay down a multitude of such propositions, which one could neither prove nor dispose of.”

(44)

The Primality Problem

• An integer p is prime if p > 1 and all positive numbers other than 1 and p itself cannot divide it.

• primes asks if an integer N is a prime number.

• Dividing N by 2, 3, . . . ,√

N is not eﬃcient.

– The length of N is only log N , but √

N = 2⁰^{.5 log N}. – It is an exponential-time algorithm.

• A polynomial-time algorithm for primes was not found until 2002 by Agrawal, Kayal, and Saxena!

• The running time is ˜O(log⁷^.5 N ).

(45)

1: _{if n = a}^b for some a, b > 1 then

2: return “composite”;

3: _{end if}

4: for r = 2, 3, . . . , n − 1 do

5: if gcd(n, r) > 1 then

7: _{end if}

8: if r is a prime then

9: Let q be the largest prime factor of r − 1;

10: _{if q ≥ 4}^√r log n and n^(r−1)/q = 1 mod r then

11: break; {Exit the for-loop.}

12: _{end if} 13: _{end if}

14: end for{r − 1 has a prime factor q ≥ 4√

r log n.}

15: for a = 1, 2, . . . , 2√

r log n do

16: _{if (x − a)}ⁿ _{= (x}ⁿ − a) mod (x^r − 1) in Zn[x ] then

18: _{end if} 19: _{end for}

20: return “prime”; {The only place with “prime” output.}

(46)

The Primality Problem (concluded)

• Later, we will focus on eﬃcient “randomized” algorithms for primes (used in Mathematica, e.g.).

• NP ∩ coNP is the class of problems that have succinct certiﬁcates and succinct disqualiﬁcations.

– Each “yes” instance has a succinct certiﬁcate.

– Each “no” instance has a succinct disqualiﬁcation.

– No instances have both.

• We will see that primes ∈ NP ∩ coNP.

– In fact, primes ∈ P as mentioned earlier.

(47)

Basic Modular Arithmetics

^a

• Let m, n ∈ Z⁺.

• m | n means m divides n; m is n’s divisor.

• We call the numbers 0, 1, . . . , n − 1 the residue modulo n.

• The greatest common divisor of m and n is denoted gcd(m, n).

• The r in Theorem 55 (p. 466) is a primitive root of p.

aCarl Friedrich Gauss.

(48)

Basic Modular Arithmetics (concluded)

• We use

a ≡ b mod n if n | (a − b).

– So 25 ≡ 38 mod 13.

• We use

a = b mod n

if b is the remainder of a divided by n.

– So 25 = 12 mod 13.

(49)

Primitive Roots in Finite Fields

Theorem 55 (Lucas & Lehmer, 1927) ^a A number p > 1 is a prime if and only if there is a number 1 < r < p such that

1. r^p−1 = 1 mod p, and

2. r⁽^p−1)/q = 1 mod p for all prime divisors q of p − 1.

• This r is called the primitive root or generator.

• We will prove one direction of the theorem later.^b

aFran¸cois Edouard Anatole Lucas (1842–1891); Derrick Henry Lehmer (1905–1991).

bSee pp. 477ﬀ.

(50)

Derrick Lehmer

^a

(1905–1991)

(51)

Pratt’s Theorem

Theorem 56 (Pratt, 1975) primes ∈ NP ∩ coNP.

• primes ∈ coNP because a succinct disqualiﬁcation is a proper divisor.

– A proper divisor of a number means it is not a prime.

• Now suppose p is a prime.

• p’s certiﬁcate includes the r in Theorem 55 (p. 466).

– There may be multiple choices for r.

(52)

The Proof (continued)

• Use recursive doubling to check if r^p−1 = 1 mod p in time polynomial in the length of the input, log₂ p.

– r, r², r⁴, . . . mod p, a total of ∼ log₂ p steps.

• We also need all prime divisors of p − 1: q1, q2, . . . , q_k. – Whether r, q1, . . . , qk are easy to ﬁnd is irrelevant.

• Checking r⁽^p−1)/qⁱ = 1 mod p is also easy.

• Checking q1, q2, . . . , q_k are all the divisors of p − 1 is easy.

(53)

The Proof (concluded)

• We still need certiﬁcates for the primality of the q_i’s.

• The complete certiﬁcate is recursive and tree-like:

C(p) = (r; q1, C(q1), q2, C(q2), . . . , qk, C(qk)). (5)

• We next prove that C(p) is succinct.

• As a result, C(p) can be checked in polynomial time.

(54)

The Succinctness of the Certificate

Lemma 57 The length of C(p) is at most quadratic at 5 log²₂ p.

• This claim holds when p = 2 or p = 3.

• In general, p − 1 has k ≤ log₂ p prime divisors q1 = 2, q2, . . . , q_k.

– Reason:

2^k ≤

k i=1

q_i ≤ p − 1.

• Note also that, as q1 = 2,

k

q_i ≤ p − 1

. (6)

(55)

The Proof (continued)

• C(p) requires:

– 2 parentheses;

– 2k < 2 log₂ p separators (at most 2 log₂ p bits);

– r (at most log₂ p bits);

– q1 = 2 and its certiﬁcate 1 (at most 5 bits);

– q2, . . . , q_k (at most 2 log₂ p bits);^a – C(q2), . . . , C(qk).

aWhy?

(56)

The Proof (concluded)

• C(p) is succinct because, by induction,

| C(p) | ≤ 5 log₂ p + 5 + 5

k i=2

log²₂ q_i

≤ 5 log₂ p + 5 + 5

_k

i=2

log₂ q_i ²

≤ 5 log₂ p + 5 + 5 log²₂ p − 1

2 by inequality (6)

< 5 log₂ p + 5 + 5[ (log₂ p) − 1 ]²

= 5 log²₂ p + 10 − 5 log₂ p ≤ 5 log²₂ p

(57)

A Certificate for 23

^a

• Note that 5 is a primitive root modulo 23 and 23 − 1 = 22 = 2 × 11.^b

• So

C(23) = (5; 2, C(2), 11, C(11)).

• Note that 2 is a primitive root modulo 11 and 11 − 1 = 10 = 2 × 5.

• So

C(11) = (2; 2, C(2), 5, C(5)).

aThanks to a lively discussion on April 24, 2008.

bOther primitive roots are 7, 10, 11, 14, 15, 17, 19, 20, 21.

(58)

A Certificate for 23 (concluded)

• Note that 2 is a primitive root modulo 5 and 5 − 1 = 4 = 2².

• So

C(5) = (2; 2, C(2)).

• In summary,

C(23) = (5; 2, C(2), 11, (2; 2, C(2), 5, (2; 2, C(2)))).

– In Mathematica, PrimeQCertificate[23] yields { 23, 5, { 2, { 11, 2, { 2, { 5, 2, { 2 }}}}}}

(59)

Turning the Proof into an Algorithm

^a

• How to turn the proof into a nondeterministic polynomial-time algorithm?

• First, guess a log₂ p-bit number r.

• Then guess up to log₂ p numbers q1, q2, . . . , q_k each containing at most log₂ p bits.

• Then recursively do the same thing for each of the q_i to form a certiﬁcate (5) on p. 470.

• Finally check if the two conditions of Theorem 55 (p.

466) hold throughout the tree.

aContributed by Mr. Kai-Yuan Hou (B99201038, R03922014) on November 24, 2015.

(60)

Euler’s

^a

Totient or Phi Function

• Let

Φ(n) = { m : 1 ≤ m < n, gcd(m, n) = 1 }

be the set of all positive integers less than n that are prime to n.^b

– Φ(12) = { 1, 5, 7, 11 }.

• Deﬁne Euler’s function of n to be φ(n) = | Φ(n) |.

• φ(p) = p − 1 for prime p, and φ(1) = 1 by convention.

• Euler’s function is not expected to be easy to compute without knowing n’s factorization.

a

(61)

(62)

Leonhard Euler (1707–1783)

(63)

Three Properties of Euler’s Function

^a

The inclusion-exclusion principle^b can be used to prove the following.

Lemma 58 If n = pê₁¹pê₂² · · · pê is the prime factorization of n, then

φ(n) = n

i=1

1 − 1 p_i

.

• For example, if n = pq, where p and q are distinct primes, then

φ(n) = pq

1 − 1 p

1 − 1 q

= pq − p − q + 1.

aSee p. 224 of the textbook.

bConsult any textbooks on discrete mathematics.

(64)

Three Properties of Euler’s Function (concluded)

Corollary 59 φ(mn) = φ(m) φ(n) if gcd(m, n) = 1.

Lemma 60

m|n φ(m) = n.

(65)

The Density Attack for primes

Witnesses to compositeness

of n

All numbers < n

(66)

The Density Attack for primes

1: Pick k ∈ { 1, . . . , n } randomly;

2: if k | n and k = 1 and k = n then

3: return “n is composite”;

4: else

5: return “n is (probably) a prime”;

6: end if

(67)

The Density Attack for primes (continued)

• It works, but does it work well?

• The ratio of numbers ≤ n relatively prime to n (the white ring) is

φ(n) n .

• When n = pq, where p and q are distinct primes, φ(n)

n = pq − p − q + 1

pq > 1 − 1

q − 1 p.

(68)

The Density Attack for primes (concluded)

• So the ratio of numbers ≤ n not relatively prime to n (the gray area) is < (1/q) + (1/p).

– The “density attack” has probability about 2/√

n of factoring n = pq when p ∼ q = O(√

n ).

– The “density attack” to factor n = pq hence takes Ω(√

n) steps on average when p ∼ q = O(√ n ).

– This running time is exponential: Ω(2⁰^{.5 log}²ⁿ).

knapsack Is NP-Complete

tripartite matching

Related Problems

Related Problems (concluded)

knapsack

knapsack Is NP-Complete

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (continued)

The Proof (concluded)

An Example

An Example (continued)

An Example (concluded)

bin packing

bin packing (concluded)

integer programming (ip)

ip Is NP-Complete

ip Is NP-Complete (concluded)

Christos Papadimitriou (1949–)

Easier or Harder?

Easier or Harder? (concluded)

coNP and Function Problems

coNP

coNP (continued)

\HV [ ∉ /

\HV QR

\HV QR

\HV [ ∈ /

\HV

\HV

\HV

\HV

coNP (continued)

coNP (concluded)

Some coNP Problems

Some coNP Problems (concluded)

A Nondeterministic Algorithm for sat complement (See also p. 117)

Analysis

An Alternative Characterization of coNP

coNP-Completeness

coNP Completeness (concluded)

Some coNP-Complete Problems

Possible Relations between P, NP, coNP

The Primality Problem

The Primality Problem (concluded)

Basic Modular Arithmetics

Basic Modular Arithmetics (concluded)

Primitive Roots in Finite Fields

Derrick Lehmer

(1905–1991)

Pratt’s Theorem

The Proof (continued)

The Proof (concluded)

The Succinctness of the Certificate

The Proof (continued)

The Proof (concluded)

A Certificate for 23

A Certificate for 23 (concluded)

Turning the Proof into an Algorithm

Euler’s

Totient or Phi Function

Leonhard Euler (1707–1783)

Three Properties of Euler’s Function

Three Properties of Euler’s Function (concluded)

The Density Attack for primes

Witnesses to compositeness

of n

All numbers < n

The Density Attack for primes

The Density Attack for primes (continued)

The Density Attack for primes (concluded)