The Proof (continued)

(1)

knapsack Is NP-Complete

^a

• knapsack ∈ NP: Guess an S and check the constraints.

• We shall reduce exact cover by 3-sets to knapsack, in which v_i = w_i for all i and K = W .

• The simplified knapsack now asks if a subset of v₁, v₂, . . . , v_n adds up to exactly K.^b

– Picture yourself as a radio DJ.

aKarp (1972).

bThis problem is called subset sum.

(2)

The Proof (continued)

• The primary diﬀerences between the two problems are:^a – Sets vs. numbers.

– Union vs. addition.

• We are given a family F = {S¹, S₂, . . . , S_n} of size-3 subsets of U = {1, 2, . . . , 3m}.

• exact cover by 3-sets asks if there are m disjoint sets in F that cover the set U .

aThanks to a lively class discussion on November 16, 2010.

(3)

The Proof (continued)

• Think of a set as a bit vector in {0, 1}^3m. – Assume m = 3.

– 110010000 means the set {1, 2, 5}.

– 001100010 means the set {3, 4, 8}.

• Assume there are n = 5 size-3 subsets in F .

• Our goal is

z }| {3m

1 1· · · 1 .

(4)

The Proof (continued)

• A bit vector can also be seen as a binary number.

• Set union resembles addition:

001100010 + 110010000 111110010

which denotes the set {1, 2, 3, 4, 5, 8}, as desired.

(5)

The Proof (continued)

• Trouble occurs when there is carry:

010000000 + 010000000 100000000

which denotes the set {1}, not the desired {2}.

(6)

The Proof (continued)

• Or consider

001100010 + 001110000 011010010

which denotes the set {2, 3, 5, 8}, not the desired {3, 4, 5, 8}.^a

aCorrected by Mr. Chihwei Lin (D97922003) on January 21, 2010.

(7)

The Proof (continued)

• Carry may also lead to a situation where we obtain our solution 1 1· · · 1 with more than m sets in F .

• For example,

000100010 001110000 101100000 + 000001101 111111111

• But the correct answer, {1, 3, 4, 5, 6, 7, 8, 9}, is not an exact cover.

(8)

The Proof (continued)

• And it uses 4 sets instead of the required m = 3.^a

• To fix this problem, we enlarge the base just enough so that there are no carries.^b

• Because there are n vectors in total, we change the base from 2 to n + 1.

aThanks to a lively class discussion on November 20, 2002.

bYou cannot map ∪ to ∨ because knapsack requires +.

(9)

The Proof (continued)

• Set vⁱ to be the integer corresponding to the bit vector encoding S_i in base n + 1:

v_i = ∑

j∈Si

1 × (n + 1)^3m^−j (4)

• Set

K =

3m∑−1 j=0

1 × (n + 1)^j =

z }| {3m

1 1 · · · 1 (base n + 1).

• Now in base n + 1, if there is a set S such that

∑

i∈S v_i =

z }| {3m

1 1· · · 1, then every position must be contributed by exactly one v_i and |S| = m.

(10)

The Proof (continued)

• For example, the case on p. 420 becomes 000100010

001110000 101100000 + 000001101 102311111 in base n + 1 = 6.

• It does not meet the goal.

(11)

The Proof (continued)

• Suppose F admits an exact cover, say {S¹, S₂, . . . , S_m}.

• Then picking S = {1, 2, . . . , m} clearly results in

v₁ + v₂ + · · · + v^m =

z }| {3m

1 1· · · 1 .

• It is important to note that the meaning of addition (+) is independent of the base.^a

– It is just regular addition.

– But an S_i may give rise to diﬀerent integer v_i’s in Eq. (4) on p. 422 under diﬀerent bases.

aContributed by Mr. Kuan-Yu Chen (R92922047) on November 3, 2004.

(12)

The Proof (concluded)

• On the other hand, suppose there exists an S such that

∑

i∈S

v_i =

z }| {3m

1 1· · · 1 in base n + 1.

• The no-carry property implies that |S| = m and {Sⁱ : i ∈ S}

is an exact cover.

(13)

An Example

• Let m = 3, U = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and S₁ = {1, 3, 4},

S₂ = {2, 3, 4}, S₃ = {2, 5, 6}, S₄ = {6, 7, 8}, S₅ = {7, 8, 9}.

• Note that n = 5, as there are 5 Sⁱ’s.

(14)

An Example (continued)

• Our reduction produces

K =

3×3−1∑

j=0

6^j =

3×3

z }| {

1 1· · · 1 (base 6) = 2015539, v₁ = 101100000 = 1734048,

v₂ = 011100000 = 334368, v₃ = 010011000 = 281448, v₄ = 000001110 = 258, v₅ = 000000111 = 43.

(15)

An Example (concluded)

• Note v¹ + v₃ + v₅ = K because

101100000 010011000 + 000000111 111111111

• Indeed,

S₁ ∪ S³ ∪ S⁵ = {1, 2, 3, 4, 5, 6, 7, 8, 9}, an exact cover by 3-sets.

(16)

bin packing

• We are given N positive integers a¹, a₂, . . . , a_N, an

integer C (the capacity), and an integer B (the number of bins).

• bin packing asks if these numbers can be partitioned into B subsets, each of which has total sum at most C.

• Think of packing bags at the check-out counter.

Theorem 47 bin packing is NP-complete.

(17)

bin packing (concluded)

• But suppose a¹, a₂, . . . , a_N are randomly distributed between 0 and 1.

• Let B be the smallest number of unit-capacity bins capable of holding them.

• Then B can diﬀer from its average by more than t with probability at most 2e^−2t²^/N.^a

aDubhashi and Panconesi (2012).

(18)

integer programming

• integer programming asks whether a system of linear inequalities with integer coeﬃcients has an integer

solution.

• In contrast, linear programming asks whether a

system of linear inequalities with integer coeﬃcients has a rational solution.

(19)

integer programming Is NP-Complete

^a

• set covering can be expressed by the inequalities Ax ≥ ⃗1, ∑n

i=1 x_i ≤ B, 0 ≤ xⁱ ≤ 1, where – x_i is one if and only if S_i is in the cover.

– A is the matrix whose columns are the bit vectors of the sets S₁, S₂, . . ..

– ⃗1 is the vector of 1s.

– The operations in Ax are standard matrix operations.

• This shows integer programming is NP-hard.

• Many NP-complete problems can be expressed as an integer programming problem.

aKarp (1972).

(20)

Easier or Harder?

^a

• Adding restrictions on the allowable problem instances will not make a problem harder.

– We are now solving a subset of problem instances or special cases.

– The independent set proof (p. 361) and the knapsack proof (p. 414).

– sat to 2sat (easier by p. 342).

– circuit value to monotone circuit value (equally hard by p. 314).

(21)

Easier or Harder? (concluded)

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.

• It is problem dependent.

– min cut to bisection width (harder by p. 389).

– linear programming to integer programming (harder by p. 431).

– sat to naesat (equally hard by p. 355) and max cut to max bisection (equally hard by p. 387).

– 3-coloring to 2-coloring (easier by p. 398).

(22)

coNP and Function Problems

(23)

coNP

• NP is the class of problems that have succinct certificates (recall Proposition 36 on p. 326).

• By definition, coNP is the class of problems whose complement is in NP.

• coNP is therefore the class of problems that have succinct disqualifications:

– A “no” instance of a problem in coNP possesses a short proof of its being a “no” instance.

– Only “no” instances have such proofs.

(24)

coNP (continued)

• Suppose L is a coNP problem.

• There exists a polynomial-time nondeterministic algorithm M such that:

– If x ∈ L, then M(x) = “yes” for all computation paths.

– If x ̸∈ L, then M(x) = “no” for some computation path.

• Note that if we swap “yes” and “no” of M, the new algorithm M^′ decides ¯L ∈ NP in the classic sense (p.

(25)

\HV [ ∉ /

\HV QR

\HV [ ∈ /

\HV

(26)

coNP (continued)

• So there are 3 major approaches to proving L ∈ coNP.

1. Prove ¯L ∈ NP.

2. Prove that “no” instances possess short proofs.

3. Write an algorithm for it.

(27)

coNP (concluded)

• Clearly P ⊆ coNP.

• It is not known if

P = NP ∩ coNP.

– Contrast this with

R = RE ∩ coRE (see Proposition 12 on p. 170).

(28)

Some coNP Problems

• validity ∈ coNP.

– If ϕ is not valid, it can be disqualified very succinctly:

a truth assignment that does not satisfy it.

• sat complement ∈ coNP.

– sat complement is the complement of sat.

– The disqualification is a truth assignment that satisfies it.

• hamiltonian path complement ∈ coNP.

(29)

Some coNP Problems (concluded)

• optimal tsp (d) ∈ coNP.

– optimal tsp (d) asks if the optimal tour has a total distance of B, where B is an input.^a

– The disqualification is a tour with a length < B.

aDefined by Mr. Che-Wei Chang (R95922093) on September 27, 2006.

(30)

A Nondeterministic Algorithm for sat complement

ϕ is a boolean formula with n variables.

1: for i = 1, 2, . . . , n do

2: Guess x_i ∈ {0, 1}; {Nondeterministic choice.}

3: end for

4: {Verification:}

5: if ϕ(x₁, x₂, . . . , x_n) = 1 then

6: “no”;

7: else

8: “yes”;

end if

(31)

Analysis

• The algorithm decides language {ϕ : ϕ is unsatisfiable}.

– The computation tree is a complete binary tree of depth n.

– Every computation path corresponds to a particular truth assignment out of 2ⁿ.

– ϕ is unsatisfiable iﬀ every truth assignment falsifies ϕ.

– But every truth assignment falsifies ϕ iﬀ every computation path results in “yes.”

(32)

An Alternative Characterization of coNP

Proposition 48 Let L ⊆ Σ^∗ be a language. Then L ∈ coNP if and only if there is a polynomially decidable and

polynomially balanced relation R such that L = {x : ∀y (x, y) ∈ R}.

(As on p. 325, we assume | y | ≤ | x |^k for some k.)

• ¯L = {x : ∃y (x, y) ∈ ¬R}.

• Because ¬R remains polynomially balanced, ¯L ∈ NP by Proposition 36 (p. 326).

(33)

coNP-Completeness

Proposition 49 L is NP-complete if and only if its complement ¯L = Σ^∗ − L is coNP-complete.

Proof (⇒; the ⇐ part is symmetric)

• Let ¯L^′ be any coNP language.

• Hence L^′ ∈ NP.

• Let R be the reduction from L^′ to L.

• So x ∈ L^′ if and only if R(x) ∈ L.

• Equivalently, x ̸∈ L^′ if and only if R(x) ̸∈ L (the law of transposition).

(34)

coNP Completeness (concluded)

• So x ∈ ¯L^′ if and only if R(x) ∈ ¯L.

• R is a reduction from ¯L^′ to ¯L.

• This shows ¯L is coNP-hard.

• But ¯L ∈ coNP.

• This shows ¯L is coNP-complete.

(35)

Some coNP-Complete Problems

• sat complement is coNP-complete.

• validity is coNP-complete.

– ϕ is valid if and only if ¬ϕ is not satisfiable.

– The reduction from sat complement to validity is hence easy.

• hamiltonian path complement is coNP-complete.

(36)

Possible Relations between P, NP, coNP

1. P = NP = coNP.

2. NP = coNP but P ̸= NP.

3. NP ̸= coNP and P ̸= NP.

• This is the current “consensus.”^a

aCarl Gauss (1777–1855), “I could easily lay down a multitude of such propositions, which one could neither prove nor dispose of.”

(37)

The Primality Problem

• An integer p is prime if p > 1 and all positive numbers other than 1 and p itself cannot divide it.

• primes asks if an integer N is a prime number.

• Dividing N by 2, 3, . . . ,√

N is not eﬃcient.

– The length of N is only log N , but √

N = 2^{0.5 log N}. – So it is an exponential-time algorithm.

• A polynomial-time algorithm for primes was not found until 2002 by Agrawal, Kayal, and Saxena!

• Later, we will focus on eﬃcient “probabilistic”

algorithms for primes (used in Mathematica, e.g.).

(38)

1: if n = a^b for some a, b > 1 then

2: return “composite”;

3: end if

4: for r = 2, 3, . . . , n− 1 do

5: if gcd(n, r) > 1 then

7: end if

8: if r is a prime then

9: Let q be the largest prime factor of r − 1;

10: if q ≥ 4√

r log n and n^(r^−1)/q ̸= 1 mod r then

11: break; {Exit the for-loop.}

12: end if

13: end if

14: end for{r − 1 has a prime factor q ≥ 4√

r log n.}

15: for a = 1, 2, . . . , 2√

r log n do

16: if (x− a)ⁿ ̸= (xⁿ − a) mod (x^r − 1) in Zn[ x ] then

18: end if

(39)

The Primality Problem (concluded)

• NP ∩ coNP is the class of problems that have succinct certificates and succinct disqualifications.

– Each “yes” instance has a succinct certificate.

– Each “no” instance has a succinct disqualification.

– No instances have both.

• We will see that primes ∈ NP ∩ coNP.

– In fact, primes ∈ P as mentioned earlier.

(40)

Primitive Roots in Finite Fields

Theorem 50 (Lucas and Lehmer (1927)) ^a A number p > 1 is a prime if and only if there is a number 1 < r < p such that

1. r^p⁻¹ = 1 mod p, and

2. r^(p^−1)/q ̸= 1 mod p for all prime divisors q of p − 1.

• This r is called the primitive root or generator.

• We will prove the theorem later (see pp. 464ﬀ).

aFran¸cois Edouard Anatole Lucas (1842–1891); Derrick Henry

(41)

Derrick Lehmer (1905–1991)

(42)

Pratt’s Theorem

Theorem 51 (Pratt (1975)) primes ∈ NP ∩ coNP.

• primes is in coNP because a succinct disqualification is a proper divisor.

– A proper divisor of a number n means n is not a prime.

• Now suppose p is a prime.

• p’s certificate includes the r in Theorem 50 (p. 453).

• Use recursive doubling to check if r^p⁻¹ = 1 mod p in time polynomial in the length of the input, log₂ p.

(43)

The Proof (concluded)

• We also need all prime divisors of p − 1: q¹, q₂, . . . , q_k. – Whether r, q₁, . . . , q_k are easy to find is irrelevant.

– There may be multiple choices for r.

• Checking r^(p^−1)/qⁱ ̸= 1 mod p is also easy.

• Checking q¹, q₂, . . . , q_k are all the divisors of p− 1 is easy.

• We still need certificates for the primality of the qⁱ’s.

• The complete certificate is recursive and tree-like:

C(p) = (r; q₁, C(q₁), q₂, C(q₂), . . . , q_k, C(q_k)).

• We next prove that C(p) is succinct.

• As a result, C(p) can be checked in polynomial time.

(44)

The Succinctness of the Certificate

Lemma 52 The length of C(p) is at most quadratic at 5 log²₂ p.

• This claim holds when p = 2 or p = 3.

• In general, p − 1 has k ≤ log2 p prime divisors q₁ = 2, q₂, . . . , q_k.

– Reason:

2^k ≤

∏k i=1

q_i ≤ p − 1.

• Note also that, as q¹ = 2,

∏k

≤ p − 1

(45)

The Proof (continued)

• C(p) requires:

– 2 parentheses;

– 2k < 2 log₂ p separators (at most 2 log₂ p bits);

– r (at most log₂ p bits);

– q₁ = 2 and its certificate 1 (at most 5 bits);

– q₂, . . . , q_k (at most 2 log₂ p bits);^a – C(q₂), . . . , C(q_k).

aWhy?

(46)

The Proof (concluded)

• C(p) is succinct because, by induction,

|C(p)| ≤ 5 log2 p + 5 + 5

∑k i=2

log²₂ q_i

≤ 5 log2 p + 5 + 5

( _k

∑

i=2

log₂ q_i )²

≤ 5 log2 p + 5 + 5 log²₂ p − 1

2 by inequality (5)

< 5 log₂ p + 5 + 5(log₂ p − 1)²

= 5 log²₂ p + 10 − 5 log2 p ≤ 5 log²2 p

(47)

A Certificate for 23

^a

• Note that 7 is a primitive root modulo 23 and 23 − 1 = 22 = 2 × 11.

• So

C(23) = (7; 2, C(2), 11, C(11)).

• Note that 2 is a primitive root modulo 11 and 11 − 1 = 10 = 2 × 5.

• So

C(11) = (2; 2, C(2), 5, C(5)).

aThanks to a lively discussion on April 24, 2008.

(48)

A Certificate for 23 (concluded)

• Note that 2 is a primitive root modulo 5 and 5 − 1 = 4 = 2².

• So

C(5) = (2; 2, C(2)).

• In summary,

C(23) = (7; 2, C(2), 11, (2; 2, C(2), 5, (2; 2, C(2)))).

(49)

Basic Modular Arithmetics

^a

• Let m, n ∈ Z⁺.

• m | n means m divides n; m is n’s divisor.

• We call the numbers 0, 1, . . . , n − 1 the residue modulo n.

• The greatest common divisor of m and n is denoted gcd(m, n).

• The r in Theorem 50 (p. 453) is a primitive root of p.

• We now prove the existence of primitive roots and then Theorem 50 (p. 453).

aCarl Friedrich Gauss.

(50)

Basic Modular Arithmetics (concluded)

• We use

a ≡ b mod n if n| (a − b).

– So 25 ≡ 38 mod 13.

• We use

a = b mod n

if b is the remainder of a divided by n.

– So 25 = 12 mod 13.

(51)

Euler’s

^a

Totient or Phi Function

• Let

Φ(n) = {m : 1 ≤ m < n, gcd(m, n) = 1}

be the set of all positive integers less than n that are prime to n.^b

– Φ(12) = {1, 5, 7, 11}.

• Define Euler’s function of n to be ϕ(n) = |Φ(n)|.

• ϕ(p) = p − 1 for prime p, and ϕ(1) = 1 by convention.

• Euler’s function is not expected to be easy to compute without knowing n’s factorization.

aLeonhard Euler (1707–1783).

bZ_n^∗ is an alternative notation.

(52)

(53)

Two Properties of Euler’s Function

The inclusion-exclusion principle^a can be used to prove the following.

Lemma 53 ϕ(n) = n∏

p|n(1 − ¹_p).

• If n = pê₁¹pê₂² · · · pê_ℓ^ℓ is the prime factorization of n, then

ϕ(n) = n

∏ℓ i=1

(

1 − 1 p_i

) .

Corollary 54 ϕ(mn) = ϕ(m) ϕ(n) if gcd(m, n) = 1.

aConsult any textbook on discrete mathematics.

(54)

A Key Lemma

Lemma 55 ∑

m|n ϕ(m) = n.

• Let n = ∏ℓ

i=1 p^k_iⁱ be the prime factorization of n and consider

∏ℓ i=1

[ ϕ(1) + ϕ(p_i) + · · · + ϕ(p^k_iⁱ) ]. (6)

• Equation (6) equals n because ϕ(p^ki ) = p^k_i − p^k_i⁻¹ by Lemma 53 (p. 466) so ϕ(1) + ϕ(p_i) + · · · + ϕ(p^k_iⁱ) = p^k_iⁱ.

• Expand Eq. (6) to yield

n = ∑ ∏^ℓ

ϕ(p^k^′ⁱ).

(55)

The Proof (concluded)

• By Corollary 54 (p. 466),

∏ℓ i=1

ϕ(p^k_i^′ⁱ) = ϕ

( _ℓ

∏

i=1

p^k_i^′ⁱ )

.

• So Eq. (6) becomes

n = ∑

k₁^′≤k1,...,k_ℓ^′≤kℓ

ϕ

( _ℓ

∏

i=1

p^k_i^′ⁱ )

.

• Each ∏ℓ

i=1 p^k_i^′ⁱ is a unique divisor of n = ∏ℓ

i=1 p^k_iⁱ.

• Equation (6) becomes

∑

m|n

ϕ(m).

(56)