The Proof (continued)

(1)

Legendre’s Law of Quadratic Reciprocity

^a

• Let p and q be two odd primes.

• The next result says their Legendre symbols are distinct if and only if both numbers are 3 mod 4.

Lemma 65 (Legendre (1785), Gauss)

(p|q)(q|p) = (−1)^p−1² ^q−1² .

aFirst stated by Euler in 1751. Legendre (1785) did not give a correct proof. Gauss proved the theorem when he was 19. He gave at least 6 different proofs during his life. The 152nd proof appeared in 1963.

(2)

The Proof (continued)

• Sum the elements of R⁰ in the previous proof in mod2.

• On one hand, this is just P_(p−1)/2

i=1 i mod 2.

• On the other hand, the sum equals

(p−1)/2X

i=1

µ

qi − p

¹iq p

º¶

+ mp mod 2

=



q

(p−1)/2X

i=1

i − p

(p−1)/2X

i=1

¹iq p

º

 + mp mod 2.

– Signs are irrelevant under mod2.

– m is as in Lemma 64 (p. 486).

(3)

The Proof (continued)

• Ignore odd multipliers to make the sum equal





(p−1)/2X

i=1

i −

(p−1)/2X

i=1

¹iq p

º

 + m mod 2.

• Equate the above with P_(p−1)/2

i=1 i mod 2 to obtain

m =

(p−1)/2X

i=1

¹iq p

º

mod 2.

(4)

The Proof (concluded)

• P_(p−1)/2

i=1 b^iq_p c is the number of integral points under the line y = (q/p) x for 1 ≤ x ≤ (p − 1)/2.

• Gauss’s lemma (p. 486) says (q|p) = (−1)^m.

• Repeat the proof with p and q reversed.

• So (p|q) = (−1)^m⁰, where m⁰ is the number of integral points above the line y = (q/p) x for 1 ≤ y ≤ (q − 1)/2.

• As a result, (p|q)(q|p) = (−1)^m+m⁰.

• But m + m⁰ is the total number of integral points in the

p−1

2 × ^q−1₂ rectangle, which is ^p−1₂ ^q−1₂ .

(5)

Eisenstein’s Rectangle

(p,q)

(p - 1)/2 (q - 1)/2

p = 11 and q = 7.

(6)

The Jacobi Symbol

^a

• The Legendre symbol only works for odd prime moduli.

• The Jacobi symbol (a | m) extends it to cases where m is not prime.

• Let m = p₁p₂ · · · p_k be the prime factorization of m.

• When m > 1 is odd and gcd(a, m) = 1, then

(a|m) = Yk i=1

(a | p_i).

– Note that the Jacobi symbol equals ±1.

– It reduces to the Legendre symbol when m is a prime.

• Define (a | 1) = 1.

(7)

Properties of the Jacobi Symbol

The Jacobi symbol has the following properties, for arguments for which it is defined.

1. (ab | m) = (a | m)(b | m).

2. (a | m₁m₂) = (a | m₁)(a | m₂).

3. If a = b mod m, then (a | m) = (b | m).

4. (−1 | m) = (−1)^(m−1)/2 (by Lemma 64 on p. 486).

5. (2 | m) = (−1)^(m²^−1)/8.^a

6. If a and m are both odd, then (a | m)(m | a) = (−1)(a−1)(m−1)/4.

(8)

Calculation of (2200|999)

Similar to the Euclidean algorithm and does not require factorization.

(202|999) = (−1)⁽⁹⁹⁹²^−1)/8(101|999)

= (−1)¹²⁴⁷⁵⁰(101|999) = (101|999)

= (−1)(100)(998)/4(999|101) = (−1)²⁴⁹⁵⁰(999|101)

= (999|101) = (90|101) = (−1)⁽¹⁰¹²^−1)/8(45|101)

= (−1)¹²⁷⁵(45|101) = −(45|101)

= −(−1)(44)(100)/4(101|45) = −(101|45) = −(11|45)

= −(−1)^(10)(44)/4(45|11) = −(45|11)

= −(1|11) = −1.

(9)

A Result Generalizing Proposition 10.3 in the Textbook

Theorem 66 The group of set Φ(n) under multiplication mod n has a primitive root if and only if n is either 1, 2, 4, p^k, or 2p^k for some nonnegative integer k and and odd

prime p.

This result is essential in the proof of the next lemma.

(10)

The Jacobi Symbol and Primality Test

^a

Lemma 67 If (M |N ) = M^{(N −1)/2} mod N for all M ∈ Φ(N ), then N is prime. (Assume N is odd.)

• Assume N = mp, where p is an odd prime, gcd(m, p) = 1, and m > 1 (not necessarily prime).

• Let r ∈ Φ(p) such that (r | p) = −1.

• The Chinese remainder theorem says that there is an M ∈ Φ(N ) such that

M = r mod p, M = 1 mod m.

aMr. Clement Hsiao (R88526067) pointed out that the textbook’s

(11)

The Proof (continued)

• By the hypothesis,

M^{(N −1)/2} = (M | N ) = (M | p)(M | m) = −1 mod N.

• Hence

M^{(N −1)/2} = −1 mod m.

• But because M = 1 mod m,

M^{(N −1)/2} = 1 mod m, a contradiction.

(12)

The Proof (continued)

• Second, assume that N = p^a, where p is an odd prime and a ≥ 2.

• By Theorem 66 (p. 498), there exists a primitive root r modulo p^a.

• From the assumption, M^{N −1} =

h

M^{(N −1)/2} i₂

= (M |N )² = 1 mod N for all M ∈ Φ(N ).

(13)

The Proof (continued)

• As r ∈ Φ(N ) (prove it), we have

r^{N −1} = 1 mod N.

• As r’s exponent modulo N = pâ is φ(N ) = pâ−1(p − 1), pâ−1(p − 1) | N − 1,

which implies that p | N − 1.

• But this is impossible given that p | N .

(14)

The Proof (continued)

• Third, assume that N = mp^a, where p is an odd prime, gcd(m, p) = 1, m > 1 (not necessarily prime), and a is even.

• The proof mimics that of the second case.

• By Theorem 66 (p. 498), there exists a primitive root r modulo p^a.

• From the assumption, M^{N −1} =

h

M^{(N −1)/2} i₂

= (M |N )² = 1 mod N for all M ∈ Φ(N ).

(15)

The Proof (continued)

• In particular,

M^{N −1} = 1 mod p^a (7)

for all M ∈ Φ(N ).

• The Chinese remainder theorem says that there is an M ∈ Φ(N ) such that

M = r mod p^a, M = 1 mod m.

• Because M = r mod p^a and Eq. (7), r^{N −1} = 1 mod p^a.

(16)

The Proof (concluded)

• As r’s exponent modulo N = pâ is φ(N ) = pâ−1(p − 1), pâ−1(p − 1) | N − 1,

which implies that p | N − 1.

• But this is impossible given that p | N .

(17)

The Number of Witnesses to Compositeness

Theorem 68 (Solovay and Strassen (1977)) If N is an odd composite, then (M |N ) 6= M^{(N −1)/2} mod N for at least half of M ∈ Φ(N ).

• By Lemma 67 (p. 499) there is at least one a ∈ Φ(N ) such that (a|N ) 6= a^{(N −1)/2} mod N .

• Let B = {b₁, b₂, . . . , b_k} ⊆ Φ(N ) be the set of all distinct residues such that (b_i|N ) = b^{(N −1)/2}_i mod N .

• Let aB = {ab_i mod N : i = 1, 2, . . . , k}.

(18)

The Proof (concluded)

• |aB| = k.

– ab_i = ab_j mod N implies N |a(b_i − b_j), which is

impossible because gcd(a, N ) = 1 and N > |b_i − b_j|.

• aB ∩ B = ∅ because

(ab_i)^{(N −1)/2} = a^{(N −1)/2}b^{(N −1)/2}_i 6= (a|N )(b_i|N ) = (ab_i|N ).

• Combining the above two results, we know

| B |

φ(N ) ≤ 0.5.

(19)

1: if N is even but N 6= 2 then

2: return “N is composite”;

3: else if N = 2 then

4: return “N is a prime”;

5: end if

6: Pick M ∈ {2, 3, . . . , N − 1} randomly;

7: if gcd(M, N ) > 1 then

8: return “N is a composite”;

9: else

10: if (M |N ) 6= M^{(N −1)/2} mod N then

11: return “N is composite”;

12: else

13: return “N is a prime”;

14: end if

(20)

Analysis

• The algorithm certainly runs in polynomial time.

• There are no false positives (for compositeness).

– When the algorithm says the number is composite, it is always correct.

• The probability of a false negative is at most one half.

– If the input is composite, then the probability that the algorithm says the number is a prime is ≤ 0.5.

• The error probability can be reduced but not eliminated.

(21)

The Improved Density Attack for compositeness

All numbers < N

Witnesses to compositeness of

N via Jacobi Witnesses to

compositeness of N via common

factor

(22)

Randomized Complexity Classes; RP

• Let N be a polynomial-time precise NTM that runs in time p(n) and has 2 nondeterministic choices at each step.

• N is a polynomial Monte Carlo Turing machine for a language L if the following conditions hold:

– If x ∈ L, then at least half of the 2^p(n) computation paths of N on x halt with “yes” where n = | x |.

– If x 6∈ L, then all computation paths halt with “no.”

• The class of all languages with polynomial Monte Carlo TMs is denoted RP (randomized polynomial time).^a

(23)

Comments on RP

• Nondeterministic steps can be seen as fair coin flips.

• There are no false positive answers.

• The probability of false negatives, 1 − ², is at most 0.5.

• But any constant between 0 and 1 can replace 0.5.

– By repeating the algorithm k = d−_log ¹

2 1−²e times, the probability of false negatives becomes (1 − ²)^k ≤ 0.5.

• In fact, ² can be arbitrarily close to 0 as long as it is of the order 1/p(n) for some polynomial p(n).

– −_log ¹

21−² = O(¹_²) = O(p(n)).

(24)

Where RP Fits

• P ⊆ RP ⊆ NP.

– A deterministic TM is like a Monte Carlo TM except that all the coin flips are ignored.

– A Monte Carlo TM is an NTM with extra demands on the number of accepting paths.

• compositeness ∈ RP; primes ∈ coRP; primes ∈ RP.^a – In fact, primes ∈ P.^b

• RP ∪ coRP is another “plausible” notion of efficient computation.

aAdleman and Huang (1987).

(25)

ZPP

^a

(Zero Probabilistic Polynomial)

• The class ZPP is defined as RP ∩ coRP.

• A language in ZPP has two Monte Carlo algorithms, one with no false positives and the other with no false

negatives.

• If we repeatedly run both Monte Carlo algorithms, eventually one definite answer will come (unlike RP).

– A positive answer from the one without false positives.

– A negative answer from the one without false negatives.

(26)

The ZPP Algorithm (Las Vegas)

1: {Suppose L ∈ ZPP.}

2: {N₁ has no false positives, and N₂ has no false negatives.}

3: while true do

4: if N₁(x) = “yes” then

5: return “yes”;

6: end if

7: if N₂(x) = “no” then

8: return “no”;

9: end if

10: end while

(27)

ZPP (concluded)

• The expected running time for the correct answer to emerge is polynomial.

– The probability that a run of the 2 algorithms does not generate a definite answer is 0.5.

– Let p(n) be the running time of each run.

– The expected running time for a definite answer is X∞

i=1

0.5ⁱip(n) = 2p(n).

• Essentially, ZPP is the class of problems that can be solved without errors in expected polynomial time.

(28)

Et Tu, RP?

1: {Suppose L ∈ RP.}

2: {N decides L without false positives.}

3: while true do

4: if N (x) = “yes” then

5: return “yes”;

6: end if

7: {But what to do here?}

8: end while

• You eventually get a “yes” if x ∈ L.

• But how to get a “no” when x 6∈ L?

• You have to sacrifice either correctness or bounded running time.

(29)

Large Deviations

• Suppose you have a biased coin.

• One side has probability 0.5 + ² to appear and the other 0.5 − ², for some 0 < ² < 0.5.

• But you do not know which is which.

• How to decide which side is the more likely—with high confidence?

• Answer: Flip the coin many times and pick the side that appeared the most times.

• Question: Can you quantify the confidence?

(30)

The Chernoff Bound

^a

Theorem 69 (Chernoff (1952)) Suppose x₁, x₂, . . . , x_n are independent random variables taking the values 1 and 0 with probabilities p and 1 − p, respectively. Let X = P_n

i=1 x_i. Then for all 0 ≤ θ ≤ 1,

prob[ X ≥ (1 + θ) pn ] ≤ e^−θ²^pn/3.

• The probability that the deviate of a binomial random variable from its expected value

E[ X ] = E[P_n

i=1 x_i ] = pn decreases exponentially with the deviation.

• The Chernoff bound is asymptotically optimal.

(31)

The Proof

• Let t be any positive real number.

• Then

prob[ X ≥ (1 + θ) pn ] = prob[ e^tX ≥ e^{t(1+θ) pn} ].

• Markov’s inequality (p. 460) generalized to real-valued random variables says that

prob £

e^tX ≥ kE[ e^tX ]¤

≤ 1/k.

• With k = e^{t(1+θ) pn}/E[ e^tX ], we have

prob[ X ≥ (1 + θ) pn ] ≤ e^{−t(1+θ) pn}E[ e^tX ].

(32)

The Proof (continued)

• Because X = P_n

i=1 x_i and x_i’s are independent, E[ e^tX ] = (E[ e^tx¹ ])ⁿ = [ 1 + p(e^t − 1) ]ⁿ.

• Substituting, we obtain

prob[ X ≥ (1 + θ) pn ] ≤ e^{−t(1+θ) pn}[ 1 + p(e^t − 1) ]ⁿ

≤ e^{−t(1+θ) pn}e^pn(e^t⁻¹⁾ as (1 + a)ⁿ ≤ e^an for all a > 0.

(33)

The Proof (concluded)

• With the choice of t = ln(1 + θ), the above becomes prob[ X ≥ (1 + θ) pn ] ≤ epn[ θ−(1+θ) ln(1+θ) ].

• The exponent expands to −^θ₂² + ^θ₆³ − ^θ₁₂⁴ + · · · for 0 ≤ θ ≤ 1, which is less than

−θ²

2 + θ³

6 ≤ θ² µ

−1

2 + θ 6

¶

≤ θ² µ

−1

2 + 1 6

¶

= −θ² 3 .

(34)

Power of the Majority Rule

From prob[ X ≤ (1 − θ) pn ] ≤ e⁻^θ2² ^pn (prove it):

Corollary 70 If p = (1/2) + ² for some 0 ≤ ² ≤ 1/2, then

prob

" _n X

i=1

x_i ≤ n/2

#

≤ e^−²²^n/2.

• The textbook’s corollary to Lemma 11.9 seems incorrect.

• Our original problem (p. 518) hence demands ≈ 1.4k/²² independent coin flips to guarantee making an error

with probability at most 2^−k with the majority rule.

(35)

BPP

^a

(Bounded Probabilistic Polynomial)

• The class BPP contains all languages for which there is a precise polynomial-time NTM N such that:

– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “yes.”

– If x 6∈ L, then at least 3/4 of the computation paths of N on x lead to “no.”

• N accepts or rejects by a clear majority.

aGill (1977).

(36)

Magic 3/4?

• The number 3/4 bounds the probability of a right answer away from 1/2.

• Any constant strictly between 1/2 and 1 can be used without affecting the class BPP.

• In fact, 0.5 plus any inverse polynomial between 1/2 and 1,

0.5 + 1 p(n), can be used.

(37)

The Majority Vote Algorithm

Suppose L is decided by N by majority (1/2) + ².

1: for i = 1, 2, . . . , 2k + 1 do

2: Run N on input x;

3: end for

4: if “yes” is the majority answer then

5: “yes”;

6: else

7: “no”;

8: end if

(38)

Analysis

• The running time remains polynomial, being 2k + 1 times N ’s running time.

• By Corollary 70 (p. 523), the probability of a false answer is at most e^−²²^k.

• By taking k = d 2/²² e, the error probability is at most 1/4.

• As with the RP case, ² can be any inverse polynomial, because k remains polynomial in n.

(39)

Probability Amplification for BPP

• Let m be the number of random bits used by a BPP algorithm.

– By definition, m is polynomial in n.

• With k = Θ(log m) in the majority vote algorithm, we can lower the error probability to, say,

≤ (3m)⁻¹.

(40)

Aspects of BPP

• BPP is the most comprehensive yet plausible notion of efficient computation.

– If a problem is in BPP, we take it to mean that the problem can be solved efficiently.

– In this aspect, BPP has effectively replaced P.

• (RP ∪ coRP) ⊆ (NP ∪ coNP).

• (RP ∪ coRP) ⊆ BPP.

• Whether BPP ⊆ (NP ∪ coNP) is unknown.

• But it is unlikely that NP ⊆ BPP (p. 544).

(41)

coBPP

• The definition of BPP is symmetric: acceptance by clear majority and rejection by clear majority.

• An algorithm for L ∈ BPP becomes one for ¯L by reversing the answer.

• So ¯L ∈ BPP and BPP ⊆ coBPP.

• Similarly coBPP ⊆ BPP.

• Hence BPP = coBPP.

• This approach does not work for RP.

• It did not work for NP either.

(42)

BPP and coBPP

Ø\HVÙ ØQRÙ ØQRÙ Ø\HVÙ

(43)

“The Good, the Bad, and the Ugly”

P BPP ZPP

RP coRP

NP coNP

(44)

Circuit Complexity

• Circuit complexity is based on boolean circuits instead of Turing machines.

• A boolean circuit with n inputs computes a boolean function of n variables.

• By identify true with 1 and false with 0, a boolean circuit with n inputs accepts certain strings in { 0, 1 }ⁿ.

• To relate circuits with arbitrary languages, we need one circuit for each possible input length n.

(45)

Formal Definitions

• The size of a circuit is the number of gates in it.

• A family of circuits is an infinite sequence

C = (C₀, C₁, . . .) of boolean circuits, where C_n has n boolean inputs.

• L ⊆ {0, 1}^∗ has polynomial circuits if there is a family of circuits C such that:

– The size of C_n is at most p(n) for some fixed polynomial p.

– For input x ∈ {0, 1}^∗, C_{| x |} outputs 1 if and only if x ∈ L.

∗ C accepts L ∩ {0, 1}ⁿ.

(46)

Exponential Circuits Contain All Languages

• Theorem 15 (p. 171) implies that there are languages that cannot be solved by circuits of size 2ⁿ/(2n).

• But exponential circuits can solve all problems.

Proposition 71 All decision problems (decidable or otherwise) can be solved by a circuit of size 2ⁿ⁺².

• We will show that for any language L ⊆ {0, 1}^∗, L ∩ {0, 1}ⁿ can be decided by a circuit of size 2ⁿ⁺².

(47)

The Proof (concluded)

• Define boolean function f : {0, 1}ⁿ → {0, 1}, where

f (x₁x₂ · · · x_n) =





1 x₁x₂ · · · x_n ∈ L, 0 x₁x₂ · · · x_n 6∈ L.

• ^{f (x}1x₂ · · · x_n) = (x₁ ∧ f (1x₂ · · · x_n)) ∨ (¬x₁ ∧ f (0x₂ · · · x_n)).

• The circuit size s(n) for f (x₁x₂ · · · x_n) hence satisfies s(n) = 4 + 2s(n − 1)

with s(1) = 1.

• Solve it to obtain s(n) = 5 × 2ⁿ⁻¹ − 4 ≤ 2ⁿ⁺².

(48)

Comments

• Proposition 71 (p. 535) does not contradict anything we knew so far about computation theory.

– Yes, there are only a finite number of circuits with size 2ⁿ⁺².

– Yes, there are only 2ⁿ possible inputs of length n.

– Yes, those circuits can solve all problems of length n.

– But is there an algorithm to tell us which circuit is the correct one?

(49)

The Circuit Complexity of P

Proposition 72 All languages in P have polynomial circuits.

• Let L ∈ P be decided by a TM in time p(n).

• By Corollary 28 (p. 263), there is a circuit with O(p(n)²) gates that accepts L ∩ {0, 1}ⁿ.

• The size of the circuit depends only on L and the length of the input.

• The size of the circuit is polynomial in n.

(50)

Languages That Polynomial Circuits Accept

• Do polynomial circuits accept only languages in P?

• There are undecidable languages that have polynomial circuits.

– Let L ⊆ {0, 1}^∗ be an undecidable language.

– Let U = {1ⁿ : the binary expansion of n is in L}.^a – U is also undecidable.

– U ∩ {1}ⁿ can be accepted by C_n that is trivially true if 1ⁿ ∈ U and trivially false if 1ⁿ 6∈ U .

– The family of circuits (C₀, C₁, . . .) is polynomial in size.

(51)

A Patch

• Despite the simplicity of a circuit, the previous discussions imply the following:

– Circuits are not a realistic model of computation.

– Polynomial circuits are not a plausible notion of efficient computation.

• What gives?

• The effective and efficient constructibility of C₀, C₁, . . . .

(52)

Uniformity

• A family (C₀, C₁, . . .) of circuits is uniform if there is a log n-space bounded TM which on input 1ⁿ outputs C_n.

– Circuits now cannot accept undecidable languages (why?).

– The circuit family on p. 539 is not constructible by a single Turing machine (algorithm).

• A language has uniformly polynomial circuits if there is a uniform family of polynomial circuits that decide it.

(53)

Uniformly Polynomial Circuits and P

Theorem 73 L ∈ P if and only if L has uniformly polynomial circuits.

• One direction was proved in Proposition 72 (p. 538).

• Now suppose L has uniformly polynomial circuits.

• Decide x ∈ L in polynomial time as follows:

– Let n = | x |.

– Build C_n in log n space, hence polynomial time.

– Evaluate the circuit with input x in polynomial time.

• Therefore L ∈ P.

(54)

Relation to P vs. NP

• Theorem 73 implies that P 6= NP if and only if

NP-complete problems have no uniformly polynomial circuits.

• A stronger conjecture: NP-complete problems have no polynomial circuits, uniformly or not.

• The above is currently the preferred approach to proving the P 6= NP conjecture—without success so far.