# The Fermat Test for Primality

## Full text

(1)

### Primality Tests

• primes asks if a number N is a prime.

• The classic algorithm tests if k | N for k = 2, 3, . . . ,√ N .

• But it runs in Ω(2n/2) steps, where n = | N | = log2 N .

(2)

### The Density Attack for primes

1: Pick k ∈ {2, . . . , N − 1} randomly; {Assume N > 2.}

2: if k| N then

3: return “N is composite”;

4: else

5: return “N is a prime”;

6: end if

(3)

### Analysis

a

• Suppose N = P Q, a product of 2 primes.

• The probability of success is

< 1 − φ(N )

N = 1 − (P − 1)(Q − 1)

P Q = P + Q − 1 P Q .

• In the case where P ≈ Q, this probability becomes

< 1

P + 1

Q ≈ 2

√N .

• This probability is exponentially small.

(4)

### The Fermat Test for Primality

Fermat’s “little” theorem on p. 365 suggests the following primality test for any given number p:

1: Pick a number a randomly from {1, 2, . . . , N − 1};

2: if aN −1 6= 1 mod N then

3: return “N is composite”;

4: else

5: return “N is probably a prime”;

6: end if

(5)

### The Fermat Test for Primality (concluded)

• Unfortunately, there are composite numbers called Carmichael numbers that will pass the Fermat test for all a ∈ {1, 2, . . . , N − 1}.

• There are infinitely many Carmichael numbers.a

aAlford, Granville, and Pomerance (1992).

(6)

### Square Roots Modulo a Prime

• Equation x2 = a mod p has at most two (distinct) roots by Lemma 54 (p. 370).

– The roots are called square roots.

– Numbers a with square roots and gcd(a, p) = 1 are called quadratic residues.

∗ They are 12 mod p, 22 mod p, . . . , (p − 1)2 mod p.

• We shall show that a number either has two roots or has none, and testing which one is true is trivial.

• There are no known efficient deterministic algorithms to find the roots.

(7)

### Euler’s Test

Lemma 60 (Euler) Let p be an odd prime and a 6= 0 mod p.

1. If a(p−1)/2 = 1 mod p, then x2 = a mod p has two roots.

2. If a(p−1)/2 6= 1 mod p, then a(p−1)/2 = −1 mod p and x2 = a mod p has no roots.

• Let r be a primitive root of p.

• By Fermat’s “little” theorem, r(p−1)/2 is a square root of 1, so r(p−1)/2 = ±1 mod p.

• But as r is a primitive root, r(p−1)/2 6= 1 mod p.

• Hence r(p−1)/2 = −1 mod p.

(8)

### The Proof (continued)

• Suppose a = r2j for some 1 ≤ j ≤ (p − 1)/2.

• Then a(p−1)/2 = rj(p−1) = 1 mod p and its two distinct roots are rj,−rj(= rj+(p−1)/2).

– If rj = −rj mod p, then 2rj = 0 mod p, which implies rj = 0 mod p, a contradiction.

• As 1 ≤ j ≤ (p − 1)/2, there are (p − 1)/2 such a’s.

(9)

### The Proof (continued)

• Each such a has 2 distinct square roots.

• The square roots of all the a’s are distinct.

– The square roots of different a’s must be different.

• Hence the set of square roots is {1, 2, . . . , p − 1}.

– Because there are (p − 1)/2 such a’s and each a has two square roots.

• As a result, a = r2j, 1 ≤ j ≤ (p − 1)/2, are all the quadratic residues.

(10)

### The Proof (concluded)

• If a = r2j+1, then it has no roots because all the square roots have been taken.

• Now,

a(p−1)/2 = [ r(p−1)/2 ]2j+1 = (−1)2j+1 = −1 mod p.

(11)

The Legendre Symbola and Quadratic Residuacity Test

• By Lemma 60 (p. 426) a(p−1)/2 mod p = ±1 for a 6= 0 mod p.

• For odd prime p, define the Legendre symbol (a | p) as

(a | p) =

0 if p | a,

1 if a is a quadratic residue modulo p,

−1 if a is a quadratic nonresidue modulo p.

• Euler’s test implies a(p−1)/2 = (a| p) mod p for any odd prime p and any integer a.

• Note that (ab|p) = (a|p)(b|p).

aAndrien-Marie Legendre (1752–1833).

(12)

### Gauss’s Lemma

Lemma 61 (Gauss) Let p and q be two odd primes. Then (q|p) = (−1)m, where m is the number of residues in

R = {iq mod p : 1 ≤ i ≤ (p − 1)/2} that are greater than (p − 1)/2.

• All residues in R are distinct.

– If iq = jq mod p, then p|(j − i) q or p|q.

• No two elements of R add up to p.

– If iq + jq = 0 mod p, then p|(i + j) or p|q.

– But neither is possible.

(13)

### The Proof (continued)

• Consider the set R of residues that result from R if we replace each of the m elements a ∈ R such that

a > (p − 1)/2 by p − a.

– This is equivalent to performing −a mod p.

• All residues in R are now at most (p − 1)/2.

• In fact, R = {1, 2, . . . , (p − 1)/2} (see illustration next page).

– Otherwise, two elements of R would add up to p, which has been shown to be impossible.

(14)

5 1 2 3 4

6 5

1 2 3 4

6

p = 7 and q = 5.

(15)

### The Proof (concluded)

• Alternatively, R = {±iq mod p : 1 ≤ i ≤ (p − 1)/2}, where exactly m of the elements have the minus sign.

• Take the product of all elements in the two representations of R.

• So [(p − 1)/2]! = (−1)mq(p−1)/2[(p − 1)/2]! mod p.

• Because gcd([(p − 1)/2]!, p) = 1, the above implies 1 = (−1)mq(p−1)/2 mod p.

(16)

### Legendre’s Law of Quadratic Reciprocity

a

• Let p and q be two odd primes.

• The next result says their Legendre symbols are distinct if and only if both numbers are 3 mod 4.

Lemma 62 (Legendre (1785), Gauss) (p|q)(q|p) = (−1)p−12 q−12 .

aFirst stated by Euler in 1751. Legendre (1785) did not give a correct proof. Gauss proved the theorem when he was 19. He gave at least 6 different proofs during his life. The 152nd proof appeared in 1963.

(17)

### The Proof (continued)

• Sum the elements of R in the previous proof in mod2.

• On one hand, this is just P(p−1)/2

i=1 i mod 2.

• On the other hand, the sum equals

(p−1)/2

X

i=1



qi − p iq p



+ mp mod 2

=

q

(p−1)/2

X

i=1

i − p

(p−1)/2

X

i=1

 iq p



 + mp mod 2.

– Signs are irrelevant under mod2.

– m is as in Lemma 61 (p. 431).

(18)

### The Proof (continued)

• Ignore odd multipliers to make the sum equal

(p−1)/2

X

i=1

i −

(p−1)/2

X

i=1

 iq p



 + m mod 2.

• Equate the above with P(p−1)/2

i=1 i mod 2 to obtain m =

(p−1)/2

X

i=1

 iq p



mod 2.

(19)

### The Proof (concluded)

• P(p−1)/2

i=1iqp ⌋ is the number of integral points under the line y = (q/p) x for 1 ≤ x ≤ (p − 1)/2.

• Gauss’s lemma (p. 431) says (q|p) = (−1)m.

• Repeat the proof with p and q reversed.

• So (p|q) = (−1)m, where m is the number of integral points above the line y = (q/p) x for 1 ≤ y ≤ (q − 1)/2.

• As a result, (p|q)(q|p) = (−1)m+m.

• But m + m is the total number of integral points in the

p−1

2 × q−12 rectangle, which is p−12 q−12 .

(20)

### Eisenstein’s Rectangle

(p,q)

(p - 1)/2 (q - 1)/2

p = 11 and q = 7.

(21)

### The Jacobi Symbol

a

• The Legendre symbol only works for odd prime moduli.

• The Jacobi symbol (a | m) extends it to cases where m is not prime.

• Let m = p1p2 · · · pk be the prime factorization of m.

• When m > 1 is odd and gcd(a, m) = 1, then (a|m) =

k

Y

i=1

(a | pi).

• Define (a | 1) = 1.

aCarl Jacobi (1804–1851).

(22)

### Properties of the Jacobi Symbol

The Jacobi symbol has the following properties, for arguments for which it is defined.

1. (ab | m) = (a | m)(b | m).

2. (a | m1m2) = (a | m1)(a| m2).

3. If a = b mod m, then (a | m) = (b | m).

4. (−1 | m) = (−1)(m−1)/2 (by Lemma 61 on p. 431).

5. (2 | m) = (−1)(m2−1)/8 (by Lemma 61 on p. 431).

6. If a and m are both odd, then (a | m)(m | a) = (−1)(a−1)(m−1)/4.

(23)

### Calculation of (2200 |999)

Similar to the Euclidean algorithm and does not require factorization.

(202|999) = (−1)(9992−1)/8(101|999)

= (−1)124750(101|999) = (101|999)

= (−1)(100)(998)/4

(999|101) = (−1)24950(999|101)

= (999|101) = (90|101) = (−1)(1012−1)/8(45|101)

= (−1)1275(45|101) = −(45|101)

= −(−1)(44)(100)/4

(101|45) = −(101|45) = −(11|45)

= −(−1)(10)(44)/4(45|11) = −(45|11)

= −(1|11) = −(11|1) = −1.

(24)

### A Result Generalizing Proposition 10.3 in the Textbook

Theorem 63 The group of set Φ(n) under multiplication mod n has a primitive root if and only if n is either 1, 2, 4, pk, or 2pk for some nonnegative integer k and and odd

prime p.

This result is essential in the proof of the next lemma.

(25)

### The Jacobi Symbol and Primality Test

a

Lemma 64 If (M|N) = M(N −1)/2 mod N for all M ∈ Φ(N), then N is prime. (Assume N is odd.)

• Assume N = mp, where p is an odd prime, gcd(m, p) = 1, and m > 1 (not necessarily prime).

• Let r ∈ Φ(p) such that (r | p) = −1.

• The Chinese remainder theorem says that there is an M ∈ Φ(N ) such that

M = r mod p, M = 1 mod m.

aMr. Clement Hsiao (R88526067) pointed out that the textbook’s proof in Lemma 11.8 is incorrect while he was a senior in January 1999.

(26)

### The Proof (continued)

• By the hypothesis,

M(N −1)/2 = (M | N) = (M | p)(M | m) = −1 mod N.

• Hence

M(N −1)/2 = −1 mod m.

• But because M = 1 mod m,

M(N −1)/2 = 1 mod m, a contradiction.

(27)

### The Proof (continued)

• Second, assume that N = pa, where p is an odd prime and a ≥ 2.

• By Theorem 63 (p. 443), there exists a primitive root r modulo pa.

• From the assumption, MN −1 = h

M(N −1)/2 i2

= (M|N)2 = 1 mod N for all M ∈ Φ(N).

(28)

### The Proof (continued)

• As r ∈ Φ(N) (prove it), we have

rN −1 = 1 mod N.

• As r’s exponent modulo N = pa is φ(N ) = pa−1(p − 1), pa−1(p − 1) | N − 1,

which implies that p| N − 1.

• But this is impossible given that p | N.

(29)

### The Proof (continued)

• Third, assume that N = mpa, where p is an odd prime, gcd(m, p) = 1, m > 1 (not necessarily prime), and a is even.

• The proof mimics that of the second case.

• By Theorem 63 (p. 443), there exists a primitive root r modulo pa.

• From the assumption, MN −1 = h

M(N −1)/2 i2

= (M|N)2 = 1 mod N for all M ∈ Φ(N).

(30)

### The Proof (continued)

• In particular,

MN −1 = 1 mod pa (6)

for all M ∈ Φ(N).

• The Chinese remainder theorem says that there is an M ∈ Φ(N) such that

M = r mod pa, M = 1 mod m.

• Because M = r mod pa and Eq. (6), rN −1 = 1 mod pa.

(31)

### The Proof (concluded)

• As r’s exponent modulo N = pa is φ(N ) = pa−1(p − 1), pa−1(p − 1) | N − 1,

which implies that p| N − 1.

• But this is impossible given that p | N.

(32)

### The Number of Witnesses to Compositeness

Theorem 65 (Solovay and Strassen (1977)) If N is an odd composite, then (M|N) 6= M(N −1)/2 mod N for at least half of M ∈ Φ(N).

• By Lemma 64 (p. 444) there is at least one a ∈ Φ(N) such that (a|N) 6= a(N −1)/2 mod N .

• Let B = {b1, b2, . . . , bk} ⊆ Φ(N) be the set of all distinct residues such that (bi|N) = b(N −1)/2i mod N .

• Let aB = {abi mod N : i = 1, 2, . . . , k}.

(33)

### The Proof (concluded)

• |aB| = k.

– abi = abj mod N implies N|a(bi − bj), which is

impossible because gcd(a, N ) = 1 and N > |bi − bj|.

• aB ∩ B = ∅ because

(abi)(N −1)/2 = a(N −1)/2b(N −1)/2i 6= (a|N )(bi|N ) = (abi|N ).

• Combining the above two results, we know

| B |

φ(N ) ≤ 0.5.

(34)

1: if N is even but N 6= 2 then

2: return “N is composite”;

3: else if N = 2 then

4: return “N is a prime”;

5: end if

6: Pick M ∈ {2, 3, . . . , N − 1} randomly;

7: if gcd(M, N ) > 1 then

8: return “N is a composite”;

9: else

10: if (M |N ) 6= M(N −1)/2 mod N then

11: return “N is composite”;

12: else

13: return “N is a prime”;

14: end if

(35)

### Analysis

• The algorithm certainly runs in polynomial time.

• There are no false positives (for compositeness).

– When the algorithm says the number is composite, it is always correct.

• The probability of a false negative is at most one half.

– When the algorithm says the number is a prime, it may err.

– If the input is composite, then the probability that the algorithm errs is one half.

• The error probability can be reduced but not eliminated.

(36)

### The Improved Density Attack for compositeness

All numbers < N

Witnesses to compositeness of

N via Jacobi Witnesses to

compositeness of N via common

factor

(37)

### Randomized Complexity Classes; RP

• Let N be a polynomial-time precise NTM that runs in time p(n) and has 2 nondeterministic choices at each step.

• N is a polynomial Monte Carlo Turing machine for a language L if the following conditions hold:

– If x ∈ L, then at least half of the 2p(n) computation paths of N on x halt with “yes” where n = | x |.

– If x 6∈ L, then all computation paths halt with “no.”

• The class of all languages with polynomial Monte Carlo TMs is denoted RP (randomized polynomial time).a

(38)

• Nondeterministic steps can be seen as fair coin flips.

• There are no false positive answers.

• The probability of false negatives, 1 − ǫ, is at most 0.5.

• But any constant between 0 and 1 can replace 0.5.

– By repeating the algorithm k = ⌈−log211−ǫ⌉ times, the probability of false negatives becomes (1 − ǫ)k ≤ 0.5.

• In fact, ǫ can be arbitrarily close to 0 as long as it is of the order 1/p(n) for some polynomial p(n).

– −log 1

21−ǫ = O(1ǫ) = O(p(n)).

(39)

### Where RP Fits

• P ⊆ RP ⊆ NP.

– A deterministic TM is like a Monte Carlo TM except that all the coin flips are ignored.

– A Monte Carlo TM is an NTM with extra demands on the number of accepting paths.

• compositeness ∈ RP; primes ∈ coRP; primes ∈ RP.a – In fact, primes ∈ P.b

• RP ∪ coRP is another “plausible” notion of efficient computation.

bAgrawal, Kayal, and Saxena (2002).

(40)

a

### (Zero Probabilistic Polynomial)

• The class ZPP is defined as RP ∩ coRP.

• A language in ZPP has two Monte Carlo algorithms, one with no false positives and the other with no false

negatives.

• If we repeatedly run both Monte Carlo algorithms, eventually one definite answer will come (unlike RP).

– A positive answer from the one without false positives.

– A negative answer from the one without false negatives.

(41)

### The ZPP Algorithm (Las Vegas)

1: {Suppose L ∈ ZPP.}

2: {N1 has no false positives, and N2 has no false negatives.}

3: while true do

4: if N1(x) = “yes” then

5: return “yes”;

6: end if

7: if N2(x) = “no” then

8: return “no”;

9: end if

10: end while

(42)

### ZPP (concluded)

• The expected running time for the correct answer to emerge is polynomial.

– The probability that a run of the 2 algorithms does not generate a definite answer is 0.5.

– Let p(n) be the running time of each run.

– The expected running time for a definite answer is

X

i=1

0.5iip(n) = 2p(n).

• Essentially, ZPP is the class of problems that can be solved without errors in expected polynomial time.

(43)

### Et Tu , RP?

1: {Suppose L ∈ RP.}

2: {N decides L without false positives.}

3: while true do

4: if N(x) = “yes” then

5: return “yes”;

6: end if

7: {But what to do here?}

8: end while

• You eventually get a “yes” if x ∈ L.

• But how to get a “no” when x 6∈ L?

• You have to sacrifice either correctness or bounded running time.

(44)

### Large Deviations

• Suppose you have a biased coin.

• One side has probability 0.5 + ǫ to appear and the other 0.5 − ǫ, for some 0 < ǫ < 0.5.

• But you do not know which is which.

• How to decide which side is the more likely—with high confidence?

• Answer: Flip the coin many times and pick the side that appeared the most times.

• Question: Can you quantify the confidence?

(45)

### The Chernoff Bound

a

Theorem 66 (Chernoff (1952)) Suppose x1, x2, . . . , xn are independent random variables taking the values 1 and 0 with probabilities p and 1 − p, respectively. Let X = Pn

i=1 xi. Then for all 0 ≤ θ ≤ 1,

prob[ X ≥ (1 + θ) pn ] ≤ e−θ2pn/3.

• The probability that the deviate of a binomial random variable from its expected value

E[ X ] = E[Pn

i=1 xi ] = pn decreases exponentially with the deviation.

• The Chernoff bound is asymptotically optimal.

aHerman Chernoff (1923–).

(46)

### The Proof

• Let t be any positive real number.

• Then

prob[ X ≥ (1 + θ) pn ] = prob[ etX ≥ et(1+θ) pn ].

• Markov’s inequality (p. 405) generalized to real-valued random variables says that

prob etX ≥ kE[ etX ] ≤ 1/k.

• With k = et(1+θ) pn/E[ etX ], we have

prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pnE[ etX ].

(47)

### The Proof (continued)

• Because X = Pn

i=1 xi and xi’s are independent, E[ etX ] = (E[ etx1 ])n = [ 1 + p(et − 1) ]n.

• Substituting, we obtain

prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pn[ 1 + p(et − 1) ]n

≤ e−t(1+θ) pnepn(et−1) as (1 + a)n ≤ ean for all a > 0.

(48)

### The Proof (concluded)

• With the choice of t = ln(1 + θ), the above becomes prob[ X ≥ (1 + θ) pn ] ≤ epn[ θ−(1+θ) ln(1+θ) ].

• The exponent expands to −θ22 + θ63θ124 + · · · for 0 ≤ θ ≤ 1, which is less than

−θ2

2 + θ3

6 ≤ θ2



−1

2 + θ 6



≤ θ2



−1

2 + 1 6



= −θ2 3 .

(49)

### Power of the Majority Rule

From prob[ X ≤ (1 − θ) pn ] ≤ eθ22 pn (prove it):

Corollary 67 If p = (1/2) + ǫ for some 0 ≤ ǫ ≤ 1/2, then prob

" n X

i=1

xi ≤ n/2

#

≤ e−ǫ2n/2.

• The textbook’s corollary to Lemma 11.9 seems incorrect.

• Our original problem (p. 463) hence demands ≈ 1.4k/ǫ2 independent coin flips to guarantee making an error

with probability at most 2−k with the majority rule.

(50)

a

### (Bounded Probabilistic Polynomial)

• The class BPP contains all languages for which there is a precise polynomial-time NTM N such that:

– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “yes.”

– If x 6∈ L, then at least 3/4 of the computation paths of N on x lead to “no.”

• N accepts or rejects by a clear majority.

aGill (1977).

(51)

### Magic 3/4?

• The number 3/4 bounds the probability of a right answer away from 1/2.

• Any constant strictly between 1/2 and 1 can be used without affecting the class BPP.

• In fact, 0.5 plus any inverse polynomial between 1/2 and 1,

0.5 + 1 p(n), can be used.

(52)

### The Majority Vote Algorithm

Suppose L is decided by N by majority (1/2) + ǫ.

1: for i = 1, 2, . . . , 2k + 1 do

2: Run N on input x;

3: end for

4: if “yes” is the majority answer then

5: “yes”;

6: else

7: “no”;

8: end if

(53)

### Analysis

• The running time remains polynomial, being 2k + 1 times N ’s running time.

• By Corollary 67 (p. 468), the probability of a false answer is at most e−ǫ2k.

• By taking k = ⌈ 2/ǫ2 ⌉, the error probability is at most 1/4.

• As with the RP case, ǫ can be any inverse polynomial, because k remains polynomial in n.

(54)

### Probability Amplification for BPP

• Let m be the number of random bits used by a BPP algorithm.

– By definition, m is polynomial in n.

• With k = Θ(log m) in the majority vote algorithm, we can lower the error probability to ≤ (3m)−1.

(55)

### Aspects of BPP

• BPP is the most comprehensive yet plausible notion of efficient computation.

– If a problem is in BPP, we take it to mean that the problem can be solved efficiently.

– In this aspect, BPP has effectively replaced P.

• (RP ∪ coRP) ⊆ (NP ∪ coNP).

• (RP ∪ coRP) ⊆ BPP.

• Whether BPP ⊆ (NP ∪ coNP) is unknown.

• But it is unlikely that NP ⊆ BPP (p. 489).

(56)

### coBPP

• The definition of BPP is symmetric: acceptance by clear majority and rejection by clear majority.

• An algorithm for L ∈ BPP becomes one for ¯L by reversing the answer.

• So ¯L ∈ BPP and BPP ⊆ coBPP.

• Similarly coBPP ⊆ BPP.

• Hence BPP = coBPP.

• This approach does not work for RP.

• It did not work for NP either.

(57)

### BPP and coBPP

Ø\HVÙ ØQRÙ ØQRÙ Ø\HVÙ

(58)

P BPP ZPP

RP coRP

NP coNP

(59)

### Circuit Complexity

• Circuit complexity is based on boolean circuits instead of Turing machines.

• A boolean circuit with n inputs computes a boolean function of n variables.

• By identify true with 1 and false with 0, a boolean circuit with n inputs accepts certain strings in { 0, 1 }n.

• To relate circuits with arbitrary languages, we need one circuit for each possible input length n.

(60)

### Formal Definitions

• The size of a circuit is the number of gates in it.

• A family of circuits is an infinite sequence

C = (C0, C1, . . .) of boolean circuits, where Cn has n boolean inputs.

• L ⊆ {0, 1} has polynomial circuits if there is a family of circuits C such that:

– The size of Cn is at most p(n) for some fixed polynomial p.

– For input x ∈ {0, 1}, C| x | outputs 1 if and only if x ∈ L.

∗ C accepts L ∩ {0, 1}n.

(61)

### Exponential Circuits Contain All Languages

• Theorem 14 (p. 153) implies that there are languages that cannot be solved by circuits of size 2n/(2n).

• But exponential circuits can solve all problems.

Proposition 68 All decision problems (decidable or otherwise) can be solved by a circuit of size 2n+2.

• We will show that for any language L ⊆ {0, 1}, L ∩ {0, 1}n can be decided by a circuit of size 2n+2.

(62)

### The Proof (concluded)

• Define boolean function f : {0, 1}n → {0, 1}, where

f (x1x2 · · · xn) =

1 x1x2 · · · xn ∈ L, 0 x1x2 · · · xn 6∈ L.

f(x1x2 · · · xn) = (x1 ∧ f (1x2 · · · xn)) ∨ (¬x1 ∧ f (0x2 · · · xn)).

• The circuit size s(n) for f(x1x2 · · · xn) hence satisfies s(n) = 4 + 2s(n − 1)

with s(1) = 1.

• Solve it to obtain s(n) = 5 × 2n−1 − 4 ≤ 2n+2.

(63)

• Proposition 68 (p. 480) does not contradict anything we knew so far about computation theory.

• Yes, there are only a finite number of circuits with size 2n+2.

• Yes, there are only 2n possible inputs of length n.

• Yes, those circuits can solve all problems of length n.

• But is there an algorithm to tell which circuit is the correct one?

Updating...

## References

Related subjects :