Basic Modular Arithmetics
a• Let m, n ∈ Z+.
• m | n means m divides n; m is n’s divisor.
• We call the numbers 0, 1, . . . , n − 1 the residue modulo n.
• The greatest common divisor of m and n is denoted gcd(m, n).
• The r in Theorem 51 (p. 448) is a primitive root of p.
aCarl Friedrich Gauss.
Basic Modular Arithmetics (concluded)
• We use
a ≡ b mod n if n| (a − b).
– So 25 ≡ 38 mod 13.
• We use
a = b mod n
if b is the remainder of a divided by n.
– So 25 = 12 mod 13.
Euler’s
aTotient or Phi Function
• Let
Φ(n) = { m : 1 ≤ m < n, gcd(m, n) = 1 }
be the set of all positive integers less than n that are prime to n.b
– Φ(12) = { 1, 5, 7, 11 }.
• Define Euler’s function of n to be φ(n) = | Φ(n) |.
• φ(p) = p − 1 for prime p, and φ(1) = 1 by convention.
• Euler’s function is not expected to be easy to compute without knowing n’s factorization.
aLeonhard Euler (1707–1783).
bZn∗ is an alternative notation.
Leonhard Euler (1707–1783)
Three Properties of Euler’s Function
The inclusion-exclusion principlea can be used to prove the following.
Lemma 54 φ(n) = n
p|n(1 − 1p).
• If n = pe11pe22 · · · pe is the prime factorization of n, then
φ(n) = n
i=1
1 − 1 pi
.
Corollary 55 φ(mn) = φ(m) φ(n) if gcd(m, n) = 1.
Lemma 56
m|n φ(m) = n.
aConsult any textbooks on discrete mathematics.
The Density Attack for primes
Witnesses to compositeness
of n
All numbers < n
The Density Attack for primes
1: Pick k ∈ { 1, . . . , n } randomly;
2: if k | n and k = 1 and k = n then
3: return “n is composite”;
4: else
5: return “n is (probably) a prime”;
6: end if
The Density Attack for primes (continued)
• It works, but does it work well?
• The ratio of numbers ≤ n relatively prime to n (the white ring) is
φ(n) n .
• When n = pq, where p and q are distinct primes, φ(n)
n = pq − p − q + 1
pq > 1 − 1
q − 1 p.
The Density Attack for primes (concluded)
• So the ratio of numbers ≤ n not relatively prime to n (the gray area) is < (1/q) + (1/p).
– The “density attack” has probability about 2/√
n of factoring n = pq when p ∼ q = O(√
n ).
– The “density attack” to factor n = pq hence takes Ω(√
n) steps on average when p ∼ q = O(√ n ).
– This running time is exponential: Ω(20.5 log2n).
The Chinese Remainder Theorem
• Let n = n1n2 · · · nk, where ni are pairwise relatively prime.
• For any integers a1, a2, . . . , ak, the set of simultaneous equations
x = a1 mod n1, x = a2 mod n2,
...
x = ak mod nk,
has a unique solution modulo n for the unknown x.
Fermat’s “Little” Theorem
aLemma 57 For all 0 < a < p, ap−1 = 1 mod p.
• Recall Φ(p) = { 1, 2, . . . , p − 1 }.
• Consider aΦ(p) = { am mod p : m ∈ Φ(p) }.
• aΦ(p) = Φ(p).
– aΦ(p) ⊆ Φ(p) as a remainder must be between 1 and p − 1.
– Suppose am ≡ am mod p for m > m, where m, m ∈ Φ(p).
– That means a(m − m) = 0 mod p, and p divides a or m − m, which is impossible.
aPierre de Fermat (1601–1665).
The Proof (concluded)
• Multiply all the numbers in Φ(p) to yield (p − 1)!.
• Multiply all the numbers in aΦ(p) to yield ap−1(p − 1)!.
• As aΦ(p) = Φ(p), we have
ap−1(p − 1)! ≡ (p − 1)! mod p.
• Finally, ap−1 = 1 mod p because p |(p − 1)!.
The Fermat-Euler Theorem
aCorollary 58 For all a ∈ Φ(n), aφ(n) = 1 mod n.
• The proof is similar to that of Lemma 57 (p. 470).
• Consider aΦ(n) = { am mod n : m ∈ Φ(n) }.
• aΦ(n) = Φ(n).
– aΦ(n) ⊆ Φ(n) as a remainder must be between 0 and n − 1 and relatively prime to n.
– Suppose am ≡ am mod n for m < m < n, where m, m ∈ Φ(n).
– That means a(m − m) = 0 mod n, and n divides a or m − m, which is impossible.
aProof by Mr. Wei-Cheng Cheng (R93922108, D95922011) on Novem- ber 24, 2004.
The Proof (concluded)
a• Multiply all the numbers in Φ(n) to yield
m∈Φ(n) m.
• Multiply all the numbers in aΦ(n) to yield aφ(n)
m∈Φ(n) m.
• As aΦ(n) = Φ(n),
m∈Φ(n)
m ≡ aφ(n)
⎛
⎝
m∈Φ(n)
m
⎞
⎠ mod n.
• Finally, aφ(n) = 1 mod n because n |
m∈Φ(n) m.
aSome typographical errors corrected by Mr. Jung-Ying Chen (D95723006) on November 18, 2008.
An Example
• As 12 = 22 × 3,
φ(12) = 12 ×
1 − 1 2
1 − 1 3
= 4.
• In fact, Φ(12) = { 1, 5, 7, 11 }.
• For example,
54 = 625 = 1 mod 12.
Exponents
• The exponent of m ∈ Φ(p) is the least k ∈ Z+ such that mk = 1 mod p.
• Every residue s ∈ Φ(p) has an exponent.
– 1, s, s2, s3, . . . eventually repeats itself modulo p, say si ≡ sj mod p, which means sj−i = 1 mod p.
• If the exponent of m is k and m = 1 mod p, then k | .
– Otherwise, = qk + a for 0 < a < k, and
m = mqk+a ≡ ma ≡ 1 mod p, a contradiction.
Lemma 59 Any nonzero polynomial of degree k has at most k distinct roots modulo p.
Exponents and Primitive Roots
• From Fermat’s “little” theorem (p. 470), all exponents divide p − 1.
• A primitive root of p is thus a number with exponent p − 1.
• Let R(k) denote the total number of residues in Φ(p) = { 1, 2, . . . , p − 1 } that have exponent k.
• We already knew that R(k) = 0 for k |(p − 1).
• So
k | (p−1)
R(k) = p − 1 as every number has an exponent.
Size of R(k)
• Any a ∈ Φ(p) of exponent k satisfies xk = 1 mod p.
• By Lemma 59 (p. 475) there are at most k residues of exponent k, i.e., R(k) ≤ k.
• Let s be a residue of exponent k.
• 1, s, s2, . . . , sk−1 are distinct modulo p.
– Otherwise, si ≡ sj mod p with i < j.
– Then sj−i = 1 mod p with j − i < k, a contradiction.
• As all these k distinct numbers satisfy xk = 1 mod p, they comprise all the solutions of xk = 1 mod p.
Size of R(k) (continued)
• But do all of them have exponent k (i.e., R(k) = k)?
• And if not (i.e., R(k) < k), how many of them do?
• Pick s, where < k.
• Suppose ∈ Φ(k) with gcd(, k) = d > 1.
• Then
(s)k/d = (sk)/d = 1 mod p.
• Therefore, s has exponent at most k/d < k.
• So s has exponent k only if ∈ Φ(k).
• We conclude that
R(k) ≤ φ(k).
Size of R(k) (concluded)
• Because all p − 1 residues have an exponent, p − 1 =
k | (p−1)
R(k) ≤
k | (p−1)
φ(k) = p − 1
by Lemma 56 (p. 464).
• Hence
R(k) =
⎧⎨
⎩
φ(k) when k | (p − 1) 0 otherwise
• In particular, R(p − 1) = φ(p − 1) > 0, and p has at least one primitive root.
• This proves one direction of Theorem 51 (p. 448).
A Few Calculations
• Let p = 13.
• From p. 472 φ(p − 1) = 4.
• Hence R(12) = 4.
• Indeed, there are 4 primitive roots of p.
• As
Φ(p − 1) = { 1, 5, 7, 11 }, the primitive roots are
g1, g5, g7, g11, where g is any primitive root.
Function Problems
• Decision problems are yes/no problems (sat, tsp (d), etc.).
• Function problems require a solution (a satisfying truth assignment, a best tsp tour, etc.).
• Optimization problems are clearly function problems.
• What is the relation between function and decision problems?
• Which one is harder?
Function Problems Cannot Be Easier than Decision Problems
• If we know how to generate a solution, we can solve the corresponding decision problem.
– If you can find a satisfying truth assignment efficiently, then sat is in P.
– If you can find the best tsp tour efficiently, then tsp (d) is in P.
• But decision problems can be as hard as the corresponding function problems, as we will see immediately.
fsat
• fsat is this function problem:
– Let φ(x1, x2, . . . , xn) be a boolean expression.
– If φ is satisfiable, then return a satisfying truth assignment.
– Otherwise, return “no.”
• We next show that if sat ∈ P, then fsat has a polynomial-time algorithm.
• sat is a subroutine (black box) that returns “yes” or
“no” on the satisfiability of the input.
An Algorithm for fsat Using sat
1: t := ; {Truth assignment.}
2: if φ ∈ sat then
3: for i = 1, 2, . . . , n do
4: if φ[ xi = true ] ∈ sat then 5: t := t ∪ { xi = true };
6: φ := φ[ xi = true ];
7: else
8: t := t ∪ { xi = false };
9: φ := φ[ xi = false ];
10: end if 11: end for 12: return t;
13: else
14: return “no”;
15: end if
Analysis
• If sat can be solved in polynomial time, so can fsat.
– There are ≤ n + 1 calls to the algorithm for sat.a – Boolean expressions shorter than φ are used in each
call to the algorithm for sat.
• Hence sat and fsat are equally hard (or easy).
• Note that this reduction from fsat to sat is not a Karp reduction.b
• Instead, it calls sat multiple times as a subroutine, and its answers guide the search on the computation tree.
aContributed by Ms. Eva Ou (R93922132) on November 24, 2004.
bRecall p. 247 and p. 251.
tsp and tsp (d) Revisited
• We are given n cities 1, 2, . . . , n and integer distances dij = dji between any two cities i and j.
• tsp (d) asks if there is a tour with a total distance at most B.
• tsp asks for a tour with the shortest total distance.
– The shortest total distance is at most
i,j dij.
∗ Recall that the input string contains d11, . . . , dnn.
• Thus the shortest total distance is less than 2| x | in magnitude, where x is the input (why?).
• We next show that if tsp (d) ∈ P, then tsp has a polynomial-time algorithm.
An Algorithm for tsp Using tsp (d)
1: Perform a binary search over interval [ 0, 2| x | ] by calling tsp (d) to obtain the shortest distance, C;
2: for i, j = 1, 2, . . . , n do
3: Call tsp (d) with B = C and dij = C + 1;
4: if “no” then
5: Restore dij to its old value; {Edge [ i, j ] is critical.}
6: end if
7: end for
8: return the tour with edges whose dij ≤ C;
Analysis
• An edge which is not on any remaining optimal tours will be eliminated, with its dij set to C + 1.
• So the algorithm ends with n edges which are not eliminated (why?).
• This is true even if there are multiple optimal tours!a
aThanks to a lively class discussion on November 12, 2013.
Analysis (concluded)
• There are O(| x | + n2) calls to the algorithm for tsp (d).
• Each call has an input length of O(| x |).
• So if tsp (d) can be solved in polynomial time, so can tsp.
• Hence tsp (d) and tsp are equally hard (or easy).
Randomized Computation
I know that half my advertising works, I just don’t know which half.
— John Wanamaker I know that half my advertising is a waste of money, I just don’t know which half!
— McGraw-Hill ad.
Randomized Algorithms
a• Randomized algorithms flip unbiased coins.
• There are important problems for which there are no known efficient deterministic algorithms but for which very efficient randomized algorithms exist.
– Extraction of square roots, for instance.
• There are problems where randomization is necessary.
– Secure protocols.
• Randomized version can be more efficient.
– Parallel algorithms for maximal independent set.b
aRabin (1976); Solovay and Strassen (1977).
b“Maximal” (a local maximum) not “maximum” (a global maximum).
Randomized Algorithms (concluded)
• Are randomized algorithms algorithms?a
• Coin flips are occasionally used in politics.b
aPascal, “Truth is so delicate that one has only to depart the least bit from it to fall into error.”
bIn the 2016 Iowa Democratic caucuses, e.g. (see http://edition.cnn.com/2016/02/02/politics/hillary-clinton-coin -flip-iowa-bernie-sanders/index.html).
“Four Most Important Randomized Algorithms”
a1. Primality testing.b
2. Graph connectivity using random walks.c 3. Polynomial identity testing.d
4. Algorithms for approximate counting.e
aTrevisan (2006).
bRabin (1976); Solovay and Strassen (1977).
cAleliunas, Karp, Lipton, Lov´asz, and Rackoff (1979).
dSchwartz (1980); Zippel (1979).
eSinclair and Jerrum (1989).
Bipartite Perfect Matching
• We are given a bipartite graph G = (U, V, E).
– U = { u1, u2, . . . , un }.
– V = { v1, v2, . . . , vn }.
– E ⊆ U × V .
• We are asked if there is a perfect matching.
– A permutation π of { 1, 2, . . . , n } such that (ui, vπ(i)) ∈ E
for all i ∈ { 1, 2, . . . , n }.
• A perfect matching contains n edges.
A Perfect Matching in a Bipartite Graph
: : : : :
;
;
;
;
;
Symbolic Determinants
• We are given a bipartite graph G.
• Construct the n × n matrix AG whose (i, j)th entry AGij is a symbolic variable xij if (ui, vj) ∈ E and 0 otherwise:
AGij =
⎧⎨
⎩
xij, if (ui, vj) ∈ E, 0, othersie.
Symbolic Determinants (continued)
• The matrix for the bipartite graph G on p. 496 isa
AG =
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎢
⎣
0 0 x13 x14 0
0 x22 0 0 0
x31 0 0 0 x35
x41 0 x43 x44 0
x51 0 0 0 x55
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎥
⎦
. (6)
aThe idea is similar to the Tanner graph in coding theory by Tanner (1981).
Symbolic Determinants (concluded)
• The determinant of AG is det(AG) =
π
sgn(π)
n i=1
AGi,π(i). (7) – π ranges over all permutations of n elements.
– sgn(π) is 1 if π is the product of an even number of transpositions and −1 otherwise.a
• det(AG) contains n! terms, many of which may be 0s.
aEquivalently, sgn(π) = 1 if the number of (i, j)s such that i < j and π(i) > π(j) is even. Contributed by Mr. Hwan-Jeu Yu (D95922028) on May 1, 2008.
Determinant and Bipartite Perfect Matching
• In
π sgn(π)n
i=1 AGi,π(i), note the following:
– Each summand corresponds to a possible perfect matching π.
– Nonzero summands n
i=1 AGi,π(i) are distinct monomials and will not cancel.
• det(AG) is essentially an exhaustive enumeration.
Proposition 60 (Edmonds (1967)) G has a perfect matching if and only if det(AG) is not identically zero.
Perfect Matching and Determinant (p. 496)
:
: : : :
;
;
;
;
;
Perfect Matching and Determinant (concluded)
• The matrix is (p. 498)
AG =
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎢
⎣
0 0 x13 x14 0
0 x22 0 0 0
x31 0 0 0 x35
x41 0 x43 x44 0
x51 0 0 0 x55
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎥
⎦ .
• det(AG) = −x14x22x35x43x51 + x13x22x35x44x51 + x14x22x31x43x55 − x13x22x31x44x55.
• Each nonzero term denotes a perfect matching, and vice versa.
How To Test If a Polynomial Is Identically Zero?
• det(AG) is a polynomial in n2 variables.
• It has, potentially, exponentially many terms.
• Expanding the determinant polynomial is thus infeasible.
• If det(AG) ≡ 0, then it remains zero if we substitute arbitrary integers for the variables x11, . . . , xnn.
• When det(AG) ≡ 0, what is the likelihood of obtaining a zero?
Number of Roots of a Polynomial
Lemma 61 (Schwartz (1980)) Let p(x1, x2, . . . , xm) ≡ 0 be a polynomial in m variables each of degree at most d. Let M ∈ Z+. Then the number of m-tuples
(x1, x2, . . . , xm) ∈ { 0, 1, . . . , M − 1 }m such that p(x1, x2, . . . , xm) = 0 is
≤ mdMm−1.
• By induction on m (consult the textbook).
Density Attack
• The density of roots in the domain is at most mdMm−1
Mm = md
M . (8)
• So suppose p(x1, x2, . . . , xm) ≡ 0.
• Then a random
(x1, x2, . . . , xm) ∈ { 0, 1, . . . , M − 1 }m has a probability of ≤ md/M of being a root of p.
• Note that M is under our control!
– One can raise M to lower the error probability, e.g.
Density Attack (concluded)
Here is a sampling algorithm to test if p(x1, x2, . . . , xm) ≡ 0.
1: Choose i1, . . . , im from { 0, 1, . . . , M − 1 } randomly;
2: if p(i1, i2, . . . , im) = 0 then
3: return “p is not identically zero”;
4: else
5: return “p is (probably) identically zero”;
6: end if
Analysis
• If p(x1, x2, . . . , xm) ≡ 0 , the algorithm will always be correct as p(i1, i2, . . . , im) = 0.
• Suppose p(x1, x2, . . . , xm) ≡ 0.
– The algorithm will answer incorrectly with
probability at most md/M by Eq. (8) on p. 505.
• We next return to the original problem of bipartite perfect matching.
A Randomized Bipartite Perfect Matching Algorithm
a1: Choose n2 integers i11, . . . , inn from { 0, 1, . . . , 2n2 − 1 } randomly; {So M = 2n2.}
2: Calculate det(AG(i11, . . . , inn)) by Gaussian elimination;
3: if det(AG(i11, . . . , inn)) = 0 then
4: return “G has a perfect matching”;
5: else
6: return “G has (probably) no perfect matchings”;
7: end if
aLov´asz (1979). According to Paul Erd˝os, Lov´asz wrote his first sig- nificant paper “at the ripe old age of 17.”
Analysis
• If G has no perfect matchings, the algorithm will always be correct as det(AG(i11, . . . , inn)) = 0.
• Suppose G has a perfect matching.
– The algorithm will answer incorrectly with
probability at most md/M = 0.5 with m = n2, d = 1 and M = 2n2 in Eq. (8) on p. 505.
• Run the algorithm independently k times.
• Output “G has no perfect matchings” if and only if all say “(probably) no perfect matchings.”
• The error probability is now reduced to at most 2−k.
L´ oszl´ o Lov´ asz (1948–)
Remarks
a• Note that we are calculating
prob[ algorithm answers “no”| G has no perfect matchings ], prob[ algorithm answers “yes”| G has a perfect matching ].
• We are not calculatingb
prob[ G has no perfect matchings| algorithm answers “no” ], prob[ G has a perfect matching| algorithm answers “yes” ].
aThanks to a lively class discussion on May 1, 2008.
bNumerical Recipes in C (1988), “statistics is not a branch of math- ematics!”
But How Large Can det (A
G(i
11, . . . , i
nn)) Be?
• It is at most
n!
2n2n .
• Stirling’s formula says n! ∼ √
2πn (n/e)n.
• Hence
log2 det(AG(i11, . . . , inn)) = O(n log2 n) bits are sufficient for representing the determinant.
• We skip the details about how to make sure that all intermediate results are of polynomial size.
An Intriguing Question
a• Is there an (i11, . . . , inn) that will always give correct answers for the algorithm on p. 508?
• A theorem on p. 604 shows that such an (i11, . . . , inn) exists!
– Whether it can be found efficiently is another matter.
• Once (i11, . . . , inn) is available, the algorithm can be made deterministic.
aThanks to a lively class discussion on November 24, 2004.
Randomization vs. Nondeterminism
a• What are the differences between randomized algorithms and nondeterministic algorithms?
• One can think of a randomized algorithm as a
nondeterministic algorithm but with a probability associated with every guess/branch.
• So each computation path of a randomized algorithm has a probability associated with it.
aContributed by Mr. Olivier Valery (D01922033) and Mr. Hasan Al- hasan (D01922034) on November 27, 2012.
Monte Carlo Algorithms
a• The randomized bipartite perfect matching algorithm is called a Monte Carlo algorithm in the sense that
– If the algorithm finds that a matching exists, it is always correct (no false positives; no type 1 errors).
– If the algorithm answers in the negative, then it may make an error (false negatives; type 2 errors).
aMetropolis and Ulam (1949).
Monte Carlo Algorithms (continued)
• The algorithm makes a false negative with probability
≤ 0.5.a
• Again, this probability refers tob
prob[ algorithm answers “no”| G has a perfect matching ] not
prob[ G has a perfect matching| algorithm answers “no” ].
aEquivalently, among the coin flip sequences, at most half of them lead to the wrong answer.
bIn general, prob[ algorithm answers “no”| input is a yes instance ].
Monte Carlo Algorithms (concluded)
• This probability 0.5 is not over the space of all graphs or determinants, but over the algorithm’s own coin flips.
– It holds for any bipartite graph.
• In contrast, to calculate
prob[ G has a perfect matching| algorithm answers “no” ], we will need the distribution of G.
• But it is an empirical statement that is very hard to verify.