ZPP
a(Zero Probabilistic Polynomial)
• The class ZPP is defined as RP ∩ coRP.
• A language in ZPP has two Monte Carlo algorithms, one with no false positives (RP) and the other with no false negatives (coRP).
• If we repeatedly run both Monte Carlo algorithms, eventually one definite answer will come (unlike RP).
– A positive answer from the one without false positives.
– A negative answer from the one without false negatives.
The ZPP Algorithm (Las Vegas)
1: {Suppose L ∈ ZPP.}
2: {N1 has no false positives, and N2 has no false negatives.}
3: while true do
4: if N1(x) = “yes” then
5: return “yes”;
6: end if
7: if N2(x) = “no” then
8: return “no”;
9: end if
10: end while
ZPP (concluded)
• The expected running time for the correct answer to emerge is polynomial.
– The probability that a run of the 2 algorithms does not generate a definite answer is 0.5 (why?).
– Let p(n) be the running time of each run of the while-loop.
– The expected running time for a definite answer is
∞ i=1
0.5iip(n) = 2p(n).
• Essentially, ZPP is the class of problems that can be
Large Deviations
• Suppose you have a biased coin.
• One side has probability 0.5 + to appear and the other 0.5 − , for some 0 < < 0.5.
• But you do not know which is which.
• How to decide which side is the more likely side—with high confidence?
• Answer: Flip the coin many times and pick the side that appeared the most times.
• Question: Can you quantify your confidence?
The Chernoff Bound
aTheorem 70 (Chernoff (1952)) Suppose x1, x2, . . . , xn are independent random variables taking the values 1 and 0 with probabilities p and 1 − p, respectively. Let X = n
i=1 xi. Then for all 0 ≤ θ ≤ 1,
prob[ X ≥ (1 + θ) pn ] ≤ e−θ2pn/3.
• The probability that the deviate of a binomial random variable from its expected value
E[ X ] = E
n
i=1
xi
= pn
decreases exponentially with the deviation.
The Proof
• Let t be any positive real number.
• Then
prob[ X ≥ (1 + θ) pn ] = prob[ etX ≥ et(1+θ) pn ].
• Markov’s inequality (p. 519) generalized to real-valued random variables says that
prob
etX ≥ kE[ etX ]
≤ 1/k.
• With k = et(1+θ) pn/E[ etX ], we havea
prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pnE[ etX ].
aNote that X does not appear in k. Contributed by Mr. Ao Sun
The Proof (continued)
• Because X = n
i=1 xi and xi’s are independent, E[ etX ] = (E[ etx1 ])n = [ 1 + p(et − 1) ]n.
• Substituting, we obtain
prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pn[ 1 + p(et − 1) ]n
≤ e−t(1+θ) pnepn(et−1) as (1 + a)n ≤ ean for all a > 0.
The Proof (concluded)
• With the choice of t = ln(1 + θ), the above becomes prob[ X ≥ (1 + θ) pn ] ≤ epn[ θ−(1+θ) ln(1+θ) ].
• The exponent expands to
−θ2
2 + θ3
6 − θ4
12 + · · · for 0 ≤ θ ≤ 1.
• But it is less than
−θ2
2 + θ3
6 ≤ θ2
−1
2 + θ 6
≤ θ2
−1
2 + 1 6
= −θ2 3 .
Other Variations of the Chernoff Bound
The following can be proved similarly (prove it).
Theorem 71 Given the same terms as Theorem 70 (p. 583),
prob[ X ≤ (1 − θ) pn ] ≤ e−θ2pn/2.
The following slightly looser inequalities achieve symmetry.
Theorem 72 (Karp, Luby, & Madras (1989)) Given the same terms as Theorem 70 (p. 583) except with
0 ≤ θ ≤ 2,
prob[ X ≥ (1 + θ) pn ] ≤ e−θ2pn/4,
−θ2
Power of the Majority Rule
The next result follows from Theorem 71 (p. 587).
Corollary 73 If p = (1/2) + for some 0 ≤ ≤ 1/2, then prob
n
i=1
xi ≤ n/2
≤ e−2n/2.
• The textbook’s corollary to Lemma 11.9 seems too loose, at e−2n/6.a
• Our original problem (p. 582) hence demands, e.g.,
n ≈ 1.4k/2 independent coin flips to guarantee making an error with probability ≤ 2−k with the majority rule.
a
BPP
a(Bounded Probabilistic Polynomial)
• The class BPP contains all languages L for which there is a precise polynomial-time NTM N such that:
– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “yes.”
– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “no.”
• So N accepts or rejects by a clear majority.
aGill (1977).
Magic 3/4?
• The number 3/4 bounds the probability (ratio) of a right answer away from 1/2.
• Any constant strictly between 1/2 and 1 can be used without affecting the class BPP.
• In fact, as with RP,
1
2 + 1 q(n)
for any polynomial q(n) can replace 3/4.
• The next algorithm shows why.
The Majority Vote Algorithm
Suppose L is decided by N by majority (1/2) + .
1: for i = 1, 2, . . . , 2k + 1 do
2: Run N on input x;
3: end for
4: if “yes” is the majority answer then
5: “yes”;
6: else
7: “no”;
8: end if
Analysis
• The running time remains polynomial: 2k + 1 times N’s running time.
• By Corollary 73 (p. 588), the probability of a false answer is at most e−2k.
• By taking k = 2/2 , the error probability is at most 1/4.
• Even if is any inverse polynomial, k remains a polynomial in n.
Aspects of BPP
• BPP is the most comprehensive yet plausible notion of efficient computation.
– If a problem is in BPP, we take it to mean that the problem can be solved efficiently.
– In this aspect, BPP has effectively replaced P.
• (RP ∪ coRP) ⊆ (NP ∪ coNP).
• (RP ∪ coRP) ⊆ BPP.
• Whether BPP ⊆ (NP ∪ coNP) is unknown.
• But it is unlikely that NP ⊆ BPP.a
coBPP
• The definition of BPP is symmetric: acceptance by clear majority and rejection by clear majority.
• An algorithm for L ∈ BPP becomes one for ¯L by reversing the answer.
• So ¯L ∈ BPP and BPP ⊆ coBPP.
• Similarly coBPP ⊆ BPP.
• Hence BPP = coBPP.
• This approach does not work for RP.a
aIt did not work for NP either.
BPP and coBPP
Ø\HVÙ ØQRÙ ØQRÙ Ø\HVÙ
“The Good, the Bad, and the Ugly”
P BPP ZPP
RP coRP
NP coNP
Circuit Complexity
• Circuit complexity is based on boolean circuits instead of Turing machines.
• A boolean circuit with n inputs computes a boolean function of n variables.
• Now, identify true/1 with “yes” and false/0 with “no.”
• Then a boolean circuit with n inputs accepts certain strings in { 0, 1 }n.
• To relate circuits with an arbitrary language, we need one circuit for each possible input length n.
Formal Definitions
• The size of a circuit is the number of gates in it.
• A family of circuits is an infinite sequence
C = (C0, C1, . . .) of boolean circuits, where Cn has n boolean inputs.
• For input x ∈ { 0, 1 }∗, C| x | outputs 1 if and only if x ∈ L.
• In other words,
Cn accepts L ∩ { 0, 1 }n.
Formal Definitions (concluded)
• L ⊆ { 0, 1 }∗ has polynomial circuits if there is a family of circuits C such that:
– The size of Cn is at most p(n) for some fixed polynomial p.
– Cn accepts L ∩ { 0, 1 }n.
Exponential Circuits Suffice for All Languages
• Theorem 14 (p. 195) implies that there are languages that cannot be solved by circuits of size 2n/(2n).
• But surprisingly, circuits of size 2n+2 can solve all problems, decidable or otherwise!
Exponential Circuits Suffice for All Languages (continued)
Proposition 74 All decision problems (decidable or otherwise) can be solved by a circuit of size 2n+2.
• We will show that for any language L ⊆ { 0, 1 }∗, L ∩ { 0, 1 }n can be decided by a circuit of size 2n+2.
• Define boolean function f : { 0, 1 }n → { 0, 1 }, where
f (x1x2 · · · xn) =
⎧⎨
⎩
1 x1x2 · · · xn ∈ L, 0 x1x2 · · · xn ∈ L.
The Proof (concluded)
• Clearly, any circuit that implements f decides L ∩ { 0, 1 }n.
• Now,
f(x1x2 · · · xn) = (x1 ∧ f(1x2 · · · xn)) ∨ (¬x1 ∧ f(0x2 · · · xn)).
• The circuit size s(n) for f(x1x2 · · · xn) hence satisfies s(n) = 4 + 2s(n − 1)
with s(1) = 1.
• Solve it to obtain s(n) = 5 × 2n−1 − 4 ≤ 2n+2.
The Circuit Complexity of P
Proposition 75 All languages in P have polynomial circuits.
• Let L ∈ P be decided by a TM in time p(n).
• By Corollary 31 (p. 297), there is a circuit with O(p(n)2) gates that accepts L ∩ { 0, 1 }n.
• The size of that circuit depends only on L and the length of the input.
• The size of that circuit is polynomial in n.
Polynomial Circuits vs. P
• Is the converse of Proposition 75 true?
– Do polynomial circuits accept only languages in P?
• No.
• Polynomial circuits can accept undecidable languages!
BPP’s Circuit Complexity
Theorem 76 (Adleman (1978)) All languages in BPP have polynomial circuits.
• Our proof will be nonconstructive in that only the existence of the desired circuits is shown.
– Recall our proof of Theorem 14 (p. 195).
– Something exists if its probability of existence is nonzero.
• It is not known how to efficiently generate circuit Cn. – If the construction of Cn can be made efficient, then
The Proof
• Let L ∈ BPP be decided by a precise polynomial-time NTM N by clear majority.
• We shall prove that L has polynomial circuits C0, C1, . . ..
– These deterministic circuits do not err.
• Suppose N runs in time p(n), where p(n) is a polynomial.
• Let An = { a1, a2, . . . , am }, where ai ∈ { 0, 1 }p(n).
• Each ai ∈ An represents a sequence of nondeterministic choices (i.e., a computation path) for N .
• Pick m = 12(n + 1).
The Proof (continued)
• Let x be an input with | x | = n.
• Circuit Cn simulates N on x with all sequences of choices in An and then takes the majority of the m outcomes.a
– Note that each An yields a circuit.
• As N with ai is a polynomial-time deterministic TM, it can be simulated by polynomial circuits of size O(p(n)2).
– See the proof of Proposition 75 (p. 603).
aAs m is even, there may be no clear majority. Still, the probability of that happening is very small and does not materially affect our general
The Circuit
,2 ,
, ,
The Proof (continued)
• The size of Cn is therefore O(mp(n)2) = O(np(n)2).
– This is a polynomial.
• We now confirm the existence of an An making Cn correct on all n-bit inputs.
• Call ai bad if it leads N to an error (a false positive or a false negative) for x.
• Select An uniformly randomly.
The Proof (continued)
• For each x ∈ { 0, 1 }n, 1/4 of the computations of N are erroneous.
• Because the sequences in An are chosen randomly and independently, the expected number of bad ai’s is m/4.a
• By the Chernoff bound (p. 583), the probability that the number of bad ai’s is m/2 or more is at most
e−m/12 < 2−(n+1).
• The error probability of using the majority rule is thus
< 2−(n+1) for each x ∈ { 0, 1 }n.
aSo the proof will not work for NP. Contributed by Mr. Ching-Hua
The Proof (continued)
• The probability that there is an x such that An results in an incorrect answer is
< 2n2−(n+1) = 2−1.
– Recall the union bound (Boole’s inequality):
prob[ A ∪ B ∪ · · · ] ≤ prob[ A ] + prob[ B ] + · · · .
• We just showed that at least half of them are correct.
• So with probability ≥ 0.5, a random An produces a correct Cn for all inputs of length n.
– Of course, verifying this fact may take a long time.
The Proof (concluded)
• Because this probability exceeds 0, an An that makes majority vote work for all inputs of length n exists.
• Hence a correct Cn exists.a
• We have used the probabilistic method popularized by Erd˝os.b
• This result answers the question on p. 514 with a “yes.”
aQuine (1948), “To be is to be the value of a bound variable.”
bA counting argument in the probabilistic language.
Leonard Adleman
a(1945–)
Paul Erd˝ os (1913–1996)
Cryptography
Whoever wishes to keep a secret must hide the fact that he possesses one.
— Johann Wolfgang von Goethe (1749–1832)
Cryptography
• Alice (A) wants to send a message to Bob (B) over a channel monitored by Eve (eavesdropper).
• The protocol should be such that the message is known only to Alice and Bob.
• The art and science of keeping messages secure is cryptography.
Alice Eve -
Bob
Encryption and Decryption
• Alice and Bob agree on two algorithms E and D—the encryption and the decryption algorithms.
• Both E and D are known to the public in the analysis.
• Alice runs E and wants to send a message x to Bob.
• Bob operates D.
• Privacy is assured in terms of two numbers e, d, the encryption and decryption keys.
• Alice sends y = E(e, x) to Bob, who then performs D(d, y) = x to recover x.
• x is called plaintext, and y is called ciphertext.a
Some Requirements
• D should be an inverse of E given e and d.
• D and E must both run in (probabilistic) polynomial time.
• Eve should not be able to recover x from y without knowing d.
– As D is public, d must be kept secret.
– e may or may not be a secret.
Degrees of Security
• Perfect secrecy: After a ciphertext is intercepted by the enemy, the a posteriori probabilities of the plaintext that this ciphertext represents are identical to the a
priori probabilities of the same plaintext before the interception.
– The probability that plaintext P occurs is
independent of the ciphertext C being observed.
– So knowing C yields no advantage in recovering P.
• Such systems are said to be informationally secure.
• A system is computationally secure if breaking it is theoretically possible but computationally infeasible.
Conditions for Perfect Secrecy
a• Consider a cryptosystem where:
– The space of ciphertext is as large as that of keys.
– Every plaintext has a nonzero probability of being used.
• It is perfectly secure if and only if the following hold.
– A key is chosen with uniform distribution.
– For each plaintext x and ciphertext y, there exists a unique key e such that E(e, x) = y.
aShannon (1949).
The One-Time Pad
a1: Alice generates a random string r as long as x;
2: Alice sends r to Bob over a secret channel;
3: Alice sends x ⊕ r to Bob over a public channel;
4: Bob receives y;
5: Bob recovers x := y ⊕ r;
aMauborgne and Vernam (1917); Shannon (1949). It was allegedly used for the hotline between Russia and U.S.
Analysis
• The one-time pad uses e = d = r.
• This is said to be a private-key cryptosystem.
• Knowing x and knowing r are equivalent.
• Because r is random and private, the one-time pad achieves perfect secrecy (see also p. 621).
• The random bit string must be new for each round of communication.
• But the assumption of a private channel is problematic.
Public-Key Cryptography
a• Suppose only d is private to Bob, whereas e is public knowledge.
• Bob generates the (e, d) pair and publishes e.
• Anybody like Alice can send E(e, x) to Bob.
• Knowing d, Bob can recover x by D(d, E(e, x)) = x.
• The assumptions are complexity-theoretic.
– It is computationally difficult to compute d from e.
– It is computationally difficult to compute x from y without knowing d.
a
Whitfield Diffie
a(1944–)
aTuring Award (2016).
Martin Hellman
a(1945–)
aTuring Award (2016).
Complexity Issues
• Given y and x, it is easy to verify whether E(e, x) = y.
• Hence one can always guess an x and verify.
• Cracking a public-key cryptosystem is thus in NP.
• A necessary condition for the existence of secure public-key cryptosystems is P = NP.
• But more is needed than P = NP.
• For instance, it is not sufficient that D is hard to compute in the worst case.
• It should be hard in “most” or “average” cases.
One-Way Functions
A function f is a one-way function if the following hold.a 1. f is one-to-one.
2. For all x ∈ Σ∗, | x |1/k ≤ |f(x)| ≤ | x |k for some k > 0.
• f is said to be honest.
3. f can be computed in polynomial time.
4. f−1 cannot be computed in polynomial time.
• Exhaustive search works, but it must be slow.
aDiffie & Hellman (1976); Boppana & Lagarias (1986); Grollmann &
Selman (1988); Ko (1985); Ko, Long, & Du (1986); Watanabe (1985);
Young (1983).
Existence of One-Way Functions (OWFs)
• Even if P = NP, there is no guarantee that one-way functions exist.
• No functions have been proved to be one-way.
• Is breaking glass a one-way function?
Candidates of One-Way Functions
• Modular exponentiation f(x) = gx mod p, where g is a primitive root of p.
– Discrete logarithm is hard.a
• The RSAb function f (x) = xe mod pq for an odd e relatively prime to φ(pq).
– Breaking the RSA function is hard.
aConjectured to be 2n for some > 0 in both the worst-case sense and average sense. Doable in time nO(log n) for finite fields of small char- acteristic (Barbulescu, et al., 2013). It is in NP in some sense (Grollmann and Selman, 1988).
bRivest, Shamir, & Adleman (1978).
Candidates of One-Way Functions (concluded)
• Modular squaring f(x) = x2 mod pq.
– Determining if a number with a Jacobi symbol 1 is a quadratic residue is hard—the quadratic
residuacity assumption (QRA).a
– Breaking it is as hard as factorization when p ≡ q ≡ 3 mod 4.b
aDue to Gauss.
bRabin (1979).
The Secret-Key Agreement Problem
• Exchanging messages securely using a private-key cryptosystem requires Alice and Bob possessing the same key (p. 623).
– An example is the r in the one-time pad (p. 622).
• How can they agree on the same secret key when the channel is insecure?
• This is called the secret-key agreement problem.
• It was solved by Diffie and Hellman (1976) using one-way functions.
The Diffie-Hellman Secret-Key Agreement Protocol
1: Alice and Bob agree on a large prime p and a primitive root g of p; {p and g are public.}
2: Alice chooses a large number a at random;
3: Alice computes α = ga mod p;
4: Bob chooses a large number b at random;
5: Bob computes β = gb mod p;
6: Alice sends α to Bob, and Bob sends β to Alice;
7: Alice computes her key βa mod p;
8: Bob computes his key αb mod p;
Analysis
• The keys computed by Alice and Bob are identical as βa = gba = gab = αb mod p.
• To compute the common key from p, g, α, β is known as the Diffie-Hellman problem.
• It is conjectured to be hard.
• If discrete logarithm is easy, then one can solve the Diffie-Hellman problem.
– Because a and b can then be obtained by Eve.
• But the other direction is still open.
The RSA Function
• Let p, q be two distinct primes.
• The RSA function is xe mod pq for an odd e relatively prime to φ(pq).
– By Lemma 54 (p. 464),
φ(pq) = pq
1 − 1 p
1 − 1 q
= pq − p − q + 1. (14)
• As gcd(e, φ(pq)) = 1, there is a d such that ed ≡ 1 mod φ(pq),
which can be found by the Euclidean algorithm.a
A Public-Key Cryptosystem Based on RSA
• Bob generates p and q.
• Bob publishes pq and the encryption key e, a number relatively prime to φ(pq).
– The encryption function is y = xe mod pq.
– Bob calculates φ(pq) by Eq. (14) (p. 635).
– Bob then calculates d such that ed = 1 + kφ(pq) for some k ∈ Z.
• The decryption function is yd mod pq.
• It works because yd = xed = x1+kφ(pq) = x mod pq by the Fermat-Euler theorem when gcd(x, pq) = 1 (p. 473).
The “Security” of the RSA Function
• Factoring pq or calculating d from (e, pq) seems hard.a
• Breaking the last bit of RSA is as hard as breaking the RSA.b
• Recommended RSA key sizes:c – 1024 bits up to 2010.
– 2048 bits up to 2030.
– 3072 bits up to 2031 and beyond.
aSee also p. 469.
bAlexi, Chor, Goldreich, & Schnorr (1988).
cRSA (2003). RSA was acquired by EMC in 2006 for 2.1 billion US
The “Security” of the RSA Function (continued)
• Recall that problem A is “harder than” problem B if solving A results in solving B.
– Factorization is “harder than” breaking the RSA.
– It is not hard to show that calculating Euler’s phi functiona is “harder than” breaking the RSA.
– Factorization is “harder than” calculating Euler’s phi function (see Lemma 54 on p. 464).
– So factorization is harder than calculating Euler’s phi function, which is harder than breaking the RSA.
aWhen the input is not factorized!
The “Security” of the RSA Function (concluded)
• Factorization cannot be NP-hard unless NP = coNP.a
• So breaking the RSA is unlikely to imply P = NP.
• But numbers can be factorized efficiently by quantum computers.b
• RSA was alleged to have received 10 million US dollars from the government to promote unsecure p and q.c
aBrassard (1979).
bShor (1994).
cMenn (2013).
Adi Shamir, Ron Rivest, and Leonard Adleman
Ron Rivest
a(1947–)
Adi Shamir
a(1952–)
aTuring Award (2002).
A Parallel History
• Diffie and Hellman’s solution to the secret-key
agreement problem led to public-key cryptography.
• At around the same time (or earlier) in Britain, the RSA public-key cryptosystem was invented first before the Diffie-Hellman secret-key agreement scheme was.
– Ellis, Cocks, and Williamson of the Communications Electronics Security Group of the British Government Communications Head Quarters (GCHQ).