• 沒有找到結果。

Large Deviations

N/A
N/A
Protected

Academic year: 2022

Share "Large Deviations"

Copied!
65
0
0

加載中.... (立即查看全文)

全文

(1)

ZPP

a

(Zero Probabilistic Polynomial)

• The class ZPP is defined as RP ∩ coRP.

• A language in ZPP has two Monte Carlo algorithms, one with no false positives (RP) and the other with no false negatives (coRP).

• If we repeatedly run both Monte Carlo algorithms, eventually one definite answer will come (unlike RP).

– A positive answer from the one without false positives.

– A negative answer from the one without false negatives.

(2)

The ZPP Algorithm (Las Vegas)

1: {Suppose L ∈ ZPP.}

2: {N1 has no false positives, and N2 has no false negatives.}

3: while true do

4: if N1(x) = “yes” then

5: return “yes”;

6: end if

7: if N2(x) = “no” then

8: return “no”;

9: end if

10: end while

(3)

ZPP (concluded)

• The expected running time for the correct answer to emerge is polynomial.

– The probability that a run of the 2 algorithms does not generate a definite answer is 0.5 (why?).

– Let p(n) be the running time of each run of the while-loop.

– The expected running time for a definite answer is

 i=1

0.5iip(n) = 2p(n).

• Essentially, ZPP is the class of problems that can be

(4)

Large Deviations

• Suppose you have a biased coin.

• One side has probability 0.5 +  to appear and the other 0.5 − , for some 0 <  < 0.5.

• But you do not know which is which.

• How to decide which side is the more likely side—with high confidence?

• Answer: Flip the coin many times and pick the side that appeared the most times.

• Question: Can you quantify your confidence?

(5)

The Chernoff Bound

a

Theorem 70 (Chernoff (1952)) Suppose x1, x2, . . . , xn are independent random variables taking the values 1 and 0 with probabilities p and 1 − p, respectively. Let X = n

i=1 xi. Then for all 0 ≤ θ ≤ 1,

prob[ X ≥ (1 + θ) pn ] ≤ e−θ2pn/3.

• The probability that the deviate of a binomial random variable from its expected value

E[ X ] = E

 n



i=1

xi



= pn

decreases exponentially with the deviation.

(6)

The Proof

• Let t be any positive real number.

• Then

prob[ X ≥ (1 + θ) pn ] = prob[ etX ≥ et(1+θ) pn ].

• Markov’s inequality (p. 519) generalized to real-valued random variables says that

prob

etX ≥ kE[ etX ]

≤ 1/k.

• With k = et(1+θ) pn/E[ etX ], we havea

prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pnE[ etX ].

aNote that X does not appear in k. Contributed by Mr. Ao Sun

(7)

The Proof (continued)

• Because X = n

i=1 xi and xi’s are independent, E[ etX ] = (E[ etx1 ])n = [ 1 + p(et − 1) ]n.

• Substituting, we obtain

prob[ X ≥ (1 + θ) pn ] ≤ e−t(1+θ) pn[ 1 + p(et − 1) ]n

≤ e−t(1+θ) pnepn(et−1) as (1 + a)n ≤ ean for all a > 0.

(8)

The Proof (concluded)

• With the choice of t = ln(1 + θ), the above becomes prob[ X ≥ (1 + θ) pn ] ≤ epn[ θ−(1+θ) ln(1+θ) ].

• The exponent expands to

−θ2

2 + θ3

6 θ4

12 + · · · for 0 ≤ θ ≤ 1.

• But it is less than

−θ2

2 + θ3

6 ≤ θ2



1

2 + θ 6

≤ θ2



1

2 + 1 6

= −θ2 3 .

(9)

Other Variations of the Chernoff Bound

The following can be proved similarly (prove it).

Theorem 71 Given the same terms as Theorem 70 (p. 583),

prob[ X ≤ (1 − θ) pn ] ≤ e−θ2pn/2.

The following slightly looser inequalities achieve symmetry.

Theorem 72 (Karp, Luby, & Madras (1989)) Given the same terms as Theorem 70 (p. 583) except with

0 ≤ θ ≤ 2,

prob[ X ≥ (1 + θ) pn ] ≤ e−θ2pn/4,

−θ2

(10)

Power of the Majority Rule

The next result follows from Theorem 71 (p. 587).

Corollary 73 If p = (1/2) +  for some 0 ≤  ≤ 1/2, then prob

 n



i=1

xi ≤ n/2



≤ e−2n/2.

• The textbook’s corollary to Lemma 11.9 seems too loose, at e−2n/6.a

• Our original problem (p. 582) hence demands, e.g.,

n ≈ 1.4k/2 independent coin flips to guarantee making an error with probability ≤ 2−k with the majority rule.

a

(11)

BPP

a

(Bounded Probabilistic Polynomial)

• The class BPP contains all languages L for which there is a precise polynomial-time NTM N such that:

– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “yes.”

– If x ∈ L, then at least 3/4 of the computation paths of N on x lead to “no.”

• So N accepts or rejects by a clear majority.

aGill (1977).

(12)

Magic 3/4?

• The number 3/4 bounds the probability (ratio) of a right answer away from 1/2.

• Any constant strictly between 1/2 and 1 can be used without affecting the class BPP.

• In fact, as with RP,

1

2 + 1 q(n)

for any polynomial q(n) can replace 3/4.

• The next algorithm shows why.

(13)

The Majority Vote Algorithm

Suppose L is decided by N by majority (1/2) + .

1: for i = 1, 2, . . . , 2k + 1 do

2: Run N on input x;

3: end for

4: if “yes” is the majority answer then

5: “yes”;

6: else

7: “no”;

8: end if

(14)

Analysis

• The running time remains polynomial: 2k + 1 times N’s running time.

• By Corollary 73 (p. 588), the probability of a false answer is at most e−2k.

• By taking k =  2/2 , the error probability is at most 1/4.

• Even if  is any inverse polynomial, k remains a polynomial in n.

(15)

Aspects of BPP

• BPP is the most comprehensive yet plausible notion of efficient computation.

– If a problem is in BPP, we take it to mean that the problem can be solved efficiently.

– In this aspect, BPP has effectively replaced P.

• (RP ∪ coRP) ⊆ (NP ∪ coNP).

• (RP ∪ coRP) ⊆ BPP.

• Whether BPP ⊆ (NP ∪ coNP) is unknown.

• But it is unlikely that NP ⊆ BPP.a

(16)

coBPP

• The definition of BPP is symmetric: acceptance by clear majority and rejection by clear majority.

• An algorithm for L ∈ BPP becomes one for ¯L by reversing the answer.

• So ¯L ∈ BPP and BPP ⊆ coBPP.

• Similarly coBPP ⊆ BPP.

• Hence BPP = coBPP.

• This approach does not work for RP.a

aIt did not work for NP either.

(17)

BPP and coBPP

Ø\HVÙ ØQRÙ ØQRÙ Ø\HVÙ

(18)

“The Good, the Bad, and the Ugly”

P BPP ZPP

RP coRP

NP coNP

(19)

Circuit Complexity

• Circuit complexity is based on boolean circuits instead of Turing machines.

• A boolean circuit with n inputs computes a boolean function of n variables.

• Now, identify true/1 with “yes” and false/0 with “no.”

• Then a boolean circuit with n inputs accepts certain strings in { 0, 1 }n.

• To relate circuits with an arbitrary language, we need one circuit for each possible input length n.

(20)

Formal Definitions

• The size of a circuit is the number of gates in it.

• A family of circuits is an infinite sequence

C = (C0, C1, . . .) of boolean circuits, where Cn has n boolean inputs.

• For input x ∈ { 0, 1 }, C| x | outputs 1 if and only if x ∈ L.

• In other words,

Cn accepts L ∩ { 0, 1 }n.

(21)

Formal Definitions (concluded)

• L ⊆ { 0, 1 } has polynomial circuits if there is a family of circuits C such that:

– The size of Cn is at most p(n) for some fixed polynomial p.

– Cn accepts L ∩ { 0, 1 }n.

(22)

Exponential Circuits Suffice for All Languages

• Theorem 14 (p. 195) implies that there are languages that cannot be solved by circuits of size 2n/(2n).

• But surprisingly, circuits of size 2n+2 can solve all problems, decidable or otherwise!

(23)

Exponential Circuits Suffice for All Languages (continued)

Proposition 74 All decision problems (decidable or otherwise) can be solved by a circuit of size 2n+2.

• We will show that for any language L ⊆ { 0, 1 }, L ∩ { 0, 1 }n can be decided by a circuit of size 2n+2.

• Define boolean function f : { 0, 1 }n → { 0, 1 }, where

f (x1x2 · · · xn) =

⎧⎨

1 x1x2 · · · xn ∈ L, 0 x1x2 · · · xn ∈ L.

(24)

The Proof (concluded)

• Clearly, any circuit that implements f decides L ∩ { 0, 1 }n.

• Now,

f(x1x2 · · · xn) = (x1 ∧ f(1x2 · · · xn)) ∨ (¬x1 ∧ f(0x2 · · · xn)).

• The circuit size s(n) for f(x1x2 · · · xn) hence satisfies s(n) = 4 + 2s(n − 1)

with s(1) = 1.

• Solve it to obtain s(n) = 5 × 2n−1 − 4 ≤ 2n+2.

(25)

The Circuit Complexity of P

Proposition 75 All languages in P have polynomial circuits.

• Let L ∈ P be decided by a TM in time p(n).

• By Corollary 31 (p. 297), there is a circuit with O(p(n)2) gates that accepts L ∩ { 0, 1 }n.

• The size of that circuit depends only on L and the length of the input.

• The size of that circuit is polynomial in n.

(26)

Polynomial Circuits vs. P

• Is the converse of Proposition 75 true?

– Do polynomial circuits accept only languages in P?

• No.

• Polynomial circuits can accept undecidable languages!

(27)

BPP’s Circuit Complexity

Theorem 76 (Adleman (1978)) All languages in BPP have polynomial circuits.

• Our proof will be nonconstructive in that only the existence of the desired circuits is shown.

– Recall our proof of Theorem 14 (p. 195).

– Something exists if its probability of existence is nonzero.

• It is not known how to efficiently generate circuit Cn. – If the construction of Cn can be made efficient, then

(28)

The Proof

• Let L ∈ BPP be decided by a precise polynomial-time NTM N by clear majority.

• We shall prove that L has polynomial circuits C0, C1, . . ..

– These deterministic circuits do not err.

• Suppose N runs in time p(n), where p(n) is a polynomial.

• Let An = { a1, a2, . . . , am }, where ai ∈ { 0, 1 }p(n).

• Each ai ∈ An represents a sequence of nondeterministic choices (i.e., a computation path) for N .

• Pick m = 12(n + 1).

(29)

The Proof (continued)

• Let x be an input with | x | = n.

• Circuit Cn simulates N on x with all sequences of choices in An and then takes the majority of the m outcomes.a

– Note that each An yields a circuit.

• As N with ai is a polynomial-time deterministic TM, it can be simulated by polynomial circuits of size O(p(n)2).

– See the proof of Proposition 75 (p. 603).

aAs m is even, there may be no clear majority. Still, the probability of that happening is very small and does not materially affect our general

(30)

The Circuit

,2 ,

, ,

  

(31)

The Proof (continued)

• The size of Cn is therefore O(mp(n)2) = O(np(n)2).

– This is a polynomial.

• We now confirm the existence of an An making Cn correct on all n-bit inputs.

• Call ai bad if it leads N to an error (a false positive or a false negative) for x.

• Select An uniformly randomly.

(32)

The Proof (continued)

• For each x ∈ { 0, 1 }n, 1/4 of the computations of N are erroneous.

• Because the sequences in An are chosen randomly and independently, the expected number of bad ai’s is m/4.a

• By the Chernoff bound (p. 583), the probability that the number of bad ai’s is m/2 or more is at most

e−m/12 < 2−(n+1).

• The error probability of using the majority rule is thus

< 2−(n+1) for each x ∈ { 0, 1 }n.

aSo the proof will not work for NP. Contributed by Mr. Ching-Hua

(33)

The Proof (continued)

• The probability that there is an x such that An results in an incorrect answer is

< 2n2−(n+1) = 2−1.

– Recall the union bound (Boole’s inequality):

prob[ A ∪ B ∪ · · · ] ≤ prob[ A ] + prob[ B ] + · · · .

• We just showed that at least half of them are correct.

• So with probability ≥ 0.5, a random An produces a correct Cn for all inputs of length n.

– Of course, verifying this fact may take a long time.

(34)

The Proof (concluded)

• Because this probability exceeds 0, an An that makes majority vote work for all inputs of length n exists.

• Hence a correct Cn exists.a

• We have used the probabilistic method popularized by Erd˝os.b

• This result answers the question on p. 514 with a “yes.”

aQuine (1948), “To be is to be the value of a bound variable.”

bA counting argument in the probabilistic language.

(35)

Leonard Adleman

a

(1945–)

(36)

Paul Erd˝ os (1913–1996)

(37)

Cryptography

(38)

Whoever wishes to keep a secret must hide the fact that he possesses one.

— Johann Wolfgang von Goethe (1749–1832)

(39)

Cryptography

• Alice (A) wants to send a message to Bob (B) over a channel monitored by Eve (eavesdropper).

• The protocol should be such that the message is known only to Alice and Bob.

• The art and science of keeping messages secure is cryptography.

Alice Eve -

Bob

(40)

Encryption and Decryption

• Alice and Bob agree on two algorithms E and D—the encryption and the decryption algorithms.

• Both E and D are known to the public in the analysis.

• Alice runs E and wants to send a message x to Bob.

• Bob operates D.

• Privacy is assured in terms of two numbers e, d, the encryption and decryption keys.

• Alice sends y = E(e, x) to Bob, who then performs D(d, y) = x to recover x.

• x is called plaintext, and y is called ciphertext.a

(41)

Some Requirements

• D should be an inverse of E given e and d.

• D and E must both run in (probabilistic) polynomial time.

• Eve should not be able to recover x from y without knowing d.

– As D is public, d must be kept secret.

– e may or may not be a secret.

(42)

Degrees of Security

• Perfect secrecy: After a ciphertext is intercepted by the enemy, the a posteriori probabilities of the plaintext that this ciphertext represents are identical to the a

priori probabilities of the same plaintext before the interception.

– The probability that plaintext P occurs is

independent of the ciphertext C being observed.

– So knowing C yields no advantage in recovering P.

• Such systems are said to be informationally secure.

• A system is computationally secure if breaking it is theoretically possible but computationally infeasible.

(43)

Conditions for Perfect Secrecy

a

• Consider a cryptosystem where:

– The space of ciphertext is as large as that of keys.

– Every plaintext has a nonzero probability of being used.

• It is perfectly secure if and only if the following hold.

– A key is chosen with uniform distribution.

– For each plaintext x and ciphertext y, there exists a unique key e such that E(e, x) = y.

aShannon (1949).

(44)

The One-Time Pad

a

1: Alice generates a random string r as long as x;

2: Alice sends r to Bob over a secret channel;

3: Alice sends x ⊕ r to Bob over a public channel;

4: Bob receives y;

5: Bob recovers x := y ⊕ r;

aMauborgne and Vernam (1917); Shannon (1949). It was allegedly used for the hotline between Russia and U.S.

(45)

Analysis

• The one-time pad uses e = d = r.

• This is said to be a private-key cryptosystem.

• Knowing x and knowing r are equivalent.

• Because r is random and private, the one-time pad achieves perfect secrecy (see also p. 621).

• The random bit string must be new for each round of communication.

• But the assumption of a private channel is problematic.

(46)

Public-Key Cryptography

a

• Suppose only d is private to Bob, whereas e is public knowledge.

• Bob generates the (e, d) pair and publishes e.

• Anybody like Alice can send E(e, x) to Bob.

• Knowing d, Bob can recover x by D(d, E(e, x)) = x.

• The assumptions are complexity-theoretic.

– It is computationally difficult to compute d from e.

– It is computationally difficult to compute x from y without knowing d.

a

(47)

Whitfield Diffie

a

(1944–)

aTuring Award (2016).

(48)

Martin Hellman

a

(1945–)

aTuring Award (2016).

(49)

Complexity Issues

• Given y and x, it is easy to verify whether E(e, x) = y.

• Hence one can always guess an x and verify.

• Cracking a public-key cryptosystem is thus in NP.

• A necessary condition for the existence of secure public-key cryptosystems is P = NP.

• But more is needed than P = NP.

• For instance, it is not sufficient that D is hard to compute in the worst case.

• It should be hard in “most” or “average” cases.

(50)

One-Way Functions

A function f is a one-way function if the following hold.a 1. f is one-to-one.

2. For all x ∈ Σ, | x |1/k ≤ |f(x)| ≤ | x |k for some k > 0.

• f is said to be honest.

3. f can be computed in polynomial time.

4. f−1 cannot be computed in polynomial time.

• Exhaustive search works, but it must be slow.

aDiffie & Hellman (1976); Boppana & Lagarias (1986); Grollmann &

Selman (1988); Ko (1985); Ko, Long, & Du (1986); Watanabe (1985);

Young (1983).

(51)

Existence of One-Way Functions (OWFs)

• Even if P = NP, there is no guarantee that one-way functions exist.

• No functions have been proved to be one-way.

• Is breaking glass a one-way function?

(52)

Candidates of One-Way Functions

• Modular exponentiation f(x) = gx mod p, where g is a primitive root of p.

– Discrete logarithm is hard.a

• The RSAb function f (x) = xe mod pq for an odd e relatively prime to φ(pq).

– Breaking the RSA function is hard.

aConjectured to be 2n for some  > 0 in both the worst-case sense and average sense. Doable in time nO(log n) for finite fields of small char- acteristic (Barbulescu, et al., 2013). It is in NP in some sense (Grollmann and Selman, 1988).

bRivest, Shamir, & Adleman (1978).

(53)

Candidates of One-Way Functions (concluded)

• Modular squaring f(x) = x2 mod pq.

– Determining if a number with a Jacobi symbol 1 is a quadratic residue is hard—the quadratic

residuacity assumption (QRA).a

– Breaking it is as hard as factorization when p ≡ q ≡ 3 mod 4.b

aDue to Gauss.

bRabin (1979).

(54)

The Secret-Key Agreement Problem

• Exchanging messages securely using a private-key cryptosystem requires Alice and Bob possessing the same key (p. 623).

– An example is the r in the one-time pad (p. 622).

• How can they agree on the same secret key when the channel is insecure?

• This is called the secret-key agreement problem.

• It was solved by Diffie and Hellman (1976) using one-way functions.

(55)

The Diffie-Hellman Secret-Key Agreement Protocol

1: Alice and Bob agree on a large prime p and a primitive root g of p; {p and g are public.}

2: Alice chooses a large number a at random;

3: Alice computes α = ga mod p;

4: Bob chooses a large number b at random;

5: Bob computes β = gb mod p;

6: Alice sends α to Bob, and Bob sends β to Alice;

7: Alice computes her key βa mod p;

8: Bob computes his key αb mod p;

(56)

Analysis

• The keys computed by Alice and Bob are identical as βa = gba = gab = αb mod p.

• To compute the common key from p, g, α, β is known as the Diffie-Hellman problem.

• It is conjectured to be hard.

• If discrete logarithm is easy, then one can solve the Diffie-Hellman problem.

– Because a and b can then be obtained by Eve.

• But the other direction is still open.

(57)

The RSA Function

• Let p, q be two distinct primes.

• The RSA function is xe mod pq for an odd e relatively prime to φ(pq).

– By Lemma 54 (p. 464),

φ(pq) = pq



1 1 p



1 1 q

= pq − p − q + 1. (14)

• As gcd(e, φ(pq)) = 1, there is a d such that ed ≡ 1 mod φ(pq),

which can be found by the Euclidean algorithm.a

(58)

A Public-Key Cryptosystem Based on RSA

• Bob generates p and q.

• Bob publishes pq and the encryption key e, a number relatively prime to φ(pq).

– The encryption function is y = xe mod pq.

– Bob calculates φ(pq) by Eq. (14) (p. 635).

– Bob then calculates d such that ed = 1 + kφ(pq) for some k ∈ Z.

• The decryption function is yd mod pq.

• It works because yd = xed = x1+kφ(pq) = x mod pq by the Fermat-Euler theorem when gcd(x, pq) = 1 (p. 473).

(59)

The “Security” of the RSA Function

• Factoring pq or calculating d from (e, pq) seems hard.a

• Breaking the last bit of RSA is as hard as breaking the RSA.b

• Recommended RSA key sizes:c – 1024 bits up to 2010.

– 2048 bits up to 2030.

– 3072 bits up to 2031 and beyond.

aSee also p. 469.

bAlexi, Chor, Goldreich, & Schnorr (1988).

cRSA (2003). RSA was acquired by EMC in 2006 for 2.1 billion US

(60)

The “Security” of the RSA Function (continued)

• Recall that problem A is “harder than” problem B if solving A results in solving B.

– Factorization is “harder than” breaking the RSA.

– It is not hard to show that calculating Euler’s phi functiona is “harder than” breaking the RSA.

– Factorization is “harder than” calculating Euler’s phi function (see Lemma 54 on p. 464).

– So factorization is harder than calculating Euler’s phi function, which is harder than breaking the RSA.

aWhen the input is not factorized!

(61)

The “Security” of the RSA Function (concluded)

• Factorization cannot be NP-hard unless NP = coNP.a

• So breaking the RSA is unlikely to imply P = NP.

• But numbers can be factorized efficiently by quantum computers.b

• RSA was alleged to have received 10 million US dollars from the government to promote unsecure p and q.c

aBrassard (1979).

bShor (1994).

cMenn (2013).

(62)

Adi Shamir, Ron Rivest, and Leonard Adleman

(63)

Ron Rivest

a

(1947–)

(64)

Adi Shamir

a

(1952–)

aTuring Award (2002).

(65)

A Parallel History

• Diffie and Hellman’s solution to the secret-key

agreement problem led to public-key cryptography.

• At around the same time (or earlier) in Britain, the RSA public-key cryptosystem was invented first before the Diffie-Hellman secret-key agreement scheme was.

– Ellis, Cocks, and Williamson of the Communications Electronics Security Group of the British Government Communications Head Quarters (GCHQ).

參考文獻

相關文件

– Factorization is “harder than” calculating Euler’s phi function (see Lemma 51 on p. 404).. – So factorization is harder than calculating Euler’s phi function, which is

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.. • It is

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.. • It is

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.. • It is

• Adding restrictions on the allowable solutions (the solution space) may make a problem harder, equally hard, or easier.. • It is

 Sender Report: used by active session participants to relay transmission and reception statistics.  Receiver Report: used to send reception statistics from those

our class: four to six times harder than a normal one in NTU around seven homework sets (and a hard final project) homework due within two weeks.. even have homework 0 and 1 NOW

Sometimes called integer linear programming (ILP), in which the objective function and the constraints (other than the integer constraints) are linear.. Note that integer programming