Oblivious Polynomial Evaluation and Oblivious Neural Learning

(1)

Oblivious Polynomial Evaluation

and Oblivious Neural Learning

Yan-Cheng Chang

Department of Computer Science and Information Engineering

National Taiwan University

(2)

Abstract

We study the problem of Oblivious Polynomial Evaluation (OPE), where one party has a polynomial P and the other party, with an input x, wants to learn P (x) in an oblivious way. Previously existing protocols are based on some intractability assump-tions that have not been well studied [10, 9], and these protocols are only applicable for polynomials over finite fields. In this paper, we propose efficient OPE protocols which are based on Oblivious Transfer only. Slight modifications to our protocols immediately give protocols to handle polynomials over floating-point numbers. Many important real-world applications deal with floating-point numbers, instead of in-tegers or arbitrary finite fields, and our protocols have the advantage of operating directly on floating-point numbers, instead of going through finite field simulation as that of [9]. As an example, we study the problem of Oblivious Neural Learning where a party has a neural network and the other party wants to train the neural network in an oblivious way with some training set. We give an efficient protocol for this problem, and in a sense it says that one can get smarter in an oblivious way.

(3)

Chapter 1 Introduction

Assume that there are two parties, Alice who has a function f and Bob who has an input x. They want to collaborate in a way for Alice to learn nothing and for Bob to learn f (x) and nothing more. A protocol achieving this task for any function f and any input x is called an Oblivious Function Evaluation protocol. The remark-able results of Yao [12] and Goldreich, Micali, and Wigderson [5] showed that such protocols exist, under some standard cryptographic assumptions. Their protocols use a Boolean circuit to represent the function f and then simulate the computation of this circuit in some oblivious way. The computational or communicational overhead of their protocols depends only linearly on the circuit size of the function f , which is the best one can expect from a complexity-theoretical point of view. However, their protocols are far from being practical in general, and this problem still needs a lot of work to be done. One line of research is to consider different representations of functions and see if more efficient simulation can be achieved via such representations. Noar and Pinkas [10] considered the case of polynomials over finite fields. Note that any function from m bits to m bits can be represented as a polynomial over a finite field GF (2m), but its degree could go as high as 2m − 1. So one would like to focus on those functions that can be represented by low degree polynomials. This turns out to have several interesting applications [10, 4, 9, 8]. The scheme proposed in [10] is much more efficient than the conventional way of going through oblivious circuit evaluation protocols, but its security is based on two assumptions. One assumption is the existence of a secure Oblivious Transfer protocol while the other, a new one, is the intractability of a Noisy Polynomial Interpolation Problem. Bleichenbacher and Nguyen [3] later showed that this new assumption may be much weaker than expected and suggested the use of a possibly stronger intractability assumption on a Polynomial Reconstruction Problem. Still, no one can say how hard this problem is as it is not that well-studied. Recently, Lindell and Pinkas [9] mentioned a not-yet-published OPE protocol, which is also based on some newly proposed assumption. The assumption is that the Decisional Diffie-Hellman Assumption, denoted as DDH, also holds over the group Z∗n2, where n is the product of two large primes. Contrary

(5)

Introduction 2

to the well studied DDH over Z∗n [2], no one knows how hard this problem is in this

new setting.

In this paper, three OPE protocols of different flavors are proposed. Compared to previous ones, the security of our first two protocols are only based on a well-accepted cryptographic assumption that a secure 1-out-of-2 protocol, denoted as OT2₁, exists. Our third protocol involves a third party who is not colluded but may be curious, and our protocol is perfectly secure, without any cryptographic assumption. Unlike that of [10], our protocols can immediately handle multi-variate polynomials and solve the Inner Product Problem where each of two parties holds a vector and one of them learns the inner product in an oblivious way. Nice properties of OT2

1 could imply nice

properties of our protocols. For example, the OT2₁ proposed by Bellare and Micali [1] runs in one round and is perfectly secure for one party. Using such an OT2

1, we have

a protocol that also runs in one round and is perfectly secure for the party Alice. One attractive feature of our protocols is that they can be modified very easily to handle floating-point numbers. This is not the case for existing OPE protocols which rely on some particular properties of finite fields. Many important applications in real life involve numerical computation over floating-point numbers, instead of over integers or arbitrary finite fields. There is no efficient mapping known that embeds floating-point numbers into finite fields where arithmetics can be carried out easily. The approach of [9] is to scale floating-point numbers up to integers with some book-keeping, apply some existing OPE protocol over integers, and then do a normalization to get back floating-point numbers. The extra work of scaling up, scaling down, and book-keeping makes their algorithm less appealing. We show how our OPE protocols over finite fields can be easily modified to operate directly on floating-point numbers, and we believe that such protocols are more likely to have practical applications.

In addition to computing functions obliviously, some computational tasks may also involve security issues and people may want to perform them in some oblivious way. We use machine learning as an example, and demonstrate the applicability of our OPE protocol over floating-point numbers. Lindell and Pinkas [9] considered the scenario where two parties, each holding a private database, want to jointly construct a decision tree that classifies entries in both databases, using a so-called ID3 algorithm. Such kind of learning is not robust to changes in the sense that changes to a database may cause the whole process to be run again. We use neural network as our learning model and consider the following scenario. Alice has a neural network which is trained to some degree and she uses it to serve the classification requests from other parties. Alice wants to keep her neural network secret, while others want to keep their requests secret. This is the task of oblivious neural computing. At some point, another party Bob with a set of training examples wants to help Alice’s neural network get better, maybe for his own good later. Alice wants to have a secure learning process so that Bob learns nothing from her, while Bob also wants to keep his training set secret. Later, other parties having their own training set can help Alice too, and Alice’s neural network can adapt in an incremental way. This is the

(6)

Introduction 3

task of oblivious neural learning. We will apply our OPE protocol over floating-point numbers, and derive protocols for oblivious neural computing and oblivious neural learning.

The rest of the paper is organized as follows. In Section 2, we give definitions and tools that will be used later. Three OPE protocols are proposed in Section 3. We derive OPE protocols for floating-point numbers in Section 4. In Section 5, we show oblivious protocols for neural computing and learning.

(7)

Chapter 2 Preliminaries

For a positive integer n, let [n] denote the set {1, . . . , n}. For an n-dimensional vector v, let vi, for i ∈ [n], denote the component in the i’th dimension, and we write

v = (v1, . . . , vn) = (vi)i∈[n]. We fix a security parameter τ , so that any number within

a small factor of 2−τ is considered negligible. For a distribution D over a set S, let D(i), for i ∈ S, denote the probability of i according to D, and define D(A), for A ⊆ S, to be P

i∈AD(i).

Definition 1. Let D and D0 be two distributions over a set S. Let dA(D, D0) =

|D(A)−D0_{(A)|. The distance of D and D}0 _{is defined as d(D, D}0_{) = max}

A⊆SdA(D, D0).

Note that d(D, D0) = 1₂P

i∈S|D(i) − D

0_{(i)|, which is a useful way for calculating}

d(D, D0).

Definition 2. Let D and D0 be two distributions. They are statistically indistin-guishable, denoted as D ≡ Ds 0_{, if d(D, D}0_{) is negligible. They are computationally}

indistinguishable, denoted as D ≡ Dc 0_{, if d}

A(D, D0) is negligible for any subset A

decided by a polynomial-size circuit.1

We will assume that parties in our protocols have only polynomial-size circuits for computation unless mentioned otherwise. So we will focus on computational security, and the default distinguishability will be the computational one.

An important cryptographic primitive is the 1-out-of-2 oblivious transfer, denoted as OT2₁. There are several variants which are all equivalent, and the one most suited for us is the following string version of OT2

1. Let F be a set.

Definition 3. An OT2

1 protocol has two parties, Sender who has input (x0, x1) ∈ F2

and Chooser who has a choice c ∈ {0, 1}. The protocol is correct if the Sender learns xc for any (x0, x1) and c. The protocol is secure if both conditions below are satisfied

for any (x0, x1) and c:

1_{Note that for A decided by a circuit C, d}

A(D, D0) = |Px∈D[C(x) = 1] − Px∈D0[C(x) = 1]|.

(8)

Preliminaries 5

• Chooser cannot distinguish the distribution of Sender’s messages from that in-duced by Sender having a different value of x1−c.

• Sender cannot distinguish the distributions of Chooser’s messages induced by c and 1 − c.

Similarly one can define OTk

1 for any k ≥ 3, with Sender having k elements and

Chooser wanting to learn one. We will use OTk

1, for k ≥ 2, to denote an assumed

correct and secure OTk₁ protocol. It is known that the existence of a OT2₁ implies the existence of OTk

1 for any k ≥ 3 [10].

Definition 4. A protocol for oblivious polynomial evaluation has two parties, Alice who has a polynomial P over some finite field F and Bob who has an input x∗ ∈ F.

An OPE protocol is correct if Bob learns P (x∗) for any x∗ and P . It is secure if both

conditions below are satisfied for any x∗ and P :

• Alice cannot distinguish the distribution of Bob’s messages from that induced by Bob having a different x0_∗.

• Bob cannot distinguish the distribution of Alice’s messages from that induced by Alice having a different P0 with P0(x∗) = P (x∗).

We say that a party in a protocol is semi-honest if the party follows the protocol but may try to learn more information than he or she should. We only consider semi-honest parties in this paper. The case of malicious parties can be handled in a standard way, which is omitted here.

Suppose D and D0 are two distributions depending on distributions E and E0 respectively. For any possible outcome t of E and E0, let (D|E = t) and (D0|E0 _{= t)}

denote the distributions of D and D0 conditioned on E = t and E0 = t respectively. Here is a useful lemma for showing D ≡ Dc 0_{, which will be used several times in our}

security proofs later.

Lemma 2.1. D ≡ Dc 0 _{provided E} _{≡ E}s 0 _{and (D|E = t)}_{≡ (D}c 0_|E0 _{= t) for any t.}

Proof. Let C be a circuit which outputs 1 with probabilities p and p0 with respect to D and D0. Let pt and p0t denote the corresponding probabilities with respect to

(D|E = t) and (D0|E0 _{= t). Let q}

t= E(t) and q0t= E0(t). Then

|p − p0| = |X t qtpt− X t q0_tp0_t| ≤ X t |qtpt− qtp0t| + X t |qtp0t− q 0 tp 0 t| ≤ X t qt|pt− p0t| + X t |qt− qt0| So ifP t|qt−q 0

(9)

Preliminaries 6

Some cases later have identical E and E0, and we only need to check each |pt− p0t|.

A family H of functions from S1 to S2 is said to satisfy a pair-wise independent

property if for any distinct α, α0 ∈ S1,

Ph∈H[h(α) = h(α0)] =

1 |S2|

.

Let (H, H(S1)) denote the distribution of (h, h(v)) with random h ∈ H and random

v ∈ S1, and let (H, S2) denote the uniform distribution over H × S2. We will use the

following lemma, which is a special case of the so-called Leftover Hash Lemma [6, 7].

Lemma 2.2. Let H be any family of functions from S1 to S2 satisfying the pair-wise

independent property. Then d((H, H(S1)), (H, S2)) ≤p|S2|/|S1|.

(10)

Chapter 3 Oblivious Polynomial Evaluation

Protocols

We will present several OPE protocols of different flavors in this section. Assume that both parties have agreed that polynomials are over a finite field F and have degrees at most d. The set of such polynomials can be identified with the set T = Fd+1 _in

a natural way. Suppose now Alice has a polynomial P (x) = Pd

i=0aix i

∈ T and Bob has x∗ ∈ F.

3.1 The First Protocol for OPE

To make the picture clear, we only discuss the case F = GF (p) for some prime p. The generalization to GF (pk_{) with k > 1 is straightforward. Let m = dlog}

2|F|e.

Each coefficient ai in the polynomial can be represented as ai =

P

j∈[m]aij2j−1 with

aij ∈ {0, 1}. For i ∈ [d] and j ∈ [m], let vij = 2j−1xi∗. Note that for each i ∈ [d],

P

j∈[m]aijvij = aix i

∗. The idea is to have Bob prepare (vij)j∈[m] and have Alice get

those vij with aij = 1, in some secret way. This is achieved by having Bob prepare

the pair (rij, vij + rij) for a random noise rij, and having Alice get what she wants

via OT2₁. Note that what Alice obtains is aijvij + rij. Here is our first protocol.

Protocol 1

1. Bob prepares dm pairs (rij, vij + rij)i∈[d],j∈[m], with each rij chosen randomly

from F.

2. For each pair (rij, vij+ rij), Alice runs an independent OT21 with Bob to get rij

if aij = 0 and vij + rij otherwise.

3. Alice sends to Bob the sum of a0 and those dm values she got. Bob subtracts

P

i,jrij from it to obtain P (x∗).

(11)

Oblivious Polynomial Evaluation Protocols 8

Lemma 3.1. Protocol 1 is correct when parties are semi-honest.

Proof. The sum Bob obtains in Step 3 is a0+P_iP_j(aijvij+ rij) = P (x∗) +P_i,jrij.

Lemma 3.2. Protocol 1 is secure when parties are semi-honest.

Proof. First, we prove Alice’s security. Suppose P and P0 are two distinct polyno-mials with P (x∗) = P0(x∗) = y∗. According to Lemma 2.1, it suffices to show that for

any fixed (rij)i∈[d],j∈[m], Alice’s respective message distributions D and D0 induced by

P and P0 are indistinguishable. Note that the last message from Alice is y∗+P_i,jrij

for both P and P0 and can be ignored. So we focus on Alice’s dm messages from the dm independent executions of OT’s. For 0 ≤ k ≤ dm, let Dk denote the distribution

with the first k messages from D and the remaining messages from D0. Assume that there exists a distinguisher C for D and D0. A standard argument shows that C can also distinguish Dk0−1 and Dk0 for some k0. Note that Alice must select different

ele-ments from that pair in the k0’th OT, as otherwise the two distributions are identical.

Then one can break Chooser’s security in OT2

1 when Sender has this input, because

with Chooser’s messages for different choices replacing the k0’th message of Dk0−1,

we get exactly Dk0−1 and Dk0, which can be distinguished by C. As OT

2

1 is assumed

to be secure, D and D0 are indistinguishable, and Alice is secure.

Next, we prove Bob’s security. Note that Bob sends dm messages to Alice for the dm independent executions of OT’s. Let x∗ 6= x0∗, let E and E0 be Bob’s respective

message distributions, and let Ek denote the distribution with the first k messages

from E and the remaining messages from E0. Suppose a distinguisher for E and E0 exists. Then it can also distinguish Ek0−1 and Ek0 for some k0. The pairs in that

k0’th OT have the forms (r, v + r) and (r0, v0 + r0), for some fixed v and v0 and for

random r and r0. Alice’s polynomial is fixed, so which element to choose in that k0’th

OT is also fixed. Suppose Alice chooses the first one in that pair. Then according to Lemma 2.1, there is a fixed r0 such that Ek0−1 conditioned on Bob having (r0, v + r0)

and Ek0 conditioned on Bob having (r0, v

0_+r

0) are distinguishable. Similarly as before,

one can distinguish Sender’s messages when Sender has (r0, v + r0) and (r0, v0+ r0)

respectively and Chooser selects the first element, which violates Sender’s security in OT2

1. The case when Alice chooses the second one in that pair can be argued

similarly, by noticing that the distribution (r, v + r) and the distribution (−v + r, r) are identical. As OT2

1 is assumed to be secure, so is Bob.

Theorem 3.3. Protocol 1 is correct and secure when parties are semi-honest.

Note that only dm invocations of OT2

1 are required and they can be done

con-currently. If OT2

1 can be carried out in one round (e.g. [1]), Protocol 1 runs in one

round. Also observe that if OT2₁ can achieve perfect security for Chooser (e.g. [1]), then Protocol 1 is perfectly secure for Alice, in the information-theoretical sense.

(12)

3.2 The Second Protocol for OPE

The idea of our second protocol is for Alice to hide the random shares of her poly-nomial P among other random polypoly-nomials, have Bob evaluate on them, and then select those values corresponding to the shares. Recall that T = Fd+1. For P ∈ T and R = (R1, . . . , Rn) ∈ Tn, define the function hR,P : {0, 1}n→ T as

hR,P(α) = P −

X

i∈[n]

αiRi.

It’s easy to check that for any P ∈ T, the class HP = {hR,P : R ∈ Tn} satisfies the

pair-wise independent property. Here is our second OPE protocol, which is also based on OT2

1 only.

Protocol 2

1. Alice generates random R ∈ Tn_{and α ∈ {0, 1}}n_{and sends (R}

1, . . . , Rn, hR,P(α))

to Bob. Let Rn+1 = hR,P(α) and αn+1 = 1.

2. Bob generates random r ∈ Fn+1 and prepares n + 1 pairs (ri, Ri(x∗) + ri)i∈[n+1].

3. For pair i, Alice runs an OT2₁ with Bob to get ri if αi = 0 and Ri(x∗) + ri

otherwise.

4. Alice sends the sum of the n + 1 values to Bob. Bob subtracts Pn+1

i=1 ri from it

to get P (x∗).

Theorem 3.4. Protocol 2 is correct and secure when parties are semi-honest. Proof. The correctness is obvious because the sum Bob obtains in Step 4 isPn+1

i=1 αiRi(x∗)+

Pn+1

i=1 ri = P (x∗)+Pn+1_i=1 ri. Bob’s security proof is almost identical to that of Protocol

1, so we only prove Alice’s security here.

Fix any two polynomials P, P0 _{∈ T, let D and D}0 denote Alice’s respective message distributions, and let E and E0 be Alice’s respective message distributions in Step 1. According to Lemma 2.1, it suffices to show E ≡ Es 0 _{and (D|E = t)}_{≡ (D}c 0_|E0 _{= t) for}

each t ∈ T. Using an argument similar to that in Protocol 1, one can show (D|E = t)≡ (Dc 0_|E0

= t) for each t ∈ T as otherwise one can break Chooser’s security in OT2 1.

Note that the family HP satisfies the pair-wise independent property and E is the

distribution (HP, HP({0, 1}n)). By choosing n = log |T|+2τ = (d+1)m+2τ , Leftover

Hash Lemma guarantees that the distance between E and the uniform distribution is at most p|T|2−n _{= 2}−τ_{, which is negligible. Similarly E}0 _{also has a negligible}

distance to the uniform one. So d(E, E0) is negligible and E ≡ Es 0_{. According to}

Lemma 2.1, Alice is secure.

Note that here are (n + 1) log |T| = O(dm(dm + τ )) bits sent in Step 1, O(dm + τ ) executions of OT2₁ in Step 3, and m bits sent in Step 4.

(13)

3.3 A Protocol for 3-Party OPE

Here we show how to remove the use of OT2

1 with the help a third party Clark. As a

result, our protocol does not rely on any cryptographic assumption and is information-theoretically secure when no collusion exists. Again, we assume that Alice has a polynomial P ∈ T, Bob has x∗ ∈ F and only Bob learns P (x∗). Now the security must

also hold against Clark so that the messages he receives altogether look completely random to him; i.e.,

• Clark cannot distinguish the uniform distribution from the joint distribution of messages he receives from Alice and Bob.

Here is the protocol.

Protocol 3

1. Bob sends random (ri)i∈[k] ∈ Fk to Alice. He also sends (xi0 = xi∗ + ri)i∈[k] to

Clark.

2. Alice sends random (si)0≤i≤k ∈ Fk+1 to Bob. She also sends a00 = a0 + s0 −

P i∈[k]airi and (a 0 i = ai+ si)i∈[k] to Clark. 3. Clark sends y = a0₀ +P i∈[k]a 0 ix 0

i to Bob, and Bob gets P (x∗) = y − (s0 +

P

i∈[k]x 0 isi).

Theorem 3.5. Protocol 3 is correct and perfectly secure provided no collusion exists,

Proof. The correctness is easy to verify. What Alice or Clark receives is completely random. Bob receives random (si)0≤i≤kin Step 2, and receives P (x∗)+s0+

P

i∈[k](xi∗+

ri)si in Step 4, so he sees the same distribution for any polynomial P0 with P0(x∗) =

P (x∗). Note that each party is secure even if others are malicious, as long as no

collusion exists.

3.4 Generalizations

It is not hard to see that all the protocols in this section can be easily extended to deal with multi-variate polynomials. They can also solve the Inner Product Problem: Alice has a vector a ∈ Fn _{while Bob has a vector x ∈ F}n _{and wants to learn the}

inner product a · x =P

i∈[n]aixi. The inner product function can be seen as a linear

polynomial on k variables, so the problem of oblivious inner product evaluation is just a special case of the problem of oblivious multi-variate polynomial evaluation. On the other hand, a multi-variate polynomial is just a sum of terms, with each term being a product between a coefficient, owned by one party, and a group of variables,

(14)

owned by the other. If any inner product can be evaluated in an oblivious way, so is any multi-variate polynomial at any point. So these two problems are equivalent.

In all these problems, Alice and Bob have their own inputs and Bob gets the final result. Later we will see a variation with each input and output shared by the two parties. We call this computing with random shares. Let’s use the Inner Product Problem as an example. Suppose that Alice has u, v ∈ Fn _{and Bob has u}0_{, v}0

∈ Fn_.

They want to compute the inner product of u + u0 and v + v0, and produce random shares, one for each party, that sum to the inner product. This generalization can be reduced to the original problem in the following way. Note that (u + u0) · (v + v0) is equal to

(u · v) + (u · v0+ v · u0) + (u0· v0).

Now Alice generates a random r ∈ F and prepares the 2(n + 1)-dimensional vector a = (−r + u · v, u1, . . . , un, v1, . . . , vn, 1),

while Bob prepares the 2(n + 1)-dimensional vector

x = (1, v₁0, . . . , v0_n, u₁0, . . . , u0_n, u0· v0).

Bob can obtain a · x = −r + (u + u0) · (v + v0) using a protocol for the original problem, and each party now holds a random share of the inner product (u + u0) · (v + v0). The variation for multi-variate polynomials can be handled similarly.

(15)

Chapter 4 Oblivious Polynomial Evaluation

for Floating-Point Numbers

4.1 Floating-Point Number System

We first give the definition of a floating-point number system.

Definition 5. A floating-point number is a rational number b = ±P2m

j=1bj2m−j for

some m, with bj ∈ {0, 1}. Let ˆm denote the floating-point number system containing

all such numbers together with standard arithmetic operations.

Such a floating-point number can be represented by 2m + 1 bits: m bits for the fractional part, m bits for the integral part, and 1 bit for the sign. Unlike finite fields, operations in a floating-point number system are not closed and errors may occur because of the limitation of finite precision. An underflow occurs when the produced number needs more bits for the fractional part, and a rounding takes place to convert it into the nearest number in the floating-point number system. An overflow occurs when the produced number needs more bits for the integral part, and the result is left undefined.

When we want to hide an element v of a finite field F in our previous protocols, we generate a pair (v, r + v) with a random r ∈ F, so that any element of the pair itself looks completely random. There is a slight complication for floating-point numbers, but it can be easily fixed.

Lemma 4.1. Suppose v, v0 ∈ ˆ` for some ` and suppose k ≥ `+τ +2. The distributions of v + r and v0 + r0 with random r, r0 ∈ ˆk have a negligible distance.

Proof. The distance is at most 2(2₂k`₋₂−2−k−`) <

2`+1

2k−1 ≤ 2

−τ_.

(16)

Oblivious Polynomial Evaluation for Floating-Point Numbers 13

4.2 An OPE Protocol for Floating-Point Numbers

Assume Alice holds P (x) = Pd

i=0aix

i_{, where a}

i ∈ ˆm, and Bob holds x∗ ∈ ˆm. For

each i, let |ai| =

P2m

j=1aij2m−j, with aij ∈ {0, 1}. All our previous protocols can be

easily modified for floating-point numbers, and here we only demonstrate one, which comes from Protocol 1. Here we use OT3

1, which can be implemented by 2 executions

of OT2

1 [10]. Let k = (d + 1)m + τ + 2 and n = k + log(2dm). Parties agree on the

floating-point system ˆk for random numbers, and the floating-point system ˆn for all arithmetics so that no underflow or overflow will ever occur. Let vij = 2m−jxi∗.

Protocol 4

1. Bob prepares 2dm 3-tuples (rij, vij+rij, −vij+rij)i∈[d],j∈[2m], with each rij chosen

randomly from ˆk.

2. For each 3-tuple (rij, vij+ rij, −vij+ rij), Alice runs an OT31 with Bob to get rij

if aij = 0, vij + rij if aij = 1 ∧ ai > 0, and −vij + rij otherwise.

3. Alice sends to Bob the sum of a0 and those 2dm values she got. Bob subtracts

P

i,jrij from it to obtain P (x∗).

Note that all the arithmetics are carried out in the system ˆn, which is large enough to guarantee that no error ever occurs. Then it’s not hard to verify the correctness of this protocol. Its security is guaranteed by the following.

Lemma 4.2. Protocol 4 is secure when parties are semi-honest.

Proof. Alice’s security proof is almost identical to that of Protocol 1, so we only discuss Bob’s security here. Let x∗, x0∗ ∈ ˆm, let E and E0 be Bob’s respective message

distributions, and let Ek denote the distribution with the first k messages from E

and the remaining messages from E0. Suppose Ek0−1 and Ek0 can be distinguished,

for some k0, and the 3-tuples in that k0’th OT have the forms (r, v + r, −v + r)

and (r0, v0 + r0, −v0 + r0), for random r and r0 and for some fixed v and v0. Let ` = (d + 1)m and note that v, v0 ∈ ˆ` because 2m−jxi ∈ ˆ` for any x ∈ ˆm, i ∈ [d] and j ∈ [2m]. Then according to Lemma 4.1, no matter which element Alice chooses, the two distributions of that element have a negligible distance. Using Lemma 2.1 and adapting Bob’s security proof for Protocol 1, one can show that E and E0 are indistinguishable.

Note that the generalizations discussed in Section 3.4 also hold for floating-point numbers. That is, Protocol 4 can also be easily modified to deal with multi-variate polynomials and solve the Inner Product Problem, for floating-point numbers. So we have the following theorem.

Theorem 4.3. Oblivious protocols exist for the problem of multi-variate polynomial evaluation and Inner Product Problem over floating-point numbers.

(17)

Chapter 5 Oblivious Neural Learning

5.1 Neural Computing and Learning

There are several variants of the neural network model. We only demonstrate our result via 2-layer feedforward neural networks with back-propagation learning. Other variants can be handled similarly.

A 2-layer feedforward neural network has an internal layer of J nodes, with the j’th node having a weight vector uj = (uj1, . . . , ujI), and an output layer of K nodes, with

the k’th node having a weight vector wk = (wk1, . . . , wkJ). Each node is associated

with an activation function f (z) = a tanh(bz) (the hyperbolic tangent function). The network takes an input vector x = (x1, . . . , xI) and produces an output vector

o = (o1, . . . , oK) in the following way.

Neural Computing

1. Compute yj = f (uj · x), for j ∈ [J]. Let y = (y1, . . . , yJ).

2. Compute ok= f (wk· y), for k ∈ [K].

The output vector o may not be correct, and a learning algorithm adjusts the weights according to how the vector o differs from the correct output vector d. The pair (x, d) constitutes a training example. The back-propagation learning (BP-Learning) algorithm adjusts the weights in the following way, with γ being some learning constant.

BP-Learning

1. Compute δok = _ab(dk− ok)(a2− o2k), for k ∈ [K].

2. Compute δyj = b_a(a2− y2j)

PK

k=1δokwkj, for j ∈ [J ].

3. Update wkj = wkj+ γδokyj, for k ∈ [K], j ∈ [J ].

(18)

Oblivious Neural Learning 15

4. Update uji= uji+ γδyjxi, for i ∈ [I], j ∈ [J ].

The process above can be repeated for a set of training examples.

5.2 Oblivious Neural Computing and Learning

Now we want to carry out neural computing and neural learning in an oblivious way between two parties, Alice and Bob. Oblivious neural computing can be defined in a way similar to oblivious polynomial evaluation, except with Alice’s polynomial replaced by a neural network. For oblivious neural learning, Bob has a set of training examples and wants to train Alice’s neural network so that Bob knows nothing about Alice’s neural network while Alice knows only what is implied by the weight changes. We need to be careful about Bob’s security, as Alice’s neural network has IJ + J K weights and that many weights changes may reveal a lot to Alice. So we do not let Alice know the weights changes induced by each training example, and only let her get the overall weights changes after the training of all examples. Now a learning protocol is secure for Bob if Alice cannot distinguish two training sets that give the same overall weight changes. The larger the training set is, the less Alice knows about Bob’s training examples. This is not a problem as in practice, neural learning typically involve large training sets.

Another scenario is for Bob to keep random shares of those final weights, as long as he is willing to help Alice serve requests from other parties for oblivious neural computing. Later when another party wants to continue the training of Alice’s neural network, Bob only needs to help with his shares for the first training example, and his duty is off after that. Contrary to the previous scenario, Alice cannot learn anything about Bob’s training set in this way.

5.3 Oblivious Activation Function Evaluation

Here we discuss options for evaluating the activation function f (z) = a tanh(bz) = a(1 −_1+e22bz) in an oblivious way. We will rely on an protocol for oblivious circuit

eval-uation [12, 5, 11], denoted as OCE, which is efficient for small circuits. Assume that Alice has x while Bob has y, and they want to generate random shares of f (x + y) for Alice and Bob. One way is to use an OCE directly, if one can accept that the circuit for f is reasonably small. For cases allowing a large b, f (z) is close to the threshold function, which has a very simple circuit, and again we can use OCE directly. Other-wise, we will approximate f in a piece-wise way by low degree polynomials and then apply our OPE protocol for it, which is described in the following. Note that f is smooth, so there are intervals

(19)

and degree-d polynomials P0, P1, . . . , Pn such that

f (z) ≈ Pi(z) for z ∈ Ii,

for some small n and d.1 _{Let I be the function such that I(z) = i for z ∈ I}

i, which has

a rather simple circuit and thus an efficient OCE protocol. Let Pi,x(y) = Pi(x + y).

Here is the oblivious protocol for evaluating the activation function.

Protocol 5

1. Alice generate random r1. Bob runs OCE with Alice to get r2 = I(x + y) − r1.

2. Alice generate random s1 and prepares the polynomial

Qx(a, y) = −s1+ n X i=0 Q j6=i(a + r1− j) Q j6=i(i − j) Pi,x(y).

Bob runs OPE with Alice for s2 = Qx(s1, y).

Note that Alice has s1 and Bob has s2 with s1+ s2 = Pi(x + y) for x + y ∈ Ii, so the

protocol is correct. The security proof is again similar to previous ones.

5.4 Oblivious Neural Algorithms

First we need to determine the possible range of floating-point numbers that can ever occur during computation. Then we can determine an appropriate floating-point number system ˆk for random numbers and a system ˆn for error-free arithmetics. Here is the protocol for oblivious neural computing which uses the OCE and OPE protocols with random shares. Assume that Alice has a two-layer neural network and Bob has an input x.

Protocol 6

1. For j ∈ [J ], Alice and Bob compute random shares sj1, sj2 of the inner product

uj · x, and then compute random shares yj1, yj2 of yj = f (sj1 + sj2). Let

y = (y1, . . . , yJ).

2. For k ∈ [K], Alice and Bob compute random shares tk1, tk2 of wk· y, and then

compute random shares ok1, ok2 of ok= f (tk1+ tk2).

1_{For example, the error can be bounded by 2 × 10}−6_{with n = 9, d = 9, `}

0= −7, `8= 7, P0= −1,

(20)

At the end, Alice can send her shares ok1 to Bob for him to obtain the output

vector o. This is not needed for oblivious learning. Note that the protocol still works when the each weight vector is shared by two parties instead of owned by Alice, which is the case in oblivious learning.

Theorem 5.1. Oblivious neural computing can be achieved by Protocol 5.

Proof. The correctness is easy to verify. The security relies on the security of the protocol for oblivious polynomial evaluation with random shares and the protocol for oblivious evaluation of the activation function. The breaking of Protocol 5’s security gives a way to break one of the protocols which has been shown to be secure.

An oblivious neural learning protocol can be derived similarly. Now only the protocol for OPE with random shares is needed.

Protocol 7

1. Alice and Bob compute random shares of each δok = _ab(dk− ok)(a2− o2k).

2. Alice and Bob compute random shares of each δyj = _ab(a2 − yj2)

PK

k=1δokwkj.

3. Alice and Bob compute random shares of each wkj = wkj+ γδokyj.

4. Alice and Bob compute random shares of each uji = uji+ γδyjxi.

The learning process can be repeated for a set of training examples. At the end of the whole process, Bob reveals his shares of those weights obtained in the last iteration, and Alice derives the resulting neural network. The correctness is easy to verify. The security can be proved similarly as before. Now Alice cannot distinguish among training sets that give the same overall weight changes. So we have the following theorem.

Theorem 5.2. Oblivious neural learning can be achieved by the combination of Pro-tocol 6 and ProPro-tocol 7.

As discussed before, an alternative scenario is not to have Bob give away his final shares to Alice, and for him to help Alice for her future task. In this way, Bob’s training set is secure and Alice cannot learn anything about it.

(21)

Appendix A: Proof of Lemma 2.2

Let ` = |H||S2|. We know from Cauchy-Schwartz that

P h,v|Pg,u[(g, g(u)) = (h, v)] − 1/`| is at most √ ` s X h,v (Pg,u[(g, g(u)) = (h, v)] − 1/`)2 = s `X h,v Pg,u[(g, g(u)) = (h, v)]2− 2 + 1 = q `Ph,h0_,u,u0[(h, h(u)) = (h0, h0(u0))] − 1 = q

`Ph,h0[h = h0]P_h,u,u0[h(u) = h(u0)] − 1

≤ p|S2| (1/|S1| + 1/|S2|) − 1

= p|S2|/|S1|,

where the inequality is because

Ph,u,u0[h(u) = h(u0)]

≤ Pu,u0[u = u0] + P_h,u,u0[h(u) = h(u0)|u 6= u0]

= 1/|S1| + 1/|S2|.

(22)

Bibliography

[1] M. Bellare and S. Micali, Non-interactive oblivious transfer and applications. CRYPTO 1989, 547-557.

[2] D. Boneh, Decision Diffie-Hellman problem. Algorithmic Number Theory 1998, 48-63.

[3] D. Bleichenbacher and P. Nguyen, Noisy polynomial interpolation and noisy chinese remaindering. EUROCRYPT 2000, 53-69.

[4] Niv Gilboa, Two party RSA key generation. CRYPTO 1999, 116-129.

[5] O. Goldreich, S. Micali, and A. Wigderson, How to play any mental game or a completeness theorem for protocols with honest majority. STOC 1987, 218-229.

[6] J. H˚astad, R. Impagliazzo, L. Levin, and M. Luby, Construction of a pseudo-random generator from any one-way function. SIAM Journal on Computing 28(4), 1364-1396 (1999).

[7] R. Impagliazzo and D. Zuckerman, How to recycle random bits. FOCS 1989, 248-253.

[8] Y. Ishai and E. Kushilevitz, Randomizing polynomials: a new representation with applications to round-efficient secure computaion. STOC 2000.

[9] Y. Lindell and B. Pinkas, Privacy preserving data mining. CRYPTO 2000, 36-54.

[10] M. Naor and B. Pinkas, Oblivious transfer and polynomial evaluation. STOC 1999, 245-254.

[11] T. Sander, A. Young, and M. Yung, Non-interactive cryptoComputing for NC1_.

FOCS 1999, 554-567.

[12] A. C. Yao, How to generate and exchange secrets (Extended Abstract). FOCS 1986, 162-167.

[13] J. M. Zurada, Introduction to artificial neural systems. PWS Publishing, 1994.

Oblivious Polynomial Evaluation and Oblivious Neural Learning