• 沒有找到結果。

Oblivious transfer protocols can implement to PIR protocols if it has the communication-efficient implementation. And if PIR protocols limit the information that the user gets, it also can implement to an oblivious transfer protocols. Naor and Pinkas [34] have subsequently shown how to turn any PIR protocol into an oblivious transfer protocol OTn1 with one invocation of a single-database PIR protocol and logarithmic number of oblivious transfer protocol OT21. DiCrescenzo, Malkin, and Ostrovsky [16]

have shown that any Oblivious transfer protocol can be constructed entirely based on invocations of PIR protocol.

Single-server PIR is close to the notation of oblivious transfer. The most different point between two concepts is:

1. PIR protocols pay great attention to the communication costs, if the communication complexity goes too high, PIR protocols will be more inefficient than downloading the entire data in the database. On the other hand, Oblivious transfer protocols do not care about the communication complexity requirement.

2. Oblivious transfer protocols will provide the privacy to both user and server, which means the server can not learn which information the user retrieved and the user also can not get the infor-mation in addition to what he wants to retrieve. In PIR protocols, it only provides the privacy to the user side. The server can not learn which information the user has retrieved but it does not care about that if the user can get other information from the server.

Although PIR and oblivious transfer are really similar, the point that focuses on two different protocols is different. These two protocols will have a different purposes that the application will also be different.

3.5 Applications

In the application of PIR, let as picture a scenario first:

We all know that the current social network is very developed. Many websites collect user behav-ioral data, giving you the most suitable advertising and search results. If you are a consumer using a shopping website to buy basketball shoes, suits, and basketball cards. One day you get a ticket to

watch an NBA game in the first place, you think this is a rare opportunity to watch a show so close.

So you decided to buy a professional camera to take pictures, and search the camera on the website.

The algorithm behind the website detects that you never search for the camera before, and it just deter-mines you are not familiar with the camera. Then it recommended a more expensive camera for you in the search results. You spent more money because of the results you search for before. The same situation will also happen in the reservation website, if you search a room in a place you never search before, the website may recommend a more expensive room for you in that place. PIR can solve the problem.

PIR can also have applications [1, 6, 36] in several problem domains below:

1. E-commerce: Just mentioned above, in E-commerce, the supplier adjusts the product’s se-lect or even the price that can make more profit, he also can sell search records to advertising companies. Using PIR, the consumers could privately retrieve the result, the user’s advertising client could privately retrieve the advertisement based on the profiled online, all data is cached locally. Content providers that display advertisements to users can be charged in a manner that protects privacy so that the advertising network does not understand the interests of users.

2. Real-time stock market: For an investor in the stock market, he needs to pay attention to the latest market information and price changes at any time. If the server manager can know all the decisions that the particular stock investor does, then he can use the information of the stock investor. The investor might prefer a platform that can keep his stock information by not revealing any information about his stock. They can make decisions with more safely, and will not affect the decision because of the leaked information.

3. Biodatabase: With the advancement of technology, more and more biometric technologies are added to our lives. Whether it is DNA identification or fingerprint identification even more.

Suppose there is a pharmaceutical organization that purchases specific genomic sequence infor-mation from a public DNA database [8]. They need this inforinfor-mation to produce new medicines, it might be a trade secret. If there is a competition organization get the information of which ge-nomic sequence information they retrieve, the competition organization would follow the clues

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

to figure out the new medicine they want to study, and may propose new products earlier than them. PIR can easily prevent this situation.

4. Patent databases: The application process for new inventions must require inventors to search the patent database to ensure that previous patents are not of great overlaps with his invention.

He wants to perform a search somehow his search terms were not kept in the query log of the patent database. The current patented database system allows curious or malicious database administrators to make inferences follow the user’s interests directly from the query log or in real-time query during execution.

In all the applications above, they don’t want to reveal the sensitive information they query which sends to the server database to the server-side or public. In this situation, PIR can easily solve the problems as well. Because of PIR pays great attention to the communication costs, so it can also solve the problem of the real-time queries privacy. PIR will be more important in a different areas.

4 Group Homomorphic Encryption based PIR Protocols

4.1 Group Homomorphic Encryption based PIR Protocols

In this chapter, we recall the group homomorphic encryption based PIR protocols that were inductively introduced in [40]. Here, we let HEPKE = (KeyGen, Enc, Dec) be an IND-CPA secure group homomorphic encryption.

One-Dimensional Group Homomorphic Encryption based PIR

In the one-dimensional setting,S has a n-element data X = {xi}ni=1,U wants to learn the value xi, where 1≤ i ≤ n. The one-dimensional group homomorphic encryption based PIR protocol is shown as below:

1. Query Generation (QG):

U first chooses an element p ∈ G1 which satisfies ord(p) > N , and then generates a query array Q ={qi}ni=1, Enc(IDG1), otherwise.

∈ G2.

Finally,U sends Q to S.

2. Response Generation (RG):

After receiving Q,S computes

R =

n i=1

xi⊙ qi

and sends it back.

4.2 Example of Homomorphic Encryption based PIR Protocols

In this section, we give an toy example to show how group homomorphic encryption based PIR pro-tocol works. We assumeS has a one-dimensional database that stores the following data:

DB = particular, the identity of G1 is 0 (i.e., IDG1 = 0). Now, suppose U wants to retrieve the value of second element. U interacts with S as follows.

1. Query Generation (QG):

U first picks k = 2 ∈ G1,and then generates a query array Q = [Enc(0), Enc(2), Enc(0), Enc(0)]. U sends Q to S.

2. Response Generation (RG):

S computes R as below.

U first computes Dec(R) = 8 and then obtain the data of second row by computing:

8× 2−1 = 4.

4.3 Two-Dimensional Group Homomorphic Encryption based PIR

The one-dimensional scheme can extend to two-dimensional scheme. In the two-dimensional setting, similar with one-dimensional setting, S has a n × n-element data X = {xi,j}ni,j=1,U wants to learn n values{xi,j}nj=1, where 1≤ i ≤ n. The two-dimensional group homomorphic encryption based PIR protocol is shown as below:

1. Query Generation (QG):

U first chooses an element p ∈ G1 which satisfies ord(p) > N , and then generates a query array Q ={qi}ni=1, Enc(IDG1), otherwise.

∈ G2.

Finally,U sends Q to S.

2. Response Generation (RG):

After receiving Q,S computes

Rj =

Although we present the equation xi,j∗p∗p−1 = xi,jhere, in fact, howU obtains the information of xidepends on the property ofG1,G2used in the scheme. IfG1,G2is a multiplicative group,U can compute xi = logp(xi∗p). If G1,G2is an additive group, with the help of the generator of the group, U can easily compute xi by division. Since the additive group is more efficient than multiplicative group, in practice we useG1,G2 =⟨Zm, +⟩ as the implementation architecture for fast calculation of xi, where m is a large number.

4.4 Example of Two-Dimensional Group Homomorphic Encryption based PIR

In this section, we give an toy example to show how two-dimensional group homomorphic encryption based PIR protocol work. We assumeS has a two-dimensional database that stores the following data:

particular, the identity of G1 is 0 (i.e., IDG1 = 0). Now, suppose U wants to retrieve the value of second row. U interacts with S as follows.

1. Query Generation (QG):

U first picks k = 2 ∈ G1,and then generates a query array Q = [Enc(0), Enc(2), Enc(0), Enc(0)]. U sends Q to S.

2. Response Generation (RG):

S computes R as below.

R =

U first computes Dec(R) = (8, 4, 2, 6) and then obtain the data of second row by computing:

(8, 4, 2, 6)× 2−1 = (4, 2, 1, 3).

In this section, inspiring from [40], we present an improved construction of the PIR protocol using homomorphic encryption. In contrast to [40] where only one data value was retrieved at a time, our work allows multiple values to be retrieved from the repository at once. The following we first provide a two-value setting protocol and then extend it to a multi-value one. We note that we only provide a one-dimensional setting here, as shown in Section 4.3, we can extend our protocols to obtain a two-dimensional setting by executing n times on the database side. Besides, we give an example in Section 5.3.

5.1 Two-value Group Homomorphic Encryption based PIR Protocol

In this protocol,S has a n-element data X = {xi}ni=1,U wants to retrieve two value x1, x2 ∈ X from S, where x1, x2 ∈ G1andG1,G2are an addition group⟨Zm, +⟩. At first, U selects two secret elements k1, k2 ∈ G1, here we note that k1 and k2 are used to be encrypted by the homomorphic encryption to a random number, and thus if k1 and k2 are chosen properly, then the value of k1and k2 do not affect the security. Which satisfy the following three equations:

 becauseU knows the location of x1and x2,U then follows the following steps to get the information of x1 and x2.

1. Query Generation (QG):

After receiving Q,S computes

R =

5.2 Multi-value Group Homomorphic Encryption based PIR Protocol

Based on two-value setting, we can obtain a multi-value setting as follows. In this protocol,S has a n-element data X ={xi}ni=1,U wants to retrieve r value x1,· · · , xr ∈ X from S, where x1,· · · , xr

we note that k1,· · · , krare used to be encrypted by the homomorphic encryption to a random number, and thus if k1,· · · , kr are chosen properly, then the value of k1,· · · , kr do not affect the security.

Which satisfy the following equations:



After receiving Q,S computes

R =

n i=1

xi⊙ qi

and sends it back.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

3. Response Retrieval (RR):

U first computes

Dec(R) = x1∗ k1+ x2∗ k2+· · · + xr∗ kr. U can then retrieve x1:

(a) ∵ k2 > N ∗ k1

∴ x1∗ k1 ≤ N ∗ k1 < k2

∵ k3 > (k2+ k1)∗ N

∴ x1∗ k1+ x2∗ k2 ≤ N ∗ k1+ N ∗ k2 < N ∗ (k1+ k2) < k3

...

∵ kr > (kr+ kr−1+· · · + k2+ k1)∗ N

∴ x1∗ k1+ x2∗ k2+· · · + xr−1∗ kr−1

≤ N ∗ k1+ N ∗ k2+· · · + N ∗ kr−1

= N ∗ (k1+ k2+· · · + kr−1)

< kr

(b) (x1∗ k1+ x2∗ k2+· · · + xr∗ kr) mod kr

= (x1∗ k1+ x2∗ k2+· · · + xr−1∗ kr−1)

(x1∗ k1+ x2∗ k2+· · · + xk−1∗ kr−1) mod kr−1

= (x1∗ k1+ x2∗ k2+· · · + xk−2∗ kr−2)

...

(x1∗ k1+ x2∗ k2) mod k2 = x1∗ k1

(c) (x1∗ k1)∗ k1−1 = x1.

In this section, we give an toy example to show how our proposed protocol work. We assumeS has a two-dimensional database that stores the following data:

DB = particular, the identity of G1 is 0 (i.e., IDG1 = 0). Now, suppose U wants to retrieve the data of second and third rows. U interacts with S as follows.

1. Query Generation (QG):

U first picks k1 = 2, k2 = 11∈ G1, and then generates a query array Q = [Enc(0), Enc(2), Enc(11), Enc(0)].

U sends Q to S.

2. Response Generation (RG):

S computes R as below.

= (Enc(41), Enc(15), Enc(24), Enc(17)) mod m.

S then sends R back to U.

3. Response Retrieval (RR):

U first computes Dec(R) = (41, 15, 24, 17) and then obtain the data of second and third row by computing the following steps.

• Second row:

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

6 Security Proof

We now provide a rigorous proof to show that our proposed protocol is secure, that is, even if an adversary has ability to access the transmitted data, he/she cannot obtain any information of what U wants to the retrieve. To be more precise, follow the idea of [3], we show that if there is a PPT algorithmA who can distinguish whether some q ∈ G2 is an encryption of IDG1 or an encryption of p ∈ G1 with a non-negligible advantage, then there is another algorithmC can win the game of IND-CPA security game of the group homomorphic encryption throughA.

Here, we first introduce some useful notations. Let (KeyGen, Enc, Dec) be an IND-CPA secure group homomorphic encryption. Then, Q0denotes{qk = Enc(IDG1)}nk=1, and Qi∈[1,n]denotes{qk= Enc(IDG1)}nk=1 except that qi = Enc(p), where p ∈ G1. Finally, P r[A(Qi) = 1] denotes the probability that the PPT algorithm A can distinguish between qi ∈ Qi is an encryption of p or a encryption of IDG1.

Theorem 1. If the underlying group homomorphic encrypiton is IND-CPA secure, then there is no

PPT algorithm A who can obtain the information that user retrieved in our propose protocol with non-negligible probability.

Proof. In order to prove this theorem, we first need to show Lemma 1 and 2 are correct. Here, λ is the security parameter of the underlying group homomorphic encryption.

Lemma 1. If there is an PPT algorithm A such that P r[A(Qi) = 1]− P r[A(Q0) = 1] = ϵ >

negl(λ), then there is another PPT algorithmC can win the IND-CPA game of the underlying group homomorphic encryption with non-negligible probability 1/2 + ϵ/2.

Proof. The following we show thatC uses A as a block-box algorithm to win the IND-CPA game of the underlying group homomorphic encryption with non-negligible probability.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

C first computes n − 1 values {qk = Enc(IDG1)}nk=1,k̸=i using the public key pk generated by KeyGen(1λ). Then, he/she sends m0 = IDG1and m1 = p as the challenge messages to the IND-CPA game, and receives c ← Enc(mb) where b ∈ {0, 1} is randomly chosen. Finally, he/she sets qi = c, and then outputsA(Q = {qk}nk=1) as the result of the guess.

Here, if b = 0, the distribution of Q is the same as the distribution of Q0; If b = 1, the distribution of Q is the same as the distribution of Qi. Therefore, the probability ofC winning the IND-CPA game AdvIN DC −CP Acan be expressed as:

P r[A(Q) = 0|b = 0] · P r[b = 0]

+P r[A(Q) = 1|b = 1] · P r[b = 1]

= (1/2)(1− P r[A(Q0) = 1]

+(1/2)(1− P r[A(Qi) = 1]

= 1/2 + ϵ/2.

Lemma 2. If there is a PPT algorithm A such that P r[A(Qj) = 1]− P r[A(Q0) = 1] = ϵ negl(λ), then there is another PPT algorithmC can win the IND-CPA game of the underlying group homomorphic encryption with non-negligible probability ϵ/2.

Proof. The proof of this Lemma is the same as of Lemma 1. Therefore, we omit it here.

Since the underlying group homomorphic encryption is IND-CPA secure, P r[A(Qj) = 1] P r[A(Q0) = 1] ≤ negl(λ) and P r[A(Qj) = 1] − P r[A(Q0) = 1] ≤ negl(λ). Furthermore, we get the following equation:

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

|P r[A(Qi) = 1]− P r[A(Qj) = 1]|

=|(P r[A(Qi) = 1]− P r[A(Q0) = 1])

−(P r[A(Qj) = 1]− P r[A(Q0) = 1])|

=|negl(λ) − negl(λ)| = negl(λ)

Thus, if the underlying group homomorphic encrypitonis IND-CPA secure, then there is no PPT algorithmA who can obtain the information that user retrieved with non-negligible probability.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

7 Discussion and Analysis

In this section, we first discuss the maximum number of data that can be retrieved at a time (i.e., the maximum value of r), and how to increase it. Then, we analyze the communication cost of our proposed protocols.

For more precise analysis, we letG1,G2 be the same additive group, i.e.,⟨Zm, +⟩ ,where m is a large number.

Table 7.1: The communication cost of retrieving two- and multi-value compared to [40] when the database is one- and two-dimensional. Here, we assume that the operators of the two protocols is based on the additive group, i.e.,G1,G2 =⟨Zm, +⟩, where m is a large number. In addition, we let n represent the number of data in an array, and r represent the number of values retrieved.

Protocol One-dimensional Two-dimensional

Two-value Multi-value Two-value Multi-value [40] 2(n + 1) log m r(n + 1) log m 4

n log m 2r√

n log m Our’s (n + 1) log m (n + 1) log m 2

n log m 2

n log m

7.1 The Maximum Number of Retrieved Data

In Section 5, we proposed two-value and multi-value setting PIR protocols, which allows users to retrieve two and multiple values at a time, respectively, thus greatly improving efficiency compared to [40]. In the implementation, however, the maximum number of data is limited by the encryption architecture we adopted. Here, we illustrate the limitations of r through our multi-setting PIR protocol.

As the same as Section 5.2,U first selects r values k1, k2,· · · , kr∈ G1which satisfy the following equations:

Therefore, the equation m >r

i=1ki∗ N will approximate to:

Assume that the group homomorphic encryption we adopted is 1024-bit, the maximum value of m is 21024 (If m is larger than 21024, group homomorphic encryption will not work properly). In order to retrieve the maximum number of data (i.e., to maximize r), we have to minimize the value of parameters (e.g., k1, N ). For instance, let k1 be 2 and the value stored in database is binary, which means the value of N is 2. We can obtain that the maximum value that can be retrieved at a time is r = 1022 in the above environment.

Here, we note that although increase m can directly increase the maximum number of r (for the same N ), we need to set up group homomorphic encryption at higher bits (e.g., 2048-bit), and the setting will result in increased execution time for encryption and decryption.

In this section, we analyze the communication cost of our proposed protocol and [40] whenU wants to retrieve r values. Here, we consider two scenarios, namely, whether the database is a one-dimensional setting or a two-dimensional setting.

In one-dimensional setting, in our proposed protocol,U has to transfer a query array Q = {qi}ni=1

toS, where qi ∈ Zm. Thus, the communication cost ofU is n log m. As for S, who transfers R =

n

i=1xi·qi mod m, the communication cost ofS is n log m. Therefore, the total cost is (n+1) log m.

On the other hand, in the work of [40], since only one value can be retrieved at a time,U and S must to execute the protocol r times in order to retrieve r values. Therefore, the communication cost is r(n + 1) log m.

In two-dimensional setting, U works the same as in one-dimensional setting, regardless of our proposed protocol or [40]. However,S must transfer R1,· · · , Rn ∈ Zmin this setting. Therefore, the communication cost of our proposed protocol and [40] is 2n log m and 2rn log m, respectively. As shown in Table 7.1, our proposed protocol can reduce the communication cost by a factor of r when U attempts to retrieve r values in either one- or two-dimensional database.

As mentioned in the section 7.1, if the settings for group homomorphic encryption are fixed, how many data can be retrieved at a time (i.e., r) depends on the maximum values stored in the database (i.e., N ) in our protocol. The relationship between N and r is shown in Table 7.2.

Table 7.2: The relation between how many data can be retrieved at a time (i.e., r) and maximum values stored in the database (i.e., N ).

N 2 25 210 250 2100 2500 r 1022 204.6 92.9 20.5 10.2 2.0

Although in [40], the communication cost is fixed regardless of the maximum values stored in the database, in a real scenario, the maxi value stored in the database would not be as large as 250. That is, even if our protocol is limited by the maximum values we stored, we can significantly reduce communication costs compared to [40]. Moreover, if the user still wants to retrieve the number of data more than r, he just has to simply increase more times to retrieve. assume that user wants to retrieve t data, and t > r, then user only have to execute the protocol⌈rt⌉ times. Which means that we can

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

reduce a great communication cost in our protocol.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

8 Experiment and Result

In this chapter, we presented an experiment of our protocol. The experiments were conducted with Intel Core(TM) i5-8400 CPU @2.80GHz, 16GB of DDR4 RAM, and NVIDIA GeForce GTX 1060M 6GB DDR5 GPU. We used Python 2.7 to construct our protocol and [40] protocol. To make a fair comparison, both we and [40] architecture use Paillier group homomorphic encryption mechanism as a component.

8.1 Paillier Cryptosystem

Paillier Cryptosystem [37], proposed in 1999, is a public key cryptography with homomorphic prop-erty. The cryptosystem consists of three algorithms described as follows: KeyGen(1λ), Enc(m, pk), and Dec(c, sk).

• KeyGen(1λ)→ (pk, sk):

1. Randomly choose two large prime numbers p and q and independently of each other such that gcd(pq, (p− 1)(q − 1)) = 1, gcd means Greatest Common Divisor.

2. Compute n = pq and λ = lcm(p− 1, q − 1), lcm means Least Common Multiple.

3. Select random integer g where g ∈ Zn2.

4. Ensure n divides the order of g by checking the existence of the following modular mul-tiplicative inverse: µ = (L(gλ mod n2))−1 mod n.

5. The public key pk = (n, g). The private key sk = (λ, µ).

• Enc(m, pk):

1. Let m be a message to be encrypted where 0≤ m < n.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

2. Select random r where 0 < r < n and r∈ Zn(i.e., ensure gcd(r, n) = 1).

3. Compute ciphertext as: c = gm· rn mod n2.

• Dec(c, sk):

1. Let c be the ciphertext to decrypt, where c∈ Zn2.

2. Compute the plaintext message as: m = L(cλ mod n2)· µ mod n.

Paillier Cryptosystem is an additive homomorphic encryption, which means given only the public key and the c1and c2that was encrypted from m1 and m2, we can still compute the value of m1+ m2 as below :

D(E(m1, r1)· E(m2, r2) mod n2) = m1+ m2 mod n.

8.2 Result

We used four different key sizes of paillier cryptosystem to construct our and [40] protocols as 32-bit, 128-bit, 512-bit, and 1024-bit. The following we compared the execution time between our protocol and [40] protocol, the result are shown in Fig. 8.1 to Fig. 8.4. The blue line ”our” represents our protocol, and the red line ”OS07” represents [40] protocol.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Figure 8.1: Execution time comparison between ours and [40] under 32-bit setting

Figure 8.2: Execution time comparison between ours and [40] under 128-bit setting

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Figure 8.3: Execution time comparison between ours and [40] under 512-bit setting

Figure 8.3: Execution time comparison between ours and [40] under 512-bit setting

相關文件