國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
For brevity, we omit the pk and sk from the encryption and decryption algorithms in the following sections.
2.3 Public Key Encryption
In this section, we recall the definition and security requirements of the public key encryption in [26].
A secure public key encryption PKE consists of the following three probabilistic polynomial time (PPT) algorithms:
• KeyGen(1λ) → (pk, sk): Key generation algorithm takes in the security parameter λ and outputs a public/private key pair (pk, sk).
• Enc(m, pk): Encryption algorithm takes a plaintext message m and a public key pk as input, and output a ciphertext ct.
• Dec(ct, sk): Decryption algorithm takes a ciphertext ct and a private key sk as input, and output a plaintext message m.
Definition 2 (Correctness). We say that a public key encryption is correct if for any (pk, sk) ← KeyGen(1λ) and message m, we have
P r[Dec(Enc(m, pk), sk) = m] = 1.
The security notation indistinguishability under chosen plaintext attack (IND-CPA) of the public key encryption is defined by the following game played between an adversaryA and a challenger C.
Security Game: IND-CPA
• KeyGen: The challenger runs (pk, sk)← KeyGen(1λ). Then the challenger sends pk to the adverary, and keeps sk secretly.
• Query: In this phase, the adversary can adaptively ask the challenger for the cipthertext for any message he/she chooses.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
• Challenge: The adversary chooses two message of the same length m0, m1 and sends them to the challenger. Then, the challenger chooses a random bit b∈ {0, 1}, and generates a ciphertext ct∗ ← Enc(pk, mb). Finally, the challenger sends ct∗ to the adversary.
• Guess: The adversary returns a bit b∗. If b∗ = b, we say the adversary wins the game.
Definition 3. We say that a public key encryption is secure if every PPT adversary wins the above game with only a negligible advantage.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
3 Private Information Retrieval
Figure 3.1: Three steps of private information retrieval
3.1 Overview of Private Information Retrieval
Nowadays, cloud services are more and more important in development of data networks. Cloud stor-age. However, We don’t know whether the service provider will not act maliciously on the uploaded data or not. Therefore, the privacy and security of cloud storage has become a big issue to discuss, and it can be divided into two parts, data privacy and data query privacy. PIR is focus on data query
‧
privacy, to ensure that the service provider cannot know which data the user has retrieved.
In 1995, Chor et al.[12] proposed a secure protocol called private information retrieval (PIR), which allows users to retrieve data from a server database without letting the server obtain any infor-mation of the retrieved data. PIR protocol allows a user to retrieve the i-th bit of an n-bit database, without revealing to the database server the value of i. A good PIR protocol is expected to have a considerably lower communication complexity. The authors first show that to achieve information-theoretic privacy under several copied n-bit database setting, the communication cost is at least n bits.
They use multiple non-interactive databases to construct the protocol to keep the communication cost less than n bits, so the communication cost is more efficient than downloading the entire database.
But in this PIR protocol, multiple copied databases are required. The more copies databases, we will have to consider more privacy issue, and hardware costs have to be care.
Although Choret al.[12] provides a great solution to protect the privacy of data queries, but mul-tiple copied databases exists many problems as above. In 1997, Kushilevitz and Ostrovsky first pro-posed a PIR protocol with a single database setting [19], called “computational private information retrieval”(cPIR), data would not need to be copied to several databases as [12] did. cPIR is a three-step interaction between a user and a server. The user first generates a query value of Q, which is made by the index of the data that the user wants to retrieve back. And sends it to the server. After the server receives the query value, it calculates the return value R using the entire data in the database and Q, and then returns the result R to the user. Finally, the user can calculate the data he/she wants from R and not revealing the information of what he/she wants to retrieve back to the server.
Since the pioneering work of [12] and [19], more and more researches talk about this area, this topic is getting more and more attention. PIR protocols can be classified into two big categories :
1. information-theoretical private information retrieval 2. computational private information retrieval
Information-theoretical PIR is using multi-server to achieve information-theoretic, such as [4, 13, 17]. cPIR is using single-server and cryptographic algorithm to achieve that only single server with some computation, such as [7, 10, 25, 28, 30, 31]. There are also some PIR protocols are based on
‧
the trusted hardware, which assumes that the hardware is in the server to respond to the user’s query without revealing to the server any query information, such as [32, 43, 47]. In cPIR area, it also has some different kinds of cPIR, we are going to discuss it later.
In 2010, Gertner [48] first proposed a new concept, called symmetrically-private information re-trieval (sPIR), in which data and user privacy are guaranteed. That is, each time the sPIR protocol is invoked, the user learns only one physical bit and knows nothing else about the data. .In the same year, the first lattice-based PIR was proposed by Aguilar-Melchor and Gaborit [9]. The security of the protocol is based on the lattice hard assumption which is called “The Differential Hidden Lattice Problem”, that is, the ability to resist quantum attacks. There is also another research discuss sPIR based on blind quantum computing [45], their protocol can reduce not only honest user’s computa-tional burden of the communication, but the cost of the quantum hardware devices in the practical implementation. More, in 2014, Dong and Chen [11] proposed a PIR with lower communication costs. The protocol uses tree-based compression and fully homomorphic encryption [15] as building blocks, which reduces communication costs to O(loglogn). In 2016, C Aguilar-Melchor, J Barrier, L Fousse [1] proposed a cPIR, called “XPIR” , which is a fast cPIR implementation. In 2019, Hei-darzadeh and Anoosheh [27] proposed an IPIR-SI scheme consider a multi-user variant of the PIR, Allow multi-user to privately retrieve a distinct message from a server with the help of a trusted agent.
There is a topic in cPIR called “Homomorphic Encryption-Based Private Information Retrieval”.
Nowadays single-database PIR protocols provide great communication cost but require the database to use an enormous amount of computational power. The lattice-based PIR which proposed by Aguilar-Melchor and Gaborit [9] has the computational cost a few thousand bit-operations per bit in the database. And the user has to generate the query matrices is not efficient at all. In 2008, Aguilar-Melchor and Gaborit [2] proposed another researchto solve the problem with homomorphic encryp-tion. Homomorphic encryption techniques are often very trivial ways to construct a variety of privacy-preserving protocols. In 2009, Gentry [21, 22, 23, 24] constructed the first fully homomorphic en-cryption scheme using lattice-based cryptography. Fully homomorphic enen-cryption is a scheme that allows one to compute arbitrary functions over encrypted data without the decryption key. In 2013, Yi, Kaosar, Paulet, and Bertino [49] proposed single-database PIR protocols from fully homomorphic
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
encryption. In 2007, Ostrovsky and Skeith [40] proposed a new method, using a security group ho-momorphic encryption method to construct the PIR protocol, and Yerukhimovich [3] further analyzed the protocol in 2015 . They used the only group homomorphic encryption to apply to single-server PIR, it can reduce the computation in the server and easily generate the queries by user. Which is the research that we improved.
After discussing all kinds of PIR, now we are going to focus on single-server cPIR. we will discuss more single-server cPIR and the group base PIR using homomorphic encryption.