• 沒有找到結果。

Challenge of Data Confidentiality

After the storage servers are located on the network, the control of the storage servers is no longer at the data owner’s hand. Therefore, there is an increasing attention on the data confidentiality.

2.3.1 Cleartext Storage

Early networked storage systems are proposed for robust storage with simple access control mechanisms. The data are stored in cleartext. When the storage system uses an erasure code for data robustness and each storage server stores a codeword symbol, the system has certain data confidentiality because less than k storage servers cannot recover the original message, where the message has k symbols.

2.3.2 Symmetric Encryption

After the hardware and software of the storage servers are improved, the stor-age servers can handle more computation. Many new networked storstor-age sys-tems provide stronger data confidentiality by storing data in encrypted form.

Once a storage system receives data from the owner, the data are encrypted by symmetric encryption scheme such as DES or AES before stored into the physical drives. Blaze’s CFS [6], TCFS [8], StegFS [7] and NCryptfs [9] are file systems that encrypt data before writing them to storage drives. Those file systems only protect the stored data at rest and assume that the storage servers are fully trusted.

Other large-scale networked storage systems, such as OceanStore [41],

Figure 2.15: File 1 before and after being encrypted in a Plutus system.

The file is divided into blocks and each file block is encrypted by using a distinct symmetric key. All symmetric keys are encrypted by using another symmetric key MK. The encrypted symmetric keys are stored with the en-crypted file blocks. A user may use a different MK for each file.

Plutus [14] and Tahoe [36], use encryption schemes to protect the data con-fidentiality against both internal and external attackers. In OceanStore, all information that enters the system must be encrypted while the owner man-ages the access control. For example, when the owner wants to share a datum with others, he needs to distribute the symmetric key to the authorized read-ers. Similarly, in both Plutus and Tahoe, a user needs to encrypt files with distinct symmetric keys and manage all of the keys by himself. A file before and after being encrypted in Plutus is illustrated in Figure 2.15. A newly encryption service [47] is provided for any cloud storage user who uses Ama-zon Simple Storage Service. The encryption service encrypts the user’s data by using AES and stores the ciphertext into the cloud storage system for the user. Again, this application assumes that the servers who encrypt the data are fully trusted.

However, the data confidentiality of above systems is guaranteed either

when the storage servers are honest and secure or when users take the re-sponsibility of key management over the huge amount of symmetric keys.

For most cases, those storage servers are assumed that they will follow the user-defined access policy on the stored data and keep the stored data in the encrypted form all the time. The trust on all storage servers sometimes is un-realistic especially when the storage system is decentralized and distributed over a large geographic area. Any one of the storage servers could be vul-nerable from internal or external attacks. On the other hand, the burden of key management for users should be decreased or moved to the servers.

Hence, stronger data confidentiality with low overhead on users is required.

The data should be kept secret even if all storage servers are compromised, and users store as few as possible keys and put as less as possible effort on the key management.

2.3.3 Public Key Encryption

Applying a public key encryption scheme in a centralized networked storage system gives a straightforward solution to the data confidentiality issue. A user encrypts the data and then stores into the system. The central authority simply treats the ciphertext as a RAW data just like in the non-encrypted case. Similarly, the strong data confidentiality is also achievable in a de-centralized system with replication technology. For instance, Farsite [10]

uses the hybrid encryption to protect the data confidentiality and provide an access control mechanism. In Farsite, a datum is first encrypted by using a symmetric encryption and the symmetric key is encrypted by using the

owner’s public key. The user only needs to store his secret key for all of his data. When he wants to share some data with some user, he encrypts the corresponding symmetric keys by using the authorized user’s public key and stores the ciphertext in the storage system. The overhead of the key man-agement is mainly moved to the servers because most of the keys are stored in storage servers (in an encrypted form) except for the user’s secret key.

2.3.4 Motivation

To my best knowledge, few research addresses the data confidentiality against the collusion of all storage servers in a decentralized networked storage system that uses erasure codes. Here is the place where my results fill in. We provide a secure cloud storage system that provides a strong data confidentiality in a decentralized environment and a good data availability by using erasure codes. Our key technique is combining a public key encryption scheme and a variant of random linear code. As a result, the data are stored in an encrypted and encoded form in each storage server and no storage server has the decryption key. The access right management is totally controlled by the data owner. The data confidentiality is fully guaranteed even if all storage servers are corrupted at the same time.

Chapter 3

Erasure Codes and System Models

We introduce our basic algebraic notations and the erasure codes we used in this chapter. The special erasure code is one of our key techniques to achieve both robustness and parallelism in our cloud storage system. We consider that there is no central authority in the collection of storage servers and introduce our first system model and an advanced one. We also describe the threat model to measure the security degree of the cloud storage systems.

3.1 Bilinear Map and Assumptions

Bilinear map. Let G1, G2 be cyclic multiplicative groups1 with prime or-der p and g ∈ G1 be a generator. A polynomial-time computable map

˜

e : G1 × G1 → G2 is a bilinear map if it has the bilinearity and

non-1It can also be described as additive groups over points on an elliptic curve.

degeneracy: for any x, y ∈ Zp, ˜e(gx, gy) = ˜e(g, g)xy and ˜e(g, g) is not the identity element in G2. In fact, ˜e(g, g) is a generator of G2. Let Gen(1λ) be an algorithm generating (p, G1, G2, ˜e, g), where λ is the length of p.

Let x ∈RX denote that x is randomly chosen from the set X.

Bilinear Diffie-Hellman assumption. Following the above parameters, given g, gx, gy, gz, where x, y, and z are randomly chosen from Zp, the bilinear Diffie-Hellman problem is to find ˜e(g, g)xyz. The assumption is that it is hard to solve the problem with a significant probability in polynomial time.

Formally, for any probabilistic polynomial time algorithm A, the following probability is negligible (in λ):

Pr[A(g, gx, gy, gz) = ˜e(g, g)xyz: x, y, z ∈RZp]

Decisional Bilinear Diffie-Hellman assumption. This assumption is that given g, gx, gy, gz, it is hard to distinguish ˜e(g, g)xyz from a random element from G2. Formally, for any any probabilistic polynomial time algorithm A, the following is negligible (in λ):

| Pr[A(g, gx, gy, gz, Qb) = b : x, y, z, r ∈R Zp; Q0 = ˜e(g, g)xyz; Q1 = ˜e(g, g)r; b ∈R {0, 1}] − 1/2|

相關文件