• 沒有找到結果。

A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding

N/A
N/A
Protected

Academic year: 2021

Share "A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding"

Copied!
9
0
0

加載中.... (立即查看全文)

全文

(1)

A Secure Erasure Code-Based Cloud Storage

System with Secure Data Forwarding

Hsiao-Ying Lin, Member, IEEE, and Wen-Guey Tzeng, Member, IEEE

Abstract—A cloud storage system, consisting of a collection of storage servers, provides long-term storage services over the Internet. Storing data in a third party’s cloud system causes serious concern over data confidentiality. General encryption schemes protect data confidentiality, but also limit the functionality of the storage system because a few operations are supported over encrypted data. Constructing a secure storage system that supports multiple functions is challenging when the storage system is distributed and has no central authority. We propose a threshold proxy re-encryption scheme and integrate it with a decentralized erasure code such that a secure distributed storage system is formulated. The distributed storage system not only supports secure and robust data storage and retrieval, but also lets a user forward his data in the storage servers to another user without retrieving the data back. The main technical contribution is that the proxy re-encryption scheme supports encoding operations over encrypted messages as well as forwarding operations over encoded and encrypted messages. Our method fully integrates encrypting, encoding, and forwarding. We analyze and suggest suitable parameters for the number of copies of a message dispatched to storage servers and the number of storage servers queried by a key server. These parameters allow more flexible adjustment between the number of storage servers and robustness. Index Terms—Decentralized erasure code, proxy re-encryption, threshold cryptography, secure storage system.

Ç

1

I

NTRODUCTION

A

Shigh-speed networks and ubiquitous Internet access

become available in recent years, many services are provided on the Internet such that users can use them from anywhere at any time. For example, the email service is probably the most popular one. Cloud computing is a concept that treats the resources on the Internet as a unified entity, a cloud. Users just use services without being concerned about how computation is done and storage is managed. In this paper, we focus on designing a cloud storage system for robustness, confidentiality, and func-tionality. A cloud storage system is considered as a large-scale distributed storage system that consists of many independent storage servers.

Data robustness is a major requirement for storage systems. There have been many proposals of storing data over storage servers [1], [2], [3], [4], [5]. One way to provide data robustness is to replicate a message such that each storage server stores a copy of the message. It is very robust because the message can be retrieved as long as one storage server survives. Another way is to encode a message of k symbols into a codeword of n symbols by erasure coding. To store a message, each of its codeword symbols is stored in a different storage server. A storage server failure corresponds

to an erasure error of the codeword symbol. As long as the number of failure servers is under the tolerance threshold of the erasure code, the message can be recovered from the codeword symbols stored in the available storage servers by the decoding process. This provides a tradeoff between the storage size and the tolerance threshold of failure servers. A decentralized erasure code is an erasure code that indepen-dently computes each codeword symbol for a message. Thus, the encoding process for a message can be split into n parallel tasks of generating codeword symbols. A decentralized erasure code is suitable for use in a distributed storage system. After the message symbols are sent to storage servers, each storage server independently computes a code-word symbol for the received message symbols and stores it. This finishes the encoding and storing process. The recovery process is the same.

Storing data in a third party’s cloud system causes serious concern on data confidentiality. In order to provide strong confidentiality for messages in storage servers, a user can encrypt messages by a cryptographic method before apply-ing an erasure code method to encode and store messages. When he wants to use a message, he needs to retrieve the codeword symbols from storage servers, decode them, and then decrypt them by using cryptographic keys. There are three problems in the above straightforward integration of encryption and encoding. First, the user has to do most computation and the communication traffic between the user and storage servers is high. Second, the user has to manage his cryptographic keys. If the user’s device of storing the keys is lost or compromised, the security is broken. Finally, besides data storing and retrieving, it is hard for storage servers to directly support other functions. For example, storage servers cannot directly forward a user’s messages to another one. The owner of messages has to retrieve, decode, decrypt and then forward them to another user.

In this paper, we address the problem of forwarding data to another user by storage servers directly under the . H.-Y. Lin is with the Intelligent Information and Communications

Research Center, Department of Computer Science, National Chiao Tung University, No. 1001, University Road, Hsinchu City 30010, Taiwan. E-mail: hsiaoying.lin@gmail.com.

. W.-G. Tzeng is with the Department of Computer Science, National Chiao Tung University, No. 1001, University Road, Hsinchu City 30010, Taiwan. E-mail: wgtzeng@cs.nctu.edu.tw.

Manuscript received 21 Mar. 2011; revised 12 Sept. 2011; accepted 18 Sept. 2011; published online 30 Sept. 2011.

Recommended for acceptance by J. Weissman.

For information on obtaining reprints of this article, please send e-mail to: tpds@computer.org, and reference IEEECS Log Number tpds-2011-03-0162. Digital Object Identifier no. 10.1109/TPDS.2011.252.

(2)

command of the data owner. We consider the system model that consists of distributed storage servers and key servers. Since storing cryptographic keys in a single device is risky, a user distributes his cryptographic key to key servers that shall perform cryptographic functions on behalf of the user. These key servers are highly protected by security mechan-isms. To well fit the distributed structure of systems, we require that servers independently perform all operations. With this consideration, we propose a new threshold proxy re-encryption scheme and integrate it with a secure decentralized code to form a secure distributed storage system. The encryption scheme supports encoding opera-tions over encrypted messages and forwarding operaopera-tions over encrypted and encoded messages. The tight integra-tion of encoding, encrypintegra-tion, and forwarding makes the storage system efficiently meet the requirements of data robustness, data confidentiality, and data forwarding. Accomplishing the integration with consideration of a distributed structure is challenging. Our system meets the requirements that storage servers independently perform encoding and re-encryption and key servers independently perform partial decryption. Moreover, we consider the system in a more general setting than previous works. This setting allows more flexible adjustment between the number of storage servers and robustness.

Our contributions. Assume that there are n distributed storage servers and m key servers in the cloud storage system. A message is divided into k blocks and represented as a vector of k symbols. Our contributions are as follows:

1. We construct a secure cloud storage system that

supports the function of secure data forwarding by using a threshold proxy re-encryption scheme. The encryption scheme supports decentralized erasure codes over encrypted messages and forwarding operations over encrypted and encoded messages. Our system is highly distributed where storage servers independently encode and forward mes-sages and key servers independently perform partial decryption.

2. We present a general setting for the parameters of our secure cloud storage system. Our parameter setting of n¼ akc supersedes the previous one of n ¼ akpffiffiffik,

where c  1:5 and a >pffiffiffi2 [6]. Our result n ¼ akc

allows the number of storage servers be much greater than the number of blocks of a message. In practical systems, the number of storage servers is much more than k. The sacrifice is to slightly increase the total copies of an encrypted message symbol sent to storage servers. Nevertheless, the storage size in each storage server does not increase because each storage server stores an encoded result (a codeword symbol), which is a combination of encrypted message symbols.

2

R

ELATED

W

ORKS

We briefly review distributed storage systems, proxy re-encryption schemes, and integrity checking mechanisms.

2.1 Distributed Storage Systems

At the early years, the Network-Attached Storage (NAS) [7] and the Network File System (NFS) [8] provide extra

storage devices over the network such that a user can access the storage devices via network connection. Afterward, many improvements on scalability, robustness, efficiency, and security were proposed [1], [2], [9].

A decentralized architecture for storage systems offers good scalability, because a storage server can join or leave without control of a central authority. To provide robust-ness against server failures, a simple method is to make replicas of each message and store them in different servers. However, this method is expensive as z replicas result in z times of expansion.

One way to reduce the expansion rate is to use erasure codes to encode messages [10], [11], [12], [13], [5]. A message is encoded as a codeword, which is a vector of symbols, and each storage server stores a codeword symbol. A storage server failure is modeled as an erasure error of the stored codeword symbol. Random linear codes support distributed encoding, that is, each codeword symbol is independently computed. To store a message of k blocks, each storage server linearly combines the blocks with randomly chosen coeffi-cients and stores the codeword symbol and coefficoeffi-cients. To retrieve the message, a user queries k storage servers for the stored codeword symbols and coefficients and solves the linear system. Dimakis et al. [13] considered the case that n¼ ak for a fixed constant a. They showed that distributing each block of a message to v randomly chosen storage servers is enough to have a probability 1  k=p  oð1Þ of a successful data retrieval, where v ¼ b ln k, b > 5a, and p is the order of the used group. The sparsity parameter v ¼ b ln k is the number of storage servers which a block is sent to. The larger v is, the communication cost is higher and the successful retrieval probability is higher. The system has a light data confidentiality because an attacker can compromise k storage servers to get the message.

Lin and Tzeng [6] addressed robustness and confidenti-ality issues by presenting a secure decentralized erasure code for the networked storage system. In addition to storage servers, their system consists of key servers, which hold cryptographic key shares and work in a distributed way. In their system, stored messages are encrypted and then encoded. To retrieve a message, key servers query storage servers for the user. As long as the number of available key servers is over a threshold t, the message can be successfully retrieved with an overwhelming probability. One of their results shows that when there are n storage servers with n ¼ akpffiffiffik, the parameter v is bpffiffiffikln k with b > 5a, and each key server queries 2 storage servers for each retrieval request, the probability of a successful retrieval is at least 1  k=p  oð1Þ.

2.2 Proxy Re-Encryption Schemes

Proxy re-encryption schemes are proposed by Mambo and Okamoto [14] and Blaze et al. [15]. In a proxy re-encryption scheme, a proxy server can transfer a ciphertext under a public key PKAto a new one under another public key PKB

by using the re-encryption key RKA!B. The server does not

know the plaintext during transformation. Ateniese et al. [16] proposed some proxy re-encryption schemes and applied them to the sharing function of secure storage systems. In their work, messages are first encrypted by the owner and then stored in a storage server. When a user

(3)

wants to share his messages, he sends a re-encryption key to the storage server. The storage server re-encrypts the encrypted messages for the authorized user. Thus, their system has data confidentiality and supports the data forwarding function. Our work further integrates encryp-tion, re-encrypencryp-tion, and encoding such that storage robust-ness is strengthened.

Type-based proxy re-encryption schemes proposed by Tang [17] provide a better granularity on the granted right of a re-encryption key. A user can decide which type of messages and with whom he wants to share in this kind of proxy re-encryption schemes. Key-private proxy re-re-encryption schemes are proposed by Ateniese et al. [18]. In a key-private proxy re-encryption scheme, given a re-encryption key, a proxy server cannot determine the identity of the recipient. This kind of proxy re-encryption schemes provides higher privacy guarantee against proxy servers. Although most proxy re-encryption schemes use pairing operations, there exist proxy re-encryption schemes without pairing [19].

2.3 Integrity Checking Functionality

Another important functionality about cloud storage is the function of integrity checking. After a user stores data into the storage system, he no longer possesses the data at hand. The user may want to check whether the data are properly stored in storage servers. The concept of provable data possession [20], [21] and the notion of proof of storage [22], [23], [24] are proposed. Later, public auditability of stored data is addressed in [25]. Nevertheless all of them consider the messages in the cleartext form.

3

S

CENARIO

We present the scenario of the storage system, the threat model that we consider for the confidentiality issue, and a discussion for a straightforward solution.

3.1 System Model

As shown in Fig. 1, our system model consists of users, n storage servers SS1; SS2; . . . ; SSn, and m key servers KS1;

KS2; . . . ; KSm. Storage servers provide storage services and

key servers provide key management services. They work independently. Our distributed storage system consists of four phases: system setup, data storage, data forwarding, and data retrieval. These four phases are described as follows.

In the system setup phase, the system manager chooses system parameters and publishes them. Each user A is assigned a public-secret key pair ðPKA; SKAÞ. User A

distributes his secret key SKAto key servers such that each

key server KSi holds a key share SKA;i, 1  i  m. The key

is shared with a threshold t.

In the data storage phase, user A encrypts his message M and dispatches it to storage servers. A message M is decomposed into k blocks m1; m2; . . . ; mk and has an

identifier ID. User A encrypts each block miinto a ciphertext

Ciand sends it to v randomly chosen storage servers. Upon

receiving ciphertexts from a user, each storage server linearly combines them with randomly chosen coefficients into a codeword symbol and stores it. Note that a storage server may receive less than k message blocks and we assume that all storage servers know the value k in advance. In the data forwarding phase, user A forwards his encrypted message with an identifier ID stored in storage servers to user Bsuch that B can decrypt the forwarded message by his secret key. To do so, A uses his secret key SKAand B’s public key

PKBto compute a re-encryption key RKIDA!Band then sends

RKIDA!Bto all storage servers. Each storage server uses the re-encryption key to re-encrypt its codeword symbol for later retrieval requests by B. The re-encrypted codeword symbol is the combination of ciphertexts under B’s public key. In order to distinguish re-encrypted codeword symbols from intact ones, we call them original codeword symbols and re-encrypted codeword symbols, respectively.

In the data retrieval phase, user A requests to retrieve a message from storage servers. The message is either stored by him or forwarded to him. User A sends a retrieval request to key servers. Upon receiving the retrieval request and executing a proper authentication process with user A, each key server KSi requests u randomly chosen storage servers

to get codeword symbols and does partial decryption on the received codeword symbols by using the key share SKA;i.

Finally, user A combines the partially decrypted codeword symbols to obtain the original message M.

System recovering.When a storage server fails, a new one is added. The new storage server queries k available storage servers, linearly combines the received codeword symbols as a new one and stores it. The system is then recovered.

3.2 Threat Model

We consider data confidentiality for both data storage and data forwarding. In this threat model, an attacker wants to break data confidentiality of a target user. To do so, the attacker colludes with all storage servers, nontarget users, and up to ðt  1Þ key servers. The attacker analyzes stored messages in storage servers, the secret keys of nontarget users, and the shared keys stored in key servers. Note that the storage servers store all re-encryption keys provided by users. The attacker may try to generate a new re-encryption key from stored re-encryption keys. We formally model this attack by the standard chosen plaintext attack1of the proxy Fig. 1. A general system model of our work.

1. Systems against chosen ciphertext attacks are more secure than systems against the chosen plaintext attack. Here, we only consider the chosen plaintext attack because a homomorphic encryption scheme is not secure against chosen ciphertext attacks. Consider a multiplicative homo-morphic encryption scheme, where DðSK; EðP K; m1Þ  EðP K; m2ÞÞ ¼

m1 m2 for the encryption function E, the decryption function D, a pair

of public key P K and secret key SK, an operation , and two messages m1

and m2. Given a challenge ciphertext C, where C ¼ EðP K; m1Þ, the attacker

chooses m2, computes EðP K; m2Þ, and computes C0¼ C  EðP K; m2Þ. The

attacker queries C0to the decryption oracle. The response m ¼ m

1 m2from

the decryption oracle reveals the plaintext m1 to the attacker since

(4)

re-encryption scheme in a threshold version, as shown in Fig. 2.

The challenger C provides the system parameters. After the attacker A chooses a target user T , the challenger gives him ðt  1Þ key shares of the secret key SKT of the target

user T to model ðt  1Þ compromised key servers. Then, the attacker can query secret keys of other users and all re-encryption keys except those from T to other users. This models compromised nontarget users and storage servers. In the challenge phase, the attacker chooses two messages M0 and M1 with the identifiers ID0 and ID1, respectively.

The challenger throws a random coin b and encrypts the message Mb with T ’s public key PKT. After getting the

ciphertext from the challenger, the attacker outputs a bit b0

for guessing b. In this game, the attacker wins if and only if

b0¼ b. The advantage of the attacker is defined as

j1=2  Pr½b0¼ bj.

A cloud storage system modeled in the above is secure if no probabilistic polynomial time attacker wins the game with a nonnegligible advantage. A secure cloud storage system implies that an unauthorized user or server cannot get the content of stored messages, and a storage server cannot generate re-encryption keys by himself. If a storage server can generate a re-encryption key from the target user to another user B, the attacker can win the security game by encrypting the ciphertext to B and decrypting the re-encrypted ciphertext using the secret key SKB. Therefore,

this model addresses the security of data storage and data forwarding.

3.3 A Straightforward Solution

A straightforward solution to supporting the data forward-ing function in a distributed storage system is as follows: when the owner A wants to forward a message to user B, he downloads the encrypted message and decrypts it by using his secret key. He then encrypts the message by using B’s public key and uploads the new ciphertext. When B wants to retrieve the forwarded message from A, he downloads the ciphertext and decrypts it by his secret key. The whole data forwarding process needs three communication rounds for A’s downloading and uploading and B’s downloading. The communication cost is linear in the length of the forwarded message. The computation cost is the decryption and encryption for the owner A, and the decryption for user B.

Proxy re-encryption schemes can significantly decrease communication and computation cost of the owner. In a proxy re-encryption scheme, the owner sends a re-encryption

key to storage servers such that storage servers perform the re-encryption operation for him. Thus, the communication cost of the owner is independent of the length of forwarded message and the computation cost of re-encryption is taken care of by storage servers. Proxy re-encryption schemes significantly reduce the overhead of the data forwarding function in a secure storage system.

4

C

ONSTRUCTION OF

S

ECURE

C

LOUD

S

TORAGE

S

YSTEMS

Before presenting our storage system, we briefly introduce the algebraic setting, the hardness assumption, an erasure code over exponents, and our approach.

Bilinear map. Let GG1 and GG2 be cyclic multiplicative

groups2with a prime order p and g 2 GG1be a generator. A

map ~e : GG1 GG1! GG2 is a bilinear map if it is efficiently

computable and has the properties of bilinearity and nondegeneracy: for any x; y 2 ZZ p; ~eðgx; gyÞ ¼ ~eðg; gÞ

xy

and ~

eðg; gÞ is not the identity element in GG2. Let Genð1Þ be an

algorithm generating ðg; ~e; GG1; GG2; pÞ, where  is the length

of p. Let x 2RXdenote that x is randomly chosen from the

set X.

Decisional bilinear Diffie-Hellman assumption. This

assumption is that it is computationally infeasible to distinguish the distributions (g, gx, gy, gz, ~eðg; gÞxyz

) and (g, gx, gy, gz, ~eðg; gÞr

), where x; y; z; r 2RZZ p. Formally, for any

probabilistic polynomial time algorithm A, the following is negligible (in ): j Pr½Aðg; gx; gy; gz; QQ bÞ ¼ b : x; y; z; r 2RZZ p; QQ0¼ ~eðg; gÞ xyz ; QQ1¼ ~eðg; gÞ r ; b2Rf0; 1g  1=2j:

Erasure coding over exponents. We consider that the

message domain is the cyclic multiplicative group GG2

described above. An encoder generates a generator matrix G¼ ½gi;j for 1  i  k; 1  j  n as follows: for each row,

the encoder randomly selects an entry and randomly sets a value from ZZ pto the entry. The encoder repeats this step v times with replacement for each row. An entry of a row can be selected multiple times but only set to one value. The values of the rest entries are set to 0. Let the message be ðm1; m2; . . . ; mkÞ 2 GGk2. The encoding process is to generate

ðw1; w2; . . . ; wnÞ 2 GGn2, w h e r e wj¼ m g1;j 1 m g2;j 2    m gk;j k f o r

1 j  n. The first step of the decoding process is to

compute the inverse of a k  k submatrix K of G. Let K be ½gi;ji for 1  i; ji k. Let K

1¼ ½d

i;j1i;jk. The final step of

the decoding process is to compute mi¼ w d1;i j1 w d2;i j2    w dk;i jk for

1 i  k. An example is shown in Fig. 3. User A stores two messages m1 and m2 into four storage servers. When the

storage servers SS1 and SS3 are available and the k  k

submatrix K is invertible, user A can decode m1 and m2

from the codeword symbols w1; w3 and the coefficients

ðg1;1; 0Þ; ð0; g2;3Þ, which are stored in the storage servers SS1

and SS3.

Our approach.We use a threshold proxy re-encryption scheme with multiplicative homomorphic property. An encryption scheme is multiplicative homomorphic if it Fig. 2. The security game for the chosen plaintext attack.

2. It can also be described as additive groups over points on an elliptic curve.

(5)

supports a group operation  on encrypted plaintexts without decryption

DðSK; EðP K; m1Þ  EðP K; m2ÞÞ ¼ m1 m2;

where E is the encryption function, D is the decryption function, and ðP K; SKÞ is a pair of public key and secret key. Given two coefficients g1and g2, two message symbols

m1and m2can be encoded to a codeword symbol mg11m g2

2 in

the encrypted form

C¼ EðP K; m1Þg1 EðP K; m2Þg2¼ EðP K; mg11 m g2

2Þ:

Thus, a multiplicative homomorphic encryption scheme supports the encoding operation over encrypted messages. We then convert a proxy re-encryption scheme with multi-plicative homomorphic property into a threshold version. A secret key is shared to key servers with a threshold value t via the Shamir secret sharing scheme [26], where t  k. In our system, to decrypt for a set of k message symbols, each key server independently queries 2 storage servers and partially decrypts two encrypted codeword symbols. As long as t key servers are available, k codeword symbols are obtained from the partially decrypted ciphertexts.

4.1 A Secure Cloud Storage System with Secure

Forwarding

As described in Section 3.1, there are four phases of our storage system.

System setup. The algorithm SetUpð1Þ generates the

system parameters . A user uses KeyGenðÞ to generate his public and secret key pair and ShareKeyGenðÞ to share his secret key to a set of m key servers with a threshold t, where k  t  m. The user locally stores the third compo-nent of his secret key.

. SetUp(1). Run Genð1Þ to obtain ðg; h; ~e; GG

1; GG2; pÞ,

where ~e : GG1 GG1! GG2 is a bilinear map, g and h

are generators of GG1, and both GG1and GG2 have the

prime order p. Set  ¼ ðg; h; ~e; GG1; GG2; p; fÞ, where f :

ZZ p f0; 1g ! ZZ

p is a one-way hash function.

. KeyGen(). For a user A, the algorithm selects

a1; a2; a32RZZ pand sets

PKA¼ ðga1; ha2Þ; SKA¼ ða1; a2; a3Þ:

. ShareKeyGen(SKA, t, m). This algorithm shares the

secret key SKA of a user A to a set of m key servers

by using two polynomials fA;1ðzÞ and fA;2ðzÞ of

degree ðt  1Þ over the finite field GF(p)

fA;1ðzÞ ¼ a1þ v1zþ v2z2þ    þ vt1zt1ðmod pÞ;

fA;2ðzÞ ¼ a12 þ v1zþ v2z2þ    þ vt1zt1ðmod pÞ;

where v1; v2; . . . ; vt12RZZ p. The key share of the

secret key SKA to the key server KSi is SKA;i¼

ðfA;1ðiÞ; fA;2ðiÞÞ, where 1  i  m.

Data storage.When user A wants to store a message of k blocks m1; m2; . . . ; mk with the identifier ID, he computes

the identity token  ¼ hfða3;IDÞand performs the encryption

algorithm EncðÞ on  and k blocks to get k original ciphertexts C1; C2; . . . ; Ck. An original ciphertext is

indi-cated by a leading bit b ¼ 0. User A sends each ciphertext Ci

to v randomly chosen storage servers. A storage server receives a set of original ciphertexts with the same identity token  from A. When a ciphertext Ci is not received, the

storage server inserts Ci¼ ð0; 1; ; 1Þ to the set. The special

format of ð0; 1; ; 1Þ is a mark for the absence of Ci. The

storage server performs EncodeðÞ on the set of k ciphertexts and stores the encoded result (codeword symbol).

. Enc(PKA;  ; m1; m2; . . . ; mk). For 1  i  k, this

algo-rithm computes

Ci¼ ð0; i; ; iÞ ¼ ð0; gri;  ; mieðg~ a1; riÞÞ;

where ri2RZZ p; 1 i  k and 0 is the leading bit

indicating an original ciphertext.

. Encode(C1; C2; . . . ; Ck). For each ciphertext Ci, the

algorithm randomly selects a coefficient gi. If some

ciphertext Ciis ð0; 1; ; 1Þ, the coefficient giis set to 0.

Let Ci¼ ð0; i; ; iÞ. The encoding process is to

compute an original codeword symbol C0

C0¼ 0;Y k i¼1 gi i   ; ;Y k i¼1 gi i  ! ¼ 0; gP k i¼1giri; ; Yk i¼1 mgi i ~eðg a1; Þ Pk i¼1giri ! ¼ ð0; gr0;  ; W ~eðg; Þa1r0 Þ; where W ¼Qki¼1mgi i and r0¼ Pk

i¼1giri. The

en-coded result is ðC0; g

1; g2; . . . ; gkÞ.

Data forwarding.User A wants to forward a message to another user B. He needs the first component a1 of his

secret key. If A does not possess a1, he queries key servers

for key shares. When at least t key servers respond, A recovers the first component a1of the secret key SKAvia the

KeyRecoverðÞ algorithm. Let the identifier of the message be ID. User A computes the re-encryption key RKID

A!B via

the ReKeyGenðÞ algorithm and securely sends the re-encryption key to each storage server. By using RKID

A!B, a

storage server re-encrypts the original codeword symbol C0

with the identifier ID into a re-encrypted codeword symbol C00 via the ReEncðÞ algorithm such that C00 is decryptable

by using B’s secret key. A re-encrypted codeword symbol is indicated by the leading bit b ¼ 1. Let the public key PKBof

user B be ðgb1; hb2Þ.

. KeyRecover(SKA;i1; SKA;i2; . . . ; SKA;it). Let T ¼ fi1;

i2; . . . ; itg. This algorithm recovers a1 via Lagrange

interpolation as follows: Fig. 3. A storage system with random linear coding over exponents.

(6)

a1¼ X s2T fA;1ðsÞ Y s02T=fsg s0 s s0 0 @ 1 Amod p:

. ReKeyGen(PKA; SKA; ID; PKB). This algorithm

se-lects e 2RZZ p and computes

RKIDA!B¼ ððhb2Þa1ðfða3;IDÞþeÞ; ha1eÞ:

. ReEnc(RKID

A!B; C0). Let C0¼ ð0; ; ; Þ ¼ ð0; gr

0

;  ; W ~eðga1; r0ÞÞ for some r0and some W , and RKID

A!B¼

ðhb2a1ðfða3;IDÞþeÞ; ha1eÞ for some e. The re-encrypted

codeword symbol is computed as follows: C00¼ ð1; ; hb2a1ðfða3;IDÞþeÞ;  ~eð; ha1eÞÞ

¼ ð1; gr0

; hb2a1ðfða3;IDÞþeÞ; W ~eðg; hÞa1r0ðfða3;IDÞþeÞÞ:

Note that the leading bit 1 indicates C00 is a re-encrypted

ciphertext.

Data retrieval.There are two cases for the data retrieval phase. The first case is that a user A retrieves his own message. When user A wants to retrieve the message with the identifier ID, he informs all key servers with the identity token . A key server first retrieves original codeword symbols from u randomly chosen storage servers and then performs partial decryption ShareDecðÞ on every retrieved original codeword symbol C0. The result of partial decryption is called a partially decrypted codeword symbol. The key server sends the partially decrypted codeword symbols  and the coefficients to user A. After user A collects replies from at least t key servers and at least k of them are originally from distinct storage servers, he executes CombineðÞ on the t partially decrypted codeword symbols to recover the blocks m1; m2; . . . ; mk. The second case is that a user B retrieves a

message forwarded to him. User B informs all key servers directly. The collection and combining parts are the same as the first case except that key servers retrieve re-encrypted codeword symbols and perform partial decryption Share-DecðÞ on re-encrypted codeword symbols.

. ShareDec(SKj; Xi). Xiis a codeword symbol, where

Xi¼ ðb; ; ; ) and b is the indicator for original

and re-encrypted codeword symbols. SKj is a key

share, where SKj¼ ðsk0; sk1Þ. By using the key share

SKj, the partially decrypted codeword symbol i;j of

Xi is generated as follows:

i;j¼ ðb; ; ; skb; Þ:

. Combine(i1;j1; i2;j2; . . . ; it;jt). Let a partially

de-crypted codeword symbol i;jbe ðb; i;j; i;j; 0i;j; i;jÞ.

This algorithm combines t partially decrypted code-word symbols, where i1;j1¼ i2;j2¼    ¼ it;jt¼ ,

j16¼ j26¼ . . . 6¼ jt and there are at least k distinct

values in fi1; i2; . . . ; itg. Let SJ¼ fj1; j2; . . . ; jtg and

S¼ fði1; j1Þ; ði2; j2Þ; . . . ; ðit; jtÞg. Without loss of

gen-erality, let SI¼ fi1; i2; . . . ; ikg be k distinct values in

fi1; i2; . . . ; itg.

In the first case b ¼ 0 for original codeword symbols, user A wants to retrieve his own message.

The algorithm combines the t values (0

i1;j1;

0i

2;j2; . . . ; 

0

it;jt) to obtain 

a1¼ fA;1ð0Þ via the

La-grange interpolation over exponents a1¼ Y ði;jÞ2S ð0 i;jÞ Q r2SJ;r6¼j j rj   ¼ fA;1ð0Þ:

For each of the partially decrypted codeword symbols i;j, where i 2 SI, the algorithm computes

an encoded block wi ¼

i;j

~

eði;j; fA;1ð0ÞÞ

¼ wi~eðg

a1; r0Þ

~ eðgr0

; fA;1ð0ÞÞ; ð1Þ

for some r0, where f

A;1ð0Þ ¼ a1. Observe that wi¼ m g1;i 1 m g2;i 2    m gk;i k for i 2 SI, and

there are k such equations. Consider the square matrix K ¼ ½gi;j, where 1  i  k; j 2 SI. The

decod-ing process is to compute K1and output the blocks

m1; m2; . . . ; mk. The algorithm fails when the square

matrix K is noninvertible. We shall analyze the probability of K being noninvertible in Section 4.2.

In the second case b ¼ 1 for re-encrypted code-word symbols, user B wants to retrieve the message forwarded to him. The algorithm does the following computation to obtain hðfða3;IDÞþeÞa1¼ Y ði;jÞ2S ð0i;jÞ Q r2SJ;r6¼j j rj   ¼ hðfða3;IDÞþeÞa1b2fB;2ð0Þ;

where fB;2ð0Þ ¼ b12 . Again, for each of i;j, where

i2 SI, the algorithm computes an encoded block

wi¼

i;j

~

eði;j; hðfða3;IDÞþeÞa1Þ

¼wieðg; hÞ~ a1r0ðfða3;IDÞþeÞ ~ eðgr0 ; hðfða3;IDÞþeÞa1Þ ; ð2Þ for some e and r0. The rest in the second case is the

same as that in the first case.

4.2 Analysis

We analyze storage and computation complexities, correct-ness, and security of our cloud storage system in this section. Let the bit-length of an element in the group GG1be l1and GG2

be l2. Let coefficients gi;j be randomly chosen from f0; 1gl3.

Storage cost. To store a message of k blocks, a storage server SSj stores a codeword symbol ðb; j;  ; jÞ and the

coefficient vector ðg1;j; g2;j; . . . ; gk;jÞ. They are total of ð1 þ

2l1þ l2þ kl3Þ bits, where j; 2 GG1 and j2 GG2. The

average cost for a message bit stored in a storage server is ð1 þ 2l1þ l2þ kl3Þ=kl2bits, which is dominated by l3=l2for a

sufficiently large k. In practice, small coefficients, i.e., l3 l2, reduce the storage cost in each storage server.

Computation cost.We measure the computation cost by

the number of pairing operations, modular exponentiations in GG1and GG2, modular multiplications in GG1and GG2, and

arithmetic operations over GF ðpÞ. These operations are denoted as Pairing, Exp1, Exp2, Mult1, Mult2, and Fp,

respectively. The cost is summarized in Table 1. Computing an Fp takes much less time than computing a Mult1 or a

(7)

Mult2. The time of computing an Exp1 is 1:5dlog pe times as

much as the time of computing a Mult1, on average, (by

using the square-and-multiply algorithm). Similarly, the time of computing a Exp2is 1:5dlog pe times as much as the

time of computing a Mult2, on average.

In the data storage phase, a user runs the EncðÞ algorithm and each storage server performs the EncodeðÞ algorithm. In the EncðÞ algorithm, generating each i

requires a Exp1, and generating each i requires a Exp1, a

Pairing, and a Mult2. Hence, for k blocks of a message, the

cost is (k Pairing þ 2k Exp1þ k Mult2). For the EncodeðÞ

algorithm, each storage server encodes k ciphertexts at most. The cost is k Exp1þ ðk  1Þ Mult1 for computing 

and k Exp2þ ðk  1Þ Mult2for computing .

In the data forwarding phase, a user runs KeyRecoverðÞ and ReKeyGenðÞ and each storage server performs ReEncðÞ. In the KeyRecoverðÞ algorithm, the computation cost is Oðt2Þ F

p. In the ReKeyGenðÞ algorithm, the

computation cost is a Exp1. In the ReEncðÞ algorithm, the

computation cost is a Pairing and a Mult1.

In the data retrieval phase, each key server runs the ShareDecðÞ algorithm and the user performs the CombineðÞ algorithm. In the ShareDecðÞ algorithm, each key server performs a Exp1 to get skb for a codeword

symbol. For a successful retrieval, t key servers would be sufficient; hence, for this step, the total cost of t key servers

is t Exp1. In the CombineðÞ algorithm, it needs the

computation of the Lagrange interpolation over exponents in GG1, the computation of the encoded blocks wj’s from the

partially decrypted codeword symbols ~i;j’s, and the

decoding computation which needs to perform the matrix inversion and recovery of blocks mi’s from the encoded

blocks wj’s. The Lagrange interpolation over exponents in

GG1needs Oðt2Þ Fp, t Exp1, and ðt  1Þ Mult1. Computing an

encoded block wj needs one Pairing and one modular

division, which takes 2 Mult2. As for the decoding

computation, the matrix inversion takes Oðk3Þ arithmetic

operations over GF ðpÞ, and the decoding for each block takes k Exp2and ðk  1Þ Mult2.

Correctness. There are two cases for correctness. The owner A correctly retrieves his message and user B correctly retrieves a message forwarded to him. The correctness of encryption and decryption for A can be seen in (1). The correctness of re-encryption and decryption for B can be seen in (2). As long as at least k storage servers are available, a user can retrieve data with an overwhelming probability. Thus, our storage system tolerates n  k server failures.

The probability of a successful retrieval. A successful retrieval is an event that a user successfully retrieves all k blocks of a message no matter whether the message is owned by him or forwarded to him. The randomness comes from the random selection of storage servers in the data storage phase, the random coefficients chosen by storage servers, and the random selection of key servers in the data retrieval phase. The probability of a successful retrieval depends on (n; k; u; v) and all randomness.

The methodology of analysis is similar to that in [13] and [6]. However, we consider a different system model from the one in [13] and a more flexible parameter setting for n ¼ akc

than the settings in [13] and [6]. The difference between our system model and the one in [13] is that our system model has key servers. In [13], a single user queries k distinct storage servers to retrieve the data. On the other hand, each key server in our system independently queries u storage servers. The use of distributed key servers increases the level of key protection but makes the analysis harder.

The ratio n=k is considered as a fixed constant in [13]. In [6], the setting is extended to n ¼ ak3=2. Our

general-ization of parameter setting for n ¼ akc, where c  1:5,

allows the number of storage servers be much greater than the number of blocks of a message. It gives a better flexibility for adjustment between the number of storage servers and robustness. This generalization is obtained by

observing that Pr½E1 is better bounded by choosing

c 1:5. The proof of Theorem 1 is given in Appendix A,

which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/ TPDS.2011.252.

Theorem 1. Assume that there are k blocks of a message, n storage servers, and m key servers, where n ¼ akc, m  t  k,

c 1:5 and a is a constant with a >pffiffiffi2. For v ¼ bkc1ln k and u ¼ 2 with b > 5a, the probability of a successful retrieval is at least 1  k=p  oð1Þ.

Security. The data confidentiality of our cloud storage system is guaranteed even if all storage servers, nontarget users, and up to ðt  1Þ key servers are compromised by the attacker. Recall the security game illustrated in Fig. 2. The proof for Theorem 2 is provided in Appendix B, available in the online supplementary material.

Theorem 2.Our cloud storage system described in Section 4.1 is secure under the threat model in Section 3.2 if the decisional bilinear Diffie-Hellman assumption holds.

5

D

ISCUSSION AND

C

ONCLUSION

In this paper, we consider a cloud storage system consists of storage servers and key servers. We integrate a newly TABLE 1

The Computation Cost of Each Algorithm in Our Secure Cloud Storage System

(8)

proposed threshold proxy re-encryption scheme and erasure codes over exponents. The threshold proxy re-encryption scheme supports encoding, forwarding, and partial decryption operations in a distributed way. To decrypt a message of k blocks that are encrypted and encoded to n codeword symbols, each key server only has to partially decrypt two codeword symbols in our system. By using the threshold proxy re-encryption scheme, we present a secure cloud storage system that provides secure data storage and secure data forwarding functionality in a decentralized structure. Moreover, each storage server independently performs encoding and re-encryption and each key server independently performs partial decryption. Our storage system and some newly proposed content addressable file systems and storage system [27], [28], [29] are highly compatible. Our storage servers act as storage nodes in a content addressable storage system for storing content addressable blocks. Our key servers act as access nodes for providing a front-end layer such as a traditional file system interface. Further study on detailed cooperation is required.

A

CKNOWLEDGMENTS

The authors thank anonymous reviewers for their valu-able comments. The research was supported in part by projects ICTL-100-Q707, ATU-100-W958, NSC 98-2221-E-009-068-MY3, NSC E-009-017-, and NSC 99-2218-E-009-020.

R

EFERENCES

[1] J. Kubiatowicz, D. Bindel, Y. Chen, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao, “Oceanstore: An Architecture for Global-Scale Persis-tent Storage,” Proc. Ninth Int’l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 190-201, 2000.

[2] P. Druschel and A. Rowstron, “PAST: A Large-Scale, Persistent Peer-to-Peer Storage Utility,” Proc. Eighth Workshop Hot Topics in Operating System (HotOS VIII), pp. 75-80, 2001.

[3] A. Adya, W.J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J.R. Douceur, J. Howell, J.R. Lorch, M. Theimer, and R. Wattenhofer, “Farsite: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment,” Proc. Fifth Symp. Operating System Design and Implementation (OSDI), pp. 1-14, 2002. [4] A. Haeberlen, A. Mislove, and P. Druschel, “Glacier: Highly

Durable, Decentralized Storage Despite Massive Correlated Fail-ures,” Proc. Second Symp. Networked Systems Design and Implemen-tation (NSDI), pp. 143-158, 2005.

[5] Z. Wilcox-O’Hearn and B. Warner, “Tahoe: The Least-Authority Filesystem,” Proc. Fourth ACM Int’l Workshop Storage Security and Survivability (StorageSS), pp. 21-26, 2008.

[6] H.-Y. Lin and W.-G. Tzeng, “A Secure Decentralized Erasure Code for Distributed Network Storage,” IEEE Trans. Parallel and Distributed Systems, vol. 21, no. 11, pp. 1586-1594, Nov. 2010. [7] D.R. Brownbridge, L.F. Marshall, and B. Randell, “The Newcastle

Connection or Unixes of the World Unite!,” Software Practice and Experience, vol. 12, no. 12, pp. 1147-1162, 1982.

[8] R. Sandberg, D. Goldberg, S. Kleiman, D. Walsh, and B. Lyon, “Design and Implementation of the Sun Network Filesystem,” Proc. USENIX Assoc. Conf., 1985.

[9] M. Kallahalla, E. Riedel, R. Swaminathan, Q. Wang, and K. Fu, “Plutus: Scalable Secure File Sharing on Untrusted Storage,” Proc. Second USENIX Conf. File and Storage Technologies (FAST), pp. 29-42, 2003.

[10] S.C. Rhea, P.R. Eaton, D. Geels, H. Weatherspoon, B.Y. Zhao, and J. Kubiatowicz, “Pond: The Oceanstore Prototype,” Proc. Second USENIX Conf. File and Storage Technologies (FAST), pp. 1-14, 2003.

[11] R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G.M. Voelker, “Total Recall: System Support for Automated Availability Management,” Proc. First Symp. Networked Systems Design and Implementation (NSDI), pp. 337-350, 2004.

[12] A.G. Dimakis, V. Prabhakaran, and K. Ramchandran, “Ubiqui-tous Access to Distributed Data in Large-Scale Sensor Net-works through Decentralized Erasure Codes,” Proc. Fourth Int’l Symp. Information Processing in Sensor Networks (IPSN), pp. 111-117, 2005.

[13] A.G. Dimakis, V. Prabhakaran, and K. Ramchandran, “Decen-tralized Erasure Codes for Distributed Networked Storage,” IEEE Trans. Information Theory, vol. 52, no. 6 pp. 2809-2816, June 2006. [14] M. Mambo and E. Okamoto, “Proxy Cryptosystems: Delegation of

the Power to Decrypt Ciphertexts,” IEICE Trans. Fundamentals of Electronics, Comm. and Computer Sciences, vol. E80-A, no. 1, pp. 54-63, 1997.

[15] M. Blaze, G. Bleumer, and M. Strauss, “Divertible Protocols and Atomic Proxy Cryptography,” Proc. Int’l Conf. Theory and Applica-tion of Cryptographic Techniques (EUROCRYPT), pp. 127-144, 1998. [16] G. Ateniese, K. Fu, M. Green, and S. Hohenberger, “Improved Proxy Re-Encryption Schemes with Applications to Secure Distributed Storage,” ACM Trans. Information and System Security, vol. 9, no. 1, pp. 1-30, 2006.

[17] Q. Tang, “Type-Based Proxy Re-Encryption and Its Construction,” Proc. Ninth Int’l Conf. Cryptology in India: Progress in Cryptology (INDOCRYPT), pp. 130-144, 2008.

[18] G. Ateniese, K. Benson, and S. Hohenberger, “Key-Private Proxy Re-Encryption,” Proc. Topics in Cryptology (CT-RSA), pp. 279-294, 2009.

[19] J. Shao and Z. Cao, “CCA-Secure Proxy Re-Encryption without Pairings,” Proc. 12th Int’l Conf. Practice and Theory in Public Key Cryptography (PKC), pp. 357-376, 2009.

[20] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song, “Provable Data Possession at Untrusted Stores,” Proc. 14th ACM Conf. Computer and Comm. Security (CCS), pp. 598-609, 2007.

[21] G. Ateniese, R.D. Pietro, L.V. Mancini, and G. Tsudik, “Scalable and Efficient Provable Data Possession,” Proc. Fourth Int’l Conf. Security and Privacy in Comm. Netowrks (SecureComm), pp. 1-10, 2008.

[22] H. Shacham and B. Waters, “Compact Proofs of Retrievability,” Proc. 14th Int’l Conf. Theory and Application of Cryptology and Information Security (ASIACRYPT), pp. 90-107, 2008.

[23] G. Ateniese, S. Kamara, and J. Katz, “Proofs of Storage from Homomorphic Identification Protocols,” Proc. 15th Int’l Conf. Theory and Application of Cryptology and Information Security (ASIACRYPT), pp. 319-333, 2009.

[24] K.D. Bowers, A. Juels, and A. Oprea, “HAIL: A High-Availability and Integrity Layer for Cloud Storage,” Proc. 16th ACM Conf. Computer and Comm. Security (CCS), pp. 187-198, 2009.

[25] C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing,” Proc. IEEE 29th Int’l Conf. Computer Comm. (INFOCOM), pp. 525-533, 2010.

[26] A. Shamir, “How to Share a Secret,” ACM Comm., vol. 22, pp. 612-613, 1979.

[27] C. Dubnicki, L. Gryz, L. Heldt, M. Kaczmarczyk, W. Kilian, P. Strzelczak, J. Szczepkowski, C. Ungureanu, and M. Welnicki, “Hydrastor: A Scalable Secondary Storage,” Proc. Seventh Conf. File and Storage Technologies (FAST), pp. 197-210, 2009.

[28] C. Ungureanu, B. Atkin, A. Aranya, S. Gokhale, S. Rago, G. Calkowski, C. Dubnicki, and A. Bohra, “Hydrafs: A High-Throughput File System for the Hydrastor Content-Addressable Storage System,” Proc. Eighth USENIX Conf. File and Storage Technologies (FAST), p. 17, 2010.

[29] W. Dong, F. Douglis, K. Li, H. Patterson, S. Reddy, and P. Shilane, “Tradeoffs in Scalable Data Routing for Deduplication Clusters,” Proc. Ninth USENIX Conf. File and Storage Technologies (FAST), p. 2, 2011.

(9)

Hsiao-Ying Lin received the MS and PhD degrees in computer science from National Chiao Tung University, Taiwan, in 2005 and 2010, respectively. Currently, she is working as an assistant research fellow in Intelligent In-formation and Communications Research Cen-ter. Her current research interests include applied cryptography and information security. She is a member of the IEEE.

Wen-Guey Tzeng received the BS degree in computer science and information engineering from National Taiwan University, in 1985, and MS and PhD degrees in computer science from the State University of New York at Stony Brook, in 1987 and 1991, respectively. He joined the Department of Computer and Information Science (now, Department of Computer Science), National Chiao Tung University, Tai-wan, in 1991. He now serves as a chairman of the department. His current research interests include cryptology, informa-tion security and network security. He is a member of the IEEE.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.

參考文獻

相關文件

For ex- ample, if every element in the image has the same colour, we expect the colour constancy sampler to pro- duce a very wide spread of samples for the surface

• The memory storage unit is where instructions and data are held while a computer program is running.. • A bus is a group of parallel wires that transfer data from one part of

Too good security is trumping deployment Practical security isn’ t glamorous... USENIX Security

了⼀一個方案,用以尋找滿足 Calabi 方程的空 間,這些空間現在通稱為 Calabi-Yau 空間。.

You are given the wavelength and total energy of a light pulse and asked to find the number of photons it

好了既然 Z[x] 中的 ideal 不一定是 principle ideal 那麼我們就不能學 Proposition 7.2.11 的方法得到 Z[x] 中的 irreducible element 就是 prime element 了..

volume suppressed mass: (TeV) 2 /M P ∼ 10 −4 eV → mm range can be experimentally tested for any number of extra dimensions - Light U(1) gauge bosons: no derivative couplings. =>

incapable to extract any quantities from QCD, nor to tackle the most interesting physics, namely, the spontaneously chiral symmetry breaking and the color confinement.. 