Searchable Encryption - Background Introduction

Chapter 2 Background Introduction

2.2 Searchable Encryption

2.2.1 Searchable Encryption

At a high level, a searchable encryption scheme provides a way to “encrypt” a search index so that its contents are hidden except to a party that is given appropriate tokens. More precisely, consider a search index generated over a collection of files (this could be a full-text index or just a keyword index). Using a searchable encryption scheme, the index is encrypted in such a way that (1) given a token for a keyword one can retrieve pointers to the encrypted files that contain the keyword; and (2) without a token the contents of the index are hidden. In addition, the tokens can only be generated with knowledge of a secret key, and the retrieval procedure reveals nothing about the files or the keywords except that the files contain a common keyword.

This last point is worth discussing further as it is crucial to understanding the security guarantee provided by searchable encryption. Notice that over time (i.e., after many searches), knowing that a certain subset of documents contains a word in common may leak some useful information because the server could make some assumptions about the client's search pattern and use this information to make a guess about the keywords being searched. It is important to understand, however, that while searching does leak some information to the provider, what is being leaked is exactly

what the provider would learn from the act of returning the appropriate files to the customer (i.e., that these files contain some common keywords). In other words, the information leaked to the cloud provider is not leaked by the cryptographic primitives, but by the manner in which the service is being used (i.e., fetching files based on the exact keyword matches). This leakage seems almost inherent to any efficient and reliable cloud storage service, and at worst, is less information than what is leaked by using a public cloud storage service. The only known alternative, which involves making the service provider return false positives and having the client perform some local filtering, is inefficient in terms of communication and computational complexity.

There are many types of searchable encryption schemes, each one appropriate to a particular application scenario. For example, the data processors in consumer and small enterprise architectures could be implemented using symmetric searchable encryption (SSE), while the data processors in a large enterprise architecture could be based on asymmetric searchable encryption (ASE). In the following, we describe each type of scheme in more detail.

2.2.2 Symmetric Searchable Encryption (SSE)

An SSE is appropriate in any setting where the party that searches through the data is also the one who generated the data. Borrowing from storage systems terminology, we refer to such scenario as a single writer/single reader (SWSR). SSE schemes were introduced in [32], and improved constructions and security definitions were given in [23, 16, 19].

The main advantages of an SSE are its efficiency and security, whereas the main disadvantage is functionality. SSE schemes are efficient for both the party conducting the encryption and (in some cases) the party performing the search. Encryption is efficient because most SSE schemes are based on symmetric primitives such as block ciphers and pseudo-random functions. As shown in [19], a search can be efficient

because the typical usage scenarios for an SSE (i.e., SWSR) allow the data to be pre-processed and stored in efficient data structures.

The security guarantees provided by SSE are, roughly speaking, the following:

(1) Without any trapdoors, the server learns nothing about the data except the length and (2) given a trapdoor for keyword W, the server learns which (encrypted) documents contain W without learning W itself.

Although these security guarantees are stronger than those provided by both asymmetric and efficiently searchable encryption (described below), we stress that they do have their limitations (as described above).

The main disadvantage of an SSE is that the known solutions trade efficiency for functionality. This is easiest to see by looking at two of the main constructions proposed in the literature. In the scheme proposed by Curtmola et al. [19], the search time for the server is optimal (i.e., linear in the number of documents that contain the keyword), but updates to the index are inefficient. On the other hand, in the scheme proposed by Goh [23], updates to the index can be conducted efficiently, but the search time for the server is slow (i.e., linear in the total number of documents). We can also state that neither scheme handles searches that are composed of conjunctions or a disjunction of terms. The only SSE scheme that handles conjunctions [24] is based on pairings on elliptic curves, and is as inefficient as the asymmetric searchable encryption schemes discussed below.

2.2.3 Asymmetric Searchable Encryption (ASE)

ASE schemes are appropriate in any setting where the party searching over the data is different from the party that generated the data. We refer to such a scenario as a many writer/single reader (MWSR). ASE schemes were introduced in [11], improved definitions were proposed in [1], and schemes for handling conjunctions were given in [28] and [13].

The main advantage of an ASE is functionality, whereas the main disadvantages are inefficiency and weaker security. Since the writer and reader can be different, ASE schemes are usable in a larger number of settings than SSE schemes. The inefficiency comes from the fact that all known ASE schemes require the evaluation of pairings on elliptic curves, which is a relatively slow operation compared to evaluations of (cryptographic) hash functions or block ciphers. In addition, in the typical usage scenarios for an ASE (i.e., MWSR), the data cannot be stored in efficient data structures.

The security guarantees provided by an ASE are, roughly speaking, the following:

(1) without any trapdoors, the server learns nothing about the data except its length and (2) given a trapdoor for a keyword W, the server learns which (encrypted) documents contain W.

Note that (2) is weaker here than in the SSE setting. In fact, when using an ASE scheme, the server can mount a dictionary attack against the token and figure out which keyword the client is searching for (Byun, et al. 2006). It can then use the token (for which it now knows the underlying keyword) and conduct a search to determine which documents contain the (known) keyword. Note that this should not necessarily be interpreted as saying that ASE schemes are insecure, just that one has to be very careful about the particular usage scenario and the types of keywords and data being considered.

2.2.4 Efficient ASE (ESE)

ESE schemes are appropriate in any setting where the party that searches through the data is different from the party that generated the data, and where the keywords are difficult to guess. This also falls into the MWSR scenario. ESE schemes were introduced in [8].

The main advantage of an efficient ASE is that searches are more efficient than

in (plain) ASE. The main disadvantage, however, is that ESE schemes are also vulnerable to dictionary attacks. In particular, dictionary attacks against an ESE can be performed directly against an encrypted index (as opposed to against a token, as in an ASE).

2.2.5 Multi-User SSE (MSSE)

Multi-user SSE (MSSE) schemes are appropriate in any setting where many parties wish to search through data that generated by a single party. We refer to such scenarios as a single writer/many reader (SWMR). MSSE schemes were introduced in [19]. In an MSSE scheme, in addition to being able to encrypt indexes and generate tokens, the owner of the data can also add and revoke a user’s privilege to search through the owner’s data.

在文檔中一個在主從式架構下SSE協議的安全性分析與改良 - 政大學術集成 (頁 17-22)