適地性系統的資料外包安全性研究

全文

(1)NATIONAL TAIWAN N ORMAL U NIVERSITY C OMPUTER S CIENCE AND I NFORMATION E NGINEERING. SUDO: A Secure Database Outsourcing Solution for Location-based Systems. Supervisor:. Author:. Dr. Ling-Jyh C HEN. Chen-Ruei H ONG. February 2, 2015.

(2) SUDO: A Secure Database Outsourcing Solution for Location-based Systems Chen-Ruei Hong Dept. of Computer Science and Information Engineering, National Taiwan Normal University [email protected] Abstract Location-based systems (LBS) represent an emerging genre of applications that exploit positioning technologies and facilitate a wide range of location-based services. Unlike conventional information systems, LBS data management is challenging because LBS data is high dimensional and spatio-temporal in nature, and information leakage may result in location related privacy crises. The issue has become even more complicated, as database outsourcing has become inevitable in view of the emerging popularity of LBS deployment. In this paper, we tackle the research challenge and propose a SecUre Database Outsourcing system, called SUDO. By combining the techniques of Hilbert space-filling curves, different invertible encryption algorithms, and genuine mixed data, we show that SUDO is capable of preserving location privacy and ownership of data for LBS against different attacks. Moreover, the proposed solution is simple, effective, and scalable; and it shows promise in supporting LBS data management with outsourced databases.. Index Terms - Location-based System, Database Outsourcing.

(3) C ONTENTS I. Introduction. 1. II. Problem Statement. 3. III. Related Work. 5. IV. The Proposed System: SUDO. 11. V. Implementation Detail. 17. VI. Evaluation. 20. VII. Discussion. 27. VIII Conclusion and Future Work. 29. References. 31.

(4) I. I NTRODUCTION The quality and reliability of wireless network and global position system is growing in recent decades. Mobile devices, i.e. smart phone and traffic guide, have been wired used and become a technology that has undivided part of our life in now days. Location based services (LBS) therefore have been wildly studied to help dealing with the road redirection, finding special service around us and avoiding traffic jam that supports the functions we wildly use every day. To solve the more and more trajectory data we creates day by day, database-asa-service [22] is a good solution that offers online access and query management, extensible and scalable storages, economical data managements, and reliable disaster protection. How to protect the data privacy, maintain integrity of data and get accurate services with efficiency form outsourced database (ODB) becomes a hot topic today. Location privacy is a well-discussed security issue on ODB. Data privacy resists of any adversary to learn one’s current or past locations by all means. Without protections on privacy, ownership of data will be offended. Ownership of data includes all rights of any usage on data. Attacks [10, 32, 40, 53] on ownership of location-based data can be viewed into two points as sensitive attributes and data distribution. Sensitive attributes such as location, client id and devise id as phone number or mac address that can directly connect one location to one client. To protect sensitive attributes, encryption such as order preserving encryption [1, 6, 45] and homomorphic encryption [14, 43, 47] is common used but make data hard to access or search. Attach index with encrypted data solves the searching problem but no one can guarantee the index of data will not expose the distribution of data in ODB. Data distribution without each sensitive attributes is still a valuable ownership. For example, criminals may want to get home and work location of any user for thief, children school for kidnapping, business trips for analysis your economic partners, client distribution for counter business strategy or locations of polices for dealing drugs. In this paper, we will introduce some weakness on protecting data distribution on ODB system in Problem Statement. In addition, we survey lots of works to avoid any disclosure of data ownership for criminal purposes in Related Work with details.. 1.

(5) Except the security issue, integrity is also an important topic of moving object data. The integrity approach in past basically are checksum and duplication. Approach of checksum offers well modification detect but heavy computation overhead. Duplication is not exclusive to checksum and offers fully integrity (i.e. ignoring update and recovery) more than checksum but heavy storage overhead. With the trajectory that holds large scale of instance of moving object data, how to reach both integrity and performance balance is always undetermined. In this paper, we propose a SecUre Data Outsourcing solution (SUDO) on trusted gateway that uses duplication for integrity to support location privacy making a ODB system holds accuracy, efficiency, security and integrity. As far as we know, the contribution of this paper is as below. 1) We identify why legal queries of MOD are dangerous. 2) We propose a robust solution SUDO that integrate duplication for integrity to security and prove it works. Moreover, SUDO is a index work with open multiobjective optimization. 3) We evaluate a metric to evaluate data density with index. 4) We conduct comprehensive evaluation to demonstrate sharing is secure than partition. In next paragraph, we state the problem statement of database outsourcing in Section II. When we have a sense on the problem statement, we review the related works in Section III. After the review of works in past, we introduce the function and architecture of SUDO in Section IV. After introduction, we discuss the factors of security to SUDO and the implementation detail of SUDO in Section VII. Then we evaluate the performance and security of SUDO in Section VI. In the end, we conclude and talk about the future work in Section VIII.. 2.

(6) II. P ROBLEM S TATEMENT The state-of-the-art data outsourcing approach is the Unified Client Model (UCM) [12, 21, 33, 55] as Figure 1. In UCM, all data and query to ODB were all encoded [21] and attached with index by the trusted gateway that allows trusted gateway encrypts, decrypts and computes complex queries for clients. Besides the computing benefits, UCM allows trusted gateway to analysis data that can offer accurate and optimized solution to clients. All clients in UCM were viewed as one client as trusted gateway to ODB that offers anonymity to each individual user.. Fig. 1. The attack assumption on Unified Client Model Commonly, there are two security assumption on ODB with UCM : 1) Client, trusted gateway and all communication are trusted, 2) Outsourcing database is untrusted. With untrusted ODB, ODB owner may curious and try to figure out about what the outsourcing data is with the queries he gets, responded data result to query, data query frequency and grouping phenomenon of result. Generally, there are some threatens on outsourced moving object database as : 1) Background Knowledge Attack [40, 53] Background Knowledge Attack states all attacks that caused by the additional information from attackers. This kind of attack were well discussed in past, such as Snapshot Location Attack [19], Location-Dependent Attack [8] and Query Tracking Attack [16]. In this paper, we focus on outsourcing database with moving object data. There is a more serious threat that includes all features of attack above. In moving object database, density query [23, 27] and continuous size-fixed range query as Figure 2 is commonly used like cross, inside query. 3.

(7) Fig. 2. Example of continuous size fixed range query on space tree and queries base on speed or uncertainty [7]. With untrusted database, attack holds the result of continuous size-fixed range query that reveals the exactly data density of continuous area exposing the adjacency mapping of block. 2) Know Plaintext Attack [32] Know Plaintext Attack is happened when attacker holds part of plaintext and corresponded cipher. In this kind of attack, adversary can learn the index by the grouping of data with the plaintext he holds. In moving object database, block adjacency mapping is easily exposed by the trajectory of moving object. Therefore, one-Bucket-to-one-area approach and order preserved index is vulnerable to Know Plaintext Attack. [38] This kind of attack can be done by duplicating accessing and making the session frequency distinguishable to encrypted data those attack holds corresponded plain data. 3) Collusion Attack [10] Collusion Attack happens when a group of attack shares all information to each other. With this kind of attack, we assume that attacks may hold multiple identifications as users and ODB manager. Group of attacker may cause block edge detection that expose the mapping area of block and may assembling in a small area as a Skewness Attack [36] on data. With uneven data distribution, attacker can located the area they target and predict the rule of index on encrypted data in ODB. This kind of attack may happen with no perpetrator such as people assemble at a ceremony or preferences and habits of people on choosing traffic route and transport.. 4.

(8) III. R ELATED W ORK In this section, we surveyed privacy enhancing technologies (PETs) on LBS and on outsourced database (ODB). In PETs on LBS, we review the relative works dealing with the problems form clients to location service provider. In the part of PETs on ODB, we review the problems of location data outsourcing to ODB and corresponded solutions. A. Privacy Enhancing Technologies on Location-Based Service In this subsection, we survey approaches for resisting of attacks on LBS. In LBS, adversary may holds retrieve queries and corresponded query results as snapshots, and try to figure out the location information of data by all means. Generally, attacks of Background Knowledge Attack on LBS is well discussed in past, and resisting approaches can be conducted in cloaking and dummy approach. 1) Cloaking : The idea of cloaking is using a large data set from universal set that contains more elements to offer anonymity to every elements [25]. Cloaking works at the anonymizer where an trusted third party between clients and untrusted LBS server. Anonymizer helps clients to create anonymous cloaking set and refines the results from LBS. Cloaking has been widely used because the features that there is no distortion and no user overhead. Here we surveyed three attack types [44] on LBS with cloaking approaches as Snapshot Location Attack, LocationDependent Attack and Query Tracking Attack. Snapshot Location Attack states the attack that attacker holds query and query result of a special moment as a snapshot. Snapshot Attack may happen by two types as Data Linking Attack and Data Sampling Attack. Data Linking Attack [19] happens when attacker can recognize some special position with some user at sparse data area. Gruteser and Grunwald use quad-tree based cloaking to solve this problem. If users in an area become more and larger than upper bound, the area will be partitioned into equal size quadrant that contains similar value of user as a cloaking set. In addition, the tree states are recorded if user density changes and lower than the lower bound, the previous quadrant will be used. 5.

(9) as quadrant merge. By the implementation, Gruteser and Grunwald show that tree based cloaking is not only can be used a multiple dimension index but also offers sufficient anonymity in practice. Data Sampling Attack [15] happens when attacker holds different set with same query at one moment, then the attacker will have ability to analysis the union result of users to find who is the query originator. Gedik and Liu propose Clique-Cloak algorithms that make cloaking not only a group of user set but also a group of cloaking set. Every message was bounded in one cloaking only that solves the overlapping problem of cloaking set. Moreover, Gedik and Liu concern the quality of service with cloaking such as success rate and wait time in practice. Location-Dependent Attack [8] happens when attacker holds two continuous snapshots that the sequence of time is continuous. In this scenario, attacker can locate a small group of target users that exist in both snapshots, because the user location was hinted by the small union area of two continuous snapshots. Pan et al. [44] propose ICliqueCloak algorithm that allows cloaking area increases before any user inside the cloaking issue a new query. An increase cloaking area will offer well anonymity whenever each cloaked user moves. With the increased cloaking, the union area will become larger and unable to identify target users. Query Tracking Attack [16] happens when attacker holds continuous query and the corresponded results. When attack on continuous query that is a set of query with same purpose from the same user, attacker will have ability to analysis the union set of the continuous query and find out the query originator who is always in query. Gedik and Liu propose dynamic group concept that suggests the satisfying of k-anonymity and constrains same user in same cloaking group and one cloaking group with one query only. Dynamic group is convinced with the resisting of Query Sampling Attack and Query Tracking Attack. Gedik and Liu also show four algorithm phases to build dynamic group concept practical. 2) Dummy Approach : Dummy approach is also an approach that avoids LBS provider holds the query and query result for unauthenticated and malicious purposes. Dummy approach. 6.

(10) offers an easy to implement, anonymizer free, without data density constrain and user defined approaches. The idea of dummy approach is to use false position data (dummies) mixed with true position data [9] as user location send to LBS. Then LBS return all possible query results about true and fake positions to the user who has to refine the query results himself. Basically, there is little Background Knowledge Attack theory on dummy approaches because the reason that fake data generator function is on the side of clients who can create any kinds of fakes to protect themselves. Work [9] was the first work gives us the idea of dummies and the one that proves the dummy idea is workable in practice. Chow and Mokbel show the anonymity and communication cost of dummy approach on LBS in a snapshot is all acceptable to clients. Lu et al. [39] propose grid and circular distribution algorithms of dummies and evaluates how anonymous area size and distribution of dummies affect the anonymity and computing cost of dummies in a snapshot. Duckham and Kulik [13] propose an obfuscation algorithm on anonymizer with space constrain. Dummies are created at the intersection of roads that helps to get approximation K-NN and to solve shortest path problem. Besides dummy approaches on a snapshot, dummies on trajectory are also well researched. You et al. [56] discuss the anonymity of short term and long term dummy trajectory and how to create safe dummy trajectory by rotation on true trajectory. The intersection effect of dummy trajectory and how to calculate the anonymity of a trajectory are all showed as well. Kato et al. [28] propose a pause position based approach that used when a mobile user knows his future trajectory in advance. The idea of pause approach is to divide the trajectory into many time slots, and dummies move among the position at each slot time. By the slot time, the anonymity of true data increases because the true position movement in future and that of dummies will be overlapped during the slot time. B. Privacy Enhancing Technologies on Outsourced Database In this subsection, we introduce PETs on ODB that solves the problem of efficiency. By the working principles, PETs on ODB can be divided into two purposes. 1). 7.

(11) Encryption that unlock the trade-off between security and efficiency. Such as order preserving encryption solves the comparison problem with encrypted data in ODB and homomorphic encryption solves the calculation problem on ODB. 2) Indexing that dealing with the efficiency of data access. Such as trees offer quick searching on ODB and SFC offers not only quick access but also allows ODB learn the relative distance with encrypted data. 1) Encryption : To avoid privacy exposure to unauthenticated access, encryption is widely used approach. Here we surveyed two important encryption categories that have great benefits on ODB as order preserving encryption and homomorphic encryption. Encryption enlarges the candidate answers of query and computation cost on data owner. To lower down the overhead, Agrawal et al. [1] first propose order preserving encryption (OPE) to make encrypted numeric data searchable on ODB. Agrawal et al. consider the plain data and encrypted data distribution by time and create transform functions as keys to encrypt data. Ensuring the plain data distribution will not be inferred by the encrypted data distribution. Pandey and Rouselakis [45] show that OPE will not reach indistinguishability against Chosen-Plaintext Attack (IND-CPA) in practical. Pandey and Rouselakis propose a security symmetric OPE with pseudorandom functions and related primitives asking that make an OPE scheme looks as random as possible. Boldyreva et al. [6] improve the security of [45] and show an modular OPE that allow multiple dimension range queries. Besides the OPE searching ability can lower the computation cost, the ability of computation on encrypted data also benefits greatly as homomorphic encryption. Famous homomorphic encryption such as Benaloh [5] and Paillier [43] equip addition on encrypted data, RSA [47] and ElGamal [14] equip multiplication while Goldwasser-Micali [18] equips exclusive-or. However, mathematical computation seldom simply uses one circuit only in practice. Tu et al. [51] propose privacy homomorphism that is an encryption tuple equips common used circuits as addition, subscription and multiplication. Gentry [17] first achieve fully ho-. 8.

(12) momorphic encryption with ideal lattices that allows all circuit on encrypted data. Implementation work such as Wong et al. [54] use privacy homomorphism [51] to get the distance between two encrypted data to reach K-NN service. Lien et al. [37] use Paillier [43] to calculate the distance between one public plain data and one encrypted data to reach K-NN service. 2) Indexing : Although B-tree[2], B*-tree[3], R-tree[20], R*-tree[4] and TPR*-tree[49] are popular solutions to searching problem.[31, 50] But the searching computation cost are all on the data owner who holds the tree as key. With the idea of computation outsourcing, more and more system choose index with space-filling curves [48] that it goes through each part of space without overlapping, transforms multiple dimension to one dimension and allows ODB to find relative locations by space filling curve index. Hilbert curve [24] is the most common used curve because the locality preserving feature [42]. Kim et al. [30] show a declustering algorithm with shifted Hilbert curves that offer a low cost bucketing algorithm on non-uniform data. Khoshgozaran and Shahabi [29] show an approximation K-NN algorithm with dual Hilbert curves with different directions. Papadopoulos et al. [46] lower down the query retrieve cost and solve the exposure problem on null cell by three kinds of table saved in three databases to illustrate the Hilbert list, the location and other attributes of data. Wang et al. [53] solve the non-uniform data distribution problem by multiple layers Hilbert curves. Other index research such as Li and Omiecinski [35] analyse the efficiency and security among one-to-one mapping, order preserving and prefix preserving index. LeFevre et al. [34] propose an efficient partition algorithm on two dimension data. Yang et al. [55] introduce privacy preserving query that expose no retrieve information and use mete data to solve searching problem. Wang and Du [52] show a smart partition algorithm that partition block into grid cells and merge cells to satisfy k-anonymity. Hore et al. [26] introduce secure and. 9.

(13) efficiency controlled bucketing algorithm with range query on encrypted data. Hore et al. also show bucket degradation approach that overlaps the plain data domain to get security and loading balance.. 10.

(14) IV. T HE P ROPOSED S YSTEM : SUDO In this section, we present the proposed SecUre Database Outsourcing system (SUDO). The SUDO system follows the Unified Client Model (UCM) [33], and the trusted gateway acts as an ‘codec’ between clients and outsourced databases. Specifically, there are three functional components (i.e., indexing, encryption, and dispatch) in the encoder, and there are two functional components (i.e., result sieve and computation) in the decoder. Figure 3 shows the architecture of the system, and we discuss each of the components in detail in the following subsections.. Fig. 3. The architecture of the SUDO system. A. Preambles We consider moving object data in location-based systems. For simplicity without loss of generality, we let Di,j denote the j-th data instance of the i-th moving object in the database, and Di,j = (Ii,j , Ai,j , xi,j , yi,j , ti,j ),. (1). where Ii,j is the indices of Di,j , Ai,j is a set of attributes associated to Di,j , xi,j and yi,j are the x- and y-coordinates of Di,j , and ti,j is the timestamp of Di,j . Note that we do not consider the coordinates of moving objects in higher dimensions (e.g., including the altitude) in this study, but the proposed work can be easily extended to support higher dimensions without additional efforts.. 11.

(15) ci,j!. t! bi,j! τi,j!. y!. x!. Fig. 4. An example of the indexing phase of the SUDO encoder, where the order of the Hilbert curves, HB and HC , are set to 2 and 3, respectively. B. SUDO - Encoder The SUDO encoder takes place when a moving object query (which can be a database insertion, search, and deletion) is requested. There are three phases for the encoder to process the query, namely, indexing, encryption, and dispatch, which we detail as follows: 1) Indexing: The SUDO encoder uses the time-sliced representation to index the timestamps of moving object data Di,j . Specifically, it divides the time into a continuous sequence of equal-length slices, and numbers each slice with a unique number, τ . Then, the SUDO encoder uses a two-layered Hilbert space-filling curve to index the locations of moving objects. More precisely, it divides the map of interest into 2B by 2B equal-sized blocks, and each block is divided again into 2C by 2C equal-sized cells. The first layer Hilbert curve, HB , is applied to index the 2B by 2B blocks, and the second layer Hilbert curve, HC , is applied to index the 2C by 2C cells of each block. Thus, there are 2B+C by 2B+C cells in the map in each time slice, and the indexes of Di,j is obtained by the tuple of its corresponding HB , HC , and time slice values, i.e., < bi,j , ci,j , τi,j >. Figure 4 shows an example of the indexing phase of the SUDO encoder. An important issue is that time slice should be static that same duration of each time slices or dynamic that duration length differs from time slices. The consideration of time slice and implementation detail will be introduced in Section VII. 2) Encryption: In the encryption phase, the SUDO encoder encrypts the moving object data, Di,j , using three encryption functions:. 12.

(16) a) Ψα () is used for the encryption of the layer-1 spatial index, i.e., bi,j . The criteria of Ψα () is that it needs to be deterministic, invertible, and yielding nearly random (or pseudo-random) permutations. In this study, we suggest to implement Ψα () using the Advanced Encryption Standard (AES) block cipher algorithm [11], as it not only meets the requirements but also has a low computational complexity. b) Ψβ () is used for encrypting the temporal index τi,j , and it needs to be invertible, and order-preserving with variable intervals. We suggest to use the Order Preserving Encryption Scheme (OPES) [1] because it satisfies all the criteria and is easy to implement. c) Ψγ () is used for encrypting the raw data of Di,j , and it has to be invertible, and capable of hiding both data information and data frequency. The Paillier cryptosystem [43] is suggested for the implementation of Ψγ () because it is invertible and nondeterministic. In addition, Paillier is asymmetric that is good for authentication in trusted gateway. 3) Dispatch: There are three tasks for a SUDO encoder in the dispatch phase. a) Storage Mapping: The encoder maintains a lookup table that maps the encrypted index tuple of the moving object data (e.g., < Ψα (bi,j ), ci,j , Ψβ (τi,j ) > for Di,j ) to the outsourced database of its storage. b) Genuine Mixed Data: For the sake of resilience against falsification attacks, the encoder implements the genuine mixed method [33] to mix the true data (contributed by clients) and the fake data (generated by the trusted gateway) at the storage of each time-sliced cell. The true data are clustered to occupy a continuous space on the storage with two indexes, σbs0 ,c,τ 0 and σbe0 ,c,τ 0 , indicating the starting and ending position for the time-sliced cell of the encrypted index tuple < b0 , c, τ 0 >. Figure 5 shows an example of the genuine mixed data for the time-sliced cell ci,j , where Kτ 0 is the number of data instances (including true and fake data) in each time-sliced cell of. 13.

(17) !"#$%&"'"%. !"#$%&%'$( ()*$%&"'"%. *'$(. !)#$%&%'$( !"#$%&"'"% 1+'(23'4-5).'/(05#.6(( "+%"(.#*"%)*-(ci,j!. !"#$%&'()*("+'(( #,"-#,$.'/(/%"%0%-'(. Fig. 5. An example of the genuine mixed data for the time-sliced cell ci,j the encrypted temporal index τ 0 (i.e., τ 0 = Ψβ (τi,j )). Moreover, to ensure data integrity, the fake data are obtained from duplicates of the true data, and r% of the true data must have its duplicates (fake data) in the same time-sliced cell [33]. We let δc τ 0 and δτ 0 denote the maximum and average number of true data for all time-sliced cells of the encrypted temporal index τ 0 . The value of Kτ 0 is obtained by  r  δc , if δc τ0 τ 0 > (1 + 100 )δτ 0 Kτ 0 =  (1 + r )δ 0 , otherwise 100 τ. (2). where r is set to 20 in this study, as suggested in [33]. In deployment of duplicates, SUDO has a feature that if the entry of cell is full, we put the duplicates at cells of the same block where there still have entries for duplicates. And the feature will help us to use one query that retrieve and auditing at the same session because the results of queries contain the trues and duplicates already. c) Cell Defragmentation: To improve the storage efficiency, the encoder performs ’defragmentation’ to condense several sparse time-sliced cells into the same cell space. More precisely, we let < b01 , c1 , τ10 > and < b02 , c2 , τ20 > be the encrypted indexes of two time-sliced cells, and the two cells are deemed eligible for defragmentation if and only if they satisfies Sharing Rules : 1) they have. 14.

(18) the same temporal indexes (i.e., τ1 = τ2 ); 2) they have the same cell index, but different block indexes (i.e., b1 6= b2 and c1 = c2 ); and 3) the number of the true data in the both cells is not greater than δc τ10 (i.e., the maximum number of true data among all time-sliced cells of the encrypted temporal index τ10 ). Then, the defragmentation procedure appends the true data of the first cell (i.e., with the encrypted index < b01 , c1 , τ10 >) to that of the second cell (i.e., with the encrypted index < b02 , c2 , τ20 >), and updates the indexes of the starting/ending locations of the true data in the first cell. Next, it updates the lookup table and maps the second cell to the outsourced database of the first cell. It reproduces the fake data such that all fake data is obtained by duplicates of the true data, and r% of the true data has its duplicates (fake data) for the first cell in its storage. Finally, it releases the storage that is originally occupied by the second cell. Figure 6 shows an example of the cell defragmentation process. Except improving efficiency, SUDO sharing also hind the composition of a block. The attacker will be hard to learn the adjacency of block due to the neighbour relation of each shared block is also shared. Therefore, how to decide candidate blocks to be shared becomes a important issue that not only influences the data density of shared block but also effects the efficiency of SUDO. The implementation detail of how to choose candidate sharing blocks will be introduced later in Section VII. C. SUDO - Decoder The objective of the SUDO decoder is to convert the encrypted data into its raw format, and it has two steps: 1) Data Retrieval: In this step, the SUDO decoder identifies the outsourced database based on the encrypted index tuple < b0 , c, τ 0 >, and it locates the true data stored on its corresponding time-sliced cell by referring to its σbs0 ,c,τ 0 and σbe0 ,c,τ 0 values. Then, it performs encrypted-domain computation and retrieves those data entries that are true data and satisfy the query criteria.. 15.

(19) !"#$%&'(. 7*.$89"'/()*/':(",92'(. !"#$%&'(. 7*.$89"'/()*/':(",92'(. ★(. %&'#($)'($"'#*$$. ★(. %&'#($)'($"'#*$$. !(. %&+#($)+($"+#*$$. ★(. %&+#($)+($"+#*$$. ;;;;;(. ;;;;;<(. !"#$%&"'"% ()*$%&"'"%+%. !"#$%&"'"% !"#$%&"'"%. ()*$%&"'"%+% ()*$%&"'"%,%. !"#$. ()*$%&"'"%,%. 1'22(3'4$%&5'*"%6#*(. !"#$. !"#$%&"'"% !"#$%&"'"%. !"#$%&"'"% !"#$%&'()*("+'(#,"-#,$.'/(/%"%0%-'(. Fig. 6. The illustration of the storage sharing 2) Data Recovery: After the Data Retrieval step, the SUDO decoder restores the encrypted data to its original form (i.e., plaintext) by applying the inverse function of Ψγ (), i.e., Ψ−1 γ (). Then, it performs data computation on the plaintext data, and it returns the data satisfying the query criteria to SUDO clients. We note that although it is favored to have the data calculation taken place in the encrypted-domain (i.e., on outsourced databases), some calculation may still need to be run in the plaintext-domain (i.e., on the trusted gateway, or on SUDO clients) due to the limitation of the encryption function Ψγ () used. Moreover, the inverse functions of Ψα () and Ψβ () may have to be applied in the Data Recovery step in order to answer some specific database queries.. 16.

(20) V. I MPLEMENTATION D ETAIL SUDO sharing solves most of attacks as index grouping, block edge detect and adjacency mapping. To resist of Know Plaintext Attack with index grouping, sharing hides the composition of each block and only the trusted gateway have cell information to sieve out the true positive answer. Thus, Adjacency mapping was hidden by sharing because sharing not only shares block itself but also shares the neighbors of blocks. Moreover, confusion of block composition also extends the edge of each block and offers anonymity to each block. In this section we analysis how SUDO solves the attacks we mentioned, attack factors that we will prove them in evaluation and algorithms to solve the attack factors. Attack factors of Background Knowledge Attack and Collusion Attack can be sum up as density variation problem. Two main factors of density variation are data number and area coverage. In the later subsection, we will introduce how we decide time slice length to solve variation of data number. Then we introduce how to choose candidate sharing blocks to increase the performance and balance the both attack factor as variation of data number and area coverage. A. How to decide time slice length In this subsection, we focus on the variation of data number of block by time slices. Variation of data number among blocks in a time slice, block will be located. With data number variation among blocks by time slices, the time stamp will be analyzed because the data number variation of block with same index appearing in different time will help adversary to reveal the time relationship with data number. Under Collusion Attack, data number may changes heavily by attacks assembling in a small area in a specific time. We propose Dynamic Time Slicing (DTS) approach to solve Background Knowledge Attack and Collusion Attack. DTS decides how long a time slice should be by the entry size we desired. The algorithm DTS is as Algorithm 1. Line 1 decides desired entry as we want base on some factors we will evaluate later, such as the rate of true data that caused by quantity and sensor frequency of moving object. Line 2 collects data because the system may run on real time, or just accumulate data from database. 17.

(21) Algorithm 1 Dynamic Time Slicing (S : Start time ; L : Length of time slice) 1: Decide W as target Kτ we want ; 2: Collecting data until Kτ ≥ W , then set E as End time ; 3: Return L = E − S ; Algorithm 2 Density Considered Sharing (B : Blocks ; R : shared blocks Result) 1: Predicting the mean number of shared block ; 2: Deciding priority of choosing sharing block ; 3: Sharing B base on Sharing Rules ; 4: Choose local result of shared block into R ; 5: Optimizing R by data density ; 6: Return R ; by time otherwise. The start time and end time may have limitation on the trusted gateway timer unit. To avoid problem on time unit, we can choose time sequence number as institution that each data gets a unique sequence number while putting into trusted gateway. After collecting, we can slice a time duration that the entry Kτ is as we set following equation 2 in Section IV. Generally, DTS controls the entry size of time slices, resists of Background Knowledge Attack on data quantity and resists of Collusion Attack that creates uneven data causing entry size variation by time. In evaluation of security in Section VI, we will show the effect of DTS with different data quantity and show DTS approach has few limitation. B. How to choose candidate sharing blocks In this subsection, we put both data density factors as data number of an block and area coverage size of a block into concern. Coverage variation in past was obviously ignored such as the works we mentioned in Related Work. To avoid coverage variation exposes data density. We propose Density Considered Sharing (DCS) to balance the density of each shared block while sharing and show SUDO supports open multiobjective optimization. Algorithm of DCS is as Algorithm 2. SUDO supports open multi-objective optimization because that each block size are the same that sharing can happen whenever two blocks satisfy sharing rules. Factors of index balancing such as retrieve query frequency, insertion frequency and percentage of retro data are all important factors that may expose the information of an area. In SUDO, block composition is hidden by sharing. Therefore, there is no risk of insertion. 18.

(22) retro data. Hence here we uses the density as consideration that resisting of Background Knowledge Attack and proving SUDO supports open multi-objective optimization. In optimization of DCS, we base on [41] generally that each local result have differential probability to be selected by the density estimation metric that we will introduce in Evaluation. Here we offer two skills as predicting the mean number of shared block and deciding priority of choosing sharing block. Such as predicting how many blocks may be used by Kτ and predicting mean data number of cells. Both lead excellent result of sharing, and giving priority of each block by estimating the cell with max data number and mean data number of cells inside a block. In line 1, predicting how many blocks may be used by Kτ and mean data number of cells can lead to better heuristic functions that cause better sharing result. In [41], sharing probability of each candidate block is basing on the heuristic function that measures how much probability of one block is. If we know how many shared block may be used, we can get both better coefficient of variation on area coverage and better coefficient of variation on data number that are the two main factors of data density. However, if we could not predict the number of shared block, we can use the true data rate to heuristic only. In line 2, Giving priority of each block can reduce the probability of collision that two of the max data cells of blocks have the same cell number. So we set high priority to those blocks with high mean of data number to avoid blocks with low mean of data that can be used to make up the blocks with high mean of data. Therefore, the priority of candidate sharing blocks can be set by the highest number of data in block and the mean of cell data in block.. 19.

(23) VI. E VALUATION In this section, we first introduce the input data set and define evaluation metrics of performance and security. Then we evaluate the effect of block and cell size on performance, and we will give suggestions to choose proper block and cell size. After evaluating performance, we evaluate how secure can SUDO reach in density requirement with MOD database by the density metric we defined and we will compare DCS with other works. A. Data Set The input data set was Greater Taipei Bus data from 2012/01/01 to 2013/12/12. Greater Taipei is at north Taiwan locates form +121◦ 16’ 31.0008” to +122◦ 1’ 25.9998” in longitude that is about 75,439 m and form +24◦ 400 50.9982” to +25◦ 18’ 11.0016” in latitude that is about 69,185 m. The valid area of Greater Taipei is as figure 7. Basing on buses service time, we collects bus serving data begins at about 5 o’clock and ends at 12 o’clock. There are 6,111 buses in total, and 3800 a day in average. The sensor rate of buses is one record per minute for each bus.. Fig. 7. The valid area of Greater Taipei Generally, the distribution and quantity of buses data is similar to that of people in Taipei. A special case is that buses go to repair station after each day while off duty. The first bus goes at about 5 o’clock and there are two rush hours at morning about 7 to 9 o’clock and at afternoon about 17 to 19 o’clock. Unfortunately, we could not get data from 2013/1/15 to 2013/3/10 that we keep blank during the time of data. 20.

(24) lost. Basing on the features of bus data, we will show SUDO resists each attacks that mentioned in Section II. B. Evaluation Metrics In this subsection, we will denote equations for performance on each percentage of data types introduce data type in SUDO ODB and explain how the distribution of duplicates can help SUDO to check data integrity quickly. Then we explain attack factors on data density and denote the equation for data density to evaluate how SUDO reaches on security compared to other works. 1) Evaluation Metrics of Performance After SUDO dispatch, there are four types of data inside a cell entry Kτ of a SUDO shared block as: 1) true data denoted as δ, 2) duplicates inside the cell that is same to its corresponded true denoted as λC , 3) duplicates inside the block that is same to its corresponded true denoted as λB and 4) secondary duplicates inside the block that is same to its corresponded true as S. Therefore, we use True Data Rate (TDR) as Equation 3 to measure how much true data inside a shared block.. T DR =. δ Kτ. (3). In SUDO, retrieving query (the query for true data) and auditing query (the query for duplicates to check integrity) can be done with in one query session because the duplicates was used for make up the data distribution. Therefore, we denote Self-Validation Rate (SVR) to state the rate of data and corresponded duplicates inside an area. By the area size with block and cell, we use SV RB as Equation 5 to measure SVR of a block and SV RC as Equation 4 to measure SVR of a cell.. δ+λC Kτ. (4). δ+λC +λB Kτ. (5). SV RC =. SV RB =. 21.

(25) The three definitions above performs best when they near to 1. But TDR considers how much duplicates we want as least. In Section IV, we set duplicates as 20 percent of true data, so TDR performs best when it near to 0.83. A special point is that, we ignores secondary duplicates S in SVR because we can check data is falsified or not by just one duplicates and we can pave the data distribution by just one duplicate that we will show it later. 2) Evaluation Metric of Security Density variation is main factor of Background Knowledge Attack and Collusion Attack. Here we use Coefficient of Data Density (CDD) as Equation 6 to state data density variation of blocks in a slice time.. CDD = (1 + CV (P ))(1 + CV (A))(1 + CV (R)). (6). In CDD, CV(P) denotes the coefficient of variation (CV) of positive answers of blocks in a slice time, CV(A) denotes the CV of area coverage of blocks in a slice time and CV(R) denotes the CV of routing cost (total distance) from one block to the rest in a slice time. Generally, CDD composed by two factors: 1) Data density inside blocks with CV(P) and CV(A), 2) Density of blocks with CV(R). CDD hold best state when CDD near to 1 that stands for no density variation of blocks in a slice time. C. Evaluation on Performance In this subsection, we show the relationship of TDR, SVR and entry size Kτ with block and cell size. After the evaluation of performance, we hope we can not only show the performance of SUDO, but also giving the concept of how to choose the proper size of block and cell. From Figure 8, we can know that to offer large amount of order preserving cell needs large scale data or with high completely integrity and recovery requirement. Moreover, if we don’t want order preserving cells, SUDO TDR can get best state 0.83 for any size of block when C equals 0. With Figure 9 and Figure 8, we can know that each cell contains about 20 to 50 percent duplicates of each cell on the (B, C) area that TDR ≥ 0.65 from the difference. 22.

(26) Fig. 9. Relationship between Avg. SVRC with Hilber order of block and cell B, C. Fig. 8. Relationship between Avg. TDR with Hilber order of block and cell B, C. Fig. 10. Relationship between Avg. SVRB with Hilber order of block and cell B, C. Fig. 11. Relationship between Avg. Kτ with Hilber order of block and cell B, C. of average SVRC minus average TDR. 20 percent duplicates can detect one single falsification with probability about 33 percent; while 50 percent duplicates can detect one single falsification with probability about 67 percent. With Figure 10 and Figure 8, we can know the (B, C) area with large TDR have less secondary duplicates. Generally, if TDR ≥ 0.65, then SVRB ≥ 0.95. So, to get largest amount of cells, you may need about 50 percent duplicates can hide data distribution inside a shared block. With SVRB ≥ 0.95, we can even do not add secondary duplicates S in SUDO that CV on cell inside a block will not bigger than 0.06. Figure 11 was created by static time slice that each time slice is 30 minutes. From this, we can know that cell entry size basically not influenced by sharing with different TDR. Thus, if we want to build a random moving object database with normal distribution on data density, we can base on the four kinds of figures above to choose proper block and cell size. Basing on the four figure above and average speed of bus that is about 250 m/min., we can choose (B, C) = (8, 2) as default that each block is about 295*270 m2 .. 23.

(27) Fig. 12. Example of Dynamic Time Slicing with Static Time Slicing. Fig. 13. Effect of Kτ with instance quantity and shared block number D. Evaluation on Security In this subsection, we first evaluate the variation of data number. After we show there is no concern about variation on data number, we add the area coverage factor into evaluation as evaluation on variation of data density. Then we will show sharing is much more excellent on the consideration of data density than partition. 1) Evaluation on Variation of Data Number In this subsection, we evaluate the effect of DTS approach and Density Considered Sharing that resisting of Background Knowledge Attack and Collusion Attack. Here we show DTS solves data number variation by time slices while data variation in a slice was solved by integrity data. In Figure 12, bus number in a single position with static τ varies heavily. We can clearly see that the entry size varies like a ”W”. Because buses have two rush hour at morning and afternoon, and most of the buses go to repair station while off duty at the end of night as a Collusion Attack. We can see that entry size of working day is different to holiday that as 2012/05/05 and 2012/05/06.. 24.

(28) Fig. 14. Effect of Kτ with average time slice length and data rates Hence, the overall variation by time will easily under Background Knowledge Attack. On contrast, DTS solves these two attacks. In Figure 12, DTS is set with target Kτ is 450 on (B, C) = (8, 2). Generally, CV of data number with DTS on every size of target Kτ we desired is all less than 4 percent. We can also see that each morning Kτ with DTS is much lower because the shared block number is few, so the last shared block that reaches not the entry requirement and causes low average. Figure 13 shows that entry size will not influenced by the quantity of instance. Only the shared block number will reflect the change of data quantity. Remember that we choose interval considered order preserving encryption on time slice that hides the variation of quantity by time slices. Figure 14 shows that data rates do not changes heavily with the target Kτ we desired and time slice length just simply grows with target Kτ we desired. So, by Figure 13 and Figure 14, we can say that there is little limitation on DTS. 2) Evaluation on Variation of Data Density In this subsection, we evaluate the density considered sharing with work base on partition. We will show the experiment result to prove SUDO is a more secure solution on outsourcing database with MOD. In Figure 15 shows that Avg. CDD of SUDO is always less than work MHBL [53] or quad tree. Work of partition such as B-tree, KD-tree, r-tree, and family of these trees do not changes the density of each area but just grouping, so we can predict that other work of partition that not in our experiment is worse than. 25.

(29) Fig. 15. Comparison of CDD on DCS and works concerns data quantity only TABLE I C OMPARISON OF DATA DENSITY FACTORS ON DCS Approach MHBL/ Quad Tree Simple Grid, 2562 Random Sharing, (B,C)=(8,2) DCS, (B,C)=(8,2). Avg. K 623 43 470 431. Avg. CDD 3.89 2.98 2.14 1.47. Avg. CV(P) 0.79 1.97 0.13 0.16. AND OTHER WORKS. Avg. CV(A) 0.8 0 0.89 0.26. Avg. SVRB. Avg. TDR x x. 0.96 0.96. 0.76 0.71. SUDO. Moreover, works do not considers area coverage even worse than simple grid that partition map into grids only. In Figure 15, we also show that SUDO can perform multi-objective optimization. Sharing with the objective on density as DCS is much better than random. In addition, density of SUDO does not change heavily with the density change of raw data. In Table I, we can see each detail setting of Figure 15. In work of MHBL [53], entry size Kτ we choose from 100 to 1200 and get optimal CDD at average Kτ is about 693. Simple grid only partition map into 256*256 grids with no optimizing approach. Simple grid shows the variation on raw data density of Greater Taipei Bus but SUDO can be twice better than raw data.. 26.

(30) VII. D ISCUSSION In this section, we discuss the different between SUDO and existing works that we referenced in Section III. In privacy enhancing technologies on location-based service of Related Work, we surveyed works of cloaking and dummy approach. In privacy enhancing technologies on outsourced database, we surveyed works of encryption and indexing. In the following paragraph, we will discuss the most significant features of them and make a short comparison with SUDO sequentially. Compared with cloaking approach, the anonymity of SUDO is basing on fake data and other data in other block with same related Hilbert value of cell while the anonymity of cloaking is basing on the data in neighbor area. To solve attack on multiple and continuous snapshots, SUDO keeps the data encrypted in ODB and query on the space index. SUDO keeps same (or similar) data number in cells and shares block storage that avoids data distribution exposure and user identified. Thus, SUDO can be viewed as a work with dummies what we will introduce in next paragraph. Compared to dummy approaches, the dummies in SUDO is more valuable because the dummies can be the duplicates for integrity in ODB. Moreover, the dummies of SUDO exist only transfers between ODB and LBS server, it do not enlarge the computation overhead to clients. As far as we know, SUDO is the first work that uses dummies in ODB and hides data distribution by the deployment of duplicates. SDUO sharing hides distribution and the percentage of duplicates in SUDO was dynamic that different from each cells. Some of OPE and homomorphic encryptions are used in SUDO for some purposes such as order preserving encryption [1] and Paillier [43] because the reason that we mentioned in Section IV. To implementation works, SUDO is compatible to them due to the homomorphic encryption we choose. SUDO is focus on the solution of efficient and secure access of MOD on ODB by multiple dimension order preserving index that we will introduce in next paragraph. Compared to space filling curve works, SUDO is target on moving object database that have more security problem and SUDO is more secure by the block sharing and duplicates for integrity. Compared to secure indexing works, SUDO satisfies MOD. 27.

(31) security issues, but SUDO is deterministic and order preserved in cell layer as relative coordinator that holds better efficiency. In addition, the spirit of security approach of SUDO is different to existing works, SUDO keeps the data number of cell equally by integrity data and block sharing is block layer approach while existing works merges or partition area as indexes. Moreover, SUDO is a compromising work that solves all existing problems and considers the probe problems of MOD on ODB.. 28.

(32) VIII. C ONCLUSION AND F UTURE W ORK In this paper, we identify why legal queries on moving object database is full of threatens. Then we propose SecUre Database Outsourcing (SUDO) for locationbased systems to solving existing attacks and threats may happen in moving object database. SUDO first solves the data density disclosure problem with integrity data deploy arrangement. Generally, SUDO has several advantages as below. SUDO is robust. SUDO integrates several works and their advantages that make SUDO reach the four important outsourcing factors. 1) Accuracy: SUDO follows the UCM outsourcing mode that with no computation overhead to clients. 2) Efficiency: SUDO holds the extensibility to outsourcing works by Hilbert curve and homomorphic encryption. Moreover, SUDO can sieve out candidate answers before decryption by the data continuous arrangement in cell. Departing temporal index and spatial index that make searching more smart and let indexing independent by time. 3) Security: SUDO not only resists the existing attacks but also resisting of the potential threatens on moving object database. In addition, SUDO sharing hides block adjacency and offers anonymity to blocks. Generally, SUDO ensures data ownership disclosure both on sensitive attributes and data distribution as we introduced in Introduction. 4) Integrity: SUDO integrates duplication for integrity and uses smart data deploy arrangement that allows data retrieving query and auditing query combined in one query session. So duplication in SUDO offers highly security for solving density problem and saves one session connecting time. SUDO sharing supports open multi-objective optimization. SUDO sharing is the first work that supports open multi-objective optimization that any factor of index balancing can be solved in SUDO just if the factor differs from blocks. Such as we uses SUDO to solve density variation on moving object database. With the feature of open optimization, we conduct comprehensive evaluation to demonstrate sharing is more secure than partition. In the existing works, data density disclosure problem is hard to be solved. We explain that partition is hard to dealing density problem and show that sharing is much more secure than partition to deal with variation of data density that will expose data distribution of location-based data.. 29.

(33) After concludes the features of SUDO, we describe the future works of SUDO. Generally, bus data is limited by the static path and service time. We are designing a plan that uses SUDO to work with large scale data without limitations. Including the location data from cell phones, vehicle network, mobile sensors, infrastructure monitors or sensors, smart cards and any behaviors that exposing id to locations such as paying the bills. Except solving more possible issues, we are designing advanced MOD queries that outsourcing computations for SUDO. The result will be reported in near future.. 30.

(34) R EFERENCES [1] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Order-preserving encryption for numeric data. In ACM SIGMOD, 2004. [2] R. Bayer and E. McCreight. Organization and maintenance of large ordered indices. In Proceedings of the 1970 ACM SIGFIDET (Now SIGMOD) Workshop on Data Description, Access and Control (SIGFIDET ’70), pages 107–141, New York, NY, USA, 1970. ACM. [3] R. Bayer and K. Unterauer. Prefix b-trees. ACM Trans. Database Syst., 2(1): 11–26, Mar. 1977. [4] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: An efficient and robust access method for points and rectangles. SIGMOD Rec., 19(2):322– 331, May 1990. [5] J. Benaloh. Dense probabilistic encryption. In Proceedings of the workshop on selected areas of cryptography, pages 120–128, 1994. [6] A. Boldyreva, N. Chenette, and A. O’Neill. Order-preserving encryption revisited: Improved security analysis and alternative solutions. In Proceedings of the 31st Annual Conference on Advances in Cryptology (CRYPTO’11), 2011. [7] R. Cheng, D. V. Kalashnikov, and S. Prabhakar.. Querying imprecise data. in moving object environments. IEEE Transactions on Knowledge and Data Engineering (TKDE’04), 16(9):1112–1127, Sept. 2004. [8] R. Cheng, Y. Zhang, E. Bertino, and S. Prabhakar. Preserving user location privacy in mobile data management infrastructures. In Proceedings of the 6th International Conference on Privacy Enhancing Technologies (PET’06), pages 393–412, Berlin, Heidelberg, 2006. Springer-Verlag. [9] C.-Y. Chow and M. F. Mokbel. Trajectory privacy in location-based services and data publication. ACM SIGKDD Explorations Newsletter, 13(1):19–29, Aug. 2011. [10] S. Craver, B.-L. Yeo, and M. Yeung. Technical trials and legal tribulations. Commun. ACM, 41(7):45–54, July 1998. [11] J. Daemen and V. Rijmen.. The Design of RijndaeL: AES - The Advanced. 31.

(35) Encryption Standard. Information Security and Cryptography. Springer, 2002. [12] E. Damiani, S. Vimercati, S. Jajodia, S. Paraboschi, and P. Samarati. Balancing confidentiality and efficiency in untrusted relational dbmss. In Proceedings of the 10th ACM conference on Computer and Communications Security (CCS’03), pages 93–102. ACM, 2003. [13] M. Duckham and L. Kulik. A formal model of obfuscation and negotiation for location privacy. In Proceedings of the Third International Conference on Pervasive Computing (PERVASIVE’05), pages 152–170, Berlin, Heidelberg, 2005. Springer-Verlag. [14] T. El Gamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Proceedings of CRYPTO 84 on Advances in Cryptology, pages 10–18, New York, NY, USA, 1985. Springer-Verlag New York, Inc. [15] B. Gedik and L. Liu.. Location privacy in mobile systems: A personalized. anonymization model. In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS ’05), pages 620–629, Washington, DC, USA, 2005. IEEE Computer Society. [16] B. Gedik and L. Liu.. Location privacy in mobile systems: A personalized. anonymization model. In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS ’05), pages 620–629, Washington, DC, USA, 2005. IEEE Computer Society. [17] C. Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing (STOC ’09). [18] S. Goldwasser and S. Micali. Probabilistic encryption & how to play mental poker keeping secret all partial information. In Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing (STOC ’82), pages 365–377, New York, NY, USA, 1982. ACM. [19] M. Gruteser and D. Grunwald. Anonymous usage of location-based services through spatial and temporal cloaking. In Proceedings of the 1st International Conference on Mobile Systems, Applications and Services (MobiSys ’03), pages 31–42, New York, NY, USA, 2003. ACM.. 32.

(36) [20] A. Guttman. R-trees: A dynamic index structure for spatial searching. SIGMOD Rec., 14(2):47–57, June 1984. [21] H. Hacigümüs¸, B. Iyer, C. Li, and S. Mehrotra. Executing sql over encrypted data in the database-service-provider model. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pages 216–227. ACM, 2002. [22] H. Hacigumus, B. Iyer, and S. Mehrotra.. Providing database as a service.. In Proceedings of the 18th International Conference on Data Engineering (ICDE’02), pages 29–38. IEEE, 2002. [23] X. Hao, X. Meng, and J. Xu. Continuous density queries for moving objects. In Proceedings of the Seventh ACM International Workshop on Data Engineering for Wireless and Mobile Access (MobiDE ’08), pages 1–7, New York, NY, USA, 2008. ACM. [24] D. Hilbert. Ueber die stetige abbildung einer line auf ein flächenstück. Mathematische Annalen, 38(3):459–460, 1891. [25] D. S. Hochbaum and A. Pathria. Analysis of the greedy approach in problems of maximum k-coverage. Naval Research Logistics (NRL), 45(6):615–627, 1998. [26] B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu. Secure multidimensional range queries over outsourced data. The International Journal on Very Large Data Bases (The VLDB Journal), 21(3):333–358, June 2012. [27] C. S. Jensen, D. Lin, B. C. Ooi, and R. Zhang. on continuouslymoving objects.. Effective density queries. In Proceedings of the 22Nd International. Conference on Data Engineering (ICDE ’06), pages 71–, Washington, DC, USA, 2006. IEEE Computer Society. [28] R. Kato, M. Iwata, T. Hara, A. Suzuki, X. Xie, Y. Arase, and S. Nishio. A dummy-based anonymization method based on user trajectory with pauses. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems (SIGSPATIAL ’12), pages 249–258, New York, NY, USA, 2012. ACM. [29] A. Khoshgozaran and C. Shahabi. Blind evaluation of nearest neighbor queries. 33.

(37) using space transformation to preserve location privacy. In Proceedings of the 10th International Conference on Advances in Spatial and Temporal Databases (SSTD’07), pages 239–257, Berlin, Heidelberg, 2007. Springer-Verlag. [30] H. Kim, M. A. Lopez, S. T. Leutenegger, and K. Li. Efficient declustering of non-uniform multidimensional data using shifted hilbert curves. In Proceedings of the 9th International Conference on Database Systems for Advances Applications (DASFAA’04), pages 694–707, 2004. [31] K.-C. Kim and S.-W. Yun. Mr-tree: A cache-conscious main memory spatial index structure for mobile gis. In Proceedings of the 4th International Conference on Web and Wireless Geographical Information Systems (W2GIS’04), pages 167– 180, Berlin, Heidelberg, 2005. Springer-Verlag. [32] W. Kozaczuk. Enigma: how the German machine cipher was broken, and how it was read by the Allies in World War Two. Univ Pubns of Amer, 1984. [33] W.-S. Ku, L. Hu, C. Shahabi, and H. Wang.. A query integrity assurance. scheme for accessing outsourced spatial databases. Geoinformatica, 17(1):97– 124, January 2013. [34] K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In Proceedings of the 22Nd International Conference on Data Engineering (ICDE ’06), pages 25–, Washington, DC, USA, 2006. IEEE Computer Society. [35] J. Li and E. R. Omiecinski. Efficiency and security trade-off in supporting range queries on encrypted databases. In Proceedings of the 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security (DBSec’05), pages 69–83, Berlin, Heidelberg, 2005. Springer-Verlag. [36] N. Li and T. Li.. t-closeness: Privacy beyond k-anonymity and -diversity.. In Proceedings of IEEE 23rd International Conference on Data Engineering (ICDE07, volume 7, pages 106–115, 2007. [37] I.-T. Lien, Y.-H. Lin, J.-R. Shieh, and J.-L. Wu. A novel privacy preserving location-based service protocol with secret circular shift for k-nn search. Trans. Info. For. Sec., 8(6):863–873, June 2013.. 34.

(38) [38] K. Liu, C. Giannella, and H. Kargupta. An attacker’s view of distance preserving maps for privacy preserving data mining. In Proceedings of the 10th European Conference on Principle and Practice of Knowledge Discovery in Databases (PKDD’06), pages 297–308, Berlin, Heidelberg, 2006. Springer-Verlag. [39] H. Lu, C. S. Jensen, and M. L. Yiu. Pad: Privacy-area aware, dummy-based location privacy in mobile services.. In Proceedings of the Seventh ACM. International Workshop on Data Engineering for Wireless and Mobile Access (MobiDE ’08), pages 16–23, New York, NY, USA, 2008. ACM. [40] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD’07), 1(1):3, 2007. [41] E. M. Manz, J. Haddock, and J. Mittenthal. Optimization of an automated manufacturing system simulation model using simulated annealing. In Proceedings of the 21st Conference on Winter Simulation (WSC ’89), pages 390–395, New York, NY, USA, 1989. ACM. [42] B. Moon, H. V. Jagadish, C. Faloutsos, and J. H. Saltz. Analysis of the clustering properties of hilbert space-filling curve. Technical report, College Park, MD, USA, 1996. [43] P. Paillier.. Public-key cryptosystems based on composite degree residuosity. classes. In Eurocrypt, 1999. [44] X. Pan, J. Xu, and X. Meng.. Protecting location privacy against location-. dependent attacks in mobile services. IEEE Transactions on Knowledge and Data Engineering (TKDE’12), 24(8):1506–1519, Aug. 2012. [45] O. Pandey and Y. Rouselakis.. Property preserving symmetric encryption.. In Proceedings of the 31st Annual International Conference on Theory and Applications of Cryptographic Techniques (EUROCRYPT’12), pages 375–391, Berlin, Heidelberg, 2012. Springer-Verlag. [46] S. Papadopoulos, S. Bakiras, and D. Papadias. Nearest neighbor search with strong location privacy. Proceedings of the VLDB Endowment (Proc. VLDB Endow.), 3(1-2):619–629, Sept. 2010.. 35.

(39) [47] R. L. Rivest, L. Adleman, and M. L. Dertouzos. On data banks and privacy homomorphisms. Foundations of secure computation, 4(11):169–180, 1978. [48] H. Sagan. Space-filling curves, volume 18. Springer-Verlag New York, 1994. [49] Y. Tao, D. Papadias, and J. Sun. The tpr*-tree: An optimized spatio-temporal access method for predictive queries. In Proceedings of the 29th International Conference on Very Large Data Bases (VLDB ’03), volume 29, pages 790–801. VLDB Endowment, 2003. [50] Y. Theodoridis, D. Papadias, E. Stefanakis, and T. Sellis. Direction relations and two-dimensional range queries: Optimisation techniques. Data Knowl. Eng., 27 (3):313–336, Oct. 1998. [51] S. Tu, M. F. Kaashoek, S. Madden, and N. Zeldovich. Processing analytical queries over encrypted data. Proceedings of the VLDB Endowment (Proc. VLDB Endow.), 6(5):289–300, Mar. 2013. [52] J. Wang and X. Du. A secure multi-dimensional partition based index in das. In Proceedings of the 10th Asia-Pacific Web Conference on Progress in WWW Research and Development (APWeb’08), pages 319–330, Berlin, Heidelberg, 2008. Springer-Verlag. [53] S.-L. Wang, C.-Y. Chen, I.-H. Ting, and T.-P. Hong. Anonymous spatial query on non-uniform data. In Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services (IIWAS ’12), pages 126–131, New York, NY, USA, 2012. ACM. [54] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis. Secure knn computation on encrypted databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD ’09), pages 139–152, New York, NY, USA, 2009. ACM. [55] Z. Yang, S. Zhong, and R. N. Wright. Privacy-preserving queries on encrypted data. In Proceedings of the 11th European Conference on Research in Computer Security (ESORICS’06), pages 479–495, Berlin, Heidelberg, 2006. SpringerVerlag. [56] T.-H. You, W.-C. Peng, and W.-C. Lee. Protecting moving trajectories with. 36.

(40) dummies. In Proceedings of the 2007 International Conference on Mobile Data Management (MDM ’07), pages 278–282, Washington, DC, USA, 2007. IEEE Computer Society.. 37.

(41)