Privacy-Preserving Compressive Sensing for Crowdsensing based Trajectory Recovery

(1)

Privacy-Preserving Compressive Sensing for Crowdsensing based Trajectory Recovery

Linghe Kong^∗‡, Liang He^†, Xiao-Yang Liu^‡, Yu Gu^§, Min-You Wu^‡, Xue Liu^∗

∗McGill University, Canada, Email: linghe.kong@mail.mcgill.ca, xueliu@cs.mcgill.ca

†University of Michigan, USA, Email: lianghe.umich@gmail.com

‡Shanghai Jiao Tong University, China, Email:{linghe.kong, yanglet, mwu}@sjtu.edu.cn

§IBM Research - Austin, USA, Email: yugu@us.ibm.com

Abstract—Location based services have experienced an explo- sive growth and evolved from utilizing a single location to the whole trajectory. Due to the hardware and energy constraints, there are usually many missing data within a trajectory. In order to accurately recover the complete trajectory, crowdsensing pro- vides a promising method. This method resorts to the correlation among multiple users’ trajectories and the advanced compressive sensing technique, which significantly outperforms conventional interpolation methods on accuracy. However, as trajectories exposes users’ daily activities, the privacy issue is a major concern in crowdsensing. While existing solutions independently tackle the accurate trajectory recovery and privacy issues, yet no single design is able to address these two challenges simul- taneously. Therefore in this paper, we propose a novel Privacy Preserving Compressive Sensing (PPCS) scheme, which encrypts a trajectory with several other trajectories while maintaining the homomorphic obfuscation property for compressive sensing.

Under PPCS, adversaries can only capture the encrypted data, so the user privacy is preserved. Furthermore, the homomorphic obfuscation property guarantees that the recovery accuracy of PPCS is comparable to the state-of-the-art compressive sensing design. Based on two publicly available traces with numerous users and long durations, we conduct extensive simulations to evaluate PPCS. The results demonstrate that PPCS achieves a high accuracy of <53 m and a large distortion between the encrypted and the original trajectories (a commonly adopted metric of privacy strength) of >9,000 m even when up to 50%

original data are missing.

I. INTRODUCTION

Location based services (LBSs) [18, 37] have experienced an explosive growth recently, which are evolving from utilizing a single location [7] to harness the complete trajectory of a mobile user [23, 26, 40]. For example, the Moves application, which automatically tracks both activities and trajectories of users, has been downloaded over 4 million times since its launch in Jan. 2013 and has been acquired by Facebook [1].

Although GPS is universally available on modern devices, the trajectory of a mobile user may always be incomplete due to none-line-of-sight to satellites [29]. In addition, since GPS consumes a significant amount of energy, it is only activated periodically to conserve energy [22]. Consequently, the trajectory recovery [30] is one of the fundamental components of LBSs to estimate the missing data in a incomplete trajectory.

For instance, trippermap [4] in Flickr can automatically repro- duce a user’s travelling path based on her geotagged photos.

Considerable interpolation methods have been devoted to trajectory recovery. With a single user’s incomplete trajectory

data, the methods such as nearest neighbors [31] and linear interpolation [34] can attain only coarse-grained accuracy.

More recently, Rallapalli et al. [29] reveal that the trajectories of multiple users within the same geographic area are strongly correlated. For instance, students in the same campus have similar time tables; vehicles in the same segment of freeway moves with similar velocities. Leveraging such correlations, the crowdsensing technology [10, 11, 16, 17, 28, 35] provides a promising recovery method, which collectively recovers all users’ trajectories together using compressive sensing (CS).

This crowdsensing recovery method is proved to be superior to interpolation methods with only single user data [29].

While the crowdsensing recovery method accomplishes the better accuracy, the major drawback for applying it in practice is its requirement to collect all users’ location data, which poses great concerns for potential privacy leakage [32, 36].

Especially in crowdsensing, the users are willing to contribute their personal data only when their privacies are preserved.

Currently, the most commonly adopted privacy-preserving ap- proach is anonymization [24]. Nevertheless, latest studies [12, 38] reveal that the anonymization mechanism alone is inadequate. To further improve the privacy, dummification [19] and obfuscation [13, 15, 27] methods are introduced, which inject fake trajectories and perturb original trajectories, respectively.

Although dummification and obfuscation methods reasonably protect user privacies, they also pollute the original data, which decreases the recovery accuracy with current crowdsensing recovery method.

To tackle the challenges of accurate trajectory recovery and privacy-preserving simultaneously, we design a novel encryption method named K-vector perturbation (KVP) to attain both objectives. The main idea of KVP is to use a private key to perturb a user’s trajectory with K other trajectories while maintaining the homomorphic obfuscation property for compressive sensing. Based on KVP, we propose the privacy- preserving compressive sensing (PPCS) scheme including three major steps. First, every user encrypts her incomplete location data by KVP and transmits the data in encrypted form to the crowdsensing server. Second, the server does not need to decrypt the data but directly recovers all users’ encrypted data together with CS. Third, a user downloads her corresponding recovered data and decrypts her own trajectory by inverse KVP. Under PPCS, adversaries are possible to capture the

(2)

encrypted data but do not know the private key, so that the users’ privacies are preserved. Furthermore, PPCS guarantees the recovery accuracy is the same by operating CS on the encrypted data and the original data, which is named by the homomorphic obfuscation property.

The contributions of this paper are summarized as follows:

• To the best of our knowledge, this is the first work to jointly optimize the data recovery accuracy and user privacy preservation in crowdsensing.

• We propose a systematic PPCS scheme for crowdsensing based trajectory recovery, which combines the novel homomorphic obfuscation method KVP into compressive sensing framework to accomplish the recovery accuracy and the privacy preservation simultaneously. Since the design of KVP is simple, PPCS is easy to be implemented.

• We theoretically prove that PPCS achieves the same recovery error bound as CS. Moreover, we prove that the expectation of distortion between encrypted data and original data is relatively large compared with the size of the area, which indicates the effective data perturbation for privacy preservation. We also prove that PPCS can protect the user privacy as long as there are no more than K actual data being exposed, where K can be proactively controlled according to the user requirement on privacy.

• Extensive simulations are conducted to evaluate PPCS, which are based on two publicly available traces from Beijing and Shanghai with large amount of users, long durations, and mixed mobility modes including walking, biking, and driving. The evaluation results show the effectiveness of PPCS. Typically, using PPCS on Beijing traces achieves the average accuracy within53 meters and the average distortion more than9, 000 meters even up to50% original data are missing.

The remainder of this paper is organized as follows. In Section II, we formulate the trajectory recovery problem. In Section III, we investigate the mobility property in real traces.

We describe the design of PPCS in Section IV, and analyze the theoretical recovery accuracy and privacy in Section V.

In Section VI, we evaluate our scheme based on trace-driven simulations. In Section VII, we review the related work. And we conclude this work in Section VIII.

II. PRELIMINARIES

In this section, we introduce the trajectory recovery model, the adversary model, and the formal definition of our problem.

A. Trajectory Recovery Model

A trajectory is composed of a sequence of locations that a user traverses, represented by her corresponding longitude x and latitude y, as shown in Fig. 1(a). The user’s current location(x, y) can be obtained through the GPS module on her mobile device. In an N -user system where the total duration of interests consists of T time slots, the trajectory of the i-th user is represented by two 1 × T vectors, where xij andyij

are the longitude and latitude at thej-th time slot respectively (i = 1, 2, · · · , N and j = 1, 2 · · · , T ).

x31x32 x34 x37x38

y31y32 y34 y37y38

x33 x35x36

y35y36

y33 t4

t1 t2 t3 t5 t6 t7 t8

1 2 3 4

5 6

7

8 USER 3

(a) Two vectors are used to record the longitude x and the latitude ydata of a user’s trajectory, whose ID is 3.

x31x32 x34 x38

y31y32 y34 y38

x33

y33 Null Null t4

t1 t2 t3 t5 t6 t7 t8

1 2 3 4

8 USER 3

(b) When some location data are missing, the corresponding elements in the vectors are null. It is not easy to directly recover the accurate trajectory due to several possible paths in map.

Fig. 1. Trajectory model.

The location data of a user could be partially missing due to reasons such as none-line-of-sight to GPS satellites, energy management of GPS module on mobile devices [22] and so on. In Fig. 1(b), the null elements in the vectors indicate the data missing at their corresponding time slots.

The trajectory recovery is not effective if it is performed for individual users independently. For example, as shown in Fig. 1(b), the location data for the5-th to the 7-th time slots are missing. Even though the map matching [33] method is utilized to narrow down the field of candidates, there are still three possible trajectories provided by linear interpolation [31].

To address the weakness of the single user recovery, the crowdsensing recovery exploits the correlation among users and recovers all users’ trajectories together using compressive sensing, which is verified to outperform existing methods [29]

and is referred as the state-of-the-art in this paper.

The notations of the crowdsensing recovery are defined as:

• Trajectory Matrixis a set ofN users’ actual trajectories, which is defined as X = (xij)N ×T. We only illustrate the longitude X related definitions and derivations in the following sections. All results for the latitude Y are similar to X, which are omitted for conciseness.

• Binary Index Matrixis used to indicate whether a location data inX is missing, which is defined as

Φ = (φij)N ×T =

0 ifxij is missing,

1 otherwise. (1)

• Sensed Matrixconsists of the sensed location data from GPS. Due to the potential data missing, elements in the sensed matrixS are either xij (i.e., sensed location data) or0 (i.e., missing data). Thus, S can be presented by¹

S = X ◦ Φ. (2)

• Recovered Matrixis generated by recovering the missing data in the sensed matrix S to approximate the actual trajectoriesX. The recovered matrix is denoted by ˆX.

• Compressive Sensing(CS) is an advanced recovery technique [6, 9] utilized to recover the missing data inS. We usefcs to denote the CS operation, thus ˆX = fcs(S).

1In this paper, ‘XΦ’ represents the matrix multiplication of X and Φ, while

‘X ◦ Φ’ represents the element-wise multiplication of X and Φ.

(3)

SERVER

Eavesdropper

Hacker Stalker

USER

Fig. 2. Adversary models.

B. User Models and Adversary Models

We consider a system consisting of two types of mobile users: public and private users. Public users are willing to share their trajectories and private users want to avoid the exposure of their trajectories. For example, in an urban traffic scenario, buses can be treated as public users, and personal vehicles are good examples of private users.

As leakage of personal trajectories can lead to unauthorized surveillance and tracking, adversaries are motivated to obtain private users’ trajectories. In Fig. 2, we illustrate the adversary models that threaten the privacy in crowdsensing recovery, which are categorized as eavesdroppers, hackers, and stalkers.

• Eavesdroppers and hackers: An eavesdropper could cap- ture the data traffic between users and the crowdsensing server by hijacking the communication channels. A hacker could access and obtain all data in the server. Because eavesdroppers and hackers can obtain the same set of information, we do not differentiate them in the rest of the paper.

• Stalkers: A stalker can track a user for a short while and obtain k actual location data of that user. Without loss of generality, we assume that k is a small number compared with the total number of data in a complete trajectory, because a stalker cannot always tail after the user.

All adversaries potentially have the following capabilities:

(i) they have the same algorithms as ours to recover the trajectory; (ii) they can exploit existing map matching methods [25, 33] as ours to further improve their estimation accuracy.

C. Problem Definition

In this paper, we consider the accurate and privacy- preserving trajectory recoveryproblem. This problem is chal- lenging because the two objectives appear to be conflicted with each other. On one hand, a highly accurate recovery can be achieved by the crowdsensing method. However, this method requires to collect data from all users, which poses the potential privacy leakage. On the other hand, the privacy objective is to avoid the exposure of users’ trajectories, which is contrary to the basic requirement of crowdsensing. Existing methods cannot satisfy the two objectives simultaneously.

To address this dilemma in crowdsensing based trajectory recovery, we propose the PPCS scheme, in which a novel homomorphic obfuscation method for CS is designed to preserve the user privacy and guarantee the recovery accuracy as well.

III. TRACEPREPROCESSING ANDVALIDATION

Before describing the design of PPCS scheme, we introduce two real traces and validate their low-rank properties, which

TABLE I

SELECTEDREAL-WORLDMOBILITYTRACES

Name Size Area Mobility Mode

Beijing 116 users × 355 slots 70×85 km² Walk/Bike/Car Shanghai 74 users × 399 slots 100×100 km² Taxi/Bus

5 10 15 20

0.8 0.85 0.9 0.95 1

i

CDF of top−i singular values

i =7↓

↑

i =13

Beijing Shanghai

Fig. 3. Low-rank property in the investigated mobility traces.

implies the strong correlation among multiple users’ trajectories within the same area [29].

A. Preprocessing of Real-World Mobility Traces

The evaluation of our design is based on two publicly available mobility traces: Geolife [2] and SUVnet [3]. These two traces have large amount of users, long durations, and mixed mobility modes. Geolife records the GPS trajectories of178 users from April 2007 to October 2011 in Beijing, in which the major user mobility modes include walking, biking, and driving. SUVnet records the trajectories of over2000 taxis and300 buses in the urban area of Shanghai.

However, the raw traces from Geolife and SUVnet cannot be directly utilized for low rank validation, because significant amount of their data are missing. To guarantee the integrity of ground truth, we perform trace preprocessing on the raw data to select their complete subsets and build the trajectory matrices, which are then utilized in our evaluations. The description of the two selected traces including their sizes, areas, mobility modes are shown in Table I, which are denoted as Beijing and Shanghai, respectively.

B. Validating the Low Rank Property

As CS is a major component of PPCS, we first need to validate whether the trajectory matrices are low-rank, which is the requirement for the CS operation [9].

Although the low-rank property has been studied in [29], each of their traces has only one mobility mode: either human walking or car driving. The mobility mode mixed with walking, biking, and driving together in our selected traces is a more general scenario. In addition, some of their traces [29]

are synthetic. However, our traces are raw data gathered from real applications, which inherently have noises. Hence, we still need to verify whether such traces are universally low-rank.

We verify the low-rank property of the selected traces by Singular Value Decomposition (SVD), which is an effective non-parametric technique for rank investigation [21]. Accord- ing to SVD, anN × T matrix X can be decomposed as

X = U ΛV^′ =

min(N,T )

X

i=1

σiuiv^′_i, (3)

(4)

SERVER

Compressive Sensing

Encrypted data Missing data

t1 t2 t3 tT

Public 1

Private N Public 2

Private 3

Private 4 Sensed data

Private N Private 3 Private 4

Step I:

KVP Encryption

Step II:

CS Recovery

Step III:

KVP Decryption

Fig. 4. PPCS overview.

whereU and V are two unitary matrices, V^′is the transpose of V , and Λ is an N × T diagonal matrix containing the singular value σi of X. Typically, the singular values σi are sorted as σi ≥ σⁱ⁺¹, (i = 1, 2, · · · , min(N, T )), where min(N, T ) is the number of singular values. The rank of the matrix X, denoted by r, is the number of its non-zero singular values.

The matrix is low-rank if r << min(N, T ). If the top-ˆr singular values have a good approximation of the total singular values, i.e.,

ˆ r

X

i=1

σi≈

min(N,T )

X

i=1

σi, (4)

this matrix is considered to be near low-rank, and ˆr is treated as its rank.

The CDF of the singular values obtained from the Beijing and Shanghai traces are shown in Fig. 3, where the x-axis presents thei-th largest singular values, and the y-axis is the ratio between the sum of the top-i singular values and the sum of all singular values. We find that the total singular values are well approximated by only a few top singular values in both traces. For example, the top-7 σiof the Beijing and the top-13 σi of the Shanghai occupy more than95% of their respective total values, while the total numbers of σis are 116 and 74 respectively. This observation reveals that both traces are of the near low-rank property. Hence, CS can be applied on them to achieve a promising recovery accuracy.

IV. PPCS SCHEME

To address the privacy issue in conventional CS, we present a simple but efficient trajectory recovery scheme Privacy- Preserving Compressive Sensing(PPCS) in this section.

A. Scheme Overview

The proposed PPCS consists of three steps. First, users encrypt their sensed data and transmit the encrypted trajectories to the server. Note that the encrypted trajectories may not be complete because of the data missing issue. Second, the server performs CS on the collective data to recover the missing part of the encrypted trajectory for all users. Third, any individual user downloads the recovered and encrypted trajectory from the server, and decrypts it to obtain her original trajectory. An overview of these three steps is shown in Fig. 4. Briefly, the advantages of PPCS are:

• The design of PPCS is simple, and thus it is easy to be implemented in practice.

• PPCS tactfully takes advantage of CS to provide significant privacy preservation strength while guaranteeing the accuracy of recovered trajectories.

• The high-complexity CS recovery is computed at the cen- tralized server side. The distributed computing at the user side is the low-complexity encryption and decryption.

• The communication overhead of every user is very small.

Analyses of these advantages are provided in Section V-A to V-D. Moreover, the case of no public vectors in PPCS are discussed in Section V-E.

B. Encrypt the Sensed Trajectories at Individual Users The core component of PPCS is to encrypt the sensed trajectories at private users, so that only their encrypted trajectories are available at the server. Denote fen as the encryption operation. With a sensed trajectoryS(i) of the user i, the encrypted trajectory can be represented as

S_(i) = fen(S(i)), (5) whereS(i) presents thei-th row vector in the matrix S.

In the next, we explain how the encryption operates in detail.

In the system under consideration, public users are willing to share their trajectories, which are available at the server.

At the first phase of encryption, a private user i randomly downloads K public vectors D(1), D(2), · · · , D(K) from all public vectors at the server, which is utilized to generate the encrypted vector S(i). Only K-vector downloading does not lead to much communication overhead. In addition, random downloading brings more uncertainty for privacy preservation.

Then, user i generates a length-(K + 1) random vector

<ψi,0, ψi,1, ψi,2· · · , ψ^i,K> as her private key, which is not shared to any other including the server. Any key satisfies ψi,j∈ (0, 1) andPK

j=0ψi,j = 1. With the public vectors and the private key, useri generates her encrypted vector S(i) as

S_(i)= (ψi,0S(i)+ ψi,1D(1)+ · · · + ψ^i,KD(K)) ◦ Φ(i). (6) To demonstrate the encryption operation, let us consider the example shown in Fig. 5. Assume a private user i = 4 has downloaded K = 2 public vectors from the server (i.e., D(1), D(2)), and has generated the length-3 key

<ψ4,0, ψ4,1, ψ4,2>. The three vectors S(4), D(1), and D(2)

are summed up with weightψ4,0,ψ4,1, andψ4,2 respectively.

For each null element in S(4), the corresponding element in the resultant sum vector is treated as the missing data and the encrypted vector S(4) is then transmitted to the server.

(5)

Private 4:

Public 1:

Public 2:

Encrypted Vector

+ +

ψ4,0 ;

;

S(4)

D(1)

D(2)

ψ4,1

ψ4,2

Fig. 5. KVP encryption

This encryption method is referred to asK-Vector Perturba- tion (KVP) in this paper, because 1) from the aspect of matrix operation,fenis essentially a linear combination ofK vectors in a matrix, and 2) the physical meaning of fen is to perturb the user trajectory with otherK public trajectories.

Intuitively, the length of private key dominates the difficulty for adversaries to decrypting the original data. Hence, the value of K determines the privacy preservation strength offered by KVP. We will further discuss the impact of K on the performance of PPCS in Section V-C.

C. Recover the Encrypted Trajectories at the Server

After collecting the encrypted trajectories from all private users and original trajectories of all public users, the server forms the encrypted matrix S of size N × T . Then, crowdsensing recovery method applies CS on S and the completed encrypted trajectory matrix is obtained as ˆX= fcs(S).

The fcs operation adopted in this paper is the standard CS recovery. Briefly, the procedures of fcs are as follows:

• Assume that ˆX can be divided into L and R matrices according to SVD factorization.

Xˆ = U ΛV^′ = LR^′, (7) whereL = U Λ^1/2,R = V Λ^1/2.

• EstimateL and R matrices by

min((||L||²2+ ||R^′||²2) + λ||B ◦ (LR^′) − S||²2), (8) where the Lagrange multiplierλ allows a tunable tradeoff between rank approximation (due to the real data is near low-rank but not exact low-rank) and accuracy fitness.

In Eq. (8), 1) B and S are known, 2) any || · ||²2 is non-negative, 3) the optimal values approximate 0 by minimizing all non-negative parts. Hence,L and R can be obtained by iterative computing, e.g., genetic algorithm.

In summary,fcsis equal to solve ˆX= LR^′ from Eq. (8) with input S. Please refer to [6] [29] for the detailed CS operation.

D. Decrypting the Recovered Trajectories at Individual Users After the encrypted trajectories are recovered at the server, any individual user can download her corresponding encrypted trajectory and apply the decryption operation. Specifically, user i downloads ˆX_(i) from the server, and locally decrypts it with the public vectors and her private key as

Xˆ_(i) = ( ˆX_(i)− (ψi,1D₍₁₎+ · · · + ψi,KD_(K)))/ψi,0, (9) where ˆX(i) is the approximation of X(i), i.e., the recovered complete trajectory of useri. Due to the local decryption and

the private key, ˆX(i) is only known by useri herself. Then, a user can exploit map matching methods [33], which normally adjust the recovered trajectory by matching the nearest roads in the map, to further improve the accuracy.

At the end of this design, we discuss the impact ofψi,0. In PPCS, ψi,0 determines the weight of a original vector in the encrypted vector. On one hand,ψi,0cannot be too small. When ψi,0→ 0, the weight of X(i)in the encrypted X(i)is too small, which will result in a poor recovery accuracy. On the other hand,ψi,0 cannot be too large. Whenψi,0→ 1, X(i) = X(i), which losses the effect of encryption. Empirically, we find that setting ψi,0 in the range [0.2, 0.8] can guarantee a high recovery accuracy and privacy. The other weights still satisfy ψi,j|j6=0∈ (0, 1) and PK

j=0ψi,j = 1.

V. PPCS ANALYSIS

In this section, we analyze the performance of PPCS in three metrics: the trajectory recovery accuracy, the privacy p- reservation against eavesdroppers, and the privacy preservation against stalkers. Its complexity analysis is also presented.

A. Accuracy Analysis

Although recovering trajectory by CS has been shown to achieve a promising accuracy [29], we still need to make sure the KVP encryption operation does not degrades the accuracy.

We adopt the same metric in [29] to evaluate the recovery accuracy, namely, the recovery errorǫ. For user i, its recovery error ǫ_(i) is the geometric mean of the distance between the actual trajectory and the recovered trajectory, defined as

ǫ(i)= ||X(i)− ˆX(i)||2

T , (10)

where||X(i)− ˆX(i)||2 =q PT

j=1(xij− ˆxij)², and T is the total number of time slots along the trajectory.

With this accuracy metric, we have the following theorem stating that the KVP encryption operation does not degrade the recovery accuracy.

Theorem 5.1:The proposed KVP is a homomorphic obfuscation method for CS. We define the homomorphic obfuscation property as follows. If a matrix X is near low-rank, the recovery accuracy of a useri satisfies

sup ||X(i)− ˆX(i)||²= sup ||X(i)− ˜X(i)||², (11) wheresup is the upper bound of || · ||2, ˆX is the trajectories recovered by CS with KVP (i.e., ˆX = fde(fcs(fen(X ◦ Φ)))), X is trajectories recovered by CS directly (i.e., ˜˜ X = fcs(X ◦ Φ)), and ˆX(i) is the recovered trajectory of user i.

Proof: When a matrix is near low-rank and the value of approximate rank isr, the valuePmin(N,T )

i=1 σi−Pr

i=1σi can be considered as noise [6], which is denoted asξ.

According to existing work on the CS-based matrix completion [6] [14], we have the accuracy upper bound as

sup ||X − ˜X||²= 4

s2 min(N, T )

(1 − α) ξ1, (12)

(6)

whereα is the data loss ratio in X, and ξ1 is the noise ofX.

Similarly, the accuracy upper bound of||X||2 can be represented as

sup ||X − ˆX||²= 4

s2 min(N, T )

(1 − α) ξ2, (13) whereξ2 is the noise of X.

From Fig. 5, we know that the KVP operation does not change the number of missing data. Consequently, the loss ratio α in Eq. (12) and in Eq. (13) has the same value.

Combining Eq. (12) and Eq. (13), we have sup ||X − ˆX||2

sup ||X − ˜X||2

= ξ2

ξ1

. (14)

It is difficult to obtain the exact value of ξ1 andξ2, which highly depends on the specific data. However, because KVP is a basic linear transformation, which can be presented as X = ΨX and Ψ is the matrix of private keys ψ. Treating this transformation as a measurement operation in CS, we can obtain the noise ratio according to CS theory [6],

ξ2

ξ1

= |µ(Φ, Ψ)|

|µ(Φ, I)|. (15)

Recall thatΦ is the binary index matrix indicating the missing data. The coherence operationµ in Eq. (15) is defined as

µ(Φ, I) = max

1≤i6=j≤T| < Φ⁽ⁱ⁾, I^(j)> |. (16) where Φ⁽ⁱ⁾ is the i-th column vector of ΦN ×T, and <

Φ⁽ⁱ⁾, I^(j) > is the inner product of two vectors, i.e., <

Φ⁽ⁱ⁾, I^(j)>= (Φ⁽ⁱ⁾)^′I^(j).

By the design of KVP, we have the Ψ matrix as follows, which is an example when K = 2 as shown in Fig. 5

Ψ =







1 0 0 0 0

0 1 0 0 0

ψ3,1 ψ3,2 ψ3,0 0 0 ... ... 0 . .. 0 ψN,1 ψN,2 0 0 ψN,0







. (17)

Combine Eq. (14), Eq. (15), Eq. (16), and Eq. (17), we can calculate the recovery error of useri as

sup ||X(i)− ˆX_(i)||2

sup ||X(i)− ˜X(i)||² =|µ(Φ, Ψ⁽ⁱ⁾)|

|µ(Φ, I⁽ⁱ⁾)| =ψi,0

1 . (18) Because of the reasons that 1) the decryption operation fde(X(i)) = ˆX(i) is also a linear transformation according to Eq. (9); 2) all other variables such as Ds and ψs are known;

and 3) since PK

j=0ψi,j = 1, we know the error is linearly amplified according to weights

sup ||X(i)− ˆX(i)||2

sup ||X(i)− ˆX_(i)||² = ψi,0+ ψi,1+ · · · + ψi,p

ψi,0

= 1

ψi,0

. Combining the above two equations, we have

sup ||X(i)− ˆX(i)||2= sup ||X(i)− ˜X(i)||2, (19)

and the theorem is proved.

B. Privacy Preservation against Eavesdroppers

Privacy preservation is offered by PPCS. We discuss how PPCS protects privacy leakage against eavesdroppers (in this subsection) and stalkers (in the next subsection).

The location data are encrypted by individual users before transmitting them to the server. In this way, only encrypted data (the encrypted sensed trajectories sent from the users S, or the complete encrypted trajectories ˆX recovered by CS) can be captured by eavesdroppers. These eavesdroppers can only infer the original user trajectory based on the exposed encrypted data ˆX. Therefore, we adopt the distortionδ defined in [40] to measure the similarity between the encrypted and the original data of every user

δ_(i) = PT

j=1|ˆX_(i,j)− X(i,j)|

T . (20)

The value of δ presents the average per-location distortion between the encrypted and the original trajectories, and a larger δ indicates a stronger privacy preservation against eavesdroppers. In practice, the complete trajectory X is not always available due to the missing data issue. In this case, we adopt the recovered ˆX to replace X in Eq. (20) for computing.

The PPCS scheme exploits KVP to obfuscate the user’s personal trajectory. Since several trajectories are perturbed into one trajectory, even if an eavesdropper steals this combined trajectory, it is not easy to distinguish the original one. In the next, we derive the distribution of the distortionδ.

The encrypted vector is obtained via linearly combiningK public vectors with weightsψs, and this encryption operation demonstrates significant randomness in that 1) theK public vectors are randomly selected from all public vectors and 2) the weight vector<ψi,0, ψi,1, · · · , ψ^i,K> is randomly generated. With these randomness, the original locations are mapped to other locations but still in the area of interests. As a result, we can use the random distance distribution to approximate the distortion of a given location and its encrypted data.

Consider aw × h rectangle area, the distortion distribution P(δ ≤ d) can be presented by a piecewise function [5]

P(δ ≤ d) =











2

w²h²(G(d) − G(0)) d ∈ [0, h]

2

w²h²(G(h) − G(0)) d ∈ (h, w]

2

w²h²(G(h) − G(√

d²− w²)) +Fh(√

d²− w²) d ∈ (d, η]

,

(21) where

G(z) = Z

(h − z)p

d²− z²(2w −p

d²− z²)dz, Fh(z) = 1 − (1 − z/h)², and η =p

w²+ h². With this distribution, the average distance between randomly selected points (i.e., the expectation of distortion ¯δ) can be easily obtained.

(7)

0 1 2 3 4 5 0

1 2 3 4 5x 10⁻⁴

Distance (km)

PDF

Simulation Analysis

Fig. 6. PDF of the distortion by KVP.

To valid our analysis on the distortion distribution, we simulate in a rectangle area with w = 4 km and h = 3 km, and randomly generate a set of trajectories including a total number of 1 × 10⁶ locations. Then we apply KVP on these trajectories and record the distances between the original locations and their corresponding encrypted locations. The statistic distribution of these distances is shown in Fig. 6, along with the probability distribution calculated according to Eq. (21). The average distance of these location pairs is also shown in the figure (i.e., ¯δ ≈ 1.83 km).

Two observations are obtained from these results. First, the distortion shows significant randomness over a large distance range. Second, the average distortion between the original and encrypted locations is relatively large compared to the area.

These observations verify that there is no obvious pattern to infer the original locations through the encrypted locations.

C. Privacy Preservation against Stalkers

Another adversary model is the stalker, who can obtain k actual location data of a user’s trajectory. A stalker has two alternative methods to recover the trajectory: 1) Crack the private key based on the k data. 2) Run the crowdsensing recovery using these k data.

In the first method, to protect the user privacy, it is required that the trajectory ˆX(i) in Eq. (9) is unsolvable, even if the stalkers know the encrypted data ˆX_(i), the decryption function fde, and k (k < K) actual location data of X_(i). The PPCS scheme resorts to the private keys against stalkers. From Eq. (9), we know

ˆ

X_(i) = ψi,0Xˆ(i)+ ψi,1D(1)+ · · · + ψi,KD(K). (22) It is possible for a stalker to obtain ˆX_(i) by hacking the server, and she may also obtain the public vectors Di (i = 1, 2, · · · , K). However, because the private keys ψ^i,j (j = 0, 1, · · · , K) are only known by the users themselves, a stalker resolving Eq. (22) needs the knowledge of at least K + 1 elements of X(i) according to the theory of underdetermined system [8]. As a result, the stalker cannot resolve the original trajectory as long as the condition k ≤ K holds. As K is the control parameter adopted in PPCS, we can proactively adjust the number of public vectors used in the encryption operation according to the requirement of individual users.

In the second method, a stalk pretends herself as a private user and joins in the crowdsensing recovery. Many practical factors significantly affects the privacy-preserving such as the

number of exposed location data k, the mobility model of users, the map structure, and etc. Some of these practical factors are not easy to be formulated. Hence, for this method, we conduct real trace based simulations to verify the privacy preservation in practice (refer to Section. VI-B).

D. Complexity Analysis

1) Computational Complexity: The KVP operation is locally run at user side. In KVP, K + 1 vectors with size 1×T need to be processed in encryption and decryption steps, which requires a computational complexity ofO((K + 1)T ).

This complexity costs negligible computing time owe to the capability of current GHz-level mobile devices.

At the server side, the main task is CS computing, which requires a computational complexity ofO(rNT ̺) [29], where r is the rank of the to-be-recovered matrix and ̺ is the iteration numbers. Our evaluation experiences with Beijing and Shanghai traces reveal that ̺ ≤ 5 in most cases. Since the server always has a strong computational capability, the CS operation is responsive in real-time.

2) Communication Overhead: In order to execute fen, a user should downloadK public vectors D(i) from the server and then upload a encrypted vector S_(i). Hence, the communication overhead isO((K+1)T ). Moreover, in order to execute fde, the user should download ˆX_(i), requiring another communication overhead ofO(T ). As an example, with K = 10, T = 500 and a 16-bit operating system, the total amount of data exchange is about(10 + 2) ×500×16/8 = 12 KB, which is a very light overhead for modern mobile applications.

E. Design Discussion

At the end of the analysis, we discuss the interesting design concern: no public users.

Public users are optional in PPCS. Even there is no public users, PPCS still works. To replace the roles of public vectors, the server can provide historical vectors (e.g., any trajectory with the same time interval yesterday), as long as the low-rank property is maintained. In addition, when a user has a high privacy requirement, she will demand a largeK public vectors according to the analysis in Section V-C. This inadequateK problem could also be solved by historical vectors.

VI. PERFORMANCEEVALUATION

In this section, we evaluate the performance of PPCS in terms of both the data accuracy and the privacy.

A. Simulation Settings

We evaluate PPCS based on two real-world traces including walk, bike, and car data in Geolife [2], and taxi and bus data in SUVnet [3]. Using the same method in Section III, we preprocess the raw data of Geolife and SUVnet by selecting complete trajectories as our ground truth to conduct our simulations. The selected traces are named Beijing traces with a size of 116 users × 355 slots and Shanghai traces with a size of74 users × 399 slots, whose detailed descriptions are listed in Table I.

(8)

0 100 200 300 400 500 600 0

0.2 0.4 0.6 0.8 1

Recovery Error (m)

CDF

PPCS−Beijing CS−Beijing

(a) CDF of recovery error in Beijing traces

0 200 400 600 800 1000

0 0.2 0.4 0.6 0.8 1

Recovery Error (m)

CDF

PPCS−Shanghai CS−Shanghai

(b) CDF of recovery error in Shanghai traces Fig. 7. Recovery accuracy comparison.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0

50 100 150 200

Loss Ratio

Average Error (m)

PPCS−Beijing CS−Beijing

(a) Recovery error vs loss ratio in Beijing traces

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0

200 400 600 800

Loss Ratio

Average Error (m)

PPCS−Shanghai CS−Shanghai

(b) Recovery error vs loss ratio in Shanghai traces Fig. 8. Recovery accuracy vs loss ratio.

0 5 10 15 20 25

0 0.2 0.4 0.6 0.8 1

Distortion (km)

CDF

Beijing Shanghai

(a) Distortion distribution

10 20 30 40 50 60

5 10 15 20 25

Number of Public Trajectories

Distortion (km)

Beijing Shanghai

(b) Distortion vs K Fig. 9. Distortion against eavesdroppers.

In the trace-driven simulations, we randomly generate a0-1 matrix Φ with the same size as the original data trace. The element inΦ takes the value of 0 if its corresponding element in the data trace is missing and 1 otherwise. The ratio of the 0 elements to the total number of elements in Φ is controlled by the data loss ratio α, which is set 0.5 by default unless otherwise specified. Then, we generate the sensed matrix S according to Eq. (2), i.e., S = X ◦ Φ. The proposed PPCS is applied on the sensed matrixS with K public vectors, and the recovered matrix ˆX is obtained. Without loss of generality, the top-K rows in the original traces are treated as the public traces, and K = 10 by default. The reported results in the following are averaged over 100 simulation runs.

We adopt the state-of-the-art CS-based crowdsensing recovery method in [29] as a baseline, which is referred as CS in the remaining of this section. With respect to the adversary model, we consider that an eavesdropper can steal all encrypted data X. And the stalker has a part (< 10%) of real trajectory data.ˆ B. Performance Analysis

1) Recovery Accuracy: We first evaluate the recovery accuracy with a default setting of α = 0.5 and K = 10. The distributions of the recovery accuracies obtained by PPCS and CS with the two real-world traces are shown in Fig. 7. For example, Fig. 7(a) shows that the recovery errors of 50%

users’ trajectories are less than 10 meters and the average recovery error is53 m when applying PPCS on Beijing traces.

Two observations are obtained from Fig. 7. First, the recovery accuracy obtained with PPCS and CS are comparable in both traces, which validates the correctness of Theorem 5.1 that PPCS can achieve similar recovery accuracy as the state-of-

the-art CS method. Second, the recovery errors are small. For instance, the recovery errors of 80% users with Beijing and Shanghai are less than100 m and 200 m, respectively. These errors are tolerable in many cases because mechanisms such as map matching [25, 33] can eliminate their impacts on the final recovered trajectories.

To gain more insights on the impact of data loss in recovery accuracy, we apply PPCS on the two traces with varying α from 0.1 to 0.7, and the results are shown in Fig. 8. A clear increasing trend of recovery errors with the increase of α can be observed. For example, with Beijing traces, PPCS achieves an average recovery error of 20 m when α = 0.2, and the error is increased to 124 m with an α = 0.7. The recovery errors with PPCS and CS are comparable in all the explored cases, which agrees with the observation in Fig. 7.

An interesting observation is that PPCS slightly improves the accuracy performance of CS. A possible reason is that the linear transformations in KVP make the low-rank property of the trajectory matrix even more obvious, and thus improves the recovery accuracy.

2) Privacy against Eavesdroppers: Keeping α = 0.5 and K = 10, next we investigate the perturbation distortion obtained with PPCS. In Fig. 9(a), we can see that the distortion between original and encrypted trajectories is enormous.

For example, the distortion of 50% trajectories are over 4, 000 m and the average distortion is more than 9, 000 m with Beijing traces. Such distortion distances are quite large when compared with the road segment length in Beijing city.

As a result, even if the encrypted trajectory is exposed to adversaries, the information leakage on the original trajectory

(9)

TABLE II

RECOVERY ERROR WHEN STALKERS HAVE PARTIAL ORIGINAL DATA. Recovery error ǫ Stalker (5%) Stalker (10%) User (50%)

ǫof Beijing 409.36 m 366.47 m 38.99 m ǫof Shanghai 2510.48 m 1723.16 m 189.96 m

is small, indicating a strong privacy preservation level. Another observation is that the distortion distribution shows no clear patterns. For example, the distortion distribution with Shanghai traces is nearly linear, but that with Beijing traces is more like a piecewise function. This patternless feature indicates that even the adversaries can obtain a large amount of the encrypted trajectories, the training methods based on these information would not facilitate them to infer the original trajectories.

As analysis in Section V, PPCS needs a number of public (or historical) trajectories to perform the encryption. To investigate the impact of the amount of public trajectories on the distortion, we apply PPCS on the two traces with the number of public traces varying from 10 to 60 (the total number of users in Beijing and Shanghai are 116 and 74 respectively). The results in Fig. 9(b) demonstrate that there is no clear relation between the distortion and the number of available public traces. This observation alleviates our concern on whether the available number of public trajectories will significantly degrade the distortion performance of PPCS. So the privacy-preserving level in PPCS is independent to the number of public vectors.

3) Privacy against Stalkers: A stalker can treat the exposed k data of a user as a (T − k) missing data trajectory and then utilizes PPCS to recover this trajectory. Next, we evaluate the privacy preservation offered by PPCS against stalkers, and the results are shown in Table II. Recall thatk is a small number in the stalker model. We set and evaluate the cases when5%

and 10% actual location data are captured by the stalks. In the case of5%, the recovery error is more than 2, 500 m with Shanghai traces and more than 400 m with Beijing traces.

It is difficult to obtain the actual trajectory with such large errors. Even if a stalker has10% actual trajectories, she cannot achieve a promising recovery accuracy. On the contrary, when the data loss ratio is α = 0.5, the private users under PPCS has an excellent accuracy that always under40 m with Beijing traces. In summary, our PPCS solution is able to effectively protect the privacy even when a few original data are exposed.

C. Illustrative Results

To demonstrate a clear view of the results obtained with PPCS, we show the recovered trajectory by PPCS/CS/Stalker and the encrypted trajectory against eavesdropper in Fig. 10, using a10-location original trajectory. All trajectories are fitted to roads by the map match method proposed in [33].

In Fig. 10(a), the recovered trajectories by PPCS and CS [29] are drawn when 4 locations along the trajectory are missing. We can see when 40% of original data are missing, PPCS still recovers the original trajectory with a high accuracy that is comparable to the result of CS. Moreover, Fig. 10(a) also shows the recovered trajectory of a stalker who applies PPCS with 3 stalked actual locations, which is a totally different trajectory compared with the original one.

N°: 39° 58' 29' ' E°: 116° 18' 39' '

CS Original PPCS Stalker

(a) The recovered trajectories by P- PCS and CS are similar to the original one, but the trajectory recovered by a stalker is much different.

N°: 39° 58' 55' ' E°: 116° 21' 34' '

Encrypted Different area

(compared with the coordinates in the left map)

(b) The encrypted trajectory distorts the original one to a different area (refer to the latitude and the longitude) against eavesdroppers.

Fig. 10. Illustrative results of PPCS. (The dots are the PPCS / CS / Stalker / Encrypted results. The lines are the map matching results based on the dots.)

The encrypted trajectory is shown in Fig. 10(b). Comparing the latitude N^◦ and the longitude E^◦ in Fig. 10(b) with those in Fig. 10(a), we find that the distortion between the encrypted trajectory and the original one is relatively large, indicating a strong defense against eavesdroppers. Furthermore, the encrypted results also form a sound trajectory in the map.

This indicates that the eavesdroppers cannot easily determine whether the hacked trajectories are encrypted or not.

In addition, even an adversary can eavesdrop and stalk simultaneously, she can only obtain two separate results:

‘encrypted’ as shown in Fig. 10(b) and ‘stalker’ in 10(a), but no further improvement on inferring the original trajectory.

VII. RELATEDWORK

In this section, we discuss the related work in literature.

There are two important research topics involved in this work:

trajectory recovery and trajectory privacy.

A. Trajectory recovery

We classify existing efforts on trajectory recovery into two categories: single user recovery and crowdsensing recovery.

The single user recovery is to reconstruct a trajectory based on a user’s own location data. Plenty of classic missing data estimation methods such as nearest neighbors (NN) [34] and linear interpolation [31] can be utilized to recover a trajectory in a user’s own mobile device. These methods avoid the data leakage issue because no data exchange is required; however, their recovery accuracy is usually limited [29].

The crowdsensing recovery is to reconstruct all users’

trajectories together based on their trajectory correlations, and thus significantly improves the recovery accuracy when compared with the single user recovery. Currently, compressive sensing (CS) [9] is an advanced recovery technique in diverse applications [20, 39]. For trajectory recovery, CS- based crowdsensing recovery [29] also produces the near- optimal approximation for missing data recovery. Although CS provides high accuracy, it requires data transmission and a computing server, and thus degrades user privacy.

B. Trajectory privacy

Existing trajectory privacy works have three primary meth- ods: anonymization, dummification, and obfuscation. First, a

(10)

user adopting anonymization method [24] is to transmit her location data attached with an anonymity instead of her ID.

However, latest studies [12, 38] reveal that the anonymization mechanism alone is inadequate to preserve the privacy well.

Second, a user adopting dummification method [19] is to transmit her location data with a set of generated fake data.

Although the dummification increases the privacy, it intro- duces additional data and influences the original correlations, which decreases the recovery accuracy. Third, the obfuscation method either perturbs a user’s location data by mixing other trajectories [15, 27] or cloaks the data into a spatial region [13].

Existing obfuscation methods blur the original data, and thus contradict with the consideration of the accurate recovery.

VIII. CONCLUSION

With the increasing popularity of location based services, it is important to simultaneously consider the quality of service and user privacy. Focus on the trajectory recovery service, in this paper, we design a novel PPCS scheme using crowdsensing to accurately recover the trajectories with the consideration of privacy. The core design of PPCS leverages the matrix transformation to include the privacy preservation into compressive sensing. Through extensive trace-based simulations, we demonstrate that PPCS not only effectively preserves the user privacy against eavesdroppers and stalkers, but also accomplishes comparable accuracy as the state-of-the- art CS design. Although we focus on the trajectory recovery in this work, the general PPCS can also be utilized in other privacy-preserving data recovery applications.

Acknowledgment: This research was supported in part by the NSERC Discovery Grant 341823, NSERC Collaborative Research and Development Grant CRDPJ418713, Canada Foundation for In- novation (CFI) Leaders Opportunity Fund 23090, NSFC grant No.

61373155, No. 91438121, No. 61303202, and China Postdoctoral Science Foundation grant No. 2014M560334.

REFERENCES

[1] Facebook acquires company behind Moves fitness app. http://www.

theverge.com/2014/4/24/5647084/facebook-acquires-moves-fitness-app.

[2] GeoLife Data Collected by Microsoft Research Asia. http://research.

microsoft.com/en-us/projects/geolife/default.aspx.

[3] SUVnet Data Collected by Shanghai Jiao Tong University. http:

//wirelesslab.sjtu.edu.cn/download.html.

[4] Trippermap service in flickr. http://www.flickr.com/services/apps/5121/.

[5] V. S. Alagar. The distribution of the distance between random points.

Journal of Applied Probability, pages 558–566, 1976.

[6] E. J. Candes and Y. Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925–936, 2010.

[7] C.-Y. Chow, M. F. Mokbel, and W. G. Aref. Casper: Query processing for location services without compromising privacy. ACM Transactions on Database Systems, 34(4):24–48, 2009.

[8] J. W. Demmel and N. J. Higham. Improved error bounds for un- derdetermined system solvers. SIAM Journal on Matrix Analysis and Applications, 14(1):1–14, 1993.

[9] D. L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289–1306, 2006.

[10] Z. Feng, Y. Zhu, Q. Zhang, L. M. Ni, and A. V. Vasilakos. Trac:

Truthful auction for location-aware collaborative sensing in mobile crowdsourcing. In IEEE INFOCOM, 2014.

[11] R. K. Ganti, F. Ye, and H. Lei. Mobile crowdsensing: current state and future challenges. IEEE Communications Magazine, 49(11):32–39, 2011.

[12] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan.

Private queries in location based services: anonymizers are not necessary.

In ACM SIGMOD, 2008.

[13] M. Gruteser and D. Grunwald. Anonymous usage of location-based services through spatial and temporal cloaking. In ACM MobiSys, 2003.

[14] C. Hegde, P. Indyk, and L. Schmidt. Approximation-tolerant model- based compressive sensing. In ACM/SIAM SODA, 2014.

[15] B. Hoh and M. Gruteser. Protecting location privacy through path confusion. In IEEE SecureComm, 2005.

[16] X. Ju, H. Zhang, and D. Sakamuri. Neteye: a user-centered wireless sensor network testbed for high-fidelity, robust experimentation. Inter- national Journal of Communication Systems, 25(9):1213–1229, 2012.

[17] J. Jun, Y. Gu, L. Cheng, B. Lu, J. Sun, T. Zhu, and J. Niu. Social- loc: Improving indoor localization with social sensing. In ACM SenSys, 2013.

[18] T. Jung and X.-Y. Li. Search me if you can: privacy-preserving location query service. In IEEE INFOCOM, 2013.

[19] H. Kido, Y. Yanagisawa, and T. Satoh. An anonymous communication technique using dummies for location-based services. In IEEE ICPS, 2005.

[20] L. Kong, M. Xia, X.-Y. Liu, M.-Y. Wu, and X. Liu. Data loss and reconstruction in sensor networks. In IEEE INFOCOM, 2013.

[21] A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural analysis of network traffic flows. In ACM SIGMETRICS, 2004.

[22] J. Liu, B. Priyantha, T. Hart, H. S. Ramos, A. A. Loureiro, and Q. Wang.

Energy efficient GPS sensing with cloud offloading. In ACM SenSys, 2012.

[23] S. Liu, S. Wang, K. Jayarajah, A. Misra, and R. Krishnan. Todmis:

Mining communities from trajectories. In ACM CIKM, 2013.

[24] C. Y. Ma, D. K. Yau, N. K. Yip, and N. S. Rao. Privacy vulnerability of published anonymous mobility traces. In ACM MOBICOM, 2010.

[25] P. Newson and J. Krumm. Hidden markov map matching through noise and sparseness. In ACM SIGSPATIAL, 2009.

[26] D. Niculescu and B. Nath. Trajectory based forwarding and its applications. In ACM MOBICOM, 2003.

[27] D. Quercia, I. Leontiadis, L. McNamara, C. Mascolo, and J. Crowcroft.

Spotme if you can: Randomized responses for location obfuscation on mobile phones. In IEEE ICDCS, 2011.

[28] K. K. Rachuri, C. Mascolo, M. Musolesi, and P. J. Rentfrow. Socia- blesense: exploring the trade-offs of adaptive sampling and computation offloading for social sensing. In ACM MOBICOM, 2011.

[29] S. Rallapalli, L. Qiu, Y. Zhang, and Y.-C. Chen. Exploiting temporal stability and low-rank structure for localization in mobile networks. In ACM MOBICOM, 2010.

[30] R. Rosales and S. Sclaroff. 3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions. In IEEE CVPR, 1999.

[31] G. Scaglia, A. Rosales, L. Quintero, V. Mut, and R. Agarwal. A linear- interpolation-based controller design for trajectory tracking of mobile robots. Elsevier Control Engineering Practice, 18(3):318–329, 2010.

[32] I. Singh, M. Butkiewicz, H. V. Madhyastha, S. V. Krishnamurthy, and S. Addepalli. Twitsper: Tweeting privately. IEEE Security & Privacy, 11(3):46–50, 2013.

[33] A. Thiagarajan, L. Ravindranath, H. Balakrishnan, S. Madden, L. Girod, et al. Accurate, low-energy trajectory mapping for mobile devices. In USENIX NSDI, 2011.

[34] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis. Secure kNN computation on encrypted databases. In ACM SIGMOD, 2009.

[35] W. Xi, J. Zhao, X.-Y. Li, K. Zhao, S. Tang, X. Liu, and Z. Jiang.

Electronic frog eye: Counting crowd using wifi. In IEEE INFOCOM, 2014.

[36] M. Xia, L. Gong, Y. Lv, Z. Qi, and X. Liu. Effective real-time android application auditing. In IEEE S&P, 2015.

[37] T. Xu and Y. Cai. Feeling-based location privacy protection for location- based services. In ACM CCS, 2009.

[38] H. Zang and J. Bolot. Anonymization of location data does not work:

A large-scale measurement study. In ACM MOBICOM, 2011.

[39] B. Zhang, X. Cheng, N. Zhang, Y. Cui, Y. Li, and Q. Liang. Sparse target counting and localization in sensor networks based on compressive sensing. In IEEE INFOCOM, 2011.

[40] J. Zhu, K.-H. Kim, P. Mohapatra, and P. Congdon. An adaptive privacy- preserving scheme for location tracking of a mobile user. In IEEE SECON, 2013.