Generating Dummies with Patterns
4.2 Intersection Pattern-based Scheme
4.2.2 K-intersect Dummy Generation
Rotation dummy generation only has one intersection between user’s trajectory and dummies. If the number of intersections between a user and the dummies are increased, it is more difficult for adversaries to figure out the user’s true trajectory and LD is thus decreased. But the dst is also influence a lot. Besides, in the situation that attackers has some background knowledge of users, the intersections between a user and the dummies can decrease the exposure of users’ trajectory and thus protect users’ location privacy. In this scheme, we increase the number of intersections between user trajectory and the dummy trajectories.
This scheme selects k points from the intersection candidate set to be the inter-section points. Then the paths between the interinter-section points by the randomized dummy generation. The intersection points can be represented as C = C1, C2, ..., Ck where Ci is the ith intersection point. The trajectory not included by the
intersec-Ӱ
Ӱ Ӱ
Ӱ T
D
Figure 4.6: An example of 2-intersect pattern scheme
Figure 4.7: The problem of multiple intersections
tion points, the rotation dummy generation is used. Explicitly, C1 and Ck can be regards as the rotation points and the following formula is used:
(m − k) ∗ dstr+ k ∗ dstk
m = dst
=⇒ dstr = m ∗ dst − dstk∗ k m − k
dstk means the distance deviation and a means the length of trajectory between the cutting points C1 and Ck. The combinations of the paths describe above will be outputted as the dummy’s moving pattern. An example of 2-intersect dummy generation example is shown in Figure 4.6, where the dotted circles are the intersection points. The trajectory in dotted square is generated by randomized dummy approach and beyond the square is generated by rotation dummy approach.
Note that increasing the number of intersection points is not always good. Be-cause the more intersections between user’s and dummy’s trajectories the less dis-tance derivation the user has. For example, in Figure 4.7, the dummy’s trajectory
is too closed to user trajectory. That will cause the injury of dst and attacker may break user’s privacy level. In this paper, we set the value of k is two to make more intersection than the rotation dummy generation and not hurt the quality of privacy in value dst.
As mentioned before, the selection of intersection candidate sets depends on several factors. We explore two factors to select candidate sets. First, the candidate sets should not be an important place to the user. For example, if we choose a users home as the intersection point, dummies and the user will stay in the same cell for a long period of time. Therefore the dummy cannot effectively protect the users location privacy. Second, in order to increase the cache utilization, we develop a cache scheme to determine intersection points in which users are likely to visit.
To determine which places are important, we should consider the staying time and sensitive places. Obviously, choosing a place that the user stays for a long time as the intersection point will decrease the location anonymity. The other type of important place is sensitive area [6], sensitive area means the places(e.g., hospital, nightclub) user don’t want people know that he is inside. For example, when shopping in a mall, most people may not be very concerned even if their locations are known. However, users may worry about their locations exposed (e,g, hospital). Our method will exclude those important places, which including the place that user stayed for more than threshold slots Tmax and user-specific sensitive area Si, to form the remainder for candidate position set.
In dummy techniques, communication cost increases are generated as a side effect. Since, the service provider must create a reply message not only for the true position data but also for the dummy. In dummy methods, LBSs will return both users’ and dummies’ data, user will filter out the dummies requests. That
Figure 4.8: An example of knn query caching.
is undoubtedly a waste of resource. Caching dummy data for future use is possi-ble, if select intersection points before that a user is likely to stay. Not only the intersection but also the area near the intersection could be cached. Adding a dummy trajectory intersects user’s trajectory at different time slots in the gener-ating dummy scheme. Consider Figure 4.8 as an example, where KNN queries are issued. Initially, we retrieve more than k data sources and let dummy X arrive the intersection (2,5) before the arrival of this user. Thus, the data returned for dummy X when X is used to arrive at intersection (2,5).
Based on the above concept, we employ query cost to evaluate the performance.
The query cost QCd is defined as:
QCd= (n + 1) ∗ Xm
i=1
SZi− ((n + 1)Ibt+ If t+ Iat) ∗ Xk
j=1
SZj
The first term represents the query cost of dummy method. SZi and SZj mean the size of answer messages, n and m mean the total number of dummies and the total time slot respectively. The second term represents the saved query cost of
cache. Ibt are the intersection points before the user arrive, Iat are the intersection points which dummy and user arrive at the same time and If t are the intersection points after the user arrive. Because Ibt are beneficial for the cache scheme, user and dummy needn’t query when user arrive the Ibt. If we want to lower the query cost, one should increase the number of Ibt.
Dummies should arrive to the P Li early, we can derive that P Lt1d = P Lt2 and t1 < t2, where P Lt1d is the dummy’s location at time slot t1 and P Lt2 is the user’s location at time slot t2. Figure 4.8 shows an example of query cost, dummy X arrives the position (2,5) before user and there are two dummies and five time slots. Assume that the size of answer messages(SZ) are 10, it can be verified that QCd=(2 + 1) ∗ 50 − (3 ∗ 1 + 0 + 1) ∗ 20 = 70 in this example.