• 沒有找到結果。

Social-Temporal Group Query

In the following, we first extend SGQ to STGQ by exploring the temporal dimension and formulate the problem in Section 3.4.1. STGQ is more complex than SGQ because there may exist numerous activity periods with different candidate groups. An intuitive approach is to first find the SGQ solution for each individual activity period and then select the one with the minimum total social distance. However, this approach is computation-ally expensive. To address this issue, in Section 3.4.2, we identify pivot time slots, the only time slots required to be explored in the temporal dimension, to facilitate efficient STGQ processing. Moreover, we propose the availability pruning strategy to leverage the correlation in the available time slots among candidate attendees to avoid exploring an unsuitable activity period.

3.4.1 Problem Definition

STGQ generalizes SGQ by considering the available time of each candidate attendee via the availability constraint, which ensures that all selected attendees are available in a period of m time slots. Given an activity initiator q and her social graph G = (V, E), where each vertex v is a candidate attendee, and the distance on each edge eu,v con-necting vertices u and v represents their social closeness. A social-temporal group query ST GQ(p, s, k, m), where p is an activity size, s is a social radius constraint, k is an ac-quaintance constraint, and m is an activity length, finds a time slot t and a set F of p vertices from G to minimize the total social distance between q and every vertex in F , i.e.,

u∈Fdu,q, where du,q is the length of the minimum-distance path between u and q with

at most s edges, such that each vertex u in F is allowed to share no edge with at most k other vertices in F , and u is available from time slot t to t + m− 1.

3.4.2 Algorithm Design

STGQ is also an NP-hard problem because STGQ can be reduced to SGQ if every candidate attendee is available in all time slots. An intuitive approach to evaluate STGQ is to consider the social dimension and the temporal dimension separately, by sequentially exploring each time slot t and the candidate attendees who are available from t to t + m− 1 (i.e., m consecutive time slots). However, the running time significantly grows when the number of time slots increases. Therefore, we devise Algorithm STGSelect, which explores the following features in the temporal dimension to reduce search space and running time.

Pivot time slot. We consider only a limited number of slots, namely, the pivot time slots, to find the solution. STGSelect returns optimal solutions even though only parts of the slots are considered.

Access ordering. In addition to interior unfamiliarity and exterior expansibility dis-cussed earlier, we further consider the solution quality and the feasibility based on the availability constraint. Algorithm STGSelect constructs the VS with vertices which have more available time slots in common to find an initial feasible solution and then chooses the vertices in VAwith smaller social distances to improve the solution.

Availability pruning. In addition to the distance and acquaintance pruning discussed in Section 3.3.2, we propose the availability pruning strategy to stop the algorithm when selecting any vertex from VAnever leads to a solution with m available time slots.

To find the optimal solution, STGSelect is expected to have an exponential-time com-plexity because STGQ is NP-hard. In the worst case, all candidate groups in all time slots may need to be considered. However, as shown in Section 3.5, the average running time of the proposed algorithm with the above strategies can be effectively reduced, especially for a large m. In the following, we describe the details of Algorithm STGSelect, paying

special attention on the temporal dimension. Instead of considering the interval from t to t + m− 1 for each time slot t, our algorithm leverages the pivot time slots defined as follows to reduce running time.

Lemma 3.4.1. A time slot is a pivot time slot if the ID of the slot is im, where i is a positive integer. Any feasible solution to STGQ must include exactly one pivot time slot.

Proof. If a solution does not span over a pivot time slot, the solution must have fewer than m slots because there are m− 1 time slots between any two consecutive pivot time slots.

If a solution contains more than one pivot time slot, the solution includes more than m slots, and the above two cases are not feasible. Moreover, there must exist an integer i such that the optimal solution resides in an interval starting from slot (i − 1)m + 1 to (i+ 1)m− 1, corresponding to pivot time slot im. If the optimal solution is not located in the above interval, the optimal solution must include at least two pivot time slots and thereby is infeasible, or the optimal solution must reside in the corresponding interval for pivot time slot (i − 1)m or (i+ 1)m. The lemma follows.

Definition 3.4.1. Every vertex v in the feasible graph GimF = (VFim, EFim) for pivot time slot im has at least m consecutive available time slots in the interval from slot (i−1)m+1 to (i + 1)m− 1. Moreover, there exists a path from q to v with at most s edges.

For each pivot time slot im, Algorithm STGSelect extends SGSelect by considering the temporal information when selecting a vertex from VAto VS. Specifically, let TSdenote the set of consecutive time slots available for all vertices in VS, and TS must contain slot im. In other words, TSwill be a feasible solution to the STGQ when VSincludes p vertices satisfying the acquaintance constraint, and|TS| ≥ m. At each iteration, for each vertex in VA, Algorithm STGSelect considers the social distance to q during the selection to reduce the objective value. However, we also consider the temporal availability of the vertex to avoid choosing a vertex that leads to a small increment of the total social distance but ends up with redundant examination of solutions eventually disqualified by the availability constraint. In other words, in addition to interior unfamiliarity and exterior expansibility as described in Section 3.3.2, we define the notion of temporal extensibility as follows.

Definition 3.4.2. The temporal extensibility of VS is

X(VS) =|TS| − m.

A larger temporal extensibility ensures that many vertices in VA with good quality in the temporal dimension can be selected by our algorithm afterward.

Temporal Extensibility Condition. To consider both the solution quality and fea-sibility in the temporal dimension, Algorithm STGSelect chooses the vertex u with the minimum social distance to q, and u must satisfy

X(VS ∪ {u}) ≥ (m − 1)

[p− |VS∪ {u}|

p

]ϕ

,

where ϕ≥ 1 and p−|VSp∪{u}| is the proportion of attendees that have not been considered.

The RHS grows when ϕ decreases, and the above condition enforces that the result VS {u} must be more temporal extensible, i.e., more available time slots are shared by all vertices in the result, and hence more vertices in VA are eligible to be selected at later iterations. In the extreme case, if ϕ = 1, the above condition requires that the result contains almost 2m− 1 available time slots when VS = {q}, because the RHS is close to m− 1. In contrast, as ϕ grows, our algorithm is able to choose a vertex with a smaller social distance because more vertices can satisfy the above condition. Please note that ϕ is increased by the algorithm if there exists no vertex in VA that can satisfy the above condition, and the RHS approaches 0 in this case. For the case that leads to X(VS∪{u}) <

0, we remove u from VAbecause adding u to VS results in unqualified solutions that are infeasible in the temporal dimension.

In addition to distance pruning and acquaintance pruning that consider the social di-mension, we propose availability pruning in the temporal dimension. The strategy enables our algorithm to stop exploring VAif there exists no solution that can satisfy the availability constraint. The above temporal extensibility considers the available time slots for vertices in VS. In contrast, availability pruning reduces the search space according to the available

time slots of vertices in VA. Specifically, for each pivot time slot im, let t+Aand tAdenote the time slots closest to im, such that all vertices in VAare not available in the two time slots, where t+A > im and tA < im, respectively. Therefore, we are able to stop con-sidering VAwhen t+A− tA ≤ m. In this case, the solution is infeasible since the interval starting from tA+1 to t+A−1 contains fewer than m time slots. This strategy can be further improved by considering the number of vertices that are not available for each time slot, and the availability pruning strategy is formally specified as follows.

Lemma 3.4.2. The availability pruning strategy stops selecting a vertex from VAto VSif

t+A(|VA| − p + |VS| + 1) − tA(|VA| − p + |VS| + 1) ≤ m,

where t+A(n) and tA(n) denote the time slots closest to im, such that at least n vertices in VA are not available, and t+A(n) > im and tA(n) < im, respectively. Moreover, the availability pruning strategy can prune the search space with no feasible solution.

Proof. If the above condition holds, there are at most p− |VS| − 1 vertices of VAavailable in each of the above two slots, and we can never find a feasible solution because Algorithm STGSelect is required to choose p−|VS| vertices from VAfor a common available interval with at least m time slots. The lemma follows.

Theorem 3.4.1. STGSelect obtains the optimal solution to STGQ.

Proof. Each pivot time slot is separated from a neighbor pivot time slot with m− 1 time slots. Therefore, Lemma 3.4.1 shows that any feasible solution must include exactly one pivot time slot. In addition, the proposed algorithm considers the interval with 2m−1 slots for each pivot time slot, and we derive the best solution by extending Algorithm SGSe-lect with the temporal extensibility. Moreover, Lemma 3.4.2 shows that the availability pruning discards VA only when there exists no feasible solution satisfying the availabil-ity constraint by incorporating any vertex from VA. The solution obtained by Algorithm STGSelect is optimal because the algorithm chooses the pivot time slot and the corre-sponding group with the smallest total social distance at the end of the algorithm. The

theorem follows.

In the following, Example 3.4.1 provides illustration of Algorithm STGSelect, and the pseudo code of STGSelect can be found in Appendix B.

Example 3.4.1. In this illustrating example for STGSelect, we extend the SGQ in Exam-ple 3.3.1 by considering the length of activity time as 3 (i.e., m = 3). When processing an STGQ, the schedules of candidate attendees provided in Figure 3.2(c) should be con-sidered as well. Since m = 3, ts3 and ts6 are selected to be pivot time slots. For the first pivot time slot ts3, VS = {v7} and VA = {v2, v3, v4, v6, v8} in the beginning. As obtained in Example 3.3.1, both of the exterior expansibility condition and the interior unfamiliarity condition hold when selecting v2. Note that STGSelect also evaluates the temporal extensibility condition when selecting a vertex to ensure the feasibility in the temporal dimension. Since (m− 1)[

p−|VS∪{v2}|

p

]ϕ

= 2× (24)2 = 12 (assume ϕ = 2) and X(VS ∪ {v2}) = 29, the temporal extensibility condition also holds, and hence we can select v2 from VAto VS. Now we have VS ={v2, v7} and VA={v3, v4, v6, v8}. The later vertex selection ordering is identical to Example 3.3.1 since there is no violation on the temporal constraint, and we also obtain the first feasible solution{v2, v4, v6, v7} (total so-cial distance = 64) available in the activity period [ts2, ts4]. Until we select v3 in the state VS ={v2, v4, v7} and VA ={(v3)}, we find out that the temporal extensibility condition does not hold when selecting v3, and then we increase ϕ since there is no other vertex in VA that we can choose. However, since X(VS∪{v3}) = 2−3 = −1, the temporal extensibil-ity condition does not hold even when the RHS of the inequalextensibil-ity approaches 0. Therefore, we can remove v3 and backtrack to the state VS = {v7} and VA = {v3, v4, v6, v8}. As shown in Example 3.3.1, the later branches violate the social constraints and hence lead to no feasible group. Therefore,{v2, v4, v6, v7} is the only feasible group available in activity periods extended from the pivot time slot ts3.

Next, we start processing the second pivot time slot, i.e., ts6. Different from ts3, we have VS ={v7} and VA ={v2, v3, v6, v8} in the beginning. Since v4is not available in the

9Ts={ts1, ts2, ts3, ts4, ts5} since v2and v7are available in them. Hence|Ts| = 5 and X(VS∪{v2}) = 5− 3 = 2.

pivot time slot, we can directly remove it without further consideration. We then obtain VS ={v2, v7} and VA ={v3, v6, v8} because selecting v2violates no constraint. Note that the LHS of the availability pruning condition is t+A(|VA| − p + |VS| + 1) − tA(|VA| − p +

|VS| + 1) = t+A(3− 4 + 2 + 1) − tA(3− 4 + 2 + 1) = t+A(2)− tA(2). Since there are 2 vertices, i.e., v3and v8, in VAnot available in ts4, tA(2) = 4. Besides, there are 2 vertices, i.e., v3 and v8, in VAnot available in ts7, t+A(2) = 7. Therefore, the availability pruning condition holds since t+A(2)− tA(2) = 7− 4 ≤ m, and we can stop selecting vertices from VAto VS. Then we backtrack one step to the state VS ={v7} and VA={v3, v6, v8}.

We can skip this final branch since the acquaintance pruning condition holds. Therefore, there exists no feasible group available in activity periods extended from ts6. Finally, we return the group{v2, v4, v6, v7} and the time period [ts2, ts4] as the optimal result.