• 沒有找到結果。

Node Indexing of AST Using SB

4.4 Index Construction and Maintenance

4.4.1 Node Indexing of AST Using SB

To effectively reduce redundant node processing in CSGQs, it is crucial to create SBs with the minimum number of nodes and ensure solution optimality by considering each

kind of node (i.e., pruned nodes, solution nodes, and internal nodes) in AST. Here we address this essential issue by deriving a set of node selection rules for building SBs under various query parameters. We first focus on the acquaintance constraint k in Rule 1 and then return to the social radius constraint s in Rule 2.

Rule 1: node indexing for different k

(1) Pruned nodes. We categorize pruned nodes into four types: IU-pruned nodes, EE-pruned nodes, acquaintance-EE-pruned nodes and distance-EE-pruned nodes, which correspond to Eqs. (3.2), (3.3), (3.4) and (3.5) in Section 3.3.2, respectively. Given s and k of the first SGQ, we examine if a pruned node is needed in the (s,k)-SB for processing a new SGQ with kas follows.

• IU-pruned nodes. All IU-pruned nodes do not appear in any (s,k)-SB with k k, since k represents a tighter acquaintance constraint. On the other hand, when k > k, an IU-pruned node is not included in any (s,k)-SB if k < U (VS) since insufficient social tightness within VSprevents this node from becoming a solution.

Therefore, an IU-pruned node only appears in the (s,k)-SB where

k ≥ max{U(VS), k + 1}. (4.1)

Example 4.4.1. Figure 4.2(b) presents an illustrative example with an IU-pruned node P 1 to identify the corresponding SBs. P 1 is generated in the first query with (s, k) = (3, 3), and its VSand VAare{v2, v6, v7, v8, v9, v11} and {v1, v4, v5, v10}, re-spectively. It is not necessary to calculate U (VS) here, since U (VS) = 4 was derived when solving the first query. According to Eq. (4.1), when s remains unchanged, P 1 only needs to appear in the (3,k)-SBs with k ≥ 4, which are (3,4)-SB, (3,5)-SB and (3,6)-SB.

• EE-pruned nodes. As with the IU-pruned nodes, all EE-pruned nodes will not ap-pear in any (s,k)-SB with k ≤ k for the same reason. Moreover, an EE-pruned node will be pruned again in any (s,k)-SB if k− k < p − |VS| − A(VS), since the

social connectivity between VSand VAis still too small with respect to k. Therefore, an EE-pruned node only appears in the (s,k)-SB where

k ≥ max{p − |VS| − A(VS) + k, k + 1}.

• Acquaintance-pruned nodes. An acquaintance-pruned node is included in an (s,k )-SB only if k > k and

v∈MA|VA∩ Nv| ≥ (p − |VS|)(p − |VS| − k − 1) (i.e., Eq. (3.5) does not hold to trigger acquaintance pruning). That is, an acquaintance-pruned node only appears in the (s,k)-SB where

k ≥ max{p − |VS| − 1 −

v∈MA

|VA∩ Nv|/(p − |VS|), k + 1}.

Note that the value of∑

v∈MA|VA∩ Nv| has already been derived in the first query and does not change when k is replaced by k, and exploiting these unchanged parts helps reduce computation when processing succeeding SGQs.

• Distance-pruned nodes. In contrast, distance-pruned nodes need to appear and may be expanded in the (s,k)-SB when k < k, since krepresents a tighter acquaintance constraint, and the solutions that trim off the distance-pruned nodes may not be feasible. However, including every distance-pruned node in all (s,k)-SBs in this situation is not necessary. Instead, we employ the distance pruning strategy again to filter out the distance-pruned nodes that never become a better solution in each SB.

Specifically, if the solutions generated in the previous queries are feasible under k, the one with the smallest total social distance is kept in the (s,k)-SB, and this total social distance is then employed to update D in distance pruning for filtering. On the other hand, distance-pruned nodes are not included in (s,k)-SBs when k ≥ k, because the original solutions that trim off these nodes are still better solutions.

In the above cases, we explored whether a pruned node appears in (s,k)-SB according to its original pruning type. However, taking a distance-pruned node as an example, when it is included in an (s,k)-SB with k < k, it may violate the new tighter interior

unfamil-iarity condition and be trimmed off by IU pruning. To further reduce the number of nodes, we examine each type of pruned nodes for the other three types of pruning strategies with the corresponding s and k. These examinations are almost the same as in Section 3.3.2, except that some parts of the inequalities have already been derived and can be reused directly.

(2) Solution nodes. It is desirable for the SBs to include solution nodes to facilitate early pruning in the new query. A solution node here can be any node with|VS| = p, e.g., any feasible solution (not necessarily the optimal one). Specifically, any solution node can be selected in an (s,k)-SB if k ≥ U(VS), since the solution node still satisfies the acquaintance constraint k. Nevertheless, if there is already another solution node in the (s,k)-SB, we only need to keep the one with the smaller total social distance to facilitate distance pruning afterwards.

(3) Internal nodes. To effectively minimize the storage overhead, no internal node is included in (s,k)-SBs, since all feasible solutions expanded from an internal node either are the solution nodes in its sub-tree or can be expanded from the pruned nodes in its sub-tree.

Rule 2: node indexing for different s (1) Pruned nodes.

• IU-pruned nodes. No IU-pruned node needs to be included in any (s,k)-SB, since changing the social radius constraint does not increase the connectivity between the existing vertices in VS. Thus, all the IU-pruned nodes are infeasible with any s.

• EE-pruned nodes. In contrast to the IU-pruned nodes, some EE-pruned nodes may be successfully expanded to generate new sub-trees when s > s, since new candi-date attendees may appear in VA. Therefore, if s > s, it is necessary to derive the corresponding VAs(i.e., the candidate attendees within shops from the initiator) for an EE-pruned node.2 We then update the social distance according to different sto

2The tightest social radius constraint that allows a vertex v to be included as a candidate can be identified from the radius graph extraction procedure, and it is the smallest i such that div,q<∞.

keep track of the status of the pruned node.3 An (s,k)-SB includes an EE-pruned node only if its VAs is large enough such that A(VS)≥ p − |VS|, implying that Eq.

(3.3) does not hold and prevents EE pruning.

Example 4.4.2. Figure 4.2(b) presents an illustrative example with an EE-pruned node P 5 to identify the corresponding SBs. P 5 is generated in the first query with (s, k) = (3, 3), and its VS and VA are {v3, v6, v7, v8, v9} and {v1, v4, v5, v10, v11}, respectively. According to Rule 2-(1), an EE-pruned node is only considered for the (s,k)-SBs with s > s. Since smax = 4, where smax is the largest possible s, P 5 may only stay in the (4,3)-SB. Note that the tightest social radius constraint that allows a vertex v to be included as a candidate can be identified from the radius graph extraction procedure. Therefore, VA4 = VA3. Since VAs is unchanged, A(VS) will remain the same when s = 4, which indicates that Eq. (3.3) still holds to trim off P 5 again. By excluding P 5 from the (s,3)-SBs, the node selection rules effectively reduce the processing time of the succeeding queries.

Note that, although VS contains the same set of vertices under different s, the so-cial distances of the vertices in VSmay change and affect the later distance pruning.

Therefore, in addition to tracking each node’s VAs for different s, we update the corresponding VS for different s, denoted as VSs. Moreover, VSs or VAs for the same node under different s tend to share many common vertices. Therefore, to efficiently maintain VSs and VAs under different s, we hierarchically save the differ-ence among them. That is, we first save a base node for VSs or VAswith the smallest s. When new candidates join or when the social distance of any vertex becomes smaller for a larger s, these new candidates or the difference of social distances will be recorded in a delta node. With the base node and the delta nodes, we can dynamically generate the corresponding VSs and VAs of the specified s for further expansion. Example 4.4.3 illustrates how the base node and delta nodes work.

3The social distance of any vertex in VAs for different s can also be derived from the radius graph extraction procedure, since the social distance of a vertex v for sis exactly div,qwith i = s.

Table 4.2: (a) The social distances from different vertices to v7 under various s, and (b)

Example 4.4.3. This example employs a query with a smaller group size p to illus-trate the function of the base node and delta nodes. Assume that v7 in Figure 4.2(a) issues a query with (p, s, k) = (4, 1, 1). With the radius graph extraction, we obtain the social distance of each vertex under different s. Part of the results are listed in Table 4.2(a), and the social distance of a vertex may decrease as s increases. More-over, a vertex is included as a candidate in VAsif its social distance becomes smaller than ∞ under the social constraint s = s. Here we consider an EE-pruned node P 5 with VS ={v6(20), v7(0)} and VA={v4(27), v9(13), v11(23)} generated in the query as an example. (The number in the parentheses next to vi is the social dis-tance from vito the initiator v7.) A naive approach to handle various sis generating standalone copies of P 5 for each sfrom s + 1 = 2 to smax = 4, i.e., CP 52 , CP 53 and CP 54 in Table 4.2(b), where smaxis the largest possible s. However, we observe that there is a large overlap among CP 52 , CP 53 and CP 54 . Therefore, in the following, we will show how to condensedly maintain these copies using the base node and the delta nodes.

First, the base node BP 52−4 contains the VSs and VAs of P 5 with the smallest s (i.e., 2). Here the index 2− 4 means this node is used when reconstructing the VSs and VAs with s = 2, 3, or 4. When s increases to 3, it is necessary to record the newly joined vertex (i.e., v12) and the difference of social distance (i.e.,−5 for v5) in the

delta node DP 53−4. The unchanged vertices can be omitted to save space. Since there is no further change when s = 4, more delta nodes do not need to be generated.

When a new query comes in, we only need to take the base node and use delta nodes to add new candidates or update the social distance as necessary. Thus, the VSs and VAs that fit the new social radius constraint can be dynamically generated when needed.

• Acquaintance-pruned nodes. Similar to the EE-pruned nodes, some acquaintance-pruned nodes may be expanded into new sub-trees when s > s, since new candidate attendees may appear in VA. Specifically, an (s,k)-SB includes an acquaintance-pruned node only if its VAs is large enough such that

v∈MAs′

|VAs ∩ Nv| ≥ (p − |VS|)(p − |VS| − k − 1),

where MAs is the set of p− |VS| vertices in VAs with the largest inner degrees. The above inequality indicates that Eq. (3.5) does not hold and the node is not pruned.

• Distance-pruned nodes. In contrast, most distance-pruned nodes, except those with VSs violating the social radius constraint (i.e., maxv∈Vs′

S hv > s, where hv is the number of hops from the initiator to a vertex v), need to be re-considered when s changes. The reason is that when s > s, the newly included vertices in VAs may create shorter paths to the initiator. Alternately, when s < s, the total social distance of the solution in the previous distance pruning may increase. In either way, the distance pruning condition may not hold, and its pruned nodes need to be included in (s,k)-SB for further examination.

Here we also create the base node and the delta nodes for a distance-pruned node to compactly maintain its VSs and VAs for different s to update the social distance of any vertex, and then use the distance pruning strategy again to include only the updated distance-pruned nodes that can generate a better solution in the (s,k)-SB.

(2) Solution nodes. In order to facilitate early pruning for the succeeding queries and avoid missing the optimal solution, (s,k)-SB includes the solution nodes that follow the social radius constraint. For each solution node, we update the social distance with any vertex in VSs, so that it is associated with the correct total social distance. For each (s,k)-SB, we only keep the solution node of the smallest total social distance to reduce the storage overhead.

(3) Internal nodes. In contrast to the (s, k)-SB, internal nodes in AST play a more important role in the (s, k)-SB, because when s > s, new candidate attendees may join.

Therefore, the internal nodes of AST need to be cached for the (s, k)-SBs with s > s, so that new candidates can be added to the existing internal nodes without generating them all over again. Similar to the pruned nodes, we maintain VSs and VAs of each internal node for different s using the base node and the delta nodes, so that we can dynamically generate the corresponding VSs and VAs of the specified s for further expansion.

Rule 3: node indexing for different s and k Although the node indexes for different k and different s have been presented in Rule 1 and Rule 2, respectively, when considering both s and k, carefully combining the rules for k and for s can further reduce the number of nodes to include in SBs. Therefore, we explore the generalized case as follows.

(1) Pruned nodes.

• IU-pruned nodes. Rule 1-(1) indicates that an IU-pruned node is included in the (s,k)-SB if k ≥ U(VS).4 Rule 2-(1) shows that, when k is fixed, an IU-pruned node is not in any (s,k)-SB regardless of s; however, when constructing (s,k )-SBs, s can still to reduce the number of IU-pruned nodes. That is, a pruned node is included in an (s,k)-SB only if all vertices in its VS are within s hops of the initiator, i.e., maxv∈VShv ≤ s. Combining the inequalities, an IU-pruned node is included in an (s,k)-SB only if k ≥ U(VS) and maxv∈VShv ≤ s.

Example 4.4.4. We revisit P 1 in Example 4.4.1, with VS ={v2, v6, v7, v8, v9, v11}

4The original rule listed in Rule 1-(1) is k ≥ max{U(VS), k + 1}. However, it can be simplified by observing that U (VS) must exceed k; otherwise, the IU pruning will not happen. Similarly, the rules of EE-pruned and acquaintance-pruned nodes used later are also simplified.

and VA={v1, v4, v5, v10}. Since U(VS) = 4 was already derived in the first query, P 1 will only appear in (s,k)-SBs with k ≥ 4. Moreover, because the vertices in VS are all within one hop of the initiator, maxv∈VShv = 1 holds. Therefore, the IU-pruned node P 1 will only appear in (s,k)-SBs with 1≤ s and k ≥ 4, such as (2,4)-SB and (3,4)-SB in Table 4.1.

• EE-pruned nodes. According to Rule 1-(1), an EE-pruned node is included in the (s,k)-SB if

k ≥ p − |VS| − A(VS) + k. (4.2) We employ the social radius constraint to reduce the number of nodes included.

Specifically, it is not necessary to keep the pruned nodes for the (s,k)-SBs with s < maxv∈VShv, even if their k satisfies Eq. (4.2). Other (s,k)-SBs whose k does not satisfy Eq. (4.2) can include an EE-pruned node only if their s is large enough so that its A(VS) with new candidates in VAs is no smaller than p− |VS|, implying that this node will not be pruned by EE pruning again.

• Acquaintance-pruned nodes. According to Rule 1-(1), an acquaintance-pruned node is included in the (s,k)-SB if

k ≥ p − |VS| − 1 −

v∈MA

|VA∩ Nv|/(p − |VS|). (4.3)

We again use the social radius constraint to reduce the number of nodes included.

The pruned nodes for the (s,k)-SBs with s < maxv∈VShv are not needed, even if their ksatisfies Eq. (4.3). Other (s,k)-SBs whose kdoes not satisfy Eq. (4.3) can include an acquaintance-pruned node only if their sis large enough such that

v∈MAs′

|VAs∩ Nv| ≥ (p − |VS|)(p − |VS| − k− 1).

• Distance-pruned nodes. A distance-pruned node may be successfully expanded when k < k or when s ̸= s, according to Rule 1-(1) or Rule 2-(1), respectively.

Similarly, we can reduce the number of included nodes by excluding the distance-pruned nodes with maxv∈VShv ≤ s. We further reuse the inequality of distance pruning strategy (i.e., Eq. (3.4)) by replacing D with the current best solution in the (s,k)-SB. If the inequality holds, the distance-pruned node is not required in the SB since it will be pruned again.

(2) Solution nodes. Combining Rule 1-(2) and Rule 2-(2), a solution node is included in the (s,k)-SBs with k ≥ U(VS) and s ≥ maxv∈VShv, i.e., satisfying the acquaintance and social radius constraints, respectively.

(3) Internal nodes. Rule 2-(3) indicates that the (s,k)-SBs with s > s should include internal nodes for new candidates due to the increment of s. For changes in k, the internal nodes with U (VS) > k violate the acquaintance constraint k and does not generate a solution. Therefore, the (s,k)-SB only includes the internal nodes if k ≥ U(VS) and s ≥ maxv∈VShv.