Input the graph G and the given cliques list C.
Output CD storing the approximate clique distances.
1 r ← |C|, n ← |V (G)|
2 Initialize CD as a r × r matrix with each element has the value ∞ 3 Initialize D as n × n matrix with each element has the value ∞ 4 Initialize Π as n × n matrix with each element has the value NIL 5 for Ci ∈ C
The algorithm BFS-Revised maintains a newly added variable BRANCH . Consider a BFS tree T computed by BFS-Revised. Every node except s must have a parent node on T. If the operation BFS-Revised starts from a node s in the clique Cs, then all of neighbor of s in the clique Cs are at the level one of the tree T (The node s is at level zero.) For every node w, the variable BRANCH [w]
refers the ancestor at the level one if s reaches w via a neighbor also in the clique Cs, or refers to s if s reaches w without passing through any neighbor belonging to Cs.
If the variable v = BRANCH [w] 6= s, we may reached a clique from s and the shortest path passes through one of neighbor of s in the clique Cs. By Lemma 3.4, the distance D[v, w] is the clique distance. Otherwise we compare the distance D [s, w] with the currently known clique distances estimation.
Note that Algorithm 3.2 computes only approximate results. To see how it failed to compute exact results, we provide the graphs showing in Figure 3.2 and
a
Figure 3.3: The Failed BFS-Tree Rooted at a
Figure 3.3 as examples. The correct clique distance cd(C1, C3), which is showing in Figure 3.2, is d(b, j) = 4. If the BFS-Revised starting from node a expands the children of e prior the expansion of b, BFS-Revised(a) reaches C3 by node j without taking the correct clique shortest path. In this particular case Algorithm 3.2 return an approximate estimate with an additive error of one. Because Algorithm 3.2 expands nodes in arbitrary order, the failed cases are not preventable. So we conclude that Algorithm 3.2 computes the approximate clique distances with the additive error at most one.
Now we prove the correctness of our algorithms. Algorithm 3.3 is just a trivial loop. Algorithm 3.2 is a modified version of Breadth-First-Search. It traverses every node and computes the shortest path from starting node to every node in graph G. If a newly found clique is reached via a neighbor in the same clique of starting node, by Lemma 3.3 and Lemma 3.4, the correct clique distance is reported.
Otherwise, we can not filter out the error cases.
The procedure BFS-Revised runs in the time of O(m + n). The time com-plexity of Algorithm 3.3 is O(r(m + n)) since it perform at most r times of BFS-Revised, where r is the number of cliques in C. Our algorithm performs better than the straightforward algorithm if the number of r is much smaller than the node number n.
Chapter 4
On Transformation to All-Pairs Shortest Paths Problem
In this chapter a technique which transforms any instance of clique distances prob-lem into an instance of APSP probprob-lem is reported. Transforming the clique distances problem into APSP problem is beneficial since we may utilize the algorithms estab-lished for APSP problem to solve the clique distance problem. However, we can not solve APSP problem by solving clique distance problem. By solving clique distance problem it is possible to obtain only a partial solution of APSP problem.
4.1 A Failed Attempt
Before introducing the final version construction, we demonstrate a failed attempt.
Definition 4.1 (Pitfall roof node). For each clique Ci = {u1, u2, u3, . . . , uk−1, uk} in C, a newly created roof node vci is inserted into V (G). A roof nodes vci are only adjacent to those nodes belong to the corresponding clique Ci and is disconnected to the rest of nodes (also disconnected from any other roof nodes). Then we put edges (vci, u) for u ∈ Ci where these edges are weighted in zero.
However, this design has a pitfall. Assume vci is a roof node of clique Ci.
u1 u2
There must be a path P(x, vci, y) with its length smaller than any edges belonging to Ci. Then the shortest path found by APSP algorithms is the path including P(x, vci, y). The newly found shortest path passes through the roof node vci and is not a possible path on the originally graph. Therefore, the transformation can not fulfill our objectives.
4.2 Correction
So we must give weighed to the edges adjacent to the roof nodes.
Definition 4.2 (Roof node). The construction steps remain unchanged. Instead of creating edges weighted in zero, a heavier edge weighted is given. We put edges (vci, uj) for uj ∈ Ci. Those edges are weighted in (wmax + 1) where wmax = maxe∈E(G)w(e). The set R = {vci|Ci ∈ C} contains all roof nodes on G.
Actually we may adopt smaller edge weights. Given a clique Ci and the cor-responding roof nodes vci. Let the maximum edge weight among the edges in Ci is wci. Then the edges weight for those edges adjacent to the roof nodes are at least
1+wci
2 since this is enough to prevent roof nodes from being included by any shortest paths.
Lemma 4.1. For all of pairs of clique Ci and Cj and the corresponding roof nodes vci and vcj, we have the length d(vci, vcj)−2×(wmax+1) as the desired clique distance cd(C, C ).
Proof. By the definition 4.2 the roof nodes are constructed and inserted into G.
After all the roof nodes are created, these operations result in an newly created graph G0. Because the negative edge weight is not allowed, it is possible to compute the all-pairs shortest path on G0 by any algorithms solving APSP problem.
Assume the shortest path between vci and vcj as P(vci, s, . . . , t, vcj), and the node s and t belong to Ci and Cj. It must be the cases because any roof node is connected to every node in the corresponding clique only and is isolated from the rest of nodes in originally graph G. So the edge (vci, s) must be taken. The edge (t, vcj) is also taken for the same reason.
We argue that the path P(s . . . t) is the clique shortest path between Ci and Cj for the reason that the path P(vci, s, . . . , t, vcj) is the shortest path between vci and vcj. If the path between s and t are not the one with the minimal length, the shortest path algorithm must find some different node s0 6= s or t0 6= t where d(s0, t0) < d(s, t).
This contradicts our assumption. Since the cost of (vci, s) and (t, vcj) are both wmax+ 1 clearly the path length of cd(Ci, Cj) = d(vci, vcj) − 2 × (wmax+ 1).
After doing the transformation, we may compute the clique distances by com-puting the all-pairs shortest path among all of the roof nodes. Algorithm 4.2 illus-trates the whole algorithm.
Algorithm 4.1 Clique-Distances-Roof-Nodes