So we must give weighed to the edges adjacent to the roof nodes.
Definition 4.2 (Roof node). The construction steps remain unchanged. Instead of creating edges weighted in zero, a heavier edge weighted is given. We put edges (vci, uj) for uj ∈ Ci. Those edges are weighted in (wmax + 1) where wmax = maxe∈E(G)w(e). The set R = {vci|Ci ∈ C} contains all roof nodes on G.
Actually we may adopt smaller edge weights. Given a clique Ci and the cor-responding roof nodes vci. Let the maximum edge weight among the edges in Ci is wci. Then the edges weight for those edges adjacent to the roof nodes are at least
1+wci
2 since this is enough to prevent roof nodes from being included by any shortest paths.
Lemma 4.1. For all of pairs of clique Ci and Cj and the corresponding roof nodes vci and vcj, we have the length d(vci, vcj)−2×(wmax+1) as the desired clique distance cd(C, C ).
Proof. By the definition 4.2 the roof nodes are constructed and inserted into G.
After all the roof nodes are created, these operations result in an newly created graph G0. Because the negative edge weight is not allowed, it is possible to compute the all-pairs shortest path on G0 by any algorithms solving APSP problem.
Assume the shortest path between vci and vcj as P(vci, s, . . . , t, vcj), and the node s and t belong to Ci and Cj. It must be the cases because any roof node is connected to every node in the corresponding clique only and is isolated from the rest of nodes in originally graph G. So the edge (vci, s) must be taken. The edge (t, vcj) is also taken for the same reason.
We argue that the path P(s . . . t) is the clique shortest path between Ci and Cj for the reason that the path P(vci, s, . . . , t, vcj) is the shortest path between vci and vcj. If the path between s and t are not the one with the minimal length, the shortest path algorithm must find some different node s0 6= s or t0 6= t where d(s0, t0) < d(s, t).
This contradicts our assumption. Since the cost of (vci, s) and (t, vcj) are both wmax+ 1 clearly the path length of cd(Ci, Cj) = d(vci, vcj) − 2 × (wmax+ 1).
After doing the transformation, we may compute the clique distances by com-puting the all-pairs shortest path among all of the roof nodes. Algorithm 4.2 illus-trates the whole algorithm.
Algorithm 4.1 Clique-Distances-Roof-Nodes Clique-Distances-Roof-nodes(G, C)
Input the graph G and given cliques list C
Output the matrix CD storing clique distances 1 r ← |C|
9 do Compute the shortest path tree rooted at vci
10 CD [i, j] ← d(vci, vcj) for any node vcj ∈ R 11 return CD
The Lemma 4.1 offers the possibility to compute clique distance by any known algorithms solving APSP problem. Even though the nodes in the graph increased after we inserted those roof nodes, the insertion expands the set of nodes V (G) slightly because we put a limitation on the smallest size of cliques, which is three of nodes. Let n denote the number of nodes in original graph G. We conclude that
|R| = o(n) and the time complexity are still dominated by n. So the overall time complexity remains.
Note that the input graph is not restricted to unweighted graphs; Lemma 4.1 are safe to extend to weighted graphs. We may use the identical procedure to construct the roof nodes. The design of roof nodes also handle the cases when some of given cliques in C are overlapped. By definitions, the clique distances of overlapped cliques shrink down to zero immediately.
Chapter 5
Concluding Remarks
The clique diameter problem and clique distance problem provide insights about the distances of interconnection between every pair of communities. Even we solve an instance of clique distance problem, we could not solve any instance of APSP problem by the solution we obtained from clique distance problem.
Starting from a straightforward algorithm, our algorithm which runs in the time of O(r(n + m)) approximates clique distance in an additive error of one. If the number of cliques r is much smaller than the number of node n then our approximate algorithm runs faster than the straightforward algorithm.
We solve any instance of clique distance problem by transforming it into an instance of APSP problem. However, it is not possible to solve an APSP problem by solving a clique distance problem. After adding the roof nodes into inputed graph, we may reconstruct the clique distances from the shortest path joining the corresponding roof nodes. Any improvements on the APSP problem immediately improve on our problems.
We state the future works below. As we stated in the Chapter 1 there are different models of community structures. The first possible future work is to devise the algorithm for different model of community structures. For example, we may use the k-clique or k-club models. Since k-clique and k-club are more practical models
comparing with cliques, the computation techniques based on them could capture much more precise community distances.
The second lane of future works is evaluating our algorithm on the real world datasets. Our algorithms are based on the Breadth-First-Search which has a good external-memory implementation [24] so our methods could have good perfor-mance in practice. In the future we may establish the experiments to evaluate and improve our algorithm.
Bibliography
[1] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication).
SIAM J. Comput., 28(4):1167–1181, 1999.
[2] Richard D. Alba. A graph-theoretic definition of a sociometric clique. Journal of Mathematical Sociology, 3:3–113, 1973.
[3] Noga Alon, Zvi Galil, and Oded Margalit. On the exponent of the all pairs shortest path problem. J. Comput. Syst. Sci., 54(2):255–262, 1997.
[4] Timothy Chan. All-pairs shortest paths with real weights in O(n3/ log n) time.
Algorithmica, 50:236–243, 2008.
[5] Timothy M. Chan. More algorithms for all-pairs shortest paths in weighted graphs. SIAM J. Comput., 39(5):2075–2089, 2010.
[6] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. J. Symb. Comput., 9(3):251–280, 1990.
[7] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, 2nd edition. MIT Press, Cambridge, MA, 2 edition, 2001.
[8] Pierluigi Crescenzi, Roberto Grossi, Claudio Imbrenda, Leonardo Lanzi, and Andrea Marino. Finding the diameter in real-world graphs - experimentally turning a lower bound into an upper bound. In Mark de Berg and Ulrich Meyer, editors, ESA (1), volume 6346 of Lecture Notes in Computer Science, pages 302–313. Springer, 2010.
[9] Edsger W. Dijkstra. A note on two problems in connexion with graphs. Nu-merische Mathematik, 1:269–271, 1959.
[10] Dorit Dor, Shay Halperin, and Uri Zwick. All-pairs almost shortest paths.
SIAM J. Comput., 29(5):1740–1759, 2000.
[11] P. Erd˝os and A. R´enyi. On random graphs. I. Publ. Math. Debrecen, 6:290–297, 1959.
[12] Santo Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75 – 174, 2010.
[13] Michael L. Fredman. New bounds on the complexity of the shortest path problem. SIAM J. Comput., 5(1):83–89, 1976.
[14] Zvi Galil and Oded Margalit. All pairs shortest distances for graphs with small integer length edges. Inf. Comput., 134(2):103–139, 1997.
[15] Zvi Galil and Oded Margalit. All pairs shortest paths for graphs with small integer length edges. J. Comput. Syst. Sci., 54(2):243–254, 1997.
[16] Yijie Han. Improved algorithm for all pairs shortest paths. Inf. Process. Lett., 91(5):245–250, 2004.
[17] Yijie Han. An O(n3(log log n/ log n)5/4) time algorithm for all pairs shortest path. Algorithmica, 51(4):428–434, 2008.
[18] Donald B. Johnson. Efficient algorithms for shortest paths in sparse networks.
J. ACM, 24:1–13, January 1977.
[19] D.R. Karger, D. Koller, and S.J. Phillips. Finding the hidden path: time bounds for all-pairs shortest paths. Foundations of Computer Science, Annual IEEE Symposium on, 0:560–568, 1991.
[20] L.R. Kerr. The effect of algebraic structure on the computational complexity.
PhD thesis, Cornell University, Ithaca, N.Y., 1970.
[21] R. Luce. Connectivity and generalized cliques in sociometric group structure.
Psychometrika, 15:169–190, 1950.
[22] Cl´emence Magnien, Matthieu Latapy, and Michel Habib. Fast computation of empirically tight bounds for the diameter of massive graphs. ACM Journal of Experimental Algorithmics, 13, 2008.
[23] Catherine C. McGeoch. All-pairs shortest paths and the essential subgraph.
Algorithmica, 13(5):426–441, 1995.
[24] Kurt Mehlhorn and Ulrich Meyer. External-memory breadth-first search with sublinear i/o. In Rolf H. M¨ohring and Rajeev Raman, editors, ESA, volume 2461 of Lecture Notes in Computer Science, pages 723–735. Springer, 2002.
[25] Robert J. Mokken. Cliques, clubs and clans. Quality & Quantity, 13(2):161–173, April 1979.
[26] M. E. J. Newman. The structure of scientific collaboration networks. Pro-ceedings of the National Academy of Sciences of the United States of America, 98(2):404–409, January 2001.
[27] Raimund Seidel. On the all-pairs-shortest-path problem. In STOC, pages 745–
749. ACM, 1992.
[28] Avi Shoshan and Uri Zwick. All pairs shortest paths in undirected graphs with integer weights. In FOCS, pages 605–615, 1999.
[29] Philip M. Spira. A new algorithm for finding all shortest paths in a graph of positive arcs in average time O(n2 log2n). SIAM J. Comput., 2(1):28–32, 1973.
[30] Tadao Takaoka. A new upper bound on the complexity of the all pairs shortest path problem. Information Processing Letters, 43(4):195 – 199, 1992.
[31] Tadao Takaoka. A faster algorithm for the all-pairs shortest path problem and its application. In Kyung-Yong Chwa and J. Ian Munro, editors, COCOON,
volume 3106 of Lecture Notes in Computer Science, pages 278–289. Springer, 2004.
[32] Tadao Takaoka. An O(n3log log n/ log n) time algorithm for the all-pairs short-est path problem. Information Processing Letters, 96(5):155 – 161, 2005.
[33] Gideon Yuval. An algorithm for finding all shortest paths using n2.81 infinite-precision multiplications. Inf. Process. Lett., 4(6):155–156, 1976.
[34] Uri Zwick. Exact and approximate distances in graphs - a survey. In Fried-helm Meyer auf der Heide, editor, ESA, volume 2161 of Lecture Notes in Com-puter Science, pages 33–48. Springer, 2001.
[35] Uri Zwick. All pairs shortest paths using bridging sets and rectangular matrix multiplication. J. ACM, 49(3):289–317, 2002.
[36] Uri Zwick. A slightly improved sub-cubic algorithm for the all pairs shortest paths problem with real edge lengths. In Rudolf Fleischer and Gerhard Trip-pen, editors, Algorithms and Computation, volume 3341 of Lecture Notes in Computer Science, pages 841–843. Springer Berlin / Heidelberg, 2005.