Abstract
The problem of k-hop reachability between two vertices in a graph has received considerable attention in recent years. A substantial number of algorithms have been proposed with the goal of improving the searching efficiency of the k-hop reachability between two vertices in a graph. However, searching and traversing are challenging tasks, especially in large-scale graphs. Furthermore, the existing algorithms propounded by different scholars are not satisfactory in terms of feasibility and scalability when applied to different kinds of graphs. In this work, we propose a new algorithm, called Zip, in an attempt to efficiently determine the common contacts between any two random vertices in a large-scale graph. First, we describe a novel algorithm for constructing the graph index via binary searching which maintains the adjacent list of each vertex in order. Second, we present the ways to achieve a sequential k-hop contact set by using the loser tree, a merge sorting algorithm. Finally, we develop an efficient algorithm for querying common contacts and an optimized strategy for k-hop contact set serialization. Experimental results on synthetic and real datasets show that the proposed Zip algorithm outperforms existing state-of-the-art algorithms (e.g., breadth-first searching, GRAIL, the graph stratification algorithm).
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cheng J, Shang Z, Cheng H et al. Efficient processing of k-hop reachability queries. The VLDB Journal, 2014, 23(2): 227–252.
Cheng J, Shang Z, Cheng H et al. K-Reach: Who is in your small world. Proceedings of the VLDB Endowment, 2012, 5(11): 1292–1303.
Yildirim H, Chaoji V, Zaki M J. GRAIL: Scalable reachability index for large graphs. Proceedings of the VLDB Endowment, 2010, 3(1/2): 276–284.
Yildirim H, Zaki M J. Graph indexing for reachability queries. In Proc. the 29th International Conference on Data Engineering Workshops (ICDEW), Mar. 2010, pp. 321–324.
Fu A W C, Wu H, Cheng J, Chu S, Wong R. IS-LABEL: An independent-set based labeling scheme for point-to-point distance querying on large graphs. Proceedings of the VLDB Endowment, 2013, 6(6): 457–468.
Akiba T, Iwata Y, Yoshida Y. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. arXiv: 1304.4661, 2013. http://arxiv.org/-abs/1304.4661, May 2015.
Wei F. TEDI: Efficient shortest path query answering on graphs. ACM SIGMOD International Conference on Management of Data, Jun. 2010, pp. 99–110.
Hasanzadeh F, Naghibzadeh M. GRU: Efficient reachability answering for large graphs using united interval labeling. In Proc. the 3rd International Advance Computing Conference, Feb. 2013, pp. 1011–1017.
Jin R, Ruan N, You B, Wang H. Hub-accelerator: Fast and exact shortest path computation in large social networks. arXiv: 1305.0507, 2013. http://arxiv.org/abs/1305.0507, May 2015.
Beamer S, Asanovic K, Patterson D. Direction-optimizing breadth-first search. In Proc. International SC Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2012, Article No. 12.
Jagadish H V. A compression technique to materialize transitive closure. ACM Transactions on Database Systems, 1990, 15(4): 558–598.
Zhao X, Sala A, Zheng H et al. Efficient shortest paths on massive social graphs. In Proc. the 7th Int. Conf. Applications and Worksharing, Oct. 2011, pp. 77–86.
Chen Y, Chen Y. An efficient algorithm for answering graph reachability queries. In Proc. the 24th Int. Conf. Data Engineering, Apr. 2008, pp. 893–902.
Knuth D E. Optimum binary search trees. Acta Informatica, 1971, 1(1): 14–25.
Nowak R. Generalized binary search. In Proc. the 46th IEEE Annual Allerton Conference on Communication, Control, and Computing, Sept. 2008, pp. 568–574.
Caselles V, Meinhardt E, Monasse P. Constructing the tree of shapes of an image by fusion of the trees of connected components of upper and lower level sets. Positivity, 2008, 12(1): 55–73.
Jin R, Xiang Y, Ruan N et al. 3-HOP: A high-compression indexing scheme for reachability query. In Proc. SIGMOD Conference, June 29–July 2, 2009, pp. 813–836.
Jin R, Xiang Y, Ruan N et al. Efficiently answering reachability queries on very large directed graphs. In Proc. ACM SIGMOD, Jun. 2008, pp. 595–608.
Agrawal R, Borgida A, Jagadish H V. Efficient management of transitive relationships in large data and knowledge bases. In Proc. ACM Symp. Management of Data, May 31–June 2, 1989, pp. 253–262.
Cohen E, Halperin E, Kaplan H et al. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 2002, 32(5): 1338–1355.
Shang H, Kitsuregawa M. Efficient breadth-first search on large graphs with skewed degree distributions. In Proc. the 16th EDBT, Mar. 2013, pp. 311–322.
Xiao Y, Wu W, Pei J, Wang W, He Z. Efficiently indexing shortest paths by exploiting symmetry in graphs. In Proc. the 12th EDBT, Mar. 2009, pp. 493–504.
Potamias M, Bonchi F, Castillo C et al. Fast shortest path distance estimation in large networks. In Proc. the 18th ACM CIKM, Nov. 2009, pp. 867–876.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the Youth Teacher Startup Fund of South China Normal University of China under Grant No. 14KJ18 and the National High Technology Research and Development 863 Program of China under Grant No. 2013AA01A212.
Rights and permissions
About this article
Cite this article
Tang, H., Mu, S., Huang, J. et al. Zip: An Algorithm Based on Loser Tree for Common Contacts Searching in Large Graphs. J. Comput. Sci. Technol. 30, 799–809 (2015). https://doi.org/10.1007/s11390-015-1561-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-015-1561-y