Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3178876.3186125acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Free access

Listing k-cliques in Sparse Real-World Graphs*

Published: 23 April 2018 Publication History

Abstract

Motivated by recent studies in the data mining community which require to efficiently list all k-cliques, we revisit the iconic algorithm of Chiba and Nishizeki and develop the most efficient parallel algorithm for such a problem. Our theoretical analysis provides the best asymptotic upper bound on the running time of our algorithm for the case when the input graph is sparse. Our experimental evaluation on large real-world graphs shows that our parallel algorithm is faster than state-of-the-art algorithms, while boasting an excellent degree of parallelism. In particular, we are able to list all k-cliques (for any k) in graphs containing up to tens of millions of edges as well as all $10$-cliques in graphs containing billions of edges, within a few minutes and a few hours respectively. Finally, we show how our algorithm can be employed as an effective subroutine for finding the k-clique core decomposition and an approximate k-clique densest subgraphs in very large real-world graphs.

References

[1]
{n. d.}. http://research.nii.ac.jp/~uno/codes.htm. ({n. d.}).
[2]
Nesreen K Ahmed, Jennifer Neville, Ryan A Rossi, and Nick Duffield. 2015. Efficient graphlet counting for large networks. In Data Mining (ICDM), 2015 IEEE International Conference on. IEEE, 1--10.
[3]
Noga Alon, Raphael Yuster, and Uri Zwick. 1995. Color-coding. J. ACM 42, 4 (July 1995), 844--856.
[4]
Noga Alon, Raphael Yuster, and Uri Zwick. 1997. Finding and counting given length cycles. Algorithmica 17, 3 (1997), 209--223.
[5]
Albert Angel, Nick Koudas, Nikos Sarkas, Divesh Srivastava, Michael Svendsen, and Srikanta Tirthapura. 2014. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. VLDB J. 23, 2 (2014), 175--199.
[6]
Oana Denisa Balalau, Francesco Bonchi, T.-H. Hubert Chan, Francesco Gullo, and Mauro Sozio. 2015. Finding Subgraphs with Maximum Total Density and Limited Overlap. In WSDM. 379--388.
[7]
Luca Becchetti, Paolo Boldi, Carlos Castillo, and Aristides Gionis. 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In SIGKDD. 16--24.
[8]
Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.
[9]
Piotr Berman and Marek Karpinski. 1999. On some tighter inapproximability results. In International Colloquium on Automata, Languages, and Programming. Springer, 200--209.
[10]
Andreas Björklund, Rasmus Pagh, Virginia Vassilevska Williams, and Uri Zwick. 2014. Listing triangles. In Automata, Languages, and Programming. Springer, 223--234.
[11]
Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2017. Counting Graphlets: Space vs Time. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM '17). ACM, New York, NY, USA, 557--566.
[12]
Coen Bron and Joep Kerbosch. 1973. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16, 9 (1973), 575--577.
[13]
Gregory Buehrer and Kumar Chellapilla. 2008. A scalable pattern mining approach to web graph compression with communities. In WSDM. 95--106.
[14]
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization. Springer, 84--95.
[15]
James Cheng, Linhong Zhu, Yiping Ke, and Shumo Chu. 2012. Fast algorithms for maximal clique enumeration with limited memory. In SIGKDD. ACM, 1240--1248.
[16]
Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM J. Comput. 14, 1 (1985), 210--223.
[17]
Shumo Chu and James Cheng. 2011. Triangle listing in massive networks and its applications. In SIGKDD. 672--680.
[18]
Alessio Conte, Roberto De Virgilio, Antonio Maccioni, Maurizio Patrignani, and Riccardo Torlone. 2016. Finding All Maximal Cliques in Very Large Social Networks Categories. (2016).
[19]
Maximilien Danisch, T-H Hubert Chan, and Mauro Sozio. 2017. Large Scale Density-friendly Graph Decomposition via Convex Programming. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 233--242.
[20]
Yon Dourisboure, Filippo Geraci, and Marco Pellegrini. 2009. Extraction and classification of dense implicit communities in the Web graph. TWEB 3, 2 (2009).
[21]
Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, and John H. Thornton Jr. 2009. Migration motif: a spatial - temporal pattern mining approach for financial markets. In SIGKDD. 1135--1144.
[22]
Alessandro Epasto, Silvio Lattanzi, and Mauro Sozio. 2015. Efficient Densest Subgraph Computation in Evolving Graphs. In Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, May 18--22, 2015. 300--310.
[23]
David Eppstein, Maarten Löffler, and Darren Strash. 2010. Listing all maximal cliques in sparse graphs in near-optimal time. Springer.
[24]
David Eppstein, Maarten Löffler, and Darren Strash. 2013. Listing all maximal cliques in large sparse real-world graphs. Journal of Experimental Algorithmics (JEA) 18 (2013), 3--1.
[25]
Irene Finocchi, Marco Finocchi, and Emanuele G Fusco. 2015. Clique counting in mapreduce: Algorithms and experiments. Journal of Experimental Algorithmics (JEA) 20 (2015), 1--7.
[26]
Eugene Fratkin, Brian T. Naughton, Douglas L. Brutlag, and Serafim Batzoglou. 2006. MotifCut: regulatory motifs finding with maximum density subgraphs. In Proceedings 14th International Conference on Intelligent Systems for Molecular Biology 2006, Fortaleza, Brazil, August 6--10, 2006. 156--157.
[27]
David Gibson, Ravi Kumar, and Andrew Tomkins. 2005. Discovering Large Dense Subgraphs in Massive Graphs. In PVLDB. 721--732.
[28]
Enrico Gregori, Luciano Lenzini, and Simone Mainardi. 2013. Parallel k-clique community detection on large-scale networks. IEEE Transactions on Parallel and Distributed Systems 24, 8 (2013), 1651--1660.
[29]
Mikhail Rudoy (https://cstheory.stackexchange.com/users/34401/mikhail rudoy). {n. d.}. Is this vertex ordering optimization NP-Hard? Theoretical Computer Science Stack Exchange. ({n. d.}).
[30]
Leonidas D. Iasemidis, Deng-Shan Shiau, Wanpracha Art Chaovalitwongse, J. Chris Sackellares, Panos M. Pardalos, Jose C. Principe, Paul R. Carney, Awadhesh Prasad, Balaji Veeramani, and Kostas Tsakalis. 2003. Adaptive epileptic seizure prediction system. IEEE Trans. Biomed. Engineering 50, 5 (2003), 616--627.
[31]
Shweta Jain and C Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán's Theorem. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 441--449.
[32]
Ruoming Jin, Yang Xiang, Ning Ruan, and David Fuhry. 2009. 3-HOP: a highcompression indexing scheme for reachability query. In SIGMOD. 813--826.
[33]
Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene Stanley, and Hernán A Makse. 2010. Identification of influential spreaders in complex networks. Nature physics 6, 11 (2010), 888--893.
[34]
Tamara G Kolda, Ali Pinar, Todd Plantenga, C Seshadhri, and Christine Task. 2014. Counting triangles in massive graphs with MapReduce. SIAM Journal on Scientific Computing 36, 5 (2014), S48--S77.
[35]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web. ACM, 591--600.
[36]
Matthieu Latapy. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theoretical Computer Science 407, 1 (2008), 458--473.
[37]
Victor E. Lee, Ning Ruan, Ruoming Jin, and Charu C. Aggarwal. 2010. A Survey of Algorithms for Dense Subgraph Discovery. In Managing and Mining Graph Data. 303--336.
[38]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data. (June 2014).
[39]
Matthaios Letsios, Oana Denisa Balalau, Maximilien Danisch, Emmanuel Orsini, and Mauro Sozio. 2016. Finding heaviest k-subgraphs and events in social media. In Data Mining Workshops (ICDMW), 2016 IEEE 16th International Conference on. IEEE, 113--120.
[40]
Kazuhisa Makino and Takeaki Uno. 2004. New algorithms for enumerating all maximal cliques. In Algorithm Theory-SWAT 2004. Springer, 260--272.
[41]
David W. Matula and Leland L. Beck. 1983. Smallest-last Ordering and Clustering and Graph Coloring Algorithms. J. ACM 30, 3 (July 1983), 417--427.
[42]
Michael Mitzenmacher, Jakub Pachocki, Richard Peng, Charalampos Tsourakakis, and Shen Chen Xu. 2015. Scalable large near-clique detection in large-scale networks via sampling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 815--824.
[43]
Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 7043 (2005), 814--818.
[44]
Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3--7, 2017. 1431--1440.
[45]
Angela P. Presson, Eric M. Sobel, Jeanette C. Papp, Charlyn J. Suarez, Toni Whistler, Mangalathu S. Rajeevan, Suzanne D. Vernon, and Steve Horvath. 2008. Integrated Weighted Gene Co-expression Network Analysis with an Application to Chronic Fatigue Syndrome. BMC Systems Biology 2 (2008), 95.
[46]
Ahmet Erdem Sariyüce, C. Seshadhri, Ali Pinar, and Ümit V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions. In WWW. 927--937.
[47]
Matthew C Schmidt, Nagiza F Samatova, Kevin Thomas, and Byung-Hoon Park. 2009. A scalable, parallel algorithm for maximal clique enumeration. J. Parallel and Distrib. Comput. 69, 4 (2009), 417--428.
[48]
Kijung Shin, Tina Eliassi-Rad, and Christos Faloutsos. 2016. CoreScope: Graph Mining Using k-Core Analysis Patterns, Anomalies and Algorithms. In Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 469--478.
[49]
UNO Takeaki. 2012. Implementation issues of clique enumeration algorithm. Special issue: Theoretical computer science and discrete mathematics, Progress in Informatics 9 (2012), 25--30.
[50]
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theoretical Computer Science 363, 1 (2006), 28--42.
[51]
Charalampos Tsourakakis. 2015. The k-clique densest subgraph problem. In Proceedings of the 24th international conference on world wide web. ACM, 1122-- 1132.
[52]
Stanley Wasserman and Katherine Faust. 1994. Social network analysis: Methods and applications. Vol. 8. Cambridge university press.
[53]
Xiao Zhou and Takao Nishizeki. 1999. Edge-Coloring and f-Coloring for Various Classes of Graphs. MATCH Commun. Math. Comput. Chem 51 (1999), 111--118.

Cited By

View all
  • (2024)k-Clique counting on large scale-graphs: a surveyPeerJ Computer Science10.7717/peerj-cs.250110(e2501)Online publication date: 18-Nov-2024
  • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
  • (2024)Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368195817:11(2722-2734)Online publication date: 30-Aug-2024
  • Show More Cited By

Index Terms

  1. Listing k-cliques in Sparse Real-World Graphs*

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '18: Proceedings of the 2018 World Wide Web Conference
    April 2018
    2000 pages
    ISBN:9781450356398
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 23 April 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. k-clique listing and counting
    2. real-world graph algorithms

    Qualifiers

    • Research-article

    Funding Sources

    • Ile-de-France Region
    • European Commission
    • French National Agency of Research
    • Google

    Conference

    WWW '18
    Sponsor:
    • IW3C2
    WWW '18: The Web Conference 2018
    April 23 - 27, 2018
    Lyon, France

    Acceptance Rates

    WWW '18 Paper Acceptance Rate 170 of 1,155 submissions, 15%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)493
    • Downloads (Last 6 weeks)81
    Reflects downloads up to 30 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)k-Clique counting on large scale-graphs: a surveyPeerJ Computer Science10.7717/peerj-cs.250110(e2501)Online publication date: 18-Nov-2024
    • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
    • (2024)Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368195817:11(2722-2734)Online publication date: 30-Aug-2024
    • (2024)Efficient Parallel D-Core Decomposition at ScaleProceedings of the VLDB Endowment10.14778/3675034.367505417:10(2654-2667)Online publication date: 1-Jun-2024
    • (2024)Aligning Human and Computational Coherence EvaluationsComputational Linguistics10.1162/coli_a_0051850:3(893-952)Online publication date: 1-Sep-2024
    • (2024)A Counting-based Approach for Efficient k-Clique Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/36549222:3(1-27)Online publication date: 30-May-2024
    • (2024)Efficient k-Clique Listing: An Edge-Oriented Branching StrategyProceedings of the ACM on Management of Data10.1145/36392622:1(1-26)Online publication date: 26-Mar-2024
    • (2024)Exploiting Fine-Grained Redundancy in Set-Centric Graph Pattern MiningProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638507(175-187)Online publication date: 2-Mar-2024
    • (2024)Towards Optimal Output-Sensitive Clique Listing or: Listing Cliques from Smaller CliquesProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649663(923-934)Online publication date: 10-Jun-2024
    • (2024)Decentralized Privacy Preservation for Critical Connections in GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.340664136:11(5911-5925)Online publication date: Nov-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media