Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Shared-memory Parallel Maximal Clique Enumeration from Static and Dynamic Graphs

Published: 16 March 2020 Publication History

Abstract

Maximal Clique Enumeration (MCE) is a fundamental graph mining problem and is useful as a primitive in identifying dense structures in a graph. Due to the high computational cost of MCE, parallel methods are imperative for dealing with large graphs. We present shared-memory parallel algorithms for MCE, with the following properties: (1) the parallel algorithms are provably work-efficient relative to a state-of-the-art sequential algorithm, (2) the algorithms have a provably small parallel depth, showing they can scale to a large number of processors, and (3) our implementations on a multicore machine show good speedup and scaling behavior with increasing number of cores and are substantially faster than prior shared-memory parallel algorithms for MCE; for instance, on certain input graphs, while prior works either ran out of memory or did not complete in five hours, our implementation finished within a minute using 32 cores. We also present work-efficient parallel algorithms for maintaining the set of all maximal cliques in a dynamic graph that is changing through the addition of edges.

References

[1]
David Avis and Komei Fukuda. 1993. Reverse search for enumeration. Disc. Appl. Math. 65 (1993), 21--46.
[2]
Guy E. Blelloch and Bruce M. Maggs. 2010. Parallel algorithms. In Algorithms and Theory of Computation Handbook. Chapman and Hall\CRC, 25--25.
[3]
Robert David Blumofe. 1995. Executing Multithreaded Programs Efficiently. Ph.D. Dissertation. Massachusetts Institute of Technology.
[4]
Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. J. ACM 46, 5 (1999), 720--748.
[5]
Coen Bron and Joep Kerbosch. 1973. Algorithm 457: Finding all cliques of an undirected graph. Commun. ACM 16, 9 (1973), 575--577.
[6]
Frédéric Cazals and Chinmay Karande. 2008. A note on the problem of reporting maximal cliques. Theor. Comput. Sci. 407, 1 (2008), 564--568.
[7]
Annie Chateau, Pierre Riou, and Eric Rivals. 2011. Approximate common intervals in multiple genome comparison. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM’11). IEEE, 131--134.
[8]
Yu Chen and Gordon M. Crippen. 2005. A novel approach to structural alignment using realistic structural and environmental information. Protein Sci. 14, 12 (2005), 2935--2946.
[9]
Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM J. Comput. 14 (1985), 210--223. Issue 1.
[10]
Alessio Conte, Tiziano De Matteis, Daniele De Sensi, Roberto Grossi, Andrea Marino, and Luca Versari. 2018. D2K: Scalable community detection in massive networks via small-diameter k-plexes. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM, 1272--1281.
[11]
Alessio Conte, Roberto Grossi, Andrea Marino, and Luca Versari. 2016. Sublinear-space bounded-delay enumeration for massive network analytics: Maximal cliques. In Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming (ICALP’16), Vol. 148. 1--148.
[12]
Apurba Das, Seyed-Vahid Sanei-Mehri, and Srikanta Tirthapura. 2018. Shared-memory parallel maximal clique enumeration. In Proceedings of the IEEE 25th International Conference on High Performance Computing (HiPC’18). IEEE, 62--71.
[13]
Apurba Das, Michael Svendsen, and Srikanta Tirthapura. 2019. Incremental maintenance of maximal cliques in a dynamic graph. VLDB J. (Apr. 2019).
[14]
Naga Shailaja Dasari, Ranjan Desh, and Mohammad Zubair. 2014. ParK: An efficient algorithm for k-core decomposition on multicore processors. In Proceedings of the IEEE International Conference on Big Data. IEEE, 9--16.
[15]
Nan Du, Bin Wu, Liutong Xu, Bai Wang, and Xin Pei. 2006. A parallel algorithm for enumerating all maximal cliques in complex network. In Proceedings of the IEEE International Conference on Data Mining Workshop. IEEE, 320--324.
[16]
Nan Du, Bin Wu, Liutong Xu, Bai Wang, and Pei Xin. 2009. Parallel algorithm for enumerating maximal cliques in complex network. In Mining Complex Data. Springer, 207--221.
[17]
Dongsheng Duan, Yuhua Li, Ruixuan Li, and Zhengding Lu. 2012. Incremental K-clique clustering in dynamic social networks. Artific. Intell. Rev. 38 (2012), 129--147.
[18]
David Eppstein, Maarten Löffler, and Darren Strash. 2010. Listing all maximal cliques in sparse graphs in near-optimal time. In Proceedings of the International Symposium on Algorithms and Computation. 403--414.
[19]
David Eppstein and Darren Strash. 2011. Listing all maximal cliques in large sparse real-world graphs. In Exper. Algor., Vol. 6630. 364--375.
[20]
James Fox, Tim Roughgarden, C. Seshadhri, and Fan Wei. 2018. Finding cliques in social networks: A new distribution-free model. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP'18), Ioannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella (Eds.), Vol. 107. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 55:1--55:15.
[21]
Helen M. Grindley, Peter J. Artymiuk, David W. Rice, and Peter Willett. 1993. Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. J. Molec. Biol. 229, 3 (1993), 707--721.
[22]
Masahiro Hattori, Yasushi Okuno, Susumu Goto, and Minoru Kanehisa. 2003. Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. J. Amer. Chem. Soc. 125, 39 (2003), 11853--11865.
[23]
David S. Johnson, Mihalis Yannakakis, and Christos H. Papadimitriou. 1988. On generating all maximal independent sets. Inf. Proc. Lett. 27, 3 (1988), 119--123.
[24]
Pall F. Jonsson and Paul A. Bates. 2006. Global topological features of cancer proteins in the human interactome. Bioinformatics 22, 18 (2006), 2291--2297.
[25]
Humayun Kabir and Kamesh Madduri. 2017. Parallel k-core decomposition on multicore platforms. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, 1482--1491.
[26]
Humayun Kabir and Kamesh Madduri. 2017. Parallel k-truss decomposition on multicore systems. In Proceedings of the IEEE Conference on High Performance Extreme Computing. 1--7.
[27]
Humayun Kabir and Kamesh Madduri. 2017. Shared-memory graph truss decomposition. In Proceedings of the IEEE 24th International Conference on High Performance Computing (HiPC’17). IEEE, 13--22.
[28]
Ina Koch. 2001. Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci. 250, 1 (2001), 1--30.
[29]
Shungo Koichi, Masaki Arisaka, Hiroyuki Koshino, Atsushi Aoki, Satoru Iwata, Takeaki Uno, and Hiroko Satoh. 2014. Chemical structure elucidation from 13C NMR chemical shifts: Efficient data processing using bipartite matching and maximal clique algorithms. J. Chem. Inf. Model. 54, 4 (2014), 1027--1035.
[30]
D. Koller and N. Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.
[31]
Frank Kose, Wolfram Weckwerth, Thomas Linke, and Oliver Fiehn. 2001. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics 17, 12 (2001), 1198--1208.
[32]
Jérôme Kunegis. 2013. Konect: The Koblenz network collection. In Proceedings of the International World Wide Web Conferences. ACM, 1343--1350.
[33]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data.
[34]
Brenton Lessley, Talita Perciano, Manish Mathai, Hank Childs, and E. Wes Bethel. 2017. Maximal clique enumeration with data-parallel primitives. In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization. IEEE, 16--25.
[35]
Yinuo Li, Zhiyuan Shao, Dongxiao Yu, Xiaofei Liao, and Hai Jin. 2019. Fast maximal clique enumeration for real-world graphs. In Proceedings of the International Conference on Database Systems for Advanced Applications. Springer, 641--658.
[36]
Guimei Liu and Limsoon Wong. 2008. Effective pruning techniques for mining quasi-cliques. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 33--49.
[37]
Li Lu, Yunhong Gu, and Robert Grossman. 2010. dmaximalcliques: A distributed algorithm for enumerating all maximal cliques and maximal clique distribution. In Proceedings of the IEEE International Conference on Data Mining Workshop. IEEE, 1320--1327.
[38]
Kazuhisa Makino and Takeaki Uno. 2004. New algorithms for enumerating all maximal cliques. In Proceedings of the Scandinavian Workshop on Algorithm Theory. 260--272.
[39]
S. Mohseni-Zadeh, Pierre Brézellec, and J.-L. Risler. 2004. Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Comp. Biol. Chem. 28, 3 (2004), 211--218.
[40]
Alberto Montresor, Francesco De Pellegrini, and Daniele Miorandi. 2013. Distributed k-core decomposition. Trans. Parallel Distrib. Syst. 24, 2 (2013), 288--300.
[41]
John W. Moon and Leo Moser. 1965. On cliques in graphs. Israel J. Math. 3, 1 (1965), 23--28.
[42]
Arko Provo Mukherjee and Srikanta Tirthapura. 2017. Enumerating maximal bicliques from a large graph using MapReduce. IEEE Trans. Serv. Comput. 10, 5 (2017), 771--784.
[43]
Thorsten J. Ottosen and Jirı Vomlel. 2010. Honour thy neighbour: Clique maintenance in dynamic graphs. In Proceedings of the International Conference on Probabilistic Graphical Models. 201--208.
[44]
Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 7043 (2005), 814--818.
[45]
Hongchao Qin, Rong-Hua Li, Guoren Wang, Lu Qin, Yurong Cheng, and Ye Yuan. 2019. Mining periodic cliques in temporal networks. In Proceedings of the IEEE 35th International Conference on Data Engineering (ICDE’19). IEEE, 1130--1141.
[46]
Oleg Rokhlenko, Ydo Wexler, and Zohar Yakhini. 2007. Similarities and differences of gene expression in yeast stress conditions. Bioinformatics 23, 2 (2007), e184--e190.
[47]
Ryan A. Rossi and Nesreen K. Ahmed. 2015. The network data repository with interactive graph analytics and visualization. In Proceedings of the AAAI Conference on Artificial Intelligence. 4292--4293.
[48]
Pablo San Segundo, Jorge Artieda, and Darren Strash. 2018. Efficiently enumerating all maximal cliques with bit-parallelism. Comput. Operat. Res. 92 (2018), 37--46.
[49]
Seyed-Vahid Sanei-Mehri, Apurba Das, and Srikanta Tirthapura. 2018. Enumerating top-k quasi-cliques. In Proceedings of the IEEE International Conference on Big Data (Big Data’18). IEEE, 1107--1112.
[50]
Ahmet Erdem Sariyuce, C. Seshadhri, and Ali Pinar. 2017. Parallel local algorithms for core, truss, and nucleus decompositions. arXiv preprint arXiv:1704.00386 (2017).
[51]
Matthew C. Schmidt, Nagiza F. Samatova, Kevin Thomas, and Byung-Hoon Park. 2009. A scalable, parallel algorithm for maximal clique enumeration. J. Parallel Distrib. Comput. 69, 4 (2009), 417--428.
[52]
Ori Shalev and Nir Shavit. 2006. Split-ordered lists: Lock-free extensible hash tables. J. ACM 53, 3 (2006), 379--405.
[53]
Julian Shun. 2017. Shared-memory Parallelism Can Be Simple, Fast, and Scalable. Morgan 8 Claypool.
[54]
Volker Stix. 2004. Finding all maximal cliques in dynamic graphs. Comput. Optim. Appl. 27, 2 (2004), 173--186.
[55]
Michael Svendsen, Arko Provo Mukherjee, and Srikanta Tirthapura. 2015. Mining maximal cliques from a large graph using mapreduce: Tackling highly uneven subproblem sizes. J. Parallel Distrib. Comput. 79 (2015), 104--114.
[56]
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theoret. Comput. Sci. 363, 1 (2006), 28--42.
[57]
Shuji Tsukiyama, Mikio Ide, Hiromu Ariyoshi, and Isao Shirakawa. 1977. A new algorithm for generating all the maximal independent sets. SIAM J. Comput. 6, 3 (1977), 505--517.
[58]
Takeaki Uno. 2010. An efficient algorithm for solving pseudo clique enumeration problem. Algorithmica 56, 1 (2010), 3--16.
[59]
Zhuo Wang, Qun Chen, Boyi Hou, Bo Suo, Zhanhuai Li, Wei Pan, and Zachary G. Ives. 2017. Parallelizing maximal clique and k-plex enumeration over graph data. J. Parallel Distrib. Comput. 106 (2017), 79--91.
[60]
Bin Wu, Shengqi Yang, Haizhou Zhao, and Bai Wang. 2009. A distributed algorithm to enumerate all maximal cliques in mapreduce. In Proceedings of the Frontier of Computer Science and Technology. IEEE, 45--51.
[61]
Ting Yu and Mengchi Liu. 2019. A memory efficient clique enumeration method for sparse graphs with a parallel implementation. Parallel Comput. 87 (2019).
[62]
Long Yuan, Lu Qin, Wenjie Zhang, Lijun Chang, and Jianye Yang. 2017. Index-based densest clique percolation community search in networks. IEEE Trans. Knowl. Data Eng. 30, 5 (2017), 922--935.
[63]
Bing Zhang, Byung-Hoon Park, Tatiana Karpinets, and Nagiza F. Samatova. 2008. From pull-down data to protein interaction networks and complexes with biological relevance. Bioinformatics 24, 7 (2008), 979--986.
[64]
Chen Zhang, Ying Zhang, Wenjie Zhang, Lu Qin, and Jianye Yang. 2019. Efficient maximal spatial clique enumeration. In Proceedings of the IEEE 35th International Conference on Data Engineering (ICDE’19). IEEE, 878--889.
[65]
Yun Zhang, Faisal N. Abu-Khzam, Nicole E. Baldwin, Elissa J. Chesler, Michael A. Langston, and Nagiza F. Samatova. 2005. Genome-scale computational approaches to memory-intensive applications in systems biology. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE Computer Society, 12.

Cited By

View all
  • (2024)Incremental Maximal Clique Enumeration for Hybrid Edge Changes in Large Dynamic GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331139836:4(1650-1666)Online publication date: Apr-2024
  • (2024)Accelerating Maximal Bicliques Enumeration with GPU on large scale networkFuture Generation Computer Systems10.1016/j.future.2024.07.021Online publication date: Jul-2024
  • (2024)Distributed algorithm for parallel computation of the n queens solutionsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124677255:PCOnline publication date: 1-Dec-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 7, Issue 1
Special Issue on Innovations in Systems for Irregular Applications, Part 1 and Regular Paper
March 2020
182 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3387354
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2020
Accepted: 01 December 2019
Revised: 01 July 2019
Received: 01 December 2018
Published in TOPC Volume 7, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Parallel algorithm
  2. batch parallel algorithm
  3. dynamic graph
  4. maximal clique enumeration
  5. scalable graph processing
  6. work depth analysis

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • US National Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Incremental Maximal Clique Enumeration for Hybrid Edge Changes in Large Dynamic GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331139836:4(1650-1666)Online publication date: Apr-2024
  • (2024)Accelerating Maximal Bicliques Enumeration with GPU on large scale networkFuture Generation Computer Systems10.1016/j.future.2024.07.021Online publication date: Jul-2024
  • (2024)Distributed algorithm for parallel computation of the n queens solutionsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124677255:PCOnline publication date: 1-Dec-2024
  • (2024)On Compressing Historical Cliques in Temporal GraphsDatabase Systems for Advanced Applications10.1007/978-981-97-5552-3_3(37-53)Online publication date: 1-Oct-2024
  • (2023)Fast Parallel Algorithms for Enumeration of Simple, Temporal, and Hop-constrained CyclesACM Transactions on Parallel Computing10.1145/361164210:3(1-35)Online publication date: 4-Aug-2023
  • (2023)Efficient Maximal Biclique Enumeration on Large Uncertain Bipartite GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327211035:12(12634-12648)Online publication date: 1-Dec-2023
  • (2023)Parallelizing Maximal Clique Enumeration on GPUsProceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT58117.2023.00022(162-175)Online publication date: 21-Oct-2023
  • (2023)A fast approximate method for k-edge connected component detection in graphs with high accuracyInformation Sciences10.1016/j.ins.2023.03.009633(384-409)Online publication date: Jul-2023
  • (2022)A Distributed Algorithm for Computing Groups in IoT SystemsInternational Journal of Software Science and Computational Intelligence10.4018/IJSSCI.30036314:1(1-21)Online publication date: 1-Jan-2022
  • (2022)Maximal clique enumeration problem on graphs: status and challengesSCIENTIA SINICA Informationis10.1360/SSI-2021-015552:5(784)Online publication date: 14-Apr-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media