Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3581784.3607059acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Efficient Maximal Biclique Enumeration on GPUs

Published: 11 November 2023 Publication History

Abstract

Maximal biclique enumeration (MBE) in bipartite graphs is an important problem in data mining with many real-world applications. All existing solutions for MBE are designed for CPUs. Parallel MBE algorithms for GPUs are needed for MBE acceleration leveraging its many computing cores. However, enumerating maximal bicliques using GPUs has three main challenges including large memory requirement, thread divergence, and load imbalance. In this paper, we propose GMBE, the first highly-efficient GPU solution for the MBE problem. To overcome the challenges, we design a node-reuse approach to reduce GPU memory usage, a pro-active pruning method using the vertex's local neighborhood size to alleviate thread divergence, and a load-aware task scheduling framework to achieve load balance among threads within GPU warps and blocks. Our experimental results show that GMBE on an NVIDIA A100 GPU can achieve 70.6× speedup over the state-of-the-art parallel MBE algorithm ParMBE on a 96-core CPU machine.

Supplemental Material

MP4 File - SC23 video presentation for "Efficient Maximal Biclique Enumeration on GPUs"
SC23 video presentation for the main program paper "Efficient Maximal Biclique Enumeration on GPUs" by Zhe Pan, Shuibing He, Xu Li, Xuechen Zhang, Rui Wang and Gang Chen

References

[1]
2022. CUDA C++ programming guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/.
[2]
2023. CUDA from Wikipedia. https://en.wikipedia.org/wiki/CUDA.
[3]
2023. GeForce RTX 20 Series. https://www.nvidia.com/en-gb/geforce/20-series/.
[4]
2023. NVIDIA A100 Tensor Core GPU. https://www.nvidia.com/en-gb/data-center/a100/.
[5]
2023. NVIDIA Nsight Compute. https://developer.nvidia.com/nsight-compute/.
[6]
2023. NVIDIA V100 Tensor Core GPU. https://www.nvidia.com/en-gb/data-center/v100/.
[7]
2023. Single instruction, multiple threads (SIMT) from Wikipedia. https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads.
[8]
Aman Abidi, Rui Zhou, Lu Chen, and Chengfei Liu. 2020. Pivot-based Maximal Biclique Enumeration. In IJCAI. 3558--3564.
[9]
Gabriela Alexe, Sorin Alexe, Yves Crama, Stephan Foldes, Peter L Hammer, and Bruno Simeone. 2004. Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics 145, 1 (2004), 11--21.
[10]
Mohammad Allahbakhsh, Aleksandar Ignjatovic, Boualem Benatallah, Seyed-Mehdi-Reza Beheshti, Elisa Bertino, and Norman Foo. 2013. Collusion detection in online rating systems. In Web Technologies and Applications: 15th Asia-Pacific Web Conference, APWeb 2013, Sydney, Australia, April 4--6, 2013. Proceedings 15. Springer, 196--207.
[11]
Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, and Wen-mei Hwu. 2022. Parallel k-clique counting on gpus. In Proceedings of the 36th ACM International Conference on Supercomputing. 1--14.
[12]
Tariq Alusaifeer, Sheela Ramanna, Christopher J Henry, and James Peters. 2013. GPU implementation of MCE approach to finding near neighbourhoods. In International Conference on Rough Sets and Knowledge Technology. Springer, 251--262.
[13]
Martin Burtscher, Rupesh Nasre, and Keshav Pingali. 2012. A quantitative study of irregular programs on GPUs. In 2012 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 141--151.
[14]
Lu Chen, Chengfei Liu, Rui Zhou, Jiajie Xu, and Jianxin Li. 2022. Efficient maximal biclique enumeration for large sparse bipartite graphs. Proceedings of the VLDB Endowment 15, 8 (2022), 1559--1571.
[15]
Xuhao Chen et al. 2022. Efficient and Scalable Graph Pattern Mining on GPUs. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). 857--877.
[16]
Xuhao Chen, Li-Wen Chang, Christopher I Rodrigues, Jie Lv, Zhiying Wang, and Wen-Mei Hwu. 2014. Adaptive cache management for energy-efficient GPU computing. In 2014 47th Annual IEEE/ACM international symposium on microarchitecture. IEEE, 343--355.
[17]
Xuhao Chen, Roshan Dathathri, Gurbinder Gill, Loc Hoang, and Keshav Pingali. 2021. Sandslash: a two-level framework for efficient graph pattern mining. In Proceedings of the ACM International Conference on Supercomputing. 378--391.
[18]
Apurba Das and Srikanta Tirthapura. 2019. Shared-memory parallel maximal biclique enumeration. In 2019 IEEE 26th International Conference on High Performance Computing (HiPC). IEEE, 34--43.
[19]
David Eppstein. 1994. Arboricity and bipartite subgraph listing algorithms. Information processing letters 51, 4 (1994), 207--211.
[20]
Wentian Guo, Yuchen Li, and Kian-Lee Tan. 2020. Exploiting reuse for GPU subgraph enumeration. IEEE Transactions on Knowledge and Data Engineering (TKDE) 34, 9 (2020), 4231--4244.
[21]
Kshitij Gupta, Jeff A Stuart, and John D Owens. 2012. A study of persistent threads style GPU programming for GPGPU workloads. In 2012 Innovative Parallel Computing (InPar). 1--14.
[22]
Danny Hermelin and George Manoussakis. 2021. Efficient enumeration of maximal induced bicliques. Discrete Applied Mathematics 303 (2021), 253--261.
[23]
George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing 20, 1 (1998), 359--392.
[24]
Kyle Kloster, Blair D Sullivan, and Andrew van der Poel. 2019. Mining maximal induced bicliques using odd cycle transversals. In Proceedings of the 2019 SIAM International Conference on Data Mining. SIAM, 324--332.
[25]
Jérôme Kunegis. 2013. Konect: the koblenz network collection. In Proceedings of the 22nd international conference on world wide web. 1343--1350.
[26]
Jure Leskovec and Andrej Krevl. 2014. SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data.
[27]
Brenton Lessley, Talita Perciano, Manish Mathai, Hank Childs, and E Wes Bethel. 2017. Maximal clique enumeration with data-parallel primitives. In 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV). IEEE, 16--25.
[28]
Yuan Lin and Vinod Grover. 2018. Using cuda warp-level primitives. https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/.
[29]
Guimei Liu, Kelvin Sim, and Jinyan Li. 2006. Efficient mining of large maximal bicliques. In Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4--8, 2006. Proceedings 8. Springer, 437--448.
[30]
Bingqing Lyu, Lu Qin, Xuemin Lin, Ying Zhang, Zhengping Qian, and Jingren Zhou. 2020. Maximum biclique search at billion scale. Proceedings of the VLDB Endowment (2020).
[31]
Ziyi Ma, Yuling Liu, Yikun Hu, Jianye Yang, Chubo Liu, and Huadong Dai. 2022. Efficient maintenance for maximal bicliques in bipartite graph streams. World Wide Web 25, 2 (2022), 857--877.
[32]
Arko Provo Mukherjee and Srikanta Tirthapura. 2016. Enumerating maximal bicliques from a large graph using mapreduce. IEEE Transactions on Services Computing (TSC) 10, 5 (2016), 771--784.
[33]
Ron Rymon. 1992. Search through systematic set enumeration. In Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning (KR). 539--550.
[34]
Amos Tanay, Roded Sharan, and Ron Shamir. 2002. Discovering statistically significant biclusters in gene expression data. In Proceedings of the Tenth International Conference on Intelligent Systems for Molecular Biology. 136--144.
[35]
Kai Wang, Wenjie Zhang, Xuemin Lin, Lu Qin, and Alexander Zhou. 2022. Efficient personalized maximum biclique search. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 498--511.
[36]
Yi-Wen Wei, Wei-Mei Chen, and Hsin-Hung Tsai. 2021. Accelerating the Bron-Kerbosch algorithm for maximal clique enumeration using GPUs. IEEE Transactions on Parallel and Distributed Systems (TPDS) 32, 9 (2021), 2352--2366.
[37]
Martin Winter, Mathias Parger, Daniel Mlakar, and Markus Steinberger. 2021. Are dynamic memory managers on gpus slow? a survey and benchmarks. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 219--233.
[38]
Jianye Yang, Yun Peng, Dian Ouyang, Wenjie Zhang, Xuemin Lin, and Xiang Zhao. 2023. (p, q)-biclique counting and enumeration for large sparse bipartite graphs. The VLDB Journal (2023), 1--25.
[39]
Yun Zhang, Charles A Phillips, Gary L Rogers, Erich J Baker, Elissa J Chesler, and Michael A Langston. 2014. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC bioinformatics 15 (2014), 1--18.

Cited By

View all
  • (2024)SyncMalloc: A Synchronized Host-Device Co-Management System for GPU Dynamic Memory Allocation across All ScalesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673069(179-188)Online publication date: 12-Aug-2024
  • (2024)AMBEA: Aggressive Maximal Biclique Enumeration in Large Bipartite Graph ComputingIEEE Transactions on Computers10.1109/TC.2024.344186473:12(2664-2677)Online publication date: Dec-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2023
1428 pages
ISBN:9798400701092
DOI:10.1145/3581784
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. maximal biclique enumeration
  2. GPU

Qualifiers

  • Research-article

Conference

SC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)295
  • Downloads (Last 6 weeks)8
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SyncMalloc: A Synchronized Host-Device Co-Management System for GPU Dynamic Memory Allocation across All ScalesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673069(179-188)Online publication date: 12-Aug-2024
  • (2024)AMBEA: Aggressive Maximal Biclique Enumeration in Large Bipartite Graph ComputingIEEE Transactions on Computers10.1109/TC.2024.344186473:12(2664-2677)Online publication date: Dec-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media