Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3627673.3679101acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
tutorial
Open access

Systems for Scalable Graph Analytics and Machine Learning: Trends and Methods

Published: 21 October 2024 Publication History

Abstract

Graph-theoretic algorithms and graph machine learning models are essential tools for addressing many real-life problems, such as social network analysis and bioinformatics. To support large-scale graph analytics, graph-parallel systems have been actively developed for over one decade, such as Google's Pregel and Spark's GraphX, which (i) promote a think-like-a-vertex computing model and target (ii) iterative algorithms and (iii) those problems that output a value for each vertex. However, this model is too restricted for supporting the rich set of heterogeneous operations for graph analytics and machine learning that many real applications demand.
In recent years, two new trends emerge in graph-parallel systems research: (1) a novel think-like-a-task computing model that can efficiently support the various computationally expensive problems of subgraph search; and (2) scalable systems for learning graph neural networks. These systems effectively complement the diversity needs of graph-parallel tools that can flexibly work together in a comprehensive graph processing pipeline for real applications, with the capability of capturing structural features. This tutorial will provide an effective categorization of the recent systems in these two directions based on their computing models and adopted techniques, and will review the key design ideas of these systems.

References

[1]
[n. d.]. BigGraph@CUHK. http://www.cse.cuhk.edu.hk/systems/graph/.
[2]
[n. d.]. GNN Tutorials. https://graph-neural-networks.github.io/.
[3]
Emily Alsentzer, Samuel G. Finlayson, Michelle M. Li, and Marinka Zitnik. 2020. Subgraph Neural Networks. In NeurIPS.
[4]
Shumo Chu and James Cheng. 2012. Triangle listing in massive networks. ACM Trans. Knowl. Discov. Data 6, 4 (2012), 17:1--17:32.
[5]
Vinícius Vitor dos Santos Dias, Carlos H. C. Teixeira, Dorgival O. Guedes, Wagner Meira Jr., and Srinivasan Parthasarathy. 2019. Fractal: A General-Purpose Graph Pattern Mining System. In SIGMOD Conference 2019. ACM, 1357--1374.
[6]
Fabrizio Frasca, Beatrice Bevilacqua, Michael M. Bronstein, and Haggai Maron. 2022. Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries. In NeurIPS.
[7]
Guimu Guo, Da Yan, M. Tamer Özsu, Zhe Jiang, and Jalal Khalil. 2020. Scalable Mining of Maximal Quasi-Cliques: An Algorithm-System Codesign Approach. Proc. VLDB Endow. 14, 4 (2020), 573--585.
[8]
Wentian Guo, Yuchen Li, Mo Sha, Bingsheng He, Xiaokui Xiao, and Kian-Lee Tan. 2020. GPU-Accelerated Subgraph Enumeration on Partitioned Graphs. In SIGMOD. ACM, 1067--1082.
[9]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA. 1024--1034. https://proceedings.neurips. cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html
[10]
Guanxian Jiang, China Qihui Zhou, Tatiana Jin, Boyang Li, Yunjian Zhao, Yichao Li, and James Cheng. 2022. VSGM: View-Based GPU-Accelerated Subgraph Matching on Large Graphs. In SC. IEEE Computer Society, 739--753.
[11]
Shirui Pan and Xingquan Zhu. 2013. Graph Classification with Imbalanced Class Distributions and Noise. In IJCAI, Francesca Rossi (Ed.). IJCAI/AAAI, 1586--1592.
[12]
Martin S. R. Paradesi, Doina Caragea, and William H. Hsu. 2007. Structural Prediction of Protein-Protein Interactions in Saccharomyces cerevisiae. In IEEE BIBE. IEEE Computer Society, 1270--1274.
[13]
Erasmo Purificato, Ludovico Boratto, and Ernesto William De Luca. 2023. Leveraging Graph Neural Networks for User Profiling: Recent Advances and Open Challenges. In CIKM. ACM, 5216--5219.
[14]
Hiroto Saigo, Sebastian Nowozin, Tadashi Kadowaki, Taku Kudo, and Koji Tsuda. 2009. gBoost: a mathematical programming approach to graph classification and regression. Mach. Learn. 75, 1 (2009), 69--89.
[15]
Andrew Stolman, Caleb Levy, C. Seshadhri, and Aneesh Sharma. 2022. Classic Graph Structural Features Outperform Factorization-Based Graph Embedding Methods on Community Labeling. In SDM. SIAM, 388--396.
[16]
Xibo Sun and Qiong Luo. 2023. Efficient GPU-Accelerated Subgraph Matching. Proc. ACM Manag. Data 1, 2 (2023), 181:1--181:26.
[17]
Hanchen Wang, Rong Hu, Ying Zhang, Lu Qin, Wei Wang, and Wenjie Zhang. 2022. Neural Subgraph Counting with Wasserstein Estimator. In SIGMOD. ACM, 160--175.
[18]
Yihua Wei and Peng Jiang. 2022. STMatch: Accelerating Graph Pattern Matching on GPU with Stack-Based Loop Optimizations. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX, USA, November 13--18, 2022, FelixWolf, Sameer Shende, Candace Culhane, Sadaf R. Alam, and Heike Jagode (Eds.). IEEE, 53:1--53:13.
[19]
Lizhi Xiang, Arif Khan, Edoardo Serra, Mahantesh Halappanavar, and Aravind Sukumaran-Rajam. 2021. cuTS: scaling subgraph isomorphism on distributed multi-GPU systems using trie based data structure. In SC. ACM, 69.
[20]
Bo Xiong, Mojtaba Nayyeri, Daniel Daza, and Michael Cochez. 2023. Reasoning beyond Triples: Recent Advances in Knowledge Graph Embeddings. In CIKM. ACM, 5228--5231.
[21]
Da Yan, Yingyi Bu, Yuanyuan Tian, and Amol Deshpande. 2017. Big Graph Analytics Platforms. Found. Trends Databases 7, 1--2 (2017), 1--195.
[22]
Da Yan, Yingyi Bu, Yuanyuan Tian, Amol Deshpande, and James Cheng. 2016. Big Graph Analytics Systems. In SIGMOD, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 2241--2243.
[23]
Da Yan, James Cheng, Kai Xing, Yi Lu, Wilfred Ng, and Yingyi Bu. 2014. Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees. Proc. VLDB Endow. 7, 14 (2014), 1821--1832.
[24]
Da Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu, Wei-Shinn Ku, and John C. S. Lui. 2020. G-thinker: A Distributed Framework for Mining Subgraphs in a Big Graph. In ICDE 2020. IEEE, 1369--1380.
[25]
Da Yan, Guimu Guo, Jalal Khalil, M. Tamer Özsu,Wei-Shinn Ku, and John C. S. Lui. 2022. G-thinker: a general distributed framework for finding qualified subgraphs in a big graph with load balancing. VLDB J. 31, 2 (2022), 287--320.
[26]
Da Yan, Yuanyuan Tian, and James Cheng. 2017. Systems for Big Graph Analytics. Springer. https://doi.org/10.1007/978--3--319--58217--7
[27]
Rex Ying, Zhaoyu Lou, Jiaxuan You, Chengtao Wen, Arquimedes Canedo, and Jure Leskovec. 2020. Neural Subgraph Matching. CoRR abs/2007.03092 (2020). arXiv:2007.03092 https://arxiv.org/abs/2007.03092
[28]
Lyuheng Yuan, Akhlaque Ahmad, Da Yan, Jiao Han, Xiaodong Yu, and Yang Zhou. 2024. FG2-AIMD: A Memory-Efficient Subgraph-Centric Framework for Efficient Subgraph Search on GPUs. In ICDE. IEEE Computer Society.
[29]
Lyuheng Yuan, Da Yan, Jiao Han, Akhlaque Ahmad, Yang Zhou, and Zhe Jiang. 2024. Faster Depth-First Subgraph Matching on GPUs. In ICDE. IEEE Computer Society.
[30]
Li Zeng, Lei Zou, and M Tamer Özsu. 2022. SGSI--A Scalable GPU-friendly Subgraph Isomorphism Algorithm. IEEE Transactions on Knowledge and Data Engineering (2022).
[31]
Li Zeng, Lei Zou, M. Tamer Özsu, Lin Hu, and Fan Zhang. 2020. GSI: GPU-friendly Subgraph Isomorphism. In ICDE. IEEE, 1249--1260.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Check for updates

Author Tags

  1. GNN
  2. graph
  3. graph neural network
  4. structure
  5. subgraph
  6. system
  7. vertex

Qualifiers

  • Tutorial

Funding Sources

  • NSF
  • South Big Data Innovation Hub
  • DOE

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 58
    Total Downloads
  • Downloads (Last 12 months)58
  • Downloads (Last 6 weeks)58
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media