research-article

Open access

Graph Coarsening via Convolution Matching for Scalable Graph Neural Network Training

Authors:

Charles Dickens,

Aishwarya Reganti,

Karthik Subbian,

Danai KoutraAuthors Info & Claims

WWW '24: Companion Proceedings of the ACM Web Conference 2024

Pages 1502 - 1510

https://doi.org/10.1145/3589335.3651920

Published: 13 May 2024 Publication History

Abstract

Graph summarization as a preprocessing step is an effective and complementary technique for scalable graph neural network (GNN) training. In this work, we propose the Coarsening Via Convolution Matching (ConvMatch) algorithm and a highly scalable variant, A-ConvMatch, for creating summarized graphs that preserve the output of graph convolution. We evaluate ConvMatch on six real-world link prediction and node classification graph datasets, and show it is efficient and preserves prediction performance while significantly reducing the graph size. Notably, ConvMatch achieves up to 95% of the prediction performance of GNNs on node classification while trained on graphs summarized down to 1% the size of the original graph. Furthermore, on link prediction tasks, ConvMatch consistently outperforms all baselines, achieving up to a 2X improvement.

Supplemental Material

MOV File

Supplemental video

Download
114.06 MB

References

[1]

Esra Akbas and Mehmet Emin Aktas. 2019. Network Embedding: On Compression and Learning. In IEEE Transactions on Big Data.

[2]

Ondrej Bohdal, Yongxin Yang, and Timothy Hospedales. 2020. Flexible Dataset Distillation: Learn Labels Instead of Images. In Arxiv preprint.

[3]

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and Locally Connected Networks on Graphs. In International Conference on Learning Representations (ICLR).

[4]

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Advances in Neural Information Processing Systems (NIPS).

[5]

Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, and Zhuo Feng. 2020. GraphZoom: A Multi-Level Spectral Approach for Accurate and Scalable Graph Embedding. In International Conference on Learning Representations (ICLR).

[6]

Cody Dunne and Ben Schneiderman. 2013. Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs. In ACM SIGCHI Conference on Human Factors in Computing Systems (CHI).

Digital Library

[7]

Matthias Englert, Anupam Gupta, Robert Krauthgamer, Harald Racke, Inbal Talgam-Cohen, and Kunal Talwar. 2014. Vertex Sparsifiers: New Results From Old Techniques. SIAM J. Comput., Vol. 43, 4 (2014), 1239--1262.

Digital Library

[8]

Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, and Chi Wang. 2020. Faster Graph Embeddings Via Coarsening. In International Conference on Machine Learning (ICML).

[9]

Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.

[10]

Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed deep graph learning at scale. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI}21). 551--568.

[11]

Sariel Har-Peled and Akash Kushal. 2005. Smaller Coresets for K-Mediam and K-Means Clustering. In ACM Annual Symposium on Computational Geometry (ACG).

[12]

David Harel and Yehuda Koren. 2002. A Fast Multi-scale Method for Drawing Large Graphs. Journal of Graph Algorithms and Applications, Vol. 6, 3 (2002), 179--202.

[13]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020a. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, Vol. 33 (2020), 22118--22133.

[14]

Weihua Hu, Matthias Fey, Marinka Zitnk, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020b. Open Graph Benchmark: Datasets for Machine Learning on Graphs. In Conference on Neural Information Processing Systems (NeurIPS).

[15]

Zengfeng Huang, Shengzhong Zhang, Chong Xi, Tang Liu, and Min Zhou. 2021. Scaling Up Graph Neural Networks Via Graph Coarsening. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).

[16]

Peng Jiang and Masuma Akter Rumi. 2021. Communication-efficient sampling for distributed training of graph convolutional networks. arXiv preprint arXiv:2101.07706 (2021).

[17]

Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, and Bing Yin. 2022a. Condensing Graphs via One-Step Gradient Matching. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).

Digital Library

[18]

Wei Jin, Lingxiao Zhao, Schichang Zhang, Yozen Liu, Jiliang Tang, and Neil Shah. 2022b. Graph Condensation for Graph Neural Networks. In International Conference on Learning Representations (ICLR).

[19]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).

[20]

Manoj Kumar, Anurag Sharma, Shashwat Saxena, and Sandeep Kumar. 2023. Featured Graph Coarsening with Similarity Guarantees. In International Conference on Machine Learning (ICML).

[21]

Ron Levie, Federico Monti, Xavier Bresson, and Michael M. Bronstein. 2017. CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters. IEEE Transactions on Signal Processing, Vol. 67, 1 (2017), 97--109.

Digital Library

[22]

Jiongqian Liang, Saket Gurukar, and Srinivasan Parthasarathy. 2021. MILE: A Multi-Level Framework for Scalable Graph Embedding. In International AAAI Conference on Web and Social Media (ICWSM).

[23]

Mengyang Liu, Shanchuan Li, Xinshi Chen, and Le Song. 2022. Graph Condensation via Receptive Field Distribution Matching. In Arxiv preprint.

[24]

Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. 2019. Graph Summarization Methods and Applications: A Survey. Comput. Surveys, Vol. 51, 62 (2019).

Digital Library

[25]

Andreas Loukas. 2019. Graph Reduction with Spectral and Cut Guarantees. Journal of Machine Learning Research, Vol. 20, 116 (2019), 1--42.

[26]

Baharan Mirzasoleiman, Jeff Bilmes, and Jure Leskovec. 2020. Coreset for Data-efficient Training of Machine Learning Models. In International Conference on Machine Learning (ICML).

[27]

Ankur Moitra. 2009. Approximation Algorithms for Multicommodity-type Problems with Guarantees Independent of the Graph Size. In IEEE Symposium on Foundation of Computer Science.

[28]

Timothy Nguyen, Zhourong Chen, and Jaehoon Lee. 2021. Dataset Meta-Learning from Kernel Ridge-Regression. In International Conference on Learning Representations (ICLR).

[29]

Manish Purohit, B. Aditya Prakash, Chanhyun Kang, Yao Zhang, and V. S. Subrahmanian. 2014. Fast Influence-based Coarsening for Large Networks. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD).

[30]

Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Mahmut Kandemir, and Anand Sivasubramaniam. 2021. Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks. In International Conference on Learning Representations.

[31]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective Classification in Network Data. AI Magazine, Vol. 29, 3 (2008).

Digital Library

[32]

Ozan Sener and Silvio Savarses. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In International Conference on Learning Representations (ICLR).

[33]

Ilia Sucholutsky and Matthias Schonlau. 2020. Soft-Label Dataset Distillation and Text Dataset Distillation. In Arxiv preprint.

[34]

Ivor W. Tsang, James T. Kwok, and Pak-Min Cheung. 2005. Core Vector Machines: Fast SVM Training on Very Large Data Sets. Journal of Machine Learning Research, Vol. 6 (2005), 363--392.

Digital Library

[35]

Chris Walshaw. 2006. A Multilevel Algorithm for Force-Directed Graph Drawing. Journal of Graph Algorithms and Applications, Vol. 7, 3 (2006), 253--285.

[36]

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).

[37]

Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, and Alexei A. Efros. 2018. Dataset Distillation. In Arxiv preprint.

[38]

Max Welling. 2009. Herding Dynamical Weights to Learn. In International Conference on Machine Learning (ICML).

Digital Library

[39]

Felix Wu, Tianyi Zhang, Amauri Holanda de Souza Jr., Christopher Fifty, Tao Yu, and Kilian Q. Wienberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning (ICML).

[40]

Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Maosong Sun, Zhichong Fang, Bo Zhang, and Leyu Lin. 2020. COSINE: Compressive Network Embedding on Large-Scale Information Networks. In IEEE Transactions on Knowledge and Data Engineering.

[41]

Bo Zhao and Hakan Bilen. 2021. Dataset Condensation with Differentiable Siamese Augmentation. In International Conference on Machine Learning (ICML).

[42]

Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2021. Dataset Condensation with Gradient Matching. In International Conference on Learning Representations (ICLR).

[43]

Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. Distdgl: distributed graph neural network training for billion-scale graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44.

[44]

Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, and George Karypis. 2021. Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs. arXiv preprint arXiv:2112.15345 (2021).

[45]

Jiong Zhu, Aishwarya Naresh Reganti, Edward W Huang, Charles Andrew Dickens, Nikhil Rao, Karthik Subbian, and Danai Koutra. 2023. Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation. In ICML Workshop on Localized Learning (LLW).

Index Terms

Graph Coarsening via Convolution Matching for Scalable Graph Neural Network Training
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
    2. Machine learning approaches
      1. Neural networks

Recommendations

Scaling Up Graph Neural Networks Via Graph Coarsening
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Scalability of graph neural networks remains one of the major challenges in graph machine learning. Since the representation of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes from previous ...
Graph colouring via the discharging method
Induced Matching Extendable Graph Powers

A graph G is called induced matching extendable (shortly, IM-extendable) if every induced matching of G is included in a perfect matching of G. A graph G is called strongly IM-extendable if every spanning supergraph of G is IM-extendable. The k-th power ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Companion Proceedings of the ACM Web Conference 2024

May 2024

1928 pages

ISBN:9798400701726

DOI:10.1145/3589335

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
,
Roy Ka-Wei Lee
Singapore University of Technology and Design

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
401
Total Downloads

Downloads (Last 12 months)401
Downloads (Last 6 weeks)111

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents