Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3637528.3671940acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

AGS-GNN: Attribute-guided Sampling for Graph Neural Networks

Published: 24 August 2024 Publication History

Abstract

We propose AGS-GNN, a novel attribute-guided sampling algorithm for Graph Neural Networks (GNNs). AGS-GNN exploits the node features and the connectivity structure of a graph while simultaneously adapting for both homophily and heterophily in graphs. In homophilic graphs, vertices of the same class are more likely to be adjacent, but vertices of different classes tend to be adjacent in heterophilic graphs. GNNs have been successfully applied to homophilic graphs, but their utility to heterophilic graphs remains challenging. The state-of-the-art GNNs for heterophilic graphs use the full neighborhood of a node instead of sampling it, and hence do not scale to large graphs and are not inductive. We develop dual-channel sampling techniques based on feature-similarity and feature-diversity to select subsets of neighbors for a node that capture adaptive information from homophilic and heterophilic neighborhoods. Currently, AGS-GNN is the only algorithm that explicitly controls homophily in the sampled subgraph through similar and diverse neighborhood samples. For diverse neighborhood sampling, we employ submodularity, a novel contribution in this context. We pre-compute the sampling distribution in parallel, achieving the desired scalability. Using an extensive dataset consisting of 35 small (< 100K nodes) and large (- 100K nodes) homophilic and heterophilic graphs, we demonstrate the superiority of AGS-GNN compared to the state-of-the-art approaches. AGS-GNN achieves test accuracy comparable to the best-performing heterophilic GNNs, even outperforming methods that use the entire graph for node classification. AGS-GNN converges faster than methods that sample neighborhoods randomly, and can be incorporated into existing GNN models that employ node or graph sampling.

Supplemental Material

PPTX File - Presentation Slides for Promotional Video
We introduce AGS-GNN, an attribute-guided sampling for graph neural networks that samples neighbors of a node based on feature similarity and diversity. It handles graphs with a mix of homophilic and heterophilic nodes by utilizing the smoothing assumption (label and features are positively correlated) for homophilic nodes and the heterophily assumption (heterophilic nodes with the same label should have similar label diversity in the neighborhood) for heterophilic nodes. In the pre-computation step of AGS-GNN, neighbors are ranked based on similarity and diversity, and this rank is converted to sampling probabilities. Two subgraphs were sampled based on the probabilities, passed to homophilic and heterophilic GNN, respectively, and learned adaptively. AGS-GNN outperforms state-of-the-art GNN performance on both homophilic and heterophilic graphs, converges faster than random sampling, can be easily incorporated with existing GNN, and offers scalable and parallel computations.
MOV File - Promotional Video
We introduce AGS-GNN, an attribute-guided sampling for graph neural networks that samples neighbors of a node based on feature similarity and diversity. It handles graphs with a mix of homophilic and heterophilic nodes by utilizing the smoothing assumption (label and features are positively correlated) for homophilic nodes and the heterophily assumption (heterophilic nodes with the same label should have similar label diversity in the neighborhood) for heterophilic nodes. In the pre-computation step of AGS-GNN, neighbors are ranked based on similarity and diversity, and this rank is converted to sampling probabilities. Two subgraphs were sampled based on the probabilities, passed to homophilic and heterophilic GNN, respectively, and learned adaptively. AGS-GNN outperforms state-of-the-art GNN performance on both homophilic and heterophilic graphs, converges faster than random sampling, can be easily incorporated with existing GNN, and offers scalable and parallel computations.

References

[1]
Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, et al. 2019. MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing. In The 36th International Conference on Machine Learning, Vol. 97. PMLR, Long Beach, California, USA, 21--29.
[2]
Hao Chen, Yue Xu, Feiran Huang, Zengde Deng, Wenbing Huang, Senzhang Wang, Peng He, and Zhoujun Li. 2020. Label-Aware Graph Convolutional Networks. In The 29th International Conference on Information and Knowledge Management. ACM, Virtual Event, Ireland, 1977--1980.
[3]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In The 6th International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada.
[4]
Jianfei Chen, Jun Zhu, and Le Song. 2018. Stochastic Training of Graph Convolutional Networks with Variance Reduction. In The 35th International Conference on Machine Learning, Jennifer G. Dy and Andreas Krause (Eds.), Vol. 80. PMLR, Stockholm, Sweden, July, 941--949.
[5]
Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, and Yaliang Li. 2020. Simple and Deep Graph Convolutional Networks. In The 37th International Conference on Machine Learning, Vol. 119. PMLR, Virtual Event, 1725--1735.
[6]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. In The 25th International Conference on Knowledge Discovery & Data Mining, KDD. ACM, Anchorage, AK, USA, 257--266.
[7]
Davide Chicco. 2021. Siamese Neural Networks: An Overview. In Artificial Neural Networks - Third Edition. Vol. 2190. Springer, 73--94.
[8]
Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive Universal Generalized PageRank Graph Neural Network. In The 9th International Conference on Learning Representations. OpenReview.net, Virtual Event, Austria.
[9]
Venkatesan Nallampatti Ekambaram. 2014. Graph Structured Data Viewed Through a Fourier Lens. Ph.,D. Dissertation. http://www.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013--209.html
[10]
Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. CoRR, Vol. abs/1903.02428 (2019).
[11]
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning. MIT Press.
[12]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In The Annual Conference on Neural Information Processing Systems 2017. Long Beach, CA, USA, 1024--1034.
[13]
Mingguo He, Zhewei Wei, and Ji-Rong Wen. 2022. Convolutional neural networks on graphs with chebyshev approximation, revisited. Advances in Neural Information Processing Systems, Vol. 35 (2022), 7264--7276.
[14]
Qian Huang, Horace He, Abhay Singh, Ser-Nam Lim, and Austin R. Benson. 2021. Combining Label Propagation and Simple Models out-performs Graph Neural Networks. In The 9th International Conference on Learning Representations. OpenReview.net, Virtual Event, Austria.
[15]
Wen-bing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive Sampling Towards Fast Graph Representation Learning. In The Annual Conference on Neural Information Processing Systems. Montréal, Canada, 4563--4572.
[16]
Wei Jin, Tyler Derr, Yiqi Wang, Yao Ma, Zitao Liu, and Jiliang Tang. 2021. Node Similarity Preserving Graph Convolutional Networks. In The Fourteenth ACM International Conference on Web Search and Data Mining. ACM, Virtual Event, Israel, 148--156.
[17]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In The 3rd International Conference on Learning Representations. OpenReview.net, San Diego, CA, USA.
[18]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In The 5th International Conference on Learning Representations. OpenReview.net, Toulon, France.
[19]
Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In The 7th International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA.
[20]
Andreas Krause and Daniel Golovin. 2014. Submodular Function Maximization. In Tractability: Practical Approaches to Hard Problems. Vol. 3. Cambridge University Press, 71--104.
[21]
Xiang Li, Renyu Zhu, Yao Cheng, Caihua Shan, Siqiang Luo, Dongsheng Li, and Weining Qian. 2022. Finding Global Homophily in Graph Neural Networks When Meeting Heterophily. In The International Conference on Machine Learning, Vol. 162. PMLR, Baltimore, Maryland, USA, 13242--13256.
[22]
Ningyi Liao, Siqiang Luo, Xiang Li, and Jieming Shi. 2023. LD2: Scalable Heterophilous Graph Neural Network with Decoupled Embeddings. In The Annual Conference on Neural Information Processing Systems. New Orleans, LA, USA.
[23]
Derek Lim, Felix Hohne, Xiuyu Li, Sijia Linda Huang, Vaishnavi Gupta, Omkar Bhalerao, and Ser Nam Lim. 2021. Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods. Advances in Neural Information Processing Systems, Vol. 34 (2021), 20887--20902.
[24]
Meng Liu, Zhengyang Wang, and Shuiwang Ji. 2021. Non-local graph neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 12 (2021), 10270--10276.
[25]
Sitao Luan, Chenqing Hua, Qincheng Lu, Jiaqi Zhu, Mingde Zhao, Shuyuan Zhang, Xiao-Wen Chang, and Doina Precup. 2022. Revisiting Heterophily For Graph Neural Networks. In The Annual Conference on Neural Information Processing Systems. New Orleans, LA, USA.
[26]
Sitao Luan, Mingde Zhao, Chenqing Hua, Xiao-Wen Chang, and Doina Precup. 2022. Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Neural Networks. CoRR, Vol. abs/2212.10822 (2022).
[27]
Michel Minoux. 1978. Accelerated Greedy Algorithms for Maximizing Submodular Set Functions. Optimization Techniques. Lecture Notes in Control and Information Sciences, Vol. 7 (1978), 234--243.
[28]
George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An Analysis of Approximations for Maximizing Submodular Set Functions-I. Mathematical Programming, Vol. 14 (1978), 265--294.
[29]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In The Annual Conference on Neural Information Processing Systems, Vol. 32. Vancouver, BC, Canada, 8024--8035.
[30]
Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020. Geom-GCN: Geometric Graph Convolutional Networks. In The 8th International Conference on Learning Representations. OpenReview.net, Addis Ababa, Ethiopia.
[31]
Oleg Platonov, Denis Kuznedelev, Artem Babenko, and Liudmila Prokhorenkova. 2023. Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond. In The Annual Conference on Neural Information Processing Systems 2023. New Orleans, LA, USA.
[32]
Oleg Platonov, Denis Kuznedelev, Michael Diskin, Artem Babenko, and Liudmila Prokhorenkova. 2023. A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress?. In The 11th International Conference on Learning Representations. OpenReview.net, Kigali, Rwanda.
[33]
Benedek Rozemberczki, Carl Allen, and Rik Sarkar. 2021. Multi-scale Attributed Node Embedding. Journal of Complex Networks, Vol. 9, 2 (2021), cnab014.
[34]
Jacob Schreiber, Jeffrey Bilmes, and William Stafford Noble. 2020. Apricot: Submodular Selection for Data Summarization in Python. The Journal of Machine Learning Research, Vol. 21, 1 (2020), 6474--6479.
[35]
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Gallagher, and Tina Eliassi-Rad. 2008. Collective Classification in Network Data. AI Magazine, Vol. 29, 3 (2008), 93--93.
[36]
Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine Learning, Vol. 109, 2 (2020), 373--440.
[37]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In The 6th International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada.
[38]
Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, et al. 2019. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. CoRR, Vol. abs/1909.01315 (2019).
[39]
Felix Wu, Amauri H. Souza Jr., Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Q. Weinberger. 2019. Simplifying Graph Convolutional Networks. In The 36th International Conference on Machine Learning, Vol. 97. PMLR, Long Beach, California, USA, 6861--6871.
[40]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In The 7th International Conference on Learning Representations. OpenReview.net, New Orleans, LA, USA.
[41]
Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. 2018. Representation Learning on Graphs with Jumping Knowledge Networks. In The 35th International Conference on Machine Learning, Vol. 80. PMLR, Stockholm, Sweden, 5449--5458.
[42]
Zhe Xu, Yuzhong Chen, Qinghai Zhou, et al. 2023. Node Classification Beyond Homophily: Towards a General Solution. In The 29th Conference on Knowledge Discovery and Data Mining. ACM, Long Beach, CA, USA, 2862--2873.
[43]
Yujun Yan, Milad Hashemi, Kevin Swersky, Yaoqing Yang, and Danai Koutra. 2022. Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks. In International Conference on Data Mining, Xingquan Zhu, Sanjay Ranka, My T. Thai, Takashi Washio, and Xindong Wu (Eds.). IEEE, Orlando,FL, USA, 1287--1292.
[44]
Tianmeng Yang, Yujing Wang, Zhihan Yue, Yaming Yang, Yunhai Tong, and Jing Bai. 2022. Graph Pointer Neural Networks. In The 36th AAAI Conference on Artificial Intelligence. AAAI Press, Virtual Event, 8832--8839.
[45]
Yang Ye and Shihao Ji. 2019. Sparse Graph Attention Networks. CoRR, Vol. abs/1912.00552 (2019).
[46]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor K. Prasanna. 2020. GraphSAINT: Graph Sampling Based Inductive Learning Method. In The 8th International Conference on Learning Representations. OpenReview.net, Addis Ababa, Ethiopia.
[47]
Elena Zheleva and Lise Getoor. 2009. To Join or Not to Join: The Illusion of Privacy in Social Networks with Mixed Public and Private User Profiles. In The 18th International Conference on World Wide Web. ACM, Madrid, Spain, 531--540.
[48]
Cheng Zheng, Bo Zong, Wei Cheng, Dongjin Song, et al. 2020. Robust Graph Representation Learning via Neural Sparsification. In The 37th International Conference on Machine Learning, Vol. 119. PMLR, Virtual Event, 11458--11468.
[49]
Xin Zheng, Yixin Liu, Shirui Pan, Miao Zhang, Di Jin, and Philip S. Yu. 2022. Graph Neural Networks for Graphs with Heterophily: A Survey. CoRR, Vol. abs/2202.07082 (2022).
[50]
Jiong Zhu, Ryan A. Rossi, Anup Rao, et al. 2021. Graph Neural Networks with Heterophily. In The 35th AAAI Conference on Artificial Intelligence. AAAI Press, Virtual Event, 11168--11176.
[51]
Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems, Vol. 33 (2020), 7793--7804.

Index Terms

  1. AGS-GNN: Attribute-guided Sampling for Graph Neural Networks
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
        August 2024
        6901 pages
        ISBN:9798400704901
        DOI:10.1145/3637528
        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 24 August 2024

        Check for updates

        Author Tags

        1. graph neural networks
        2. heterophily
        3. submodular functions

        Qualifiers

        • Research-article

        Funding Sources

        • U. S. Department of Energy, Office of Science

        Conference

        KDD '24
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 344
          Total Downloads
        • Downloads (Last 12 months)344
        • Downloads (Last 6 weeks)164
        Reflects downloads up to 21 Nov 2024

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media