research-article

Adversarial graph augmentation to improve graph contrastive learning

AUTHORs:

Susheel Suresh,

Jennifer NevilleAuthors Info & Claims

NIPS'21: Proceedings of the 35th International Conference on Neural Information Processing Systems

Article No.: 1218, Pages 15920 - 15933

Published: 10 June 2024 Publication History

Abstract

Self-supervised learning of graph neural networks (GNN) is in great need because of the widespread label scarcity issue in real-world graph/network data. Graph contrastive learning (GCL), by training GNNs to maximize the correspondence between the representations of the same graph in its different augmented forms, may yield robust and transferable GNNs even without using labels. However, GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. Here, we propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL² by comparing with the state-of-the-art GCL methods and achieve performance gains of up-to 14% in unsupervised, 6% in transfer, and 3% in semi-supervised learning settings overall with 18 different benchmark datasets for the tasks of molecule property regression and classification, and social network classification.

Supplementary Material

Additional material (3540261.3541479_supp.pdf)

Supplemental material.

Download
1.65 MB

References

[1]

A. W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green, C. Qin, A. Žídek, A. W. Nelson, A. Bridgland et al., "Improved protein structure prediction using potentials from deep learning," Nature, vol. 577, no. 7792, pp. 706–710, 2020.

[2]

J. Shlomi, P. Battaglia, and J.-R. Vlimant, "Graph neural networks in particle physics," Machine Learning: Science and Technology, vol. 2, no. 2, p. 021001, 2020.

[3]

W. L. Hamilton, "Graph representation learning," Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 14, no. 3, pp. 1–159, 2020.

[4]

K. Hornik, M. Stinchcombe, H. White et al., "Multilayer feedforward networks are universal approximators." Neural Networks, vol. 2, no. 5, pp. 359–366, 1989.

[5]

G. Cybenko, "Approximation by superpositions of a sigmoidal function," Mathematics of control, signals and systems, vol. 2, no. 4, pp. 303–314, 1989.

[6]

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, "The graph neural network model," IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2008.

Digital Library

[7]

I. Chami, S. Abu-El-Haija, B. Perozzi, C. Ré, and K. Murphy, "Machine learning on graphs: A model and comprehensive taxonomy," arXiv preprint arXiv:2005.03675, 2020.

[8]

Z. Zhang, P. Cui, and W. Zhu, "Deep learning on graphs: A survey," IEEE TKDE, 2020.

[9]

W. L. Hamilton, R. Ying, and J. Leskovec, "Representation learning on graphs: Methods and applications," IEEE Data Engineering Bulletin, vol. 40, no. 3, pp. 52–74, 2017.

[10]

T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in International Conference on Learning Representations, 2017.

[11]

H. Dai, B. Dai, and L. Song, "Discriminative embeddings of latent variable models for structured data," in International Conference on Machine Learning. PMLR, 2016, pp. 2702–2711.

[12]

P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, "Graph attention networks," in International Conference on Learning Representations, 2018.

[13]

M. Zhang, Z. Cui, M. Neumann, and Y. Chen, "An end-to-end deep learning architecture for graph classification," in the AAAI Conference on Artificial Intelligence, 2018, pp. 4438–4445.

[14]

K. Xu, W. Hu, J. Leskovec, and S. Jegelka, "How powerful are graph neural networks?" in International Conference on Learning Representations, 2019.

[15]

C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe, "Weisfeiler and leman go neural: Higher-order graph neural networks," in the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 4602–4609.

[16]

P. Li, Y. Wang, H. Wang, and J. Leskovec, "Distance encoding: Design provably more powerful neural networks for graph representation learning," Advances in Neural Information Processing Systems, vol. 33, 2020.

[17]

W. Hu, B. Liu, J. Gomes, M. Zitnik, P. Liang, V. Pande, and J. Leskovec, "Strategies for pre-training graph neural networks," International Conference on Learning Representations, 2020.

[18]

F.-Y. Sun, J. Hoffmann, and J. Tang, "Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization," arXiv preprint arXiv:1908.01000, 2019.

[19]

H. G. Vogel, Drug discovery and evaluation: pharmacological assays. Springer Science & Business Media, 2002.

[20]

T. N. Kipf and M. Welling, "Variational graph auto-encoders," arXiv preprint arXiv:1611.07308, 2016.

[21]

A. Grover, A. Zweig, and S. Ermon, "Graphite: Iterative generative modeling of graphs," in International Conference on Machine Learning. PMLR, 2019, pp. 2434–2444.

[22]

Z. Peng, W. Huang, M. Luo, Q. Zheng, Y. Rong, T. Xu, and J. Huang, "Graph representation learning via graphical mutual information maximization," in Proceedings of The Web Conference 2020, 2020.

[23]

P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm, "Deep graph infomax," arXiv preprint arXiv:1809.10341, 2018.

[24]

Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen, "Graph contrastive learning with augmentations," Advances in Neural Information Processing Systems, vol. 33, 2020.

[25]

K. Hassani and A. H. Khasahmadi, "Contrastive multi-view representation learning on graphs," in International Conference on Machine Learning. PMLR, 2020, pp. 4116–4126.

[26]

Y. Xie, Z. Xu, Z. Wang, and S. Ji, "Self-supervised learning of graph neural networks: A unifed review," arXiv preprint arXiv:2102.10757, 2021.

[27]

Y. Liu, S. Pan, M. Jin, C. Zhou, F. Xia, and P. S. Yu, "Graph self-supervised learning: A survey," arXiv preprint arXiv:2103.00111, 2021.

[28]

S. Zhang, Z. Hu, A. Subramonian, and Y. Sun, "Motif-driven contrastive learning of graph representations," arXiv preprint arXiv:2012.12533, 2020.

[29]

S. Thakoor, C. Tallec, M. G. Azar, R. Munos, P. Veličković, and M. Valko, "Bootstrapped representation learning on graphs," arXiv preprint arXiv:2102.06514, 2021.

[30]

Y. Zhu, Y. Xu, F. Yu, Q. Liu, S. Wu, and L. Wang, "Graph contrastive learning with adaptive augmentation," arXiv preprint arXiv:2010.14945, 2020.

[31]

J. Qiu, Q. Chen, Y. Dong, J. Zhang, H. Yang, M. Ding, K. Wang, and J. Tang, "Gcc: Graph contrastive coding for graph neural network pre-training," in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1150–1160.

Digital Library

[32]

M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural computation, vol. 15, no. 6, pp. 1373–1396, 2003.

Digital Library

[33]

B. Perozzi, R. Al-Rfou, and S. Skiena, "Deepwalk: Online learning of social representations," in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710.

Digital Library

[34]

A. Grover and J. Leskovec, "node2vec: Scalable feature learning for networks," in the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864.

Digital Library

[35]

L. F. Ribeiro, P. H. Saverese, and D. R. Figueiredo, "struc2vec: Learning node representations from structural identity," in the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 385–394.

Digital Library

[36]

W. Hamilton, Z. Ying, and J. Leskovec, "Inductive representation learning on large graphs," in Advances in Neural Information Processing Systems, 2017.

[37]

K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu, L. Akoglu, D. Koutra, C. Falout-sos, and L. Li, "Rolx: structural role extraction & mining in large graphs," in the ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 1231–1239.

Digital Library

[38]

C. Donnat, M. Zitnik, D. Hallac, and J. Leskovec, "Learning structural node embeddings via diffusion wavelets," in the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1320–1329.

[39]

R. Linsker, "Self-organization in a perceptual network," Computer, vol. 21, no. 3, pp. 105–117, 1988.

Digital Library

[40]

M. Tschannen, J. Djolonga, P. K. Rubenstein, S. Gelly, and M. Lucic, "On mutual information maximization for representation learning," in International Conference on Learning Representations, 2020.

[41]

N. Tishby, F. C. Pereira, and W. Bialek, "The information bottleneck method," arXiv preprint physics/0004057, 2000.

[42]

N. Tishby and N. Zaslavsky, "Deep learning and the information bottleneck principle," in 2015 IEEE Information Theory Workshop (ITW). IEEE, 2015.

[43]

Z. Goldfeld and Y. Polyanskiy, "The information bottleneck problem and its applications in machine learning," IEEE Journal on Selected Areas in Information Theory, 2020.

[44]

A. A. Alemi, I. Fischer, J. V. Dillon, and K. Murphy, "Deep variational information bottleneck," arXiv preprint arXiv:1612.00410, 2016.

[45]

X. B. Peng, A. Kanazawa, S. Toyer, P. Abbeel, and S. Levine, "Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information fow," arXiv preprint arXiv:1810.00821, 2018.

[46]

I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerch-ner, "beta-vae: Learning basic visual concepts with a constrained variational framework." in International Conference on Learning Representations, 2017.

[47]

T. Wu, H. Ren, P. Li, and J. Leskovec, "Graph information bottleneck," in Advances in Neural Information Processing Systems, 2020.

[48]

J. Yu, T. Xu, Y. Rong, Y. Bian, J. Huang, and R. He, "Recognizing predictive substructures with subgraph information bottleneck," International Conference on Learning Representations, 2021.

[49]

J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, "Neural message passing for quantum chemistry," in International Conference on Machine Learning. JMLR. org, 2017.

[50]

T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley & Sons, 2012.

Digital Library

[51]

B. Weisfeiler and A. Leman, "A reduction of a graph to a canonical form and an algebra arising during this reduction," Nauchno-Technicheskaya Informatsia, 1968.

[52]

W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, "Open graph benchmark: Datasets for machine learning on graphs," arXiv preprint arXiv:2005.00687, 2020.

[53]

D. Duvenaud, D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams, "Convolutional networks on graphs for learning molecular fingerprints," Advances in Neural Information Processing Systems, vol. 2015, pp. 2224–2232, 2015.

[54]

E. N. Gilbert, "Random graphs," The Annals of Mathematical Statistics, vol. 30, no. 4, pp. 1141–1144, 1959.

[55]

P. Erdős and A. Rényi, "On random graphs i." Publ. Math. Debrecen, vol. 6, pp. 290–297, 1959.

[56]

C. J. Maddison, A. Mnih, and Y. W. Teh, "The concrete distribution: A continuous relaxation of discrete random variables," in International Conference on Learning Representations, 2017.

[57]

E. Jang, S. Gu, and B. Poole, "Categorical reparameterization with gumbel-softmax," in International Conference on Learning Representations, 2017.

[58]

D. Luo, W. Cheng, D. Xu, W. Yu, B. Zong, H. Chen, and X. Zhang, "Parameterized explainer for graph neural network," Advances in Neural Information Processing Systems, vol. 33, 2020.

[59]

A. v. d. Oord, Y. Li, and O. Vinyals, "Representation learning with contrastive predictive coding," arXiv preprint arXiv:1807.03748, 2018.

[60]

Y. Tian, D. Krishnan, and P. Isola, "Contrastive multiview coding," arXiv preprint arXiv:1906.05849, 2019.

[61]

B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, "On variational bounds of mutual information," in International Conference on Machine Learning, 2019.

[62]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, "A simple framework for contrastive learning of visual representations," in International Conference on Machine Learning. PMLR, 2020, pp. 1597–1607.

[63]

S. Becker and G. E. Hinton, "Self-organizing neural network that discovers surfaces in random-dot stereograms," Nature, vol. 355, no. 6356, pp. 161–163, 1992.

[64]

O. Henaff, "Data-efficient image recognition with contrastive predictive coding," in International Conference on Machine Learning. PMLR, 2020, pp. 4182–4192.

[65]

R. D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Grewal, P. Bachman, A. Trischler, and Y. Bengio, "Learning deep representations by mutual information estimation and maximization," in International Conference on Learning Representations, 2019.

[66]

T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. Hinton, "Big self-supervised models are strong semi-supervised learners," arXiv preprint arXiv:2006.10029, 2020.

[67]

P. Veličković, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm, "Deep graph infomax," arXiv preprint arXiv:1809.10341, 2018.

[68]

Z. Peng, W. Huang, M. Luo, Q. Zheng, Y. Rong, T. Xu, and J. Huang, "Graph representation learning via graphical mutual information maximization," in Proceedings of The Web Conference 2020, 2020, pp. 259–270.

[69]

Y. Jiao, Y. Xiong, J. Zhang, Y. Zhang, T. Zhang, and Y. Zhu, "Sub-graph contrast for scalable self-supervised graph representation learning," arXiv preprint arXiv:2009.10273, 2020.

[70]

Y. You, T. Chen, Y. Shen, and Z. Wang, "Graph contrastive learning automated," arXiv preprint arXiv:2106.07594, 2021.

[71]

Y. Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, and P. Isola, "What makes for good views for contrastive learning?" in Advances in Neural Information Processing Systems, 2020.

[72]

K. Xu, W. Hu, J. Leskovec, and S. Jegelka, "How powerful are graph neural networks?" in International Conference on Learning Representations, 2019.

[73]

C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, and M. Neumann, "Tudataset: A collection of benchmark datasets for learning with graphs," in ICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+ 2020), 2020. [Online]. Available: www.graphlearning.io

[74]

V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, "Benchmarking graph neural networks," arXiv preprint arXiv:2003.00982, 2020.

[75]

N. M. Kriege, F. D. Johansson, and C. Morris, "A survey on graph kernels," Applied Network Science, vol. 5, no. 1, pp. 1–42, 2020.

[76]

P. Yanardag and S. Vishwanathan, "Deep graph kernels," in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015, pp. 1365–1374.

[77]

N. Shervashidze, P. Schweitzer, E. J. v. Leeuwen, K. Mehlhorn, and K. M. Borgwardt, "Weisfeiler-lehman graph kernels," Journal of Machine Learning Research, vol. 12, no. Sep, pp. 2539–2561, 2011.

Digital Library

[78]

A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, and S. Jaiswal, "graph2vec: Learning distributed representations of graphs," arXiv preprint arXiv:1707.05005, 2017.

[79]

B. Adhikari, Y. Zhang, N. Ramakrishnan, and B. A. Prakash, "Sub2vec: Feature learning for subgraphs," in Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 2018, pp. 170–182.

[80]

T. M. Cover, Elements of information theory. John Wiley & Sons, 1999.

[81]

L. Babai, "Groups, graphs, algorithms: The graph isomorphism problem," in Proc. ICM, vol. 3. World Scientific, 2018, pp. 3303–3320.

[82]

H. A. Helfgott, J. Bajpai, and D. Dona, "Graph isomorphisms in quasi-polynomial time," arXiv preprint arXiv:1710.04574, 2017.

[83]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Digital Library

[84]

C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, "Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization," ACM Transactions on Mathematical Software (TOMS), vol. 23, no. 4, pp. 550–560, 1997.

Digital Library

[85]

R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "Liblinear: A library for large linear classification," Journal of Machine Learning Research, vol. 9, pp. 1871–1874, 2008.

Digital Library

Recommendations

Adaptive Graph Augmentation for Graph Contrastive Learning
Advanced Intelligent Computing Technology and Applications
Abstract
Graph contrastive learning emerged as a promising method for graph representation learning. The traditional graph contrastive methods utilize data augmentations for original graphs and train models during pre-training, and for different downstream ...
ArieL: Adversarial Graph Contrastive Learning
Contrastive learning is an effective unsupervised method in graph representation learning. The key component of contrastive learning lies in the construction of positive and negative samples. Previous methods usually utilize the proximity of nodes in the ...
Video Representation Learning with Graph Contrastive Augmentation
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Contrastive-based self-supervised learning for image representations has significantly closed the gap with supervised learning. A natural extension of image-based contrastive learning methods to the video domain is to fully exploit the temporal ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

December 2021

30517 pages

ISBN:9781713845393

Copyright © 2021 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 10 June 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents