research-article

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Authors:

Jie TangAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1150 - 1160

https://doi.org/10.1145/3394486.3403168

Published: 20 August 2020 Publication History

Abstract

Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and graph classification. However, prior arts on graph representation learning focus on domain specific problems and train a dedicated model for each graph dataset, which is usually non-transferable to out-of-domain data. Inspired by the recent advances in pre-training from natural language processing and computer vision, we design Graph Contrastive Coding (GCC) --- a self-supervised graph neural network pre-training framework --- to capture the universal network topological properties across multiple networks. We design GCC's pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. We conduct extensive experiments on three graph learning tasks and ten graph datasets. The results show that GCC pre-trained on a collection of diverse datasets can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts. This suggests that the pre-training and fine-tuning paradigm presents great potential for graph representation learning.

References

[1]

Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Reviews of modern physics, Vol. 74, 1 (2002), 47.

[2]

J Ignacio Alvarez-Hamelin, Luca Dall'Asta, Alain Barrat, and Alessandro Vespignani. 2006. Large scale networks fingerprinting and visualization using the k-core decomposition. In Advances in neural information processing systems. 41--50.

[3]

Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In KDD '06 . 44--54.

Digital Library

[4]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et almbox. 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018).

[5]

Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science, Vol. 353, 6295 (2016), 163--166.

[6]

Stephen P Borgatti and Martin G Everett. 2000. Models of core/periphery structures. Social networks, Vol. 21, 4 (2000), 375--395.

[7]

Ronald S Burt. 2009. Structural holes: The social structure of competition .Harvard university press.

[8]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST), Vol. 2, 3 (2011), 1--27.

Digital Library

[9]

Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2019. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR '19 .

[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT '19. 4171--4186.

[11]

Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD '17 . 135--144.

Digital Library

[12]

Claire Donnat, Marinka Zitnik, David Hallac, and Jure Leskovec. 2018. Learning structural node embeddings via diffusion wavelets. In KDD '18 . 1320--1329.

Digital Library

[13]

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In ICML '17. JMLR. org, 1263--1272.

[14]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD '16. 855--864.

Digital Library

[15]

Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In CVPR '06, Vol. 2. IEEE, 1735--1742.

Digital Library

[16]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034.

[17]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR '20 . 9729--9738.

[18]

Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. Rolx: structural role extraction & mining in large graphs. In KDD '12. 1231--1239.

Digital Library

[19]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019 b. Pre-training graph neural networks. In ICLR '19 .

[20]

Ziniu Hu, Changjun Fan, Ting Chen, Kai-Wei Chang, and Yizhou Sun. 2019 a. Unsupervised Pre-Training of Graph Convolutional Networks. ICLR 2019 Workshop: Representation Learning on Graphs and Manifolds (2019).

[21]

Glen Jeh and Jennifer Widom. 2002. SimRank: a measure of structural-context similarity. In KDD '02 . 538--543.

Digital Library

[22]

Yilun Jin, Guojie Song, and Chuan Shi. 2019. GraLSP: Graph Neural Networks with Local Structural Patterns. arXiv preprint arXiv:1911.07675 (2019).

[23]

Kristian Kersting, Nils M. Kriege, Christopher Morris, Petra Mutzel, and Marion Neumann. 2016. Benchmark Data Sets for Graph Kernels. http://graphkernels.cs.tu-dortmund.de

[24]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR '15 .

[25]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR '17 .

[26]

Elizabeth A Leicht, Petter Holme, and Mark EJ Newman. 2006. Vertex similarity in networks. Physical Review E, Vol. 73, 2 (2006), 026120.

[27]

Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In KDD '06. 631--636.

Digital Library

[28]

Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05 . 177--187.

Digital Library

[29]

Silvio Micali and Zeyuan Allen Zhu. 2016. Reconstructing markov processes from independent and anonymous experiments. Discrete Applied Mathematics, Vol. 200 (2016), 108--122.

Digital Library

[30]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

[31]

Ron Milo, Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt, Shai Shen-Orr, Inbal Ayzenshtat, Michal Sheffer, and Uri Alon. 2004. Superfamilies of evolved and designed networks. Science, Vol. 303, 5663 (2004), 1538--1542.

[32]

Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science, Vol. 298, 5594 (2002), 824--827.

[33]

Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. arXiv preprint arXiv:1707.05005 (2017).

[34]

Mark EJ Newman. 2006. Modularity and community structure in networks. Proceedings of the national academy of sciences, Vol. 103, 23 (2006), 8577--8582.

[35]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).

[36]

Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. 2004. Automatic multimedia cross-modal correlation discovery. In KDD '04 . 653--658.

Digital Library

[37]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems. 8024--8035.

Digital Library

[38]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et almbox. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.

Digital Library

[39]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD '14 . 701--710.

Digital Library

[40]

Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. 2019. Netsmf: Large-scale network embedding as sparse matrix factorization. In The World Wide Web Conference. 1509--1520.

Digital Library

[41]

Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018a. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM '18 . 459--467.

Digital Library

[42]

Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018b. Deepinf: Social influence prediction with deep learning. In KDD '18 . 2110--2119.

Digital Library

[43]

Leonardo FR Ribeiro, Pedro HP Saverese, and Daniel R Figueiredo. 2017. struc2vec: Learning node representations from structural identity. In KDD '17 . 385--394.

Digital Library

[44]

Scott C Ritchie, Stephen Watts, Liam G Fearnley, Kathryn E Holt, Gad Abraham, and Michael Inouye. 2016. A scalable permutation approach reveals replication and preservation patterns of network modules in large datasets. Cell systems, Vol. 3, 1 (2016), 71--82.

[45]

Daniel A Spielman and Shang-Hua Teng. 2013. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing, Vol. 42, 1 (2013), 1--26.

[46]

Fan-Yun Sun, Jordan Hoffman, Vikas Verma, and Jian Tang. 2019. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR '19 .

[47]

Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In WWW '15. 1067--1077.

Digital Library

[48]

Shang-Hua Teng et almbox. 2016. Scalable algorithms for data and network analysis. Foundations and Trends® in Theoretical Computer Science, Vol. 12, 1--2 (2016), 1--274.

[49]

Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2019. Contrastive multiview coding. arXiv preprint arXiv:1906.05849 (2019).

[50]

Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In ICDM '06. IEEE, 613--622.

Digital Library

[51]

Johan Ugander, Lars Backstrom, Cameron Marlow, and Jon Kleinberg. 2012. Structural diversity in social contagion. Proceedings of the National Academy of Sciences, Vol. 109, 16 (2012), 5962--5966.

[52]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

[53]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. ICLR '18 (2018).

[54]

Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing, Vol. 17, 4 (2007), 395--416.

[55]

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019 a. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In ICLR '19 .

[56]

Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, et almbox. 2019 b. Deep graph library: Towards efficient and scalable deep learning on graphs. arXiv preprint arXiv:1909.01315 (2019).

[57]

Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of small-world networks. nature, Vol. 393, 6684 (1998), 440.

[58]

Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In CVPR '18 . 3733--3742.

[59]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In ICLR '19 .

[60]

Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In KDD '15. 1365--1374.

Digital Library

[61]

Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, Vol. 42, 1 (2015), 181--213.

Digital Library

[62]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD '18 . 974--983.

Digital Library

[63]

Fanjin Zhang, Xiao Liu, Jie Tang, Yuxiao Dong, Peiran Yao, Jie Zhang, Xiaotao Gu, Yan Wang, Bin Shao, Rui Li, and et al. 2019 b. OAG: Toward Linking Large-Scale Heterogeneous Entity Graphs. In KDD '19 . 2585--2595.

[64]

Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. 2019 a. ProNE: fast and scalable network representation learning. In IJCAI '19 . 4278--4284.

[65]

Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, and Juanzi Li. 2015. Panther: Fast top-k similarity search on large networks. In KDD '15 . 1445--1454.

Digital Library

[66]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An end-to-end deep learning architecture for graph classification. In AAAI '18 .

Cited By

Dong YZhang YQian YZhao YYang ZFeng X(2025)ASGCL: Adaptive Sparse Mapping-based graph contrastive learning network for cancer drug response predictionPLOS Computational Biology10.1371/journal.pcbi.101274821:1(e1012748)Online publication date: 30-Jan-2025
https://doi.org/10.1371/journal.pcbi.1012748
Hayat MXue SWu JKhan BYang JNejdl WAuer SKarras OCha MMoens MNajork M(2025)Self-supervised Time-aware Heterogeneous Hypergraph Learning for Dynamic Graph-level ClassificationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703504(213-221)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703504
Li SLuo YZhang AWang XLi LZhou JChua T(2025)Self-attentive Rationalization for Interpretable Graph Contrastive LearningACM Transactions on Knowledge Discovery from Data10.1145/366589419:2(1-21)Online publication date: 15-Feb-2025
https://dl.acm.org/doi/10.1145/3665894
Show More Cited By

Index Terms

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Information systems
  1. Information systems applications
    1. Data mining
  2. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Multi-scale Graph Pooling Approach with Adaptive Key Subgraph for Graph Representations
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

The recent progress in graph representation learning boosts the development of many graph classification tasks, such as protein classification and social network classification. One of the mainstream approaches for graph representation learning is the ...
Self-supervised contrastive graph representation with node and graph augmentation
Abstract
Graph representation is a critical technology in the field of knowledge engineering and knowledge-based applications since most knowledge bases are represented in the graph structure. Nowadays, contrastive learning has become a prominent way for ...
SMGCL: Semi-supervised Multi-view Graph Contrastive Learning
Abstract
Graph contrastive learning (GCL), aiming to generate supervision information by transforming the graph data itself, is increasingly becoming a focus of graph research. It has shown promising performance in graph representation learning ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
NSFC for Distinguished Young Scholar
NSFC

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

565
Total Citations
View Citations
4,848
Total Downloads

Downloads (Last 12 months)552
Downloads (Last 6 weeks)54

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dong YZhang YQian YZhao YYang ZFeng X(2025)ASGCL: Adaptive Sparse Mapping-based graph contrastive learning network for cancer drug response predictionPLOS Computational Biology10.1371/journal.pcbi.101274821:1(e1012748)Online publication date: 30-Jan-2025
https://doi.org/10.1371/journal.pcbi.1012748
Hayat MXue SWu JKhan BYang JNejdl WAuer SKarras OCha MMoens MNajork M(2025)Self-supervised Time-aware Heterogeneous Hypergraph Learning for Dynamic Graph-level ClassificationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703504(213-221)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703504
Li SLuo YZhang AWang XLi LZhou JChua T(2025)Self-attentive Rationalization for Interpretable Graph Contrastive LearningACM Transactions on Knowledge Discovery from Data10.1145/366589419:2(1-21)Online publication date: 15-Feb-2025
https://dl.acm.org/doi/10.1145/3665894
Wang ZSun YYang ZYang LLin H(2025)Temporal Network Embedding Enhanced With Long-Range Dynamics and Self-Supervised LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.338434836:3(5747-5758)Online publication date: Mar-2025
https://doi.org/10.1109/TNNLS.2024.3384348
Yuan RTang YWu YZhang W(2025)Clustering Enhanced Multiplex Graph Contrastive Representation LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333475136:1(1341-1355)Online publication date: Jan-2025
https://doi.org/10.1109/TNNLS.2023.3334751
Wang YHu XGan QHuang XQiu XWipf D(2025)Efficient Link Prediction via GNN Layers Induced by Negative SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348101537:1(253-264)Online publication date: Jan-2025
https://doi.org/10.1109/TKDE.2024.3481015
Xie YLuo LCao TYu BQin A(2025)Contrastive Learning Network for Unsupervised Graph MatchingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345757535:1(643-656)Online publication date: Jan-2025
https://doi.org/10.1109/TCSVT.2024.3457575
Qiu ZWu JYang JSu XAggarwal C(2025)Heterogeneous Social Event Detection via Hyperbolic Graph RepresentationsIEEE Transactions on Big Data10.1109/TBDATA.2024.338101711:1(115-129)Online publication date: Feb-2025
https://doi.org/10.1109/TBDATA.2024.3381017
Guo FWang Z(2025)KEMB-Rec: Knowledge-Enhanced Explainable Multibehavior Recommendation With Graph Contrastive LearningIEEE Internet of Things Journal10.1109/JIOT.2024.343952712:4(3563-3576)Online publication date: 15-Feb-2025
https://doi.org/10.1109/JIOT.2024.3439527
Shan LLi JLiu G(2025)Alternating-update-strategy based Graph Autoencoder for graph neural networkThe Computer Journal10.1093/comjnl/bxaf007Online publication date: 5-Feb-2025
https://doi.org/10.1093/comjnl/bxaf007
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten