research-article

Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models

Authors:

Benedek Rozemberczki,

Rik SarkarAuthors Info & Claims

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Pages 1325 - 1334

https://doi.org/10.1145/3340531.3411866

Published: 19 October 2020 Publication History

Abstract

In this paper, we propose a flexible notion of characteristic functions defined on graph vertices to describe the distribution of vertex features at multiple scales. We introduce FEATHER, a computationally efficient algorithm to calculate a specific variant of these characteristic functions where the probability weights of the characteristic function are defined as the transition probabilities of random walks. We argue that features extracted by this procedure are useful for node level machine learning tasks. We discuss the pooling of these node representations, resulting in compact descriptors of graphs that can serve as features for graph classification algorithms. We analytically prove that FEATHER describes isomorphic graphs with the same representation and exhibits robustness to data corruption. Using the node feature characteristic functions we define parametric models where evaluation points of the functions are learned parameters of supervised classifiers. Experiments on real world large datasets show that our proposed algorithm creates high quality representations, performs transfer learning efficiently, exhibits robustness to hyperparameter changes and scales linearly with the input size.

Supplementary Material

MP4 File (3340531.3411866.mp4)

In this paper, we propose a flexible notion of characteristic functions defined on graph vertices to describe the distribution of vertex features at multiple scales. We introduce FEATHER, a computationally efficient algorithm to calculate a specific variant of these characteristic functions where the probability weights of the characteristic function are defined as the transition probabilities of random walks. We argue that features extracted by this procedure are useful for node level machine learning tasks. We discuss the pooling of these node representations, resulting in compact descriptors of graphs that can serve as features for graph classification algorithms. We analytically prove that FEATHER describes isomorphic graphs with the same representation and exhibits robustness to data corruption. Using the node feature characteristic functions we define parametric models where evaluation points of the functions are learned parameters of supervised classifiers.

Download
14.38 MB

References

[1]

Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. 2019. MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing. In International Conference on Machine Learning.

[2]

Nesreen K Ahmed, Ryan A Rossi, John Boaz Lee, Theodore L Willke, Rong Zhou, Xiangnan Kong, and Hoda Eldardiry. 2019. role2vec: Role-based network embeddings. In Proc. DLG KDD.

[3]

Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management. ACM, 891--900.

Digital Library

[4]

Hong Chen and Hisashi Koga. 2019. GL2vec: Graph Embedding Enriched by Line Graphs with Edge Features. In International Conference on Neural Information Processing. Springer, 3--14.

Digital Library

[5]

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. In International Conference on Knowledge Discovery and Data Mining.

Digital Library

[6]

Edith Cohen, Daniel Delling, Thomas Pajor, and Renato F Werneck. 2014. Distance-Based Influence in Networks: Computation and Maximization. arXiv preprint arXiv:1410.6976 (2014).

[7]

Anirban DasGupta. 2011. Characteristic Functions and Applications. 293--322.

[8]

Nathan de Lara and Pineau Edouard. 2018. A simple baseline algorithm for graph classification. In Advances in Neural Information Processing Systems.

[9]

Aaron Defazio, Francis Bach, and Simon Lacoste-Julien. 2014. SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In Advances in neural information processing systems. 1646--1654.

Digital Library

[10]

Claire Donnat, Marinka Zitnik, David Hallac, and Jure Leskovec. 2018. Learning structural node embeddings via diffusion wavelets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1320--1329.

Digital Library

[11]

Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.

[12]

Feng Gao, Guy Wolf, and Matthew Hirn. 2019. Geometric Scattering for Graph Data Analysis. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. 2122--2131.

[13]

Hongyang Gao and Shuiwang Ji. 2019. Graph U-nets. In Proceedings of The 36th International Conference on Machine Learning.

[14]

Thomas G"artner, Peter Flach, and Stefan Wrobel. 2003. On graph kernels: Hardness results and efficient alternatives. In Learning theory and kernel machines. Springer, 129--143.

[15]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.

Digital Library

[16]

N. Halko, P. G. Martinsson, and J. A. Tropp. 2011. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Rev., Vol. 53, 2 (2011), 217--288.

Digital Library

[17]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems.

[18]

Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. Rolx: structural role extraction & mining in large graphs. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 1231--1239.

Digital Library

[19]

Tamás Horváth, Thomas G"artner, and Stefan Wrobel. 2004. Cyclic pattern kernels for predictive graph mining. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. 158--167.

Digital Library

[20]

George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, Vol. 20, 1 (1998), 359--392.

Digital Library

[21]

Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.

[22]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.

[23]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In International Conference on Learning Representations.

[24]

Junhyun Lee, Inyeop Lee, and Jaewoo Kang. 2019. Self-Attention Graph Pooling. In Proceedings of the 36th International Conference on Machine Learning.

[25]

Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua. 2018. Attributed Social Network Embedding. IEEE Transactions on Knowledge and Data Engineering, Vol. 30, 12 (2018), 2257--2270.

Digital Library

[26]

Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, and Yang Liu. 2017. graph2vec: Learning distributed representations of graphs. (2017).

[27]

Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 1105--1114.

Digital Library

[28]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.

Digital Library

[29]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710.

Digital Library

[30]

Bryan Perozzi, Vivek Kulkarni, Haochen Chen, and Steven Skiena. 2017. Don't Walk, Skip!: online learning of multi-scale network embeddings. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 258--265.

Digital Library

[31]

Bryan Perozzi and Steven Skiena. 2015. Exact age prediction in social networks. In Proceedings of the 24th International Conference on World Wide Web. 91--92.

Digital Library

[32]

Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 459--467.

Digital Library

[33]

Benedek Rozemberczki, Carl Allen, and Rik Sarkar. 2019 a. Multi-scale Attributed Node Embedding. arXiv preprint arXiv:1909.13021 (2019).

[34]

Benedek Rozemberczki, Ryan Davies, Rik Sarkar, and Charles Sutton. 2019 b. GEMSEC: Graph Embedding with Self Clustering. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2019. ACM, 65--72.

Digital Library

[35]

Benedek Rozemberczki, Oliver Kiss, and Rik Sarkar. 2020. Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs. In Proceedings of the 29th ACM International on Conference on Information and Knowledge Management (CIKM '20). ACM.

Digital Library

[36]

Benedek Rozemberczki and Rik Sarkar. 2018. Fast Sequence-Based Embedding with Diffusion Graphs. In International Workshop on Complex Networks. Springer, 99--107.

[37]

Nino Shervashidze, Pascal Schweitzer, Erik Jan Van Leeuwen, Kurt Mehlhorn, and Karsten M Borgwardt. 2011. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, Vol. 12, 77 (2011), 2539--2561.

Digital Library

[38]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res., Vol. 15, 1 (2014), 1929--1958.

Digital Library

[39]

Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web. 1067--1077.

Digital Library

[40]

Anton Tsitsulin, Davide Mottin, Panagiotis Karras, Alexander Bronstein, and Emmanuel Müller. 2018. Netlsd: hearing the shape of a graph. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2347--2356.

Digital Library

[41]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.

[42]

Saurabh Verma and Zhi-Li Zhang. 2017. Hunt for the unique, stable, sparse and fast feature learning on graphs. In Advances in Neural Information Processing Systems. 88--98.

[43]

Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning.

[44]

Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Chang. 2015. Network representation learning with rich text information. In Twenty-Fourth International Joint Conference on Artificial Intelligence.

Digital Library

[45]

Hong Yang, Shirui Pan, Peng Zhang, Ling Chen, Defu Lian, and Chengqi Zhang. 2018. Binarized attributed network embedding. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 1476--1481.

[46]

Shuang Yang and Bo Yang. 2018. Enhanced Network Embedding with Text Information. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 326--331.

[47]

Daokun Zhang, Jie Yin, Xingquan Zhu, and Chengqi Zhang. 2018b. SINE: Scalable Incomplete Network Embedding. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 737--746.

[48]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018a. An End-to-End Deep Learning Architecture for Graph Classification. In AAAI. 4438--4445.

Cited By

Lu HWang LMa XCheng JZhou M(2025)A survey of graph neural networks and their industrial applicationsNeurocomputing10.1016/j.neucom.2024.128761614(128761)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128761
Wang SLiu WChen LZong S(2025)Influence maximization based on discrete particle swarm optimization on multilayer networkInformation Systems10.1016/j.is.2024.102466127(102466)Online publication date: Jan-2025
https://doi.org/10.1016/j.is.2024.102466
Ahmad WWang B(2025)A learning-based influence maximization framework for complex networks via K-core hierarchies and reinforcement learningExpert Systems with Applications10.1016/j.eswa.2024.125393259(125393)Online publication date: Jan-2025
https://doi.org/10.1016/j.eswa.2024.125393
Show More Cited By

Index Terms

Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

On augmenting topological graph representations for attributed graphs▪
Abstract
Graph representations based on embedding methods allow for easier analysis of the network structure and can be used for a variety of tasks, such as link prediction and node classification. These methods have been shown to be effective ...
Highlights
- This study proposes an attributed graph augmentation framework for attributed graphs
Bipartite graph capsule network
Abstract
Graphs have been widely adopted in various fields, where many graph models are developed. Most of previous research focuses on unipartite or homogeneous graph analysis. In this graphs, the relationships between the same type of entities are ...
The nonorientable genus of joins of complete graphs with large edgeless graphs

We show that for n=4 and n>=6, K"n has a nonorientable embedding in which all the facial walks are hamilton cycles. Moreover, when n is odd there is such an embedding that is 2-face-colorable. Using these results we consider the join of an edgeless ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

October 2020

3619 pages

ISBN:9781450368599

DOI:10.1145/3340531

General Chairs:
Mathieu d'Aquin
DSI, Insight, NUI Galway, Ireland
,
Stefan Dietze
GESIS, Cologne, Germany, Heinrich-Heine-University Düsseldorf, Germany, L3S Research Center, Germany
,
Program Chairs:
Claudia Hauff
TU Delft, The Netherlands
,
Edward Curry
DSI, Insight, NUI Galway, Ireland
,
Philippe Cudre Mauroux
eXascale, University of Fribourg, Switzerland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

EPSRC

Conference

CIKM '20

Sponsor:

CIKM '20: The 29th ACM International Conference on Information and Knowledge Management

October 19 - 23, 2020

Virtual Event, Ireland

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

168
Total Citations
View Citations
516
Total Downloads

Downloads (Last 12 months)129
Downloads (Last 6 weeks)24

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lu HWang LMa XCheng JZhou M(2025)A survey of graph neural networks and their industrial applicationsNeurocomputing10.1016/j.neucom.2024.128761614(128761)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128761
Wang SLiu WChen LZong S(2025)Influence maximization based on discrete particle swarm optimization on multilayer networkInformation Systems10.1016/j.is.2024.102466127(102466)Online publication date: Jan-2025
https://doi.org/10.1016/j.is.2024.102466
Ahmad WWang B(2025)A learning-based influence maximization framework for complex networks via K-core hierarchies and reinforcement learningExpert Systems with Applications10.1016/j.eswa.2024.125393259(125393)Online publication date: Jan-2025
https://doi.org/10.1016/j.eswa.2024.125393
Choi IKim W(2024)Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node EmbeddingsEntropy10.3390/e2601007026:1(70)Online publication date: 12-Jan-2024
https://doi.org/10.3390/e26010070
叶梦(2024)Identifying Vital Nodes Based on Epidemic Spreading Probability EntropyModeling and Simulation10.12677/mos.2024.13439813:04(4405-4415)Online publication date: 2024
https://doi.org/10.12677/mos.2024.134398
Zhang QLiang F(2024)Bayesian Analysis of Exponential Random Graph Models Using Stochastic Gradient Markov Chain Monte CarloBayesian Analysis10.1214/23-BA136419:2Online publication date: 1-Jun-2024
https://doi.org/10.1214/23-BA1364
Gu HZheng SLiu XXie HLui J(2024)Online Incentive Protocol Design for Reposting Service in Online Social NetworksACM Transactions on the Web10.1145/3696473Online publication date: 19-Sep-2024
https://dl.acm.org/doi/10.1145/3696473
Zhang YMukhopadhyay RChaintreau A(2024)Network Fairness Ambivalence: When does social network capital mitigate or amplify unfairness?Proceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560178:2(1-28)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3656017
Wu JDai HKent KYen JXu CWang Y(2024)Open Set Dandelion Network for IoT Intrusion DetectionACM Transactions on Internet Technology10.1145/363982224:1(1-26)Online publication date: 9-Jan-2024
https://dl.acm.org/doi/10.1145/3639822
Yang RShi J(2024)Efficient High-Quality Clustering for Large Bipartite GraphsProceedings of the ACM on Management of Data10.1145/36392782:1(1-27)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639278
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents