research-article

DeepWalk: online learning of social representations

Authors:

Steven SkienaAuthors Info & Claims

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 701 - 710

https://doi.org/10.1145/2623330.2623732

Published: 24 August 2014 Publication History

Abstract

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs.

DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide F1 scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data.

DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.

Supplementary Material

MP4 File (p701-sidebyside.mp4)

Download
242.70 MB

References

[1]

R. Al-Rfou, B. Perozzi, and S. Skiena. Polyglot: Distributed word representations for multilingual nlp. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 183--192, Sofia, Bulgaria, August 2013. ACL.

[2]

R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In Foundations of Computer Science, 2006. FOCS'06. 47th Annual IEEE Symposium on, pages 475--486. IEEE, 2006.

Digital Library

[3]

Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. 2013.

[4]

Y. Bengio, R. Ducharme, and P. Vincent. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137--1155, 2003.

Digital Library

[5]

L. Bottou. Stochastic gradient learning in neural networks. In Proceedings of Neuro-Nîmes 91, Nimes, France, 1991. EC2.

[6]

V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3):15, 2009.

Digital Library

[7]

R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th ICML, ICML '08, pages 160--167. ACM, 2008.

Digital Library

[8]

G. E. Dahl, D. Yu, L. Deng, and A. Acero. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1):30--42, 2012.

Digital Library

[9]

J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. Le, M. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Ng. Large scale distributed deep networks. In P. Bartlett, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1232--1240. 2012.

[10]

D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research, 11:625--660, 2010.

Digital Library

[11]

R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008.

Digital Library

[12]

F. Fouss, A. Pirotte, J.-M. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. Knowledge and Data Engineering, IEEE Transactions on, 19(3):355--369, 2007.

Digital Library

[13]

B. Gallagher and T. Eliassi-Rad. Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In Advances in Social Network Mining and Analysis, pages 1--19. Springer, 2010.

Digital Library

[14]

B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos. Using ghost edges for classification in sparsely labeled networks. In Proceedings of the 14th ACM SIGKDD, KDD '08, pages 256--264, New York, NY, USA, 2008. ACM.

Digital Library

[15]

S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6):721--741, 1984.

Digital Library

[16]

L. Getoor and B. Taskar. Introduction to statistical relational learning. MIT press, 2007.

Digital Library

[17]

K. Henderson, B. Gallagher, L. Li, L. Akoglu, T. Eliassi-Rad, H. Tong, and C. Faloutsos. It's who you know: Graph mining using recursive structural features. In Proceedings of the 17th ACM SIGKDD, KDD '11, pages 663--671, New York, NY, USA, 2011. ACM.

Digital Library

[18]

G. E. Hinton. Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society, pages 1--12. Amherst, MA, 1986.

[19]

R. A. Hummel and S. W. Zucker. On the foundations of relaxation labeling processes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, :267--287, 1983.

Digital Library

[20]

U. Kang, H. Tong, and J. Sun. Fast random walk graph kernel. In SDM, pages 828--838, 2012.

[21]

R. I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML, volume 2, pages 315--322, 2002.

Digital Library

[22]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, volume 1, page 4, 2012.

Digital Library

[23]

D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019--1031, 2007.

Digital Library

[24]

F. Lin and W. Cohen. Semi-supervised classification of network data using very few labels. In Advances in Social Networks Analysis and Mining (ASONAM), 2010 International Conference on, pages 192--199, Aug 2010.

Digital Library

[25]

S. A. Macskassy and F. Provost. A simple relational classifier. In Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pages 64--76, 2003.

[26]

S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. The Journal of Machine Learning Research, 8:935--983, 2007.

Digital Library

[27]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. CoRR, abs/1301.3781, 2013.

[28]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111--3119. 2013.

Digital Library

[29]

T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In Proceedings of NAACL-HLT, pages 746--751, 2013.

[30]

A. Mnih and G. E. Hinton. A scalable hierarchical distributed language model. Advances in neural information processing systems, 21:1081--1088, 2009.

[31]

F. Morin and Y. Bengio. Hierarchical probabilistic neural network language model. In Proceedings of the international workshop on artificial intelligence and statistics, pages 246--252, 2005.

[32]

J. Neville and D. Jensen. Iterative classification in relational data. In Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data, pages 13--20, 2000.

[33]

J. Neville and D. Jensen. Leveraging relational autocorrelation with latent group models. In Proceedings of the 4th International Workshop on Multi-relational Mining, MRDM '05, pages 49--55, New York, NY, USA, 2005. ACM.

Digital Library

[34]

J. Neville and D. Jensen. A bias/variance decomposition for models using collective inference. Machine Learning, 73(1):87--106, 2008.

Digital Library

[35]

M. E. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23):8577--8582, 2006.

[36]

B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24, pages 693--701. 2011.

[37]

P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. Collective classification in network data. AI magazine, 29(3):93, 2008.

Digital Library

[38]

D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81--90. ACM, 2004.

Digital Library

[39]

L. Tang and H. Liu. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD, KDD '09, pages 817--826, New York, NY, USA, 2009. ACM.

Digital Library

[40]

L. Tang and H. Liu. Scalable learning of collective behavior based on sparse social dimensions. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1107--1116. ACM, 2009.

Digital Library

[41]

L. Tang and H. Liu. Leveraging social media networks for classification. Data Mining and Knowledge Discovery, 23(3):44--478, 2011.

Digital Library

[42]

S. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt. Graph kernels. The Journal of Machine Learning Research, 99:1201--1242, 2010.

Digital Library

[43]

X. Wang and G. Sukthankar. Multi-label relational neighbor classification using social context features. In Proceedings of the 19th ACM SIGKDD, pages 464--472. ACM, 2013.

Digital Library

[44]

W. Zachary. An information flow model for conflict and fission in small groups1. Journal of anthropological research, 33(4):452--473, 1977.

Cited By

Zhao HRui PChen JZhao SZhang Y(2025)Restage: Relation Structure-Aware Hierarchical Heterogeneous Graph EmbeddingTsinghua Science and Technology10.26599/TST.2023.901014730:1(198-214)Online publication date: Feb-2025
https://doi.org/10.26599/TST.2023.9010147
Chen DZheng SXu MZhu ZZhao Y(2025)SiGNN: A spike-induced graph neural network for dynamic graph representation learningPattern Recognition10.1016/j.patcog.2024.111026158(111026)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111026
Guo QYang XLi MQian Y(2025)Collaborative graph neural networks for augmented graphs: A local-to-global perspectivePattern Recognition10.1016/j.patcog.2024.111020158(111020)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111020
Show More Cited By

Index Terms

DeepWalk: online learning of social representations
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Max-margin deepwalk: discriminative learning of network representation
IJCAI'16: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence

DeepWalk is a typical representation learning method that learns low-dimensional representations for vertices in social networks. Similar to other network representation learning (NRL) models, it encodes the network structure into vertex representations ...
Deep cycle autoencoder for unsupervised domain adaptation with generative adversarial networks

Deep learning is a powerful tool for domain adaptation by learning robust high‐level domain invariant representations. Recently, adversarial domain adaptation models are applied to learn representations with adversarial training manners in feature space. ...
Exact Age Prediction in Social Networks
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

Predicting accurate demographic information about the users of information systems is a problem of interest in personalized search, ad targeting, and other related fields. Despite such broad applications, most existing work only considers age prediction ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2014

2028 pages

ISBN:9781450329569

DOI:10.1145/2623330

General Chairs:
Sofus Macskassy
Facebook
,
Claudia Perlich
Dstillery
,
Program Chairs:
Jure Leskovec
Stanford University
,
Wei Wang
UCLA
,
Rayid Ghani
University of Chicago

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

KDD '14

Sponsor:

KDD '14: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 24 - 27, 2014

New York, New York, USA

Acceptance Rates

KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6,431
Total Citations
View Citations
20,711
Total Downloads

Downloads (Last 12 months)1,990
Downloads (Last 6 weeks)147

Reflects downloads up to 18 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhao HRui PChen JZhao SZhang Y(2025)Restage: Relation Structure-Aware Hierarchical Heterogeneous Graph EmbeddingTsinghua Science and Technology10.26599/TST.2023.901014730:1(198-214)Online publication date: Feb-2025
https://doi.org/10.26599/TST.2023.9010147
Chen DZheng SXu MZhu ZZhao Y(2025)SiGNN: A spike-induced graph neural network for dynamic graph representation learningPattern Recognition10.1016/j.patcog.2024.111026158(111026)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111026
Guo QYang XLi MQian Y(2025)Collaborative graph neural networks for augmented graphs: A local-to-global perspectivePattern Recognition10.1016/j.patcog.2024.111020158(111020)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111020
Zhang TLv LBardou D(2024)Evolutionary Perturbation Attack on Temporal Link PredictionJournal of the Physical Society of Japan10.7566/JPSJ.93.07400293:7Online publication date: 15-Jul-2024
https://doi.org/10.7566/JPSJ.93.074002
Xiong PLiang ACai XXia T(2024)APTAnet: an atom-level peptide-TCR interaction affinity prediction modelBiophysics Reports10.52601/bpr.2023.23003710:1(1)Online publication date: 2024
https://doi.org/10.52601/bpr.2023.230037
Liu YZhao XQiu XYan H(2024)VCGERGInternational Journal of Information Security and Privacy10.4018/IJISP.34259618:1(1-21)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.4018/IJISP.342596
Zhou SLiu YLiu Y(2024)A Market Convergence Prediction Framework Based on a Supply Chain Knowledge GraphSustainability10.3390/su1604169616:4(1696)Online publication date: 19-Feb-2024
https://doi.org/10.3390/su16041696
Chen ZLiu SHuang JXiu YZhang HLong H(2024)Ethereum Phishing Scam Detection Based on Data Augmentation Method and Hybrid Graph Neural Network ModelSensors10.3390/s2412402224:12(4022)Online publication date: 20-Jun-2024
https://doi.org/10.3390/s24124022
Khoo LLim MChong CMcNaney R(2024)Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing ApproachesSensors10.3390/s2402034824:2(348)Online publication date: 6-Jan-2024
https://doi.org/10.3390/s24020348
Guo YZhou Z(2024)SSRES: A Student Academic Paper Social Recommendation Model Based on a Heterogeneous Graph ApproachMathematics10.3390/math1211166712:11(1667)Online publication date: 27-May-2024
https://doi.org/10.3390/math12111667
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents