Abstract
Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in order to exploit machine learning algorithms for mining tasks like node classification and edge prediction. However, most of the work focuses on distributed representations of nodes that are inherently ill-suited to tasks such as community detection which are intuitively dependent on subgraphs. Here, we formulate subgraph embedding problem based on two intuitive properties of subgraphs and propose Sub2Vec, an unsupervised algorithm to learn feature representations of arbitrary subgraphs. We also highlight the usability of Sub2Vec by leveraging it for network mining tasks, like community detection and graph classification. We show that Sub2Vec gets significant gains over state-of-the-art methods. In particular, Sub2Vec offers an approach to generate a richer vocabulary of meaningful features of subgraphs for representation and reasoning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Code in Python available at: https://goo.gl/Ef4q8g.
- 2.
- 3.
References
Bach, F.R., Jordan, M.I.: Learning spectral clustering. In: NIPS, vol. 16 (2003)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS, vol. 14, pp. 585–591 (2001)
Bhagat, S., Cormode, G., Muthukrishnan, S.: Node classification in social networks. In: Aggarwal, C. (ed.) Social Network Data Analytics, pp. 115–148. Springer, Boston (2011). https://doi.org/10.1007/978-1-4419-8462-3_5
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theor. Exp. (2008)
Bousquet, O., Bottou, L.: The tradeoffs of large scale learning. In: NIPS, pp. 161–168 (2008)
Carey, F.A., Sundberg, R.J.: Advanced Organic Chemistry. Part A: Structure and Mechanisms. Springer, New York (2007). https://doi.org/10.1007/978-0-387-44899-2
Cheng, K., Li, J., Liu, H.: Unsupervised feature selection in signed social networks. In: KDD 2017, pp. 777–786. ACM (2017)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. National Acad. Sci. 99(12), 7821–7826 (2002)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: SIGKDD, pp. 855–864. ACM (2016)
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14, pp. 1188–1196 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Milojević, S.: Power law distributions in information science: making the case for logarithmic binning. J. Am. Soc. Inf. Sci. Technol. 61, 2417–2425 (2010)
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y., Saminathan, S.: subgraph2vec: Learning distributed representations of rooted sub-graphs from large graphs. arXiv preprint arXiv:1606.08928 (2016)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: SIGKDD, pp. 701–710. ACM (2014)
Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer (2010)
Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: struc2vec: Learning node representations from structural identity. In: KDD 2017, pp. 385–394. ACM (2017)
Riesen, K., Bunke, H.: Graph Classification and Clustering Based on Vector Space Embedding. World Scientific Publishing Co. Inc., River Edge (2010)
Shervashidze, N., Schweitzer, P., Leeuwen, E.J.V., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-lehman graph kernels. J. Mach. Learn. Res. 12(Sep), 2539–2561 (2011)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: WWW, pp. 1067–1077. ACM (2015)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. science 290(5500), 2319–2323 (2000)
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: SIGKDD, pp. 1225–1234. ACM (2016)
Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community Preserving Network Embedding, pp. 203–209 (2017)
Whang, J.J., Dhillon, I.S., Gleich, D.F.: Non-exhaustive, overlapping k-means. In: SDM, pp. 936–944. SIAM (2015)
Yanardag, P., Vishwanathan, S.: Deep graph kernels. In: SIGKDD, pp. 1365–1374. ACM (2015)
Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: WSDM, pp. 587–596. ACM (2013)
Acknowledgements
This paper is based on work partially supported by the NSF (CAREER-IIS-1750407, DGE-1545362, and IIS-1633363), the NEH (HG-229283-15), ORNL (Task Order 4000143330) and from the Maryland Procurement Office (H98230-14-C-0127), and a Facebook faculty gift.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Adhikari, B., Zhang, Y., Ramakrishnan, N., Prakash, B.A. (2018). Sub2Vec: Feature Learning for Subgraphs. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-93037-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93036-7
Online ISBN: 978-3-319-93037-4
eBook Packages: Computer ScienceComputer Science (R0)