Nothing Special   »   [go: up one dir, main page]

Skip to main content

Sub2Vec: Feature Learning for Subgraphs

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10938))

Included in the following conference series:

Abstract

Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in order to exploit machine learning algorithms for mining tasks like node classification and edge prediction. However, most of the work focuses on distributed representations of nodes that are inherently ill-suited to tasks such as community detection which are intuitively dependent on subgraphs. Here, we formulate subgraph embedding problem based on two intuitive properties of subgraphs and propose Sub2Vec, an unsupervised algorithm to learn feature representations of arbitrary subgraphs. We also highlight the usability of Sub2Vec by leveraging it for network mining tasks, like community detection and graph classification. We show that Sub2Vec gets significant gains over state-of-the-art methods. In particular, Sub2Vec offers an approach to generate a richer vocabulary of meaningful features of subgraphs for representation and reasoning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Code in Python available at: https://goo.gl/Ef4q8g.

  2. 2.

    http://mlcb.is.tuebingen.mpg.de/Mitarbeiter/Nino/Graphkernels/.

  3. 3.

    snap.stanford.edu.

References

  1. Bach, F.R., Jordan, M.I.: Learning spectral clustering. In: NIPS, vol. 16 (2003)

    Google Scholar 

  2. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS, vol. 14, pp. 585–591 (2001)

    Google Scholar 

  3. Bhagat, S., Cormode, G., Muthukrishnan, S.: Node classification in social networks. In: Aggarwal, C. (ed.) Social Network Data Analytics, pp. 115–148. Springer, Boston (2011). https://doi.org/10.1007/978-1-4419-8462-3_5

    Chapter  Google Scholar 

  4. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theor. Exp. (2008)

    Google Scholar 

  5. Bousquet, O., Bottou, L.: The tradeoffs of large scale learning. In: NIPS, pp. 161–168 (2008)

    Google Scholar 

  6. Carey, F.A., Sundberg, R.J.: Advanced Organic Chemistry. Part A: Structure and Mechanisms. Springer, New York (2007). https://doi.org/10.1007/978-0-387-44899-2

    Book  Google Scholar 

  7. Cheng, K., Li, J., Liu, H.: Unsupervised feature selection in signed social networks. In: KDD 2017, pp. 777–786. ACM (2017)

    Google Scholar 

  8. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. National Acad. Sci. 99(12), 7821–7826 (2002)

    Article  MathSciNet  Google Scholar 

  9. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: SIGKDD, pp. 855–864. ACM (2016)

    Google Scholar 

  10. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14, pp. 1188–1196 (2014)

    Google Scholar 

  11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)

    Google Scholar 

  12. Milojević, S.: Power law distributions in information science: making the case for logarithmic binning. J. Am. Soc. Inf. Sci. Technol. 61, 2417–2425 (2010)

    Article  Google Scholar 

  13. Narayanan, A., Chandramohan, M., Chen, L., Liu, Y., Saminathan, S.: subgraph2vec: Learning distributed representations of rooted sub-graphs from large graphs. arXiv preprint arXiv:1606.08928 (2016)

  14. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: SIGKDD, pp. 701–710. ACM (2014)

    Google Scholar 

  15. Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer (2010)

    Google Scholar 

  16. Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: struc2vec: Learning node representations from structural identity. In: KDD 2017, pp. 385–394. ACM (2017)

    Google Scholar 

  17. Riesen, K., Bunke, H.: Graph Classification and Clustering Based on Vector Space Embedding. World Scientific Publishing Co. Inc., River Edge (2010)

    Book  Google Scholar 

  18. Shervashidze, N., Schweitzer, P., Leeuwen, E.J.V., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-lehman graph kernels. J. Mach. Learn. Res. 12(Sep), 2539–2561 (2011)

    MathSciNet  MATH  Google Scholar 

  19. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: WWW, pp. 1067–1077. ACM (2015)

    Google Scholar 

  20. Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  21. Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: SIGKDD, pp. 1225–1234. ACM (2016)

    Google Scholar 

  22. Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community Preserving Network Embedding, pp. 203–209 (2017)

    Google Scholar 

  23. Whang, J.J., Dhillon, I.S., Gleich, D.F.: Non-exhaustive, overlapping k-means. In: SDM, pp. 936–944. SIAM (2015)

    Chapter  Google Scholar 

  24. Yanardag, P., Vishwanathan, S.: Deep graph kernels. In: SIGKDD, pp. 1365–1374. ACM (2015)

    Google Scholar 

  25. Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: WSDM, pp. 587–596. ACM (2013)

    Google Scholar 

Download references

Acknowledgements

This paper is based on work partially supported by the NSF (CAREER-IIS-1750407, DGE-1545362, and IIS-1633363), the NEH (HG-229283-15), ORNL (Task Order 4000143330) and from the Maryland Procurement Office (H98230-14-C-0127), and a Facebook faculty gift.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bijaya Adhikari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Adhikari, B., Zhang, Y., Ramakrishnan, N., Prakash, B.A. (2018). Sub2Vec: Feature Learning for Subgraphs. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93037-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93036-7

  • Online ISBN: 978-3-319-93037-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics