Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

HSEM: highly scalable node embedding for link prediction in very large-scale social networks

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Very large-scale social networks are typically sparse and dynamic and often have millions of nodes and billions of links. Link prediction in very large-scale networks is a challenging task for most existing methods. This paper investigates the link prediction problem in very large-scale networks. We propose a model, namely, a very large-scale network co-occurrence embedding algorithm (Hsem), which learns the co-occurrence features of node pairs to embed nodes into a vector with a lower and fixed dimension. A damping-based random walk algorithm is also developed to extract the hierarchical structural information of node pairs. Moreover, we design an incremental learning algorithm based on Hsem to meet the requirements of highly dynamic evolution in very large-scale networks, which significantly improves learning speed while maintaining accuracy. Finally, we present controlled experiments employing the proposed and baseline methods on eight challenging real-world large-scale datasets with a fixed dropout percentage and calculate the time overhead and receiver operating characteristics. The results show that Hsem performs comparably to current state-of-the-art methods and consistently outperforms the baselines in different experimental settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://www.useit.com.cn/thread-17562-1-1.html

  2. http://kdmlab.org/bear

  3. https://github.com/tangjianpku/LINE

  4. http://konect.uni-koblenz.de/networks/Youtube-u-growth

  5. http://konect.uni-koblenz.de/networks/flickr-growth

References

  1. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Networks 25(3), 211–230 (2003)

    Article  Google Scholar 

  2. Balduini, M., Bozzon, A., Della Valle, E., Huang, Y., Houben, G.J.: Recommending venues using continuous predictive social media analytics. IEEE Internet Comput. 18(5), 28–35 (2014)

    Article  Google Scholar 

  3. Chen, Z., Chen, M., Weinberger, K.Q., Zhang, W.: Marginalized denoising for link prediction and multi-label learning. In: AAAI, pp. 1707–1713 (2015)

  4. Dong, Y., Tang, J., Wu, S., Tian, J., Chawla, N.V., Rao, J., Cao, H.: Link prediction and recommendation across heterogeneous social networks. In: ICDM, pp. 181–190 (2012)

  5. Ermis, B., Acar, E., Cemgil, A.T.: Link prediction in heterogeneous data via generalized coupled tensor factorization. Data Min. Knowl. Disc. 29(1), 203–236 (2015)

    Article  MathSciNet  Google Scholar 

  6. Fire, M., Tenenboim-Chekina, L., Puzis, R., Lesser, O., Rokach, L., Elovici, Y.: Computationally efficient link prediction in a variety of social networks. ACM Trans. Intell. Syst. Technol. 5(1), 10–25 (2013)

    Article  Google Scholar 

  7. Garcia-Gasulla, D., Cortés, U.: Link Prediction in Very Large Directed Graphs: Exploiting Hierarchical Properties in Parallel. KNOW@LOD (2014)

  8. Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 855–864 (2016)

  9. Hisano, R.: Semi-supervised graph embedding approach to dynamic link prediction arXiv preprint arXiv (2016)

  10. Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: KDD, pp. 538–543 (2002)

  11. Katz, L.: A new status index derived from sociometric analysis. Psychometrika (1953)

  12. Konstas, I., Stathopoulos, V., Jose, J.M.: On social networks and collaborative recommendation. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 195–202 (2009)

  13. Lefakis, L., Fleuret, F.: Reservoir Boosting : Between Online and Offline Ensemble Learning, pp. 1412–1420 (2013)

  14. Li, J., Dani, H., Hu, X., Tang, J., Chang, Y., Liu, H.: Attributed network embedding for learning in a dynamic environment. arXiv:1706.01860 (2017)

  15. Li, J., Xia, F., Wang, W., Chen, Z., Asabere, N. Y., Jiang, H.: ACRec: a co-authorship based random walk model for academic collaboration recommendation. In: Proceedings of the companion publication of the 23rd international conference on World wide web companion (2014)

  16. Liao, L., He, X., Zhang, H., Chua, T.: Attributed social network embedding. arXiv:1705.04969 (2017)

  17. Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. JASIST 58(7), 1019–1031 (2007)

    Article  Google Scholar 

  18. Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New Perspectives and Methods in Link Prediction. In: SIGKDD, pp. 243–252 (2010)

  19. Lü, L., Jin, C. H.: Zhou, T.: Similarity index based on local paths for link prediction of complex networks. Phys. Rev. E 80(4), 046,122 (2009)

    Article  Google Scholar 

  20. Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Phys. A Stat. Mech. Appl. 390(6), 1150–1170 (2011)

    Article  Google Scholar 

  21. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv.org cs.CL, pp. 1–12 (2013)

  22. Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the international workshop on artificial intelligence and statistics, pp. 246–252 (2005)

  23. Nguyen, C.H., Mamitsuka, H.: Latent feature kernels for link prediction on sparse graphs. IEEE Trans. Neural Netw. Learn. Syst. 23(11), 1793–1804 (2012)

    Article  Google Scholar 

  24. Pan, J., Yang, H., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: SIGKDD, pp. 653–658 (2004)

  25. Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. J. Syst. Softw. 85(9), 2119–2132 (2012)

    Article  Google Scholar 

  26. Raymond, R., Kashima, H.: Fast and Scalable Algorithms for Semi-Supervised Link Prediction on Static and Dynamic Graphs. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 131–147. Springer, Berlin (2010)

    Chapter  Google Scholar 

  27. Sarkar, P., Chakrabarti, D., Jordan, M.: Nonparametric link prediction in large scale dynamic networks. Electron J Stat 8(2), 2022–2065 (2014)

    Article  MathSciNet  Google Scholar 

  28. Shin, K., Jung, J., Lee, S., Kang, U.: BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In: SIGMOD, pp. 1571–1585 (2015)

  29. Symeonidis, P., Perentis, C.: Link prediction in multi-modal social networks. In: ECML/PKDD, vol. 8726, pp. 147–162 (2014)

    Chapter  Google Scholar 

  30. Tabourier, L., Libert, A.S., Lambiotte, R.: RankMerging: learning to rank in large-scale social networks. In: ECML-PKDD Workshop (2014)

  31. Tang, J., Chang, S., Aggarwal, C.C., Liu, H.: Negative link prediction in social media. In: WSDM, pp. 87–96 (2015)

  32. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-Scale Information Network Embedding. In: Proceedings of the 24Th International Conference on World Wide Web, pp. 1067–1077 (2015)

  33. Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1225–1234 (2016)

  34. Wang, P., Xu, B., Wu, Y., Zhou, X.: Link prediction in social networks: the state-of-the-art. Sci. China Inf. Sci. 58(1), 1–38 (2015)

    Google Scholar 

  35. Wang, X., He, D., Chen, D., Xu, J.: Clustering-based collaborative filtering for link prediction. In: AAAI, pp. 332–338 (2015)

  36. Wu, Z., Li, Y.: Link prediction based on multi-steps resource allocation. In: WI-IAT, pp. 355–360 (2014)

  37. Xia, F., Chen, Z., Wang, W., Li, J.: MVCWalker: Random walk-based most valuable collaborators recommendation exploiting academic factors. IEEE Trans. Emerg. Topics Comput. 2(3), 364–375 (2014)

    Article  Google Scholar 

  38. Zhang, J., Fang, Z., Chen, W., Tang, J.: Diffusion of following links in microblogging networks. IEEE Trans. Knowl. Data Eng. 27(8), 2093–2106 (2015)

    Article  Google Scholar 

  39. Zhou, T., Lü, L., Zhang, Y.C.: Predicting missing links via local information. Europ. Phys. J. B. 71(4), 623–630 (2009)

    Article  Google Scholar 

  40. Zhou, X., Liang, X., Zhang, H., Ma, Y.: Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans. Knowl. Data Eng. 28(2), 411–424 (2016)

    Article  Google Scholar 

  41. Zhu, L., Guo, D., Yin, J., Steeg, G.V., Galstyan, A.: Scalable temporal latent space inference for link prediction in dynamic social networks. IEEE Trans. Knowl. Data Eng. 28(10), 2765–2777 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant Nos. 71531012 and 71271211), the Natural Science Foundation of Beijing (Grant No. 4172032), the Outstanding Innovative Talents Cultivation Funded Programs 2018 of Renmin University of China, and the Opening Project of State Key Laboratory of Digital Publishing Technology of Founder Group.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xun Liang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Social Computing and Big Data Applications

Guest Editors: Xiaoming Fu, Hong Huang, Gareth Tyson, Lu Zheng, and Gang Wang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhiyuli, A., Liang, X. & Chen, Y. HSEM: highly scalable node embedding for link prediction in very large-scale social networks. World Wide Web 22, 2799–2824 (2019). https://doi.org/10.1007/s11280-018-0649-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0649-z

Keywords

Navigation