Abstract
Very large-scale social networks are typically sparse and dynamic and often have millions of nodes and billions of links. Link prediction in very large-scale networks is a challenging task for most existing methods. This paper investigates the link prediction problem in very large-scale networks. We propose a model, namely, a very large-scale network co-occurrence embedding algorithm (Hsem), which learns the co-occurrence features of node pairs to embed nodes into a vector with a lower and fixed dimension. A damping-based random walk algorithm is also developed to extract the hierarchical structural information of node pairs. Moreover, we design an incremental learning algorithm based on Hsem to meet the requirements of highly dynamic evolution in very large-scale networks, which significantly improves learning speed while maintaining accuracy. Finally, we present controlled experiments employing the proposed and baseline methods on eight challenging real-world large-scale datasets with a fixed dropout percentage and calculate the time overhead and receiver operating characteristics. The results show that Hsem performs comparably to current state-of-the-art methods and consistently outperforms the baselines in different experimental settings.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Networks 25(3), 211–230 (2003)
Balduini, M., Bozzon, A., Della Valle, E., Huang, Y., Houben, G.J.: Recommending venues using continuous predictive social media analytics. IEEE Internet Comput. 18(5), 28–35 (2014)
Chen, Z., Chen, M., Weinberger, K.Q., Zhang, W.: Marginalized denoising for link prediction and multi-label learning. In: AAAI, pp. 1707–1713 (2015)
Dong, Y., Tang, J., Wu, S., Tian, J., Chawla, N.V., Rao, J., Cao, H.: Link prediction and recommendation across heterogeneous social networks. In: ICDM, pp. 181–190 (2012)
Ermis, B., Acar, E., Cemgil, A.T.: Link prediction in heterogeneous data via generalized coupled tensor factorization. Data Min. Knowl. Disc. 29(1), 203–236 (2015)
Fire, M., Tenenboim-Chekina, L., Puzis, R., Lesser, O., Rokach, L., Elovici, Y.: Computationally efficient link prediction in a variety of social networks. ACM Trans. Intell. Syst. Technol. 5(1), 10–25 (2013)
Garcia-Gasulla, D., Cortés, U.: Link Prediction in Very Large Directed Graphs: Exploiting Hierarchical Properties in Parallel. KNOW@LOD (2014)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 855–864 (2016)
Hisano, R.: Semi-supervised graph embedding approach to dynamic link prediction arXiv preprint arXiv (2016)
Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: KDD, pp. 538–543 (2002)
Katz, L.: A new status index derived from sociometric analysis. Psychometrika (1953)
Konstas, I., Stathopoulos, V., Jose, J.M.: On social networks and collaborative recommendation. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 195–202 (2009)
Lefakis, L., Fleuret, F.: Reservoir Boosting : Between Online and Offline Ensemble Learning, pp. 1412–1420 (2013)
Li, J., Dani, H., Hu, X., Tang, J., Chang, Y., Liu, H.: Attributed network embedding for learning in a dynamic environment. arXiv:1706.01860 (2017)
Li, J., Xia, F., Wang, W., Chen, Z., Asabere, N. Y., Jiang, H.: ACRec: a co-authorship based random walk model for academic collaboration recommendation. In: Proceedings of the companion publication of the 23rd international conference on World wide web companion (2014)
Liao, L., He, X., Zhang, H., Chua, T.: Attributed social network embedding. arXiv:1705.04969 (2017)
Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. JASIST 58(7), 1019–1031 (2007)
Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New Perspectives and Methods in Link Prediction. In: SIGKDD, pp. 243–252 (2010)
Lü, L., Jin, C. H.: Zhou, T.: Similarity index based on local paths for link prediction of complex networks. Phys. Rev. E 80(4), 046,122 (2009)
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Phys. A Stat. Mech. Appl. 390(6), 1150–1170 (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv.org cs.CL, pp. 1–12 (2013)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the international workshop on artificial intelligence and statistics, pp. 246–252 (2005)
Nguyen, C.H., Mamitsuka, H.: Latent feature kernels for link prediction on sparse graphs. IEEE Trans. Neural Netw. Learn. Syst. 23(11), 1793–1804 (2012)
Pan, J., Yang, H., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: SIGKDD, pp. 653–658 (2004)
Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. J. Syst. Softw. 85(9), 2119–2132 (2012)
Raymond, R., Kashima, H.: Fast and Scalable Algorithms for Semi-Supervised Link Prediction on Static and Dynamic Graphs. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 131–147. Springer, Berlin (2010)
Sarkar, P., Chakrabarti, D., Jordan, M.: Nonparametric link prediction in large scale dynamic networks. Electron J Stat 8(2), 2022–2065 (2014)
Shin, K., Jung, J., Lee, S., Kang, U.: BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs. In: SIGMOD, pp. 1571–1585 (2015)
Symeonidis, P., Perentis, C.: Link prediction in multi-modal social networks. In: ECML/PKDD, vol. 8726, pp. 147–162 (2014)
Tabourier, L., Libert, A.S., Lambiotte, R.: RankMerging: learning to rank in large-scale social networks. In: ECML-PKDD Workshop (2014)
Tang, J., Chang, S., Aggarwal, C.C., Liu, H.: Negative link prediction in social media. In: WSDM, pp. 87–96 (2015)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-Scale Information Network Embedding. In: Proceedings of the 24Th International Conference on World Wide Web, pp. 1067–1077 (2015)
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1225–1234 (2016)
Wang, P., Xu, B., Wu, Y., Zhou, X.: Link prediction in social networks: the state-of-the-art. Sci. China Inf. Sci. 58(1), 1–38 (2015)
Wang, X., He, D., Chen, D., Xu, J.: Clustering-based collaborative filtering for link prediction. In: AAAI, pp. 332–338 (2015)
Wu, Z., Li, Y.: Link prediction based on multi-steps resource allocation. In: WI-IAT, pp. 355–360 (2014)
Xia, F., Chen, Z., Wang, W., Li, J.: MVCWalker: Random walk-based most valuable collaborators recommendation exploiting academic factors. IEEE Trans. Emerg. Topics Comput. 2(3), 364–375 (2014)
Zhang, J., Fang, Z., Chen, W., Tang, J.: Diffusion of following links in microblogging networks. IEEE Trans. Knowl. Data Eng. 27(8), 2093–2106 (2015)
Zhou, T., Lü, L., Zhang, Y.C.: Predicting missing links via local information. Europ. Phys. J. B. 71(4), 623–630 (2009)
Zhou, X., Liang, X., Zhang, H., Ma, Y.: Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Trans. Knowl. Data Eng. 28(2), 411–424 (2016)
Zhu, L., Guo, D., Yin, J., Steeg, G.V., Galstyan, A.: Scalable temporal latent space inference for link prediction in dynamic social networks. IEEE Trans. Knowl. Data Eng. 28(10), 2765–2777 (2016)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant Nos. 71531012 and 71271211), the Natural Science Foundation of Beijing (Grant No. 4172032), the Outstanding Innovative Talents Cultivation Funded Programs 2018 of Renmin University of China, and the Opening Project of State Key Laboratory of Digital Publishing Technology of Founder Group.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Social Computing and Big Data Applications
Guest Editors: Xiaoming Fu, Hong Huang, Gareth Tyson, Lu Zheng, and Gang Wang
Rights and permissions
About this article
Cite this article
Zhiyuli, A., Liang, X. & Chen, Y. HSEM: highly scalable node embedding for link prediction in very large-scale social networks. World Wide Web 22, 2799–2824 (2019). https://doi.org/10.1007/s11280-018-0649-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-018-0649-z