Abstract
Semi-supervised network representation learning is becoming a hotspot in graph mining community, which aims to learn low-dimensional vector representations of vertices using partial label information. In particular, graph neural networks integrate structural information and other side information like vertex attributes to learn node representations. Although the existing semi-supervised graph learning performs well on limited labeled data, it is still often hampered when labeled dataset is quite small. To mitigate this issue, we propose PMNRL, a pseudo-multitask learning framework for semi-supervised network representation learning to boost the expression power of graph networks such as vanilla GCN (Graph Convolutional Networks) and GAT (Graph Attention Networks). In PMNRL, by leveraging the community structures in networks, we create a pseudo task that classifies nodes’ community affiliation, and conduct a joint learning of two tasks (i.e., the original task and the pseudo task). Our proposed scheme can take advantage of the inherent connection between structural proximity and label similarity to improve the performance without the need to resort to more labels. The proposed framework is implemented in two ways: two-stage method and end-to-end method. For two-stage method, communities are first detected and then the community affiliations are used as “labels” along with original labels to train the joint model. In end-to-end method, the unsupervised community learning is combined into the representation learning process by shared layers and task-specific layers, so as to encourage the common features and specific features for different tasks at the same time. The experimental results on three real-world benchmark networks demonstrate the performance improvement of the vanilla models using our framework without any additional labels, especially when there are quite few labels.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amyar A, Modzelewski R, Li H, Ruan S (2020) Multi-task deep learning based ct imaging analysis for covid-19 pneumonia: Classification and segmentation. Comput Biol Med 126(104):037
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exper 2008(10):P10008
Bo D, Wang X, Shi C, Zhu M, Lu E, Cui P (2020) Structural deep clustering network. In: Proceedings of The Web Conference 2020, pp 1400–1410
Chiang WL, Liu X, Si S, Li Y, Bengio S, Hsieh CJ (2019) Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 257–266
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111
Cordasco G, Gargano L (2010) Community detection via semi-synchronous label propagation algorithms. In: 2010 IEEE International workshop on: Business applications of social network analysis (BASNA). IEEE, pp 1–8
Hamilton WL (2020) Graph representation learning. Synthesis Lect Artif Intell Mach Learn 14 (3):1–159
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: Methods and applications. arXiv:1709.05584
Huang YA, Chan KC, You ZH, Hu P, Wang L, Huang ZA (2020) Predicting microrna–disease associations from lncrna–microrna interactions via multiview multitask learning. Briefings in Bioinformatics
Khosla M, Setty V, Anand A (2019) A comparative study for unsupervised network representation learning. IEEE Transactions on Knowledge and Data Engineering
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Lee JB, Rossi RA, Kong X, Kim S, Koh E, Rao A (2019) Graph convolutional networks with motif-based attention. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp 499– 508
Li B, Pi D (2020) Network representation learning: a systematic literature review. Neural Comput Appl:1–33
Liao Q, Ding Y, Jiang ZL, Wang X, Zhang C, Zhang Q (2019) Multi-task deep convolutional neural network for cancer diagnosis. Neurocomputing 348:66–73
Lu G, Gan J, Yin J, Luo Z, Li B, Zhao X (2020) Multi-task learning using a hybrid representation for text classification. Neural Comput Appl 32(11):6467–6480
Lv G, Wang S, Liu B, Chen E, Zhang K (2019) Sentiment classification by leveraging the shared knowledge from a sequence of domains. In: International conference on database systems for advanced applications. Springer, pp 795–811
Maaten Lvd, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(Nov):2579–2605
Mohan A, Pramod K (2019) Network representation learning: models, methods and applications. SN Appl Sci 1(9):1014
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–93
Tran PV (2018) Multi-task graph autoencoders. arXiv:1811.02798
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ. Accepted as poster
Velickovic P, Fedus W, Hamilton WL, Lio P, Bengio Y, Hjelm RD (2019) Deep graph infomax. In: ICLR (Poster)
Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487
Xie Y, Jin P, Gong M, Zhang C, Yu B (2020) Multi-task network representation learning. Front Neurosci:14
Xu L, Wei X, Cao J, Philip SY (2019) Multi-task network embedding. Int J Data Sci Anal 8(2):183–198
Yang X, Jiang X, Tian C, Wang P, Zhou F, Fujita H (2020) Inverse projection group sparse representation for tumor classification: a low rank variation dictionary approach. Knowl-Based Syst 196 (105):768
Zhang D, Yin J, Zhu X, Zhang C (2018) Network representation learning: A survey. IEEE transactions on Big Data
Zhang Y, Yang Q (2017) A survey on multi-task learning. arXiv:1707.08114
Acknowledgements
This work is partially supported by National Natural Science Foundation of China (No. 61873218), Southwest Petroleum University Innovation Base Funding (No. 642) and Southwest Petroleum University Scientific Research Starting Project (No. 2019QHZ016).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Python code and datasets
The code is implemented in Pytorch framework of Python and can be found at https://github.com/roger40/CINS_ML-group/tree/master/Paper%20codes/PMNRL Three datasets are public and could be obtained from [21] or the above link.
Rights and permissions
About this article
Cite this article
Wang, B., Dai, Z., Kong, D. et al. Boosting semi-supervised network representation learning with pseudo-multitasking. Appl Intell 52, 8118–8133 (2022). https://doi.org/10.1007/s10489-021-02844-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02844-y