Abstract
The interactions between humans and their environment, comprising living and non-living entities, can be studied via Social Network Analysis (SNA). Node classification, as well as community detection tasks, are still open research problems in SNA. Hence, SNA has become an interesting and appealing domain in Artificial Intelligence (AI) research. Immanent facts about social network structures can be effectively harnessed for training AI models in a bid to solve node classification and community detection problems in SNA. Hence, crucial aspects such as the individual attributes of spatial social actors, and the underlying patterns of relationship binding these social actors must be taken into consideration in the course of analyzing the social network. These factors determine the nature and dynamics of a given social network. In this paper, we have proposed a unique framework, Representation Learning via Knowledge-Graph Embeddings and ConvNet (RLVECN), for studying and extracting meaningful facts from social network structures to aid in node classification as well as community detection tasks. Our proposition utilizes an edge sampling approach for exploiting features of the social graph, via learning the context of each actor with respect to neighboring actors/nodes, with the goal of generating vector-space embedding per actor. Successively, these relatively low-dimensional vector embeddings are fed as input features to a downstream classifier for classification tasks about the social graph/network. Herein RLVECN has been trained, tested, and evaluated on real-world social networks.
This research was supported by International Business Machines (IBM) and Compute Canada (SHARCNET).
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Node classification
- Feature learning
- Feature extraction
- Dimensionality reduction
- Semi-supervised learning
1 Introduction and Related Literature
Humans inhabit in a planet comprised of several systems and ecosystems; and interaction is a natural phenomenon and characteristic obtainable in any given system or ecosystem. Thus, relationships between constituent entities in a given system/ecosystem is a strategy for survival and essentiality for the sustenance of a given system/ecosystem. Owing to the recent AI advances, these real-world complex systems and ecosystems can be effectively modelled as social network structures for analysis. Social network graphs [22] are intricate structures which present analytical challenges to Machine Learning (ML) and Deep Learning (DL) models because of their dynamic nature, complex links, and occasionally massive size. In this regard, we have proposed a new hybrid and DL-based model (RLVECN) based on an iterative learning approach [1] for solving (node) classification as well as clustering problems in SNA via an edge sampling strategy.
In SNA, the classification of nodes induces the formation of cluster(s). Consequently, clusters give rise to homophily in social networks. Basically, learning in RLVECN is induced via semi-supervised training. The architecture of RLVECN comprises two (2) distinct Representation Learning (RL) layers, viz: a Knowledge-Graph Embeddings (VE) layer and a Convolutional Neural Network (ConvNet) layer [13]. Both of these layers are trained by means of unsupervised learning. These layers are essentially feature-extraction and dimensionality-reduction layers where underlying knowledge and viable facts are automatically extracted from the social network structure [15]. The vector-embedding layer is responsible for projecting the feature representation of the social graph to a q-dimensional real-number space, \( \mathbb {R}^q \). This is done by associating a real-number vector to every unique actor/node in the social network such that the cosine distance of any given tie/edge (a pair of actors) would capture a significant degree of correlation between the two associated actors. Furthermore, the ConvNet layer, which feeds on the vector-embedding layer, is responsible for further extraction of apparent features and/or representations from the social graph. Finally, a classification layer succeeds the RL layers; and it is trained by means of supervised learning. The classifier is based on a Neural Network (NN) architecture assembled using multiple and deep layers of stacked perceptrons (NN units) [6]. Every low-dimensional feature (X), extracted by the representation-learning layers, is mapped to a corresponding output label (Y); and these (X, Y) pairs are used to supervise the training of the classifier such that it can effectively and efficiently learn how to identify clusters and classify actors within a given social network structure. Hence, the novelty of our research contribution are as stated below:
-
(1)
Proposition of a DL-based and hybrid model, RLVECN, which is aimed at solving node classification problems in SNA.
-
(2)
Detailed benchmarking reports with respect to classic objective functions used for classification tasks.
-
(3)
Comparative analysis, between RLVECN and state-of-the-art methodologies, against standard real-world social networks.
RLVECN is capable of learning the non-linear distributed features enmeshed in a social network [9]. We have evaluated RLVECN against an array of state-of-the-art models and RL methodologies which serve as our baselines, viz:
-
(i)
DeepWalk: Online Learning of Social Representations [19].
-
(ii)
GCN: Semi-Supervised Classification with Graph Convolutional Networks [11].
-
(iii)
LINE: Large-scale Information Network Embedding [25].
-
(iv)
Node2Vec: Scalable Feature Learning for Networks [8].
-
(v)
SDNE: Structural Deep Network Embedding [26].
2 Proposed Methodology and Framework
2.1 Definition of Problem
Definition 1
Social Network, SN: As expressed via Eq. 1 such that SN is a tuple comprising a set of actors/vertices, V; a set of ties/edges, E; a metadata function, \( f_V \), which extends the definition of the vertices’ set by mapping it to a given set of attributes, \( V^{\prime } \); and a metadata function, \( f_E \), which extends the definition of the edges’ set by mapping it to a given set of attributes, \( E^{\prime } \). Thus, a graph function, \( G (V, E) \subset SN \)
Definition 2
Knowledge Graph, KG: \( (\mathbb {E}, \mathbb {R}) \) is a set comprising entities, \( \mathbb {E} \), and relations, \( \mathbb {R} \), between the entities. Thus, a KG [24, 28] is defined via a set of triples, t : (u, p, v) , where \( u, v \in \mathbb {E} \) and \( p \in \mathbb {R} \). Also, a KG [27] can be modelled as a social network, SN, such that: \( \mathbb {E} \rightarrow V \) and \( \mathbb {R} \rightarrow E \) and \( (\mathbb {E}, \mathbb {R}) \vdash f_V, f_E \).
Definition 3
Knowledge-Graph (Vector) Embeddings, X: The vector-space embeddings, X, generated by the embedding layer are based on a mapping function, f, expressed via Eq. 2. f projects the representation of the graph’s actors to a q-dimensional real space, \( \mathbb {R}^q \), such that the existent ties between any given pair of actors, \( (u_i, v_j) \), remain preserved via the homomorphism from V to X.
Definition 4
Node Classification: Considering, SN, comprising partially labelled actors (or vertices), \( V_{lbl} \subset V : V_{lbl} \rightarrow Y_{lbl} \); and unlabelled vertices defined such that: \( V_{ulb} = V - V_{lbl} \). A node-classification task aims at training a predictive function, \( f : V \rightarrow Y \), that learns to predict the labels, Y, for all actors or vertices, \( V \subset SN \), via knowledge harnessed from the mapping: \( V_{lbl} \rightarrow Y_{lbl} \).
2.2 Proposed Methodology
Our proposition, RLVECN, is comprised of two (2) distinct Feature Learning (FL) layers, and one (1) classification layer (Fig. 2).
Representation Learning - Knowledge-Graph Embeddings Layer: Given a social network, SN, defined by a set of actors/vertices, \( V : U \subset V \, \forall \, \{u_m, v_m\} \in V \), and \( M : m \in M \) denotes the number of unique actors in SN. Additionally, let the ties/edges in SN be defined such that: \( E \subset \{U \times V\} \); where \( u_i \in V \) and \( v_j \in V \) represent a source_vertex and a target_vertex in E, respectively.
The objective function of the vector-embedding layer aims at maximizing the average logarithmic probability of the source_vertex, \( u_i \), being predicted as neighboring actor to the target_vertex, \( v_j \), with respect to all training pairs, \( \forall \, (u_i, v_j) \in E \). Formally, the function is expressed as in Eq. 3:
Consequently, in order to compute \( Pr(u_i|v_j) \), we have to quantify the proximity of each target_vertex, \( v_j \), with respect to its source_vertex, \( u_i \). The vector-embedding model measures this adjacency/proximity as the cosine similarity between \( v_j \) and its corresponding \( u_i \). Thus, the cosine distance is calculated as the dot product between the target_vertex and the source_vertex. Mathematically, \( Pr(u_i|v_j) \) is computed via a softmax function as defined in Eq. 4:
Hence, the objective function of our vector-embedding model with respect to the SN is as expressed by Eq. 5:
Representation Learning - ConvNet Layer: This layer comprises three (3) RL or FL operations, namely: convolution, non-linearity, and pooling operations. RLVECN utilizes a one-dimensional (1D) convolution layer [14] which is sandwiched between the vector-embedding and classification layers. Equation 6 expresses the 1D-convolution operation:
where \( f_{i} \) represents a cell/matrix position in the Feature Map; \( k_{j} \) denotes a cell position in the Kernel; and \( x_{i-j} \) denotes a cell/matrix position in the 1D-Input (data) matrix.
The non-linearity operation is a rectified linear unit (ReLU) function which introduces non-linearity after the convolution operation since real-world problems usually exist in non-linear form(s). As a result, the rectified feature/activation map is computed via: \( r_{i} \in R = g(f_{i} \in F) = max(0,F) \).
The pooling operation is responsible for reducing the input width of each rectified activation map while retaining its vital properties. In this regard, the Max Pooling function is defined such that the resultant pooled (or downsampled) feature map is generated via: \( p_{i} \in P = h(r_{i} \in R) = maxPool(R) \).
Classification - Multi-Layer Perceptron (MLP) Classifier Layer: This is the last layer of our proposed RLVECN’s architecture, and it succeeds the RL layers. The pooled feature maps, generated by the Representation Learning layers, contain high-level features extracted from the constituent actors of the social network structure. Hence, the classification layer utilizes these extracted “high-level features” for identifying clusters, based on the respective classes, contained in the social graph. In this regard, a MLP [5] function is defined as a mathematical function, \( f_{c} \), mapping some set of input values, P, to their respective output labels, Y. In other words, \( Y = f_{c}(P, \varTheta ) \), and \( \varTheta \) denotes a set of parameters. The MLP [4] function learns the values of \( \varTheta \) that will result in the best decision, Y, approximation for the input set, P. The MLP classifier output is a probability distribution which indicates the likelihood of a feature representation belonging to a particular output class. Our MLP [10] classifier is modelled such that sequential layers of NN units are stacked against each other to form a Deep Neural Network (DNN) structure [3, 16].
Node Classification Algorithm: Defined via Algorithm 1.
2.3 Proposed Architecture/Framework
Figure 2 illustrates the architecture of our proposition, RLVECN.
3 Data Sets and Materials
3.1 Data Sets
With regard to Table 1 herein, six (6) real-world benchmark social-graph data sets were employed for experimentation and evaluation, viz: Cora [20, 23], CiteSeer [20, 23], Facebook Page-Page webgraph [21], PubMed-Diabetes [17], Internet-Industry partnerships [2, 12], and Terrorists-Relationship [29].
3.2 Data Preprocessing
All benchmark data sets ought to be comprised of actors and ties already encoded as discrete data (natural-number format). However, Cora, CiteSeer, Facebook-Page2Page, PubMed-Diabetes, and Terrorists-Relation data sets are made up of nodes and/or edges encoded in mixed formats (categorical and numerical formats). Thus, it is necessary to transcode these non-numeric (categorical) entities to their respective discrete (numeric) data representation, without semantic loss, via an injective function that maps each distinct entry in the categorical-entity domain to distinct numeric values in the discrete-data codomain, \( f_m : categorical \rightarrow discrete \). Thereafter, the numeric representation of all benchmark data sets are normalized, \( f_n : discrete \rightarrow continuous \), prior to training against RLVECN and the benchmark models. Also, only edgelist ties, \( E \subset G \), whose constituent actors are present in the nodelist, \( V \subset G \), were used for training/testing our model.
4 Experiment, Results, and Discussions
RLVECN’s experimentation setup was tuned in accordance with the hyperparameters shown in Table 2. Our evaluations herein were recorded with reference to a range of objective functions. Thus, Categorical Cross Entropy was employed as the cost/loss function; while the fitness/utility was measured based on the following metrics: Precision (PC), Recall (RC), F-measure or F1-score (F1), Accuracy (AC), and Area Under the Receiver Operating Characteristic Curve (RO). Moreover, the objective functions have been computed against each benchmark data set with regard to the constituent classes (or categories) present in each data set. The Support (SP) represents the number of ground-truth samples per class/category for each data set.
In a bid to avoid sample bias across-the-board, we have used exactly the same SP for all models inclusive of RLVECN model. However, since RLVECN is based on an edge-sampling technique; the SP recorded against RLVECN model represent the numbers of edges/ties used for computation as explained in Algorithm 1. With regard to the standard node-classification tasks herein, the performance of our RLVECN model during benchmarking against five(5) popular baselines (DeepWalk, GCN, LINE, Node2Vec, SDNE); and when evaluated against the validation/test samples for the benchmark data sets are as documented in Table 3, 4, 5, and 6 respectively. Consequently, Fig. 3 graphically shows the learning-progress curves during the node-classification tasks using our proposed RLVECN model; and when training over the benchmark data sets. Hence, the dotted-black lines represent learning progress over the training set; and the dotted-blue lines represent learning progress over the test set.
Tables 3, 4, 5, and 6 have clearly tabulated our results as a multi-classification task over the benchmark data sets. Thus, for each class per data set, we have laid emphasis on the F1 (the weighted average of the PC and RC metrics) and RO. Therefore, we have highlighted the model which performed best (based on F1 and RO metrics) for each classification task using a bold font. Additionally, we have employed a point-based ranking standard to ascertain the fittest model for each node classification task. The model with the highest aggregate point signifies the fittest model for the specified task, and so on in a descending order of aggregate points. Accordingly, as can be seen from our tabular results, our proposed methodology (RLVECN) is at the top with the highest fitness points.
L2 regularization (\( L2 = 0.04 \)) [7] and early stopping [18] were used herein as addon regularization techniques to overcome overfitting incurred during RLVECN’s training. Hence, the application of early stopping with respect to RLVECN’s training over the benchmark data sets were, viz: 50 epochs (Cora, CiteSeer, Facebook-Page2Page, PubMed-Diabetes, Terrorists-Relation) and 180 epochs (Internet-Industry-Partnership). We have used a mini-batch size of 256 for training and testing/validating because we want to ensure that sufficient patterns are extracted by RLVECN during training before its network weights are updated.
5 Limitations, Conclusion, Future Work, and Acknowledgements
The benchmark models evaluated herein were implemented using their default parameters. We were not able to evaluate GCN [11] against Facebook Page-Page webgraph, PubMed-Diabetes, and Internet-Industry-Partnership data sets; because each of these aforementioned data sets does not possess a vectorized feature set which is required by GCN model for input-data processing. Overall, RLVECN’s remarkable performance with respect to our benchmarking results can be attributed to the following:
-
(1)
The RL kernel of RLVECN is constituted of two (2) distinct layers of FL, viz: Knowledge-Graph Embeddings and ConvNet [13].
-
(2)
The high-quality data preprocessing techniques employed herein with respect to the benchmark data sets. We ensured that all constituent actors of a given social graph were transcoded to their respective discrete data representations, without any loss in semantics, and normalized prior to training and/or testing.
In conclusion, we intend to expand our experimentation scope to include more social network data sets and benchmark models. This research was made possible by International Business Machines (IBM), SHARCNET and Compute Canada (www.computecanada.ca).
References
Aggarwal, C.C. (ed.): Social Network Data Analytics. Springer, Boston (2011). https://doi.org/10.1007/978-1-4419-8462-3
Batagelj, V., Doreian, P., Ferligoj, A., Kejzar, N. (eds.): Understanding Large Temporal Networks and Spatial Networks: Exploration, Pattern Searching. Visualization and Network Evolution. Wiley, Hoboken (2014)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–113 (2009)
Deng, L.M., Yu, D.H.: Deep Learning: Methods and Applications. Foundations and trends in signal processing. Now Publishers (2014)
Goodfellow, I.G., Bengio, Y., Courville, A.C.: Deep learning. Nature 521, 436–444 (2015)
Goodfellow, I.G., Bengio, Y., Courville, A.C. (eds.): Deep Learning. MIT Press, Cambridge (2017)
Gron, A. (ed.): Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media Inc., Newton (2017)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016, pp. 855–864 (2016)
Hinton, G.E.: Learning multiple layers of representation. TRENDS Cognit. Sci. 11(10), 428–433 (2007)
Hinton, G.E., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Sig. Process. Mag. 29, 82–97 (2012)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR), abs/1609.02907 (2017)
Krebs, V.E.: Organizational adaptability quotient. In: IBM Global Services (2008)
Molokwu, B.C.: Event prediction in complex social graphs using one-dimensional convolutional neural network. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI (2019)
Molokwu, B.C.: Event prediction in social graphs using 1-dimensional convolutional neural network. In: Canadian Conference on AI (2019)
Molokwu, B.C., Kobti, Z.: Event prediction in complex social graphs via feature learning of vertex embeddings. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) Neural Information Processing, pp. 573–580. Springer, Cham (2019). https://doi.org/10.1007/978-1-4419-8462-3
Molokwu, B.C., Kobti, Z.: Spatial event prediction via multivariate time series analysis of neighboring social units using deep neural networks. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019)
Namata, G., London, B., Getoor, L., Huang, B.: Query-driven active surveying for collective classification. In: Proceedings of the Workshop on Mining and Learning with Graphs, MLG-2012 (2012)
Patterson, J., Gibson, A. (eds.): Deep Learning: A Practitioner’s Approach. O’Reilly Media Inc., Newton (2017)
Perozzi, B., Al-Rfou’, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, abs/1403.6652 (2014)
Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph analytics and visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015). http://networkrepository.com
Rozemberczki, B., Allen, C., Sarkar, R.: Multi-scale attributed node embedding (2019). arxiv: abs/1909.13021
Scott, J. (ed.): Social Network Analysis. SAGE Publications Ltd., Newbury Park (2017)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29, 93–106 (2008)
Tabacof, P., Costabello, L.: Probability calibration for knowledge graph embedding models. In: International Conference on Learning Representations (ICLR), abs/1912.10000 (2020)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web (2015)
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)
Yang, S., Tian, J., Zhang, H., Yan, J., He, H., Jin, Y.: Transms: knowledge graph embedding for complex relations by multidirectional semantics. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI (2019)
Zhang, Q., Sun, Z., Hu, W., Chen, M., Guo, L., Qu, Y.: Multi-view knowledge graph embedding for entity alignment. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI. abs/1906.02390 (2019)
Zhao, B., Sen, P., Getoor, L.: Entity and relationship labeling in affiliation networks. In: Proceedings of the 23rd International Conference on Machine Learning (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Molokwu, B.C., Shuvo, S.B., Kar, N.C., Kobti, Z. (2020). Node Classification in Complex Social Graphs via Knowledge-Graph Embeddings and Convolutional Neural Network. In: Krzhizhanovskaya, V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science(), vol 12142. Springer, Cham. https://doi.org/10.1007/978-3-030-50433-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-50433-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50432-8
Online ISBN: 978-3-030-50433-5
eBook Packages: Computer ScienceComputer Science (R0)