CN114861766A - Dynamic link prediction method and system based on multi-granularity evolution - Google Patents
Dynamic link prediction method and system based on multi-granularity evolution Download PDFInfo
- Publication number
- CN114861766A CN114861766A CN202210387628.6A CN202210387628A CN114861766A CN 114861766 A CN114861766 A CN 114861766A CN 202210387628 A CN202210387628 A CN 202210387628A CN 114861766 A CN114861766 A CN 114861766A
- Authority
- CN
- China
- Prior art keywords
- dynamic
- granularity
- graph
- nodes
- evolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 235000019580 granularity Nutrition 0.000 claims abstract description 44
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 238000013507 mapping Methods 0.000 claims description 10
- 230000007246 mechanism Effects 0.000 claims description 9
- 238000004422 calculation algorithm Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 238000007477 logistic regression Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 230000004927 fusion Effects 0.000 abstract description 8
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 239000011159 matrix material Substances 0.000 description 12
- 239000013598 vector Substances 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 239000002245 particle Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a dynamic link prediction method and a dynamic link prediction system based on multi-granularity evolution. The method comprises the following steps: dividing the graph under each time slice of the dynamic graph to obtain a multi-granularity subgraph; extracting structural features of nodes from the multi-granularity subgraph; learning a dynamic evolution rule of the graph from the multi-granularity subgraphs to obtain dynamic evolution characteristics of the subgraphs with different granularities, and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain dynamic evolution characteristics of the nodes; and fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics. The invention can mine richer graph information, thereby enhancing the extraction capability of the structural characteristics and improving the accuracy of link prediction; the invention can fully realize the fusion of dynamic characteristics, enhance the learning ability of the dynamic characteristics and further improve the performance of link prediction.
Description
Technical Field
The invention belongs to the technical field of information, and particularly relates to a dynamic link prediction method and system based on multi-granularity evolution.
Background
Many real-world complex information systems can be modeled in graph form, and research on graph data is going into further in recent years. Because the relationship information between the entities is important for the correlation calculation on the graph, and the link prediction can predict the possibility that the relationship exists between the nodes which are not formed, effective information is mined, and the method is one of important tasks in the graph mining field. Meanwhile, graph data in the real world often dynamically evolves along with time, link prediction for a dynamic graph has a research value, the goal is to learn the structural evolution of the graph from past information to predict a future link, and wide attention is attracted. For example, chinese patent (application No. 201910440098.5, application publication No. CN 110413844a) proposes a dynamic link prediction method based on a spatio-temporal attention depth model, in an encoding stage, the method uses an adjacent matrix sequence of different time slices of a dynamic graph as input, uses a spatial attention mechanism to focus on neighboring nodes in a GCN-attention module to update a feature vector of each node under the corresponding time slice, inputs the feature vectors of T time slices into an LSTM-attention module to obtain a hidden layer vector, calculates a context vector for the hidden layer vectors of T times, inputs the context vector into a decoder as a spatio-temporal feature vector, decodes the input spatio-temporal feature vector by the decoder, and outputs a probability matrix obtained by decoding and indicating whether a link exists between a node and a node, that is, thereby realizing prediction of a dynamic link. Chinese patent (application No. 202110280461.9, application publication No. CN 113065974A) proposes a link prediction method based on dynamic network representation learning, which constructs a similarity matrix of a snapshot network by calculating similarity values among nodes of a dynamic network through a similarity-based aggregation strategy; applying the graph convolution neural network to a single snapshot network for feature aggregation, guiding a feature aggregation process by using an adjacency matrix and a similarity matrix, and determining low-dimensional feature representation of a node; and inputting the low-dimensional feature representation of the node into a logistic regression classifier to obtain a link prediction result of the dynamic network.
The existing dynamic link prediction method focuses on extracting the structural features and the dynamic features of a dynamic graph from the whole graph, but the method ignores that the graphs under different layers contain different structures and change rules, and the feature extraction is incomplete, so that the dynamic link prediction effect is not accurate enough. For example, in a social network, different groups are included, such as a group of students, a group of company associates, and so on. The interaction (sending messages and e-mails) among different groups in a period of time forms a dynamic graph, some groups have more interaction and dense structure, and some groups have less interaction and sparse structure. The subgraphs formed by different groups contain different structures and dynamic change modes, and the structural features and the dynamic features of different groups are learned distinctively, so that richer dynamic graph information is mined. If the characteristics are extracted from the whole graph, some noise may be introduced, so that the predicted edges in a dense structure are less than the actual edges, and the predicted edges in a sparse structure are more than the actual edges, so that the overall dynamic link prediction effect is poor.
Disclosure of Invention
Aiming at the defects of the existing link prediction method, the invention provides a dynamic link prediction method and a dynamic link prediction system based on multi-granularity evolution.
The method comprises the steps of dividing a graph under each time slice by a certain strategy to construct a multi-granularity subgraph. Then, a multi-granularity-based structural feature extraction module is designed, features are extracted from subgraphs under different granularities and are fused, and therefore the structural features of the subgraphs are captured. Then, a multi-granularity-based dynamic evolution fusion module is designed for learning the dynamic rules of the graph from the subgraphs under different granularities, and further, the dynamic characteristics are selectively fused by considering different importance of the characteristics under different granularities, and structural characteristics are simultaneously fused, so that node representation containing space-time characteristics is finally obtained, and the relation between future entities is more accurately predicted.
Specifically, the technical scheme adopted by the invention is as follows:
a dynamic link prediction method based on multi-granularity evolution comprises the following steps:
dividing the graph under each time slice of the dynamic graph to obtain a multi-granularity subgraph;
extracting structural features of nodes from the multi-granularity subgraph;
learning a dynamic evolution rule of the graph from the multi-granularity subgraphs to obtain dynamic evolution characteristics of the subgraphs with different granularities, and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain dynamic evolution characteristics of the nodes;
and fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics.
Further, dividing the graph under each time slice by adopting a k-tress subgraph decomposition algorithm to obtain the multi-granularity subgraph.
Further, a graph convolution network based on the disentanglement propagation and mapping operation is adopted and applied to the k-tress subgraph to extract the structural features of the nodes.
Further, the graph convolution network based on the disentanglement propagation and mapping operation has the following formula:
att l =σ(MLP(X l ))l=0,1,2,…,J
X out =softmax(sum(att 0 οX 0 ,…,att J οX J ))
wherein l is the number of GCN layers; j is the maximum depth of propagation;is a node feature, N represents the number of nodes, d represents a feature dimension; MLP denotes full join operation; x l Representing depthA node characteristic when l;the method is used for adaptively adjusting the information which needs to be reserved at different propagation depths of each node; att is represented by ° l Each element in (1) is represented by X with a corresponding d-dimensional node J Multiplying; softmax denotes normalization operation; sum denotes a summation operation; final output X out Is obtained by combining a plurality of propagation layers.
Further, the learning of the dynamic evolution law of the graph from the multi-granularity subgraph is to use RNN to learn the dynamic evolution law of different granularity subgraphs.
Further, the fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain the dynamic evolution characteristics of the nodes includes: and introducing a multi-head self-attention mechanism, learning the weight on each granularity through the self-attention mechanism, multiplying the weight by the node representation of the corresponding granularity, and summing to obtain the fused node dynamic evolution characteristic representation.
Further, the predicting the future link according to the node representation containing the space-time characteristics comprises: and performing Hadamard product operation on two node representations corresponding to the link, predicting the score of the link through the node representations, and training a logistic regression classifier to judge whether the link exists or not so as to predict the link on the graph at the next time.
A dynamic link prediction system based on multi-granularity evolution adopting the method comprises the following steps:
the multi-granularity graph dividing module is used for dividing the graph under each time slice of the dynamic graph to obtain a multi-granularity subgraph;
the multi-granularity structural feature extraction module is used for extracting structural features of the nodes from the multi-granularity subgraph;
the multi-granularity dynamic characteristic learning module is used for learning the dynamic evolution rule of the graph from the multi-granularity subgraph to obtain the dynamic evolution characteristics of the subgraphs with different granularities and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain the dynamic evolution characteristics of the nodes;
and the dynamic link prediction module is used for fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics.
The invention has the following beneficial effects:
1. aiming at the problems that the existing dynamic link prediction method focuses on extracting features from the whole graph and neglects the sub-graph structures with different granularities to have different structural features, the graph is divided into multi-granularity sub-graphs according to a certain strategy, then the features are extracted from the sub-graphs, the sub-graphs with different granularities in the same time slice are connected, the mode of the development from a local dense structure to a global sparse structure is learned, and richer graph information is mined, so that the extraction capability of the structural features is enhanced, and the accuracy of link prediction is improved.
2. Aiming at the problem that dynamic feature learning is not sufficient in a dynamic link prediction method, the invention provides a multi-granularity dynamic feature evolution fusion method, and the dynamic evolution features of the dynamic feature evolution fusion method are learned on a sub-graph time sequence under different granularities, so that the dynamic features of multiple granularities are obtained. Then, different weights are set for the features with different granularities by combining a self-attention mechanism, the contribution of important features is increased, the fusion of dynamic features is fully realized, the learning capability of the dynamic features is enhanced, and the performance of link prediction is further improved.
3. The method and the system can be used in the fields of social networks, recommendation systems and the like. For example, when used in a social network, the nodes in the present invention may represent users in the social network, and the predicted links may represent potential (future) friend relationships between the users in the social network. When the method is used for a recommendation system, the nodes in the method can represent users and commodities in the recommendation system, the predicted links connect the users and the commodities, and potential commodities (potential commodities represent commodities which are loved by the users but not bought yet) can be recommended to the users.
Drawings
FIG. 1 is an example of a k-tress diagram.
FIG. 2 is a schematic flow diagram of the process of the present invention.
Fig. 3 is a schematic diagram of a dynamic link prediction architecture proposed by the method of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to specific examples and the accompanying drawings.
The invention provides a dynamic link prediction method based on multi-granularity evolution, which provides a complete algorithm framework and mainly comprises 4 parts of dynamic multi-granularity map construction, multi-granularity structure feature extraction, multi-granularity dynamic feature evolution fusion and dynamic link prediction, and specifically comprises the following steps:
The invention defines the dynamic graph G as a time series G 1 ,G 2 ,…,G T And (4) separating various structures of the graph at each time from 1 to T by a certain strategy, and excavating abundant structure information with distinguishing force to help improve the link prediction capability. According to requirements, the k-tress subgraph decomposition algorithm in the community mining field is introduced. First, the concepts of support and k-tress subgraphs are introduced, which are defined as follows:
definition 1.support (e), if an edge e in the graph G is in k triangles, the support of the edge is (e) k-2.
Given a graph G and an integer k, the k-tress subgraph is a maximum subgraph G in G, in which the support of all edges is (e) > < k-2.
FIG. 1 shows a decomposed subgraph according to the k-tress definition. Wherein, the left one is the original image and is not changed. By definition, the edge support in a 2-tress subgraph should be greater than 0, meaning that each edge should be in at least 0 triangles, and any edge satisfies this condition, i.e., the graph itself. By analogy, each edge in a 3-tress subgraph is in at least 1 triangle, as shown by the right two. The 4-tress diagram is shown on the right.
Thus, a multi-granularity subgraph divided by k-tress decomposition is obtained. The multi-granularity subgraphs obtained by the k-tress strategy have many excellent characteristics, and the k-tress subgraphs require all edges to be at least in (k-2) triangles. It is known that triangles are the fundamental modules of real-world networks because a link is caused by a triangle closure, which has a high local clustering coefficient. In a social network, a triangle indicates that two friends have a common friend, indicating that the relationship between the three friends is firm and stable. Further, if two people have more friends, their friendship is more stable, and it can be emphasized whether any two friends have a firm relationship. On the link prediction task, common neighbors of nodes are considered in the k-tress strategy, a subgraph with a higher k value has a denser structure and fewer nodes, and intuitively, the more the common neighbors of two nodes are, the more the nodes are likely to become friends or interact with each other, so that the link prediction task has strong advantages.
And applying a k-tress decomposition algorithm to the graph under each time slice to obtain a multi-granularity subgraph sequence corresponding to the time slice.
And 2, learning the multi-granularity structural features.
In order to extract enough features from a multi-granularity map obtained over multiple timeslices to reflect the diversity of the structure, we need to propagate the features in a larger domain. One of the most common ways is GCN (Graph convolution Network), which extracts features by two operations of propagation and mapping, as the most classical GCN formula:
wherein, X l For node representation matrices, adjacency matrices with self-loopsIs an adjacency matrix a to which an identity matrix I is added. Each element a in the adjacency matrix ij Indicating whether there is an edge between node i and node j, 0 representing no edge, and 1 representing an edge. Is a degree matrix with self-loops, where diag () represents a diagonal matrix, the elements on the diagonal representing the degrees of the corresponding nodes,to representThe value in row i and column j represents whether there is an edge between node i and node j. Superscript l denotes the number of GCN layers, W l Represents the parameter of the l-th layer, σ () is a nonlinear activation function. The propagation operation isFor communicating the node characteristics to neighboring nodes. The mapping operation is σ (P) l-1 W l ) For mapping the node representation to a feature space.
However, the mode of the entanglement propagation and mapping operation can cause the propagation depth of the GCN to be limited, and the features cannot be propagated in a large field. Therefore, the invention designs a graph convolution network based on disentanglement propagation and mapping operation, named as Modified GCN, to extract features, which can alleviate the over-fitting problem of general GCN, thereby propagating information in a larger range. The formula is as follows:
att l =σ(MLP(X l ))l=0,1,2,…,J
X out =softmax(sum(att 0 οX 0 ,…,att J οX J ))
where l is the number of GCN layers, and also the depth. J is the maximum depth of propagation and,is node feature, N represents the number of nodes, d represents the feature dimension, and MLP represents the full join operation. X l Representing the node characteristics at depth l.The method is used for adaptively adjusting the information which each node needs to keep at different propagation depths. Att is represented by ° l Each element in (1) is represented by X with a corresponding d-dimensional node J Multiplication, softmax denotes normalization, sum denotes summation. Final output X out Is obtained by combining a plurality of propagation layers.
And applying the Modified GCN to the k-tress subgraph to acquire the characteristics. Because the feature propagation and the feature mapping in the Modified GCN are carried out separately, the overfitting problem can be effectively relieved, and therefore the Modified GCN has a better effect on capturing high-order structure information.
Applying the Modified GCN to the subgraph on each time slice, and expressing the formula as follows:
the sub-graph with the granularity of K is represented by K, the value range of K is 2-K, K is the sub-graph with the maximum granularity, which is manually set and divided by applying a K-tress algorithm, T represents a time slice index, the range is 1-T, and T is the latest time slice.Representing the node representation of the k-tress subgraph in the t-time slice,a adjacency matrix representing k-tress subgraphs in a t-time slice. Thus we get the node representation at all granularities for all time slicesIn the figure G t The multi-granularity sequence of the node representation is as followsIn the particle sizeAt k, the nodes represent a time sequence of
To better learn structural features, RNNs are well suited to handle highly related sequences, assuming that there is a potential process of structural evolution in each temporally multi-granular subgraph sequence. In the figure G t In the above, the sequence is inverted toAnd inputs it to the RNN to learn structural information. The mathematical representation is:
…
whereinA hidden state is represented in the form of,representing the input features. In the input sequence, subgraph structures with higher k values are denser and have fewer nodes. In this process, structural information at different granularities is consolidated. Final hidden state outputContaining the structural attributes of the graph under the entire time slice t. This is performed for each time slice and a structural feature representation is finally obtained
And 3, multi-granularity dynamic feature learning.
In the time dimension, subgraphs of different granularities contribute differently to the evolution of the whole graph. Processing subgraphs of different granularities differently helps to model the complete graph evolution process.
RNN is used to learn the dynamic evolution law of different granularity subgraphs. At each particle size k, willInput RNN, formulated as follows:
…
where T is the length of the time series,indicating a hidden state. The dynamic feature learning module is similar to the structure fusion module except for their inputs, where the input sequences are time sequences of different granularity, with the purpose of capturing the dynamic evolution features. Note that the structure embedding sequenceAlso input into RNN and output structural feature S T 。
In order to obtain the final node representation, a plurality of granularity learned subgraph dynamic evolution features need to be fused. In order to obtain the importance degrees of different granularities, a multi-head self-attention mechanism is introduced, the weight on each granularity is learned through the self-attention mechanism, and then the weight is multiplied by the node representation of the corresponding granularity and summed to obtain the fused node dynamic evolution characteristic representation.
Node v i The attention coefficient formula at k particle size is as follows:
wherein w k And w k’ As the parameter(s) is (are),is represented by nodes, and the attention coefficient can be obtained after normalizationThe fused nodes are represented as:
sigma is an activation function, and finally the dynamic evolution characteristic expression H of all the nodes is obtained out . Final node representation H emb Including structural and dynamic evolution features, formulated as follows, whereinRepresenting a matrix splicing operation:
and 4, predicting a future link.
Finally, updating network parameters according to the optimization function until the model converges and the effect on the test set is optimal, and training to obtain an optimal model; performing Hadamard product operation on the representation of two nodes corresponding to the link, and predicting the link through the representation of the nodesAnd training the logistic regression classifier to judge whether the link exists or not so as to predict the graph G at the next time T T+1 And (c) a link.
Dynamic graph { G) input below 1 ,G 2 ,...,G T The invention is further described by taking the example as an example.
Fig. 2 is a flow chart of dynamic link prediction, including input, multi-granularity graph partitioning, multi-granularity structural feature extraction, multi-granularity dynamic feature extraction fusion, dynamic link prediction, and output.
Fig. 3 is a schematic diagram of the dynamic link prediction architecture of the present invention. The framework mainly comprises 4 parts of multi-granularity graph division, multi-granularity structural feature extraction, multi-granularity dynamic feature extraction fusion and dynamic link prediction.
In { G 1 ,G 2 ,…,G T Applying a k-tress decomposition algorithm on each time slice to obtain a corresponding multi-granularity map, as shown in the left side of fig. 3, taking a multi-granularity sub-map decomposed by Gt asWhere the superscript represents time and the subscript represents the number of the multi-particle size plot. As the definition of k-tress is known, the subgraph sequence starts from a 2-tress subgraph (original graph), and k _ max is a value set manually.
And 2, learning the multi-granularity structural features.
Firstly, applying Modified GCN to each granularity graph of each time to extract features to obtainThen, a multi-granularity subgraph sequence of each time t is reversely writtenThe RNN is input to capture potential structural features. This is performed for each time slice and finally a structural feature representation is obtained
First, RNNs are used to learn the dynamic evolution rules of different granularity subgraphs. At each particle size k, willInput RNN to obtain dynamic feature representationThe dynamic characteristics of the multiple granularities are expressed asIn combination with the self-attention mechanism, the importance (weight) of each granularity is calculated, and the dynamic feature representation is multiplied and summed with the weight to obtain the total dynamic feature representation. Finally, combining the structural characteristics and the dynamic characteristics to obtain a final node representation H emb 。
And 4, predicting the dynamic link.
Finally, updating network parameters according to an optimization function until the model converges and the effect on the test set reaches the optimum, and training to obtain an optimum model; performing Hadamard product operation on the representation of two nodes corresponding to the link, predicting the score of the link through the representation of the nodes, and training a logistic regression classifier to judge whether the link exists or not so as to predict the graph G at the next time T T+1 And (c) a link.
Experimental data: in the experiment, AUC is used as an evaluation index, and the AUC is an index for measuring the learning capacity of the model, and represents the probability of a positive sample before a negative sample, wherein the range is 0-1, and the closer to 1, the better the model effect is. The applied data sets UCI, AS, MATH, FACEBOOK, ask and ENRON are public graph data sets, and the information is shown in table 1.
TABLE 1 data set information
Data set | Number of nodes | Number of edges | Number of time slices |
UCI | 1899 | 59835 | 7 |
AS | 6828 | 1947704 | 100 |
MATH | 24740 | 323357 | 77 |
60730 | 607487 | 27 | |
ASKU | 74924 | 356822 | 21 |
ENRON | 87036 | 530284 | 38 |
Comparative experiments GCRN, dynagem, DynAE, dynarnn, DynAERNN, EvloveGCN and CTGCN are all existing advanced dynamic link prediction methods, as explained below:
GCRN: the model combines a GCN model and an RNN model, finds dynamic patterns in the graph, and generates node embedding in the dynamic graph.
DynGEM: the work is a graph embedding method based on depth self-coding, SDNE is expanded into a dynamic graph, and the method can be used for generating the embedded representation of the graph snapshot at the current moment based on the embedded representation of the graph snapshot at the previous moment, namely the embedded representation of the graph snapshot at the previous moment is used as an initial value, performing gradient training, and incrementally processing the dynamic graph, and has very good performance.
Dyngraph2 vec: the main dynamic graph feature of the work proposes an embedding method, and the dynamic graph learns the evolved structure and predicts the invisible link. It utilizes multiple non-linear layers to learn structural patterns in each network while learning temporal transitions in the network using recursive layers. There are three variants of this model, DynAE, dynarnn, and DynAERNN.
EvolveGCN: the framework is a dynamic variant of the GCN, and the framework does not adopt a node embedding mode, but adopts the GCN to learn the weight of a single moment in a time dimension. Meanwhile, the correlation of GCN weight among different time instants is considered, and GCN parameters are evolved by using RNN, so that the sequence dynamics is captured.
CTGCN: the model proposes a time-series GCN, combining the GCN and RNN, so that it captures the connectivity and structural information of the graph, while capturing the dynamics.
The experimental results are shown in table 2, and some methods cannot be tested on a large graph due to memory limitations, and are indicated by "-".
TABLE 2 dynamic Link prediction experiment AUC results
Based on the same inventive concept, another embodiment of the present invention provides a dynamic link prediction system based on multi-granularity evolution using the above method, which includes:
the multi-granularity graph dividing module is used for dividing the graph under each time slice of the dynamic graph to obtain multi-granularity subgraphs;
the multi-granularity structural feature extraction module is used for extracting structural features of the nodes from the multi-granularity subgraph;
the multi-granularity dynamic characteristic learning module is used for learning the dynamic evolution rule of the graph from the multi-granularity subgraph to obtain the dynamic evolution characteristics of the subgraphs with different granularities and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain the dynamic evolution characteristics of the nodes;
and the dynamic link prediction module is used for fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps of the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.
The particular embodiments of the present invention disclosed above are illustrative only and are not intended to be limiting, since various alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The invention should not be limited to the disclosure of the embodiments in the present specification, but the scope of the invention is defined by the appended claims.
Claims (10)
1. A dynamic link prediction method based on multi-granularity evolution is characterized by comprising the following steps:
dividing the graph under each time slice of the dynamic graph to obtain a multi-granularity subgraph;
extracting structural features of nodes from the multi-granularity subgraph;
learning a dynamic evolution rule of the graph from the multi-granularity subgraphs to obtain dynamic evolution characteristics of the subgraphs with different granularities, and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain dynamic evolution characteristics of the nodes;
and fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics.
2. The method of claim 1, wherein the multi-granularity subgraph is obtained by partitioning the graph under each time slice by using a k-tress subgraph decomposition algorithm.
3. The method according to claim 2, wherein a graph convolution network based on de-entanglement propagation and mapping operations is adopted and applied to the k-tress subgraph to extract the structural features of the nodes.
4. The method according to claim 3, wherein the graph convolution network based on the disentanglement propagation and mapping operation has the following formula:
att l =σ(MLP(X l ))l=0,1,2,…,J
wherein l is the number of GCN layers; j is the maximum depth of propagation;is a node feature, N represents the number of nodes, d represents a feature dimension; MLP denotes full join operation; x l Representing the node characteristics when the depth is l;the method is used for adaptively adjusting the information which needs to be reserved at different propagation depths of each node;stands for att l Each element in (1) is represented by X with a corresponding d-dimensional node J Multiplying; softmax denotes normalization operation; sum denotes a summation operation; final output X out Is obtained by combining a plurality of propagation layers.
5. The method of claim 1, wherein learning the dynamic evolution law of the graph from the multi-granularity subgraphs is by using RNN to learn the dynamic evolution law of different granularity subgraphs.
6. The method of claim 1, wherein the fusing the dynamic evolution features of the subgraphs of different granularities to obtain the dynamic evolution features of the nodes comprises: and introducing a multi-head self-attention mechanism, learning the weight on each granularity through the self-attention mechanism, multiplying the weight by the node representation of the corresponding granularity, and summing to obtain the fused node dynamic evolution characteristic representation.
7. The method of claim 1, wherein predicting future links from node representations containing spatio-temporal features comprises: and performing Hadamard product operation on two node representations corresponding to the link, predicting the score of the link through the node representations, and training a logistic regression classifier to judge whether the link exists or not so as to predict the link on the graph at the next time.
8. A dynamic link prediction system based on multi-granularity evolution adopting the method of any claim 1 to 7, which is characterized by comprising:
the multi-granularity graph dividing module is used for dividing the graph under each time slice of the dynamic graph to obtain a multi-granularity subgraph;
the multi-granularity structural feature extraction module is used for extracting structural features of the nodes from the multi-granularity subgraph;
the multi-granularity dynamic characteristic learning module is used for learning the dynamic evolution rule of the graph from the multi-granularity subgraph to obtain the dynamic evolution characteristics of the subgraphs with different granularities and fusing the dynamic evolution characteristics of the subgraphs with different granularities to obtain the dynamic evolution characteristics of the nodes;
and the dynamic link prediction module is used for fusing the structural characteristics of the nodes and the dynamic evolution characteristics of the nodes to obtain a node representation containing space-time characteristics, and predicting a future link according to the node representation containing the space-time characteristics.
9. An electronic apparatus, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a computer, implements the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210387628.6A CN114861766A (en) | 2022-04-13 | 2022-04-13 | Dynamic link prediction method and system based on multi-granularity evolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210387628.6A CN114861766A (en) | 2022-04-13 | 2022-04-13 | Dynamic link prediction method and system based on multi-granularity evolution |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114861766A true CN114861766A (en) | 2022-08-05 |
Family
ID=82630912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210387628.6A Pending CN114861766A (en) | 2022-04-13 | 2022-04-13 | Dynamic link prediction method and system based on multi-granularity evolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114861766A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118094020A (en) * | 2024-01-30 | 2024-05-28 | 苏州科技大学 | Topology adaptive graph layout method, system and storage medium for large graph visualization |
-
2022
- 2022-04-13 CN CN202210387628.6A patent/CN114861766A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118094020A (en) * | 2024-01-30 | 2024-05-28 | 苏州科技大学 | Topology adaptive graph layout method, system and storage medium for large graph visualization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230169140A1 (en) | Graph convolutional networks with motif-based attention | |
CN114117220B (en) | Deep reinforcement learning interactive recommendation system and method based on knowledge enhancement | |
CN110263280B (en) | Multi-view-based dynamic link prediction depth model and application | |
CN112685504B (en) | Production process-oriented distributed migration chart learning method | |
CN113065974A (en) | Link prediction method based on dynamic network representation learning | |
CN111325340B (en) | Information network relation prediction method and system | |
CN113378913A (en) | Semi-supervised node classification method based on self-supervised learning | |
WO2016165058A1 (en) | Social prediction | |
CN113761250A (en) | Model training method, merchant classification method and device | |
CN113554100B (en) | Web service classification method for enhancing attention network of special composition picture | |
CN115907001A (en) | Knowledge distillation-based federal diagram learning method and automatic driving method | |
CN113240086B (en) | Complex network link prediction method and system | |
CN118134017A (en) | Method for predicting social network link by adopting impulse neural network | |
CN112819024B (en) | Model processing method, user data processing method and device and computer equipment | |
CN116992151A (en) | Online course recommendation method based on double-tower graph convolution neural network | |
CN109523012A (en) | Based on Variational Solution Used coupled modes to the expression learning method of symbol directed networks | |
CN114880479B (en) | Heterogeneous graph convolution rumor detection method based on multistage interaction and graph reconstruction | |
CN114861766A (en) | Dynamic link prediction method and system based on multi-granularity evolution | |
CN115188022A (en) | Human behavior identification method based on consistency semi-supervised deep learning | |
Hu et al. | Data Customization-based Multiobjective Optimization Pruning Framework for Remote Sensing Scene Classification | |
CN115481215A (en) | Partner prediction method and prediction system based on temporal partner knowledge graph | |
Zhang et al. | Knowledge graph driven recommendation model of graph neural network | |
CN114842247B (en) | Characteristic accumulation-based graph convolution network semi-supervised node classification method | |
Jiang et al. | Dynamic adaptive and adversarial graph convolutional network for traffic forecasting | |
CN116975686A (en) | Method for training student model, behavior prediction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |