Nothing Special   »   [go: up one dir, main page]

CN112508170B - Multi-correlation time sequence prediction system and method based on generation of countermeasure network - Google Patents

Multi-correlation time sequence prediction system and method based on generation of countermeasure network Download PDF

Info

Publication number
CN112508170B
CN112508170B CN202011299519.6A CN202011299519A CN112508170B CN 112508170 B CN112508170 B CN 112508170B CN 202011299519 A CN202011299519 A CN 202011299519A CN 112508170 B CN112508170 B CN 112508170B
Authority
CN
China
Prior art keywords
time sequence
interaction
network
matrix
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011299519.6A
Other languages
Chinese (zh)
Other versions
CN112508170A (en
Inventor
吴伟杰
黄芳
吴琪
欧阳洋
禹克强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202011299519.6A priority Critical patent/CN112508170B/en
Publication of CN112508170A publication Critical patent/CN112508170A/en
Application granted granted Critical
Publication of CN112508170B publication Critical patent/CN112508170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a system and a method for predicting a plurality of relevant time sequences based on a generation countermeasure network, which aims at solving the problem of predicting a plurality of relevant time sequences widely existing in real life end to end, compared with other existing time sequence prediction models, the method can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences. In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The invention skillfully generates the interactive relation through a generator, obtains a predicted value through a generator, and optimizes the generated interactive relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.

Description

Multi-correlation time sequence prediction system and method based on generation of countermeasure network
Technical Field
The invention relates to the technical field of time sequence prediction, in particular to a multi-correlation time sequence prediction system and method based on a generated countermeasure network.
Background
Time series data are widely available in various fields of society, economy and life, and analysis and research of time series have important value and significance. In life, time-varying histories such as precipitation, electricity consumption, sales of commodities, price fluctuations of stocks can be regarded as typical time-series data. The change rule of the time sequences is found to predict the future development trend, and the method has great significance in the aspects of guiding reasonable utilization of hydraulic power resources, effective organization of production, reduction of inventory, improvement of benefits and the like. Research on the problem of time series prediction has been paid attention to by many scholars both at home and abroad for a long time. The ARIMA model (Autoregressive Integrated Moving Average model, differential integration moving average autoregressive model) is widely used in the problem of time series prediction, but is a linear model based on the linear regression theory in statistics, and has obvious defects in fitting data with complex patterns. In recent years, with the continuous development of artificial intelligence, many methods of machine learning and deep learning have also been used in the problem of prediction of time series. Such as support vector machines and neural networks, these models have the advantage of their strong fitting capability and are well suited for predictive analysis of complex time series data. So far, most of these studies have focused on single time series prediction problems and have achieved better results and applications. In the real world, however, the problems encountered tend to be more complex, and the subject of investigation often contains multiple time sequences.
In comparison with the problem of prediction of a single time series, in the problem of prediction of a plurality of time series, not only the timing relationship contained inside each time series but also the mutual influence between time series are considered, and such influence is called an interaction relationship (Interaction Relationship). Most of the current time sequence prediction methods are difficult to capture the influence of complex interaction relations among different time sequences on a prediction result, so that a satisfactory prediction effect is often not achieved. Therefore, for such time series predictions with multiple series correlations, how to effectively capture such hidden interactions between multiple time series is the biggest challenge to solve this problem.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a multi-correlation time sequence prediction system and a multi-correlation time sequence prediction method based on a generation countermeasure network, which can capture complex interaction relations among a plurality of time sequences and time sequence internal time sequence relations at the same time, and have unique advantages in the prediction tasks of the plurality of correlation time sequences.
In a first aspect of the present invention, there is provided a multi-phase time series prediction system based on generating an countermeasure network, comprising:
an interaction matrix generator mapping the original random vector into an interaction matrix;
A predicted value generator for obtaining an intermediate feature representation from the time series interaction graph using a graph convolution network, and for processing the intermediate feature representation using a recurrent neural network to obtain a predicted value for each time series; the time sequence interaction graph is generated by the interaction matrix and a time sequence feature matrix;
The time sequence discriminator is used for training based on the false time sequence sample and the real time sequence sample, and the time sequence discriminator after training is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.
According to the embodiment of the invention, at least the following technical effects are achieved:
Compared with other existing time sequence prediction models, the system can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences.
In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The system skillfully generates the interaction relation through a generator, obtains a predicted value through a generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.
According to some embodiments of the invention, the interaction matrix generator comprises a transposed convolutional network.
According to some embodiments of the invention, the recurrent neural network is a long-short term memory network.
According to some embodiments of the invention, the network depth of the graph rolling network is set to be 3 layers or 4 layers.
In a second aspect of the present invention, a method for predicting a plurality of relevant time sequences based on generating an countermeasure network is provided, and the method is applied to a relevant time sequence prediction system based on generating the countermeasure network, where the relevant time sequence prediction system based on generating the countermeasure network includes an interaction matrix generator, a predicted value generator and a time sequence discriminator, and the interaction matrix generator, the predicted value generator and the time sequence discriminator are connected with each other in pairs;
The method comprises the following steps:
Mapping the original random vector into an interaction matrix through the interaction matrix generator;
Constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix;
Obtaining intermediate feature representations from the time sequence interaction graph by using a graph convolution network through the predicted value generator, and processing the intermediate feature representations by using a cyclic neural network to obtain predicted values of each time sequence;
Training the time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding gradient information back to the interaction matrix generator and the predicted value generator through the trained time sequence discriminator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.
According to the embodiment of the invention, at least the following technical effects are achieved:
compared with other existing time sequence prediction models, the method can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences.
In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The method skillfully generates the interaction relation through a generator, obtains a predicted value through the generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.
According to some embodiments of the invention, the mapping of the original random vector into the interaction matrix comprises the steps of:
mapping the original random vector into three-dimensional characteristic representation through a full connection layer, processing the three-dimensional characteristic representation through a transposition convolution layer to obtain an output matrix, and carrying out symmetry processing on the output matrix to obtain the interaction matrix.
According to some embodiments of the invention, the recurrent neural network is a long-short term memory network.
According to some embodiments of the invention, the network depth of the graph rolling network is set to be 3 layers or 4 layers.
In a third aspect of the present invention, there is provided a plurality of related time series prediction devices based on generation of an countermeasure network, comprising: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generating an countermeasure network according to the second aspect of the invention.
In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform a plurality of related time series prediction methods based on generating an countermeasure network according to the second aspect of the present invention.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a time series interaction diagram comprising five time series according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a multi-phase time series prediction system based on generation of an countermeasure network according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a workflow of an interaction matrix generator according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a workflow of a predictor generator provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of a time sequence identifier according to an embodiment of the present invention;
FIG. 6 is a graph of experimental data of the influence of the interaction matrix generator on performance provided by an embodiment of the present invention;
Fig. 7 is a graph of experimental data of the influence of GCN network depth on performance according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a multi-phase time series prediction apparatus based on generation of an countermeasure network according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
For ease of understanding, the multi-phase time series prediction will be described first:
assuming that the study contains n time series T 1,T2,…,Tn, the time to predict is t+1, so the available data feature is T 1 [t-w+1,t],T2 [t-w+1,t],…,Tn [t-w+1,t], which represents the historical sequence value of each sequence within a sliding window of length w. The goal of the predictions of the plurality of correlated time sequences is to train a model f to map the data features described above to predicted values for each time sequence at time t+1:
[T1 t+1,T2 t+1,…,Tn t+1]=f(T1 [t-w+1,t],T2 [t-w+1,t],…,Tn [t-w+1,t]) (1)
The time-series feature vector T i [t-w+1,t] is a feature vector formed by a time-series of historical sequence values covered by a sliding window with length w at time T, and the time-series feature vector of the ith time-series is expressed as follows:
Ti [t-w+1,t]=[Ti t-w+1,Ti t-w+2,…,Ti t] (2)
The time series feature matrix X n×w is composed of n time series feature vectors, one for each row in the matrix. The number of columns of the matrix is equal to the length w of the sliding window, the number of rows of the matrix is equal to the number n of time sequences, and the specific form is shown as the following formula:
Time series interaction diagram G: the interaction of the time series is represented as a weighted undirected graph g= (V, E). Where V is a set of nodes, each node corresponding to a time series feature vector, the set of nodes V may be described using a time series feature matrix X n×w. The weighted edge set E represents weighted adjacencies between nodes of the time series interaction graph and is used for describing interaction relations between different time series. The adjacency matrix of the time series interaction graph is denoted by A n×n, which is called an interaction matrix, each matrix element corresponds to one edge in E, and the value range definition of A n×n is shown in the formula (4). Fig. 1 shows a time series interaction diagram comprising five time series.
A first embodiment;
Providing a multi-correlation time series prediction system based on generation of an countermeasure network, wherein the overall architecture of the system is shown in fig. 2; the system comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator, which are respectively marked as G M、GP and D, and is a deep learning model based on the generation of an countermeasure network, and the overall architecture of the system is shown in figure 2.
The interaction matrix generator G M maps the original random vector into an interaction matrix.
As an alternative embodiment, the generator G M includes a transposed convolutional network, as shown in fig. 3, where the fully-connected layer is configured to map the original random vector into a three-dimensional feature representation, and the transposed convolutional layer is configured to process the three-dimensional feature representation to obtain an output matrix, and to symmetrically process the output matrix to obtain an interaction matrix. The interaction matrix is used to represent the interaction relationship between different time sequences.
The generator G M functions to generate a two-dimensional matrix. In this embodiment, G M is implemented by using a transposed convolution network, where the effects of the transposed convolution and the convolution are exactly opposite, and the transposed convolution is a method of converting a coarse-granularity representation into a fine-granularity representation, which is equivalent to an up-sampling method. The transposed convolutional network has the characteristics of network partial connection (Local Connectivity) and convolutional kernel parameter sharing (PARAMETER SHARING), can greatly reduce the number of parameters of the network compared with a fully-connected neural network, and has higher efficiency in processing large-scale data. As shown in fig. 3, a high-dimensional random noise vector sampled from the gaussian distribution is used as input to the generator G M, and the high-dimensional random noise vector is mapped into a three-dimensional feature representation (FeatureRepresentation) through a full-connection layer, where the three dimensions are length, width, and channel number, respectively. The transposed convolutional layer continues to process the resulting three-dimensional feature representation, the number of channels of the feature representation decreases and the length and width increases each time a layer of the transposed convolutional layer passes, and finally the transposed convolutional network outputs an n x 1-dimensional tensor, where n is the number of time series that need to be processed. And obtaining an interaction matrix by matrix symmetry operation of the transposed convolution network output result. The symmetrization operation is shown in the following formula (5), where O is the output matrix and a is the symmetry matrix.
The predicted value generator G P uses a graph convolution network to obtain intermediate characteristic representations from the time sequence interaction graph, and uses a cyclic neural network to process the intermediate characteristic representations to obtain predicted values of each time sequence; the time series interaction graph is generated from an interaction matrix and a time series feature matrix.
Among the multiple correlation time series prediction problems, there are two dependencies that need to be resolved: 1) An interactive relationship existing between the time series; 2) The timing relationship exists inside the time series. The interactive relationship is obtained through the generator G M, and the purpose of the generator G P is to comprehensively process the dependency relationship of the two aspects. The workflow of generator G P is shown in fig. 4. Firstly, the interaction matrix generated by the generator G M is used as an adjacent matrix of the time sequence interaction graph, the time sequence feature matrix is used as a feature matrix of a node in the time sequence interaction graph, and the two feature matrices construct a time sequence interaction graph. The time sequence feature vector on each node of the time sequence interaction graph contains time sequence relations inside the time sequence, and the weighted edges between the nodes contain interaction relations between the time sequences. An intermediate feature representation (Feature Representation) may then be obtained by processing the time series interaction map using a graph rolling network (GCN). From the perspective of graph embedding (Graph Embedding), the GCN embeds topology information in the time series interaction graph, i.e., information in the interaction edges, into the output intermediate feature representation. This feature thus obtained by GCN processing represents two aspects of information that are actually involved: 1) The information in the time sequence feature matrix contains the time sequence relation inside the time sequence; 2) The information in the interaction relation matrix contains interaction relations among time sequences. Finally, the intermediate feature representation is processed using a recurrent neural network to generate a final predicted value.
As an alternative embodiment, the process of obtaining an intermediate feature representation via GCN processing is provided below:
Generator G P models the interaction between time series with GCN, and the graph roll used in generator G P is laminated as shown in equation (6):
wherein the method comprises the steps of A is an adjacent matrix obtained by symmetry of the interaction matrix, I is an n-dimensional identity matrix, and A is converted intoEquivalent to adding a ring edge to each node pointing to itself, this is done in order not to lose the original information of the node itself during the operation of the graph convolution. Matrix arrayIs thatCorresponding degree matrix with elements on main diagonalThe other elements are 0.The left and right are multiplied simultaneouslyThe method is a normalization process in graph convolution, and the problem that scale inconsistency of node characteristics after convolution is prevented.Is a time-series feature matrix on the graph, each row of the matrix is a time-series feature vector,Is a trainable parameter (LearnableParameter) in the GCN.Is a characterization matrix (RepresentationMatrix) obtained after the graph rolling operation. Will beRepresented as a whole in a matrix form as shown in formula (7). Matrix X and matrixRepresented by the form of row vectors, as shown in the formulas (8) and (9), respectively. Because of the parameter matrix The dimension of the square W is not changed after the square W is multiplied, so that the influence of the parameter matrix W is ignored in analysis. Then, each node feature vector output after a layer of graph rolling operation is shown in the formula (10), and it can be seen that the output feature vector is actually a weighted sum of feature vectors of the node and all adjacent nodes before input, and the weight coefficient is a result of normalizing the upper weights of edges between the nodes.
According to the analysis, the GCN is used for processing the time sequence interaction graph, so that the interaction relation between time sequences is modeled, and the core is that the time feature vectors on the first-order adjacent nodes and the self nodes are subjected to weighted fusion according to the correlation size represented by edges between the nodes in the graph, so that new feature vector representations are generated. The receptive field of the node in the graph rolling process is all first-order neighbor nodes of the node, and the larger the weight coefficient of the node with larger correlation is, the larger the influence on the newly generated feature vector of the node is.
As an alternative embodiment, the recurrent neural network used by generator G P is a long short term memory network (LSTM). The GCN processes the time series interaction graph to obtain an intermediate feature representation that includes both time series relationships within the time series and complex interactions between the time series. After extracting the interaction relation, the generator G P needs to generate future predicted values according to the intermediate feature representation, that is, extracting the time sequence relation. The recurrent neural network is suitable for processing sequential data, where Long Short Term Memory (LSTM) is a typical representation of the recurrent neural network, and the present embodiment uses LSTM to extract timing relationships and generate future predictions for each time sequence. LSTM uses some "Gate" structure to allow information to selectively affect the state of the recurrent neural network at each instant. The so-called "gate" structure is a neural network using a sigmoid activation function and a bit-wise multiplication operation, which is not described in detail herein. The following is the definition of each gate in LSTM:
z=tanh(Wz[ht-1,xt]+bz) (11)
i=sigmoid(Wi[ht-1,xt]+bi) (12)
f=sigmoid(Wf[ht-1,xt]+bf) (13)
o=sigmoid(Wo[ht-1,xt]+bo) (14)
ct=f⊙ct-1+i⊙z (15)
ht=o⊙tanh(ct) (16)
Where i, f, o represent input gate, forget gate and output gate, respectively, and c t represents a memory unit at time t, which can be regarded as a token vector of the previously output sequence information. h t denotes an output value at time t. W and b represent the weight parameter and bias parameter, respectively, for each gate in the LSTM.
In generator G P, the GCN performs a graph convolution operation according to the interaction relationship between the time series, resulting in an intermediate feature representation that fuses the time series internal timing relationship with the complex interaction relationship between the time series. The GCN processed feature representation is fed into the LSTM, which processes the sequences and generates a predicted value for each time sequence.
Training the time sequence discriminator D based on the false time sequence sample and the real time sequence sample, wherein the trained time sequence discriminator is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence samples are generated by adding predicted values to the original time sequence feature vectors, and the true time sequence samples are generated by adding true values to the original time sequence feature vectors.
In GAN, the arbiter acts as a party to the game with the producer, and needs to make a correct distinction between the data generated by the producer and the real data. In the present system, the generator G P generates a predicted value for each time series, and if the predicted value is directly simulated in GAN, the meaning of distinguishing the predicted value from the true value by the arbiter is not great. The system is implemented by adding the predicted value generated by the generator G P to the back of the original time-series feature vector to construct a false time-series sample, and adding the true value to the back of the original time-series feature vector to construct a true time-series sample. The specific forms of the dummy time series samples and the real time series samples are represented by the formula (17) and the formula (18), respectively.
Ti [t-w+1,t+1]=[Ti t-w+1,Ti t-w+2,…,Ti t,Ti t+1] (18)
The function of the discriminator D is to correctly distinguish between the true and false time series samples constructed as described above, and the specific structure of the discriminator D is shown in fig. 5, and the whole discriminator includes two input terminals. One of the inputs is an embedded layer (Embedding) which accepts an input that is a One-hot Encoding (One-hot Encoding) vector and outputs a dense vector of low dimensions. This one-hot encoded vector is a sparse vector of very high dimensions that is used to distinguish from which time series in the dataset the current time series sample originated. The other input is a two-way long and short term memory network (Bidirectional LSTM) which accepts input as a time series of samples. The main structure of the two-way long-short-term memory network is the combination of two unidirectional LSTMs, and at each time t, the input is simultaneously provided to the two LSTMs with opposite directions. The two networks independently calculate, each generates hidden state and output at the moment, and the two unidirectional LSTMs are completely symmetrical except the directions. FLSTM output at the last moment encodes forward time sequence information in the time sequence sample, BLSTM output at the first moment encodes reverse time sequence information of the time sequence, and the output of the two-way long-short-term memory network is simply splicing FLSTM and output vectors of the BLSTM. Finally, the vector output by the embedded layer and the vector output by the two-way long-short-term memory network are spliced together and input into a fully-connected network, and the fully-connected network gives the probability that the input time sequence sample is true.
From the system's overall perspective, generator G M fits the interaction matrix's data distribution G M(z;θM); the generator G P fits the data distribution G P(GM(z;θM),X;θP of each time sequence predicted value under the condition of giving out a time sequence feature matrix and an interaction matrix), and the final purpose of the two generators is to fool the discriminator D; the optimization process in which the arbiter D outputs the probability D (y, X; θ D).GM,GP and D as competitors for both parties to conduct the maximum and minimum game (MinimaxGame) for each time-series sample constructed can be formally expressed as shown in equation (19): finally, the complete generation countermeasure algorithm in the training process of the system is summarized as the pseudo code in table 1:
TABLE 1
The overall workflow of the system can be divided into a generation process, a discrimination process and an countermeasure process. In the generation process, the generator G M firstly converts an original random vector into an interaction matrix, and then combines the interaction matrix with a time sequence feature matrix to construct a time sequence interaction diagram. The generator G P then extracts the interactions between the time series nodes and the timing relationships within the individual time series from the time series interaction graph, generating future predictions for each time series. In the distinguishing process, firstly, a real sample and a false sample are constructed, and real data (Realdata) and a real label (REALTARGET) which are expressed by a time sequence feature matrix are subjected to matrix splicing operation (Concatenate) to obtain a real time sequence sample; the false time series samples are obtained by the matrix splicing operation of the real data (Realdata) of the time series feature matrix and the false labels (FAKETARGET) generated by the generator G P. The classifier D is then trained using the real and false samples as a training set, and when it can correctly distinguish between the real and false samples, the classifier training is completed. In the final countermeasure process, the trained discriminators are fixed as evaluation functions of the generators, and the probability that false samples generated by the two generators are evaluated as real samples by the discriminators is maximized by adjusting network parameters of the generators G M and G P.
Compared with other existing time sequence prediction models, the system can capture complex interaction relations among a plurality of time sequences and time sequence internal time sequence relations, and has unique advantages in prediction tasks of a plurality of related time sequences.
In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The system skillfully generates the interaction relation through a generator, obtains a predicted value through a generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge. In addition, the system adopts a transposed convolution network to realize an interaction matrix generator, thereby improving expansibility.
A second embodiment;
Providing simulation results of the system, wherein the system comprises a generator G M, a generator G P and a discriminator D, and the G M is formed by a transposed convolution network; the predictor generator G P is composed of a graph roll-up network and an LSTM network.
In order to verify the effectiveness of the system, the system is compared with the predicted performance of other reference methods on different data sets, and different metrics are used for evaluation, thereby verifying the effectiveness and applicability of the system. The influence of the system structure on the prediction performance, namely the influence of the generator G M structure and the GCN network depth on the prediction performance, is then researched through experiments.
1. A data set;
(1) Store ITEM DEMAND DATASET, which provides a daily sales record of 50 different items located in 10 different stores, each item sales record starting at 1 st 2013 and ending at 31 st 2017, 12, that is, the data set contains 500 time series each time series having a time period of 1826 days.
(2) Web TRAFFIC DATASET, which records data of changes in wikipedia website traffic over time. The entire data set contains approximately 145000 time series, where each time series represents the daily access to a wikipedia page, recorded for a period of 804 days starting at 1 in 2015 and ending at 9 in 2017. The data set contains missing values, and the data used in the experiment is 500 time sequences which are screened from the data set and do not contain missing values.
(3) NOAA CHINA DATASET, which is weather data recorded by weather stations at different locations within the country provided by the united states national marine and atmospheric administration. This example has extracted as experimental data 400 different weather stations daily temperature data from 2015 to 2018.
2. Setting an experiment;
(1) Setting system parameters;
The system is implemented using Pytorch deep learning frameworks. In the generator G M, the dimension of the random noise vector is set to 128, and its distribution follows a gaussian distribution. In generator G P, the number of GCN layers is set to 3, LSTM hidden layers are set to 3, and the hidden layer dimension is set to 64, so LSTM output requires a full connection layer to transform dimension from 64 dimension to 1 dimension. In the discriminator D, the dimension of the embedded layer vector is set to 8, the number of hidden layers of the bidirectional long-short-term memory network is set to 3, and the dimension of the hidden layer is set to 64. During experimental training of the system, the learning rate was set to 0.001, the batch size parameter was set to 32, adam was used as the optimization algorithm, and Dropout techniques were used to avoid model over-fitting, and Dropout parameters were set to 0.2.
(2) A reference method;
Autogressive Integrated Moving Average (ARIMA), which is a very widely used method for predicting a time series, by first smoothing the time series by differential operation and then combining an AR model and an MA model to predict future values of the time series;
Vector Auto-REGRESSIVE (VAR), which is often used to solve the problem of multi-dimensional time series prediction, can take into account the correlation between variables of different dimensions;
Support Vector Regression machines (SVR), a very well known machine learning model with solid mathematical theoretical support;
LightGBM (LGB), a gradient-lifted tree model proposed and implemented by microsoft, can solve classification and regression problems, and shows strong predictive performance in numerous data mining contests;
Long Short-Term Memory (LSTM), a recurrent neural network model, is well suited for processing sequential data.
Gate Recurrent Unit (GRU), which is also a recurrent neural network model, modifies and simplifies the gating mechanism in the LSTM model, and has higher training efficiency;
(3) Simulation results;
TABLE 2
Table 2 shows the comparison of the prediction accuracy of the present system and the other six methods on three datasets, store Item, web Traffic, NOAAChina. It can be seen from the table that the present system has the best predictive effect on all three data sets under both MAE and RMSE criteria. Among other comparative methods, ARIMA is a single time series prediction method that does not take into account the interactive relationships existing between time series among a plurality of related time series prediction problems, and the prediction effect in experiments is the worst. The VAR converts a plurality of related time series prediction problems into a multidimensional time series prediction problem in an experiment, and the method can capture the related relation between time series to a certain extent, but the prediction effect is only better than an ARIMA model because the VAR is a linear model and has limited fitting capacity on data with complex modes. Both SVR and LGB are very excellent machine learning models with very close prediction effects, LGB being slightly better than SVR as a whole. Both LSTM and GRU are deep learning models and are also very similar in structure, with LSTM being slightly better predicted than GRU, but GRU training is significantly more efficient than LSTM from a model training perspective. LGB has better effect on StoreItem and NOAAChina datasets than LSTM, and better effect on Web Traffic datasets. The prediction effect of the system on three data sets comprehensively surpasses other six methods by integrating the results of the whole experiment, and the system is proved to have obvious advantages in a plurality of related time sequence prediction problems.
3. The interaction matrix generator G M effects on performance;
a comparative experiment was performed on the effect of using the fully connected neural network and the transposed convolutional neural network to implement the interaction matrix generator G M on the system prediction performance effect, and the result of the experiment is shown in fig. 6. Wherein the two subgraphs for each row are predicted performance variations over a data set for a system implemented using both networks (FCN for fully connected neural network and TConv for transposed convolutional neural network). The specific meaning of the argument represented by the horizontal axis is the number of time series in the data set used by the model. The concrete meaning of the dependent variable represented by the vertical axis is the prediction accuracy of the model, the prediction accuracy evaluation index used by the left-hand column of subgraphs is MAE, and the prediction accuracy evaluation index used by the right-hand column of subgraphs is RMSE. It was found through experimentation that the predictive performance of models implemented using these two networks did not differ much in the case of a small number of time series, for example between 50 and 100. However, as the number of time series increases, the model prediction performance achieved by the transposed convolutional network is better than that of the fully-connected network, and the larger the number of time series is, the more obvious the difference is. This advantage of the transposed convolutional network may be due to its partial connection and parameter sharing characteristics like the convolutional network, so the transposed convolutional network is more efficient in processing two-dimensional grid data for better predictive performance of the interaction matrix generator G M.
4. The influence of the depth of the GCN network on the performance;
in this experiment, the influence relationship of the number of layers of GCN on the system prediction performance was studied, and the specific experimental result is shown in fig. 7. And under the condition of using a five-fold cross validation mode to measure different prediction performances of the system on a training set and a testing set on three data sets respectively, wherein the evaluation index of the prediction result in the first line subgraph is MAE, and the evaluation index of the prediction result in the second line subgraph is RMSE. It was found that under these two evaluation criteria, the system had the best fitting ability (minimum training set error) and the best generalization ability (minimum test set error) when the number of layers of GCN was 3 layers to 4 layers. When the number of GCN layers is less than 3, the system does not completely fit the data, and the training error and the generalization error gradually decrease along with the increase of the number of GCN layers. When the number of layers of the GCN exceeds 6, the system starts to generate an overfitting condition, and the generalization error at the moment is obviously increased along with the increase of the number of layers of the GCN.
A third embodiment;
There is provided a method of predicting a plurality of relevant time sequences based on generating an countermeasure network, comprising the steps of:
s100, mapping an original random vector into an interaction matrix through an interaction matrix generator;
s200, constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix;
s300, obtaining intermediate characteristic representations from the time sequence interaction graph by using a graph convolution network through a predicted value generator, and processing the intermediate characteristic representations by using a cyclic neural network to obtain predicted values of each time sequence;
s400, training a time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding gradient information back to an interaction matrix generator and a predicted value generator through the trained time sequence discriminator; the false time sequence samples are generated by adding predicted values to the original time sequence feature vectors, and the true time sequence samples are generated by adding true values to the original time sequence feature vectors.
As an alternative embodiment, mapping the original random vector into an interaction matrix comprises the steps of:
Mapping the original random vector into three-dimensional characteristic representation through the full connection layer, processing the three-dimensional characteristic representation by using the transposed convolution layer to obtain an output matrix, and carrying out symmetrical processing on the output matrix to obtain an interaction matrix.
As an alternative embodiment, the recurrent neural network is a long-short term memory network.
As an alternative embodiment, the network depth of the graph roll-up network is set to be 3 layers or 4 layers.
It should be noted that, since the method embodiment and the system embodiment are based on the same inventive concept, the corresponding content in the system embodiment is also applicable to the method embodiment, and is not repeated here.
A fourth embodiment;
Referring to fig. 8, a multi-phase time series prediction apparatus based on generation of an countermeasure network is provided, which may be any type of intelligent terminal such as a cellular phone, a tablet computer, a personal computer, etc. Specifically, the apparatus includes: one or more control processors and memory, here exemplified by one control processor. The control processor and the memory may be connected by a bus or otherwise, here by way of example.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the multi-phase time series prediction device based on generating an countermeasure network in embodiments of the present invention. The control processor implements the multi-phase time series prediction method based on generating the countermeasure network of the above method embodiments by running non-transitory software programs, instructions, and modules stored in the memory.
The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data created from the use of a multi-phase time series prediction system based on generating an countermeasure network, and the like. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the control processor, the remote memory being connectable via a network to the relevant time series prediction device based on the generated countermeasure network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory and when executed by the one or more control processors perform the multi-phase time series prediction method based on generating an antagonizing network of the above-described method embodiments.
Embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions for performing the method of generating a correlation time series prediction based on an antagonism network of the above method embodiments by one or more control processors.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented in software plus a general purpose hardware platform. Those skilled in the art will appreciate that all or part of the flow of the method of the above-described embodiments may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, and the program may include the flow of the embodiment of the method as described above when executed. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims (8)

1. A multi-phase time series prediction system based on generating an countermeasure network, characterized by different time series of weather data recorded for weather stations at different locations, wherein the weather data is data of daily temperatures, comprising:
An interaction matrix generator mapping the original random vector into an interaction matrix; the interaction matrix is used for representing interaction relations among different time sequences of the meteorological data;
The interaction matrix generator comprises a transposed convolution network, wherein a full connection layer of the transposed convolution network is used for mapping an original random vector into a three-dimensional characteristic representation, and a transposed convolution layer of the transposed convolution network is used for processing the three-dimensional characteristic representation to obtain an output matrix and carrying out symmetrical processing on the output matrix to obtain the interaction matrix;
the symmetrization process comprises the following steps:
wherein O is an output matrix, and A is an interaction matrix;
A predicted value generator for obtaining an intermediate feature representation from the time series interaction graph using a graph convolution network, and for processing the intermediate feature representation using a recurrent neural network to obtain a predicted value for each time series; the time sequence interaction graph is generated by the interaction matrix and a time sequence feature matrix;
The time sequence discriminator is used for training based on the false time sequence sample and the real time sequence sample, and the time sequence discriminator after training is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.
2. The multiple correlation time series prediction system based on generating an countermeasure network of claim 1, wherein the recurrent neural network is a long and short term memory network.
3. The multi-phase time series prediction system based on generating an countermeasure network of claim 1, wherein a network depth of the graph rolling network is set to 3 layers or 4 layers.
4. A method for predicting a plurality of relevant time sequences based on generating an countermeasure network, which is characterized by being applied to a system for predicting the relevant time sequences based on generating the countermeasure network, wherein the system for predicting the relevant time sequences based on generating the countermeasure network comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator, and the interaction matrix generator, the predicted value generator and the time sequence discriminator are mutually connected in pairs;
The method is used for different time sequences of meteorological data recorded by meteorological stations at different positions, wherein the meteorological data are data of daily temperature, and comprises the following steps:
Mapping the original random vector into an interaction matrix through the interaction matrix generator; the interaction matrix is used for representing interaction relations among different time sequences of the meteorological data; the interaction matrix generator comprises a transposed convolution network, wherein a full connection layer of the transposed convolution network is used for mapping an original random vector into a three-dimensional characteristic representation, and a transposed convolution layer of the transposed convolution network is used for processing the three-dimensional characteristic representation to obtain an output matrix and carrying out symmetrical processing on the output matrix to obtain the interaction matrix;
the symmetrization process comprises the following steps:
wherein O is an output matrix, and A is an interaction matrix;
Constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix; the mapping of the original random vector into the interaction matrix comprises the following steps:
Mapping the original random vector into three-dimensional feature representation through a full connection layer, processing the three-dimensional feature representation by using a transposition convolution layer to obtain an output matrix, and carrying out symmetric processing on the output matrix to obtain the interaction matrix;
Obtaining intermediate feature representations from the time sequence interaction graph by using a graph convolution network through the predicted value generator, and processing the intermediate feature representations by using a cyclic neural network to obtain predicted values of each time sequence;
Training the time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding gradient information back to the interaction matrix generator and the predicted value generator through the trained time sequence discriminator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.
5. The method of claim 4, wherein the recurrent neural network is a long-term memory network.
6. The method of generating multiple correlated time series predictions based on an countermeasure network of claim 4, wherein the network depth of the graph rolling network is set to be either layer 3 or layer 4.
7. A plurality of related time series prediction devices based on generating an countermeasure network, comprising: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generating an countermeasure network as claimed in any one of claims 4 to 6.
8. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the plurality of correlation time series prediction methods based on generating an countermeasure network as claimed in any one of claims 4 to 6.
CN202011299519.6A 2020-11-19 2020-11-19 Multi-correlation time sequence prediction system and method based on generation of countermeasure network Active CN112508170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011299519.6A CN112508170B (en) 2020-11-19 2020-11-19 Multi-correlation time sequence prediction system and method based on generation of countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011299519.6A CN112508170B (en) 2020-11-19 2020-11-19 Multi-correlation time sequence prediction system and method based on generation of countermeasure network

Publications (2)

Publication Number Publication Date
CN112508170A CN112508170A (en) 2021-03-16
CN112508170B true CN112508170B (en) 2024-08-16

Family

ID=74958158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011299519.6A Active CN112508170B (en) 2020-11-19 2020-11-19 Multi-correlation time sequence prediction system and method based on generation of countermeasure network

Country Status (1)

Country Link
CN (1) CN112508170B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673742B (en) * 2021-07-02 2022-06-14 华南理工大学 Distribution transformer area load prediction method, system, device and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110138595A (en) * 2019-04-12 2019-08-16 中国科学院深圳先进技术研究院 Time link prediction technique, device, equipment and the medium of dynamic weighting network
CN111126758A (en) * 2019-11-15 2020-05-08 中南大学 Academic team influence propagation prediction method, device and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10650045B2 (en) * 2016-02-05 2020-05-12 Sas Institute Inc. Staged training of neural networks for improved time series prediction performance
KR101856170B1 (en) * 2017-09-20 2018-05-10 주식회사 모비젠 Apparatus for predicting error generation time of system based on time-series data and method thereof
RU2715024C1 (en) * 2019-02-12 2020-02-21 Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк) Method of trained recurrent neural network debugging
US20200342968A1 (en) * 2019-04-24 2020-10-29 GE Precision Healthcare LLC Visualization of medical device event processing
CN111612206B (en) * 2020-03-30 2022-09-02 清华大学 Neighborhood people stream prediction method and system based on space-time diagram convolution neural network
CN111475546A (en) * 2020-04-09 2020-07-31 大连海事大学 Financial time sequence prediction method for generating confrontation network based on double-stage attention mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110138595A (en) * 2019-04-12 2019-08-16 中国科学院深圳先进技术研究院 Time link prediction technique, device, equipment and the medium of dynamic weighting network
CN111126758A (en) * 2019-11-15 2020-05-08 中南大学 Academic team influence propagation prediction method, device and storage medium

Also Published As

Publication number Publication date
CN112508170A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
Wang et al. Parallel LSTM‐Based Regional Integrated Energy System Multienergy Source‐Load Information Interactive Energy Prediction
Xuan et al. Multi-model fusion short-term load forecasting based on random forest feature selection and hybrid neural network
Saeed et al. Hybrid bidirectional LSTM model for short-term wind speed interval prediction
Jain et al. Data mining techniques: a survey paper
CN112529168A (en) GCN-based attribute multilayer network representation learning method
Al-Janabi et al. Development of deep learning method for predicting DC power based on renewable solar energy and multi-parameters function
CN112633973B (en) Commodity recommendation method and related equipment thereof
CN113780002A (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
Tong et al. Temporal inception convolutional network based on multi‐head attention for ultra‐short‐term load forecasting
Dong et al. Short-term wind speed time series forecasting based on a hybrid method with multiple objective optimization for non-convex target
CN114492978A (en) Time-space sequence prediction method and device based on multi-layer attention mechanism
Tariq et al. Employing deep learning and time series analysis to tackle the accuracy and robustness of the forecasting problem
Jiang et al. An intelligent recommendation approach for online advertising based on hybrid deep neural network and parallel computing
CN112508170B (en) Multi-correlation time sequence prediction system and method based on generation of countermeasure network
Bento et al. Ocean wave power forecasting using convolutional neural networks
Dai et al. Multimodal deep learning water level forecasting model for multiscale drought alert in Feiyun River basin
Hu et al. Graph transformer based dynamic multiple graph convolution networks for traffic flow forecasting
Xie et al. The day-ahead electricity price forecasting based on stacked CNN and LSTM
Ye et al. A novel informer-time-series generative adversarial networks for day-ahead scenario generation of wind power
CN116993185A (en) Time sequence prediction method, device, equipment and storage medium
Qin et al. A hybrid deep learning model for short‐term load forecasting of distribution networks integrating the channel attention mechanism
CN100380395C (en) Supervised classification process of artificial immunity in remote sensing images
Bosma et al. Estimating solar and wind power production using computer vision deep learning techniques on weather maps
Prashanthi et al. A comparative study of the performance of machine learning based load forecasting methods
Wu Evaluation model of product shape design scheme based on fuzzy genetic algorithm mining spatial association rules

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant