CN112508170B

CN112508170B - Multi-correlation time sequence prediction system and method based on generation of countermeasure network

Info

Publication number: CN112508170B
Application number: CN202011299519.6A
Authority: CN
Inventors: 吴伟杰; 黄芳; 吴琪; 欧阳洋; 禹克强
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2020-11-19
Filing date: 2020-11-19
Publication date: 2024-08-16
Anticipated expiration: 2040-11-19
Also published as: CN112508170A

Abstract

The invention discloses a system and a method for predicting a plurality of relevant time sequences based on a generation countermeasure network, which aims at solving the problem of predicting a plurality of relevant time sequences widely existing in real life end to end, compared with other existing time sequence prediction models, the method can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences. In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The invention skillfully generates the interactive relation through a generator, obtains a predicted value through a generator, and optimizes the generated interactive relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.

Description

Multi-correlation time sequence prediction system and method based on generation of countermeasure network

Technical Field

The invention relates to the technical field of time sequence prediction, in particular to a multi-correlation time sequence prediction system and method based on a generated countermeasure network.

Background

Time series data are widely available in various fields of society, economy and life, and analysis and research of time series have important value and significance. In life, time-varying histories such as precipitation, electricity consumption, sales of commodities, price fluctuations of stocks can be regarded as typical time-series data. The change rule of the time sequences is found to predict the future development trend, and the method has great significance in the aspects of guiding reasonable utilization of hydraulic power resources, effective organization of production, reduction of inventory, improvement of benefits and the like. Research on the problem of time series prediction has been paid attention to by many scholars both at home and abroad for a long time. The ARIMA model (Autoregressive Integrated Moving Average model, differential integration moving average autoregressive model) is widely used in the problem of time series prediction, but is a linear model based on the linear regression theory in statistics, and has obvious defects in fitting data with complex patterns. In recent years, with the continuous development of artificial intelligence, many methods of machine learning and deep learning have also been used in the problem of prediction of time series. Such as support vector machines and neural networks, these models have the advantage of their strong fitting capability and are well suited for predictive analysis of complex time series data. So far, most of these studies have focused on single time series prediction problems and have achieved better results and applications. In the real world, however, the problems encountered tend to be more complex, and the subject of investigation often contains multiple time sequences.

In comparison with the problem of prediction of a single time series, in the problem of prediction of a plurality of time series, not only the timing relationship contained inside each time series but also the mutual influence between time series are considered, and such influence is called an interaction relationship (Interaction Relationship). Most of the current time sequence prediction methods are difficult to capture the influence of complex interaction relations among different time sequences on a prediction result, so that a satisfactory prediction effect is often not achieved. Therefore, for such time series predictions with multiple series correlations, how to effectively capture such hidden interactions between multiple time series is the biggest challenge to solve this problem.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a multi-correlation time sequence prediction system and a multi-correlation time sequence prediction method based on a generation countermeasure network, which can capture complex interaction relations among a plurality of time sequences and time sequence internal time sequence relations at the same time, and have unique advantages in the prediction tasks of the plurality of correlation time sequences.

In a first aspect of the present invention, there is provided a multi-phase time series prediction system based on generating an countermeasure network, comprising:

an interaction matrix generator mapping the original random vector into an interaction matrix;

A predicted value generator for obtaining an intermediate feature representation from the time series interaction graph using a graph convolution network, and for processing the intermediate feature representation using a recurrent neural network to obtain a predicted value for each time series; the time sequence interaction graph is generated by the interaction matrix and a time sequence feature matrix;

The time sequence discriminator is used for training based on the false time sequence sample and the real time sequence sample, and the time sequence discriminator after training is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.

According to the embodiment of the invention, at least the following technical effects are achieved:

Compared with other existing time sequence prediction models, the system can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences.

In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The system skillfully generates the interaction relation through a generator, obtains a predicted value through a generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.

According to some embodiments of the invention, the interaction matrix generator comprises a transposed convolutional network.

According to some embodiments of the invention, the recurrent neural network is a long-short term memory network.

According to some embodiments of the invention, the network depth of the graph rolling network is set to be 3 layers or 4 layers.

In a second aspect of the present invention, a method for predicting a plurality of relevant time sequences based on generating an countermeasure network is provided, and the method is applied to a relevant time sequence prediction system based on generating the countermeasure network, where the relevant time sequence prediction system based on generating the countermeasure network includes an interaction matrix generator, a predicted value generator and a time sequence discriminator, and the interaction matrix generator, the predicted value generator and the time sequence discriminator are connected with each other in pairs;

The method comprises the following steps:

Mapping the original random vector into an interaction matrix through the interaction matrix generator;

Constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix;

Obtaining intermediate feature representations from the time sequence interaction graph by using a graph convolution network through the predicted value generator, and processing the intermediate feature representations by using a cyclic neural network to obtain predicted values of each time sequence;

Training the time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding gradient information back to the interaction matrix generator and the predicted value generator through the trained time sequence discriminator; the false time sequence samples are generated by adding the predicted value to the original time sequence feature vector, and the true time sequence samples are generated by adding the true value to the original time sequence feature vector.

compared with other existing time sequence prediction models, the method can capture complex interaction relations among a plurality of time sequences at the same time, and has unique advantages in the prediction tasks of a plurality of related time sequences.

In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The method skillfully generates the interaction relation through a generator, obtains a predicted value through the generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge.

According to some embodiments of the invention, the mapping of the original random vector into the interaction matrix comprises the steps of:

mapping the original random vector into three-dimensional characteristic representation through a full connection layer, processing the three-dimensional characteristic representation through a transposition convolution layer to obtain an output matrix, and carrying out symmetry processing on the output matrix to obtain the interaction matrix.

In a third aspect of the present invention, there is provided a plurality of related time series prediction devices based on generation of an countermeasure network, comprising: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generating an countermeasure network according to the second aspect of the invention.

In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform a plurality of related time series prediction methods based on generating an countermeasure network according to the second aspect of the present invention.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a time series interaction diagram comprising five time series according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a multi-phase time series prediction system based on generation of an countermeasure network according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a workflow of an interaction matrix generator according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a workflow of a predictor generator provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of a time sequence identifier according to an embodiment of the present invention;

FIG. 6 is a graph of experimental data of the influence of the interaction matrix generator on performance provided by an embodiment of the present invention;

Fig. 7 is a graph of experimental data of the influence of GCN network depth on performance according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a multi-phase time series prediction apparatus based on generation of an countermeasure network according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

For ease of understanding, the multi-phase time series prediction will be described first:

assuming that the study contains n time series T ₁,T₂,…,T_n, the time to predict is t+1, so the available data feature is T ₁ ^[t-w+1,t],T₂ ^[t-w+1,t],…,T_n ^[t-w+1,t], which represents the historical sequence value of each sequence within a sliding window of length w. The goal of the predictions of the plurality of correlated time sequences is to train a model f to map the data features described above to predicted values for each time sequence at time t+1:

[T₁ ^t+1,T₂ ^t+1,…,T_n ^t+1]＝f(T₁ ^[t-w+1,t],T₂ ^[t-w+1,t],…,T_n ^[t-w+1,t]) (1)

The time-series feature vector T _i ^[t-w+1,t] is a feature vector formed by a time-series of historical sequence values covered by a sliding window with length w at time T, and the time-series feature vector of the ith time-series is expressed as follows:

T_i ^[t-w+1,t]＝[T_i ^t-w+1,T_i ^t-w+2,…,T_i ^t] (2)

The time series feature matrix X ^n×w is composed of n time series feature vectors, one for each row in the matrix. The number of columns of the matrix is equal to the length w of the sliding window, the number of rows of the matrix is equal to the number n of time sequences, and the specific form is shown as the following formula:

Time series interaction diagram G: the interaction of the time series is represented as a weighted undirected graph g= (V, E). Where V is a set of nodes, each node corresponding to a time series feature vector, the set of nodes V may be described using a time series feature matrix X ^n×w. The weighted edge set E represents weighted adjacencies between nodes of the time series interaction graph and is used for describing interaction relations between different time series. The adjacency matrix of the time series interaction graph is denoted by A ^n×n, which is called an interaction matrix, each matrix element corresponds to one edge in E, and the value range definition of A ^n×n is shown in the formula (4). Fig. 1 shows a time series interaction diagram comprising five time series.

A first embodiment;

Providing a multi-correlation time series prediction system based on generation of an countermeasure network, wherein the overall architecture of the system is shown in fig. 2; the system comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator, which are respectively marked as G _M、G_P and D, and is a deep learning model based on the generation of an countermeasure network, and the overall architecture of the system is shown in figure 2.

The interaction matrix generator G _M maps the original random vector into an interaction matrix.

As an alternative embodiment, the generator G _M includes a transposed convolutional network, as shown in fig. 3, where the fully-connected layer is configured to map the original random vector into a three-dimensional feature representation, and the transposed convolutional layer is configured to process the three-dimensional feature representation to obtain an output matrix, and to symmetrically process the output matrix to obtain an interaction matrix. The interaction matrix is used to represent the interaction relationship between different time sequences.

The generator G _M functions to generate a two-dimensional matrix. In this embodiment, G _M is implemented by using a transposed convolution network, where the effects of the transposed convolution and the convolution are exactly opposite, and the transposed convolution is a method of converting a coarse-granularity representation into a fine-granularity representation, which is equivalent to an up-sampling method. The transposed convolutional network has the characteristics of network partial connection (Local Connectivity) and convolutional kernel parameter sharing (PARAMETER SHARING), can greatly reduce the number of parameters of the network compared with a fully-connected neural network, and has higher efficiency in processing large-scale data. As shown in fig. 3, a high-dimensional random noise vector sampled from the gaussian distribution is used as input to the generator G _M, and the high-dimensional random noise vector is mapped into a three-dimensional feature representation (FeatureRepresentation) through a full-connection layer, where the three dimensions are length, width, and channel number, respectively. The transposed convolutional layer continues to process the resulting three-dimensional feature representation, the number of channels of the feature representation decreases and the length and width increases each time a layer of the transposed convolutional layer passes, and finally the transposed convolutional network outputs an n x 1-dimensional tensor, where n is the number of time series that need to be processed. And obtaining an interaction matrix by matrix symmetry operation of the transposed convolution network output result. The symmetrization operation is shown in the following formula (5), where O is the output matrix and a is the symmetry matrix.

The predicted value generator G _P uses a graph convolution network to obtain intermediate characteristic representations from the time sequence interaction graph, and uses a cyclic neural network to process the intermediate characteristic representations to obtain predicted values of each time sequence; the time series interaction graph is generated from an interaction matrix and a time series feature matrix.

Among the multiple correlation time series prediction problems, there are two dependencies that need to be resolved: 1) An interactive relationship existing between the time series; 2) The timing relationship exists inside the time series. The interactive relationship is obtained through the generator G _M, and the purpose of the generator G _P is to comprehensively process the dependency relationship of the two aspects. The workflow of generator G _P is shown in fig. 4. Firstly, the interaction matrix generated by the generator G _M is used as an adjacent matrix of the time sequence interaction graph, the time sequence feature matrix is used as a feature matrix of a node in the time sequence interaction graph, and the two feature matrices construct a time sequence interaction graph. The time sequence feature vector on each node of the time sequence interaction graph contains time sequence relations inside the time sequence, and the weighted edges between the nodes contain interaction relations between the time sequences. An intermediate feature representation (Feature Representation) may then be obtained by processing the time series interaction map using a graph rolling network (GCN). From the perspective of graph embedding (Graph Embedding), the GCN embeds topology information in the time series interaction graph, i.e., information in the interaction edges, into the output intermediate feature representation. This feature thus obtained by GCN processing represents two aspects of information that are actually involved: 1) The information in the time sequence feature matrix contains the time sequence relation inside the time sequence; 2) The information in the interaction relation matrix contains interaction relations among time sequences. Finally, the intermediate feature representation is processed using a recurrent neural network to generate a final predicted value.

As an alternative embodiment, the process of obtaining an intermediate feature representation via GCN processing is provided below:

Generator G _P models the interaction between time series with GCN, and the graph roll used in generator G _P is laminated as shown in equation (6):

wherein the method comprises the steps of A is an adjacent matrix obtained by symmetry of the interaction matrix, I is an n-dimensional identity matrix, and A is converted intoEquivalent to adding a ring edge to each node pointing to itself, this is done in order not to lose the original information of the node itself during the operation of the graph convolution. Matrix arrayIs thatCorresponding degree matrix with elements on main diagonalThe other elements are 0.The left and right are multiplied simultaneouslyThe method is a normalization process in graph convolution, and the problem that scale inconsistency of node characteristics after convolution is prevented.Is a time-series feature matrix on the graph, each row of the matrix is a time-series feature vector,Is a trainable parameter (LearnableParameter) in the GCN.Is a characterization matrix (RepresentationMatrix) obtained after the graph rolling operation. Will beRepresented as a whole in a matrix form as shown in formula (7). Matrix X and matrixRepresented by the form of row vectors, as shown in the formulas (8) and (9), respectively. Because of the parameter matrix The dimension of the square W is not changed after the square W is multiplied, so that the influence of the parameter matrix W is ignored in analysis. Then, each node feature vector output after a layer of graph rolling operation is shown in the formula (10), and it can be seen that the output feature vector is actually a weighted sum of feature vectors of the node and all adjacent nodes before input, and the weight coefficient is a result of normalizing the upper weights of edges between the nodes.

According to the analysis, the GCN is used for processing the time sequence interaction graph, so that the interaction relation between time sequences is modeled, and the core is that the time feature vectors on the first-order adjacent nodes and the self nodes are subjected to weighted fusion according to the correlation size represented by edges between the nodes in the graph, so that new feature vector representations are generated. The receptive field of the node in the graph rolling process is all first-order neighbor nodes of the node, and the larger the weight coefficient of the node with larger correlation is, the larger the influence on the newly generated feature vector of the node is.

As an alternative embodiment, the recurrent neural network used by generator G _P is a long short term memory network (LSTM). The GCN processes the time series interaction graph to obtain an intermediate feature representation that includes both time series relationships within the time series and complex interactions between the time series. After extracting the interaction relation, the generator G _P needs to generate future predicted values according to the intermediate feature representation, that is, extracting the time sequence relation. The recurrent neural network is suitable for processing sequential data, where Long Short Term Memory (LSTM) is a typical representation of the recurrent neural network, and the present embodiment uses LSTM to extract timing relationships and generate future predictions for each time sequence. LSTM uses some "Gate" structure to allow information to selectively affect the state of the recurrent neural network at each instant. The so-called "gate" structure is a neural network using a sigmoid activation function and a bit-wise multiplication operation, which is not described in detail herein. The following is the definition of each gate in LSTM:

z＝tanh(W_z[h_t-1,x_t]+b_z) (11)

i＝sigmoid(W_i[h_t-1,x_t]+b_i) (12)

f＝sigmoid(W_f[h_t-1,x_t]+b_f) (13)

o＝sigmoid(W_o[h_t-1,x_t]+b_o) (14)

c_t＝f⊙c_t-1+i⊙z (15)

h_t＝o⊙tanh(c_t) (16)

Where i, f, o represent input gate, forget gate and output gate, respectively, and c _t represents a memory unit at time t, which can be regarded as a token vector of the previously output sequence information. h _t denotes an output value at time t. W and b represent the weight parameter and bias parameter, respectively, for each gate in the LSTM.

In generator G _P, the GCN performs a graph convolution operation according to the interaction relationship between the time series, resulting in an intermediate feature representation that fuses the time series internal timing relationship with the complex interaction relationship between the time series. The GCN processed feature representation is fed into the LSTM, which processes the sequences and generates a predicted value for each time sequence.

Training the time sequence discriminator D based on the false time sequence sample and the real time sequence sample, wherein the trained time sequence discriminator is used for feeding gradient information back to the interaction matrix generator and the predicted value generator; the false time sequence samples are generated by adding predicted values to the original time sequence feature vectors, and the true time sequence samples are generated by adding true values to the original time sequence feature vectors.

In GAN, the arbiter acts as a party to the game with the producer, and needs to make a correct distinction between the data generated by the producer and the real data. In the present system, the generator G _P generates a predicted value for each time series, and if the predicted value is directly simulated in GAN, the meaning of distinguishing the predicted value from the true value by the arbiter is not great. The system is implemented by adding the predicted value generated by the generator G _P to the back of the original time-series feature vector to construct a false time-series sample, and adding the true value to the back of the original time-series feature vector to construct a true time-series sample. The specific forms of the dummy time series samples and the real time series samples are represented by the formula (17) and the formula (18), respectively.

T_i ^[t-w+1,t+1]＝[T_i ^t-w+1,T_i ^t-w+2,…,T_i ^t,T_i ^t+1] (18)

The function of the discriminator D is to correctly distinguish between the true and false time series samples constructed as described above, and the specific structure of the discriminator D is shown in fig. 5, and the whole discriminator includes two input terminals. One of the inputs is an embedded layer (Embedding) which accepts an input that is a One-hot Encoding (One-hot Encoding) vector and outputs a dense vector of low dimensions. This one-hot encoded vector is a sparse vector of very high dimensions that is used to distinguish from which time series in the dataset the current time series sample originated. The other input is a two-way long and short term memory network (Bidirectional LSTM) which accepts input as a time series of samples. The main structure of the two-way long-short-term memory network is the combination of two unidirectional LSTMs, and at each time t, the input is simultaneously provided to the two LSTMs with opposite directions. The two networks independently calculate, each generates hidden state and output at the moment, and the two unidirectional LSTMs are completely symmetrical except the directions. FLSTM output at the last moment encodes forward time sequence information in the time sequence sample, BLSTM output at the first moment encodes reverse time sequence information of the time sequence, and the output of the two-way long-short-term memory network is simply splicing FLSTM and output vectors of the BLSTM. Finally, the vector output by the embedded layer and the vector output by the two-way long-short-term memory network are spliced together and input into a fully-connected network, and the fully-connected network gives the probability that the input time sequence sample is true.

From the system's overall perspective, generator G _M fits the interaction matrix's data distribution G _M(z;θ_M); the generator G _P fits the data distribution G _P(G_M(z;θ_M),X;θ_P of each time sequence predicted value under the condition of giving out a time sequence feature matrix and an interaction matrix), and the final purpose of the two generators is to fool the discriminator D; the optimization process in which the arbiter D outputs the probability D (y, X; θ _D).G_M,G_P and D as competitors for both parties to conduct the maximum and minimum game (MinimaxGame) for each time-series sample constructed can be formally expressed as shown in equation (19): finally, the complete generation countermeasure algorithm in the training process of the system is summarized as the pseudo code in table 1:

TABLE 1

The overall workflow of the system can be divided into a generation process, a discrimination process and an countermeasure process. In the generation process, the generator G _M firstly converts an original random vector into an interaction matrix, and then combines the interaction matrix with a time sequence feature matrix to construct a time sequence interaction diagram. The generator G _P then extracts the interactions between the time series nodes and the timing relationships within the individual time series from the time series interaction graph, generating future predictions for each time series. In the distinguishing process, firstly, a real sample and a false sample are constructed, and real data (Realdata) and a real label (REALTARGET) which are expressed by a time sequence feature matrix are subjected to matrix splicing operation (Concatenate) to obtain a real time sequence sample; the false time series samples are obtained by the matrix splicing operation of the real data (Realdata) of the time series feature matrix and the false labels (FAKETARGET) generated by the generator G _P. The classifier D is then trained using the real and false samples as a training set, and when it can correctly distinguish between the real and false samples, the classifier training is completed. In the final countermeasure process, the trained discriminators are fixed as evaluation functions of the generators, and the probability that false samples generated by the two generators are evaluated as real samples by the discriminators is maximized by adjusting network parameters of the generators G _M and G _P.

Compared with other existing time sequence prediction models, the system can capture complex interaction relations among a plurality of time sequences and time sequence internal time sequence relations, and has unique advantages in prediction tasks of a plurality of related time sequences.

In the multiple related time series prediction problems, the complex interaction relationship existing between the time series is hidden inside the data, and the conventional method cannot directly extract the hidden complex interaction relationship. The system skillfully generates the interaction relation through a generator, obtains a predicted value through a generator, and optimizes the generated interaction relation through a discriminator. This method of extracting interactions directly from the data avoids reliance on other prior knowledge. In addition, the system adopts a transposed convolution network to realize an interaction matrix generator, thereby improving expansibility.

A second embodiment;

Providing simulation results of the system, wherein the system comprises a generator G _M, a generator G _P and a discriminator D, and the G _M is formed by a transposed convolution network; the predictor generator G _P is composed of a graph roll-up network and an LSTM network.

In order to verify the effectiveness of the system, the system is compared with the predicted performance of other reference methods on different data sets, and different metrics are used for evaluation, thereby verifying the effectiveness and applicability of the system. The influence of the system structure on the prediction performance, namely the influence of the generator G _M structure and the GCN network depth on the prediction performance, is then researched through experiments.

1. A data set;

(1) Store ITEM DEMAND DATASET, which provides a daily sales record of 50 different items located in 10 different stores, each item sales record starting at 1 st 2013 and ending at 31 st 2017, 12, that is, the data set contains 500 time series each time series having a time period of 1826 days.

(2) Web TRAFFIC DATASET, which records data of changes in wikipedia website traffic over time. The entire data set contains approximately 145000 time series, where each time series represents the daily access to a wikipedia page, recorded for a period of 804 days starting at 1 in 2015 and ending at 9 in 2017. The data set contains missing values, and the data used in the experiment is 500 time sequences which are screened from the data set and do not contain missing values.

(3) NOAA CHINA DATASET, which is weather data recorded by weather stations at different locations within the country provided by the united states national marine and atmospheric administration. This example has extracted as experimental data 400 different weather stations daily temperature data from 2015 to 2018.

2. Setting an experiment;

(1) Setting system parameters;

The system is implemented using Pytorch deep learning frameworks. In the generator G _M, the dimension of the random noise vector is set to 128, and its distribution follows a gaussian distribution. In generator G _P, the number of GCN layers is set to 3, LSTM hidden layers are set to 3, and the hidden layer dimension is set to 64, so LSTM output requires a full connection layer to transform dimension from 64 dimension to 1 dimension. In the discriminator D, the dimension of the embedded layer vector is set to 8, the number of hidden layers of the bidirectional long-short-term memory network is set to 3, and the dimension of the hidden layer is set to 64. During experimental training of the system, the learning rate was set to 0.001, the batch size parameter was set to 32, adam was used as the optimization algorithm, and Dropout techniques were used to avoid model over-fitting, and Dropout parameters were set to 0.2.

(2) A reference method;

Autogressive Integrated Moving Average (ARIMA), which is a very widely used method for predicting a time series, by first smoothing the time series by differential operation and then combining an AR model and an MA model to predict future values of the time series;

Vector Auto-REGRESSIVE (VAR), which is often used to solve the problem of multi-dimensional time series prediction, can take into account the correlation between variables of different dimensions;

Support Vector Regression machines (SVR), a very well known machine learning model with solid mathematical theoretical support;

LightGBM (LGB), a gradient-lifted tree model proposed and implemented by microsoft, can solve classification and regression problems, and shows strong predictive performance in numerous data mining contests;

Long Short-Term Memory (LSTM), a recurrent neural network model, is well suited for processing sequential data.

Gate Recurrent Unit (GRU), which is also a recurrent neural network model, modifies and simplifies the gating mechanism in the LSTM model, and has higher training efficiency;

(3) Simulation results;

TABLE 2

Table 2 shows the comparison of the prediction accuracy of the present system and the other six methods on three datasets, store Item, web Traffic, NOAAChina. It can be seen from the table that the present system has the best predictive effect on all three data sets under both MAE and RMSE criteria. Among other comparative methods, ARIMA is a single time series prediction method that does not take into account the interactive relationships existing between time series among a plurality of related time series prediction problems, and the prediction effect in experiments is the worst. The VAR converts a plurality of related time series prediction problems into a multidimensional time series prediction problem in an experiment, and the method can capture the related relation between time series to a certain extent, but the prediction effect is only better than an ARIMA model because the VAR is a linear model and has limited fitting capacity on data with complex modes. Both SVR and LGB are very excellent machine learning models with very close prediction effects, LGB being slightly better than SVR as a whole. Both LSTM and GRU are deep learning models and are also very similar in structure, with LSTM being slightly better predicted than GRU, but GRU training is significantly more efficient than LSTM from a model training perspective. LGB has better effect on StoreItem and NOAAChina datasets than LSTM, and better effect on Web Traffic datasets. The prediction effect of the system on three data sets comprehensively surpasses other six methods by integrating the results of the whole experiment, and the system is proved to have obvious advantages in a plurality of related time sequence prediction problems.

3. The interaction matrix generator G _M effects on performance;

a comparative experiment was performed on the effect of using the fully connected neural network and the transposed convolutional neural network to implement the interaction matrix generator G _M on the system prediction performance effect, and the result of the experiment is shown in fig. 6. Wherein the two subgraphs for each row are predicted performance variations over a data set for a system implemented using both networks (FCN for fully connected neural network and TConv for transposed convolutional neural network). The specific meaning of the argument represented by the horizontal axis is the number of time series in the data set used by the model. The concrete meaning of the dependent variable represented by the vertical axis is the prediction accuracy of the model, the prediction accuracy evaluation index used by the left-hand column of subgraphs is MAE, and the prediction accuracy evaluation index used by the right-hand column of subgraphs is RMSE. It was found through experimentation that the predictive performance of models implemented using these two networks did not differ much in the case of a small number of time series, for example between 50 and 100. However, as the number of time series increases, the model prediction performance achieved by the transposed convolutional network is better than that of the fully-connected network, and the larger the number of time series is, the more obvious the difference is. This advantage of the transposed convolutional network may be due to its partial connection and parameter sharing characteristics like the convolutional network, so the transposed convolutional network is more efficient in processing two-dimensional grid data for better predictive performance of the interaction matrix generator G _M.

4. The influence of the depth of the GCN network on the performance;

in this experiment, the influence relationship of the number of layers of GCN on the system prediction performance was studied, and the specific experimental result is shown in fig. 7. And under the condition of using a five-fold cross validation mode to measure different prediction performances of the system on a training set and a testing set on three data sets respectively, wherein the evaluation index of the prediction result in the first line subgraph is MAE, and the evaluation index of the prediction result in the second line subgraph is RMSE. It was found that under these two evaluation criteria, the system had the best fitting ability (minimum training set error) and the best generalization ability (minimum test set error) when the number of layers of GCN was 3 layers to 4 layers. When the number of GCN layers is less than 3, the system does not completely fit the data, and the training error and the generalization error gradually decrease along with the increase of the number of GCN layers. When the number of layers of the GCN exceeds 6, the system starts to generate an overfitting condition, and the generalization error at the moment is obviously increased along with the increase of the number of layers of the GCN.

A third embodiment;

There is provided a method of predicting a plurality of relevant time sequences based on generating an countermeasure network, comprising the steps of:

s100, mapping an original random vector into an interaction matrix through an interaction matrix generator;

s200, constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix;

s300, obtaining intermediate characteristic representations from the time sequence interaction graph by using a graph convolution network through a predicted value generator, and processing the intermediate characteristic representations by using a cyclic neural network to obtain predicted values of each time sequence;

s400, training a time sequence discriminator through the false time sequence sample and the real time sequence sample, and feeding gradient information back to an interaction matrix generator and a predicted value generator through the trained time sequence discriminator; the false time sequence samples are generated by adding predicted values to the original time sequence feature vectors, and the true time sequence samples are generated by adding true values to the original time sequence feature vectors.

As an alternative embodiment, mapping the original random vector into an interaction matrix comprises the steps of:

Mapping the original random vector into three-dimensional characteristic representation through the full connection layer, processing the three-dimensional characteristic representation by using the transposed convolution layer to obtain an output matrix, and carrying out symmetrical processing on the output matrix to obtain an interaction matrix.

As an alternative embodiment, the recurrent neural network is a long-short term memory network.

As an alternative embodiment, the network depth of the graph roll-up network is set to be 3 layers or 4 layers.

It should be noted that, since the method embodiment and the system embodiment are based on the same inventive concept, the corresponding content in the system embodiment is also applicable to the method embodiment, and is not repeated here.

A fourth embodiment;

Referring to fig. 8, a multi-phase time series prediction apparatus based on generation of an countermeasure network is provided, which may be any type of intelligent terminal such as a cellular phone, a tablet computer, a personal computer, etc. Specifically, the apparatus includes: one or more control processors and memory, here exemplified by one control processor. The control processor and the memory may be connected by a bus or otherwise, here by way of example.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the multi-phase time series prediction device based on generating an countermeasure network in embodiments of the present invention. The control processor implements the multi-phase time series prediction method based on generating the countermeasure network of the above method embodiments by running non-transitory software programs, instructions, and modules stored in the memory.

The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data created from the use of a multi-phase time series prediction system based on generating an countermeasure network, and the like. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the control processor, the remote memory being connectable via a network to the relevant time series prediction device based on the generated countermeasure network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory and when executed by the one or more control processors perform the multi-phase time series prediction method based on generating an antagonizing network of the above-described method embodiments.

Embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions for performing the method of generating a correlation time series prediction based on an antagonism network of the above method embodiments by one or more control processors.

From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented in software plus a general purpose hardware platform. Those skilled in the art will appreciate that all or part of the flow of the method of the above-described embodiments may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, and the program may include the flow of the embodiment of the method as described above when executed. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A multi-phase time series prediction system based on generating an countermeasure network, characterized by different time series of weather data recorded for weather stations at different locations, wherein the weather data is data of daily temperatures, comprising:

An interaction matrix generator mapping the original random vector into an interaction matrix; the interaction matrix is used for representing interaction relations among different time sequences of the meteorological data;

The interaction matrix generator comprises a transposed convolution network, wherein a full connection layer of the transposed convolution network is used for mapping an original random vector into a three-dimensional characteristic representation, and a transposed convolution layer of the transposed convolution network is used for processing the three-dimensional characteristic representation to obtain an output matrix and carrying out symmetrical processing on the output matrix to obtain the interaction matrix;

the symmetrization process comprises the following steps:

wherein O is an output matrix, and A is an interaction matrix;

2. The multiple correlation time series prediction system based on generating an countermeasure network of claim 1, wherein the recurrent neural network is a long and short term memory network.

3. The multi-phase time series prediction system based on generating an countermeasure network of claim 1, wherein a network depth of the graph rolling network is set to 3 layers or 4 layers.

4. A method for predicting a plurality of relevant time sequences based on generating an countermeasure network, which is characterized by being applied to a system for predicting the relevant time sequences based on generating the countermeasure network, wherein the system for predicting the relevant time sequences based on generating the countermeasure network comprises an interaction matrix generator, a predicted value generator and a time sequence discriminator, and the interaction matrix generator, the predicted value generator and the time sequence discriminator are mutually connected in pairs;

The method is used for different time sequences of meteorological data recorded by meteorological stations at different positions, wherein the meteorological data are data of daily temperature, and comprises the following steps:

Mapping the original random vector into an interaction matrix through the interaction matrix generator; the interaction matrix is used for representing interaction relations among different time sequences of the meteorological data; the interaction matrix generator comprises a transposed convolution network, wherein a full connection layer of the transposed convolution network is used for mapping an original random vector into a three-dimensional characteristic representation, and a transposed convolution layer of the transposed convolution network is used for processing the three-dimensional characteristic representation to obtain an output matrix and carrying out symmetrical processing on the output matrix to obtain the interaction matrix;

the symmetrization process comprises the following steps:

wherein O is an output matrix, and A is an interaction matrix;

Constructing a time sequence interaction diagram according to the interaction matrix and the time sequence feature matrix; the mapping of the original random vector into the interaction matrix comprises the following steps:

Mapping the original random vector into three-dimensional feature representation through a full connection layer, processing the three-dimensional feature representation by using a transposition convolution layer to obtain an output matrix, and carrying out symmetric processing on the output matrix to obtain the interaction matrix;

5. The method of claim 4, wherein the recurrent neural network is a long-term memory network.

6. The method of generating multiple correlated time series predictions based on an countermeasure network of claim 4, wherein the network depth of the graph rolling network is set to be either layer 3 or layer 4.

7. A plurality of related time series prediction devices based on generating an countermeasure network, comprising: at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a plurality of related time series prediction methods based on generating an countermeasure network as claimed in any one of claims 4 to 6.

8. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the plurality of correlation time series prediction methods based on generating an countermeasure network as claimed in any one of claims 4 to 6.