CN115953902B

CN115953902B - Traffic flow prediction method based on multi-view space-time diagram convolutional network

Info

Publication number: CN115953902B
Application number: CN202310132783.8A
Authority: CN
Inventors: 顾军华; 冀震雷; 张亚娟; 蒋家海; 金建铭; 郭睿哲
Original assignee: Hebei University of Technology
Current assignee: Hebei University of Technology
Priority date: 2023-02-20
Filing date: 2023-02-20
Publication date: 2024-09-06
Anticipated expiration: 2043-02-20
Also published as: CN115953902A

Abstract

The invention relates to a traffic flow prediction method based on a multi-view space-time diagram convolution network, which takes traffic data on a road network, a static diagram adjacency matrix and a trend similar diagram adjacency matrix as inputs, and realizes accurate prediction of traffic flow by alternately extracting time features and space features and then using an output module to perform feature fusion. The invention extracts the spatial dependency relationship in the road from different angles, captures the spatial dependency relationship from the global angle and the local angle, and provides a multi-view space-time diagram convolution network to realize traffic flow prediction. The dependency relationship in the road observation point is modeled from different angles, and then the multi-view graph convolution module is utilized to extract the spatial characteristics so as to increase the capability of the module for capturing the spatial dependency and improve the prediction performance of the model.

Description

Traffic flow prediction method based on multi-view space-time diagram convolutional network

Technical Field

The invention belongs to the field of traffic flow prediction, and relates to a traffic flow prediction method based on a multi-view space-time diagram convolution network. The prediction method utilizes a graph convolution neural network and a convolution neural network to realize the prediction of traffic flow.

Background

With the development of sensor technology, traffic data has a trend of explosive growth, and traffic systems formally enter the traffic big data era. The intelligent traffic system controls urban traffic by using traffic big data. In recent years, with the support of deep learning technology, intelligent traffic systems have been greatly successful, and traffic flow prediction, which is one of the basic components of intelligent traffic systems, has attracted attention of more and more researchers. The change pattern in the traffic data changes with time and space, the traffic flow at the observation point is affected by not only the historical traffic flow but also the flow at the surrounding observation points, and how to properly process these influence factors is one of the important challenges for realizing accurate traffic flow prediction.

With the advent of deep learning, deep learning models such as convolutional neural networks and graph convolution networks are used to mine complex spatial dependencies in traffic data, and models such as cyclic neural networks are mainly used when processing time dependencies in traffic data. While existing deep learning models have achieved better predictions in traffic flow prediction tasks, there are still some problems. Conventional traffic flow prediction methods typically use adjacency matrices to describe spatial dependencies of roads, e.g., MTGNN use adaptive adjacency matrices in capturing spatial dependencies, whereas adaptive adjacency matrices often reflect local spatial dependencies in roads, ignoring global spatial dependencies in roads. In the real world, there may be additional spatial relationships between road segments due to regional functions (e.g., there may be close relationships between residences and work areas). In order to better capture local spatial dependence and global spatial dependence in a road network and solve the problems existing in a MTGNN model, the invention provides a space-time diagram convolutional network model MVSTGCN based on multiple views.

Disclosure of Invention

The invention provides a traffic flow prediction method based on a multi-view space-time diagram convolution network for traffic flow prediction aiming at the traffic flow prediction problem on a road network. Specifically, the invention relates to a novel space-time characteristic extraction method, which realizes complex space dependency extraction by considering the interaction relation among the road observation points from multiple angles in the space dimension, and extracts the time dependency by using a gating time convolution module in the time dimension. In order to reduce data redundancy caused by multi-angle extraction space dependency, the robustness of the model is improved, a feature separation idea is introduced, and finally, a large number of experiments are carried out on two real traffic data sets, so that the feasibility and the effectiveness of the designed network are verified.

The technical scheme of the invention is as follows:

a traffic flow prediction method based on a multi-view space-time diagram convolutional network, which is characterized by comprising the following steps:

s1, acquiring a traffic data set and constructing a static diagram adjacency matrix A _road of a road network, wherein the traffic data set comprises traffic data at each observation point in the road network at different moments;

S2, acquiring a trend similar graph adjacency matrix A _dtw;

S3, constructing MVSTGCN models:

The MVSTGCN model comprises an output module and a plurality of space-time feature extraction layers which are connected in series, space-time features under different scales are obtained by stacking a plurality of space-time feature extraction layers, the output of the current space-time feature extraction layer is used as the input of the next space-time feature extraction layer, meanwhile, the output of the current space-time feature extraction layer is recorded into the output module, and when all the space-time feature extraction layers extract the space-time features, a prediction result is given out by the output module;

Each space-time feature extraction layer comprises a structure of a gating time convolution module and a multi-view graph convolution module which are connected in series, wherein the multi-view graph convolution module is used for extracting space features, and the gating time convolution module is used for extracting time features;

the input of the gating time convolution module is traffic data processed by the linear conversion layer, the output of the gating time convolution module is connected with the multi-view graph convolution module, and meanwhile, the static graph adjacent matrix A _road and the trend similar graph adjacent matrix A _dtw are also used as the input of the multi-view graph convolution module; residual connection is carried out between the input of the gating time convolution module and the output of the multi-view graph convolution module;

The output module comprises at least two full connection layers (FC), and the activation function of the output module is ELU;

s4, training MVSTGCN the model by using the traffic data set, and finally obtaining a MVSTGCN model for traffic flow prediction.

The specific process of the step S1 is as follows:

S11, obtaining graph structure information G= (V, E), wherein V= { V ₁,v₂,...,v_N } of a road network represents a set of observation points in the road network, E represents a set of edges, and the connection relation between the corresponding observation points is shown as < V _i,v_j >. E, if the observation points V _i and V _j are directly communicated with each other, otherwise, no direct relation is shown between the two observation points, and a static graph adjacent matrix A _road is formed;

s12, acquiring traffic data of the road network, and cleaning the data;

S13, dividing the cleaned traffic data into time slices, wherein the traffic data of all observation points in one time slice corresponds to a feature matrix X ^(t) which is expressed as The traffic data of the observation points v _i in the t-th time slice is represented, and N is the number of the observation points; n is the number of observation points;

S14, stacking the time slice data to obtain a final traffic data set X= [ X ⁽¹⁾,X⁽²⁾,...,X^(Len) ], wherein Len represents the time step size of the traffic data, and is far greater than the time step T input by the MVSTGCN model.

The calculation process of the gating time convolution module is as shown in formula (2):

Z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

Wherein: TCN represents a hole causal convolutional network; x' represents a feature matrix of traffic data processed by the linear conversion layer; z represents the output of the gating time convolution module; tanh and sigma represent Tanh and Sigmoid activation functions, respectively; * Representing element multiplication;

the output module comprises two full connection layers (FC) and performs nonlinear transformation on the space-time characteristics under different scales to obtain a final prediction result.

The multi-view graph convolution module consists of three multi-scale graph convolution networks and an aggregation layer, and three multi-scale graph convolution networks MS-GCN are utilized to respectively extract the special spatial features of the static graph, the special spatial features of the trend similar graph and the common spatial features on the graph; the formula of the operation process is expressed as follows:

Q_rs＝MS-GCN1(Z,A_road) (7)

Q_ss＝MS-GCN2(Z,A_dtw) (8)

Q_rc＝MS-GCN3(Z,A_road) (9)

Q_sc＝MS-GCN3(Z,A_dtw) (10)

Wherein: MS-GCN1, MS-GCN2 and MS-GCN3 represent three multi-scale graph convolutional networks; z represents the output of the gating time convolution module; a _dtw and a _road represent a trend similarity graph adjacency matrix and a static graph adjacency matrix; q _rs,Q_ss,Q_rc and Q _sc represent static diagram specific spatial features, trend similar diagram specific spatial features, static diagram common spatial features and trend similar diagram common spatial features, respectively;

The formula of the multi-scale graph convolutional network is expressed as formula (6):

Where θ _l represents the weight of the first layer, obtained by the attention module, when l=0, θ ₀ =1, which represents the residual relationship of the 0 th layer; q represents the output of the multi-scale graph convolutional network MS-GCN; m represents the number of layers of the picture scroll laminate; h ^(l) represents the output of the first layer of the graph roll-up network;

After the spatial features are obtained, performing splicing operation on the obtained spatial features, and inputting the spliced feature vectors into a polymerization layer to perform feature polymerization, wherein the calculation process is as shown in formula (11):

Q’＝Agg(Concat(Q_rs,Q_ss,Q_rc,Q_sc)) (11)

wherein: concat denotes a vector concatenation operation; agg represents the feature aggregation operation, Q' is the output of the multi-view convolution module, and the aggregation layer is a full connection layer or channel attention.

The total loss L of the MVSTGCN model is the sum of the loss function L _gcn and the predicted loss L _model of the multi-view convolution module, and the calculation formula of l=l _model+L_gcn,L_model is:

Wherein: s represents a predicted time step; n represents the number of observation points; d represents the dimension of the feature to be predicted, D is fixed to be 1 in the traffic flow prediction task, and the traffic flow feature is represented; Is a prediction result; y is a real result; t represents a t-th time slice; s denotes the predicted s-th time slice; A set of parameters representing MVSTGCN;

The loss function L _gcn of the multi-view graph convolution module is defined as:

L_gcn＝αL₁-βL₂ (14)

Wherein alpha and beta represent super parameters, L ₁ represents the similarity of the common spatial features in the static diagram and the trend similar diagram, and L ₂ represents the sum of the similarity of the unique spatial features of the static diagram and the common spatial features of the static diagram and the similarity of the unique spatial features and the common spatial features in the trend similar diagram; the similarity is obtained by F norm calculation.

In the step S2, a dynamic time warping algorithm (DTW) is adopted to calculate the similarity of the data sequences between the observation points, and a trend similarity matrix Sim is constructed, wherein Sim _ij corresponds to the minimum alignment distance between the observation point v _i and the observation point v _j;

Then using a threshold function to select a communication relation, setting a direct communication relation between an observation point v _i and an observation point v _j when Sim _ij is smaller than a threshold, otherwise setting no communication relation, and finally obtaining a trend similarity graph adjacency matrix A _dtw;

Or setting the value of top K, and selecting the positions in the adjacent matrixes corresponding to the top K values of the values of Sim _ij in the same row of the trend similar matrix to be set to be 1, and setting the rest to be 0.

The dynamic time warping algorithm (DTW) needs to set a search step size that limits the DTW when looking for alignment relations.

The invention also protects a computer readable storage medium, in which a computer program is stored, which computer program is adapted to perform the traffic flow prediction method based on a multi-view space-time diagram convolutional network when loaded by a computer.

Compared with the prior art, the invention has the beneficial effects that:

The invention designs a multi-view space-time diagram convolutional network for traffic flow prediction by utilizing an algorithm of a deep convolutional neural network. According to the method, traffic data on the road network, a static diagram adjacency matrix and a trend similar diagram adjacency matrix are taken as inputs, time features and space features are extracted alternately, and then feature fusion is carried out by using an output module, so that accurate prediction of traffic flow is realized. Compared with the STG-NCDE with the best prediction precision in the embodiment, the invention obtains better prediction performance on the real traffic data sets PeMS, peMS 08. Specifically, in 3835.33% reduction in MAE, 3.91% reduction in MAPE, and 6.34% reduction in RMSE in the PeMS data set; on PeMS dataset, MAE was reduced by 1.49%, MAPE was reduced by 1.51% and RMSE was reduced by 1.57% (the smaller the values of the three evaluation indices were the better).

The invention realizes the extraction of the spatial dependency relationship in the road from different angles, the capture of the spatial dependency relationship from the global angle and the local angle, and provides the multi-view space-time diagram convolution network to realize the traffic flow prediction. The dependency relationship in the road observation point is modeled from different angles, and then the multi-view graph convolution module is utilized to extract the spatial characteristics so as to increase the capability of the module for capturing the spatial dependency and improve the prediction performance of the model.

The invention provides a multi-view space-time diagram convolution network, which extracts potential space dependence and can strengthen the effect of prediction. The gating time convolution module mainly comprises two cavity causal convolution networks which are respectively activated by different activation functions, has a simple structure, has no dependency in the calculation process, can realize parallel calculation, and has the advantages of short training time and no error accumulation.

Drawings

FIG. 1 is a schematic diagram of the structure of a hole cause and effect convolutional network TCN.

FIG. 2 is a schematic diagram of the structure of the gating time convolution module.

FIG. 3 is a schematic diagram of the architecture of a multi-scale graph convolutional network.

FIG. 4 is a schematic diagram of the structure of the multi-view convolution module.

Fig. 5MVSTGCN is a schematic structural diagram of the model.

FIG. 6 is a graph of single-step prediction MAE vs.

FIG. 7 is a graph of single-step prediction MAPE vs.

FIG. 8 is a graph of the single-step predicted RMSE comparison results.

Detailed Description

In order to make the technical scheme of the invention clearer, the invention is further described below with reference to the attached drawings.

According to the method, potential space-dependent features in traffic data are extracted through the multi-view graph convolution module, potential time-dependent features in the traffic data are extracted through the gating time convolution module, time and space-dependent feature extraction is alternately performed, space-time dependent relationships under different scales are obtained, the space-time dependent relationships under different scales are input into the output module for nonlinear data conversion, and finally a prediction result is obtained. The multi-scale graph convolution network carries out message transmission based on the adjacency matrix to realize the capture of spatial characteristics; the multi-view graph convolution module is composed of 3 multi-scale graph convolution networks, extracts special spatial features and public spatial features on traffic data based on an adjacent matrix respectively, plays a role in feature separation, and reflects a spatial dependency relationship; the output module is composed of a plurality of full-connection layers and an activation function and is used for carrying out nonlinear data conversion on the input space-time characteristics.

The invention is realized by the following steps:

Firstly, acquiring traffic data of a certain area through a crawler script or an API interface provided by a traffic management department, preprocessing the acquired traffic data, and then constructing a traffic data set;

The road network is represented: the traffic flow prediction problem is a typical multivariate time sequence prediction problem, and aims to predict traffic flow in a period of time in the future through historical traffic flow data, the invention utilizes an undirected graph G= (V, E) to represent a road network, wherein V is a set of observation points in the road network, N= |V| represents the number of the observation points in the road network, E is defined as a set of edges, the invention represents connectivity or distance among the observation points, A epsilon R ^N×N represents an adjacency matrix of the undirected graph G, A _ij represents adjacency relation between the observation points V _i and V _j, when < V _i,v_j > epsilonE, the direct road communication exists between the observation points V _i and V _j, other observation points are not needed, the value of A _ij is 1 or the distance between the two observation points is 0, and otherwise, the direct relation does not exist between the two observation points. X ^(t)∈R^N×D represents a feature matrix formed by data at each observation point at the t-th moment, D represents the number of features of the data at the observation point, and the features generally comprise flow, speed and occupancy;

traffic data at various observation points in the road network at different times constitute a traffic data set.

Road network refers to a road system which is formed by mutually connecting and interweaving various roads into a net-shaped distribution in a certain area. When the traffic prediction task is carried out, firstly, an area to be subjected to traffic prediction is selected, the communication relationship among the roads on the area is a road network, then the positions for carrying out traffic data statistics in the road network are called observation points, the connection relationship among the observation points can be obtained through the road network, and then the connection relationship among the observation points is represented by a static diagram adjacent matrix A _road.

Second, determining the input and output of the model: for traffic flow prediction tasks, the input of the model is historical traffic data X ^(t-T+1):(t)∈R^N×D×T of N observation points in the past T time steps, the goal is to learn a mapping function f, and output traffic flow data X ^(t+1):(t+S)∈R^N×D×S of N observation points in the future S time steps, expressed as:

the traffic flow prediction predicts traffic data of the next S time steps by using traffic data of the past T time steps, and the history data is cut off to the time T, so that the output starts from the time t+1.

And a third step of: dividing the traffic data set and normalizing the data: the standard was scaled according to the usual scale, 60% of the data was used for training, 20% of the data was used for validation, 20% of the data was used for testing, and the data was normalized using the Z-Score method and recorded as Temp _train,Temp_val and Temp _test, respectively. The method comprises the following steps of performing the following operation on a Temp _train, taking traffic flow data of the first T time steps of the Temp _train as an input of a sample 1, taking traffic flow data of the first T+1 to the first T+S time steps as a real result of the sample 1, taking traffic flow data of the first T+1 to the second T time steps as an input of a sample 2, taking traffic flow data of the first 2T+1 to 2T+S time steps as a real result of the sample 2, and analogizing the same, so as to finally obtain a training set X _train for model training and a corresponding real result Y thereof. Temp _val and Temp _test are processed in the same manner to obtain validation set X _val, test set X _test, and their corresponding real results Y _val and Y _test, respectively. Each sample corresponds to static graph adjacency matrix a _road and trend similarity graph adjacency matrix a _dtw at this time.

Fourth step: and constructing MVSTGCN a model, namely a multi-view space-time diagram convolution network model:

The MVSTGCN model comprises an output module and a plurality of space-time feature extraction layers which are connected in series, space-time features under different scales can be obtained by stacking a plurality of space-time feature extraction layers, the output of the current space-time feature extraction layer is used as the input of the next space-time feature extraction layer, meanwhile, the output of the current space-time feature extraction layer is recorded into the output module, and when all the space-time feature extraction layers extract the space-time features, a prediction result is given out by the output module;

The input of the gating time convolution module is traffic data processed by the linear conversion layer, the output of the gating time convolution module is connected with the multi-view graph convolution module, and meanwhile, the static graph adjacent matrix A _road and the trend similar graph adjacent matrix A _dtw are also used as the input of the multi-view graph convolution module; and residual connection is carried out between the input of the gating time convolution module and the output of the multi-view graph convolution module.

The output module comprises at least two full connection layers (FC), the activation function of the output module is ELU, and the final prediction result is obtained by carrying out nonlinear transformation on the space-time characteristics under different scales.

The overall structure of MVSTGCN model is shown in fig. 5. A _road, traffic data, a _dtw in fig. 5 represent inputs of the model; linear represents a Linear conversion layer, and performs dimension conversion on traffic data so that the data matches the input dimension of the model. The residual connection represents that the original input of each feature extraction layer has a connection relation with the output, so that the problem of gradient disappearance is prevented when a model is trained, namely O=Z ' +g (Z '), wherein O represents the final output of the space-time feature extraction layer, Z ' represents the input of the space-time feature extraction layer, and g (Z ') represents the output of the Z ' after passing through a gating time convolution module and a multi-view graph convolution module. ELU represents an ELU activation function, denoted by P (x), and the ELU's calculation formula is: x refers to the variable, i.e., the input data of the ELU. FC stands for fully connected layer. Two ELUs and two FCs constitute an output module. + represents element addition. The broken dashed boxes in the figure represent multiple layers of spatio-temporal feature extraction layers.

In order to capture the time dependency relationship in traffic data, the invention adopts a gating time convolution module to extract the time characteristics. The gating time convolution module (see fig. 2) is composed of two cavity causal convolution networks TCN, the two cavity causal convolution networks TCN are identical in structure, the two cavity causal convolution networks are respectively activated by Tanh and Sigmoid activation functions, and the Tanh activation functions are used for filtering the output of the cavity causal convolution networks, and the Sigmoid activation functions play a gating role and are used for controlling the proportion of information transferred to the next layer. The hole convolution can ensure that the model processes a long enough sequence under the condition of small parameter quantity, and the causal convolution ensures that the model cannot see future information. The calculation process of the final gating time convolution module is as follows:

Z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

Wherein: TCN represents a hole causal convolutional network; x' represents a feature matrix of traffic data which is input after being processed by the linear conversion layer; z represents the output of the gating time convolution module; tanh and sigma represent Tanh and Sigmoid activation functions, respectively; * Representing element multiplication.

The structure of the hole cause and effect convolutional network is shown in fig. 1. Each row in fig. 1 represents the input of the current layer, taking the input layer as an example, when a sequence with the length of 8 is input, a one-dimensional hole causal convolution kernel with the size of 2 and the hole rate of 1 is used for extracting features, then the extracted data is the last seven positions of the penultimate layer, the first position is a filling bit, the filling bit is not needed in actual construction, the first three positions of the third layer are filling bits, so that the picture looks concise, for example, a circle of the third column of the penultimate row has a connection relationship with the second column and the third column of the penultimate row, and the connection relationship is omitted here. Causal convolution refers to the fact that connections exist only at the current location and at some previous locations, preventing future data from being seen. The hole rate represents that when each convolution kernel processes a sequence, the jump range of the processing position is increased by 1, the output layer is the first row, and in the second row, the output layer only receives the calculation result of the 8 th column and the 4 th column data, and the 5 th column to the 7 th column data are ignored because the hole cause and effect convolution kernel with the size of 2 and the hole rate of 4 is used. The calculation formula for each position is: Wherein x epsilon R ^T represents input, T represents the length of an input sequence, and T is 8 in the figure; r epsilon R ^k represents a parameter set of a convolution kernel, k is the size of the convolution kernel, and k in the figure is 2; t represents the currently calculated sequence position; d is the void fraction of the convolution kernel; r _a denotes the a-th parameter of the convolution kernel, the subscript starting from 0; x _(t-d×a) denotes the input t-dxa th position, and t denotes the t-th time slice.

In order to capture potential spatial dependencies in traffic data, a multi-view graph convolution module is designed. The graph convolution network promotes the traditional convolution neural network from Euclidean space data to a graph with a non-Euclidean structure, can realize message transmission on the graph and further realize extraction of space dependence, extends the traditional graph convolution method on the basis, captures multi-view information, and enables a multi-view graph convolution module to aggregate observation point characteristics through static graphs and trend similarity graphs. The flow of the multi-view graph convolution module is as follows:

(1) Building a trend similarity graph: based on the flow change trend similarity among observation points, a trend similarity graph is constructed, and the specific process is as follows: the distances between every two observation points are calculated respectively, and because interactions of different observation points have time lags, the common cosine similarity and Euclidean distance measurement method cannot align time sequences well. The distance between observation points is calculated pairwise by using a dynamic time warping algorithm (DTW) in this embodiment. The dynamic time warping algorithm allows alignment of relationships across time steps, and can better handle distance computation of sequence inconsistencies caused by time-lapse relationships. The dynamic time warping algorithm uses dynamic programming ideas to calculate the minimum distance between sequences. The dynamic programming transfer equation used is expressed as:

γ(w,h)＝dis(q_w,c_h)+min(γ(w-1,h-1),γ(w-1,h),γ(w,h-1)) (3)

Where γ (w, h) represents the minimum alignment distance from the time series of (0,w) in the observation point v _q and the time series of (0, h) in the observation point v _c; dis (q _w,c_h) represents the distance difference (flow difference) between the w-th time step in the observation point v _q and the h-th time step in the observation point v _c, where w, h represent the time steps.

The minimum distance between observation point sequences is calculated by using the algorithm to form a trend similarity matrix Sim, then a threshold function is used for selecting a communication relation, when Sim _ij is smaller than a threshold value, the position in the Sim _ij corresponding adjacent matrix is set to be 1, otherwise, the position is set to be 0, and finally the trend similarity graph adjacent matrix A _dtw is obtained.

The usual distance measurement method mostly uses cosine similarity and euclidean distance, and DTW is adopted here to realize cross-time-step relation alignment, because traffic data has time-lag relation from the perspective of spatial change. The conventional DTW has a time complexity of O (n ²) when calculating the distance of the sequence, and consumes much when calculating the time sequence.

In this embodiment, the search step size of the DTW in searching for the alignment relationship is preferably limited, for example, 12 time steps are searched backward, so that the complexity is reduced to O (n), and because the interaction of the traffic sequence observation points all occur in a limited time range, limiting the search step size does not affect the final performance, and a specific search step size needs to be selected through experiments, so that a specific problem is specifically analyzed.

The minimum distance between the ith observation point and the jth observation point in the road is assumed that the time sequence length of the two observation points is 12, the value of Sim _ij is gamma (12, 12), and the threshold belongs to the super-parameters of the network, because different thresholds can lead to the matrix having different sparsity.

The positions in the adjacent matrixes corresponding to top K values of the Sim _ij values in the same row of the trend similar matrix can be selected according to the requirement, the positions in the adjacent matrixes are set to be 1, the rest of the adjacent matrixes are set to be 0, K is also a super parameter, and the sparseness of the matrixes is determined by different K.

(2) Constructing a multi-scale graph convolution network MS-GCN: based on a given adjacency matrix A and a feature matrix Z, spatial feature extraction is carried out on the feature matrix Z by using a graph convolution network GCN, different weight factors endowed according to the information content of the current layer are obtained in each graph convolution layer of the graph convolution network by using an attention module, and the operation process of the graph convolution network layer is expressed as follows by a formula:

H^(l+1)＝f(H^(l),A)＝ReLU(AH^(l)W^(l+1)+b^(l+1)) (4)

H⁽⁰⁾＝Z∈R^N×D (5)

Wherein H ^(l) represents the output of the first layer of the graph rolling network, and l is an integer between 0 and m and is also the input of the first layer +1; w ^(l+1) and b ^(l+1) represent the learnable model parameters of layer l+1, and ReLU represents the ReLU activation function. In order to dynamically adjust the influence weights of different layers, the attention module gives different weight factors according to the information content of the current layer, and the structure of the multi-scale graph convolution network MS-GCN is shown in figure 3. The nodes on each circle in fig. 3 represent neighbor information to be aggregated for the current layer. The formula of the final multi-scale convolution network calculation process is expressed as:

Where θ _l represents the weight of the first layer, obtained by the attention module, when l=0, θ ₀ =1, which represents the residual relationship of the 0 th layer; q represents the output of the multi-scale graph convolutional network MS-GCN; m represents the number of layers of the scroll laminate.

(3) Constructing a multi-view graph convolution module: on the basis of a multi-scale graph convolution network, a multi-view graph convolution module is constructed, and the structure of the multi-view graph convolution module is shown in fig. 4. Specifically, the multi-view graph convolution module is composed of three multi-scale graph convolution networks and one aggregation layer. The three multi-scale graph convolution networks MS-GCN respectively realize the extraction of the special spatial features of the static graph, the special spatial features of the trend similar graph and the common spatial features on the graph. The formula of the operation process is expressed as follows:

Q_rs＝MS-GCN1(Z,A_road) (7)

Q_ss＝MS-GCN2(Z,A_dtw) (8)

Q_rc＝MS-GCN3(Z,A_road) (9)

Q_sc＝MS-GCN3(Z,A_dtw) (10)

wherein: MS-GCN1, MS-GCN2 and MS-GCN3 represent the three multi-scale graph convolutional networks mentioned previously; z represents a feature matrix formed by traffic data processed by the gating time convolution module; a _dtw and a _road represent a trend similarity graph adjacency matrix and a static graph adjacency matrix; q _rs,Q_ss,Q_rc and Q _sc represent a static diagram-specific spatial feature, a trend-similar diagram-specific spatial feature, a static diagram-common spatial feature, and a trend-similar diagram-common spatial feature, respectively, with the subscript c representing common, the subscript r representing the static diagram, the subscript first location s representing trend similarity, and the subscript second location s representing unique.

After the spatial features are obtained, splicing the obtained spatial features, inputting the spliced feature vectors into a polymerization layer for feature polymerization, and defining a formula in a calculation process as follows:

Q′＝Agg(Concat(Q_rs,Q_ss,Q_rc,Q_sc)) (11)

Wherein: concat denotes a vector concatenation operation; agg represents a feature aggregation operation, and a specific implementation may use full connection layer or channel attention instead.

(4) Loss function setting: in order to ensure that the three multi-scale graph convolution networks MS-GCN have the function of feature separation, corresponding loss functions need to be defined, and the learning result of the model is ensured. Let Q _rc∈R^N×D×T,Q_sc∈R^N×D×T,Q_rs∈R^N ^×D×T,Q_ss∈R^N×D×T represent the static diagram common spatial feature, the trend similar diagram common spatial feature, the static diagram specific spatial feature and the trend similar diagram specific spatial feature, respectively. The two common spatial features Q _rc and Q _sc are first dimension transformed and then normalized to yield Q '_rc∈R^N×DT and Q' _sc∈R^N×DT. Because Q _rc and Q _sc represent common spatial features in static and trend similarity graphs, respectively, the two matrices should have extremely high similarity, and the metric formula uses F-norm calculation, which is:

The above formula can be used to measure the difference between the space specific information and the space public information, the specific information and the common information should have different information characterization, so the two matrices should have great difference, and Q '_rs and Q' _ss are used to represent the space characteristic of the static diagram and the space characteristic of the trend similar diagram after dimension conversion and normalization respectively, and the measurement formula is also calculated by using F norm, and the formula is:

The loss function L _gcn of the final multi-view graph convolution module is defined as:

L_gcn＝αL₁-βL₂ (14)

Wherein alpha and beta represent super parameters for controlling the influence degree of the current loss function on the model parameters, L ₁ represents the similarity of the common spatial features in the static diagram and the trend similarity diagram, and L ₂ represents the sum of the similarity of the unique spatial features of the static diagram and the common spatial features of the static diagram and the similarity of the unique spatial features and the common spatial features in the trend similarity diagram.

Fifth step: training model parameters: and constructing a trend similar graph according to Temp _train to obtain a trend similar graph adjacency matrix A _dtw. Then, X _train, a trend similar graph adjacency matrix a _dtw and a static graph adjacency matrix a _road are used as input of the MVSTGCN model (the two adjacency matrices are only effective in the multi-view graph convolution module, and other module adjacency matrices do not participate in calculation), so as to obtain a prediction result of the MVSTGCN model, the prediction result and the real result Y are used for loss calculation, and a back propagation algorithm is used for updating MVSTGCN model parameters. The specific flow direction of the data is as follows:

(1) Carrying out dimension transformation on traffic data in the training set X _train to obtain X ', and inputting the X' into a first space-time feature extraction layer;

(2) The space-time feature extraction layer firstly utilizes a gating time convolution module to extract time features of input to obtain time features Z, and then a multi-view graph convolution module carries out space feature extraction on the time features Z based on a trend similar graph adjacent matrix A _dtw and a static graph adjacent matrix A _road to obtain a space-time feature Q' ^(p),Q′^(p) output by the current layer to represent an output result of a p-th space-time feature extraction layer;

(3) Inputting the spatiotemporal feature Q '^(p) to the next spatiotemporal feature extraction layer while recording the spatiotemporal feature Q' ^(p) in the variable Skip;

(4) Repeating the step (2) and (3) until the data passes through all the space-time feature extraction layers;

(5) Inputting the variable Skip into an output module, wherein the output result of the output module is the prediction result of the model

(6) Utilizing prediction resultsAnd calculating the predicted loss L _model of the model by the real result Y, and adding the loss function of the multi-view graph convolution module and the predicted loss to obtain the final loss l=l _model+L_gcn,L_model of the model, wherein the calculation formula is as follows:

Wherein: s represents a predicted time step; n represents the number of observation points; d represents the dimension of the feature to be predicted, D is fixed to be 1 in the traffic flow prediction task, and the traffic flow feature is represented; t represents a t-th time slice; s denotes the predicted s-th time slice; all parameters representing MVSTGCN; i represents the i-th observation point; u is the u-th feature.

(7) The parameters of the model are updated using a back propagation algorithm.

Sixth step: verifying the performance of a model: inputting the verification set X _val into the MVSTGCN model, and calculating the predicted loss L _model of the MVSTGCN model on the verification set;

Seventh step: the fifth and seventh steps are repeated until the predicted loss L _model on the validation set is less than the set threshold or the maximum number of repetitions is reached.

Eighth step: testing MVSTGCN the predictive performance of the model: test set X _test was input into the MVSTGCN model, traffic flow predictions were made and the performance of the final MVSTGCN model was evaluated.

(1) MVSTGCN model overall performance assessment

The present invention has been tested on two real traffic flow datasets PeMS and PeMS 08. The details of the dataset are shown in table 1. Both data sets were taken as a time step of 5 minutes.

Table 1: data set information

The effectiveness and accuracy of the traffic flow prediction method based on the multi-view space-time diagram convolution network provided by the invention are verified through a comparison experiment on a real data set, and the experimental result is shown in table 2.

The invention is compared with other reference methods: including FC-LSTM, DCRNN, STGCN, ASTGCN, STG Seq, GRAPH WAVENET, STSGCN, STFGNN, ZGCNETS and STG-NCDE (all reference methods mentioned are english shorthand for models in published literature) and are evaluated using three metrics, including mean absolute error MAE, root mean square error RMSE and mean absolute percent error MAPE. The smaller and better the values of the three measurement indexes are, the corresponding calculation formulas are as follows:

wherein: y _s is the observed value of the traffic flow, S represents the length of a time sequence to be predicted as a predicted value of traffic flow; s represents the s-th predicted time step, and the value range is (1, S).

As can be seen from table 2, the predicted performance of the MVSTGCN model is better than the other baseline model for both data sets. MVSTGCN through multi-angle spatial dependency extraction, a spatial feature model with richer semantics can be obtained, so that the model has better prediction performance.

Table 2: MVSTGCN results of Performance comparison with reference model

To further verify MVSTGCN performance, a single step prediction error comparison was chosen over PeMS dataset with the two most advanced models zgcnes, STG-NCDE and the experimental comparison results plotted as a line graph. Single step prediction results for MAE, MAPE and RMSE are shown in fig. 6-8.

(2) To evaluate and understand the effects and performance of key components in the MVSTGCN model proposed by the present invention, ablation studies were performed on the PeMS03 dataset, and variants of MVSTGCN were named and expressed as follows:

MVSTGCN/w.o.R.: only the static graph adjacency matrix is used, a multi-view graph convolution module is not used, a common graph convolution network is provided, and the dependence relationship constructed by other angles is not considered.

MVSTGCN/w.o.C.: on the basis of a single adjacency matrix, a multi-scale graph convolution network is used for replacing an original graph convolution network, namely, a attention module is added in a common graph convolution network on the basis of MVSTGCN/w.o.R, so that the multi-scale graph convolution network is formed.

MVSTGCN/w.o.d.: the static graph adjacency matrix and the trend similarity graph are used for spatial dependency extraction, and the shared parameter part in fig. 4 is not included, namely only two MS-GCNs are included.

The results of the ablation experiments are shown in table 3.

Table 3: ablation experiments

Experimental results show that the multi-view graph convolution module, the multi-scale graph convolution network and the feature separation concept are all critical to the performance of MVSTGCN. The space feature fusion on the trend similarity graph and the static graph proves the effectiveness of multi-view feature extraction, and the multi-scale graph convolution network can perform dynamic weight adjustment on the middle output of the model, so that the self-adaption effect is achieved. The MVSTGCN model of the invention achieves a better traffic flow prediction result.

The invention is applicable to the prior art where it is not described.

Claims

1. A traffic flow prediction method based on a multi-view space-time diagram convolutional network, which is characterized by comprising the following steps:

S2, acquiring a trend similar graph adjacency matrix A _dtw;

S3, constructing MVSTGCN models:

2. The traffic flow prediction method based on a multi-view space-time diagram convolutional network according to claim 1, wherein,

The specific process of step S1 is:

s12, acquiring traffic data of the road network, and cleaning the data;

S13, dividing the cleaned traffic data into time slices, wherein the traffic data of all observation points in one time slice corresponds to a feature matrix X ^(t) which is expressed as The traffic data of the observation points v _i in the t-th time slice is represented, and N is the number of the observation points;

3. The traffic flow prediction method based on a multi-view space-time diagram convolutional network according to claim 1, wherein,

z＝Tanh(TCN(X′))*σ(TCN(X′)) (2)

4. The traffic flow prediction method based on a multi-view space-time diagram convolutional network according to claim 1, wherein,

Q_rs＝MS-GCN1(Z,A_road) (7)

Q_ss＝MS-GCN2(Z,A_dtw) (8)

Q_rc＝MS-GCN3(Z,A_road) (9)

Q_sc＝MS-GCN3(Z,A_dtw) (10)

Q′＝Agg(Concat(Q_rs,Q_ss,Q_rc,Q_sc)) (11)

5. The traffic flow prediction method based on the multi-view space-time diagram convolution network according to claim 4, wherein the total loss L of the MVSTGCN model is a sum of a loss function L _gcn and a predicted loss L _model of the multi-view diagram convolution module, and a calculation formula of l=l _model+L_gcn,L_model is:

L_gcn＝αL₁-βL₂ (14)

6. The traffic flow prediction method based on the multi-view space-time diagram convolutional network according to claim 1, wherein in S2, a dynamic time warping algorithm (DTW) is adopted to calculate the similarity of the data sequences between the observation points, and a trend similarity matrix Sim is constructed, wherein Sim _ij corresponds to the minimum alignment distance between the observation point v _i and the observation point v _j;

7. The traffic flow prediction method based on a multi-view space-time graph convolutional network according to claim 6, wherein a search step size limiting DTW in finding alignment relation needs to be set in a dynamic time warping algorithm (DTW).

8. A computer readable storage medium, having stored therein a computer program adapted to perform the multi-view space-time diagram convolutional network based traffic flow prediction method according to any one of claims 1-7 when loaded by a computer.