CN112884222A

CN112884222A - Time-period-oriented LSTM traffic flow density prediction method

Info

Publication number: CN112884222A
Application number: CN202110185310.5A
Authority: CN
Inventors: 曾昀敏; 陈佩仪; 张衡; 胡栋; 孙艺菲; 姚剑
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-02-10
Filing date: 2021-02-10
Publication date: 2021-06-01
Anticipated expiration: 2041-02-10
Also published as: CN112884222B

Abstract

The invention discloses a time-slot-oriented LSTM traffic flow density prediction method. The invention creatively provides a concept of traffic flow density, obtains a data set of the traffic flow density of each vehicle in the area, and effectively reflects the actual traffic jam condition in the urban traffic hot spot area. And local extension or shortening is carried out on a time axis by utilizing a dynamic time warping method, so that the similarity among the traffic flow density sequence slices is more accurately measured. And classifying the traffic flow density slice sequences according to the similarity by using a K-Means clustering method, so as to be beneficial to reflecting the typical mode of the traffic flow density. And respectively training neural networks aiming at traffic flow density data with similar patterns, and performing weighted summation on the output predicted values of the neural networks according to the time labels or the distances of the data clustering centers, thereby fully utilizing the characteristics of the sequence structure of the data. And finally, performing similar day correction on the weighted summation result by using the past data with the same type of the working day, so as to improve the algorithm prediction precision.

Description

Time-period-oriented LSTM traffic flow density prediction method

Technical Field

The invention belongs to the field of traffic prediction, and particularly relates to a time-slot-oriented LSTM traffic flow density prediction method.

Background

With the promotion of the society and economy development of China and the steady progress of urbanization construction, the physical and living conditions of people are improved, the holding capacity of motor vehicles and the road traffic volume are increased rapidly, and the traffic jam phenomenon of cities, particularly big cities, becomes a focus problem at present. The problems of traffic network retardation, disordered traffic order and the like are gradually highlighted, and huge impact and pressure are brought to urban traffic management. The problems of commuting time waste, energy and economic loss, air quality reduction and the like caused by the phenomenon are increasingly highlighted. At present, the problem of urban road traffic congestion not only restricts the further development of social economy, but also is an important reason influencing the trip experience of people and reducing the quality of life of people. Therefore, research on urban road vehicle motion prediction methods must be strengthened to further complete the task of solving urban traffic congestion.

In view of the problem of predicting the traffic flow of urban roads, experts and scholars in the related field have conducted many studies, and therefore prediction methods with higher complexity and better performance are continuously proposed, or the existing prediction methods are improved from different angles. Generally, these prediction methods can be broadly divided into two categories: the method is supported by classical mathematical and physical theory knowledge and is suitable for processing small-scale, simple and single traffic flow data, but the method cannot meet the requirement of actual precision along with the complication of a traffic system; the other type of prediction model is formed by taking modern scientific technology and methods as main research means, and comprises a nonparametric regression model, a multidimensional fractal-based method, a spectrum analysis method, a neural network model and the like, wherein the methods pay attention to improving the fitting effect on the real traffic flow phenomenon, do not pursue strict mathematical derivation and clear physical significance, and show advantages in processing more complex traffic flow data. A Long-Short Term Memory neural network (Long-Short Term Memory) is one of the commonly used neural network models, and has a relatively good performance in traffic flow prediction in recent years. However, since the simulation is simply performed according to the time correlation of the traffic flow data and the objective law of actual road traffic is not considered, there is still room for improvement in terms of improving the accuracy.

Disclosure of Invention

The invention aims to provide a time period-oriented LSTM (long short term memory neural network) traffic flow density prediction method by combining four modules of clustering, neural network, ensemble learning and similar day correction, because the current deep learning method for predicting a time sequence aiming at a certain time period or time label is very limited and a clustering idea is rarely introduced into a prediction process. The traffic flow density is defined by the number of track points left by the vehicle in unit time, and the traffic jam condition can be reflected better.

The technical scheme of the invention is a time slot-oriented LSTM traffic flow density prediction method, which comprises a data processing stage, a training stage, a parameter adjusting stage and a testing stage;

the data processing stage specifically comprises: selecting a research area, selecting a part which has intersection with the research area from the acquired vehicle track information, calculating according to a traffic flow density formula to obtain a traffic flow density data set of each vehicle in the area, and dividing the traffic flow density data set into a training set, a verification set and a test set;

the training phase comprises the following steps:

s11, segmenting traffic flow density data in a training set according to a given length to obtain time sequence slices, recording time labels of the time sequence slices, calculating similarity among the time sequence slices by using a dynamic time warping method, and classifying the traffic flow density slices according to the similarity by using a K-Means clustering method to obtain N clustering clusters and corresponding time labels;

step S12, constructing a time-slot-oriented LSTM neural network, and performing time-series sliding window processing, splicing processing and normalization processing on input data of the neural network according to the specified sliding window length;

step S13, taking the data of N clusters, and respectively training N LSTM neural networks;

the parameter adjusting stage comprises the following steps:

step S21, preprocessing the verification set data, specifically including:

setting the length of a sliding window as seq _ len, performing time series sliding window processing on each sequence in the verification set, wherein the time series window size is seq _ len, the sliding window step length is 1, and then performing splicing processing and normalization processing;

step S22, inputting the processed verification set data into L neural networks according to given standards, wherein L is less than N, and performing denormalization operation and weighted summation on L prediction results;

step S23, the weighted summation result is corrected by using the past data, and the final prediction result is output;

step S24, adjusting parameters, carrying out clustering and neural network training by using the training set data again, and then repeating S22 and S23 and carrying out error analysis;

step S25, repeating step S24 for a plurality of times, and finally reserving parameters which enable the three indexes to be relatively best in performance as model parameters;

the testing phase comprises the following steps:

step S31, the preprocessing operation of the step S21 is carried out on the test set data;

step S32, inputting the processed test set data into L neural networks according to a given standard, wherein L is less than N, performing inverse normalization operation and weighted summation on L prediction results, correcting the weighted summation result by using past data, and outputting a final prediction result;

step S33, the training set data is reused to carry out clustering and neural network training, and then step S32 is carried out;

and step S34, repeating the step S33M times, and calculating the mean value and standard deviation of the MAE, the MSE and the MAPE of the corrected predicted value and the true value.

Further, the vehicle track information is derived from a network data set;

the research area is an urban traffic hot spot area;

the acquired vehicle track information is the motor vehicle track points acquired after the interval time;

accumulating the number of track points in a selected research area within a specified interval time to obtain a traffic flow density data set of the specified interval time, wherein the specified interval time is formulated according to a prediction demand;

if the specified interval time is delta t and the frequency of the occurrence frequency of the vehicle track points in the research area in the interval time is n, the traffic flow density data is defined as follows:

and dividing the traffic flow density data set into a training set, a verification set and a test set.

Further, the specific implementation manner of obtaining N clustering clusters in step S11 is as follows;

firstly, the similarity between time series slices is calculated by utilizing a dynamic time warping method, the concrete realization mode is as follows,

the dynamic time warping method carries out local extension or shortening on a time axis to calculate the similarity between time sequences, if two time sequences needing to calculate the similarity are X and Y, the lengths are | X | and | Y |, i represents the traffic flow density at the time i in X, j represents the traffic flow density at the time j in Y, the rounding path W is defined as:

W＝w₁,w₂,...,w_k,Max(|X|,|Y|)≤k≤|X|+|Y|

in the formula: w is a₁＝(1,1)，w_k＝(|X|,|Y|)；

The cumulative rounding distance D (i, j) from (1,1) to (i, j) is defined as:

D(i,j)＝Dist(i,j)+Min[D(i-1,j),D(i,j-1),D(i-1,j-1)]

in the formula: dist (i, j) denotes X_i,Y_jThe distance between the two points;

solving by using dynamic programming to obtain a normalized path distance D (| X |, | Y |);

then, classifying the traffic flow density slice sequences according to the similarity by a K-Means clustering method to obtain various typical sequences of the traffic flow sequences, and integrating N clustering clusters and clustering centers.

Further, the specific implementation manner of performing time-series sliding window processing, splicing processing and normalization processing on the input data according to the specified length in step S12 is as follows;

the time series sliding window processing comprises the following steps:

if the traffic flow density slice data in the cluster is expressed as S₁,S₂,...,S_nThe length of a traffic flow density slice is time _ span, the length of a sliding window is seq _ len, and S is firstly carried out_i(i 1.. said, n) is spliced with the first seq _ len-1 traffic flow density values which are logically corresponding to the time in the non-sliced traffic flow density data set and recorded as

Then to

Performing time series sliding window processing with the time series window size of seq _ len and the sliding window step length of 1;

the splicing treatment comprises the following steps:

recording x as a traffic flow density slice cut by the sliding window, and recording y as a traffic flow density value at the next moment of the slice;

then the traffic flow density sliding window data set is obtained and expressed as:

{(x_i,y_i)}_{1≤i≤n×time_span}

the normalization process is: for { x₀,...,x_nAnd (5) defining the normalized traffic flow density as follows:

wherein x_k∈{x₀,...,x_n}；

The normalized traffic flow density sliding window data set is the data received in the neural network

Further, the given criteria in step S22 are divided into two criteria, one is a time stamp as a criterion, and the other is a distance from the cluster center as a criterion;

the time label is taken as a standard, namely, the time label of input data is judged firstly, then the neural network with the same time label is selected, and if the training set of the neural network does not contain the time label, the data is not input into the neural network;

calculating the DTW distance between the input data and the clustering center of the clustering cluster corresponding to each neural network by taking the distance between the input data and the clustering center as a standard, and selecting the neural network for prediction when the distance is smaller than a given threshold value;

the denormalization operation being the prediction of the result

Conversion to { x ] according to the following formula₀,...,x_n}：

Wherein x_k∈{x₀,...,x_n}；

The weighted summation is also divided into two types, and the weighted summation corresponds to the given standard in two input types respectively;

setting the L denormalized prediction results as y by taking the time label as a standard₁,...,y_LAt the k-thIn the cluster corresponding to each neural network, N is the total number of slices with the same time labels as the input data_kAnd if the total number of slices containing the time label in the training set is N, the weight of the predicted value output by the kth neural network is as follows:

using the distance from the cluster center as the standard, if the input data and the cluster center

The DTW distance of (1) is decreased in sequence, and the corresponding prediction result is y₁,...,y_LAnd then:

c₁≤...≤c_L

c₁+...+c_L＝1

finally, the predicted values obtained by weighting are:

further, the specific implementation manner of step S23 is as follows;

if the input data of the neural network is in the specified correction time period, n past time sequence data { p) with the same type as the working day of the input data of the neural network exist₀,...,p_n-1The weighted traffic flow density predicted value is O_wAnd defining the corrected traffic flow density predicted value O as:

in the formula: α represents the similar day correction coefficient, and d represents the correction day range.

Further, the error analysis in step S24 includes the calculation of three indexes MAE, MAPE and MSE, which are defined as follows:

in the formula: y represents the true value of the traffic flow density, O represents the predicted value of the traffic flow density, and N represents the number of the predicted values of the traffic flow density. Compared with the prior art, the invention has the following advantages and beneficial effects:

the concept of traffic flow density is creatively provided, the traffic flow density data set of each vehicle in the area is obtained, and the actual traffic jam condition in the urban traffic hot spot area can be effectively reflected.

And local extension or shortening is carried out on a time axis by utilizing a dynamic time warping method, so that the similarity among the traffic flow density sequence slices is more accurately measured.

And classifying the traffic flow density slice sequences according to the similarity by using a K-Means clustering method, so as to be beneficial to reflecting the typical mode of the traffic flow density.

An initial normalization method is provided, and experiments prove that the method is more stable.

Respectively training neural networks aiming at traffic flow density data with similar patterns, and weighting and summing the output predicted values of the neural networks according to the time labels or the distances of the data clustering centers in the integrated learning step, thereby fully utilizing the characteristics of the sequence structure of the data.

And similar day correction is carried out on the weighted summation result by using the past data with the same type of the working day, and the long-time dependency of the traffic flow density is further utilized to improve the algorithm prediction precision.

And calculating the average absolute error MAE of the three indexes, performing error analysis on the average absolute percentage error MAPE and the mean square error MSE, and describing the deviation and the stability of the algorithm in multiple angles.

Drawings

FIG. 1 is a general flow chart of a time slot-oriented LSTM traffic flow density prediction method;

FIG. 2 is a flow chart of a time slot-oriented LSTM traffic flow density prediction method training phase;

FIG. 3 is a flow chart of a validation phase of the time slot-oriented LSTM traffic flow density prediction method;

fig. 4 is a flow chart of the testing phase of the LSTM traffic flow density prediction method facing the time slot.

Detailed Description

The model is further described in detail below with reference to the accompanying drawings and examples. The specific embodiments described herein are merely illustrative of the present model and are not intended to be limiting of the present model.

As shown in fig. 1, the invention provides a time slot-oriented LSTM traffic flow density prediction method, which is divided into a data processing stage, a training stage, a parameter adjusting stage and a testing stage, wherein the data processing stage specifically includes:

selecting a research area, selecting a part which has intersection with the area from the acquired vehicle track information, calculating a traffic flow density sequence of vehicles in the area according to a traffic flow density definition formula, and dividing the sequence into a training set, a verification set and a test set.

The vehicle trajectory information is derived from a network data set.

The research area is an urban traffic hot spot area.

The acquired vehicle track information is the number of motor vehicle track points acquired after the interval time.

And accumulating the number of the track points in the selected study area within the specified interval time to obtain a traffic flow density data set of the specified interval time, wherein the specified interval time can be set according to the prediction requirement (for example, 2 minutes).

different from the traditional traffic flow definition, the traffic flow density is the number of track points in unit time, and the actual congestion condition can be reflected better. In practical situations, a phenomenon that traffic flow is large but traffic is smooth may occur, vehicle track points are a two-dimensional time sequence formed by acquiring longitude and latitude data of vehicles every time a fixed time interval passes, and if the number of track points left by all vehicles in a certain area is large in unit time, the probability is that each vehicle forms a stagnation in the area.

And dividing the traffic flow density data set into a training set, a verification set and a test set. Specifically, if the acquired vehicle track information contains T days in total, the traffic flow density data set of the previous T-2 days is used as a training set, the data set of the T-1 day is used as a verification set, and the data set of the T day is used as a test set.

As shown in fig. 2, the training phase includes the following steps:

and step S11, dividing the traffic flow density data in the training set according to a given length to obtain time series slices, and recording the time labels of the slices according to the positions of the slices in the day. And calculating the similarity among the time sequence slices by using a dynamic time warping method, and classifying the traffic flow density slices according to the similarity by using a K-Means clustering method to obtain N clustering clusters, wherein the slices in each cluster have respective time labels.

The given length may be established according to predicted demand (e.g., 3 hours).

The Dynamic Time Warping (Dynamic Time Warping) method is a sequence measurement method, and local extension or shortening is performed on a Time axis to calculate similarity between Time sequences.

W＝w₁,w₂,...,w_k,Max(|X|,|Y)≤k≤|X|+|Y|

in the formula: w is a₁＝(1,1)，w_k＝(|X|,|Y|)；

The cumulative rounding distance D (i, j) from (1,1) to (i, j) is defined as:

D(i,j)＝Dist(i,j)+Min[D(i-1,j),D(i,j-1),D(i-1,j-1)]

in the formula: dist (i, j) denotes X_i,Y_jThe distance between the two points;

and the K-Means clustering is used for classifying the traffic flow density slices according to similarity (dynamic time warping measurement) to obtain various typical sequences of traffic flow sequences, and integrating N clustering clusters and clustering centers. The value of N may be formulated based on predicted demand.

And step S12, constructing an LSTM neural network facing a time period, and performing time series sliding window processing, splicing processing and normalization processing on input data of the neural network according to the specified sliding window length.

The time series sliding window processing comprises the following steps:

Then to

the splicing treatment comprises the following steps:

{(x_i,y_i)}_{1≤i≤n×time_span}

wherein x_k∈{x₀,...,x_n}；

And step S13, taking the data of the N clusters, and respectively training N LSTM neural networks.

As shown in fig. 3, the parameter adjusting stage includes the following steps:

step S21, preprocessing the verification set data, specifically including:

and setting the length of the sliding window as seq _ len, performing time series sliding window processing with the time series window size of seq _ len and the sliding window step length of 1 on each sequence in the verification set, and then performing splicing processing and normalization processing which are the same as S12.

And step S22, inputting the processed data into L neural networks (L < N) according to given standards, and performing denormalization operation and weighted summation on L prediction results.

The neural network consists of 1 or 2 layers (adjustable parameters) of LSTM layers, a Dropout layer behind the LSTM layer and a last layer of full-connection layer, the activation function of the full-connection layer is a linear function, the network uses an Adam optimizer, and 1 traffic flow density value is output as a predicted value at the next moment when the traffic flow density of seq _ len length is received.

The given criteria are one based on time labels and one based on distances from the cluster center.

The time labels are used as the standard, namely the time labels of input data are firstly judged, and then the neural networks with the same time labels are selected. If the training set of a neural network does not contain the time tag, no data is input into the neural network.

And calculating the DTW distance between the input data and the clustering center of the clustering cluster corresponding to each neural network by taking the distance between the input data and the clustering center as a standard, and selecting the network for prediction when the distance is smaller than a given threshold value.

The data input process selects the neural network trained by the data similar to the data mode to be predicted, makes full use of the data structure characteristics of the traffic flow density sequence, and has strong pertinence; and the neural network with weak pattern similarity is omitted, so that the calculation amount is saved.

The denormalization operation being the prediction of the result

Conversion to { x ] according to the following formula₀,...,x_n}：

Wherein x_k∈{x₀,...,x_n}；

There are also two ways of weighted summation, corresponding to the given criteria at the two inputs.

Setting the L denormalized prediction results as y by taking the time label as a standard₁,...,y_LIn the cluster corresponding to the kth neural network, the slice having the same time stamp as the input data has N in total_kAnd if the total number of slices containing the time label in the training set is N, the weight of the predicted value output by the kth neural network is as follows:

c₁≤...≤c_L

c₁+...+c_L＝1

finally, the predicted values obtained by weighting are:

in step S23, the weighted sum result is corrected using the past data, and the final prediction result is output.

in the formula: alpha represents the correction coefficient of the similar day, and d is the range of the correction days.

And step S24, adjusting parameters, carrying out clustering and neural network training by using the training set data again, and then repeating S22 and S23 and carrying out error analysis.

The error analysis includes the calculation of three indices, MAE, MAPE and MSE, defined as follows:

in the formula: y represents the true value of the traffic flow density, O represents the predicted value of the traffic flow density, and N represents the number of the predicted values of the traffic flow density.

And step S25, repeating the step S24 for a plurality of times, and finally reserving parameters which enable the three indexes to be relatively best in performance as model parameters.

As shown in fig. 4, the test phase performs a test to analyze the predicted result and the actual value, and exemplifies the effectiveness of the model. The method comprises the following specific steps:

step S31, preprocessing the test set data in the same way as S21;

step S32, inputting the processed test set data into L neural networks (L < N) by taking a time label or a distance from a cluster center as a standard, performing denormalization operation and weighted summation on L prediction results, correcting the weighted summation result by using past data, and outputting a final prediction result;

step S33, clustering and neural network training are carried out by using the training set data again, and then step S32 is carried out;

and step S34, repeating the step S33M times, and calculating the Mean value (Mean) and standard deviation (Std) of the MAE, the MSE and the MAPE of the corrected predicted value and the real value so as to simulate the prediction effect of the model under the real condition.

The effectiveness of the present invention can be further exemplified by the following simulation experiments. It is noted that the parameters used in the experiments do not influence the generality of the present invention.

1) Simulation conditions are as follows:

8G memory, 128G hard disk, Intel Core i3-8750H processor, operating system Windows, simulation software SQL, SPSS and Jupyter Notebook.

2) Simulation content:

analytical modeling was performed using the KDD CUP 2020 dataset of the dropletin cover data open platform. And selecting local areas of the metropolis, wherein the time range is from 2016, 11, month and 1 to 11, month and 30, original vehicle track data are recorded at intervals of 2-4 seconds, the original data are combined into 2-minute time intervals, and the data are normalized by the method. In the experiment, traffic flow density data from 11 month 1 to 11 month 28 days are used as a training set, traffic flow density data from 11 month 29 days is used as a verification set, and traffic flow density data from 11 month 30 days is used as a test set.

The model proposed by the method of the invention (denoted as TPO) was compared with a classical time series prediction model, including the recurrent neural network LSTM (denoted as C). TPO and C are subjected to parameter adjustment work on a verification set, and respective optimal parameters are obtained.

The traffic flow density in the local area was predicted in 11 months and 30 days, and error analysis was performed, and the results are shown in table 1.

Compared with the C algorithm, the MAE of the invention is 3.3 multiplied by 10 less on the mean value⁴6X 10 less standard deviation⁷(ii) a MAPE is 0.09 less in mean value and 0.014 less in standard deviation, and the accuracy and stability of traffic flow density prediction are improved.

TABLE 1 comparison of Performance of the prediction models

The foregoing describes a specific embodiment of the present invention. It should be understood that the above description is only a specific embodiment of the present invention, and is not intended to limit the present invention, and those skilled in the art may make various changes or modifications within the scope of the appended claims without affecting the spirit of the present invention.

Claims

1. A time slot-oriented LSTM traffic flow density prediction method is characterized by comprising the following steps: a data processing stage, a training stage, a parameter adjusting stage and a testing stage;

the training phase comprises the following steps:

step S12, constructing a time-slot-oriented LSTM neural network, and performing time-series sliding window processing, splicing processing and normalization processing on input data of the neural network according to a specified length;

the parameter adjusting stage comprises the following steps:

step S21, preprocessing the verification set data, specifically including:

step S25, repeating step S24 for multiple times, and finally reserving parameters which enable the best relative performance in error analysis as model parameters;

the testing phase comprises the following steps:

and step S34, repeating the step S33M times, and predicting the traffic flow density.

2. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the vehicle track information is derived from a network data set;

the research area is an urban traffic hot spot area;

accumulating the number of track points in the appointed interval time in the selected research area to obtain a traffic flow density data set of the appointed interval time of each road segment, wherein the appointed interval time is made according to a prediction requirement;

3. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the specific implementation manner of obtaining the N clusters in step S11 is as follows;

W＝w₁,w₂,...,w_k,Max(|X|,|Y|)≤k≤|X|+|Y|

in the formula: w is a₁＝(1,1)，w_k＝(|X|,|Y|)；

The cumulative rounding distance D (i, j) from (1,1) to (i, j) is defined as:

D(i,j)＝Dist(i,j)+Min[D(i-1,j),D(i,j-1),D(i-1,j-1)]

in the formula: dist (i, j) denotes X_i,Y_jThe distance between the two points;

4. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the specific implementation manner of performing time series sliding window processing, splicing processing and normalization processing on the input data according to the specified length in step S12 is as follows;

the time series sliding window processing comprises the following steps:

Then to

the splicing treatment comprises the following steps:

{(x_i,y_i)}_{1≤i≤n×time_span}

wherein x_k∈{x₀,...,x_n}；

5. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the given criteria in step S22 are divided into two criteria, one is a criterion of time stamp, and the other is a criterion of distance from the cluster center;

the denormalization operation being the prediction of the result

Conversion to { x ] according to the following formula₀,...,x_n}：

Wherein x_k∈{x₀,...,x_n}；

c₁≤...≤c_L

c₁+...+c_L＝1

finally, the predicted values obtained by weighting are:

6. the time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the specific implementation of step S23 is as follows;

if there are n past time sequence data { p) with the same type as the working day of the input data₀,...,p_n-1The weighted traffic flow density predicted value is O_wAnd defining the corrected traffic flow density predicted value O as:

7. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: the error analysis in step S24 includes the calculation of three indices MAE, MAPE and MSE, defined as follows:

8. The time slot-oriented LSTM traffic flow density prediction method according to claim 1, characterized in that: in step S34, the mean and standard deviation of the MAE, MSE, MAPE of the corrected predicted value and the true value are calculated to simulate the prediction effect under the true condition.