Multi-Scale Non-Local Spatio-Temporal Information Fusion Networks for Multi-Step Traffic Flow Forecasting
<p>Differences in flows between regions at different scales.</p> "> Figure 2
<p>Visualization of inflow and outflow. (<b>a</b>) Inflow and outflow for a given time interval in the BJTaxi dataset. (<b>b</b>) Inflow and outflow to single areas of the city.</p> "> Figure 3
<p>Visualization of traffic flow prediction.</p> "> Figure 4
<p>The multi-scale non-local spatio-temporal information fusion network (MN-STFN) framwork.</p> "> Figure 5
<p>Multi-scale Traffic Flow Pattern Capture block for encoding (MTFPC-E).</p> "> Figure 6
<p>Multi-scale Traffic Flow Pattern Capture block for forecasting (MTFPC-F).</p> "> Figure 7
<p>The structure of non-local block.</p> "> Figure 8
<p>Comparison of RMSE results for several major models. (<b>a</b>) BJTaxi. (<b>b</b>) NYCBike.</p> "> Figure 9
<p>Comparison of MAPE results for several major models. (<b>a</b>) BJTaxi. (<b>b</b>) NYCBike.</p> "> Figure 10
<p>Single-step forecast scenario for individual area traffic flow predictions compared to true values.</p> "> Figure 11
<p>Visualization of errors in single-step prediction results for different scale models. The prediction error in a region increases as the color of the region gets darker. (<b>a</b>) MTFPC-1. (<b>b</b>) MTFPC-2.</p> "> Figure 12
<p>The visualization comparison of the predictive performance of models with different hidden dimensions.</p> "> Figure 13
<p>Visualization of the error of single-step and prediction results for two models with a single sample. The prediction error in a region increases as the color of the region gets darker. (<b>a</b>) MN-STFN-dc. (<b>b</b>) MN-STFN.</p> "> Figure 14
<p>Visualization of the influence of input data on individual grids in the prediction results.</p> ">
Abstract
:1. Introduction
- We propose a multi-scale non-local spatio-temporal information fusion network (MN-STFN), which is able to accurately and stably make multi-step predictions of future traffic flows by inputting gridded data of traffic flows in the past period.
- Our model is able to capture the unique traffic flow patterns at different scales.
- We add a non-local network structure to the model to better capture the spatio-temporal direct traffic connections between the local and global parts of the urban region in the temporal traffic flow data.
- We compare our model with multiple baseline models on two public datasets in Beijing and New York. Experiments show that our model exhibits better performance on Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) against existing models.
2. Related Work
3. Preliminaries
3.1. Problem Formulation
3.2. Convolutional Neural Network
3.3. Convolutional Long Short-Term Memory
4. Methods
4.1. Encoding Net for Traffic Flow Pattern
4.2. Forecasting Net
4.3. Non-Local Block
5. Experiments
5.1. Datasets
- BJTaxi: The dataset was collected from the real GPS movement trajectories of Beijing cabs from 2013–2016, which are processed and divided into specification grid data, and a single grid contains both inflow and outflow features, and the whole dataset is divided into data at 30-min intervals, and the number of available time intervals is 22,459. We use the last four weeks of data as test data and all the rest of the data are used for training.
- NYCBike: This dataset was collected with the movement trajectories of New York public bicycles from April 2014 to September 2014. The grid size of the region is , the time interval of data collection is one hour, and the available time interval is 4392; we use the data of the last 10 days as the test data, and the rest of the data are used as the training data.
5.2. Baselines
- SVR: Support Vector Regression (SVR) is an application of Support Vector Machines (SVMs) to solve regression problems. Unlike traditional linear regression models, SVR can handle non-linear relationships and is very effective in dealing with high-dimensional data and noise in data.
- LSTM: It is a variant of recurrent neural networks (RNNs) for processing and modeling time series data and other sequence data with temporal dependencies. LSTM is designed to solve the problem of gradient vanishing in traditional RNNs to better capture long-term dependencies.
- ST-SSL [14]: It proposes a novel spatio-temporal self-supervised learning traffic prediction framework, a spatio-temporal convolutional module on top of a complementary self-supervised learning paradigm to enhance traffic pattern representation and identify spatial and temporal heterogeneity in traffic flows. Due to model structural limitations, only single-step prediction performance is compared for this model.
- ConvLSTM [23]: Its combining of convolutional networks with LSTM allows it to capture the existence of local dependencies in spatio-temporal data.
- SA-ConvLSTM [17]: A variant of ConvLSTM to capture long-term dependencies in the presence of time series by introducing self-attention as well as additional memory units.
- ST-ResNet [3]: It is a traffic flow prediction model based on deep residual networks. By stacking residual units, three different cycles of traffic flow data are processed separately to capture the spatio-temporal correlations present in the traffic data.
- AttConvLSTM [19]: It is a multi-step traffic flow prediction model based on sequence-to-sequence architecture, which establishes the influence relationship between regions at a long distance by introducing an attention mechanism to the hidden states at different moments.
5.3. Evaluation Metrics and Settings
5.4. Comparing with Baselines
- On the BJTaxi dataset, our model (MN-STFN) shows better performance compared to other existing methods. We are the first model to reduce the RMSE to below 15 on single-step prediction, which shows that our non-local network with multi-scale traffic pattern capture can effectively reduce the prediction error of the model; on the 2–5-step prediction, the performance is constantly leaning towards SA-ConvLSTM but it still manages to achieve the minimum error on multi-step prediction. Our model also has the lowest error on MAPE, outperforming all other models in both single-step and multi-step prediction.
- For the results on the NYCBike dataset, our model shows better performance in RMSE compared to the rest of the models for 1–4 step prediction, while ST-ResNet shows better performance for 5-step prediction. As for the results on MAPE, ST-SSL shows better performance in the case of single-step prediction; while in the case of multi-step prediction, our model performs better than the rest of the models.
- With the exception of ST-SSL, which shows better performance on the single-step prediction of MAPE results, the difference in the performance of all models on the NYCBike dataset is not as pronounced as on the BJTaxi dataset. The reason may be that the total data volume of the NYCBike dataset is much smaller than that of the former, the difference in traffic flow variations in the dataset is not very large, the traffic flow patterns in the data are relatively simple, and the existing methods are able to capture the flow patterns that exist in them well. Therefore, except for SVR, the performance difference in the other models in this dataset is not very obvious. On the other hand, the NYCBike dataset area is not very large, and thus its flow patterns do not vary much at different scales, which makes our multi-scale structure not work well. Nevertheless, the prediction results of our model on the NYCBike dataset are still able to approach and exceed the best existing models, and with the expansion of the prediction area, our model is able to better capture the complex traffic flow patterns and show excellent prediction performance.
5.5. Effect of Multi-Scale Traffic Flow Pattern Capture
5.6. Effect of Low-Scale Hidden-State Dimensions
5.7. Effect of Non-Local Block
- MN-STFN-dc: It uses an transposed convolution layer instead of a non-local block to transform the hidden state dimension into the predicted output dimension.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zheng, Y.; Capra, L.; Wolfson, O.; Yang, H. Urban computing: Concepts, methodologies, and applications. ACM Trans. Intell. Syst. Technol. TIST 2014, 5, 1–55. [Google Scholar] [CrossRef]
- Zhang, J.; Zheng, Y.; Qi, D.; Li, R.; Yi, X. DNN-based prediction model for spatio-temporal data. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Burlingame, CA, USA, 31 October–3 November 2016; pp. 1–4. [Google Scholar]
- Zhang, J.; Zheng, Y.; Qi, D. Deep spatio-temporal residual networks for citywide crowd flows prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Chen, Q.; Song, X.; Yamada, H.; Shibasaki, R. Learning deep representation from big and heterogeneous data for traffic accident inference. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Jayarajah, K.; Tan, A.; Misra, A. Understanding the Interdependency of Land Use and Mobility for Urban Planning. In Proceedings of the the 2018 ACM International Joint Conference and 2018 International Symposium, Singapore, 8–12 October 2018. [Google Scholar]
- Wang, Y.; Tong, D.; Li, W.; Liu, Y. Optimizing the spatial relocation of hospitals to reduce urban traffic congestion: A case study of Beijing. Trans. GIS 2019, 23, 365–386. [Google Scholar] [CrossRef]
- Chen, Y.; Wu, G.; Chen, Y.; Xia, Z. Spatial Location Optimization of Fire Stations with Traffic Status and Urban Functional Areas. Appl. Spat. Anal. Policy 2023, 16, 771–788. [Google Scholar] [CrossRef]
- Ali, A.; Terada, K. A framework for human tracking using kalman filter and fast mean shift algorithms. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan, 27 September–4 October 2009; pp. 1028–1033. [Google Scholar]
- Kumar, S.V.; Vanajakshi, L. Short-term traffic flow prediction using seasonal ARIMA model with limited input data. Eur. Transp. Res. Rev. 2015, 7, 21. [Google Scholar] [CrossRef]
- Zheng, J.; Ni, L.M. An unsupervised framework for sensing individual and cluster behavior patterns from human mobile data. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, Pittsburgh, PA, USA, 5–8 September 2012; pp. 153–162. [Google Scholar]
- Ye, Y.; Zheng, Y.; Chen, Y.; Feng, J.; Xie, X. Mining individual life pattern based on location history. In Proceedings of the 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware, Washington, DC, USA, 18–20 May 2009; pp. 1–10. [Google Scholar]
- Fu, X.; Yu, G.; Liu, Z. Spatial–temporal convolutional model for urban crowd density prediction based on mobile-phone signaling data. IEEE Trans. Intell. Transp. Syst. 2021, 23, 14661–14673. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv 2017, arXiv:1709.04875. [Google Scholar]
- Ji, J.; Wang, J.; Huang, C.; Wu, J.; Xu, B.; Wu, Z.; Zhang, J.; Zheng, Y. Spatio-temporal self-supervised learning for traffic flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 4356–4364. [Google Scholar]
- Zhou, Z.; Wang, Y.; Xie, X.; Qiao, L.; Li, Y. STUaNet: Understanding uncertainty in spatiotemporal collective human mobility. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 1868–1879. [Google Scholar]
- Zhang, Y.; Li, Y.; Zhou, X.; Luo, J.; Zhang, Z.L. Urban traffic dynamics prediction—a continuous spatial-temporal meta-learning approach. ACM Trans. Intell. Syst. Technol. TIST 2022, 13, 1–19. [Google Scholar] [CrossRef]
- Lin, Z.; Li, M.; Zheng, Z.; Cheng, Y.; Yuan, C. Self-attention convlstm for spatiotemporal prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11531–11538. [Google Scholar]
- Wang, D.; Yang, Y.; Ning, S. DeepSTCL: A deep spatio-temporal ConvLSTM for travel demand prediction. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
- Liu, C.H.; Piao, C.; Ma, X.; Yuan, Y.; Tang, J.; Wang, G.; Leung, K.K. Modeling citywide crowd flows using attentive convolutional lstm. In Proceedings of the 2021 IEEE 37th International Conference on Data Engineering (ICDE), Chania, Greece, 19–22 April 2021; pp. 217–228. [Google Scholar]
- Zheng, H.; Lin, F.; Feng, X.; Chen, Y. A hybrid deep learning model with attention-based conv-LSTM networks for short-term traffic flow prediction. IEEE Trans. Intell. Transp. Syst. 2020, 22, 6910–6920. [Google Scholar] [CrossRef]
- Li, Z.; Han, Y.; Xu, Z.; Zhang, Z.; Sun, Z.; Chen, G. PMGCN: Progressive Multi-Graph Convolutional Network for Traffic Forecasting. ISPRS Int. J. Geo-Inf. 2023, 12, 241. [Google Scholar] [CrossRef]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv 2017, arXiv:1707.01926. [Google Scholar]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
- Pan, Z.; Liang, Y.; Wang, W.; Yu, Y.; Zheng, Y.; Zhang, J. Urban traffic prediction from spatio-temporal data using deep meta learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1720–1730. [Google Scholar]
- Pan, Z.; Ke, S.; Yang, X.; Liang, Y.; Yu, Y.; Zhang, J.; Zheng, Y. AutoSTG: Neural Architecture Search for Predictions of Spatio-Temporal Graph. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 1846–1855. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- Liu, L.; Zhang, R.; Peng, J.; Li, G.; Du, B.; Lin, L. Attentive crowd flow machines. In Proceedings of the 26th ACM international conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; pp. 1553–1561. [Google Scholar]
- Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7794–7803. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Datasets | Time Interval | Taxis/Bikes | Regions | Available Time Interval | Data Type |
---|---|---|---|---|---|
BJTaxi | 30 min | 34k+ | 32 × 32 | 22,459 | Taxi GPS |
NYCBike | 1 h | 6.8k+ | 16 × 8 | 4392 | Bike rent |
Methods | RMSE of Step Prediction | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
SVR | 39.26 | 44.49 | 48.51 | 51.79 | 54.38 |
LSTM | 16.21 | 18.49 | 19.25 | 20.02 | 20.85 |
ST-SSL | 17.82 | - | - | - | - |
ConvLSTM | 17.46 | 21.19 | 24.10 | 26.39 | 29.23 |
SA-ConvLSTM | 15.66 | 16.35 | 17.28 | 17.78 | 18.38 |
ST-ResNet | 16.42 | 17.80 | 18.65 | 19.35 | 19.77 |
AttConvLSTM | 19.11 | 19.68 | 22.32 | 24.81 | 26.87 |
MN-STFN | 14.13 | 15.54 | 16.35 | 17.67 | 17.95 |
Methods | MAPE of Step Prediction | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
SVR | 0.4281 | 0.4409 | 0.4499 | 0.4568 | 0.4626 |
LSTM | 0.3661 | 0.3973 | 0.4020 | 0.4336 | 0.4384 |
ST-SSL | 0.2196 | - | - | - | - |
ConvLSTM | 0.2940 | 0.3599 | 0.4151 | 0.4742 | 0.5481 |
SA-ConvLSTM | 0.2294 | 0.2322 | 0.2419 | 0.2436 | 0.2505 |
ST-ResNet | 0.2252 | 0.2381 | 0.2455 | 0.2563 | 0.2586 |
AttConvLSTM | 0.3727 | 0.4022 | 0.4676 | 0.5510 | 0.5660 |
MN-STFN | 0.2081 | 0.2294 | 0.2354 | 0.2420 | 0.2464 |
Methods | RMSE of Step Prediction | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
SVR | 10.75 | 12.61 | 13.86 | 14.57 | 16.79 |
LSTM | 4.85 | 5.74 | 5.62 | 6.08 | 6.71 |
ST-SSL | 4.62 | - | - | - | - |
ConvLSTM | 5.65 | 7.25 | 8.41 | 9.29 | 9.55 |
SA-ConvLSTM | 4.68 | 5.22 | 5.62 | 6.08 | 6.71 |
ST-ResNet | 4.51 | 5.03 | 5.49 | 6.07 | 6.54 |
AttConvLSTM | 4.89 | 5.76 | 6.15 | 6.70 | 7.41 |
MN-STFN | 4.49 | 5.08 | 5.50 | 5.93 | 6.68 |
Methods | MAPE of Step Prediction | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
SVR | 0.5832 | 0.6489 | 0.6959 | 0.5993 | 0.6136 |
LSTM | 0.4370 | 0.4917 | 0.5510 | 0.5677 | 0.6345 |
ST-SSL | 0.3018 | - | - | - | - |
ConvLSTM | 0.5025 | 0.6021 | 0.6472 | 0.6482 | 0.6793 |
SA-ConvLSTM | 0.4072 | 0.4153 | 0.4303 | 0.4415 | 0.4860 |
ST-ResNet | 0.3920 | 0.4396 | 0.4624 | 0.4772 | 0.5023 |
AttConvLSTM | 0.4321 | 0.4576 | 0.4728 | 0.5058 | 0.5729 |
MN-STFN | 0.3895 | 0.4082 | 0.4286 | 0.4322 | 0.4848 |
Block Number | RMSE of Step Prediction | ||
---|---|---|---|
1 | 3 | 5 | |
MTFPC-1 | 14.21 | 16.42 | 17.82 |
MTFPC-2 | 14.13 | 16.35 | 17.95 |
MTFPC-3 | 14.41 | 16.62 | 17.96 |
Input Dim | RMSE of Step Prediction | ||
---|---|---|---|
1 | 3 | 5 | |
8 | 14.48 | 17.26 | 18.87 |
16 | 14.13 | 16.35 | 17.95 |
32 | 14.67 | 16.52 | 17.99 |
Methods | RMSE of Step Prediction | ||
---|---|---|---|
1 | 3 | 5 | |
MN-STFN-dc | 14.32 | 16.75 | 18.32 |
MN-STFN | 14.13 | 16.35 | 17.95 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lu, S.; Chen, H.; Teng, Y. Multi-Scale Non-Local Spatio-Temporal Information Fusion Networks for Multi-Step Traffic Flow Forecasting. ISPRS Int. J. Geo-Inf. 2024, 13, 71. https://doi.org/10.3390/ijgi13030071
Lu S, Chen H, Teng Y. Multi-Scale Non-Local Spatio-Temporal Information Fusion Networks for Multi-Step Traffic Flow Forecasting. ISPRS International Journal of Geo-Information. 2024; 13(3):71. https://doi.org/10.3390/ijgi13030071
Chicago/Turabian StyleLu, Shuai, Haibo Chen, and Yilong Teng. 2024. "Multi-Scale Non-Local Spatio-Temporal Information Fusion Networks for Multi-Step Traffic Flow Forecasting" ISPRS International Journal of Geo-Information 13, no. 3: 71. https://doi.org/10.3390/ijgi13030071
APA StyleLu, S., Chen, H., & Teng, Y. (2024). Multi-Scale Non-Local Spatio-Temporal Information Fusion Networks for Multi-Step Traffic Flow Forecasting. ISPRS International Journal of Geo-Information, 13(3), 71. https://doi.org/10.3390/ijgi13030071