Prediction of Parking Space Availability Using Improved MAT-LSTM Network
<p>The research flowchart and content framework of this paper.</p> "> Figure 2
<p>TCN architecture.</p> "> Figure 3
<p>Structures of the RNN and the LSTM network.</p> "> Figure 4
<p>Schematic of MAT structure. FC represents a linear transformation, Attention indicates the application of attention -pooling to each head and Concat denotes the concatenation of the <span class="html-italic">h</span> attention pooling outputs.</p> "> Figure 5
<p>Schematic of the scaled dot-product attention structure. Note: MatMul represents the matrix product, Scale denotes normalization through division by <math display="inline"><semantics> <mrow> <msqrt> <msub> <mrow> <mi>d</mi> </mrow> <mrow> <mi>k</mi> </mrow> </msub> </msqrt> </mrow> </semantics></math>, and Softmax calculates the weights.</p> "> Figure 6
<p>Location and surroundings of the analyzed parking lots.</p> "> Figure 7
<p>True and predicted NAP values for part of the D1 test set.</p> "> Figure 8
<p>Convergence curves for TCN, LSTM, and MAT-LSTM (Dataset D1): (<b>a</b>) comparison of validation set MSE curves and (<b>b</b>) comparison of test set MSE curves.</p> "> Figure 9
<p>Error curves for LSTM and MAT-LSTM: (<b>a</b>) Dataset D1; (<b>b</b>) Dataset D2; (<b>c</b>) Dataset D3; (<b>d</b>) Dataset S5; and (<b>e</b>) Dataset S10.</p> "> Figure A1
<p>NAP of D1, D2, and D3 in the year 2021.</p> "> Figure A1 Cont.
<p>NAP of D1, D2, and D3 in the year 2021.</p> ">
Abstract
:1. Introduction
- The effectiveness and accuracy of two neural network models—TCN and LSTM—were compared to identify the most suitable method for single-input single-output time series prediction problems. For the short-term prediction of available parking spaces, three real and two synthetic datasets were used to train and test the network models. While the TCN has a simple structure, is rapidly trained, and provides high efficiency, the LSTM network can fully leverage its ability to capture longer historical information features, achieving a greater accuracy in handling long-term series problems.
- The LSTM network was optimized by implementing a multi-head attention (MAT) module. The improved MAT-LSTM network effectively models the internal correlations between temporal data and fully explores high-level temporal features. Through ablation experiments, it was proven that, at the expense of permissible time costs, the proposed MAT-LSTM network model successfully captured intrinsic correlations between temporal data and improved the accuracy of short-term parking space availability predictions.
2. Related Work
- For short-term predictions of NAP, which performs better—RNN with LSTM or the structurally simple CNN? In previous studies, CNNs and RNNs were usually discussed independently. When selecting the network baseline, it is a key prerequisite to discuss the efficiency and accuracy of CNNs and RNNs in comparing these two network architectures on the same dataset, which is missing in previous studies. Moreover, different articles use different evaluation metrics, such as MAE in [16], RMSE in [17], and MSE in [19], which makes it difficult to compare the two frameworks in the same study or through the results of different studies.
- If an RNN with LSTM outperforms a CNN, what are its advantages over the CNN? Are there further methods available to enhance its advantages?
- We simultaneously trained and tested the TCN and LSTM networks on three real datasets and two synthetic datasets and evaluated their characteristics in terms of training time, accuracy, and convergence rate.
- We improved the LSTM network by integrating a preceding MAT module to capture the features and relationships in long-time series and compared the predictive performance of the improved MAT-LSTM network with that of the classic LSTM network.
3. Proposed Approach
3.1. Classic TCN Architecture
3.2. Classic LSTM Network Architecture
- Forget Gate Computation: The forget gate decides how much of the cell state from the previous moment is retained in the current cell state, essentially determining what information to discard from the cell state. This gate reads and , then after passing through a sigmoid layer, it outputs a number between 0 and 1, denoted as , which is multiplied element wise by each number in the cell state . A value of equal to 0 signifies complete discarding, whereas 1 indicates complete retention. The output of the forget gate is expressed as:
- Input Gate Computation: The input gate determines how much of the current input is retained in the current cell state and consists of two parts. The first part is a sigmoid layer, which decides which values to update, denoted by ; the second part is a tanh layer, which creates a new candidate cell state vector, , incorporating the information to be updated into the cell state. The computations for the input gate are expressed as:
- Updating the Old Cell State: After processing through the input gate, the old cell state is updated to by multiplying the old state with to discard the information that is determined to be discarded and then adding , thus completing the updating of the cell state. It is represented as follows:
- Output Gate Computation: The output gate decides how much of the current cell state to output, with a sigmoid layer determining which part of the cell state will be outputted. The cell state is processed through a tanh function to yield a value between −1 and 1, which is then multiplied by the output of the sigmoid gate, thus only outputting the determined part of the state. This is represented as:
3.3. Improved MAT-LSTM Network Architecture
- The inputs denoted as queries (Q), keys (K), and values (V) undergo linear transformation, reformulating the matrices Q, K, and V, which possess a dimension of , into correspondingly dimensioned spaces . This transformation propels the input matrices into more narrowly defined subspaces, thus permitting diverse attention heads to discern distinct facets of the data.
- The scaled dot-product attention mechanism is deployed to deduce the outcomes, as depicted in Figure 5. Initially, the dot product of Q and the transpose of K () are computed and then scaled by dividing by the square root of (), a factor that mitigates against the softmax function’s propensity for vanishing gradients during training—a common occurrence when dot products are excessively large. Subsequent to the scaling, the softmax function is applied to each row of the scaled scores, engendering a matrix of attention weights that reflect the significance of each corresponding value in relation to each query. Ultimately, the results from the softmax function are multiplied by V, culminating in the output matrix for each attention head, which represents a weighted summation of the values, with the weights mirroring the attention each value receives from the respective queries.
- These aforementioned stages are reiterated, culminating in the amalgamation of the respective results. The outputs derived from all attention heads are then concatenated along the dimension of the features to formulate a unified matrix that embodies the information accrued from all heads.
- The assembled matrix from step 3 is subjected to an additional linear transformation. The matrix resulting from this process serves as the input for the subsequent layers within the neural network or constitutes the final output in instances where it pertains to the terminal layer in a sequential processing model.
4. Experiments and Results
4.1. Datasets
4.2. Network Evaluation Method
4.3. Training and Results
4.4. Accuracy and Efficiency Comparison between LSTM and TCN
- LSTM exhibited instability. As previously mentioned, datasets D1, D2, and D3 serve as repeated validations to assess the network stability. If a method performs similarly across these datasets, it has a good stability and repeatability. However, Table 4 reveals a two-order-of-magnitude difference in the MSE of LSTM between the D3 data and D1 and D2 data, suggesting a weak stability. Therefore, the accuracy comparisons below focus solely on the D1 and D2 results, excluding D3.
- LSTM demonstrated a higher NAP prediction accuracy than the TCN. On the real datasets, LSTM achieved an average MSE of 0.0436, surpassing the TCN’s average of 0.5612. On the synthetic datasets, LSTM’s average MSE of exceeded the TCN’s 0.0057.
- Regarding the average training time per epoch, LSTM required less training time than the TCN. Table 5 illustrates that, for datasets comprising 525,600 data points and 300 training epochs, the average training time for LSTM was 01:57:38 compared to that of TCN, which was 02:12:05.
- LSTM initially exhibited larger errors than the TCN. As depicted in Figure 8, for identical data, the initial MSE of LSTM was two orders of magnitude greater than that of the TCN. Particularly within 10 epochs, the MSE curve of the TCN significantly outperformed that of LSTM. However, after 30 epochs, the slopes of both curves gradually decreased, and the curves converged.
4.5. Improvement of MAT-LSTM
- MAT-LSTM enhanced the prediction accuracy. Table 4 shows that, when comparing the test set MSE of the two networks, MAT-LSTM achieved a significantly higher prediction accuracy on D1 and D2 compared to LSTM. Overall, MAT-LSTM exhibited a 23% reduction in the average MSE compared to the traditional LSTM, indicating a substantial accuracy improvement owing to the network enhancement. In the D1 dataset, MAT-LSTM even achieved a 48% higher accuracy than LSTM. While the accuracy of LSTM was marginally higher than that of MAT-LSTM in the initial stages of training on the S5 and S10 synthetic datasets, after multiple training rounds, MAT-LSTM achieved a final average precision of , surpassing that of LSTM of by 29%.
- The convergence speed of MAT-LSTM was accelerated. As depicted in Figure 9, MAT-LSTM attained a lower MSE within the first five epochs and reached a steady state earlier than LSTM on datasets D1, D2, and D3. In contrast, the MSE of the conventional LSTM required 10–40 runs to decrease to a similar level as that of MAT-LSTM. However, on the S5 and S10 synthetic datasets, both networks essentially converged simultaneously, possibly because the normalization process expedited the decline of the loss function. Both networks exhibited a significant downward trend on the S5 and S10 datasets within five epochs and stabilized thereafter.
- MAT-LSTM required more computational time. The addition of MAT increased the computational workload, increasing the time for each training cycle. As indicated in Table 5, MAT-LSTM consumed 16% more training time per cycle than LSTM for the D1, D2, and D3 datasets, and 20% more for the S5 and S10 datasets.
5. Discussion
- Within a reasonable range, more universal algorithm models should be explored. The algorithm proposed in this study only relies on historical parking availability time series, which limits the universality of the algorithm model. Specifically, it needs to be trained on the dataset of each parking lot to obtain targeted models. In practical application, our ultimate expectation is a more universal unified model, whose algorithm is applicable to the prediction of parking space availability in different regions and different types of parking lots.
- Optimizing the prediction length should be aimed for. Prediction errors tend to accumulate over time, with long-term predictions generally being less accurate than short-term ones. While short-term predictions offer higher confidence in their accuracy, they require more intensive training and computation, potentially leading to redundant computation and resource wastage. By using the same historical data to predict different time lengths, comparing their accuracies and weighing computational requirements, an optimal prediction period can be determined.
- Studying a multi-input, single/multi-output network model to explore the impacts of multiple explanatory variables on response variables should be conducted, which can be used to predict the number of highway accidents [24] and incidents of exceeding the bridge design traffic load [25]. In this example, the response variable can be the average availability of a parking lot within a specific time window, and the explanatory variables can include time patterns (hours of the day, days of the week, and holidays [26,27]) and contextual factors (weather conditions [28,29,30], characteristics of the parking lot in the area, characteristics of the building/enterprise served by the parking lot [31], and so on). This method will simultaneously improve the generalization of predictive models in multiple parking lots, thereby promoting their practical applicability to stakeholders.
- Although ANNs are considered “black box” models, the method of using feature importance indicators can effectively rank predictive variables based on their significance, thereby revealing the impact of each predictive variable on the response variable. However, owing to the complexity and variability of transportation, research in this area remains challenging, as the differences in the attributes, user behavior characteristics, and other aspects of each parking lot are significant and may change over time. Therefore, research on this issue should be both targeted and universal, combining the individual attributes of parking lots for analysis and summarizing statistical patterns based on a large number of datasets.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
References
- She, F.; Qiu, J.; Tang, M. Simulation of prediction for free parking spaces in large parking lots. Appl. Res. Comput. 2019, 36, 851–854. [Google Scholar]
- Wang, Q.; Li, H. Design of Parking Space Management System Based on Internet of Things Technology. In Proceedings of the 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers, Dalian, China, 14–16 April 2022. [Google Scholar]
- Gao, G.; Ding, Y.; Jiang, F.; Li, C. Prediction of Parking Guidance Space Based on BP Neural Networks. Comput. Syst. Appl. 2017, 26, 236–239. [Google Scholar]
- Zhao, W.; Zhang, Y. Application research of parking lot free parking number prediction based on back propagation neural network. J. Harbin Univ. Commer. Nat. Sci. Ed. 2015, 2015, 44–46. [Google Scholar]
- Ji, Y.; Tang, D.; Blythe, P.; Guo, W.; Wang, W. Short-term forecasting of available parking space using wavelet neural network model. IET Intell. Transp. Syst. 2014, 9, 202–209. [Google Scholar] [CrossRef]
- Ji, Y.; Chen, X.; Wang, W.; Hu, B. Short-term forecasting of parking space using particle swarm optimization-wavelet neural network model. J. Jilin Univ. (Eng. Technol. Ed.) 2016, 46, 399–405. [Google Scholar]
- Chen, H.; Tu, X.; Wang, Y.; Zheng, J. Short-Term Parking Space Prediction Based on Wavelet-ELM Neural Networks. J. Jilin Univ. (Sci. Ed.) 2017, 55, 388–392. [Google Scholar]
- Xie, K. Shared Characteristics and Allocation Optimization Methods of Motor Vehicle Parking Spaces in Universities in Urban Centers. Master’s Thesis, Southeast University, Nanjing, China, 2014. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271v2. [Google Scholar]
- Li, W.; Wang, X. Time Series Prediction Method Based on Simplified LSTM Neural Network. J. Beijing Univ. Technol. 2021, 47, 480–488. [Google Scholar]
- Zeng, C.; Mia, C.; Wang, K.; Cui, Z. Predicting vacant parking space availability: A DWT-Bi-LSTM model. Phys. A Stat. Mech. Appl. 2022, 599, 127498. [Google Scholar] [CrossRef]
- Luo, X.; Li, D.; Yang, Y.; Zhang, S. Short-term Traffic Flow Prediction Based on KNN-LSTM. J. Beijing Univ. Technol. 2018, 44, 1521–1527. [Google Scholar]
- Shao, W.; Zhang, Y.; Guo, B.; Qin, K.; Chan, J.; Salim, F.D. Parking Availability Prediction with Long Short Term Memory Model. In Proceedings of the Green, Pervasive, and Cloud Computing, Uberlândia, Brazil, 26–28 May 2019. [Google Scholar]
- Tian, Y.; Wei, C.; Xu, D. Traffic Flow Prediction Based on Stack AutoEncoder and Long Short-Term Memory Network. In Proceedings of the IEEE International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), Shenyang, China, 20–22 November 2020. [Google Scholar]
- Bandara, K.; Bergmeir, C.; Hewamalage, H. LSTM-MSNet: Leveraging Forecasts on Sets of Related Time Series With Multiple Seasonal Patterns. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1586–1599. [Google Scholar] [CrossRef] [PubMed]
- Camero, A.; Toutouh, J.; Stolfi, D.H.; Alba, E. Evolutionary Deep Learning for Car Park Occupancy Prediction in Smart Cities. In Learning and Intelligent Optimization; Coello Coello, C.A., Ed.; Springer International Publishing AG: Cham, Switzerland, 2019; pp. 386–401. [Google Scholar]
- Li, J.; Li, J.; Zhang, H. Deep Learning Based Parking Prediction on Cloud Platform. In Proceedings of the International Conference on Big Data Computing and Communications (BIGCOM), Chicago, IL, USA, 7–9 August 2018. [Google Scholar]
- Lea, C.; Vidal, R.; Reiter, A.; Hager, G.D. Temporal Convolutional Networks: A Unified Approach to Action Segmentation. arXiv 2016, arXiv:1608.08242v1. [Google Scholar]
- Shang, K.; Wan, Z.; Zhang, Y.; Cui, Z.; Zhang, Z.; Jiang, C.; Zhang, F. Intelligent Short-Term Multiscale Prediction of Parking Space Availability Using an Attention-Enhanced Temporal Convolutional Network. ISPRS Int. J. Geo-Inf. 2023, 12, 208. [Google Scholar] [CrossRef]
- Peng, P.; Chen, Y.; Lin, W.; Wang, J.Z. Attention-based CNN-LSTM for high-frequency multiple cryptocurrency trend prediction. Expert Syst. Appl. 2024, 237, 121520. [Google Scholar] [CrossRef]
- Mnih, V.; Heess, N.; Graves, A. Recurrent Models of Visual Attention. arXiv 2014, arXiv:1406.6247. [Google Scholar]
- Bhosale, S.; Chakraborty, R.; Kopparapu, S.K. Deep Encoded Linguistic and Acoustic Cues for Attention Based End to End Speech Emotion Recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, Barcelona, Italy, 4–8 May 2020. [Google Scholar]
- Yin, W.; Ebert, S.; Schütze, H. Attention-Based Convolutional Neural Network for Machine Comprehension. arXiv 2016, arXiv:1602.04341v1. [Google Scholar]
- Alqatawna, A.; Álvarez, R.; García-Moreno, S. Comparison of Multivariate Regression Models and Artificial Neural Networks for Prediction Highway Traffic Accidents in Spain: A Case Study. Transp. Res. Proc. 2021, 58, 277–284. [Google Scholar] [CrossRef]
- Ventura, R.; Barabino, B.; Maternini, G. Prediction of the severity of exceeding design traffic loads on highway bridges. Heliyon 2024, 10, e23374. [Google Scholar] [CrossRef]
- Vlahogianni, E.; Kepaptsoglou, K.; Tsetsos, V.; Karlaftis, M.G. A Real-Time Parking Prediction System for Smart Cities. J. Intell. Transp. Syst. 2016, 20, 192–204. [Google Scholar] [CrossRef]
- Shao, W.; Salim, F.D.; Song, A.; Bouguettaya, A. Clustering Big Spatiotemporal-Interval Data. IEEE Trans. Big Data. 2016, 2, 190–203. [Google Scholar] [CrossRef]
- Badii, C.; Nesi, P.; Paoli, P. Predicting Available Parking Slots on Critical and Regular Services by Exploiting a Range of Open Data. IEEE Access 2018, 6, 44059–44071. [Google Scholar] [CrossRef]
- Provoost, J.C.; Kamilaris, A.; Wismans, L.J.J.; van der Drift, J.S.; van Keulen, M. Predicting parking occupancy via machine learning in the web of things. Internet Things 2020, 12, 100301. [Google Scholar] [CrossRef]
- Zhang, F.; Liu, Y.; Feng, N.; Yang, C.; Zhai, J.; Zhang, S.; He, B.; Lin, J.; Du, X. Periodic Weather-Aware LSTM With Event Mechanism for Parking Behavior Prediction. IEEE Trans. Knowl. Data Eng. 2022, 34, 5896–5909. [Google Scholar] [CrossRef]
- Xiao, X.; Peng, Z.; Lin, Y.; Jin, Z.; Shao, W.; Chen, R.; Cheng, N.; Mao, G. Parking Prediction in Smart Cities: A Survey. IEEE Trans. Intell. Transp. Syst. 2023, 24, 10302–10326. [Google Scholar] [CrossRef]
Parking Slot | Error | ||
---|---|---|---|
D1 | 152 | 152 | 0 |
D2 | 132 | 132 | 0 |
D3 | 150 | 150 | 0 |
Datasets | Attribute | Min | Max | SD |
---|---|---|---|---|
D1 | Real | 0 | 157 | 58.68 |
D2 | Real | 0 | 132 | 29.00 |
D3 | Real | 0 | 151 | 42.22 |
S5 | Synthetic | 1 | 2210 | 1245.83 |
S10 | Synthetic | 1 | 4450 | 622.89 |
Model | Hyperparameter | |||||
---|---|---|---|---|---|---|
Batch Size | Epochs | Layers | Hidden Units/Layer | Kernel Size | Learning Rate | |
TCN | 128 | 300/120 | 8 | 30 | 13 | 0.004 |
LSTM | 128 | 300/120 | 8 | 30 | 13 | 0.0001–0.001 |
MAT-LSTM | 128 | 300/120 | 8 | 30 | 13 | 0.0001 |
Model | MSE of the Real Dataset | MSE of Synthetic Dataset | |||||
---|---|---|---|---|---|---|---|
D1 | D2 | D3 | Mean of D1&D2 | S5 | S10 | Mean | |
TCN | 0.7903 | 0.3321 | 0.3424 | 0.5612 | 0.0046 | 0.0067 | 0.0057 |
LSTM | 0.0279 | 0.0592 | 0.0005 | 0.0436 | |||
MAT-LSTM | 0.0145 | 0.0526 | 0.0321 | 0.0336 |
Model | Training Time on Real Dataset (300 Epochs) | Training Time on Synthetic Dataset (120 Epochs) | |||||
---|---|---|---|---|---|---|---|
D1 | D2 | D3 | Mean | S5 | S10 | Mean | |
TCN | 02:11:49 | 02:12:06 | 02:12:19 | 02:12:05 | 00:53:12 | 00:53:26 | 00:53:19 |
LSTM | 01:56:44 | 01:59:01 | 01:57:10 | 01:57:38 | 00:47:57 | 00:47:59 | 00:47:58 |
MAT-LSTM | 02:15:14 | 02:18:04 | 02:17:05 | 02:16:48 | 00:55:54 | 00:56:07 | 00:56:01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, F.; Shang, K.; Yan, L.; Nan, H.; Miao, Z. Prediction of Parking Space Availability Using Improved MAT-LSTM Network. ISPRS Int. J. Geo-Inf. 2024, 13, 151. https://doi.org/10.3390/ijgi13050151
Zhang F, Shang K, Yan L, Nan H, Miao Z. Prediction of Parking Space Availability Using Improved MAT-LSTM Network. ISPRS International Journal of Geo-Information. 2024; 13(5):151. https://doi.org/10.3390/ijgi13050151
Chicago/Turabian StyleZhang, Feizhou, Ke Shang, Lei Yan, Haijing Nan, and Zicong Miao. 2024. "Prediction of Parking Space Availability Using Improved MAT-LSTM Network" ISPRS International Journal of Geo-Information 13, no. 5: 151. https://doi.org/10.3390/ijgi13050151