Abstract
The rapid progress of industrial development, urbanization, and traffic has caused air quality degradation that negatively affects human health and environmental sustainability, especially in developed countries. However, due to the limited number of sensors available, the air quality index at many locations is not monitored. Therefore, many research, including statistical and machine learning approaches, have been proposed to tackle the problem of estimating air quality value at an arbitrary location. Most of the existing research perform interpolation process based on traditional techniques that leverage distance information. In this work, we propose a novel deep-learning-based model for air quality value estimation. This approach follows the encoder–decoder paradigm, with the encoder and decoder trained separately using different training mechanisms. In the encoder component, we proposed a new self-supervised graph representation learning approach for spatio-temporal data. For the decoder component, we designed a deep interpolation layer that employs two attention mechanisms and a fully connected layer using air quality data at known stations, distance information, and meteorology information at the target point to predict air quality at arbitrary locations. The experimental results demonstrate significant improvements in estimation accuracy achieved by our proposed model compared to state-of-the-art approaches. For the MAE indicator, our model enhances the estimation accuracy from 4.93% to 34.88% on the UK dataset, and from 6.89% to 31.94% regarding the Beijing dataset. In terms of the RMSE, the average improvements of our method on the two datasets are 13.33% and 14.37%, respectively. The statistics for MAPE are 36.05% and 13.25%, while for MDAPE, they are 24.48% and 36.33%, respectively. Furthermore, the value of \(R_2\) score attained by our proposed model also shows considerable improvement, with increases of 5.39% and 32.58% compared to that of comparison benchmarks. Our source code and data are available at https://github.com/duclong1009/Unsupervised-Air-Quality-Estimation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The code and datasets generated during and/or analyzed during the current study are available in. https://github.com/duclong1009/Unsupervised-Air-Quality-Estimation
References
W. H. O. (WHO): Ambient air pollution: A global assessment of exposure and burden of disease (2016)
Tai AP, Mickley LJ, Jacob DJ (2010) Correlations between fine particulate matter (pm2. 5) and meteorological variables in the united states: implications for the sensitivity of pm2. 5 to climate change. Atmos Environ 44(32):3976–3984. https://doi.org/10.1016/j.atmosenv.2010.06.060
Kulmala M (2018) Build a global Earth observatory. Nature Publishing Group
Rahmati Aidinlou H, Nikbakht AM (2022) Fuzzy-based modeling of thermohydraulic aspect of solar air heater roughened with inclined broken roughness. Neural Comput Appl 34(3):2393–2412. https://doi.org/10.1007/s00521-021-06547-w
Liu X, Jayaratne R, Thai P, Kuhn T, Zing I, Christensen B, Lamont R, Dunbabin M, Zhu S, Gao J, Wainwright D, Neale D, Kan R, Kirkwood J, Morawska L (2020) Low-cost sensors as an alternative for long-term air quality monitoring. Environ Res 185:109438. https://doi.org/10.1016/j.envres.2020.109438
deSouza P, Anjomshoaa A, Duarte F, Kahn R, Kumar P, Ratti C (2020) Air quality monitoring using mobile low-cost sensors mounted on trash-trucks: methods development and lessons learned. Sustain Cities Soc 60:102239. https://doi.org/10.1016/j.scs.2020.102239
Motlagh NH, Lagerspetz E, Nurmi P, Li X, Varjonen S, Mineraud J, Siekkinen M, Rebeiro-Hargrave A, Hussein T, Petaja T, Kulmala M, Tarkoma S (2020) Toward massive scale air quality monitoring. IEEE Commun Mag 58(2):54–59. https://doi.org/10.1109/MCOM.001.1900515
Idrees Z, Zheng L (2020) Low cost air pollution monitoring systems: a review of protocols and enabling technologies. J Ind Inf Integr 17:100123. https://doi.org/10.1016/j.jii.2019.100123
Lin Y-C, Lee S-J, Ouyang C-S, Wu C-H (2020) Air quality prediction by neuro-fuzzy modeling approach. Appl Soft Comput 86:105898. https://doi.org/10.1016/j.asoc.2019.105898
Xiao X, Jin Z, Wang S, Xu J, Peng Z, Wang R, Shao W, Hui Y (2022) A dual-path dynamic directed graph convolutional network for air quality prediction. Sci Total Environ 827:154298. https://doi.org/10.1016/j.scitotenv.2022.154298
Wang J, Li J, Wang X, Wang J, Huang M (2021) Air quality prediction using CT-LSTM. Neural Comput Appl 33(10):4779–4792. https://doi.org/10.1007/s00521-020-05535-w
Wang J, Song G (2018) A deep spatial-temporal ensemble model for air quality prediction. Neurocomputing 314:198–206. https://doi.org/10.1016/j.neucom.2018.06.049
Han J, Liu H, Zhu H, Xiong H, Dou D (2021) Joint air quality and weather prediction based on multi-adversarial spatiotemporal networks. Proceed AAAI Conf Artif Intell 35:4081–4089. https://doi.org/10.1609/aaai.v35i5.16529
Chen P-C, Lin Y-T (2022) Exposure assessment of pm2.5 using smart spatial interpolation on regulatory air quality stations with clustering of densely-deployed microsensors. Environ Pollut 292:118401. https://doi.org/10.1016/j.envpol.2021.118401
Beauchamp M, Malherbe L, de Fouquet C, Létinois L, Tognet F (2018) A polynomial approximation of the traffic contributions for kriging-based interpolation of urban air quality model. Environ Modell Softw 105:132–152. https://doi.org/10.1016/j.envsoft.2018.03.033
Li J, Heap AD (2011) A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors. Eco Inform 6(3):228–241. https://doi.org/10.1016/j.ecoinf.2010.12.003
Noi E, Murray AT (2022) Interpolation biases in assessing spatial heterogeneity of outdoor air quality in Moscow, Russia. Land Use Policy 112:105783. https://doi.org/10.1016/j.landusepol.2021.105783
Xu C, Wang J, Hu M, Wang W (2022) A new method for interpolation of missing air quality data at monitor stations. Environ Int 169:107538. https://doi.org/10.1016/j.envint.2022.107538
Alimissis A, Philippopoulos K, Tzanis C, Deligiorgi D (2018) Spatial estimation of urban air pollution with the use of artificial neural network models. Atmos Environ 191:205–213. https://doi.org/10.1016/j.atmosenv.2018.07.058
Ma J, Ding Y, Cheng JC, Jiang F, Wan Z (2019) A temporal-spatial interpolation and extrapolation method based on geographic long short-term memory neural network for pm 2.5. J Clean Prod 237:117729. https://doi.org/10.1016/j.jclepro.2019.11772
Qi Z, Wang T, Song G, Hu W, Li X, Zhang Z (2018) Deep air learning: Interpolation, prediction, and feature analysis of fine-grained air quality. IEEE Trans Knowl Data Eng 30(12):2285–2297. https://doi.org/10.1109/TKDE.2018.2823740
Li L, Girguis M, Lurmann F, Pavlovic N, McClure C, Franklin M, Wu J, Oman LD, Breton C, Gilliland F, Habre R (2020) Ensemble-based deep learning for estimating pm2.5 over California with multisource big data including wildfire smoke. Environ Int 145:106143. https://doi.org/10.1016/j.envint.2020.106143
Rijal N, Gutta RT, Cao T, Lin J, Bo Q, Zhang J (2018) Ensemble of deep neural networks for estimating particulate matter from images. In: 2018 IEEE 3rd International conference on image, vision and computing (ICIVC), pp 733–738. https://doi.org/10.1109/ICIVC.2018.8492790
Dixit E, Jindal V (2022) Ieesep: an intelligent energy efficient stable election routing protocol in air pollution monitoring WSNS. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07027-5
Ari D, Alagoz BB (2022) An effective integrated genetic programming and neural network model for electronic nose calibration of air pollution monitoring application. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07129-0
Al-Janabi S, Alkaim A, Al-Janabi E, Aljeboree A, Mustafa M (2021) Intelligent forecaster of concentrations (pm2. 5, pm10, no2, co, o3, so2) caused air pollution (IFCSAP). Neural Comput Appl 33(21):14199–14229. https://doi.org/10.1007/s00521-021-06067-7
Wardana I, Gardner JW, Fahmy SA (2022) Estimation of missing air pollutant data using a spatiotemporal convolutional autoencoder. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07224-2
Liang Y, Ke S, Zhang J, Yi X, Zheng Y (2018) Geoman: Multi-level attention networks for geo-sensory time series prediction. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI-18, pp 3428–3434. https://doi.org/10.24963/ijcai.2018/476
Zhao J, Deng F, Cai Y, Chen J (2018) Long short-term memory–fully connected (LSTM-FC) neural network for pm2.5 concentration prediction. Chemosphere. https://doi.org/10.1016/j.chemosphere.2018.12.128
Qi Y, Li Q, Karimian H, Liu D (2019) A hybrid model for spatiotemporal forecasting of pm2.5 based on graph convolutional neural network and long short-term memory. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2019.01.333
Ma J, Ding Y, Gan VJL, Lin C, Wan Z (2019) Spatiotemporal prediction of pm2.5 concentrations at different time granularities using IDW-BLSTM. IEEE Access 7:107897–107907
Guo C, Liu G, Lyu L, Chen CH (2020) An unsupervised pm2.5 estimation method with different Spatio-temporal resolutions based on KIDW-TCGRU. IEEE Access 8:190263–190276. https://doi.org/10.1109/ACCESS.2020.3032420
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International conference on learning representations. ICLR ’17. https://doi.org/10.48550/ARXIV.1609.02907
Liu Y, Jin M, Pan S, Zhou C, Zheng Y, Xia F, Yu P (2022) Graph self-supervised learning: a survey. IEEE Transactions on knowledge and data engineering abs/2103.00111, 1–1. https://doi.org/10.1109/TKDE.2022.3172903
Kipf TN, Welling M (2016) Variational graph auto-encoders. CoRR abs/1611.07308. 1611.07308. https://doi.org/10.48550/ARXIV.1611.07308
Wang C, Pan S, Long G, Zhu X, Jiang J (2017) Mgae: marginalized graph autoencoder for graph clustering. In: Proceedings of the 2017 ACM on conference on information and knowledge management. CIKM ’17, pp. 889–898. https://doi.org/10.1145/3132847.3132967
Jin W, Derr T, Liu H, Wang Y, Wang S, Liu Z, Tang J (2020) Self-supervised learning on graphs: deep insights and new direction. CoRR abs/2006.10141. https://doi.org/10.48550/ARXIV.2006.10141
Hu Z, Fan C, Chen T, Chang K-W, Sun Y (2019) Pre-training graph neural networks for generic structural feature extraction. In: ICLR 2019 Workshop: representation learning on graphs and manifolds. https://doi.org/10.48550/ARXIV.1905.13728
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International conference on knowledge discovery and data mining. KDD ’14, pp 701–710. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2623330.2623732
Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining. KDD ’16, pp 855–864. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2939672.2939754
Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2020) Deep Graph Contrastive Representation Learning. In: ICML Workshop on Graph Representation Learning and Beyond. https://doi.org/10.48550/ARXIV.2006.04131
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. NIPS’17, pp 1025–1035. Curran Associates Inc., Red Hook, NY, USA. https://doi.org/10.48550/ARXIV.1706.02216
Velickovic P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. https://doi.org/10.48550/ARXIV.1809.10341
Opolka FL, Solomon A, Cangea C, Velickovic P, Liò P, Hjelm RD (2019) Spatio-temporal deep graph infomax. ICLR 2019 abs/1904.06316. https://doi.org/10.48550/ARXIV.1904.06316
Winarno E, Hadikurniawati W, Rosso RN (2017) Location based service for presence system using haversine method. In: 2017 International conference on innovative and creative information technology (ICITech), pp 1–4. https://doi.org/10.1109/INNOCIT.2017.8319153. IEEE
copernicus: ERA5 Hourly Data on Single Levels from 1959 to Present. https://doi.org/10.24381/cds.adbb2d47. https://cds.climate.copernicus.eu/cdsapp/#!/dataset/reanalysis-era5-single-levels Accessed 2019-09-30
Li S, Xie G, Ren J, Guo L, Yang Y, Xu X (2020) Urban pm2.5 concentration prediction via attention-based CNN-LSTM. Appl Ci. https://doi.org/10.3390/app10061953
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on deep learning, December. https://doi.org/10.48550/ARXIV.1412.3555
Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240. https://doi.org/10.2307/143141
Cichowicz R, Wielgosinski G, Fetter W (2020) Effect of wind speed on the level of particulate matter pm10 concentration in atmospheric air during winter season in vicinity of large combustion plant. J Atmos Chem 77:1–14. https://doi.org/10.1007/s10874-020-09401-w
Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, Deng M, Li H (2020) T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans Intell Transp Syst 21(9):3848–3858. https://doi.org/10.1109/tits.2019.2935152
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. https://doi.org/10.48550/ARXIV.1412.6980
Reani M, Lowe D, Gledson A, Topping D, Jay C (2022) UK daily meteorology, air quality, and pollen measurements for 2016–2019, with estimates for missing data. Sci Data 9(1):43. https://doi.org/10.1038/s41597-022-01135-6
Wang H air pollution and meteorological data in Beijing 2017-2018. https://doi.org/10.7910/DVN/USXCAK
Colchado LE, Villanueva E, Ochoa-Luna J (2021) A neural network architecture with an attention-based layer for spatial prediction of fine particulate matter. In: 2021 IEEE 8th International conference on data science and advanced analytics (DSAA), pp 1–10. https://doi.org/10.1109/DSAA53316.2021.9564200
Chen Y, Zang L, Du W, Xu D, Shen G, Zhang Q, Zou Q, Chen J, Zhao M, Yao D (2018) Ambient air pollution of particles and gas pollutants, and the predicted health risks from long-term exposure to pm25 in zhejiang province, china. Environ Sci Pollut Res 25(24):23833–23844. https://doi.org/10.1007/s11356-018-2420-5
Chen Z, Xie X, Cai J, Chen D, Gao B, He B, Cheng N, Xu B (2018) Understanding meteorological influences on pm\(_{2.5}\) concentrations across china: a temporal and spatial perspective. Atmos Chem Phys 18(8):5343–5358
Wang J, Ogawa S (2015) Effects of meteorological conditions on pm2.5 concentrations in Nagasaki, Japan. Int J Environ Res Public Health 12:9089–101. https://doi.org/10.3390/ijerph120809089
Mi K, Zhuang R, Zhang Z, Gao J, Pei Q (2019) Spatiotemporal characteristics of pm2.5 and its associated gas pollutants, a case in china. Sustain Cities Soc 45:287–295. https://doi.org/10.1016/j.scs.2018.11.004
Li K, Bai K (2019) International Journal of Environmental Research and Public Health. Spatiotemporal Assoc Between pm2.5 So2 Well No2 China From 2015 to 2018 16(13):2352. https://doi.org/10.3390/ijerph16132352
Hart S In: Eatwell, J., Milgate, M., Newman, P. (eds.) Shapley Value, pp 210–216. Palgrave Macmillan UK, London (1989). https://doi.org/10.1007/978-1-349-20181-5_25
Jia M, Zhao T, Cheng X, Gong S, Zhang X, Tang L, Liu D, Wu X, Wang L, Chen Y (2017) Inverse relations of pm2.5 and o3 in air compound pollution between cold and hot seasons over an urban area of east china. Atmosphere. https://doi.org/10.3390/atmos8030059
Fu H, Zhang Y, Liao C, Mao L, Wang Z, Hong N (2020) Investigating PM(2.5) responses to other air pollutants and meteorological factors across multiple temporal scales. Sci Rep 10(1):15639. https://doi.org/10.1038/s41598-020-72722-z
Acknowledgements
This work was funded by Vingroup Joint Stock Company (Vingroup JSC), Vingroup, and supported by the Vingroup Innovation Foundation (VINIF) under project code VINIF.2020.DA09. This research is partially funded by Hanoi University of Science and Technology (HUST) under grant number T2022-PC-049. Viet Hung Vu and Duc Long Nguyen were funded by Vingroup Joint Stock Company and supported by the Domestic Master/PhD Scholarship Programme of Vingroup Innovation Foundation (VINIF), Vingroup Big Data Institute (VINBIGDATA), under Grant VINIF.2022.Ths.BK.05 and VINIF.2022.Ths.BK.07, respectively.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A Details of hyper-parameter settings
Appendix A Details of hyper-parameter settings
All our experiment is conducted on NVIDIA GeForce RTX 2080 Ti graphic card. The Cuda version is 11.4. The deep-learning framework PyTorch version 3.8 is used to implement this approach. In our implementation, we use the default batch size of 32 using the Adam optimizer [52]. The self-supervised training of embedding is carried out for 30 epochs, with the initial learning rate of \(1e^{-3}\). The number of epochs trained for the supervised models is also 30, with the initial learning rate of \(1e^{-3}\). We use early stopping to get the best model weight. The value of patience in early stopping is 10 epochs.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vu, V.H., Nguyen, D.L., Nguyen, T.H. et al. Self-supervised air quality estimation with graph neural network assistance and attention enhancement. Neural Comput & Applic 36, 11171–11193 (2024). https://doi.org/10.1007/s00521-024-09637-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09637-7