research-article

Spatio-Temporal Enhanced Contrastive and Contextual Learning for Weather Forecasting

Authors:

Yilong YinAuthors Info & Claims

IEEE Transactions on Knowledge and Data Engineering, Volume 36, Issue 8

Pages 4260 - 4274

https://doi.org/10.1109/TKDE.2024.3362825

Published: 06 February 2024 Publication History

Abstract

Weather forecasting is of great importance for human life and various real-world fields, e.g., traffic prediction, agricultural production, and tourist industry. Existing methods can be roughly divided into two categories: theory-driven (e.g., numerical weather prediction (NWP)) and data-driven methods. Theory-driven methods require a complex simulation of the physical evolution process in the atmosphere model using supercomputers, while most data-driven methods learn the underlying laws from the historical weather records via deep learning models. However, some data-driven methods simply regard all weather variables of monitoring stations as a whole and fail to more granularly exploit complex correlations across different stations, while others prefer to construct large neural networks with massive learnable parameters. To alleviate these defects, we propose a spatio-temporal contrastive self-supervision method and a generative contextual self-supervised technique to capture spatial and temporal dependencies from the station-level and variable-level, respectively. Through these well-designed self-supervised tasks, uncomplicated networks obtain strong capability to capture latent representations for weather changes with time-varying. Thereafter, an effective encoder-decoder based fine-tuning framework is proposed, consisting of three self-supervised encoders. Extensive experiments conducted on four real-world weather condition datasets demonstrate that our method outperforms the state-of-the-art models and also empirically validates the feasibility of each self-supervised task.

References

[1]

S. Mehrkanoon, “Deep shared representation learning for weather elements forecasting,” Knowl.-Based Syst., vol. 179, pp. 120–128, 2019.

Digital Library

[2]

J. Xia et al., “Machine learning-based weather support for the 2022 winter olympics,” 2020. [Online]. Available: https://link.springer.com/content/pdf/10.1007/s00376-020-0043-5.pdf

[3]

A. G. Salman, B. Kanigoro, and Y. Heryadi, “Weather forecasting using deep learning techniques,” in Proc. Int. Conf. Adv. Comput. Sci. Inf. Syst., 2015, pp. 281–285.

[4]

X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-C. Woo, “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” in Proc. Adv. Neural Inf. Process. Syst., 2015, pp. 802–810.

[5]

C. K. Sønderby et al., “MetNet: A neural weather model for precipitation forecasting,” 2020,.

[6]

J. Pathak et al., “FourCastNet: A global data-driven high-resolution weather model using adaptive Fourier neural operators,” 2022,.

[7]

K. Trebing and S. Mehrkanoon, “Wind speed prediction using multidimensional convolutional neural networks,” in Proc. IEEE Symp. Ser. Comput. Intell., 2020, pp. 713–720.

[8]

G. Marchuk, Numerical Methods in Weather Prediction. New York, NY, USA: Elsevier, 2012.

[9]

P. Bauer, A. Thorpe, and G. Brunet, “The quiet revolution of numerical weather prediction,” Nature, vol. 525, no. 7567, pp. 47–55, 2015.

[10]

A. Abraham, N. S. Philip, B. Nath, and P. Saratchandran, “Performance analysis of connectionist paradigms for modeling chaotic behavior of stock indices,” in Proc. 2nd Int. Workshop Intell. Syst. Des. Appl. Comput. Intell. Appl. Dyn., 2002, pp. 181–186.

[11]

S. Agrawal, L. Barrington, C. Bromberg, J. Burge, C. Gazen, and J. Hickey, “Machine learning for precipitation nowcasting from radar images,” 2019,.

[12]

L. Chen and X. Lai, “Comparison between arima and ann models used in short-term wind speed forecasting,” in Proc. IEEE Asia-Pacific Power Energy Eng. Conf., 2011, pp. 1–4.

[13]

N. I. Sapankevych and R. Sankar, “Time series prediction using support vector machines: A survey,” IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 24–38, May 2009.

Digital Library

[14]

Y. Radhika and M. Shashi, “Atmospheric temperature prediction using support vector machines,” Int. J. Comput. Theory Eng., vol. 1, no. 1, 2009, Art. no.

[15]

D. Chauhan and J. Thakur, “Data mining techniques for weather prediction: A review,” Int. J. Recent Innov. Trends Comput. Commun., vol. 2, no. 8, pp. 2184–2189, 2014.

[16]

A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,” Comput. Intell. Neurosci., vol. 2018, 2018, Art. no.

Digital Library

[17]

A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, “Speech recognition using deep neural networks: A systematic review,” IEEE Access, vol. 7, pp. 19143–19165, 2019.

[18]

D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages of deep learning for natural language processing,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 2, pp. 604–624, Feb. 2021.

[19]

B. Klein, L. Wolf, and Y. Afek, “A dynamic convolutional layer for short range weather prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 4840–4848.

[20]

A. Grover, A. Kapoor, and E. Horvitz, “A deep hybrid model for weather forecasting,” in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2015, pp. 379–386.

[21]

R. Castro, Y. M. Souto, E. Ogasawara, F. Porto, and E. Bezerra, “STConvS2S: Spatiotemporal convolutional sequence to sequence network for weather forecasting,” Neurocomputing, vol. 426, pp. 285–298, 2021.

[22]

H. Zhou et al., “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 11106–11115.

[23]

Y. Gong, Z. Li, J. Zhang, W. Liu, and Y. Zheng, “Online spatio-temporal crowd flow distribution prediction for complex metro system,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 2, pp. 865–880, Feb. 2022.

Digital Library

[24]

X. Zhang et al., “Spatio-temporal fusion and contrastive learning for urban flow prediction,” Knowl.-Based Syst., vol. 282, 2023, Art. no.

[25]

X. Shi et al., “Deep learning for precipitation nowcasting: A benchmark and a new model,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 5622–5632.

[26]

Z. Lin, M. Li, Z. Zheng, Y. Cheng, and C. Yuan, “Self-attention convLSTM for spatiotemporal prediction,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 11531–11538.

[27]

Y. Li et al., “Weather forecasting using ensemble of spatial-temporal attention network and multi-layer perceptron,” Asia-Pacific J. Atmospheric Sci., vol. 57, no. 3, pp. 533–546, 2021.

[28]

M. Ma et al., “HiSTGNN: Hierarchical spatio-temporal graph neural networks for weather forecasting,” 2022,.

[29]

J. Han, H. Liu, H. Zhu, H. Xiong, and D. Dou, “Joint air quality and weather prediction based on multi-adversarial spatiotemporal networks,” in Proc. 35th AAAI Conf. Artif. Intell., 2021, pp. 4081–4089.

[30]

Q. Ni, Y. Wang, and Y. Fang, “GE-STDGN: A novel spatio-temporal weather prediction model based on graph evolution,” Appl. Intell., vol. 52, pp. 7638–7652, 2022.

Digital Library

[31]

B. Wang et al., “Deep uncertainty quantification: A machine learning approach for weather forecasting,” in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 2087–2095.

[32]

H. Zhou et al., “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 11106–11115.

[33]

H. Wu, J. Xu, J. Wang, and M. Long, “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting,” in Proc. Adv. Neural Inf. Process. Syst., 2021, pp. 22419–22430.

[34]

T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin, “Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting,” in Proc. Int. Conf. Mach. Learn., 2022, pp. 27268–27286.

[35]

Y. An, H. Xue, X. Zhao, and L. Zhang, “Conditional self-supervised learning for few-shot classification,” in Proc. 13th Int. Joint Conf. Artif. Intell., 2021, pp. 2140–2146.

[36]

D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, “Context encoders: Feature learning by inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2536–2544.

[37]

M. Noroozi and P. Favaro, “Unsupervised learning of visual representations by solving jigsaw puzzles,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 69–84.

[38]

G. Larsson, M. Maire, and G. Shakhnarovich, “Learning representations for automatic colorization,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 577–593.

[39]

R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 649–666.

[40]

S. Gidaris, P. Singh and N. Komodakis, “Unsupervised representation learning by predicting image rotations,” in Proc. Int. Conf. Learn. Representations, 2018, pp. 1–16. [Online]. Available: https://arxiv.org/pdf/1803.07728.pdf

[41]

T. Chen, X. Zhai, M. Ritter, M. Lucic, and N. Houlsby, “Self-supervised GANs via auxiliary rotation loss,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12154–12163.

[42]

C. Doersch, A. Gupta, and A. A. Efros, “Unsupervised visual representation learning by context prediction,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1422–1430.

[43]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” 2018,.

[44]

M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, and O. Levy, “SpanBERT: Improving pre-training by representing and predicting spans,” Trans. Assoc. Comput. Linguistics, vol. 8, pp. 64–77, 2020.

[45]

Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “ALBERT: A lite bert for self-supervised learning of language representations,” 2019,.

[46]

Y. Sun et al., “ERNIE: Enhanced representation through knowledge integration,” 2019,.

[47]

Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun, and Q. Liu, “ERNIE: Enhanced language representation with informative entities,” 2019,.

[48]

A. Van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” 2018,.

[49]

K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9729–9738.

[50]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 1597–1607.

[51]

J.-B. Grill et al., “Bootstrap your own latent-a new approach to self-supervised learning,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 21271–21284.

[52]

Y. Gong, Z. Li, J. Zhang, W. Liu, Y. Yin, and Y. Zheng, “Missing value imputation for multi-view urban statistical data via spatial correlation learning,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 686–698, Jan. 2023.

[53]

Y. Gong et al., “Missingness-pattern-adaptive learning with incomplete data,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 11053–11066, Sep. 2023.

Digital Library

[54]

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–22. [Online]. Available: https://arxiv.org/pdf/2010.11929.pdf

[55]

A. Miech, J.-B. Alayrac, L. Smaira, I. Laptev, J. Sivic, and A. Zisserman, “End-to-end learning of visual representations from uncurated instructional videos,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9879–9889.

[56]

J. Fu et al., “Dual attention network for scene segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3146–3154.

[57]

Y. Zheng, R. Wang, J. Yang, L. Xue, and M. Hu, “Principal characteristic networks for few-shot learning,” J. Vis. Commun. Image Representation, vol. 59, pp. 563–573, 2019.

Digital Library

[58]

F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 815–823.

[59]

X. Zhang, Y. Gong, X. Zhang, X. Wu, C. Zhang, and X. Dong, “Mask-and contrast-enhanced spatio-temporal learning for urban flow prediction,” in Proc. 32nd ACM Int. Conf. Inf. Knowl. Manage., 2023, pp. 3298–3307.

[60]

H. Qu, Y. Gong, M. Chen, J. Zhang, Y. Zheng, and Y. Yin, “Forecasting fine-grained urban flows via spatio-temporal contrastive self-supervision,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 8, pp. 8008–8023, Aug. 2023.

Digital Library

[61]

K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” 2021,.

[62]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

[63]

J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” 2016,.

[64]

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM Sigkdd Int. Conf. Knowl. Discov. Data Mining, 2016, pp. 785–794.

[65]

G. Lai, W.-C. Chang, Y. Yang, and H. Liu, “Modeling long-and short-term temporal patterns with deep neural networks,” in Proc. 41st Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2018, pp. 95–104.

[66]

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. 27th Int. Conf. Neural Inf. Process. Syst., 2014, pp. 3104–3112.

[67]

L. Bai, L. Yao, C. Li, X. Wang, and C. Wang, “Adaptive graph convolutional recurrent network for traffic forecasting,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 17804–17815.

[68]

A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, pp. 1–11, 2017.

[69]

A. Paszke et al., “Automatic differentiation in PyTorch,” in Proc. Int. Conf. Neural Inf. Process. Syst. Workshop, Long Beach, CA, USA, 2017, pp. 1–4. [Online]. Available: https://openreview.net/forum?id=BJJsrmfCZ

[70]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014,.

Cited By

Kumar RBhanu MMendes-Moreira JChandra J(2024)Spatio-Temporal Predictive Modeling Techniques for Different Domains: a SurveyACM Computing Surveys10.1145/369666157:2(1-42)Online publication date: 20-Sep-2024
https://dl.acm.org/doi/10.1145/3696661
An YLi ZLiu WSun HChen MLu WGong YSerra ESpezzano F(2024)Spatio-temporal Graph Normalizing Flow for Probabilistic Traffic PredictionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679705(45-55)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679705

Recommendations

Temporal convolutional neural (TCN) network for an effective weather forecasting using time-series data from the local weather station
Abstract
Non-predictive or inaccurate weather forecasting can severely impact the community of users such as farmers. Numerical weather prediction models run in major weather forecasting centers with several supercomputers to solve simultaneous complex ...
The cities weather forecasting by crowdsourced atmospheric data
Abstract
The problem of the weather forecasting still exists in urban agglomerations, in part because of local factors on a city scale. It is important to take into acount urban processes, to make a high-resolution weather forecast. In this study, we ...
Deep learning-based effective fine-grained weather forecasting model
Abstract
It is well-known that numerical weather prediction (NWP) models require considerable computer power to solve complex mathematical equations to obtain a forecast based on current weather conditions. In this article, we propose a novel lightweight ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering

IEEE Transactions on Knowledge and Data Engineering Volume 36, Issue 8

Aug. 2024

711 pages

Issue’s Table of Contents

1041-4347 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 06 February 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kumar RBhanu MMendes-Moreira JChandra J(2024)Spatio-Temporal Predictive Modeling Techniques for Different Domains: a SurveyACM Computing Surveys10.1145/369666157:2(1-42)Online publication date: 20-Sep-2024
https://dl.acm.org/doi/10.1145/3696661
An YLi ZLiu WSun HChen MLu WGong YSerra ESpezzano F(2024)Spatio-temporal Graph Normalizing Flow for Probabilistic Traffic PredictionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679705(45-55)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679705

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents