Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Spatio-Temporal Enhanced Contrastive and Contextual Learning for Weather Forecasting

Published: 06 February 2024 Publication History

Abstract

Weather forecasting is of great importance for human life and various real-world fields, e.g., traffic prediction, agricultural production, and tourist industry. Existing methods can be roughly divided into two categories: theory-driven (e.g., numerical weather prediction (NWP)) and data-driven methods. Theory-driven methods require a complex simulation of the physical evolution process in the atmosphere model using supercomputers, while most data-driven methods learn the underlying laws from the historical weather records via deep learning models. However, some data-driven methods simply regard all weather variables of monitoring stations as a whole and fail to more granularly exploit complex correlations across different stations, while others prefer to construct large neural networks with massive learnable parameters. To alleviate these defects, we propose a spatio-temporal contrastive self-supervision method and a generative contextual self-supervised technique to capture spatial and temporal dependencies from the station-level and variable-level, respectively. Through these well-designed self-supervised tasks, uncomplicated networks obtain strong capability to capture latent representations for weather changes with time-varying. Thereafter, an effective encoder-decoder based fine-tuning framework is proposed, consisting of three self-supervised encoders. Extensive experiments conducted on four real-world weather condition datasets demonstrate that our method outperforms the state-of-the-art models and also empirically validates the feasibility of each self-supervised task.

References

[1]
S. Mehrkanoon, “Deep shared representation learning for weather elements forecasting,” Knowl.-Based Syst., vol. 179, pp. 120–128, 2019.
[2]
J. Xia et al., “Machine learning-based weather support for the 2022 winter olympics,” 2020. [Online]. Available: https://link.springer.com/content/pdf/10.1007/s00376-020-0043-5.pdf
[3]
A. G. Salman, B. Kanigoro, and Y. Heryadi, “Weather forecasting using deep learning techniques,” in Proc. Int. Conf. Adv. Comput. Sci. Inf. Syst., 2015, pp. 281–285.
[4]
X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-C. Woo, “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” in Proc. Adv. Neural Inf. Process. Syst., 2015, pp. 802–810.
[5]
C. K. Sønderby et al., “MetNet: A neural weather model for precipitation forecasting,” 2020,.
[6]
J. Pathak et al., “FourCastNet: A global data-driven high-resolution weather model using adaptive Fourier neural operators,” 2022,.
[7]
K. Trebing and S. Mehrkanoon, “Wind speed prediction using multidimensional convolutional neural networks,” in Proc. IEEE Symp. Ser. Comput. Intell., 2020, pp. 713–720.
[8]
G. Marchuk, Numerical Methods in Weather Prediction. New York, NY, USA: Elsevier, 2012.
[9]
P. Bauer, A. Thorpe, and G. Brunet, “The quiet revolution of numerical weather prediction,” Nature, vol. 525, no. 7567, pp. 47–55, 2015.
[10]
A. Abraham, N. S. Philip, B. Nath, and P. Saratchandran, “Performance analysis of connectionist paradigms for modeling chaotic behavior of stock indices,” in Proc. 2nd Int. Workshop Intell. Syst. Des. Appl. Comput. Intell. Appl. Dyn., 2002, pp. 181–186.
[11]
S. Agrawal, L. Barrington, C. Bromberg, J. Burge, C. Gazen, and J. Hickey, “Machine learning for precipitation nowcasting from radar images,” 2019,.
[12]
L. Chen and X. Lai, “Comparison between arima and ann models used in short-term wind speed forecasting,” in Proc. IEEE Asia-Pacific Power Energy Eng. Conf., 2011, pp. 1–4.
[13]
N. I. Sapankevych and R. Sankar, “Time series prediction using support vector machines: A survey,” IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 24–38, May 2009.
[14]
Y. Radhika and M. Shashi, “Atmospheric temperature prediction using support vector machines,” Int. J. Comput. Theory Eng., vol. 1, no. 1, 2009, Art. no.
[15]
D. Chauhan and J. Thakur, “Data mining techniques for weather prediction: A review,” Int. J. Recent Innov. Trends Comput. Commun., vol. 2, no. 8, pp. 2184–2189, 2014.
[16]
A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,” Comput. Intell. Neurosci., vol. 2018, 2018, Art. no.
[17]
A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, “Speech recognition using deep neural networks: A systematic review,” IEEE Access, vol. 7, pp. 19143–19165, 2019.
[18]
D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages of deep learning for natural language processing,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 2, pp. 604–624, Feb. 2021.
[19]
B. Klein, L. Wolf, and Y. Afek, “A dynamic convolutional layer for short range weather prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 4840–4848.
[20]
A. Grover, A. Kapoor, and E. Horvitz, “A deep hybrid model for weather forecasting,” in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2015, pp. 379–386.
[21]
R. Castro, Y. M. Souto, E. Ogasawara, F. Porto, and E. Bezerra, “STConvS2S: Spatiotemporal convolutional sequence to sequence network for weather forecasting,” Neurocomputing, vol. 426, pp. 285–298, 2021.
[22]
H. Zhou et al., “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 11106–11115.
[23]
Y. Gong, Z. Li, J. Zhang, W. Liu, and Y. Zheng, “Online spatio-temporal crowd flow distribution prediction for complex metro system,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 2, pp. 865–880, Feb. 2022.
[24]
X. Zhang et al., “Spatio-temporal fusion and contrastive learning for urban flow prediction,” Knowl.-Based Syst., vol. 282, 2023, Art. no.
[25]
X. Shi et al., “Deep learning for precipitation nowcasting: A benchmark and a new model,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 5622–5632.
[26]
Z. Lin, M. Li, Z. Zheng, Y. Cheng, and C. Yuan, “Self-attention convLSTM for spatiotemporal prediction,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 11531–11538.
[27]
Y. Li et al., “Weather forecasting using ensemble of spatial-temporal attention network and multi-layer perceptron,” Asia-Pacific J. Atmospheric Sci., vol. 57, no. 3, pp. 533–546, 2021.
[28]
M. Ma et al., “HiSTGNN: Hierarchical spatio-temporal graph neural networks for weather forecasting,” 2022,.
[29]
J. Han, H. Liu, H. Zhu, H. Xiong, and D. Dou, “Joint air quality and weather prediction based on multi-adversarial spatiotemporal networks,” in Proc. 35th AAAI Conf. Artif. Intell., 2021, pp. 4081–4089.
[30]
Q. Ni, Y. Wang, and Y. Fang, “GE-STDGN: A novel spatio-temporal weather prediction model based on graph evolution,” Appl. Intell., vol. 52, pp. 7638–7652, 2022.
[31]
B. Wang et al., “Deep uncertainty quantification: A machine learning approach for weather forecasting,” in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 2087–2095.
[32]
H. Zhou et al., “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 11106–11115.
[33]
H. Wu, J. Xu, J. Wang, and M. Long, “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting,” in Proc. Adv. Neural Inf. Process. Syst., 2021, pp. 22419–22430.
[34]
T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin, “Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting,” in Proc. Int. Conf. Mach. Learn., 2022, pp. 27268–27286.
[35]
Y. An, H. Xue, X. Zhao, and L. Zhang, “Conditional self-supervised learning for few-shot classification,” in Proc. 13th Int. Joint Conf. Artif. Intell., 2021, pp. 2140–2146.
[36]
D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, “Context encoders: Feature learning by inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2536–2544.
[37]
M. Noroozi and P. Favaro, “Unsupervised learning of visual representations by solving jigsaw puzzles,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 69–84.
[38]
G. Larsson, M. Maire, and G. Shakhnarovich, “Learning representations for automatic colorization,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 577–593.
[39]
R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 649–666.
[40]
S. Gidaris, P. Singh and N. Komodakis, “Unsupervised representation learning by predicting image rotations,” in Proc. Int. Conf. Learn. Representations, 2018, pp. 1–16. [Online]. Available: https://arxiv.org/pdf/1803.07728.pdf
[41]
T. Chen, X. Zhai, M. Ritter, M. Lucic, and N. Houlsby, “Self-supervised GANs via auxiliary rotation loss,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12154–12163.
[42]
C. Doersch, A. Gupta, and A. A. Efros, “Unsupervised visual representation learning by context prediction,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1422–1430.
[43]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” 2018,.
[44]
M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, and O. Levy, “SpanBERT: Improving pre-training by representing and predicting spans,” Trans. Assoc. Comput. Linguistics, vol. 8, pp. 64–77, 2020.
[45]
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “ALBERT: A lite bert for self-supervised learning of language representations,” 2019,.
[46]
Y. Sun et al., “ERNIE: Enhanced representation through knowledge integration,” 2019,.
[47]
Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun, and Q. Liu, “ERNIE: Enhanced language representation with informative entities,” 2019,.
[48]
A. Van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” 2018,.
[49]
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9729–9738.
[50]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 1597–1607.
[51]
J.-B. Grill et al., “Bootstrap your own latent-a new approach to self-supervised learning,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 21271–21284.
[52]
Y. Gong, Z. Li, J. Zhang, W. Liu, Y. Yin, and Y. Zheng, “Missing value imputation for multi-view urban statistical data via spatial correlation learning,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 686–698, Jan. 2023.
[53]
Y. Gong et al., “Missingness-pattern-adaptive learning with incomplete data,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 11053–11066, Sep. 2023.
[54]
A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–22. [Online]. Available: https://arxiv.org/pdf/2010.11929.pdf
[55]
A. Miech, J.-B. Alayrac, L. Smaira, I. Laptev, J. Sivic, and A. Zisserman, “End-to-end learning of visual representations from uncurated instructional videos,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9879–9889.
[56]
J. Fu et al., “Dual attention network for scene segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3146–3154.
[57]
Y. Zheng, R. Wang, J. Yang, L. Xue, and M. Hu, “Principal characteristic networks for few-shot learning,” J. Vis. Commun. Image Representation, vol. 59, pp. 563–573, 2019.
[58]
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 815–823.
[59]
X. Zhang, Y. Gong, X. Zhang, X. Wu, C. Zhang, and X. Dong, “Mask-and contrast-enhanced spatio-temporal learning for urban flow prediction,” in Proc. 32nd ACM Int. Conf. Inf. Knowl. Manage., 2023, pp. 3298–3307.
[60]
H. Qu, Y. Gong, M. Chen, J. Zhang, Y. Zheng, and Y. Yin, “Forecasting fine-grained urban flows via spatio-temporal contrastive self-supervision,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 8, pp. 8008–8023, Aug. 2023.
[61]
K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” 2021,.
[62]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[63]
J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” 2016,.
[64]
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM Sigkdd Int. Conf. Knowl. Discov. Data Mining, 2016, pp. 785–794.
[65]
G. Lai, W.-C. Chang, Y. Yang, and H. Liu, “Modeling long-and short-term temporal patterns with deep neural networks,” in Proc. 41st Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2018, pp. 95–104.
[66]
I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. 27th Int. Conf. Neural Inf. Process. Syst., 2014, pp. 3104–3112.
[67]
L. Bai, L. Yao, C. Li, X. Wang, and C. Wang, “Adaptive graph convolutional recurrent network for traffic forecasting,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 17804–17815.
[68]
A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, pp. 1–11, 2017.
[69]
A. Paszke et al., “Automatic differentiation in PyTorch,” in Proc. Int. Conf. Neural Inf. Process. Syst. Workshop, Long Beach, CA, USA, 2017, pp. 1–4. [Online]. Available: https://openreview.net/forum?id=BJJsrmfCZ
[70]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014,.

Cited By

View all
  • (2024)Spatio-Temporal Predictive Modeling Techniques for Different Domains: a SurveyACM Computing Surveys10.1145/369666157:2(1-42)Online publication date: 20-Sep-2024
  • (2024)Spatio-temporal Graph Normalizing Flow for Probabilistic Traffic PredictionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679705(45-55)Online publication date: 21-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering  Volume 36, Issue 8
Aug. 2024
711 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 06 February 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Spatio-Temporal Predictive Modeling Techniques for Different Domains: a SurveyACM Computing Surveys10.1145/369666157:2(1-42)Online publication date: 20-Sep-2024
  • (2024)Spatio-temporal Graph Normalizing Flow for Probabilistic Traffic PredictionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679705(45-55)Online publication date: 21-Oct-2024

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media