Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

From reanalysis to satellite observations: gap-filling with imbalanced learning

Published: 01 April 2022 Publication History

Abstract

Increasing the spatial coverage and temporal resolution of Earth surface monitoring can significantly improve forecasting or monitoring capabilities in the context of smart city, such as extreme weather forecasting, ecosystem monitoring and anthropogenic impact monitoring. As an essential data source for Earth’s surface monitoring, most satellite observations exist data gaps due to various factors like the limitations of measuring equipment, the interferences of environments, and the delay or loss of data updates. Although many efforts have been conducted to fill the gaps in the last decade, the existing techniques cannot efficiently address the problem. In this paper, we extensively study the gap-filling problem of satellite observations using imbalanced learning. Specifically, we propose a framework called Reanalysis to Satellite (R2S) to simulate satellite observations with reanalysis data. In the R2S framework, we propose a generic method called Spatial Temporal Match (STM), matching reanalysis data and satellite observations to construct the Reanalysis-Satellite (R-S) dataset used to train the model. Based on the R-S dataset, we propose a novel method called Semi-imbalanced (SIMBA) to handle the imbalance problem of gap-filling by taking advantages of traditional machine learning and imbalanced learning. We construct a hybrid model in the R2S framework for the Soil Moisture Active Passive (SMAP) satellite observations of the tropical cyclone wind speed. Extensive experiments demonstrate the hybrid model outperforms the traditional machine learning model and closely approximates in situ observations.

References

[1]
Adetiloye T, Awasthi A (2017) Chapter 8 - predicting Short-Term congested traffic flow on urban motorway networks. In: Handbook of neural computation. Academic Press, pp 145–165
[2]
O’Brien Andrew, Gleason Scott (2015) Joel Johnson Chris Ruf: The CYGNSS end-to-end simulator (e2ES)
[3]
Benabdelkader S and Melgani F Contextual spatiospectral postreconstruction of Cloud-Contaminated images IEEE Geosci Remote Sens Lett 2008 5 2 204-208
[4]
Blanchard BW, Hsu SA (2005) On the radial variation of the tangential wind speed outside the radius of maximum wind during hurricane Wilma (2005). Coastal Studies Institue. Louisiana State University, pp 1–11
[5]
Branco P, Ribeiro RP, Torgo L (2016) UBL: An R package for Utility-based Learning. arXiv:1604.08079 [cs, stat]
[6]
Branco P, Torgo L, Ribeiro RP (2017) SMOGN: A Pre-processing Approach For Imbalanced Regression. In: First international workshop on learning with imbalanced domains: Theory and applications, pp 36–50
[7]
Chawla NV, Bowyer KW, Hall LO, and Kegelmeyer WP SMOTE: Synthetic Minority Over-sampling Technique J Artif Intell Res 2002 16 321-357
[8]
Chen T, Guestrin C (2016) XGBOost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. Association for Computing Machinery, pp 785–794
[9]
Chen T, Tang L, Liu Q, Yang D, Xie S, Cao X, Wu C, Yao E, Liu Z, Jiang Z (2012) Combining factorization model and additive forest for collaborative followee recommendation. KDD CUP
[10]
Cressie N, Wikle CK (2015) Statistics for Spatio-Temporal data. Wiley
[11]
Das M and Ghosh SK A deep-learning-based forecasting ensemble to predict missing data for remote sensing analysis IEEE J Sel Top Appl Earth Observ Remote Sens 2017 10 12 5228-5236
[12]
Entekhabi D, Njoku EG, O’Neill PE, Kellogg KH, Crow WT, Edelstein WN, Entin JK, Goodman SD, Jackson TJ, Johnson J, Kimball J, Piepmeier JR, Koster RD, Martin N, McDonald KC, Moghaddam M, Moran S, Reichle R, Shi JC, Spencer MW, Thurman SW, Tsang L, and Van Zyl J The soil moisture active passive (SMAP) mission Proc IEEE 2010 98 5 704-716
[13]
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Annals of statistics:1189–1232
[14]
Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, and Bing G Learning from class-imbalanced data: Review of methods and applications Expert Syst Appl 2017 73 220-239
[15]
He H, Ma Y (2013) Imbalanced learning: Foundations, Algorithms, and Applications. Wiley
[16]
Huang X, Zou Y, Wang Y (2016) Cost-sensitive sparse linear regression for crowd counting with imbalanced training data. In: 2016 IEEE International conference on multimedia and expo (ICME), pp 1–6
[17]
Kandasamy S, Baret F, Verger A, Neveux P, and Weiss M A comparison of methods for smoothing and gap filling time series of remote sensing observations-application to MODIS LAI products Biogeosciences 2013 10 6 4055
[18]
Kato T (2016) Chapter 4 - Prediction of photovoltaic power generation output and network operation. In: Integration of distributed energy resources in power systems. Academic Press, pp 77–108
[19]
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, pp 3146–3154
[20]
Kimball SK and Mulekar MS A 15-Year climatology of north atlantic tropical cyclones. Part I: Size parameters J Clim 2004 17 18 3555-3575
[21]
Klotz BW and Uhlhorn EW Improved stepped frequency microwave radiometer tropical cyclone surface winds in heavy precipitation J Atmos Ocean Technol 2014 31 11 2392-2408
[22]
Konik M, Kowalewski M, Bradtke K, and Darecki M The operational method of filling information gaps in satellite imagery using numerical models Int J Appl Earth Observ Geoinforma 2019 75 68-82
[23]
Krasnopolsky V, Nadiga S, Mehra A, Bayler E, and Behringer D Neural networks technique for filling gaps in satellite measurements: Application to ocean color observations Comput Intell Neurosci 2016 2016 e6156513
[24]
Krawczyk B, Woźniak M, and Schaefer G Cost-sensitive decision tree ensembles for effective imbalanced classification Appl Soft Comput 2014 14 554-562
[25]
Lee S, Cho M, and Lee C An effective gap filtering method for Landsat ETM+ SLC-off data TAO: Terrestrial Atmosph Ocean Sci 2016 27 6 9
[26]
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE International conference on computer vision (ICCV), pp 2999–3007
[27]
Liu J and Zio E Weighted-feature and cost-sensitive regression model for component continuous degradation assessment Reliab Eng Syst Safety 2017 168 210-217
[28]
Loyola-González O, Martínez-Trinidad JF, Carrasco-Ochoa JA, and García-Borroto M Study of the impact of resampling methods for contrast pattern based classifiers in imbalanced databases Neurocomputing 2016 175 935-947
[29]
Mariethoz G, McCabe MF, Renard P (2012) Spatiotemporal reconstruction of gaps in multivariate fields using the direct sampling approach. Water Resources Research 48(10)
[30]
Masunaga H, Matsui T, Tao Wk, Hou AY, Kummerow CD, Nakajima T, Bauer P, Olson WS, Sekiguchi M, and Nakajima TY Satellite Data Simulator Unit: A Multisensor, Multispectral Satellite Simulator Package Bullet Amer Meteorol Soc 2010 91 12 1625-1632
[31]
Meissner T, Ricciardulli L, and Wentz FJ Capability of the SMAP mission to measure ocean surface winds in storms Bull Am Meteorol Soc 2017 98 8 1660-1677
[32]
Mohan P and Strobl E The short-term economic impact of tropical Cyclone Pam: An analysis using VIIRS nightlight satellite imagery Int J Remote Sens 2017 38 21 5992-6006
[33]
Murakami H Tropical cyclones in reanalysis data sets Geophys Res Lett 2014 41 6 2133-2141
[34]
Pal R (2017) Chapter 4 - Validation methodologies. In: Predictive modeling of drug sensitivity. Academic Press, pp 83–107
[35]
Pan Y, Jin M, Zhang S, Deng Y (2020) TEC Map Completion Using DCGAN And Poisson Blending. Space Weather 18(5):e2019SW002390
[36]
Ribeiro RPA (2011) Utility-based Regression. Ph.D. thesis, University of Porto
[37]
Roy PS, Behera MD, and Srivastav SK Satellite remote sensing: sensors, Applications and Techniques Proc Natl Acad Sci India Sect A: Phys Sci 2017 87 4 465-472
[38]
Saeys Y, Inza I, and Larrañaga P A review of feature selection techniques in bioinformatics Bioinformatics 2007 23 19 2507-2517
[39]
Schenkel BA and Hart RE An examination of tropical cyclone position, intensity, and intensity life cycle within atmospheric reanalysis datasets J Clim 2011 25 10 3453-3475
[40]
Tahir MA, Kittler J, Mikolajczyk K, Yan F (2009) A multiple expert approach to the class imbalance problem using inverse random under sampling. In: Multiple classifier systems, lecture notes in computer science. Springer, pp 82–91
[41]
Torgo L, Ribeiro RP, Pfahringer B, Branco P (2013) SMOTE For regression. In: Progress in artificial intelligence, lecture notes in computer science. Springer, pp 378–389
[42]
Tyree S, Weinberger KQ, Agrawal K, Paykin J (2011) Parallel boosted regression trees for web search ranking. In: Proceedings of the 20th International Conference on World Wide Web, pp 387–396
[43]
Uhlhorn EW, Black PG, Franklin JL, Goodberlet M, Carswell J, and Goldstein AS Hurricane surface wind measurements from an operational stepped frequency microwave radiometer Mon Weather Rev 2007 135 9 3070-3085
[44]
Wang G, Garcia D, Liu Y, de Jeu R, and Johannes Dolman A A three-dimensional gap filling method for large geophysical datasets: Application to global satellite soil moisture observations Environ Modell Softw 2012 30 139-142
[45]
Webster PJ, Holland GJ, Curry JA, and Chang HR Changes in tropical cyclone number, duration, and intensity in a warming environment Science 2005 309 5742 1844-1846
[46]
Woodruff JD, Irish JL, and Camargo SJ Coastal flooding by tropical cyclones and sea-level rise Nature 2013 504 7478 44-52
[47]
Xian S, Yin J, Lin N, and Oppenheimer M Influence of risk factors and past events on flood resilience in coastal megacities: Comparative analysis of NYC and Shanghai Sci Total Environ 2018 610 1251-1261
[48]
Yeh CW, Li DC, Lin LS, Tsai TI (2016) A Learning Approach with Under-and Over-Sampling for Imbalanced Data Sets. In: 2016 5Th IIAI international congress on advanced applied informatics (IIAI-AAI), pp 725–729
[49]
Yi Y, Johnson JT, and Wang X On the estimation of wind speed diurnal cycles using simulated measurements of CYGNSS and ASCAT IEEE Geosci Remote Sens Lett 2018 16 2 168-172
[50]
Yin G, Mariethoz G, Sun Y, and McCabe MF A comparison of gap-filling approaches for Landsat-7 satellite data Int J Remote Sens 2017 38 23 6653-6679
[51]
Yu X, Liu J, Yang Z, Jia X, Ling Q, Ye S (2017) Learning from imbalanced data for predicting the number of software defects. In: 2017 IEEE 28Th international symposium on software reliability engineering (ISSRE). IEEE, pp 78–89
[52]
Yun J, Ha J, Lee JS (2016) Automatic determination of neighborhood size in SMOTE. In: Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication, IMCOM ’16. Association for Computing Machinery, pp 1–8
[53]
Zhang R, Di B, Luo Y, Deng X, Grieneisen ML, Wang Z, Yao G, Zhan Y (2018) A nonparametric approach to filling gaps in satellite-retrieved aerosol optical depth for estimating ambient PM2. 5 levels, vol 243

Index Terms

  1. From reanalysis to satellite observations: gap-filling with imbalanced learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Geoinformatica
        Geoinformatica  Volume 26, Issue 2
        Apr 2022
        147 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 01 April 2022
        Accepted: 24 September 2020
        Revision received: 13 September 2020
        Received: 10 August 2020

        Author Tags

        1. Imbalanced learning
        2. Gap-filling
        3. Satellite observations
        4. Reanalysis data
        5. Tropical cyclone
        6. Smart city

        Qualifiers

        • Research-article

        Funding Sources

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 22 Sep 2024

        Other Metrics

        Citations

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media