DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images
<p>Some examples of satellite cloud images with noise points and lines are presented. The noise points have been marked with red boxes. In (<b>a</b>,<b>b</b>), satellite cloud images with noise points are displayed, while (<b>c</b>,<b>d</b>) show images with noise lines.</p> "> Figure 2
<p>Illustration of the proposed framework DeepDR.</p> "> Figure 3
<p>Illustration of the Transformer-based noise image classifier.</p> "> Figure 4
<p>Illustration of the proposed pseudo-label-based noise region segmentation.</p> "> Figure 5
<p>Some example images of a noise point dataset are shown.</p> "> Figure 6
<p>Some example images of a noise line dataset are displayed.</p> "> Figure 7
<p>A selection of results from the noise points experiment. (<b>a</b>,<b>b</b>) are remote sensing satellite images containing noises, and (<b>c</b>,<b>d</b>) are normal remote sensing satellite images.</p> "> Figure 8
<p>A selection of results from the noise lines experiment. (<b>a</b>,<b>b</b>) are remote sensing satellite images containing lines, and (<b>c</b>,<b>d</b>) are normal remote sensing satellite images.</p> "> Figure 9
<p>The precision of all methods on normal and noise point images.</p> "> Figure 10
<p>The recall of all methods on normal and noise point images.</p> "> Figure 11
<p>The F1 score for all methods on normal and noise point images.</p> "> Figure 12
<p>The precision of all methods on normal and noise line images.</p> "> Figure 13
<p>The recall of all methods on normal and noise line images.</p> "> Figure 14
<p>The F1 score for all methods on normal and noise line images.</p> "> Figure 15
<p>The visualization results of image segmentation methods for meteorological satellite images containing noise points. The noise points have been marked with red boxes.</p> "> Figure 16
<p>The visualization results of image segmentation methods for meteorological satellite images containing noise lines.</p> ">
Abstract
:1. Introduction
- We propose a novel two-level deep defect recognition framework for meteorological satellite images. To our knowledge, this problem has not been explored previously through deep learning methods.
- We develop a transformer-based noise image classification method to identify whether a meteorological satellite image contains noise points or noise lines. Additionally, we construct and release two datasets of noise image classification to evaluate the proposed method.
- We design a training strategy using pseudo-labels to train image segmentation models to detect the region containing noise points or noise lines. This training strategy can be applied to train image segmentation models for detecting noise regions.
- Comprehensive experiments have been conducted to evaluate the proposed noise image classification method and the training strategy of noise region segmentation. The results demonstrate that our method outperforms state-of-the-art methods in addressing the noise image classification problem, and our training strategy can effectively construct image segmentation models to detect real noise regions.
2. Related Work
2.1. Noise Image Classification
2.2. Image Segmentation
3. Deep Defect Recognition
3.1. Problem Definition
3.2. Overall Framework
Algorithm 1 DeepDR: Two-level Deep Defect Recognition Framework | |
Data: Original satellite image S | |
Results: Defect detection results (normalPatches, pointMasks, lineMasks) | |
1: | patches ← CropImage(S, patchSize); |
2: | noisePointPatches ← { }; |
3: | noiseLinePatches ← { }; |
4: | normalPatches ← { }; |
5: | for each patch Xi in patches do |
6: | isNoisePoint ← TransformerClassifierPoint(Xi) |
7: | if isNoisePoint then |
8: | noisePointPatches.append(Xi) |
9: | continue; |
10: | else |
11: | isNoiseLine ← TransformerClassifierLine(Xi) |
12: | if isNoisePoint then |
13: | noiseLinePatches.append(Xi) |
14: | continue; |
15: | else |
16: | noisePatches.append(Xi) |
17: | end if |
18: | end if |
19: | end for |
20: | pointMasks ← { }; |
21: | lineMasks ← { }; |
22: | for patch in noisePointPatches do |
23: | pointMask ← NoisePointSegmentation(patch) |
24: | pointMasks.append(pointMask); |
25: | end for |
26: | for patch in noiseLinePatches do |
27: | lineMask ← NoiseLineSegmentation(patch); |
28: | lineMasks.append(lineMask); |
29: | end for |
30: | return normalPatches, pointMasks, lineMasks |
3.3. Transformer-Based Noise Image Classification
3.4. Pseudo-Label-Based Noise Region Segmentation
4. Experiments
4.1. Dataset
4.2. Experimental Setup
4.2.1. Implementation Details
4.2.2. Evaluation Protocol
- Accuracy is a measure that reflects the overall effectiveness of a classification model. It assesses the proportion of correctly predicted instances among all instances. A high accuracy score indicates that the model is making correct predictions across all classes, while a low accuracy suggests a higher rate of misclassifications. The definition of the accuracy metric is as follows:Accuracy gives an overall assessment of the model’s ability to make correct predictions across all classes.
- Precision is a metric that focuses on the accuracy of positive predictions made by a model. It quantifies the model’s ability to correctly identify instances belonging to the positive class. High precision indicates that when the model predicts a positive instance, it is likely to be correct, minimizing false positives. The definition of the precision metric is as follows:TP and FP denote True Positives and False Positives. Precision is particularly important in situations where false positives carry significant consequences.
- Recall measures the effectiveness of a classification model in capturing all relevant instances of a specific class. It emphasizes the ability of the model to avoid missing positive instances, making it crucial in scenarios where false negatives (missing positives) have significant implications. A high recall score indicates a model that is sensitive to the presence of positive instances. The definition of the recall metric is as follows:FN denotes False Negatives. Recall is important when the cost of missing positive instances (false negatives) is high, and it provides insight into how well the model identifies all relevant instances.
- F1 Score is a comprehensive metric that balances precision and recall. It is particularly useful in situations where there is an uneven distribution of classes or where there is a trade-off between false positives and false negatives. The F1 score is the harmonic mean of precision and recall, offering a single value that considers both the correctness of positive predictions and the model’s ability to capture all relevant instances. The definition of F1 score is as follows:The F1 score combines both precision and recall into a single metric, allowing for a comprehensive evaluation of a model’s performance, especially in scenarios where there is a trade-off between false positives and false negatives.
- mIoU, i.e., mean Intersection over Union, is a widely used evaluation metric in semantic segmentation tasks. It assesses the accuracy of pixel-wise classification by measuring the overlap between predicted segmentation masks and ground truth masks for each class. mIoU is the average IoU across all classes, where IoU is calculated as the ratio of the intersection area between the predicted and ground truth masks to their union area. Mathematically, IoU is expressed as follows:TP denotes the number of true positive pixels (correctly classified pixels). FP is the number of false positive pixels (incorrectly classified pixels). FN is the number of false negative pixels (pixels missed by the prediction). mIoU is calculated by summing up the IoU of each class and dividing it by the total number of classes, which is defined as follows:N is the total number of classes. In semantic segmentation, a higher mIoU value indicates better performance, meaning the model can accurately delineate different objects or regions within an image.
4.3. Experimental Results
4.4. Ablation Study
4.5. Performance on Different Subclasses
4.6. Comparison Results of Noise Region Segmentations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kim, M.; Song, H.; Kim, Y. Direct Short-Term Forecast of Photovoltaic Power through a Comparative Study between COMS and Himawari-8 Meteorological Satellite Images in a Deep Neural Network. Remote. Sens. 2020, 12, 2357. [Google Scholar] [CrossRef]
- Vyas, S.S.; Bhattacharya, B.K. Agricultural drought early warning from geostationary meteorological satellites: Concept and demonstration over semi-arid tract in India. Environ. Monit. Assess. 2020, 192, 1–15. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Zhang, D. Progressive switching median filter for the removal of impulse noise from highly corrupted images. IEEE Trans. Circuits Syst. Ii Analog. Digit. Signal Process. 1999, 46, 78–80. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Wu, H.R. Adaptive impulse detection using center-weighted median filters. IEEE Signal Process. Lett. 2001, 8, 1–3. [Google Scholar] [CrossRef]
- Chen, T.; Ma, K.K.; Chen, L.H. Tri-state median filter for image denoising. IEEE Trans. Image Process. 1999, 8, 1834–1838. [Google Scholar] [CrossRef] [PubMed]
- Ferdous, H.; Siraj, T.; Setu, S.J.; Anwar, M.M.; Rahman, M.A. Machine learning approach towards satellite image classification. In Proceedings of the International Conference on Trends in Computational and Cognitive Engineering: Proceedings of TCCE 2020; Springer: Berlin/Heidelberg, Germany, 2021; pp. 627–637. [Google Scholar]
- Valero Medina, J.A.; Alzate Atehortúa, B.E. Comparison of maximum likelihood, support vector machines, and random forest techniques in satellite images classification. Tecnura 2019, 23, 13–26. [Google Scholar] [CrossRef]
- Kulkarni, S.; Kelkar, V. Classification of multispectral satellite images using ensemble techniques of bagging, boosting and adaboost. In Proceedings of the 2014 International Conference on Circuits, Systems, Communication and Information Technology Applications (CSCITA), Mumbai, India, 4–5 April 2014; IEEE: New York, NY, USA, 2014; pp. 253–258. [Google Scholar]
- Unnikrishnan, A.; Sowmya, V.; Soman, K. Deep AlexNet with reduced number of trainable parameters for satellite image classification. Procedia Comput. Sci. 2018, 143, 931–938. [Google Scholar] [CrossRef]
- Pritt, M.; Chern, G. Satellite image classification with deep learning. In Proceedings of the 2017 IEEE applied imagery pattern recognition workshop (AIPR), Washington, DC, USA, 10–17 October 2017; IEEE: New York, NY, USA, 2017; pp. 1–7. [Google Scholar]
- Wang, D.; Zhuang, L.; Gao, L.; Sun, X.; Zhao, X.; Plaza, A. Sliding dual-window-inspired reconstruction network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Wang, D.; Zhuang, L.; Gao, L.; Sun, X.; Huang, M.; Plaza, A. BockNet: Blind-block reconstruction network with a guard window for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote. Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Li, Y.; Jiang, T.; Xie, W.; Lei, J.; Du, Q. Sparse coding-inspired GAN for hyperspectral anomaly detection in weakly supervised learning. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
- Cheng, X.; Zhang, M.; Lin, S.; Li, Y.; Wang, H. Deep self-representation learning framework for hyperspectral anomaly detection. IEEE Trans. Instrum. Meas. 2024, 73, 1–16. [Google Scholar] [CrossRef]
- Zhuang, L.; Ng, M.K.; Gao, L.; Wang, Z. Eigen-CNN: Eigenimages Plus Eigennoise Level Maps Guided Network for Hyperspectral Image Denoising. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 1–18. [Google Scholar] [CrossRef]
- Wang, M.; Gao, L.; Ren, L.; Sun, X.; Chanussot, J. Hyperspectral Simultaneous Anomaly Detection and Denoising: Insights From Integrative Perspective. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2024, 17, 13966–13980. [Google Scholar] [CrossRef]
- Li, X.; Ding, M.; Gu, Y.; Pižurica, A. An end-to-end framework for joint denoising and classification of hyperspectral images. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3269–3283. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; proceedings, part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Li, R.; Duan, C.; Zheng, S.; Zhang, C.; Atkinson, P.M. MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed Images. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Li, R.; Wang, L.; Zhang, C.; Duan, C.; Zheng, S. A2-FPN for semantic segmentation of fine-resolution remotely sensed images. Int. J. Remote. Sens. 2022, 43, 1131–1155. [Google Scholar] [CrossRef]
- Ma, Y.; Wang, Y.; Liu, X.; Wang, H. SWINT-RESNet: An improved remote sensing image segmentation model based on Transformer. IEEE Geosci. Remote. Sens. Lett. 2024, 21, 8003005. [Google Scholar] [CrossRef]
- Li, J.; Cheng, S. AFENet: An Attention-Focused Feature Enhancement Network for the Efficient Semantic Segmentation of Remote Sensing Images. Remote. Sens. 2024, 16, 4392. [Google Scholar] [CrossRef]
- Kang, X.; Hong, Y.; Duan, P.; Li, S. Fusion of hierarchical class graphs for remote sensing semantic segmentation. Inf. Fusion 2024, 109, 102409. [Google Scholar] [CrossRef]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning. PMLR, Online, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013; Volume 398. [Google Scholar]
- Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Zhang, H. The optimality of naive Bayes. Aa 2004, 1, 3. [Google Scholar]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Freund, Y.; Schapire, R.; Abe, N. A short introduction to boosting. J.-Jpn. Soc. Artif. Intell. 1999, 14, 1612. [Google Scholar]
- Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1–9. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Cao, L. A MobileNetV2 model of transfer learning is employed for remote sensing image classification. Adv. Eng. Technol. Res. 2024, 10, 596. [Google Scholar] [CrossRef]
- Xie, M.; Tang, Q.; Yang, K.; Ma, Y.; Zhao, S.; Feng, X.; Hao, W. Image classification based on improved VGG network. In Proceedings of the Fifth International Conference on Computer Vision and Data Mining (ICCVDM), Changchun, China, 19–21 July 2024; Volume 13272, pp. 352–356. [Google Scholar]
- Dosovitskiy, A. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Method | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
---|---|---|---|---|
K-Nearest Neighbors [30] | 84.00 | 98.99 | 68.70 | 81.11 |
Naive Bayes [31] | 84.80 | 87.83 | 80.80 | 84.17 |
Decision Tree [32] | 83.15 | 95.47 | 69.60 | 80.51 |
Random Forest [33] | 82.55 | 98.37 | 66.20 | 79.14 |
AdaBoos [35] | 88.55 | 97.30 | 79.30 | 87.38 |
Support Vector Machine [36] | 90.95 | 99.88 | 82.00 | 90.06 |
Logistic Regression [29] | 93.70 | 98.88 | 88.40 | 93.35 |
Multilayer Perceptron [34] | 94.30 | 97.74 | 90.70 | 94.09 |
AlexNet [37] | 97.00 | 98.16 | 95.80 | 96.96 |
ResNet50 [38] | 98.95 | 99.59 | 98.30 | 98.94 |
MobileNetV2-Adv [39] | 91.95 | 94.20 | 89.40 | 91.74 |
VGG16-AdvNet [40] | 92.35 | 97.32 | 87.10 | 91.93 |
Ours | 99.15 | 99.40 | 98.90 | 99.15 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
---|---|---|---|---|
K-Nearest Neighbors [30] | 94.35 | 99.89 | 88.80 | 94.02 |
Naive Bayes [31] | 90.65 | 96.89 | 84.00 | 89.98 |
Decision Tree [32] | 94.95 | 99.13 | 90.70 | 94.73 |
Random Forest [33] | 94.70 | 99.67 | 89.70 | 94.42 |
AdaBoost [35] | 96.35 | 99.36 | 93.30 | 96.24 |
Support Vector Machine [36] | 97.60 | 99.79 | 95.40 | 97.55 |
Logistic Regression [29] | 98.35 | 99.69 | 97.00 | 98.33 |
Multilayer Perceptron [34] | 98.25 | 99.49 | 97.00 | 98.23 |
AlexNet [37] | 98.65 | 99.80 | 97.50 | 98.63 |
ResNet50 [38] | 99.05 | 99.40 | 98.70 | 99.05 |
MobileNetV2-Adv [39] | 95.20 | 99.45 | 90.90 | 94.98 |
VGG16-AdvNet [40] | 98.10 | 99.90 | 96.30 | 98.07 |
Ours | 99.25 | 99.40 | 99.10 | 99.25 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
---|---|---|---|---|
ViT (points) | 98.10 | 98.59 | 97.60 | 98.10 |
DeiT (points) | 99.15 | 99.40 | 98.90 | 99.15 |
ViT (lines) | 99.00 | 99.60 | 98.40 | 99.00 |
DeiT (lines) | 99.25 | 99.40 | 99.10 | 99.25 |
Method | Flops | Parameters | Noise Points (s) | Noise Lines (s) |
---|---|---|---|---|
AlexNet [37] | 0.264 | 0.245 | ||
ResNet50 [38] | 1.873 | 1.970 | ||
MobileNetV2-Adv [39] | 1.375 | 1.356 | ||
VGG16-AdvNet [40] | 0.505 | 0.514 | ||
Ours | 0.501 | 0.505 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
---|---|---|---|---|
without self-attention | 94.30 | 97.74 | 90.70 | 94.09 |
with self-attention | 99.15 | 99.40 | 98.90 | 99.15 |
Method | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
---|---|---|---|---|
without self-attention | 98.25 | 99.49 | 97.00 | 98.23 |
with self-attention | 99.25 | 99.40 | 99.10 | 99.25 |
Type | Unet | Unet++ | DeepLabV3 | DeepLabV3+ | A2-FPN | MACU-Net |
---|---|---|---|---|---|---|
Noise Points | 58.50 | 58.12 | 52.24 | 57.05 | 58.06 | 57.09 |
Noise Lines | 69.80 | 72.14 | 66.42 | 71.47 | 71.48 | 71.99 |
Type | Unet | Unet++ | DeepLabV3 | DeepLabV3+ | A2-FPN | MACU-Net |
---|---|---|---|---|---|---|
Flops | ||||||
Parameters | ||||||
Points(s) | 8.66 | 6.39 | 7.46 | 4.19 | 3.93 | 8.48 |
Lines(s) | 9.25 | 6.28 | 7.39 | 4.04 | 3.84 | 8.40 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, X.; Chang, X.; Fan, C.; Lin, M.; Wei, L.; Ye, Y. DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images. Remote Sens. 2025, 17, 585. https://doi.org/10.3390/rs17040585
Zhao X, Chang X, Fan C, Lin M, Wei L, Ye Y. DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images. Remote Sensing. 2025; 17(4):585. https://doi.org/10.3390/rs17040585
Chicago/Turabian StyleZhao, Xiangang, Xiangyu Chang, Cunqun Fan, Manyun Lin, Lan Wei, and Yunming Ye. 2025. "DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images" Remote Sensing 17, no. 4: 585. https://doi.org/10.3390/rs17040585
APA StyleZhao, X., Chang, X., Fan, C., Lin, M., Wei, L., & Ye, Y. (2025). DeepDR: A Two-Level Deep Defect Recognition Framework for Meteorological Satellite Images. Remote Sensing, 17(4), 585. https://doi.org/10.3390/rs17040585