ARSOD-YOLO: Enhancing Small Target Detection for Remote Sensing Images
<p>Some examples of remote sensing images.</p> "> Figure 2
<p>YOLOv8 network architecture.</p> "> Figure 3
<p>ARSOD-YOLO network architecture.</p> "> Figure 4
<p>The basic structure of AFEM. It consists of GhostModule and MLP as the basic components.</p> "> Figure 5
<p>Structural diagrams of AKSFFM, C2f, and Bottleneck.</p> "> Figure 6
<p>Images illustrating the different categories of the dataset [<a href="#B43-sensors-24-07472" class="html-bibr">43</a>].</p> "> Figure 7
<p>Comparison of AI-TOD with other benchmark datasets [<a href="#B44-sensors-24-07472" class="html-bibr">44</a>].</p> "> Figure 8
<p>Visualization of mAP effects of different modules.</p> "> Figure 9
<p>PR curves for categories in the VEDAI dataset.</p> "> Figure 10
<p>PR curves for categories in the AI-TOD dataset.</p> "> Figure 11
<p>Visual comparison of object detection models: ARSOD-YOLO vs. YOLOv3, YOLOv5, and YOLOv10 on VEDAI dataset images.</p> ">
Abstract
:1. Introduction
2. Related Works
2.1. Small Target Detection Algorithm Based on Feature Enhancement
2.2. Small Target Detection Algorithm Based on Feature Fusion
2.3. Remote Sensing Small Target Detection Method Based on YOLO
3. Materials and Methods
3.1. Baseline Model
3.2. ARSOD-YOLO
3.3. Adaptive Feature Enhancement Module
3.4. Adaptive Multi-Convolutional Kernel Feature Fusion Module
3.5. Loss Function
4. Experiment
4.1. Related Indexes
4.2. Datasets
4.3. Ablation Experiments
4.4. Comparative Experiments on Loss Function
4.5. Comparative Experiments
4.6. Visual Experiment on VEDAI Dataset
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, Z.; Wang, Y.; Zhang, N.; Zhang, Y.; Zhao, Z.; Xu, D.; Ben, G.; Gao, Y. Deep learning-based object detection techniques for remote sensing images: A survey. Remote Sens. 2022, 14, 2385. [Google Scholar] [CrossRef]
- Zhang, M.; Li, W.; Zhang, Y.; Tao, R.; Du, Q. Hyperspectral and LiDAR data classification based on structural optimization transmission. IEEE Trans. Cybern. 2022, 53, 3153–3164. [Google Scholar] [CrossRef] [PubMed]
- Shi, T.; Gong, J.; Hu, J.; Zhi, X.; Zhang, W.; Zhang, Y.; Zhang, P.; Bao, G. Feature-enhanced CenterNet for small object detection in remote sensing images. Remote Sens. 2022, 14, 5488. [Google Scholar] [CrossRef]
- Ran, Q.; Wang, Q.; Zhao, B.; Wu, Y.; Pu, S.; Li, Z. Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2021, 14, 5786–5795. [Google Scholar] [CrossRef]
- Ruan, H.; Qian, W.; Zheng, Z.; Peng, Y. A Decoupled Semantic–Detail Learning Network for Remote Sensing Object Detection in Complex Backgrounds. Electronics 2023, 12, 3201. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, W.; Sun, W.; Tao, R.; Du, Q. Single-source domain expansion network for cross-scene hyperspectral image classification. IEEE Trans. Image Process. 2023, 32, 1498–1512. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhong, Y.; Wang, J.; Ma, A. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 4096–4105. [Google Scholar] [CrossRef]
- Pang, J.; Li, C.; Shi, J.; Xu, Z.; Feng, H. R2-CNN: Fast tiny object detection in large-scale remote sensing images. arXiv 2019, arXiv:1902.06042. [Google Scholar]
- Zhang, W.; Wang, S.; Thachan, S.; Chen, J.; Qian, Y. Deconv R-CNN for small object detection on remote sensing images. In Proceedings of the IGARSS 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 2483–2486. [Google Scholar]
- Aggarwal, A.; Mittal, M.; Battineni, G. Generative adversarial network: An overview of theory and applications. Int. J. Inf. Manag. Data Insights 2021, 1, 100004. [Google Scholar] [CrossRef]
- Courtrai, L.; Pham, M.T.; Lefèvre, S. Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks. Remote Sens. 2020, 12, 3152. [Google Scholar] [CrossRef]
- Bai, Y.; Zhang, Y.; Ding, M.; Ghanem, B. Sod-mtgan: Small object detection via multi-task generative adversarial network. In Proceedings of the European conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 206–221. [Google Scholar]
- Wu, J.; Xu, S. From point to region: Accurate and efficient hierarchical small object detection in low-resolution remote sensing images. Remote Sens. 2021, 13, 2620. [Google Scholar] [CrossRef]
- Ma, T.; Yang, Z.; Wang, J.; Sun, S.; Ren, X.; Ahmad, U. Infrared small target detection network with generate label and feature mapping. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhao, J.; Zhao, C.; Xiong, W.; Li, Q.; Yang, J. Robust real-time object detection based on deep learning for very high resolution remote sensing images. In Proceedings of the IGARSS 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 14 November 2019; pp. 1314–1317. [Google Scholar]
- Bell, S.; Zitnick, C.L.; Bala, K.; Girshick, R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2874–2883. [Google Scholar]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8232–8241. [Google Scholar]
- Tang, X.; Du, D.K.; He, Z.; Liu, J. Pyramidbox: A context-assisted single shot face detector. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 797–813. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Liu, S.; Huang, D.; Wang, Y. Learning spatial fusion for single-shot object detection. arXiv 2019, arXiv:1911.09516. [Google Scholar]
- Ghiasi, G.; Lin, T.Y.; Le, Q.V. Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7036–7045. [Google Scholar]
- Zhang, S.; Wen, L.; Bian, X.; Lei, Z.; Li, S.Z. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4203–4212. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Lim, J.S.; Astrid, M.; Yoon, H.J.; Lee, S.I. Small object detection using context and attention. In Proceedings of the IEEE 2021 international Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju Island, Republic of Korea, 13–16 April 2021; pp. 181–186. [Google Scholar]
- Liu, Z.; Gao, G.; Sun, L.; Fang, Z. HRDNet: High-resolution Detection Network for Small Objects. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021. [Google Scholar]
- Nie, H.; Pang, H.; Ma, M.; Zheng, R. A Lightweight Remote Sensing Small Target Image Detection Algorithm Based on Improved YOLOv8. Sensors 2024, 24, 2952. [Google Scholar] [CrossRef]
- Zhao, D.; Shao, F.; Liu, Q.; Yang, L.; Zhang, H.; Zhang, Z. A Small Object Detection Method for Drone-Captured Images Based on Improved YOLOv7. Remote Sens. 2024, 16, 1002. [Google Scholar] [CrossRef]
- Wang, P.; Sun, X.; Diao, W.; Fu, K. FMSSD: Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Trans. Geosci. Remote. Sens. 2019, 58, 3377–3390. [Google Scholar] [CrossRef]
- Liang, X.; Zhang, J.; Zhuo, L.; Li, Y.; Tian, Q. Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 1758–1770. [Google Scholar] [CrossRef]
- Bai, Y.; Li, R.; Gou, S.; Zhang, C.; Chen, Y.; Zheng, Z. Cross-connected bidirectional pyramid network for infrared small-dim target detection. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Li, Y.; Huang, Q.; Pei, X.; Chen, Y.; Jiao, L.; Shang, R. Cross-layer attention network for small object detection in remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2020, 14, 2148–2161. [Google Scholar] [CrossRef]
- Gong, Y.; Yu, X.; Ding, Y.; Peng, X.; Zhao, J.; Han, Z. Effective fusion factor in FPN for tiny object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 1160–1168. [Google Scholar]
- Cao, J.; Bao, W.; Shang, H.; Yuan, M.; Cheng, Q. GCL-YOLO: A GhostConv-based lightweight yolo network for UAV small object detection. Remote Sens. 2023, 15, 4932. [Google Scholar] [CrossRef]
- Liu, K.; Huang, J.; Li, X. Eagle-eye-inspired attention for object detection in remote sensing. Remote Sens. 2022, 14, 1743. [Google Scholar] [CrossRef]
- Li, P.; Che, C. SeMo-YOLO: A multiscale object detection network in satellite remote sensing images. In Proceedings of the IEEE 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Wan, D.; Lu, R.; Wang, S.; Shen, S.; Xu, T.; Lang, X. Yolo-hr: Improved yolov5 for object detection in high-resolution optical remote sensing images. Remote Sens. 2023, 15, 614. [Google Scholar] [CrossRef]
- Qu, Z.; Zhu, F.; Qi, C. Remote sensing image target detection: Improvement of the YOLOv3 model with auxiliary networks. Remote Sens. 2021, 13, 3908. [Google Scholar] [CrossRef]
- Xu, D.; Wu, Y. Improved YOLO-V3 with DenseNet for multi-scale remote sensing target detection. Sensors 2020, 20, 4276. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation IEEE Trans. Cybern. 2020, 52, 8574–8586. [Google Scholar]
- Liu, Z.; Gao, G.; Sun, L.; Fang, Z. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression Proc. AAAI Conf. Artif. Intell. 2019, 34, 12993–13000. [Google Scholar]
- Razakarivony, S.; Jurie, F. Vehicle detection in aerial imagery : A small target detection benchmark. J. Vis. Commun. Image Represent. 2016, 34, 187–203. [Google Scholar] [CrossRef]
- Liu, C.; Gao, G.; Huang, Z.; Hu, Z.; Liu, Q.; Wang, Y. YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images. IEEE Trans. Intell. Transp. Syst. 2024, 25, 3863–13875. [Google Scholar] [CrossRef]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-scale Dataset for Object Detection in Aerial Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Lam, D.; Kuzma, R.; Mcgee, K.; Dooley, S.; Laielli, M.; Klaric, M.; Bulatov, Y.; Mccord, B. xView: Objects in Context in Overhead Imagery. arXiv 2018, arXiv:1802.07856. [Google Scholar]
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote. Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Zhu, P.; Wen, L.; Bian, X.; Ling, H.; Hu, Q. Vision Meets Drones: A Challenge; Springer: Cham, Switzerland, 2018. [Google Scholar]
Dataset | BiFPN | AFEM | AKSFFM | WIoU | mAP50 (%) | mAP50–95 (%) |
---|---|---|---|---|---|---|
VEDAI | × | × | × | × | 0.712 | 0.459 |
✓ | × | × | × | 0.718 | 0.461 | |
✓ | ✓ | × | × | 0.725 | 0.472 | |
✓ | ✓ | ✓ | × | 0.739 | 0.466 | |
× | × | × | ✓ | 0.729 | 0.469 | |
✓ | × | ✓ | × | 0.735 | 0.463 | |
✓ | ✓ | ✓ | ✓ | 0.743 | 0.469 |
Method | mAp50 | mAp50–95 |
---|---|---|
CIoU | 0.735 | 0.463 |
SIoU | 0.704 | 0.449 |
Focal_CIoU | 0.734 | 0.455 |
GIoU | 0.729 | 0.46 |
WIoU | 0.743 | 0.469 |
Model | Categories | P (%) | R (%) | mAP50 (%) |
---|---|---|---|---|
YOLOv8n | car | 0.787 | 0.87 | 0.922 |
truck | 0.537 | 0.571 | 0.616 | |
pick-up | 0.692 | 0.836 | 0.853 | |
tractor | 0.715 | 0.6 | 0.609 | |
camping_car | 0.726 | 0.784 | 0.786 | |
boat | 0.499 | 0.667 | 0.706 | |
van | 0.636 | 0.699 | 0.759 | |
other | 0.448 | 0.383 | 0.399 | |
large | 1 | 0.617 | 0.756 | |
all | 0.671 | 0.67 | 0.712 | |
ARSOD-YOLO | car | 0.82 | 0.84 | 0.895 |
truck | 0.638 | 0.543 | 0.636 | |
pick-up | 0.686 | 0.834 | 0.8 | |
tractor | 0.74 | 0.568 | 0.715 | |
camping_car | 0.694 | 0.789 | 0.816 | |
boat | 0.623 | 0.63 | 0.675 | |
van | 0.819 | 0.72 | 0.797 | |
other | 0.611 | 0.468 | 0.523 | |
large | 0.923 | 0.667 | 0.831 | |
all | 0.728 | 0.673 | 0.743 |
Model | Categories | P (%) | R (%) | mAP50 (%) |
---|---|---|---|---|
YOLOv8n | vedaiself | 0.434 | 0.616 | 0.589 |
bridge | 0.562 | 0.334 | 0.331 | |
storage-tank | 0.843 | 0.652 | 0.739 | |
ship | 0.603 | 0.64 | 0.629 | |
swimming-pool | 1 | 0.0335 | 0.0604 | |
vehicle | 0.672 | 0.681 | 0.671 | |
person | 0.671 | 0.282 | 0.26 | |
windmill | 0.397 | 0.0152 | 0.0591 | |
all | 0.616 | 0.407 | 0.417 | |
ARSOD-YOLO | airplane | 0.714 | 0.6 | 0.648 |
bridge | 0.518 | 0.289 | 0.307 | |
storage-tank | 0.849 | 0.671 | 0.748 | |
ship | 0.731 | 0.596 | 0.668 | |
swimming-pool | 0.439 | 0.31 | 0.307 | |
vehicle | 0.746 | 0.682 | 0.707 | |
person | 0.535 | 0.267 | 0.307 | |
windmill | 0.331 | 0.182 | 0.136 | |
all | 0.608 | 0.45 | 0.478 |
Method | P (%) | R (%) | mAP50 (%) | mAP50–95 (%) | FLOPs (G) |
---|---|---|---|---|---|
YOLOv3t | 0.51 | 0.564 | 0.553 | 0.3 | 13 |
YOLOv5n | 0.602 | 0.511 | 0.517 | 0.302 | 5.9 |
YOLOv9t | 0.817 | 0.591 | 0.654 | 0.429 | 11.1 |
YOLOv8n | 0.671 | 0.67 | 0.712 | 0.462 | 8.1 |
YOLOv10n | 0.729 | 0.658 | 0.726 | 0.465 | 8.3 |
RT-DETR | 0.455 | 0.436 | 0.455 | 0.436 | 103.56 |
Fast-RCNN | - | - | 0.459 | - | 196.2 |
SSD | - | - | 0.451 | - | - |
TPH-YOLO | - | - | 0.584 | 0.338 | 270.9 |
Ours | 0.728 | 0.673 | 0.743 | 0.469 | 25.1 |
Method | P (%) | R (%) | mAP50 (%) | mAP50–95 (%) | FLOPs (G) |
---|---|---|---|---|---|
YOLOv3t | 0.475 | 0.298 | 0.309 | 0.131 | 13.1 |
YOLOv5n | 0.698 | 0.311 | 0.311 | 0.342 | 6.1 |
YOLOv8n | 0.616 | 0.407 | 0.417 | 0.18 | 8.2 |
YOLOv9t | 0.592 | 0.277 | 0.384 | 0.299 | 12 |
YOLOv10n | 0.538 | 0.203 | 0.455 | 0.203 | 8.2 |
RT-DETR | 0.614 | 0.229 | 0.191 | 0.121 | 108.1 |
HANet | - | - | 0.529 | 0.210 | - |
Ours | 0.608 | 0.45 | 0.478 | 0.209 | 25.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qiu, Y.; Zheng, X.; Hao, X.; Zhang, G.; Lei, T.; Jiang, P. ARSOD-YOLO: Enhancing Small Target Detection for Remote Sensing Images. Sensors 2024, 24, 7472. https://doi.org/10.3390/s24237472
Qiu Y, Zheng X, Hao X, Zhang G, Lei T, Jiang P. ARSOD-YOLO: Enhancing Small Target Detection for Remote Sensing Images. Sensors. 2024; 24(23):7472. https://doi.org/10.3390/s24237472
Chicago/Turabian StyleQiu, Yijuan, Xiangyue Zheng, Xuying Hao, Gang Zhang, Tao Lei, and Ping Jiang. 2024. "ARSOD-YOLO: Enhancing Small Target Detection for Remote Sensing Images" Sensors 24, no. 23: 7472. https://doi.org/10.3390/s24237472
APA StyleQiu, Y., Zheng, X., Hao, X., Zhang, G., Lei, T., & Jiang, P. (2024). ARSOD-YOLO: Enhancing Small Target Detection for Remote Sensing Images. Sensors, 24(23), 7472. https://doi.org/10.3390/s24237472