LARS: Remote Sensing Small Object Detection Network Based on Adaptive Channel Attention and Large Kernel Adaptation
<p>The overall architecture of LARS.</p> "> Figure 2
<p>Structure of the ACA block.</p> "> Figure 3
<p>Structure of the LKA block.</p> "> Figure 4
<p>Average detection accuracy per category on the DOTA-v2.0 dataset. Each point represents the accuracy of a comparison model in a given category, the horizontal axis represents different models, and the vertical axis represents the AP50 value for each category.</p> "> Figure 5
<p>Average detection accuracy for all categories on the DOTA-v2.0 dataset. LARS is represented by the red bar, and the highest detection accuracy of 63.01 was achieved on this dataset.</p> "> Figure 6
<p>Comparison of mAP50 and mAP95 metrics on the VisDrone dataset.</p> "> Figure 7
<p>Comparison of results from different strategies for decomposition of large kernels using (kernel, dilation) format.</p> "> Figure 8
<p>Comparison of detection performance with different blocks added.</p> "> Figure 9
<p>Visualization of detection performance after adding different blocks, with the parts of the detection results where it is difficult to find differences circled in ellipses.</p> "> Figure 10
<p>Evaluation metric analysis for proposed model.</p> "> Figure 11
<p>Comparison of normalized confusion matrices without (<b>Left</b>) and with (<b>Right</b>) the LBN block on the DOTA-v2.0 dataset, where BG represents the background.</p> "> Figure 12
<p>Precision–recall curve and F1–confidence curve for proposed model.</p> "> Figure 13
<p>Examples of test results on the DOTA-v2.0 and VisDrone datasets.</p> ">
Abstract
:1. Introduction
- To address the issue of insufficient attention to remote sensing small object features when dealing with features of different scales, the ACA block is proposed. This block applies adaptive attention weighting based on the input feature dimensions, guiding the model to better focus on the local information.
- The LKA block is designed to address the problem of the incorrect detection of remote sensing small objects caused by the loss of local information in remote sensing images due to large kernel convolutions. This block dynamically adjusts the surrounding spatial receptive field according to the ranging background around the detection area, and it is guided by the weight information extracted by the ACA block, enhancing the model’s ability to extract the contextual information around small objects.
- The LBN method is designed to resolve the issue of classification confusion caused by the correlation between samples. This method improves the consistency analysis capabilities during adaptive learning, alleviating the decline in the model’s classification accuracy caused by sample misclassification.
2. Related Works
3. Method
3.1. Network Overview
3.2. ACA Block
3.3. LKA Block
Algorithm 1: KA Block: Core of LKA | |
Input: Image tensor x | |
Output: Tensor after processing X | |
1 | Initialize convolutional layers; |
2 | Generate features F = Conv(x), F = Conv(F); |
3 | Obtain concatenated features F = Concat(F, F); |
4 | Calculate statistics S = P(F), S = P(F); |
5 | Obtain aggregate weights S = (Concat(S, S)); |
6 | Obtain the weight matrix W = S[0]*F + S[1]*F; |
7 | Calculate the weighted sum X = W*x; |
8 | Return X; |
3.4. LBN Method
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Comparative Experiments
4.4. Ablation Experiments
4.5. Results Analysis
5. Conclusions
- (1)
- To address the lack of interpretability of the model, we are also exploring the combination of some mathematical formulas to explain the workings of the model in order to more clearly understand the inner workings and decision-making process of the model.
- (2)
- The model has a large number of parameters, leaving room for improvement in terms of a lightweight design. Future research can focus on reducing the model parameters and computational complexity through compression techniques to meet the requirements of applications in resource-constrained environments.
- (3)
- There is still room for improvement in the model’s accuracy, especially for complex backgrounds in remote sensing small object detection tasks. Future works can further enhance the accuracy and robustness through optimization algorithms or modifications to the model structure.
- (4)
- We also find that the imbalance in the number of samples in the dataset can significantly affect the detection results. Future research can consider using learning methods to improve the model’s detection abilities for classes with fewer samples.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, D.; Zhang, J.; Qi, Y.; Wu, Y.; Zhang, Y. Tiny Object Detection in Remote Sensing Images Based on Object Reconstruction and Multiple Receptive Field Adaptive Feature Enhancement. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–13. [Google Scholar] [CrossRef]
- Shihabudeen, H.; Rajeesh, J. A detail review and analysis on deep learning based fusion of IR and visible images. In AIP Conference Proceedings; AIP Publishing: Melville, NY, USA, 2024; Volume 2965. [Google Scholar]
- Cheng, G.; Yuan, X.; Yao, X.; Yan, K. Towards large-scale small object detection: Survey and benchmarks. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 13467–13488. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Bashir, S.M.A.; Khan, M.; Ullah, Q.; Wang, R.; Song, Y.; Guo, Z.; Niu, Y. Remote sensing image super-resolution and object detection: Benchmark and state of the art. Expert Syst. Appl. 2022, 197, 116793. [Google Scholar] [CrossRef]
- Xie, Q.; Zhou, D.; Tang, R.; Feng, H. A Deep CNN-Based Detection Method for Multi-Scale Fine-Grained Objects in Remote Sensing Images. IEEE Access 2024, 12, 15622–15630. [Google Scholar] [CrossRef]
- Chadwick, A.J.; Coops, N.C.; Bater, C.W.; Martens, L.A.; White, B. Transferability of a Mask R–CNN Model for the Delineation and Classification of Two Species of Regenerating Tree Crowns to Untrained Sites. Sci. Remote Sens. 2024, 9, 100109. [Google Scholar] [CrossRef]
- Zhu, Z.; He, X.; Qi, G.; Li, Y.; Cong, B.; Liu, Y. Brain Tumor Segmentation Based on the Fusion of Deep Semantics and Edge Information in Multimodal MRI. Inf. Fusion 2023, 91, 376–387. [Google Scholar] [CrossRef]
- Sagar, A.S.; Chen, Y.; Xie, Y.; Kim, H.S. MSA R-CNN: A comprehensive approach to remote sensing object detection and scene understanding. Expert Syst. Appl. 2024, 241, 122788. [Google Scholar] [CrossRef]
- Zhu, Z.; Sun, M.; Qi, G.; Li, Y.; Gao, X.; Liu, Y. Sparse Dynamic Volume TransUNet with Multi-Level Edge Fusion for Brain Tumor Segmentation. Comput. Biol. Med. 2024, 172, 108284. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Z.; Wang, Z.; Qi, G.; Mazur, N.; Yang, P.; Liu, Y. Brain Tumor Segmentation in MRI with Multi-Modality Spatial Information Enhancement and Boundary Shape Correction. Pattern Recognit. 2024, 153, 110553. [Google Scholar] [CrossRef]
- Ghadi, Y.Y.; Rafique, A.A.; Al Shloul, T.; Alsuhibany, S.A.; Jalal, A.; Park, J. Robust object categorization and Scene classification over remote sensing images via features fusion and fully convolutional network. Remote Sens. 2022, 14, 1550. [Google Scholar] [CrossRef]
- Qu, J.; Tang, Z.; Zhang, L.; Zhang, Y.; Zhang, Z. Remote Sensing Small Object Detection Network Based on Attention Mechanism and Multi-Scale Feature Fusion. Remote Sens. 2023, 15, 2728. [Google Scholar] [CrossRef]
- Ghaffarian, S.; Valente, J.; Van Der Voort, M.; Tekinerdogan, B. Effect of attention mechanism in deep learning-based remote sensing image processing: A systematic literature review. IEEE Trans. Geosci. Remote Sens. 2021, 13, 2965. [Google Scholar] [CrossRef]
- Wang, J.; Li, W.; Zhang, M.; Chanussot, J. Large Kernel Sparse ConvNet Weighted by Multi-Frequency Attention for Remote Sensing Scene Understanding. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar] [CrossRef]
- Xiang, S.; Liang, Q. Remote Sensing Image Compression with Long-Range Convolution and Improved Non-Local Attention Model. Signal Process. 2023, 209, 109005. [Google Scholar] [CrossRef]
- Wang, W.; Li, S.; Shao, J.; Jumahong, H. LKC-Net: Large Kernel Convolution Object Detection Network. Sci. Rep. 2023, 13, 9535. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Hou, Q.; Zheng, Z.; Cheng, M.M.; Yang, J.; Li, X. Large Selective Kernel Network for Remote Sensing Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1 October 2023; pp. 16794–16805. [Google Scholar]
- Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11963–11975. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proc. Int. Conf. Mach. Learn. 2015, 37, 448–456. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer Normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelilli, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Du, D.; Zhu, P.; Wen, L.; Bian, X.; Lin, H.; Hu, Q.; Peng, T.; Zheng, J.; Wang, X.; Zhang, Y.; et al. VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Jiang, J.; Zhong, X.; Chang, Z.; Gao, X. Object Detection of Transmission Tower Based on DPM. In Proceedings of the 4th International Conference on Information Technologies and Electrical Engineering, Lviv, Ukraine, 19–21 May 2021; pp. 1–5. [Google Scholar]
- Ren, Y.; Zhu, C.; Xiao, S. Small Object Detection in Optical Remote Sensing Images via Modified Faster R-CNN. Appl. Sci. 2018, 8, 813. [Google Scholar] [CrossRef]
- Lim, J.S.; Astrid, M.; Yoon, H.J.; Lee, S.I. Small Object Detection Using Context and Attention. In Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju, Republic of Korea, 13–16 April 2021; pp. 181–186. [Google Scholar]
- Yan, J.; Hu, X.; Zhang, K.; Shi, T.; Zhu, G.; Zhang, Y. Detection of Dim Small Ground Targets in SAR Remote Sensing Image Based on Multi-Level Feature Fusion. J. Imaging Sci. Technol. 2023, 67, 1. [Google Scholar] [CrossRef]
- Fan, F.; Zhang, M.; Yu, D.; Li, J.; Zhou, S.; Liu, Y. Lightweight Context Awareness and Feature Enhancement for Anchor-Free Remote Sensing Target Detection. IEEE Sens. J. 2024, 24, 10714–10726. [Google Scholar] [CrossRef]
- Du, Z.; Liang, Y. Object Detection of Remote Sensing Image Based on Multi-Scale Feature Fusion and Attention Mechanism. IEEE Access 2024, 12, 8619–8632. [Google Scholar] [CrossRef]
- Paoletti, M.E.; Moreno-Alvarez, S.; Haut, J.M. Multiple attention-guided capsule networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–20. [Google Scholar] [CrossRef]
- Yan, R.; Yan, L.; Cao, Y.; Geng, G.; Zhou, P. One-Stop Multiscale Reconciliation Attention Network with Scribble Supervision for Salient Object Detection in Optical Remote Sensing Images. Appl. Intell. 2024, 54, 1–19. [Google Scholar] [CrossRef]
- Liu, C.; Zhang, S.; Hu, M.; Song, Q. Object Detection in Remote Sensing Images Based on Adaptive Multi-Scale Feature Fusion Method. Remote Sens. 2024, 16, 907. [Google Scholar] [CrossRef]
- Dong, P.; Wang, B.; Cong, R.; Sun, H.H.; Li, C. Transformer with Large Convolution Kernel Decoder Network for Salient Object Detection in Optical Remote Sensing Images. Comput. Vis. Image Underst. 2024, 240, 103917. [Google Scholar] [CrossRef]
- Sharshar, A.; Matsun, A. Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection. arXiv 2023, arXiv:2311.12956. [Google Scholar]
- Cha, K.; Seo, J.; Lee, T. A Billion-Scale Foundation Model for Remote Sensing Images. arXiv 2023, arXiv:2304.05215. [Google Scholar] [CrossRef]
- Lee, H.; Song, M.; Koo, J. Hausdorff distance matching with adaptive query denoising for rotated detection transformer. arXiv 2023, arXiv:2305.07598. [Google Scholar]
- Xie, X.; Cheng, G.; Wang, J. Oriented R-CNN and Beyond. Int. J. Comput. Vis. 2024, 132, 2420–2442. [Google Scholar] [CrossRef]
- Han, J.; Ding, J.; Li, J.; Xia, G.S. Align Deep Features for Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
- Li, W.; Chen, Y.; Hu, K.; Zhu, J. Oriented RepPoints for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1829–1838. [Google Scholar]
- Biswas, D.; Tešić, J. Progressive Domain Adaptation with Contrastive Learning for Object Detection in the Satellite Imagery. arXiv 2022, arXiv:2209.02564. [Google Scholar]
- Zhao, Z.; Li, S. OASL: Orientation-Aware Adaptive Sampling Learning for Arbitrary Oriented Object Detection. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103740. [Google Scholar] [CrossRef]
- Zhao, J.; Ding, Z.; Zhou, Y.; Zhu, H.; Du, W.; Yao, R.; Saddik, A.E. Efficient Decoder for End-to-End Oriented Object Detection in Remote Sensing Images. arXiv 2023, arXiv:2311.17629. [Google Scholar]
- Xie, X.; Cheng, G.; Rao, C.; Lang, C.; Han, J. Oriented Object Detection via Contextual Dependence Mining and Penalty-Incentive Allocation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–10. [Google Scholar]
- Zhang, M.; Yue, K.; Li, B.; Guo, J.; Li, Y.; Gao, X. Single-Frame Infrared Small Target Detection via Gaussian Curvature Inspired Network. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–13. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
- Xu, C.; Ding, J.; Wang, J.; Yang, W.; Yu, H.; Yu, L.; Xia, G.S. Dynamic Coarse-to-Fine Learning for Oriented Tiny Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7318–7328. [Google Scholar]
- Yang, X.; Yang, X.; Yang, J.; Ming, Q.; Wang, W.; Tian, Q.; Yan, J. Learning High-Precision Bounding Box for Rotated Object Detection Via Kullback-Leibler Divergence. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2021; Volume 34, pp. 18381–18394. [Google Scholar]
- Jocher, G. Ultralytics YOLOv8. Available online: https://github.com/ultralytics/ultralytics (accessed on 12 March 2024).
- Hou, L.; Lu, K.; Xue, J.; Li, Y. Shape-adaptive selection and measurement for oriented object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 923–932. [Google Scholar]
- Nin, G.; Huang, H. Multi-oriented object detection in aerial images with double horizontal rectangles. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 4932–4944. [Google Scholar]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef] [PubMed]
- Cheng, C.; Yao, Y.; Li, S.; Li, K.; Xie, X. Dual aligned oriented detector. IEEE Trans. Geosci. Remote Sens. 2020, 43, 1452–1459. [Google Scholar] [CrossRef]
- Yuan, X.; Cheng, G.; Yan, K.; Zeng, Q.; Han, J. Small object detection via coarse-to-fine proposal generation and imitation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 6317–6327. [Google Scholar]
- Shang, J.; Wang, J.; Liu, S.; Wang, C.; Zheng, B. Small Target Detection Algorithm for UAV Aerial Photography Based on Improved YOLOv5s. Electronics 2023, 12, 2434. [Google Scholar] [CrossRef]
- Liu, H.; Duan, X.; Lou, H.; Gu, J.; Chen, H. Improved GBS-YOLOv5 Algorithm Based on YOLOv5 Applied to UAV Intelligent Traffic. Sci. Rep. 2023, 13, 9577. [Google Scholar] [CrossRef]
- Ding, K.; Li, X.; Guo, W.; Wu, L. Improved object detection algorithm for drone-captured dataset based on yolov5. In Proceedings of the 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 14–16 January 2022; pp. 895–899. [Google Scholar]
- Tang, S.; Fang, Y.; Zhang, S. HIC-YOLOv5: Improved YOLOv5 for Small Object Detection. arXiv 2023, arXiv:2309.16393. [Google Scholar]
- Yang, C.; Huang, Z.; Wang, N. QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 13668–13677. [Google Scholar]
- Du, B.; Huang, Y.; Chen, J.; Huang, D. Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 13435–13444. [Google Scholar]
- Yu, W.; Yang, T.; Chen, C. Towards Resolving the Challenge of Long-Tail Distribution in UAV Images for Object Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Online, 5–9 January 2021; pp. 3258–3267. [Google Scholar]
- Akyon, F.C.; Altinuc, S.O.; Temizel, A. Slicing Aided Hyper Inference and Fine-Tuning for Small Object Detection. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 966–970. [Google Scholar]
- Liu, S.; Zha, J.; Sun, J.; Li, Z.; Wang, G. EdgeYOLO: An edge-real-time object detector. In Proceedings of the 2023 42nd Chinese Control Conference (CCC), Tianjin, China, 24–26 July 2023; pp. 7507–7512. [Google Scholar]
Area Subset | Small | Normal | ||
---|---|---|---|---|
Extremely Small | Relatively Small | Generally Small | ||
Area Range | (0, 144] | (144, 400] | (400, 1024] | (1024, 2000] |
Category | Abbr. | Inst. Count | Category | Abbr. | Inst. Count |
---|---|---|---|---|---|
plane | PL | 23,930 | large vehicle | LV | 89,353 |
ship | SH | 251,883 | small vehicle | SV | 1,235,658 |
storage tank | ST | 79,497 | helicopter | HC | 893 |
baseball diamond | BD | 3834 | roundabout | RA | 6809 |
tennis count | TC | 9396 | soccer ball field | SBF | 2404 |
basketball court | BC | 3556 | swimming pool | SP | 20,095 |
ground track field | GTF | 4933 | container crane | CC | 3887 |
harbor | HB | 29,581 | airport | AP | 5905 |
bridge | BR | 21,433 | helipad | HP | 611 |
Training | / | 268,627 | Test/Test-dev | / | 353,346 |
Validation | / | 81,048 | Test-challenge | / | 1,690,637 |
Category | Instance Count |
---|---|
airplane | 31,529 |
helicopter | 1395 |
small-vehicle | 463,072 |
large-vehicle | 15,333 |
ship | 61,916 |
container | 138,223 |
storage-tank | 35,027 |
swimming-pool | 26,953 |
windmill | 26,755 |
Train | 344,228 |
Validation | 159,573 |
Test | 296,402 |
Total | 800,203 |
Category | Instance Count |
---|---|
pedestrian | 79,337 |
people | 27,059 |
bicycle | 10,477 |
car | 144,865 |
van | 24,950 |
truck | 12,871 |
tricycle | 4803 |
awning-tricycle | 3243 |
bus | 5926 |
motor | 29,642 |
Method | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | CC | AP | HL | mAP50 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
multi-stage | |||||||||||||||||||
BFM [34] | 80.12 | 54.12 | 50.07 | 65.68 | 43.98 | 60.07 | 67.85 | 79.11 | 64.38 | 60.56 | 45.98 | 58.26 | 58.31 | 64.82 | 69.84 | 32.78 | 89.37 | 11.07 | 58.69 |
RHINO [35] | 79.74 | 58.79 | 48.13 | 67.12 | 57.21 | 59.11 | 69.48 | 83.54 | 65.14 | 74.05 | 47.93 | 60.49 | 58.43 | 63.25 | 55.59 | 48.49 | 82.06 | 14.40 | 60.72 |
ORCB [36] | 78.65 | 51.80 | 47.15 | 65.78 | 43.35 | 58.29 | 60.89 | 82.83 | 63.51 | 59.50 | 43.40 | 55.79 | 52.90 | 56.18 | 54.13 | 27.55 | 66.24 | 5.22 | 54.06 |
S2A-Net [37] | 77.84 | 51.31 | 43.72 | 62.59 | 47.51 | 50.58 | 57.86 | 80.73 | 59.11 | 65.32 | 36.43 | 52.60 | 45.36 | 52.46 | 40.12 | 0 | 62.81 | 11.11 | 49.86 |
O-Rep [38] | 73.02 | 46.68 | 42.37 | 63.05 | 47.06 | 50.28 | 58.64 | 78.84 | 57.12 | 66.77 | 35.21 | 50.76 | 48.77 | 51.62 | 34.23 | 6.17 | 64.66 | 5.87 | 48.95 |
PDACL [39] | 83.08 | 68.53 | 44.31 | 58.33 | 63.04 | 79.12 | 88.18 | 93.87 | 58.51 | 72.95 | 54.01 | 54.84 | 73.21 | 57.80 | 40.70 | 3.05 | 61.41 | 49.67 | 61.21 |
OASL [40] | 76.65 | 55.46 | 46.33 | 62.49 | 53.18 | 56.62 | 66.16 | 80.75 | 63.07 | 67.03 | 44.89 | 55.68 | 54.24 | 59.04 | 59.15 | 35.00 | 77.81 | 14.65 | 57.18 |
one-stage | |||||||||||||||||||
RRF [41] | 77.58 | 49.96 | 38.60 | 53.82 | 54.97 | 57.12 | 68.93 | 77.88 | 59.59 | 71.92 | 40.17 | 51.22 | 53.00 | 57.12 | 49.66 | 25.42 | 66.92 | 5.17 | 53.28 |
DFDet [42] | 75.44 | 52.17 | 42.28 | 60.17 | 48.80 | 53.36 | 62.67 | 78.15 | 56.85 | 66.52 | 40.78 | 53.05 | 48.42 | 59.23 | 51.40 | 25.47 | 66.29 | 16.38 | 53.19 |
GCI-Net [43] | 79.18 | 51.57 | 47.50 | 66.61 | 43.30 | 58.07 | 60.73 | 82.85 | 64.47 | 59.62 | 44.31 | 56.66 | 52.71 | 56.73 | 53.04 | 26.10 | 66.41 | 14.42 | 54.68 |
O-RCNN [44] | 77.95 | 50.29 | 46.73 | 65.24 | 42.61 | 54.56 | 60.02 | 79.08 | 61.69 | 59.42 | 42.26 | 56.89 | 51.11 | 56.16 | 59.33 | 25.81 | 60.67 | 9.17 | 53.28 |
DCFL [45] | 79.49 | 55.97 | 50.15 | 61.59 | 49.01 | 55.33 | 59.31 | 81.81 | 66.52 | 60.06 | 52.87 | 56.71 | 57.83 | 58.13 | 60.35 | 35.66 | 78.65 | 13.03 | 57.66 |
R3Det [46] | 75.44 | 50.95 | 41.16 | 61.61 | 41.11 | 45.76 | 49.65 | 78.52 | 54.97 | 60.79 | 42.07 | 53.20 | 43.08 | 49.55 | 34.09 | 36.26 | 68.65 | 0.06 | 47.26 |
YOLO8 [47] | 91.89 | 72.87 | 43.42 | 60.95 | 62.13 | 79.11 | 85.79 | 94.51 | 60.78 | 75.22 | 42.36 | 56.53 | 75.99 | 68.35 | 48.51 | 2.24 | 33.82 | 0.15 | 58.51 |
SASM [48] | 70.30 | 40.62 | 37.01 | 59.03 | 40.21 | 45.46 | 44.60 | 78.58 | 49.34 | 60.73 | 29.89 | 46.57 | 42.95 | 48.31 | 28.13 | 1.82 | 76.37 | 0.74 | 44.53 |
Ours | 94.53 | 73.45 | 51.43 | 65.02 | 65.49 | 81.32 | 87.66 | 94.82 | 70.20 | 79.41 | 54.17 | 62.53 | 80.42 | 72.50 | 42.67 | 8.01 | 54.81 | 3.90 | 63.01 |
Method | Publication | AP | AP | AP | AP | AP | AP | AP |
---|---|---|---|---|---|---|---|---|
multi-stage | ||||||||
S2A-Net [37] | TGRS’22 | 28.3 | 69.6 | 13.1 | 10.2 | 22.8 | 35.8 | 29.5 |
O-Rep [38] | CVPR’22 | 26.3 | 58.8 | 19.0 | 9.4 | 22.6 | 32.4 | 28.5 |
DHRec [49] | TPAMI’22 | 30.1 | 68.8 | 19.8 | 10.6 | 24.6 | 40.3 | 34.6 |
one-stage | ||||||||
GV [50] | TPAMI’21 | 31.7 | 70.8 | 22.6 | 11.7 | 27.0 | 41.1 | 33.8 |
O-RCNN [44] | ICCV’21 | 34.4 | 70.7 | 28.6 | 12.5 | 28.6 | 44.5 | 36.7 |
DODet [51] | TGRS’22 | 31.6 | 68.1 | 23.4 | 11.3 | 26.3 | 41.0 | 33.5 |
CFINet [52] | ICCV’23 | 34.4 | 73.1 | 26.1 | 13.5 | 29.3 | 44.0 | 35.9 |
Ours | - | 49.4 | 72.1 | 59.3 | 15.2 | 30.5 | 45.4 | 37.7 |
Method | mAP50 (%) | mAP95 (%) |
---|---|---|
UTY5S [53] | 36.41 | 20.18 |
IGUIT [54] | 35.32 | 20.04 |
DCFL [45] | 32.14 | - |
IOD [55] | 42.93 | 24.62 |
HIC-YOLOv5 [56] | 44.32 | 25.99 |
QueryDet [57] | 48.15 | 28.71 |
CEASC [58] | 50.74 | 28.46 |
DSH-Net [59] | 51.81 | 30.94 |
SAHI [60] | 43.59 | - |
EdgeYOLO [61] | 44.85 | - |
Ours | 52.87 | 33.92 |
(k,d) Sequence | Precision (%) | Recall (%) | mAP50 (%) | mAP95 (%) |
---|---|---|---|---|
(23, 1) | 68.53 | 50.47 | 52.94 | 32.68 |
(3, 1) + (5, 1) + (7, 1) + (9, 1) | 70.90 | 51.54 | 54.45 | 34.27 |
(5, 1) + (7, 3) | 73.97 | 54.72 | 57.34 | 41.12 |
Block | Precision (%) | Recall (%) | mAP50 (%) | mAP95 (%) | ||
---|---|---|---|---|---|---|
LKA | ACA | LBN | ||||
70.62 | 50.08 | 52.33 | 36.06 | |||
√ | 67.98 | 51.38 | 53.63 | 37.72 | ||
√ | √ | 72.34 | 51.69 | 54.28 | 37.95 | |
√ | √ | 66.54 | 53.22 | 55.95 | 38.85 | |
√ | √ | √ | 73.97 | 54.72 | 57.40 | 41.12 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Yang, Y.; An, Y.; Sun, Y.; Zhu, Z. LARS: Remote Sensing Small Object Detection Network Based on Adaptive Channel Attention and Large Kernel Adaptation. Remote Sens. 2024, 16, 2906. https://doi.org/10.3390/rs16162906
Li Y, Yang Y, An Y, Sun Y, Zhu Z. LARS: Remote Sensing Small Object Detection Network Based on Adaptive Channel Attention and Large Kernel Adaptation. Remote Sensing. 2024; 16(16):2906. https://doi.org/10.3390/rs16162906
Chicago/Turabian StyleLi, Yuanyuan, Yajun Yang, Yiyao An, Yudong Sun, and Zhiqin Zhu. 2024. "LARS: Remote Sensing Small Object Detection Network Based on Adaptive Channel Attention and Large Kernel Adaptation" Remote Sensing 16, no. 16: 2906. https://doi.org/10.3390/rs16162906
APA StyleLi, Y., Yang, Y., An, Y., Sun, Y., & Zhu, Z. (2024). LARS: Remote Sensing Small Object Detection Network Based on Adaptive Channel Attention and Large Kernel Adaptation. Remote Sensing, 16(16), 2906. https://doi.org/10.3390/rs16162906