BSFCDet: Bidirectional Spatial–Semantic Fusion Network Coupled with Channel Attention for Object Detection in Satellite Images
"> Figure 1
<p>Challenges of RSOD. The object to be detected is surrounded by the yellow box. RSIs where the objects are (<b>a</b>) small, (<b>b</b>) multiscale, and (<b>c</b>) difficult to distinguish from the background.</p> "> Figure 2
<p>Structure of YOLOv4-tiny.</p> "> Figure 3
<p>Structure of SPP.</p> "> Figure 4
<p>Structure of BSFCDet.</p> "> Figure 5
<p>Structure of SPPF-G.</p> "> Figure 6
<p>Structure diagram of the ECAM.</p> "> Figure 7
<p>Structure diagram of Resblock_M.</p> "> Figure 8
<p>Detection examples of BSFCDet compared to other algorithms on the dataset DIOR. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> "> Figure 8 Cont.
<p>Detection examples of BSFCDet compared to other algorithms on the dataset DIOR. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> "> Figure 9
<p>Detection examples of BSFCDet compared to other algorithms on the dataset RSOD. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> "> Figure 10
<p>Detection examples of BSFCDet compared to other algorithms on the dataset DOTA. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> "> Figure 10 Cont.
<p>Detection examples of BSFCDet compared to other algorithms on the dataset DOTA. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> "> Figure 11
<p>Detection examples of BSFCDet compared to other algorithms on the dataset VOC12. (<b>a</b>) Comparison of detection examples for image 1; (<b>b</b>) Comparison of detection examples for image 2; (<b>c</b>) Comparison of detection examples for image 3. The algorithms corresponding to each row from left to right are BSFCDet, YOLOv3, YOLOv4-tiny, and YOLOv3-tiny.</p> ">
Abstract
:1. Introduction
- In military reconnaissance, RSOD techniques can detect aircraft, missiles, and other military equipment and facilities, which is convenient for the rapid acquisition of military intelligence and is an important part of the modern military system.
- In urban planning, the relevant departments use RSOD techniques to quickly obtain urban topography, traffic conditions, and other data, which is conducive to the coordination of urban spatial layouts and the rational use of urban land.
- In agricultural monitoring, RSOD techniques are used to monitor crop growth, pests, and other information to take preventive measures to reduce economic losses.
- The proportion of small objects is large. Small objects are easily missed due to a lack of feature information, which affects the detection effect. For example, there are more small targets in Figure 1a, which creates challenges in RSOD.
- The scales of objects greatly vary. Objects of the same category or different categories in RSIs are quite different in scale, and the scales presented by the same object may vary at different resolutions. As shown in Figure 1b, the scale of the playground is large, while that of the car is small, which requires the detection algorithm to have strong scale adaptability.
- The background is complex. As shown in Figure 1c, remote sensing imaging is influenced by light, weather, and terrain, which increases the background noise of RSIs and makes the objects hard to detect.
- We designed SPPF-G for feature fusion, which fleshes out the spatial features of small targets and improves the detection accuracy of small objects.
- We modified the two-layer feature pyramid network (FPN) [19] to a three-layer BiFPN-G to integrate the deep semantic information with the shallow spatial information, thus improving the detection ability of the model for multiscale objects.
- We proposed the ECAM to enhance object information and suppress background noise and proposed a less computational residual block named Resblock_M to balance accuracy with speed.
2. Methods
2.1. The Existing Methods
2.1.1. YOLOv4-Tiny
2.1.2. Spatial Pyramid Pooling
2.2. The Proposed Methods
2.2.1. Structure of BSFCDet
2.2.2. SPPF-G
2.2.3. BiFPN-G
2.2.4. ECAM
2.2.5. Resblock_M
3. Results
3.1. Dataset and Preprocessing
3.2. Experimental Setup and Comparative Methods
3.3. Evaluation Metrics
3.4. Ablation Experiment
3.5. Comparative Experiments
3.5.1. Comparative Experiments on DIOR
3.5.2. Comparative Experiments on RSOD
3.5.3. Comparative Experiments on DOTA
3.5.4. Comparative Experiments on VOC12
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, Y.; Bashir, S.M.A.; Khan, M.; Ullah, Q.; Wang, R.; Song, Y.; Guo, Z.; Niu, Y. Remote sensing image super-resolution and object detection: Benchmark and state of the art. Expert Syst. Appl. 2022, 197, 116793. [Google Scholar] [CrossRef]
- Ma, W.; Li, N.; Zhu, H.; Jiao, L.; Tang, X.; Guo, Y.; Hou, B. Feature Split–Merge–Enhancement Network for Remote Sensing Object Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5616217. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: San Diego, CA, USA, 2005; Volume 1, pp. 886–893. [Google Scholar]
- Felzenszwalb, P.; McAllester, D.; Ramanan, D. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, M.H. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Chen, S.; Zhan, R.; Zhang, J. Geospatial Object Detection in Remote Sensing Imagery Based on Multiscale Single-Shot Detector with Activated Semantics. Remote Sens. 2018, 10, 820. [Google Scholar] [CrossRef] [Green Version]
- Fu, Y.; Wu, F.; Zhao, J. Context-Aware and Depthwise-based Detection on Orbit for Remote Sensing Image. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018. [Google Scholar]
- Schilling, H.; Bulatov, D.; Niessner, R.; Middelmann, W.; Soergel, U. Detection of Vehicles in Multisensor Data via Multibranch Convolutional Neural Networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 4299–4316. [Google Scholar] [CrossRef]
- Hou, L.; Lu, K.; Xue, J.; Hao, L. Cascade detector with feature fusion for arbitrary-oriented objects in remote sensing images. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020. [Google Scholar]
- Qu, J.; Su, C.; Zhang, Z.; Razi, A. Dilated Convolution and Feature Fusion SSD Network for Small Object Detection in Remote Sensing Images. IEEE Access 2020, 8, 82832–82843. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Yang, X.; Sun, H.; Sun, X.; Yan, M.; Guo, Z.; Fu, K. Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network. IEEE Access 2018, 6, 50839–50849. [Google Scholar] [CrossRef]
- Yang, X.; Sun, H.; Fu, K.; Yang, J.; Sun, X.; Yan, M.; Guo, Z. Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens. 2018, 10, 132. [Google Scholar] [CrossRef] [Green Version]
- Zou, F.; Xiao, W.; Ji, W.; He, K.; Yang, Z.; Song, J.; Zhou, H.; Li, K. Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image. Neural Comput. Appl. 2020, 32, 14549–14562. [Google Scholar] [CrossRef]
- Fu, K.; Chang, Z.; Zhang, Y.; Xu, G.; Zhang, K.; Sun, X. Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images. ISPRS J. Photogramm. Remote Sens. 2020, 161, 294–308. [Google Scholar] [CrossRef]
- Wang, P.; Sun, X.; Diao, W.; Fu, K. FMSSD: Feature-Merged Single-Shot Detection for Multiscale Objects in Large-Scale Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2019, 58, 3377–3390. [Google Scholar] [CrossRef]
- Zhang, Y.; You, Y.; Wang, R.; Liu, F.; Liu, J. Nearshore vessel detection based on Scene-mask R-CNN in remote sensing image. In Proceedings of the 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC), Guiyang, China, 22–24 August 2018; pp. 76–80. [Google Scholar]
- Li, Q.; Mou, L.; Jiang, K.; Liu, Q.; Wang, Y.; Zhu, X.X. Hierarchical Region Based Convolution Neural Network for Multiscale Object Detection in Remote Sensing Images. In Proceedings of the IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 4355–4358. [Google Scholar]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. SCRDet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–3 November 2019; pp. 8232–8241. [Google Scholar]
- Li, C.; Luo, B.; Hong, H.; Su, X.; Wang, Y.; Liu, J.; Wang, C.; Zhang, J.; Wei, L. Object Detection Based on Global-Local Saliency Constraint in Aerial Images. Remote Sens. 2020, 12, 1435. [Google Scholar] [CrossRef]
- Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles. arXiv 2022, arXiv:2206.02424. [Google Scholar]
- Luo, W.; Zhang, Z.; Fu, P.; Wei, G.; Wang, D.; Li, X.; Shao, Q.; He, Y.; Wang, H.; Zhao, Z.; et al. Intelligent Grazing UAV Based on Airborne Depth Reasoning. Remote Sens. 2022, 14, 4188. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020), Washington, DC, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. Supplementary material for “ECA-Net: Efficient channel attention for deep convolutional neural networks”. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Jiang, Z.; Zhao, L.; Li, S.; Jia, Y. Real-time object detection method for embedded devices. In Proceedings of the Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020. [Google Scholar]
- Han, K.; Wang, Y.H.; Tian, Q.; Guo, J.Y.; Xu, C.J.; Xu, C. GhostNet: More Features from Cheap Operations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 1577–1586. [Google Scholar]
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Long, Y.; Gong, Y.; Xiao, Z.; Liu, Q. Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2486–2498. [Google Scholar] [CrossRef]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Everingham, M.; Eslami, S.M.A.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes Challenge: A Retrospective. Int. J. Comput. Vis. 2014, 111, 98–136. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 8–16 October 2016; pp. 21–37. [Google Scholar]
- Cheng, G.; Si, Y.; Hong, H.; Yao, X.; Guo, L. Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 431–435. [Google Scholar] [CrossRef]
- Wang, J.; Gong, Z.; Liu, X.; Guo, H.; Yu, D.; Ding, L. Object Detection Based on Adaptive Feature-Aware Method in Optical Remote Sensing Images. Remote Sens. 2022, 14, 3616. [Google Scholar] [CrossRef]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint triplets for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–3 November 2019; pp. 6568–6577. [Google Scholar]
- Lang, L.; Xu, K.; Zhang, Q.; Wang, D. Fast and Accurate Object Detection in Remote Sensing Images Based on Lightweight Deep Neural Network. Sensors 2021, 21, 5460. [Google Scholar] [CrossRef] [PubMed]
- Buzzy, M.; Thesma, V.; Davoodi, M.; Velni, J.M. Real-Time Plant Leaf Counting Using Deep Object Detection Networks. Sensors 2020, 20, 6896. [Google Scholar] [CrossRef] [PubMed]
- Arriaga, O.; Valdenegro-Toro, M.; Plöger, P. Real-time Convolutional Neural Networks for Emotion and Gender Classification. arXiv 2017, arXiv:1710.07557. [Google Scholar]
- Huang, Z.; Li, W.; Xia, X.-G.; Wang, H.; Jie, F.; Tao, R. LO-Det: Lightweight Oriented Object Detection in Remote Sensing Images. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 3067470. [Google Scholar] [CrossRef]
- Wei, H.; Zhang, Y.; Chang, Z.; Li, H.; Wang, H.; Sun, X. Oriented objects as pairs of middle lines. ISPRS J. Photogramm. Remote Sens. 2020, 169, 268–279. [Google Scholar] [CrossRef]
- Xu, T.; Sun, X.; Diao, W.; Zhao, L.; Fu, K.; Wang, H. ASSD: Feature Aligned Single-Shot Detection for Multiscale Objects in Aerial Imagery. IEEE Trans. Geosci. Remote Sens. 2021, 60, 3089170. [Google Scholar] [CrossRef]
- Cheng, G.; Zhou, P.; Han, J. Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7405–7415. [Google Scholar] [CrossRef]
- Li, K.; Cheng, G.; Bu, S.; You, X. Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2017, 56, 2337–2348. [Google Scholar] [CrossRef]
- Cheng, G.; Zhou, P.; Han, J. Rifd-cnn: Rotation-invariant and fisher discriminative convolutional neural networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2884–2893. [Google Scholar]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. arXiv 2019, arXiv:1905.05055. [Google Scholar] [CrossRef]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-nms—Improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) 2017, Venice, Italy, 22–29 October 2017; pp. 2380–7504. [Google Scholar]
GPU_mem (GB) | Forward (ms) | Backward (ms) | Input | Output | |
---|---|---|---|---|---|
SPPF-G | 4.401 | 54.84 | 114.3 | (16,1024,64,64) | (16,1024,64,64) |
SPP | 4.586 | 80.9 | 171.2 | (16,1024,64,64) | (16,1024,64,64) |
Dataset | Categories | Images | Instances | Year | Proportion |
---|---|---|---|---|---|
DIOR | 20 | 23,463 | 192,472 | 2018 | 1:1:2 |
RSOD | 4 | 976 | 6950 | 2017 | 7:1:2 |
DOTAv1.0 | 15 | 20,889 | 188,282 | 2017 | 7:1:2 |
VOC12 | 20 | 17,125 | 40,138 | 2012 | 7:1:2 |
Dataset | Anchor |
---|---|
DIOR | (7,7)(8,16)(14,28)(19,11)(26,47)(36,20)(57,60)(107,128)(278,280) |
RSOD | (14,15)(22,23)(30,29)(36,39)(47,48)(55,62)(68,74)(80,92)(223,275) |
DOTAv1.0 | (11,10)(20,22)(29,41)(43,25)(44,98)(47,45)(78,63)(100,106)(169,202) |
VOC12 | (21,40)(52,68)(62,141)(97,272)(159,155)(177,380)(303,495)(372,267)(549,556) |
Parameter | Configuration |
---|---|
Operating System | Ubuntu 18.04 |
CPU | Intel Core i9-9900X CPU @3.50 GHz × 20 |
GPU | NVIDIA RTX 2080 Ti |
Language | Python 3.8 |
Frame | PyTorch |
Baseline | +S | +S+B | +S+B+E | +S+B+E+R | |
---|---|---|---|---|---|
Precision | 42.5 | 49.8 | 50.5 | 51 | 44.6 |
Recall | 61.4 | 66.2 | 69.7 | 71.4 | 68.4 |
F1 | 50.2 | 56.8 | 58.6 | 59.5 | 54 |
[email protected] | 53 | 62.5 | 65.5 | 66.6 | 63.2 |
[email protected]:0.95 | 30.2 | 35.4 | 38 | 39.2 | 36.5 |
FPS | 270 | 258 | 220 | 209 | 222 |
Methods | Backbone | [email protected] | [email protected]:0.95 | FPS | GPU | Year |
---|---|---|---|---|---|---|
Simple-CNN | VGG16 | 66.5 | — | 13 | GT710 | 2021 |
CSFF | ResNet-101 | 68.0 | — | 15 | RTX-3090 | 2021 |
CF2PN [47] | VGG16 | 67.3 | — | 20 | RTX2080Ti | 2021 |
AFADet | VGG16 | 66.1 | — | 26 | RTX2080Ti | 2022 |
YOLOv5l | — | 71.2 | 45.8 | 28 | GTX1070Ti | 2020 |
ASSD-Lite [48] | MobileNetv2 | 63.3 | — | 35 | GTX1080Ti | 2021 |
YOLOv5m | Focus-CSP-SPP | 69.8 | 44.6 | 46 | GTX1070Ti | 2020 |
AFADet-300 | VGG16 | 57.4 | — | 61 | RTX2080Ti | 2022 |
YOLOv4 | CSPDarkNet53 | 74.4 | 51.4 | 62 | RTX2080Ti | 2020 |
LO-Det | MobileNetv2 | 58.7 | — | 65 | RTX3090 | 2021 |
YOLOv3 | DarkNet53 | 57.1 | 49.9 | 89 | RTX2080Ti | 2018 |
YOLOv5s | CSP Focus | 65.4 | 39.0 | 118 | GTX1070Ti | 2020 |
FANet | 17-layer-CNN | 56.5 | — | 228 | RTX2080Ti | 2021 |
YOLOv4-tiny | CSPDarkNet53tiny | 53.0 | 30.2 | 270 | RTX2080Ti | 2020 |
YOLOv3-tiny | — | 51.0 | 26.0 | 275 | RTX2080Ti | 2018 |
BSFCDet | CSPDarkNet53tiny-R | 63.2 | 36.5 | 222 | RTX2080Ti | 2022 |
R-CNN | RICNN [49] | RICAOD [50] | YOLOv3-Tiny | YOLOv4-Tiny | Faster R-CNN | |
Backbone | VGG16 | VGG16 | VGG16 | — | CSPDarkNet53tiny | VGG16 |
Airplane | 35.6 | 39.1 | 42.2 | 73.3 | 58.7 | 53.6 |
Airport | 43.0 | 61.0 | 69.7 | 25.5 | 56.1 | 49.3 |
Expressway toll station | 33.5 | 36.6 | 49.3 | 50.2 | 48.7 | 55.2 |
Basketball court | 62.3 | 66.3 | 79.0 | 82.5 | 74.4 | 66.2 |
Bridge | 15.6 | 25.3 | 27.7 | 22.1 | 22.5 | 28.0 |
Chimney | 53.7 | 63.3 | 68.9 | 71.7 | 72.0 | 70.9 |
Dam | 33.7 | 41.1 | 50.1 | 22.7 | 47.5 | 62.3 |
Expressway service area | 50.2 | 51.7 | 60.5 | 44.0 | 54.7 | 69.0 |
Baseball field | 53.8 | 60.1 | 62.0 | 75.1 | 71.8 | 78.8 |
Golf course | 50.1 | 55.9 | 64.4 | 28.5 | 60.4 | 68.0 |
Overpass | 30.9 | 39.0 | 46.8 | 40.7 | 46.8 | 50.1 |
Harbor | 39.5 | 43.5 | 42.3 | 34.8 | 51.3 | 50.2 |
Ground track field | 49.3 | 58.9 | 65.3 | 56.5 | 64.5 | 56.9 |
Tennis court | 54.0 | 63.5 | 70.3 | 81.8 | 79.8 | 75.2 |
Stadium | 60.8 | 61.1 | 53.5 | 63.7 | 55.4 | 73.0 |
Storage tank | 18.0 | 19.1 | 24.5 | 52.7 | 37.0 | 39.8 |
Ship | 9.1 | 9.1 | 11.7 | 66.7 | 42.2 | 27.7 |
Train station | 36.1 | 46.1 | 53.3 | 22.7 | 36.3 | 38.6 |
Vehicle | 9.1 | 11.4 | 20.4 | 33.0 | 26.5 | 23.6 |
Windmill | 16.4 | 31.5 | 56.2 | 72.4 | 52.6 | 45.4 |
mAP | 37.7 | 44.2 | 50.9 | 51.0 | 53.0 | 54.1 |
RIFD-CNN [51] | YOLOv3 | SSD | Faster RCNN with FPN | BSFCDet | ||
Backbone | VGG16 | DarkNet53 | VGG16 | ResNet-50 | CSPDarkNet53tiny-R | |
Airplane | 56.6 | 72.2 | 59.5 | 54.1 | 79.0 | |
Airport | 53.2 | 29.2 | 72.7 | 71.4 | 62.7 | |
Expressway toll station | 56.0 | 54.4 | 53.1 | 62.1 | 57.4 | |
Basketball court | 69.0 | 78.6 | 75.7 | 81.0 | 86.0 | |
Bridge | 29.0 | 31.2 | 29.7 | 42.6 | 29.7 | |
Chimney | 71.5 | 69.7 | 65.8 | 72.5 | 74.0 | |
Dam | 63.1 | 26.9 | 56.6 | 57.5 | 39.9 | |
Expressway service area | 69.0 | 48.6 | 63.5 | 68.7 | 62.5 | |
Baseball field | 79.9 | 74.0 | 72.4 | 63.3 | 82.0 | |
Golf course | 68.9 | 31.1 | 65.3 | 73.1 | 65.4 | |
Overpass | 51.1 | 49.7 | 48.1 | 56.0 | 50.8 | |
Harbor | 51.2 | 44.9 | 49.4 | 42.8 | 50.2 | |
Ground track field | 62.4 | 61.1 | 68.6 | 76.5 | 66.6 | |
Tennis court | 79.5 | 87.3 | 76.3 | 81.2 | 85.5 | |
Stadium | 73.6 | 70.6 | 61.0 | 57.0 | 67.2 | |
Storage tank | 41.5 | 68.7 | 46.6 | 53.5 | 63.8 | |
Ship | 31.7 | 87.4 | 59.2 | 71.8 | 83.8 | |
Train station | 40.1 | 29.4 | 55.1 | 53.0 | 44.1 | |
Vehicle | 28.5 | 48.3 | 27.4 | 43.1 | 42.1 | |
Windmill | 46.9 | 78.7 | 65.7 | 80.9 | 71.7 | |
mAP | 56.1 | 57.1 | 58.6 | 63.1 | 63.2 |
Methods | AI | OI | OV | PL | [email protected] | [email protected]:0.95 | FPS |
---|---|---|---|---|---|---|---|
CenterNet | — | — | — | — | 75.4 | — | 88 |
YOLOv3-tiny | 90.7 | 92.2 | 61.3 | 88.9 | 83.3 | 51.2 | 276 |
SSD300 | 70.1 | 90.3 | 78.4 | 100 | 84.7 | — | 54 |
FANet | 87.1 | 99.0 | 56.6 | 97.9 | 85.1 | — | 228 |
YOLOv4-tiny | 90.9 | 91.8 | 68.0 | 90.0 | 85.2 | 53.7 | 274 |
Faster R-CNN | 86.4 | 88.3 | 80.2 | 91.2 | 86.5 | — | 18 |
AFADet-300 | 69.8 | 96.9 | 93.4 | 99.9 | 90.0 | — | 61 |
YOLOv3 | 97.4 | 97.6 | 71.0 | 94.6 | 90.1 | 64.0 | 80 |
YOLOv5s | — | — | — | — | 90.8 | — | 79 |
YOLOv4 | 98.4 | 98.5 | 76.3 | 98.4 | 92.9 | 65.7 | 60 |
BSFCDet | 95.7 | 97.8 | 81.7 | 96.0 | 92.8 | 58.5 | 227 |
Methods | [email protected] | [email protected]:0.95 | FPS |
---|---|---|---|
YOLOv4 | 95.1 | 77.3 | 61 |
YOLOv3 | 93.2 | 75.6 | 86 |
YOLOv4-tiny | 85.3 | 57.6 | 263 |
YOLOv3-tiny | 84.3 | 52 | 280 |
BSFCDet | 91.3 | 61.4 | 240 |
Methods | [email protected] | [email protected]:0.95 | FPS |
---|---|---|---|
YOLOv3-tiny | 44 | 20.7 | 283 |
YOLOv4-tiny | 46.5 | 23.3 | 284 |
R-CNN | 53.3 | — | 50 |
YOLOv1 | 57.9 | — | 95 |
SPP-Net | 58.1 | — | 51 |
YOLOv3 | 66 | 47 | 78 |
YOLOv4 | 68.3 | 49.1 | 71 |
BSFCDet | 58.5 | 36.3 | 238 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wei, X.; Zhang, Y.; Zheng, Y. BSFCDet: Bidirectional Spatial–Semantic Fusion Network Coupled with Channel Attention for Object Detection in Satellite Images. Remote Sens. 2023, 15, 3213. https://doi.org/10.3390/rs15133213
Wei X, Zhang Y, Zheng Y. BSFCDet: Bidirectional Spatial–Semantic Fusion Network Coupled with Channel Attention for Object Detection in Satellite Images. Remote Sensing. 2023; 15(13):3213. https://doi.org/10.3390/rs15133213
Chicago/Turabian StyleWei, Xinchi, Yan Zhang, and Yuhui Zheng. 2023. "BSFCDet: Bidirectional Spatial–Semantic Fusion Network Coupled with Channel Attention for Object Detection in Satellite Images" Remote Sensing 15, no. 13: 3213. https://doi.org/10.3390/rs15133213
APA StyleWei, X., Zhang, Y., & Zheng, Y. (2023). BSFCDet: Bidirectional Spatial–Semantic Fusion Network Coupled with Channel Attention for Object Detection in Satellite Images. Remote Sensing, 15(13), 3213. https://doi.org/10.3390/rs15133213