Long-Tailed Object Detection for Multimodal Remote Sensing Images
"> Figure 1
<p>Comparison of visible light and infrared images. (<b>a</b>) Image examples from the LLVIP dataset. In low-light environments, infrared images can provide more information than visible light images. (<b>b</b>) Image examples from the VEDAI dataset. In good lighting conditions, visible light images can provide more color, texture and other detail information.</p> "> Figure 2
<p>A combination plot of instance number and detection accuracy on the VEDAI dataset.</p> "> Figure 3
<p>An overview of the model’s overall framework. Our model is based on YOLOv8 architecture, and the main contributions are as follows: (1) we propose the dynamic feature fusion module to achieve efficient and accurate fusion of multi-source images; (2) the proposed instance-balanced mosaic reduces the underfitting of the model for tail instances by balancing the sampling of instances; (3) class-balanced BCE loss not only considers the learning difficulty of instances, but also balances the learning difficulty between classes.</p> "> Figure 4
<p>Dynamic feature fusion module. It consists of two parts: a calculation of image entropy module and a multimodal feature fusion module composed of multiple Conv layers. Among them, IR represents infrared image and RGB represents visible light image.</p> "> Figure 5
<p>A schematic diagram of the instance-balanced mosaic method. It consists of three steps: first, generating a gray image as the image to be stitched and randomly selecting four images; second, randomly selecting a stitching center point in the gray image, and calculating the stitching point and direction of the selected images in turn; third, combining the four images into the gray image according to the stitching point and direction to generate a new image.</p> "> Figure 6
<p>(<b>a</b>) Class distribution diagram of FLIR dataset. (<b>b</b>) Class distribution diagram of VEDAI dataset. (Because the LLVIP dataset has only one category, this figure does not show it).</p> "> Figure 7
<p>(<b>a</b>) Normalized confusion matrix of the baseline methods. (<b>b</b>) Our method.</p> "> Figure 8
<p>(<b>a</b>) Normalized confusion matrix of the baseline method. (<b>b</b>) Normalized confusion matrix of our method.</p> "> Figure 9
<p>(<b>a</b>–<b>e</b>) Detection visualization on FLIR dataset. Yellow ellipses represent the targets missed by the model in detection, and purple ellipses indicate the targets falsely detected by the model.</p> "> Figure 10
<p>Parameter analysis of hyperparameter <math display="inline"><semantics> <mi>γ</mi> </semantics></math> on VEDAI and FLIR datasets.</p> ">
Abstract
:1. Introduction
- We propose a dynamic feature fusion module based on image information entropy, which dynamically adjusts the fusion coefficient according to the different information entropy of images, enabling the model to capture more features. Compared with other similar methods, this module helps the model significantly improve detection accuracy without significantly increasing computational complexity, and this method is simple and can be easily inserted into other object detection networks to achieve feature fusion for remote sensing images.
- We propose an instance-balanced mosaic data augmentation method based on instance number, which solves the long-tailed problem by providing rich tail-class features for the model through resampling during data augmentation.
- We propose class-balanced BCE loss for long-tailed object detection. This loss provides more balanced loss information for the model according to sample number, improving the detection accuracy of tail instances.
- Based on three public benchmark datasets, we constructed a large number of experiments to verify the performance of our method. Compared with baseline methods, our method can greatly improve the performance of baseline models. The experimental results and ablation analysis prove the effectiveness of our proposed method.
2. Related Work
2.1. Long-Tailed Object Detection
2.2. Multimodal Feature Fusion
3. Methodology
3.1. Basic Architecture
3.2. Dynamic Feature Fusion Module
3.2.1. Calculation of Image Entropy
3.2.2. Multimodal Feature Fusion
3.3. Instance-Balanced Mosaic
3.4. Class-Balanced BCE Loss
4. Experiment and Results
4.1. Dataset
4.2. Implementation Details
4.3. Accuracy Metrics
4.4. Comparisons with Previous Methods
4.4.1. Experiment Results of the VEDAI Dataset
4.4.2. Experiment Results of the FLIR Dataset
4.4.3. Experiment Results of the LLVIP Dataset
5. Discussion
5.1. Ablation Study
5.1.1. Ablation Studies for Each Module
5.1.2. Impact of Instance-Balanced Mosaic
5.2. Parameter Analysis
5.3. Effect of Operators
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Earthdata. What Is Remote Sensing?|Earthdata. 2021. Available online: https://www.earthdata.nasa.gov/learn/backgrounders/remote-sensing (accessed on 11 September 2023).
- Chi, M.; Plaza, A.; Benediktsson, J.A.; Sun, Z.; Shen, J.; Zhu, Y. Big Data for Remote Sensing: Challenges and Opportunities. Proc. IEEE 2016, 104, 2207–2219. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 6–14 December 2015; pp. 91–99. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Janakiramaiah, B.; Kalyani, G.; Karuna, A.; Prasad, L.N.; Krishna, M. Military object detection in defense using multi-level capsule networks. Soft Comput. 2021, 27, 1045–1059. [Google Scholar] [CrossRef]
- Ennouri, N.; Mourshed, M.; Bray, M. Advances in remote sensing applications for urban sustainability. Euro-Mediterr. J. Environ. Integr. 2016, 1, 1–15. [Google Scholar]
- Rezaei, M.; Azarmi, M.; Pour Mir, F.M. Traffic-Net: 3D Traffic Monitoring Using a Single Camera. arXiv 2021, arXiv:2109.09165. [Google Scholar] [CrossRef]
- Ma, T.J. Remote sensing detection enhancement. J. Big Data 2021, 8, 1–13. [Google Scholar] [CrossRef]
- Liu, Z.; Miao, Z.; Zhan, X.; Wang, J.; Gong, B.; Yu, S.X. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2537–2546. [Google Scholar]
- Li, B.; Yao, Y.; Tan, J.; Zhang, G.; Yu, F.; Lu, J.; Luo, Y. Equalized focal loss for dense long-tailed object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1–10. [Google Scholar]
- Zang, Y.; Zhou, K.; Huang, C.; Loy, C.C. Semi-Supervised and Long-Tailed Object Detection with CascadeMatch. Int. J. Comput. Vis. 2023, 131, 987–1001. [Google Scholar] [CrossRef]
- Wang, T.; Zhu, Y.; Zhao, C.; Zeng, W.; Wang, J.; Tang, M. Adaptive class suppression loss for long-tail object detection. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 3103–3112. [Google Scholar]
- Zhao, W.; Liu, J.; Liu, Y.; Zhao, F.; He, Y.; Lu, H. Teaching teachers first and then student: Hierarchical distillation to improve long-tailed object recognition in aerial images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
- Duan, Y.; Liu, X.; Jatowt, A.; Yu, H.T.; Lynden, S.; Kim, K.S.; Matono, A. Long-Tailed Graph Representation Learning via Dual Cost-Sensitive Graph Convolutional Network. Remote Sens. 2022, 14, 3295. [Google Scholar] [CrossRef]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- YOLOv8. 2021. Available online: https://github.com/ultralytics/ultralytics (accessed on 11 September 2023).
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Liu, X.; Li, H.; Wang, J.; Li, L.; Ouyang, W. Deep learning for generic object detection: A survey. Int. J. Comput. Vis. 2020, 128, 261–318. [Google Scholar] [CrossRef]
- Yaman, B.; Mahmud, T.; Liu, C.H. Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection. arXiv 2023, arXiv:2305.08069. [Google Scholar]
- Li, Y.; Wang, T.; Kang, B.; Tang, S.; Wang, C.; Li, J.; Feng, J. Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10991–11000. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Tan, J.; Wang, C.; Li, B.; Li, Q.; Ouyang, W.; Yin, C.; Yan, J. Equalization loss for long-tailed object recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11662–11671. [Google Scholar]
- Liu, J.; Fan, X.; Jiang, J.; Liu, R.; Luo, Z. Learning a Deep Multi-Scale Feature Ensemble and an Edge-Attention Guidance for Image Fusion. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 105–119. [Google Scholar] [CrossRef]
- Zhang, J.; Lei, J.; Xie, W.; Fang, Z.; Li, Y.; Du, Q. SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
- Chen, C.; Zhao, X.; Wang, J.; Li, D.; Guan, Y.; Hong, J. Dynamic graph convolutional network for assembly behavior recognition based on attention mechanism and multi-scale feature fusion. Sci. Rep. 2022, 12, 7394. [Google Scholar] [CrossRef]
- Li, J.; Li, B.; Jiang, Y.; Cai, W. MSAt-GAN: A generative adversarial network based on multi-scale and deep attention mechanism for infrared and visible light image fusion. Complex Intell. Syst. 2022, 8, 4753–4781. [Google Scholar] [CrossRef]
- Ai, Y.; Liu, X.; Zhai, H.; Li, J.; Liu, S.; An, H.; Zhang, W. Multi-Scale Feature Fusion with Attention Mechanism Based on CGAN Network for Infrared Image Colorization. Appl. Sci. 2023, 13, 4686. [Google Scholar] [CrossRef]
- Ahmed, M.R.; Ashrafi, A.F.; Ahmed, R.U.; Shatabda, S.; Islam, A.M.; Islam, S. DoubleU-NetPlus: A novel attention and context-guided dual U-Net with multi-scale residual feature fusion network for semantic segmentation of medical images. Neural Comput. Appl. 2023, 35, 14379–14401. [Google Scholar] [CrossRef]
- YOLOv5. 2021. Available online: https://github.com/ultralytics/yolov5 (accessed on 14 September 2023).
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proc. Aaai Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Sobel, I. Camera Models and Machine Perception; Technical Report; Stanford University: Standford, CA, USA, 1968. [Google Scholar]
- Prewitt, J.M. Object enhancement and extraction. Pict. Process. Psychopictorics 1970, 10, 15–19. [Google Scholar]
- Roberts, L.G. Machine perception of three-dimensional solids. In Optical and Electro-Optical Information Processing; Massachusetts Institute of Technology: Cambridge, MA, USA, 1963; pp. 159–197. [Google Scholar]
- Kapur, J.; Sahoo, P.; Wong, A. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vision Graph. Image Process. 1986, 29, 273–285. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adva. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Feng, C.; Zhong, Y.; Gao, Y.; Scott, M.R.; Huang, W. TOOD: Task-aligned one-stage object detection. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 12893–12902. [Google Scholar]
- Zhang, H.; Fromont, E.; Lefevre, S.; Avignon, B. Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Virtual, 25–28 October 2020; pp. 1016–1020. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Sharma, M.; Dhanaraj, M.; Karnam, S.; Chachlakis, D.G.; Ptucha, R.; Markopoulos, P.P.; Saber, E. YOLOrs: Object Detection in Multimodal Remote Sensing Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 1497–1508. [Google Scholar] [CrossRef]
- Pham, M.T.; Courtrai, L.; Friguet, C.; Lefèvre, S.; Baussard, A. YOLO-Fine: One-Stage Detector of Small Objects under Various Backgrounds in Remote Sensing Images. Remote Sens. 2020, 12, 2501. [Google Scholar] [CrossRef]
- Fang, Q.; Wang, Z. Cross-Modality Attentive Feature Fusion for Object Detection in Multispectral Remote Sensing Imagery. Pattern Recognit. 2022, 130, 108786. [Google Scholar]
- Zhang, H.; Fromont, E.; Lefevre, S.; Avignon, B. Guided attentive feature fusion for multispectral pedestrian detection. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021. [Google Scholar] [CrossRef]
- Wang, K.; Fang, B.; Qian, J.; Yang, S.; Zhou, X.; Zhou, J. Perspective transformation data augmentation for object detection. IEEE Access 2019, 8, 4935–4943. [Google Scholar] [CrossRef]
Dataset | #Classes | Image Size | #Train | #Test |
---|---|---|---|---|
FLIR | 3 | 640 × 512 | 4129 | 1013 |
LLVIP | 1 | 1280 × 1024 | 12,025 | 3463 |
VEDAI | 8 | 1024 × 1024 | 1089 | 121 |
Category | Car | Pickup | Camping | Truck | Other | Tractor | Boat | Van | Total | |
---|---|---|---|---|---|---|---|---|---|---|
Instances | 134 | 95 | 39 | 30 | 20 | 19 | 17 | 10 | 364 | |
YOLOv3 [41] | IR | 80.2 | 67.0 | 65.6 | 47.8 | 25.9 | 40.1 | 32.7 | 53.3 | 51.5 |
RGB | 83.1 | 71.5 | 69.1 | 59.3 | 48.9 | 67.3 | 33.5 | 55.7 | 61.1 | |
Multi | 84.6 | 72.7 | 67.1 | 62.0 | 43.0 | 65.2 | 37.1 | 58.3 | 61.3 | |
YOLOv4 [20] | IR | 80.5 | 67.9 | 68.8 | 53.7 | 30.0 | 44.2 | 25.4 | 51.4 | 52.8 |
RGB | 83.7 | 73.4 | 71.2 | 59.1 | 51.7 | 65.9 | 34.3 | 60.3 | 62.4 | |
Multi | 85.5 | 72.8 | 72.4 | 62.8 | 48.9 | 69.0 | 34.3 | 54.7 | 62.6 | |
YOLOv5s [32] | IR | 77.3 | 65.3 | 66.5 | 51.6 | 25.9 | 42.4 | 21.9 | 48.9 | 49.9 |
RGB | 80.1 | 58.0 | 66.1 | 51.5 | 45.8 | 64.4 | 21.6 | 40.9 | 54.8 | |
Multi | 80.8 | 68.5 | 69.1 | 54.7 | 46.8 | 64.3 | 24.3 | 46.0 | 56.8 | |
YOLOrs [42] | IR | 82.0 | 73.9 | 63.8 | 54.2 | 44.0 | 54.4 | 22.0 | 43.4 | 54.7 |
RGB | 85.3 | 72.9 | 70.3 | 50.7 | 42.7 | 76.8 | 18.7 | 38.9 | 57.0 | |
Multi | 84.2 | 78.3 | 68.8 | 52.6 | 46.8 | 67.9 | 21.5 | 57.9 | 59.7 | |
YOLO-Fine [43] | IR | 76.8 | 74.4 | 64.7 | 63.5 | 45.0 | 78.1 | 70.0 | 77.9 | 68.2 |
RGB | 79.7 | 74.5 | 77.1 | 81.0 | 37.3 | 70.7 | 60.8 | 63.6 | 68.8 | |
YOLOFusion [44] | IR | 86.7 | 75.9 | 66.6 | 77.1 | 43.0 | 62.3 | 70.7 | 84.3 | 70.8 |
RGB | 91.1 | 82.3 | 75.1 | 78.3 | 33.3 | 81.2 | 71.8 | 62.2 | 71.9 | |
Multi | 91.7 | 85.9 | 78.9 | 78.1 | 54.7 | 71.9 | 71.7 | 75.2 | 75.9 | |
SuperYOLO [27] | IR | 87.9 | 81.4 | 76.9 | 61.6 | 39.4 | 60.6 | 46.1 | 71.0 | 65.6 |
RGB | 90.3 | 82.7 | 76.7 | 68.6 | 53.9 | 79.5 | 58.1 | 70.3 | 72.5 | |
Multi | 91.1 | 85.7 | 79.3 | 70.2 | 57.3 | 80.4 | 60.2 | 76.5 | 75.1 | |
Ours | IR | 88.9 | 87.4 | 81.3 | 78.4 | 60.2 | 80.7 | 70.1 | 75.7 | 76.6 |
RGB | 88.4 | 80.1 | 77.7 | 78.8 | 51.3 | 69.8 | 64.0 | 71.4 | 72.7 | |
Multi | 91.6 | 87.2 | 84.1 | 85.1 | 72.0 | 84.7 | 75.7 | 80.7 | 81.0 |
Model | Data | Backbone | mAP50 | mAP50:95 |
---|---|---|---|---|
FasterR-CNN [5] | RGB | ResNet50 | 64.9 | 28.9 |
FasterR-CNN [5] | IR | ResNet50 | 74.4 | 37.6 |
SSD [4] | RGB | VGG16 | 52.2 | 21.8 |
SSD [4] | IR | VGG16 | 65.5 | 29.6 |
YOLOv3 [41] | RGB | Darknet53 | 58.3 | 25.7 |
YOLOv3 [41] | IR | Darknet53 | 73.6 | 36.8 |
YOLOv5 [32] | RGB | CSPD53 | 67.8 | 31.8 |
YOLOv5 [32] | IR | CSPD53 | 73.9 | 39.5 |
YOLOv8 [19] | RGB | - | 60.7 | 28.8 |
YOLOv8 [19] | IR | - | 73.7 | 38.3 |
CFR3 [40] | Multi | VGG16 | 72.4 | - |
GAFF [45] | Multi | ResNet18 | 72.9 | 37.5 |
YOLOFusion [44] | Multi | CFB | 78.7 | 40.2 |
SuperYOLO [27] | Multi | YOLO | 74.6 | 39.4 |
Ours | Multi | YOLO | 78.5 | 42.6 |
Model | Data | Backbone | mAP50 | mAP50:95 |
---|---|---|---|---|
FasterR-CNN [5] | RGB | ResNet50 | 91.4 | 49.2 |
FasterR-CNN [5] | IR | ResNet50 | 96.1 | 61.1 |
SSD [4] | RGB | VGG16 | 82.6 | 39.8 |
SSD [4] | IR | VGG16 | 90.2 | 53.5 |
YOLOv3 [41] | RGB | Darknet53 | 87.1 | - |
YOLOv3 [41] | IR | Darknet53 | 94.0 | - |
YOLOv5 [32] | RGB | CSPD53 | 90.8 | - |
YOLOv5 [32] | IR | CSPD53 | 96.5 | - |
YOLOv8 [19] | RGB | - | 92.5 | 54.1 |
YOLOv8 [19] | IR | - | 96.6 | 63.2 |
YOLOFusion [44] | Multi | CFB | 97.5 | 63.6 |
SuperYOLO [27] | Multi | YOLO | 96.7 | 65.2 |
Ours | Multi | YOLO | 97.7 | 67.6 |
Dataset | DFF | IBM | CBB | mAP50 | mAP50:95 |
---|---|---|---|---|---|
VEDAI | 72.7 | 44.3 | |||
✓ | 75.4 | 50.1 | |||
✓ | ✓ | 77.7 | 51.4 | ||
✓ | ✓ | ✓ | 81.0 | 54.5 | |
FLIR | 60.7 | 28.8 | |||
✓ | 75.2 | 40.4 | |||
✓ | ✓ | 76.2 | 41.3 | ||
✓ | ✓ | ✓ | 78.5 | 42.6 | |
LLVIP | 92.5 | 54.1 | |||
✓ | 97.7 | 67.6 |
Dataset | VEDAI | FLIR | LLVIP |
---|---|---|---|
Mosaic | 79.7 | 76.2 | 97.2 |
Shear | 76.2 | 75.1 | 96.6 |
Perspective | 74.4 | 76.0 | 96.9 |
IBM (ours) | 81.0 | 78.5 | 97.7 |
Dataset | SobSobel | Prewitt | Robert |
---|---|---|---|
VEDAI | 81.0 | 80.1 | 80.6 |
FLIR | 78.5 | 78.3 | 78.6 |
LLVIP | 97.7 | 97.7 | 97.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, J.; Yu, M.; Li, S.; Zhang, J.; Hu, S. Long-Tailed Object Detection for Multimodal Remote Sensing Images. Remote Sens. 2023, 15, 4539. https://doi.org/10.3390/rs15184539
Yang J, Yu M, Li S, Zhang J, Hu S. Long-Tailed Object Detection for Multimodal Remote Sensing Images. Remote Sensing. 2023; 15(18):4539. https://doi.org/10.3390/rs15184539
Chicago/Turabian StyleYang, Jiaxin, Miaomiao Yu, Shuohao Li, Jun Zhang, and Shengze Hu. 2023. "Long-Tailed Object Detection for Multimodal Remote Sensing Images" Remote Sensing 15, no. 18: 4539. https://doi.org/10.3390/rs15184539