Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3460268.3460276acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaieeConference Proceedingsconference-collections
research-article
Open access

A Novel UAV Aerial Vehicle Detection Method Based on Attention Mechanism and Multi-scale Feature Cross Fusion

Published: 30 July 2021 Publication History

Abstract

With the rapid development of artificial intelligence science, more and more researchers try to use deep learning to train neural networks and have achieved great success in object detection. Vehicle detection based on UAV image is a special field of object detection. Due to the low resolution of the vehicle object, complex background, and less image information, it is challenging to extract robust visual and spatial features from the depth network and accurately locate the object in complex scenes. In this paper, combining the characteristics of vehicles in aerial images, we design a novel feature pyramid network called channel-spatial attention fused feature pyramid network (CSF-FPN) with Faster R-CNN as the basic framework. In CSF-FPN, a hybrid attention mechanism and feature cross-fusion module are introduced, so that feature maps can be generated with enhanced spatial and channel interdependence to extract richer semantic information. After our CSF-FPN is integrated into the Faster R-CNN network, the detection performance of small objects is greatly improved. The experimental results based on the VEDIA Dataset showed that the proposed framework could effectively detect the vehicle in large scene azimuth. Compared with the existing advanced methods, mAP and F1-score are improved.

References

[1]
K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016;pp. 770–778.
[2]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy and A. C. Berg, "SSD: Single Shot MultiBox Detector", European Conference on Computer Vision Springer International Publishing, 2016.
[3]
J. Redmon, S. D. Wala, R. Girshick and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detect Detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp,779-788.
[4]
J. Redmon and A. Farhadi, YOLO9000: Better Faster Stronger, 2016.
[5]
J. Redmon and A. Farhadi, YOLOv3: An Incremental Improvement, 2018.
[6]
S. Ren, K. He, R. Girshick and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 91–99.
[7]
S. Bell, C. Zitnick, K. Bala, Girshick, R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 26 June–1 July 2016.
[8]
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie. Feature pyramid networks for object detection. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2016; pp. 936–944.
[9]
R. Girshick, J. Donahue, T. Darrell, J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 580–587.
[10]
R. Girshick. Fast R-CNN. Comput. Sci. 2015, 1440–1448.
[11]
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc.IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 580-587.
[12]
Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng, and J. Sun, ‘‘DetNet: A backbone network for object detection,’’ 2018, arXiv:1804.06215. [Online]. Available: http://arxiv.org/abs/1804.06215.
[13]
D. Bahdanau, K. Cho, and Y. Bengio, “Neural ma-chine translation by jointly learning to align and translate, ” 2014, arXiv preprint arXiv:1409.0473.
[14]
A. Vaswani, N. Shazeer, N. Parmar, “Attention is all you need,” Proc. Advances in Neural Information Processing Systems, 2017, pp. 5998-6008.
[15]
J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F Li. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009.
[16]
X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, X. Sun, and K. Fu, ‘‘SCRDet: Towards more robust detection for small, cluttered and rotated objects,’’ inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul, South Korea, Oct. 2019, pp. 8231–8240. .2019.00832.
[17]
W. Sakla, G. Konjevod and T. N. Mundhenk. Deep multi-modal vehicle detection in aerial ISR imagery. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017.
[18]
M. Mandal, M. Shah, P. Meena, S. Devi and S. K. Vipparthi. A VDNet: A Small-Sized Vehicle Detection Network.
[19]
Q. Li, L. Mou, Q. Xu, Y. Zhang and X. Zhu. R3-Net: A Deep Network for Multioriented Vehicle Detection in Aerial Images and Videos. arXiv 2018, arXiv:1808.05560.for Aerial Visual Data. IEEE Geosci. Remote Sens. Lett. 2019, 17, 494–498.
[20]
J. O. D. Terrail and F. Jurie. Faster RER-CNN: Application to the Detection of Vehicles in Aerial Images. arXiv2018, arXiv:1809.07628.

Cited By

View all
  • (2024)UAV image detection based on multi-scale spatial attention mechanism with hybrid dilated convolution2024 3rd International Conference on Image Processing and Media Computing (ICIPMC)10.1109/ICIPMC62364.2024.10586685(279-284)Online publication date: 17-May-2024
  • (2024)Deep learning-based object detection in maritime unmanned aerial vehicle imageryEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107513128:COnline publication date: 14-Mar-2024
  • (2024)Deep Landscape Design Evaluation System with Multi-scale Visual Attention MechanismIntelligent 3D Technologies and Augmented Reality10.1007/978-981-97-5184-6_2(13-24)Online publication date: 3-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AIEE '21: Proceedings of the 2021 2nd International Conference on Artificial Intelligence in Electronics Engineering
January 2021
102 pages
ISBN:9781450389273
DOI:10.1145/3460268
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. UAV image
  2. attention mechanism
  3. deep learning
  4. feature pyramid
  5. vehicle detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

AIEE 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)168
  • Downloads (Last 6 weeks)24
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)UAV image detection based on multi-scale spatial attention mechanism with hybrid dilated convolution2024 3rd International Conference on Image Processing and Media Computing (ICIPMC)10.1109/ICIPMC62364.2024.10586685(279-284)Online publication date: 17-May-2024
  • (2024)Deep learning-based object detection in maritime unmanned aerial vehicle imageryEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107513128:COnline publication date: 14-Mar-2024
  • (2024)Deep Landscape Design Evaluation System with Multi-scale Visual Attention MechanismIntelligent 3D Technologies and Augmented Reality10.1007/978-981-97-5184-6_2(13-24)Online publication date: 3-Sep-2024
  • (2023)Object Detection of UAV Images from Orthographic Perspective Based on Improved YOLOv5sSustainability10.3390/su15191456415:19(14564)Online publication date: 7-Oct-2023
  • (2023)Multiple Flying Object Detection using AlexNet Architecture for Aerial Surveillance Applications2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10307613(1-6)Online publication date: 6-Jul-2023
  • (2023)UAV positioning based on pure azimuth passive2023 IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA)10.1109/EEBDA56825.2023.10090790(586-590)Online publication date: 24-Feb-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media