LODNU: lightweight object detection network in UAV vision

Naiyuan Chen¹,
Yan Li²^na1,
Zhuomin Yang³,
Zhensong Lu¹,
Sai Wang¹ &
…
Junang Wang¹

749 Accesses
17 Citations
Explore all metrics

Abstract

With the development of unmanned aerial vehicle (UAV) technology, using UAV to detect objects has become a research focus. However, most of the current object detection algorithms based on deep learning are resource consuming, which is difficult to deploy on embedded devices lacking memory and computing power, such as UAV. To meet these challenges, this paper proposes a lightweight object detection network in UAV vision (LODNU) based on YOLOv4, which can meet the application requirements of resource-constrained devices while ensuring the detection accuracy. Based on YOLOv4, LODNU uses depth-wise separable convolution to reconstruct the backbone network to reduce the parameters of the model, and embeds improved coordinate attention in the backbone network to improve the extraction ability of key object features. Meanwhile, the adaptive scale weighted feature fusion module is added to the path aggregation network to improve the accuracy of multi-scale object detection. In addition, in order to balance the proportion of sample size, we propose a patching data augmentation. LODNU achieves the performance near to YOLOv4 with fewer parameters. Compared with YOLOv4, LODNU achieves more balanced results between model size and accuracy in the VisDrone2019 dataset. The experimental show that the parameters of LODNU are only 13.1% of YOLOv4, and the floating-point operations are only 13.6% of YOLOv4. The mean average precision of LODNU in the VisDrone2019 dataset is 31.4%, which is 77% of YOLOv4.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Article 30 October 2024

Small object detection based on YOLOv8 in UAV perspective

Article 18 August 2024

Improved feature extraction network in lightweight YOLOv7 model for real-time vehicle detection on low-cost hardware

Article 20 April 2024

Availability of data and materials

Not applicable.

References

Gupta H, Verma OP (2022) monitoring and surveillance of urban road traffic using low altitude drone images: a deep learning approach. Multimed Tools Appl 81(14):19683–19703
Article Google Scholar
Wang S, Zhao J, Ta N, Zhao X, Xiao M, Wei H (2021) A real-time deep learning forest fire monitoring algorithm based on an improved pruned+ kd model. J Real Time Image Proc 18(6):2319–2329
Article Google Scholar
Huang Z, Zhang T, Liu P, Lu X (2020) outdoor independent charging platform system for power patrol UAV. In: 2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), pp 1–5
Han J, Zhang D, Cheng G et al (2014) Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans Geosci Remote Sens 53(6):3325–3337
Article Google Scholar
Shi Z, Yu X, Jiang Z et al (2013) Ship detection in high-resolution optical imagery based on anomaly detector and local shape feature. IEEE Trans Geosci Remote Sens 52(8):4511–4523
Google Scholar
Everingham M, Eslami S, Van Gool L et al (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Lin T-Y, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: European Conference on Computer Vision. Springer, pp 740–755
Google Scholar
Liu L, Ouyang W, Wang X et al (2020) Deep learning for generic object detection: a survey. Int J Comput Vis 128(2):261–318
Article MATH Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708
Girshick R, Donahue J, Darrell T et al (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Article Google Scholar
Lin T-Y, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Lin T-Y, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and pattern Recognition, pp 2117–2125
Liu S, Qi L, Qin H et al (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768
Liu W, Anguelov D, Erhan D et al (2016) Ssd: single shot multibox detector. In: European Conference on Computer Vision. Springer, pp 21–37
Google Scholar
Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Chen Y, Chen X, Chen L, He D, Zheng J, Xu C, Lin Y, Liu L (2021) UAV lightweight object detection based on the improved yolo algorithm. In: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering, pp 1502–1506
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13713–13722
Ren S, He K, Girshick R et al (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Dai J, Li Y, He K et al (2016) R-fcn: object detection via region-based fully convolutional networks. Adv Neural Inf Process Syst 29:379–387
Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Yu F, Wang D, Shelhamer E et al (2018) Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2403–2412
Kim S-W, Kook H-K, Sun J-Y et al (2018) Parallel feature pyramid network for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 234–250
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7036–7045
Iandola FN, Han S, Moskewicz MW, et al (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and $<$ 0.5 mb model size. arXiv preprint arXiv:1602.07360
Howard AG, Zhu M, Chen B et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Sandler M, Howard A, Zhu M et al (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520
Howard A, Sandler M, Chu G et al (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1314–1324
Zhang X, Zhou X, Lin M et al (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6848–6856
Ma N, Zhang X, Liu M et al (2021) Activate or not: learning customized activation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8032–8042
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. arXiv preprint arXiv:1710.05941
He K, Zhang X, Ren S et al (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1026–1034
Selvaraju RR, Cogswell M, Das A et al (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 618–626
Wang C-Y, Bochkovskiy A, Liao H-YM (2021) Scaled-yolov4: scaling cross stage partial network. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 13024–13033
Jocher G (2020) Yolov5. https://github.com/ultralytics/yolov5. Accessed 26 Nov 2021
Sun W, Dai L, Zhang X, Chang P, He X (2022) RSOD: real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell 52(8):8448–8463
Article Google Scholar
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022
Wang H, Wu Z, Liu Z, Cai H, Zhu L, Gan C, Han S (2020) Hat: hardware-aware transformers for efficient natural language processing. arXiv preprint arXiv:2005.14187
Mehta S, Ghazvininejad M, Iyer S, Zettlemoyer L, Hajishirzi H (2020) Delight: deep and light-weight transformer. arXiv preprint arXiv:2008.00623

Download references

Acknowledgements

This study is supported by Open Project of Key Laboratory of Ministry of Public Security for Road Traffic Safety, No.2021Zdsyskfkt04.

Funding

This study is supported by Open Project of Key Laboratory of Minstry of Public Security for Road Traffic Safety, No.2021Zdsyskfkt04.

Author information

Yan Li contributed equally to this work.

Authors and Affiliations

School of Automation, Nanjing University of Information Science and Technology, Nanjing, 210044, Jiangsu, China
Naiyuan Chen, Zhensong Lu, Sai Wang & Junang Wang
School of Internet of Things Engineering, Wuxi University, Wuxi, 214105, Jiangsu, China
Yan Li
Research Institute of Traffic Management, Ministry of Public Security, Wuxi, 214151, Jiangsu, China
Zhuomin Yang

Authors

Naiyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhuomin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhensong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

NC wrote the main manuscript text. YL provided some suggestions for revision of the manuscript. ZY provided funding. ZL suggested the structure of the manuscript. SW provided some support on the experimental equipment. JW gave some help to the typesetting of the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yan Li.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Code availability

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, N., Li, Y., Yang, Z. et al. LODNU: lightweight object detection network in UAV vision. J Supercomput 79, 10117–10138 (2023). https://doi.org/10.1007/s11227-023-05065-x

Download citation

Accepted: 16 January 2023
Published: 01 February 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11227-023-05065-x

LODNU: lightweight object detection network in UAV vision

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Small object detection based on YOLOv8 in UAV perspective

Improved feature extraction network in lightweight YOLOv7 model for real-time vehicle detection on low-cost hardware

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

LODNU: lightweight object detection network in UAV vision

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Small object detection based on YOLOv8 in UAV perspective

Improved feature extraction network in lightweight YOLOv7 model for real-time vehicle detection on low-cost hardware

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation