Real-time detection algorithm for non-motorized vehicles based on D-YOLO model

Yushan Li^1,2,
Hongwei Ding ORCID: orcid.org/0000-0003-4097-7054^1,2,
Peng Hu³,
Zhijun Yang^1,2 &
…
Guanbo Wang^1,2

619 Accesses
1 Citation
Explore all metrics

Abstract

In complex traffic scenarios, it is crucial to develop a rapid and precise real-time detection system for non-motorized vehicles to ensure safe driving. D-YOLO is a lightweight real-time detection technique for non-motorized vehicles based on an enhanced version of YOLOv4-tiny. Typically, the computing capabilities of mobile devices are constrained, therefore we begin by reducing the number of model parameters. Then, we add dilated convolution and depthwise separable convolution into the network’s Cross Stage Partial Connection (CSPNet) in order to produce the DCSPNet with improved performance. Coordinate Attention(CA) is implemented to enhance the network’s capability to extract effective features. In the neck network of the model by introducing a spatial pyramid set (SPP) to enhance the feature representation of non-motorized vehicles in the feature layer. Finally, we test this proposed model on dataset, the experimental results show that D-YOLO has a model size of only 6.7MB, which is 16.5MB smaller than YOLOv4-tiny. The detection speed of D-YOLO is about 25% faster than that of YOLOv4-tiny, D-YOLO has approximately 58% fewer model parameters than YOLOv4-tiny, D-YOLO has a mAP of 70.36%, which is 2.01% higher than YOLOv4-tiny. It can be shown that D-YOLO ensures both accuracy and real-time performance to satisfies the demand of real-time detection of non-motorized vehicles in intelligent traffic scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-precision real-time autonomous driving target detection based on YOLOv8

Article 19 September 2024

YOLOv8n-CGW: A novel approach to multi-oriented vehicle detection in intelligent transportation systems

Article 02 May 2024

Real-Time Object Detection Based on YOLO-v2 for Tiny Vehicle Object

Article 11 June 2022

Code Availability

The code that support the findings of this study are available from the corresponding author upon reasonable request.

Availability of data and materials

Datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst (JETC) 13(3):1–18
Article Google Scholar
Aslam N, Sharma V (2017) Foreground detection of moving object using gaussian mixture model. In: 2017 International conference on communication and signal processing (ICCSP), pp 1071–1074. IEEE
Avenash R, Viswanath P (2019) Semantic segmentation of satellite images using a modified cnn with hard-swish activation function. In: VISIGRAPP (4: VISAPP), pp 413–420
Bar-Cohen Y (2006) Biomimetics—using nature to inspire human innovation. Bioinspiration & Biomimetics 1(1):1
Article Google Scholar
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection arXiv:2004.10934
Cai Y, Luan T, Gao H, Wang H, Chen L, Li Y, Sotelo MA, Li Z (2021) Yolov4-5d: an effective and efficient object detector for autonomous driving. IEEE Trans Instrum Meas 70:1–13
Google Scholar
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0
Chen Y, Kalantidis Y, Li J, Yan S, Feng J (2018) Aˆ 2-nets: double attention networks. Advances in neural information processing systems, vol 31
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1, pp 886–893. Ieee
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
Gholami A, Kwon K, Wu B, Tai Z, Yue X, Jin P, Zhao S, Keutzer K (2018) Squeezenext: hardware-aware neural network design. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1638–1647
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, vol 2, pp 1458–1465. IEEE
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
He Y, Lin J, Liu Z, Wang H, Li L-J, Han S (2018) Amc: Automl for model compression and acceleration on mobile devices. In: Proceedings of the european conference on computer vision (ECCV), pp 784–800
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1389–1397
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1314–1324
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications arXiv:1704.04861
Hu J, Shen L, Albanie S, Sun G, Vedaldi A (2018) Gather-excite: exploiting feature context in convolutional neural networks. Advances in Neural Information Processing Systems, vol 31
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Hu Y, Sun S, Li J, Wang X, Gu Q (2018) A novel channel pruning method for deep neural network compression arXiv:1805.11394
Huang Z, Li W, Xia X-G, Wang H, Jie F, Tao R (2022) Lo-det: lightweight oriented object detection in remote sensing images. IEEE Trans Geosci Remote Sens 60:1–15. https://doi.org/10.1109/TGRS.2021.3067470
Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang Z, Wang J, Fu X, Yu T, Guo Y, Wang R (2020) Dc-spp-yolo: dense connection and spatial pyramid pooling based yolo for object detection. Inform Sci 522:241–258
Article MathSciNet Google Scholar
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and;0.5 mb model size arXiv:1602.07360
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, pp 2169–2178. IEEE
Li X, Lai S, Qian X (2021) Dbcface: towards pure convolutional neural network face detection. IEEE Trans Circuits Syst Video Technol 32(4):1792–1804
Article Google Scholar
Li G, Yang Y, Qu X (2019) Deep learning approaches on pedestrian detection in hazy weather. IEEE Trans Ind Electron 67(10):8889–8899
Article Google Scholar
Lienhart R, Maydt J (2002) An extended set of haar-like features for rapid object detection. In: Proceedings. international conference on image processing, vol 1. IEEE
Lin M, Ji R, Wang Y, Zhang Y, Zhang B, Tian Y, Shao L (2020) Hrank: filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1529–1538
Lin M, Ji R, Zhang Y, Zhang B, Wu Y, Tian Y (2020) Channel pruning via automatic structure search arXiv:2001.08565
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pp 21–37. Springer
Liu J. -J., Hou Q, Cheng M-M, Wang C, Feng J (2020) Improving convolutional networks with self-calibrated convolutions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10096–10105
Liu Z, Mu H, Zhang X, Guo Z, Yang X, Cheng K-T, Sun J (2019) Metapruning: meta learning for automatic neural network channel pruning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3296–3305
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Ma X, Guo F-M, Niu W, Lin X, Tang J, Ma K, Ren B, Wang Y (2020) Pconv: the missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 5117–5124
Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the european conference on computer vision (ECCV), pp 116–131
Misra D, Nalamada T, Arasanipalai AU, Hou Q (2021) Rotate to attend: convolutional triplet attention module. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3139–3148
Peng C, Ma J (2020) Semantic segmentation using stride spatial pyramid pooling and dual attention decoder. Pattern Recogn 107:107498
Article Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems, vol 28
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Song H, Liang H, Li H, Dai Z, Yun X (2019) Vision-based vehicle detection and counting system using deep learning in highway scenes. Eur Transp Res Rev 11(1):1–16
Article Google Scholar
Srinivas S, Subramanya A, Venkatesh Babu R (2017) Training sparse neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 138–145
Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: Proceedings. 1999 IEEE computer society conference on computer vision and pattern recognition (Cat. No PR00149), vol 2, pp 246–252. IEEE
Stollenga MF, Masci J, Gomez F, Schmidhuber J (2014) Deep networks with internal selective attention through feedback connections. Advances in Neural Information Processing Systems, vol 2
Tan YS, Lim KM, Tee C, Lee CP, Low CY (2021) Convolutional neural network with spatial pyramid pooling for hand gesture recognition. Neural Comput and Applic 33(10):5339–5351
Article Google Scholar
Van de Sande KE, Uijlings JR, Gevers T, Smeulders AW (2011) Segmentation as selective search for object recognition. In: 2011 International conference on computer vision, pp 1879–1886. IEEE
Wang G, Ding H, Yang Z, Li B, Wang Y, Bao L (2022) Trc-yolo: a real-time detection method for lightweight targets based on mobile devices. IET Comput Vis 16(2):126–142
Article Google Scholar
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 390–391
Wang F, Tax DM (2016) Survey on the attention based rnn model and its applications in computer vision arXiv:1601.06823
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the european conference on computer vision (ECCV), pp 3–19
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Yang S, Gao T, Wang J, Deng B, Azghadi MR, Lei T, Linares-Barranco B (2022) Sam: a unified self-adaptive multicompartmental spiking neuron model for learning with working memory. Front Neurosci, vol 16
Yang S, Linares-Barranco B, Chen B (2022) Heterogeneous ensemble-based spike-driven few-shot online learning. Front Neurosci, vol 16
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1794–1801. IEEE
Yu J, Zhang W (2021) Face mask wearing detection algorithm based on improved yolo-v4. Sensors 21(9):3263
Article Google Scholar
Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision, pp 834–849. Springer
Zhang T, Ye S, Zhang Y, Wang Y, Fardad M (2018) Systematic weight pruning of dnns using alternating direction method of multipliers arXiv:1802.05747
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Zhao X, Pu F, Wang Z, Chen H, Xu Z (2019) Detection, tracking, and geolocation of moving vehicle from uav using monocular camera, vol 7
Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, Jia J (2018) Psanet: point-wise spatial attention network for scene parsing. In: Proceedings of the european conference on computer vision (ECCV), pp 267–283

Download references

Acknowledgements

This study was supported in part by grants from The National Natural Science Foundation of China, Grant/Award Numbers: 61461053, 61461054.

Funding

The National Natural Science Foundation of China, Grant/Award Numbers: 61461053, 61461054

Author information

Authors and Affiliations

School of Information, Yunnan University, Chenggong District, Kunming, 650500, Yunnan Province, China
Yushan Li, Hongwei Ding, Zhijun Yang & Guanbo Wang
The Key Laboratory of Internet of Things Technology and Application in Yunnan Province, Yunnan University, Chenggong District, Kunming, 650500, Yunnan Province, China
Yushan Li, Hongwei Ding, Zhijun Yang & Guanbo Wang
Yunnan Youbei Communication Technology Co., Chenggong District, Kunming, 650500, Yunnan Province, China
Peng Hu

Authors

Yushan Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Ding
View author publications
You can also search for this author in PubMed Google Scholar
Peng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhijun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guanbo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yushan Li : Conceptualization,Software, Writing, Original draft preparation.

Hongwei Ding : Supervision, Writing- Original draft preparation.

Peng Hu : Data curation, Visualization.

Zhijun Yang : Data curation, Investigation.

Guanbo Wang : Supervision, Software, Validation.

Corresponding author

Correspondence to Hongwei Ding.

Ethics declarations

Consent for Publication

The Author agrees to publication in the Journal indicated below and also topublication of the article in English by Springer in Springer’s correspondingEnglish-language journal.

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Ding, H., Hu, P. et al. Real-time detection algorithm for non-motorized vehicles based on D-YOLO model. Multimed Tools Appl 83, 61673–61696 (2024). https://doi.org/10.1007/s11042-023-14385-2

Download citation

Received: 07 September 2022
Revised: 20 November 2022
Accepted: 06 January 2023
Published: 25 January 2023
Issue Date: July 2024
DOI: https://doi.org/10.1007/s11042-023-14385-2

Real-time detection algorithm for non-motorized vehicles based on D-YOLO model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High-precision real-time autonomous driving target detection based on YOLOv8

YOLOv8n-CGW: A novel approach to multi-oriented vehicle detection in intelligent transportation systems

Real-Time Object Detection Based on YOLO-v2 for Tiny Vehicle Object

Code Availability

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent for Publication

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Real-time detection algorithm for non-motorized vehicles based on D-YOLO model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High-precision real-time autonomous driving target detection based on YOLOv8

YOLOv8n-CGW: A novel approach to multi-oriented vehicle detection in intelligent transportation systems

Real-Time Object Detection Based on YOLO-v2 for Tiny Vehicle Object

Code Availability

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Consent for Publication

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation