Research on Non-Pooling YOLOv5 Based Algorithm for the Recognition of Randomly Distributed Multiple Types of Parts
<p>Traditional YOLOv5 network.</p> "> Figure 2
<p>Basic Inception Architecture.</p> "> Figure 3
<p>Dimensionality-reducing inception architecture.</p> "> Figure 4
<p>Spatial pyramid pooling (SPP) operation.</p> "> Figure 5
<p>SPP structure.</p> "> Figure 6
<p>Spatial pyramid convolutions (SPC) structure.</p> "> Figure 7
<p>Comparison of activation functions.</p> "> Figure 8
<p>Comparison of the mAPs of the algorithms at <span class="html-italic">IOU</span> = 0.5.</p> "> Figure 9
<p>Comparison of the precisions of algorithms.</p> "> Figure 10
<p>Comparison of the recalls of the algorithms.</p> "> Figure 11
<p>Comparison of the average of mAPs from <span class="html-italic">IOU</span> = 0.5 to <span class="html-italic">IOU</span> = 0.95 for algorithms.</p> "> Figure 12
<p>Loss of Mish-NP.</p> "> Figure 13
<p>Example of an input image.</p> "> Figure 14
<p>Results of Traditional YOLOv5.</p> "> Figure 15
<p>Results of NP-YOLOv5.</p> "> Figure 16
<p>Results of AM-YOLOv5 and Faster-RCNN.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Traditional YOLOv5 Network
2.2. Training on Multiscale Images
3. Proposed Method
3.1. Non-Pooling YOLOv5
3.2. Selection of Activation Function
4. Experiments and Results
4.1. Evaluation Indices of Detection Performance
4.2. Comparison of Different Algorithms
4.3. Experimental Result
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Tekin, B.; Sinha, S.N.; Fua, P. Real-Time Seamless Single Shot 6D Object Pose Prediction. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 292–301. [Google Scholar]
- Peng, S.; Zhou, X.; Liu, Y.; Lin, H.; Huang, Q.; Bao, H. PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation. IEEE Trans. Pattern. Anal. Mach. Intell. 2022, 44, 3212–3223. [Google Scholar] [CrossRef] [PubMed]
- Iriondo, A.; Lazkano, E.; Ansuategi, A. Affordance-Based Grasping Point Detection Using Graph Convolutional Networks for Industrial Bin-Picking Applications. Sensors 2021, 21, 816. [Google Scholar] [CrossRef] [PubMed]
- Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Adv. Neural Inf. Process. Syst. 2017, 39, 1137–1149. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Ma, W.; Wang, X.; Yu, J. A Lightweight Feature Fusion Single Shot Multibox Detector for Garbage Detection. IEEE Access 2020, 8, 188577–188586. [Google Scholar] [CrossRef]
- Zhao, Y.; Han, R.; Rao, Y. A New Feature Pyramid Network for Object Detection. In Proceedings of the 2019 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Jishou, China, 14–15 September 2019; pp. 428–431. [Google Scholar]
- Zhang, Y.; Han, J.H.; Kwon, Y.W.; Moon, Y.S. A New Architecture of Feature Pyramid Network for Object Detection. In Proceedings of the 2020 IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China, 11–14 December 2020; pp. 1224–1228. [Google Scholar]
- Mathew, A.B.; Kurian, S. Identification of Malicious Code Variants using Spp-Net Model and Color Images. In Proceedings of the 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), Rupnagar, India, 26–28 November 2020; pp. 581–585. [Google Scholar]
- Wang, X.; Wang, S.; Cao, J.; Wang, Y. Data-Driven Based Tiny-YOLOv3 Method for Front Vehicle Detection Inducing SPP-Net. IEEE Access 2020, 8, 110227–110236. [Google Scholar] [CrossRef]
- Jarrett, K.; Kavukcuoglu, K.; Ranzato, M.; LeCun, Y. What is the best multi-stage architecture for object recognition? In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 2146–2153. [Google Scholar]
- Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for Activation Functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
- Misra, D. Mish: A Self Regularized Non-Monotonic Activation Function. arXiv 2019, arXiv:1908.08681. [Google Scholar]
- He, W.; Wu, Y.; Li, X. Attention Mechanism for Neural Machine Translation: A survey. In Proceedings of the 2021 IEEE 5th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Kyoto, Japan, 15–17 October 2021; pp. 1485–1489. [Google Scholar]
- Cai, W.; Wang, Y.; Ma, J.; Jin, Q. CAN: Effective cross features by global attention mechanism and neural network for ad click prediction. Tsinghua Sci. Technol. 2022, 27, 186–195. [Google Scholar] [CrossRef]
- Du, Y.; Du, L.; Li, L. An SAR Target Detector Based on Gradient Harmonized Mechanism and Attention Mechanism. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Gao, Y.; Gong, H.; Ding, X.; Guo, B. Image Recognition Based on Mixed Attention Mechanism in Smart Home Appliances. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Xi’an, China, 12–14 March 2021; pp. 1501–1505. [Google Scholar]
- Woo, S.H.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the COMPUTER VISION—ECCV 2018, PT VII, Munich, Germany, 8–14 September 2018; Volume 11211, pp. 3–19. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Szegedy, C.; Wei, L.; Yangqing, J.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. arXiv 2020, arXiv:2005.03572. [Google Scholar] [CrossRef] [PubMed]
Method | [email protected] | Precision | Recall |
---|---|---|---|
YOLOv5 | 0.93 | 0.88 | 0.93 |
Mish-YOLOv5 | 0.94 | 0.94 | 0.93 |
SiLU-NP | 0.94 | 0.95 | 0.95 |
Mish-NP | 0.96 | 0.96 | 0.96 |
AM-YOLOv5 | 0.91 | 0.86 | 0.90 |
Faster-RCNN | 0.85 | 0.82 | 0.86 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, Z.; Zhang, L.; Gao, X.; Huang, Y.; Liu, X. Research on Non-Pooling YOLOv5 Based Algorithm for the Recognition of Randomly Distributed Multiple Types of Parts. Sensors 2022, 22, 9335. https://doi.org/10.3390/s22239335
Yu Z, Zhang L, Gao X, Huang Y, Liu X. Research on Non-Pooling YOLOv5 Based Algorithm for the Recognition of Randomly Distributed Multiple Types of Parts. Sensors. 2022; 22(23):9335. https://doi.org/10.3390/s22239335
Chicago/Turabian StyleYu, Zehua, Ling Zhang, Xingyu Gao, Yang Huang, and Xiaoke Liu. 2022. "Research on Non-Pooling YOLOv5 Based Algorithm for the Recognition of Randomly Distributed Multiple Types of Parts" Sensors 22, no. 23: 9335. https://doi.org/10.3390/s22239335
APA StyleYu, Z., Zhang, L., Gao, X., Huang, Y., & Liu, X. (2022). Research on Non-Pooling YOLOv5 Based Algorithm for the Recognition of Randomly Distributed Multiple Types of Parts. Sensors, 22(23), 9335. https://doi.org/10.3390/s22239335