Abstract
The detection of surface defects in steel is crucial for maintaining high product quality standards and preventing financial losses due to inferior goods. However, the subtle imperfections that escape human observation present a significant challenge to existing algorithms. In response to this pressing need, our research introduces an advanced approach, MS-YOLOv5s, designed to better distinguish between background and defects. Firstly, our study presents a new neck module, the comprehensive position feature pyramid networks (CPFPN), which improves the precision of detecting barely noticeable flaws. This is achieved by using the spatial channel attention module (SCAM) on intermediate feature maps to retain more positional information from the original image. Moreover, this innovative method adopts multi-scale learning, dynamically adjusting the input image size during training to amplify the differences between defects and background. MS-YOLOv5s achieves 80.5% and 65.7% mean average precision (mAP) respectively on the NEU-DET and GC10-DET datasets, demonstrating robust performance across various scenarios and outperforming many methods in identifying defects on the steel surface.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yu, F., Bl, B., Xl, A., Cx, A., Sc, A., Xin, Y.: Deep learning-based fast recognition of commutator surface defects. Measurement 178(8), 109324 (2021)
Lee, S.Y., Tama, B.A., Moon, S.J., Lee, S.: Steel surface defect diagnostics using deep convolutional neural network and class activation map. Appl. Sci. 9(24), 5449 (2019)
Xu, Y., Li, D., Xie, Q., Wu, Q., Wang, J.: Automatic defect detection and segmentation of tunnel surface using modified mask r-cnn. Measurement 178, 109316 (2021)
Tian, R., Jia, M.: DCC-Centernet: a rapid detection method for steel surface defects. Measurement 187, 110211 (2022)
Li, J., Su, Z., Geng, J., Yin, Y.: Real-time detection of steel strip surface defects based on improved yolo detection network. IFAC-PapersOnLine 51(21), 76–81 (2018)
Anter, A.M., Abd Elaziz, M., Zhang, Z.: Real-time epileptic seizure recognition using bayesian genetic whale optimizer and adaptive machine learning. Futur. Gener. Comput. Syst. 127, 426–434 (2022)
Mandriota, C., Nitti, M., Ancona, N., Stella, E., Distante, A.: Filter-based feature selection for rail defect detection. Mach. Vis. Appl. 15, 179–185 (2004)
Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858–864 (2013)
Zhou, A., Zheng, H., Li, M., Shao, W.: Defect inspection algorithm of metal surface based on machine vision. In: 2020 12th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA) (2020)
Jocher, G.: YOLOv5 by Ultralytics (2020). https://github.com/ultralytics/yolov5
Li, C., et al.: YOLOv6: a single-stage object detection framework for industrial applications. ArXiv abs/2209.02976 (2022), https://api.semanticscholar.org/CorpusID:252110986
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464–7475 (2022), https://api.semanticscholar.org/CorpusID:250311206
Li, H., Zhang, R., Pan, Y., Ren, J., Shen, F.: LR-FPN: enhancing remote sensing object detection with location refined feature pyramid network. arXiv preprint arXiv:2404.01614 (2024)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate ob-ject detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2013). https://api.semanticscholar.org/CorpusID:215827080
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Pingchuan, W.: Machine vision technology and nondestructive detection of the surface defects in strip steel. Nonde Structive Testing (2000)
Yao, J., et al.: NDC-Scene: boost monocular 3d semantic scene completion in normalized devicecoordinates space. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9421–9431. IEEE Computer Society (2023)
Yao, J., Pan, X., Wu, T., Zhang, X.: Building lane-level maps from aerial images. In: ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3890–3894. IEEE (2024)
Zhao, T., Chen, X., Yang, L.: IPCA-SVM based real-time wrinkling detection approaches for strip steel production process. Int. J. Wireless Mobile Comput. 16(2), 160 (2019)
Yao, J., Wu, T., Zhang, X.: Improving depth gradientcontinuity in transformers: a comparative study on monocular depth estimation with cnn. arXiv preprint arXiv:2308.08333 (2023)
Fu, X., Shen, F., Du, X., Li, Z.: Bag of tricks for “vision meet alage” object detection challenge. In: 2022 6th International Conference on Universal Village (UV), pp. 1–4. IEEE (2022)
Weng, W., Wei, M., Ren, J., Shen, F.: Enhancing aerial object detection with selective frequency interaction network. IEEE Trans. Artif. Intell. 1(01), 1–12 (2024)
Qiao, C., et al.: A novel multi-frequency coordinated module for sar ship detection. In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 804–811. IEEE (2022)
Shen, F., Zhu, J., Zhu, X., Xie, Y., Huang, J.: Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 23(7), 8793–8804 (2021)
Shen, F., Du, X., Zhang, L., Tang, J.: Triplet contrastive learning for unsupervised vehicle re-identification. arXiv preprint arXiv:2301.09498 (2023)
Shen, F., Xie, Y., Zhu, J., Zhu, X., Zeng, H.: GiT: graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. (2023)
Weng, W., Ling, W., Lin, F., Ren, J., Shen, F.: A novel cross frequency-domain interaction learning for aerial oriented object detection. In: Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Springer (2023). https://doi.org/10.1007/978-981-99-8462-6_24
Shen, F., Shu, X., Du, X., Tang, J.: Pedestrian-specific bipartite-aware similarity learning for text-based person retrieval. In: Proceedings of the 31th ACM International Conference on Multimedia (2023)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. ArXiv abs/1807.06521 (2018). https://api.semanticscholar.org/CorpusID:49867180
Zhu, C., Chen, F., Shen, Z., Savvides, M.: Soft anchor-point object detection. In: European Conference on Computer Vision (2019). https://api.semanticscholar.org/CorpusID:208512715
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 6568–6577 (2019). https://api.semanticscholar.org/CorpusID:119296375
Jocher, G., Chaurasia, A., Qiu, J.: Ultralytics YOLO (2023). https://github.com/ultralytics/ultralytics
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra R-CNN: towards balanced learning for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 821–830 (2019). https://api.semanticscholar.org/CorpusID:102354217
Li, C., Guo, J., Porikli, F., Fu, H., Pang, Y.: A cascaded convolutional neural network for single image dehazing. IEEE Access, 1 (2018)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding YOLO series in 2021. ArXivabs/2107.08430 (2021). https://api.semanticscholar.org/CorpusID:236088010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, C., Zhou, M., Liang, Y., Pan, W., Gao, Z. (2024). MS-YOLOv5s: An Improved YOLOv5s for the Detection of Imperceptible Defects on Steel Surfaces. In: Huang, DS., Zhang, C., Guo, J. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14871. Springer, Singapore. https://doi.org/10.1007/978-981-97-5609-4_31
Download citation
DOI: https://doi.org/10.1007/978-981-97-5609-4_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5608-7
Online ISBN: 978-981-97-5609-4
eBook Packages: Computer ScienceComputer Science (R0)