Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

B-FPN SSD: an SSD algorithm based on a bidirectional feature fusion pyramid

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

This paper proposes a bidirectional feature fusion pyramid (B-FPN) Single Shot Multiple Frame Detector (SSD) algorithm. First, a bidirectional feature pyramid (B-FPN) structure is constructed, which realizes the bidirectional fusion of the feature layers and improves the accuracy of detection. Second, we introduce coordinate attention (CA) to focus on the important channel features while preserving their location information, thereby increasing the focus on the important information. Finally, optimizing the loss function speeds up the convergence of the model and further improves the detection accuracy of the network. The experimental results show that on the VOC2007 dataset, the mAP of the algorithm in this paper is 76.48%, which is 3.52% higher than that of the SSD algorithm. On the COCO 2017 dataset, the mAP of the proposed algorithm is 3.85% higher than that of the SSD algorithm. Compared with other mainstream target detection algorithms, the algorithm in this paper has certain advantages in detection accuracy, and can also achieve satisfactory results in detection speed. Finally, the accuracy of foreign object recognition in the special environment of iron ore transportation is 98.26%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig.10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Zhu, M.R., Niu, H.X.: Railway foreign body intrusion recognition algorithm based on improved YOLOv3 model. J. Beijing Jiaotong Univ. 46(02), 37–45 (2022)

    Google Scholar 

  2. Zhu, Y.C.: Research on Identification of Foreign Objects in Coal Belts in Coal Mines Based on Deep Learning. Liaoning Technical University, Liaoning (2021)

    Google Scholar 

  3. Zhang, H.M.: Research on Foreign Object Identification Method on Belt Conveyor Based on Deep Learning. Anhui University of Technology, Anhui (2020)

    Google Scholar 

  4. Lv, Z.Q.: Research on Image Recognition of Foreign Objects in Coal Mine Belt Transportation Under Complex Environment. China University of Mining and Technology, Beijing (2020)

    Google Scholar 

  5. Wu S P, Ding E J, Y X. Identification method of foreign objects in conveyor belt based on improved FPN. Safety in Coal Mines, 2019,50(12):127–130.

  6. Hu J H, Gao Y, Zhang H J, et al. Identification method of non-coal foreign matter in belt conveyor based on deep learning. Journal of Mine Automation, 2021, 47(06):57–62+90.

  7. Zhang Y. Research on Traffic Target Detection Algorithm Based on YOLO-V3[D]. Anhui University of Science & Technology, 2021.

  8. Yuan, Z.H., Sun, Q., Li, G.X., et al.: Automatic driving target detection based on Yolov3. J.Chongqing Univ. Technol. (Natural Sci.) 34(09), 56–61 (2020)

    Google Scholar 

  9. Zhang, X.Y., Gao, H.B., Zhao, J.H.: Overview of deep learning intelligent driving methods. J Tsinghua Univ 58(4), 438–444 (2018)

    Google Scholar 

  10. Wu H, C.Y.W.N.. Sequence Level Semantics Aggregation for Video Object Detection. IEEE, 2019.

  11. Xiong, C.H., Lv, W.H., Wu, W.: Application and Development of Artificial Intelligence Technology for Intelligence Reconnaissance Field. Command Information System and Technology 10(5), 8–13 (2019)

    Google Scholar 

  12. Li, H.H., Zhou, K.P., Han, T.C.: Ship object detection based on SSD improved with CReLU and FPN. Chinese Journal of Scientific Instrument 41(04), 183–190 (2020)

    Google Scholar 

  13. Zhang S, Wen L, Bian X, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018:637–653.

  14. Yang, S.D., Chen, Z.H., Ma, X.M., et al.: Real-time high-precision pedestrian tracking: a detection–tracking–correction strategy based on improved SSD and Cascade R-CNN. J. Real-Time Image Proc. 19, 287–302 (2022)

    Article  Google Scholar 

  15. Li, J., Liang, X., Shen, S.M., et al.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20(4), 985–996 (2017)

    Google Scholar 

  16. Ren, G.Q., Han, H.Y., Li, C.J., et al.: Foreign object detection in coal mine belt transportation based on Fast_YOLOv3 algorithm. Industry and Mine Automation 47(08), 77–83 (2021)

    Google Scholar 

  17. Xie, F., Zhu, D.J.: Survey on Deep Learning Object Detection. Computer Systems & Applications 31(02), 1–12 (2022)

    Google Scholar 

  18. Bochkovskiy A, Wang C Y, Liao H Y Mark. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. (2020–04–23) [2021–06–04]. https://arxiv.org/abs/2004.10934.

  19. Huo, A.Q., Yang, Y.Y., Xie, G.K.: Vehicle target detection based on improved YOLOv3 algorithm. COMPUTER ENGINEERING AND DESIGN. 43(07), 1981–1989 (2022)

    Google Scholar 

  20. Du, J.Y., Chen, R., Hao, L., et al.: Coal mine belt conveyor foreign object detection. Industry and Mine Automation 47(08), 77–83 (2021)

    Google Scholar 

  21. Redmon J, Farhadi A. YOLOv3: An incremental improvement ∥ IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA: IEEE, 2017: 6517–6525.

  22. Hao S, Zhang X, Ma X, et al. Foreign object detection in coal mine conveyor belt based on CBAM-YOLOv5. Journal of China Coal Society, 2021: 1–11.

  23. Woo, S., Park, J., Lee, J.Y., et al.: CBAM: Convolutional Block Attention Module. European conference on computer vision 11211, 3–19 (2018)

    Google Scholar 

  24. GLENN JOCHER, et al. YOLOv5[EB/OL]. https://github.com/ultralytics/yolov5, 2021. 9

  25. Tang, W., Fazhi He, Yu., Liu, et al.: MA TR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer. IEEE Trans. Image Process. 31, 5134–5149 (2022)

    Article  Google Scholar 

  26. Behnood Rasti, Pedram Ghamisi. Remote sensing image classification using subspace sensor fusion. Information Fusion. 2020,121–130.

  27. Wei Tang, Fazhi He, Yu Liu, et al. YDTR: Infrared and Visible Image Fusion via Y -shape Dynamic Transformer. IEEE Transactions on Multimedia. 2022.

  28. Chenxing Xia, Yanguang Sun, Xiuju Gao, et al. DMINet: dense multi-scale inference network for salient object detection. The Visual Computer. 2022.

  29. Pengfei Wang, Minglian Wang, Dongzhi He. Multi-scale feature pyramid and multi-branch neural network for person re-identification. The Visual Computer. 2022

  30. Hou Q, Zhou D Q, Feng J S. Coordinate Attention for Efficient Mobile Network Design//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. NJ:IEEE, 2021:13713–13722.

  31. Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:936–944.

  32. Chen, X.C.: Improved bounding box regression loss function based on smoothL1. COLLEGE MATHEMATICS 37(05), 18–23 (2021)

    Google Scholar 

  33. Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector. European Conference on Computer vision, 2016: 21 -37.

  34. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 91–99 (2017)

    Article  Google Scholar 

  35. Wang, C., A. Bochkovskiy and H.M. Liao, Scaled-YOLOv4:Scaling Cross Stage Partial Network. IEEE Conference on Computer Vision and Pattern Recognition, 2020.

  36. Ge, Z., et al.: YOLOX: Exceeding YOLO Series in 2021. IEEE Conference on Computer Vision and Pattern Recognition, 2021.

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (No. U1804147), Innovative Scientists and Technicians Team of Henan Provincial High Education (20IRTSTHN019), Science and Technology Project of Henan Province(No. 212102210508) .

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junjia Bi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Q., Bi, J., Zhang, J. et al. B-FPN SSD: an SSD algorithm based on a bidirectional feature fusion pyramid. Vis Comput 39, 6265–6277 (2023). https://doi.org/10.1007/s00371-022-02727-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02727-4

Keywords

Navigation