Abstract
This paper proposes a bidirectional feature fusion pyramid (B-FPN) Single Shot Multiple Frame Detector (SSD) algorithm. First, a bidirectional feature pyramid (B-FPN) structure is constructed, which realizes the bidirectional fusion of the feature layers and improves the accuracy of detection. Second, we introduce coordinate attention (CA) to focus on the important channel features while preserving their location information, thereby increasing the focus on the important information. Finally, optimizing the loss function speeds up the convergence of the model and further improves the detection accuracy of the network. The experimental results show that on the VOC2007 dataset, the mAP of the algorithm in this paper is 76.48%, which is 3.52% higher than that of the SSD algorithm. On the COCO 2017 dataset, the mAP of the proposed algorithm is 3.85% higher than that of the SSD algorithm. Compared with other mainstream target detection algorithms, the algorithm in this paper has certain advantages in detection accuracy, and can also achieve satisfactory results in detection speed. Finally, the accuracy of foreign object recognition in the special environment of iron ore transportation is 98.26%.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability statement
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Zhu, M.R., Niu, H.X.: Railway foreign body intrusion recognition algorithm based on improved YOLOv3 model. J. Beijing Jiaotong Univ. 46(02), 37–45 (2022)
Zhu, Y.C.: Research on Identification of Foreign Objects in Coal Belts in Coal Mines Based on Deep Learning. Liaoning Technical University, Liaoning (2021)
Zhang, H.M.: Research on Foreign Object Identification Method on Belt Conveyor Based on Deep Learning. Anhui University of Technology, Anhui (2020)
Lv, Z.Q.: Research on Image Recognition of Foreign Objects in Coal Mine Belt Transportation Under Complex Environment. China University of Mining and Technology, Beijing (2020)
Wu S P, Ding E J, Y X. Identification method of foreign objects in conveyor belt based on improved FPN. Safety in Coal Mines, 2019,50(12):127–130.
Hu J H, Gao Y, Zhang H J, et al. Identification method of non-coal foreign matter in belt conveyor based on deep learning. Journal of Mine Automation, 2021, 47(06):57–62+90.
Zhang Y. Research on Traffic Target Detection Algorithm Based on YOLO-V3[D]. Anhui University of Science & Technology, 2021.
Yuan, Z.H., Sun, Q., Li, G.X., et al.: Automatic driving target detection based on Yolov3. J.Chongqing Univ. Technol. (Natural Sci.) 34(09), 56–61 (2020)
Zhang, X.Y., Gao, H.B., Zhao, J.H.: Overview of deep learning intelligent driving methods. J Tsinghua Univ 58(4), 438–444 (2018)
Wu H, C.Y.W.N.. Sequence Level Semantics Aggregation for Video Object Detection. IEEE, 2019.
Xiong, C.H., Lv, W.H., Wu, W.: Application and Development of Artificial Intelligence Technology for Intelligence Reconnaissance Field. Command Information System and Technology 10(5), 8–13 (2019)
Li, H.H., Zhou, K.P., Han, T.C.: Ship object detection based on SSD improved with CReLU and FPN. Chinese Journal of Scientific Instrument 41(04), 183–190 (2020)
Zhang S, Wen L, Bian X, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018:637–653.
Yang, S.D., Chen, Z.H., Ma, X.M., et al.: Real-time high-precision pedestrian tracking: a detection–tracking–correction strategy based on improved SSD and Cascade R-CNN. J. Real-Time Image Proc. 19, 287–302 (2022)
Li, J., Liang, X., Shen, S.M., et al.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20(4), 985–996 (2017)
Ren, G.Q., Han, H.Y., Li, C.J., et al.: Foreign object detection in coal mine belt transportation based on Fast_YOLOv3 algorithm. Industry and Mine Automation 47(08), 77–83 (2021)
Xie, F., Zhu, D.J.: Survey on Deep Learning Object Detection. Computer Systems & Applications 31(02), 1–12 (2022)
Bochkovskiy A, Wang C Y, Liao H Y Mark. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. (2020–04–23) [2021–06–04]. https://arxiv.org/abs/2004.10934.
Huo, A.Q., Yang, Y.Y., Xie, G.K.: Vehicle target detection based on improved YOLOv3 algorithm. COMPUTER ENGINEERING AND DESIGN. 43(07), 1981–1989 (2022)
Du, J.Y., Chen, R., Hao, L., et al.: Coal mine belt conveyor foreign object detection. Industry and Mine Automation 47(08), 77–83 (2021)
Redmon J, Farhadi A. YOLOv3: An incremental improvement ∥ IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA: IEEE, 2017: 6517–6525.
Hao S, Zhang X, Ma X, et al. Foreign object detection in coal mine conveyor belt based on CBAM-YOLOv5. Journal of China Coal Society, 2021: 1–11.
Woo, S., Park, J., Lee, J.Y., et al.: CBAM: Convolutional Block Attention Module. European conference on computer vision 11211, 3–19 (2018)
GLENN JOCHER, et al. YOLOv5[EB/OL]. https://github.com/ultralytics/yolov5, 2021. 9
Tang, W., Fazhi He, Yu., Liu, et al.: MA TR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer. IEEE Trans. Image Process. 31, 5134–5149 (2022)
Behnood Rasti, Pedram Ghamisi. Remote sensing image classification using subspace sensor fusion. Information Fusion. 2020,121–130.
Wei Tang, Fazhi He, Yu Liu, et al. YDTR: Infrared and Visible Image Fusion via Y -shape Dynamic Transformer. IEEE Transactions on Multimedia. 2022.
Chenxing Xia, Yanguang Sun, Xiuju Gao, et al. DMINet: dense multi-scale inference network for salient object detection. The Visual Computer. 2022.
Pengfei Wang, Minglian Wang, Dongzhi He. Multi-scale feature pyramid and multi-branch neural network for person re-identification. The Visual Computer. 2022
Hou Q, Zhou D Q, Feng J S. Coordinate Attention for Efficient Mobile Network Design//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. NJ:IEEE, 2021:13713–13722.
Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:936–944.
Chen, X.C.: Improved bounding box regression loss function based on smoothL1. COLLEGE MATHEMATICS 37(05), 18–23 (2021)
Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector. European Conference on Computer vision, 2016: 21 -37.
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 91–99 (2017)
Wang, C., A. Bochkovskiy and H.M. Liao, Scaled-YOLOv4:Scaling Cross Stage Partial Network. IEEE Conference on Computer Vision and Pattern Recognition, 2020.
Ge, Z., et al.: YOLOX: Exceeding YOLO Series in 2021. IEEE Conference on Computer Vision and Pattern Recognition, 2021.
Acknowledgements
This work is partially supported by the National Natural Science Foundation of China (No. U1804147), Innovative Scientists and Technicians Team of Henan Provincial High Education (20IRTSTHN019), Science and Technology Project of Henan Province(No. 212102210508) .
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Q., Bi, J., Zhang, J. et al. B-FPN SSD: an SSD algorithm based on a bidirectional feature fusion pyramid. Vis Comput 39, 6265–6277 (2023). https://doi.org/10.1007/s00371-022-02727-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02727-4