Abstract
For fire detection, there are characteristics such as variable samples feature morphology, complex background and dense targets, small samples size of dataset and imbalance of categories, which lead to the problems of low accuracy and poor real-time performance of the existing fire detection models. We propose EA-YOLO, a flame and smoke detection model based on efficient multi-scale feature enhancement. In order to improve the extraction capability of the network model for flame smoke targets’ features, an efficient attention mechanism is integrated into the backbone network, Multi Channel Attention (MCA), and the number of parameters of the model is reduced by introducing the RepVB module; at the same time, we design a multi-weighted, multidirectional feature neck structure called the Multidirectional Feature Pyramid Network (MDFPN), to enhance the model’s flame smoke target feature information fusion ability; finally, we redesign the CIoU loss function by introducing the Slide weighting function to improve the imbalance between difficult and easy samples. Additionally, to address the issue of small sample sizes in fire datasets, we establish two new fire datasets: Fire-smoke and Ro-fire-smoke. The latter includes a model robustness validation function. The experimental results show that the method of this paper is 6.5\(\%\) and 7.3\(\%\) higher compared to the baseline model YOLOv7 on the Fire-smoke and Ro-fire-smoke datasets, respectively. The detection speed is 74.6 frames per second. To fully demonstrate the superiority of EA-YOLO, we utilized the public FASDD dataset and compared several state-of-the-art (SOTA) models with EA-YOLO on this dataset. The results were highly favorable. It fully demonstrates that the method in this paper has high fire detection accuracy while considering the real-time nature of the model. The source code and datasets are located at https://github.com/DIADH/DIADH.YOLO.
Similar content being viewed by others
Data Availability
Our dataset will be made publicly available at this URL, https://github.com/DIADH/DIADH.YOLO.
References
Aslan, S., Güdükbay, U., Töreyin, B.U., Cetin, A.E.: Deep convolutional generative adversarial networks based flame detection in video. arXiv preprint arXiv:1902.01824, (2019)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, (2020)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6154–6162, (2018)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In European conference on computer vision, pages 213–229. Springer, (2020)
Chen, P.-Y., Chang, M.-C., Hsieh, J.-W., Chen, Y.-S.: Parallel residual bi-fusion feature pyramid network for accurate single-shot object detection. IEEE Trans. Image Process. 30, 9099–9111 (2021)
Chen, P.-Y., Hsieh, J.-W., Wang, C.-Y., Liao, H.-Y.M.: Recursive hybrid fusion pyramid network for real-time small object detection on embedded devices. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 402–403, (2020)
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13039–13048, (2021)
Chen, Z., Yang, J., Chen, L., Jiao, H.: Garbage classification system based on improved shufflenet v2. Resour. Conserv. Recycl. 178, 106090 (2022)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6569–6578, (2019)
Frizzi, S., Kaabi, R., Bouchouicha, M., Ginoux, J.-M., Moreau, E., Fnaiech, F.: Convolutional neural network for video fire and smoke detection. In IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society, pages 877–882. IEEE, (2016)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, (2021)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, (2014)
Glenn, J.: Ultralytics yolov8. https://github.com/ultralytics/ultralytics, 2023, (2023)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, (2017)
Yaowen, H., Zhan, J., Zhou, G., Chen, A., Cai, W., Guo, K., Yahui, H., Li, L.: Fast forest fire smoke detection using mvmnet. Knowl.-Based Syst. 241, 108219 (2022)
Jin, Z., Liu, B., Chu, Q., Nenghai, Yu.: Safnet: A semi-anchor-free network with enhanced feature pyramid for object detection. IEEE Trans. Image Process. 29, 9445–9457 (2020)
Jin, Z., Yu, D., Song, L., Yuan, Z., Yu, L.: You should look at all objects. In European Conference on Computer Vision, pages 332–349. Springer, (2022)
Leng, Z., Tan, M., Liu, C., Cubuk, E.D., Shi, X., Cheng, S., Anguelov, D.: Polyloss: A polynomial expansion perspective of classification loss functions. arXiv preprint arXiv:2204.12511, (2022)
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W. et al.: Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, (2022)
Li, X., You, A., Zhu, Z., Zhao, H., Yang, M., Yang, K., Tan, S., Tong, Y.: Semantic flow for fast and accurate scene parsing. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pages 775–793. Springer, (2020)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, (2017)
Qin, Y.-Y., Cao, J.-T., Ji, X.-F.: Fire detection method based on depthwise separable convolution and yolov3. Int. J. Autom. Comput. 18, 300–310 (2021)
Quan, Y., Zhang, D., Zhang, L., Tang, J.: Centralized feature pyramid for object detection. arxiv 2022. arXiv preprint arXiv:2210.02093, 41
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, (2016)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 85 (2015)
Siliang, M., Yong, X.: Mpdiou: a loss for efficient and accurate bounding box regression. arXiv preprint arXiv:2307.07662, (2023)
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, (2019)
Cheng, S., Jin, Y., Harrison, S.P., Quilodrán-Casas, C., Prentice, I.C., Guo, Y.K.: Parameter flexible wildfire prediction using machine learning techniques: Forward and inverse modelling. Remote Sens. 14, 3228 (2022)
Cheng, S., Prentice, I.C., Huang, Y., Jin, Y., Guo, Yi.-Ke., Arcucci, R.: Data-driven surrogate model with latent data assimilation: Application to wildfire forecasting. J. Comput. Phys. 464, 111302 (2022)
Filonenko, A., Kurnianggoro, L., Jo, K.-H.: Comparative study of modern convolutional neural networks for smoke detection on image data. In 2017 10th international conference on human system interactions (HSI), pages 64–68. IEEE, (2017)
Huang, J., He, Z., Guan, Y., Zhang, H.: Real-time forest fire detection by ensemble lightweight YOLOX-L and defogging method. Sensors 23(4), 1894 (2023)
Talaat, F.M., ZainEldin, H.: An improved fire detection approach based on YOLO-v8 for smart cities. Neural Comput. Appl. 35(28), 20939–20954 (2023)
Tao, C., Zhang, J., Wang, P.: Smoke detection based on deep convolutional neural networks. In 2016 International conference on industrial informatics-computing technology, intelligent technology, industrial information integration (ICIICII), pages 150–153. IEEE, (2016)
Zhang, C., Cheng, S., Kasoar, M., Arcucci, R.: Reduced order digital twin and latent data assimilation for global wildfire prediction. EGUsphere 2022, 1–24 (2022)
Zhang, Q., Lin, G., Zhang, Y., Xu, G., Wang, J.: Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images. Proc. Eng. 211, 441–446 (2018)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, (2019)
Wang, C., He, W., Nie, Y., Guo, J., Liu, C., Han, K., Wang, Y.: Gold-yolo: Efficient object detector via gather-and-distribute mechanism. arXiv preprint arXiv:2309.11331, (2023)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7464–7475, (2023)
Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.: Cspnet: A new backbone that can enhance learning capability of cnn. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 390–391, (2020)
Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: Few-shot image semantic segmentation with prototype alignment. In proceedings of the IEEE/CVF international conference on computer vision, pages 9197–9206, (2019)
Renjie, X., Lin, H., Kangjie, L., Cao, L., Liu, Y.: A forest fire detection system based on ensemble learning. Forests 12(2), 217 (2021)
Xu, S., Wang, X., Lv, W., Chang, Q., Cui, C., Deng, K., Wang, G., Dang, Q., Wei, S., Du, Y. et al.: Pp-yoloe: An evolved version of yolo. arXiv preprint arXiv:2203.16250, (2022)
Yang, G., Lei, J., Zhu, Z., Cheng, S., Feng, Z., Liang, R.: Afpn: asymptotic feature pyramid network for object detection. arXiv preprint arXiv:2306.15988, (2023)
Yu, Z., Huang, H., Chen, W., Su, Y., Liu, Y., Wang, X.: Yolo-facev2: A scale and occlusion aware face detector. arXiv preprint arXiv:2208.02019, (2022)
Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., Shum, H.-Y.: Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, (2022)
Zhang, H., Xu, C., Zhang, S.: Inner-iou: More effective intersection over union loss with auxiliary bounding box. arXiv preprint arXiv:2311.02877, (2023)
Zhang, H., Wang, Y., Dayoub, F., Sunderhauf, N.: Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8514–8523, (2021)
Zhang, X., Qian, K., Jing, K., Yang, J., Yu, H.: Fire detection based on convolutional neural networks with channel attention. In 2020 Chinese Automation Congress (CAC), pages 3080–3085. IEEE, (2020)
Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., Ling, H.: M2det: A single-shot object detector based on multi-level feature pyramid network. In Proceedings of the AAAI conference on artificial intelligence 33, 9259–9266 (2019)
Acknowledgements
Let's embrace the world together.
Funding
The National Natural Science Foundation of China(62103096); The Natural Science Foundation of Hainan Province(623MS071); The "Chunhui Plan" cooperative scientific research project of the Ministry of Education(HZKY20220314); Hainan Province Science and Technology Special Fund (ZDYF2022SHFZ105); The Natural Science Foundation of Heilongjiang Province (LH2023H001,LH2022F009); The Hainan Provincial Joint Project of Sanya Yazhou Bay Science and Technology City (2021JJLH0025); Guiding Innovation Foundation of Northeast Petroleum University (Grant No. 15071202203).
Author information
Authors and Affiliations
Contributions
The successful completion of this research owes much to the support and assistance of the following parties, to whom I express my heartfelt gratitude: Research Design and Execution: The design and execution of this study benefited from the collaborative efforts of the entire team. I contributed to defining the research questions, designing experiments, collecting and analyzing data, and bore responsibility for the entire research process. Literature Review and Theoretical Support: Throughout the research process, I extensively reviewed and synthesized literature in the relevant field, drawing valuable theoretical support and research insights. This literature provided an essential theoretical foundation for this study. Data Collection and Analysis: I actively participated in the collection and organization of experimental data and conducted thorough analyses using statistical methods. Through in-depth study of the data, I could provide reliable explanations and inferences for the research findings. Presentation and Discussion of Results: I participated in presenting and discussing the research results, contributing significantly to the derivation of research conclusions through argumentation and interpretation. Additionally, I contributed to reviewing and revising the research results to ensure their accuracy and reliability. Paper Writing and Revision: Finally, I was responsible for drafting the initial paper and making multiple revisions, striving to ensure the clarity, accuracy, and completeness of the paper's content. I also participated in carefully considering and responding to feedback and suggestions from others to continuously improve the quality and expression of the paper. In summary, this research was completed with the guidance and assistance of my supervisor and team members, for which I am sincerely grateful. My contribution is reflected not only in specific practical work but also in the understanding and application of research ideas and methods. I hope that my efforts can make a certain contribution to academic research and practical work.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, D., Qian, Y., Lu, J. et al. Ea-yolo: efficient extraction and aggregation mechanism of YOLO for fire detection. Multimedia Systems 30, 287 (2024). https://doi.org/10.1007/s00530-024-01489-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-024-01489-4