Abstract
Ensuring the safety of workers and machinery during operations is a critical task in the construction sites. However, an inevitable circumstance in construction sites is the complex and dynamic environment, which often leads to occlusions. When detecting occluded objects in construction sites, general vision-based approaches tend to exhibit lower accuracy and may even miss detections, resulting in potential safety hazards. To handle this issue, this paper proposes a vision-based approach for detecting occluded objects in construction sites. Firstly, the proposed detection algorithm adopts the state-of-the-art YOLOv7 as its backbone. To enhance its capability in capturing contextual information of occluded objects, a novel channel attention mechanism is employed. Then, a design scheme for the detector head is provided by integrating a novel loss function Scylla-Intersection over Union (SIoU) and the non-maximum suppression (NMS) strategy. With the help of the loss function SIoU, the network can compute the loss values of occluded objects more accurately. To ensure that the network can select the right predicted box which closely aligns with the ground truth, the Euclidean distance is utilized as spatial penalty factor during the NMS stage. By implementing these two strategies, the proposed method can preserve both the category information and bounding boxes of occluded objects, which makes them possible to be detected. Finally, detailed experiments are done to verify the proposed method. Experimental results demonstrate that the proposed method has the potential for improving the detection accuracy. Moreover, it shows a better performance in detecting occluded objects in the dynamic construction sites compared to the existing baselines.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The dataset on moving objects in construction sites is available at https://doi.org/10.1016/i.autcon.2020.103482.
References
Kong T, Fang W, Love PE, Luo H, Xu S, Li H (2021) Computer vision and long short-term memory: learning to predict unsafe behaviour in construction. Adv Eng Inform 50:101400
Zhai P, Wang J, Zhang L (2023) Extracting worker unsafe behaviors from construction images using image captioning with deep learning-based attention mechanism. J Constr Eng Manag 149(2):04022164
Chen C, Gu H, Lian S, Zhao Y, Xiao B (2022) Investigation of edge computing in computer vision-based construction resource detection. Buildings 12(12):2167
Guo Y, Cui H, Li S (2022) Excavator joint node-based pose estimation using lightweight fully convolutional network. Autom Constr 141:104435
Assadzadeh A, Arashpour M, Li H, Hosseini R, Elghaish F, Baduge S (2023) Excavator 3D pose estimation using deep learning and hybrid datasets. Adv Eng Inform 55:101875
Wang Y, Xiao B, Bouferguene A, Al-Hussein M, Li H (2022) Vision-based method for semantic information extraction in construction by integrating deep learning object detection and image captioning. Adv Eng Inform 53:101699
Pereira E, Ali M, Wu L, Abourizk S (2020) Distributed simulation-based analytics approach for enhancing safety management systems in industrial construction. J Constr Eng Manag 146(1):04019091
Kang D, Cha Y-J (2018) Autonomous UAVs for structural health monitoring using deep learning and an ultrasonic beacon system with geo-tagging. Comput Aided Civ Infrastruct Eng 33(10):885–902
Hou X, Zeng Y, Xue J (2020) Detecting structural components of building engineering based on deep-learning method. J Constr Eng Manag 146(2):04019097
Lu R, Brilakis I, Middleton CR (2019) Detection of structural components in point clouds of existing RC bridges. Comput Aided Civ Infrastruct Eng 34(3):191–212
Cha Y-J, Choi W, Suh G, Mahmoudkhani S, Büyüköztürk O (2018) Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Comput Aided Civ Infrastruct Eng 33(9):731–747
Zhang M, Zhu M, Zhao X (2020) Recognition of high-risk scenarios in building construction based on image semantics. J Comput Civ Eng 34(4):04020019
Davis P, Aziz F, Newaz MT, Sher W, Simon L (2021) The classification of construction waste material using a deep convolutional neural network. Autom Constr 122:103481
Wang H, Song Y, Huo L, Chen L, He Q (2023) Multiscale object detection based on channel and data enhancement at construction sites. Multimed Syst 29(1):49–58
Zhou X, Gong Q, Liu Y, Yin L (2021) Automatic segmentation of TBM muck images via a deep-learning approach to estimate the size and shape of rock chips. Autom Constr 126:103685
Bang S, Baek F, Park S, Kim W, Kim H (2020) Image augmentation to improve construction resource detection using generative adversarial networks, cut-and-paste, and image transformation techniques. Autom Constr 115:103198
Yang Z, He B, Liu Y, Wang D, Zhu G (2021) Classification of rock fragments produced by tunnel boring machine using convolutional neural networks. Autom Constr 125:103612
Jiang Y, Pang D, Li C (2021) A deep learning approach for fast detection and classification of concrete damage. Autom Constr 128:103785
Torok MM, Golparvar-Fard M, Kochersberger KB (2014) Image-based automated 3D crack detection for post-disaster building assessment. J Comput Civ Eng 28(5):A4014004
Kang D, Benipal SS, Gopal DL, Cha Y-J (2020) Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Autom Constr 118:103291
Chen C, Zhu Z, Hammad A (2020) Automated excavators activity recognition and productivity analysis from construction site surveillance videos. Autom Constr 110:103045
Guo Y, Xu Y, Li S (2020) Dense construction vehicle detection based on orientation-aware feature fusion convolutional neural network. Autom Constr 112:103124
Luo H, Wang M, Wong PK-Y, Cheng JC (2020) Full body pose estimation of construction equipment using computer vision and deep learning techniques. Autom Constr 110:103016
Xiao B, Kang S-C (2021) Vision-based method integrating deep learning detection for tracking multiple construction machines. J Comput Civ Eng 35(2):04020071
Luo X, Li H, Cao D, Dai F, Seo J, Lee S et al (2018) Recognizing diverse construction activities in site images via relevance networks of construction-related objects detected by convolutional neural networks. J Comput Civ Eng 32(3):04018012
Kim J, Hwang J, Chi S, Seo J (2020) Towards database-free vision-based monitoring on construction sites: a deep active learning approach. Autom Constr 120:103376
Ding L, Fang W, Luo H, Love PE, Zhong B, Ouyang X (2018) A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Autom Constr 86:118–124
Fang Q, Li H, Luo X, Ding L, Luo H, Rose TM, An W (2018) Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Autom Constr 85:1–9
Mneymneh BE, Abbas M, Khoury H (2019) Vision-based framework for intelligent monitoring of hardhat wearing on construction sites. J Comput Civ Eng 33(2):04018066
Neuhausen M, Herbers P, König M (2020) Using synthetic data to improve and evaluate the tracking performance of construction workers on site. Appl Sci 10(14):4948
Yu Y, Yang X, Li H, Luo X, Guo H, Fang Q (2019) Joint-level vision-based ergonomic assessment tool for construction workers. J Constr Eng Manag 145(5):04019025
Fang Q, Li H, Luo X, Ding L, Luo H, Li C (2018) Computer vision aided inspection on falling prevention measures for steeplejacks in an aerial environment. Autom Constr 93:148–164
Chian E, Fang W, Goh YM, Tian J (2021) Computer vision approaches for detecting missing barricades. Autom Constr 131:103862
Bangaru SS, Wang C, Busam SA, Aghazadeh F (2021) ANN-based automated scaffold builder activity recognition through wearable EMG and IMU sensors. Autom Constr 126:103653
Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2018) Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7774–7783
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12993–13000
Gevorgyan Z (2022) SIoU loss: more powerful learning for bounding box regression. arXiv e-prints arXiv–2205
Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-NMS–improving object detection with one line of code. In: Proceedings of the IEEE international conference on computer vision, pp 5561–5569
He Y, Zhang X, Savvides M, Kitani K (2018) Softer-NMS: rethinking bounding box regression for accurate object detection. arXiv preprint arXiv:1809.08545
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 784–799
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Proceedings of the European conference on computer vision (ECCV), pp 637–653
Chen H, Hou L, Zhang GK, Wu S (2023) Using context-guided data augmentation, lightweight CNN, and proximity detection techniques to improve site safety monitoring under occlusion conditions. Saf Sci 158:105958
Ravi N, Naqvi S, El-Sharkawy M (2022) Biou: An improved bounding box regression for object detection. J Low Power Electron Appl 12(4):51
Song Z, Zhang Y, Liu Y, Yang K, Sun M (2022) MSFYOLO: feature fusion-based detection for small objects. IEEE Lat Am Trans 20(5):823–830
Park M, Bak J, Park S et al (2023) Small and overlapping worker detection at construction sites. Autom Constr 151:104856
Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7464–7475
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Xuehui A, Li Z, Zuguang L, Chengzhi W, Pengfei L, Zhiwei L (2021) Dataset and benchmark for detecting moving objects in construction sites. Autom Constr 122:103482
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Acknowledgements
This study is partly supported by the National Natural Science Foundation of China (62076150 and 62133008), the Key Technology Research and Development Program of Shandong Province (2021CXGC011205, 2021TSGC1053, and 2022TSGC2157), and the Natural Science Foundation of Shandong Province (ZR2021QF077 and ZR2023QF020).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Q., Liu, H., Peng, W. et al. A vision-based approach for detecting occluded objects in construction sites. Neural Comput & Applic 36, 10825–10837 (2024). https://doi.org/10.1007/s00521-024-09580-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09580-7