CN115841651B

CN115841651B - Constructor intelligent monitoring system based on computer vision and deep learning

Info

Publication number: CN115841651B
Application number: CN202211602196.2A
Authority: CN
Inventors: 陈祺荣; 陈科宇; 杭世杰; 林俊; 汤序霖; 陈钰开; 李晨慧; 李卫勇; 朱东烽; 杨哲; 杨健明; 聂勤文; 张华健; 邬学文; 汪爽; 练月荣
Original assignee: Guangzhou Jishi Construction Group Co ltd; Guangdong Yuncheng Architectural Technology Co ltd; Hainan University
Current assignee: Guangzhou Jishi Construction Group Co ltd; Guangdong Yuncheng Architectural Technology Co ltd; Hainan University
Priority date: 2022-12-13
Filing date: 2022-12-13
Publication date: 2023-08-22
Anticipated expiration: 2042-12-13
Also published as: CN115841651A

Abstract

The invention discloses an intelligent monitoring system for construction personnel based on computer vision and deep learning, which includes an intelligent statistical module and an intelligent monitoring module. The intelligent statistical module includes a video acquisition module, a deep learning algorithm analysis module, and an analysis result display module. Cap recognition module, overalls recognition module, state recognition module and alarm reminder module. The present invention can not only realize the intelligent statistics of the number of personnel and vehicles entering and exiting the construction site and the total number of personnel and vehicles in the local area, but also use image recognition technology to identify the clothing and status of construction personnel passing through the entrance of the construction site, and Construction workers who do not meet the standards will be reminded, which can effectively avoid the occurrence of safety accidents caused by construction workers not wearing safety helmets, work clothes or poor mental state, thereby effectively improving the construction safety performance of the construction site .

Description

Intelligent monitoring system for construction workers based on computer vision and deep learning

技术领域technical field

本发明涉及智能统计技术领域，具体来说，涉及基于计算机视觉与深度学习的施工人员智能监测系统。The invention relates to the technical field of intelligent statistics, in particular to an intelligent monitoring system for construction personnel based on computer vision and deep learning.

背景技术Background technique

目前，在建筑施工过程中，要求施工现场原则上实施封闭式管理，设立进出场门禁系统。当前建筑施工现场所使用的门禁系统多为由三辊闸，摆闸，翼闸，不锈钢栅栏门等通道系统与感应卡读写器结合构成的系统，所有管理人员和施工人员提前将人脸或者二代身份证录入系统后便可自由进出，人员未经登记和授权则无法擅自进入施工现场，同时在施工现场出入口安装监控摄像头，将一段时间内的视频录像存储在本地服务器内以便查阅。At present, in the process of building construction, the construction site is required to implement closed management in principle, and an access control system for entry and exit is set up. At present, most of the access control systems used in construction sites are systems composed of tripod gates, swing gates, wing gates, stainless steel fence gates and other channel systems combined with proximity card readers. After the second-generation ID card is entered into the system, they can enter and exit freely, and personnel without registration and authorization cannot enter the construction site without authorization. At the same time, surveillance cameras are installed at the entrance and exit of the construction site, and video recordings for a period of time are stored in the local server for easy reference.

然而，目前的门禁系统在实际应用时存在以下缺陷：However, the current access control system has the following defects in actual application:

1)建筑工地出入口处人员出入频繁，人员无序流动，现场管理困难；2)采用纸质记录考勤数据，难以准确统计实际工时，且数据易被篡改；3)施工现场人员数量多、分包单位多、工种和岗位多的特点，项目管理人员无法清晰和及时掌握现场的施工作业人员数量、各工种人员数量和各专业分包单位人数，不利于现场管理效率提高；4)发生建筑劳务人员工资纠纷时，监管部门难以取证，维权困难重重；5)车辆以及车辆内人员进出场统计困难。1) People come in and out frequently at the entrance and exit of the construction site, people flow out of order, and on-site management is difficult; 2) Using paper records to record attendance data makes it difficult to accurately count actual working hours, and the data is easy to be tampered with; 3) There are many people on the construction site and subcontracting Due to the characteristics of many units, types of work and positions, the project management personnel cannot clearly and timely grasp the number of construction workers on the site, the number of personnel of each type of work and the number of subcontractors of each specialty, which is not conducive to improving the efficiency of on-site management; 4) The occurrence of construction labor personnel In the case of wage disputes, it is difficult for the supervisory department to obtain evidence, and it is difficult to defend rights; 5) It is difficult to count the entry and exit of vehicles and personnel in vehicles.

针对相关技术中的问题，目前尚未提出有效的解决方案。Aiming at the problems in the related technologies, no effective solution has been proposed yet.

发明内容Contents of the invention

针对相关技术中的问题，本发明提出基于计算机视觉与深度学习的施工人员智能监测系统，以克服现有相关技术所存在的上述技术问题。Aiming at the problems in the related technologies, the present invention proposes an intelligent monitoring system for construction personnel based on computer vision and deep learning to overcome the above-mentioned technical problems in the existing related technologies.

为此，本发明采用的具体技术方案如下：For this reason, the concrete technical scheme that the present invention adopts is as follows:

基于计算机视觉与深度学习的施工人员智能监测系统，该系统包括智能统计模块和智能监测模块；An intelligent monitoring system for construction workers based on computer vision and deep learning, which includes an intelligent statistics module and an intelligent monitoring module;

其中，所述智能统计模块用于利用训练好的深度学习算法对通过工地入口的施工人员进行检测，并通过跟踪算法对目标进行跟踪，在目标碰撞检测线时进行计数并统计，同时通过客户端进行实时展示；Wherein, the intelligent statistical module is used to detect the construction workers passing through the entrance of the construction site by using the trained deep learning algorithm, and track the target through the tracking algorithm, count and count when the target collides with the detection line, and at the same time through the client display in real time;

所述智能监测模块用于利用预设的图像识别技术分别对通过工地入口的施工人员的着装及状态进行识别，并对不符合标准的施工人员进行提醒。The intelligent monitoring module is used to use preset image recognition technology to identify the clothing and status of the construction personnel passing through the entrance of the construction site, and remind the construction personnel who do not meet the standards.

进一步的，所述智能统计模块包括视频采集模块、深度学习算法分析模块及分析结果显示模块；Further, the intelligent statistics module includes a video acquisition module, a deep learning algorithm analysis module and an analysis result display module;

所述视频采集模块用于根据架设在施工现场各出入口的监控摄像头采集实时监控画面，并将实时监控画面输入POE交换机经转换后得到初始视频素材；The video acquisition module is used to collect real-time monitoring pictures according to the monitoring cameras erected at each entrance and exit of the construction site, and input the real-time monitoring pictures into the POE switch to obtain initial video material after conversion;

所述深度学习算法分析模块用于通过训练好的深度学习算法实时输出与初始视频素材相对应的视频分析画面以及人员统计数据，并将分析画面和统计结果输出至客户端；The deep learning algorithm analysis module is used to output in real time the video analysis picture corresponding to the initial video material and the personnel statistics data through the trained deep learning algorithm, and output the analysis picture and statistical results to the client;

所述分析结果显示模块用于在客户端显示视频分析画面及人员与车辆的统计数据，还用于管理人员根据需求切换不同出入口摄像头所获取的分析画面以及统计数据。The analysis result display module is used for displaying video analysis pictures and statistical data of personnel and vehicles on the client side, and is also used for managers to switch analysis pictures and statistical data obtained by different entrance and exit cameras according to requirements.

进一步的，所述深度学习算法分析模块包括检测区域设定模块、视频分析模块、数据统计模块及数据传输模块；Further, the deep learning algorithm analysis module includes a detection area setting module, a video analysis module, a data statistics module and a data transmission module;

所述检测区域设定模块用于根据不同出入口的画面布局，通过对相应的检测范围进行调整来控制各个画面中的实际检测范围，实现检测区域的设定；The detection area setting module is used to control the actual detection range in each screen by adjusting the corresponding detection range according to the screen layout of different entrances and exits, so as to realize the setting of the detection area;

所述视频分析模块用于通过撞线检测点坐标来对初始视频素材中的图像进行分析，实现对检测区域人员及车辆的分析及统计；The video analysis module is used to analyze the image in the initial video material through the coordinates of the line collision detection point, so as to realize the analysis and statistics of the personnel and vehicles in the detection area;

所述数据统计模块用于通过目标框是否撞线及目标框撞线区域的颜色来实现对目标进出的判定和统计；The data statistics module is used to realize the judgment and statistics of target entry and exit through whether the target frame hits the line and the color of the target frame's line-crossing area;

所述数据传输模块用于将分析画面和统计结果通过RTSP服务器以及HTTP推送服务输出至客户端。The data transmission module is used to output the analysis screen and statistical results to the client through the RTSP server and the HTTP push service.

进一步的，所述检测区域的设定包括以下步骤：Further, the setting of the detection area includes the following steps:

依照所需区域的形状依次确定各处端点位置并通过一个数组来表达，且该数组的每个元素为表征图形端点的二元数组，通过数组的调整来实现检测区域的调整及设定。According to the shape of the required area, the positions of the endpoints are sequentially determined and expressed by an array, and each element of the array is a binary array representing the endpoint of the graph. The adjustment and setting of the detection area is realized through the adjustment of the array.

进一步的，所述通过撞线检测点坐标来对初始视频素材中的图像进行分析，实现对检测区域人员及车辆的分析及统计包括以下步骤：Further, the analysis of the image in the initial video material through the coordinates of the line collision detection point to realize the analysis and statistics of the personnel and vehicles in the detection area includes the following steps:

获取当前图像中所有物体的位置参数及类别；Obtain the position parameters and categories of all objects in the current image;

判断新一帧图像中是否有物体的几何中心与上一帧图像中某个物体几何中心的偏移量在预设偏移量内，若是，则判定两个物体为同一物体，拥有相同ID，若否，则判定新一帧图像中存在新物体，并为该物体赋予新ID；Determine whether the geometric center of an object in the new frame of image is within the preset offset from the geometric center of an object in the previous frame of image. If so, determine that the two objects are the same object and have the same ID. If not, it is determined that there is a new object in the new frame image, and a new ID is assigned to the object;

采用矩形表示物体范围且已知物体ID的图像，设某个物体范围的参数为x₁，y₁，x₂，y₂，其中x₁<x₂，y₁<y₂，撞线检测点坐标为(check_point_x，check_point_y)，check_point_x＝x₁，check_point_y＝int[y₁+(y₂-y₁)*0.6]，int指对运算结果取整；A rectangle is used to represent the image of the object range and the object ID is known. Set the parameters of an object range as x ₁ , y ₁ , x ₂ , y ₂ , where x ₁ <x ₂ , y ₁ <y ₂ , and the line collision detection point The coordinates are (check_point_x, check_point_y), check_point_x=x ₁ , check_point_y=int[y ₁ +(y ₂ -y ₁ )*0.6], int refers to the rounding of the operation result;

判断物体撞线检测点是否位于判定区内，若是，则对该物体进行统计运算，若否，则不进行统计运算。It is judged whether the object collision line detection point is located in the judgment area, if yes, the statistical calculation is performed on the object, and if not, the statistical calculation is not performed.

进一步的，所述通过目标框是否撞线及目标框撞线区域的颜色来实现对目标进出的判定和统计包括以下步骤：Further, the determination and statistics of the target entry and exit through whether the target frame hits the line and the color of the target frame's line-crossing area include the following steps:

规定当前摄像头所捕捉到的全部画面为检测区域，并设定蓝色与黄色的条形区域为判定区，且当目标上行撞黄线时记为进，目标下行撞蓝线时记为出；It is stipulated that all the pictures captured by the current camera are the detection area, and the blue and yellow bar areas are set as the judgment area, and when the target hits the yellow line up, it is recorded as entering, and when the target hits the blue line down, it is recorded as out;

获取施工现场各出入口监控摄像头采集的实时监控画面，并对获取的实时监控画面进行尺寸缩小处理；Obtain the real-time monitoring pictures collected by the monitoring cameras at the entrances and exits of the construction site, and reduce the size of the acquired real-time monitoring pictures;

判断缩小后的实时监控画面的检测区域中是否有目标出现，若否，则将该监控画面视为无效画面并进行忽略清理，若是，则为目标赋框后输出；Judging whether there is a target in the detection area of the reduced real-time monitoring screen, if not, the monitoring screen is regarded as an invalid screen and ignored and cleaned up, if so, it is output after the target is framed;

检测目标框是否撞线及目标框撞线区域的颜色，并依据目标框撞线区域的颜色对目标的进出进行判定与统计。Detect whether the target frame hits the line and the color of the target frame's line-colliding area, and judge and count the entry and exit of the target according to the color of the target frame's line-colliding area.

进一步的，所述智能监测模块包括安全帽识别模块、工作服识别模块、状态识别模块及警报提醒模块；Further, the intelligent monitoring module includes a safety helmet identification module, a work clothes identification module, a state identification module and an alarm reminder module;

所述安全帽识别模块用于利用图像识别技术对通过工地入口的未佩带安全帽的施工人员进行识别；The safety helmet identification module is used to identify construction personnel who pass through the entrance of the construction site without wearing safety helmets by using image recognition technology;

所述工作服识别模块用于利用图像识别技术对通过工地入口的未穿着工作服的施工人员进行识别；The work clothes identification module is used to use image recognition technology to identify construction workers who pass through the entrance of the construction site without wearing work clothes;

所述状态识别模块用于利用图像识别技术对通过工地入口的施工人员的精神状态进行识别；The state identification module is used to identify the mental state of construction workers passing through the entrance of the construction site by using image recognition technology;

所述警报提醒模块用于对未佩戴安全帽、未穿着工作服及精神状态不符合标准的施工人员进行提醒。The alarm reminder module is used to remind construction workers who do not wear safety helmets, work clothes and whose mental state does not meet the standards.

进一步的，所述状态识别模块包括施工人员面部图像获取模块、融合特征识别模块及状态识别结果输出模块；Further, the state recognition module includes a construction personnel facial image acquisition module, a fusion feature recognition module and a state recognition result output module;

所述施工人员面部图像获取模块用于获取施工现场各出入口实时监控画面中施工人员的面部图像；The facial image acquisition module of the construction personnel is used to obtain the facial images of the construction personnel in the real-time monitoring screen of each entrance and exit of the construction site;

所述融合特征识别模块用于利用基于独立特征融合的面部状态识别算法对施工人员的面部图像进行识别，实现对进入施工现场的施工人员的精神状态进行识别；The fusion feature recognition module is used to identify the facial image of the construction personnel by using the facial state recognition algorithm based on independent feature fusion, so as to realize the identification of the mental state of the construction personnel entering the construction site;

所述状态识别结果输出模块用于输出施工现场各出入口实时监控画面中施工人员的精神状态信息。The state recognition result output module is used to output the mental state information of the construction workers in the real-time monitoring screen of each entrance and exit of the construction site.

进一步的，所述融合特征识别模块包括全局状态特征提取模块、局部状态特征提取模块、状态特征融合模块及状态特征分析识别模块；Further, the fusion feature recognition module includes a global state feature extraction module, a local state feature extraction module, a state feature fusion module, and a state feature analysis and recognition module;

所述全局状态特征提取模块用于通过离散余弦变换提取面部图像的全局状态特征，并利用独立成分分析技术对全局状态特征的相关性进行去除，得到独立全局状态特征；The global state feature extraction module is used to extract the global state feature of the facial image by discrete cosine transform, and utilizes the independent component analysis technique to remove the correlation of the global state feature to obtain the independent global state feature;

所述局部状态特征提取模块用于提取图像序列中眼部和嘴部区域的特征，并分别对眼部和嘴部区域进行Gabort小波变换和特征融合，得到两个局部区域的动态多尺度特征作为面部图像的局部状态特征；The local state feature extraction module is used to extract the features of the eye and mouth regions in the image sequence, and perform Gabort wavelet transform and feature fusion on the eye and mouth regions respectively to obtain the dynamic multi-scale features of the two local regions as Local state features of facial images;

所述状态特征融合模块用于将独立全局状态特征与局部状态特征进行融合，在全局特征中加入局部细节信息，得到面部状态融合特征；The state feature fusion module is used to fuse independent global state features and local state features, and add local detail information to the global features to obtain facial state fusion features;

所述状态特征分析识别模块用于通过预设的分类器对得到的面部状态融合特征进行分析与识别，得到施工人员的精神状态信息，其中，所述施工人员的精神状态包括清醒状态、轻度疲劳状态、中度疲劳状态及重度疲劳状态。The state feature analysis and identification module is used to analyze and identify the obtained facial state fusion features through a preset classifier, and obtain the mental state information of the construction personnel, wherein the mental state of the construction personnel includes awake state, mild Fatigue state, moderate fatigue state and severe fatigue state.

进一步的，所述预设的分类器通过AdaBoost算法挑选部分特征，去除冗余特征并进行训练得到，所述预设的分类器的计算公式为：Further, the preset classifier is obtained by selecting some features through the AdaBoost algorithm, removing redundant features and performing training, and the calculation formula of the preset classifier is:

其中，T表示算法循环的最终次数，a_t表示分类器h_t(X)被挑选的权值，由AdaBoost算法学习确定，X＝(X₁，X₂，…，X_T)表示挑选出来的面部图像序列的动态Gabor特征。Among them, _T represents the final number of algorithm loops, at represents the selected weight of the classifier h _t (X), which is determined by AdaBoost algorithm learning, X=(X ₁ , X ₂ ,...,X _T ) represents the selected Dynamic Gabor features for facial image sequences.

本发明的有益效果为：The beneficial effects of the present invention are:

1)通过利用训练好的深度学习算法对通过工地入口的施工人员进行检测，并利用跟踪算法对目标进行跟踪，在目标碰撞检测线时进行计数并统计，从而可以实现对施工现场人员及车辆进、出数量以及当地内人员与车辆总数的智能统计，此外，本发明还可以利用图像识别技术分别对通过工地入口的施工人员的着装及状态进行识别，并对不符合标准的施工人员进行提醒，从而可以有效地避免因施工人员未佩带安全帽、未穿着工作服或精神状态欠佳而导致的安全事故现象的发生，进而可以有效地提高施工现场的施工安全性能。1) By using the trained deep learning algorithm to detect the construction personnel passing the entrance of the construction site, and using the tracking algorithm to track the target, count and count when the target collides with the detection line, so that the construction site personnel and vehicles can be monitored , the number of exits, and the intelligent statistics of the total number of personnel and vehicles in the local area. In addition, the present invention can also use image recognition technology to identify the clothing and status of construction personnel passing through the entrance of the construction site, and remind construction personnel that do not meet the standards. In this way, the occurrence of safety accidents caused by construction personnel not wearing safety helmets, work clothes or poor mental state can be effectively avoided, and the construction safety performance of the construction site can be effectively improved.

2)本发明不仅可以在各类光照条件下识别准确率均能达到95％以上，而且还可以在保证统计数据准确地同时，使得分析画面与原画面时间差不超过1秒，具有识别准确率高和识别速度快的优点。2) The present invention can not only achieve a recognition accuracy rate of more than 95% under various lighting conditions, but also ensure that the statistical data is accurate, so that the time difference between the analysis screen and the original screen does not exceed 1 second, and has a high recognition accuracy rate And the advantages of fast recognition speed.

3)基于本发明所使用的算法与UI所包含的可塑性，可根据不同的需求加入不同的功能，如人脸识别、工种识别等，可拓展性高；此外，本发明所需硬件设备价格低廉，在节约成本的同时提高现场管理效率获取更多的经济效益。3) Based on the algorithm used in the present invention and the plasticity contained in the UI, different functions can be added according to different needs, such as face recognition, type of work recognition, etc., and the scalability is high; in addition, the hardware equipment required by the present invention is cheap , Improve on-site management efficiency and obtain more economic benefits while saving costs.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the accompanying drawings required in the embodiments. Obviously, the accompanying drawings in the following description are only some of the present invention. Embodiments, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

图1是根据本发明实施例的基于计算机视觉与深度学习的施工人员智能监测系统的结构框图；Fig. 1 is a structural block diagram of an intelligent monitoring system for construction personnel based on computer vision and deep learning according to an embodiment of the present invention;

图2是根据本发明实施例的基于计算机视觉与深度学习的施工人员智能监测系统中深度学习算法分析模块的结构框图；2 is a structural block diagram of a deep learning algorithm analysis module in an intelligent monitoring system for construction personnel based on computer vision and deep learning according to an embodiment of the present invention;

图3是根据本发明实施例的基于计算机视觉与深度学习的施工人员智能监测系统中状态识别模块的结构框图；3 is a structural block diagram of a state recognition module in an intelligent monitoring system for construction personnel based on computer vision and deep learning according to an embodiment of the present invention;

图4是根据本发明实施例的基于计算机视觉与深度学习的施工人员智能监测系统中融合特征识别模块的结构框图。Fig. 4 is a structural block diagram of a fusion feature recognition module in an intelligent monitoring system for construction workers based on computer vision and deep learning according to an embodiment of the present invention.

图中：In the picture:

1、智能统计模块；2、智能监测模块；11、视频采集模块；12、深度学习算法分析模块；121、检测区域设定模块；122、视频分析模块；123、数据统计模块；124、数据传输模块；13、分析结果显示模块；21、安全帽识别模块；22、工作服识别模块；23、状态识别模块；231、施工人员面部图像获取模块；232、融合特征识别模块；2321、全局状态特征提取模块；2322、局部状态特征提取模块；2323、状态特征融合模块；2324、状态特征分析识别模块；233、状态识别结果输出模块；24、警报提醒模块。1. Intelligent statistics module; 2. Intelligent monitoring module; 11. Video acquisition module; 12. Deep learning algorithm analysis module; 121. Detection area setting module; 122. Video analysis module; 123. Data statistics module; 124. Data transmission module; 13. analysis result display module; 21. safety helmet recognition module; 22. work clothes recognition module; 23. state recognition module; 231. construction personnel facial image acquisition module; module; 2322, local state feature extraction module; 2323, state feature fusion module; 2324, state feature analysis and recognition module; 233, state recognition result output module; 24, alarm reminder module.

具体实施方式Detailed ways

为进一步说明各实施例，本发明提供有附图，这些附图为本发明揭露内容的一部分，其主要用以说明实施例，并可配合说明书的相关描述来解释实施例的运作原理，配合参考这些内容，本领域普通技术人员应能理解其他可能的实施方式以及本发明的优点，图中的组件并未按比例绘制，而类似的组件符号通常用来表示类似的组件。In order to further illustrate the various embodiments, the present invention provides accompanying drawings, which are part of the disclosure of the present invention, and are mainly used to illustrate the embodiments, and can be used in conjunction with the relevant descriptions in the specification to explain the operating principles of the embodiments, for reference Those of ordinary skill in the art should be able to understand other possible implementations and advantages of the present invention. The components in the figures are not drawn to scale, and similar component symbols are generally used to represent similar components.

根据本发明的实施例，提供了基于计算机视觉与深度学习的施工人员智能监测系统。According to an embodiment of the present invention, an intelligent monitoring system for construction personnel based on computer vision and deep learning is provided.

现结合附图和具体实施方式对本发明进一步说明，如图1-图4所示，根据本发明实施例的基于计算机视觉与深度学习的施工人员智能监测系统，该系统包括智能统计模块1和智能监测模块2；The present invention will be further described in conjunction with the accompanying drawings and specific embodiments. As shown in FIGS. Monitoring module 2;

所述智能统计模块1用于利用训练好的深度学习算法对通过工地入口的施工人员进行检测，并通过跟踪算法对目标进行跟踪，在目标碰撞检测线时进行计数并统计，同时通过客户端进行实时展示；The intelligent statistical module 1 is used to detect the construction workers passing through the entrance of the construction site by using the trained deep learning algorithm, and track the target through the tracking algorithm, and count and count when the target collides with the detection line, and at the same time through the client. real-time display;

其中，所述智能统计模块1包括视频采集模块11、深度学习算法分析模块12及分析结果显示模块13；Wherein, the intelligent statistical module 1 includes a video acquisition module 11, a deep learning algorithm analysis module 12 and an analysis result display module 13;

所述视频采集模块11用于根据架设在施工现场各出入口的监控摄像头采集实时监控画面，并将实时监控画面输入POE交换机经转换后得到初始视频素材；Described video collection module 11 is used for collecting real-time monitoring picture according to the monitoring camera erected at each entrance and exit of the construction site, and inputting the real-time monitoring picture into the POE switch to obtain initial video material after conversion;

所述深度学习算法分析模块12用于通过训练好的深度学习算法实时输出与初始视频素材相对应的视频分析画面以及人员统计数据，并将分析画面和统计结果输出至客户端；The deep learning algorithm analysis module 12 is used to output in real time the video analysis picture corresponding to the initial video material and the personnel statistics data through the trained deep learning algorithm, and output the analysis picture and statistical results to the client;

其中，本实施例的算法采用的主干特征提取网络优于跨级部分网络，降低了显存消耗，在加强卷积神经网络的学习能力的同时，也进一步加宽网络，保证算法精度。在主干特征提取网络的训练策略中，为保障复杂环境下的识别精度，将多张图像进行拼接，模拟复杂环境下的物体。本算法还将更改后图像进行再次决策，改善决策边界的薄弱环境，提高系统鲁棒性。该算法识别物体有如下三步：Among them, the backbone feature extraction network used in the algorithm of this embodiment is better than the cross-level partial network, which reduces memory consumption, and further widens the network while enhancing the learning ability of the convolutional neural network to ensure the accuracy of the algorithm. In the training strategy of the backbone feature extraction network, in order to ensure the recognition accuracy in complex environments, multiple images are stitched together to simulate objects in complex environments. This algorithm will also change the image to make another decision, improve the weak environment of the decision boundary, and improve the robustness of the system. The algorithm recognizes objects in the following three steps:

首先，采用步长为2、卷积核为3×3的卷积层对图像进行5次采样，从而提取图像的主干特征，并产生五张不同尺寸特征层。First, the image is sampled 5 times with a convolutional layer with a stride of 2 and a convolution kernel of 3×3 to extract the main features of the image and generate five feature layers of different sizes.

其次，将较小尺寸特征层进行多尺度感受野融合并进行最大值池化，再通过张量拼接将处理后的小尺寸特征图与较大尺寸图进行参数聚合。故本算法在不同尺寸检测下依旧适用。Secondly, the multi-scale receptive field fusion of the smaller size feature layer is performed and the maximum pooling is performed, and then the parameters of the processed small size feature map and the larger size map are aggregated through tensor splicing. Therefore, this algorithm is still applicable under different size detection.

最后，将特征层划分为不同尺寸的网格，目的在于能检测大小相差较大的各类目标，避免目标丢失。再对不同尺寸网格上产生多规格的先验框，每个先验框返回物体为预设物体类别概率，再将多个先验框中包含最大置信度的先验框作为物体实际位置，返回所有预设物体类别的置信度与物体实际位置。Finally, the feature layer is divided into grids of different sizes, the purpose is to detect various targets with large differences in size and avoid target loss. Then generate multi-standard a priori boxes on grids of different sizes, and each a priori box returns the probability that the object is a preset object category, and then use the a priori box with the maximum confidence among multiple a priori boxes as the actual position of the object, Returns the confidence and the actual position of the object for all preset object categories.

具体的，置信度区间是人为设定，置信度在算法识别出物体后给出。在测试中，置信度最低值为0.3，置信度区间为(0.3，1)，在此区间下，由于待检测物体种类较为单一，自然环境变化小，识别正确率较高。最佳置信度区间受环境因素影响大，故不存在适用于不同环境的最佳置信度区间，应在实际环境下多次调试置信度区间才能得到对应条件下最佳置信度区间。Specifically, the confidence interval is set manually, and the confidence is given after the algorithm recognizes the object. In the test, the lowest confidence value is 0.3, and the confidence interval is (0.3, 1). Under this interval, because the types of objects to be detected are relatively single and the natural environment changes little, the recognition accuracy is high. The optimal confidence interval is greatly affected by environmental factors, so there is no optimal confidence interval applicable to different environments, and the confidence interval should be adjusted several times in the actual environment to obtain the optimal confidence interval under the corresponding conditions.

具体的，所述深度学习算法分析模块12包括检测区域设定模块121、视频分析模块122、数据统计模块123及数据传输模块124；Specifically, the deep learning algorithm analysis module 12 includes a detection area setting module 121, a video analysis module 122, a data statistics module 123 and a data transmission module 124;

所述检测区域设定模块121用于根据不同出入口的画面布局，通过对相应的检测范围进行调整来控制各个画面中的实际检测范围，实现检测区域(检测区域包含全部画面范围，判定区仅包含蓝黄条形区，如果人员出现在画面中但未通过蓝黄条区，则该人员会被捕捉到但不会计入统计数据)的设定；The detection area setting module 121 is used to control the actual detection range in each screen by adjusting the corresponding detection range according to the screen layouts of different entrances and exits, so as to realize the detection area (the detection area includes the whole screen range, and the judgment area only includes Blue and yellow bar area, if a person appears in the screen but does not pass through the blue and yellow bar area, the person will be captured but will not be included in the statistics);

本实施例中在现场测试过程中发现，需要对不同区域的视频画面检测范围进行调整。一般来说，摄像头所获取的全部画面会被默认作检测区域进行识别，然而在实际应用中，往往不需要整个监控画面全部参与识别，例如在开发过程中测试的项目工地的主要人员进出通道，该通道画面主要包含闸机通道与右侧门卫值班室。算法在默认情况下会对整个画面中的人员进行识别与检测，如果有人员在值班室中来回走动，也会被算法判定为“进场”或者“出场”，这种误判会给系统统计结果造成巨大误差。因此，本系统在项目现场落地过程中，需要根据不同出入口的画面布局，对相应的后台算法检测范围进行调整。通过在算法代码设置检测区域，来控制各个画面中的实际检测范围。对于图像上的任意一点，都可以通过一个二元数组来确定，若需要设置一个检测区域，可依照所需区域的形状依次确定各处端点位置，此时的检测区域通过一个数组来表达，该数组的每个元素为表征图形端点的二元数组，从而用数组来达到调整几何区域的目的。所有进入该通道画面中的人员都会被识别出来，但是只有通过图中检测区域的人员会计入统计数据，而在检测区域以外活动的人员，不会对统计数据产生影响。In this embodiment, it is found during the on-site test that it is necessary to adjust the detection range of video images in different areas. Generally speaking, all the pictures captured by the camera will be recognized as the detection area by default. However, in practical applications, it is often not necessary for the entire monitoring picture to participate in the recognition. The passage screen mainly includes the turnstile passage and the guard duty room on the right. By default, the algorithm will identify and detect people in the entire screen. If someone walks back and forth in the duty room, it will also be judged by the algorithm as "entering" or "exiting". This misjudgment will give the system statistics The result is a huge error. Therefore, during the landing process of this system on the project site, it is necessary to adjust the detection range of the corresponding background algorithm according to the screen layout of different entrances and exits. By setting the detection area in the algorithm code, the actual detection range in each screen is controlled. For any point on the image, it can be determined by a binary array. If you need to set a detection area, you can determine the positions of the endpoints in sequence according to the shape of the required area. At this time, the detection area is expressed by an array. Each element of the array is a binary array representing the endpoint of the graph, so the array can be used to achieve the purpose of adjusting the geometric area. All people entering the channel screen will be identified, but only those who pass through the detection area in the picture will be included in the statistical data, and those who move outside the detection area will not have an impact on the statistical data.

所述视频分析模块122用于通过撞线检测点坐标来对初始视频素材中的图像进行分析，实现对检测区域人员及车辆的分析及统计；The video analysis module 122 is used to analyze the image in the initial video material through the coordinates of the line collision detection point, so as to realize the analysis and statistics of personnel and vehicles in the detection area;

本实施例中的算法在完成初始化之后，就可识别并给出图像中的物体类别与范围，算法采用矩形框确定图像的位置，在平面内，确定一个矩形位置只需要知道其一条对角线坐标，例如右上角顶点坐标值(x₁，y₁)与左下角顶点坐标值(x₂，y₂)四个参数，即可确定矩形在图像的位置，故算法只需返回该四项参数与物体类别。算法返回的四个参数用于撞线检测点的确定和绘制矩形。在得到当前图像所有物体的位置参数与类别之后，若新一帧图像中有物体的几何中心与上一帧图像的一个物体几何中心的偏移量在设定偏移量以内，则两个物体视为同一物体，拥有相同ID。若新一帧存在新物体，为该物体赋予新ID。经过上述处理后，得到用矩形表示物体范围且已知物体ID的图像。现设某物体范围的参数为x₁，y₁，x₂，y₂，其中x₁<x₂，y₁<y₂，撞线检测点坐标为(check_point_x，check_point_y)，check_point_x＝x₁，check_point_y＝int[y₁+(y₂-y₁)*0.6]，int指对运算结果取整。仅当某一物体撞线检测点位于判定区内时，算法才会进行统计运算，否则不进行统计运算。After the algorithm in this embodiment is initialized, it can identify and give the object category and range in the image. The algorithm uses a rectangular frame to determine the position of the image. In a plane, only one diagonal line is needed to determine the position of a rectangle. Coordinates, such as the upper right vertex coordinate value (x ₁ , y ₁ ) and the lower left vertex coordinate value (x ₂ , y ₂ ), can determine the position of the rectangle in the image, so the algorithm only needs to return these four parameters with object class. The four parameters returned by the algorithm are used to determine the line collision detection point and draw the rectangle. After obtaining the position parameters and categories of all objects in the current image, if the offset between the geometric center of an object in the new frame image and the geometric center of an object in the previous frame image is within the set offset, the two objects Treated as the same object with the same ID. If there is a new object in the new frame, assign a new ID to the object. After the above processing, an image in which the range of the object is represented by a rectangle and the ID of the object is known is obtained. Now assume that the parameters of an object range are x ₁ , y ₁ , x ₂ , y ₂ , where x ₁ <x ₂ , y ₁ <y ₂ , the coordinates of the line collision detection point are (check_point_x, check_point_y), check_point_x=x ₁ , check_point_y=int[y ₁ +(y ₂ -y ₁ )*0.6], int means to round the operation result. The algorithm will perform statistical calculations only when an object's line collision detection point is within the judgment area, otherwise it will not perform statistical calculations.

所述数据统计模块123用于通过目标框是否撞线及目标框撞线区域的颜色来实现对目标进出的判定和统计；The data statistics module 123 is used to realize the judgment and statistics on the entry and exit of the target through whether the target frame hits the line and the color of the target frame's line-crossing area;

本实施例在实现人员统计功能时，采用了一种更为直观与准确的判定方式。即在算法中预设判定区，预设蓝色与黄色的条形区域为判定区。当目标上行撞黄线时记为进，目标下行撞蓝线时记为出，以此来计算行人进出数量。在算法运行时，将对视频进行处理，首先缩小尺寸再检测画面中是否有目标出现，假设画面中无目标出现，算法会将其视为无效画面忽略并清理，当画面中有目标存在时，算法会为目标赋框后输出，最后检测目标框是否撞线以及目标框所撞区域是为蓝色或者黄色作为目标进出的判断依据。In this embodiment, a more intuitive and accurate determination method is adopted when implementing the personnel counting function. That is, the judgment area is preset in the algorithm, and the blue and yellow bar areas are preset as the judgment area. When the target goes up and hits the yellow line, it is recorded as entering, and when the target goes down and hits the blue line, it is recorded as out, so as to calculate the number of pedestrians entering and exiting. When the algorithm is running, the video will be processed. First, reduce the size and then detect whether there is a target in the picture. If there is no target in the picture, the algorithm will treat it as an invalid picture and ignore it and clean it up. When there is a target in the picture, The algorithm will assign a frame to the target and output it, and finally detect whether the target frame hits the line and whether the area hit by the target frame is blue or yellow as the basis for judging the target's entry and exit.

所述数据传输模块124用于将分析画面和统计结果通过RTSP服务器以及HTTP推送服务输出至客户端。The data transmission module 124 is used to output the analysis screen and statistical results to the client through the RTSP server and the HTTP push service.

视频素材在经过深度学习算法识别分析后，需要将分析画面输出到客户端内，为使实时监控画面与客户端内展示的分析画面之间的延迟降到最低，本实施例所使用的传输方式为RTSP服务器，RTSP(Real Time Streaming Protocol)实时流协议，在使用RTSP时，客户端和服务器都可以发出请求，也就是说，RTSP可以是双向的，可以根据实际负载情况对提供服务的服务器进行转换，避免过多负载集中在同一个服务器上造成延迟。首先在算法中写入可以支持RTSP协议的视频流接口，然后根据摄像头的IP地址、端口号，设备用户名及密码等信息，获取RTSP流。进一步地，将流地址输入算法预设的接口内即可将分析画面传出到客户端内。After the video material is recognized and analyzed by the deep learning algorithm, the analysis screen needs to be output to the client. In order to minimize the delay between the real-time monitoring screen and the analysis screen displayed in the client, the transmission method used in this embodiment For RTSP server, RTSP (Real Time Streaming Protocol) real-time streaming protocol, when using RTSP, both the client and the server can send requests, that is to say, RTSP can be bidirectional, and the server that provides the service can be processed according to the actual load situation. Switching to avoid excessive load concentration on the same server causing delays. First write the video stream interface that can support the RTSP protocol in the algorithm, and then obtain the RTSP stream according to the camera's IP address, port number, device user name and password and other information. Furthermore, the analysis screen can be transmitted to the client by inputting the flow address into the interface preset by the algorithm.

与输出实时分析画面类似，在深度学习算法中统计得到的数据同样需要实时传输至客户端内，本实施例采用HTTP实现数据传输。HTTP(Hyper Text Transfer Protocol)即超文本传输协议，具有简单、灵活和易于扩展的优点。本实施例首先在服务端即深度学习算法内编写HTTP程序，接着创建一个客户端到服务端的TCP连接并在客户端内写好数据请求报文，当客户端发送报文时就实现了一个HTTP请求。在服务端接收请求后便会根据请求内容组织响应并复用TCP连接将响应内容回报至客户端，在客户端内解析并读取响应的内容最后实时显示在客户端用户界面上。Similar to outputting real-time analysis screens, the statistically obtained data in the deep learning algorithm also needs to be transmitted to the client in real time. In this embodiment, HTTP is used to realize data transmission. HTTP (Hyper Text Transfer Protocol) is the Hypertext Transfer Protocol, which has the advantages of being simple, flexible and easy to expand. This embodiment first writes the HTTP program in the deep learning algorithm on the server side, then creates a TCP connection from the client to the server and writes the data request message in the client. When the client sends the message, an HTTP program is realized. ask. After receiving the request, the server will organize the response according to the request content and multiplex the TCP connection to report the response content to the client, parse and read the response content in the client, and finally display it on the client user interface in real time.

所述分析结果显示模块13用于在客户端显示视频分析画面及人员与车辆的统计数据，还用于管理人员根据需求切换不同出入口摄像头所获取的分析画面以及统计数据。The analysis result display module 13 is used to display video analysis images and statistical data of personnel and vehicles on the client terminal, and is also used for managers to switch analysis images and statistical data obtained by different entrance and exit cameras according to requirements.

为使本实时视频分析画面更加直观的呈现给现场管理人员和方便现场管理人员根据需求获取不同出入口的分析画面，本实施例通过深度学习算法与虚幻4引擎的结合，在算法中加入RTSP和HTTP推流服务，将所获取的视频分析画面和人员统计的结果实时展示在客户端，并在客户端添加其他功能。In order to present the real-time video analysis screen more intuitively to the on-site management personnel and facilitate the on-site management personnel to obtain the analysis screens of different entrances and exits according to the needs, this embodiment combines the deep learning algorithm with the Unreal 4 engine, and adds RTSP and HTTP to the algorithm Streaming service, which displays the obtained video analysis screen and people statistics results on the client in real time, and adds other functions to the client.

该客户端所包含的功能主要包括：The functions contained in the client mainly include:

(1)人员与车辆统计功能，包括人员与车辆进、出数量以及当地内人员与车辆总数；(1) Personnel and vehicle statistics function, including the number of personnel and vehicles entering and exiting, and the total number of personnel and vehicles in the local area;

(2)视角切换功能，当在不同出入口架设多个摄像头时，可通过客户端在不同摄像头之间进行切换并实时获取该摄像头所记录的人员与车辆进出数据。(2) Angle of view switching function. When multiple cameras are set up at different entrances and exits, the client can switch between different cameras and obtain the personnel and vehicle entry and exit data recorded by the camera in real time.

所述智能监测模块2用于利用预设的图像识别技术分别对通过工地入口的施工人员的着装及状态进行识别，并对不符合标准的施工人员进行提醒。The intelligent monitoring module 2 is used to identify the clothing and state of the construction personnel passing through the entrance of the construction site by using the preset image recognition technology, and remind the construction personnel who do not meet the standards.

其中，所述智能监测模块2包括安全帽识别模块21、工作服识别模块22、状态识别模块23及警报提醒模块24；Wherein, the intelligent monitoring module 2 includes a helmet identification module 21, a work clothes identification module 22, a state identification module 23 and an alarm reminder module 24;

所述安全帽识别模块21用于利用图像识别技术对通过工地入口的未佩带安全帽的施工人员进行识别；The safety helmet identification module 21 is used to identify construction personnel who pass through the entrance of the construction site without wearing safety helmets by using image recognition technology;

所述工作服识别模块22用于利用图像识别技术对通过工地入口的未穿着工作服的施工人员进行识别；The work clothes recognition module 22 is used to use image recognition technology to identify construction workers who pass through the site entrance without wearing work clothes;

所述状态识别模块23用于利用图像识别技术对通过工地入口的施工人员的精神状态进行识别；The state identification module 23 is used to identify the mental state of construction workers passing through the site entrance by using image recognition technology;

具体的，所述状态识别模块23包括施工人员面部图像获取模块231、融合特征识别模块232及状态识别结果输出模块233；Specifically, the state recognition module 23 includes a construction personnel facial image acquisition module 231, a fusion feature recognition module 232 and a state recognition result output module 233;

所述施工人员面部图像获取模块231用于获取施工现场各出入口实时监控画面中施工人员的面部图像；The facial image acquisition module 231 of the construction personnel is used to obtain the facial images of the construction personnel in the real-time monitoring screen of each entrance and exit of the construction site;

所述融合特征识别模块232用于利用基于独立特征融合的面部状态识别算法对施工人员的面部图像进行识别，实现对进入施工现场的施工人员的精神状态进行识别；The fusion feature identification module 232 is used to identify the facial image of the construction personnel using the facial state identification algorithm based on independent feature fusion, so as to identify the mental state of the construction personnel entering the construction site;

所述融合特征识别模块232包括全局状态特征提取模块2321、局部状态特征提取模块2322、状态特征融合模块2323及状态特征分析识别模块2324；The fusion feature identification module 232 includes a global state feature extraction module 2321, a local state feature extraction module 2322, a state feature fusion module 2323 and a state feature analysis and identification module 2324;

所述全局状态特征提取模块2321用于通过离散余弦变换(discrete cosinetransform，DCT)提取面部图像的全局状态特征，并利用独立成分分析(independentcomponent analysis，ICA)技术对全局状态特征的相关性进行去除，得到独立全局状态特征；The global state feature extraction module 2321 is used to extract the global state feature of the face image by discrete cosine transform (discrete cosinetransform, DCT), and utilize independent component analysis (independent component analysis, ICA) technology to remove the correlation of the global state feature, Get independent global state features;

离散余弦变换是一种常用的图像数据压缩方法，对于一幅MxN的数字图像(x，y)，其2D离散余弦变换的定义为：Discrete cosine transform is a commonly used image data compression method. For an MxN digital image (x, y), its 2D discrete cosine transform is defined as:

式中，u＝0，1，2，...，M-1；v＝0，1，2，...，N-1；In the formula, u=0, 1, 2,..., M-1; v=0, 1, 2,..., N-1;

离散余弦变换的特点是：当频域变化因子u、v较大时，DCT系数C(u，v)的值很小；而数值较大的C(u，v)主要分布在u、v较小的左上角区域，这也是有用信息的集中区域，本实施例中便是提取此区域的有用信息作为图像的全局疲劳特征。The characteristics of discrete cosine transform are: when the frequency domain change factors u and v are large, the value of DCT coefficient C(u, v) is very small; while the value of C(u, v) with large value is mainly distributed when u and v are relatively large. The small upper left corner area is also a concentrated area of useful information. In this embodiment, the useful information in this area is extracted as the global fatigue feature of the image.

独立主元分析作为一种用于解决盲信号分离问题的有效方法，通过ICA能从混合信号中通过变换矩阵成功地分离出互相独立的源信号，这种性质用在疲劳特征提取方面不仅能降低疲劳特征向量的维数，还可以减少特征向量中各分量间的高阶相关性。As an effective method to solve the problem of blind signal separation, independent principal component analysis can successfully separate the independent source signals from the mixed signal through the transformation matrix through ICA. This property can not only reduce the fatigue feature extraction The dimensionality of the fatigue eigenvector can also reduce the high-order correlation among the components in the eigenvector.

人脸表情图像中除了PCA方法能去除的二阶统计相关之外，高阶统计相关量也占有很大成分，因此，用ICA方法去除表情图像中的高阶相关量将会得到更有鉴别能力的特征。ICA算法的基本思想是用一组基函数来表示一系列随机变量，同时假设各成分之间统计独立或尽可能独立。In addition to the second-order statistical correlation that can be removed by the PCA method in the facial expression image, the high-order statistical correlation also occupies a large component. Therefore, using the ICA method to remove the high-order correlation in the facial expression image will result in a more discriminative ability. Characteristics. The basic idea of the ICA algorithm is to use a set of basis functions to represent a series of random variables, while assuming that the components are statistically independent or as independent as possible.

本实施例中采用ICA从面部图像序列的全局DCT特征中通过变换矩阵成功地分离出互相独立的特征，这样不仅能降低特征向量的维数，还可以减少特征向量中各分量间的高阶相关性而得到更有鉴别能力的独立全局特征。In this embodiment, ICA is used to successfully separate independent features from the global DCT features of the facial image sequence through the transformation matrix, which can not only reduce the dimension of the feature vector, but also reduce the high-order correlation between the components in the feature vector characteristics to obtain more discriminative independent global features.

所述局部状态特征提取模块2322用于提取图像序列中眼部和嘴部区域的特征，并分别对眼部和嘴部区域进行Gabort小波变换和特征融合，得到两个局部区域的动态多尺度特征作为面部图像的局部状态特征；The local state feature extraction module 2322 is used to extract the features of the eye and mouth regions in the image sequence, and perform Gabort wavelet transform and feature fusion on the eye and mouth regions respectively to obtain the dynamic multi-scale features of the two local regions as local state features of facial images;

人脸疲劳时的表现特征具有不同的尺度，有整体的较大的尺度，也有细微的较小的尺度，所以单一尺度的分析很难提取到人脸疲劳表情的全部重要特征，然而利用多尺度分解，提取包含较多疲劳信息的尺度和方向上的特征进行分析，能更好地对面部视觉信息进行有效分析，对于面部视频图像序列，由于疲劳时不同的面部运动的尺度不同，因此需要用多尺度方法来分析疲劳信息。The performance characteristics of facial fatigue have different scales, there are overall larger scales and subtle smaller scales, so it is difficult to extract all the important features of facial fatigue expressions by single-scale analysis. However, using multi-scale Decomposition, extraction of features on scales and directions that contain more fatigue information for analysis, can better analyze facial visual information effectively. For facial video image sequences, due to the different scales of different facial movements during fatigue, it is necessary to use A multiscale approach to analyzing fatigue information.

Gabor小波是多尺度分析的有力工具，与DCT相比，Gabor变换同时在时域和频域获得最佳局部化，其变换系数描述了图像上给定位置附近区域的灰度特征，并且具有对光照、位置等不敏感的优点，适合用于表示人脸的局部特征，而由于DCT更注重图像的全局信息，它往往忽略了在人脸识别过程中更为重要的局部信息，因此，本实施例中引人了Gabor小波变换进行图像局部多尺度特征的提取，并将之与独立全局特征进行融合，在全局信息中加入了图像的局部特征。Gabor wavelet is a powerful tool for multi-scale analysis. Compared with DCT, Gabor transform obtains the best localization in time domain and frequency domain at the same time. The advantages of insensitivity to illumination, position, etc., are suitable for representing the local features of the face, and because DCT pays more attention to the global information of the image, it often ignores the more important local information in the process of face recognition. Therefore, this implementation In the example, the Gabor wavelet transform is introduced to extract the local multi-scale features of the image, and it is fused with the independent global features, and the local features of the image are added to the global information.

所述状态特征融合模块2323用于将独立全局状态特征与局部状态特征进行融合，在全局特征中加入局部细节信息，得到面部状态融合特征；The state feature fusion module 2323 is used to fuse independent global state features and local state features, and add local detail information to the global features to obtain facial state fusion features;

所述状态特征分析识别模块2324用于通过预设的分类器对得到的面部状态融合特征进行分析与识别，得到施工人员的精神状态信息，其中，所述施工人员的精神状态包括清醒状态(眼睛正常争开，眼球状态活跃，头部端正，注意力集中，眉毛平展)、轻度疲劳状态(眼球活跃程度下降，目光呆港，眉毛出现下垂趋势，额头紧皱，头部转动频率增加，精神不振)、中度疲劳状态(出现眼睛闭合、打哈欠、点头等现象，眉毛严重下垂，脸部肌肉变形严重)及重度疲劳状态(眼睛闭合趋势加重，且出现持续闭眼现象，注意力涣散)。The state feature analysis and identification module 2324 is used to analyze and identify the obtained facial state fusion features through a preset classifier to obtain the mental state information of the construction workers, wherein the mental state of the construction workers includes the awake state (eye Normal opening, active eyeballs, straight head, concentrated attention, and flat eyebrows), mild fatigue (decreased eyeball activity, dull eyes, drooping eyebrows, forehead wrinkling, increased head turning frequency, mental fatigue) Sluggishness), moderate fatigue state (eye closure, yawning, nodding, etc., severe drooping of eyebrows, severe facial muscle deformation) and severe fatigue state (eye closure tends to aggravate, and persistent eye closure occurs, and concentration is lax) .

所述预设的分类器通过AdaBoost算法挑选部分特征，去除冗余特征并进行训练得到，所述预设的分类器的计算公式为：The preset classifier is obtained by selecting some features through the AdaBoost algorithm, removing redundant features and performing training. The calculation formula of the preset classifier is:

所述状态识别结果输出模块233用于输出施工现场各出入口实时监控画面中施工人员的精神状态信息。The state recognition result output module 233 is used to output the mental state information of the construction workers in the real-time monitoring screen of each entrance and exit of the construction site.

所述警报提醒模块24用于对未佩戴安全帽、未穿着工作服及精神状态不符合标准的施工人员进行提醒。The alarm reminding module 24 is used to remind construction workers who do not wear safety helmets, do not wear work clothes and whose mental state does not meet the standards.

综上所述，借助于本发明的上述技术方案，通过利用训练好的深度学习算法对通过工地入口的施工人员进行检测，并利用跟踪算法对目标进行跟踪，在目标碰撞检测线时进行计数并统计，从而可以实现对施工现场人员及车辆进、出数量以及当地内人员与车辆总数的智能统计，此外，本发明还可以利用图像识别技术分别对通过工地入口的施工人员的着装及状态进行识别，并对不符合标准的施工人员进行提醒，从而可以有效地避免因施工人员未佩带安全帽、未穿着工作服或精神状态欠佳而导致的安全事故现象的发生，进而可以有效地提高施工现场的施工安全性能。To sum up, with the help of the above technical solution of the present invention, the construction personnel passing the entrance of the construction site are detected by using the trained deep learning algorithm, and the tracking algorithm is used to track the target, and when the target collides with the detection line, count and Statistics, so as to realize the intelligent statistics of the number of personnel and vehicles entering and exiting the construction site and the total number of personnel and vehicles in the local area. In addition, the present invention can also use image recognition technology to identify the clothing and status of construction personnel passing through the entrance of the construction site. , and remind the construction workers who do not meet the standards, so as to effectively avoid the occurrence of safety accidents caused by the construction workers not wearing safety helmets, not wearing work clothes or poor mental state, and thus effectively improving the safety of the construction site Construction safety performance.

同时，本发明不仅可以在各类光照条件下识别准确率均能达到95％以上，而且还可以在保证统计数据准确地同时，使得分析画面与原画面时间差不超过1秒，具有识别准确率高和识别速度快的优点。At the same time, the present invention can not only achieve a recognition accuracy rate of more than 95% under various lighting conditions, but also ensure that the statistical data is accurate, so that the time difference between the analysis screen and the original screen does not exceed 1 second, and has a high recognition accuracy rate. And the advantages of fast recognition speed.

同时，基于本发明所使用的算法与UI所包含的可塑性，可根据不同的需求加入不同的功能，如人脸识别、工种识别等，可拓展性高；此外，本发明所需硬件设备价格低廉，在节约成本的同时提高现场管理效率获取更多的经济效益。At the same time, based on the algorithm used in the present invention and the plasticity contained in the UI, different functions can be added according to different requirements, such as face recognition, job type recognition, etc., and the scalability is high; in addition, the hardware equipment required by the present invention is cheap , Improve on-site management efficiency and obtain more economic benefits while saving costs.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the scope of the present invention. within the scope of protection.

Claims

1. The constructor intelligent monitoring system based on computer vision and deep learning is characterized by comprising an intelligent statistics module (1) and an intelligent monitoring module (2);

the intelligent statistics module (1) is used for detecting constructors passing through a construction site entrance by using a trained deep learning algorithm, tracking targets by using a tracking algorithm, counting and counting when the targets collide with a detection line, and displaying in real time by a client;

the intelligent monitoring module (2) is used for respectively identifying the dressing and the state of constructors passing through the site entrance by utilizing a preset image identification technology and reminding constructors which do not accord with the standard;

the intelligent statistics module (1) comprises a video acquisition module (11), a deep learning algorithm analysis module (12) and an analysis result display module (13);

the video acquisition module (11) is used for acquiring real-time monitoring pictures according to monitoring cameras erected at various entrances and exits of a construction site, and inputting the real-time monitoring pictures into a POE switch to obtain an initial video material after conversion;

the deep learning algorithm analysis module (12) is used for outputting video analysis pictures and personnel statistics data corresponding to the initial video materials in real time through a trained deep learning algorithm, and outputting the analysis pictures and statistics results to a client;

the analysis result display module (13) is used for displaying video analysis pictures and statistics data of personnel and vehicles on the client side, and is also used for enabling management personnel to switch the analysis pictures and the statistics data obtained by cameras at different entrances and exits according to requirements;

the deep learning algorithm analysis module (12) comprises a detection area setting module (121), a video analysis module (122), a data statistics module (123) and a data transmission module (124);

the detection area setting module (121) is used for controlling the actual detection range in each picture by adjusting the corresponding detection range according to the picture layout of different entrances and exits so as to realize the setting of the detection area;

the video analysis module (122) is used for analyzing images in the initial video material through coordinates of the wire collision detection points, so that analysis and statistics of personnel and vehicles in a detection area are realized;

the data statistics module (123) is used for judging and counting whether the target frame collides with the wire or not and the color of the wire collision area of the target frame;

the data transmission module (124) is used for outputting the analysis picture and the statistical result to the client through the RTSP server and the HTTP push service.

2. The intelligent constructor monitoring system based on computer vision and deep learning as set forth in claim 1, wherein the setting of the detection area comprises the steps of:

the positions of all the endpoints are determined in sequence according to the shape of the required area and expressed by an array, each element of the array is a binary array representing the endpoint of the graph, and the adjustment and the setting of the detection area are realized by adjusting the array.

3. The intelligent monitoring system for constructors based on computer vision and deep learning according to claim 2, wherein the analysis of the image in the initial video material by the coordinates of the collision detection point, the analysis and statistics of the personnel and vehicles in the detection area are realized by the following steps:

acquiring position parameters and categories of all objects in a current image;

judging whether the geometric center of an object in the new frame image and the offset of a geometric center of a certain object in the previous frame image are within a preset offset, if so, judging that the two objects are the same object and have the same ID, if not, judging that the new object exists in the new frame image, and assigning a new ID for the object;

using an image of a known object ID with a rectangle representing the object range, taking the parameter of a certain object range as x ₁ ，y ₁ ，x ₂ ，y ₂ Wherein x is ₁ <x ₂ ，y ₁ <y ₂ The coordinates of the wire-strike detection point are (check_point_x, check_point_y), check_point_x=x ₁ ，check_point_y＝int[y ₁ +(y ₂ -y ₁ )*0.6]Int means rounding the operation result;

and judging whether the object collision detection point is positioned in the judging area, if so, carrying out statistical operation on the object, and if not, not carrying out statistical operation.

4. The intelligent monitoring system for constructors based on computer vision and deep learning according to claim 3, wherein the determining and counting of the goal passing in and out by the goal frame wire collision and the color of the goal frame wire collision area comprises the following steps:

the method comprises the steps that all pictures captured by a current camera are defined as detection areas, a blue and yellow strip-shaped area is set as a judging area, and the pictures are recorded as entering when a target goes up to hit a yellow line, and recorded as exiting when the target goes down to hit a blue line;

acquiring real-time monitoring pictures acquired by monitoring cameras at all entrances and exits of a construction site, and performing size reduction treatment on the acquired real-time monitoring pictures;

judging whether a target appears in the detection area of the reduced real-time monitoring picture, if not, regarding the monitoring picture as an invalid picture and performing neglect cleaning, if so, framing the target and outputting;

detecting whether the target frame collides with the wire or not and the color of the wire collision area of the target frame, and judging and counting the entry and exit of the target according to the color of the wire collision area of the target frame.

5. The intelligent monitoring system for constructors based on computer vision and deep learning according to claim 1, wherein the intelligent monitoring module (2) comprises a safety helmet identification module (21), a work clothes identification module (22), a state identification module (23) and an alarm reminding module (24);

the helmet identification module (21) is used for identifying constructors without wearing the helmet, which pass through the site entrance, by utilizing an image identification technology;

the work clothes identification module (22) is used for identifying constructors who do not wear work clothes and pass through a worksite entrance by utilizing an image identification technology;

the state identification module (23) is used for identifying the mental state of constructors passing through the worksite entrance by utilizing an image identification technology;

the alarm reminding module (24) is used for reminding constructors who do not wear safety helmets, do not wear work clothes and do not accord with the spirit state standard.

6. The intelligent monitoring system for constructors based on computer vision and deep learning according to claim 5, wherein the state recognition module (23) comprises an constructor face image acquisition module (231), a fusion feature recognition module (232) and a state recognition result output module (233);

the constructor face image acquisition module (231) is used for acquiring face images of constructors in real-time monitoring pictures of all entrances and exits of a construction site;

the fusion characteristic recognition module (232) is used for recognizing the facial image of the constructor by utilizing a facial state recognition algorithm based on independent characteristic fusion, so as to recognize the mental state of the constructor entering the construction site;

the state identification result output module (233) is used for outputting the mental state information of constructors in real-time monitoring pictures of all entrances and exits of the construction site.

7. The intelligent constructor monitoring system based on computer vision and deep learning of claim 6, wherein the fusion feature recognition module (232) comprises a global state feature extraction module (2321), a local state feature extraction module (2322), a state feature fusion module (2323) and a state feature analysis recognition module (2324);

the global state feature extraction module (2321) is used for extracting global state features of the facial image through discrete cosine transformation, and removing correlation of the global state features by utilizing an independent component analysis technology to obtain independent global state features;

the local state feature extraction module (2322) is used for extracting features of an eye region and a mouth region in an image sequence, and performing Gabort wavelet transformation and feature fusion on the eye region and the mouth region respectively to obtain dynamic multi-scale features of two local regions as local state features of a facial image;

the state feature fusion module (2323) is used for fusing the independent global state feature and the local state feature, and adding local detail information into the global feature to obtain a face state fusion feature;

the state feature analysis and recognition module (2324) is used for analyzing and recognizing the obtained facial state fusion features through a preset classifier to obtain mental state information of constructors, wherein the mental states of the constructors comprise an awake state, a mild fatigue state, a moderate fatigue state and a severe fatigue state.

8. The intelligent monitoring system for constructors based on computer vision and deep learning according to claim 7, wherein the preset classifier is obtained by selecting part of features, removing redundant features and training through an AdaBoost algorithm, and the calculation formula of the preset classifier is as follows:

wherein T represents the final number of algorithm loops, a _t Representation classifier h _t (X) the selected weights, determined by AdaBoost algorithm learning, x= (X) ₁ ，X ₂ ，…，X _T ) Representing the dynamic Gabor features of the selected facial image sequence.