CN114387587A - Fatigue driving monitoring method - Google Patents
Fatigue driving monitoring method Download PDFInfo
- Publication number
- CN114387587A CN114387587A CN202210040471.XA CN202210040471A CN114387587A CN 114387587 A CN114387587 A CN 114387587A CN 202210040471 A CN202210040471 A CN 202210040471A CN 114387587 A CN114387587 A CN 114387587A
- Authority
- CN
- China
- Prior art keywords
- driver
- image
- multiplied
- fatigue driving
- calculating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a fatigue driving monitoring method, which belongs to the technical field of fatigue driving monitoring and comprises the steps of constructing a prediction model for detecting the eyes to be opened/closed, calculating the blinking frequency of a driver, estimating the head posture of the driver by calculating the Euler angle of the head posture of the driver, constructing the prediction model for extracting the characteristics of a facial image and a binocular image, and estimating the gazing area of the driver; the fatigue state of the driver is monitored by monitoring the blinking frequency, the head posture angle and the fixation point of the driver in real time, so that the accuracy of detecting the fatigue driving of the driver is improved. The method is a non-contact method, can automatically monitor fatigue driving by only collecting the head and face images of the driver through one camera arranged in the vehicle, has low manufacturing cost, does not influence the normal driving of the driver, and greatly improves the market application value.
Description
Technical Field
The invention belongs to the technical field of fatigue driving monitoring, and particularly relates to a fatigue driving monitoring method.
Background
In the modern society of high-speed operation, traffic accidents frequently occur, and fatigue driving is an important reason for the traffic accidents, so that the monitoring of whether a driver is fatigue driving has great significance. At present, three methods for monitoring fatigue driving at home and abroad can be generally adopted, and the first monitoring method is to monitor the driving characteristics (such as the motion rule of a steering wheel, the service condition of an accelerator pedal and the like) of a vehicle so as to monitor whether a driver is in fatigue driving. The monitoring method can be used only in specific environments, and the monitoring method also loses the function in complex road environments such as mountain roads, mud roads and the like; the second monitoring method focuses on electrical signals of a human body, such as an Electrooculogram (EOG), an Electrocardiogram (ECG) and an electroencephalogram (EEG), and requires a driver to wear a large number of sensors, which may affect normal driving of the driver, increase driving risk, and have great limitations. With the rapid development of computer vision technology, fatigue driving monitoring technology based on computer vision becomes more and more popular.
The Chinese patent 'CN 203562073U device for reminding driver of safe driving' provides a steering wheel annular sleeve for reminding driver of safe driving, and the device can realize vibration at fixed intervals so as to achieve the purpose of reminding driver of concentrating driving. In essence, the driver is reminded at a fixed time instead of monitoring the state of the driver, and the driver is reminded in real time when the driver is tired. Chinese patent CN 108010272A fatigue driving reminding device designs a fatigue driving reminding device, can realize driving to the fixed time after, reminds the driver to rest. The Chinese patent CN 110299014A safety driving prompting device designs a safety driving prompting device, and realizes the monitoring of fatigue driving of a driver by analyzing state data such as steering of a steering wheel. The fatigue state of the driver is estimated by analyzing the steering wheel data and the like, and this method cannot be applied to a road section with a complicated driving environment such as a mountain road, a mud road and the like.
In a production environment, many automotive companies collaborate with third party companies to design, develop and monitor products for driver status. For example, an alarm system in an automobile tracks the steering pattern of the driver and generates a warning signal when an abnormal deviation is detected. Or the rest auxiliary system in the automobile can judge the fatigue degree of the driver according to the motion rule of the steering wheel and the service condition of the pedal, and after fatigue driving is detected, the automobile warns the driver in the forms of steering wheel vibration, sound signals and the like. The fatigue driving monitoring products appearing on the market are popular in that the fatigue state of a driver is judged by monitoring pupils and eyelids through wireless glasses, and are applied to commercial vehicles at present. But is not friendly to the shortsighted drivers, has high cost and is not suitable for being popularized and used in a large area.
Disclosure of Invention
In view of the above problems, it is an object of the present invention to provide a fatigue driving monitoring method, which can monitor the driving state of a driver in real time by capturing a facial image of the driver, and prevent the driver from fatigue driving without interfering with the normal driving behavior of the driver.
In order to achieve the above object, the present invention provides a fatigue driving monitoring method, comprising:
constructing a detection model for detecting opening/closing of eyes, and calculating the blinking frequency of a driver;
estimating the head attitude of the driver by calculating the Euler angle of the head attitude of the driver;
constructing a detection model of a driver watching area, and detecting the watching area of the driver through an eye image and a face image of the driver;
and judging whether the driver is fatigue driving according to the blinking frequency, the head posture or the watching area of the driver.
The constructing of a detection model for detecting the opening/closing of eyes and the calculation of the blinking frequency of the driver include:
step 1.1: collecting facial image data of a driver to construct an open/closed eye detection data set;
step 1.2: constructing a LeNet neural network as a prediction model for detecting open/closed eyes, and training by using an open/closed eye detection data set;
step 1.3: aiming at the facial image of the driver to be monitored, the trained LeNet neural network is used for predicting the eyes to be opened or closed, and the statistical prediction result is the total time length T occupied by the adjacent 20 frames of images with closed eyes.
The estimation of the head posture of the driver by calculating the Euler angle of the head posture of the driver comprises the following steps:
step 2.1: positioning 68 key points of the face in each image by using a cascade regression tree algorithm to obtain two-dimensional coordinates of the 68 key points;
step 2.2: calculating a rotation matrix rot _ vector of the head through an N-point perspective pose solving algorithm according to the universal three-dimensional coordinates of the key points of the head and the two-dimensional coordinates of the key points of the 68 faces obtained in the step 2.1;
step 2.3: and converting the rotation matrix into a pitch angle pitch, a yaw angle yaw and a roll angle roll in a space coordinate system to represent the head posture of the driver.
The construction of a prediction model for extracting the characteristics of the facial image and the binocular image and the estimation of the gaze area of the driver comprises the following steps:
step 3.1: taking the public data set DDGC-DB1 as a training set of a training model;
step 3.2: carrying out specification adjustment on each sample in the training set, and uniformly adjusting the samples to 224 multiplied by 224;
step 3.3: cutting out the face image from the zoomed image, and recording the width of the face image as L1Height is denoted as H1;
Step 3.4: taking the geometric center of the eye area as the center, intercepting the images of the two eyes, and recording the distance between the two corners of the eyes as L2Height is denoted as H2;
Step 3.5: respectively extracting features of the intercepted face image and the intercepted binocular image by using a convolutional layer of a VGG16 neural network to obtain 3 feature vectors of 1 multiplied by 4096, and respectively recording the feature vectors as xi, psi and gamma;
step 3.6: calculating a weighted sum vector gamma of the vector calculated in the step 3.5;
calculating Euclidean distance o between vectors xi and psi1:
Calculating Euclidean distance o between vectors xi and gamma2:
Γ=ξ+o1ψ+o2γ
In the formula, xikValue of the Kth element, ψ, representing vector ξkValue of the Kth element, gamma, representing the vector psikThe kth element value, K ═ 1,2,3, … …,4095, representing the vector γ;
ψ=[ψ0,ψ1,……,ψ4095],γ=[γ0,γ1,……,γ4095],ξ=[ξ0,ξ1,……,ξ4095];
step 3.7: and calculating the sum vector gamma through two full-connection layers, and then carrying out normalization calculation on the result by using softmax to obtain an output vector R, wherein the element value in R is the predicted probability value of each category, and the element position where the maximum value is located is the prediction result of the gazing area.
Whether the driver is fatigue driving is judged according to the blinking frequency, the head posture or the watching area of the driver, and the method is specifically expressed as follows:
aiming at the facial image of the driver to be monitored, carrying out eye opening or eye closing prediction by utilizing a trained LeNet neural network, counting the total time length T occupied by adjacent 20 frames of images with closed eyes as a prediction result, and if the T is less than or equal to one minute or the T is more than or equal to two minutes, determining that the driver is fatigue driving;
when the maximum values of the changes of pitch, yaw and roll are all less than or equal to a set threshold value within a period of time, judging that the driver is fatigue driving;
and aiming at the facial image of the driver to be monitored, detecting the watching area according to the intercepted facial image and the binocular image, and judging that the driver is fatigue driving when the predicted results of the driver are the same watching area within a certain period of time.
The step 1.1 comprises the following steps:
step 1.1.1: respectively acquiring facial image data of different angles of the head of N drivers;
step 1.1.2: positioning 68 key points of the human face in each image;
step 1.1.3: after the key points of the face are obtained, the direction and the size of the face are corrected;
step 1.1.4: intercepting the corrected eye image as a sample in a training set;
step 1.1.5: each sample was labeled as open or closed.
Said step 1.1.3 comprises:
step S1-1: using the upper left corner point of the image as the origin, the horizontal direction as the horizontal axis and the vertical direction as the vertical axis, and according to the left eye corner key point P of the left eye37Right corner of the right eye, key point P46Calculates a rotation center point a (a) of the imagex,ay):
In the formula, P37.x、P37Y represents the abscissa and ordinate of the face key point with the number 37, respectively; p46.x, P46Y represents respectivelyThe abscissa and the ordinate of the face key point with the number of 46;
step S1-2: when the image deflects in the horizontal direction, correcting the image in the horizontal direction according to the rotation angle alpha;
step S1-3: and the corrected images are zoomed to ensure that the eye sizes of all the images are consistent.
The step S1-2 includes:
step SS 1-1: calculating a point P by taking the upper left corner point of the image as an original point, the horizontal direction as an X axis and the vertical direction as a Y axis37And point P46Height difference h in Y-axis direction, point P when head is not deflected in horizontal direction37And point P46The height difference in the Y-axis direction is 0;
h==P37.y-P46.y
when h <0, it means that the head is deflected leftward in the horizontal direction, and vice versa;
step SS 1-2: calculating a point P37To point P46Distance r therebetween:
step SS 1-3: calculating the rotation angle alpha:
step SS 1-4: and rotating the picture by alpha degrees in the horizontal direction according to the rotation angle to realize the correction of the picture in the horizontal direction.
The step S1-3 includes:
step SS 2-1: calculating right canthus key point P of left eye40And left eye corner key point P of right eye43Distance d in the direction of the X-axis, i.e. d ═ P43.x-P40.x;
Step SS 2-2: calculating a scaling scale, wherein the scale is D/D, and D is any constant;
step SS 2-3: and zooming the picture according to the zooming scale.
The step 1.2 comprises the following steps:
step 1.2.1: processing the sample image specification into a size of 3 multiplied by 32 as the input of a LeNet neural network;
step 1.2.2: performing convolution operation on an input image by using 6 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, and edge filling (padding) is not performed to obtain a characteristic vector with the specification of 6 multiplied by 28;
step 1.2.3: performing maximum pooling operation on the obtained feature vector, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling (padding) is not performed, and the specification of the feature vector is converted into 6 multiplied by 14;
step 1.2.4: performing convolution operation on the feature vector obtained in the step 1.2.3 by using 16 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling (padding) is not performed, and the specification of the feature vector is converted into 16 multiplied by 10;
step 1.2.5: performing maximum pooling operation on the feature vectors obtained in the step 1.2.4, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling (padding) is not performed, and the specification of the feature vectors is converted into 16 multiplied by 5;
step 1.2.6: performing convolution operation on the feature vector obtained in the step 1.2.5 by using 120 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling (padding) is not performed, and the specification of the feature vector is converted into 120 multiplied by 1;
step 1.2.7: inputting the eigenvector obtained in the step 1.2.6_ into a full-connection layer F1 for full-connection calculation, wherein the number of the neurons of F1 is 120, and the eigenvector with the specification of 1 multiplied by 120 is obtained;
step 1.2.8: inputting the eigenvector obtained in the step 1.2.7 into a full-connection layer F2 for full-connection calculation, wherein the number of neurons in F2 is 2, and the eigenvector with the specification of 1 multiplied by 2 is obtained
Step 1.2.9: and (3) carrying out normalization operation on the vector obtained in the step (1.2.7) by using softmax to obtain the probability that the input eye picture is open or closed.
The invention has the beneficial effects that:
the invention provides a fatigue driving monitoring method, which monitors the fatigue state of a driver by monitoring the blinking frequency, the head posture angle and the fixation point of the driver in real time and improves the accuracy of detecting the fatigue driving of the driver. The method is a non-contact method, can automatically monitor fatigue driving by only collecting the head and face images of the driver through one camera arranged in the vehicle, has low manufacturing cost, does not influence the normal driving of the driver, and greatly improves the market application value.
Drawings
FIG. 1 is a flow chart of a method for monitoring fatigue driving according to the present invention;
FIG. 2 is a diagram of location information of 68 key points of a face according to the present invention;
FIG. 3 is a schematic view of a gaze area prediction network according to the present invention;
fig. 4 is a schematic view of the division of the gaze region in the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples. After a camera acquires a face image, 68 key points of a personal face are positioned by using a landmark model of dlib, then the face image is subjected to geometric correction (size and direction adjustment) so that the face image is always kept horizontal and the distance between the right eye corner of a left eye and the left eye corner of a right eye is always ensured to be 90 pixels, then the eye image is acquired, eye opening/closing detection is carried out, and the statistical detection result is the total duration T occupied by 20 adjacent frames of images with closed eyes; again, 3D head angle calculations were performed using the SlovePnP algorithm in OpenCV with 68 individual face key points. And finally, the gaze area is detected through the face image, and whether the driver has dangerous driving behaviors or not can be monitored while fatigue driving of the driver is monitored. The monitoring method has the advantages that three monitoring modes of blink frequency monitoring, head posture change monitoring and watching region monitoring are combined, fatigue driving behaviors of a driver can be accurately and sensitively captured, in addition, distraction driving behaviors (operating a radio, a mobile phone and the like) of the driver can be captured through watching region monitoring, and driving safety of the driver is guaranteed.
As shown in fig. 1, the method for monitoring fatigue driving provided by the present invention comprehensively determines whether a driver is in fatigue driving by estimating a blinking frequency, a head pose, and a gaze region of the driver, and includes:
constructing a detection model for detecting opening/closing of eyes, and calculating the blinking frequency of a driver; the method comprises the following steps:
step 1.1: acquiring 20 facial image data of a driver to prepare a training sample set as an open/closed eye detection data set, wherein a camera for acquiring facial images of the driver can be arranged in the center of an instrument panel or at a position where normal driving of the driver is not affected, such as an interior rearview mirror; the method comprises the following steps:
step 1.1.1: respectively collecting facial image data of different angles of the face of N drivers;
step 1.1.2: positioning 68 key points of the human face in each image; in a specific implementation, a cascade regression tree algorithm (landmark model of dlib) is used, and other methods such as deep learning are also available, and the position information of 68 key points is shown in fig. 2.
Step 1.1.3: after the key points of the face are obtained, the direction and the size of the face are corrected; the method comprises the following steps:
step S1-1: using the upper left corner point of the image as the origin, the horizontal direction as the horizontal axis (X axis), and the vertical direction as the longitudinal axis (Y axis), according to the left eye corner key point P of the left eye37(point 37 in FIG. 2), right corner key point P for the right eye46(point 46 in fig. 2) coordinate values to calculate the rotation center point a (a) of the imagex,ay):
In the formula, P37.x、P37Y represents the abscissa and ordinate of the face key point with the number 37, respectively; p46.x, P46Y represents the abscissa and ordinate of the face key point with the number 46, respectively;
step S1-2: when the image deflects in the horizontal direction, correcting the image in the horizontal direction according to the rotation angle alpha; the method comprises the following steps:
step SS 1-1: calculating a point P by taking the upper left corner point of the image as an original point, the horizontal direction as an X axis and the vertical direction as a Y axis37And point P46A height difference h in the Y-axis direction, which is 0 when the head is not deflected in the horizontal direction;
h=P37.y-P46.y
when h <0, it means that the head is deflected leftward in the horizontal direction, and vice versa;
step SS 1-2: calculating a point P37To point P46Distance r therebetween:
step SS 1-3: calculating the rotation angle alpha:
step SS 1-4: rotating the picture by alpha degrees in the horizontal direction according to the rotation angle to realize the correction of the picture in the horizontal direction; the rotaimage method in the OpenCV toolkit or the like can be used;
step S1-3: the corrected images are zoomed to ensure that the eye sizes of all the images are consistent; the method comprises the following steps:
step SS 2-1: calculating right canthus key point P of left eye40(point 40 in FIG. 2) and left eye corner key point P for the right eye43(point 43 in fig. 2) is a distance d in the X-axis direction, i.e., d is P43.x-P40.x;
Step SS 2-2: calculating a scaling scale, wherein the scale is D/D, and D is any constant;
step SS 2-3: zooming the picture according to the zooming scale;
step 1.1.4: intercepting the corrected eye image as a sample in a training set;
step 1.1.5: labeling each sample with open eyes or closed eyes;
step 1.2: constructing a LeNet neural network as a prediction model for detecting open/closed eyes, and training by using a training sample set; the method comprises the following steps:
step 1.2.1: processing the sample image specification into a size of 3 multiplied by 32 as the input of a LeNet neural network;
step 1.2.2: performing convolution operation on an input image by using 6 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, and edge filling (padding) is not performed to obtain a characteristic vector with the specification of 6 multiplied by 28;
step 1.2.3: performing maximum pooling operation on the obtained feature vector, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling (padding) is not performed, and the specification of the feature vector is converted into 6 multiplied by 14;
step 1.2.4: performing convolution operation on the feature vector obtained in the step 1.2.3 by using 16 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling (padding) is not performed, and the specification of the feature vector is converted into 16 multiplied by 10;
step 1.2.5: performing maximum pooling operation on the feature vectors obtained in the step 1.2.4, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling (padding) is not performed, and the specification of the feature vectors is converted into 16 multiplied by 5;
step 1.2.6: performing convolution operation on the feature vector obtained in the step 1.2.5 by using 120 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling (padding) is not performed, and the specification of the feature vector is converted into 120 multiplied by 1;
step 1.2.7: inputting the eigenvector obtained in the step 1.2.6 into a full-connection layer F1 for full-connection calculation, wherein the number of the neurons of F1 is 120, and the eigenvector with the specification of 1 multiplied by 120 is obtained;
step 1.2.8: inputting the eigenvector obtained in the step 1.2.7 into a full-connection layer F2 for full-connection calculation, wherein the number of neurons in F2 is 2, and the eigenvector with the specification of 1 multiplied by 2 is obtained;
step 1.2.9: carrying out normalization operation on the vector obtained in the step 1.2.7 by using softmax to obtain the probability that the input eye picture is open or closed;
model training: and (3) carrying out blink detection on the eye image by using a trained LeNet neural network, wherein the network architecture of LeNet comprises the following components: the input image is 3 × 32 × 32 in size, and the following operations are performed on the input image:
(1) carrying out convolution operation on an input image by using 6 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step size is 1, and a characteristic map of 6 multiplied by 28 is obtained without padding;
(2) performing maximum pooling operation on the obtained feature vectors, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, padding is not performed, and the size of a feature map is converted into 6 multiplied by 14;
(3) performing convolution operation on the feature vector obtained in the step (2) by using 16 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step size is 1, and a 16 multiplied by 10 feature map is obtained without padding;
(4) performing maximum pooling operation on the feature vectors obtained in the step (3), wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, padding is not performed, and the size of a feature map is converted into 16 multiplied by 5;
(5) performing convolution operation on the feature vector obtained in the step (4) by using 120 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, and a 120 multiplied by 1 feature map is obtained without padding;
(6) inputting the feature vectors obtained in the step (5) to a full-connection layer (F1) for full-connection operation, wherein the number of the neurons of F1 is 120;
(7) inputting the feature vector obtained in the step (6) to a full-connection layer (F2) for full-connection calculation, wherein the number of the neurons in F2 is 2;
(8) and (4) carrying out normalization operation on the vector obtained in the step (6) by using softmax to obtain a final open/closed eye prediction result.
Blink detection: the blinking frequency of a person is 15-20 times per minute under normal conditions, in the early stage of fatigue driving, a driver can frequently blink to relieve fatigue, and in the case of over fatigue, the blinking frequency is lower than 15 times per minute, so that the time for blinking 20 times is counted, and if the time is less than one minute or more than two minutes, the driver is considered to be fatigue driving. The blinking process is a process from eye opening to eye closing to eye opening, and tests show that in a normal condition, a camera with a frame rate of 30 has only one frame of image as eye closing in each blinking process, so that the blinking times can be reduced to the number of frames with eye closing, and the total time spent on each 20 frames of eye closing images is counted to judge whether the driver is tired to drive.
Step 1.3: aiming at the facial image of a driver to be monitored, carrying out eye opening or eye closing prediction by using a trained LeNet neural network, and counting the total time length T occupied by adjacent 20 frames of images with closed eyes as a detection result; and judging whether the driver is fatigue driving or not according to the total time length T, and if the T is less than or equal to one minute or the T is more than or equal to two minutes, determining that the driver is fatigue driving.
Head pose estimation: in the normal driving process, the driver needs to judge the driving environment information, so that the head posture changes, and if the driver does not detect large-angle rotation (more than 20 degrees) in the horizontal direction for a long time (more than 10 minutes), the driver is considered to be tired.
Estimating the head attitude of the driver by calculating the Euler angle of the head attitude of the driver; the method comprises the following steps:
step 2.1: using a cascade regression tree algorithm to locate 68 key points of the face in each image (using the landmark model of dlib), and obtaining two-dimensional coordinates of the 68 key points, as shown in fig. 2;
step 2.2: calculating a rotation matrix rot _ vector of the head through an N-point perspective pose solving algorithm according to the universal three-dimensional coordinates of the key points of the head and the two-dimensional coordinates of the key points of the 68 faces obtained in the step 2.1;
the 68 3D generic keypoint coordinates are as follows:
key points 1: -73.393523, -29.801432, -47.667532
Key point 2: -72.775014, -10.949766, -45.909403
Key points 3: 70.533638,7.929818, -44.84258
Key points 4: -66.850058,26.07428, -43.141114
Key points 5: -59.790187,42.56439, -38.635298
Key points 6: -48.368973,56.48108, -30.750622
Key points 7: -34.121101,67.246992, -18.456453
Key points 9:0.098749,77.061286,0.881698
The key points are 10:17.477031,74.758448, -5.181201
Key points 11:32.648966,66.929021, -19.176563
The key points are 12:46.372358,56.311389, -30.77057
Key points 13:57.34348,42.419126, -37.628629
Key points 14:64.388482,25.45588, -40.886309
Key points 15:68.212038,6.990805, -42.281449
Key points 16:70.486405, -11.666193, -44.142567
Key points 17:71.375822, -30.365191, -47.140426
Key points 19-51.287588, -58.769795, -7.268147
Key points 20: 37.8048, -61.996155, -0.442051
Key points 21: -24.022754, -61.033399,6.606501
The key points 23:12.056636, -57.391033,12.051204
Key points 24:25.106256, -61.902186,7.315098
Key points 25:38.338588, -62.777713,1.022953
Key points 26:51.191007, -59.302347, -5.349435
Key points 27:60.053851, -50.190255, -11.615746
Key points 28:0.65394, -42.19379,13.380835
Key points 29:0.804809, -30.993721,21.150853
The key points are 30:0.992204, -19.944596,29.284036
Key points 31:1.226783, -8.414541,36.94806
Key points 32: -14.772472,2.598255,20.132003
Key points 33-7.180239, 4.751589,23.536684
Key points 34:0.55592,6.5629,25.944448
Key points 35:8.272499,4.661005,23.695741
Key points 36:15.214351,2.643046,20.858157
Key points 37-46.04729, -37.471411, -7.037989
Key points 38-37.674688, -42.73051, -3.021217
Key points 39-27.883856, -42.711517, -1.353629
Key points 40: -19.648268, -36.754742,0.111088
Key points 41-28.272965, -35.134493,0.147273
Key points 43:19.265868, -37.032306,0.665746
Key points 44:27.894191, -43.342445, -0.24766
Key points 45:37.437529, -43.110822, -1.696435
Key points 46:45.170805, -38.086515, -4.894163
Key points 47:38.196454, -35.532024, -0.282961
The key points are 48:28.764989, -35.484289,1.172675
Key points 49-28.916267, 28.612716,2.24031
Key points 50-17.533194, 22.172187,15.934335
Key points 51-6.68459, 19.029051,22.611355
Key points 52:0.381001,20.721118,23.748437
Key points 53:8.375443,19.03546,22.721995
Key points 54:18.876618,22.394109,15.610679
Key points 55:28.794412,28.079924,3.217393
Key points 56:19.057574,36.298248,14.987997
Key points 57:8.956375,39.634575,22.554245
Key points 58:0.381549,40.395647,23.591626
Key points 59-7.428895, 39.836405,22.406106
Key points 61: -24.37749,28.677771,4.785684
Key points 62: -6.897633,25.475976,20.893742
Key points 63:0.340663,26.014269,22.220479
Key points 64:8.444722,25.326198,21.02552
Key points 65:24.474473,28.323008,5.712776
Key points 66:8.449166,30.596216,20.671489
Key points 67:0.205322,31.408738,21.90367
Key points 68-7.198266, 30.844876,20.328022
Step 2.3: converting the rotation matrix into a pitch angle (pitch, rotating around an X axis), a yaw angle (yaw, rotating around a Y axis) and a roll angle (roll, rotating around a Z axis) in a space coordinate system (right-handed Cartesian coordinate system) so as to represent the head pose of a driver; the concrete expression is as follows:
converting the rotation matrix rot _ vent into quaternions which are respectively marked as w, p, q and k;
where rot _ vent [0] [0] represents the value of row 1, column 1 element of the rotation matrix; rot _ vent [1] [0] represents the value of row 2, column 1 element of the rotation matrix; rot _ vent [2] [0] represents the value of the element of row 3, column 1 of the rotation matrix;
calculating Euler angles of 3 directions in a right-handed Cartesian space coordinate system:
when the maximum value of the changes of the pitch, the paw, the yaw and the roll is monitored to be less than or equal to the set threshold value within a period of time, the fact that the driver is fatigue driving is indicated.
Annotation region estimation: in the normal driving process, a driver needs to observe the road environment, so that the annotation area changes, whether the driver is tired is judged according to the change condition of the fixation point, and if the fixation point does not change within 5 minutes, the driver can be considered to be tired. In addition, the driving safety is also affected by the distracting driving behavior of the driver operating the air conditioner, the radio, the mobile phone and other devices. Besides judging whether the driver is tired driving or not through the gaze point detection, the driver can also give a prompt when the driver is not attentive to driving. The annotation region estimation includes:
1) use of the data set: the public data set DDGC-DB1 is adopted;
2) model training: the estimation of the gaze area employs a multi-branch strategy, as shown in fig. 3. Firstly, face correction is carried out by using the same method as that in blink detection, then a face image and an eye image are intercepted, the size of the image is adjusted to 224 multiplied by 224, and a resize method in an opencv library can be used for realizing; the geometric center of the eye is taken as the center, the width of the intercepted image is the distance between two eye corners of the eye, and the same image size and the same intercepting method are adopted when the annotation region is estimated, so that the data used for prediction is the same as the data set of the training model.
Respectively extracting features of the intercepted face image and the intercepted binocular image by using a convolutional layer of a VGG16 neural network to obtain 3 feature vectors of 1 multiplied by 4096, and respectively recording the feature vectors as xi, psi and gamma; the convolutional layer parameters of the VGG16 neural network are shown in table 1, where: conv denotes the convolutional layer, Relu denotes the activation function, and Pool denotes the pooling layer.
TABLE 1 model parameters Table for VGG16 neural networks
Constructing a detection model of a driver gazing area, wherein the feature extraction model adopts a VGG16 neural network, and the gazing area of the driver is detected through an eye image and a face image of the driver; the method comprises the following steps:
step 3.1: taking the public data set DDGC-DB1 as a training set of the model;
step 3.2: carrying out specification adjustment on each sample in the training set, and uniformly adjusting the samples to 224 multiplied by 224;
step 3.3: cutting out the face image from the zoomed image, and recording the width of the face image as L1Height is denoted as H1;
Step 3.4: taking the geometric center of the eye area as the center, intercepting the images of the two eyes, and recording the distance between the two corners of the eyes as L2Height is denoted as H2;
Step 3.5: respectively extracting features of the intercepted face image and the intercepted binocular image by using a convolutional layer of a VGG16 neural network to obtain 3 feature vectors of 1 multiplied by 4096, and respectively recording the feature vectors as xi, psi and gamma;
step 3.6: calculating a weighted sum vector gamma of the vector calculated in the step 3.5;
calculating Euclidean distance o between vectors xi and psi1:
Calculating Euclidean distance o between vectors xi and gamma2:
Γ=ξ+o1ψ+o2γ
In the formula, xikValue of the Kth element, ψ, representing vector ξkValue of the Kth element, gamma, representing the vector psikThe kth element value, K ═ 1,2,3, … …,4095, representing the vector γ;
ψ=[ψ0,ψ1,……,ψ4095],γ=[γ0,γ1,……,γ4095],ξ=[ξ0,ξ1,……,ξ4095];
step 3.7: calculating a sum vector gamma through two full-connection layers, and then carrying out normalization calculation on the result by using softmax to obtain a vector R, wherein element values in the R are predicted probability values of various categories, and the element position of the maximum value is a prediction result of a watching region; and inputting the acquired facial image of the driver to be monitored into the trained gazing area detection model, and outputting a prediction result of the gazing area. If the watching areas of the driver in a certain time period are the instrument panel area, the equipment control area such as navigation and audio, and the auxiliary driving glove box area, the driver is considered to drive with distraction, and the watching area division schematic diagram is shown in fig. 4, wherein (1) the area is the left side rearview mirror of the vehicle; (2) the area is an instrument panel; (3) the area is an equipment control area; (4) the area is a front glove compartment of the copilot; (5) the area is a right side rearview mirror; dividing the front windshield into 9 areas which are respectively numbered as (6) - (14); the number of the other areas of the left front door window except the rearview mirror area is (15); the number of the other areas of the right front side door window except the rearview mirror area is (16); the steering wheel area number is (17).
Judging whether the driver is fatigue driving according to the blinking frequency, the head posture or the watching area of the driver; the concrete expression is as follows:
aiming at the head image of the driver to be monitored, carrying out eye opening or eye closing prediction by utilizing a trained LeNet neural network, counting the total time length T occupied by adjacent 20 frames of images with closed eyes as a prediction result, and if the T is less than or equal to one minute or the T is more than or equal to two minutes, determining that the driver is fatigue driving;
when the maximum values of the changes of pitch, yaw and roll are all less than or equal to a set threshold value within a period of time, judging that the driver is fatigue driving;
and aiming at the head image of the driver to be monitored, detecting the watching region by using the intercepted face image and binocular image by using the trained watching region detection model, and judging that the driver is fatigue driving when the prediction results of the driver in a certain period of time are the same watching region.
In this embodiment, the fatigue driving determination conditions are set as follows:
when the driver is in driving, one of the following conditions occurs, and the driver is considered to be in fatigue driving:
the time taken for the driver to blink 20 times is more than 2 minutes or less than 1 minute;
the driver did not detect a large angular rotation of the head (over 20 degrees) for a long time (10 minutes);
the fixation area of the driver is not changed within 5 minutes;
if one of the three conditions is satisfied, fatigue driving is considered, and if 75 or more frames are concentrated in the region of attention of the driver for 3 seconds (or 150 frames) and are in the region (3), the region (4), and the region (17), the driver is considered to be distracted driving.
Claims (10)
1. A method of monitoring fatigue driving, comprising:
constructing a detection model for detecting opening/closing of eyes, and calculating the blinking frequency of a driver;
estimating the head attitude of the driver by calculating the Euler angle of the head attitude of the driver;
constructing a detection model of a driver watching area, and detecting the driver watching area through an eye image and a face image of the driver;
and judging whether the driver is fatigue driving according to the blinking frequency, the head posture or the watching area of the driver.
2. The fatigue driving monitoring method according to claim 1, wherein the constructing a detection model for detecting the opening/closing of eyes, and calculating the blinking frequency of the driver comprises:
step 1.1: collecting facial image data of a driver to construct an open/closed eye detection data set;
step 1.2: constructing a LeNet neural network as a model for detecting open/closed eyes, and training by using an open/closed eye detection data set;
step 1.3: aiming at the facial image of the driver to be monitored, the trained neural network LeNet is used for predicting the eyes to be opened or closed, and the total duration T occupied by the adjacent 20 frames of images with the eyes closed is counted as the detection result.
3. The fatigue driving monitoring method according to claim 2, wherein estimating the head posture of the driver by calculating the euler angle of the head posture of the driver comprises:
step 2.1: positioning 68 key points of the face in each image by using a cascade regression tree algorithm to obtain two-dimensional coordinates of the 68 key points;
step 2.2: calculating a rotation matrix rot _ vector of the head through an N-point perspective pose solving algorithm according to the universal three-dimensional coordinates of the key points of the head and the two-dimensional coordinates of the key points of the 68 faces obtained in the step 2.1;
step 2.3: and converting the rotation matrix into a pitch angle pitch, a yaw angle yaw and a roll angle roll in a space coordinate system to represent the head posture of the driver.
4. The fatigue driving monitoring method according to claim 3, wherein the constructing a detection model of the driver's gaze area, and the detecting the driver's gaze area through the eye image and the face image of the driver comprises:
step 3.1: taking the public data set DDGC-DB1 as a training set of the model;
step 3.2: carrying out specification adjustment on each sample in the training set, and uniformly adjusting the samples to 224 multiplied by 224;
step 3.3: cutting out the face image from the scaled image, and mapping the face imageImage width is noted as L1Height is denoted as H1;
Step 3.4: taking the geometric center of the eye area as the center, intercepting the images of the two eyes, and recording the distance between the two corners of the eyes as L2Height is denoted as H2;
Step 3.5: respectively extracting features of the intercepted face image and the intercepted binocular image by using a convolutional layer of a VGG16 neural network to obtain 3 feature vectors of 1 multiplied by 4096, and respectively recording the feature vectors as xi, psi and gamma;
step 3.6: calculating a weighted sum vector gamma of the vector calculated in the step 3.5;
calculating Euclidean distance o between vectors xi and psi1:
Calculating Euclidean distance o between vectors xi and gamma2:
Γ=ξ+o1ψ+o2γ
In the formula, xikValue of the Kth element, ψ, representing vector ξkValue of the Kth element, gamma, representing the vector psikThe kth element value, K ═ 1,2,3, … …,4095, representing the vector γ;
ψ=[ψ0,ψ1,......,ψ4095],γ=[γ0,γ1,......γ4095],ξ=[ξ0,ξ1,......,ξ4095];
step 3.7: and calculating the sum vector gamma through two full-connection layers, and then carrying out normalization calculation on the result by using softmax to obtain a vector R, wherein the element value in R is the predicted probability value of each category, and the element position of the maximum value is the predicted result of the gazing area.
5. The method for monitoring fatigue driving of claim 4, wherein the determining whether the driver is fatigue driving according to the blinking frequency, the head posture or the gaze area of the driver is specifically expressed as:
aiming at the facial image of the driver to be monitored, carrying out eye opening or eye closing prediction by utilizing a trained LeNet neural network, counting the total time length T occupied by adjacent 20 frames of images with closed eyes as a prediction result, and if the T is less than or equal to one minute or the T is more than or equal to two minutes, determining that the driver is fatigue driving;
when the maximum values of the changes of pitch, yaw and roll are all less than or equal to a set threshold value within a period of time, judging that the driver is fatigue driving;
and aiming at the facial image of the driver to be monitored, detecting the watching area according to the intercepted facial image and the binocular image, and judging that the driver is fatigue driving when the predicted results of the driver are the same watching area within a certain period of time.
6. A method as claimed in claim 2, wherein step 1.1 comprises:
step 1.1.1: respectively acquiring facial image data of different angles of the head of N drivers;
step 1.1.2: positioning 68 key points of the human face in each image;
step 1.1.3: after the key points of the face are obtained, the direction and the size of the face are corrected;
step 1.1.4: intercepting the corrected eye image as a sample in a training set;
step 1.1.5: each sample was labeled as open or closed.
7. A method for monitoring fatigue driving according to claim 6, wherein said step 1.1.3 comprises:
step S1-1: using the upper left corner point of the image as the origin, the horizontal direction as the horizontal axis and the vertical direction as the vertical axis, and according to the left eye corner key point P of the left eye37Right corner of the right eye, key point P46Calculates a rotation center point a (a) of the imagex,ay):
In the formula, P37.x、P37Y represents the abscissa and ordinate of the face key point with the number 37, respectively; p46.x、P46Y represents the abscissa and ordinate of the face key point with the number 46, respectively;
step S1-2: when the image deflects in the horizontal direction, correcting the image in the horizontal direction according to the rotation angle alpha;
step S1-3: and the corrected images are zoomed to ensure that the eye sizes of all the images are consistent.
8. The fatigue driving monitoring method according to claim 7, wherein the step S1-2 includes:
step SS 1-1: calculating a point P by taking the upper left corner point of the image as an original point, the horizontal direction as an X axis and the vertical direction as a Y axis37And point P46A height difference h in the Y-axis direction, which is 0 when the head is not deflected in the horizontal direction;
h=P37.y-P46.y
when h <0, it means that the head is deflected leftward in the horizontal direction, and vice versa;
step SS 1-2: calculating a point P37To point P46Distance r therebetween:
step SS 1-3: calculating the rotation angle alpha:
step SS 1-4: and rotating the picture by alpha degrees in the horizontal direction according to the rotation angle to realize the correction of the picture in the horizontal direction.
9. The fatigue driving monitoring method according to claim 7, wherein the step S1-3 includes:
step SS 2-1: calculating right canthus key point P of left eye40And left eye corner key point P of right eye43Distance d in the direction of the X-axis, i.e. d ═ P43.x-P40.x;
Step SS 2-2: calculating a scaling scale, wherein the scale is D/D, and D is any constant;
step SS 2-3: and zooming the picture according to the zooming scale.
10. A method as claimed in claim 2, wherein said step 1.2 comprises:
step 1.2.1: processing the sample image specification into a size of 3 multiplied by 32 as the input of a LeNet neural network;
step 1.2.2: performing convolution operation on an input image by using 6 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, and edge filling is not performed to obtain a characteristic vector with the specification of 6 multiplied by 28;
step 1.2.3: performing maximum pooling operation on the obtained feature vector, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling is not performed, and the specification of the feature vector is converted into 6 multiplied by 14;
step 1.2.4: performing convolution operation on the feature vector obtained in the step 1.2.3 by using 16 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling is not performed, and the specification of the feature vector is converted into 16 multiplied by 10;
step 1.2.5: performing maximum pooling operation on the feature vector obtained in the step 1.2.4, wherein the size of a convolution kernel is 2 multiplied by 2, the step length is 2, edge filling is not performed, and the specification of the feature vector is converted into 16 multiplied by 5;
step 1.2.6: performing convolution operation on the feature vector obtained in the step 1.2.5 by using 120 convolution kernels, wherein the size of the convolution kernels is 5 multiplied by 5, the step length is 1, edge filling is not performed, and the specification of the feature vector is converted into 120 multiplied by 1;
step 1.2.7: inputting the eigenvector obtained in the step 1.2.6 into a full-connection layer F1 for full-connection calculation, wherein the number of the neurons of F1 is 120, and the eigenvector with the specification of 1 multiplied by 120 is obtained;
step 1.2.8: inputting the eigenvector obtained in the step 1.2.7 into a full-connection layer F2 for full-connection calculation, wherein the number of neurons in F2 is 2, and the eigenvector with the specification of 1 multiplied by 2 is obtained;
step 1.2.9: and (3) carrying out normalization operation on the vector obtained in the step (1.2.7) by using softmax to obtain the probability that the input eye picture is open or closed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210040471.XA CN114387587A (en) | 2022-01-14 | 2022-01-14 | Fatigue driving monitoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210040471.XA CN114387587A (en) | 2022-01-14 | 2022-01-14 | Fatigue driving monitoring method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114387587A true CN114387587A (en) | 2022-04-22 |
Family
ID=81201276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210040471.XA Pending CN114387587A (en) | 2022-01-14 | 2022-01-14 | Fatigue driving monitoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114387587A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114898339A (en) * | 2022-05-20 | 2022-08-12 | 一汽解放汽车有限公司 | Training method, device, equipment and storage medium of driving behavior prediction model |
CN115861984A (en) * | 2023-02-27 | 2023-03-28 | 联友智连科技有限公司 | Driver fatigue detection method and system |
CN117227740A (en) * | 2023-09-14 | 2023-12-15 | 南京项尚车联网技术有限公司 | Multi-mode sensing system and method for intelligent driving vehicle |
CN118220166A (en) * | 2024-05-07 | 2024-06-21 | 钧捷科技(北京)有限公司 | Driver distraction early warning system for overcoming problems of near vision mirror and eye difference |
-
2022
- 2022-01-14 CN CN202210040471.XA patent/CN114387587A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114898339A (en) * | 2022-05-20 | 2022-08-12 | 一汽解放汽车有限公司 | Training method, device, equipment and storage medium of driving behavior prediction model |
CN114898339B (en) * | 2022-05-20 | 2024-06-07 | 一汽解放汽车有限公司 | Training method, device, equipment and storage medium of driving behavior prediction model |
CN115861984A (en) * | 2023-02-27 | 2023-03-28 | 联友智连科技有限公司 | Driver fatigue detection method and system |
CN115861984B (en) * | 2023-02-27 | 2023-06-02 | 联友智连科技有限公司 | Driver fatigue detection method and system |
CN117227740A (en) * | 2023-09-14 | 2023-12-15 | 南京项尚车联网技术有限公司 | Multi-mode sensing system and method for intelligent driving vehicle |
CN117227740B (en) * | 2023-09-14 | 2024-03-19 | 南京项尚车联网技术有限公司 | Multi-mode sensing system and method for intelligent driving vehicle |
CN118220166A (en) * | 2024-05-07 | 2024-06-21 | 钧捷科技(北京)有限公司 | Driver distraction early warning system for overcoming problems of near vision mirror and eye difference |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114387587A (en) | Fatigue driving monitoring method | |
Wang et al. | Driver fatigue detection: a survey | |
JP6307629B2 (en) | Method and apparatus for detecting safe driving state of driver | |
CN109389806B (en) | Fatigue driving detection early warning method, system and medium based on multi-information fusion | |
CN110641468B (en) | Controlling autonomous vehicles based on passenger behavior | |
CN109543651B (en) | Method for detecting dangerous driving behavior of driver | |
CN111753674A (en) | Fatigue driving detection and identification method based on deep learning | |
CN112489425A (en) | Vehicle anti-collision early warning method and device, vehicle-mounted terminal equipment and storage medium | |
Pech et al. | Head tracking based glance area estimation for driver behaviour modelling during lane change execution | |
CN110547807A (en) | driving behavior analysis method, device, equipment and computer readable storage medium | |
US11592677B2 (en) | System and method for capturing a spatial orientation of a wearable device | |
CN113128295A (en) | Method and device for identifying dangerous driving state of vehicle driver | |
US20210146934A1 (en) | Vehicle operation assistance device, vehicle operation assistance method, and program | |
CN117227740B (en) | Multi-mode sensing system and method for intelligent driving vehicle | |
Jha et al. | Probabilistic estimation of the driver's gaze from head orientation and position | |
US10268903B2 (en) | Method and system for automatic calibration of an operator monitor | |
CN113780125A (en) | Fatigue state detection method and device for multi-feature fusion of driver | |
Jha et al. | Driver visual attention estimation using head pose and eye appearance information | |
Baccour et al. | Comparative analysis of vehicle-based and driver-based features for driver drowsiness monitoring by support vector machines | |
Wang et al. | Driver fatigue detection technology in active safety systems | |
CN116597611A (en) | Method, system and device for monitoring and early warning driver state | |
CN108256487B (en) | Driving state detection device and method based on reverse dual-purpose | |
CN116012822A (en) | Fatigue driving identification method and device and electronic equipment | |
CN115861982A (en) | Real-time driving fatigue detection method and system based on monitoring camera | |
Epple et al. | How do drivers observe surrounding vehicles in real-world traffic? Estimating the drivers primary observed traffic objects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |