CN110348350A - A kind of driver status detection method based on facial expression - Google Patents
A kind of driver status detection method based on facial expression Download PDFInfo
- Publication number
- CN110348350A CN110348350A CN201910584900.8A CN201910584900A CN110348350A CN 110348350 A CN110348350 A CN 110348350A CN 201910584900 A CN201910584900 A CN 201910584900A CN 110348350 A CN110348350 A CN 110348350A
- Authority
- CN
- China
- Prior art keywords
- facial expression
- driver
- convolution
- layer
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000008921 facial expression Effects 0.000 title claims abstract description 72
- 238000001514 detection method Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 38
- 238000013528 artificial neural network Methods 0.000 claims abstract description 30
- 238000012937 correction Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000000034 method Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 9
- 230000009467 reduction Effects 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims 1
- 230000006399 behavior Effects 0.000 claims 1
- 238000013461 design Methods 0.000 abstract description 7
- 230000000694 effects Effects 0.000 abstract description 2
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 230000001815 facial effect Effects 0.000 description 23
- 238000011176 pooling Methods 0.000 description 20
- 230000014509 gene expression Effects 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 238000000513 principal component analysis Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011423 initialization method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000011897 real-time detection Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06F18/2193—Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
技术领域technical field
本发明属于驾驶员状态技术领域,更为具体地讲,涉及一种基于面部表情的驾驶员状态检测方法,即对驾驶员面部表情进行实时检测并由此对驾驶员当前的驾驶状态进行判定的方法。The invention belongs to the technical field of driver status, and more specifically relates to a method for detecting driver status based on facial expressions, that is, real-time detection of driver facial expressions and thus judging the driver's current driving status method.
背景技术Background technique
驾驶员的驾驶状态对安全驾驶起着至关重要的作用,通过实时检测出驾驶员的驾驶状态,可以很好地确保驾驶员的安全驾驶。The driver's driving state plays a vital role in safe driving. By detecting the driver's driving state in real time, the driver's safe driving can be well ensured.
目前对驾驶员的驾驶状态进行分析判断主要分为接触式和非接触式两大类。其中,接触式方法主要为通过穿戴式设备等检测驾驶员脑电信号、肌电信号等生理信号来判断驾驶员的驾驶状态,该方法主要的缺点是检测过程中会对驾驶员安全驾驶造成影响且成本较高;非接触式的方法分为三小类,第一类是通过检测车辆的行驶轨迹来判断驾驶员的驾驶状态,但是该方法受环境道路影响较大且准确率低,第二种方法是通过实时检测方向盘转动角度、刹车离合受力程度等情况判断驾驶员的驾驶状态,但是该方法受到驾驶员个人的驾驶习惯影响较大;第三种方法是利用计算机视觉方法,利用摄像头拍摄到的驾驶员面部图像判断出驾驶员当前的表情,进而实时检测出驾驶员的驾驶状态,该方法具有实时性好、准确率高的优点,因此,计算机视觉方法检测驾驶员的驾驶状态是当前的主流方向。At present, the analysis and judgment of the driver's driving state are mainly divided into two categories: contact type and non-contact type. Among them, the contact method is mainly to detect the driver's driving state by detecting the driver's EEG signal, EMG signal and other physiological signals through wearable devices. The main disadvantage of this method is that the detection process will affect the driver's safe driving. And the cost is high; the non-contact method is divided into three sub-categories. The first category is to judge the driver's driving state by detecting the vehicle's driving trajectory, but this method is greatly affected by the environmental road and has low accuracy. The first method is to judge the driving state of the driver by detecting the steering wheel rotation angle in real time, the force level of the brake clutch, etc., but this method is greatly affected by the driver's personal driving habits; the third method is to use computer vision methods. The captured facial image of the driver can determine the driver's current expression, and then detect the driver's driving state in real time. This method has the advantages of good real-time performance and high accuracy. Therefore, the computer vision method to detect the driver's driving state current mainstream direction.
面部表情在人与人之间交流上有着重要作用,面部表情相对于文字语音等媒介,在表达人的情感方面具有更加直观,准确的优势。人的这种情感交互模式现在已经用于如虚拟现实、数字娱乐领域、通信与视频会议、人机交互等场景。因此基于面部表情的驾驶员状态检测相对于单纯的疲劳检测会更加具有优势和人性化。面部表情识别方法大致包括以下三个方面:人脸图像预处理、面部表情特征学习、面部表情分类,最后依据面部表情分类检测出驾驶员的驾驶状态。Facial expressions play an important role in communication between people. Compared with media such as text and voice, facial expressions have more intuitive and accurate advantages in expressing people's emotions. This human emotional interaction mode has now been used in scenarios such as virtual reality, digital entertainment, communication and video conferencing, and human-computer interaction. Therefore, the driver state detection based on facial expression will be more advantageous and humanized than simple fatigue detection. The facial expression recognition method generally includes the following three aspects: face image preprocessing, facial expression feature learning, facial expression classification, and finally, the driver's driving state is detected according to the facial expression classification.
然而,现有的基于面部标签的驾驶员状态检测,参数数量大,运算速度较低,从而影响了驾驶员状态检测的实时性。同时,准确率也有待提高。However, the existing driver state detection based on facial tags has a large number of parameters and a low calculation speed, which affects the real-time performance of driver state detection. At the same time, the accuracy rate needs to be improved.
发明内容Contents of the invention
本发明的目的在于克服现有技术的不足,提供一种基于面部表情的驾驶员状态检测方法,以增强驾驶员状态检测的实时性,同时提高准确率。The purpose of the present invention is to overcome the deficiencies of the prior art and provide a driver state detection method based on facial expressions, so as to enhance the real-time performance of the driver state detection and improve the accuracy rate at the same time.
为实现上述发明目的,本发明基于面部表情的驾驶员状态检测方法,其特征在于,包括以下步骤:In order to realize the foregoing invention object, the driver's state detection method based on facial expression of the present invention is characterized in that, comprises the following steps:
(1)、获取驾驶员的面部图像(1) Obtain the driver's face image
利用在驾驶员前方安装的摄像头获取驾驶员的视频流,利用haar特征+adaboost人脸检测算法,检测视频流图像中用户的人脸区域即面部图像;Use the camera installed in front of the driver to obtain the driver's video stream, and use the haar feature + adaboost face detection algorithm to detect the user's face area in the video stream image, that is, the facial image;
(2)、对获取的面部图像进行预处理(2), preprocessing the acquired facial image
首先对面部图像进行图像灰度化处理First, image grayscale processing is performed on the face image
Gray=0.3R+0.59G+0.11BGray=0.3R+0.59G+0.11B
其中,Gray为像素灰度值,R为红色像素值、G为绿色像素值、B为蓝色像素值;Among them, Gray is the pixel gray value, R is the red pixel value, G is the green pixel value, and B is the blue pixel value;
然后进行Gamma校正:Then perform Gamma correction:
I=Grayγ I = Gray |
其中,I为校正后像素灰度值,γ为0.5;Among them, I is the pixel gray value after correction, and γ is 0.5;
最后,再对灰度化处理以及Gamma校正后的图像使用PCA(主成分分析)方法进行处理:Finally, use the PCA (Principal Component Analysis) method to process the grayscaled and Gamma-corrected images:
确定面部图像为n行m列的矩阵X,先将矩阵X的每一行进行零均值化,然后求出矩阵X的协方差矩阵,再求出协方差矩阵的特征值及对应的特征向量;将特征向量按照对应特征值大小从小到大按行排成矩阵O,取矩阵O的前K行组成矩阵P,矩阵P为K行n列的矩阵,得到面部图像Y,Y=PX即为降维到K维后的面部图像;Determine that the facial image is a matrix X with n rows and m columns, first carry out zero-meanization for each row of the matrix X, then find the covariance matrix of the matrix X, and then find the eigenvalues and corresponding eigenvectors of the covariance matrix; The eigenvectors are arranged into a matrix O according to the size of the corresponding eigenvalues from small to large, and the first K rows of the matrix O are taken to form a matrix P. The matrix P is a matrix of K rows and n columns, and the facial image Y is obtained. Y=PX is dimensionality reduction The facial image after reaching the K dimension;
(3)、驾驶员面部表情识别(3) Driver facial expression recognition
3.1)、构建面部表情识别卷据神经网络3.1), Construct facial expression recognition volume neural network
所述面部表情识别卷据神经网络包括依次连接的四层结构、平均池化层、dropout层以及softmax分类器;Described facial expression recognition volume neural network comprises four layers of structure connected successively, average pooling layer, dropout layer and softmax classifier;
每一层结构包括卷积核大小为3×3、步长为2的第一卷积层、卷积核大小为3×3、步长为1的第二、三卷积层、池化层以及inception结构;其中,前三层结构中的池化层的卷积核大小为3×3、步长为2,第四层结构的池化层的卷积核大小为3×3、步长为1;Each layer structure includes the first convolution layer with a convolution kernel size of 3×3 and a step size of 2, the second and third convolution layers with a convolution kernel size of 3×3 and a step size of 1, and a pooling layer And the inception structure; among them, the convolution kernel size of the pooling layer in the first three layers of structure is 3×3, the step size is 2, and the convolution kernel size of the pooling layer of the fourth layer structure is 3×3, the step size is 1;
所述inception结构分为包括并行处理的四个特征图处理通道以及一个filterconcatenation层,第一个特征图处理通道采用大小为3×3卷积核对输入特征图池化操作,然后采用大小为1×1卷积核进行卷积操作,最后送入filter concatenation层中;第二个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,然后送入filterconcatenation层中;第三个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,得到的特征图分别再采用大小为3×1卷积核、1×3卷积核进行卷积操作,得到的特征图都送入filter concatenation层中;第四个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,卷积操作得到的特征图再采用大小为3×3卷积核进行卷积操作,然后分别再采用大小为3×1卷积核、1×3卷积核对3×3卷积核卷积操作后的特征图进行卷积操作,得到的特征图都送入filter concatenation层中;filter concatenation层中将四个特征图处理通道得到的特征图进行连接,得到连接后的特征图;The inception structure is divided into four feature map processing channels including parallel processing and a filterconcatenation layer. The first feature map processing channel uses a convolution kernel with a size of 3×3 to pool the input feature map, and then uses a size of 1× 1 convolution kernel performs convolution operation, and finally sends it to the filter concatenation layer; the second feature map processing channel uses a convolution kernel with a size of 1×1 to perform convolution operation on the input feature map, and then sends it to the filter concatenation layer; the third Each feature map processing channel uses a convolution kernel with a size of 1×1 to perform convolution operations on the input feature map, and the obtained feature maps are then convolutionally operated with a convolution kernel with a size of 3×1 and a convolution kernel with a size of 1×3 to obtain The feature maps of all are sent to the filter concatenation layer; the fourth feature map processing channel uses a convolution kernel with a size of 1×1 to perform convolution operations on the input feature maps, and the feature maps obtained by the convolution operation are then used with a size of 3×3. The convolution operation is performed on the product kernel, and then the convolution operation is performed on the feature map after the convolution operation of the 3×3 convolution kernel by using a 3×1 convolution kernel and a 1×3 convolution kernel respectively, and the obtained feature maps are sent to into the filter concatenation layer; in the filter concatenation layer, the feature maps obtained by the four feature map processing channels are connected to obtain the connected feature map;
第一层结构的第一卷积层输入降维后的K维面部图像,依次经过第一、二、三卷积层的卷积操作后送入池化层中进行池化操作,池化后的特征图像送入inception结构进行处理,得到连接后的特征图;第一层结构处理完得到的连接后的特征图送入第二层结构进行第一层结构的相同处理、第二层结构处理完的特征图送入第三层结构进行第一层结构的相同处理,第三层结构处理完的特征图送入第四层结构进行第一层结构的相同处理,第四层结构处理完得到的连接后的特征图送入平均池化层进行平均池化操作,平均池化后的特征图在dropout层进行一定比例的丢弃,然后送入softmax分类器器中分类得到面部表情;The first convolutional layer of the first layer structure inputs the K-dimensional facial image after dimension reduction, and after the convolution operation of the first, second, and third convolutional layers, it is sent to the pooling layer for pooling operation. After pooling The feature image of the inception structure is sent to the inception structure for processing, and the connected feature map is obtained; the connected feature map obtained after the first layer structure is processed is sent to the second layer structure for the same processing of the first layer structure, and the second layer structure processing The finished feature map is sent to the third layer structure for the same processing as the first layer structure, the feature map processed by the third layer structure is sent to the fourth layer structure for the same processing as the first layer structure, and the fourth layer structure is processed to get The connected feature maps are sent to the average pooling layer for average pooling operation, and the average pooled feature maps are discarded in a certain proportion in the dropout layer, and then sent to the softmax classifier to classify facial expressions;
3.2)、训练面部表情识别卷据神经网络3.2), training facial expression recognition volume neural network
将标记面部表情的K维面部图像送入步骤3.1)构建的面部表情识别卷据神经网络对其进行训练,得到训练好的面部表情识别卷据神经网络;The K-dimension facial image of mark facial expression is sent into the facial expression recognition volume neural network that step 3.1) builds and it is trained, obtains the well-trained facial expression recognition volume neural network;
其中,面部表情识别卷据神经网络训练过程中选取的激活函数为Relu函数,优化算法为SGD((Stochastic Gradient Descent,随机梯度下降)),初始化方法为Xavier,学习速率为:Among them, the activation function selected in the facial expression recognition volume neural network training process is the Relu function, the optimization algorithm is SGD ((Stochastic Gradient Descent, stochastic gradient descent)), the initialization method is Xavier, and the learning rate is:
base_lr(1-iter/max_iter)×0.5base_lr(1-iter/max_iter)×0.5
其中,base_lr=0.01是最初学习速率,iter是当前迭代的次数,max_iter是最大迭代次数;Among them, base_lr=0.01 is the initial learning rate, iter is the number of current iterations, and max_iter is the maximum number of iterations;
3.3)、获取的驾驶员面部图像经过步骤(1)、(2)处理后,送入训练好的面部表情识别卷据神经网络,得到驾驶员的面部表情;3.3), after step (1), (2) processing, the driver's facial image that obtains is sent into the trained facial expression recognition volume neural network to obtain the driver's facial expression;
(4)、输出结果(4), output result
识别出驾驶员面部表情之后,得到驾驶员的驾驶状态,实时地显示在屏幕中,或者能够及时地对驾驶员进行提示,当驾驶员出现愤怒等不适合驾驶的面部表情时,给出不适合驾驶的状态提醒,及时对驾驶员进行有效的提示或者采用一系列的方法来缓解驾驶员当前的不适驾驶状态。After recognizing the facial expression of the driver, the driving status of the driver can be obtained and displayed on the screen in real time, or the driver can be prompted in time. Driving status reminder, timely and effective reminder to the driver or a series of methods to alleviate the driver's current uncomfortable driving state.
本发明的发明目的是这样实现的:The purpose of the invention of the present invention is achieved like this:
本发明基于面部表情的驾驶员状态检测方法,通过灰度化、Gamma校正以及PCA降维处理,使得面部图像大小减小、特征增强。在此基础上,本发明构建了一个依次连接的四层结构、平均池化层、dropout层以及softmax分类器构成的面部表情识别卷据神经网络,其参数数量小,同时,其中的inception结构设计采用了一种新的形式,将传统的规则卷积3*3卷积拆成1*3卷积和3*1卷积,这样一方面节约了大量参数,加速运算并减轻了过拟合,并增加了一层非线性扩展模型表达能力,可以处理更多、更丰富的空间特征,增加特征多样性。这种inception结构设计使得面部表情识别卷据神经网络变得更加轻量化,同时具有更好的检测效果即提高了驾驶员状态检测的准确率。The facial expression-based driver state detection method of the present invention reduces the size of the facial image and enhances the features through grayscale, Gamma correction and PCA dimension reduction processing. On this basis, the present invention builds a facial expression recognition roll neural network consisting of a sequentially connected four-layer structure, an average pooling layer, a dropout layer and a softmax classifier, and its parameter quantity is small, and at the same time, the inception structure design wherein A new form is adopted to split the traditional regular convolution 3*3 convolution into 1*3 convolution and 3*1 convolution, which saves a lot of parameters on the one hand, speeds up the operation and reduces overfitting. And a layer of non-linear expansion model expression ability is added, which can handle more and richer spatial features and increase feature diversity. This inception structure design makes the neural network for facial expression recognition more lightweight, and at the same time has a better detection effect, which improves the accuracy of driver state detection.
附图说明Description of drawings
图1是本发明基于面部表情的驾驶员状态检测方法的流程图;Fig. 1 is the flow chart of the driver's state detection method based on facial expression of the present invention;
图2是面部表情识别卷据神经网络一种具体实施方式框架图;Fig. 2 is a frame diagram of a specific embodiment of the neural network for facial expression recognition;
图3是图2所示面部表情识别卷据神经网络中inception结构的框架图。Fig. 3 is a framework diagram of the inception structure in the neural network for facial expression recognition shown in Fig. 2 .
具体实施方式Detailed ways
下面结合附图对本发明的具体实施方式进行描述,以便本领域的技术人员更好地理解本发明。需要特别提醒注意的是,在以下的描述中,当已知功能和设计的详细描述也许会淡化本发明的主要内容时,这些描述在这里将被忽略。Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention. It should be noted that in the following description, when detailed descriptions of known functions and designs may dilute the main content of the present invention, these descriptions will be omitted here.
实施例Example
本发明中,首先采用haar特征+adaboost人脸检测算法检测出驾驶员的人脸区域即面部图像,然后把检测到的面部图像进行预处理后输入的构建的面部表情识别卷积神经网络中,实时的检测当前驾驶员的面部表情,得到驾驶员的驾驶状态。In the present invention, at first adopt haar characteristic+adaboost human face detection algorithm to detect the driver's human face region, i.e. the facial image, then in the facial expression recognition convolutional neural network that is input after the preprocessing of the detected facial image, Real-time detection of the current driver's facial expression to obtain the driver's driving status.
图1是本发明基于面部表情的驾驶员状态检测方法的流程图。Fig. 1 is a flow chart of the driver's state detection method based on facial expression in the present invention.
在本实施例中,如图1所示,本发明基于面部表情的驾驶员状态检测方法,包括以下步骤:In the present embodiment, as shown in Figure 1, the driver's state detection method based on facial expression of the present invention comprises the following steps:
步骤S1:获取驾驶员的面部图像Step S1: Get the driver's face image
利用在驾驶员前方安装的摄像头获取驾驶员的视频流,利用haar特征+adaboost人脸检测算法,检测视频流图像中用户的人脸区域即面部图像:提取图像中的haar-like特征,然后将haar-like特征输入Adaboost分类器,检测驾驶员人脸所处区域位置,把人脸位置框选到作为面部图像进行后续处理。Use the camera installed in front of the driver to obtain the driver's video stream, use the haar feature + adaboost face detection algorithm to detect the user's face area in the video stream image, that is, the facial image: extract the haar-like feature in the image, and then The haar-like features are input into the Adaboost classifier to detect the location of the driver's face, and frame the location of the face as a facial image for subsequent processing.
步骤S2:对获取的面部图像进行预处理Step S2: Preprocessing the acquired facial image
首先对面部图像进行图像灰度化处理First, image grayscale processing is performed on the face image
Gray=0.3R+0.59G+0.11BGray=0.3R+0.59G+0.11B
其中,Gray为像素灰度值,R为红色像素值、G为绿色像素值、B为蓝色像素值。Among them, Gray is the pixel gray value, R is the red pixel value, G is the green pixel value, and B is the blue pixel value.
然后进行Gamma校正:Then perform Gamma correction:
I=Grayγ I = Gray |
其中,I为校正后像素灰度值,γ为0.5。Among them, I is the pixel gray value after correction, and γ is 0.5.
最后,再对灰度化处理以及Gamma校正后的图像使用PCA方法进行处理:确定面部图像为n行m列的矩阵X,先将矩阵X的每一行进行零均值化,然后求出矩阵X的协方差矩阵,再求出协方差矩阵的特征值及对应的特征向量;将特征向量按照对应特征值大小从小到大按行排成矩阵O,取矩阵O的前K行组成矩阵P,矩阵P为K行n列的矩阵,得到面部图像Y,Y=PX即为降维到K维后的面部图像。这样就完成了驾驶员人脸图像的预处理,方便后续神经网络的表情识别。通过灰度化PCA降维处理,使得面部图像大小减小,增强了处理的实时性,通过Gamma校正,图像特征得到了增强,提高了识别的准确率。Finally, use the PCA method to process the gray-scaled and Gamma-corrected images: determine the facial image as a matrix X with n rows and m columns, first zero-mean each row of the matrix X, and then calculate the matrix X Covariance matrix, and then find the eigenvalues and corresponding eigenvectors of the covariance matrix; arrange the eigenvectors into matrix O according to the size of the corresponding eigenvalues from small to large, and take the first K rows of matrix O to form matrix P, matrix P is a matrix of K rows and N columns, and the facial image Y is obtained, and Y=PX is the facial image after dimensionality reduction to K dimensions. In this way, the preprocessing of the driver's face image is completed, which is convenient for the expression recognition of the subsequent neural network. Through gray-scale PCA dimension reduction processing, the size of the facial image is reduced, and the real-time performance of the processing is enhanced. Through Gamma correction, the image features are enhanced, and the recognition accuracy is improved.
步骤S3:驾驶员面部表情识别Step S3: Driver facial expression recognition
步骤S3.1:构建面部表情识别卷据神经网络Step S3.1: Construct facial expression recognition volume neural network
在本实施例总,如图2所示,所述面部表情识别卷据神经网络包括依次连接的四层结构、平均池化层、dropout层以及softmax分类器。In this embodiment, as shown in FIG. 2 , the neural network for facial expression recognition includes a sequentially connected four-layer structure, an average pooling layer, a dropout layer, and a softmax classifier.
所述四层结构的每一层结构包括卷积核大小为3×3、步长为2的第一卷积层、卷积核大小为3×3、步长为1的第二、三卷积层、池化层以及inception结构;其中,前三层结构中的池化层的卷积核大小为3×3、步长为2,第四层结构的、池化层的卷积核大小为3×3、步长为1。Each layer of the four-layer structure includes the first convolution layer with a convolution kernel size of 3×3 and a step size of 2, the second and third convolution layers with a convolution kernel size of 3×3 and a step size of 1 Product layer, pooling layer, and inception structure; among them, the convolution kernel size of the pooling layer in the first three layers of structure is 3×3, and the step size is 2, and the convolution kernel size of the fourth layer structure and pooling layer is 3×3 with a step size of 1.
在本实施例中,如图3所示,所述inception结构分为包括并行处理的四个特征图处理通道以及一个filter concatenation层,第一个特征图处理通道采用大小为3×3卷积核对输入特征图池化操作,然后采用大小为1×1卷积核进行卷积操作,最后送入filterconcatenation层中;第二个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,然后送入filter concatenation层中;第三个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,得到的特征图分别再采用大小为3×1卷积核、1×3卷积核进行卷积操作,得到的特征图都送入filter concatenation层中;第四个特征图处理通道采用大小为1×1卷积核对输入特征图进行卷积操作,卷积操作得到的特征图再采用大小为3×3卷积核进行卷积操作,然后分别再采用大小为3×1卷积核、1×3卷积核对3×3卷积核卷积操作后的特征图进行卷积操作,得到的特征图都送入filter concatenation层中;filter concatenation层中将四个特征图处理通道得到的特征图进行连接,得到连接后的特征图。In this embodiment, as shown in Figure 3, the inception structure is divided into four feature map processing channels including parallel processing and a filter concatenation layer, and the first feature map processing channel uses a convolution check with a size of 3×3 Input the feature map pooling operation, then use the size of 1×1 convolution kernel for convolution operation, and finally send it to the filterconcatenation layer; the second feature map processing channel uses the size of 1×1 convolution kernel to convolve the input feature map The product operation is then sent to the filter concatenation layer; the third feature map processing channel uses a 1×1 convolution kernel to perform convolution operations on the input feature map, and the obtained feature maps use a 3×1 convolution kernel , 1×3 convolution kernel for convolution operation, and the obtained feature maps are sent to the filter concatenation layer; the fourth feature map processing channel uses a 1×1 convolution kernel to perform convolution operations on the input feature map, and the convolution The feature map obtained by the operation is then convolved with a 3×3 convolution kernel, and then the convolution kernel with a size of 3×1, 1×3 convolution kernel and 3×3 convolution kernel are used respectively. The feature map is convolved, and the obtained feature maps are sent to the filter concatenation layer; in the filter concatenation layer, the feature maps obtained by the four feature map processing channels are connected to obtain the connected feature map.
在本实施例中,inception结构的设计采用了一种新的形式,将传统的规则卷积如3*3卷积拆成1*3卷积和3*1卷积,一方面节约了大量参数,加速运算并减轻了过拟合,同时增加了一层非线性扩展模型表达能力,可以处理更多、更丰富的空间特征,增加特征多样性。这种特殊的结构设计使得面部表情识别卷据神经网络变得更加轻量化,同时具有更好的检测效果。In this embodiment, the design of the inception structure adopts a new form, and the traditional regular convolution such as 3*3 convolution is split into 1*3 convolution and 3*1 convolution, which saves a lot of parameters on the one hand , to speed up the operation and reduce overfitting, and at the same time add a layer of nonlinear expansion model expression ability, which can handle more and richer spatial features and increase feature diversity. This special structural design makes the neural network for facial expression recognition more lightweight and has better detection results.
第一层结构的第一卷积层输入降维后的K维面部图像,依次经过第一、二、三卷积层的卷积操作后送入池化层中进行池化操作,池化后的特征图像送入inception结构进行处理,得到连接后的特征图;第一层结构处理完得到的连接后的特征图送入第二层结构进行第一层结构的相同处理、第二层结构处理完的特征图送入第三层结构进行第一层结构的相同处理,第三层结构处理完的特征图送入第四层结构进行第一层结构的相同处理,第四层结构处理完得到的连接后的特征图送入平均池化层进行平均池化操作,平均池化后的特征图在dropout层进行一定比例的丢弃,然后送入softmax分类器器中分类得到面部表情。The first convolutional layer of the first layer structure inputs the K-dimensional facial image after dimension reduction, and after the convolution operation of the first, second, and third convolutional layers, it is sent to the pooling layer for pooling operation. After pooling The feature image of the inception structure is sent to the inception structure for processing, and the connected feature map is obtained; the connected feature map obtained after the first layer structure is processed is sent to the second layer structure for the same processing of the first layer structure, and the second layer structure processing The finished feature map is sent to the third layer structure for the same processing as the first layer structure, the feature map processed by the third layer structure is sent to the fourth layer structure for the same processing as the first layer structure, and the fourth layer structure is processed to get The connected feature maps are sent to the average pooling layer for average pooling operation, and the average pooled feature maps are discarded in a certain proportion in the dropout layer, and then sent to the softmax classifier to classify facial expressions.
步骤S3.2:训练面部表情识别卷据神经网络Step S3.2: Training facial expression recognition volume neural network
将标记面部表情的K维面部图像送入步骤S3.1构建的面部表情识别卷据神经网络对其进行训练,得到训练好的面部表情识别卷据神经网络.Send the K-dimensional facial image marked with facial expression to the facial expression recognition volume neural network constructed in step S3.1 to train it, and obtain the trained facial expression recognition volume neural network.
其中,面部表情识别卷据神经网络训练过程中选取的激活函数为Relu函数,优化算法为SGD,初始化方法为Xavier,学习速率为:Among them, the activation function selected in the facial expression recognition volume neural network training process is the Relu function, the optimization algorithm is SGD, the initialization method is Xavier, and the learning rate is:
base_lr(1-iter/max_iter)×0.5base_lr(1-iter/max_iter)×0.5
其中,base_lr=0.01是最初学习速率,iter是当前迭代的次数,max_iter是最大迭代次数。Among them, base_lr=0.01 is the initial learning rate, iter is the number of current iterations, and max_iter is the maximum number of iterations.
步骤S3.3:获取的驾驶员面部图像经过步骤(1)、(2)处理后,送入训练好的面部表情识别卷据神经网络,得到驾驶员的面部表情。Step S3.3: After the acquired driver's facial image is processed in steps (1) and (2), it is sent to the trained facial expression recognition roll neural network to obtain the driver's facial expression.
在本实施例中,输出7种基本的驾驶员面部表情。本实施例中,训练所需的数据集为人脸表情数据库中的部分数据集,对数据集进行预处理后,送入面部表情识别卷据神经网络进行训练。用人脸表情数据库中的其他部分数据集作为测试。In this embodiment, seven basic driver facial expressions are output. In this embodiment, the data sets required for training are part of the data sets in the facial expression database. After the data sets are preprocessed, they are sent to the neural network for facial expression recognition for training. Use other part of the dataset in the facial expression database as a test.
步骤S4:得到驾驶员的驾驶状态Step S4: Obtain the driving state of the driver
根据识别出驾驶员面部表情,得到驾驶员的驾驶状态,实时地显示在屏幕中,或者能够及时地对驾驶员进行提示,当驾驶员出现愤怒等不适合驾驶的面部表情时,给出不适合驾驶的状态提醒,及时对驾驶员进行有效的提示或者采用一系列的方法来缓解驾驶员当前的不适驾驶状态。According to the recognition of the driver's facial expression, the driver's driving status can be obtained and displayed on the screen in real time, or the driver can be prompted in time. When the driver has facial expressions that are not suitable for driving, such as anger, an inappropriate Driving status reminder, timely and effective reminder to the driver or a series of methods to alleviate the driver's current uncomfortable driving state.
在本实施例中,通过在数据集中的训练,通过对驾驶员面部进行表情识别,验证了本发明提出的改进的inception结构的正确性和有效性。In this embodiment, the correctness and effectiveness of the improved inception structure proposed by the present invention are verified by training in the data set and performing expression recognition on the driver's face.
采用VGG网络模型、inception V2网络模型、ResNet网络模型和本发明基于面部表情的驾驶员状态检测方法进行面部表情识别,识别结果如表1所示。表1算法识别率结果对比:Using VGG network model, inception V2 network model, ResNet network model and the driver state detection method based on facial expression of the present invention to carry out facial expression recognition, the recognition results are shown in Table 1. Table 1 Algorithm recognition rate results comparison:
表1Table 1
从表1可以看出:本发明可以通过动态的增减改进的inception结构以提高准确率,可以适用于不同的条件,在改进的inception结构数量增加到一定的程度下,准确率可以超过最好的VGG网络模型,并且参数数量少于该模型。因此本发明在驾驶员面部表情识别准确率与实时性方面有较大优势。It can be seen from Table 1 that the present invention can improve the accuracy rate by dynamically increasing or decreasing the improved inception structure, and can be applied to different conditions. When the number of improved inception structures increases to a certain extent, the accuracy rate can exceed the best The VGG network model of , and the number of parameters is less than this model. Therefore, the present invention has greater advantages in the recognition accuracy and real-time performance of the driver's facial expression.
尽管上面对本发明说明性的具体实施方式进行了描述,以便于本技术领域的技术人员理解本发明,但应该清楚,本发明不限于具体实施方式的范围,对本技术领域的普通技术人员来讲,只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内,这些变化是显而易见的,一切利用本发明构思的发明创造均在保护之列。Although the illustrative specific embodiments of the present invention have been described above, so that those skilled in the art can understand the present invention, it should be clear that the present invention is not limited to the scope of the specific embodiments. For those of ordinary skill in the art, As long as various changes are within the spirit and scope of the present invention defined and determined by the appended claims, these changes are obvious, and all inventions and creations using the concept of the present invention are included in the protection list.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910584900.8A CN110348350B (en) | 2019-07-01 | 2019-07-01 | Driver state detection method based on facial expressions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910584900.8A CN110348350B (en) | 2019-07-01 | 2019-07-01 | Driver state detection method based on facial expressions |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110348350A true CN110348350A (en) | 2019-10-18 |
CN110348350B CN110348350B (en) | 2022-03-25 |
Family
ID=68177588
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910584900.8A Active CN110348350B (en) | 2019-07-01 | 2019-07-01 | Driver state detection method based on facial expressions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348350B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325190A (en) * | 2020-04-01 | 2020-06-23 | 京东方科技集团股份有限公司 | Expression recognition method and device, computer equipment and readable storage medium |
CN111402143A (en) * | 2020-06-03 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN111563468A (en) * | 2020-05-13 | 2020-08-21 | 电子科技大学 | A method for detecting abnormal driver behavior based on neural network attention |
CN111832416A (en) * | 2020-06-16 | 2020-10-27 | 杭州电子科技大学 | A method for motor imagery EEG signal recognition based on enhanced convolutional neural network |
CN113642467A (en) * | 2021-08-16 | 2021-11-12 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108677A (en) * | 2017-12-12 | 2018-06-01 | 重庆邮电大学 | One kind is based on improved CNN facial expression recognizing methods |
WO2018133034A1 (en) * | 2017-01-20 | 2018-07-26 | Intel Corporation | Dynamic emotion recognition in unconstrained scenarios |
EP3355247A1 (en) * | 2017-01-27 | 2018-08-01 | STMicroelectronics Srl | A method of operating neural networks, corresponding network, apparatus and computer program product |
CN108491858A (en) * | 2018-02-11 | 2018-09-04 | 南京邮电大学 | Method for detecting fatigue driving based on convolutional neural networks and system |
CN109034090A (en) * | 2018-08-07 | 2018-12-18 | 南通大学 | A kind of emotion recognition system and method based on limb action |
CN109376692A (en) * | 2018-11-22 | 2019-02-22 | 河海大学常州校区 | A Transfer Convolutional Neural Network Method for Facial Expression Recognition |
-
2019
- 2019-07-01 CN CN201910584900.8A patent/CN110348350B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018133034A1 (en) * | 2017-01-20 | 2018-07-26 | Intel Corporation | Dynamic emotion recognition in unconstrained scenarios |
EP3355247A1 (en) * | 2017-01-27 | 2018-08-01 | STMicroelectronics Srl | A method of operating neural networks, corresponding network, apparatus and computer program product |
CN108108677A (en) * | 2017-12-12 | 2018-06-01 | 重庆邮电大学 | One kind is based on improved CNN facial expression recognizing methods |
CN108491858A (en) * | 2018-02-11 | 2018-09-04 | 南京邮电大学 | Method for detecting fatigue driving based on convolutional neural networks and system |
CN109034090A (en) * | 2018-08-07 | 2018-12-18 | 南通大学 | A kind of emotion recognition system and method based on limb action |
CN109376692A (en) * | 2018-11-22 | 2019-02-22 | 河海大学常州校区 | A Transfer Convolutional Neural Network Method for Facial Expression Recognition |
Non-Patent Citations (3)
Title |
---|
YAO A等: "HoloNet: towards robust emotion recognition in the wild", 《PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION》 * |
党宏社等: "基于卷积神经网络的多人表情识别算法", 《现代计算机》 * |
甘路涛: "基于面部表情的驾驶员状态分析方法研究", 《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325190A (en) * | 2020-04-01 | 2020-06-23 | 京东方科技集团股份有限公司 | Expression recognition method and device, computer equipment and readable storage medium |
WO2021196928A1 (en) * | 2020-04-01 | 2021-10-07 | 京东方科技集团股份有限公司 | Expression recognition method and apparatus, computer device, and readable storage medium |
US20220343683A1 (en) * | 2020-04-01 | 2022-10-27 | Boe Technology Group Co., Ltd. | Expression Recognition Method and Apparatus, Computer Device, and Readable Storage Medium |
CN111325190B (en) * | 2020-04-01 | 2023-06-30 | 京东方科技集团股份有限公司 | Expression recognition method and device, computer equipment and readable storage medium |
US12002289B2 (en) * | 2020-04-01 | 2024-06-04 | Boe Technology Group Co., Ltd. | Expression recognition method and apparatus, computer device, and readable storage medium |
CN111563468A (en) * | 2020-05-13 | 2020-08-21 | 电子科技大学 | A method for detecting abnormal driver behavior based on neural network attention |
CN111563468B (en) * | 2020-05-13 | 2023-04-07 | 电子科技大学 | Driver abnormal behavior detection method based on attention of neural network |
CN111402143A (en) * | 2020-06-03 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN111832416A (en) * | 2020-06-16 | 2020-10-27 | 杭州电子科技大学 | A method for motor imagery EEG signal recognition based on enhanced convolutional neural network |
CN113642467A (en) * | 2021-08-16 | 2021-11-12 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
CN113642467B (en) * | 2021-08-16 | 2023-12-01 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
Also Published As
Publication number | Publication date |
---|---|
CN110348350B (en) | 2022-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348350A (en) | A kind of driver status detection method based on facial expression | |
CN110532900B (en) | Facial Expression Recognition Method Based on U-Net and LS-CNN | |
CN107609638B (en) | A Method for Optimizing Convolutional Neural Networks Based on Linear Encoders and Interpolated Sampling | |
CN111046964B (en) | Convolutional neural network-based human and vehicle infrared thermal image identification method | |
JP6788264B2 (en) | Facial expression recognition method, facial expression recognition device, computer program and advertisement management system | |
CN108615010A (en) | Facial expression recognizing method based on the fusion of parallel convolutional neural networks characteristic pattern | |
CN108491858A (en) | Method for detecting fatigue driving based on convolutional neural networks and system | |
CN108664947A (en) | A kind of fatigue driving method for early warning based on Expression Recognition | |
CN108875674A (en) | A kind of driving behavior recognition methods based on multiple row fusion convolutional neural networks | |
CN110399821A (en) | Customer Satisfaction Acquisition Method Based on Facial Expression Recognition | |
CN106845351A (en) | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term | |
CN106650786A (en) | Image recognition method based on multi-column convolutional neural network fuzzy evaluation | |
CN110046575A (en) | Based on the remote sensing images scene classification method for improving residual error network | |
CN105005765A (en) | Facial expression identification method based on Gabor wavelet and gray-level co-occurrence matrix | |
CN110059593B (en) | Facial expression recognition method based on feedback convolutional neural network | |
CN111507227B (en) | Multi-student individual segmentation and state autonomous identification method based on deep learning | |
CN107742095A (en) | Chinese sign language recognition method based on convolutional neural network | |
CN107862692A (en) | A kind of ribbon mark of break defect inspection method based on convolutional neural networks | |
CN110443296B (en) | Hyperspectral image classification-oriented data adaptive activation function learning method | |
CN112990007B (en) | Facial expression recognition method and system based on regional grouping and internal association fusion | |
CN109815920A (en) | Gesture recognition method based on convolutional neural network and adversarial convolutional neural network | |
CN107622261A (en) | Face age estimation method and device based on deep learning | |
CN105956570B (en) | Smile recognition method based on lip features and deep learning | |
CN106503619B (en) | Gesture recognition method based on BP neural network | |
CN110096991A (en) | A kind of sign Language Recognition Method based on convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |