CN117351463A

CN117351463A - Parameter detection method and device

Info

Publication number: CN117351463A
Application number: CN202210748481.9A
Authority: CN
Inventors: 李源; 周欣文; 黄�俊; 沈鹏程
Original assignee: Momenta Suzhou Technology Co Ltd
Current assignee: Momenta Suzhou Technology Co Ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2024-01-05
Also published as: WO2024001365A1

Abstract

The application provides a parameter detection method and device, wherein the method comprises the following steps: acquiring a first video frame image; the first video frame image includes a face image of a driver; determining a first predictive vector of a first facial feature parameter, wherein the first predictive vector is obtained by inputting the first video frame image into a preset first model; the first model is used for detecting a first facial feature parameter of a driver in the image; and calculating a detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the prediction vector of the first facial feature parameter. The method and the device can enable the detection result of the facial feature parameters of the driver to be more accurate.

Description

Parameter detection method and device

Technical Field

The present disclosure relates to the field of image detection technologies, and in particular, to a method and apparatus for detecting parameters.

Background

In the driver monitoring system (Driver Monitor System, DMS) system, it is necessary to detect facial feature parameters of the driver, such as a line of sight, a facial key point, and the like, based on an image captured by a camera, and further detect driving behavior of the driver, such as whether or not the driver is distracted during driving, and the like, based on the facial feature parameters of the driver. However, in the prior art, the detection result of the facial feature parameters of the driver is not accurate enough, so that the detection result of the driving behavior detection of the driver by the DMS system is not accurate enough, and the user experience is affected.

Disclosure of Invention

The application provides a parameter detection method and device, which can enable the detection result of facial feature parameters of a driver to be more accurate.

In a first aspect, an embodiment of the present application provides a parameter detection method, including: acquiring a first video frame image; the first video frame image includes a face image of a driver; determining a first predictive vector of a first facial feature parameter, wherein the first predictive vector is obtained by inputting the first video frame image into a preset first model; the first model is used for detecting a first facial feature parameter of a driver in the image; and calculating the detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the first prediction vector of the first facial feature parameter. In the method, not only the detection value of the first facial feature parameter of the first video frame image can be calculated, but also the confidence coefficient of the detection value can be calculated, so that the confidence coefficient can be used for representing the credibility of the detection value, and the detection result of the first facial feature parameter of the first video frame image is more accurate.

The first model may be a parameter detection model in the following embodiment. The first prediction vector is a prediction vector output by the parameter detection model in the following embodiment.

In one possible implementation manner, the calculating the detection value of the first facial feature parameter according to the first prediction vector of the first facial feature parameter and the confidence of the detection value includes: normalizing the first predictive vector to obtain a first predictive probability vector of the first facial feature parameter; and calculating the detection value of the first facial feature parameter according to the first predictive probability vector, and calculating the confidence coefficient of the detection value according to the first predictive probability vector.

In a kind of canIn an implementation manner, the first predictive probability vector is denoted as p= [ S ] ₀ ，S ₁ ，…，S _N ]N is a natural number, S ₀ 、S ₁ 、…、S _N When the detection values are vector elements of the first predictive probability vector, the calculating the detection value of the first facial feature parameter according to the first predictive probability vector includes:

calculating the detection value of the first facial feature parameter according to the following formula

Wherein i is an integer and i.e.0, N.

In one possible implementation, the first predictive probability vector is denoted as p= [ S ] ₀ ，S ₁ ，…，S _N ]N is a natural number, S ₀ 、S ₁ 、…、S _N When the vector elements are respectively vector elements of the first predictive probability vector, calculating the confidence of the detection value according to the first predictive probability vector, including:

The confidence con f of the detection value is calculated using the following formula:

con f＝max(P)。

in one possible implementation, the first predictive probability vector is denoted as p= [ S ] ₀ ，S ₁ ，…，S _N ]N is a natural number, S ₀ 、S ₁ 、…、S _N When the vector elements are respectively vector elements of the first predictive probability vector, the calculating the confidence coefficient of the detection value according to the detection value of the first facial feature parameter includes:

the confidence level con f of the first facial feature parameter is calculated using the following formula:

wherein,is a detected value of the first facial feature parameter, round represents a rounding function.

In one possible implementation manner, the normalizing the first prediction vector to obtain a first prediction probability vector of the first facial feature parameter includes:

and carrying out normalization processing on the first prediction vector by using a normalization exponential function to obtain the first prediction probability vector.

In one possible implementation manner, the training method of the first model includes: obtaining a first sample, the first sample comprising: a first image, a first parameter value of a first facial feature parameter of the first image; inputting the first sample into a preset initial model to obtain a second predictive vector of the first facial feature parameter; normalizing the second predictive vector to obtain a second predictive probability vector of the first facial feature parameter; encoding the first parameter value to obtain a true value vector of the first facial feature parameter; calculating a predicted deviation value by using a preset loss function according to the second predicted probability vector and the true value vector; adjusting parameters of the initial model according to the predicted deviation value; and when the initial model is not converged, acquiring a new sample as the first sample, and returning to the step of inputting the first sample into a preset initial model until the initial model is converged, so as to obtain the first model.

In one possible implementation manner, the preset initial model includes: the convolution neural network is used as the input of the preset initial model, and the full-connection network is used as the output of the preset initial model; the output end of the fully-connected network is provided with N+1 neurons aiming at the first facial feature parameter, and the output of the N+1 neurons is used as a vector element of a second prediction vector of the first facial feature parameter.

In a second aspect, an embodiment of the present application provides a method for establishing a parameter detection model, including: obtaining a first sample, the first sample comprising: a first image, a first parameter value of a first facial feature parameter of the first image; inputting the first sample into a preset initial model to obtain a second predictive vector of the first facial feature parameter; normalizing the second predictive vector to obtain a second predictive probability vector of the first facial feature parameter; encoding the first parameter value to obtain a true value vector of the first facial feature parameter; calculating a predicted deviation value by using a preset loss function according to the second predicted probability vector and the true value vector; adjusting parameters of the initial model according to the predicted deviation value; and when the initial model is not converged, acquiring a new sample as the first sample, and returning to the step of inputting the first sample into a preset initial model until the initial model is converged, so as to obtain the parameter detection model.

In one possible implementation manner, the preset initial model includes: the convolutional neural network is used as the input of the preset initial model, the fully-connected network is used as the output of the preset initial model, the convolutional neural network can effectively extract the characteristics of the first sample, the characteristics are input into the fully-connected network, and the fully-connected network outputs the prediction vector of the characteristic parameters of the first face according to the characteristics; the output end of the fully-connected network is provided with N+1 neurons aiming at the first facial feature parameter, and the output of the N+1 neurons is used as a vector element of a second prediction vector of the first facial feature parameter.

In one possible implementation manner, the normalizing the second prediction vector includes: and normalizing the second prediction vector by using a normalization exponential function.

In one possible implementation manner, the encoding the first parameter value to obtain the true value vector of the first facial feature parameter includes: encoding the first parameter value by using a Gaussian distribution function to obtain a true value vector of the first facial feature parameter; alternatively, the first parameter value is encoded using an arbitrary distribution function, resulting in a true value vector for the first facial feature parameter.

In one possible implementation, when encoding the first parameter value using a gaussian distribution function, the calculating a prediction bias value according to the second prediction probability vector and the truth vector using a preset loss function includes:

calculating an ith vector element L included in the predicted bias value using the following formula of the loss function _i ：

Wherein S is _i The ith vector element, T, being the second predictive probability vector _i The i-th vector element, i is an integer, and i.e. [0, N, is the true value vector]。

In one possible implementation, when the first parameter value is encoded using an arbitrary distribution function, the calculating a prediction bias value according to the second prediction probability vector and the truth value vector using a preset loss function includes:

the predicted deviation value is calculated using the following equation for the loss function:

L(S _m ，S _m+1 )＝-((m+1-v)log(S _m )+(v-m)log(S _m+1 ))

wherein m=int (v), v is the first parameter value, int is a rounding operation, S _m Is the mth vector element of the second predictive probability vector, S _m+1 Is the m+1th vector element of the second predictive probability vector.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors coupled with a memory, the memory having one or more computer programs stored therein; the one or more computer programs, when executed by the processor, cause the electronic device to perform the method of any of the first aspect.

In a fourth aspect, embodiments of the present application provide an electronic device, including: one or more processors coupled with a memory, the memory having one or more computer programs stored therein; the one or more computer programs, when executed by the processor, cause the electronic device to perform the method of any of the second aspects.

In a fifth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the method of any of the first aspects.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the method of any of the second aspects.

In a seventh aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the method of any one of the first aspects.

In an eighth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the method of any of the second aspects.

In a ninth aspect, the present application provides a computer program for performing the method of the first or second aspect when the computer program is executed by a computer.

In one possible design, the program in the ninth aspect may be stored in whole or in part on a storage medium packaged with the processor, or in part or in whole on a memory not packaged with the processor.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a system structure to which the parameter detection method provided in the embodiment of the present application is applicable;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a method for establishing a parameter detection model according to an embodiment of the present application;

fig. 4 is a schematic flow chart of a parameter detection method according to an embodiment of the present application;

Fig. 5 is another flow chart of the parameter detection method according to the embodiment of the present application.

Detailed Description

The terminology used in the description section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application.

In the DMS system, it is necessary to detect facial feature parameters of the driver, such as a line-of-sight parameter, a face key point, and the like, based on an image captured by a camera, and further detect driving behavior of the driver, such as whether or not the driver is distracted during driving, and the like, based on the facial feature parameters of the driver. However, in the prior art, the detection result of the facial feature parameter of the driver is often not accurate enough, so that the detection result of the DMS system for detecting the driving behavior of the driver is not accurate enough.

Taking the example that the facial feature parameter of the driver is the sight line parameter of the driver and the sight line parameter of the driver is detected, a sight line detection model can be preset in the DMS system, the input of the sight line detection model is an image shot by a camera, and the output is the pitch angle and the yaw angle of the sight line of the driver in the image. However, if an image is captured by the camera when the driver blinks, closes eyes, or shields eyes of the driver, the pupil of the driver is not visible in the image, that is, the pupil image of the driver is not included in the image, but the line of sight detection model still outputs the angle value of the pitch angle and the yaw angle of the line of sight for the image, and the angle value of the line of sight output based on the image is not accurate in practice, if the angle value of the image in which the pupil of the driver is not visible is used as the data base of the subsequent driver distraction detection, the deviation of the driver distraction detection result is caused.

Therefore, the embodiment of the application provides a parameter detection method and a parameter detection model establishment method, which can enable the detection result of the facial feature parameters of the driver to be more accurate, and further enable the detection result of the DMS system on the driving behavior detection of the driver to be more accurate.

An architecture of a system to which the parameter detection method of the embodiment of the present application may be applied is exemplarily described. Fig. 1 is a schematic structural diagram of a DMS system to which the parameter detection method according to the embodiment of the present application is applicable, where the DMS system includes: a camera 110 and an electronic device 120; wherein,

the camera 110 is used to capture still images or video. The camera 110 may be disposed in a vehicle and positioned in front of the driver to be aimed at the driver's face. Still images or videos captured by the camera 110 include a driver's face image while the driver is sitting in the driving position. The object generates an optical image through the lens and projects the optical image onto the photosensitive element of the camera 110. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the electronic device 120.

The electronic device 120 is configured to convert an electrical signal into a digital image signal, detect a facial feature parameter of a driver based on the digital image signal, detect a driving behavior of the driver based on a detection result of the facial feature parameter of the driver, and the like. The electronic device 120 may be disposed in the same vehicle as the camera 110.

In another embodiment, an image signal processor (image signal processor, ISP) may be disposed in the camera 110, and the electrical signal may be converted into a digital image signal by the ISP, so that the camera 110 directly transmits the digital image signal converted by the ISP to the electronic device 120, and the electronic device 120 does not need to perform the conversion from the electrical signal to the digital image signal.

The electronic device referred to in embodiments of the present application may be a vehicle-mounted device, a vehicle networking terminal, a computer, a laptop computer, a handheld communication device, a handheld computing device, and/or other devices for communicating over a wireless system as well as a next generation communication system, such as a mobile terminal in a 5G network or a mobile terminal in a future evolved public land mobile network (Public Land Mobile Network, PLMN) network, etc.

Fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 200 includes: processor 210, memory 220, display 230, etc. The electronic device 120 described above may be implemented by the electronic device 200 shown in fig. 2.

Optionally, for the electronic device to function more perfectly, the electronic device may further include: the embodiments of the present application are not limited to one or more devices selected from an antenna, a mobile communication module, a wireless communication module, an audio module, a speaker, a receiver, a microphone, an earphone interface, a charging management module, a power management module, a battery, and the like.

Processor 210 may include one or more processing units such as, for example: processor 210 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an ISP, a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

The electrical signals transmitted by the cameras can be converted into digital image signals by the ISP. The ISP may output the digital image signal to DSP processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format.

The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.

A memory may also be provided in the processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to reuse the instruction or data, it may be called directly from the memory. Repeated accesses are avoided and the latency of the processor 210 is reduced, thereby improving the efficiency of the system.

Memory 220 may be used to store computer executable program code that includes instructions. The memory 220 may include a stored program area and a stored data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data created during use of the electronic device 200 (e.g., audio data, etc.), and so on. In addition, the memory 220 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like. The processor 210 performs various functional applications of the electronic device 200 and data processing by executing instructions stored in the memory 220 and/or instructions stored in a memory provided in the processor.

In the embodiment shown in fig. 2, the memory 220 is provided in the electronic device 200 as an example, and in other embodiments provided in the embodiments of the present application, the memory 220 may not be provided in the electronic device 200, and in this case, the memory 220 may be connected to the electronic device 200 through an interface provided by the electronic device 200, and may further be connected to the processor 210 in the electronic device 200.

The display 230 is used to display images, videos, and the like. The display 230 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the electronic device 200 may include 1 or more display screens 230.

Hereinafter, a method for detecting parameters according to the embodiments of the present application will be described in detail with reference to the above system architecture and the structure of the electronic device.

First, an exemplary description is given of a parameter detection model establishing method according to an embodiment of the present application.

In the embodiment of the application, a training sample set is preset, and the training sample set comprises a plurality of samples. Sample number the present embodiments are not limited.

Each sample includes: an image, parameter values for at least 1 facial feature parameter in the image.

Alternatively, facial feature parameters may include, but are not limited to: a driver's line of sight parameter, a driver's head pose parameter, a position parameter of eye feature points, a position parameter of facial feature areas, a position parameter of a driver's body area, and the like.

Alternatively, the driver's gaze parameters may include, but are not limited to: pitch angle, yaw angle, etc.

Alternatively, the driver's head pose parameters may include, but are not limited to: pitch angle, yaw angle, roll angle, etc.

Alternatively, the ocular feature points may include, but are not limited to: eye center point, corner of eye point, eyelid point, pupil point, etc.

Alternatively, facial feature regions may include, but are not limited to: eye area, mouth area, nose area, etc.

Alternatively, the body feature region may include, but is not limited to: hand area, torso area, etc.

In this embodiment, an initial model is preset, where the initial model may include: convolutional neural networks and fully-connected networks; the convolutional neural network is used as an input end of an initial model and used for receiving training samples, and the convolutional neural network can effectively extract characteristics of the samples and input the characteristics into the fully-connected network; the fully-connected network is used as an output end of the initial model to output the predictive vector of the facial feature parameters.

In the embodiment of the present application, the prediction vector of each facial feature parameter is an n+1-dimensional array. That is, the output terminal of the fully-connected network sets n+1 neurons for each facial feature parameter as neurons outputting vector elements of the facial feature parameter, so that the vector elements output by its corresponding n+1 neurons constitute a predictive vector of the facial feature parameter for each facial feature parameter. N is a natural number.

For example, for a parameter of pitch angle of line of sight, the output of the fully connected network comprises n+1 neurons for the pitch angle, and for a parameter of yaw angle of line of sight, the output of the fully connected network comprises n+1 neurons for the yaw angle.

The implementation flow of the parameter detection model building method in the embodiment of the present application is described in the following by way of example. The electronic device for executing the parameter detection model establishing method according to the embodiment of the present application may be the electronic device 120, or may be an electronic device that is disposed outside the vehicle, which is not limited in the embodiment of the present application. As shown in fig. 3, the parameter detection model building method may include:

step 301: acquiring a first sample; the first sample includes: the first image, a first parameter value of a first facial feature parameter.

Step 302: inputting the first sample into a preset initial model to obtain a prediction vector of the first facial feature parameter.

The prediction vector of the first facial feature parameter is an n+1-dimensional array, and in the embodiment of the present application, the prediction vector is denoted as y= [ y ] ₀ ，y ₁ ，…，y _N ]；

Vector element y in prediction vector y ₀ ，y ₁ And the like are output values of n+1 neurons corresponding to the first facial feature parameter.

The first facial feature parameter in this step and the subsequent step 303 is a first facial feature parameter of the first sample, and the first sample is omitted for convenience of description.

Step 303: and carrying out normalization processing on the predicted vector of the first facial feature parameter to obtain a predicted probability vector of the first facial feature parameter.

The normalization processing in the embodiment of the present application refers to mapping each vector element in the above-described prediction vector to an interval [0,1] by calculation.

In the embodiment of the present application, the predictive probability vector of the first facial feature parameter is denoted as P.

In one possible implementation, the prediction vector y may be normalized using a softmax function, and the resulting prediction probability vector p=softmax (y) = [ S ₀ ，S ₁ ，…，S _N ]。

The Softmax function is also known as a normalized exponential function.

The predictive probability vector is also an n+1-dimensional array, the sum of vector elements in the predictive probability vector being 1, i.e

Step 304: and encoding the first parameter value to obtain a true value vector of the first facial feature parameter.

Let the first parameter value be v, the true vector be T, the true vector be N+1-dimensional array, let T= [ T ] ₀ ，t ₁ ，…，t _N ]。

In one possible implementation, the first parameter value v may be encoded using a gaussian distribution function, resulting in a true value vector T, with a calculation formula, for example, as shown in the following formula:

wherein i is an integer and i.e.0, N; sigma is the standard deviation of the gaussian distribution, the specific value of sigma can be set according to the actual condition of the first parameter value, if the prediction difficulty of the first parameter value is high or the accuracy is low, the sigma can be set relatively high, so that the obtained true value vector T has relatively high degree of dispersion, otherwise, the sigma is set relatively low, so that the obtained true value vector T has relatively low degree of dispersion.

In another possible implementation, the first parameter value v may be encoded using an arbitrary distribution function, resulting in the true value vector T.

Step 305: and calculating a predicted deviation value by using a preset loss function according to the true value vector T and the predicted probability vector P of the first facial feature parameters.

The predicted deviation value is set to L below.

If the true value vector T is encoded using a Gaussian distribution function, in one possible implementation, a kullback-Leibler divergence may be used as the loss function, where the calculated predicted deviation value L may be an N+1-dimensional array, assuming L= [ L ] ₀ ，l ₁ ，…，l _N ]For the ith vector element L in the predicted bias value _i The following loss function calculation may be used:

wherein i is an integer and i.e.0, N.

If the truth vector T is encoded using an arbitrary distribution function, in one possible implementation, the predicted bias value may be calculated using the following penalty function:

L(S _m ，S _m+1 )＝-((m+1-v)log(S _m )+(v-m)log(S _m+1 ))

wherein m=int (v), int is a rounding operation, S _m Is the mth vector element, S, of the predictive probability vector P calculated in step 303 _m+1 Is the predicted probability calculated in step 303M+1th vector element of the rate vector P.

For example, assuming that the first facial feature parameter is the pitch angle of the line of sight, and the first parameter value of the pitch angle is 46.8, then m=int (46.8) =46, the predicted deviation value l= - ((46+1-46.8) log (S) can be calculated using the above loss function ₄₆ )+(46.8-46)log(S ₄₇ ))＝-(0.2log(S ₄₆ )+0.8log(S ₄₇ ))。

Step 306: and adjusting parameters of the preset model by using the predicted deviation value.

How to use the predicted deviation value to adjust the parameters of the preset model can be implemented by using related techniques, which are not described herein.

Step 307: and judging whether the preset initial model is converged, if so, obtaining a parameter detection model, and if not, executing step 308.

The determining whether the preset model converges may be implemented by using a related technology, for example, determining whether the predicted deviation value meets a preset convergence condition, or determining whether the adjustment value of the parameter of the preset model meets the preset convergence condition, which is not limited in the embodiment of the present application.

Step 308: and obtaining a new sample as a first sample, and returning to execute the step 302 and the subsequent steps until the preset initial model converges to obtain the parameter detection model in the embodiment of the application.

It should be noted that, after step 308, the method may further include the steps of testing the parameter detection model obtained by training by using a test sample, adjusting parameters of the parameter detection model according to a test result, and the like, so as to optimize the parameter detection model.

The input of the parameter detection model established by the method shown in the above figure 3 is an image, the output is a prediction vector of the first facial feature parameter of the image, the prediction vector is an n+1 dimensional array, and compared with the parameter detection model in the prior art, the parameter detection model only outputs one parameter value, and the output result of the parameter detection model has more dimensional data information, thereby improving the accuracy of the parameter detection model on the detection result of the first facial feature parameter. The parameter detection model may be provided in the electronic device 120, and support the electronic device to detect the first facial feature parameter of the image.

The parameter detection method provided in the embodiment of the present application is described below as an example.

Fig. 4 is a flowchart of a method for detecting parameters according to an embodiment of the present application, as shown in fig. 4, where the method may be applied to the electronic device, and the method may include:

step 401: a first video frame image is acquired.

The first video frame image obtained in this step may be a frame of video frame image including a face image of the driver, which is captured by the camera in real time.

Step 402: a first predictive vector of a first facial feature parameter of a first video frame image is determined, the first predictive vector being obtainable by inputting the first video frame image into a preset first model.

For convenience of description, the first facial feature parameter of the first video frame image is briefly described as the first facial feature parameter.

The parameter detection model used in this step, that is, the parameter detection model obtained in the above-described parameter detection model creation method, is for this purpose continued from the description in fig. 3 described above, and the predicted vector of the first facial feature parameter is denoted as y.

Step 403: and calculating the detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the prediction vector of the first facial feature parameter.

For convenience of description, the confidence of the detected value of the first facial feature parameter will be also referred to as the confidence of the first facial feature parameter in the following description.

Optionally, the step may include:

normalizing the predictive vector of the first facial feature parameter to obtain a predictive probability vector;

and calculating the detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the predictive probability vector.

It should be noted that the number of the substrates,the specific processing method for carrying out normalization processing on the prediction vector y in the step generally needs to be the same as the normalization processing method used in the establishment process of the parameter detection model so as to ensure the accuracy of the detection value and the confidence coefficient obtained by subsequent calculation. Taking the above normalization of the prediction vector y using the softmax function as an example, the description in fig. 3 is continued, and the obtained prediction probability vector is P, p=softmax (y) = [ S ₀ ，S ₁ ，…，S _N ]。

Wherein the detected value of the first facial feature parameter is recorded asThe above-mentioned calculation of the detection value of the first facial feature parameter based on the predictive probability vector P>May include:

calculating the detection value of the first facial feature parameter using the following calculation formula

In one possible implementation, if the parameter detection model is built by taking the example of encoding the first parameter value using a gaussian distribution function, the detection value is calculated from the predictive probability vector P The confidence of (2) may include:

calculating a detection value according to the predictive probability vector P using the following formulaConfidence con f of (c):

con f＝max(P)

that is, canTo select S ₀ ～S _N The maximum value of (2) is taken as a detection valueConfidence con f of (c).

In another possible implementation, if the parameter detection model is built by taking the example of encoding the first parameter value using an arbitrary distribution function, the detection value is calculated from the predictive probability vector PThe confidence of (2) may include:

detection value obtained from predictive probability vector PTo calculate the confidence level con f:

where round represents a rounding function.

Through the steps, the detection value of the first facial feature parameter of the first video frame image can be calculated, and the confidence coefficient of the detection value can be calculated, so that the confidence coefficient can be used for representing the credibility of the detection value, and the detection result of the first facial feature parameter of the first video frame image is more accurate.

Alternatively, the detected value and the confidence of the first facial feature parameter of the first video frame image may be used as a data base for detecting the driving behavior of the driver. At this time, as shown in fig. 5, the parameter detection method according to the embodiment of the present application may further include the following step 404 after step 403.

Step 404: and detecting the driving behavior of the driver according to the detection value and the confidence level of the first facial feature parameter.

When the driving behavior of the driver is detected according to the detection value and the confidence coefficient of the first facial feature parameter, whether the detection value is suitable for being used as a data base of the driving behavior detection or not can be evaluated according to the confidence coefficient of the detection value, for example, the detection value with the confidence coefficient lower than a certain threshold value is filtered out and is not used as the data base of the driving behavior detection of the driver, so that the data base of the driving behavior detection of the driver is more accurate and effective, and the accuracy of the driving behavior detection of the driver can be improved.

For example, the driving behavior may be a distraction behavior of the driver, and the first facial feature parameter may be a pitch angle and a yaw angle of the line of sight, and the step may include:

and detecting the distraction behavior of the driver according to the detection value and the confidence coefficient of the sight pitch angle and the detection value and the confidence coefficient of the sight yaw angle.

The detecting the distraction behavior of the driver according to the detection value and the confidence coefficient of the sight pitch angle and the detection value and the confidence coefficient of the sight yaw angle may specifically include:

Acquiring a detection value and a confidence coefficient of a sight pitch angle of each frame of video frame image in a first time period, and acquiring a detection value and a confidence coefficient of a sight yaw angle of each frame of video frame image in the first time period;

filtering the video frame images in the first time period according to the confidence coefficient of the sight pitch angle and the confidence coefficient of the sight yaw angle of each frame of video frame image to obtain effective video frame images;

and detecting the distraction behavior of the driver according to the detection value of the sight pitch angle and the detection value of the sight yaw angle of the effective video frame image.

It should be noted that, the first period may be a period of a preset duration with a shooting time of the first video frame image as an end point, and a specific value of the preset duration is not limited in the embodiment of the present application; the first frame video frame image is the last frame video frame image in the first period.

In one possible implementation manner, when filtering the video frame images according to the confidence level of the line-of-sight pitch angle and the confidence level of the line-of-sight yaw angle of each frame of video frame image, a first threshold value may be set for the confidence level of the line-of-sight pitch angle, a second threshold value may be set for the confidence level of the line-of-sight yaw angle, the video frame images with the confidence level of the line-of-sight pitch angle lower than the first threshold value and/or the confidence level of the line-of-sight yaw angle lower than the second threshold value are filtered, and the remaining video frame images, that is, the video frame images with the confidence level of the line-of-sight pitch angle not lower than the first threshold value and the confidence level of the line-of-sight yaw angle not lower than the second threshold value, are used as the valid video frame images.

The above-mentioned detection of the distraction behavior of the driver according to the detection value of the line-of-sight pitch angle and the detection value of the line-of-sight yaw angle of the effective video frame image may be implemented using a related distraction behavior detection method, and the embodiment of the present application is not limited.

Through the filtering of the video frame images, the video frame images with low confidence coefficient of the sight pitch angle and/or low confidence coefficient of the sight yaw angle can be filtered, for example, the video frame images without driver pupil images in the video frame images have low confidence coefficient of the sight pitch angle and/or low confidence coefficient of the sight yaw angle, and can be filtered, so that the basic data of the driver distraction behavior detection are more accurate and effective, and the accuracy of the driver distraction behavior detection can be improved.

The embodiment of the application also provides electronic equipment, which comprises: a processor coupled to a memory, the memory having a computer program stored therein; the computer program is executed by the processor, where the processor is configured to implement the methods provided in the embodiments shown in fig. 3 to 5 of the present application.

The embodiment of the application also provides electronic equipment, which comprises a storage medium and a central processing unit, wherein the storage medium can be a nonvolatile storage medium, a computer executable program is stored in the storage medium, and the central processing unit is connected with the nonvolatile storage medium and executes the computer executable program to realize the methods provided by the embodiments shown in fig. 3-5 of the application.

Embodiments of the present application also provide a computer-readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the methods provided by the embodiments shown in fig. 3-5 of the present application.

Embodiments of the present application also provide a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the methods provided by the embodiments shown in fig. 3-5 of the present application.

In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.

Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in the embodiments disclosed herein can be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In several embodiments provided herein, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (hereinafter referred to as ROM), a random access Memory (Random Access Memory) and various media capable of storing program codes such as a magnetic disk or an optical disk.

The foregoing is merely specific embodiments of the present application, and any person skilled in the art may easily conceive of changes or substitutions within the technical scope of the present application, which should be covered by the protection scope of the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for detecting a parameter, comprising:

acquiring a first video frame image; the first video frame image includes a face image of a driver;

determining a first predictive vector of a first facial feature parameter, wherein the first predictive vector is obtained by inputting the first video frame image into a preset first model; the first model is used for detecting a first facial feature parameter of a driver in the image;

and calculating the detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the first prediction vector of the first facial feature parameter.

2. The method of claim 1, wherein the calculating the detected value of the first facial feature parameter and the confidence of the detected value based on the first predictive vector of the first facial feature parameter comprises:

normalizing the first predictive vector to obtain a first predictive probability vector of the first facial feature parameter;

And calculating the detection value of the first facial feature parameter and the confidence coefficient of the detection value according to the first predictive probability vector.

3. The method of claim 2, wherein the first predictive probability vector is denoted as p= [ S ] ₀ ，S ₁ ，...，S _N ]N is a natural number, S ₀ 、S ₁ 、...、S _N When the detection values are vector elements of the first predictive probability vector, the calculating the detection value of the first facial feature parameter according to the first predictive probability vector includes:

Wherein i is an integer and i.e.0, N.

4. A method according to claim 2 or 3, characterized in that the first predictive probability vector is denoted p= [ S ] ₀ ，S ₁ ，...，S _N ]N is a natural number, S ₀ 、S ₁ 、...、S _N When the vector elements are respectively vector elements of the first predictive probability vector, calculating the confidence of the detection value according to the first predictive probability vector, including:

the confidence conf of the detection value is calculated using the following formula:

conf＝max(P)。

5. a method according to claim 2 or 3, wherein the first predictive profileThe rate vector is denoted as p= [ S ] ₀ ，S ₁ ，...，S _N ]N is a natural number, S ₀ 、S ₁ 、...、S _N When the vector elements are respectively vector elements of the first predictive probability vector, the calculating the confidence coefficient of the detection value according to the detection value of the first facial feature parameter includes:

6. A method according to claim 2 or 3, wherein normalizing the first prediction vector to obtain a first prediction probability vector for the first facial feature parameter comprises:

7. A method according to any one of claims 1 to 3, wherein the training method of the first model comprises:

obtaining a first sample, the first sample comprising: a first image, a first parameter value of a first facial feature parameter of the first image;

inputting the first sample into a preset initial model to obtain a second predictive vector of the first facial feature parameter;

normalizing the second predictive vector to obtain a second predictive probability vector of the first facial feature parameter;

encoding the first parameter value to obtain a true value vector of the first facial feature parameter;

Calculating a predicted deviation value by using a preset loss function according to the second predicted probability vector and the true value vector;

adjusting parameters of the initial model according to the predicted deviation value;

and when the initial model is not converged, acquiring a new sample as the first sample, and returning to the step of inputting the first sample into a preset initial model until the initial model is converged, so as to obtain the first model.

8. The method of claim 7, wherein the pre-set initial model comprises: the convolution neural network is used as the input of the preset initial model, and the full-connection network is used as the output of the preset initial model;

the output end of the fully-connected network is provided with N+1 neurons aiming at the first facial feature parameter, and the output of the N+1 neurons is used as a vector element of a second prediction vector of the first facial feature parameter.

9. The parameter detection model building method is characterized by comprising the following steps of:

and when the initial model is not converged, acquiring a new sample as the first sample, and returning to the step of inputting the first sample into a preset initial model until the initial model is converged, so as to obtain the parameter detection model.

10. The method of claim 9, wherein the pre-set initial model comprises: the convolution neural network is used as the input of the preset initial model, and the full-connection network is used as the output of the preset initial model;

11. The method according to claim 9 or 10, wherein normalizing the second prediction vector comprises:

and normalizing the second prediction vector by using a normalization exponential function.

12. The method according to claim 9 or 10, wherein said encoding the first parameter values results in a true value vector for the first facial feature parameter, comprising:

encoding the first parameter value by using a Gaussian distribution function to obtain a true value vector of the first facial feature parameter; or,

and encoding the first parameter value by using an arbitrary distribution function to obtain a true value vector of the first facial feature parameter.

13. The method according to claim 12, wherein said calculating a predicted bias value using a predetermined loss function from said second predicted probability vector and said truth vector when encoding said first parameter value using a gaussian distribution function, comprises:

Wherein S is _i Is the ith vector element, T, of the second predictive probability vector _i The i-th vector element, i is an integer, and i.e. [0, N, is the true value vector]。

14. The method according to claim 12, wherein said calculating a predicted bias value using a predetermined loss function from said second predicted probability vector and said truth vector when encoding said first parameter value using an arbitrary distribution function, comprises:

L(S _m ，S _m+1 )＝-((m+1-v)log(s _m )+(v-m)log(S _m+1 ))

15. An electronic device, comprising:

one or more processors coupled with a memory, the memory having one or more computer programs stored therein; the one or more computer programs, when executed by the processor, cause the electronic device to perform the method of any of claims 1-8.

16. An electronic device, comprising:

one or more processors coupled with a memory, the memory having one or more computer programs stored therein; the one or more computer programs, when executed by the processor, cause the electronic device to perform the method of any of claims 9-14.

17. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when run on a computer, causes the computer to perform the method according to any of claims 1 to 8.

18. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when run on a computer, causes the computer to perform the method of any of claims 9 to 14.