Nothing Special   »   [go: up one dir, main page]

WO2020125623A1 - Method and device for live body detection, storage medium, and electronic device - Google Patents

Method and device for live body detection, storage medium, and electronic device Download PDF

Info

Publication number
WO2020125623A1
WO2020125623A1 PCT/CN2019/125957 CN2019125957W WO2020125623A1 WO 2020125623 A1 WO2020125623 A1 WO 2020125623A1 CN 2019125957 W CN2019125957 W CN 2019125957W WO 2020125623 A1 WO2020125623 A1 WO 2020125623A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional color
image
depth
face
color image
Prior art date
Application number
PCT/CN2019/125957
Other languages
French (fr)
Chinese (zh)
Inventor
侯允
刘耀勇
陈岩
Original Assignee
上海瑾盛通信科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海瑾盛通信科技有限公司 filed Critical 上海瑾盛通信科技有限公司
Publication of WO2020125623A1 publication Critical patent/WO2020125623A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • the present application relates to the technical field of face recognition, in particular to a living body detection method, device, storage medium and electronic equipment.
  • electronic devices use relevant face recognition technology to not only distinguish between individual users, but also perform live detection on users. For example, electronic devices obtain user faces (such as photos taken through a depth camera such as a structured light camera or a time-of-flight camera). The RGB-D image of the user's face image) can determine whether the user's face is a living face.
  • user faces such as photos taken through a depth camera such as a structured light camera or a time-of-flight camera.
  • the RGB-D image of the user's face image can determine whether the user's face is a living face.
  • the embodiments of the present application provide a living body detection method, device, storage medium, and electronic equipment, which can reduce the hardware cost of the electronic equipment for living body detection.
  • an embodiment of the present application provides a living body detection method, which is applied to an electronic device, the electronic device includes a monocular camera, and the living body detection method includes:
  • the two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
  • an embodiment of the present application provides a living body detection device, which is applied to an electronic device, the electronic device includes a monocular camera, and the living body detection device includes:
  • a color image acquisition module configured to shoot the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected
  • a depth image acquisition module configured to input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
  • the living body face detection module inputs the two-dimensional color image and the depth image into a pre-trained living body detection model to obtain a detection result.
  • an embodiment of the present application provides a storage medium on which a computer program is stored, and when the computer program runs on a computer, causes the computer to execute:
  • the two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
  • an embodiment of the present application provides an electronic device including a processor, a memory, and a monocular camera.
  • the memory has a computer program
  • the processor is used to execute the computer program by calling the computer program:
  • the two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
  • FIG. 1 is a schematic flowchart of a living body detection method provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of the living body detection performed by the electronic device through the living body detection model in the embodiment of the present application.
  • FIG. 3 is another schematic flowchart of the living body detection method provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of constructing a training sample set in an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a living body detection device provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG 7 is another schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the related art proposes a living body detection technology based on a depth camera such as a structured light camera or a time-of-flight camera.
  • a depth camera such as a structured light camera or a time-of-flight camera.
  • its implementation requires that the electronic device be equipped with an additional depth camera, which increases The cost of biopsy.
  • the embodiments of the present application firstly provide a living body detection method, which realizes living body detection based on a monocular camera commonly configured in electronic devices, without increasing the hardware cost of the electronic devices.
  • the execution subject of the living body detection method may be the living body detection device provided in the embodiment of the present application, or an electronic device integrated with the living body detection device.
  • the living body detection device may be implemented by hardware or software, and the electronic device may be intelligent Mobile phones, tablet computers, PDAs, notebook computers, or desktop computers are equipped with processors and have processing capabilities.
  • An embodiment of the present application provides a living body detection method, including:
  • the two-dimensional color image and the depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
  • the living body detection model is a convolutional neural network model, including a convolutional layer, a pooling layer, and a fully connected layer connected in sequence, and the inputting of the two-dimensional color image and the depth image in advance Trained living body detection model to obtain the detection results, including:
  • the inputting the two-dimensional color image and the depth image into the convolution layer for feature extraction to obtain the joint global features of the two-dimensional color image and the depth image includes:
  • the preprocessing the two-dimensional color image to obtain the face area image in the two-dimensional color image includes:
  • An ellipse template, a circular template or a rectangular template is used to extract the face area image from the two-dimensional color image.
  • the method before capturing the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected, the method further includes:
  • a plurality of different live human face images are captured by the monocular camera to obtain multiple two-dimensional color live human face sample images, and a depth image corresponding to each of the two-dimensional color live human face sample images is obtained to obtain multiple A depth image;
  • a plurality of different non-living human faces are photographed through the monocular camera to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each of the two-dimensional color non-living human face sample images is obtained to obtain Multiple second depth images;
  • a convolutional neural network is used to perform model training on the training sample set to obtain the convolutional neural network model.
  • the convolutional neural network is used to perform model training on the training sample set, and before the convolutional neural network model is obtained, the method further includes:
  • the acquiring depth images corresponding to each of the two-dimensional color live human face sample images to obtain multiple first depth images includes:
  • a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
  • the living body detection method further includes:
  • each of the two-dimensional color live human face sample images and each of the two-dimensional color non-live human face sample images as training inputs, and using the first depth image corresponding to each of the two-dimensional color live human face sample images and each location
  • the second depth image corresponding to the two-dimensional color non-living human face sample image is output as the target, and the supervised model training is performed to obtain the depth estimation model.
  • the method before inputting the two-dimensional color image into a pre-trained depth estimation model for depth estimation, the method further includes:
  • FIG. 1 is a schematic flowchart of a living body detection method provided by an embodiment of the present application.
  • the flow of the living body detection method provided by the embodiment of the present application may be as follows:
  • a face to be detected is photographed through a monocular camera to obtain a two-dimensional color image of the face to be detected.
  • the electronic device can treat the detected person through the configured monocular camera when receiving an operation that requires face recognition for user identity detection, such as an unlock operation based on face recognition or a payment operation based on face recognition The face is photographed. Since the monocular camera is only sensitive to two-dimensional color information, a two-dimensional color image of the face to be detected will be captured.
  • a front monocular camera also commonly known as a front camera
  • a rear monocular camera also commonly known as a rear camera
  • the imaging capability of the rear monocular camera is higher than the imaging capability of the front monocular camera, so that when the electronic device shoots the face to be detected through the monocular camera, it can default to perform the shooting through the front monocular camera Operation to shoot the face to be detected; the shooting operation can also be performed by the rear monocular camera by default to shoot the face to be detected; the front monocular camera and the rear can also be predicted based on the real-time pose information
  • the monocular camera facing the face to be detected in the monocular camera so that the shooting operation is automatically performed by the monocular camera facing the face to be detected in the front monocular camera and the rear monocular camera, and the face to be detected is photographed .
  • the current unlocking method adopted by the electronic device is “face unlocking”.
  • the electronic device receives the trigger operation for unlocking the face, by default the front monocular camera is used to shoot the face to be detected, thereby obtaining the person to be detected Two-dimensional color image of the face.
  • the payment method currently adopted by the electronic device is "face-swapping payment"
  • the electronic device receives the trigger operation of the face-swapping payment
  • the face to be detected is photographed by the front monocular camera by default, thereby obtaining the pending detection Two-dimensional color image of human face.
  • the captured two-dimensional color image is input to a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image.
  • a depth estimation model for depth estimation is pre-trained, where the depth estimation model may be stored locally in the electronic device or may be stored in a remote server.
  • the electronic device calls the pre-trained depth estimation model locally or calls the pre-trained depth estimation model from the remote server, and transfers the person to be detected
  • the two-dimensional color image of the face is input to a pre-trained depth estimation model, and the depth estimation model is used to perform depth estimation on the two-dimensional color image to obtain a depth image corresponding to the two-dimensional color image.
  • the resolution of the estimated depth image is the same as the resolution of the two-dimensional color image.
  • the pixel value of each pixel in the depth image is used to describe the corresponding pixel in the two-dimensional color image to the aforementioned monocular camera (That is, the distance of a monocular camera that captures a two-dimensional color image).
  • the electronic device After obtaining the two-dimensional color image of the face to be detected through the front monocular camera, the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model to perform depth estimation on the two-dimensional color image. Get a depth image corresponding to a two-dimensional color image.
  • a two-dimensional color image and its corresponding depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
  • the living body detection model for living body detection is also pre-trained, where the living body detection model may be stored locally in the electronic device , Can also be stored in a remote server.
  • the pre-trained living body detection model is called locally or from a remote
  • the server at the end calls the pre-trained living body detection model, and inputs the previously acquired two-dimensional color image and its corresponding depth image to the pre-trained living body detection model.
  • the living body detection model is based on the input two-dimensional color image and The corresponding depth image performs live detection on the face to be detected to obtain a detection result that the face to be detected is a living face, or a detection result that the face to be detected is a non-living face.
  • the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model for the two-dimensional color Perform depth estimation on the image to obtain a depth image corresponding to the two-dimensional color image, then call the locally stored, pre-trained living body detection model, and input the previously obtained two-dimensional color image and its corresponding depth image to the living body detection model for living body Detection, and the detection result is obtained, wherein, if the detection result of the face to be detected is a living face, it means that the face to be detected is a real face of a person with vital signs, and if the face to be detected is a non-living face
  • the detection result indicates that the face to be detected is not the real face of the person with vital signs, and may be a face image or a face video captured in advance.
  • the electronic device in the embodiment of the present application can first obtain the two-dimensional color image of the face to be detected by the configured monocular camera, and then input the obtained two-dimensional color image into the pre-trained depth estimation model Perform depth estimation to obtain a depth image corresponding to a two-dimensional color image, and finally input the previously obtained two-dimensional color image and its corresponding depth image into a pre-trained living body detection model for living body detection to obtain a detection result.
  • the electronic device can realize the living body detection without using an additional depth camera, but using a generally configured monocular camera, which reduces the hardware cost of the electronic device for living body detection.
  • FIG. 3 is another schematic flowchart of the living body detection method provided by the embodiment of the present application.
  • the living body detection method may be applied to an electronic device, and the flow of the living body detection method may include:
  • the electronic device is trained with a machine learning algorithm to obtain a depth estimation model and a living body detection model, where the living body detection model is a convolutional neural network model.
  • the electronic device uses a machine learning algorithm to train in advance to obtain a depth estimation model and a living body detection model. It should be noted that after the trained depth estimation model and the living body detection model, the electronic device may store the depth estimation model and the living body detection model locally in the electronic device, or may store the depth estimation model and the living body detection model on a remote The server may also store one of the depth estimation model and the living body detection model locally on the electronic device and the other on a remote server.
  • machine learning algorithms can include: decision tree models, logistic regression models, Bayesian models, neural network models, clustering models, and so on.
  • machine learning algorithms can be divided according to various situations. For example, based on the learning method, machine learning algorithms can be divided into: supervised learning algorithms, unsupervised learning algorithms, semi-supervised learning algorithms, reinforcement learning algorithms, and so on.
  • training data Under supervised learning, the input data is called “training data”, and each set of training data has a clear identification or result, such as “spam” and “non-spam” in anti-spam systems, and recognition of handwritten digits. "1,2,3,4" and so on.
  • Common application scenarios of supervised learning are classification problems and regression problems.
  • Common algorithms are Logistic Regression and Backward Propagation Neural Network.
  • the model In unsupervised learning, the data is not specifically identified, the model is to infer some internal structure of the data.
  • Common application scenarios include association rule learning and clustering.
  • Common algorithms include Apriori algorithm and k-Means algorithm.
  • Semi-supervised learning algorithm In this learning mode, the input data is partially identified.
  • This learning model can be used for type recognition, but the model first needs to learn the internal structure of the data in order to reasonably organize the data for prediction.
  • Application scenarios include classification and regression.
  • the algorithm includes some extensions to commonly used supervised learning algorithms. These algorithms first attempt to model unlabeled data, and then predict the labeled data on this basis. Such as graph theory inference algorithm (Graph Inference) or Laplacian support vector machine (Laplacian SVM), etc.
  • Reinforcement learning algorithm In this learning mode, the input data is used as feedback to the model. Unlike the supervised model, the input data is only used as a way to check whether the model is right or wrong. Under reinforcement learning, the input data is directly fed back to the model. The model must make adjustments immediately.
  • Common application scenarios include dynamic systems and robot control.
  • Common algorithms include Q-Learning and time difference learning (Temporal learning).
  • Regression algorithms common regression algorithms include: least squares (Ordinary Least Square), logistic regression (Logistic Regression), stepwise regression (Stepwise Regression), multiple adaptive regression spline (Multivariate Adaptive Regression Splines) and local scatter smoothing Estimate (Locally Estimated Scatterplot Smoothing).
  • KNN k-Nearest Neighbor
  • LVQ Learning Vector Quantization
  • SOM Self-Organizing Map
  • Regularization methods common algorithms include: Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic Network (Elastic Net).
  • LASSO Least Absolute Shrinkage and Selection Operator
  • Elastic Net Elastic Net
  • Decision tree algorithm common algorithms include: Classification and regression tree (Classification And Regression Tree, CART), ID3 (Iterative Dichotomiser 3), C4.5, Chi-squared Automatic Interaction Detection (CHAID), Decision Stump, Random Forest (Random Forest), Multiple Adaptive Regression Spline (MARS) and Gradient Boosting Machine (Gradient Boosting Machine, GBM).
  • Bayesian algorithm including: Naive Bayesian algorithm, Average One-Dependence Estimators (AODE), and Bayesian Belief Network (BBN).
  • AODE Average One-Dependence Estimators
  • BBN Bayesian Belief Network
  • a convolutional neural network is used to train a live detection model, that is, the live detection model is a convolutional neural network model, where the convolutional neural network model includes a convolutional layer, a pooling layer, and a fully connected Floor.
  • the electronic device shoots the face to be detected through a monocular camera to obtain a two-dimensional color image of the face to be detected.
  • the electronic device may treat the detected person through the configured monocular camera when receiving an operation that requires face recognition for user identity detection, such as an unlock operation based on face recognition or a payment operation based on face recognition
  • an operation that requires face recognition for user identity detection such as an unlock operation based on face recognition or a payment operation based on face recognition
  • the face is photographed. Since the monocular camera is only sensitive to two-dimensional color information, a two-dimensional color image of the face to be detected will be captured.
  • a front monocular camera also commonly known as a front camera
  • a rear monocular camera also commonly known as a rear camera
  • the imaging capability of the rear monocular camera is higher than the imaging capability of the front monocular camera, so that when the electronic device shoots the face to be detected through the monocular camera, it can default to perform the shooting through the front monocular camera Operation to shoot the face to be detected; the shooting operation can also be performed by the rear monocular camera by default to shoot the face to be detected; the front monocular camera and the rear can also be predicted based on the real-time pose information
  • the monocular camera facing the face to be detected in the monocular camera so that the shooting operation is automatically performed by the monocular camera facing the face to be detected in the front monocular camera and the rear monocular camera, and the face to be detected is photographed .
  • the current unlocking method adopted by the electronic device is “face unlocking”.
  • the electronic device receives the trigger operation for unlocking the face, by default the front monocular camera is used to shoot the face to be detected, thereby obtaining the person to be detected Two-dimensional color image of the face.
  • the payment method currently adopted by the electronic device is "face-swapping payment"
  • the electronic device receives the trigger operation of the face-swapping payment
  • the face to be detected is photographed by the front monocular camera by default, thereby obtaining the pending detection Two-dimensional color image of human face.
  • the electronic device inputs the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image.
  • the electronic device After acquiring the two-dimensional color image of the face to be detected through the monocular camera, the electronic device calls the pre-trained depth estimation model locally or calls the pre-trained depth estimation model from a remote server, and transfers the person to be detected
  • the two-dimensional color image of the face is input to a pre-trained depth estimation model, and the depth estimation model is used to perform depth estimation on the two-dimensional color image to obtain a depth image corresponding to the two-dimensional color image.
  • the resolution of the estimated depth image is the same as the resolution of the two-dimensional color image.
  • the pixel value of each pixel in the depth image is used to describe the corresponding pixel in the two-dimensional color image to the aforementioned monocular camera (That is, the distance of a monocular camera that captures a two-dimensional color image).
  • the electronic device After obtaining the two-dimensional color image of the face to be detected through the front monocular camera, the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model to perform depth estimation on the two-dimensional color image. Get a depth image corresponding to a two-dimensional color image.
  • the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image to the convolutional layer of the convolutional neural network model for feature extraction, and obtains the combined global features of the aforementioned two-dimensional color image and the aforementioned depth image.
  • the electronic device after inputting the two-dimensional color image captured by the monocular camera into the pre-trained depth estimation model and obtaining the depth image corresponding to the two-dimensional color image, the electronic device locally calls the pre-trained living body detection
  • the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image into the convolutional layer of the convolutional neural network model for feature extraction (feature extraction is to map the original image data to the hidden layer feature space, thereby Get the corresponding global features), get the global features of the two-dimensional color image and the global features of the depth image.
  • feature extraction is to map the original image data to the hidden layer feature space, thereby Get the corresponding global features
  • the global features of the two-dimensional color image and the global features of the depth image are combined in the convolutional layer to obtain the joint global features of the foregoing two-dimensional color image and the foregoing depth image.
  • the electronic device obtains the joint global feature and inputs the pooling layer of the convolutional neural network model to perform feature dimensionality reduction to obtain the joint global feature after the dimensionality reduction.
  • the joint global features of the two-dimensional color image and the depth image output by the convolution layer will be input into the pooling layer of the convolutional neural network model.
  • Sampling is to retain the salient features of the joint global features and achieve feature dimensionality reduction of the joint global features.
  • downsampling can be achieved by means of maximum pooling or mean pooling.
  • the joint global feature is subjected to feature dimensionality reduction through the pooling layer to obtain a joint global feature of 10*10 dimensionality reduction.
  • the electronic device inputs the joint global features after dimensionality reduction into the fully connected layer of the convolutional neural network model for classification processing, and obtains the detection result that the face to be detected is a living face, or the face to be detected is a non-living body Face detection results.
  • the fully connected layer is used to implement the function of the classifier.
  • Each node of the fully connected layer is connected to all output nodes of the pooling layer.
  • a node of the fully connected layer is called a neuron in the fully connected layer.
  • the number of neurons in the connection layer can be determined according to the actual application requirements. For example, the number of neurons in the fully connected layer can be set to 4096, and so on.
  • the dimensionality-reduced joint global features output by the pooling layer will be input to the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the face to be detected is Detection results of non-living human faces.
  • the electronic device preprocesses the two-dimensional color image to obtain the face area image in the two-dimensional color image
  • the electronic device preprocesses the aforementioned depth image to obtain the face area image in the aforementioned depth image;
  • the electronic device inputs the face area image in the two-dimensional color image and the face area image in the depth image to the convolutional layer for feature extraction to obtain the combined global features of the two-dimensional color image and the depth image .
  • the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image into the convolutional layer of the convolutional neural network model for feature extraction, it is not the original two-dimensional color image and the original
  • the aforementioned depth image is input to the convolutional layer of the convolutional neural network for feature extraction, but the two-dimensional color image and the depth image are preprocessed separately to obtain the face area image in the two-dimensional color image and the aforementioned Face area image in the depth image.
  • the face area can be extracted from the two-dimensional color image and the depth image by using an oval template, a circular template, or a rectangular template, etc., respectively. Image, thereby obtaining the face area image in the aforementioned two-dimensional color image and the face area image in the aforementioned depth image.
  • the electronic device shoots a plurality of different living human faces through a monocular camera to obtain a plurality of two-dimensional color living human face sample images, and obtains a depth image corresponding to each two-dimensional color living human face sample image to obtain multiple First depth image
  • the electronic device shoots multiple different non-living human faces through a monocular camera to obtain multiple two-dimensional color non-living human face sample images, and obtains depth images corresponding to each two-dimensional color non-living human face sample images, Obtain multiple second depth images;
  • the electronic device uses each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample, and each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample, Construct training sample set;
  • the electronic device adopts a convolutional neural network to perform model training on the training sample set to obtain a convolutional neural network model as a living body detection model.
  • the electronic device can shoot the faces of the users with different skin colors, different genders, and different ages (ie, live faces) through the monocular camera configured to obtain multiple two-dimensional color live face sample images
  • the electronic device also obtains a depth image corresponding to each two-dimensional color non-living human face sample image to obtain multiple first depth images.
  • the electronic device can also be connected to a depth camera.
  • the external depth camera is used to shoot simultaneously. In this way, the electronic device will obtain the live face through the monocular camera.
  • the two-dimensional color live human face sample image is captured by an external depth camera to obtain the depth image of the live human face, and then the captured depth image and the two-dimensional color live human face sample image are aligned, and the aligned depth image is recorded It is the first depth image of the two-dimensional color live human face sample image.
  • the electronic device can also shoot different non-living human faces such as different facial images, facial videos, human face masks and human head models through its configured monocular camera to obtain multiple two-dimensional color non-living human face samples
  • the electronic device also obtains depth images corresponding to the two-dimensional color non-living human face sample images to obtain multiple second depth images.
  • the electronic device can also be connected with a depth camera.
  • the external depth camera is used to shoot simultaneously. In this way, the electronic device will obtain the non-living human through the monocular camera.
  • the two-dimensional color non-living face sample image of the face is captured by an external depth camera to obtain the depth image of the non-living face, and then the captured depth image and the two-dimensional color non-living face sample image are aligned and aligned
  • the post-depth image is recorded as the second depth image of the two-dimensional color non-living human face sample image.
  • each two-dimensional color live human face sample image and its corresponding first depth image are used as positive samples, and each two-dimensional color non-live human face sample image and its corresponding second depth image are used as negative samples to construct a training sample set, such as Figure 4 shows.
  • the electronic device After completing the construction of the training sample set, the electronic device uses a convolutional neural network to perform model training on the constructed training sample set to obtain a convolutional neural network model as a living body detection model for living body detection.
  • a convolutional neural network when used to perform model training on the constructed training sample set, a supervised learning method or an unsupervised learning method may be used, which can be specifically selected by a person of ordinary skill in the art according to actual needs.
  • the convolutional neural network before the convolutional neural network is used to perform model training on the training sample set to obtain a convolutional neural network model, which is used as a living body detection model, it further includes:
  • the electronic device performs sample expansion processing on the training sample set according to a preset sample expansion strategy.
  • the sample expansion of the training sample set can increase the diversity of the samples, so that the trained convolutional neural network model has stronger robustness.
  • the sample expansion strategy may be set to perform one or more of small rotation, scaling, and inversion on the positive samples/negative samples in the training sample set.
  • the two-dimensional color live human face sample image and its corresponding first depth image can be Rotate the same amplitude to obtain the rotated two-dimensional color live human face sample image and the rotated first depth image.
  • the new two-dimensional color live human face sample image and the rotated first depth image form a new Positive sample.
  • the following when acquiring depth images corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images, the following may be performed:
  • the electronic device receives the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera;
  • the electronic device generates a depth image corresponding to each two-dimensional color live human face sample image according to the distance from each pixel in each two-dimensional color live human face sample image to the monocular camera, and obtains a plurality of first depth images.
  • the distance from each pixel point in the two-dimensional color live human face sample image to the monocular camera can be manually calibrated, and the electronic The device generates a depth image corresponding to the two-dimensional color live human face sample image according to the distance between each pixel in the two-dimensional color live human face sample image and the monocular camera, and records it as the first depth image.
  • the electronic device can receive the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera, and according to the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera , A depth image corresponding to each two-dimensional color live human face sample image is generated, and multiple first depth images are obtained.
  • the following when acquiring depth images corresponding to each two-dimensional color non-living human face sample image to obtain multiple second depth images, the following may be performed:
  • the electronic device receives the calibrated distance from each pixel in the two-dimensional color non-living face sample image to the monocular camera;
  • the electronic device generates a depth image corresponding to each two-dimensional color non-living human face sample image according to the distance from each pixel in each two-dimensional color non-living human face sample image to the monocular camera, and obtains a plurality of second depth images.
  • a machine learning algorithm when used to obtain a depth estimation model, the following may be performed:
  • the electronic device uses each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training inputs, and uses the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live body
  • the second depth image corresponding to the face sample image is used as the target output, and the supervised model training is performed to obtain the depth estimation model.
  • the electronic device uses multiple acquired two-dimensional color live human face sample images and corresponding multiple first depth images, and multiple two-dimensional color non-live human face samples
  • multiple acquired two-dimensional color live human face sample images and their corresponding multiple first depth images, and multiple second depth images can also be used
  • Dimensional color non-living human face sample images and corresponding multiple second depth images are used to train a depth estimation model.
  • the electronic device can directly use each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training inputs, and use the first depth image and each two corresponding to each two-dimensional color live human face sample image
  • the second depth image corresponding to the dimensional color non-living face sample image is used as the target output, and the supervised model is trained to obtain the depth estimation model.
  • the electronic device uses the two-dimensional color live human face sample image as a training input, and uses the first depth image of the two-dimensional color live human face sample image as a corresponding target output ;
  • the electronic device uses the two-dimensional color non-living human face sample image as a training input, and the two-dimensional color non-living human face sample image as the corresponding target output .
  • FIG. 5 is a schematic structural diagram of a living body detection device according to an embodiment of the present application.
  • the living body detection device is applied to an electronic device, the electronic device includes a monocular camera, the living body detection device includes a color image acquisition module 501, a depth image acquisition module 502, and a living face detection module 503, as follows:
  • the color image acquisition module 501 is used to shoot a face to be detected through a monocular camera to obtain a two-dimensional color image of the face to be detected;
  • the depth image acquisition module 502 is used to input the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
  • the living body face detection module 503 is used to input a two-dimensional color image and its corresponding depth image into a pre-trained living body detection model for living body detection to obtain a detection result.
  • the living body detection model is a convolutional neural network model, which includes a convolution layer, a pooling layer, and a fully connected layer connected in sequence. After inputting a two-dimensional color image and its corresponding depth image into a pre-trained living body detection The model performs live detection, and when the detection result is obtained, the live face detection module 503 can be used to:
  • the joint global feature will be input into the pooling layer for feature dimensionality reduction, and the joint global feature after dimensionality reduction will be obtained;
  • the joint global features after dimensionality reduction are input into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the detection result that the face to be detected is a non-living face.
  • the living face detection module 503 may Used for:
  • the face area image in the two-dimensional color image and the face area image in the depth image are input to the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image.
  • the living body detection device further includes a model training module, which is used to:
  • the multiple different live human faces are captured through the monocular camera to obtain multiple two-dimensional color live human face sample images, And obtain a depth image corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images;
  • a plurality of different non-living human face images are captured by a monocular camera to obtain multiple two-dimensional color non-living human face sample images, and a depth image corresponding to each two-dimensional color non-living human face sample image is obtained to obtain multiple second Depth image
  • each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample
  • each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample
  • a convolutional neural network is used to model the training sample set, and a convolutional neural network model is obtained as a living body detection model.
  • the model training module before the convolutional neural network is used to train the training sample set, the model training module:
  • the model training module when acquiring depth images corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images, the model training module may be used to:
  • a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
  • the model training module when acquiring depth images corresponding to each two-dimensional color non-living human face sample image to obtain multiple second depth images, the model training module may be used to:
  • a depth image corresponding to each two-dimensional color non-living human face sample image is generated to obtain a plurality of second depth images.
  • model training module can also be used for:
  • each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training input, and use the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live human face image
  • the second depth image corresponding to the sample image is output as the target, and the supervised model is trained to obtain the depth estimation model.
  • An embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the stored computer program is executed on a computer, causes the computer to perform the steps in the living body detection method provided in this embodiment, or The computer is caused to execute the steps in the model training method provided in this embodiment.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read Only Memory, ROM), or a random access device (Random Access Memory, RAM), and so on.
  • An embodiment of the present application also provides an electronic device, including a memory, a processor, and the processor executes the steps in the living body detection method provided in this embodiment by calling a computer program stored in the memory, or executes the model as provided in this embodiment Steps in the training method.
  • an electronic device is also provided.
  • the electronic device includes a processor 701, a memory 702, and a monocular camera 703.
  • the processor 701 is electrically connected to the memory 702 and the monocular camera 703.
  • the processor 701 is the control center of the electronic device, and uses various interfaces and lines to connect the various parts of the entire electronic device, executes the electronic device by running or loading the computer program stored in the memory 702, and calling the data stored in the memory 702 Various functions and process data.
  • the memory 702 may be used to store software programs and modules.
  • the processor 701 runs computer programs and modules stored in the memory 702 to execute various functional applications and data processing.
  • the memory 702 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, computer programs required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store Data created by the use of electronic devices, etc.
  • the memory 702 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Accordingly, the memory 702 may further include a memory controller to provide the processor 701 with access to the memory 702.
  • the monocular camera 703 may include a camera having one or more lenses and an image sensor, capable of capturing external image data.
  • the processor 701 in the electronic device loads the instructions corresponding to the process of one or more computer programs into the memory 702 according to the following steps, and the processor 701 runs and stores the instructions in the memory 702 Computer program to achieve various functions as follows:
  • the monocular camera 703 shoots the face to be detected to obtain a two-dimensional color image of the face to be detected;
  • the two-dimensional color image and the corresponding depth image are input into a pre-trained living body detection model for living body detection, and the detection result is obtained.
  • FIG. 7 is another schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device further includes components such as an input unit 704 and an output unit 705.
  • the input unit 704 can be used to receive input numbers, character information, or user characteristic information (such as fingerprints), and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.
  • user characteristic information such as fingerprints
  • the output unit 705 may be used to display information input by the user or information provided to the user, such as a screen.
  • the processor 701 in the electronic device loads the instructions corresponding to the process of one or more computer programs into the memory 702 according to the following steps, and the processor 701 runs and stores the instructions in the memory 702 Computer program to achieve various functions as follows:
  • the monocular camera 703 shoots the face to be detected to obtain a two-dimensional color image of the face to be detected;
  • the two-dimensional color image and the corresponding depth image are input into a pre-trained living body detection model for living body detection, and the detection result is obtained.
  • the living body detection model is a convolutional neural network model, which includes a convolution layer, a pooling layer, and a fully connected layer connected in sequence. After inputting a two-dimensional color image and its corresponding depth image into a pre-trained living body detection The model performs a living body test, and when the test result is obtained, the processor 701 can execute:
  • the joint global feature will be input into the pooling layer for feature dimensionality reduction, and the joint global feature after dimensionality reduction will be obtained;
  • the joint global features after dimensionality reduction are input into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the detection result that the face to be detected is a non-living face.
  • the processor 701 may execute:
  • the face area image in the two-dimensional color image and the face area image in the depth image are input to the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image.
  • the processor 701 may execute:
  • the monocular camera 703 is used to photograph the face to be detected to obtain a two-dimensional color image of the face to be detected, the monocular camera 703 is used to photograph multiple different live human faces to obtain multiple two-dimensional color live human face samples Image, and obtain a depth image corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images;
  • a plurality of different non-living human face images are captured by the monocular camera 703 to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each two-dimensional color non-living human face sample image is obtained to obtain multiple Two depth images;
  • each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample
  • each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample
  • a convolutional neural network is used to model the training sample set, and a convolutional neural network model is obtained as a living body detection model.
  • the processor 701 may execute:
  • the processor 701 may execute:
  • a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
  • the processor 701 may execute:
  • a depth image corresponding to each two-dimensional color non-living human face sample image is generated to obtain a plurality of second depth images.
  • processor 701 may also execute:
  • each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training input, and use the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live human face image
  • the second depth image corresponding to the sample image is output as the target, and the supervised model is trained to obtain the depth estimation model.
  • the computer program may be stored in a computer-readable storage medium, such as stored in a memory of an electronic device, and executed by at least one processor in the electronic device, and may include, for example, a living body detection method during execution The process of the embodiment.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
  • each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium, such as a read-only memory, magnetic disk, or optical disk, etc. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

A method and device for live body detection, a storage medium, and an electronic device. The method comprises: first, photographing via a monocular camera a two-dimensional color image of a face to be detected (101), then, inputting the two-dimensional color image into a pretrained depth estimation model for depth estimation to produce a corresponding depth image (102), and finally, inputting the two-dimensional color image and the depth image corresponding thereto into a pretrained live body detection model for live body detection to produce a detection result (103).

Description

活体检测方法、装置、存储介质及电子设备Living body detection method, device, storage medium and electronic equipment
本申请要求于2018年12月20日提交中国专利局、申请号为201811565579.0、发明名称为“活体检测方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application submitted to the Chinese Patent Office on December 20, 2018, with the application number 201811565579.0 and the invention titled "living test method, device, storage medium and electronic equipment", the entire contents of which are incorporated by reference In this application.
技术领域Technical field
本申请涉及人脸识别技术领域,具体涉及一种活体检测方法、装置、存储介质及电子设备。The present application relates to the technical field of face recognition, in particular to a living body detection method, device, storage medium and electronic equipment.
背景技术Background technique
目前,电子设备利用相关人脸识别技术,不仅能够区分用户个体,还能够对用户进行活体检测,比如,电子设备通过配置的结构光摄像头或者飞行时间摄像头等深度摄像头获取用户人脸(如拍摄的用户人脸图像)的RGB-D图像,可以判断用户人脸是否为活体人脸。At present, electronic devices use relevant face recognition technology to not only distinguish between individual users, but also perform live detection on users. For example, electronic devices obtain user faces (such as photos taken through a depth camera such as a structured light camera or a time-of-flight camera). The RGB-D image of the user's face image) can determine whether the user's face is a living face.
发明内容Summary of the invention
本申请实施例提供了一种活体检测方法、装置、存储介质及电子设备,能够降低电子设备实现活体检测的硬件成本。The embodiments of the present application provide a living body detection method, device, storage medium, and electronic equipment, which can reduce the hardware cost of the electronic equipment for living body detection.
第一方面,本申请实施例提供了一种活体检测方法,应用于电子设备,所述电子设备包括单目摄像头,所述活体检测方法包括:In a first aspect, an embodiment of the present application provides a living body detection method, which is applied to an electronic device, the electronic device includes a monocular camera, and the living body detection method includes:
通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
将所述二维彩色图像输入预先训练的深度估计模型,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
第二方面,本申请实施例提供了一种活体检测装置,应用于电子设备,所述电子设备包括单目摄像头,所述活体检测装置包括:In a second aspect, an embodiment of the present application provides a living body detection device, which is applied to an electronic device, the electronic device includes a monocular camera, and the living body detection device includes:
彩色图像获取模块,用于通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;A color image acquisition module, configured to shoot the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
深度图像获取模块,用于将所述二维彩色图像输入预先训练的深度估计模型,得到对应所述二维彩色图像的深度图像;A depth image acquisition module, configured to input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
活体人脸检测模块,将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果。The living body face detection module inputs the two-dimensional color image and the depth image into a pre-trained living body detection model to obtain a detection result.
第三方面,本申请实施例提供了一种存储介质,其上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行:In a third aspect, an embodiment of the present application provides a storage medium on which a computer program is stored, and when the computer program runs on a computer, causes the computer to execute:
通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
将所述二维彩色图像输入预先训练的深度估计模型,得到对应所述二维彩色图像的深 度图像;Input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
第四方面,本申请实施例提供了一种电子设备,包括处理器、存储器和单目摄像头,所述存储器有计算机程序,所述处理器通过调用所述计算机程序,用于执行:According to a fourth aspect, an embodiment of the present application provides an electronic device including a processor, a memory, and a monocular camera. The memory has a computer program, and the processor is used to execute the computer program by calling the computer program:
通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
将所述二维彩色图像输入预先训练的深度估计模型,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model to obtain a detection result.
附图说明BRIEF DESCRIPTION
图1是本申请实施例提供的活体检测方法的一流程示意图。FIG. 1 is a schematic flowchart of a living body detection method provided by an embodiment of the present application.
图2是本申请实施例中电子设备通过活体检测模型进行活体检测的示意图。FIG. 2 is a schematic diagram of the living body detection performed by the electronic device through the living body detection model in the embodiment of the present application.
图3是本申请实施例提供的活体检测方法的另一流程示意图。FIG. 3 is another schematic flowchart of the living body detection method provided by the embodiment of the present application.
图4是本申请实施例中构建训练样本集的示意图。4 is a schematic diagram of constructing a training sample set in an embodiment of the present application.
图5是本申请实施例提供的活体检测装置的一结构示意图。5 is a schematic structural diagram of a living body detection device provided by an embodiment of the present application.
图6是本申请实施例提供的电子设备的一结构示意图。6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
图7是本申请实施例提供的电子设备的另一结构示意图。7 is another schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式detailed description
请参照图式,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。Please refer to the drawings in which the same component symbols represent the same components. The principle of the present application is illustrated by implementation in an appropriate computing environment. The following description is based on the illustrated specific embodiments of the present application, which should not be considered as limiting other specific embodiments not detailed herein.
目前,人脸识别技术广泛的用于电子设备的解锁以及安全支付等,但是利用非活体人脸图像、非活体人脸视频、人脸面具或者人头模型等可以容易的冒充他人,对用户造成损失。为解决人脸识别技术中的这种缺陷,相关技术中提出了基于结构光摄像头或者飞行时间摄像头等深度摄像头的活体检测技术,然而其实现需要电子设备配备额外的深度摄像头,增加了电子设备实现活体检测的成本。为此,本申请实施例首先提供一种活体检测方法,该活体检测方法基于电子设备普遍配置的单目摄像头实现活体检测,不会增加电子设备的硬件成本。其中,该活体检测方法的执行主体可以是本申请实施例提供的活体检测装置,或者集成了该活体检测装置的电子设备,该活体检测装置可以采用硬件或者软件的方式实现,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等配置有处理器而具有处理能力的设备。Currently, face recognition technology is widely used to unlock electronic devices and secure payment, but using non-living face images, non-living face videos, face masks, or head models can easily impersonate others and cause losses to users. . In order to solve this defect in the face recognition technology, the related art proposes a living body detection technology based on a depth camera such as a structured light camera or a time-of-flight camera. However, its implementation requires that the electronic device be equipped with an additional depth camera, which increases The cost of biopsy. For this reason, the embodiments of the present application firstly provide a living body detection method, which realizes living body detection based on a monocular camera commonly configured in electronic devices, without increasing the hardware cost of the electronic devices. Wherein, the execution subject of the living body detection method may be the living body detection device provided in the embodiment of the present application, or an electronic device integrated with the living body detection device. The living body detection device may be implemented by hardware or software, and the electronic device may be intelligent Mobile phones, tablet computers, PDAs, notebook computers, or desktop computers are equipped with processors and have processing capabilities.
本申请实施例提供一种活体检测方法,包括:An embodiment of the present application provides a living body detection method, including:
通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
将所述二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
在一实施例中,所述活体检测模型为卷积神经网络模型,包括依次连接的卷积层、池化层和全连接层,所述将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果,包括:In an embodiment, the living body detection model is a convolutional neural network model, including a convolutional layer, a pooling layer, and a fully connected layer connected in sequence, and the inputting of the two-dimensional color image and the depth image in advance Trained living body detection model to obtain the detection results, including:
将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征;Input the two-dimensional color image and the depth image into the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image;
将所述联合全局特征输入所述池化层进行特征降维,得到降维后的联合全局特征;Input the joint global feature into the pooling layer to perform feature dimensionality reduction to obtain the joint global feature after dimensionality reduction;
将所述降维后的联合全局特征输入所述全连接层中进行分类处理,得到所述待检测人脸为活体人脸的检测结果,或者得到所述待检测人脸为非活体人脸的检测结果。Input the dimensionality-reduced joint global feature into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a live face, or the face to be detected is a non-live face Test results.
在一实施例中,所述将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征,包括:In an embodiment, the inputting the two-dimensional color image and the depth image into the convolution layer for feature extraction to obtain the joint global features of the two-dimensional color image and the depth image includes:
对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像;Preprocessing the two-dimensional color image to obtain a face area image in the two-dimensional color image;
对所述深度图像进行预处理,得到所述深度图像中的人脸区域图像;Preprocessing the depth image to obtain a face area image in the depth image;
将所述二维彩色图像中的人脸区域图像和所述深度图像中的人脸区域图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征。Input the face area image in the two-dimensional color image and the face area image in the depth image into the convolution layer for feature extraction to obtain the joint global features of the two-dimensional color image and the depth image .
在一实施例中,所述对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像,包括:In an embodiment, the preprocessing the two-dimensional color image to obtain the face area image in the two-dimensional color image includes:
采用椭圆模板、圆形模板或者矩形模板从所述二维彩色图像中提取人脸区域图像。An ellipse template, a circular template or a rectangular template is used to extract the face area image from the two-dimensional color image.
在一实施例中,所述通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像之前,还包括:In an embodiment, before capturing the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected, the method further includes:
通过所述单目摄像头对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;A plurality of different live human face images are captured by the monocular camera to obtain multiple two-dimensional color live human face sample images, and a depth image corresponding to each of the two-dimensional color live human face sample images is obtained to obtain multiple A depth image;
通过所述单目摄像头对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各所述二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;A plurality of different non-living human faces are photographed through the monocular camera to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each of the two-dimensional color non-living human face sample images is obtained to obtain Multiple second depth images;
将各所述二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各所述二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;Constructing each of the two-dimensional color live human face sample images and their corresponding first depth images as positive samples, and using each of the two-dimensional color non-live human face sample images and their corresponding second depth images as negative samples Training sample set;
采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型。A convolutional neural network is used to perform model training on the training sample set to obtain the convolutional neural network model.
在一实施例中,所述采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型之前,还包括:In an embodiment, the convolutional neural network is used to perform model training on the training sample set, and before the convolutional neural network model is obtained, the method further includes:
按照预设的样本扩充策略对所述训练样本集进行样本扩充处理。Perform sample expansion processing on the training sample set according to a preset sample expansion strategy.
在一实施例中,所述获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像,包括:In an embodiment, the acquiring depth images corresponding to each of the two-dimensional color live human face sample images to obtain multiple first depth images includes:
接收标定的各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离;Receiving the calibrated distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera;
根据各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离,生成各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。According to the distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera, a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
在一实施例中,所述活体检测方法还包括:In an embodiment, the living body detection method further includes:
将各所述二维彩色活体人脸样本图像和各所述二维彩色非活体人脸样本图像作为训练输入、将各所述二维彩色活体人脸样本图像对应的第一深度图像和各所述二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到所述深度估计模型。Using each of the two-dimensional color live human face sample images and each of the two-dimensional color non-live human face sample images as training inputs, and using the first depth image corresponding to each of the two-dimensional color live human face sample images and each location The second depth image corresponding to the two-dimensional color non-living human face sample image is output as the target, and the supervised model training is performed to obtain the depth estimation model.
在一实施例中,所述将所述二维彩色图像输入预先训练的深度估计模型进行深度估计之前,还包括:In an embodiment, before inputting the two-dimensional color image into a pre-trained depth estimation model for depth estimation, the method further includes:
从本地调用所述深度估计模型或者从服务器调用所述深度估计模型。Call the depth estimation model locally or call the depth estimation model from the server.
请参照图1,图1为本申请实施例提供的活体检测方法的流程示意图。如图1所示,本申请实施例提供的活体检测方法的流程可以如下:Please refer to FIG. 1, which is a schematic flowchart of a living body detection method provided by an embodiment of the present application. As shown in FIG. 1, the flow of the living body detection method provided by the embodiment of the present application may be as follows:
在101中,通过单目摄像头对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像。In 101, a face to be detected is photographed through a monocular camera to obtain a two-dimensional color image of the face to be detected.
本申请实施例中,电子设备可以在接收到基于人脸识别的解锁操作或者基于人脸识别的支付操作等需要采用人脸识别进行用户身份检测的操作时,通过配置的单目摄像头对待检测人脸进行拍摄,由于单目摄像头只对二维的颜色信息敏感,将拍摄得到待检测人脸的二维彩色图像。In the embodiment of the present application, the electronic device can treat the detected person through the configured monocular camera when receiving an operation that requires face recognition for user identity detection, such as an unlock operation based on face recognition or a payment operation based on face recognition The face is photographed. Since the monocular camera is only sensitive to two-dimensional color information, a two-dimensional color image of the face to be detected will be captured.
应当说明的是,目前,电子设备通常配置有两个单目摄像头,分别为前置单目摄像头(也即是俗称的前置摄像头)和后置单目摄像头(也即是俗称的后置摄像头),且后置单目摄像头的成像能力高于前置单目摄像头的成像能力,这样,电子设备在通过单目摄像头对待检测人脸进行拍摄时,可以默认通过前置单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄;也可以默认通过后置单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄;还可以根据实时的姿态信息,来预测前置单目摄像头和后置单目摄像头中朝向待检测人脸的单目摄像头,从而自动通过前置单目摄像头和后置单目摄像头中朝向待检测人脸的单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄。It should be noted that at present, electronic devices are usually equipped with two monocular cameras, namely a front monocular camera (also commonly known as a front camera) and a rear monocular camera (also commonly known as a rear camera) ), and the imaging capability of the rear monocular camera is higher than the imaging capability of the front monocular camera, so that when the electronic device shoots the face to be detected through the monocular camera, it can default to perform the shooting through the front monocular camera Operation to shoot the face to be detected; the shooting operation can also be performed by the rear monocular camera by default to shoot the face to be detected; the front monocular camera and the rear can also be predicted based on the real-time pose information The monocular camera facing the face to be detected in the monocular camera, so that the shooting operation is automatically performed by the monocular camera facing the face to be detected in the front monocular camera and the rear monocular camera, and the face to be detected is photographed .
比如,电子设备当前采用的解锁方式为“人脸解锁”,则当电子设备接收到人脸解锁的触发操作时,默认通过前置单目摄像头对待检测人脸进行拍摄,由此得到待检测人脸的二维彩色图像。For example, the current unlocking method adopted by the electronic device is “face unlocking”. When the electronic device receives the trigger operation for unlocking the face, by default the front monocular camera is used to shoot the face to be detected, thereby obtaining the person to be detected Two-dimensional color image of the face.
又比如,电子设备当前采用的支付方式为“刷脸支付”,则当电子设备接收到刷脸支付的触发操作时,默认通过前置单目摄像头对待检测人脸进行拍摄,由此得到待检测人脸的二维彩色图像。For another example, the payment method currently adopted by the electronic device is "face-swapping payment", when the electronic device receives the trigger operation of the face-swapping payment, the face to be detected is photographed by the front monocular camera by default, thereby obtaining the pending detection Two-dimensional color image of human face.
在102中,将拍摄得到的二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像。In 102, the captured two-dimensional color image is input to a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image.
应当说明的是,在本申请实施例,预先训练有用于深度估计的深度估计模型,其中,该深度估计模型可以存储在电子设备本地,也可以存储在远端的服务器中。这样,电子设备在通过单目摄像头获取到待检测人脸的二维彩色图像后,从本地调用预先训练的深度估计模型或者从远端的服务器调用预先训练的深度估计模型,并将待检测人脸的二维彩色图像输入到预先训练的深度估计模型,通过该深度估计模型对二维彩色图像进行深度估计,得到对应二维彩色图像的深度图像。It should be noted that in the embodiment of the present application, a depth estimation model for depth estimation is pre-trained, where the depth estimation model may be stored locally in the electronic device or may be stored in a remote server. In this way, after acquiring the two-dimensional color image of the face to be detected through the monocular camera, the electronic device calls the pre-trained depth estimation model locally or calls the pre-trained depth estimation model from the remote server, and transfers the person to be detected The two-dimensional color image of the face is input to a pre-trained depth estimation model, and the depth estimation model is used to perform depth estimation on the two-dimensional color image to obtain a depth image corresponding to the two-dimensional color image.
应当说明的是,估计得到深度图像的分辨率与二维彩色图像的分辨率相同,深度图像中各像素点的像素值用于描述其在二维彩色图像中对应的像素点到前述单目摄像头(即拍摄得到二维彩色图像的单目摄像头)的距离。It should be noted that the resolution of the estimated depth image is the same as the resolution of the two-dimensional color image. The pixel value of each pixel in the depth image is used to describe the corresponding pixel in the two-dimensional color image to the aforementioned monocular camera (That is, the distance of a monocular camera that captures a two-dimensional color image).
比如,电子设备在通过前置单目摄像头拍摄得到待检测人脸的二维彩色图像之后,调用本地存储的、预先训练的深度估计模型,通过该深度估计模型对二维彩色图像进行深度估计,得到对应二维彩色图像的深度图像。For example, after obtaining the two-dimensional color image of the face to be detected through the front monocular camera, the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model to perform depth estimation on the two-dimensional color image. Get a depth image corresponding to a two-dimensional color image.
在103中,将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。In 103, a two-dimensional color image and its corresponding depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
应当说明的是,在本申请实施例中,除了预先训练有用于深度估计的深度估计模型之外,还预先训练有用于活体检测的活体检测模型,其中,该活体检测模型可以存储在电子设备本地,也可以存储在远端的服务器中。这样,电子设备在将通过单目摄像头拍摄得到的二维彩色图像输入到预先训练的深度估计模型,并得到对应二维彩色图像的深度图像之后,从本地调用预先训练的活体检测模型或者从远端的服务器调用预先训练的活体检测模型,并将之前获取到的二维彩色图像及其对应的深度图像输入到预先训练的活体检测模型,通过该活体检测模型基于输入的二维彩色图像及其对应的深度图像对待检测人脸进行活体检测,得到待检测人脸为活体人脸的检测结果,或者得到待检测人脸为非活体人脸的检测结果。It should be noted that, in the embodiment of the present application, in addition to the depth estimation model for depth estimation being pre-trained, the living body detection model for living body detection is also pre-trained, where the living body detection model may be stored locally in the electronic device , Can also be stored in a remote server. In this way, after the electronic device inputs the two-dimensional color image captured by the monocular camera to the pre-trained depth estimation model and obtains the depth image corresponding to the two-dimensional color image, the pre-trained living body detection model is called locally or from a remote The server at the end calls the pre-trained living body detection model, and inputs the previously acquired two-dimensional color image and its corresponding depth image to the pre-trained living body detection model. The living body detection model is based on the input two-dimensional color image and The corresponding depth image performs live detection on the face to be detected to obtain a detection result that the face to be detected is a living face, or a detection result that the face to be detected is a non-living face.
比如,请参照图2,电子设备在通过前置单目摄像头拍摄得到待检测人脸的二维彩色图像之后,调用本地存储的、预先训练的深度估计模型,通过该深度估计模型对二维彩色图像进行深度估计,得到对应二维彩色图像的深度图像,然后,调用本地存储的、预先训练的活体检测模型,并将之前得到二维彩色图像及其对应的深度图像输入到活体检测模型进行活体检测,得到检测结果,其中,若得到待检测人脸为活体人脸的检测结果,则说明待检测人脸为具有生命体征的人的真实人脸,若得到待检测人脸为非活体人脸的检测结果,则说明待检测人脸不为具有生命体征的人的真实人脸,可能是预先拍摄得到人脸图像或者人脸视频等。For example, referring to FIG. 2, after the two-dimensional color image of the face to be detected is captured by the front monocular camera, the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model for the two-dimensional color Perform depth estimation on the image to obtain a depth image corresponding to the two-dimensional color image, then call the locally stored, pre-trained living body detection model, and input the previously obtained two-dimensional color image and its corresponding depth image to the living body detection model for living body Detection, and the detection result is obtained, wherein, if the detection result of the face to be detected is a living face, it means that the face to be detected is a real face of a person with vital signs, and if the face to be detected is a non-living face The detection result indicates that the face to be detected is not the real face of the person with vital signs, and may be a face image or a face video captured in advance.
由上可知,本申请实施例中的电子设备,可以首先通过配置的单目摄像头拍摄得到待检测人脸的二维彩色图像,然后将拍摄得到的二维彩色图像输入到预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像,最后将之前得到的二维彩色图像及 其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。由此,使得电子设备无需借助额外配置的深度摄像头,而是使用普遍配置的单目摄像头即可实现活体检测,降低了电子设备实现活体检测的硬件成本。It can be seen from the above that the electronic device in the embodiment of the present application can first obtain the two-dimensional color image of the face to be detected by the configured monocular camera, and then input the obtained two-dimensional color image into the pre-trained depth estimation model Perform depth estimation to obtain a depth image corresponding to a two-dimensional color image, and finally input the previously obtained two-dimensional color image and its corresponding depth image into a pre-trained living body detection model for living body detection to obtain a detection result. As a result, the electronic device can realize the living body detection without using an additional depth camera, but using a generally configured monocular camera, which reduces the hardware cost of the electronic device for living body detection.
请参照图3,图3为本申请实施例提供的活体检测方法的另一种流程示意图。该活体检测方法可以应用于电子设备,该活体检测方法的流程可以包括:Please refer to FIG. 3, which is another schematic flowchart of the living body detection method provided by the embodiment of the present application. The living body detection method may be applied to an electronic device, and the flow of the living body detection method may include:
在201中,电子设备采用机器学习算法训练得到深度估计模型和活体检测模型,其中,活体检测模型为卷积神经网络模型。In 201, the electronic device is trained with a machine learning algorithm to obtain a depth estimation model and a living body detection model, where the living body detection model is a convolutional neural network model.
本申请实施例中,电子设备预先采用机器学习算法训练得到深度估计模型和活体检测模型。应当说明的是,电子设备在训练得到的深度估计模型和活体检测模型之后,可以将深度估计模型和活体检测模型存储在电子设备本地,也可以将深度估计模型和活体检测模型存储在远端的服务器,还可以将深度估计模型和活体检测模型中的一个存储在电子设备本地、将另一个存储在远端的服务器。In the embodiment of the present application, the electronic device uses a machine learning algorithm to train in advance to obtain a depth estimation model and a living body detection model. It should be noted that after the trained depth estimation model and the living body detection model, the electronic device may store the depth estimation model and the living body detection model locally in the electronic device, or may store the depth estimation model and the living body detection model on a remote The server may also store one of the depth estimation model and the living body detection model locally on the electronic device and the other on a remote server.
其中,机器学习算法可以包括:决策树模型、逻辑回归模型、贝叶斯模型、神经网络模型、聚类模型等等。Among them, machine learning algorithms can include: decision tree models, logistic regression models, Bayesian models, neural network models, clustering models, and so on.
机器学习算法的算法类型可以根据各种情况划分,比如,可以基于学习方式可以将机器学习算法划分成:监督式学习算法、非监控式学习算法、半监督式学习算法、强化学习算法等等。The types of machine learning algorithms can be divided according to various situations. For example, based on the learning method, machine learning algorithms can be divided into: supervised learning algorithms, unsupervised learning algorithms, semi-supervised learning algorithms, reinforcement learning algorithms, and so on.
在监督式学习下,输入数据被称为“训练数据”,每组训练数据有一个明确的标识或结果,如对防垃圾邮件系统中“垃圾邮件”“非垃圾邮件”,对手写数字识别中的“1、2、3、4”等。监督式学习的常见应用场景如分类问题和回归问题。常见算法有逻辑回归(Logistic Regression)和反向传递神经网络(Back Propagation Neural Network)。Under supervised learning, the input data is called "training data", and each set of training data has a clear identification or result, such as "spam" and "non-spam" in anti-spam systems, and recognition of handwritten digits. "1,2,3,4" and so on. Common application scenarios of supervised learning are classification problems and regression problems. Common algorithms are Logistic Regression and Backward Propagation Neural Network.
在非监督式学习中,数据并不被特别标识,模型是为了推断出数据的一些内在结构。常见的应用场景包括关联规则的学习以及聚类等。常见算法包括Apriori算法以及k-Means算法等。In unsupervised learning, the data is not specifically identified, the model is to infer some internal structure of the data. Common application scenarios include association rule learning and clustering. Common algorithms include Apriori algorithm and k-Means algorithm.
半监督式学习算法,在此学习方式下,输入数据被部分标识,这种学习模型可以用来进行类型识别,但是模型首先需要学习数据的内在结构以便合理的组织数据来进行预测。应用场景包括分类和回归,算法包括一些对常用监督式学习算法的延伸,这些算法首先试图对未标识数据进行建模,在此基础上再对标识的数据进行预测。如图论推理算法(Graph Inference)或者拉普拉斯支持向量机(Laplacian SVM)等。Semi-supervised learning algorithm. In this learning mode, the input data is partially identified. This learning model can be used for type recognition, but the model first needs to learn the internal structure of the data in order to reasonably organize the data for prediction. Application scenarios include classification and regression. The algorithm includes some extensions to commonly used supervised learning algorithms. These algorithms first attempt to model unlabeled data, and then predict the labeled data on this basis. Such as graph theory inference algorithm (Graph Inference) or Laplacian support vector machine (Laplacian SVM), etc.
强化学习算法,在这种学习模式下,输入数据作为对模型的反馈,不像监督模型那样,输入数据仅仅是作为一个检查模型对错的方式,在强化学习下,输入数据直接反馈到模型,模型必须对此立刻作出调整。常见的应用场景包括动态系统以及机器人控制等。常见算法包括Q-Learning以及时间差学习(Temporal difference learning)。Reinforcement learning algorithm. In this learning mode, the input data is used as feedback to the model. Unlike the supervised model, the input data is only used as a way to check whether the model is right or wrong. Under reinforcement learning, the input data is directly fed back to the model. The model must make adjustments immediately. Common application scenarios include dynamic systems and robot control. Common algorithms include Q-Learning and time difference learning (Temporal learning).
此外,还可以基于根据算法的功能和形式的类似性将机器学习算法划分成:In addition, it is also possible to divide the machine learning algorithm into:
回归算法,常见的回归算法包括:最小二乘法(Ordinary Least Square),逻辑回归(Logistic Regression),逐步式回归(Stepwise Regression),多元自适应回归样条(Multivariate Adaptive Regression Splines)以及本地散点平滑估计(Locally Estimated Scatterplot Smoothing)。Regression algorithms, common regression algorithms include: least squares (Ordinary Least Square), logistic regression (Logistic Regression), stepwise regression (Stepwise Regression), multiple adaptive regression spline (Multivariate Adaptive Regression Splines) and local scatter smoothing Estimate (Locally Estimated Scatterplot Smoothing).
基于实例的算法,包括k-Nearest Neighbor(KNN),学习矢量量化(Learning Vector Quantization,LVQ),以及自组织映射算法(Self-Organizing Map,SOM)。Examples-based algorithms include k-Nearest Neighbor (KNN), Learning Vector Quantization (LVQ), and Self-Organizing Map (SOM).
正则化方法,常见的算法包括:Ridge Regression,Least Absolute Shrinkage and Selection Operator(LASSO),以及弹性网络(Elastic Net)。Regularization methods, common algorithms include: Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic Network (Elastic Net).
决策树算法,常见的算法包括:分类及回归树(Classification And Regression Tree,CART),ID3(Iterative Dichotomiser 3),C4.5,Chi-squared Automatic Interaction Detection(CHAID),Decision Stump,随机森林(Random Forest),多元自适应回归样条(MARS)以及梯度推进机(Gradient Boosting Machine,GBM)。Decision tree algorithm, common algorithms include: Classification and regression tree (Classification And Regression Tree, CART), ID3 (Iterative Dichotomiser 3), C4.5, Chi-squared Automatic Interaction Detection (CHAID), Decision Stump, Random Forest (Random Forest), Multiple Adaptive Regression Spline (MARS) and Gradient Boosting Machine (Gradient Boosting Machine, GBM).
贝叶斯方法算法,包括:朴素贝叶斯算法,平均单依赖估计(Averaged One-Dependence Estimators,AODE),以及Bayesian Belief Network(BBN)。Bayesian algorithm, including: Naive Bayesian algorithm, Average One-Dependence Estimators (AODE), and Bayesian Belief Network (BBN).
比如,本申请实施例中采用卷积神经网络来训练活体检测模型,也即是活体检测模型为卷积神经网络模型,其中,该卷积神经网络模型包括卷积层、池化层以及全连接层。For example, in this embodiment of the present application, a convolutional neural network is used to train a live detection model, that is, the live detection model is a convolutional neural network model, where the convolutional neural network model includes a convolutional layer, a pooling layer, and a fully connected Floor.
在202中,电子设备通过单目摄像头对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像。In 202, the electronic device shoots the face to be detected through a monocular camera to obtain a two-dimensional color image of the face to be detected.
本申请实施例中,电子设备可以在接收到基于人脸识别的解锁操作或者基于人脸识别的支付操作等需要采用人脸识别进行用户身份检测的操作时,通过配置的单目摄像头对待检测人脸进行拍摄,由于单目摄像头只对二维的颜色信息敏感,将拍摄得到待检测人脸的二维彩色图像。In the embodiment of the present application, the electronic device may treat the detected person through the configured monocular camera when receiving an operation that requires face recognition for user identity detection, such as an unlock operation based on face recognition or a payment operation based on face recognition The face is photographed. Since the monocular camera is only sensitive to two-dimensional color information, a two-dimensional color image of the face to be detected will be captured.
应当说明的是,目前,电子设备通常配置有两个单目摄像头,分别为前置单目摄像头(也即是俗称的前置摄像头)和后置单目摄像头(也即是俗称的后置摄像头),且后置单目摄像头的成像能力高于前置单目摄像头的成像能力,这样,电子设备在通过单目摄像头对待检测人脸进行拍摄时,可以默认通过前置单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄;也可以默认通过后置单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄;还可以根据实时的姿态信息,来预测前置单目摄像头和后置单目摄像头中朝向待检测人脸的单目摄像头,从而自动通过前置单目摄像头和后置单目摄像头中朝向待检测人脸的单目摄像头来执行拍摄操作,以对待检测人脸进行拍摄。It should be noted that at present, electronic devices are usually equipped with two monocular cameras, namely a front monocular camera (also commonly known as a front camera) and a rear monocular camera (also commonly known as a rear camera) ), and the imaging capability of the rear monocular camera is higher than the imaging capability of the front monocular camera, so that when the electronic device shoots the face to be detected through the monocular camera, it can default to perform the shooting through the front monocular camera Operation to shoot the face to be detected; the shooting operation can also be performed by the rear monocular camera by default to shoot the face to be detected; the front monocular camera and the rear can also be predicted based on the real-time pose information The monocular camera facing the face to be detected in the monocular camera, so that the shooting operation is automatically performed by the monocular camera facing the face to be detected in the front monocular camera and the rear monocular camera, and the face to be detected is photographed .
比如,电子设备当前采用的解锁方式为“人脸解锁”,则当电子设备接收到人脸解锁的触发操作时,默认通过前置单目摄像头对待检测人脸进行拍摄,由此得到待检测人脸的二维彩色图像。For example, the current unlocking method adopted by the electronic device is “face unlocking”. When the electronic device receives the trigger operation for unlocking the face, by default the front monocular camera is used to shoot the face to be detected, thereby obtaining the person to be detected Two-dimensional color image of the face.
又比如,电子设备当前采用的支付方式为“刷脸支付”,则当电子设备接收到刷脸支付的触发操作时,默认通过前置单目摄像头对待检测人脸进行拍摄,由此得到待检测人脸的二维彩色图像。For another example, the payment method currently adopted by the electronic device is "face-swapping payment", when the electronic device receives the trigger operation of the face-swapping payment, the face to be detected is photographed by the front monocular camera by default, thereby obtaining the pending detection Two-dimensional color image of human face.
在203中,电子设备将拍摄得到的二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像。In 203, the electronic device inputs the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image.
其中,电子设备在通过单目摄像头获取到待检测人脸的二维彩色图像后,从本地调用预先训练的深度估计模型或者从远端的服务器调用预先训练的深度估计模型,并将待检测人脸的二维彩色图像输入到预先训练的深度估计模型,通过该深度估计模型对二维彩色图像进行深度估计,得到对应二维彩色图像的深度图像。After acquiring the two-dimensional color image of the face to be detected through the monocular camera, the electronic device calls the pre-trained depth estimation model locally or calls the pre-trained depth estimation model from a remote server, and transfers the person to be detected The two-dimensional color image of the face is input to a pre-trained depth estimation model, and the depth estimation model is used to perform depth estimation on the two-dimensional color image to obtain a depth image corresponding to the two-dimensional color image.
应当说明的是,估计得到深度图像的分辨率与二维彩色图像的分辨率相同,深度图像中各像素点的像素值用于描述其在二维彩色图像中对应的像素点到前述单目摄像头(即拍摄得到二维彩色图像的单目摄像头)的距离。It should be noted that the resolution of the estimated depth image is the same as the resolution of the two-dimensional color image. The pixel value of each pixel in the depth image is used to describe the corresponding pixel in the two-dimensional color image to the aforementioned monocular camera (That is, the distance of a monocular camera that captures a two-dimensional color image).
比如,电子设备在通过前置单目摄像头拍摄得到待检测人脸的二维彩色图像之后,调用本地存储的、预先训练的深度估计模型,通过该深度估计模型对二维彩色图像进行深度估计,得到对应二维彩色图像的深度图像。For example, after obtaining the two-dimensional color image of the face to be detected through the front monocular camera, the electronic device calls a locally stored, pre-trained depth estimation model, and uses the depth estimation model to perform depth estimation on the two-dimensional color image. Get a depth image corresponding to a two-dimensional color image.
在204中,电子设备将前述二维彩色图像及其对应的深度图像输入卷积神经网络模型的卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征。In 204, the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image to the convolutional layer of the convolutional neural network model for feature extraction, and obtains the combined global features of the aforementioned two-dimensional color image and the aforementioned depth image.
本申请实施例中,电子设备在将通过单目摄像头拍摄得到的二维彩色图像输入到预先训练的深度估计模型,并得到对应二维彩色图像的深度图像之后,从本地调用预先训练的活体检测模型或者从远端的服务器调用预先训练的活体检测模型,利用该活体检测模型也即是之前训练的卷积神经网络模型实现活体检测。In the embodiment of the present application, after inputting the two-dimensional color image captured by the monocular camera into the pre-trained depth estimation model and obtaining the depth image corresponding to the two-dimensional color image, the electronic device locally calls the pre-trained living body detection The model or call a pre-trained living body detection model from a remote server, and use the living body detection model, which is the previously trained convolutional neural network model, to achieve living body detection.
首先,电子设备将前述二维彩色图像及其对应的深度图像输入卷积神经网络模型的卷积层进行特征提取(特征提取也即是将原始的图像数据映射到隐层特征空间,由此来得到对应的全局特征),得到二维彩色图像的全局特征和深度图像的全局特征。之后,在卷积层对二维彩色图像的全局特征和深度图像的全局特征进行特征联合,得到前述二维彩色图像和前述深度图像的联合全局特征。First, the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image into the convolutional layer of the convolutional neural network model for feature extraction (feature extraction is to map the original image data to the hidden layer feature space, thereby Get the corresponding global features), get the global features of the two-dimensional color image and the global features of the depth image. After that, the global features of the two-dimensional color image and the global features of the depth image are combined in the convolutional layer to obtain the joint global features of the foregoing two-dimensional color image and the foregoing depth image.
在205中,电子设备将得到联合全局特征输入卷积神经网络模型的池化层进行特征降维,得到降维后的联合全局特征。In 205, the electronic device obtains the joint global feature and inputs the pooling layer of the convolutional neural network model to perform feature dimensionality reduction to obtain the joint global feature after the dimensionality reduction.
本申请实施例中,为了减少计算量,提升活体检测的效率,由卷积层输出的前述二维彩色图像和前述深度图像的联合全局特征将被输入卷积神经网络模型的池化层进行下采样,也即是保留联合全局特征中的显著特征,实现对联合全局特征的特征降维。其中,下采样可以通过最大池化或者均值池化等方式实现。In the embodiment of the present application, in order to reduce the amount of calculation and improve the efficiency of living body detection, the joint global features of the two-dimensional color image and the depth image output by the convolution layer will be input into the pooling layer of the convolutional neural network model. Sampling is to retain the salient features of the joint global features and achieve feature dimensionality reduction of the joint global features. Among them, downsampling can be achieved by means of maximum pooling or mean pooling.
比如,假设卷积层输出的为20*20的联合全局特征,经过池化层对该联合全局特征进行特征降维,得到10*10的降维后的联合全局特征。For example, assuming that the convolutional layer outputs a joint global feature of 20*20, the joint global feature is subjected to feature dimensionality reduction through the pooling layer to obtain a joint global feature of 10*10 dimensionality reduction.
在206中,电子设备将降维后的联合全局特征输入卷积神经网络模型的全连接层进行分类处理,得到待检测人脸为活体人脸的检测结果,或者得到待检测人脸为非活体人脸的检测结果。In 206, the electronic device inputs the joint global features after dimensionality reduction into the fully connected layer of the convolutional neural network model for classification processing, and obtains the detection result that the face to be detected is a living face, or the face to be detected is a non-living body Face detection results.
其中,全连接层用于实现分类器的功能,其每一个结点都与池化层的所有输出结点相连,全连接层的一个结点即称为全连接层中的一个神经元,全连接层中神经元的数量可以根据实际应用的需求而定,比如,可以将全连接层的神经元数量设置为4096个,等等。Among them, the fully connected layer is used to implement the function of the classifier. Each node of the fully connected layer is connected to all output nodes of the pooling layer. A node of the fully connected layer is called a neuron in the fully connected layer. The number of neurons in the connection layer can be determined according to the actual application requirements. For example, the number of neurons in the fully connected layer can be set to 4096, and so on.
本申请实施例中,池化层所输出的降维后的联合全局特征将被输入到全连接层进行分类处理,得到待检测人脸为活体人脸的检测结果,或者得到待检测人脸为非活体人脸的检测结果。In the embodiment of the present application, the dimensionality-reduced joint global features output by the pooling layer will be input to the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the face to be detected is Detection results of non-living human faces.
在一实施方式中,在将前述二维彩色图像及其对应的深度图像输入卷积神经网络模型的卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征时,可以执行:In one embodiment, when the aforementioned two-dimensional color image and its corresponding depth image are input into the convolutional layer of the convolutional neural network model for feature extraction, and the combined global feature of the aforementioned two-dimensional color image and the aforementioned depth image is obtained, carried out:
(1)电子设备对前述二维彩色图像进行预处理,得到前述二维彩色图像中的人脸区域图像;(1) The electronic device preprocesses the two-dimensional color image to obtain the face area image in the two-dimensional color image;
(2)电子设备对前述深度图像进行预处理,得到前述深度图像中的人脸区域图像;(2) The electronic device preprocesses the aforementioned depth image to obtain the face area image in the aforementioned depth image;
(3)电子设备将前述二维彩色图像中的人脸区域图像和前述深度图像中的人脸区域图像输入前述卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征。(3) The electronic device inputs the face area image in the two-dimensional color image and the face area image in the depth image to the convolutional layer for feature extraction to obtain the combined global features of the two-dimensional color image and the depth image .
为进一步提升活体检测的效率,电子设备在将前述二维彩色图像及其对应的深度图像输入卷积神经网络模型的卷积层进行特征提取时,并不是将原始的前述二维彩色图像和原始的前述深度图像输入到卷积神经网络的卷积层进行特征提取,而是先分别对前述二维彩色图像和前述深度图像进行预处理,得到前述二维彩色图像中的人脸区域图像以及前述深度图像中的人脸区域图像。In order to further improve the efficiency of living body detection, when the electronic device inputs the aforementioned two-dimensional color image and its corresponding depth image into the convolutional layer of the convolutional neural network model for feature extraction, it is not the original two-dimensional color image and the original The aforementioned depth image is input to the convolutional layer of the convolutional neural network for feature extraction, but the two-dimensional color image and the depth image are preprocessed separately to obtain the face area image in the two-dimensional color image and the aforementioned Face area image in the depth image.
其中,电子设备在对前述二维彩色图像和前述深度图像进行预处理时,可以采用椭圆形模板、圆形模板或者矩形模板等方式分别从前述二维彩色图像和前述深度图像中提取人脸区域图像,由此得到前述二维彩色图像中的人脸区域图像和前述深度图像中的人脸区域图像。Wherein, when the electronic device preprocesses the two-dimensional color image and the depth image, the face area can be extracted from the two-dimensional color image and the depth image by using an oval template, a circular template, or a rectangular template, etc., respectively. Image, thereby obtaining the face area image in the aforementioned two-dimensional color image and the face area image in the aforementioned depth image.
在一实施方式中,在采用机器学习算法训练得到活体检测模型时,可以执行:In one embodiment, when using a machine learning algorithm to train to obtain a living body detection model, you can execute:
(1)电子设备通过单目摄像头对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;(1) The electronic device shoots a plurality of different living human faces through a monocular camera to obtain a plurality of two-dimensional color living human face sample images, and obtains a depth image corresponding to each two-dimensional color living human face sample image to obtain multiple First depth image
(2)电子设备通过单目摄像头对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;(2) The electronic device shoots multiple different non-living human faces through a monocular camera to obtain multiple two-dimensional color non-living human face sample images, and obtains depth images corresponding to each two-dimensional color non-living human face sample images, Obtain multiple second depth images;
(3)电子设备将各二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、 将各二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;(3) The electronic device uses each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample, and each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample, Construct training sample set;
(4)电子设备采用卷积神经网络对训练样本集进行模型训练,得到卷积神经网络模型,作为活体检测模型。(4) The electronic device adopts a convolutional neural network to perform model training on the training sample set to obtain a convolutional neural network model as a living body detection model.
其中,一方面,电子设备可以通过其配置的单目摄像头对不同肤色、不同性别以及不同年龄段的用户的人脸(即活体人脸)进行拍摄,得到多个二维彩色活体人脸样本图像,此外,电子设备还获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第一深度图像。Among them, on the one hand, the electronic device can shoot the faces of the users with different skin colors, different genders, and different ages (ie, live faces) through the monocular camera configured to obtain multiple two-dimensional color live face sample images In addition, the electronic device also obtains a depth image corresponding to each two-dimensional color non-living human face sample image to obtain multiple first depth images.
比如,电子设备还可以外接深度摄像头,在通过单目摄像头对任一活体人脸进行拍摄时,同步通过外接的深度摄像头进行拍摄,这样,电子设备将通过单目摄像头拍摄得到该活体人脸的二维彩色活体人脸样本图像,通过外接的深度摄像头拍摄得到该活体人脸的深度图像,然后将拍摄得到的深度图像和二维彩色活体人脸样本图像进行对齐,将对齐后的深度图像记为二维彩色活体人脸样本图像的第一深度图像。For example, the electronic device can also be connected to a depth camera. When shooting any live face through the monocular camera, the external depth camera is used to shoot simultaneously. In this way, the electronic device will obtain the live face through the monocular camera. The two-dimensional color live human face sample image is captured by an external depth camera to obtain the depth image of the live human face, and then the captured depth image and the two-dimensional color live human face sample image are aligned, and the aligned depth image is recorded It is the first depth image of the two-dimensional color live human face sample image.
另一方面,电子设备还可以通过其配置的单目摄像头对不同人脸图像、人脸视频、人脸面具以及人头模型等非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,此外,电子设备还获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像。On the other hand, the electronic device can also shoot different non-living human faces such as different facial images, facial videos, human face masks and human head models through its configured monocular camera to obtain multiple two-dimensional color non-living human face samples In addition, the electronic device also obtains depth images corresponding to the two-dimensional color non-living human face sample images to obtain multiple second depth images.
比如,电子设备还可以外接深度摄像头,在通过单目摄像头对任一非活体人脸进行拍摄时,同步通过外接的深度摄像头进行拍摄,这样,电子设备将通过单目摄像头拍摄得到该非活体人脸的二维彩色非活体人脸样本图像,通过外接的深度摄像头拍摄得到该非活体人脸的深度图像,然后将拍摄得到的深度图像和二维彩色非活体人脸样本图像进行对齐,将对齐后的深度图像记为二维彩色非活体人脸样本图像的第二深度图像。For example, the electronic device can also be connected with a depth camera. When shooting any non-living human face through the monocular camera, the external depth camera is used to shoot simultaneously. In this way, the electronic device will obtain the non-living human through the monocular camera. The two-dimensional color non-living face sample image of the face is captured by an external depth camera to obtain the depth image of the non-living face, and then the captured depth image and the two-dimensional color non-living face sample image are aligned and aligned The post-depth image is recorded as the second depth image of the two-dimensional color non-living human face sample image.
电子设备在获取到的多个二维彩色活体人脸样本图像及其对应的第一深度图像,以及获取到多个二维彩色非活体人脸样本图像及其对应的第二深度图像之后,将各二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集,如图4所示。After acquiring the multiple two-dimensional color live human face sample images and their corresponding first depth images and the multiple two-dimensional color non-live human face sample images and their corresponding second depth images, the electronic device will Each two-dimensional color live human face sample image and its corresponding first depth image are used as positive samples, and each two-dimensional color non-live human face sample image and its corresponding second depth image are used as negative samples to construct a training sample set, such as Figure 4 shows.
电子设备在完成训练样本集的构建之后,采用卷积神经网络对构建的训练样本集进行模型训练,得到卷积神经网络模型,作为用于活体检测的活体检测模型。After completing the construction of the training sample set, the electronic device uses a convolutional neural network to perform model training on the constructed training sample set to obtain a convolutional neural network model as a living body detection model for living body detection.
应当说明的是,在采用卷积神经网络对构建的训练样本集进行模型训练时,可以采用监督学习方法,也可以采用非监督学习方法,具体可由本领域普通技术人员根据实际需要进行选取。It should be noted that, when a convolutional neural network is used to perform model training on the constructed training sample set, a supervised learning method or an unsupervised learning method may be used, which can be specifically selected by a person of ordinary skill in the art according to actual needs.
在一实施方式中,在采用卷积神经网络对训练样本集进行模型训练,得到卷积神经网络模型,作为活体检测模型之前,还包括:In one embodiment, before the convolutional neural network is used to perform model training on the training sample set to obtain a convolutional neural network model, which is used as a living body detection model, it further includes:
电子设备按照预设的样本扩充策略对训练样本集进行样本扩充处理。The electronic device performs sample expansion processing on the training sample set according to a preset sample expansion strategy.
本申请实施例中,通过对训练样本集进行样本扩充能够增加样本的多样性,使得训练 得到的卷积神经网络模型具有更强的鲁棒性。其中,样本扩充策略可以设置为对训练样本集中的正样本/负样本进行小幅度的旋转、缩放、反转中的一种或多种。In the embodiment of the present application, the sample expansion of the training sample set can increase the diversity of the samples, so that the trained convolutional neural network model has stronger robustness. The sample expansion strategy may be set to perform one or more of small rotation, scaling, and inversion on the positive samples/negative samples in the training sample set.
比如,对于训练样本集中的由一个二维彩色活体人脸样本图像及其对应的第一深度图像组成的正样本,可以对其中的二维彩色活体人脸样本图像及其对应的第一深度图像进行相同幅度的旋转,得到旋转后的二维彩色活体人脸样本图像以及旋转后的第一深度图像,由旋转后的二维彩色活体人脸样本图像和旋转后的第一深度图像组成一个新的正样本。For example, for a positive sample composed of a two-dimensional color live human face sample image and its corresponding first depth image in the training sample set, the two-dimensional color live human face sample image and its corresponding first depth image can be Rotate the same amplitude to obtain the rotated two-dimensional color live human face sample image and the rotated first depth image. The new two-dimensional color live human face sample image and the rotated first depth image form a new Positive sample.
在一实施方式中,在获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像时,可以执行:In one embodiment, when acquiring depth images corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images, the following may be performed:
(1)电子设备接收标定的各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离;(1) The electronic device receives the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera;
(2)电子设备根据各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,生成各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。(2) The electronic device generates a depth image corresponding to each two-dimensional color live human face sample image according to the distance from each pixel in each two-dimensional color live human face sample image to the monocular camera, and obtains a plurality of first depth images.
其中,对于电子设备通过单目摄像头所拍摄得到的任一二维彩色活体人脸样本图像,可以手工标定该二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,并由电子设备根据接收到标定的该二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,生成该二维彩色活体人脸样本图像对应的深度图像,记为第一深度图像。Among them, for any two-dimensional color live human face sample image captured by the electronic device through the monocular camera, the distance from each pixel point in the two-dimensional color live human face sample image to the monocular camera can be manually calibrated, and the electronic The device generates a depth image corresponding to the two-dimensional color live human face sample image according to the distance between each pixel in the two-dimensional color live human face sample image and the monocular camera, and records it as the first depth image.
由此,电子设备可以接收标定的各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,并根据各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,生成各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。Thus, the electronic device can receive the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera, and according to the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera , A depth image corresponding to each two-dimensional color live human face sample image is generated, and multiple first depth images are obtained.
在一实施方式中,在获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像时,可以执行:In an embodiment, when acquiring depth images corresponding to each two-dimensional color non-living human face sample image to obtain multiple second depth images, the following may be performed:
电子设备接收标定的各二维彩色非活体人脸样本图像中各像素点到单目摄像头的距离;The electronic device receives the calibrated distance from each pixel in the two-dimensional color non-living face sample image to the monocular camera;
电子设备根据各二维彩色非活体人脸样本图像中各像素点到单目摄像头的距离,生成各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像。The electronic device generates a depth image corresponding to each two-dimensional color non-living human face sample image according to the distance from each pixel in each two-dimensional color non-living human face sample image to the monocular camera, and obtains a plurality of second depth images.
在一实施方式中,在采用机器学习算法训练得到深度估计模型时,可以执行:In an embodiment, when a machine learning algorithm is used to obtain a depth estimation model, the following may be performed:
电子设备将各二维彩色活体人脸样本图像和各二维彩色非活体人脸样本图像作为训练输入、将各二维彩色活体人脸样本图像对应的第一深度图像和各二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到深度估计模型。The electronic device uses each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training inputs, and uses the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live body The second depth image corresponding to the face sample image is used as the target output, and the supervised model training is performed to obtain the depth estimation model.
应当说明的是,在本申请实施例中,电子设备除了利用获取的多个二维彩色活体人脸样本图像及其对应的多个第一深度图像、以及多个二维彩色非活体人脸样本图像及其对应的多个第二深度图像训练来训练活体检测模型之外,还可以利用获取的多个二维彩色活体人脸样本图像及其对应的多个第一深度图像、以及多个二维彩色非活体人脸样本图像及其对应的多个第二深度图像,来训练得到深度估计模型。其中,电子设备可以直接将各二维 彩色活体人脸样本图像和各二维彩色非活体人脸样本图像作为训练输入、将各二维彩色活体人脸样本图像对应的第一深度图像和各二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到深度估计模型。It should be noted that, in the embodiment of the present application, the electronic device uses multiple acquired two-dimensional color live human face sample images and corresponding multiple first depth images, and multiple two-dimensional color non-live human face samples In addition to training the image and its corresponding multiple second depth images to train the living body detection model, multiple acquired two-dimensional color live human face sample images and their corresponding multiple first depth images, and multiple second depth images can also be used Dimensional color non-living human face sample images and corresponding multiple second depth images are used to train a depth estimation model. Among them, the electronic device can directly use each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training inputs, and use the first depth image and each two corresponding to each two-dimensional color live human face sample image The second depth image corresponding to the dimensional color non-living face sample image is used as the target output, and the supervised model is trained to obtain the depth estimation model.
比如,对于任一二维彩色活体人脸样本图像,电子设备将该二维彩色活体人脸样本图像作为训练输入,将该二维彩色活体人脸样本图像的第一深度图像作为对应的目标输出;同样的,对于任一二维彩色非活体人脸样本图像,电子设备将该二维彩色非活体人脸样本图像作为训练输入,将该二维彩色非活体人脸样本图像作为对应的目标输出。For example, for any two-dimensional color live human face sample image, the electronic device uses the two-dimensional color live human face sample image as a training input, and uses the first depth image of the two-dimensional color live human face sample image as a corresponding target output ; Similarly, for any two-dimensional color non-living human face sample image, the electronic device uses the two-dimensional color non-living human face sample image as a training input, and the two-dimensional color non-living human face sample image as the corresponding target output .
本申请实施例还提供一种活体检测装置。请参照图5,图5为本申请实施例提供的活体检测装置的结构示意图。其中该活体检测装置应用于电子设备,该电子设备包括单目摄像头,该活体检测装置包括彩色图像获取模块501、深度图像获取模块502以及活体人脸检测模块503,如下:An embodiment of the present application also provides a living body detection device. Please refer to FIG. 5, which is a schematic structural diagram of a living body detection device according to an embodiment of the present application. Wherein the living body detection device is applied to an electronic device, the electronic device includes a monocular camera, the living body detection device includes a color image acquisition module 501, a depth image acquisition module 502, and a living face detection module 503, as follows:
彩色图像获取模块501,用于通过单目摄像头对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像;The color image acquisition module 501 is used to shoot a face to be detected through a monocular camera to obtain a two-dimensional color image of the face to be detected;
深度图像获取模块502,用于将拍摄得到的二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像;The depth image acquisition module 502 is used to input the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
活体人脸检测模块503,用于将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The living body face detection module 503 is used to input a two-dimensional color image and its corresponding depth image into a pre-trained living body detection model for living body detection to obtain a detection result.
在一实施方式中,活体检测模型为卷积神经网络模型,包括依次连接的卷积层、池化层和全连接层,在将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果时,活体人脸检测模块503可以用于:In one embodiment, the living body detection model is a convolutional neural network model, which includes a convolution layer, a pooling layer, and a fully connected layer connected in sequence. After inputting a two-dimensional color image and its corresponding depth image into a pre-trained living body detection The model performs live detection, and when the detection result is obtained, the live face detection module 503 can be used to:
将前述二维彩色图像及其对应的深度图像输入卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征;Input the aforementioned two-dimensional color image and its corresponding depth image into the convolution layer for feature extraction, and obtain the joint global features of the aforementioned two-dimensional color image and the aforementioned depth image;
将得到联合全局特征输入池化层进行特征降维,得到降维后的联合全局特征;The joint global feature will be input into the pooling layer for feature dimensionality reduction, and the joint global feature after dimensionality reduction will be obtained;
将降维后的联合全局特征输入全连接层进行分类处理,得到待检测人脸为活体人脸的检测结果,或者得到待检测人脸为非活体人脸的检测结果。The joint global features after dimensionality reduction are input into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the detection result that the face to be detected is a non-living face.
在一实施方式中,在将前述二维彩色图像及其对应的深度图像输入卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征时,活体人脸检测模块503可以用于:In one embodiment, when the two-dimensional color image and the corresponding depth image are input into the convolution layer for feature extraction, and the combined global features of the two-dimensional color image and the depth image are obtained, the living face detection module 503 may Used for:
对前述二维彩色图像进行预处理,得到前述二维彩色图像中的人脸区域图像;Preprocessing the aforementioned two-dimensional color image to obtain the face area image in the aforementioned two-dimensional color image;
对前述深度图像进行预处理,得到前述深度图像中的人脸区域图像;Preprocessing the aforementioned depth image to obtain the face area image in the aforementioned depth image;
将前述二维彩色图像中的人脸区域图像和前述深度图像中的人脸区域图像输入前述卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征。The face area image in the two-dimensional color image and the face area image in the depth image are input to the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image.
在一实施方式中,活体检测装置还包括模型训练模块,用于:In an embodiment, the living body detection device further includes a model training module, which is used to:
在通过单目摄像头对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像之前,通过单目摄像头对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;Before shooting the face to be detected through the monocular camera to obtain the two-dimensional color image of the face to be detected, the multiple different live human faces are captured through the monocular camera to obtain multiple two-dimensional color live human face sample images, And obtain a depth image corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images;
通过单目摄像头对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;A plurality of different non-living human face images are captured by a monocular camera to obtain multiple two-dimensional color non-living human face sample images, and a depth image corresponding to each two-dimensional color non-living human face sample image is obtained to obtain multiple second Depth image
将各二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;Use each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample, and each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample to construct a training sample set;
采用卷积神经网络对训练样本集进行模型训练,得到卷积神经网络模型,作为活体检测模型。A convolutional neural network is used to model the training sample set, and a convolutional neural network model is obtained as a living body detection model.
在一实施方式中,在采用卷积神经网络对训练样本集进行模型训练之前,模型训练模块:In one embodiment, before the convolutional neural network is used to train the training sample set, the model training module:
按照预设的样本扩充策略对训练样本集进行样本扩充处理。Perform sample expansion processing on the training sample set according to the preset sample expansion strategy.
在一实施方式中,在获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像时,模型训练模块可以用于:In one embodiment, when acquiring depth images corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images, the model training module may be used to:
接收标定的各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离;Receive the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera;
根据各二维彩色活体人脸样本图像中各像素点到单目摄像头的距离,生成各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。According to the distance between each pixel in each two-dimensional color live human face sample image and the monocular camera, a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
在一实施方式中,在获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像时,模型训练模块可以用于:In one embodiment, when acquiring depth images corresponding to each two-dimensional color non-living human face sample image to obtain multiple second depth images, the model training module may be used to:
接收标定的各二维彩色非活体人脸样本图像中各像素点到单目摄像头的距离;Receive the distance between each pixel in the two-dimensional color non-living face sample image calibrated to the monocular camera;
根据各二维彩色非活体人脸样本图像中各像素点到单目摄像头的距离,生成各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像。According to the distance between each pixel in each two-dimensional color non-living human face sample image and the monocular camera, a depth image corresponding to each two-dimensional color non-living human face sample image is generated to obtain a plurality of second depth images.
在一实施方式中,模型训练模块还可以用于:In one embodiment, the model training module can also be used for:
将各二维彩色活体人脸样本图像和各二维彩色非活体人脸样本图像作为训练输入、将各二维彩色活体人脸样本图像对应的第一深度图像和各二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到深度估计模型。Use each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training input, and use the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live human face image The second depth image corresponding to the sample image is output as the target, and the supervised model is trained to obtain the depth estimation model.
本申请实施例提供一种计算机可读的存储介质,其上存储有计算机程序,当其存储的计算机程序在计算机上执行时,使得计算机执行如本实施例提供的活体检测方法中的步骤,或者使得计算机执行如本实施例提供的模型训练方法中的步骤。其中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)或者随机存取器(Random Access Memory,RAM)等。An embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the stored computer program is executed on a computer, causes the computer to perform the steps in the living body detection method provided in this embodiment, or The computer is caused to execute the steps in the model training method provided in this embodiment. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read Only Memory, ROM), or a random access device (Random Access Memory, RAM), and so on.
本申请实施例还提供一种电子设备,包括存储器,处理器,处理器通过调用存储器中 存储的计算机程序,执行本实施例提供的活体检测方法中的步骤,或者执行如本实施例提供的模型训练方法中的步骤。An embodiment of the present application also provides an electronic device, including a memory, a processor, and the processor executes the steps in the living body detection method provided in this embodiment by calling a computer program stored in the memory, or executes the model as provided in this embodiment Steps in the training method.
在一实施例中,还提供一种电子设备。请参照图6,电子设备包括处理器701、存储器702以及单目摄像头703。其中,处理器701与存储器702和单目摄像头703电性连接。In an embodiment, an electronic device is also provided. Referring to FIG. 6, the electronic device includes a processor 701, a memory 702, and a monocular camera 703. The processor 701 is electrically connected to the memory 702 and the monocular camera 703.
处理器701是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器702内的计算机程序,以及调用存储在存储器702内的数据,执行电子设备的各种功能并处理数据。The processor 701 is the control center of the electronic device, and uses various interfaces and lines to connect the various parts of the entire electronic device, executes the electronic device by running or loading the computer program stored in the memory 702, and calling the data stored in the memory 702 Various functions and process data.
存储器702可用于存储软件程序以及模块,处理器701通过运行存储在存储器702的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器702可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器702可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器702还可以包括存储器控制器,以提供处理器701对存储器702的访问。The memory 702 may be used to store software programs and modules. The processor 701 runs computer programs and modules stored in the memory 702 to execute various functional applications and data processing. The memory 702 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, computer programs required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may store Data created by the use of electronic devices, etc. In addition, the memory 702 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Accordingly, the memory 702 may further include a memory controller to provide the processor 701 with access to the memory 702.
单目摄像头703可以包括具有一个或多个透镜和图像传感器的照相机,能够捕捉外接的图像数据。The monocular camera 703 may include a camera having one or more lenses and an image sensor, capable of capturing external image data.
在本申请实施例中,电子设备中的处理器701会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器702中,并由处理器701运行存储在存储器702中的计算机程序,从而实现各种功能,如下:In the embodiment of the present application, the processor 701 in the electronic device loads the instructions corresponding to the process of one or more computer programs into the memory 702 according to the following steps, and the processor 701 runs and stores the instructions in the memory 702 Computer program to achieve various functions as follows:
通过单目摄像头703对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像;The monocular camera 703 shoots the face to be detected to obtain a two-dimensional color image of the face to be detected;
将拍摄得到的二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像;Input the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the corresponding depth image are input into a pre-trained living body detection model for living body detection, and the detection result is obtained.
请参照图7,图7为本申请实施例提供的电子设备的另一结构示意图,与图6所示电子设备的区别在于,电子设备还包括输入单元704和输出单元705等组件。Please refer to FIG. 7, which is another schematic structural diagram of an electronic device provided by an embodiment of the present application. The difference from the electronic device shown in FIG. 6 is that the electronic device further includes components such as an input unit 704 and an output unit 705.
其中,输入单元704可用于接收输入的数字、字符信息或用户特征信息(比如指纹),以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入等。The input unit 704 can be used to receive input numbers, character information, or user characteristic information (such as fingerprints), and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.
输出单元705可用于显示由用户输入的信息或提供给用户的信息,如屏幕。The output unit 705 may be used to display information input by the user or information provided to the user, such as a screen.
在本申请实施例中,电子设备中的处理器701会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器702中,并由处理器701运行存储在存储器 702中的计算机程序,从而实现各种功能,如下:In the embodiment of the present application, the processor 701 in the electronic device loads the instructions corresponding to the process of one or more computer programs into the memory 702 according to the following steps, and the processor 701 runs and stores the instructions in the memory 702 Computer program to achieve various functions as follows:
通过单目摄像头703对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像;The monocular camera 703 shoots the face to be detected to obtain a two-dimensional color image of the face to be detected;
将拍摄得到的二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应二维彩色图像的深度图像;Input the captured two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the corresponding depth image are input into a pre-trained living body detection model for living body detection, and the detection result is obtained.
在一实施方式中,活体检测模型为卷积神经网络模型,包括依次连接的卷积层、池化层和全连接层,在将二维彩色图像及其对应的深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果时,处理器701可以执行:In one embodiment, the living body detection model is a convolutional neural network model, which includes a convolution layer, a pooling layer, and a fully connected layer connected in sequence. After inputting a two-dimensional color image and its corresponding depth image into a pre-trained living body detection The model performs a living body test, and when the test result is obtained, the processor 701 can execute:
将前述二维彩色图像及其对应的深度图像输入卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征;Input the aforementioned two-dimensional color image and its corresponding depth image into the convolution layer for feature extraction, and obtain the joint global features of the aforementioned two-dimensional color image and the aforementioned depth image;
将得到联合全局特征输入池化层进行特征降维,得到降维后的联合全局特征;The joint global feature will be input into the pooling layer for feature dimensionality reduction, and the joint global feature after dimensionality reduction will be obtained;
将降维后的联合全局特征输入全连接层进行分类处理,得到待检测人脸为活体人脸的检测结果,或者得到待检测人脸为非活体人脸的检测结果。The joint global features after dimensionality reduction are input into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a living face, or the detection result that the face to be detected is a non-living face.
在一实施方式中,在将前述二维彩色图像及其对应的深度图像输入卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征时,处理器701可以执行:In an embodiment, when the two-dimensional color image and the corresponding depth image are input to the convolutional layer for feature extraction, and the combined global features of the two-dimensional color image and the depth image are obtained, the processor 701 may execute:
对前述二维彩色图像进行预处理,得到前述二维彩色图像中的人脸区域图像;Preprocessing the aforementioned two-dimensional color image to obtain the face area image in the aforementioned two-dimensional color image;
对前述深度图像进行预处理,得到前述深度图像中的人脸区域图像;Preprocessing the aforementioned depth image to obtain the face area image in the aforementioned depth image;
将前述二维彩色图像中的人脸区域图像和前述深度图像中的人脸区域图像输入前述卷积层进行特征提取,得到前述二维彩色图像和前述深度图像的联合全局特征。The face area image in the two-dimensional color image and the face area image in the depth image are input to the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image.
在一实施方式中,在通过单目摄像头703703对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像之前,处理器701可以执行:In an embodiment, before shooting the face to be detected through the monocular camera 703703 to obtain a two-dimensional color image of the face to be detected, the processor 701 may execute:
在通过单目摄像头703对待检测人脸进行拍摄,得到待检测人脸的二维彩色图像之前,通过单目摄像头703对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;Before the monocular camera 703 is used to photograph the face to be detected to obtain a two-dimensional color image of the face to be detected, the monocular camera 703 is used to photograph multiple different live human faces to obtain multiple two-dimensional color live human face samples Image, and obtain a depth image corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images;
通过单目摄像头703对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;A plurality of different non-living human face images are captured by the monocular camera 703 to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each two-dimensional color non-living human face sample image is obtained to obtain multiple Two depth images;
将各二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;Use each two-dimensional color live human face sample image and its corresponding first depth image as a positive sample, and each two-dimensional color non-live human face sample image and its corresponding second depth image as a negative sample to construct a training sample set;
采用卷积神经网络对训练样本集进行模型训练,得到卷积神经网络模型,作为活体检测模型。A convolutional neural network is used to model the training sample set, and a convolutional neural network model is obtained as a living body detection model.
在一实施方式中,在采用卷积神经网络对训练样本集进行模型训练之前,处理器701可以执行:In one embodiment, before the convolutional neural network is used for model training on the training sample set, the processor 701 may execute:
按照预设的样本扩充策略对训练样本集进行样本扩充处理。Perform sample expansion processing on the training sample set according to the preset sample expansion strategy.
在一实施方式中,在获取各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像时,处理器701可以执行:In an embodiment, when acquiring the depth images corresponding to each two-dimensional color live human face sample image to obtain multiple first depth images, the processor 701 may execute:
接收标定的各二维彩色活体人脸样本图像中各像素点到单目摄像头703的距离;Receive the distance from each pixel in the two-dimensional color live human face sample image to the monocular camera 703;
根据各二维彩色活体人脸样本图像中各像素点到单目摄像头703的距离,生成各二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。According to the distance between each pixel in each two-dimensional color live human face sample image and the monocular camera 703, a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
在一实施方式中,在获取各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像时,处理器701可以执行:In one embodiment, when acquiring the depth images corresponding to the two-dimensional color non-living human face sample images to obtain multiple second depth images, the processor 701 may execute:
接收标定的各二维彩色非活体人脸样本图像中各像素点到单目摄像头703的距离;Receive the distance from each pixel in the two-dimensional color non-living human face sample image to the monocular camera 703;
根据各二维彩色非活体人脸样本图像中各像素点到单目摄像头703的距离,生成各二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像。According to the distance between each pixel in each two-dimensional color non-living human face sample image and the monocular camera 703, a depth image corresponding to each two-dimensional color non-living human face sample image is generated to obtain a plurality of second depth images.
在一实施方式中,处理器701还可以执行:In an embodiment, the processor 701 may also execute:
将各二维彩色活体人脸样本图像和各二维彩色非活体人脸样本图像作为训练输入、将各二维彩色活体人脸样本图像对应的第一深度图像和各二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到深度估计模型。Use each two-dimensional color live human face sample image and each two-dimensional color non-live human face sample image as training input, and use the first depth image corresponding to each two-dimensional color live human face sample image and each two-dimensional color non-live human face image The second depth image corresponding to the sample image is output as the target, and the supervised model is trained to obtain the depth estimation model.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For a part that is not detailed in an embodiment, you can refer to the related descriptions of other embodiments.
需要说明的是,对本申请实施例的活体检测方法而言,本领域普通测试人员可以理解实现本申请实施例的活体检测方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如活体检测方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。It should be noted that, for the living body detection method of the embodiment of the present application, ordinary testers in the art can understand that all or part of the process of implementing the living body detection method of the embodiment of the present application can be completed by controlling relevant hardware through a computer program , The computer program may be stored in a computer-readable storage medium, such as stored in a memory of an electronic device, and executed by at least one processor in the electronic device, and may include, for example, a living body detection method during execution The process of the embodiment. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
对本申请实施例的活体检测装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。For the living body detection device of the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The above integrated modules can be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium, such as a read-only memory, magnetic disk, or optical disk, etc. .
以上对本申请实施例所提供的一种活体检测方法、装置、存储介质及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The above provides a detailed description of a method, device, storage medium, and electronic equipment provided by embodiments of the present application. Specific examples are used in this article to explain the principles and implementation of the present application. The descriptions of the above embodiments are only It is used to help understand the method of this application and its core ideas; meanwhile, for those skilled in the art, according to the ideas of this application, there will be changes in the specific implementation and application scope. In summary, this specification The content should not be construed as limiting the application.

Claims (20)

  1. 一种活体检测方法,应用于电子设备,所述电子设备包括单目摄像头,其中,包括:A living body detection method is applied to an electronic device. The electronic device includes a monocular camera, including:
    通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
    将所述二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
    将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
  2. 根据权利要求1所述的活体检测方法,其中,所述活体检测模型为卷积神经网络模型,包括依次连接的卷积层、池化层和全连接层,所述将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果,包括:The living body detection method according to claim 1, wherein the living body detection model is a convolutional neural network model, including a convolution layer, a pooling layer, and a fully connected layer connected in sequence, and the two-dimensional color image Input the pre-trained living body detection model with the depth image to obtain the detection result, including:
    将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征;Input the two-dimensional color image and the depth image into the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image;
    将所述联合全局特征输入所述池化层进行特征降维,得到降维后的联合全局特征;Input the joint global feature into the pooling layer to perform feature dimensionality reduction to obtain the joint global feature after dimensionality reduction;
    将所述降维后的联合全局特征输入所述全连接层中进行分类处理,得到所述待检测人脸为活体人脸的检测结果,或者得到所述待检测人脸为非活体人脸的检测结果。Input the dimensionality-reduced joint global feature into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a live face, or the face to be detected is a non-live face Test results.
  3. 根据权利要求2所述的活体检测方法,其中,所述将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征,包括:The living body detection method according to claim 2, wherein the two-dimensional color image and the depth image are input to the convolutional layer for feature extraction to obtain the two-dimensional color image and the depth image Joint global features, including:
    对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像;Preprocessing the two-dimensional color image to obtain a face area image in the two-dimensional color image;
    对所述深度图像进行预处理,得到所述深度图像中的人脸区域图像;Preprocessing the depth image to obtain a face area image in the depth image;
    将所述二维彩色图像中的人脸区域图像和所述深度图像中的人脸区域图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征。Input the face area image in the two-dimensional color image and the face area image in the depth image into the convolution layer for feature extraction to obtain the joint global features of the two-dimensional color image and the depth image .
  4. 根据权利要求3所述的活体检测方法,其中,所述对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像,包括:The living body detection method according to claim 3, wherein the preprocessing the two-dimensional color image to obtain a face area image in the two-dimensional color image includes:
    采用椭圆模板、圆形模板或者矩形模板从所述二维彩色图像中提取人脸区域图像。An ellipse template, a circular template or a rectangular template is used to extract the face area image from the two-dimensional color image.
  5. 根据权利要求2所述的活体检测方法,其中,所述通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像之前,还包括:The living body detection method according to claim 2, wherein the taking of the face to be detected by the monocular camera to obtain a two-dimensional color image of the face to be detected further comprises:
    通过所述单目摄像头对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;A plurality of different live human face images are captured by the monocular camera to obtain multiple two-dimensional color live human face sample images, and a depth image corresponding to each of the two-dimensional color live human face sample images is obtained to obtain multiple A depth image;
    通过所述单目摄像头对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各所述二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;A plurality of different non-living human faces are photographed through the monocular camera to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each of the two-dimensional color non-living human face sample images is obtained to obtain Multiple second depth images;
    将各所述二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各所述二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;Constructing each of the two-dimensional color live human face sample images and their corresponding first depth images as positive samples, and using each of the two-dimensional color non-live human face sample images and their corresponding second depth images as negative samples Training sample set;
    采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型。A convolutional neural network is used to perform model training on the training sample set to obtain the convolutional neural network model.
  6. 根据权利要求5所述的活体检测方法,其中,所述采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型之前,还包括:The living body detection method according to claim 5, wherein the model training of the training sample set using a convolutional neural network to obtain the convolutional neural network model further includes:
    按照预设的样本扩充策略对所述训练样本集进行样本扩充处理。Perform sample expansion processing on the training sample set according to a preset sample expansion strategy.
  7. 根据权利要求5所述的活体检测方法,其中,所述获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像,包括:The living body detection method according to claim 5, wherein the acquiring depth images corresponding to each of the two-dimensional color live human face sample images to obtain a plurality of first depth images includes:
    接收标定的各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离;Receiving the calibrated distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera;
    根据各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离,生成各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。According to the distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera, a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
  8. 根据权利要求5所述的活体检测方法,其中,所述活体检测方法还包括:The living body detection method according to claim 5, wherein the living body detection method further comprises:
    将各所述二维彩色活体人脸样本图像和各所述二维彩色非活体人脸样本图像作为训练输入、将各所述二维彩色活体人脸样本图像对应的第一深度图像和各所述二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到所述深度估计模型。Using each of the two-dimensional color live human face sample images and each of the two-dimensional color non-live human face sample images as training inputs, and using the first depth image corresponding to each of the two-dimensional color live human face sample images and each location The second depth image corresponding to the two-dimensional color non-living human face sample image is output as the target, and the supervised model training is performed to obtain the depth estimation model.
  9. 根据权利要求1所述的活体检测方法,其中,所述将所述二维彩色图像输入预先训练的深度估计模型进行深度估计之前,还包括:The living body detection method according to claim 1, wherein before the inputting the two-dimensional color image into a pre-trained depth estimation model for depth estimation, further comprising:
    从本地调用所述深度估计模型或者从服务器调用所述深度估计模型。Call the depth estimation model locally or call the depth estimation model from the server.
  10. 一种活体检测装置,应用于电子设备,其中,包括:A living body detection device applied to electronic equipment, including:
    彩色图像获取模块,用于通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;A color image acquisition module, configured to shoot the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
    深度图像获取模块,用于将所述二维彩色图像输入预先训练的深度估计模型,得到对应所述二维彩色图像的深度图像;A depth image acquisition module, configured to input the two-dimensional color image into a pre-trained depth estimation model to obtain a depth image corresponding to the two-dimensional color image;
    活体人脸检测模块,用于将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果。The living body face detection module is used to input the two-dimensional color image and the depth image into a pre-trained living body detection model to obtain a detection result.
  11. 一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上运行时,使得所述计算机执行:A storage medium on which a computer program is stored, wherein, when the computer program runs on a computer, the computer is caused to execute:
    通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
    将所述二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
    将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
  12. 一种电子设备,包括处理器、存储器和单目摄像头,所述存储器储存有计算机程序,其中,所述处理器通过调用所述计算机程序,用于执行:An electronic device includes a processor, a memory, and a monocular camera. The memory stores a computer program, wherein the processor is used to execute the computer program by calling the computer program:
    通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像;Shooting the face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected;
    将所述二维彩色图像输入预先训练的深度估计模型进行深度估计,得到对应所述二维彩色图像的深度图像;Input the two-dimensional color image into a pre-trained depth estimation model to perform depth estimation to obtain a depth image corresponding to the two-dimensional color image;
    将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型进行活体检测,得到检测结果。The two-dimensional color image and the depth image are input into a pre-trained living body detection model for living body detection, and a detection result is obtained.
  13. 根据权利要求12所述的电子设备,其中,所述活体检测模型为卷积神经网络模型,包括依次连接的卷积层、池化层和全连接层,在将所述二维彩色图像和所述深度图像输入预先训练的活体检测模型,得到检测结果时,所述处理器用于执行:The electronic device according to claim 12, wherein the living body detection model is a convolutional neural network model, which includes a convolutional layer, a pooling layer, and a fully connected layer connected in sequence. The depth image is input into a pre-trained living body detection model, and when a detection result is obtained, the processor is used to execute:
    将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征;Input the two-dimensional color image and the depth image into the convolutional layer for feature extraction to obtain a joint global feature of the two-dimensional color image and the depth image;
    将所述联合全局特征输入所述池化层进行特征降维,得到降维后的联合全局特征;Input the joint global feature into the pooling layer to perform feature dimensionality reduction to obtain the joint global feature after dimensionality reduction;
    将所述降维后的联合全局特征输入所述全连接层中进行分类处理,得到所述待检测人脸为活体人脸的检测结果,或者得到所述待检测人脸为非活体人脸的检测结果。Input the dimensionality-reduced joint global feature into the fully connected layer for classification processing to obtain the detection result that the face to be detected is a live face, or the face to be detected is a non-live face Test results.
  14. 根据权利要求13所述的电子设备,其中,在将所述二维彩色图像和所述深度图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征时,所述处理器用于执行:The electronic device according to claim 13, wherein the two-dimensional color image and the depth image are input to the convolutional layer for feature extraction to obtain a joint global view of the two-dimensional color image and the depth image Feature, the processor is used to perform:
    对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像;Preprocessing the two-dimensional color image to obtain a face area image in the two-dimensional color image;
    对所述深度图像进行预处理,得到所述深度图像中的人脸区域图像;Preprocessing the depth image to obtain a face area image in the depth image;
    将所述二维彩色图像中的人脸区域图像和所述深度图像中的人脸区域图像输入所述卷积层进行特征提取,得到所述二维彩色图像和所述深度图像的联合全局特征。Input the face area image in the two-dimensional color image and the face area image in the depth image into the convolution layer for feature extraction to obtain the joint global features of the two-dimensional color image and the depth image .
  15. 根据权利要求14所述的电子设备,其中,在对所述二维彩色图像进行预处理,得到所述二维彩色图像中的人脸区域图像时,所述处理器用于执行:The electronic device according to claim 14, wherein, when preprocessing the two-dimensional color image to obtain a face area image in the two-dimensional color image, the processor is configured to execute:
    采用椭圆模板、圆形模板或者矩形模板从所述二维彩色图像中提取人脸区域图像。An ellipse template, a circular template or a rectangular template is used to extract the face area image from the two-dimensional color image.
  16. 根据权利要求13所述的电子设备,其中,在通过所述单目摄像头对待检测人脸进行拍摄,得到所述待检测人脸的二维彩色图像之前,所述处理器还用于执行:The electronic device according to claim 13, wherein before taking a face to be detected through the monocular camera to obtain a two-dimensional color image of the face to be detected, the processor is further configured to execute:
    通过所述单目摄像头对多个不同活体人脸进行拍摄,得到多个二维彩色活体人脸样本图像,并获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像;A plurality of different live human face images are captured by the monocular camera to obtain multiple two-dimensional color live human face sample images, and a depth image corresponding to each of the two-dimensional color live human face sample images is obtained to obtain multiple A depth image;
    通过所述单目摄像头对多个不同非活体人脸进行拍摄,得到多个二维彩色非活体人脸样本图像,并获取各所述二维彩色非活体人脸样本图像对应的深度图像,得到多个第二深度图像;A plurality of different non-living human faces are photographed through the monocular camera to obtain a plurality of two-dimensional color non-living human face sample images, and a depth image corresponding to each of the two-dimensional color non-living human face sample images is obtained to obtain Multiple second depth images;
    将各所述二维彩色活体人脸样本图像及其对应的第一深度图像作为正样本、将各所述二维彩色非活体人脸样本图像及其对应的第二深度图像作为负样本,构建训练样本集;Constructing each of the two-dimensional color live human face sample images and their corresponding first depth images as positive samples, and using each of the two-dimensional color non-live human face sample images and their corresponding second depth images as negative samples Training sample set;
    采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型。A convolutional neural network is used to perform model training on the training sample set to obtain the convolutional neural network model.
  17. 根据权利要求16所述的电子设备,其中,在采用卷积神经网络对所述训练样本集进行模型训练,得到所述卷积神经网络模型之前,所述处理器还用于执行:The electronic device according to claim 16, wherein before the convolutional neural network is used to model the training sample set to obtain the convolutional neural network model, the processor is further configured to execute:
    按照预设的样本扩充策略对所述训练样本集进行样本扩充处理。Perform sample expansion processing on the training sample set according to a preset sample expansion strategy.
  18. 根据权利要求16所述的电子设备,其中,在获取各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像时,所述处理器用于执行:The electronic device according to claim 16, wherein, when acquiring depth images corresponding to each of the two-dimensional color live human face sample images to obtain multiple first depth images, the processor is configured to execute:
    接收标定的各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离;Receiving the calibrated distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera;
    根据各所述二维彩色活体人脸样本图像中各像素点到所述单目摄像头的距离,生成各所述二维彩色活体人脸样本图像对应的深度图像,得到多个第一深度图像。According to the distance between each pixel in each of the two-dimensional color live human face sample images and the monocular camera, a depth image corresponding to each two-dimensional color live human face sample image is generated to obtain a plurality of first depth images.
  19. 根据权利要求16所述的电子设备,其中,所述处理器还用于执行:The electronic device according to claim 16, wherein the processor is further configured to execute:
    将各所述二维彩色活体人脸样本图像和各所述二维彩色非活体人脸样本图像作为训练输入、将各所述二维彩色活体人脸样本图像对应的第一深度图像和各所述二维彩色非活体人脸样本图像对应的第二深度图像作为目标输出,进行有监督模型训练,得到所述深度估计模型。Using each of the two-dimensional color live human face sample images and each of the two-dimensional color non-live human face sample images as training inputs, and using the first depth image corresponding to each of the two-dimensional color live human face sample images and each location The second depth image corresponding to the two-dimensional color non-living human face sample image is output as the target, and the supervised model training is performed to obtain the depth estimation model.
  20. 根据权利要求12所述的电子设备,其中,在将所述二维彩色图像输入预先训练的深度估计模型进行深度估计之前,所述处理器还用于执行:The electronic device according to claim 12, wherein before inputting the two-dimensional color image into a pre-trained depth estimation model for depth estimation, the processor is further configured to execute:
    从本地调用所述深度估计模型或者从服务器调用所述深度估计模型。Call the depth estimation model locally or call the depth estimation model from the server.
PCT/CN2019/125957 2018-12-20 2019-12-17 Method and device for live body detection, storage medium, and electronic device WO2020125623A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811565579.0 2018-12-20
CN201811565579.0A CN109635770A (en) 2018-12-20 2018-12-20 Biopsy method, device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2020125623A1 true WO2020125623A1 (en) 2020-06-25

Family

ID=66075992

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/125957 WO2020125623A1 (en) 2018-12-20 2019-12-17 Method and device for live body detection, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN109635770A (en)
WO (1) WO2020125623A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797745A (en) * 2020-06-28 2020-10-20 北京百度网讯科技有限公司 Training and predicting method, device, equipment and medium of object detection model
CN111914758A (en) * 2020-08-04 2020-11-10 成都奥快科技有限公司 Face in-vivo detection method and device based on convolutional neural network
CN111985427A (en) * 2020-08-25 2020-11-24 深圳前海微众银行股份有限公司 Living body detection method, living body detection apparatus, and readable storage medium
CN112069936A (en) * 2020-08-21 2020-12-11 深圳市商汤科技有限公司 Attack point testing method and related device, electronic equipment and storage medium
CN112183357A (en) * 2020-09-29 2021-01-05 深圳龙岗智能视听研究院 Deep learning-based multi-scale in-vivo detection method and system
CN112200057A (en) * 2020-09-30 2021-01-08 汉王科技股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN112434647A (en) * 2020-12-09 2021-03-02 浙江光珀智能科技有限公司 Human face living body detection method
CN112699811A (en) * 2020-12-31 2021-04-23 中国联合网络通信集团有限公司 Living body detection method, apparatus, device, storage medium, and program product
CN113378715A (en) * 2021-06-10 2021-09-10 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN113542527A (en) * 2020-11-26 2021-10-22 腾讯科技(深圳)有限公司 Face image transmission method and device, electronic equipment and storage medium
CN114550274A (en) * 2022-03-16 2022-05-27 北京人人云图信息技术有限公司 Face fraud recognition method based on mask face detection
US20230112452A1 (en) * 2020-04-16 2023-04-13 Samsung Electronics Co., Ltd. Method and apparatus for testing liveness

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635770A (en) * 2018-12-20 2019-04-16 上海瑾盛通信科技有限公司 Biopsy method, device, storage medium and electronic equipment
CN110245645B (en) * 2019-06-21 2021-06-08 北京字节跳动网络技术有限公司 Face living body identification method, device, equipment and storage medium
CN110334628B (en) * 2019-06-26 2021-07-27 华中科技大学 Outdoor monocular image depth estimation method based on structured random forest
CN110674759A (en) * 2019-09-26 2020-01-10 深圳市捷顺科技实业股份有限公司 Monocular face in-vivo detection method, device and equipment based on depth map
CN111091063B (en) * 2019-11-20 2023-12-29 北京迈格威科技有限公司 Living body detection method, device and system
CN112861586B (en) * 2019-11-27 2022-12-13 马上消费金融股份有限公司 Living body detection, image classification and model training method, device, equipment and medium
CN111881706B (en) * 2019-11-27 2021-09-03 马上消费金融股份有限公司 Living body detection, image classification and model training method, device, equipment and medium
CN111191521B (en) * 2019-12-11 2022-08-12 智慧眼科技股份有限公司 Face living body detection method and device, computer equipment and storage medium
CN111046845A (en) * 2019-12-25 2020-04-21 上海骏聿数码科技有限公司 Living body detection method, device and system
TWI722872B (en) * 2020-04-17 2021-03-21 技嘉科技股份有限公司 Face recognition device and face recognition method
CN113553887A (en) * 2020-04-26 2021-10-26 华为技术有限公司 Monocular camera-based in-vivo detection method and device and readable storage medium
CN111753658A (en) * 2020-05-20 2020-10-09 高新兴科技集团股份有限公司 Post sleep warning method and device and computer equipment
CN112036331B (en) * 2020-09-03 2024-04-09 腾讯科技(深圳)有限公司 Living body detection model training method, device, equipment and storage medium
CN112115831B (en) * 2020-09-10 2024-03-15 深圳印像数据科技有限公司 Living body detection image preprocessing method
CN112270303A (en) * 2020-11-17 2021-01-26 北京百度网讯科技有限公司 Image recognition method and device and electronic equipment
CN112508812B (en) * 2020-12-01 2024-08-27 厦门美图之家科技有限公司 Image color cast correction method, model training method, device and equipment
CN113435408A (en) * 2021-07-21 2021-09-24 北京百度网讯科技有限公司 Face living body detection method and device, electronic equipment and storage medium
CN113705428B (en) * 2021-08-26 2024-07-19 北京市商汤科技开发有限公司 Living body detection method and device, electronic equipment and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157938A1 (en) * 2016-12-07 2018-06-07 Samsung Electronics Co., Ltd. Target detection method and apparatus
CN108876833A (en) * 2018-03-29 2018-11-23 北京旷视科技有限公司 Image processing method, image processing apparatus and computer readable storage medium
CN109034102A (en) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 Human face in-vivo detection method, device, equipment and storage medium
CN109635770A (en) * 2018-12-20 2019-04-16 上海瑾盛通信科技有限公司 Biopsy method, device, storage medium and electronic equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005215750A (en) * 2004-01-27 2005-08-11 Canon Inc Face detecting device and face detecting method
JP6027070B2 (en) * 2014-09-24 2016-11-16 富士フイルム株式会社 Area detection apparatus, area detection method, image processing apparatus, image processing method, program, and recording medium
GB201508074D0 (en) * 2015-05-12 2015-06-24 Apical Ltd People detection
US9691152B1 (en) * 2015-08-14 2017-06-27 A9.Com, Inc. Minimizing variations in camera height to estimate distance to objects
CN107871134A (en) * 2016-09-23 2018-04-03 北京眼神科技有限公司 A kind of method for detecting human face and device
CN108171204B (en) * 2018-01-17 2019-09-17 百度在线网络技术(北京)有限公司 Detection method and device
CN108537152B (en) * 2018-03-27 2022-01-25 百度在线网络技术(北京)有限公司 Method and apparatus for detecting living body
CN108764024B (en) * 2018-04-09 2020-03-24 平安科技(深圳)有限公司 Device and method for generating face recognition model and computer readable storage medium
CN108960127B (en) * 2018-06-29 2021-11-05 厦门大学 Shielded pedestrian re-identification method based on adaptive depth measurement learning
CN108898112A (en) * 2018-07-03 2018-11-27 东北大学 A kind of near-infrared human face in-vivo detection method and system
CN109003297B (en) * 2018-07-18 2020-11-24 亮风台(上海)信息科技有限公司 Monocular depth estimation method, device, terminal and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157938A1 (en) * 2016-12-07 2018-06-07 Samsung Electronics Co., Ltd. Target detection method and apparatus
CN108876833A (en) * 2018-03-29 2018-11-23 北京旷视科技有限公司 Image processing method, image processing apparatus and computer readable storage medium
CN109034102A (en) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 Human face in-vivo detection method, device, equipment and storage medium
CN109635770A (en) * 2018-12-20 2019-04-16 上海瑾盛通信科技有限公司 Biopsy method, device, storage medium and electronic equipment

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230112452A1 (en) * 2020-04-16 2023-04-13 Samsung Electronics Co., Ltd. Method and apparatus for testing liveness
US11836235B2 (en) * 2020-04-16 2023-12-05 Samsung Electronics Co., Ltd. Method and apparatus for testing liveness
CN111797745A (en) * 2020-06-28 2020-10-20 北京百度网讯科技有限公司 Training and predicting method, device, equipment and medium of object detection model
CN111914758A (en) * 2020-08-04 2020-11-10 成都奥快科技有限公司 Face in-vivo detection method and device based on convolutional neural network
CN112069936A (en) * 2020-08-21 2020-12-11 深圳市商汤科技有限公司 Attack point testing method and related device, electronic equipment and storage medium
CN111985427A (en) * 2020-08-25 2020-11-24 深圳前海微众银行股份有限公司 Living body detection method, living body detection apparatus, and readable storage medium
CN112183357A (en) * 2020-09-29 2021-01-05 深圳龙岗智能视听研究院 Deep learning-based multi-scale in-vivo detection method and system
CN112183357B (en) * 2020-09-29 2024-03-26 深圳龙岗智能视听研究院 Multi-scale living body detection method and system based on deep learning
CN112200057A (en) * 2020-09-30 2021-01-08 汉王科技股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN112200057B (en) * 2020-09-30 2023-10-31 汉王科技股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN113542527B (en) * 2020-11-26 2023-08-18 腾讯科技(深圳)有限公司 Face image transmission method and device, electronic equipment and storage medium
CN113542527A (en) * 2020-11-26 2021-10-22 腾讯科技(深圳)有限公司 Face image transmission method and device, electronic equipment and storage medium
CN112434647A (en) * 2020-12-09 2021-03-02 浙江光珀智能科技有限公司 Human face living body detection method
CN112699811B (en) * 2020-12-31 2023-11-03 中国联合网络通信集团有限公司 Living body detection method, living body detection device, living body detection apparatus, living body detection storage medium, and program product
CN112699811A (en) * 2020-12-31 2021-04-23 中国联合网络通信集团有限公司 Living body detection method, apparatus, device, storage medium, and program product
CN113378715A (en) * 2021-06-10 2021-09-10 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN113378715B (en) * 2021-06-10 2024-01-05 北京华捷艾米科技有限公司 Living body detection method based on color face image and related equipment
CN114550274A (en) * 2022-03-16 2022-05-27 北京人人云图信息技术有限公司 Face fraud recognition method based on mask face detection

Also Published As

Publication number Publication date
CN109635770A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
WO2020125623A1 (en) Method and device for live body detection, storage medium, and electronic device
US11645506B2 (en) Neural network for skeletons from input images
Kumar et al. Face detection techniques: a review
CN107766786B (en) Activity test method and activity test computing device
US10002313B2 (en) Deeply learned convolutional neural networks (CNNS) for object localization and classification
WO2021043168A1 (en) Person re-identification network training method and person re-identification method and apparatus
US11704907B2 (en) Depth-based object re-identification
US20190279052A1 (en) Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image
WO2019227479A1 (en) Method and apparatus for generating face rotation image
WO2017088432A1 (en) Image recognition method and device
WO2021018245A1 (en) Image classification method and apparatus
Yang et al. Facial expression recognition based on dual-feature fusion and improved random forest classifier
JP7228961B2 (en) Neural network learning device and its control method
CN111183455A (en) Image data processing system and method
CN109963072B (en) Focusing method, focusing device, storage medium and electronic equipment
CN114339054B (en) Method and device for generating photographing mode and computer readable storage medium
WO2021217919A1 (en) Facial action unit recognition method and apparatus, and electronic device, and storage medium
Liu RETRACTED ARTICLE: Video Face Detection Based on Deep Learning
JP7360217B2 (en) Method for obtaining data from an image of an object of a user having biometric characteristics of the user
US20220277579A1 (en) Clustered dynamic graph convolutional neural network (cnn) for biometric three-dimensional (3d) hand recognition
Cui et al. Improving the face recognition system by hybrid image preprocessing
Srinivas et al. E-CNN-FFE: An Enhanced Convolutional Neural Network for Facial Feature Extraction and Its Comparative Analysis with FaceNet, DeepID, and LBPH Methods
CN113128289B (en) Face recognition feature extraction calculation method and equipment
Chatterjee Deep Convolutional Neural Networks for the Face and Iris Based Presentation Attack Mitigation
Sarkar Partial Face Detection and Illumination Estimation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19901333

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19901333

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 23.12.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19901333

Country of ref document: EP

Kind code of ref document: A1