CN112541576B - Biological living body identification neural network construction method of RGB monocular image - Google Patents
Biological living body identification neural network construction method of RGB monocular image Download PDFInfo
- Publication number
- CN112541576B CN112541576B CN202011475744.0A CN202011475744A CN112541576B CN 112541576 B CN112541576 B CN 112541576B CN 202011475744 A CN202011475744 A CN 202011475744A CN 112541576 B CN112541576 B CN 112541576B
- Authority
- CN
- China
- Prior art keywords
- layer
- attention
- convolution
- module
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 30
- 238000010276 construction Methods 0.000 title claims description 11
- 230000004913 activation Effects 0.000 claims abstract description 27
- 238000010606 normalization Methods 0.000 claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000000605 extraction Methods 0.000 claims abstract description 15
- 238000011176 pooling Methods 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 8
- 210000004205 output neuron Anatomy 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 4
- 238000013527 convolutional neural network Methods 0.000 description 10
- 238000004806 packaging method and process Methods 0.000 description 6
- 238000005286 illumination Methods 0.000 description 5
- 210000003462 vein Anatomy 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
- G06V40/45—Detection of the body part being alive
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a biological living body identification neural network of RGB monocular images, which comprises a root module, a plurality of repeatable modules, a feature extraction module and an output module which are sequentially connected from front to back; the root module comprises a convolution layer, a batch normalization layer and an activation layer which are sequentially connected and packaged from front to back; the repeatable module comprises a convolution layer, a batch normalization layer, an activation layer, a depth separable convolution layer, a batch normalization layer, an activation layer, a convolution layer and a batch normalization layer which are sequentially connected and packaged from front to back; if the repeatable module performs downsampling, the tail end of the repeatable module is also provided with a space attention layer; the feature extraction module comprises a global average pooling layer, a full-connection layer, an activation layer and a regularization layer which are sequentially connected and packaged from front to back; the output module is a full-connection layer with weight values subjected to regularization treatment. Through the scheme, the method and the device have the advantages of being simple in logic, small in technical workload, high in calculation accuracy and the like.
Description
Technical Field
The invention relates to the technical field of biological living body identification in computer face recognition, in particular to a biological living body identification neural network of RGB monocular images and a construction method thereof.
Background
In the technical field of living organism identification in computer face identification, the computer face identification technology adopts a camera or a sensor to collect relevant information such as face images and perform functions such as identity comparison, identity confirmation, attribute identification and the like. At present, the computer face recognition technology is widely applied to various fields such as security protection, attendance checking, finance, traffic, intelligent terminals and the like. In practical application, it is required to accurately determine whether the face collected by the camera or the sensor is derived from a living organism, i.e. a real natural person, instead of an attack behavior of a non-living organism, such as a behavior of performing counterfeiting by using a photograph or a mobile phone video. Biological living body identification is important to ensure the safety and reliability of the whole face recognition process.
Currently, there are a variety of biological in-vivo identification techniques; for example, a 3D structured light lens is utilized to reconstruct a three-dimensional structure of a shooting target; or an infrared sensor is used for collecting infrared characteristic information of a shooting target and the like; for example, the patent application number is 202011114943.9, and the name is a Chinese patent of face vein combined face recognition method and device, which adopts an infrared camera to collect face vein images, adopts an RGB camera to collect living face images and non-moving face images, and fuses the face vein images and face photos to form preprocessed living face images and non-living face images. And combining the three-channel images to form a new thinking face image. Because facial vein images are relatively complex in distribution, the calculation workload of the merging operation is large, and the hardware investment cost is increased.
For example, the Chinese invention patent with the patent application number of 201710478894.9 and the name of photo fake convolutional neural network training method and human face living body detection method is constructed by constructing a training set; acquiring images in a training set; detecting a human face in the image; the face is cut and then normalized and sent into a convolutional neural network, wherein the convolutional neural network comprises an input layer, a plurality of convolutional layers, a ReLU layer, a max-pooling layer, a full-connection layer, a Dropout layer and a SoftmaxWithLoss layer; the convolutional neural network is trained. The disadvantage is that such networks and loss functions have high requirements for consistency of data distribution between training samples and samples in practical applications, and are prone to overfitting problems. If the samples collected during practical application have obvious differences with training samples due to factors such as illumination, hardware, environment and the like, the classification accuracy of the neural network is lower.
Further, as the Chinese patent with the patent application number of 201911358984.X and the name of "a multi-feature multi-model living body face recognition method", the method is as follows: acquiring RGB face images to be identified; decomposing the whole area of the RGB face image into a plurality of local areas, and dividing to obtain an RGB image associated with each local area; performing feature transformation on the RGB image of the local area to obtain a corresponding HSV image, and combining the RGB image and the HSV image to form input image information and outputting the input image information; respectively inputting the input image information into each neural network model in the corresponding classification network model for recognition so as to respectively obtain model characteristics output by each neural network model corresponding to the local area; model features output by all neural network models in all classification network models are input into a feature output layer in a unified mode to form and input a feature output matrix into a fusion feature network model for recognition, so that a living body face recognition result of the RGB face image is output. Although it obtains a plurality of features and corresponding training models through segmentation, it is directly related to the resolution of images and photos, and if the resolution, size, etc. of fake photos are enough to match with a real human living body, there is also a problem of recognition errors.
Therefore, the above methods can identify the attack of the non-living body to a certain extent, but all require specific hardware to collect the related information, and require additional cost. For some applications, such as some mobile internet applications, the techniques cannot be used when authentication is required by using a front-end camera of the mobile phone.
As is well known, because the information provided by monocular RGB images is more limited than other methods such as 3D structured light, and in the practical application process, the environment, illumination, attack modes of non-living organisms during image acquisition, and the acquisition of camera specifications and models are all possible; it can be seen that neural networks trained using monocular RGB images result in lower accuracy and are prone to over-fitting problems, i.e. better performance in some scenarios and significant degradation in network performance in other scenarios.
Therefore, it is urgently required to provide a biological living body recognition neural network using a common camera and a construction method thereof, which have the advantages of simple logic, less calculation workload and high precision.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a biological living body identification neural network of RGB monocular images and a construction method thereof, and adopts the following technical scheme:
the biological living body identification neural network of the RGB monocular image comprises a root module, a plurality of repeatable modules, a feature extraction module and an output module which are sequentially connected from front to back;
the root module comprises a convolution layer, a batch normalization layer and an activation layer which are sequentially connected and packaged from front to back;
the repeatable module comprises a convolution layer, a batch normalization layer, an activation layer, a depth separable convolution layer, a batch normalization layer, an activation layer, a convolution layer and a batch normalization layer which are sequentially connected and packaged from front to back; if the repeatable module performs downsampling, the tail end of the repeatable module is also provided with a space attention layer;
the feature extraction module comprises a global average pooling layer, a full-connection layer, an activation layer and a regularization layer which are sequentially connected and packaged from front to back;
the output module is a full-connection layer with weight values subjected to regularization treatment.
A construction method of a biological living body identification neural network of RGB monocular images comprises the following steps:
the root module is obtained by connecting and packaging the convolution layer, the batch normalization layer and the activation layer from front to back in sequence;
sequentially connecting and packaging the convolution layer, the batch normalization layer, the activation layer, the depth separable convolution layer, the batch normalization layer, the activation layer, the convolution layer and the batch normalization layer from front to back to obtain a repeatable module; if the repeatable module performs downsampling, the tail end of the repeatable module is also provided with a space attention layer;
the feature extraction module is obtained by sequentially connecting and packaging a global average pooling layer, a full connecting layer, an activating layer and a regularization layer from front to back;
the full-connection layer with the weight value subjected to regularization treatment is used as an output module;
and sequentially connecting the root module, the repeatable modules, the feature extraction module and the output module to obtain the convolutional neural network.
Further, the step size of the depth separable convolution layer is set to 2.
Still further, the input and output of the spatial attention layer adopts the following steps:
using the input characteristic tensor of the space attention module and combining an attention mapping function to obtain an attention map which has the same size as the input characteristic tensor and has a channel of 1;
element multiplication is carried out on attention force diagram and input characteristic tensor, broadcast expansion is carried out on attention force diagram in channel direction, and weighted attention characteristic tensor is obtained, wherein the expression is as follows:
where X is the input feature tensor of the spatial attention module, X' is the weighted attention feature tensor,for element multiplication operations, F at Is an attention mapping function;
splicing the weighted attention characteristic tensor and the input characteristic tensor along the channel direction to obtain an output tensor of the spatial attention layer, wherein the expression is as follows:
X out =concat([X,X′],axis="channel")
wherein X is out For the output tensor of the spatial attention layer, concat is a channel stitching function, and axis parameters specify the dimension of tensor stitching as the channel direction.
Still further, the attention of the spatial attention module is generated by adopting a first convolution layer, a second convolution layer and an activation layer adopting a sigmoid function, wherein the first convolution layer, the second convolution layer and the activation layer are sequentially arranged and packaged from front to back; and the expression of the attention mapping function is:
F at (X)=sigmoid(Conv 2 (Conv 1 (X)))
wherein sigmoid is a sigmoid activation function, conv 1 And Conv 2 The convolution operation functions of the first convolution layer and the second convolution layer are respectively.
Preferably, the convolution kernel size of the first convolution layer is 1x1, and the number of convolution output channels is 1; the convolution kernel size of the second convolution layer is one of 3x3, 5x5 and 7x7, and the number of convolution output channels is 1.
Preferably, the regularization layer employs L2 regularization.
Further, the regularized full-connection layer of the weight value adopts L2 regularization treatment, and the number of output neurons of the full-connection layer is 1, and the expression is as follows:
w=Normalize(w 0 )
cosθ=Dense w (Y)
wherein w is 0 The normal is an L2 regularization function, the Dense is a full-connection operation function, w represents the weight of the operation function, Y represents the input tensor of the full-connection layer, cos theta represents the output tensor of the full-connection layer, and the mathematical meaning is the cosine of the included angle between the Y and w vectors.
Further, the output of the convolutional neural network is the cosine of the included angle between the feature vector output by the feature extraction module and the weight value vector of the full-connection layer with the weight value subjected to regularization processing:
when the training sample of the convolutional neural network is a living sample, the expression of the loss function of the neural network is:
when the training sample of the convolutional neural network is a non-living sample, the expression of the loss function of the neural network is:
the cos theta represents the output of the full-connection layer with the weight value subjected to regularization treatment, s and m are super parameters, wherein the value range of s is any numerical value between 10 and 90, and the value range of m is any numerical value between 0.2 and 0.7.
Preferably, the value of the super parameter s is 30, and the value of the super parameter m is 0.5.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention skillfully adopts the spatial attention layer, and the weighted attention characteristic tensor in the spatial attention layer contains the spatial attention information, so that the spatial structure information such as a screen frame which possibly appears in an attack image can be focused more accurately, and the network precision can be effectively improved;
(2) The invention splices the weighted attention characteristic tensor and the input tensor of the attention layer according to the channel dimension; the weighted attention feature tensor can accurately pay attention to the space structure information, and the input tensor of the attention layer can more reserve texture detail information, such as mole lines and the like when a screen is flipped and the like possibly appearing in an attack image; the dimension splicing method can effectively utilize the two types of information, thereby improving the precision of the network;
(3) The spatial attention layer is only used when the repeatable module performs downsampling; if the repeatable module does not perform downsampling, the repeatable module is not added, and the network precision is improved, and meanwhile, the parameters and the operand of the network are only increased in a small extent;
(4) The invention relates to a feature extraction layer and an output layer, which are used for regularizing the weight values of a full-connection layer with the extracted features and weight values subjected to regularization treatment. During training, the loss function is in an asymmetric form. For the living body sample, the included angle between the characteristic vector of the living body sample and the weight value of the full-connection layer with the weight value subjected to regularization treatment can be compressed to be as small as possible. For non-living samples, the loss function performs loss penalty on samples with an included angle between the feature vector of the non-living sample and the weight value of the regularized full-connection layer being smaller than arccos (m-1), and performs no loss penalty on samples with an included angle between the feature vector of the non-living sample and the weight value of the regularized full-connection layer being larger than arccos (m-1).
(5) Compared with the traditional cross entropy loss function, the method can carry out loss penalty on all non-living samples; the cross entropy loss function is very prone to the phenomenon of network over-fitting for training samples. The loss function in the invention can reduce the over-fitting phenomenon, improve the generalization performance of the network, and ensure that the trained network can obtain better performance in different application scenes;
in conclusion, the method has the advantages of simple logic, less technical workload, high calculation precision and the like, and has high practical value and popularization value in the technical field of biological living body identification in computer face identification.
Drawings
For a clearer description of the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and should not be considered as limiting the scope of protection, and other related drawings may be obtained according to these drawings without the need of inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of a root module structure according to the present invention.
Fig. 2 is a schematic diagram of a repeatable module configuration without downsampling according to the present invention.
Fig. 3 is a schematic diagram of a repeatable module configuration for downsampling according to the present invention.
Fig. 4 is a schematic view of a spatial attention layer structure of the present invention.
Fig. 5 is a schematic structural diagram of a feature extraction module according to the present invention.
Fig. 6 is a schematic structural diagram of an output module according to the present invention.
Fig. 7 is a schematic diagram of the overall structure of the present invention.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the present invention will be further described with reference to the accompanying drawings and examples, and embodiments of the present invention include, but are not limited to, the following examples. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
Examples
As shown in fig. 1 to 7, the present embodiment provides a biological living body identification neural network of RGB monocular images and a construction method thereof, and compared with a common method for classifying living bodies by using images, the present invention uses a spatial attention mechanism, regularized feature extraction, and an asymmetric loss function, thereby improving the accuracy of the neural network, and simultaneously has good generalization performance for different application scenarios.
The first step is that the root module is obtained by connecting and packaging the convolution layer, the batch normalization layer, the activation layer and the sequence from front to back. In this embodiment, the convolution kernel size of the convolution layer is set to 3×3, and the number of output channels is 32.
Step two, sequentially connecting and packaging the front to back according to a convolution layer, a batch normalization layer, an activation layer, a depth separable convolution layer, a batch normalization layer, an activation layer, a convolution layer, a batch normalization layer and an optional spatial attention layer to obtain a repeatable module;
the following description is needed: if the downsampling is needed in the repeatable module, the downsampling is performed in the depth separable convolution layer, i.e., the step size in the depth separable convolution layer is set to 2. Meanwhile, if downsampling is performed, a spatial attention layer is added in the module. If no downsampling is performed, the step size in the depth separable convolutional layer is set to 1 and no spatial attention layer is added in the module.
In this embodiment, the input/output of the spatial attention layer adopts the following steps:
(1) Using the input characteristic tensor of the space attention module and combining an attention mapping function to obtain an attention map which has the same size as the input characteristic tensor and has a channel of 1;
(2) Element multiplication, carrying out element multiplication on attention force and input characteristic tensor, carrying out broadcast expansion on attention force in the channel direction, and obtaining weighted attention force characteristic tensor, wherein the expression is as follows:
where X is the input feature tensor of the spatial attention module, X' is the weighted attention feature tensor,for element multiplication operations, F at The output of the function is an attention map of channel 1, which is the same size as the input tensor.
(3) Splicing the weighted attention characteristic tensor and the input characteristic tensor along the channel direction to obtain an output tensor of the spatial attention layer, wherein the expression is as follows:
X out =concat([X,X′],axis="channel")
wherein X is out For the output tensor of the spatial attention layer, concat is a channel stitching function, and axis parameters specify the dimension of tensor stitching as the channel direction.
In this embodiment, the attention of the spatial attention module is generated by using a first convolution layer, a second convolution layer and an activation layer using a sigmoid function, which are sequentially arranged and packaged from front to back; and the expression of the attention mapping function is:
F at (X)=sigmoid(Conv 2 (Conv 1 (X)))
wherein sigmoid is a sigmoid activation function, conv 1 And Conv 2 The convolution operation functions of the first convolution layer and the second convolution layer are respectively.
In this embodiment, 11 repeatable modules are arranged in total and connected sequentially. Downsampling and adding spatial attention layers were performed at the 2 nd, 6 th, 10 th and 11 th modules, respectively. Conv 1 The convolution kernel size of the convolution function is 1x1, and the number of output channels is 1.Conv 2 The convolution kernel size of the convolution function is 3x3, and the number of output channels is also 1.
Thirdly, sequentially connecting and packaging the global average pooling layer, the full connection layer, the activation layer and the regularization layer from front to back to obtain a feature extraction module; the number of output units of the full connection layer in this step is 256.
Fourth, the full-connection layer (namely the special full-connection layer) with the regularized weight value is used as an output module, and the main difference between the special full-connection layer and the common full-connection layer is that: the weight values in the special full connection layer are also regularized, and the special full connection layer does not use the offset.
In this embodiment, when the training sample of the convolutional neural network is a living sample, the expression of the loss function of the neural network is:
when the training sample of the convolutional neural network is a non-living sample, the expression of the loss function of the neural network is:
the cos θ represents the output of the full-connection layer with the regularized weight value, the super parameter s is 30, and the super parameter m is 0.5.
Fifthly, sequentially connecting a root module, a plurality of repeatable modules, a feature extraction module and an output module to obtain a living body identification neural network;
to verify the feasibility and good performance of the method, the present example was tested with a proprietary live dataset. The data set contains 5 thousands of real person images collected under good illumination conditions, and 2 thousands of attack samples are respectively attacked by using a mobile phone screen and a photo. The test set contains test sets with good illumination (the same scene) and darker illumination (other scenes). The control group used the same training set and test set, but in the neural network used, the spatial attention layer was removed, the special fully-connected layer was replaced with the normal fully-connected layer, and the loss function of the neural network was replaced with the Softmax function commonly used for the classification network. The experimental results are as follows:
as can be seen from the results, the recognition accuracy of the embodiment is high, and the recognition method has good generalization performance for different application scenes, particularly scenes not included in the training set.
In summary, the invention aims at the field of biological living body identification by utilizing RGB monocular images, improves the identification accuracy of the network by adding a space attention, a special full-connection layer, an asymmetric loss function and other modes, and has good generalization performance for different application scenes. Compared with the similar technology, the invention has outstanding substantive characteristics and remarkable progress, and has high practical value and popularization value.
The above embodiments are only preferred embodiments of the present invention and are not intended to limit the scope of the present invention, but all changes made by adopting the design principle of the present invention and performing non-creative work on the basis thereof shall fall within the scope of the present invention.
Claims (6)
1. The construction method of the biological living body recognition neural network for the RGB monocular image is characterized by comprising a root module, a plurality of repeatable modules, a feature extraction module and an output module which are sequentially connected from front to back;
the root module comprises a convolution layer, a batch normalization layer and an activation layer which are sequentially connected and packaged from front to back;
the repeatable module comprises a convolution layer, a batch normalization layer, an activation layer, a depth separable convolution layer, a batch normalization layer, an activation layer, a convolution layer and a batch normalization layer which are sequentially connected and packaged from front to back; if the repeatable module performs downsampling, the tail end of the repeatable module is also provided with a space attention layer;
the feature extraction module comprises a global average pooling layer, a full-connection layer, an activation layer and a regularization layer which are sequentially connected and packaged from front to back;
the output module is a full-connection layer with weight values subjected to regularization treatment;
the input and output of the spatial attention layer adopts the following steps:
using the input characteristic tensor of the space attention module and combining an attention mapping function to obtain an attention map which has the same size as the input characteristic tensor and has a channel of 1;
element multiplication is carried out on attention force diagram and input characteristic tensor, broadcast expansion is carried out on attention force diagram in channel direction, and weighted attention characteristic tensor is obtained, wherein the expression is as follows:
where X is the input feature tensor of the spatial attention module, X' is the weighted attention feature tensor,for element multiplication operations, F at Is an attention mapping function;
splicing the weighted attention characteristic tensor and the input characteristic tensor along the channel direction to obtain an output tensor of the spatial attention layer, wherein the expression is as follows:
X out =concat([X,X′],axis="channel")
wherein X is out For the output tensor of the spatial attention layer, concat is a channel splicing function, and the axis parameter designates the dimension of tensor splicing as the channel direction;
the attention of the spatial attention module is generated by adopting a first convolution layer, a second convolution layer and an activation layer adopting a sigmoid function, wherein the first convolution layer, the second convolution layer and the activation layer are sequentially arranged and packaged from front to back; and the expression of the attention mapping function is:
F at (X)=sigmoid(Conv 2 (Conv 1 (X)))
wherein sigmoid is a sigmoid activation function, conv 1 And Conv 2 Convolution operation functions of the first convolution layer and the second convolution layer respectively;
the output of the neural network is the cosine quantity of an included angle between the feature vector output by the feature extraction module and the weight value vector of the full-connection layer with the weight value subjected to regularization treatment:
when the training sample of the neural network is a living sample, the expression of the loss function of the neural network is:
when the training sample of the neural network is a non-living sample, the expression of the loss function of the neural network is:
the cos theta represents the output of the full-connection layer with the weight value subjected to regularization treatment, s and m are super parameters, wherein the value range of s is any numerical value between 10 and 90, and the value range of m is any numerical value between 0.2 and 0.7.
2. The method for constructing a biological living body recognition neural network for RGB monocular images according to claim 1, wherein the step size of the depth separable convolutional layer is set to 2.
3. The construction method for a biological living body recognition neural network for RGB monocular images according to claim 1, wherein the convolution kernel size of the first convolution layer is 1x1, and the number of convolution output channels is 1; the convolution kernel size of the second convolution layer is one of 3x3, 5x5 and 7x7, and the number of convolution output channels is 1.
4. The method for constructing a biological living body recognition neural network for RGB monocular images according to claim 1, wherein the regularization layer adopts L2 regularization.
5. The construction method of a biological living body recognition neural network for an RGB monocular image according to claim 1, wherein the regularized weight value full-connection layer is L2 regularized, and the number of output neurons of the full-connection layer is 1, and the expression is:
w=Normalize(w 0 )
cosθ=Dense w (Y)
wherein w is 0 The normal is an L2 regularization function, the Dense is a full-connection operation function, w represents the weight of the operation function, Y represents the input tensor of the full-connection layer, cos theta represents the output tensor of the full-connection layer, and the mathematical meaning is the cosine of the included angle between the Y and w vectors.
6. The construction method for a biological living body recognition neural network for an RGB monocular image according to claim 1, wherein the super parameter s takes a value of 30 and the super parameter m takes a value of 0.5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011475744.0A CN112541576B (en) | 2020-12-14 | 2020-12-14 | Biological living body identification neural network construction method of RGB monocular image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011475744.0A CN112541576B (en) | 2020-12-14 | 2020-12-14 | Biological living body identification neural network construction method of RGB monocular image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112541576A CN112541576A (en) | 2021-03-23 |
CN112541576B true CN112541576B (en) | 2024-02-20 |
Family
ID=75020167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011475744.0A Active CN112541576B (en) | 2020-12-14 | 2020-12-14 | Biological living body identification neural network construction method of RGB monocular image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112541576B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115272130A (en) * | 2022-08-22 | 2022-11-01 | 苏州大学 | Image moire removing system and method based on multispectral cascade normalization |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110246148A (en) * | 2019-05-27 | 2019-09-17 | 浙江科技学院 | The conspicuousness detection method of multi-modal depth information fusion and attention study |
CN110335212A (en) * | 2019-06-28 | 2019-10-15 | 西安理工大学 | Defect ancient books Chinese character restorative procedure based on condition confrontation network |
CN110543878A (en) * | 2019-08-07 | 2019-12-06 | 华南理工大学 | pointer instrument reading identification method based on neural network |
WO2020107847A1 (en) * | 2018-11-28 | 2020-06-04 | 平安科技(深圳)有限公司 | Bone point-based fall detection method and fall detection device therefor |
CN111832216A (en) * | 2020-04-14 | 2020-10-27 | 新疆大学 | Rolling bearing residual service life prediction method based on EEMD-MCNN-GRU |
CN111882002A (en) * | 2020-08-06 | 2020-11-03 | 桂林电子科技大学 | MSF-AM-based low-illumination target detection method |
WO2020191390A3 (en) * | 2019-03-21 | 2020-11-12 | Illumina, Inc. | Artificial intelligence-based quality scoring |
CN112036454A (en) * | 2020-08-17 | 2020-12-04 | 上海电力大学 | Image classification method based on multi-core dense connection network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3901833A1 (en) * | 2018-01-15 | 2021-10-27 | Illumina, Inc. | Deep learning-based variant classifier |
-
2020
- 2020-12-14 CN CN202011475744.0A patent/CN112541576B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020107847A1 (en) * | 2018-11-28 | 2020-06-04 | 平安科技(深圳)有限公司 | Bone point-based fall detection method and fall detection device therefor |
WO2020191390A3 (en) * | 2019-03-21 | 2020-11-12 | Illumina, Inc. | Artificial intelligence-based quality scoring |
CN110246148A (en) * | 2019-05-27 | 2019-09-17 | 浙江科技学院 | The conspicuousness detection method of multi-modal depth information fusion and attention study |
CN110335212A (en) * | 2019-06-28 | 2019-10-15 | 西安理工大学 | Defect ancient books Chinese character restorative procedure based on condition confrontation network |
CN110543878A (en) * | 2019-08-07 | 2019-12-06 | 华南理工大学 | pointer instrument reading identification method based on neural network |
CN111832216A (en) * | 2020-04-14 | 2020-10-27 | 新疆大学 | Rolling bearing residual service life prediction method based on EEMD-MCNN-GRU |
CN111882002A (en) * | 2020-08-06 | 2020-11-03 | 桂林电子科技大学 | MSF-AM-based low-illumination target detection method |
CN112036454A (en) * | 2020-08-17 | 2020-12-04 | 上海电力大学 | Image classification method based on multi-core dense connection network |
Non-Patent Citations (2)
Title |
---|
Graph-Guided Architecture Search for Real-Time Semantic Segmentation;P. Lin等;《 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》;全文 * |
基于深度注意胶囊的智能信号识别方法研究;吴亚聪;《中国硕士电子期刊网》(第2期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112541576A (en) | 2021-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112949565B (en) | Single-sample partially-shielded face recognition method and system based on attention mechanism | |
Reddy et al. | Spontaneous facial micro-expression recognition using 3D spatiotemporal convolutional neural networks | |
Li et al. | In ictu oculi: Exposing ai generated fake face videos by detecting eye blinking | |
CN112215180B (en) | Living body detection method and device | |
EP3287943B1 (en) | Liveness test method and liveness test computing apparatus | |
Menotti et al. | Deep representations for iris, face, and fingerprint spoofing detection | |
CN108345818B (en) | Face living body detection method and device | |
WO2021248733A1 (en) | Live face detection system applying two-branch three-dimensional convolutional model, terminal and storage medium | |
Wang et al. | Eye recognition with mixed convolutional and residual network (MiCoRe-Net) | |
CN107808115A (en) | A kind of biopsy method, device and storage medium | |
WO2021218238A1 (en) | Image processing method and image processing apparatus | |
CN112084917A (en) | Living body detection method and device | |
CN112801015A (en) | Multi-mode face recognition method based on attention mechanism | |
Cun et al. | Image splicing localization via semi-global network and fully connected conditional random fields | |
CN110222718A (en) | The method and device of image procossing | |
CN111209873A (en) | High-precision face key point positioning method and system based on deep learning | |
CN111814682A (en) | Face living body detection method and device | |
Allaert et al. | Optical flow techniques for facial expression analysis: Performance evaluation and improvements | |
WO2024077781A1 (en) | Convolutional neural network model-based image recognition method and apparatus, and terminal device | |
Silwal et al. | A novel deep learning system for facial feature extraction by fusing CNN and MB-LBP and using enhanced loss function | |
CN112541576B (en) | Biological living body identification neural network construction method of RGB monocular image | |
Duffner et al. | A neural scheme for robust detection of transparent logos in TV programs | |
El Madmoune et al. | Robust face recognition using convolutional neural networks combined with Krawtchouk moments. | |
CN112308035A (en) | Image detection method, image detection device, computer equipment and storage medium | |
Wang et al. | IPNet: Polarization-based Camouflaged Object Detection via dual-flow network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |