CN112819947B

CN112819947B - Three-dimensional face reconstruction method, device, electronic device and storage medium

Info

Publication number: CN112819947B
Application number: CN202110151906.3A
Authority: CN
Inventors: 俞云杰; 黄晗; 郭彦东
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-02-03
Filing date: 2021-02-03
Publication date: 2025-02-11
Anticipated expiration: 2041-02-03
Also published as: CN112819947A

Abstract

The present application discloses a method, device, electronic device and storage medium for reconstructing a three-dimensional face, and relates to the technical field of electronic devices. The method comprises: obtaining shape parameters and texture parameters of a face to be reconstructed, inputting the shape parameters into a first model, obtaining a three-dimensional face shape output by the first model, inputting the texture parameters into a second model, obtaining a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on training of a generative adversarial network, and a target three-dimensional face is generated based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face includes texture information generated based on the face texture map. The present application generates a three-dimensional face shape and/or a face texture map through a trained generative adversarial network, so that the generated target three-dimensional face has rich detail features, thereby improving the reconstruction effect of the three-dimensional face.

Description

Three-dimensional face reconstruction method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of electronic devices, and in particular, to a method and apparatus for reconstructing a three-dimensional face, an electronic device, and a storage medium.

Background

In the fields of computer vision and computer graphics, three-dimensional face reconstruction is a topic with gradually rising heat, and can be widely applied to the fields of face recognition, face editing, man-machine interaction, expression-driven animation, augmented reality, virtual reality and the like. And the three-dimensional face reconstruction based on a single RGB image, namely, the three-dimensional face shape and texture are reconstructed by using the single RGB image, is a durable topic. At present, the three-dimensional face reconstructed based on a single RGB image has poor details and lacks reality.

Disclosure of Invention

In view of the above problems, the present application provides a method, an apparatus, an electronic device, and a storage medium for reconstructing a three-dimensional face, so as to solve the above problems.

In a first aspect, an embodiment of the present application provides a method for reconstructing a three-dimensional face, where the method includes obtaining a shape parameter and a texture parameter of a face to be reconstructed, inputting the shape parameter into a first model to obtain a three-dimensional face shape output by the first model, inputting the texture parameter into a second model to obtain a face texture map output by the second model, where at least one of the first model and the second model is obtained based on generating an countermeasure network training, and generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, where the target three-dimensional face includes texture information generated based on the face texture map.

In a second aspect, an embodiment of the application provides a three-dimensional face reconstruction device, which comprises a parameter acquisition module, a face shape acquisition module and a texture map acquisition module, wherein the parameter acquisition module is used for acquiring shape parameters and texture parameters of a face to be reconstructed, the face shape acquisition module is used for inputting the shape parameters into a first model to acquire a three-dimensional face shape output by the first model, the texture map acquisition module is used for inputting the texture parameters into a second model to acquire a face texture map output by the second model, at least one of the first model and the second model is acquired based on generation of a countermeasure network training, and the three-dimensional face generation module is used for generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

In a third aspect, an embodiment of the present application provides an electronic device comprising a memory and a processor, the memory coupled to the processor, the memory storing instructions that when executed by the processor perform the above-described method.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having program code stored therein, the program code being callable by a processor to perform the above method.

The method, the device, the electronic equipment and the storage medium for reconstructing the three-dimensional face provided by the embodiment of the application acquire the shape parameters and the texture parameters of the face to be reconstructed, input the shape parameters into the first model to acquire the three-dimensional face shape output by the first model, input the texture parameters into the second model to acquire the face texture map output by the second model, wherein at least one of the first model and the second model is acquired based on the generation countermeasure network training, and the target three-dimensional face is generated based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map, so that the three-dimensional face shape and/or the face texture map is generated through the trained generation countermeasure network, the generated target three-dimensional face has rich detail characteristics, and the reconstruction effect of the three-dimensional face is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a three-dimensional face reconstruction method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a three-dimensional face reconstruction method according to another embodiment of the present application;

fig. 3 is a schematic flow chart of a three-dimensional face reconstruction method according to still another embodiment of the present application;

Fig. 4 is a flowchart illustrating step S306 of the three-dimensional face reconstruction method shown in fig. 3 according to the present application;

fig. 5 is a schematic flow chart of a three-dimensional face reconstruction method according to another embodiment of the present application;

fig. 6 is a schematic flow chart of a three-dimensional face reconstruction method according to still another embodiment of the present application;

fig. 7 is a schematic flow chart of a three-dimensional face reconstruction method according to still another embodiment of the present application;

Fig. 8 is a schematic flow chart of a three-dimensional face reconstruction method according to still another embodiment of the present application;

fig. 9 shows a block diagram of a three-dimensional face reconstruction device according to an embodiment of the present application;

Fig. 10 shows a block diagram of an electronic device for performing a method of reconstructing a three-dimensional face according to an embodiment of the present application;

Fig. 11 illustrates a storage unit for storing or carrying program codes for implementing a reconstruction method of a three-dimensional face according to an embodiment of the present application.

Detailed Description

In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings.

At present, a classical method for reconstructing a three-dimensional face based on a single RGB image is a three-dimensional deformation model (3D morphable model,3DMM), and the basic idea is to collect the three-dimensional face in advance and complete the establishment of a database. The shape vector, the expression vector and the texture vector of the face are analyzed as principal components, the vectors are converted into independent representations, and on the basis, any face can be represented by utilizing the linear combination of the average face and the principal components. The inventor finds that two important but frequently neglected problems exist in 3DMM, namely 1, after original three-dimensional face scanning data are collected, a prefabricated template face is used for aligning the original face data, however, the low quality (the number of vertexes and the number of patches are small) of the template face is limited, the aligned face has the phenomenon of 'smoothing' original face data, so that some face detail information (such as acne, wrinkles and the like) is lost, 2, the existing 3 DMM-based method starts from an average face, three-dimensional face appearance features are obtained by linear combination of the average face and principal components, and the three-dimensional face appearance features obtained by the method are more regularized, so that the generated face tends to be the average face and has poorer reality.

In order to solve the problems, the inventor discovers through long-term research and provides the three-dimensional face reconstruction method, the device, the electronic equipment and the storage medium, and the three-dimensional face shape and/or the face texture map are generated through the trained generation countermeasure network, so that the generated target three-dimensional face has rich detail characteristics, and the three-dimensional face reconstruction effect is improved. The specific three-dimensional face reconstruction method is described in detail in the following embodiments.

Referring to fig. 1, fig. 1 is a flow chart illustrating a three-dimensional face reconstruction method according to an embodiment of the present application. The three-dimensional face reconstruction method is used for generating the three-dimensional face shape and/or the face texture mapping through the trained generation countermeasure network, so that the generated target three-dimensional face has rich detail characteristics, and the three-dimensional face reconstruction effect is improved. In a specific embodiment, the three-dimensional face reconstruction method is applied to the three-dimensional face reconstruction device 200 shown in fig. 9 and the electronic apparatus 100 (fig. 10) provided with the three-dimensional face reconstruction device 200. The specific flow of the present embodiment will be described below by taking an electronic device as an example, and it will be understood that the electronic device applied in the present embodiment may be a smart phone, a tablet computer, a desktop computer, a wearable electronic device, etc., which is not limited herein. The following will describe the flow shown in fig. 1 in detail, and the method for reconstructing a three-dimensional face specifically may include the following steps:

Step S101, obtaining shape parameters and texture parameters of a face to be reconstructed.

In this embodiment, the face to be reconstructed may be a three-dimensional face to be reconstructed. Acquiring the shape parameters and texture parameters of the face to be reconstructed may include acquiring the shape parameters and texture parameters of the three-dimensional face to be reconstructed. The shape parameters may include shape parameters and expression parameters, and the obtaining of the shape parameters and texture parameters of the face to be reconstructed may include obtaining the shape parameters, the expression parameters, and the texture parameters of the three-dimensional face to be reconstructed.

As a way, a two-dimensional color face can be acquired through an RGB camera, the acquired two-dimensional color face is processed to obtain a three-dimensional face corresponding to the two-dimensional color face as a face to be reconstructed, and shape parameters and texture parameters of the face to be reconstructed are obtained.

In some embodiments, after the shape parameters of the face to be reconstructed are obtained, the shape parameters may be initialized, and after the texture parameters of the face to be reconstructed are obtained, the texture parameters may be initialized.

Step S102, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

In some embodiments, the first model and the second model may be pre-trained and set, wherein at least one of the first model and the second model is obtained based on generating the countermeasure network training. Thus, as one approach, the first model may be obtained based on generating the countermeasure network training, and the second model may be obtained based on generating the countermeasure network training. As yet another approach, the first model may be obtained based on generating the countermeasure network training, and the second model may be a principal component analysis statistical model. As yet another approach, the first model may be a principal component analysis statistical model and the second model may be obtained based on generating an countermeasure network training.

In this embodiment, after the shape parameter of the face to be reconstructed is obtained, the shape parameter may be input into the first model, to obtain the three-dimensional face shape output by the first model. When the first model is a trained generated countermeasure network, the shape parameters can be input into the trained generated countermeasure network to obtain the three-dimensional face shape output by the trained generated countermeasure network, and when the first model is a principal component analysis statistical model, the shape parameters can be input into the principal component analysis statistical model to obtain the three-dimensional face shape output by the principal component analysis statistical model.

Step S103, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generation of an countermeasure network training.

In this embodiment, after obtaining texture parameters of the face to be reconstructed, the texture parameters may be input into the second model, to obtain a face texture map output by the second model. When the second model is a trained generated countermeasure network, texture parameters can be input into the trained generated countermeasure network to obtain a face texture map output by the trained generated countermeasure network, and when the second model is a principal component analysis statistical model, texture parameters can be input into the principal component analysis statistical model to obtain a face texture map output by the principal component analysis statistical model.

And step S104, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

In this embodiment, after the three-dimensional face shape and the face texture map are obtained, a target three-dimensional face may be generated based on the three-dimensional face shape and the face texture map, where the target three-dimensional face includes texture information generated based on the face texture map, that is, the generated target three-dimensional face is a textured three-dimensional face.

In some embodiments, after the three-dimensional face shape and the face texture map are obtained, the three-dimensional face shape and the face texture map may be combined to generate the target three-dimensional face.

According to the method for reconstructing the three-dimensional face, provided by the embodiment of the application, the shape parameters and the texture parameters of the face to be reconstructed are acquired, the shape parameters are input into the first model, the three-dimensional face shape output by the first model is obtained, the texture parameters are input into the second model, the face texture map output by the second model is obtained, at least one of the first model and the second model is obtained based on the generation of the antagonism network training, the target three-dimensional face is generated based on the three-dimensional face shape and the face texture map, the target three-dimensional face comprises texture information generated based on the face texture map, and therefore the three-dimensional face shape and/or the face texture map is generated through the trained generation antagonism network, the generated target three-dimensional face has rich detail characteristics, and the reconstruction effect of the three-dimensional face is improved.

Referring to fig. 2, fig. 2 is a flow chart illustrating a three-dimensional face reconstruction method according to another embodiment of the present application. In the embodiment, the first model is a trained generated countermeasure network, and the method for reconstructing a three-dimensional face specifically may include the following steps:

step S201, obtaining the shape parameters and texture parameters of the face to be reconstructed.

The specific description of step S201 refers to step S101, and is not repeated here.

Step S202, inputting the shape parameters into the trained generated countermeasure network, and obtaining the face coordinate mapping output by the trained generated countermeasure network.

In this embodiment, the first model is a trained generation countermeasure network. As a way, a training data set may be first collected, where the attribute or feature of one type of data in the training data set is different from another type of data, and then training and modeling are performed on the generated countermeasure network by using the collected training data set according to a preset algorithm, so that a rule is integrated based on the training data set, and a trained generated countermeasure network is obtained. In this embodiment, the training data set may be, for example, a plurality of shape parameters and a plurality of face coordinate maps having correspondence.

In some embodiments, after obtaining the shape parameters of the face to be reconstructed, the shape parameters may be input into a trained generating countermeasure network, and a face coordinate map output by the trained generating countermeasure network may be obtained.

And step 203, obtaining the three-dimensional face shape based on the face coordinate mapping.

In this embodiment, after the face coordinate map is obtained, the three-dimensional face shape may be obtained based on the face coordinate map. In some embodiments, after the face coordinate map is obtained, the coordinates of each vertex of the three-dimensional face may be obtained based on the face coordinate map, and the three-dimensional face shape may be obtained based on the coordinates of each vertex of the three-dimensional face.

And S204, inputting the texture parameters into a second model to obtain the face texture map output by the second model.

Step S205, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S204 to step S205 refer to step S103 to step S104, and are not repeated here.

In the method for reconstructing a three-dimensional face according to the further embodiment of the present application, compared with the method for reconstructing a three-dimensional face shown in fig. 1, the method for reconstructing a three-dimensional face according to the present application further obtains a face coordinate map based on shape parameters and obtains a three-dimensional face shape based on the face coordinate map through a trained generation countermeasure network, thereby improving accuracy of the obtained three-dimensional face shape.

Referring to fig. 3, fig. 3 is a flow chart illustrating a three-dimensional face reconstruction method according to still another embodiment of the present application. The following will describe the flow shown in fig. 3 in detail, and the method for reconstructing a three-dimensional face specifically may include the following steps:

step S301, obtaining shape parameters and texture parameters of a face to be reconstructed.

Step S302, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

Step S303, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generation of an countermeasure network training.

And S304, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S301 to step S304 refer to step S101 to step S104, and are not described herein.

And step S305, rendering the target three-dimensional face based on the rendering function, the camera parameters and the illumination parameters, and generating a two-dimensional rendering face.

In this embodiment, after the target three-dimensional face is obtained, the target three-dimensional face may be rendered based on the rendering function, the camera parameters, and the illumination parameters, to generate a two-dimensional rendered face.

In some embodiments, the camera orientation may be based on a camera model with camera position [ x _c,y_c,z_c ]And a focal length f _c parameterizing camera parameters p _c＝[x_c,y_c,z_c,x'_c,y'_c,z'_c, based on the illumination model, parameterizing illumination parameters p_l＝[x_l,y_l,z_l,r_l,g_l,b_l,r_a,g_a,b_a]. with point source position [ x _l,y_l,z_l ], color [ r _l,g_l,b_l ] and ambient light color [ r _a,g_a,b_a ], and finally, rendering the whole image byTo do so, wherein S, g represents the first model and the second model, respectively, e.g., S represents the first generation antagonism network, g represents the second generation antagonism network, p represents a function that maps three-dimensional face shape to spatial coordinates, face texture map to texture coordinates, and,Representing a rendering function that renders a three-dimensional face into a two-dimensional image, which may be a weak perspective projection or a perspective projection. Thus, as a way, a rendering function may be based onThe camera parameters p _c and the illumination parameters p _l render a target three-dimensional face (based on a three-dimensional face shape and a face texture map) to generate a two-dimensional rendered face.

Step S306, acquiring the gesture of the face to be reconstructed.

In this embodiment, the pose of the face to be reconstructed may be acquired. As an implementation manner, key point detection can be performed on the face to be reconstructed to obtain the face key points corresponding to the face to be reconstructed, and the pose of the face to be reconstructed is obtained based on the face key points.

Referring to fig. 4, fig. 4 is a flowchart illustrating a step S306 of the method for reconstructing a three-dimensional face shown in fig. 3 according to the present application. The following will describe the flow shown in fig. 4 in detail, and the method may specifically include the following steps:

And step 3061, detecting key points of the face to be rebuilt, and obtaining a plurality of three-dimensional face key points of the face to be rebuilt as first face key points.

In some embodiments, key point detection may be performed on a face to be reconstructed, and a plurality of three-dimensional face key points of the face to be reconstructed are obtained as first face key points. In some embodiments, key point detection may be performed on a face to be reconstructed, to obtain a plurality of three-dimensional face key points of the face to be reconstructed as first face key points, and to obtain a plurality of two-dimensional face key points of the face to be reconstructed as second face key points.

In this embodiment, a face to be reconstructed may be obtained, and a plurality of two-dimensional face key points obtained by an advanced, pre-trained face key point detector may be detected as second face key points and a plurality of three-dimensional face key points obtained by the advanced, pre-trained face key point detector, respectively, as first face key points. As a way, the face to be reconstructed can be obtained, and 68 two-dimensional face key points can be obtained through detection by an advanced, pre-trained face key point detectorAnd 68 three-dimensional face key pointsAs a way, the face to be reconstructed can be obtained, and 68 two-dimensional face key points can be obtained through detection by an advanced and pre-trained face key point detector68 Two-dimensional face key points are projected through back projectionBy way of (a) generating 68 three-dimensional face key points

Step S3062, based on the first face key points, acquiring the gesture of the face to be rebuilt.

In some embodiments, after the first face key point is obtained, the pose of the face to be reconstructed may be obtained based on the first face key point. As one way, after the first face key point is obtained, an euler angle may be obtained based on the first face key point calculation, and the pose of the face to be reconstructed may be determined based on the euler angle, where the euler angle includes a roll angle, a pitch angle, and a yaw angle.

Step S307, calculating the key point loss of the face to be reconstructed and the target three-dimensional face based on the relation between the gesture and the preset gesture.

In some embodiments, a preset pose may be preset and stored, where the preset pose is used as a basis for determining the pose of the face to be reconstructed. Therefore, in this embodiment, after the pose of the face to be reconstructed is obtained, the pose may be compared with the preset pose to obtain a relationship between the pose and the preset pose, and the key point loss of the face to be reconstructed and the target three-dimensional face is calculated based on the relationship between the pose and the preset pose.

In some embodiments, the preset rolling angle, the preset pitch angle and the preset yaw angle can be preset and stored, wherein the preset rolling angle is used as a judgment basis of the rolling angle of the face to be reconstructed, the preset pitch angle is used as a judgment basis of the lazy pitch angle of the face to be reconstructed, and the preset yaw angle is used as a judgment basis of the yaw angle of the face to be reconstructed. As one way, when the roll angle is greater than the preset roll angle, the pitch angle is greater than the preset pitch angle, and/or the yaw angle is greater than the preset yaw angle, it may be determined that the attitude is greater than the preset attitude, and when the roll angle is greater than the preset roll angle, the pitch angle is not greater than the preset pitch angle, and the yaw angle is not greater than the preset yaw angle, it may be determined that the attitude is not greater than the preset attitude.

Therefore, in this embodiment, after the first face key point is obtained, the roll angle of the face to be reconstructed may be calculated based on the first face key point, and the roll angle may be compared with a preset roll angle, so as to obtain a relationship between the roll angle and the preset roll angle, and a key point loss of the face to be reconstructed and the target three-dimensional face may be calculated based on the relationship between the roll angle and the preset roll angle, after the first face key point is obtained, the pitch angle of the face to be reconstructed may be calculated based on the first face key point, and the pitch angle may be compared with the preset pitch angle, so as to obtain a relationship between the pitch angle and the preset pitch angle, and a key point loss of the face to be reconstructed and the target three-dimensional face may be calculated based on the relationship between the first face key point, the yaw angle and the preset yaw angle may be compared, and a relationship between the yaw angle and the yaw angle may be obtained, so as to calculate a key point loss of the face to be reconstructed and the target three-dimensional face.

In some embodiments, when the relationship between the pose of the face to be reconstructed and the preset pose represents that the pose is greater than the preset pose, a plurality of three-dimensional face key points of the target three-dimensional face may be obtained as the third face key point, and the key point loss of the first face key point and the third face key point may be calculated. In one mode, when the relation representation gesture of the face to be reconstructed and the preset gesture is larger than the preset gesture, key point detection can be performed on the target three-dimensional face, a plurality of three-dimensional face key points are obtained to serve as third face key points, and key point loss of the first face key points and the third face key points is calculated.

In some embodiments, when the relationship between the pose of the face to be reconstructed and the preset pose represents that the pose is not greater than the preset pose, a plurality of two-dimensional face key points corresponding to a plurality of three-dimensional face key points of the target three-dimensional face can be obtained as a fourth face key point, and key point loss of the second face key point and the fourth face key point is calculated. When the relation representation gesture of the face to be reconstructed and the preset gesture is not larger than the preset gesture, the key point detection can be carried out on the target three-dimensional face to obtain a plurality of three-dimensional face key points, the three-dimensional face key points are projected to a pixel coordinate system according to camera parameters to obtain a plurality of corresponding two-dimensional face key points serving as fourth face key points, and key point loss of the second face key points and the fourth face key points is calculated.

In some embodiments, the key point loss for the face to be reconstructed and the target three-dimensional face may be calculated based on L _lan＝||M(I⁰)-M'(p_s,p_e,p_c)||₂. In the above formula, M (I ⁰) is a first key point or a second key point corresponding to an image to be reconstructed, M' (p _s,p_e,p_c) is a coordinate of projection of a third key point or a fourth key point corresponding to a target three-dimensional image, and || ₂ represents an L2 norm.

And step 308, performing iterative optimization on the shape parameters based on the key point loss to obtain first optimized shape parameters.

In this embodiment, after the key point loss is obtained by calculation, the shape parameter may be iteratively optimized based on the key point loss to obtain the first optimized shape parameter. As an implementation manner, after the key point loss is obtained through calculation, a gradient descent algorithm may be used to calculate a gradient of the key point loss relative to the shape parameter, so as to perform iterative optimization on the shape parameter, where the iterative optimization process is performed continuously until the key point loss almost converges, or until the number of iterations is large by a pass threshold, so as to complete iterative optimization on the shape parameter, and obtain a first optimized shape parameter.

Step S309, inputting the first optimized shape parameter as a new shape parameter into the first model.

In some embodiments, after the first optimized shape parameter is obtained, the first optimized shape parameter may be input as a new shape parameter into the first model, so as to obtain a new three-dimensional face shape through the first model, and obtain a new target three-dimensional face through the new three-dimensional face shape, thereby improving accuracy of the obtained three-dimensional face.

And step S310, carrying out iterative optimization on the camera parameters based on the key point loss to obtain optimized camera parameters.

In this embodiment, after the key point loss is obtained by calculation, iterative optimization may be performed on the camera based on the key point loss to obtain optimized camera parameters. As an implementation manner, after the key point loss is obtained by calculation, a gradient descent algorithm may be used to calculate the gradient of the key point loss relative to the camera parameter, so as to perform iterative optimization on the camera parameter, where the iterative optimization process is performed continuously until the key point loss almost converges, or until the iteration number is large by a pass number threshold, so as to complete iterative optimization on the camera parameter, and obtain the optimized camera parameter.

And step 311, taking the optimized camera parameters as new camera parameters to render the target three-dimensional face.

In some embodiments, after obtaining the optimized camera parameters, the optimized camera parameters may be used as new camera parameters to render the target three-dimensional face, obtain a new two-dimensional rendered face, and improve accuracy of the obtained two-dimensional rendered face.

In the three-dimensional face reconstruction method according to the further embodiment of the present application, compared with the three-dimensional face reconstruction method shown in fig. 1, the key point loss of the face to be reconstructed and the target three-dimensional face is calculated, and the shape parameter and the camera parameter are optimized based on the key point loss, so as to improve the reality of the target three-dimensional face.

Referring to fig. 5, fig. 5 is a flow chart illustrating a three-dimensional face reconstruction method according to another embodiment of the present application. The following details about the flow shown in fig. 5, the method for reconstructing a three-dimensional face specifically may include the following steps:

Step S401, obtaining shape parameters and texture parameters of a face to be reconstructed.

Step S402, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

Step S403, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generating an countermeasure network training.

And step S404, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S401 to step S404 refer to step S101 to step S104, and are not described herein.

And step S405, rendering the target three-dimensional face based on the rendering function, the camera parameters and the illumination parameters, and generating a two-dimensional rendering face.

The specific description of step S405 is referred to step S305, and is not repeated here.

Step S406, acquiring the biological characteristic information of the face to be reconstructed as first biological characteristic information, and acquiring the biological characteristic information of the two-dimensional rendering face as second biological characteristic information.

In some embodiments, an advanced face recognition network may be utilized to capture the biometric information of the face to be reconstructed, obtain the biometric information of the face to be reconstructed as the first biometric information, and utilize an advanced face recognition network to capture the biometric information of the two-dimensional rendered face, and obtain the biometric information of the two-dimensional rendered face as the second biometric information. The biometric information may include, but is not limited to, a facial form, an eyebrow length, an eye size, a nose height, a lip thickness, and the like. The face recognition network may be F ⁿ (I): where n represents n convolutional layers.

Step S407, calculating the biological characteristic loss of the face to be reconstructed and the two-dimensional rendering face based on the first biological characteristic information and the second biological characteristic information.

In this embodiment, after the first biometric information of the face to be reconstructed and the second biometric information of the two-dimensional rendered face are obtained, the biometric loss of the face to be reconstructed and the two-dimensional rendered face may be calculated based on the first biometric information and the second biometric information. In some embodiments, after the first biometric information of the face to be reconstructed and the second biometric information of the two-dimensional rendered face are obtained, a first face feature vector of the first biometric information and a second face feature vector of the second biometric information may be calculated, and then the biometric loss of the face to be reconstructed and the two-dimensional rendered face may be calculated based on the first face feature vector of the first biometric information and the second face feature vector of the second biometric information.

In some embodiments, it may be based onAnd calculating the biological characteristic loss of the face to be reconstructed and the two-dimensional rendering face. F ⁿ(I⁰) in the above formula represents a first face feature vector of a face to be reconstructed, and F ⁿ(I^R) represents a second face feature vector of a two-dimensional rendered face.

And step S408, performing iterative optimization on the shape parameter and the texture parameter based on the biological feature loss to obtain a second optimized shape parameter and a first optimized texture parameter.

In this embodiment, after the biometric loss is obtained by calculation, the shape parameter may be iteratively optimized based on the biometric loss to obtain a second optimized shape parameter. As an embodiment, after the calculation of the biometric loss, a gradient descent algorithm may be used to calculate a gradient of the biometric loss relative to the shape parameter, so as to perform iterative optimization on the shape parameter, where the iterative optimization is performed continuously until the key point loss almost converges, or until the number of iterations is large by a threshold number of passes, so as to complete iterative optimization on the shape parameter, and obtain a second optimized shape parameter.

In this embodiment, after the biometric loss is obtained by calculation, the texture parameter may be iteratively optimized based on the biometric loss to obtain the first optimized texture parameter. As an implementation manner, after the biometric loss is obtained by calculation, a gradient descent algorithm may be used to calculate a gradient of the biometric loss relative to the texture parameter, so as to perform iterative optimization on the texture parameter, where the iterative optimization process is performed continuously until the key point loss almost converges, or until the iteration number is large by a pass threshold, so as to complete iterative optimization on the texture parameter, and obtain a first optimized texture parameter.

Step S409, inputting the second optimized shape parameter as a new shape parameter into the first model, and inputting the first optimized texture parameter as a new texture parameter into the second model.

In some embodiments, after the second optimized shape parameter is obtained, the second optimized shape parameter may be input as a new shape parameter into the first model, so as to obtain a new three-dimensional face shape through the first model, and obtain a new target three-dimensional face through the new three-dimensional face shape, thereby improving accuracy of the obtained three-dimensional face.

In some embodiments, after the first optimized texture parameter is obtained, the first optimized texture parameter may be input as a new texture parameter into the second model, so as to obtain new face texture coordinates through the second model, and obtain a new target three-dimensional face through the new face texture coordinates, thereby improving accuracy of the obtained three-dimensional face.

Compared with the three-dimensional face reconstruction method shown in fig. 1, the three-dimensional face reconstruction method provided by the other embodiment of the application further calculates the biological feature loss of the face to be reconstructed and the two-dimensional rendered face, optimizes the shape parameters and the camera parameters based on the biological feature loss, and improves the reality of the target three-dimensional face.

Referring to fig. 6, fig. 6 is a flow chart illustrating a three-dimensional face reconstruction method according to still another embodiment of the present application. The following will describe the flow shown in fig. 6 in detail, and the method for reconstructing a three-dimensional face specifically may include the following steps:

step S501, obtaining shape parameters and texture parameters of a face to be reconstructed.

Step S502, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

Step S503, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generating an countermeasure network training.

And step S504, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S501 to step S504 refer to step S101 to step S104, and are not described herein.

And step S505, rendering the target three-dimensional face based on the rendering function, the camera parameters and the illumination parameters, and generating a two-dimensional rendering face.

The specific description of step S505 refers to step S305, and is not described herein.

Step S506, acquiring the attribute content information of the face to be reconstructed as first attribute content information, and acquiring the attribute content information of the two-dimensional rendering face as second attribute content information.

In some embodiments, the attribute content information of the face to be reconstructed may be captured by using the features of the layer convolution layer in an advanced face recognition network, the attribute content information of the face to be reconstructed is obtained as the first attribute content information, and the attribute content information of the two-dimensional rendering face may be captured by using the features of the layer convolution layer in the advanced face recognition network, and the attribute content information of the two-dimensional rendering face is obtained as the second attribute content information. Wherein the attribute content information may include expressions, gestures, illumination, etc., without limitation.

And S507, calculating the attribute content loss of the face to be reconstructed and the two-dimensional rendering face based on the first attribute content information and the second attribute content information.

In this embodiment, after the first attribute content information of the face to be reconstructed and the second attribute content information of the two-dimensional rendered face are obtained, the attribute content loss of the face to be reconstructed and the two-dimensional rendered face may be calculated based on the first attribute content information and the second attribute content information.

In some embodiments, it may be based onAnd calculating the attribute content loss of the face to be reconstructed and the two-dimensional rendering face. F ^j(I⁰) represents the middle-level features of the face to be reconstructed, and F ^j(I^R) represents the middle-level features of the face of the two-dimensional rendered face.

And step S508, performing iterative optimization on the shape parameter and the texture parameter based on the attribute content loss to obtain a third optimized shape parameter and a second optimized texture parameter.

In this embodiment, after the attribute content loss is obtained by calculation, the shape parameter may be iteratively optimized based on the attribute content loss to obtain a third optimized shape parameter. As an implementation manner, after the attribute content loss is obtained by calculation, a gradient descent algorithm may be used to calculate a gradient of the attribute content loss relative to the shape parameter, so as to perform iterative optimization on the shape parameter, where the iterative optimization process is performed continuously until the key point loss almost converges, or until the number of iterations is large by a pass threshold, so as to complete iterative optimization on the shape parameter, and obtain a third optimized shape parameter.

In this embodiment, after the attribute content loss is obtained by calculation, the texture parameter may be iteratively optimized based on the attribute content loss to obtain a second optimized texture parameter. As an implementation manner, after the attribute content loss is obtained by calculation, a gradient descent algorithm may be used to calculate a gradient of the attribute content loss relative to the texture parameter, so as to perform iterative optimization on the texture parameter, where the iterative optimization process is performed continuously until the key point loss almost converges, or until the iteration number is large by a pass number threshold, so as to complete iterative optimization on the texture parameter, and obtain a second optimized texture parameter.

Step S509 is to input the third optimized shape parameter as a new shape parameter to the first model and to input the second optimized texture parameter as a new texture parameter to the second model.

In some embodiments, after the third optimized shape parameter is obtained, the third optimized shape parameter may be input as a new shape parameter into the first model, so as to obtain a new three-dimensional face shape through the first model, and obtain a new target three-dimensional face through the new three-dimensional face shape, thereby improving accuracy of the obtained three-dimensional face.

In some embodiments, after the second optimized texture parameter is obtained, the second optimized texture parameter may be input as a new texture parameter into the second model, so as to obtain a new face texture coordinate through the second model, and obtain a new target three-dimensional face through the new face texture coordinate, thereby improving accuracy of the obtained three-dimensional face.

In the three-dimensional face reconstruction method according to the still another embodiment of the present application, compared with the three-dimensional face reconstruction method shown in fig. 1, attribute content information loss of the face to be reconstructed and the two-dimensional rendered face is calculated, and shape parameters and camera parameters are optimized based on the attribute content information loss, so as to improve the realism of the target three-dimensional face.

Referring to fig. 7, fig. 7 is a flow chart illustrating a three-dimensional face reconstruction method according to still another embodiment of the present application. The following details about the flow shown in fig. 7, the method for reconstructing a three-dimensional face specifically may include the following steps:

Step S601, obtaining shape parameters and texture parameters of a face to be reconstructed.

Step S602, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

And step S603, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generation of an countermeasure network training.

Step S604, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S601 to step S604 refer to step S101 to step S104, and are not described herein.

And step 605, rendering the target three-dimensional face based on the rendering function, the camera parameters and the illumination parameters, and generating a two-dimensional rendering face.

The specific description of step S605 is referred to step S305, and will not be repeated here.

Step S606, obtaining the pixel information of the face to be reconstructed as first pixel information, and obtaining the pixel information of the face to be reconstructed based on the two-dimensional rendering as second pixel information.

In some embodiments, pixel information of a face to be reconstructed is obtained as first pixel information, and pixel information of a two-dimensional rendered face is obtained as second pixel information. The pixel extraction module may be used to obtain the pixel information of the face to be reconstructed and the pixel information of the two-dimensional rendered face, which is not limited herein.

Step S607, calculating the pixel loss of the face to be reconstructed and the two-dimensional rendered face based on the first pixel information and the second pixel information.

In this embodiment, after the first pixel information of the face to be reconstructed and the second pixel information of the two-dimensional rendered face are obtained, the pixel loss of the face to be reconstructed and the two-dimensional rendered face may be calculated based on the first pixel information and the second pixel information.

In some implementations, the pixel loss for the face to be reconstructed and the two-dimensional rendered face can be calculated based on L _pix＝||I⁰-I^R||₁. In the above formula, I ⁰ represents a face to be reconstructed, I ^R represents a two-dimensional rendering face, and I ₁ represents a norm of L1.

And step 608, performing iterative optimization on the illumination parameters based on the pixel loss to obtain optimized illumination parameters.

In this embodiment, after the pixel loss is obtained by calculation, the illumination parameter may be iteratively optimized based on the pixel loss to obtain the optimized illumination parameter. As an implementation manner, after the pixel loss is obtained by calculation, a gradient descent algorithm may be used to calculate the gradient of the pixel loss relative to the illumination parameter, so as to perform iterative optimization on the illumination parameter, where the iterative optimization process is continuously performed until the key point loss almost converges, or until the number of iterations is large by a pass threshold, so as to complete iterative optimization on the illumination parameter, and obtain the optimized illumination parameter.

In some embodiments, after the pixel loss is obtained through calculation, the shape parameter and the texture parameter may be further iteratively optimized based on the pixel loss, which is not described herein.

And step S609, rendering the target three-dimensional face by taking the optimized illumination parameter as a new illumination parameter.

In some embodiments, after the optimized illumination parameter is obtained, the optimized illumination parameter can be used as a new illumination parameter to render the target three-dimensional face, so as to obtain a new two-dimensional rendered face, and the accuracy of the obtained two-dimensional rendered face is improved.

In the three-dimensional face reconstruction method according to still another embodiment of the present application, compared with the three-dimensional face reconstruction method shown in fig. 1, the pixel loss of the face to be reconstructed and the two-dimensional rendered face is calculated, and the shape parameter and the camera parameter are optimized based on the pixel loss, so as to improve the realism of the target three-dimensional face.

Referring to fig. 8, fig. 8 is a flow chart illustrating a three-dimensional face reconstruction method according to another embodiment of the present application. The following details about the flow shown in fig. 8, the method for reconstructing a three-dimensional face specifically may include the following steps:

step S701, a first training data set is obtained, wherein the first training data set comprises a plurality of shape parameters and a plurality of three-dimensional face shapes, and the shape parameters and the three-dimensional face shapes are in one-to-one correspondence.

In this embodiment, a first training data set is first acquired. The first training data set may include a plurality of shape parameters and a plurality of three-dimensional face shapes, where the plurality of shape parameters and the plurality of three-dimensional face lines correspond one-to-one. As one way, the first training data set may also include a plurality of shape parameters and a plurality of face coordinate maps, where the plurality of shape parameters and the plurality of face sitting heading maps are in one-to-one correspondence.

In some embodiments, the first training data set may be stored locally in the electronic device, may be stored in another device and transmitted to the electronic device, may be stored in a server and transmitted to the electronic device, may be collected in real time by the electronic device, and the like, and is not limited herein.

Step S702, training the generated antagonism network by taking the plurality of shape parameters as input parameters and the plurality of three-dimensional face shapes as output parameters to obtain the trained first generated antagonism network.

As one approach, after obtaining the plurality of shape parameters and the plurality of three-dimensional face shapes, the plurality of shape parameters and the plurality of three-dimensional face shapes may be trained as a first training data set on the generated antagonism network to obtain a trained first generated antagonism network. In some embodiments, the generation of the antagonism network may be trained using a plurality of shape parameters as input parameters and a plurality of three-dimensional face shapes as output parameters to obtain a trained first generation antagonism network. In addition, after the trained first generation objective network is obtained, the accuracy of the trained first generation objective network can be verified, whether the three-dimensional face shape output by the trained first generation objective network based on the input shape parameters meets the preset requirement or not is judged, when the three-dimensional face shape output by the trained first generation objective network based on the input shape parameters does not meet the preset requirement, the training first training data set can be re-acquired to train the generation objective network, or a plurality of first training data sets can be acquired to correct the trained first generation objective network, and the limitation is not limited.

In the training process of the first generation of the reactance network, firstly, a three-dimensional face shape (face coordinate mapping) with the lowest resolution is trained, and the generator and the discriminator are trained stably under the resolution. Then, the image with the higher first-order resolution is transited to be trained. Adding a new convolution layer in the above steps, processing the new convolution layer in a residual block mode, namely directly up-sampling the image characteristic image with low resolution by 2 times into the image characteristic image with high primary resolution, converting the characteristic image into an RGB image F1 by using 1X 1 convolution, converting the image characteristic image with high primary resolution, which is just up-sampled by 2 times, into an RGB image F2 by using 3X 3 convolution, then converting the characteristic image into an RGB image F2 by using a new 1X 1 convolution, multiplying F1 by weights 1-a and F2 by a and adding, wherein the weight coefficient a gradually transits from 0 to 1, which ensures the fade-in of the 3X 3 convolution layer, and training a generator and a discriminator to be stable under the resolution. Finally, the image with higher first-level resolution is transited to be trained. The new convolution layer is processed by adding new residual blocks according to the method in the step, and the steps are repeated, so that the original resolution of the three-dimensional face shape (face coordinate mapping) is finally obtained.

Step S703, obtaining a second training data set, wherein the second training data set comprises a plurality of texture parameters and a plurality of face texture maps, and the texture parameters and the face texture maps are in one-to-one correspondence.

In this embodiment, the second training data set is first acquired. The second training data set may include a plurality of texture parameters and a plurality of face texture maps, where the plurality of texture parameters and the plurality of face texture maps are in one-to-one correspondence.

In some embodiments, the second training data set may be stored locally in the electronic device, may be stored in another device and transmitted to the electronic device, may be stored in a server and transmitted to the electronic device, may be collected in real time by the electronic device, and the like, and is not limited herein.

In some embodiments, the face coordinate mapping obtained by performing face registration on the sample face may be added to the first training data set, the face texture mapping obtained by obtaining the sample face may be added to the second training data set, and as a way, the process of performing face registration on the sample face may include the following steps:

s1, obtaining a sample face, wherein the sample face is a textured three-dimensional face, projecting textured three-dimensional face scanning data to two dimensions to generate a frontal face image, and generating 68 two-dimensional face key points of the frontal face image by using a two-dimensional key point detector Back projecting 68 two-dimensional face key points to generate 68 three-dimensional face key points

S2, obtaining 68 three-dimensional face key points of original three-dimensional face scanning data according to the S1, and registering the 68 three-dimensional face key points with 68 three-dimensional face key points of a standard face template through Procrustes transformation. In this way, the pose, size, and shape of all original three-dimensional face scan data are aligned with a standard face template.

S3, registering neutral face data in the original three-dimensional face scanning data with a standard face template by utilizing NICP algorithm, wherein the neutral face data refers to a natural face which does not make any expression.

S4, transferring the expression of a group of template faces to the registered neutral faces by using a deformation transfer algorithm for the other 19 (other quantity) face data with expressions in the original three-dimensional face scanning data, so that the corresponding expression is generated (for example, the neutral expression and the mouth opening expression of the known template faces can be transferred to the registered neutral faces).

S5, registering the facial expression data in the original 3D facial scanning data with the facial template with the expression, which is deformed in the S4, by utilizing NICP algorithm, so as to generate more accurate facial grids with the expression.

S6, using a face manipulation algorithm based on a sample, constructing 20 (other numbers of face grids with expressions of each person generated in S3 and S5 as a mixed shape model of the person, wherein the result is that a neutral expression of a face and 46 FACS mixed shapes, namely any expression H of the person can be expressed by linear combination of the mixed shapes:

s7, sampling the parameter vector a of the mixed shape from Gaussian distribution N (mu=0, sigma=1), and normalizing by an e-index normalization function, so that data amplification is carried out on the three-dimensional face. The augmented face data is converted into a face coordinate map (which may be converted into a three-dimensional face shape) and a face texture map.

Step S704, training the generated countermeasure network by taking the texture parameters as input parameters and the face texture maps as output parameters to obtain the trained second generated countermeasure network.

As one approach, after obtaining the plurality of texture parameters and the plurality of face texture maps, the generated challenge network may be trained using the plurality of texture parameters and the plurality of face texture maps as a second training data set to obtain a trained second generated challenge network. In some embodiments, the generated countermeasure network may be trained using the plurality of texture parameters as input parameters and the plurality of three-dimensional face texture maps as output parameters to obtain a trained second generated countermeasure network. In addition, after the trained second generated countermeasure network is obtained, the accuracy of the trained second generated countermeasure network may be further verified, and whether the face coordinate map output by the trained second generated countermeasure network based on the input texture parameters meets the preset requirement may be judged, and when the face coordinate map output by the trained second generated countermeasure network based on the input texture parameters does not meet the preset requirement, the trained second training data set may be re-collected to train the generated countermeasure network, or a plurality of second training data sets may be re-obtained to correct the trained second generated countermeasure network, which is not limited herein.

In the second training process of generating the countermeasure network, firstly, training a face texture map with the lowest resolution, and training the generator and the discriminator under the resolution. Then, the image with the higher first-order resolution is transited to be trained. Adding a new convolution layer in the above steps, processing the new convolution layer in a residual block mode, namely directly up-sampling the image characteristic image with low resolution by 2 times into the image characteristic image with high primary resolution, converting the characteristic image into an RGB image F1 by using 1X 1 convolution, converting the image characteristic image with high primary resolution, which is just up-sampled by 2 times, into an RGB image F2 by using 3X 3 convolution, then converting the characteristic image into an RGB image F2 by using a new 1X 1 convolution, multiplying F1 by weights 1-a and F2 by a and adding, wherein the weight coefficient a gradually transits from 0 to 1, which ensures the fade-in of the 3X 3 convolution layer, and training a generator and a discriminator to be stable under the resolution. Finally, the image with higher first-level resolution is transited to be trained. The new convolution layer is processed by adding new residual blocks according to the method in the step, and the steps are repeated, so that the original resolution of the texture coordinate mapping is finally achieved.

Step S705, obtaining the shape parameter and the texture parameter of the face to be rebuilt.

Step S706, inputting the shape parameters into a first model to obtain the three-dimensional face shape output by the first model.

Step S707, inputting the texture parameters into a second model to obtain a face texture map output by the second model, wherein at least one of the first model and the second model is obtained based on generating an countermeasure network training.

Step S708, generating a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map.

The specific description of step S705 to step S708 refers to step S101 to step S104, and will not be repeated here.

In still another embodiment of the present application, the generating countermeasure network is further trained by a plurality of shape parameters and a plurality of three-dimensional face shapes, so as to obtain a trained first generating countermeasure network capable of generating a three-dimensional face shape based on the shape parameters, and the generating countermeasure network is trained by a plurality of texture parameters and a plurality of face texture maps, so as to obtain a trained second generating countermeasure network capable of generating a face texture map based on the texture parameters, so as to improve the accuracy of the generated three-dimensional face shape and face texture map.

Referring to fig. 9, fig. 9 is a block diagram of a three-dimensional face reconstruction device according to an embodiment of the present application. The following will describe the block diagram shown in fig. 9, and the three-dimensional face reconstruction device 200 includes a parameter obtaining module 210, a face shape obtaining module 220, a texture map obtaining module 230, and a three-dimensional face generating module 240, where:

The parameter obtaining module 210 is configured to obtain a shape parameter and a texture parameter of a face to be reconstructed.

The face shape obtaining module 220 is configured to input the shape parameter into a first model, and obtain a three-dimensional face shape output by the first model.

Further, the first model is a trained generating countermeasure network, and the face shape obtaining module 220 includes a face coordinate mapping obtaining sub-module and a face shape obtaining sub-module, wherein:

and the face coordinate mapping obtaining sub-module is used for inputting the shape parameters into the trained generation countermeasure network to obtain the face coordinate mapping output by the trained generation countermeasure network.

And the face shape obtaining submodule is used for obtaining the three-dimensional face shape based on the face coordinate mapping.

A texture map obtaining module 230, configured to input the texture parameter into a second model, and obtain a face texture map output by the second model, where at least one of the first model and the second model is obtained based on generating an countermeasure network training.

The three-dimensional face generating module 240 is configured to generate a target three-dimensional face based on the three-dimensional face shape and the face texture map, where the target three-dimensional face includes texture information generated based on the face texture map.

Further, the three-dimensional face reconstruction device 200 further comprises a two-dimensional rendering face generation module, wherein:

the two-dimensional rendering face generation module is used for rendering the target three-dimensional face based on a rendering function, camera parameters and illumination parameters to generate a two-dimensional rendering face.

Further, the three-dimensional face reconstruction device 200 further includes a pose acquisition module, a key point loss calculation module, a first iterative optimization module, and a first optimization parameter substitution module, wherein:

And the gesture acquisition module is used for acquiring the gesture of the face to be reconstructed.

Further, the gesture acquisition module comprises a first face key point acquisition sub-module and a gesture acquisition sub-module, wherein:

A first face key point obtaining sub-module for carrying out key point detection on the face to be rebuilt, obtaining a plurality of three-dimensional face key points of the face to be rebuilt as first face key points,

Further, the first face key point obtaining submodule comprises a second face key point obtaining unit, wherein:

The second face key point obtaining unit is used for carrying out key point detection on the face to be rebuilt, obtaining a plurality of three-dimensional face key points of the face to be rebuilt as first face key points, and obtaining a plurality of two-dimensional face key points of the face to be rebuilt as second face key points.

And the gesture acquisition sub-module is used for acquiring the gesture of the face to be reconstructed based on the first face key points.

And the key point loss calculation module is used for calculating the key point loss of the face to be reconstructed and the target three-dimensional face based on the relation between the gesture and the preset gesture.

Further, the keypoint loss calculation module comprises a first keypoint calculation sub-module and a second keypoint calculation sub-module, wherein:

And the first key point loss calculation submodule is used for acquiring a plurality of three-dimensional face key points of the target three-dimensional face as third face key points when the gesture is larger than the preset gesture, and calculating key point losses of the first face key points and the third face key points.

And the second key point calculating submodule is used for acquiring a plurality of two-dimensional face key points corresponding to a plurality of three-dimensional face key points of the target three-dimensional face to serve as fourth face key points when the gesture is not larger than the preset gesture, and calculating key point loss of the second face key points and the fourth face key points.

And the first iterative optimization module is used for carrying out iterative optimization on the shape parameters based on the key point loss to obtain first optimized shape parameters.

And the first optimization parameter substitution module is used for inputting the first optimization shape parameter into the first model as a new shape parameter.

Further, the three-dimensional face reconstruction device 200 further includes an optimized camera parameter obtaining module and a second optimized parameter replacing module, wherein:

And the optimized camera parameter obtaining module is used for carrying out iterative optimization on the camera parameters based on the key point loss to obtain optimized camera parameters.

And the second optimization parameter substitution module is used for rendering the target three-dimensional face by taking the optimized camera parameters as new camera parameters.

Further, the three-dimensional face reconstruction device 200 further includes a biometric information acquisition module, a biometric loss calculation module, a second iterative optimization module, and a third optimization parameter substitution module, wherein:

the biological characteristic information acquisition module is used for acquiring biological characteristic information of the face to be reconstructed as first biological characteristic information and acquiring biological characteristic information of the two-dimensional rendering face as second biological characteristic information.

The biological characteristic loss calculation module is used for calculating the biological characteristic loss of the face to be reconstructed and the two-dimensional rendering face based on the first biological characteristic information and the second biological characteristic information.

And the second iterative optimization module is used for carrying out iterative optimization on the shape parameter and the texture parameter based on the biological feature loss to obtain a second optimized shape parameter and a first optimized texture parameter.

And a third optimization parameter substitution module, configured to input the second optimization shape parameter as a new shape parameter into the first model, and input the first optimization texture parameter as a new texture parameter into the second model.

Further, the three-dimensional face reconstruction device 200 further includes an attribute content information acquisition module, an attribute content loss calculation module, a third iterative optimization module, and a fourth optimization parameter substitution module, wherein:

The attribute content information acquisition module is used for acquiring the attribute content information of the face to be reconstructed as first attribute content information and acquiring the attribute content information of the two-dimensional rendering face as second attribute content information.

And the attribute content loss calculation module is used for calculating the attribute content loss of the face to be reconstructed and the two-dimensional rendering face based on the first attribute content information and the second attribute content information.

And the third iterative optimization module is used for carrying out iterative optimization on the shape parameter and the texture parameter based on the attribute content loss to obtain a third optimized shape parameter and a second optimized texture parameter.

And a fourth optimization parameter substitution module, configured to input the third optimization shape parameter as a new shape parameter into the first model, and input the second optimization texture parameter as a new texture parameter into the second model.

Further, the three-dimensional face reconstruction device 200 further includes a pixel information acquisition module, a pixel loss calculation module, a fourth iterative optimization module, and a fifth optimization parameter substitution module, wherein:

the pixel information acquisition module is used for acquiring the pixel information of the face to be reconstructed as first pixel information and acquiring the pixel information based on the two-dimensional rendering face as second pixel information.

And the pixel loss calculation module is used for calculating the pixel loss of the face to be reconstructed and the two-dimensional rendering face based on the first pixel information and the second pixel information.

And a fourth iterative optimization module, configured to iteratively optimize the illumination parameter based on the pixel loss, to obtain an optimized illumination parameter.

And a fifth optimization parameter substitution module, configured to render the target three-dimensional face with the optimized illumination parameter as a new illumination parameter.

Further, the three-dimensional face reconstruction device 200 further includes a first training data set acquisition module and a first generation reactance network acquisition module, wherein:

The first training data set acquisition module is used for acquiring a first training data set, wherein the first training data set comprises a plurality of shape parameters and a plurality of three-dimensional face shapes, and the shape parameters and the three-dimensional face shapes are in one-to-one correspondence.

The first generation antagonism network obtaining module is used for training the generation antagonism network by taking the plurality of shape parameters as input parameters and the plurality of three-dimensional face shapes as output parameters to obtain the trained first generation antagonism network.

Further, the three-dimensional face reconstruction device 200 further includes a second training data set acquisition module and a second generation countermeasure network acquisition module, wherein:

The second training data set acquisition module is used for acquiring a second training data set, wherein the second training data set comprises a plurality of texture parameters and a plurality of face texture maps, and the texture parameters and the face texture maps are in one-to-one correspondence.

And the second generation countermeasure network obtaining module is used for training the generation countermeasure network by taking the texture parameters as input parameters and the face texture maps as output parameters to obtain the trained second generation countermeasure network.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.

In several embodiments provided by the present application, the coupling of the modules to each other may be electrical, mechanical, or other.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

Referring to fig. 10, a block diagram of an electronic device 100 according to an embodiment of the application is shown. The electronic device 100 may be a smart phone, a tablet computer, an electronic book, or the like capable of running an application program. The electronic device 100 of the present application may include one or more processors 110, memory 120, a touch screen 130, and one or more application programs, wherein the one or more application programs may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more program(s) configured to perform the methods as described in the foregoing method embodiments.

Wherein the processor 110 may include one or more processing cores. The processor 110 utilizes various interfaces and lines to connect various portions of the overall electronic device 100, perform various functions of the electronic device 100, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and invoking data stored in the memory 120. Alternatively, the processor 110 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 110 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like, the GPU is used for rendering and drawing the content to be displayed, and the modem is used for processing wireless communication. It will be appreciated that the modem may not be integrated into the processor 110 and may be implemented solely by a single communication chip.

Memory 120 may include random access Memory (Random Access Memory, RAM) or Read-Only Memory (ROM). Memory 120 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described below, etc. The storage data area may also store data created by the terminal 100 in use (such as phonebook, audio-video data, chat-record data), etc.

The touch screen 130 is used to display information input by a user, information provided to the user, and various graphical user interfaces of the electronic device 100, which may be formed by graphics, text, icons, numbers, video, and any combination thereof, and in one example, the touch screen 130 may be a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD) or an Organic Light-Emitting Diode (OLED), which is not limited herein.

Referring to fig. 11, a block diagram of a computer readable storage medium according to an embodiment of the present application is shown. The computer readable medium 300 has stored therein program code which can be invoked by a processor to perform the methods described in the method embodiments described above.

The computer readable storage medium 300 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer readable storage medium 300 comprises a non-volatile computer readable medium (non-transitory computer-readable storage medium). The computer readable storage medium 300 has storage space for program code 310 that performs any of the method steps described above. The program code can be read from or written to one or more computer program products. Program code 310 may be compressed, for example, in a suitable form.

In summary, the method, the device, the electronic equipment and the storage medium for reconstructing the three-dimensional face provided by the embodiment of the application acquire the shape parameter and the texture parameter of the face to be reconstructed, input the shape parameter into the first model, acquire the three-dimensional face shape output by the first model, input the texture parameter into the second model, acquire the face texture map output by the second model, wherein at least one of the first model and the second model is acquired based on the generation of the antagonism network training, and generate the target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face comprises texture information generated based on the face texture map, so that the three-dimensional face shape and/or the face texture map is generated through the trained generation of the antagonism network, the generated target three-dimensional face has rich detail characteristics, and the reconstruction effect of the three-dimensional face is improved.

It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above-mentioned embodiments, it will be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or replacements do not drive the essence of the corresponding technical solution to deviate from the spirit and scope of the technical solution of the embodiments of the present application.

Claims

1. A method for reconstructing a three-dimensional face, characterized in that the method comprises:

Obtaining shape parameters and texture parameters of the face to be reconstructed;

Inputting the shape parameters into a trained first generative adversarial network to obtain a three-dimensional face shape output by the first generative adversarial network;

Inputting the texture parameters into a trained second generative adversarial network to obtain a face texture map output by the second generative adversarial network;

Based on the three-dimensional face shape and the face texture map, generating a target three-dimensional face, wherein the target three-dimensional face includes texture information generated based on the face texture map;

Rendering the target three-dimensional face based on a rendering function, camera parameters, and lighting parameters to generate a two-dimensional rendered face;

Performing key point detection on the face to be reconstructed, obtaining a plurality of three-dimensional face key points of the face to be reconstructed as first face key points, and obtaining a plurality of two-dimensional face key points of the face to be reconstructed as second face key points;

Based on the first facial key points, acquiring the posture of the face to be reconstructed;

When the posture is greater than a preset posture, obtaining a plurality of three-dimensional facial key points of the target three-dimensional face as third facial key points, and calculating key point losses of the first facial key points and the third facial key points;

When the posture is not greater than the preset posture, obtaining a plurality of two-dimensional face key points corresponding to a plurality of three-dimensional face key points of the target three-dimensional face as fourth face key points, and calculating the key point loss of the second face key points and the fourth face key points;

Iteratively optimize the shape parameters based on the key point loss to obtain first optimized shape parameters;

The first optimized shape parameter is input into the first generative adversarial network as a new shape parameter.

2. The method according to claim 1, characterized in that the step of inputting the shape parameters into a trained first generative adversarial network to obtain the three-dimensional face shape output by the first generative adversarial network comprises:

Inputting the shape parameters into the first generative adversarial network to obtain a face coordinate map output by the first generative adversarial network;

Based on the face coordinate map, the three-dimensional face shape is obtained.

3. The method according to claim 1, characterized in that the method further comprises:

Iteratively optimize the camera parameters based on the key point loss to obtain optimized camera parameters;

The optimized camera parameters are used as new camera parameters to render the target three-dimensional human face.

4. The method according to claim 1, characterized in that after rendering the target three-dimensional face based on the rendering function, camera parameters and illumination parameters to generate a two-dimensional rendered face, it also includes:

Acquire biometric information of the face to be reconstructed as first biometric information, and acquire biometric information of the two-dimensionally rendered face as second biometric information;

Calculating the biometric loss of the to-be-reconstructed face and the two-dimensionally rendered face based on the first biometric information and the second biometric information;

Iteratively optimize the shape parameter and the texture parameter based on the biometric loss to obtain a second optimized shape parameter and a first optimized texture parameter;

The second optimized shape parameter is input into the first generative adversarial network as a new shape parameter, and the first optimized texture parameter is input into the second generative adversarial network as a new texture parameter.

5. The method according to claim 1, characterized in that after rendering the target three-dimensional face based on the rendering function, camera parameters and illumination parameters to generate a two-dimensional rendered face, it also includes:

Acquiring attribute content information of the to-be-reconstructed face as first attribute content information, and acquiring attribute content information of the two-dimensionally rendered face as second attribute content information;

Calculating attribute content losses of the to-be-reconstructed face and the two-dimensionally rendered face based on the first attribute content information and the second attribute content information;

Iteratively optimize the shape parameters and the texture parameters based on the attribute content loss to obtain third optimized shape parameters and second optimized texture parameters;

The third optimized shape parameter is input into the first generative adversarial network as a new shape parameter, and the second optimized texture parameter is input into the second generative adversarial network as a new texture parameter.

6. The method according to claim 1, characterized in that after rendering the target three-dimensional face based on the rendering function, camera parameters and illumination parameters to generate a two-dimensional rendered face, it also includes:

Acquire pixel information of the to-be-reconstructed face as first pixel information, and acquire pixel information based on the two-dimensionally rendered face as second pixel information;

Calculating pixel losses of the to-be-reconstructed face and the two-dimensionally rendered face based on the first pixel information and the second pixel information;

Iteratively optimizing the illumination parameters based on the pixel loss to obtain optimized illumination parameters;

The optimized illumination parameters are used as new illumination parameters to render the target three-dimensional human face.

7. The method according to any one of claims 1 to 6, characterized in that before inputting the shape parameters into a trained first generative adversarial network to obtain the three-dimensional face shape output by the first generative adversarial network, the method further comprises:

Acquire a first training data set, wherein the first training data set includes a plurality of shape parameters and a plurality of three-dimensional face shapes, and the plurality of shape parameters and the plurality of three-dimensional face shapes correspond to each other one by one;

The multiple shape parameters are used as input parameters and the multiple three-dimensional face shapes are used as output parameters to train a generative adversarial network to obtain the trained first generative adversarial network.

8. The method according to any one of claims 1 to 6, characterized in that before inputting the texture parameters into a trained second generative adversarial network to obtain the face texture map output by the second generative adversarial network, the method further comprises:

Acquire a second training data set, wherein the second training data set includes a plurality of texture parameters and a plurality of face texture maps, and the plurality of texture parameters and the plurality of face texture maps correspond one to one;

The plurality of texture parameters are used as input parameters and the plurality of face texture maps are used as output parameters to train a generative adversarial network to obtain the trained second generative adversarial network.

9. A three-dimensional face reconstruction device, characterized in that the device comprises:

A parameter acquisition module, used to obtain shape parameters and texture parameters of the face to be reconstructed;

A face shape acquisition module, used for inputting the shape parameters into a trained first generative adversarial network to obtain a three-dimensional face shape output by the first generative adversarial network;

A texture map acquisition module, used for inputting the texture parameters into a trained second generative adversarial network to obtain a face texture map output by the second generative adversarial network;

A three-dimensional face generation module, used to generate a target three-dimensional face based on the three-dimensional face shape and the face texture map, wherein the target three-dimensional face includes texture information generated based on the face texture map;

A two-dimensional rendering face generation module, used for rendering the target three-dimensional face based on a rendering function, camera parameters and illumination parameters to generate a two-dimensional rendering face;

a posture acquisition module, used to perform key point detection on the face to be reconstructed, obtain a plurality of three-dimensional face key points of the face to be reconstructed as first face key points, and obtain a plurality of two-dimensional face key points of the face to be reconstructed as second face key points;

A posture acquisition module is further used to acquire the posture of the face to be reconstructed based on the first facial key points;

a key point loss calculation module, configured to obtain a plurality of three-dimensional face key points of the target three-dimensional face as third face key points when the posture is greater than a preset posture, and calculate key point losses of the first face key points and the third face key points;

The key point loss calculation module is further used to obtain multiple two-dimensional face key points corresponding to multiple three-dimensional face key points of the target three-dimensional face as fourth face key points when the posture is not greater than the preset posture, and calculate the key point loss of the second face key points and the fourth face key points;

A first iterative optimization module, configured to iteratively optimize the shape parameters based on the key point loss to obtain first optimized shape parameters;

The first optimized parameter replacement module is used to input the first optimized shape parameter as a new shape parameter into the first generative adversarial network.

10. An electronic device, comprising a memory and a processor, wherein the memory is coupled to the processor, the memory stores instructions, and when the instructions are executed by the processor, the processor executes the method according to any one of claims 1 to 8.

11. A computer-readable storage medium, characterized in that program codes are stored in the computer-readable storage medium, and the program codes can be called by a processor to execute the method according to any one of claims 1 to 8.