CN110689625B - Automatic generation method and device for customized face mixed expression model - Google Patents
Automatic generation method and device for customized face mixed expression model Download PDFInfo
- Publication number
- CN110689625B CN110689625B CN201910840594.XA CN201910840594A CN110689625B CN 110689625 B CN110689625 B CN 110689625B CN 201910840594 A CN201910840594 A CN 201910840594A CN 110689625 B CN110689625 B CN 110689625B
- Authority
- CN
- China
- Prior art keywords
- human face
- model
- face
- dimensional
- neutral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000007935 neutral effect Effects 0.000 claims abstract description 86
- 238000007493 shaping process Methods 0.000 claims abstract description 28
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000012546 transfer Methods 0.000 claims abstract description 14
- 238000005516 engineering process Methods 0.000 claims description 39
- 238000012216 screening Methods 0.000 claims description 6
- 230000008921 facial expression Effects 0.000 abstract description 10
- 239000000203 mixture Substances 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000037303 wrinkles Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2021—Shape modification
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Architecture (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention discloses a method and a device for automatically generating a customized face mixed expression model, wherein the method comprises the following steps: carrying out non-rigid registration on the human face three-dimensional template model by using a depth map and human face feature points corresponding to each frame of image of the RGB-D image sequence, and deforming the human face three-dimensional template model according to a non-rigid registration result and Shape from shaping to generate a neutral human face three-dimensional model; processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer to generate a customized human face mixed model; and sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field and the Shape from shaping to generate a human face tracking result so as to update the customized human face mixed model. The method can generate a vivid human face expression model in real time.
Description
Technical Field
The invention relates to the technical field of three-dimensional reconstruction of facial animation, in particular to an automatic generation method and device of a customized facial mixed expression model.
Background
The high-precision customized mixed facial expression model comprises the shapes of human faces when people make certain expressions, and different shapes form different expression bases in the mixed model. In the fields of movies, animation, games and the like, the three-dimensional animation of the human face can be quickly generated through a group of expression coefficient groups.
The customized face mixed expression model is a face three-dimensional expression model which is frequently required to be used in movies and animations and used for making face animations, and can also be used for face tracking tasks. The commonly used methods for making high-precision face mixture models often require expensive equipment. The simple automatic method is difficult to meet the precision requirement, and the details of the face such as moles, wrinkles and the like cannot be restored.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
To this end, an object of the present invention is to provide an automatic generation method of a customized face mixture model, which performs high-precision tracking from faces in face color and depth sequences, and directly uses the high-precision tracking result to generate the customized face mixture model.
The invention also aims to provide an automatic generating device for the customized human face mixed expression model.
In order to achieve the above object, an embodiment of the present invention provides an automatic generation method for a customized face mixed expression model, including:
s1, acquiring an RGB-D image sequence containing user neutral expression, carrying out non-rigid registration on a human face three-dimensional template model by using a depth map and a human face feature point corresponding to each frame of image of the RGB-D image sequence, inputting each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and deforming the human face three-dimensional template model according to the deformation data set;
s2, reconstructing details of a human face in a non-rigid registered human face three-dimensional model through a Shape from shaping technology in the last frame of the RGB-D image sequence, and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model;
s3, processing the neutral human face three-dimensional model and the human face mixed model template through a Deformation Transfer technology to generate a customized human face mixed model;
s4, sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field technology and the Shape from shaping technology so as to track the human face in the RGB-D image sequence and generate a human face tracking result;
and S5, updating the customized face mixing model according to the face tracking result.
The automatic generation method of the customized face mixed expression model comprises the steps of carrying out non-rigid registration on a face three-dimensional template model by utilizing a depth map and a face characteristic point corresponding to each frame of image of an RGB-D image sequence, and deforming the face three-dimensional template model according to a non-rigid registration result and Shape from shaping to generate a neutral face three-dimensional model; processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer to generate a customized human face mixed model; and sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field and the Shape from shaping to generate a human face tracking result so as to update the customized human face mixed model. High-precision tracking is carried out on the face in the face color and depth sequence, and the high-precision tracking result is directly used for generating the customized face mixed model, so that the automatic generation of the high-precision customized face mixed expression model is realized, and the vivid face expression model can be generated in real time.
In addition, the automatic generation method for the customized face mixed expression model according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the acquiring the RGB-D image sequence containing the user neutral expression includes:
and (3) keeping the neutral expression by the user, sequentially rotating the head in the upward, downward, left and right directions, and collecting each frame of user expression image to form the RGB-D image sequence.
Further, in an embodiment of the present invention, the inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame image to generate a deformation data set includes:
inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate the deformation data set.
Further, in an embodiment of the present invention, the S4 specifically includes:
s41, deforming the neutral human face three-dimensional model through the customized human face mixed model to generate an expression coefficient of the customized human face mixed model;
s42, deforming the deformed neutral human face three-dimensional model in the S41 by using a Warping Field technology;
and S43, deforming the deformed neutral human face three-dimensional model in the S42 by a Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
Further, in an embodiment of the present invention, the face tracking result includes:
and the reconstruction result of the current neutral human face three-dimensional model and the expression coefficient of the human face mixed model.
In order to achieve the above object, an embodiment of another aspect of the present invention provides an automatic generating apparatus for a customized mixed facial expression model, including:
the processing module is used for acquiring an RGB-D image sequence containing user neutral expression, performing non-rigid registration on a human face three-dimensional template model by using a depth map and a human face feature point corresponding to each frame of image of the RGB-D image sequence, inputting each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and deforming the human face three-dimensional template model according to the deformation data set;
the first generation module is used for reconstructing details of a human face in a non-rigid registered human face three-dimensional model through a Shape from shaping technology in the last frame of the RGB-D image sequence and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model;
the second generation module is used for processing the neutral human face three-dimensional model and the human face mixed model template through a Deformation Transfer technology to generate a customized human face mixed model;
the tracking module is used for sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field technology and the Shape from shaping technology so as to track the human face in the RGB-D image sequence and generate a human face tracking result;
and the updating module is used for updating the customized face mixing model according to the face tracking result.
The automatic generation device of the customized human face mixed expression model of the embodiment of the invention carries out non-rigid registration on a human face three-dimensional template model by utilizing a depth map and a human face characteristic point corresponding to each frame image of an RGB-D image sequence, and carries out deformation on the human face three-dimensional template model according to a non-rigid registration result and Shape from shaping to generate a neutral human face three-dimensional model; processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer to generate a customized human face mixed model; and sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field and the Shape from shaping to generate a human face tracking result so as to update the customized human face mixed model. High-precision tracking is carried out on the face in the face color and depth sequence, and the high-precision tracking result is directly used for generating the customized face mixed model, so that the automatic generation of the high-precision customized face mixed expression model is realized, and the vivid face expression model can be generated in real time.
In addition, the automatic customized face mixed expression model generation device according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the acquiring the RGB-D image sequence containing the user neutral expression includes:
and (3) keeping the neutral expression by the user, sequentially rotating the head in the upward, downward, left and right directions, and collecting each frame of user expression image to form the RGB-D image sequence.
Further, in an embodiment of the present invention, the inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame image to generate a deformation data set includes:
inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate the deformation data set.
Further, in one embodiment of the present invention, the tracking module comprises: a first deforming unit, a second deforming unit and a third deforming unit;
the first deformation unit is used for deforming the neutral human face three-dimensional model through the customized human face mixed model to generate an expression coefficient of the customized human face mixed model;
the second deformation unit is used for deforming the deformed neutral human face three-dimensional model in the first deformation unit by using a Warping Field technology;
and the third deformation unit is used for deforming the deformed neutral human face three-dimensional model in the second deformation unit through Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
Further, in an embodiment of the present invention, the face tracking result includes:
and the reconstruction result of the current neutral human face three-dimensional model and the expression coefficient of the human face mixed model.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a method for automatically generating a customized mixed facial expression model according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an automatic customized human face mixed expression model generation device according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a method and an apparatus for automatically generating a customized face mixed expression model according to an embodiment of the present invention with reference to the accompanying drawings.
First, an automatic generation method of a customized face mixed expression model proposed according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a flowchart of an automatic generation method of a customized face mixed expression model according to an embodiment of the invention.
As shown in fig. 1, the method for automatically generating the customized human face mixed expression model includes the following steps:
and step S1, acquiring an RGB-D image sequence containing user neutral expression, performing non-rigid registration on the human face three-dimensional template model by using a depth map and human face feature points corresponding to each frame of image of the RGB-D image sequence, inputting each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and deforming the human face three-dimensional template model according to the deformation data set.
Further, the user expression images of each frame are collected to form an RGB-D image sequence by keeping the user in a neutral expression and sequentially rotating the head in the upward, downward, leftward and rightward directions.
The resolution of the RGB-D image sequence used in the embodiments of the present invention is 640 x 480.
Further, in an embodiment of the present invention, inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame image to generate a deformation data set, includes:
inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate a deformation data set.
Specifically, each frame of the RGB-D image sequence is processed to obtain a depth map corresponding to each frame and a face feature point in each frame of image, and in each frame, the depth map and the detected face feature point are used to perform non-rigid registration on a face three-dimensional template model, where the face three-dimensional template model is an existing template model, each vertex in the non-rigid registration result is input into the depth map corresponding to each frame, depth data close to the distance is searched as valid data, and then fused into an array having the same size as the face three-dimensional template model, and then the fused result is used as a data item of the deformed face three-dimensional template model, i.e., a deformed data group, and the deformed data group is used to deform the face three-dimensional template model.
It is understood that the depth map includes three-dimensional coordinate points, the three-dimensional coordinates of each vertex in the non-rigid result are compared with the three-dimensional coordinates in the depth map, and the depth data at a closer distance is taken as effective data.
And step S2, reconstructing the details of the human face in the non-rigid registered human face three-dimensional model through Shape from shaping technology in the last frame of the RGB-D image sequence, and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model.
Specifically, in the last frame of the RGB-D image sequence, the details of the face in the non-rigid registered three-dimensional face model are reconstructed by the Shape from shaping technique, and the deformed three-dimensional face target model in step S1 and the reconstructed three-dimensional face template model in step S2 are integrated to generate a neutral three-dimensional face model.
It can be understood that the face in the input color and depth sequence keeps the neutral expression to do rigid movement only, and the three-dimensional reconstruction of the neutral face is completed by deforming the three-dimensional template model of the face. In the reconstruction process, a non-rigid registration result of the human face three-dimensional template model is used for fusing a more accurate human face three-dimensional model; and obtaining a better non-rigid registration result by using the fused human face three-dimensional model, and performing iteration and alternation on the two.
In the traditional combined reconstruction method, the reconstructed human face three-dimensional network does not have a fixed topological structure. In the embodiment of the invention, the face fused by the fusion method has the same topological structure with the face template model.
And step S3, processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer technology to generate a customized human face mixed model.
And after the three-dimensional reconstruction of the neutral face is completed, the preliminary initialization of the customized face model is completed by using a Deformation Transfer technology.
After the Deformation Transfer technology is used, a preliminary result of the customized human face model can be obtained.
Specifically, a reconstruction high-precision neutral face model and a face mixing model in a template are used as input by using a Deformation Transfer technology, and an initialization result of a customized face mixing model is obtained.
And step S4, sequentially deforming the neutral human face three-dimensional model through customizing a human face mixed model, a Warping Field technology and a Shape from shaping technology so as to track the human face in the RGB-D image sequence and generate a human face tracking result.
Further, in an embodiment of the present invention, the method further includes:
s41, deforming the neutral face three-dimensional model through the customized face mixed model to generate an expression coefficient of the customized face mixed model;
s42, deforming the deformed neutral human face three-dimensional model in the S41 by using a Warping Field technology;
and S43, deforming the deformed neutral human face three-dimensional model in the S42 by Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
The face tracking result comprises a reconstruction result of the current neutral face three-dimensional model and an expression coefficient of the face mixed model.
Specifically, the face in the input color and depth sequence is tracked, the face in the input sequence is tracked with high precision by using the existing customized face mixing model, the Warping Field and the Shape from shaping, and finally the high-precision reconstruction result of the current frame face model and the expression coefficient of the face mixing model at the moment are obtained.
The tracking method of the face hybrid model used in the embodiment does not limit the changing space of the face hybrid model, so that the change of the face hybrid model has higher degree of freedom, and the high-precision face hybrid model can be updated.
And step S5, updating the customized face mixed model according to the face tracking result.
Specifically, the high-precision reconstruction result of the face model and the corresponding expression coefficient are used for updating the customized face mixing model.
And respectively solving each vertex motion in the updated customized face mixed model, and keeping the semanteme of each expression base in the mixed model unchanged by using a mask.
According to the automatic generation method of the customized human face mixed expression model provided by the embodiment of the invention, the non-rigid registration is carried out on the human face three-dimensional template model by utilizing the depth map and the human face characteristic point corresponding to each frame image of the RGB-D image sequence, and the human face three-dimensional template model is deformed according to the non-rigid registration result and Shape from modeling to generate a neutral human face three-dimensional model; processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer to generate a customized human face mixed model; and sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field and the Shape from shaping to generate a human face tracking result so as to update the customized human face mixed model. High-precision tracking is carried out on the face in the face color and depth sequence, and the high-precision tracking result is directly used for generating the customized face mixed model, so that the automatic generation of the high-precision customized face mixed expression model is realized, and the vivid face expression model can be generated in real time.
Next, an automatic customized face mixed expression model generation apparatus proposed according to an embodiment of the present invention is described with reference to the drawings.
Fig. 2 is a schematic structural diagram of an automatic customized human face mixed expression model generation device according to an embodiment of the invention.
As shown in fig. 2, the apparatus for automatically generating customized mixed facial expression model includes: a processing module 100, a first generating module 200, a second generating module 300, a tracking module 400, and an updating module 500.
The processing module 100 is configured to obtain an RGB-D image sequence including a user neutral expression, perform non-rigid registration on a three-dimensional face template model by using a depth map and a face feature point corresponding to each frame of image of the RGB-D image sequence, input each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and deform the three-dimensional face template model according to the deformation data set.
The first generation module 200 is used for reconstructing details of a human face in a non-rigid registered human face three-dimensional model through a Shape from shaping technology in the last frame of an RGB-D image sequence, and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model;
the second generating module 300 is configured to process the neutral human face three-dimensional model and the human face mixed model template through a transformation Transfer technology, and generate a customized human face mixed model.
And the tracking module 400 is configured to sequentially deform the neutral face three-dimensional model through customizing a face mixture model, a Warping Field technology and a Shape from shaping technology, so as to track a face in the RGB-D image sequence and generate a face tracking result.
And the updating module 500 is used for updating the customized face mixing model according to the face tracking result.
The device can generate a better neutral face reconstruction result; the high-precision tracking of the human face can be realized; a high-precision face hybrid model can be generated.
Further, in an embodiment of the present invention, acquiring an RGB-D image sequence containing a neutral expression of a user includes:
and (3) keeping the neutral expression by the user, sequentially rotating the head in the upward, downward, left and right directions, and collecting each frame of user expression image to form an RGB-D image sequence.
Further, in an embodiment of the present invention, inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame image to generate a deformation data set, includes:
inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate a deformation data set.
Further, in one embodiment of the present invention, the tracking module comprises: a first deforming unit, a second deforming unit and a third deforming unit;
the first deformation unit is used for deforming the neutral human face three-dimensional model through the customized human face mixed model to generate an expression coefficient of the customized human face mixed model;
the second deformation unit is used for deforming the deformed neutral human face three-dimensional model in the first deformation unit through a Warping Field technology;
and the third deformation unit is used for deforming the deformed neutral human face three-dimensional model in the second deformation unit through Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
Further, in one embodiment of the present invention, the face tracking result includes:
and reconstructing a result of the current neutral human face three-dimensional model and the expression coefficient of the human face mixed model.
It should be noted that the explanation of the foregoing embodiment of the method for automatically generating a customized mixed facial expression model is also applicable to the apparatus of this embodiment, and details are not described here.
According to the automatic generation device for the customized human face mixed expression model, which is provided by the embodiment of the invention, the non-rigid registration is carried out on the human face three-dimensional template model by utilizing the depth map and the human face characteristic point corresponding to each frame image of the RGB-D image sequence, and the human face three-dimensional template model is deformed according to the non-rigid registration result and Shape from modeling to generate a neutral human face three-dimensional model; processing the neutral human face three-dimensional model and the human face mixed model template through the Deformation Transfer to generate a customized human face mixed model; and sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field and the Shape from shaping to generate a human face tracking result so as to update the customized human face mixed model. High-precision tracking is carried out on the face in the face color and depth sequence, and the high-precision tracking result is directly used for generating the customized face mixed model, so that the automatic generation of the high-precision customized face mixed expression model is realized, and the vivid face expression model can be generated in real time.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (6)
1. A method for automatically generating a customized face mixed expression model is characterized by comprising the following steps:
s1, acquiring an RGB-D image sequence containing user neutral expression, using a depth map and a face feature point corresponding to each frame of image of the RGB-D image sequence to perform non-rigid registration on a face three-dimensional template model, inputting each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, including: inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate the deformation data set; deforming the human face three-dimensional template model according to the deformation data set;
s2, reconstructing details of a human face in a non-rigid registered human face three-dimensional model through a Shape from shaping technology in the last frame of the RGB-D image sequence, and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model;
s3, processing the neutral human face three-dimensional model and the human face mixed model template through a Deformation Transfer technology to generate a customized human face mixed model;
s4, sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field technology and the Shape from shaping technology so as to track the human face in the RGB-D image sequence and generate a human face tracking result;
s5, updating the customized face mixing model according to the face tracking result;
wherein, the S4 specifically includes:
s41, deforming the neutral human face three-dimensional model through the customized human face mixed model to generate an expression coefficient of the customized human face mixed model;
s42, deforming the deformed neutral human face three-dimensional model in the S41 by using a Warping Field technology;
and S43, deforming the deformed neutral human face three-dimensional model in the S42 by a Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
2. The method for automatically generating a customized human face mixed expression model according to claim 1, wherein the obtaining of the RGB-D image sequence containing the user neutral expression comprises:
and (3) keeping the neutral expression by the user, sequentially rotating the head in the upward, downward, left and right directions, and collecting each frame of user expression image to form the RGB-D image sequence.
3. The method of claim 1, wherein the face tracking result comprises:
and the reconstruction result of the current neutral human face three-dimensional model and the expression coefficient of the human face mixed model.
4. An automatic generation device for a customized face mixed expression model is characterized by comprising:
the processing module is used for acquiring an RGB-D image sequence containing user neutral expression, performing non-rigid registration on a human face three-dimensional template model by using a depth map and a human face feature point corresponding to each frame of image of the RGB-D image sequence, inputting each vertex in a non-rigid registration result into the depth map corresponding to each frame of image to generate a deformation data set, and inputting each vertex in the non-rigid registration result into the depth map corresponding to each frame of image to generate the deformation data set, and the processing module comprises: inputting each vertex in the non-rigid registration result into a depth map corresponding to each frame of image to generate depth data, screening the depth data to generate effective depth data, and fusing the effective depth data into an array with the same size as the human face three-dimensional template model to generate the deformation data set; deforming the human face three-dimensional template model according to the deformation data set;
the first generation module is used for reconstructing details of a human face in a non-rigid registered human face three-dimensional model through a Shape from shaping technology in the last frame of the RGB-D image sequence and generating a neutral human face three-dimensional model according to the deformed human face three-dimensional template model and the reconstructed human face three-dimensional template model;
the second generation module is used for processing the neutral human face three-dimensional model and the human face mixed model template through a Deformation Transfer technology to generate a customized human face mixed model;
the tracking module is used for sequentially deforming the neutral human face three-dimensional model through the customized human face mixed model, the Warping Field technology and the Shape from shaping technology so as to track the human face in the RGB-D image sequence and generate a human face tracking result;
the updating module is used for updating the customized face mixing model according to the face tracking result;
wherein the tracking module comprises: a first deforming unit, a second deforming unit and a third deforming unit;
the first deformation unit is used for deforming the neutral human face three-dimensional model through the customized human face mixed model to generate an expression coefficient of the customized human face mixed model;
the second deformation unit is used for deforming the deformed neutral human face three-dimensional model in the first deformation unit by using a Warping Field technology;
and the third deformation unit is used for deforming the deformed neutral human face three-dimensional model in the second deformation unit through Shape from shaping technology to generate a reconstruction result of the current neutral human face three-dimensional model.
5. The apparatus for automatically generating customized human face mixed expression model according to claim 4, wherein the acquiring of the RGB-D image sequence containing the user neutral expression comprises:
and (3) keeping the neutral expression by the user, sequentially rotating the head in the upward, downward, left and right directions, and collecting each frame of user expression image to form the RGB-D image sequence.
6. The apparatus of claim 4, wherein the face tracking result comprises:
and the reconstruction result of the current neutral human face three-dimensional model and the expression coefficient of the human face mixed model.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910840594.XA CN110689625B (en) | 2019-09-06 | 2019-09-06 | Automatic generation method and device for customized face mixed expression model |
PCT/CN2020/108965 WO2021042961A1 (en) | 2019-09-06 | 2020-08-13 | Method and device for automatically generating customized facial hybrid emoticon model |
US17/462,113 US20210390792A1 (en) | 2019-09-06 | 2021-08-31 | Method and device for customizing facial expressions of user |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910840594.XA CN110689625B (en) | 2019-09-06 | 2019-09-06 | Automatic generation method and device for customized face mixed expression model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110689625A CN110689625A (en) | 2020-01-14 |
CN110689625B true CN110689625B (en) | 2021-07-16 |
Family
ID=69107913
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910840594.XA Active CN110689625B (en) | 2019-09-06 | 2019-09-06 | Automatic generation method and device for customized face mixed expression model |
Country Status (3)
Country | Link |
---|---|
US (1) | US20210390792A1 (en) |
CN (1) | CN110689625B (en) |
WO (1) | WO2021042961A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689625B (en) * | 2019-09-06 | 2021-07-16 | 清华大学 | Automatic generation method and device for customized face mixed expression model |
CN118411453B (en) * | 2024-07-03 | 2024-09-03 | 紫光摩度教育科技有限公司 | Digital human-computer interaction method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198523A (en) * | 2013-04-26 | 2013-07-10 | 清华大学 | Three-dimensional non-rigid body reconstruction method and system based on multiple depth maps |
CN108154550A (en) * | 2017-11-29 | 2018-06-12 | 深圳奥比中光科技有限公司 | Face real-time three-dimensional method for reconstructing based on RGBD cameras |
CN108711185A (en) * | 2018-05-15 | 2018-10-26 | 清华大学 | Joint rigid moves and the three-dimensional rebuilding method and device of non-rigid shape deformations |
CN109472820A (en) * | 2018-10-19 | 2019-03-15 | 清华大学 | Monocular RGB-D camera real-time face method for reconstructing and device |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8861800B2 (en) * | 2010-07-19 | 2014-10-14 | Carnegie Mellon University | Rapid 3D face reconstruction from a 2D image and methods using such rapid 3D face reconstruction |
EP2852932A1 (en) * | 2012-05-22 | 2015-04-01 | Telefónica, S.A. | A method and a system for generating a realistic 3d reconstruction model for an object or being |
US9317954B2 (en) * | 2013-09-23 | 2016-04-19 | Lucasfilm Entertainment Company Ltd. | Real-time performance capture with on-the-fly correctives |
CN106327571B (en) * | 2016-08-23 | 2019-11-05 | 北京的卢深视科技有限公司 | A kind of three-dimensional face modeling method and device |
EP3330927A1 (en) * | 2016-12-05 | 2018-06-06 | THOMSON Licensing | Method and apparatus for sculpting a 3d model |
US10572720B2 (en) * | 2017-03-01 | 2020-02-25 | Sony Corporation | Virtual reality-based apparatus and method to generate a three dimensional (3D) human face model using image and depth data |
CN109584353B (en) * | 2018-10-22 | 2023-04-07 | 北京航空航天大学 | Method for reconstructing three-dimensional facial expression model based on monocular video |
CN110689625B (en) * | 2019-09-06 | 2021-07-16 | 清华大学 | Automatic generation method and device for customized face mixed expression model |
-
2019
- 2019-09-06 CN CN201910840594.XA patent/CN110689625B/en active Active
-
2020
- 2020-08-13 WO PCT/CN2020/108965 patent/WO2021042961A1/en active Application Filing
-
2021
- 2021-08-31 US US17/462,113 patent/US20210390792A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198523A (en) * | 2013-04-26 | 2013-07-10 | 清华大学 | Three-dimensional non-rigid body reconstruction method and system based on multiple depth maps |
CN108154550A (en) * | 2017-11-29 | 2018-06-12 | 深圳奥比中光科技有限公司 | Face real-time three-dimensional method for reconstructing based on RGBD cameras |
CN108711185A (en) * | 2018-05-15 | 2018-10-26 | 清华大学 | Joint rigid moves and the three-dimensional rebuilding method and device of non-rigid shape deformations |
CN109472820A (en) * | 2018-10-19 | 2019-03-15 | 清华大学 | Monocular RGB-D camera real-time face method for reconstructing and device |
Non-Patent Citations (1)
Title |
---|
Mesh Modification Using Deformation Gradients;Robert Walker Sumner;《百度学术》;20051215;第37页第3节、第85页4.1节 * |
Also Published As
Publication number | Publication date |
---|---|
US20210390792A1 (en) | 2021-12-16 |
WO2021042961A1 (en) | 2021-03-11 |
CN110689625A (en) | 2020-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596974B (en) | Dynamic scene robot positioning and mapping system and method | |
CN100407798C (en) | Three-dimensional geometric mode building system and method | |
CN109003325A (en) | A kind of method of three-dimensional reconstruction, medium, device and calculate equipment | |
JP2023106284A (en) | Digital twin modeling method and system for teleoperation environment of assembly robot | |
CN105006016B (en) | A kind of component-level 3 D model construction method of Bayesian network constraint | |
CN108053437B (en) | Three-dimensional model obtaining method and device based on posture | |
CN104346824A (en) | Method and device for automatically synthesizing three-dimensional expression based on single facial image | |
CN112734890B (en) | Face replacement method and device based on three-dimensional reconstruction | |
Li et al. | Avatarcap: Animatable avatar conditioned monocular human volumetric capture | |
CN108876814A (en) | A method of generating posture stream picture | |
CN104778736B (en) | The clothes three-dimensional animation generation method of single video content driven | |
CN110689625B (en) | Automatic generation method and device for customized face mixed expression model | |
Zuo et al. | Sparsefusion: Dynamic human avatar modeling from sparse rgbd images | |
CN115951784B (en) | Method for capturing and generating motion of wearing human body based on double nerve radiation fields | |
CN111292427A (en) | Bone displacement information acquisition method, device, equipment and storage medium | |
CN114255285A (en) | Method, system and storage medium for fusing three-dimensional scenes of video and urban information models | |
JP2003061936A (en) | Moving three-dimensional model formation apparatus and method | |
Orvalho et al. | Transferring the rig and animations from a character to different face models | |
US20120154393A1 (en) | Apparatus and method for creating animation by capturing movements of non-rigid objects | |
Li | The influence of digital twins on the methods of film and television creation | |
Noborio et al. | Experimental results of 2D depth-depth matching algorithm based on depth camera Kinect v1 | |
Wu et al. | Example-based real-time clothing synthesis for virtual agents | |
CN109308732B (en) | Component grid fusion method and system based on control distortion of the mesh | |
Tisserand et al. | Automatic 3D garment positioning based on surface metric | |
CN111402256B (en) | Three-dimensional point cloud target detection and attitude estimation method based on template |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |