Nothing Special   »   [go: up one dir, main page]

CN109472820B - Monocular RGB-D camera real-time face reconstruction method and device - Google Patents

Monocular RGB-D camera real-time face reconstruction method and device Download PDF

Info

Publication number
CN109472820B
CN109472820B CN201811222294.7A CN201811222294A CN109472820B CN 109472820 B CN109472820 B CN 109472820B CN 201811222294 A CN201811222294 A CN 201811222294A CN 109472820 B CN109472820 B CN 109472820B
Authority
CN
China
Prior art keywords
dimensional coordinates
rigid motion
face
current
human face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811222294.7A
Other languages
Chinese (zh)
Other versions
CN109472820A (en
Inventor
徐枫
冯铖锃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811222294.7A priority Critical patent/CN109472820B/en
Publication of CN109472820A publication Critical patent/CN109472820A/en
Application granted granted Critical
Publication of CN109472820B publication Critical patent/CN109472820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a monocular RGB-D camera real-time face reconstruction method and a monocular RGB-D camera real-time face reconstruction device, wherein the method comprises the following steps: detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm; obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the positions of the characteristic points of the human face; acquiring the current three-dimensional coordinates of each human face characteristic point on the key frame; obtaining global rigid motion from the key frame to each frame according to the three-dimensional coordinates and the current three-dimensional coordinates to obtain a rigid motion result; using the rigid motion result as the initialization of ICP to finely adjust the rigid motion of the human face; rigid motion results are applied to the keyframe model to update the TSDF representation of the model. The method effectively removes the depth of the non-face area, removes the influence of non-rigid motion, and can improve the accuracy of rigid motion estimation by using the human face characteristic points.

Description

Monocular RGB-D camera real-time face reconstruction method and device
Technical Field
The invention relates to the technical field of three-dimensional reconstruction, in particular to a monocular RGB-D camera real-time face reconstruction method and device.
Background
In the related art, the three-dimensional reconstruction technology is a research hotspot in the fields of computer vision and computer graphics, is one of core technologies in the fields of virtual reality/augmented reality, automatic driving, robots and the like, and has wide application. In recent years, there have been many efforts to use consumer-level depth cameras (such as microsoft Kinect, intel real sense, etc.) to perform real-time three-dimensional reconstruction of general scenes and objects.
Most of the work is based on the ICP algorithm to carry out rigid registration on the reconstructed geometric part and the current frame input point cloud, and the rigid motion (global rotation and translation) of the current frame relative to the key frame is estimated. The method has great limitation when the camera moves fast or the reconstruction object moves fast, and reconstruction failure caused by inaccurate rigid motion estimation often occurs.
Disclosure of Invention
The present application is based on the recognition and discovery by the inventors of the following problems:
the real-time three-dimensional reconstruction of the monocular RGB-D camera is a research hotspot in the field of computer graphics and computer vision, and how to rapidly and accurately reconstruct information such as the geometry, reflectivity, ambient illumination and the like of a common object according to input data of the monocular RGB-D camera is an important research topic. Advanced reconstruction techniques in recent years mostly use Iterative Closest Point (ICP) based algorithms in the geometric registration stage, but such methods generally only cope with slow camera or object motion.
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide a monocular RGB-D camera real-time face reconstruction method, which effectively removes the depth of non-face areas, removes the influence of non-rigid motion, and can improve the accuracy of rigid motion estimation by using human face feature points.
The invention also aims to provide a monocular RGB-D camera real-time face reconstruction method.
In order to achieve the above object, an embodiment of the present invention provides a method for reconstructing a real-time face of a monocular RGB-D camera, including the following steps: step S1: detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm; step S2: obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the positions of the human face characteristic points; step S3: acquiring the current three-dimensional coordinates of each human face characteristic point on the key frame; step S4: obtaining global rigid motion from the key frame to each frame according to the three-dimensional coordinates and the current three-dimensional coordinates to obtain a rigid motion result; step S5: using the rigid motion result as the initialization of ICP to finely adjust the rigid motion of the human face; step S6: applying the rigid motion results to the keyframe model to update the TSDF representation of the model.
The monocular RGB-D camera real-time face reconstruction method provided by the embodiment of the invention takes the particularity of a face structure into consideration, utilizes an advanced face image characteristic point detection technology to improve the accuracy of the monocular RGB-D camera real-time face reconstruction, and is a novel method for estimating global rigid motion aiming at special targets such as faces, so that the real-time three-dimensional face reconstruction during the rapid motion of the faces can be processed, the depth of non-face areas is effectively removed, the influence of non-rigid motion is removed, and the accuracy of rigid motion estimation can be improved by utilizing the face characteristic points.
In addition, the monocular RGB-D camera real-time face reconstruction method according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the step S1 further includes: dividing the feature points of the outer circle of the face into a left feature point and a right feature point; respectively fitting the left characteristic point and the right characteristic point by using an exponential function curve, and after fitting, reserving depth data of a region which is positioned above the two curves simultaneously; and setting the depth values outside the area to be zero.
Further, in an embodiment of the present invention, the step S2 further includes: and searching the corresponding position of each feature point on the depth image according to the residual internal feature points, and obtaining the three-dimensional coordinates of each feature point of the current frame through back projection of the internal reference matrix of the depth camera.
Further, in an embodiment of the present invention, the step S3 further includes: rendering the corresponding depth map of the current reconstructed model, and obtaining the current three-dimensional coordinates of the feature points on the key frame model.
Further, in an embodiment of the present invention, the step S4 further includes: modeling global rigid motion as an optimization problem, the optimization target being:
Figure BDA0001835077920000021
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure BDA0001835077920000022
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure BDA0001835077920000023
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
In order to achieve the above object, an embodiment of another aspect of the present invention provides a monocular RGB-D camera real-time face reconstruction device, including the following steps: the detection module is used for detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm; the first processing module is used for obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the position of the face characteristic point; the acquisition module is used for acquiring the current three-dimensional coordinates of each facial feature point on the key frame; the second processing module is used for obtaining global rigid motion from the key frame to each frame according to the three-dimensional coordinates and the current three-dimensional coordinates so as to obtain a rigid motion result; the initialization module is used for using the rigid motion result as the initialization of ICP (inductively coupled plasma) to finely adjust the rigid motion of the face; and the updating module is used for applying the rigid motion result to the key frame model so as to update the TSDF representation of the model.
The monocular RGB-D camera real-time human face reconstruction device provided by the embodiment of the invention takes the particularity of the human face structure into consideration, utilizes the advanced human face image characteristic point detection technology to improve the accuracy of the monocular RGB-D camera real-time human face reconstruction, and provides a new method for estimating global rigid motion aiming at special targets such as human faces.
In addition, the monocular RGB-D camera real-time face reconstruction device according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the detection module is further configured to divide the feature points outside the face into a left feature point and a right feature point, respectively curve-fit the left feature point and the right feature point with an exponential function, and after the curve-fit, keep the depth data of the area located above the two curves at the same time, and set the depth values outside the area to zero.
Further, in an embodiment of the present invention, the first processing module is further configured to find a corresponding position of each feature point on the depth image according to the remaining internal feature points, and obtain a three-dimensional coordinate of each feature point of the current frame through back projection of a reference matrix of the depth camera.
Further, in an embodiment of the present invention, the obtaining module is further configured to render the currently reconstructed model into its corresponding depth map, and obtain the current three-dimensional coordinates of the feature points on the keyframe model using the obtained current three-dimensional coordinates.
Further, in an embodiment of the present invention, the second processing module is further configured to model the global rigid motion as an optimization problem, the optimization being aimed at:
Figure BDA0001835077920000031
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure BDA0001835077920000032
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure BDA0001835077920000033
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a monocular RGB-D camera real-time face reconstruction method according to one embodiment of the present invention;
FIG. 2 is a flow chart of a monocular RGB-D camera real-time face reconstruction method according to one embodiment of the present invention;
FIG. 3 is a graph comparing the estimation of rigid motion using feature points and the estimation of ICP according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a monocular RGB-D camera real-time face reconstruction device according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The monocular RGB-D camera real-time face reconstruction method and apparatus proposed in the embodiment of the present invention will be described below with reference to the accompanying drawings, and first, the monocular RGB-D camera real-time face reconstruction method proposed in the embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a flowchart of a monocular RGB-D camera real-time face reconstruction method according to an embodiment of the present invention.
As shown in fig. 1, the monocular RGB-D camera real-time face reconstruction method includes the following steps:
step S1: and detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm.
Further, in an embodiment of the present invention, the step S1 further includes: dividing the feature points of the outer circle of the face into a left feature point and a right feature point; respectively fitting the left characteristic point and the right characteristic point by using an exponential function curve, and after fitting, reserving depth data of an area which is positioned above the two curves simultaneously; the depth values outside the area are set to zero.
It can be understood that, as shown in fig. 2, the position of the face feature point is detected on the input face RGB image by using an advanced face feature point detection algorithm, and this step only uses the feature points outside the face; the embodiment of the invention divides the characteristic points of the outer circle of the face into a left half and a right half, and each half is fitted by an exponential function curve, after fitting, only the depth data of the area which is positioned above the two curves at the same time is reserved, and the part outside the area is not considered to belong to the face part, and the depth values at the positions are set to be zero.
It should be noted that, in the embodiment of the present invention, an RGB image with a resolution of 640 × 480 and a depth image with the same resolution are used, and the RGB image and the depth image are aligned in advance, so that pixels at the same position on two images have a corresponding relationship, which is only an example and is not limited specifically herein.
Specifically, the removing of the depth data of the non-face area in the embodiment of the present invention specifically includes:
the input depth image usually contains depth data of a non-face area, such as shoulders, a background and the like, because the movement of the face is inconsistent with the movement of the non-face area in the rotation process, non-rigid movement is generated on the whole, and the depth data of the non-face area is automatically removed by utilizing a curve enclosed by peripheral feature points of the face.
The embodiment of the invention divides the characteristic points of the outer circle of the face into a left half and a right half, each half is fitted by an exponential function curve, after fitting, an area which is simultaneously positioned above the two curves is reserved, and the parts outside the area do not belong to the face part, so the depth values at the positions are set to be zero.
Step S2: and obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the positions of the human face characteristic points.
Further, in an embodiment of the present invention, the step S2 further includes: and searching the corresponding position of each feature point on the depth image according to the residual internal feature points, and obtaining the three-dimensional coordinates of each feature point of the current frame through back projection of the internal reference matrix of the depth camera.
It is understood that, as shown in fig. 2, the embodiment of the present invention uses the pixel coordinates of the face feature point on the RGB image in step S1, which is contrary to step S1, and uses the remaining internal feature points instead of the outer feature points, and finds the corresponding position of each feature point on the depth image, where the RGB image and the depth image have the same pixel coordinates because they are aligned. And finally, obtaining the three-dimensional coordinates { p _ i ^ live | p _ i ^ live ^ R ^3, i ^ 1, …, n } of each feature point of the current frame by utilizing the back projection of the internal parameter matrix of the depth camera.
Step S3: and acquiring the current three-dimensional coordinates of each human face characteristic point on the key frame.
Further, in an embodiment of the present invention, the step S3 further includes: rendering the corresponding depth map of the current reconstructed model, and obtaining the current three-dimensional coordinates of the feature points on the key frame model.
It can be understood that, as shown in fig. 2, when calculating the three-dimensional coordinates of each facial feature point on the key frame at this time, the embodiment of the present invention needs to render the currently reconstructed model into its corresponding depth map, and then calculate the three-dimensional coordinates of the feature points on the key frame model using a method similar to that in step S2
Figure BDA0001835077920000051
Step S4: and obtaining global rigid motion from the key frame to each frame according to the three-dimensional coordinates and the current three-dimensional coordinates to obtain a rigid motion result.
It will be appreciated that from the three-dimensional coordinates of these two sets of feature points, global rigid motions R and t from keyframe to each frame are calculated, as shown in fig. 2, and we model this as an optimization problem.
Wherein, in an embodiment of the present invention, the step S4 further includes: modeling global rigid motion as an optimization problem, the optimization target being:
Figure BDA0001835077920000052
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure BDA0001835077920000053
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure BDA0001835077920000054
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
Step S5: the rigid motion result is used as an initialization of the ICP to fine tune the rigid motion of the face.
It will be appreciated that embodiments of the present invention, as shown in fig. 2, use this estimate as an initialization of ICP to further fine tune the rigid motion of the face.
Step S6: rigid motion results are applied to the keyframe model to update the TSDF representation of the model.
It will be appreciated that embodiments of the present invention, as shown in fig. 2, act on the keyframe model to update the TSDF representation of the model based on the current estimated rigid motion results. Also, a comparison graph of the rigid motion estimated using the feature points and the ICP method estimated effect is shown in fig. 3.
Specifically, according to steps S2-S6, the embodiment of the present invention accurately estimates the global rigid motion by using the feature points, and specifically includes:
in each frame, three-dimensional coordinates of two groups of feature points are calculated, one group is the three-dimensional coordinates of the feature points of the current input frame, and the other group is the three-dimensional coordinates of the feature points after the key frame is updated.
The three-dimensional coordinates of the face feature points of the current frame input point cloud can be calculated by the pixel coordinates of the two-dimensional feature points detected on the RGB image of the frame and the internal reference matrix of the depth camera: after the feature points are detected on the RGB image, the pixel coordinates of each feature point corresponding to the depth map are searched, and the three-dimensional coordinates of each feature point in the depth camera coordinate system can be obtained through the depth camera internal reference matrix
Figure BDA0001835077920000061
Where n is the number of the used face feature points, the feature points in the outer circle are not used here, because the semantic positions of the feature points in the outer circle on the face in different postures may change.
For the three-dimensional coordinates of the human face feature points on the key frame, as the human face model reconstructed in each frame is updated, the surface of the human face model is more and more complete, and the noise is continuously reduced, in each frame, the current reconstructed model needs to be rendered into a corresponding depth map, and then the three-dimensional coordinates of the feature points on the key frame model are calculated by using a method similar to the method for calculating the three-dimensional coordinates of the feature points on the input point cloud
Figure BDA0001835077920000062
According to the three-dimensional coordinates of the two groups of feature points, global rigid motions R and t from the key frame to each frame are calculated and modeled as an optimization problem, and the optimization target is as follows:
Figure BDA0001835077920000063
in the embodiment of the invention, the estimation result is used as the initialization of ICP (inductively coupled plasma) to further fine-tune rigid motion of the face, and because some feature points are shielded under a large posture, for example, when the angle of the side face exceeds 45 degrees, the calculation of three-dimensional coordinates of part of the feature points is inaccurate.
And finally, according to the currently estimated rigid motion result, acting on the key frame model, and updating the TSDF representation of the model.
According to the monocular RGB-D camera real-time face reconstruction method provided by the embodiment of the invention, the specificity of a face structure is considered, the accuracy of the real-time face reconstruction of the monocular RGB-D camera is improved by utilizing an advanced face image characteristic point detection technology, and the real-time three-dimensional face reconstruction in the process of rapid face motion can be processed by aiming at a novel method for estimating global rigid motion of special targets such as the face, so that the depth of a non-face area is effectively removed, the influence of the non-rigid motion is removed, and the accuracy of rigid motion estimation can be improved by utilizing the face characteristic points.
Next, a monocular RGB-D camera real-time face reconstruction device according to an embodiment of the present invention will be described with reference to the drawings.
Fig. 4 is a schematic structural diagram of a monocular RGB-D camera real-time face reconstruction device according to an embodiment of the present invention.
As shown in fig. 4, the monocular RGB-D camera real-time face reconstruction device 10 includes: a detection module 100, a first processing module 200, an acquisition module 300, a second processing module 400, an initialization module 500, and an update module 600.
The detection module 100 is configured to detect the positions of the face feature points on the input RGB face image through an advanced face feature point detection algorithm. The first processing module 200 is configured to obtain three-dimensional coordinates of each feature point of the current frame according to the position of the feature point of the face. The obtaining module 300 is configured to obtain current three-dimensional coordinates of each facial feature point on the key frame. The second processing module 400 is configured to obtain a global rigid motion from the key frame to each frame according to the three-dimensional coordinates and the current three-dimensional coordinates, so as to obtain a rigid motion result. The initialization module 500 is used to use the rigid motion result as the initialization of ICP to fine tune the rigid motion of the face. The update module 600 is configured to apply the rigid motion results to the keyframe model to update the TSDF representation of the model. The device 10 of the embodiment of the invention effectively removes the depth of the non-face area, removes the influence of non-rigid motion, and can improve the accuracy of rigid motion estimation by using the human face characteristic points.
Further, in an embodiment of the present invention, the detection module 100 is further configured to divide the feature points outside the face into left and right feature points, curve-fit the left and right feature points with an exponential function, respectively, and after the curve-fit, retain depth data of an area located above the two curves at the same time, and set depth values outside the area to zero.
Further, in an embodiment of the present invention, the first processing module 200 is further configured to find a corresponding position of each feature point on the depth image according to the remaining internal feature points, and obtain three-dimensional coordinates of each feature point of the current frame through back projection of a reference matrix of the depth camera.
Further, in an embodiment of the present invention, the obtaining module 300 is further configured to render the currently reconstructed model into its corresponding depth map, and use the current three-dimensional coordinates of the feature points on the obtained key frame model.
Further, in an embodiment of the present invention, the second processing module 400 is further configured to model the global rigid motion as an optimization problem, the optimization being aimed at:
Figure BDA0001835077920000071
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure BDA0001835077920000072
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure BDA0001835077920000073
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
It should be noted that the explanation of the embodiment of the monocular RGB-D camera real-time face reconstruction method is also applicable to the monocular RGB-D camera real-time face reconstruction device of the embodiment, and details are not repeated here.
According to the monocular RGB-D camera real-time human face reconstruction device provided by the embodiment of the invention, the specificity of a human face structure is considered, the accuracy of reconstructing the human face by the monocular RGB-D camera in real time is improved by utilizing an advanced human face image characteristic point detection technology, and the real-time three-dimensional reconstruction of the human face during the rapid motion of the human face can be processed by a novel method for estimating global rigid motion aiming at special targets such as the human face, so that the depth of a non-human face area is effectively removed, the influence of the non-rigid motion is removed, and the accuracy of rigid motion estimation can be improved by utilizing human face characteristic points.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (8)

1. A monocular RGB-D camera real-time face reconstruction method is characterized by comprising the following steps:
step S1: detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm; the step S1 further includes: dividing the feature points of the outer circle of the face into a left feature point and a right feature point; respectively fitting the left characteristic point and the right characteristic point by using an exponential function curve, and after fitting, reserving depth data of a region which is positioned above the two curves simultaneously; setting the depth value outside the area to be zero;
step S2: obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the positions of the human face characteristic points;
step S3: acquiring the current three-dimensional coordinates of each human face characteristic point on the key frame;
step S4: obtaining global rigid motion from the key frame to the current frame according to the three-dimensional coordinates and the current three-dimensional coordinates to obtain a rigid motion result;
step S5: using the rigid motion result as initialization based on an iterative closest point algorithm to finely adjust the rigid motion of the face; and
step S6: applying the rigid motion results to the keyframe model to update the TSDF representation of the model.
2. The monocular RGB-D camera real-time face reconstruction method according to claim 1, wherein the step S2 further includes:
and searching the corresponding position of each feature point on the depth image according to the residual internal feature points, and obtaining the three-dimensional coordinates of each feature point of the current frame through back projection of the internal reference matrix of the depth camera.
3. The monocular RGB-D camera real-time face reconstruction method according to claim 1, wherein the step S3 further includes:
rendering the corresponding depth map of the current reconstructed model, and acquiring the current three-dimensional coordinates of the feature points on the key frame model.
4. The monocular RGB-D camera real-time face reconstruction method according to claim 1, wherein the step S4 further includes:
modeling global rigid motion as an optimization problem, the optimization target being:
Figure FDA0002839510340000011
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure FDA0002839510340000012
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure FDA0002839510340000013
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
5. A monocular RGB-D camera real-time human face reconstruction device is characterized by comprising the following steps:
the detection module is used for detecting the positions of the human face characteristic points on the input human face RGB image through an advanced human face characteristic point detection algorithm; the detection module is further used for dividing the feature points of the outer circle of the face into a left feature point and a right feature point, fitting the left feature point and the right feature point respectively by using an exponential function curve, reserving depth data of an area which is simultaneously positioned above the two curves after fitting, and setting depth values outside the area to be zero;
the first processing module is used for obtaining the three-dimensional coordinates of each characteristic point of the current frame according to the position of the face characteristic point;
the acquisition module is used for acquiring the current three-dimensional coordinates of each facial feature point on the key frame;
the second processing module is used for obtaining global rigid motion from the key frame to the current frame according to the three-dimensional coordinates and the current three-dimensional coordinates so as to obtain a rigid motion result;
the initialization module is used for using the rigid motion result as the initialization based on the iterative closest point algorithm so as to finely adjust the rigid motion of the face; and
and the updating module is used for applying the rigid motion result to the key frame model so as to update the TSDF representation of the model.
6. The monocular RGB-D camera real-time face reconstruction device of claim 5, wherein the first processing module is further configured to find a corresponding position of each feature point on the depth image according to the remaining internal feature points, and obtain a three-dimensional coordinate of each feature point of the current frame by back-projection of an internal reference matrix of the depth camera.
7. The monocular RGB-D camera real-time face reconstruction device of claim 5, wherein the obtaining module is further configured to render the currently reconstructed model into its corresponding depth map and obtain current three-dimensional coordinates of feature points on the keyframe model.
8. The monocular RGB-D camera real-time face reconstruction device of claim 5, wherein the second processing module is further configured to model global rigid motion as an optimization problem, the optimization being aimed at:
Figure FDA0002839510340000021
wherein R and t represent respectively the rigid rotation and translation to be optimized, n is the number of characteristic points,
Figure FDA0002839510340000022
three-dimensional coordinates representing the ith feature point of the current input frame,
Figure FDA0002839510340000023
and representing the three-dimensional coordinates of the ith characteristic point of the key frame.
CN201811222294.7A 2018-10-19 2018-10-19 Monocular RGB-D camera real-time face reconstruction method and device Active CN109472820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811222294.7A CN109472820B (en) 2018-10-19 2018-10-19 Monocular RGB-D camera real-time face reconstruction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811222294.7A CN109472820B (en) 2018-10-19 2018-10-19 Monocular RGB-D camera real-time face reconstruction method and device

Publications (2)

Publication Number Publication Date
CN109472820A CN109472820A (en) 2019-03-15
CN109472820B true CN109472820B (en) 2021-03-16

Family

ID=65665744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811222294.7A Active CN109472820B (en) 2018-10-19 2018-10-19 Monocular RGB-D camera real-time face reconstruction method and device

Country Status (1)

Country Link
CN (1) CN109472820B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109949412B (en) * 2019-03-26 2021-03-02 腾讯科技(深圳)有限公司 Three-dimensional object reconstruction method and device
CN110363858B (en) * 2019-06-18 2022-07-01 新拓三维技术(深圳)有限公司 Three-dimensional face reconstruction method and system
CN110533773A (en) * 2019-09-02 2019-12-03 北京华捷艾米科技有限公司 A kind of three-dimensional facial reconstruction method, device and relevant device
CN110689625B (en) * 2019-09-06 2021-07-16 清华大学 Automatic generation method and device for customized face mixed expression model
CN110910452B (en) * 2019-11-26 2023-08-25 上海交通大学 Low-texture industrial part pose estimation method based on deep learning
CN113221600B (en) * 2020-01-21 2022-06-21 魔门塔(苏州)科技有限公司 Method and device for calibrating image feature points
CN113674161A (en) * 2021-07-01 2021-11-19 清华大学 Face deformity scanning completion method and device based on deep learning
CN113902847B (en) * 2021-10-11 2024-04-16 岱悟智能科技(上海)有限公司 Monocular depth image pose optimization method based on three-dimensional feature constraint

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8682066B2 (en) * 2010-03-11 2014-03-25 Ramot At Tel-Aviv University Ltd. Devices and methods of reading monochromatic patterns
KR101556992B1 (en) * 2014-03-13 2015-10-05 손우람 3d scanning system using facial plastic surgery simulation
CN106289181B (en) * 2015-05-22 2018-12-18 北京雷动云合智能技术有限公司 A kind of real-time SLAM method of view-based access control model measurement
CN106934827A (en) * 2015-12-31 2017-07-07 杭州华为数字技术有限公司 The method for reconstructing and device of three-dimensional scenic
CN106446815B (en) * 2016-09-14 2019-08-09 浙江大学 A kind of simultaneous localization and mapping method
CN106910242B (en) * 2017-01-23 2020-02-28 中国科学院自动化研究所 Method and system for carrying out indoor complete scene three-dimensional reconstruction based on depth camera
CN108549873B (en) * 2018-04-19 2019-12-24 北京华捷艾米科技有限公司 Three-dimensional face recognition method and three-dimensional face recognition system

Also Published As

Publication number Publication date
CN109472820A (en) 2019-03-15

Similar Documents

Publication Publication Date Title
CN109472820B (en) Monocular RGB-D camera real-time face reconstruction method and device
CN110998659B (en) Image processing system, image processing method, and program
US9420265B2 (en) Tracking poses of 3D camera using points and planes
CN108564616B (en) Fast robust RGB-D indoor three-dimensional scene reconstruction method
CN108629843B (en) Method and equipment for realizing augmented reality
EP2656309B1 (en) Method for determining a parameter set designed for determining the pose of a camera and for determining a three-dimensional structure of the at least one real object
CN111144213B (en) Object detection method and related equipment
JP2007310707A (en) Apparatus and method for estimating posture
CN108225319B (en) Monocular vision rapid relative pose estimation system and method based on target characteristics
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN110111388A (en) Three-dimension object pose parameter estimation method and visual apparatus
CN112083403B (en) Positioning tracking error correction method and system for virtual scene
CN113393503B (en) Classification-driven shape prior deformation category-level object 6D pose estimation method
CN113744315B (en) Semi-direct vision odometer based on binocular vision
CN114494150A (en) Design method of monocular vision odometer based on semi-direct method
CN111829522B (en) Instant positioning and map construction method, computer equipment and device
CN109872343B (en) Weak texture object posture tracking method, system and device
CN116468786A (en) Semantic SLAM method based on point-line combination and oriented to dynamic environment
CN105339981B (en) Method for using one group of primitive registration data
CN108694348B (en) Tracking registration method and device based on natural features
KR20220161340A (en) Image processing system and method
CN113723432B (en) Intelligent identification and positioning tracking method and system based on deep learning
CN117726747A (en) Three-dimensional reconstruction method, device, storage medium and equipment for complementing weak texture scene
Fan et al. Collaborative three-dimensional completion of color and depth in a specified area with superpixels
US11417063B2 (en) Determining a three-dimensional representation of a scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant