Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Driving High-Resolution Facial Scans with Video Performance Capture

Published: 29 December 2014 Publication History

Abstract

We present a process for rendering a realistic facial performance with control of viewpoint and illumination. The performance is based on one or more high-quality geometry and reflectance scans of an actor in static poses, driven by one or more video streams of a performance. We compute optical flow correspondences between neighboring video frames, and a sparse set of correspondences between static scans and video frames. The latter are made possible by leveraging the relightability of the static 3D scans to match the viewpoint(s) and appearance of the actor in videos taken in arbitrary environments. As optical flow tends to compute proper correspondence for some areas but not others, we also compute a smoothed, per-pixel confidence map for every computed flow, based on normalized cross-correlation. These flows and their confidences yield a set of weighted triangulation constraints among the static poses and the frames of a performance. Given a single artist-prepared face mesh for one static pose, we optimally combine the weighted triangulation constraints, along with a shape regularization term, into a consistent 3D geometry solution over the entire performance that is drift free by construction. In contrast to previous work, even partial correspondences contribute to drift minimization, for example, where a successful match is found in the eye region but not the mouth. Our shape regularization employs a differential shape term based on a spatially varying blend of the differential shapes of the static poses and neighboring dynamic poses, weighted by the associated flow confidences. These weights also permit dynamic reflectance maps to be produced for the performance by blending the static scan maps. Finally, as the geometry and maps are represented on a consistent artist-friendly mesh, we render the resulting high-quality animated face geometry and animated reflectance maps using standard rendering tools.

Supplementary Material

fyffe (fyffe.zip)
Supplemental movie and image files for, Driving High-Resolution Facial Scans with Video Performance Capture
MP4 File (a8.mp4)

References

[1]
O. Alexander, M. Rogers, W. Lambeth, M. Chiang, and P. Debevec. 2009. Creating a photoreal digital actor: The digital Emily project. In Proceedings of the Conference on Visual Media Production (CVMP'09). 176--187.
[2]
S. Baker, D. Scharstein, J. P. Lewis, S. Roth, M. J. Black, and R. Szeliski. 2011. A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92, 1, 1--31.
[3]
T. Beeler, B. Bickel, P. Beardsley, B. Sumner, and M. Gross. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 3, 40:1--40:9.
[4]
T. Beeler, F. Hahn, D. Bradley, B. Bickel, P. Beardsley, C. Gotsman, R. W. Sumner, and M. Gross. 2011. High-quality passive facial performance capture using anchor frames. In ACM SIGGRAPH Papers. ACM Press, New York, 75:1--75:10.
[5]
B. Bickel, M. Lang, M. Botsch, M. A. Otaduy, and M. Gross. 2008. Pose-space animation and transfer of facial details. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Graphics and Interactive Techniques. Eurographics Association, Aire-la-Ville, Switzerland. 57--66.
[6]
G. Borshukov, D. Piponi, O. Larsen, J. P. Lewis, and C. Tempelaar-Lietz. 2003. Universal capture: Image-based facial animation for “the matrix reloaded.” In Proceedings of the ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques, A. P. Rockwood, Ed. ACM Press, New York.
[7]
D. Bradley, W. Heidrich, T. Popa, and A. Sheffer. 2010. High resolution passive facial performance capture. In ACM SIGGRAPH Papers. ACM Press, New York, 41:1--41:10.
[8]
T. F. Cootes, G. J. Edwards, and C. J. Taylor. 1998. Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23, 6, 484--498.
[9]
P. Debevec. 1998. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proceedings of the 25th Annual ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques. ACM Press, New York, 189--198.
[10]
D. DeCarlo and D. Metaxas. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'96). 231--238.
[11]
P. Ekman and W. Friesen. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto, CA.
[12]
A. Ghosh, G. Fyffe, B. Tunwattanapong, J. Busch, X. Yu, and P. Debevec. 2011. Multiview face capture using polarized spherical gradient illumination. In Proceedings of the SIGGRAPH Asia Conference. ACM Press, New York, 129:1--129:10.
[13]
B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin. 1998. Making faces. In Proceedings of the 25th Annual ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques. ACM Press, New York, 55--66.
[14]
T. Hawkins, A. Wenger, C. Tchou, A. Gardner, F. Goransson, and P. Debevec. 2004. Animatable facial reflectance fields. In Proceedings of the 15th Eurographics Workshop on Rendering Techniques. 309--320.
[15]
H. Huang, J. Chai, X. Tong, and H.-T. Wu. 2011. Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30, 4, 74:1--74:10.
[16]
J. Jimenez, A. Jarabo, D. Gutierrez, E. Danvoye, and J. Von Der Pahlen. 2012. Separable subsurface scattering and photorealistic eyes rendering. In ACM SIGGRAPH Courses. ACM Press, New York.
[17]
M. Klaudiny and A. Hilton. 2012. High-detail 3d capture and non-sequential alignment of facial performance. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DPVT'12).
[18]
M. Klaudiny, A. Hilton, and J. Edge. 2010. High-detail 3d capture of facial performance. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Trasmission (3DPVT'10).
[19]
V. Kolmogorov. 2006. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 28, 10, 1568--1583.
[20]
H. Li, P. Roivainen, and R. Forcheimer. 1993. 3-d motion estimation in model-based facial image coding. IEEE Trans. Pattern Anal. Mach. Intell. 15, 6, 545--555.
[21]
W.-C. Ma, A. Jones, J.-Y. Chiang, T. Hawkins, S. Frederiksen, P. Peers, M. Vukovic, M. Ouhyoung, and P. Debevec. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM Trans. Graph. 27, 5, 121:1--121:10.
[22]
K. Nishino and S. K. Navar. 2004. Eyes for relighting. ACM Trans. Graph. 23, 3, 704--711.
[23]
M. Park, S. Kashyap, R. Collins, and Y. Liu. 2010. Data driven mean-shift belief propagation for non-gaussian mrfs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). 3547--3554.
[24]
T. Popa, I. South Ickinson, D. Bradley, A. Sheffer, and W. Heidrich. 2010. Globally consistent space-time reconstruction. In Proceedings of the Eurographics Symposium on Geometry Processing (SGP'10).
[25]
Y. Seol, J. Lewis, J. Seo, B. Choi, K. Anjyo, and J. Noh. 2012. Spacetime expression cloning for blendshapes. ACM Trans. Graph. 31, 2, 14:1--14:12.
[26]
L. Valgaerts, C. Wu, A. Bruhn, H.-P. Seidel, and C. Theobalt. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM Trans. Graph. 31, 6, 187:1--187:11.
[27]
T. Weise, S. Bouaziz, H. Li, and M. Pauly. 2011. Realtime performance-based facial animation. In ACM SIGGRAPH Papers. ACM Press, New York, 77:1--77:10.
[28]
M. Werlberger. 2012. Convex approaches for high performance video processing. Ph.D. thesis, Institute for Computer Graphics and Vision, Graz University of Technology, Graz, Austria.
[29]
L. Zhang, N. Snavely, B. Curless, and S. M. Seitz. 2004. Spacetime faces: High resolution capture for modeling and animation. In ACM SIGGRAPH Papers. ACM Press, New York, 548--558.
[30]
X. Zhu and D. Ramanan. 2012. Face detection, pose estimation, and landmark localization in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12). 2879--2886.

Cited By

View all
  • (2024)Universal Facial Encoding of Codec Avatars from VR HeadsetsACM Transactions on Graphics10.1145/365823443:4(1-22)Online publication date: 19-Jul-2024
  • (2024)Formulating facial mesh tracking as a differentiable optimization problem: a backpropagation-based solutionVisual Intelligence10.1007/s44267-024-00054-x2:1Online publication date: 19-Jul-2024
  • (2023)A Study on the Digital Twin Pipeline for Facial Digitizing System OptimizationJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2023.28.5.53028:5(530-544)Online publication date: 30-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 34, Issue 1
November 2014
153 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2702692
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 December 2014
Accepted: 01 June 2014
Revised: 01 May 2014
Received: 01 December 2013
Published in TOG Volume 34, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Facial animation
  2. temporal correspondence

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)4
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Universal Facial Encoding of Codec Avatars from VR HeadsetsACM Transactions on Graphics10.1145/365823443:4(1-22)Online publication date: 19-Jul-2024
  • (2024)Formulating facial mesh tracking as a differentiable optimization problem: a backpropagation-based solutionVisual Intelligence10.1007/s44267-024-00054-x2:1Online publication date: 19-Jul-2024
  • (2023)A Study on the Digital Twin Pipeline for Facial Digitizing System OptimizationJOURNAL OF BROADCAST ENGINEERING10.5909/JBE.2023.28.5.53028:5(530-544)Online publication date: 30-Sep-2023
  • (2023)DreamFace: Progressive Generation of Animatable 3D Faces under Text GuidanceACM Transactions on Graphics10.1145/359209442:4(1-16)Online publication date: 26-Jul-2023
  • (2023)RelightableHands: Efficient Neural Relighting of Articulated Hand Models2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01599(16663-16673)Online publication date: Jun-2023
  • (2023)MEGANE: Morphable Eyeglass and Avatar Network2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01228(12769-12779)Online publication date: Jun-2023
  • (2023)Implicit Neural Head Synthesis via Controllable Local Deformation Fields2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00048(416-426)Online publication date: Jun-2023
  • (2022)Rig Inversion by Training a Differentiable Rig FunctionSIGGRAPH Asia 2022 Technical Communications10.1145/3550340.3564218(1-4)Online publication date: 6-Dec-2022
  • (2022)FDLS: A Deep Learning Approach to Production Quality, Controllable, and Retargetable Facial Performances.Proceedings of the 2022 Digital Production Symposium10.1145/3543664.3543672(1-9)Online publication date: 7-Aug-2022
  • (2022)A moving eulerian-lagrangian particle method for thin film and foam simulationACM Transactions on Graphics10.1145/3528223.353017441:4(1-17)Online publication date: 22-Jul-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media