research-article

MUNCH: Modelling Unique 'N Controllable Heads

Authors:

Suvidha Tripathi,

Pranit PuriAuthors Info & Claims

MIG '23: Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games

Article No.: 14, Pages 1 - 11

https://doi.org/10.1145/3623264.3624470

Published: 15 November 2023 Publication History

Abstract

The automated generation of 3D human heads has been an intriguing and challenging task for computer vision researchers. Prevailing methods synthesize realistic avatars but with limited control over the diversity and quality of rendered outputs and suffer from limited correlation between shape and texture of the character. We propose a method that offers quality, diversity, control, and realism along with explainable network design, all desirable features to game-design artists in the domain. First, our proposed Geometry Generator identifies disentangled latent directions and generate novel and diverse samples. A Render Map Generator then learns to synthesize multiply high-fidelty physically-based render maps including Albedo, Glossiness, Specular, and Normals. For artists preferring fine-grained control over the output, we introduce a novel Color Transformer Model that allows semantic color control over generated maps. We also introduce quantifiable metrics called Uniqueness and Novelty and a combined metric to test the overall performance of our model. Demo for both shapes & textures can be found: https://munch-seven.vercel.app/. We will release our model along with the synthetic dataset.

Supplementary Material

Supplementary Material including Hi-Resolution Outputs (MUNCH_Supplementary.pdf)

Download
17.71 MB

References

[1]

3D Scanstore. 2022. 3D Models from 3D Scans. https://www.3dscanstore.com/.

[2]

Mohammad Amin Aliari, Andre Beauchamp, Tiberiu Popa, and Eric Paquette. 2023. Face Editing Using Part-Based Optimization of the Latent Space. In Computer Graphics Forum, Vol. 42. Wiley Online Library, 269–279.

[3]

Thabo Beeler, Bernd Bickel, Paul Beardsley, Bob Sumner, and Markus Gross. 2010. High-Quality Single-Shot Capture of Facial Geometry. In ACM SIGGRAPH 2010 Papers (Los Angeles, California) (SIGGRAPH ’10). Association for Computing Machinery, New York, NY, USA, Article 40, 9 pages. https://doi.org/10.1145/1833349.1778777

Digital Library

[4]

Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Computer graphics and interactive techniques. 187–194.

[5]

James Booth, Anastasios Roussos, Stefanos Zafeiriou, Allan Ponniah, and David Dunaway. 2016. A 3d morphable model learnt from 10,000 faces. In CVPR. 5543–5552.

[6]

Giorgos Bouritsas, Sergiy Bokhnyak, Stylianos Ploumpis, Michael Bronstein, and Stefanos Zafeiriou. 2019. Neural 3d morphable models: Spiral convolutional networks for 3d shape representation learning and generation. In ICCV. 7213–7222.

[7]

Darren Cosker, Eva Krumhuber, and Adrian Hilton. 2011. A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling. In 2011 international conference on computer vision. IEEE, 2296–2303.

Digital Library

[8]

Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, and Michael Black. 2019. Capture, Learning, and Synthesis of 3D Speaking Styles. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 10101–10111. http://voca.is.tue.mpg.de/

[9]

Hang Dai, Nick Pears, William Smith, and Christian Duncan. 2020. Statistical modeling of craniofacial shape and texture. IJCV 128, 2 (2020), 547–571.

Digital Library

[10]

Hang Dai, Nick Pears, William AP Smith, and Christian Duncan. 2017. A 3d morphable model of craniofacial shape and texture variation. In IEEE ICCV. 3085–3093.

[11]

Yao Feng, Haiwen Feng, Michael J Black, and Timo Bolkart. 2021. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (ToG) 40, 4 (2021), 1–13.

Digital Library

[12]

Simone Foti, Bongjin Koo, Danail Stoyanov, and Matthew J Clarkson. 2022. 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces. In CVPR. 18730–18739.

[13]

Baris Gecer, Jiankang Deng, and Stefanos Zafeiriou. 2021a. Ostec: One-shot texture completion. In CVPR. 7628–7638.

[14]

Baris Gecer, Alexandros Lattas, Stylianos Ploumpis, Jiankang Deng, Athanasios Papaioannou, Stylianos Moschoglou, and Stefanos Zafeiriou. 2020. Synthesizing coupled 3d face modalities by trunk-branch generative adversarial networks. In European conference on computer vision. Springer, 415–433.

Digital Library

[15]

Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos Zafeiriou. 2019. Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In CVPR. 1155–1164.

[16]

Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos Zafeiriou. 2021b. Fast-GANFIT: Generative adversarial network for high fidelity 3D face reconstruction. arXiv preprint arXiv:2105.07474 (2021).

[17]

Thomas Gerig, Andreas Morel-Forster, Clemens Blumer, Bernhard Egger, Marcel Luthi, Sandro Schönborn, and Thomas Vetter. 2018. Morphable face models-an open framework. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 75–82.

Digital Library

[18]

Shunwang Gong, Lei Chen, Michael Bronstein, and Stefanos Zafeiriou. 2019. Spiralnet++: A fast and highly efficient mesh convolution operator. In ICCV Workshops. 0–0.

[19]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 1125–1134.

[20]

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).

[21]

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020. Training generative adversarial networks with limited data. NeurIPS 33 (2020), 12104–12114.

[22]

Robin Kips, Pietro Gori, Matthieu Perrot, and Isabelle Bloch. 2020. Ca-gan: Weakly supervised color aware gan for controllable makeup transfer. In ECCV. Springer, 280–296.

[23]

Alexandros Lattas, Stylianos Moschoglou, Baris Gecer, Stylianos Ploumpis, Vasileios Triantafyllou, Abhijeet Ghosh, and Stefanos Zafeiriou. 2020. AvatarMe: Realistically Renderable 3D Facial Reconstruction" in-the-wild". In CVPR. 760–769.

[24]

Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Jiankang Deng, and Stefanos Zafeiriou. 2023. FitMe: Deep Photorealistic 3D Morphable Model Avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8629–8640.

[25]

Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Baris Gecer, Abhijeet Ghosh, and Stefanos P Zafeiriou. 2021. AvatarMe++: Facial shape and BRDF inference with photorealistic rendering-aware GANs. IEEE PAMI01 (2021), 1–1.

[26]

Ruilong Li, Karl Bladin, Yajie Zhao, Chinmay Chinara, Owen Ingraham, Pengda Xiang, Xinglei Ren, Pratusha Prasad, Bipin Kishore, Jun Xing, 2020. Learning formation of physically-based face attributes. In CVPR. 3410–3419.

[27]

Tianye Li, Timo Bolkart, Michael J Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans.ACM Trans. Graph. 36, 6 (2017), 194–1.

Digital Library

[28]

Jiangke Lin, Lincheng Li, Yi Yuan, and Zhengxia Zou. 2022. Realistic Game Avatars Auto-Creation from Single Images via Three-pathway Network. In 2022 IEEE Conference on Games (CoG). IEEE, 33–40.

Digital Library

[29]

Jiangke Lin, Yi Yuan, and Zhengxia Zou. 2021. Meingame: Create a game character face from a single portrait. In AAAI Conference on Artificial Intelligence, Vol. 35. 311–319.

[30]

Richard T Marriott, Sami Romdhani, and Liming Chen. 2021. A 3d gan for improved large-pose facial recognition. In CVPR. 13445–13455.

[31]

Tetiana Martyniuk, Orest Kupyn, Yana Kurlyak, Igor Krashenyi, Jiři Matas, and Viktoriia Sharmanska. 2022. DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image. In CVPR. 20942–20952.

[32]

Stylianos Moschoglou, Stylianos Ploumpis, Mihalis A Nicolaou, Athanasios Papaioannou, and Stefanos Zafeiriou. 2020. 3DFaceGAN: adversarial nets for 3D face representation, generation, and translation. IJCV 128, 10 (2020), 2534–2551.

Digital Library

[33]

Christian Murphy, Sudhir Mudur, Daniel Holden, Marc-André Carbonneau, Donya Ghafourzadeh, and Andre Beauchamp. 2020. Appearance controlled face texture generation for video game characters. In Motion, interaction and games. 1–11.

[34]

Christian Murphy, Sudhir Mudur, Daniel Holden, Marc-André Carbonneau, Donya Ghafourzadeh, and Andre Beauchamp. 2021. Artist guided generation of video game production quality face textures. Computers & Graphics 98 (2021), 268–279.

Digital Library

[35]

Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. 2009. A 3D face model for pose and illumination invariant face recognition. In IEEE international conference on advanced video and signal based surveillance. Ieee, 296–301.

Digital Library

[36]

Stylianos Ploumpis, Evangelos Ververas, Eimear O’Sullivan, Stylianos Moschoglou, Haoyang Wang, Nick Pears, William AP Smith, Baris Gecer, and Stefanos Zafeiriou. 2020. Towards a complete 3D morphable model of the human head. IEEE PAMI 43, 11 (2020), 4142–4160.

[37]

Stylianos Ploumpis, Haoyang Wang, Nick Pears, William AP Smith, and Stefanos Zafeiriou. 2019. Combining 3d morphable models: A large scale face-and-head model. In CVPR. 10934–10943.

[38]

Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, and Michael J Black. 2018. Generating 3D faces using convolutional mesh autoencoders. In ECCV. 704–720.

[39]

Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, and Hao Li. 2017. Photorealistic facial texture inference using deep neural networks. In CVPR. 5144–5153.

[40]

Matan Sela, Elad Richardson, and Ron Kimmel. 2017. Unrestricted facial geometry reconstruction using image-to-image translation. In CVPR. 1576–1585.

[41]

Ron Slossberg, Gil Shamai, and Ron Kimmel. 2018. High quality facial surface and texture synthesis via generative adversarial networks. In ECCV Workshops. 0–0.

[42]

Fariborz Taherkhani, Aashish Rai, Quankai Gao, Shaunak Srivastava, Xuanbai Chen, Fernando de la Torre, Steven Song, Aayush Prakash, and Daeil Kim. 2022. Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance. arXiv preprint arXiv:2208.14263 (2022).

[43]

Unreal Engine. 2022. MetaHuman. https://bit.ly/3EokNri.

[44]

Erroll Wood, Tadas Baltrušaitis, Charlie Hewitt, Sebastian Dziadzio, Thomas J Cashman, and Jamie Shotton. 2021. Fake it till you make it: face analysis in the wild using synthetic data alone. In ICCV. 3681–3691.

[45]

Shugo Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, and Hao Li. 2018. High-fidelity facial reflectance and geometry inference from an unconstrained image. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–14.

Digital Library

[46]

Yuxiang Zhou, Jiankang Deng, Irene Kotsia, and Stefanos Zafeiriou. 2019. Dense 3d face decoding over 2500fps: Joint texture & shape convolutional mesh decoders. In CVPR. 1097–1106.

Cited By

Gruber ACollins EMeka AMueller FSarkar KOrts‐Escolano SPrasso LBusch JGross MBeeler T(2024)GANtlitz: Ultra High Resolution Generative Model for Multi‐Modal Face TexturesComputer Graphics Forum10.1111/cgf.1503943:2Online publication date: 24-Apr-2024
https://doi.org/10.1111/cgf.15039

Index Terms

MUNCH: Modelling Unique 'N Controllable Heads
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Shape modeling
      1. Mesh geometry models
      2. Shape analysis

Recommendations

Modeling and rendering of walkthrough environments with panoramic images
VRST '04: Proceedings of the ACM symposium on Virtual reality software and technology

An important, potential application of image-based techniques is to create photo-realistic image-based environments for interactive walkthrough. However, existing image-based studies are based on different assumptions with different focuses. There is a ...
Photo-Consistent Reconstruction of Semitransparent Scenes by Density-Sheet Decomposition

This paper considers the problem of reconstructing visually realistic 3D models of dynamic semitransparent scenes, such as fire, from a very small set of simultaneous views (even two). We show that this problem is equivalent to a severely ...
An introduction to image-based rendering
Integrated image and graphics technologies

In this chapter, we review the techniques for image-based rendering. Unlike traditional 3D computer graphics in which 3D geometry of the scene is known, image-based rendering (IBR) techniques render novel views directly from input images. IBR techniques ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MIG '23: Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games

November 2023

224 pages

ISBN:9798400703935

DOI:10.1145/3623264

Editors:
Julien Pettré
Inria, France
,
Barbara Solenthaler
ETH Zurich, Switzerland
,
Rachel McDonnell
TCD, Ireland
,
Christopher Peters
KTH, Sweden

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MIG '23

Sponsor:

SIGGRAPH

MIG '23: The 16th ACM SIGGRAPH Conference on Motion, Interaction and Games

November 15 - 17, 2023

Rennes, France

Acceptance Rates

Overall Acceptance Rate -9 of -9 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
60
Total Downloads

Downloads (Last 12 months)60
Downloads (Last 6 weeks)6

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gruber ACollins EMeka AMueller FSarkar KOrts‐Escolano SPrasso LBusch JGross MBeeler T(2024)GANtlitz: Ultra High Resolution Generative Model for Multi‐Modal Face TexturesComputer Graphics Forum10.1111/cgf.1503943:2Online publication date: 24-Apr-2024
https://doi.org/10.1111/cgf.15039

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents