Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

FaceTuneGAN: : Face autoencoder for convolutional expression transfer using neural generative adversarial networks

Published: 01 February 2023 Publication History

Abstract

In this paper, we present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression. We propose a first adaptation of image-to-image translation networks, that have successfully been used in the 2D domain, to 3D face geometry. Leveraging recently released large face scan databases (FaceScape and CoMA), a neural network has been trained to decouple factors of variations with a better knowledge of the face, enabling facial expressions transfer and neutralization of expressive faces. Specifically, we design an adversarial architecture adapting the base architecture of FUNIT and using SpiralNet++ for our convolutional and sampling operations. Applied on these two datasets, FaceTuneGAN has a better identity decomposition and face neutralization than state-of-the-art techniques. It also outperforms classical deformation transfer approach by predicting blendshapes closer to ground-truth data and with less of undesired artifacts due to too different facial morphologies between source and target.

Graphical abstract

Display Omitted

Highlights

3D face model representation.
Morphology aware expression transfer.
Style based 3D facial generative model.
Semi-supervised learning of 3D identity and expression separation.

References

[1]
Alexander O., Rogers M., Lambeth W., Chiang J.Y., Ma W.C., Wang C.C., Debevec P., The Digital Emily Project: Achieving a photorealistic digital actor, IEEE Comput Graph Appl 30 (4) (2010) 20–31,.
[2]
Li R., Bladin K., Zhao Y., Chinara C., Ingraham O., Xiang P., Ren X., Prasad P., Kishore B., Xing J., Li H., Learning formation of physically-based face attributes, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2020, pp. 3407–3416,. URL: http://arxiv.org/abs/2004.03458. arXiv:2004.03458.
[3]
Wei S.E., Saragih J., Simon T., Harley A.W., Lombardi S., Perdoch M., Hypes A., Wang D., Badino H., Sheikh Y., VR facial animation via multiview image translation, ACM Trans Graph 38 (4) (2019) 1–16,.
[4]
Bottino A., De Simone M., Laurentini A., Sforza C., A new 3-D tool for planning plastic surgery, IEEE Trans Biomed Eng 59 (12) (2012) 3439–3449,.
[5]
Little A.C., Jones B.C., Debruine L.M., The many faces of research on face perception, Philos Trans R Soc B 366 (1571) (2011) 1634–1637,. URL: https://royalsocietypublishing.org/doi/10.1098/rstb.2010.0386.
[6]
Danieau F., Gubins I., Olivier N., Dumas O., Denis B., Lopez T., Mollet N., Frager B., Avril Q., Automatic generation and stylization of 3d facial rigs, in: 26th IEEE conference on virtual reality and 3D user interfaces, VR 2019 - Proceedings, IEEE, 2019, pp. 784–792,. URL: https://ieeexplore.ieee.org/document/8798208/.
[7]
Olivier N., Hoyet L., Danieau F., Argelaguet F., Avril Q., Lecuyer A., Guillotel P., Multon F., The impact of stylization on face recognition, in: Proceedings - SAP 2020: ACM symposium on applied perception, ACM, 2020, pp. 1–9,. URL: https://dl.acm.org/doi/10.1145/3385955.3407930.
[8]
Olivier N., Kerbiriou G., Arguelaguet F., Avril Q., Danieau F., Guillotel P., Hoyet L., Multon F., Study on automatic 3d facial caricaturization: from rules to deep learning, Frontiers in Virtual Reality 2 (2022),.
[9]
Sumner R.W., Popović J., Deformation transfer for triangle meshes, ACM Trans Graph 23 (3) (2004) 399–405,.
[10]
Goodfellow I.J., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y., Generative adversarial nets, in: Ghahramani Z., Welling M., Cortes C., Lawrence N., Weinberger K.Q. (Eds.), Advances in neural information processing systems, Vol. 3, Curran Associates, Inc., 2014, pp. 2672–2680,. URL: https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.
[11]
Mirza M., Osindero S., Conditional generative adversarial nets, 2014, URL: http://arxiv.org/abs/1411.1784. arXiv:1411.1784.
[12]
Radford A., Metz L., Chintala S., Unsupervised representation learning with deep convolutional generative adversarial networks, in: 4th international conference on learning representations, ICLR 2016 - Conference track proceedings, 2016, URL: http://arxiv.org/abs/1511.06434. arXiv:1511.06434.
[13]
Karras T., Aila T., Laine S., Lehtinen J., Progressive growing of GANs for improved quality, stability, and variation, in: 6th international conference on learning representations, ICLR 2018 - Conference track proceedings, 2018, URL: http://arxiv.org/abs/1710.10196. arXiv:1710.10196.
[14]
Isola P., Zhu J.Y., Zhou T., Efros A.A., Image-to-image translation with conditional adversarial networks, in: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 5967–5976,. URL: http://arxiv.org/abs/1611.07004. arXiv:1611.07004.
[15]
Zhu J.Y., Park T., Isola P., Efros A.A., Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE international conference on computer vision, Vol. 2017-Octob, 2017, pp. 2242–2251,. URL: http://arxiv.org/abs/1703.10593. arXiv:1703.10593.
[16]
Liu M.-Y., Huang X., Mallya A., Karras T., Aila T., Lehtinen J., Kautz J., Few-shot unsupervised image-to-image translation, in: IEEE/CVF international conference on computer vision (ICCV), 2019, URL: http://arxiv.org/abs/1905.01723.
[17]
Karras T., Laine S., Aila T., A style-based generator architecture for generative adversarial networks, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Vol. 2019-June, 2019, pp. 4396–4405,. URL: http://arxiv.org/abs/1812.04948. arXiv:1812.04948.
[18]
Choi Y., Uh Y., Yoo J., Ha J.W., StarGAN v2: Diverse image synthesis for multiple domains, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2020, pp. 8185–8194,. URL: http://arxiv.org/abs/1912.01865. arXiv:1912.01865.
[19]
Gong S., Chen L., Bronstein M., Zafeiriou S., SpiralNet++: A fast and highly efficient mesh convolution operator, in: Proceedings - 2019 international conference on computer vision workshop, ICCVW 2019, 2019, pp. 4141–4148,. URL: http://arxiv.org/abs/1911.05856. arXiv:1911.05856.
[20]
Egger B., Smith W.A., Tewari A., Wuhrer S., Zollhoefer M., Beeler T., Bernard F., Bolkart T., Kortylewski A., Romdhani S., Theobalt C., Blanz V., Vetter T., 3D morphable face modelsa-past, present, and future, ACM Trans Graph 39 (5) (2020),. URL: http://arxiv.org/abs/1909.01815. arXiv:1909.01815.
[21]
Blanz V., Vetter T., A morphable model for the synthesis of 3D faces, in: Proceedings of the 26th annual conference on computer graphics and interactive techniques, SIGGRAPH 1999, ACM Press, 1999, pp. 187–194,.
[22]
Vlasic D., Brand M., Pfister H., Popović J., Face transfer with multilinear models, ACM Trans Graph 24 (3) (2005) 426–433,.
[23]
Ferrari C., Berretti S., Pala P., Del BIMBO A., A sparse and locally coherent morphable face model for dense semantic correspondence across heterogeneous 3D faces, IEEE Trans Pattern Anal Mach Intell (2021),. arXiv:2006.03840.
[24]
Li T., Bolkart T., Black M.J., Li H., Romero J., Learning a model of facial shape and expression from 4D scans, ACM Trans Graph 36 (6) (2017) 1–17,. URL: https://dl.acm.org/doi/10.1145/3130800.3130813.
[25]
Wang M., Panagakis Y., Snape P., Zafeiriou S., Learning the multilinear structure of visual data, in: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 6053–6061,.
[26]
Bolkart T., Wuhrer S., A robust multilinear model learning framework for 3D faces, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Vol. 2016-Decem, 2016, pp. 4911–4919,.
[27]
Ranjan A., Bolkart T., Sanyal S., Black M.J., Generating 3D faces using convolutional mesh autoencoders, in: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (Eds.), Lecture notes in computer science (Including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), in: LNCS, vol. 11207, Springer International Publishing, 2018, pp. 725–741,. URL: http://arxiv.org/abs/1807.10267 http://link.springer.com/10.1007/978-3-030-01219-9_43. arXiv:1807.10267.
[28]
Odena A., Olah C., Shlens J., Conditional image synthesis with auxiliary classifier gans, in: 34th international conference on machine learning, ICML 2017, Vol. 6, 2017, pp. 4043–4055. URL: http://arxiv.org/abs/1610.09585. arXiv:1610.09585.
[29]
Abrevaya V.F., Boukhayma A., Wuhrer S., Boyer E., A decoupled 3D facial shape model by adversarial training, in: Proceedings of the IEEE international conference on computer vision, Vol. 2019-Octob, IEEE, 2019, pp. 9418–9427,. URL: https://hal.archives-ouvertes.fr/hal-02064711. arXiv:1902.03619.
[30]
Jiang Z.H., Wu Q., Chen K., Zhang J., Disentangled representation learning for 3D face shape, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Vol. 2019-June, 2019, pp. 11949–11958,. URL: http://arxiv.org/abs/1902.09887. arXiv:1902.09887.
[31]
Zhang Z., Yu C., Li H., Sun J., Liu F., Learning distribution independent latent representation for 3D face disentanglement, in: Proceedings - 2020 international conference on 3D vision, 3DV 2020, 2020, pp. 848–857,.
[32]
Gatys L.A., Ecker A.S., Bethge M., Image style transfer using convolutional neural networks, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Vol. 2016-Decem, IEEE, 2016, pp. 2414–2423,. URL: http://ieeexplore.ieee.org/document/7780634/.
[33]
Li C., Wand M., Precomputed real-time texture synthesis with markovian generative adversarial networks, in: Lecture notes in computer science (Including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), in: LNCS, vol. 9907, 2016, pp. 702–716,. arXiv:1604.04382.
[34]
Ledig C., Theis L., Huszár F., Caballero J., Cunningham A., Acosta A., Aitken A., Tejani A., Totz J., Wang Z., Shi W., Photo-realistic single image super-resolution using a generative adversarial network, in: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 105–114,. arXiv:1609.04802.
[35]
Zhu J.Y., Krähenbühl P., Shechtman E., Efros A.A., Generative visual manipulation on the natural image manifold, in: Lecture notes in computer science (Including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), in: LNCS, vol. 9909, 2016, pp. 597–613,. URL: http://arxiv.org/abs/1609.03552. arXiv:1609.03552.
[36]
Taigman Y., Polyak A., Wolf L., Unsupervised cross-domain image generation, ICLR (Poster) (2017) URL: http://arxiv.org/abs/1611.02200.
[37]
Liu M.Y., Breuel T., Kautz J., Unsupervised image-to-image translation networks, Adv Neural Inf Process Syst 2017-Decem (2017) 701–709. URL: http://arxiv.org/abs/1703.00848. arXiv:1703.00848.
[38]
Bruna J., Zaremba W., Szlam A., LeCun Y., Spectral networks and deep locally connected networks on graphs, in: 2nd international conference on learning representations, ICLR 2014 - Conference track proceedings, 2014, URL: http://arxiv.org/abs/1312.6203. arXiv:1312.6203.
[39]
Sinha A., Bai J., Ramani K., Deep learning 3D shape surfaces using geometry images, in: Lecture notes in computer science (Including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), in: LNCS, vol. 9910, 2016, pp. 223–240,.
[40]
Moschoglou S., Ploumpis S., Nicolaou M.A., Papaioannou A., Zafeiriou S., 3DFaceGAN: Adversarial nets for 3D face representation, generation, and translation, Int J Comput Vis 128 (10–11) (2020) 2534–2551,. URL: http://arxiv.org/abs/1905.00307. arXiv:1905.00307.
[41]
Lim I., Dielen A., Campen M., Kobbelt L., A simple approach to intrinsic correspondence learning on unstructured 3D meshes, in: Lecture notes in computer science (Including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), in: LNCS, vol. 11131, 2019, pp. 349–362,. URL: http://arxiv.org/abs/1809.06664. arXiv:1809.06664.
[42]
Bouritsas G., Bokhnyak S., Ploumpis S., Zafeiriou S., Bronstein M., Neural 3D morphable models: Spiral convolutional networks for 3D shape representation learning and generation, in: Proceedings of the IEEE international conference on computer vision, Vol. 2019-Octob, 2019, pp. 7212–7221,. arXiv:1905.02876.
[43]
Qi C.R., Su H., Mo K., Guibas L.J., PointNet: Deep learning on point sets for 3D classification and segmentation, in: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 77–85,. URL: http://arxiv.org/abs/1612.00593. arXiv:1612.00593.
[44]
Xu J., Sun X., Zhang Z., Zhao G., Lin J., Understanding and improving layer normalization, Adv Neural Inf Process Syst 32 (2019) URL: http://arxiv.org/abs/1911.07013. arXiv:1911.07013.
[45]
Segu M., Grinvald M., Siegwart R., Tombari F., 3DSNet: Unsupervised shape-to-shape 3D style transfer, 2020, URL: http://arxiv.org/abs/2011.13388. arXiv:2011.13388.
[46]
Meng Z., Liu P., Cai J., Han S., Tong Y., Identity-aware convolutional neural network for facial expression recognition, in: IEEE international conference on automatic face & gesture recognition (FG 2017), IEEE, 2017, pp. 558–565.
[47]
Liu X., Kumar B.V., Jia P., You J., Hard negative generation for identity-disentangled facial expression recognition, Pattern Recognit 88 (2019) 1–12.
[48]
Ali K., Hughes C.E., Facial expression recognition by using a disentangled identity-invariant expression representation, in: ICPR, IEEE, 2020, pp. 9460–9467.
[49]
Bai M., Xie W., Shen L., Disentangled feature based adversarial learning for facial expression recognition, in: Nternational conference on image processing, ICIP, Vol. 2019-Septe, 2019, pp. 31–35,.
[50]
Tran L., Yin X., Liu X., Disentangled representation learning GAN for pose-invariant face recognition, in: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 1283–1292,.
[51]
Zhang Z., Zhai S., Yin L., Identity-based adversarial training of deep CNNs for facial action unit recognition, in: British machine vision conference 2018, BMVC 2018, 2019.
[52]
Garland M., Heckbert P.S., Surface simplification using quadric error metrics, in: Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH 1997, ACM Press, 1997, pp. 209–216,.
[53]
Huang X., Belongie S., Arbitrary style transfer in real-time with adaptive instance normalization, in: Proceedings of the IEEE international conference on computer vision, Vol. 2017-Octob, IEEE, 2017, pp. 1510–1519,. URL: http://ieeexplore.ieee.org/document/8237429/. arXiv:1703.06868.
[54]
Hui L., Li X., Chen J., He H., Yang J., Unsupervised multi-domain image translation with domain-specific encoders/decoders, in: Proceedings - International conference on pattern recognition, Vol. 2018-Augus, 2018, pp. 2044–2049,. arXiv:1712.02050.
[55]
Zhou T., Krähenbühl P., Aubry M., Huang Q., Efros A.A., Learning dense correspondence via 3D-guided cycle consistency, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Vol. 2016-Decem, 2016, pp. 117–126,. URL: http://arxiv.org/abs/1604.05383. arXiv:1604.05383.
[56]
Mescheder L., Geiger A., Nowozin S., Which training methods for GANs do actually converge?, in: 35th international conference on machine learning, ICML 2018, Vol. 8, 2018, pp. 5589–5626. URL: http://arxiv.org/abs/1801.04406. arXiv:1801.04406.
[57]
Desbrun M., Meyer M., Schröder P., Barr A.H., Implicit fairing of irregular meshes using diffusion and curvature flow, in: Proceedings of the 26th annual conference on computer graphics and interactive techniques, SIGGRAPH 1999, ACM Press, 1999, pp. 317–324,. URL: http://portal.acm.org/citation.cfm?doid=311535.311576.
[58]
Nealen A., Igarashi T., Sorkine O., Alexa M., Laplacian mesh optimization, in: Proceedings - GRAPHITE 2006: 4th international conference on computer graphics and interactive techniques in Australasia and Southeast Asia, ACM Press, 2006, pp. 381–389,. URL: http://portal.acm.org/citation.cfm?doid=1174429.1174494 http://dl.acm.org/citation.cfm?id=1174494.
[59]
Johnson J., Ravi N., Reizenstein J., Novotny D., Tulsiani S., Lassner C., Branson S., Accelerating 3D deep learning with PyTorch3D, in: SIGGRAPH Asia 2020 courses, 2020, p. 1,. arXiv:2007.08501.
[60]
Yang H., Zhu H., Wang Y., Huang M., Shen Q., Yang R., Cao X., FaceScape: A large-scale high quality 3D face dataset and detailed riggable 3D face prediction, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2020, pp. 598–607,. arXiv:2003.13989.
[61]
Kacem A., Cherenkova K., Aouada D., Disentangled face identity representations for joint 3D face recognition and expression neutralisation, 2021, URL: http://arxiv.org/abs/2104.10273. arXiv:2104.10273.
[62]
Kingma D.P., Ba J.L., Adam: A method for stochastic optimization, in: 3rd international conference on learning representations, ICLR 2015 - Conference track proceedings, 2015, arXiv:1412.6980.
[63]
He K., Zhang X., Ren S., Sun J., Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, in: Proceedings of the IEEE international conference on computer vision, Vol. 2015 Inter, IEEE, 2015, pp. 1026–1034,. URL: http://ieeexplore.ieee.org/document/7410480/. arXiv:1502.01852.
[64]
Chandran P., Bradley D., Gross M., Beeler T., Semantic deep face models, in: Proceedings - 2020 international conference on 3D vision, 3DV 2020, 2020, pp. 345–354,.
[65]
Kingma D.P., Welling M., Auto-encoding variational bayes, in: 2nd international conference on learning representations, ICLR 2014 - Conference track proceedings, 2014, arXiv:1312.6114.
[66]
Clevert D.A., Unterthiner T., Hochreiter S., Fast and accurate deep network learning by exponential linear units (ELUs), in: 4th international conference on learning representations, ICLR 2016 - Conference track proceedings, 2016, arXiv:1511.07289.

Cited By

View all

Index Terms

  1. FaceTuneGAN: Face autoencoder for convolutional expression transfer using neural generative adversarial networks
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Computers and Graphics
          Computers and Graphics  Volume 110, Issue C
          Feb 2023
          189 pages

          Publisher

          Pergamon Press, Inc.

          United States

          Publication History

          Published: 01 February 2023

          Author Tags

          1. Computers and graphics
          2. Formatting
          3. Guidelines

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 09 Dec 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)MeshWGAN: Mesh-to-Mesh Wasserstein GAN With Multi-Task Gradient Penalty for 3D Facial Geometric Age TransformationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.328450030:8(4927-4940)Online publication date: 1-Aug-2024
          • (2024)FISTNetInformation Fusion10.1016/j.inffus.2024.102572112:COnline publication date: 1-Dec-2024
          • (2024)Non-corresponding and topology-free 3D face expression transferThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03473-540:10(7057-7074)Online publication date: 1-Oct-2024
          • (2023)MORGAN: MPEG Original Reference Geometric Avatar NeutralProceedings of the 28th International ACM Conference on 3D Web Technology10.1145/3611314.3615909(1-10)Online publication date: 9-Oct-2023
          • (2023)Editorial NoteComputers and Graphics10.1016/j.cag.2023.01.014110:C(A1-A3)Online publication date: 1-Feb-2023

          View Options

          View options

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media