Abstract
2D caricature editing has shown superior performance. However, 3D exaggerated caricature face (ECF) modeling with flexible shape and texture editing capabilities is far from achieving satisfactory high-quality results. This paper aims to model shape and texture variations of 3D caricatures in a learnable parameter space. To achieve this goal, we propose a novel framework for highly controllable editing of 3D caricatures. Our model mainly consists of the texture and shape hyper-networks, texture and shape Sirens, and a projection module. Specifically, two hyper-networks take the texture and shape latent codes as inputs to learn the compact parameter spaces of the two Siren modules. The texture and shape Sirens are leveraged to model the deformation variations of textural styles and geometric shapes. We further incorporate precise control of the camera parameters in the projection module to enhance the quality of generated ECF results. Our method allows flexible editing online and swapping textural features between 3D caricatures. For this purpose, we contribute a 3D caricature face dataset with textures for training and testing. Experiments and user evaluations demonstrate that our method is capable of generating diverse high-fidelity caricatures and achieves better editing capabilities than state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability.
Data is available on reasonable request from the corresponding author.
References
Akleman, E., Palmer, J., Logan, R.: Making extreme caricatures with a new interactive 2d deformation technique with simplicial complexes. In: Proceedings of visual, vol. 1, p. 2000 (2000)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: Seminal Graphics Papers: Pushing the Boundaries, Volume 2, pp. 157–164 (2023)
Cai, H., Guo, Y., Peng, Z., Zhang, J.: Landmark detection and 3d face reconstruction for caricature using a nonlinear parametric model. Graphical Models 115, 101,103 (2021)
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics 20(3), 413–425 (2013)
Cao, K., Liao, J., Yuan, L.: Carigans: Unpaired photo-to-caricature translation. ACM Transactions on Graphics 37(6), 244 (2018)
Daněček, R., Black, M.J., Bolkart, T.: Emoca: Emotion driven monocular face capture and animation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 20,311–20,322 (2022)
Deng, Y., Yang, J., Tong, X.: Deformed implicit field: Modeling 3d shapes with learned dense correspondence. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10,286–10,296 (2021)
Ding, Y., Ma, X., Luo, M., Zheng, A., He, R.: Unsupervised contrastive photo-to-caricature translation based on auto-distortion. In: International Conference on Pattern Recognition, pp. 4520–4527 (2021)
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3d face model from in-the-wild images. ACM Transactions on Graphics 40(4), 1–13 (2021)
Galanakis, S., Gecer, B., Lattas, A., Zafeiriou, S.: 3dmm-rf: Convolutional radiance fields for 3d face modeling. In: Winter Conference on Applications of Computer Vision, pp. 3536–3547 (2023)
Garg, J., Peri, S.V., Tolani, H., Krishnan, N.C.: Deep cross modal learning for caricature verification and identification (cavinet). In: ACM international conference on Multimedia, pp. 1101–1109 (2018)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Gong, J., Hold-Geoffroy, Y., Lu, J.: Autotoon: Automatic geometric warping for face cartoon generation. In: Winter Conference on Applications of Computer Vision, pp. 360–369 (2020)
Han, X., Gao, C., Yu, Y.: Deepsketch2face: a deep learning based sketching system for 3d face and caricature modeling. ACM Transactions on Graphics 36(4), 1–12 (2017)
Han, X., Hou, K., Du, D., Qiu, Y., Cui, S., Zhou, K., Yu, Y.: Caricatureshop: Personalized and photorealistic caricature sketching. IEEE Transactions on Visualization and Computer Graphics 26(7), 2349–2361 (2018)
Hou, H., Huo, J., Wu, J., Lai, Y.K., Gao, Y.: Mw-gan: multi-warping gan for caricature generation with multi-style geometric exaggeration. IEEE Transactions on Image Processing 30, 8644–8657 (2021)
Huo, J., Li, W., Shi, Y., Gao, Y., Yin, H.: Webcaricature: a benchmark for caricature recognition. In: British Machine Vision Conference, p. 223 (2017)
Jang, W., Ju, G., Jung, Y., Yang, J., Tong, X., Lee, S.: Stylecarigan: caricature generation via stylegan feature map modulation. ACM Transactions on Graphics 40(4), 1–16 (2021)
Jung, Y., Jang, W., Kim, S., Yang, J., Tong, X., Lee, S.: Deep deformable 3d caricatures with learned shape control. In: ACM SIGGRAPH Conference Proceedings, pp. 1–9 (2022)
Karras, T., Aila, T., Laine, S., Herva, A., Lehtinen, J.: Audio-driven facial animation by joint end-to-end learning of pose and emotion. ACM Transactions on Graphics 36(4), 1–12 (2017)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Kim, J., Kim, M., Kang, H., Lee, K.: U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: International Conference on Learning Representations (2020)
Laishram, L., Shaheryar, M., Lee, J.T., Jung, S.K.: A style-based caricature generator. In: International Workshop on Frontiers of Computer Vision, pp. 71–82 (2023)
Li, W., Xiong, W., Liao, H., Huo, J., Gao, Y., Luo, J.: Carigan: Caricature generation through weakly paired adversarial learning. Neural Networks 132, 66–74 (2020)
Liu, Y., Shu, Z., Li, Y., Lin, Z., Zhang, R., Kung, S.: 3d-fm gan: Towards 3d-controllable face manipulation. In: European Conference on Computer Vision, pp. 107–125 (2022)
O’Toole, A.J., Vetter, T., Blanz, V.: Three-dimensional shape and two-dimensional surface reflectance contributions to face recognition: An application of three-dimensional morphing. Vision research 39(18), 3145–3155 (1999)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: Learning continuous signed distance functions for shape representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 165–174 (2019)
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3d face model for pose and illumination invariant face recognition. In: International Conference on Advanced Video and Signal based Surveillance, pp. 296–301 (2009)
Pinkney, J.N., Adler, D.: Resolution dependent gan interpolation for controllable image synthesis between domains. arXiv preprint arXiv:2010.05334 (2020)
Qiu, Y., Xu, X., Qiu, L., Pan, Y., Wu, Y., Chen, W., Han, X.: 3dcaricshop: A dataset and a baseline method for single-view 3d caricature face reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10,236–10,245 (2021)
Shen, Y., Yang, C., Tang, X., Zhou, B.: Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(4), 2004–2018 (2020)
Shi, Y., Deb, D., Jain, A.K.: Warpgan: Automatic caricature generation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10,762–10,771 (2019)
Huang, Meijia and Dai, Ju and Pan, Junjun and Bai, Junxuan and Qin, Hong: 3D-CariNet: End-to-end 3D Caricature Generation from Natural Face Images with Differentiable Renderer. In: Pacific Graphics Short Papers, Posters, and Work-in-Progress Papers (2021)
Wallraven, C., Blanz, V., Vetter, T.: 3d-reconstruction of faces: Combining stereo with class-based knowledge. In: Mustererkennung 1999: 21. DAGM-Symposium Bonn, pp. 405–412 (1999)
Wang, W., Ceylan, D., Mech, R., Neumann, U.: 3dn: 3d deformation network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1038–1046 (2019)
Wu, Q., Zhang, J., Lai, Y.K., Zheng, J., Cai, J.: Alive caricature from 2d to 3d. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7336–7345 (2018)
Yang, S., Jiang, L., Liu, Z., Loy, C.C.: Pastiche master: Exemplar-based high-resolution portrait style transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7693–7702 (2022)
Ye, Z., Xia, M., Sun, Y., Yi, R., Yu, M., Zhang, J., Lai, Y.K., Liu, Y.J.: 3d-carigan: an end-to-end solution to 3d caricature generation from normal face photos. IEEE Transactions on Visualization and Computer Graphics 29(4), 2203–2210 (2021)
Zhao, X., Chen, W., Xie, W., Shen, L.: Style attention based global-local aware gan for personalized facial caricature generation. Frontiers in Neuroscience 17, 1136,416 (2023)
Zheng, Z., Wang, C., Yu, Z., Wang, N., Zheng, H., Zheng, B.: Unpaired photo-to-caricature translation on faces in the wild. Neurocomputing 355, 71–81 (2019)
Zheng, Z., Zhu, J., Ji, W., Yang, Y., Chua, T.S.: 3d magic mirror: Clothing reconstruction from a single image via a causal perspective. arXiv preprint arXiv:2204.13096 (2022)
Zhou, P., Xie, L., Ni, B., Tian, Q.: Cips-3d: A 3d-aware generator of gans based on conditionally-independent pixel synthesis. arXiv preprint arXiv:2110.09788 (2021)
Acknowledgements
This research is supported by National Key R\( { \& }\)D Program of China (No. 2022ZD0115902), National Natural Science Foundation of China (No. 62102208), Beijing Natural Science Foundation (No. 4232023), Young Elite Scientists Sponsorship Program by BAST (No. BYESS2023382), Beijing Emerging Interdisciplinary Platform for Medicine and Engineering in Sports (EIPMES), the Open Project Program of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (No. VRLAB2024C06).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
lin, Y., Dai, J., Pan, J. et al. Free editing of Shape and Texture with Deformable Net for 3D Caricature Generation. Vis Comput 40, 4675–4687 (2024). https://doi.org/10.1007/s00371-024-03461-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-024-03461-9