SemanticGAN: Facial Image Editing with Semantic to Realize Consistency

Xin Luan^15,16,17,
Nan Yang^15,16,17,
Huijie Fan^15,16 &
…
Yandong Tang^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13536))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1899 Accesses

Abstract

Recent work has shown that face editing in the latent space of Generative Adversarial Networks(GANs). However, it is difficult to decouple the attributes in latent space that reduce the inconsistent face editing. In this work, we proposed a simple yet effective method named SemanticGAN to realize consistent face editing. First, we get fine editing on attribute-related regions and note that we mainly consider the accuracy of the edited images possessing the target attributes instead of whether the editing of irrelevant regions is inconsistent. Second, we optimize the attribute-independent regions that ensure the edited face image consistent with the raw image. Specifically, we apply the generated semantic segmentation to distinguish the edited regions and the unedited regions. Extensive qualitative and quantitative results validate our proposed method. Comparisons show that SemanticGAN can achieve a satisfactory image-consistent editing result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network

Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously

Disentangled face editing via individual walk in personalized facial semantic field

Article 03 November 2022

References

Abdal, R., Qin, Y., Wonka, P.: Image2styleGAN: how to embed images into the styleGAN latent space? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4432–4441 (2019)
Google Scholar
Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN++: how to edit the embedded images? In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8293–8302 (2020). https://doi.org/10.1109/CVPR42600.2020.00832
Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of styleGAN-generated images using conditional continuous normalizing flows. ACM Trans. Graph. 40(3), 1–21 (2021). https://doi.org/10.1145/3447648
Article Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Chen, X., et al.: CooGAN: a memory-efficient framework for high-resolution facial attribute editing. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 670–686. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_39
Chapter Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018). https://doi.org/10.1109/CVPR.2018.00916
Chu, W., Tai, Y., Wang, C., Li, J., Huang, F., Ji, R.: SSCGAN: facial attribute editing via style skip connections. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 414–429. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_25
Chapter Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 2019-June, pp. 4685–4694 (2019). https://doi.org/10.1109/CVPR.2019.00482
Gao, Y., et al.: High-fidelity and arbitrary face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16115–16124 (2021)
Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020). https://doi.org/10.1145/3422622
Article MathSciNet Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein GANs. Adv. Neural Inf. Proc. Syst. 2017, 5768–5778 (2017)
Google Scholar
He, Z., Kan, M., Zhang, J., Shan, S.: PA-GAN: progressive attention generative adversarial network for facial attribute editing. arXiv Preprint arXiv:2007.05892 (2020)
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019). https://doi.org/10.1109/TIP.2019.2916751
Article MathSciNet MATH Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Huang, Xun, Liu, Ming-Yu., Belongie, Serge, Kautz, Jan: Multimodal unsupervised image-to-image translation. In: Ferrari, Vittorio, Hebert, Martial, Sminchisescu, Cristian, Weiss, Yair (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Chapter Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv Preprint arXiv:1710.10196 (2017)
Karras, T., et al.: Alias-free generative adversarial networks. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019-June, 4396–4405 (2019). https://doi.org/10.1109/CVPR.2019.00453
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAN. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8107–8116 (2020). DOIhttps://doi.org/10.1109/CVPR42600.2020.00813
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5549–5558 (2020)
Google Scholar
Li, X., et al.: Image-to-image translation via hierarchical style disentanglement. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8635–8644 (2021). https://doi.org/10.1109/CVPR46437.2021.00853, http://arxiv.org/abs/2103.01456
Lin, J., Zhang, R., Ganz, F., Han, S., Zhu, J.Y.: Anycost GANs for interactive image synthesis and editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 14981–14991 (2021). https://doi.org/10.1109/CVPR46437.2021.01474, http://arxiv.org/abs/2103.03243
Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A., Fidler, S.: EditGAN: high-precision semantic image editing. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar
Liu, M., et al.: STGAN: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019-June, 3668–3677 (2019). https://doi.org/10.1109/CVPR.2019.00379
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV 2015, vol. 2015, pp. 3730–3738 (2015). https://doi.org/10.1109/ICCV.2015.425
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for GANs do actually converge? In: 35th International Conference on Machine Learning, ICML 2018. vol. 8, pp. 5589–5626. PMLR (2018)
Google Scholar
Richardson, E., et al.: Encoding in style: a styleGAN encoder for image-to-image translation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2287–2296 (2021). https://doi.org/10.1109/CVPR46437.2021.00232
Shen, Y., Yang, C., Tang, X., Zhou, B.: InterFaceGAN: interpreting the disentangled face representation learned by GANs. IEEE Trans. Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2020.3034267
Article Google Scholar
Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GaNs. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1532–1540 (2021). https://doi.org/10.1109/CVPR46437.2021.00158
Tan, D.S., Soeseno, J.H., Hua, K.L.: Controllable and identity-aware facial attribute transformation. IEEE Trans. Cybernet. (2021). https://doi.org/10.1109/TCYB.2021.3071172
Article Google Scholar
Tritrong, N., Rewatbowornwong, P., Suwajanakorn, S.: Repurposing GANs for one-shot semantic part segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4475–4485 (2021)
Google Scholar
Viazovetskyi, Y., Ivashkin, V., Kashin, E.: StyleGAN2 distillation for feed-forward image manipulation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12367 LNCS, pp. 170–186. Springer (2020). https://doi.org/10.1007/978-3-030-58542-6_11
Wang, Y., Gonzalez-Garcia, A., Van De Weijer, J., Herranz, L.: SDIT: scalable and diverse cross-domain image translation. In: MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia, pp. 1267–1276 (2019). https://doi.org/10.1145/3343031.3351004
Wu, P.W., Lin, Y.J., Chang, C.H., Chang, E.Y., Liao, S.W.: RelGAN: multi-domain image-to-image translation via relative attributes. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 5914–5922 (2019)
Google Scholar
Xiao, T., Hong, J., Ma, J.: ELEGANT: Exchanging latent encodings with GAN for transferring multiple face attributes. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11214 LNCS, pp. 172–187 (2018). https://doi.org/10.1007/978-3-030-01249-6_11
Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., Xiang, T.: L2M-GAN: learning to manipulate latent space semantics for facial attribute editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2950–2959 (2021). https://doi.org/10.1109/CVPR46437.2021.00297
Yang, N., Zheng, Z., Zhou, M., Guo, X., Qi, L., Wang, T.: A domain-guided noise-optimization-based inversion method for facial image manipulation. IEEE Trans. Image Process. 30, 6198–6211 (2021). https://doi.org/10.1109/TIP.2021.3089905
Article Google Scholar
Yang, N., Zhou, M., Xia, B., Guo, X., Qi, L.: Inversion based on a detached dual-channel domain method for styleGAN2 embedding. IEEE Signal Process. Lett. 28, 553–557 (2021). https://doi.org/10.1109/LSP.2021.3059371
Article Google Scholar
Zhang, K., Su, Y., Guo, X., Qi, L., Zhao, Z.: MU-GAN: facial attribute editing based on multi-attention mechanism. IEEE/CAA J. Autom. Sin. 8(9), 164–1626 (2020)
Google Scholar
Zhang, Y., et al.: DatasetGAN: efficient labeled data factory with minimal human effort. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10145–10155 (2021)
Google Scholar
Zhao, B., Chang, B., Jie, Z., Sigal, L.: Modular generative adversarial networks. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11218 LNCS, pp. 157–173 (2018). https://doi.org/10.1007/978-3-030-01264-9_10
Zhu, J., Shen, Y., Zhao, D., Zhou, B.: In-domain GAN inversion for real image editing. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12362 LNCS, 592–608 (2020). https://doi.org/10.1007/978-3-030-58520-4_35, http://arxiv.org/abs/2004.00049

Download references

Author information

Authors and Affiliations

State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
Xin Luan, Nan Yang, Huijie Fan & Yandong Tang
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, 110169, China
Xin Luan, Nan Yang, Huijie Fan & Yandong Tang
University of Chinese Academy of Sciences, Beijing, 100049, China
Xin Luan & Nan Yang

Authors

Xin Luan
View author publications
You can also search for this author in PubMed Google Scholar
Nan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Huijie Fan
View author publications
You can also search for this author in PubMed Google Scholar
Yandong Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huijie Fan .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luan, X., Yang, N., Fan, H., Tang, Y. (2022). SemanticGAN: Facial Image Editing with Semantic to Realize Consistency. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-18913-5_34
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics