Abstract
Recent work has shown that face editing in the latent space of Generative Adversarial Networks(GANs). However, it is difficult to decouple the attributes in latent space that reduce the inconsistent face editing. In this work, we proposed a simple yet effective method named SemanticGAN to realize consistent face editing. First, we get fine editing on attribute-related regions and note that we mainly consider the accuracy of the edited images possessing the target attributes instead of whether the editing of irrelevant regions is inconsistent. Second, we optimize the attribute-independent regions that ensure the edited face image consistent with the raw image. Specifically, we apply the generated semantic segmentation to distinguish the edited regions and the unedited regions. Extensive qualitative and quantitative results validate our proposed method. Comparisons show that SemanticGAN can achieve a satisfactory image-consistent editing result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdal, R., Qin, Y., Wonka, P.: Image2styleGAN: how to embed images into the styleGAN latent space? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4432–4441 (2019)
Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN++: how to edit the embedded images? In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8293–8302 (2020). https://doi.org/10.1109/CVPR42600.2020.00832
Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of styleGAN-generated images using conditional continuous normalizing flows. ACM Trans. Graph. 40(3), 1–21 (2021). https://doi.org/10.1145/3447648
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Chen, X., et al.: CooGAN: a memory-efficient framework for high-resolution facial attribute editing. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 670–686. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_39
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018). https://doi.org/10.1109/CVPR.2018.00916
Chu, W., Tai, Y., Wang, C., Li, J., Huang, F., Ji, R.: SSCGAN: facial attribute editing via style skip connections. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 414–429. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_25
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. vol. 2019-June, pp. 4685–4694 (2019). https://doi.org/10.1109/CVPR.2019.00482
Gao, Y., et al.: High-fidelity and arbitrary face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16115–16124 (2021)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020). https://doi.org/10.1145/3422622
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein GANs. Adv. Neural Inf. Proc. Syst. 2017, 5768–5778 (2017)
He, Z., Kan, M., Zhang, J., Shan, S.: PA-GAN: progressive attention generative adversarial network for facial attribute editing. arXiv Preprint arXiv:2007.05892 (2020)
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019). https://doi.org/10.1109/TIP.2019.2916751
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems 30 (2017)
Huang, Xun, Liu, Ming-Yu., Belongie, Serge, Kautz, Jan: Multimodal unsupervised image-to-image translation. In: Ferrari, Vittorio, Hebert, Martial, Sminchisescu, Cristian, Weiss, Yair (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv Preprint arXiv:1710.10196 (2017)
Karras, T., et al.: Alias-free generative adversarial networks. In: Advances in Neural Information Processing Systems 34 (2021)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019-June, 4396–4405 (2019). https://doi.org/10.1109/CVPR.2019.00453
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAN. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8107–8116 (2020). DOIhttps://doi.org/10.1109/CVPR42600.2020.00813
Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5549–5558 (2020)
Li, X., et al.: Image-to-image translation via hierarchical style disentanglement. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8635–8644 (2021). https://doi.org/10.1109/CVPR46437.2021.00853, http://arxiv.org/abs/2103.01456
Lin, J., Zhang, R., Ganz, F., Han, S., Zhu, J.Y.: Anycost GANs for interactive image synthesis and editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 14981–14991 (2021). https://doi.org/10.1109/CVPR46437.2021.01474, http://arxiv.org/abs/2103.03243
Ling, H., Kreis, K., Li, D., Kim, S.W., Torralba, A., Fidler, S.: EditGAN: high-precision semantic image editing. In: Advances in Neural Information Processing Systems 34 (2021)
Liu, M., et al.: STGAN: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019-June, 3668–3677 (2019). https://doi.org/10.1109/CVPR.2019.00379
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV 2015, vol. 2015, pp. 3730–3738 (2015). https://doi.org/10.1109/ICCV.2015.425
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for GANs do actually converge? In: 35th International Conference on Machine Learning, ICML 2018. vol. 8, pp. 5589–5626. PMLR (2018)
Richardson, E., et al.: Encoding in style: a styleGAN encoder for image-to-image translation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2287–2296 (2021). https://doi.org/10.1109/CVPR46437.2021.00232
Shen, Y., Yang, C., Tang, X., Zhou, B.: InterFaceGAN: interpreting the disentangled face representation learned by GANs. IEEE Trans. Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2020.3034267
Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GaNs. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1532–1540 (2021). https://doi.org/10.1109/CVPR46437.2021.00158
Tan, D.S., Soeseno, J.H., Hua, K.L.: Controllable and identity-aware facial attribute transformation. IEEE Trans. Cybernet. (2021). https://doi.org/10.1109/TCYB.2021.3071172
Tritrong, N., Rewatbowornwong, P., Suwajanakorn, S.: Repurposing GANs for one-shot semantic part segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4475–4485 (2021)
Viazovetskyi, Y., Ivashkin, V., Kashin, E.: StyleGAN2 distillation for feed-forward image manipulation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12367 LNCS, pp. 170–186. Springer (2020). https://doi.org/10.1007/978-3-030-58542-6_11
Wang, Y., Gonzalez-Garcia, A., Van De Weijer, J., Herranz, L.: SDIT: scalable and diverse cross-domain image translation. In: MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia, pp. 1267–1276 (2019). https://doi.org/10.1145/3343031.3351004
Wu, P.W., Lin, Y.J., Chang, C.H., Chang, E.Y., Liao, S.W.: RelGAN: multi-domain image-to-image translation via relative attributes. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 5914–5922 (2019)
Xiao, T., Hong, J., Ma, J.: ELEGANT: Exchanging latent encodings with GAN for transferring multiple face attributes. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11214 LNCS, pp. 172–187 (2018). https://doi.org/10.1007/978-3-030-01249-6_11
Yang, G., Fei, N., Ding, M., Liu, G., Lu, Z., Xiang, T.: L2M-GAN: learning to manipulate latent space semantics for facial attribute editing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2950–2959 (2021). https://doi.org/10.1109/CVPR46437.2021.00297
Yang, N., Zheng, Z., Zhou, M., Guo, X., Qi, L., Wang, T.: A domain-guided noise-optimization-based inversion method for facial image manipulation. IEEE Trans. Image Process. 30, 6198–6211 (2021). https://doi.org/10.1109/TIP.2021.3089905
Yang, N., Zhou, M., Xia, B., Guo, X., Qi, L.: Inversion based on a detached dual-channel domain method for styleGAN2 embedding. IEEE Signal Process. Lett. 28, 553–557 (2021). https://doi.org/10.1109/LSP.2021.3059371
Zhang, K., Su, Y., Guo, X., Qi, L., Zhao, Z.: MU-GAN: facial attribute editing based on multi-attention mechanism. IEEE/CAA J. Autom. Sin. 8(9), 164–1626 (2020)
Zhang, Y., et al.: DatasetGAN: efficient labeled data factory with minimal human effort. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10145–10155 (2021)
Zhao, B., Chang, B., Jie, Z., Sigal, L.: Modular generative adversarial networks. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11218 LNCS, pp. 157–173 (2018). https://doi.org/10.1007/978-3-030-01264-9_10
Zhu, J., Shen, Y., Zhao, D., Zhou, B.: In-domain GAN inversion for real image editing. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12362 LNCS, 592–608 (2020). https://doi.org/10.1007/978-3-030-58520-4_35, http://arxiv.org/abs/2004.00049
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Luan, X., Yang, N., Fan, H., Tang, Y. (2022). SemanticGAN: Facial Image Editing with Semantic to Realize Consistency. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-18913-5_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)