Abstract
Training well-performing Generative Adversarial Networks (GANs) with limited data has always been challenging. Existing methods either require sufficient data (over 100 training images) for training or generate images of low quality and low diversity. To solve this problem, we propose a new Cross-domain Semantic Relation (CSR) loss. The CSR loss improves the performance of the generative model by maintaining the relationship between instances in the source domain and generated images. At the same time, a perceptual similarity loss and a discriminative contrastive loss are designed to further enrich the diversity of generated images and stabilize the training process of models. Experiments on nine publicly available few-shot datasets and comparisons with the current nine methods show that our approach is superior to all baseline methods. Finally, we perform ablation studies on the proposed three loss functions and prove that these three loss functions are essential for few-shot image generation tasks. Code is available at https://github.com/gouayao/CSR.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Benaim S, Wolf L (2017) One-sided unsupervised domain mapping. Advances in neural information processing systems 30
Bojanowski P, Joulin A, Lopez-Pas D, Szlam A (2018) Optimizing the latent space of generative networks. In: International Conference on Machine Learning, PMLR, pp 600–609
Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv preprint http://arxiv.org/abs/1809.11096
Cao J, Hou L, Yang MH, He R, Sun Z (2021) Remix: Towards image-to-image translation with limited data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15018–15027
Chan KC, Wang X, Xu X, Gu J, Loy CC (2021) Glean: Generative latent bank for large-factor image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14245–14254
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607
Chong MJ, Forsyth D (2020) Effectively unbiased fid and inception score and where to find them. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6070–6079
Frühstück A, Singh KK, Shechtman E, Mitra NJ, Wonka P, Lu J (2022) Insetgan for full-body image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7723–7732
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems 27
Gou Y, Li M, Song Y, He Y, Wang L (2022) Multi-feature contrastive learning for unpaired image-to-image translation. Complex & Intelligent Systems pp 1–12
Gu Z, Li W, Huo J, Wang L, Gao Y (2021) Lofgan: Fusing local representations for few-shot image generation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8463–8471
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30
Hong Y, Niu L, Zhang J, Zhao W, Fu C, Zhang L (2020) F2gan: Fusing-and-filling gan for few-shot image generation. In: Proceedings of the 28th ACM international conference on multimedia, pp 2535–2543
Hong Y, Niu L, Zhang J, Zhang L (2022) Deltagan: Towards diverse few-shot image generation with sample-specific delta. In: Computer Vision–ECCV 2022: 17th european conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVI, Springer, pp 259–276
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Jeong J, Shin J (2021) Training gans with stronger augmentations via contrastive discriminator. In: International Conference on Learning Representations
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, Springer, pp 694–711
Jung C, Kwon G, Ye JC (2022) Exploring patch-wise semantic relation for contrastive learning in image-to-image translation tasks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 18260–18269
Kang M, Park J (2020) Contragan: Contrastive learning for conditional image generation. Advances in Neural Information Processing Systems 33:21357–21369
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. In: International Conference on Learning Representations
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4401–4410
Karras T, Aittala M, Hellsten J, Laine S, Lehtinen J, Aila T (2020) Training generative adversarial networks with limited data. Advances in Neural Information Processing Systems 33:12104–12114
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020b) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8110–8119
Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks. Advances in Neural Information Processing Systems 34
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint http://arxiv.org/abs/1412.6980
Kong C, Kim J, Han D, Kwak N (2022) Few-shot image generation with mixup-based distance learning. In: Computer vision–ECCV 2022: 17th european conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XV, Springer, pp 563–580
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al. (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Li M, Lin J, Ding Y, Liu Z, Zhu JY, Han S (2020a) Gan compression: Efficient architectures for interactive conditional gans. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5284–5294
Li Y, Zhang R, Lu JC, Shechtman E (2020) Few-shot image generation with elastic weight consolidation. Advances in Neural Information Processing Systems 33:15885–15896
Liang J, Zeng H, Zhang L (2022) Details or artifacts: A locally discriminative learning approach to realistic image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5657–5666
Liu R, Ge Y, Choi CL, Wang X, Li H (2021a) Divco: Diverse conditional image synthesis via contrastive generative adversarial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16377–16386
Liu Y, Shu Z, Li Y, Lin Z, Perazzi F, Kung SY (2021b) Content-aware gan compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12156–12166
Mo S, Cho M, Shin J (2020) Freeze the discriminator: a simple baseline for fine-tuning gans. arXiv preprint http://arxiv.org/abs/2002.10964
Noguchi A, Harada T (2019) Image generation from small datasets via batch statistics adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2750–2758
Ojha U, Li Y, Lu J, Efros AA, Lee YJ, Shechtman E, Zhang R (2021) Few-shot image generation via cross-domain correspondence. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10743–10752
Van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv e-prints pp arXiv–1807
Ozbulak G (2019) Image colorization by capsule networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 0–0
Park T, Efros AA, Zhang R, Zhu JY (2020a) Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision, Springer, pp 319–345
Park T, Zhu JY, Wang O, Lu J, Shechtman E, Efros A, Zhang R (2020) Swapping autoencoder for deep image manipulation. Advances in Neural Information Processing Systems 33:7198–7211
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. Advances in neural information processing systems 29
Skorokhodov I, Sotnikov G, Elhoseiny M (2021) Aligning latent and image spaces to connect the unconnectable. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14144–14153
Tseng HY, Jiang L, Liu C, Yang MH, Yang W (2021) Regularizing generative adversarial networks under limited data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7921–7931
Wang H, Gui S, Yang H, Liu J, Wang Z (2020a) Gan slimming: All-in-one gan compression by a unified optimization framework. In: European conference on computer vision, Springer, pp 54–73
Wang X, Tang X (2008) Face photo-sketch synthesis and recognition. IEEE transactions on pattern analysis and machine intelligence 31(11):1955–1967
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Change Loy C (2018a) Esrgan: Enhanced super-resolution generative adversarial networks. In: Proceedings of the european conference on computer vision (ECCV) workshops, pp 0–0
Wang Y, Wu C, Herranz L, van de Weijer J, Gonzalez-Garcia A, Raducanu B (2018b) Transferring gans: generating images from limited data. In: Proceedings of the european conference on computer vision (ECCV), pp 218–234
Wang Y, Gonzalez-Garcia A, Berga D, Herranz L, Khan FS, Weijer Jvd (2020b) Minegan: effective knowledge transfer from gans to target domains with few images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9332–9341
Wu Y, Wang X, Li Y, Zhang H, Zhao X, Shan Y (2021) Towards vivid and diverse image colorization with generative color prior. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14377–14386
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
Xiao J, Li L, Wang C, Zha ZJ, Huang Q (2022) Few shot generative model adaption via relaxed spatial structural alignment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11204–11213
Xie J, Zheng Z, Fang X, Zhu SC, Wu YN (2021) Learning cycle-consistent cooperative networks via alternating mcmc teaching for unsupervised cross-domain translation. In: The Thirty-Fifth AAAI conference on artificial intelligence (AAAI)
Yang M, Wang Z, Chi Z, Feng W (2022) Wavegan: Frequency-aware gan for high-fidelity few-shot image generation. In: Computer vision–ECCV 2022: 17th european conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XV, Springer, pp 1–17
Yaniv J, Newman Y, Shamir A (2019) The face of art: landmark detection and geometric style in portraits. ACM Transactions on graphics (TOG) 38(4):1–15
Yu F, Seff A, Zhang Y, Song S, Funkhouser T, Xiao J (2015) Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint http://arxiv.org/abs/1506.03365
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision, Springer, pp 649–666
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gou, Y., Li, M., Lv, Y. et al. Rethinking cross-domain semantic relation for few-shot image generation. Appl Intell 53, 22391–22404 (2023). https://doi.org/10.1007/s10489-023-04602-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04602-8