Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Image generation step by step: animation generation-image translation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Generative adversarial networks play an important role in image generation, but the successful generation of high-resolution images from complex data sets remains a challenging goal. In this paper, we propose the LGAN (Link Generative Adversarial Networks) model, which can effectively enhance the quality of the synthesized images. The LGAN model consists of two parts, G1 and G2. G1 is responsible for the unconditional generation part, which generates anime images with highly abstract features containing few coefficients but continuous image elements covering the overall image features. Moreover, G2 is responsible for the conditional generation part (image translation), consisting of mapping and Superresolution networks. The mapping network fills the output of G1 into the real-world image after semantic segmentation or edge detection processing; the Superresolution network super-resolves the actual picture after completing mapping to improve the image’s resolution. In the comparison test with WGAN, SAGAN, WGAN-GP and PG-GAN, this paper’s LGAN(SEG) leads 64.36 and 12.28, respectively, fully proving the model’s superiority.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Goodfellow I et al (2014) Generative adversarial networks. arXiv preprint. https://arxiv.org/abs/1406.2661

  2. Karras T et al (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv preprint. https://arxiv.org/abs/1710.10196

  3. Miyato T et al (2018) Spectral normalization for generative adversarial networks. arXiv preprint. https://arxiv.org/abs/1802.05957

  4. Brock A, Donahue J, Simonyan K (2018) Large scale GAN training for high fidelity natural image synthesis. arXiv preprint. https://arxiv.org/abs/1809.11096

  5. Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv preprint. https://arxiv.org/abs/1710.10196

  6. Zhang H et al (2019) Self-attention generative adversarial networks. In: International conference on machine learning. PMLR

  7. Wang TC et al (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition.

  8. Chen T et al (2009) Sketch2photo: Internet image montage. ACM Trans Graph (TOG) 28(5):1–10

    Google Scholar 

  9. Hays J, Efros AA (2007) Scene completion using millions of photographs. ACM Trans Graph (ToG) 26(3):4-es

    Article  Google Scholar 

  10. Johnson M et al (2006) Semantic photosynthesis. In: Computer graphics forum, vol 25, no. 3. Blackwell Publishing, Inc, Oxford, Boston

  11. Lalonde JF et al (2007) Photo clip art. ACM Trans Graph (TOG) 26(3):3–4

  12. Eitz M et al (2009) Photosketch: a sketch based image query and compositing system. SIGGRAPH 2009: talks. pp 1–1

  13. Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE international conference on computer vision

  14. Isola P et al (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  15. Wang TC et al (2018) Video-to-video synthesis. arXiv preprint. https://arxiv.org/abs/1808.06601

  16. Zhang H et al (201) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision

  17. Zhang H et al (2018) Stackgan++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Machine Intell 41(8):1947–1962

  18. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint. https://arxiv.org/abs/1511.06434

  19. Mao X et al (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE inter-national conference on computer vision

  20. Gulrajani I et al (2017) Improved training of wasserstein gans. arXiv preprint. https://arxiv.org/abs/1704.00028

  21. Zhang H, Song Y, Han C, Zhang L (2020) Remote sensing image spatiotemporal fusion using a generative adversarial network. IEEE Trans Geosci Remote Sens 59(5):4273–4286

  22. Yang Q, Yan P, Zhang Y, Yu H, Shi Y, Mou X, Kalra MK, Zhang Y, Sun L, Wang G (2018) Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Trans Med Imaging 37(6):1348–1357

    Article  Google Scholar 

  23. Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2020) ClothingOut: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl 32(9):4519–4530

    Article  Google Scholar 

  24. Guo Y, Li H, Zhuang P (2019) Underwater image enhancement using a multiscale dense generative adversarial network. IEEE J Ocean Eng 45(3):862–870

    Article  Google Scholar 

  25. Xue Y, Xu T, Zhang H, Long LR, Huang X (2018) Segan: adversarial network with multiscale l 1 loss for medical image segmentation. Neuroinformatics 16(3):383–392

    Article  Google Scholar 

  26. Lin CT, Huang SW, Wu YY, Lai SH (2020) GAN-based day-to-night image style transfer for nighttime vehicle detection. IEEE Trans Intell Transp Syst 22(2): 951–963

  27. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2018) Stackgan++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41(8):1947–1962

    Article  Google Scholar 

  28. Dong X, Lei Y, Wang T, Thomas M, Tang L, Curran WJ, Liu T, Yang X (2019) Automatic multiorgan segmentation in thorax CT images using U-net-GAN. Med Phys 46(5):2157–2168

    Article  Google Scholar 

  29. Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321:321–331

    Article  Google Scholar 

  30. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 4401–4410

  31. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning. PMLR

  32. Panaretos VM, Zemel Y (2019) Statistical aspects of Wasserstein distances. Annu Rev Stat Appl 6:405–431

    Article  MathSciNet  Google Scholar 

  33. Champion T, De Pascale L, Juutinen P (2008) The ∞-Wasserstein distance: local solutions and existence of optimal transport maps. SIAM J Math Anal 40(1):1–20

  34. Zhu JY et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision

  35. Kim J et al (2019) U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. arXiv preprint. https://arxiv.org/abs/1907.10830

  36. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of Wasserstein gans. arXiv preprint. https://arxiv.org/abs/1704.00028

  37. Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. arXiv preprint. https://arxiv.org/abs/1803.02155

  38. Pathak D et al (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  39. Shih Y et al (2013) Data-driven hallucination of different times of day from a single outdoor photo. ACM Trans Graph (TOG) 32(6):1–11

  40. Park T et al (2019) Semantic image synthesis with spatial-ly-adaptive normalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

  41. Huang X et al (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV)

  42. Zhu JY et al (2017) Toward multimodal image-to-image translation. arXiv preprint. https://arxiv.org/abs/1711.11586

  43. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and superresolution. In: European conference on computer vision. Springer, Cham

  44. Wang X, Gupta A (2016) Generative image modeling using style and structure adversarial networks. In: European conference on computer vision. Springer, Cham

  45. Zhou Y, Berg TL (2016) Learning temporal trans-formations from time-lapse videos. In: European conference on computer vision. Springer, Cham

  46. Yoo D et al (2016) Pixel-level domain transfer. In: European conference on computer vision. Springer, Cham

  47. Jiang K, Wang Z, Yi P, Wang G, Lu T, Jiang J (2019) Edge-enhanced GAN for remote sensing image superresolution. IEEE Trans Geosci Remote Sens 57(8):5799–5812

    Article  Google Scholar 

  48. Alom MZ et al (2018) Recurrent residual convolutional neural network based on U-Net (R2U-Net) for medical image segmentation. arXiv preprint. https://arxiv.org/abs/1802.06955

  49. Zhou Z et al (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, pp 3–11

  50. Oktay O et al (2018) Attention u-net: learning where to look for the pancreas. arXiv preprint. https://arxiv.org/abs/1804.03999

  51. Isensee F et al (2019) nnU-Net: breaking the spell on successful medical image segmentation. arXiv preprint 1:1–8. https://arxiv.org/abs/1904.08128

  52. Liu Y et al (2020) Regularizing discriminative capability of CGANs for semi-supervised generative learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

  53. Wang J et al (2020) Transformation gan for unsupervised image synthesis and representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

  54. Liu S et al (2020) Diverse image generation via self-conditioned gans. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Download references

Funding

This work is partially supported by the National NaturalScience Foundation of China (61461053) and Yunnan University of the China Postgraduate Science Foundation under Grant (2020306).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongwei Ding.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jing, B., Ding, H., Yang, Z. et al. Image generation step by step: animation generation-image translation. Appl Intell 52, 8087–8100 (2022). https://doi.org/10.1007/s10489-021-02835-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02835-z

Keywords

Navigation