Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Survey on leveraging pre-trained generative adversarial networks for image editing and restoration

  • Review
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Generative adversarial networks (GANs) have drawn enormous attention due to their simple yet effective training mechanism and superior image generation quality. With the ability to generate photorealistic high-resolution (e.g., 1024 × 1024) images, recent GAN models have greatly narrowed the gaps between the generated images and the real ones. Therefore, many recent studies show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors. In this study, we briefly review recent progress on leveraging pre-trained large-scale GAN models from three aspects, i.e., (1) the training of large-scale generative adversarial networks, (2) exploring and understanding the pre-trained GAN models, and (3) leveraging these models for subsequent tasks like image restoration and editing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of International Conference on Neural Information Processing Systems, 2014

  2. Denton E L, Chintala S, Fergus R, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In: Proceedings of International Conference on Neural Information Processing Systems, 2015

  3. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2016

  4. Zhang H, Xu T, Li H, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 5907–5915

  5. Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2019. 7354–7363

  6. Mao X, Li Q, Xie H, et al. Least squares generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2794–2802

  7. Berthelot D, Schumm T, Metz L. BEGAN: boundary equilibrium generative adversarial networks. 2017. ArXiv:1703.10717

  8. Jolicoeur-Martineau A. The relativistic discriminator: a key element missing from standard gan. In: Proceedings of International Conference on Learning Representations, 2019

  9. Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2017. 214–223

  10. Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs. In: Proceedings of International Conference on Neural Information Processing Systems, 2017. 5769–5779

  11. Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2018

  12. Karras T, Aila T, Laine S, et al. Progressive growing of GANs for improved quality, stability, and variation. In: Proceedings of International Conference on Learning Representations, 2018

  13. Brock A, Donahue J, Simonyan K. Large scale GAN training for high fidelity natural image synthesis. In: Proceedings of International Conference on Learning Representations, 2018

  14. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 4401–4410

  15. Karras T, Laine S, Aittala M, et al. Analyzing and improving the image quality of StyleGAN. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8110–8119

  16. Karras T, Aittala M, Hellsten J, et al. Training generative adversarial networks with limited data. 2020. ArXiv:2006.06676

  17. Karras T, Aittala M, Laine S, et al. Alias-free generative adversarial networks. In: Proceedings of International Conference on Neural Information Processing Systems, 2021

  18. Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 1125–1134

  19. Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2223–2232

  20. Choi Y, Choi M, Kim M, et al. StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8789–8797

  21. He Z, Zuo W, Kan M, et al. AttGAN: facial attribute editing by only changing what you want. IEEE Trans Image Process, 2019, 28: 5464–5478

    Article  MathSciNet  Google Scholar 

  22. Liu M, Ding Y, Xia M, et al. STGAN: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 3673–3682

  23. Choi Y, Uh Y, Yoo J, et al. StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8188–8197

  24. Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 4681–4690

  25. Wang X, Yu K, Wu S, et al. ESRGAN: enhanced super-resolution generative adversarial networks. In: Proceedings of European Conference on Computer Vision, 2018

  26. Wang X, Xie L, Dong C, et al. Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: Proceedings of IEEE International Conference on Computer Vision Workshops, 2021. 1905–1914

  27. Zhang K, Liang J, van Gool L, et al. Designing a practical degradation model for deep blind image super-resolution. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 4791–4800

  28. Kupyn O, Budzan V, Mykhailych M, et al. DeblurGAN: blind motion deblurring using conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8183–8192

  29. Kupyn O, Martyniuk T, Wu J, et al. DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 8878–8887

  30. Zheng S, Zhu Z, Zhang X, et al. Distribution-induced bidirectional generative adversarial network for graph representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 7224–7233

  31. Zhu H, Peng X, Chandrasekhar V, et al. DehazeGAN: when image dehazing meets differential programming. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 1234–1240

  32. Zhu H, Cheng Y, Peng X, et al. Single-image dehazing via compositional adversarial network. IEEE Trans Cybern, 2019, 51: 829–838

    Article  Google Scholar 

  33. Mehta A, Sinha H, Narang P, et al. HiDeGAN: a hyperspectral-guided image dehazing GAN. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. 212–213

  34. Dong Y, Liu Y, Zhang H, et al. FD-GAN: generative adversarial networks with fusion-discriminator for single image dehazing. In: Proceedings of AAAI Conference on Artificial Intelligence, 2020. 10729–10736

  35. Liu Z, Luo P, Wang X, et al. Deep learning face attributes in the wild. In: Proceedings of IEEE International Conference on Computer Vision, 2015. 3730–3738

  36. Voynov A, Babenko A. Unsupervised discovery of interpretable directions in the GAN latent space. In: Proceedings of International Conference on Learning Representations, 2020. 9786–9796

  37. Yu F, Seff A, Zhang Y, et al. LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. 2015. ArXiv:1506.03365

  38. Zhu J, Shen Y, Zhao D, et al. In-domain GAN inversion for real image editing. In: Proceedings of European Conference on Computer Vision, 2020. 592–608

  39. Rudin L I, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D-Nonlinear Phenomena, 1992, 60: 259–268

  40. Buades A, Coll B, Morel J M. A non-local algorithm for image denoising. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2005. 60–65

  41. Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process, 2006, 15: 3736–3745

    Article  MathSciNet  Google Scholar 

  42. Li B, Gou Y, Gu S, et al. You only look yourself: unsupervised and untrained single image dehazing neural network. Int J Comput Vis, 2021, 129: 1754–1767

    Article  Google Scholar 

  43. Shoshan A, Mechrez R, Zelnik-Manor L. Dynamic-Net: tuning the objective without re-training for synthesis tasks. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 3215–3223

  44. Gou Y, Li B, Liu Z, et al. CLEARER: multi-scale neural architecture search for image restoration. In: Proceedings of International Conference on Neural Information Processing Systems, 2020, 33: 17129–17140

  45. Bau D, Zhu J Y, Strobelt H, et al. GAN dissection: visualizing and understanding generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2019

  46. Bau D, Zhu J Y, Wulff J, et al. Seeing what a GAN cannot generate. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 4502–4511

  47. Goetschalckx L, Andonian A, Oliva A, et al. GANalyze: toward visual definitions of cognitive image properties. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 5744–5753

  48. Härkönen E, Hertzmann A, Lehtinen J, et al. GANSpace: discovering interpretable GAN controls. In: Proceedings of International Conference on Neural Information Processing Systems, 2020

  49. Suzuki R, Koyama M, Miyato T, et al. Spatially controllable image synthesis with internal representation collaging. 2018. ArXiv:1811.10153

  50. Bau D, Strobelt H, Peebles W, et al. Semantic photo manipulation with a generative image prior. ACM Trans Graph, 2019, 38: 1–11

    Article  Google Scholar 

  51. Tewari A, Elgharib M, Bharaj G, et al. StyleRig: rigging StyleGAN for 3D control over portrait images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 6142–6151

  52. Abdal R, Zhu P, Mitra N J, et al. StyleFlow: attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. ACM Trans Graph, 2021, 40: 1–21

    Article  Google Scholar 

  53. Menon S, Damian A, Hu S, et al. Pulse: self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 2437–2445

  54. Richardson E, Alaluf Y, Patashnik O, et al. Encoding in style: a StyleGAN encoder for image-to-image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 2287–2296

  55. Chan K C, Wang X, Xu X, et al. GLEAN: generative latent bank for large-factor image super-resolution. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 14245–14254

  56. Wang X, Li Y, Zhang H, et al. Towards real-world blind face restoration with generative facial prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 9168–9178

  57. Yang T, Ren P, Xie X, et al. GAN prior embedded network for blind face restoration in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 672–681

  58. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444

    Article  Google Scholar 

  59. Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. 248–255

  60. Lee C H, Liu Z, Wu L, et al. MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5549–5558

  61. Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc IEEE, 1998, 86: 2278–2324

    Article  Google Scholar 

  62. Netzer Y, Wang T, Coates A, et al. Reading digits in natural images with unsupervised feature learning. In: Proceedings of_NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011

  63. Krizhevsky A. Learning multiple layers of features from tiny images. 2009. https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf

  64. Liu Z, Yan S, Luo P, et al. Fashion landmark detection in the wild. In: Proceedings of European Conference on Computer Vision, 2016. 229–245

  65. Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 3213–3223

  66. Shao S, Li Z, Zhang T, et al. Objects365: a large-scale, high-quality dataset for object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 8430–8439

  67. Zhou B, Lapedriza A, Khosla A, et al. Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell, 2017, 40: 1452–1464

    Article  Google Scholar 

  68. Krasin I, Duerig T, Alldrin N, et al. OpenImages: a public dataset for large-scale multi-label and multi-class image classification. 2017. https://storage.googleapis.com/openimages/web/index.html

  69. Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training GANs. In: Proceedings of International Conference on Neural Information Processing Systems, 2016

  70. Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2818–2826

  71. Gurumurthy S, Sarvadevabhatla S R K, Babu R V. DeliGAN: generative adversarial networks for diverse and limited data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 166–174

  72. Che T, Li Y, Jacob A P, et al. Mode regularized generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2017

  73. Zhou Z, Zhang W, Wang J. Inception score, label smoothing, gradient vanishing and −log(D(x)) alternative. 2017. ArXiv:1708.01729

  74. Zhou Z, Cai H, Rong S, et al. Activation maximization generative adversarial nets. In: Proceedings of International Conference on Learning Representations, 2018

  75. Heusel M, Ramsauer H, Unterthiner T, et al. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of International Conference on Neural Information Processing Systems, 2017

  76. Bonneel N, Rabin J, Peyré G, et al. Sliced and radon Wasserstein barycenters of measures. J Math Imaging Vision, 2015, 51: 22–45

    Article  MathSciNet  Google Scholar 

  77. Kolouri S, Nadjahi K, Simsekli U, et al. Generalized sliced Wasserstein distances. 2019. ArXiv:1902.00434

  78. Shmelkov K, Schmid C, Alahari K. How good is my GAN? In: Proceedings of European Conference on Computer Vision, 2018. 213–229

  79. Kynkäänniemi T, Karras T, Laine S, et al. Improved precision and recall metric for assessing generative models. 2019. ArXiv:1904.06991

  80. Khrulkov V, Oseledets I. Geometry score: a method for comparing generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2018. 2621–2629

  81. Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process, 2004, 13: 600–612

    Article  Google Scholar 

  82. Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 586–595

  83. Borji A. Pros and cons of GAN evaluation measures. Comput Vision Image Understanding, 2019, 179: 41–65

    Article  Google Scholar 

  84. Wang Z, She Q, Ward T E. Generative adversarial networks in computer vision. ACM Comput Surv, 2022, 54: 1–38

    Google Scholar 

  85. Kang M, Shin J, Park J. StudioGAN: a taxonomy and benchmark of gans for image synthesis. 2022. ArXiv:2206.09479

  86. Mescheder L, Geiger A, Nowozin S. Which training methods for GANs do actually converge? In: Proceedings of International Conference on Learning Representations, 2018. 3481–3490

  87. Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 1501–1510

  88. Tancik M, Srinivasan P, Mildenhall B, et al. Fourier features let networks learn high frequency functions in low dimensional domains. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 7537–7547

  89. Mirza M, Osindero S. Conditional generative adversarial nets. 2014. ArXiv:1411.1784

  90. Perarnau G, van de Weijer J, Raducanu B, et al. Invertible conditional GANs for image editing. In: Proceedings of NeurIPSW, 2016

  91. Abdal R, Qin Y, Wonka P. Image2StyleGAN: how to embed images into the stylegan latent space? In: Proceedings of IEEE International Conference on Computer Vision, 2019. 4432–4441

  92. Liu Y, Li Q, Sun Z, et al. Style intervention: how to achieve spatial disentanglement with style-based generators? 2020. ArXiv:2011.09699

  93. Wu Z, Lischinski D, Shechtman E. Stylespace analysis: disentangled controls for StyleGAN image generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 12863–12872

  94. Xu J, Xu H, Ni B, et al. Hierarchical style-based networks for motion synthesis. In: Proceedings of European Conference on Computer Vision, 2020. 178–194

  95. Zhang L, Bai X, Gao Y. SalS-GAN: spatially-adaptive latent space in StyleGAN for real image embedding. In: Proceedings of ACM International Conference on Multimedia, 2021. 5176–5184

  96. Zhu P, Abdal R, Qin Y, et al. Improved StyleGAN embedding: where are the good latents? 2020. ArXiv:2012.09036

  97. Abdal R, Qin Y, Wonka P. Image2StyleGAN++: how to edit the embedded images? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8296–8305

  98. Kang K, Kim S, Cho S. GAN inversion for out-of-range images with geometric transformations. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 13941–13949

  99. Cherepkov A, Voynov A, Babenko A. Navigating the GAN parameter space for semantic image editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 3671–3680

  100. Feng Q, Shah V, Gadde R, et al. Near perfect GAN inversion. 2022. ArXiv:2202.11833

  101. Donahue J, Krähenbühl P, Darrell T. Adversarial feature learning. In: Proceedings of International Conference on Learning Representations, 2017

  102. Dumoulin V, Belghazi I, Poole B, et al. Adversarially learned inference. 2016. ArXiv:1606.00704

  103. Zhu J Y, Krähenbühl P, Shechtman E, et al. Generative visual manipulation on the natural image manifold. In: Proceedings of European Conference on Computer Vision, 2016. 597–613

  104. Creswell A, Bharath A A. Inverting the generator of a generative adversarial network. IEEE Trans Neural Netw Learn Syst, 2019, 30: 1967–1974

    Article  Google Scholar 

  105. Lipton Z C, Tripathi S. Precise recovery of latent vectors from generative adversarial networks. In: Proceedings of International Conference on Learning Representations Workshops, 2017

  106. Shah V, Hegde C. Solving linear inverse problems using GAN priors: an algorithm with provable guarantees. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2018. 4609–4613

  107. Ma F, Ayaz U, Karaman S. Invertibility of convolutional generative networks from partial measurements. In: Proceedings of International Conference on Neural Information Processing Systems, 2018. 9651–9660

  108. Raj A, Li Y, Bresler Y. GAN-based projector for faster recovery with convergence guarantees in linear inverse problems. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 5602–5611

  109. Bau D, Zhu J Y, Wulff J, et al. Inverting layers of a large generator. In: Proceedings of International Conference on Learning Representations Workshops, 2019. 4

  110. Shen Y, Gu J, Tang X, et al. Interpreting the latent space of GANs for semantic face editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 9243–9252

  111. Daras G, Odena A, Zhang H, et al. Your local GAN: designing two dimensional local attention mechanisms for generative models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 14531–14539

  112. Gu J, Shen Y, Zhou B. Image processing using multi-code GAN prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 3012–3021

  113. Anirudh R, Thiagarajan J J, Kailkhura B, et al. MimicGAN: robust projection onto image manifolds with corruption mimicking. Int J Comput Vis, 2020, 128: 2459–2477

    Article  Google Scholar 

  114. Pan X, Zhan X, Dai B, et al. Exploiting deep generative prior for versatile image restoration and manipulation. IEEE Trans Pattern Anal Mach Intell, 2022, 44: 7474–7489

    Article  Google Scholar 

  115. Viazovetskyi Y, Ivashkin V, Kashin E. StyleGAN2 distillation for feed-forward image manipulation. In: Proceedings of European Conference on Computer Vision, 2020. 170–186

  116. Collins E, Bala R, Price B, et al. Editing in style: uncovering the local semantics of GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5771–5780

  117. Pidhorskyi S, Adjeroh D A, Doretto G. Adversarial latent autoencoders. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 14104–14113

  118. Huh M, Zhang R, Zhu J Y, et al. Transforming and projecting images into class-conditional generative networks. In: Proceedings of European Conference on Computer Vision, 2020. 17–34

  119. Nitzan Y, Bermano A, Li Y, et al. Face identity disentanglement via latent space mapping. ACM Trans Graph, 2020, 39: 1–14

    Article  Google Scholar 

  120. Aberdam A, Simon D, Elad M. When and how can deep generative models be inverted? 2020. ArXiv:2006.15555

  121. Guan S, Tai Y, Ni B, et al. Collaborative learning for faster StyleGAN embedding. 2020. ArXiv:2007.01758

  122. Shen Y, Zhou B. Closed-form factorization of latent semantics in GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 1532–1540

  123. Xu Y, Shen Y, Zhu J, et al. Generative hierarchical features from synthesizing images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 4432–4442

  124. Tewari A, Elgharib M, R M B, et al. PIE: portrait image embedding for semantic control. ACM Trans Graph, 2020, 39: 1–14

    Article  Google Scholar 

  125. Bartz C, Bethge J, Yang H, et al. One model to reconstruct them all: a novel way to use the stochastic noise in StyleGAN. In: Proceedings of British Machine Vision Association, 2020

  126. Wang H P, Yu N, Fritz M. Hijack-GAN: unintended-use of pretrained, black-box GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 7872–7881

  127. Zhuang P, Koyejo O O, Schwing A. Enjoy your editing: controllable GANs for image editing via latent space navigation. In: Proceedings of International Conference on Learning Representations, 2021

  128. Alaluf Y, Patashnik O, Cohen-Or D. Only a matter of style: age transformation using a style-based regression model. ACM Trans Graph, 2021, 40: 1–12

    Article  Google Scholar 

  129. Tov O, Alaluf Y, Nitzan Y, et al. Designing an encoder for StyleGAN image manipulation. ACM Trans Graph, 2021, 40: 1–14

    Article  Google Scholar 

  130. Patashnik O, Wu Z, Shechtman E, et al. StyleCLIP: text-driven manipulation of StyleGAN imagery. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 2085–2094

  131. Chai L, Wulff J, Isola P. Using latent space regression to analyze and leverage compositionality in GANs. In: Proceedings of International Conference on Learning Representations, 2021

  132. Chai L, Zhu J Y, Shechtman E, et al. Ensembling with deep generative views. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 14997–15007

  133. Alaluf Y, Patashnik O, Cohen-Or D. ReStyle: a residual-based StyleGAN encoder via iterative refinement. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6711–6720

  134. Wei T, Chen D, Zhou W, et al. E2Style: improve the efficiency and effectiveness of StyleGAN inversion. IEEE Trans Image Process, 2022, 31: 3267–3280

    Article  Google Scholar 

  135. Xu Y, Du Y, Xiao W, et al. From continuity to editability: inverting GANs with consecutive images. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 13910–13918

  136. Wang T, Zhang Y, Fan Y, et al. High-fidelity GAN inversion for image attribute editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11379–11388

  137. Schwettmann S, Hernandez E, Bau D, et al. Toward a visual concept vocabulary for GAN latent space. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6804–6812

  138. Alaluf Y, Tov O, Mokady R, et al. HyperStyle: StyleGAN inversion with hypernetworks for real image editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 18511–18521

  139. Peebles W, Zhu J Y, Zhang R, et al. GAN-supervised dense visual alignment. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 13470–13481

  140. Dinh T M, Tran A T, Nguyen R, et al. HyperInverter: improving StyleGAN inversion via hypernetwork. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11389–11398

  141. Alaluf Y, Patashnik O, Wu Z, et al. Third time’s the charm? Image and video editing with StyleGAN3. 2022. ArXiv:2201.13433

  142. Frühstück A, Singh K K, Shechtman E, et al. InsetGAN for full-body image generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 7723–7732

  143. Wu Y, Yang Y L, Jin X. HairMapper: removing hair from portraits using gans. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 4227–4236

  144. Parmar G, Li Y, Lu J, et al. Spatially-adaptive multilayer selection for GAN inversion and editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11399–11409

  145. Zhou B, Zhao H, Puig X, et al. Scene parsing through ADE20K dataset. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 633–641

  146. Chen B C, Chen C S, Hsu W H. Cross-age reference coding for age-invariant face recognition and retrieval. In: Proceedings of European Conference on Computer Vision, 2014. 768–783

  147. Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context. In: Proceedings of European Conference on Computer Vision, 2014. 740–755

  148. Wah C, Branson S, Welinder P, et al. The Caltech-UCSD birds-200-2011 dataset. 2011. http://www.vision.caltech.edu/visipedia/CUB-200.html

  149. Anonymous, The Danbooru Community, Branwen G. Danbooru2021: a large-scale crowdsourced and tagged anime illustration dataset. 2021. https://www.gwern.net/Danbooru

  150. Nilsback M E, Zisserman A. Automated flower classification over a large number of classes. In: Proceedings of the 6th Indian Conference on Computer Vision, Graphics & Image Processing, 2008. 722–729

  151. Huang G B, Mattar M, Berg T, et al. Labeled faces in the wild: a database forstudying face recognition in unconstrained environments. In: Proceedings of Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition, 2008

  152. Skorokhodov I, Sotnikov G, Elhoseiny M. Aligning latent and image spaces to connect the unconnectable. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 14144–14153

  153. Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction. Science, 2015, 350: 1332–1338

    Article  MathSciNet  Google Scholar 

  154. Zhou B, Lapedriza A, Xiao J, et al. Learning deep features for scene recognition using places database. In: Proceedings of International Conference on Neural Information Processing Systems, 2014

  155. Parkhi O M, Vedaldi A, Zisserman A, et al. Cats and dogs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012. 3498–3505

  156. Livingstone S R, Russo F A. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. Plos One, 2018, 13: e0196391

    Article  Google Scholar 

  157. Krause J, Stark M, Deng J, et al. 3D object representations for fine-grained categorization. In: Proceedings of IEEE International Conference on Computer Vision Workshops, 2013. 554–561

  158. Naik N, Philipoom J, Raskar R, et al. Streetscore-predicting the perceived safety of one million streetscapes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 779–785

  159. Laffont P Y, Ren Z, Tao X, et al. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans Graph, 2014, 33: 1–11

    Article  Google Scholar 

  160. Yu A, Grauman K. Fine-grained visual comparisons with local learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. 192–199

  161. Liu D C, Nocedal J. On the limited memory BFGS method for large scale optimization. Math Programming, 1989, 45: 503–528

    Article  MathSciNet  Google Scholar 

  162. Kingma D P, Ba J. Adam: a method for stochastic optimization. 2014. ArXiv:1412.6980

  163. Deng J, Guo J, Xue N, et al. ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 4690–4699

  164. Huang Y, Wang Y, Tai Y, et al. CurricularFace: adaptive curriculum learning loss for deep face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5901–5910

  165. He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 9729–9738

  166. Donahue J, Simonyan K. Large scale adversarial representation learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2019. 32

  167. Kingma D P, Dhariwal P. Glow: generative flow with invertible 1 × 1 convolutions. In: Proceedings of International Conference on Neural Information Processing Systems, 2018. 10236–10245

  168. Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 6840–6851

  169. Tousi A, Jeong H, Han J, et al. Automatic correction of internal units in generative neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 7932–7940

  170. Bau D, Zhou B, Khosla A, et al. Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 6541–6549

  171. Carter S, Armstrong Z, Schubert L, et al. Activation atlas. Distill, 2019. https://distill.pub/2019/activation-atlas

  172. Bau D, Liu S, Wang T, et al. Rewriting a deep generative model. In: Proceedings of European Conference on Computer Vision, 2020. 351–369

  173. Langner O, Dotsch R, Bijlstra G, et al. Presentation and validation of the Radboud Faces Database. Cognition Emotion, 2010, 24: 1377–1388

    Article  Google Scholar 

  174. Ramesh A, Choi Y, LeCun Y. A spectral regularizer for unsupervised disentanglement. 2018. ArXiv:1812.01161

  175. Chen X, Duan Y, Houthooft R, et al. InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of International Conference on Neural Information Processing Systems, 2016. 2172–2180

  176. Peebles W, Peebles J, Zhu J Y, et al. The hessian penalty: a weak prior for unsupervised disentanglement. In: Proceedings of European Conference on Computer Vision, 2020. 581–597

  177. Zhu X, Xu C, Tao D. Learning disentangled representations with latent variation predictability. In: Proceedings of European Conference on Computer Vision, 2020. 684–700

  178. Zhu X, Xu C, Tao D. Where and what? Examining interpretable disentangled representations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 5861–5870

  179. Wei Y, Shi Y, Liu X, et al. Orthogonal jacobian regularization for unsupervised disentanglement in image generation. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6721–6730

  180. He Z, Kan M, Shan S. EigenGAN: layer-wise eigen-learning for GANs. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 14408–14417

  181. Jahanian A, Chai L, Isola P. On the “steerability” of generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2020

  182. Zhu J, Shen Y, Xu Y, et al. Region-based semantic factorization in GANs. 2022. ArXiv:2202.09649

  183. Wang B, Ponce C R. A geometric analysis of deep generative image models and its applications. In: Proceedings of International Conference on Learning Representations, 2021

  184. Tzelepis C, Tzimiropoulos G, Patras I. WarpedGANSpace: finding non-linear RBF paths in GAN latent space. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6393–6402

  185. Wang X, Yu K, Dong C, et al. Deep network interpolation for continuous imagery effect transition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 1692–1701

  186. Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 618–626

  187. Pan X, Dai B, Liu Z, et al. Do 2D GANs know 3D shape? Unsupervised 3D shape reconstruction from 2D image GANs. 2020. ArXiv:2011.00844

  188. Zhang J, Chen X, Cai Z, et al. Unsupervised 3D shape completion through GAN inversion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 1768–1777

  189. Kingma D P, Welling M. Auto-encoding variational Bayes. 2013. ArXiv:1312.6114

  190. van den Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of International Conference on Machine Learning, 2016. 1747–1756

  191. Ramesh A, Dhariwal P, Nichol A, et al. Hierarchical text-conditional image generation with clip latents. 2022. ArXiv:2204.06125

  192. Saharia C, Chan W, Saxena S, et al. Photorealistic text-to-image diffusion models with deep language understanding. 2022. ArXiv:2205.11487

  193. Zhang D, Han J, Cheng G, et al. Weakly supervised object localization and detection: a survey. IEEE Trans Pattern Anal Mach Intell, 2021, 44: 5866–5885

    Google Scholar 

  194. Han J, Zhang D, Cheng G, et al. Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process Mag, 2018, 35: 84–100

    Article  Google Scholar 

  195. Zhang D, Tian H, Han J. Few-cost salient object detection with adversarial-paced learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 33: 12236–12247

  196. Frid-Adar M, Klang E, Amitai M, et al. Synthetic data augmentation using GAN for improved liver lesion classification. In: Proceedings of International Symposium on Biomedical Imaging, 2018. 289–293

  197. Huang S W, Lin C T, Chen S P, et al. AugGAN: cross domain adaptation with gan-based data augmentation. In: Proceedings of European Conference on Computer Vision, 2018. 718–731

  198. Zhang Y, Ling H, Gao J, et al. DatasetGAN: efficient labeled data factory with minimal human effort. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 10145–10155

  199. Han M, Zheng H, Wang C, et al. Leveraging GAN priors for few-shot part segmentation. In: Proceedings of ACM International Conference on Multimedia, 2022. 1339–1347

  200. Schlegl T, Seeböck P, Waldstein S M, et al. f-AnoGAN: fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal, 2019, 54: 30–44

    Article  Google Scholar 

  201. Dunn I, Pouget H, Melham T, et al. Adaptive generation of unrestricted adversarial inputs. 2019. ArXiv:1905.02463

  202. Wang X, He K, Hopcroft J E. At-GAN: a generative attack model for adversarial transferring on generative adversarial nets. 2019. ArXiv:1904.07793

  203. Ojha U, Li Y, Lu J, et al. Few-shot image generation via cross-domain correspondence. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 10743–10752

  204. Gu J, Liu L, Wang P, et al. StyleNeRF: a style-based 3D aware generator for high-resolution image synthesis. In: Proceedings of International Conference on Learning Representations, 2022

  205. He J, Shi W, Chen K, et al. GCFSR: a generative and controllable face super resolution method without facial and GAN priors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 1889–1898

  206. Li X, Chen C, Lin X, et al. From face to natural image: learning real degradation for blind image super-resolution. In: Proceedings of European Conference on Computer Vision, 2022

  207. Li B, Liu X, Hu P, et al. All-in-one image restoration for unknown corruption. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 17452–17462

  208. Lyu Z, Xu X, Yang C, et al. Accelerating diffusion models via early stop of the diffusion process. 2022. ArXiv:2205.12524

  209. Grover A, Dhar M, Ermon S. Flow-GAN: combining maximum likelihood and adversarial learning in generative models. In: Proceedings of AAAI Conference on Artificial Intelligence, 2018

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. U19A2073, 62006064), Hong Kong RGC RIF (Grant No. R5001-18), and 2020 Heilongjiang Provincial Natural Science Foundation Joint Guidance Project (Grant No. LH2020C001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wangmeng Zuo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, M., Wei, Y., Wu, X. et al. Survey on leveraging pre-trained generative adversarial networks for image editing and restoration. Sci. China Inf. Sci. 66, 151101 (2023). https://doi.org/10.1007/s11432-022-3679-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3679-0

Keywords

Navigation