Abstract
Generative adversarial networks (GANs) have drawn enormous attention due to their simple yet effective training mechanism and superior image generation quality. With the ability to generate photorealistic high-resolution (e.g., 1024 × 1024) images, recent GAN models have greatly narrowed the gaps between the generated images and the real ones. Therefore, many recent studies show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors. In this study, we briefly review recent progress on leveraging pre-trained large-scale GAN models from three aspects, i.e., (1) the training of large-scale generative adversarial networks, (2) exploring and understanding the pre-trained GAN models, and (3) leveraging these models for subsequent tasks like image restoration and editing.
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of International Conference on Neural Information Processing Systems, 2014
Denton E L, Chintala S, Fergus R, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In: Proceedings of International Conference on Neural Information Processing Systems, 2015
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2016
Zhang H, Xu T, Li H, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 5907–5915
Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2019. 7354–7363
Mao X, Li Q, Xie H, et al. Least squares generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2794–2802
Berthelot D, Schumm T, Metz L. BEGAN: boundary equilibrium generative adversarial networks. 2017. ArXiv:1703.10717
Jolicoeur-Martineau A. The relativistic discriminator: a key element missing from standard gan. In: Proceedings of International Conference on Learning Representations, 2019
Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2017. 214–223
Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs. In: Proceedings of International Conference on Neural Information Processing Systems, 2017. 5769–5779
Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2018
Karras T, Aila T, Laine S, et al. Progressive growing of GANs for improved quality, stability, and variation. In: Proceedings of International Conference on Learning Representations, 2018
Brock A, Donahue J, Simonyan K. Large scale GAN training for high fidelity natural image synthesis. In: Proceedings of International Conference on Learning Representations, 2018
Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 4401–4410
Karras T, Laine S, Aittala M, et al. Analyzing and improving the image quality of StyleGAN. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8110–8119
Karras T, Aittala M, Hellsten J, et al. Training generative adversarial networks with limited data. 2020. ArXiv:2006.06676
Karras T, Aittala M, Laine S, et al. Alias-free generative adversarial networks. In: Proceedings of International Conference on Neural Information Processing Systems, 2021
Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 1125–1134
Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2223–2232
Choi Y, Choi M, Kim M, et al. StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8789–8797
He Z, Zuo W, Kan M, et al. AttGAN: facial attribute editing by only changing what you want. IEEE Trans Image Process, 2019, 28: 5464–5478
Liu M, Ding Y, Xia M, et al. STGAN: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 3673–3682
Choi Y, Uh Y, Yoo J, et al. StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8188–8197
Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 4681–4690
Wang X, Yu K, Wu S, et al. ESRGAN: enhanced super-resolution generative adversarial networks. In: Proceedings of European Conference on Computer Vision, 2018
Wang X, Xie L, Dong C, et al. Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: Proceedings of IEEE International Conference on Computer Vision Workshops, 2021. 1905–1914
Zhang K, Liang J, van Gool L, et al. Designing a practical degradation model for deep blind image super-resolution. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 4791–4800
Kupyn O, Budzan V, Mykhailych M, et al. DeblurGAN: blind motion deblurring using conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8183–8192
Kupyn O, Martyniuk T, Wu J, et al. DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 8878–8887
Zheng S, Zhu Z, Zhang X, et al. Distribution-induced bidirectional generative adversarial network for graph representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 7224–7233
Zhu H, Peng X, Chandrasekhar V, et al. DehazeGAN: when image dehazing meets differential programming. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 1234–1240
Zhu H, Cheng Y, Peng X, et al. Single-image dehazing via compositional adversarial network. IEEE Trans Cybern, 2019, 51: 829–838
Mehta A, Sinha H, Narang P, et al. HiDeGAN: a hyperspectral-guided image dehazing GAN. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. 212–213
Dong Y, Liu Y, Zhang H, et al. FD-GAN: generative adversarial networks with fusion-discriminator for single image dehazing. In: Proceedings of AAAI Conference on Artificial Intelligence, 2020. 10729–10736
Liu Z, Luo P, Wang X, et al. Deep learning face attributes in the wild. In: Proceedings of IEEE International Conference on Computer Vision, 2015. 3730–3738
Voynov A, Babenko A. Unsupervised discovery of interpretable directions in the GAN latent space. In: Proceedings of International Conference on Learning Representations, 2020. 9786–9796
Yu F, Seff A, Zhang Y, et al. LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. 2015. ArXiv:1506.03365
Zhu J, Shen Y, Zhao D, et al. In-domain GAN inversion for real image editing. In: Proceedings of European Conference on Computer Vision, 2020. 592–608
Rudin L I, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D-Nonlinear Phenomena, 1992, 60: 259–268
Buades A, Coll B, Morel J M. A non-local algorithm for image denoising. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2005. 60–65
Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process, 2006, 15: 3736–3745
Li B, Gou Y, Gu S, et al. You only look yourself: unsupervised and untrained single image dehazing neural network. Int J Comput Vis, 2021, 129: 1754–1767
Shoshan A, Mechrez R, Zelnik-Manor L. Dynamic-Net: tuning the objective without re-training for synthesis tasks. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 3215–3223
Gou Y, Li B, Liu Z, et al. CLEARER: multi-scale neural architecture search for image restoration. In: Proceedings of International Conference on Neural Information Processing Systems, 2020, 33: 17129–17140
Bau D, Zhu J Y, Strobelt H, et al. GAN dissection: visualizing and understanding generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2019
Bau D, Zhu J Y, Wulff J, et al. Seeing what a GAN cannot generate. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 4502–4511
Goetschalckx L, Andonian A, Oliva A, et al. GANalyze: toward visual definitions of cognitive image properties. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 5744–5753
Härkönen E, Hertzmann A, Lehtinen J, et al. GANSpace: discovering interpretable GAN controls. In: Proceedings of International Conference on Neural Information Processing Systems, 2020
Suzuki R, Koyama M, Miyato T, et al. Spatially controllable image synthesis with internal representation collaging. 2018. ArXiv:1811.10153
Bau D, Strobelt H, Peebles W, et al. Semantic photo manipulation with a generative image prior. ACM Trans Graph, 2019, 38: 1–11
Tewari A, Elgharib M, Bharaj G, et al. StyleRig: rigging StyleGAN for 3D control over portrait images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 6142–6151
Abdal R, Zhu P, Mitra N J, et al. StyleFlow: attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. ACM Trans Graph, 2021, 40: 1–21
Menon S, Damian A, Hu S, et al. Pulse: self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 2437–2445
Richardson E, Alaluf Y, Patashnik O, et al. Encoding in style: a StyleGAN encoder for image-to-image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 2287–2296
Chan K C, Wang X, Xu X, et al. GLEAN: generative latent bank for large-factor image super-resolution. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 14245–14254
Wang X, Li Y, Zhang H, et al. Towards real-world blind face restoration with generative facial prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 9168–9178
Yang T, Ren P, Xie X, et al. GAN prior embedded network for blind face restoration in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 672–681
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. 248–255
Lee C H, Liu Z, Wu L, et al. MaskGAN: towards diverse and interactive facial image manipulation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5549–5558
Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc IEEE, 1998, 86: 2278–2324
Netzer Y, Wang T, Coates A, et al. Reading digits in natural images with unsupervised feature learning. In: Proceedings of_NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011
Krizhevsky A. Learning multiple layers of features from tiny images. 2009. https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf
Liu Z, Yan S, Luo P, et al. Fashion landmark detection in the wild. In: Proceedings of European Conference on Computer Vision, 2016. 229–245
Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 3213–3223
Shao S, Li Z, Zhang T, et al. Objects365: a large-scale, high-quality dataset for object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 8430–8439
Zhou B, Lapedriza A, Khosla A, et al. Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell, 2017, 40: 1452–1464
Krasin I, Duerig T, Alldrin N, et al. OpenImages: a public dataset for large-scale multi-label and multi-class image classification. 2017. https://storage.googleapis.com/openimages/web/index.html
Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training GANs. In: Proceedings of International Conference on Neural Information Processing Systems, 2016
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2818–2826
Gurumurthy S, Sarvadevabhatla S R K, Babu R V. DeliGAN: generative adversarial networks for diverse and limited data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 166–174
Che T, Li Y, Jacob A P, et al. Mode regularized generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2017
Zhou Z, Zhang W, Wang J. Inception score, label smoothing, gradient vanishing and −log(D(x)) alternative. 2017. ArXiv:1708.01729
Zhou Z, Cai H, Rong S, et al. Activation maximization generative adversarial nets. In: Proceedings of International Conference on Learning Representations, 2018
Heusel M, Ramsauer H, Unterthiner T, et al. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of International Conference on Neural Information Processing Systems, 2017
Bonneel N, Rabin J, Peyré G, et al. Sliced and radon Wasserstein barycenters of measures. J Math Imaging Vision, 2015, 51: 22–45
Kolouri S, Nadjahi K, Simsekli U, et al. Generalized sliced Wasserstein distances. 2019. ArXiv:1902.00434
Shmelkov K, Schmid C, Alahari K. How good is my GAN? In: Proceedings of European Conference on Computer Vision, 2018. 213–229
Kynkäänniemi T, Karras T, Laine S, et al. Improved precision and recall metric for assessing generative models. 2019. ArXiv:1904.06991
Khrulkov V, Oseledets I. Geometry score: a method for comparing generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2018. 2621–2629
Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process, 2004, 13: 600–612
Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 586–595
Borji A. Pros and cons of GAN evaluation measures. Comput Vision Image Understanding, 2019, 179: 41–65
Wang Z, She Q, Ward T E. Generative adversarial networks in computer vision. ACM Comput Surv, 2022, 54: 1–38
Kang M, Shin J, Park J. StudioGAN: a taxonomy and benchmark of gans for image synthesis. 2022. ArXiv:2206.09479
Mescheder L, Geiger A, Nowozin S. Which training methods for GANs do actually converge? In: Proceedings of International Conference on Learning Representations, 2018. 3481–3490
Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 1501–1510
Tancik M, Srinivasan P, Mildenhall B, et al. Fourier features let networks learn high frequency functions in low dimensional domains. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 7537–7547
Mirza M, Osindero S. Conditional generative adversarial nets. 2014. ArXiv:1411.1784
Perarnau G, van de Weijer J, Raducanu B, et al. Invertible conditional GANs for image editing. In: Proceedings of NeurIPSW, 2016
Abdal R, Qin Y, Wonka P. Image2StyleGAN: how to embed images into the stylegan latent space? In: Proceedings of IEEE International Conference on Computer Vision, 2019. 4432–4441
Liu Y, Li Q, Sun Z, et al. Style intervention: how to achieve spatial disentanglement with style-based generators? 2020. ArXiv:2011.09699
Wu Z, Lischinski D, Shechtman E. Stylespace analysis: disentangled controls for StyleGAN image generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 12863–12872
Xu J, Xu H, Ni B, et al. Hierarchical style-based networks for motion synthesis. In: Proceedings of European Conference on Computer Vision, 2020. 178–194
Zhang L, Bai X, Gao Y. SalS-GAN: spatially-adaptive latent space in StyleGAN for real image embedding. In: Proceedings of ACM International Conference on Multimedia, 2021. 5176–5184
Zhu P, Abdal R, Qin Y, et al. Improved StyleGAN embedding: where are the good latents? 2020. ArXiv:2012.09036
Abdal R, Qin Y, Wonka P. Image2StyleGAN++: how to edit the embedded images? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 8296–8305
Kang K, Kim S, Cho S. GAN inversion for out-of-range images with geometric transformations. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 13941–13949
Cherepkov A, Voynov A, Babenko A. Navigating the GAN parameter space for semantic image editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 3671–3680
Feng Q, Shah V, Gadde R, et al. Near perfect GAN inversion. 2022. ArXiv:2202.11833
Donahue J, Krähenbühl P, Darrell T. Adversarial feature learning. In: Proceedings of International Conference on Learning Representations, 2017
Dumoulin V, Belghazi I, Poole B, et al. Adversarially learned inference. 2016. ArXiv:1606.00704
Zhu J Y, Krähenbühl P, Shechtman E, et al. Generative visual manipulation on the natural image manifold. In: Proceedings of European Conference on Computer Vision, 2016. 597–613
Creswell A, Bharath A A. Inverting the generator of a generative adversarial network. IEEE Trans Neural Netw Learn Syst, 2019, 30: 1967–1974
Lipton Z C, Tripathi S. Precise recovery of latent vectors from generative adversarial networks. In: Proceedings of International Conference on Learning Representations Workshops, 2017
Shah V, Hegde C. Solving linear inverse problems using GAN priors: an algorithm with provable guarantees. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2018. 4609–4613
Ma F, Ayaz U, Karaman S. Invertibility of convolutional generative networks from partial measurements. In: Proceedings of International Conference on Neural Information Processing Systems, 2018. 9651–9660
Raj A, Li Y, Bresler Y. GAN-based projector for faster recovery with convergence guarantees in linear inverse problems. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 5602–5611
Bau D, Zhu J Y, Wulff J, et al. Inverting layers of a large generator. In: Proceedings of International Conference on Learning Representations Workshops, 2019. 4
Shen Y, Gu J, Tang X, et al. Interpreting the latent space of GANs for semantic face editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 9243–9252
Daras G, Odena A, Zhang H, et al. Your local GAN: designing two dimensional local attention mechanisms for generative models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 14531–14539
Gu J, Shen Y, Zhou B. Image processing using multi-code GAN prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 3012–3021
Anirudh R, Thiagarajan J J, Kailkhura B, et al. MimicGAN: robust projection onto image manifolds with corruption mimicking. Int J Comput Vis, 2020, 128: 2459–2477
Pan X, Zhan X, Dai B, et al. Exploiting deep generative prior for versatile image restoration and manipulation. IEEE Trans Pattern Anal Mach Intell, 2022, 44: 7474–7489
Viazovetskyi Y, Ivashkin V, Kashin E. StyleGAN2 distillation for feed-forward image manipulation. In: Proceedings of European Conference on Computer Vision, 2020. 170–186
Collins E, Bala R, Price B, et al. Editing in style: uncovering the local semantics of GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5771–5780
Pidhorskyi S, Adjeroh D A, Doretto G. Adversarial latent autoencoders. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 14104–14113
Huh M, Zhang R, Zhu J Y, et al. Transforming and projecting images into class-conditional generative networks. In: Proceedings of European Conference on Computer Vision, 2020. 17–34
Nitzan Y, Bermano A, Li Y, et al. Face identity disentanglement via latent space mapping. ACM Trans Graph, 2020, 39: 1–14
Aberdam A, Simon D, Elad M. When and how can deep generative models be inverted? 2020. ArXiv:2006.15555
Guan S, Tai Y, Ni B, et al. Collaborative learning for faster StyleGAN embedding. 2020. ArXiv:2007.01758
Shen Y, Zhou B. Closed-form factorization of latent semantics in GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 1532–1540
Xu Y, Shen Y, Zhu J, et al. Generative hierarchical features from synthesizing images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 4432–4442
Tewari A, Elgharib M, R M B, et al. PIE: portrait image embedding for semantic control. ACM Trans Graph, 2020, 39: 1–14
Bartz C, Bethge J, Yang H, et al. One model to reconstruct them all: a novel way to use the stochastic noise in StyleGAN. In: Proceedings of British Machine Vision Association, 2020
Wang H P, Yu N, Fritz M. Hijack-GAN: unintended-use of pretrained, black-box GANs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 7872–7881
Zhuang P, Koyejo O O, Schwing A. Enjoy your editing: controllable GANs for image editing via latent space navigation. In: Proceedings of International Conference on Learning Representations, 2021
Alaluf Y, Patashnik O, Cohen-Or D. Only a matter of style: age transformation using a style-based regression model. ACM Trans Graph, 2021, 40: 1–12
Tov O, Alaluf Y, Nitzan Y, et al. Designing an encoder for StyleGAN image manipulation. ACM Trans Graph, 2021, 40: 1–14
Patashnik O, Wu Z, Shechtman E, et al. StyleCLIP: text-driven manipulation of StyleGAN imagery. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 2085–2094
Chai L, Wulff J, Isola P. Using latent space regression to analyze and leverage compositionality in GANs. In: Proceedings of International Conference on Learning Representations, 2021
Chai L, Zhu J Y, Shechtman E, et al. Ensembling with deep generative views. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 14997–15007
Alaluf Y, Patashnik O, Cohen-Or D. ReStyle: a residual-based StyleGAN encoder via iterative refinement. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6711–6720
Wei T, Chen D, Zhou W, et al. E2Style: improve the efficiency and effectiveness of StyleGAN inversion. IEEE Trans Image Process, 2022, 31: 3267–3280
Xu Y, Du Y, Xiao W, et al. From continuity to editability: inverting GANs with consecutive images. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 13910–13918
Wang T, Zhang Y, Fan Y, et al. High-fidelity GAN inversion for image attribute editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11379–11388
Schwettmann S, Hernandez E, Bau D, et al. Toward a visual concept vocabulary for GAN latent space. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6804–6812
Alaluf Y, Tov O, Mokady R, et al. HyperStyle: StyleGAN inversion with hypernetworks for real image editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 18511–18521
Peebles W, Zhu J Y, Zhang R, et al. GAN-supervised dense visual alignment. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 13470–13481
Dinh T M, Tran A T, Nguyen R, et al. HyperInverter: improving StyleGAN inversion via hypernetwork. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11389–11398
Alaluf Y, Patashnik O, Wu Z, et al. Third time’s the charm? Image and video editing with StyleGAN3. 2022. ArXiv:2201.13433
Frühstück A, Singh K K, Shechtman E, et al. InsetGAN for full-body image generation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 7723–7732
Wu Y, Yang Y L, Jin X. HairMapper: removing hair from portraits using gans. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 4227–4236
Parmar G, Li Y, Lu J, et al. Spatially-adaptive multilayer selection for GAN inversion and editing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 11399–11409
Zhou B, Zhao H, Puig X, et al. Scene parsing through ADE20K dataset. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 633–641
Chen B C, Chen C S, Hsu W H. Cross-age reference coding for age-invariant face recognition and retrieval. In: Proceedings of European Conference on Computer Vision, 2014. 768–783
Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context. In: Proceedings of European Conference on Computer Vision, 2014. 740–755
Wah C, Branson S, Welinder P, et al. The Caltech-UCSD birds-200-2011 dataset. 2011. http://www.vision.caltech.edu/visipedia/CUB-200.html
Anonymous, The Danbooru Community, Branwen G. Danbooru2021: a large-scale crowdsourced and tagged anime illustration dataset. 2021. https://www.gwern.net/Danbooru
Nilsback M E, Zisserman A. Automated flower classification over a large number of classes. In: Proceedings of the 6th Indian Conference on Computer Vision, Graphics & Image Processing, 2008. 722–729
Huang G B, Mattar M, Berg T, et al. Labeled faces in the wild: a database forstudying face recognition in unconstrained environments. In: Proceedings of Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition, 2008
Skorokhodov I, Sotnikov G, Elhoseiny M. Aligning latent and image spaces to connect the unconnectable. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 14144–14153
Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction. Science, 2015, 350: 1332–1338
Zhou B, Lapedriza A, Xiao J, et al. Learning deep features for scene recognition using places database. In: Proceedings of International Conference on Neural Information Processing Systems, 2014
Parkhi O M, Vedaldi A, Zisserman A, et al. Cats and dogs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012. 3498–3505
Livingstone S R, Russo F A. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. Plos One, 2018, 13: e0196391
Krause J, Stark M, Deng J, et al. 3D object representations for fine-grained categorization. In: Proceedings of IEEE International Conference on Computer Vision Workshops, 2013. 554–561
Naik N, Philipoom J, Raskar R, et al. Streetscore-predicting the perceived safety of one million streetscapes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014. 779–785
Laffont P Y, Ren Z, Tao X, et al. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans Graph, 2014, 33: 1–11
Yu A, Grauman K. Fine-grained visual comparisons with local learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. 192–199
Liu D C, Nocedal J. On the limited memory BFGS method for large scale optimization. Math Programming, 1989, 45: 503–528
Kingma D P, Ba J. Adam: a method for stochastic optimization. 2014. ArXiv:1412.6980
Deng J, Guo J, Xue N, et al. ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 4690–4699
Huang Y, Wang Y, Tai Y, et al. CurricularFace: adaptive curriculum learning loss for deep face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 5901–5910
He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 9729–9738
Donahue J, Simonyan K. Large scale adversarial representation learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2019. 32
Kingma D P, Dhariwal P. Glow: generative flow with invertible 1 × 1 convolutions. In: Proceedings of International Conference on Neural Information Processing Systems, 2018. 10236–10245
Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 6840–6851
Tousi A, Jeong H, Han J, et al. Automatic correction of internal units in generative neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 7932–7940
Bau D, Zhou B, Khosla A, et al. Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 6541–6549
Carter S, Armstrong Z, Schubert L, et al. Activation atlas. Distill, 2019. https://distill.pub/2019/activation-atlas
Bau D, Liu S, Wang T, et al. Rewriting a deep generative model. In: Proceedings of European Conference on Computer Vision, 2020. 351–369
Langner O, Dotsch R, Bijlstra G, et al. Presentation and validation of the Radboud Faces Database. Cognition Emotion, 2010, 24: 1377–1388
Ramesh A, Choi Y, LeCun Y. A spectral regularizer for unsupervised disentanglement. 2018. ArXiv:1812.01161
Chen X, Duan Y, Houthooft R, et al. InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of International Conference on Neural Information Processing Systems, 2016. 2172–2180
Peebles W, Peebles J, Zhu J Y, et al. The hessian penalty: a weak prior for unsupervised disentanglement. In: Proceedings of European Conference on Computer Vision, 2020. 581–597
Zhu X, Xu C, Tao D. Learning disentangled representations with latent variation predictability. In: Proceedings of European Conference on Computer Vision, 2020. 684–700
Zhu X, Xu C, Tao D. Where and what? Examining interpretable disentangled representations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 5861–5870
Wei Y, Shi Y, Liu X, et al. Orthogonal jacobian regularization for unsupervised disentanglement in image generation. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6721–6730
He Z, Kan M, Shan S. EigenGAN: layer-wise eigen-learning for GANs. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 14408–14417
Jahanian A, Chai L, Isola P. On the “steerability” of generative adversarial networks. In: Proceedings of International Conference on Learning Representations, 2020
Zhu J, Shen Y, Xu Y, et al. Region-based semantic factorization in GANs. 2022. ArXiv:2202.09649
Wang B, Ponce C R. A geometric analysis of deep generative image models and its applications. In: Proceedings of International Conference on Learning Representations, 2021
Tzelepis C, Tzimiropoulos G, Patras I. WarpedGANSpace: finding non-linear RBF paths in GAN latent space. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 6393–6402
Wang X, Yu K, Dong C, et al. Deep network interpolation for continuous imagery effect transition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 1692–1701
Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 618–626
Pan X, Dai B, Liu Z, et al. Do 2D GANs know 3D shape? Unsupervised 3D shape reconstruction from 2D image GANs. 2020. ArXiv:2011.00844
Zhang J, Chen X, Cai Z, et al. Unsupervised 3D shape completion through GAN inversion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 1768–1777
Kingma D P, Welling M. Auto-encoding variational Bayes. 2013. ArXiv:1312.6114
van den Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of International Conference on Machine Learning, 2016. 1747–1756
Ramesh A, Dhariwal P, Nichol A, et al. Hierarchical text-conditional image generation with clip latents. 2022. ArXiv:2204.06125
Saharia C, Chan W, Saxena S, et al. Photorealistic text-to-image diffusion models with deep language understanding. 2022. ArXiv:2205.11487
Zhang D, Han J, Cheng G, et al. Weakly supervised object localization and detection: a survey. IEEE Trans Pattern Anal Mach Intell, 2021, 44: 5866–5885
Han J, Zhang D, Cheng G, et al. Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process Mag, 2018, 35: 84–100
Zhang D, Tian H, Han J. Few-cost salient object detection with adversarial-paced learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 33: 12236–12247
Frid-Adar M, Klang E, Amitai M, et al. Synthetic data augmentation using GAN for improved liver lesion classification. In: Proceedings of International Symposium on Biomedical Imaging, 2018. 289–293
Huang S W, Lin C T, Chen S P, et al. AugGAN: cross domain adaptation with gan-based data augmentation. In: Proceedings of European Conference on Computer Vision, 2018. 718–731
Zhang Y, Ling H, Gao J, et al. DatasetGAN: efficient labeled data factory with minimal human effort. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 10145–10155
Han M, Zheng H, Wang C, et al. Leveraging GAN priors for few-shot part segmentation. In: Proceedings of ACM International Conference on Multimedia, 2022. 1339–1347
Schlegl T, Seeböck P, Waldstein S M, et al. f-AnoGAN: fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal, 2019, 54: 30–44
Dunn I, Pouget H, Melham T, et al. Adaptive generation of unrestricted adversarial inputs. 2019. ArXiv:1905.02463
Wang X, He K, Hopcroft J E. At-GAN: a generative attack model for adversarial transferring on generative adversarial nets. 2019. ArXiv:1904.07793
Ojha U, Li Y, Lu J, et al. Few-shot image generation via cross-domain correspondence. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 10743–10752
Gu J, Liu L, Wang P, et al. StyleNeRF: a style-based 3D aware generator for high-resolution image synthesis. In: Proceedings of International Conference on Learning Representations, 2022
He J, Shi W, Chen K, et al. GCFSR: a generative and controllable face super resolution method without facial and GAN priors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 1889–1898
Li X, Chen C, Lin X, et al. From face to natural image: learning real degradation for blind image super-resolution. In: Proceedings of European Conference on Computer Vision, 2022
Li B, Liu X, Hu P, et al. All-in-one image restoration for unknown corruption. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2022. 17452–17462
Lyu Z, Xu X, Yang C, et al. Accelerating diffusion models via early stop of the diffusion process. 2022. ArXiv:2205.12524
Grover A, Dhar M, Ermon S. Flow-GAN: combining maximum likelihood and adversarial learning in generative models. In: Proceedings of AAAI Conference on Artificial Intelligence, 2018
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. U19A2073, 62006064), Hong Kong RGC RIF (Grant No. R5001-18), and 2020 Heilongjiang Provincial Natural Science Foundation Joint Guidance Project (Grant No. LH2020C001).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, M., Wei, Y., Wu, X. et al. Survey on leveraging pre-trained generative adversarial networks for image editing and restoration. Sci. China Inf. Sci. 66, 151101 (2023). https://doi.org/10.1007/s11432-022-3679-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3679-0