A Cost-Effective Method for Improving and Re-purposing Large, Pre-trained GANs by Fine-Tuning Their Class-Embeddings

Qi Li¹²,
Long Mai¹³,
Michael A. Alcorn¹² &
…
Anh Nguyen¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12625))

Included in the following conference series:

Asian Conference on Computer Vision

732 Accesses

Abstract

Large, pre-trained generative models have been increasingly popular and useful to both the research and wider communities. Specifically, BigGANs—a class-conditional Generative Adversarial Networks trained on ImageNet—achieved excellent, state-of-the-art capability in generating realistic photos. However, fine-tuning or training BigGANs from scratch is practically impossible for most researchers and engineers because (1) GAN training is often unstable and suffering from mode-collapse; and (2) the training requires a significant amount of computation, 256 Google TPUs for 2 days or 8 $\times $ V100 GPUs for 15 days. Importantly, many pre-trained generative models both in NLP and image domains were found to contain biases that are harmful to the society. Thus, we need computationally-feasible methods for modifying and re-purposing these huge, pre-trained models for downstream tasks. In this paper, we propose a cost-effective optimization method for improving and re-purposing BigGANs by fine-tuning only the class-embedding layer. We show the effectiveness of our model-editing approach in three tasks: (1) significantly improving the realism and diversity of samples of complete mode-collapse classes; (2) re-purposing ImageNet BigGANs for generating images for Places365; and (3) de-biasing or improving the sample diversity for selected ImageNet classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Transferring GANs: Generating Images from Limited Data

Few-shot image generation based on contrastive meta-learning generative adversarial network

Article 21 July 2022

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Notes

1.
Code for reproducibility is available at https://github.com/qilimk/biggan-am.

References

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1, 9 (2019)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations (2019)
Google Scholar
Johnson, K.: AI weekly: a deep learning Pioneer’s teachable moment on AI bias | venturebeat. https://venturebeat.com/2020/06/26/ai-weekly-a-deep-learning-pioneers-teachable-moment-on-ai-bias/. Accessed 07 Aug 2020
Sheng, E., Chang, K.W., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. arXiv preprint arXiv:1909.01326 (2019)
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017)
Google Scholar
Ravuri, S., Vinyals, O.: Seeing is not necessarily believing: limitations of BigGANs for data augmentation (2019)
Google Scholar
Brock, A.: ajbrock/BigGAN-PyTorch: the author’s officially unofficial PyTorch BigGAN implementation. https://github.com/ajbrock/BigGAN-PyTorch. Accessed 25 July 2019
Yang, D., Hong, S., Jang, Y., Zhao, T., Lee, H.: Diversity-sensitive conditional generative adversarial networks. In: International Conference on Learning Representations (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of the 34th International Conference on Machine Learning, JMLR. org, vol. 70, pp. 2642–2651 (2017)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T., Clune, J.: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In: Advances in Neural Information Processing Systems, pp. 3387–3395 (2016)
Google Scholar
Nguyen, A., Clune, J., Bengio, Y., Dosovitskiy, A., Yosinski, J.: Plug & play generative networks: Conditional iterative generation of images in latent space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4467–4477 (2017)
Google Scholar
Nguyen, A., Yosinski, J., Clune, J.: Understanding neural networks via feature visualization: a survey. arXiv preprint arXiv:1904.08939 (2019)
Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Univ. Montreal 1341, 1 (2009)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Borji, A.: Pros and cons of GAN evaluation measures. Comput. Vis. Image Underst. 179, 41–65 (2019)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Tran, B., Madry, A.: Learning perceptually-aligned representations via adversarial robustness. arXiv preprint arXiv:1906.00945 (2019)
Amazon: Amazon EC2 P3 instance product details. https://aws.amazon.com/ec2/instance-types/p3/. Accessed 7 July 2020
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017)
Article Google Scholar
Wu, J., Hu, W., Xiong, H., Huan, J., Braverman, V., Zhu, Z.: On the noisy gradient descent that generalizes as SGD. arXiv preprint arXiv:1906.07405 (2019)
Yeh, R.A., Chen, C., Lim, T.Y., Schwing, A.G., Hasegawa-Johnson, M., Do, M.N.: Semantic image inpainting with deep generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5485–5493 (2017)
Google Scholar
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36
Chapter Google Scholar
Zhao, Z., Dua, D., Singh, S.: Generating natural adversarial examples. In: International Conference on Learning Representations (2018)
Google Scholar
Turner, R., Hung, J., Frank, E., Saatchi, Y., Yosinski, J.: Metropolis-Hastings generative adversarial networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Long Beach, California, USA, PMLR, vol. 97, pp. 6345–6353 (2019)
Google Scholar
Azadi, S., Olsson, C., Darrell, T., Goodfellow, I., Odena, A.: Discriminator rejection sampling. In: International Conference on Learning Representations (2019)
Google Scholar
Bau, D., et al.: Visualizing and understanding generative adversarial networks. In: International Conference on Learning Representations (2019)
Google Scholar
Jahanian, A., Chai, L., Isola, P.: On the“steerability” of generative adversarial networks. arXiv preprint arXiv:1907.07171 (2019)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, 36849, USA
Qi Li, Michael A. Alcorn & Anh Nguyen
Adobe Inc., San Jose, CA, 95110, USA
Long Mai

Authors

Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Long Mai
View author publications
You can also search for this author in PubMed Google Scholar
Michael A. Alcorn
View author publications
You can also search for this author in PubMed Google Scholar
Anh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Li .

Editor information

Editors and Affiliations

Waseda University, Tokyo, Japan
Hiroshi Ishikawa
Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Pajdla
University of Pennsylvania, Philadelphia, PA, USA
Jianbo Shi

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 75100 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Q., Mai, L., Alcorn, M.A., Nguyen, A. (2021). A Cost-Effective Method for Improving and Re-purposing Large, Pre-trained GANs by Fine-Tuning Their Class-Embeddings. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12625. Springer, Cham. https://doi.org/10.1007/978-3-030-69538-5_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-69538-5_32
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69537-8
Online ISBN: 978-3-030-69538-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Cost-Effective Method for Improving and Re-purposing Large, Pre-trained GANs by Fine-Tuning Their Class-Embeddings

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Transferring GANs: Generating Images from Limited Data

Few-shot image generation based on contrastive meta-learning generative adversarial network

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 75100 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Cost-Effective Method for Improving and Re-purposing Large, Pre-trained GANs by Fine-Tuning Their Class-Embeddings

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Transferring GANs: Generating Images from Limited Data

Few-shot image generation based on contrastive meta-learning generative adversarial network

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 75100 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation