Network Bending: Expressive Manipulation of Deep Generative Models

Terence Broad^11,12,
Frederic Fol Leymarie¹¹ &
Mick Grierson¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12693))

Included in the following conference series:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)

1870 Accesses
2 Citations
4 Altmetric

Abstract

We introduce a new framework for manipulating and interacting with deep generative models that we call network bending. We present a comprehensive set of deterministic transformations that can be inserted as distinct layers into the computational graph of a trained generative neural network and applied during inference. In addition, we present a novel algorithm for analysing the deep generative model and clustering features based on their spatial activation maps. This allows features to be grouped together based on spatial similarity in an unsupervised fashion. This results in the meaningful manipulation of sets of features that correspond to the generation of a broad array of semantically significant features of the generated images. We outline this framework, demonstrating our results on state-of-the-art deep generative models trained on several image datasets. We show how it allows for the direct manipulation of semantically meaningful aspects of the generative process as well as allowing for a broad range of expressive outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

rcGAN: Learning a Generative Model for Arbitrary Size Image Generation

The neural coding framework for learning generative models

Article Open access 19 April 2022

Applications of Generative Adversarial Networks (GANs): An Updated Review

Article 19 December 2019

Notes

1.
Our implementation and the datasets we have used for training the clustering models are publicly available and can be found at: https://github.com/terrybroad/network-bending.

References

Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the StyleGAN latent space? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4432–4441 (2019)
Google Scholar
Bau, D., Liu, S., Wang, T., Zhu, J.-Y., Torralba, A.: Rewriting a deep generative model. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 351–369. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_21
Chapter Google Scholar
Bau, D., et al.: Semantic photo manipulation with a generative image prior. ACM Trans. Graph. (TOG) 38(4), 1–11 (2019)
Article Google Scholar
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vsion and Pattern Recognition, pp. 6541–6549. openaccess.thecvf.com (2017)
Google Scholar
Bau, D., et al.: GAN dissection: visualizing and understanding generative adversarial networks. In: International Conference on Learning Representations (November 2018)
Google Scholar
Ben-Kiki, O., Evans, C., Ingerson, B.: YAML ain’t markup language (YAML$^\text{TM}$) version 1.1. Working Draft 2008–05 11 (2009)
Google Scholar
Berns, S., Colton, S.: Bridging generative deep learning and computational creativity. In: Proceedings of the 11th International Conference on Computational Creativity (2020)
Google Scholar
Bontrager, P., Roy, A., Togelius, J., Memon, N., Ross, A.: DeepMasterPrints: generating MasterPrints for dictionary attacks via latent variable evolution. In: 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–9. IEEE (2018)
Google Scholar
Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Soulié, F.F., Hérault, J. (eds.) Neurocomputing. NATO ASI Series (Series F: Computer and Systems Sciences), vol. 68. Springer, Heidelberg (1990). https://doi.org/10.1007/978-3-642-76153-9_28
Brink, P.: Dissection of a generative network for music composition. Master’s thesis (2019)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations (2019)
Google Scholar
Brouwer, H.: Audio-reactive latent interpolations with StyleGAN. In: NeurIPS 2020 Workshop on Machine Learning for Creativity and Design (2020)
Google Scholar
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)
Article Google Scholar
Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., Sutskever, I.: Jukebox: a generative model for music. arXiv preprint arXiv:2005.00341 (2020)
Dobrian, C., Koppelman, D.: The ‘E’ in NIME: musical expression with new computer interfaces. In: NIME, vol. 6, pp. 277–282 (2006)
Google Scholar
Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 766–774 (2014)
Google Scholar
Fernandes, P., Correia, J., Machado, P.: Evolutionary latent space exploration of generative adversarial networks. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds.) EvoApplications 2020. LNCS, vol. 12104, pp. 595–609. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43722-0_38
Chapter Google Scholar
Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Grézl, F., Karafiát, M., Kontár, S., Cernocky, J.: Probabilistic and bottle-neck features for LVCSR of meetings. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP 2007, vol. 4, pp. IV-757. IEEE (2007)
Google Scholar
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. arXiv preprint arXiv:2004.02546 (2020)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Google Scholar
Jacobs, J., Gogia, S., Mĕch, R., Brandt, J.R.: Supporting expressive procedural art creation through direct manipulation. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 6330–6341 (2017)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations (2017)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. arXiv preprint arXiv:1912.04958 (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (2013)
Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
McCallum, L., Yee-King, M.: Network bending neural vocoders. In: NeurIPS 2020 Workshop on Machine Learning for Creativity and Design (2020)
Google Scholar
Oord, A.V.D., et al.: WaveNet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8024–8035 (2019)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representations (2016)
Google Scholar
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on Machine Learning (2014)
Google Scholar
Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9243–9252 (2020)
Google Scholar
Simon, J.: GANBreeder app (November 2018). https://www.joelsimon.net/ganbreeder.html. Accessed 1 Mar 2020
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Soille, P.: Erosion and dilation. In: Morphological Image Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03939-7_3
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Google Scholar
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5122–5130 (2017)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: International Conference on Learning Representations (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Goldsmiths, University of London, London, UK
Terence Broad & Frederic Fol Leymarie
Creative Computing Institute, University of the Arts London, London, UK
Terence Broad & Mick Grierson

Authors

Terence Broad
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Fol Leymarie
View author publications
You can also search for this author in PubMed Google Scholar
Mick Grierson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Terence Broad .

Editor information

Editors and Affiliations

University of A Coruña, A Coruña, Spain
Juan Romero
University of Coimbra, Coimbra, Portugal
Tiago Martins
University of A Coruña, A Coruña, Spain
Nereida Rodríguez-Fernández

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Broad, T., Leymarie, F.F., Grierson, M. (2021). Network Bending: Expressive Manipulation of Deep Generative Models. In: Romero, J., Martins, T., Rodríguez-Fernández, N. (eds) Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. Lecture Notes in Computer Science(), vol 12693. Springer, Cham. https://doi.org/10.1007/978-3-030-72914-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-72914-1_2
Published: 02 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72913-4
Online ISBN: 978-3-030-72914-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Network Bending: Expressive Manipulation of Deep Generative Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

rcGAN: Learning a Generative Model for Arbitrary Size Image Generation

The neural coding framework for learning generative models

Applications of Generative Adversarial Networks (GANs): An Updated Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Network Bending: Expressive Manipulation of Deep Generative Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

rcGAN: Learning a Generative Model for Arbitrary Size Image Generation

The neural coding framework for learning generative models

Applications of Generative Adversarial Networks (GANs): An Updated Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation