Article

Free access

Conditional image synthesis with auxiliary classifier GANs

Authors:

Augustus Odena,

Christopher Olah,

Jonathon ShlensAuthors Info & Claims

ICML'17: Proceedings of the 34th International Conference on Machine Learning - Volume 70

Pages 2642 - 2651

Published: 06 August 2017 Publication History

PDF eReader Publisher Site

Abstract

In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128 x 128 samples are more than twice as discriminable as artificially resized 32 x 32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.

References

[1]

Ballé, Johannes, Laparra, Valero, and Simoncelli, Eero P. Density modeling of images using a generalized normalization transformation. CoRR, abs/1511.06281, 2015. URL http://arxiv.org/abs/1511.06281.

[2]

Bengio, Yoshua, Mesnil, Gregoire, Dauphin, Yann, and Rifai, Salah. Better mixing via deep representations. CoRR, abs/1207.4404, 2012. URL http://arxiv.org/abs/1207.4404.

[3]

Blundell, C., Uria, B., Pritzel, A., Li, Y., Ruderman, A., Leibo, J. Z, Rae, J., Wierstra, D., and Hassabis, D. Model-Free Episodic Control. ArXiv e-prints, June 2016.

[4]

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., and Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. ArXiv e-prints, June 2016.

[5]

Denton, Emily L., Chintala, Soumith, Szlam, Arthur, and Fergus, Robert. Deep generative image models using a laplacian pyramid of adversarial networks. CoRR, abs/1506.05751, 2015. URL http://arxiv.org/abs/1506.05751.

[6]

Dinh, Laurent, Sohl-Dickstein, Jascha, and Bengio, Samy. Density estimation using real NVP. CoRR, abs/1605.08803, 2016. URL http://arxiv.org/abs/1605.08803.

[7]

Donahue, J., Krähenbühl, P., and Darrell, T. Adversarial Feature Learning. ArXiv e-prints, May 2016.

[8]

Dumoulin, V., Belghazi, I., Poole, B., Lamb, A., Arjovsky, M., Mastropietro, O., and Courville, A. Adversarially Learned Inference. ArXiv e-prints, June 2016.

[9]

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative Adversarial Networks. ArXiv e-prints, June 2014.

[10]

Kingma, D. P and Welling, M. Auto-Encoding Variational Bayes. ArXiv e-prints, December 2013.

[11]

Kingma, Diederik P., Rezende, Danilo Jimenez, Mohamed, Shakir, and Welling, Max. Semi-supervised learning with deep generative models. CoRR, abs/1406.5298, 2014. URL http://arxiv.org/abs/1406.5298.

Digital Library

[12]

Kingma, Diederik P., Salimans, Tim, and Welling, Max. Improving variational inference with inverse autoregressive flow. CoRR, abs/1606.04934, 2016. URL http://arxiv.org/abs/1606.04934.

[13]

Ledig, C., Theis, L., Huszar, F., Caballero, J., Aitken, A., Tejani, A., Totz, J., Wang, Z., and Shi, W. PhotoRealistic Single Image Super-Resolution Using a Generative Adversarial Network. ArXiv e-prints, September 2016.

[14]

Ma, Kede, Wu, Qingbo, Wang, Zhou, Duanmu, Zhengfang, Yong, Hongwei, Li, Hongliang, and Zhang, Lei. Group mad competition - a new methodology to compare objective image quality models. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

[15]

Maas, Andrew, Hannun, Awni, and Ng, Andrew. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of The 33rd International Conference on Machine Learning, 2013.

[16]

Mirza, Mehdi and Osindero, Simon. Conditional generative adversarial nets. CoRR, abs/1411.1784, 2014. URL http://arxiv.org/abs/1411.1784.

[17]

Mohamed, Shakir and Lakshminarayanan, Balaji. Learning in implicit generative models. arXiv preprint arXiv:1610.03483, 2016.

[18]

Nguyen, Anh Mai, Dosovitskiy, Alexey, Yosinski, Jason, Brox, Thomas, and Clune, Jeff. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. CoRR, abs/1605.09304, 2016. URL http://arxiv.org/abs/1605.09304.

Digital Library

[19]

Odena, A. Semi-Supervised Learning with Generative Adversarial Networks. ArXiv e-prints, June 2016.

[20]

Odena, Augustus, Dumoulin, Vincent, and Olah, Chris. Deconvolution and checkerboard artifacts. http://distill.pub/2016/deconv-checkerboard/, 2016.

[21]

Radford, Alec, Metz, Luke, and Chintala, Soumith. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015. URL http://arxiv.org/abs/1511.06434.

[22]

Ramsundar, Bharath, Kearnes, Steven, Riley, Patrick, Webster, Dale, Konerding, David, and Pande, Vijay. Massively multitask networks for drug discovery. In Proceedings of The 33rd International Conference on Machine Learning, 2016.

[23]

Reed, Scott, Akata, Zeynep, Mohan, Santosh, Tenka, Samuel, Schiele, Bernt, and Lee, Honglak. Learning what and where to draw. arXiv preprint arXiv:1610.02454, 2016a.

Digital Library

[24]

Reed, Scott, Akata, Zeynep, Yan, Xinchen, Logeswaran, Lajanugen, Schiele, Bernt, and Lee, Honglak. Generative adversarial text-to-image synthesis. In Proceedings of The 33rd International Conference on Machine Learning, 2016b.

Digital Library

[25]

Rezende, D. and Mohamed, S. Variational Inference with Normalizing Flows. ArXiv e-prints, May 2015.

Digital Library

[26]

Rezende, D., Mohamed, S., and Wierstra, D. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. ArXiv e-prints, January 2014.

Digital Library

[27]

Ronneberger, Olaf, Fischer, Philipp, and Brox, Thomas. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015. URL http://arxiv.org/abs/1505.04 597.

[28]

Russakovsky, Olga, Deng, Jia, Su, Hao, Krause, Jonathan, Satheesh, Sanjeev, Ma, Sean, Huang, Zhiheng, Karpathy, Andrej, Khosla, Aditya, Bernstein, Michael, Berg, Alexander C., and Fei-Fei, Li. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211-252, 2015.

Digital Library

[29]

Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., and Chen, X. Improved Techniques for Training GANs. ArXiv e-prints, June 2016.

[30]

Simoncelli, Eero and Olshausen, Bruno. Natural image statistics and neural representation. Annual Review of Neuroscience, 24:1193-1216, 2001.

[31]

Springenberg, J. T. Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks. ArXiv e-prints, November 2015.

[32]

Sutskever, Ilya, Vinyals, Oriol, and V., Le Quoc. Sequence to sequence learning with neural networks. In Neural Information Processing Systems, 2014.

Digital Library

[33]

Szegedy, Christian, Liu, Wei, Jia, Yangqing, Sermanet, Pierre, Reed, Scott E., Anguelov, Dragomir, Erhan, Dumitru, Vanhoucke, Vincent, and Rabinovich, Andrew. Going deeper with convolutions. CoRR, abs/1409.4842, 2014. URL http://arxiv.org/abs/1409.4842.

[34]

Szegedy, Christian, Vanhoucke, Vincent, Ioffe, Sergey, Shlens, Jonathon, and Wojna, Zbigniew. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015. URL http://arxiv.org/abs/1512.00567.

[35]

Theis, L., van den Oord, A., and Bethge, M. A note on the evaluation of generative models. ArXiv e-prints, November 2015.

[36]

Toderici, George, Vincent, Damien, Johnston, Nick, Hwang, Sung Jin, Minnen, David, Shor, Joel, and Covell, Michele. Full resolution image compression with recurrent neural networks. CoRR, abs/1608.05148, 2016. URL http://arxiv.org/abs/1608.05148.

[37]

Uehara, M., Sato, I., Suzuki, M., Nakayama, K., and Matsuo, Y. Generative Adversarial Nets from a Density Ratio Estimation Perspective. ArXiv e-prints, October 2016.

[38]

van den Oord, Aäron, Kalchbrenner, Nal, and Kavukcuoglu, Koray. Pixel recurrent neural networks. CoRR, abs/1601.06759, 2016a. URL http://arxiv.org/abs/1601.06759.

[39]

van den Oord, Aäron, Kalchbrenner, Nal, Vinyals, Oriol, Espeholt, Lasse, Graves, Alex, and Kavukcuoglu, Koray. Conditional image generation with pixelcnn decoders. CoRR, abs/1606.05328, 2016b. URL http://arxiv.org/abs/1606.05328.

[40]

Wang, Zhou, Bovik, Alan C, Sheikh, Hamid R, and Simoncelli, Eero P. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600-612, 2004a.

Digital Library

[41]

Wang, Zhou, Simoncelli, Eero P, and Bovik, Alan C. Multi-scale structural similarity for image quality assessment. In Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on, volume 2, pp. 1398-1402. Ieee, 2004b.

Cited By

Badjie BCecílio JCasimiro A(2024)Adversarial Attacks and Countermeasures on Image Classification-based Deep Learning Models in Autonomous Driving Systems: A Systematic ReviewACM Computing Surveys10.1145/369162557:1(1-52)Online publication date: 7-Oct-2024
https://dl.acm.org/doi/10.1145/3691625
Hao ZPei SHan QAi R(2024)GB-GAIN: Granular-ball conditional generative adversarial imputation networks for incomplete dataProceedings of the 2024 6th International Conference on Big Data Engineering10.1145/3688574.3688581(48-53)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3688574.3688581
Wei YCao LLi HDong YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MB2C: Multimodal Bidirectional Cycle Consistency for Learning Robust Visual Neural RepresentationsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681292(8992-9000)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681292
Show More Cited By

Conditional image synthesis with auxiliary classifier GANs
1. Computing methodologies

Recommendations

A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Conditional Generative Adversarial Network (cGAN) is an important type of GAN which is often equipped with an auxiliary classifier. However, existing cGANs usually have the issue of mode collapse which can incur unstable performance in practice. In this ...
Conditional GANs for Image Captioning with Sentiments
Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series
Abstract
The area of automatic image captioning has witnessed much progress recently. However, generating captions with sentiment, which is a common dimension in human generated captions, still remains a challenge. This work presents a generative approach ...
Diffusion models beat GANs on image synthesis
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'17: Proceedings of the 34th International Conference on Machine Learning - Volume 70

August 2017

4208 pages

Publisher

JMLR.org

Publication History

Published: 06 August 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

105
Total Citations
View Citations
2,054
Total Downloads

Downloads (Last 12 months)161
Downloads (Last 6 weeks)28

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Badjie BCecílio JCasimiro A(2024)Adversarial Attacks and Countermeasures on Image Classification-based Deep Learning Models in Autonomous Driving Systems: A Systematic ReviewACM Computing Surveys10.1145/369162557:1(1-52)Online publication date: 7-Oct-2024
https://dl.acm.org/doi/10.1145/3691625
Hao ZPei SHan QAi R(2024)GB-GAIN: Granular-ball conditional generative adversarial imputation networks for incomplete dataProceedings of the 2024 6th International Conference on Big Data Engineering10.1145/3688574.3688581(48-53)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3688574.3688581
Wei YCao LLi HDong YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MB2C: Multimodal Bidirectional Cycle Consistency for Learning Robust Visual Neural RepresentationsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681292(8992-9000)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681292
Franceschelli GMusolesi M(2024)Creativity and Machine Learning: A SurveyACM Computing Surveys10.1145/366459556:11(1-41)Online publication date: 28-Jun-2024
https://dl.acm.org/doi/10.1145/3664595
Wang SDu YGuo XPan BQin ZZhao L(2024)Controllable Data Generation by Deep Learning: A ReviewACM Computing Surveys10.1145/364860956:9(1-38)Online publication date: 25-Apr-2024
https://dl.acm.org/doi/10.1145/3648609
Jin ZGeng MDeng JWang THu SLi GLiu X(2024)Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech RecognitionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2023.332388832(413-429)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TASLP.2023.3323888
Yang GLin JSu ZLi Y(2024)Visual privacy behaviour recognition for social robots based on an improved generative adversarial networkIET Computer Vision10.1049/cvi2.1223118:1(110-123)Online publication date: 8-Feb-2024
https://dl.acm.org/doi/10.1049/cvi2.12231
Zhu YWu YDeng ZRussakovsky OYan YOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Boundary guided learning-free semantic control with diffusion modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669546(78319-78346)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669546
Nguyen NChandrasegaran KAbdollahzadeh MCheung NOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Label-only model inversion attacks via knowledge transferProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669137(68895-68907)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669137
Yu RWang XOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Generator born from classifierProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668478(54139-54151)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668478
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents