Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3374587.3374640acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Bi-directional Generation between Attributes and Images

Published: 04 March 2020 Publication History

Abstract

This paper investigates the problem of generating images from visual attributes and vice versa. Given the prevailing research of image recognition, the bi-directional generation between attributes and images is rarely explored due to the challenges of learning a good bidirectionally generative model and the different structure of these two modalities. To address this problem, the bidirectional generative model (BGM) which based on a variant of variational auto-encoders (VAEs) is proposed in this paper. The attributes in BGM are represented by attribute functions. The attribute functions directly ground the meaning of attributes in visual representations. They also allow the BGM to generate images and attributes bi-directionally. The BGM is applied to 3D chairs dataset to verify its validity. The BGM achieves 85.2% and 81.7% accuracy in attribute inference and image reconstruction tasks, respectively. The experimental results demonstrate the efficiency of the BGM.

References

[1]
Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. 2016. Deep learning for visual understanding: a review. Neurocomputing. 187(C), 27--48.
[2]
Szegedy, C., Liu, W., Jia, Y. et al., Going deeper with convolutions. 2015. In Proceedings of the CVPR.
[3]
Simonyan, K., Zisserman, A. Very deep convolutional networks for large-scale image recognition. 2015. In Proceedings of the ICLR.
[4]
Ren, S., He, K., Girshick, R. et al. 2015.Faster R-CNN: towards real-time object detection with region proposal networks. In Proceedings of the NIPS.
[5]
Mostajabi, M., Yadollahpour, P., Shakhnarovich, G. Feedforward semantic segmentation with zoom-out features. 2015. In Proceedings of the CVPR.
[6]
Suzuki, M., Nakayama, K., & Matsuo, Y. Joint multimodal learning with deep generative models. 2017.In ICLR Workshop track.
[7]
Mao, J.Y., Gan, C., Kohli, P., Tenenbaum, J.B., & Wu, J.J. 2019. The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision. In ICLR.
[8]
Kingma, D. P., & Welling, M. 2014. Auto-encoding variational bayes. In ICLR.
[9]
Kingma, D. P., Rezende, D. J., Mohamed, S. & Welling, M. 2014. Semi-supervised learning with deep generative models. In Advances in Neural In-formation Processing Systems.
[10]
Kulkarni, T. D., Whitney, W., Kohli, P. & Tenenbaum, J. B. 2015. Deep convolutional inverse graphics network. In Advances in Neural Information Processing Systems.
[11]
Sohn, K., Yan, X., Lee, H., & Arbor, A. 2015. Learning Structured Output Representation using Deep Conditional Generative Models. In Advances in Neural Information Processing Systems.
[12]
Pandey, G., & Dukkipati, A. 2017. Variational methods for conditional multimodal deep learning. In International Joint Conference on Neural Networks.
[13]
Goodfellow, I., Pouget-Abadie, Jean, Mirza, M., Xu, B., et al. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems.
[14]
Reed, S., Akata, Z., Mohan, S., Tenka, S., Schiele, B. & Lee, H. 2016. Learning what and where to draw. In Advances in Neural Information Processing Systems.
[15]
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & H. Lee. 2016. Generative adversarial text to image synthesis. In International Conference on Machine Learning.
[16]
Sohn, K., Lee, H., & Yan, X.C. 2015. Learning structured output representation using deep conditional generative models. In Advances in Neural Information Processing Systems.
[17]
Pandey, G., & Dukkipati, A. 2016. Variational methods for conditional multimodal learning: Generating human faces from attributes. arXiv preprint arXiv:1603.01801.
[18]
Vedantam, R., Fischer, I., Huang, J., & Murphy, K. 2017.Generative models of visually grounded imagination. arXiv preprint arXiv:1705.10762.
[19]
Higgins, I., Sonnerat, N., Matthey, L. Pal, A., Burgess, C. P. & Bosnjak, M. et al. 2017. Scan: learning hierarchical compositional visual concepts. arXiv preprint arXiv:1707.03389.
[20]
Chen, R. T. Q., Li, X., Grosse, R., & Duvenaud, D. 2018. Isolating sources of disentanglement in variational autoencoders. In Advances in Neural Information Processing Systems.
[21]
Bengio, Y., Courville, A., & Vincent, P. Representation learning: a review and new perspectives. 2012. IEEE Transactions on Pattern Analysis & Machine Intelligence. 35(8), 1798--1828.
[22]
Aubry, M., Maturana, D., Efros, A. A., Russell B. C., & Sivic, J. 2014. Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models. In IEEE Conference on Computer Vision and Pattern Recognition.
[23]
Kingma, D., & Ba., J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSAI '19: Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence
December 2019
370 pages
ISBN:9781450376273
DOI:10.1145/3374587
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Shenzhen University: Shenzhen University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Variational Auto-encoders
  2. attribute functions
  3. bi-directional generation
  4. image reconstruction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CSAI2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 44
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media