Article

Winner-take-all autoencoders

Authors:

Alireza Makhzani,

Brendan FreyAuthors Info & Claims

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

Pages 2791 - 2799

Published: 07 December 2015 Publication History

Abstract

In this paper, we propose a winner-take-all method for learning hierarchical sparse representations in an unsupervised fashion. We first introduce fully-connected winner-take-all autoencoders which use mini-batch statistics to directly enforce a lifetime sparsity in the activations of the hidden units. We then propose the convolutional winner-take-all autoencoder which combines the benefits of convolutional architectures and autoencoders for learning shift-invariant sparse representations. We describe a way to train convolutional autoencoders layer by layer, where in addition to lifetime sparsity, a spatial sparsity within each feature map is achieved using winner-take-all activation functions. We will show that winner-take-all autoencoders can be used to to learn deep sparse representations from the MNIST, CIFAR-10, ImageNet, Street View House Numbers and Toronto Face datasets, and achieve competitive classification performance.

References

[1]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks.," in NIPS, vol. 1, p. 4, 2012.

[2]

A. Ng, "Sparse autoencoder," CS294A Lecture notes, vol. 72, 2011.

[3]

A. Coates, A. Y. Ng, and H. Lee, "An analysis of single-layer networks in unsupervised feature learning," in International Conference on Artificial Intelligence and Statistics, 2011.

[4]

K. Kavukcuoglu, P. Sermanet, Y.-L. Boureau, K. Gregor, M. Mathieu, and Y. LeCun, "Learning convolutional feature hierarchies for visual recognition.," in NIPS, vol. 1, p. 5, 2010.

[5]

H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations," in Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609-616, ACM, 2009.

[6]

A. Krizhevsky, "Convolutional deep belief networks on cifar-10," Unpublished, 2010.

[7]

M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, "Deconvolutional networks," in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 25282535, IEEE, 2010.

[8]

P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. LeCun, "Pedestrian detection with unsupervised multi-stage feature learning," in Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pp. 3626-3633, IEEE, 2013.

[9]

P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," The Journal of Machine Learning Research, vol. 11, pp. 3371-3408, 2010.

[10]

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.

[11]

Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, "Reading digits in natural images with unsupervised feature learning," in NIPS workshop on deep learning and unsupervised feature learning, vol. 2011, p. 5, Granada, Spain, 2011.

[12]

M. D. Zeiler and R. Fergus, "Differentiable pooling for hierarchical feature learning," arXiv preprint arXiv:1207.0151, 2012.

[13]

R. Salakhutdinov and G. E. Hinton, "Deep boltzmann machines," in International Conference on Artificial Intelligence and Statistics, pp. 448-455, 2009.

[14]

A. Makhzani and B. Frey, "k-sparse autoencoders," International Conference on Learning Representations, ICLR, 2014.

[15]

J. Bruna and S. Mallat, "Invariant scattering convolution networks," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, no. 8, pp. 1872-1886, 2013.

[16]

J. Mairal, P. Koniusz, Z. Harchaoui, and C. Schmid, "Convolutional kernel networks," in Advances in Neural Information Processing Systems, pp. 2627-2635, 2014.

[17]

M. Ranzato, F. J. Huang, Y.-L. Boureau, and Y. Lecun, "Unsupervised learning of invariant feature hierarchies with applications to object recognition," in Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pp. 1-8, IEEE, 2007.

[18]

D. P. Kingma, S. Mohamed, D. J. Rezende, and M. Welling, "Semi-supervised learning with deep generative models," in Advances in Neural Information Processing Systems, pp. 3581-3589, 2014.

[19]

I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, "Maxout networks," ICML, 2013.

[20]

A. Coates and A. Y. Ng, "Selecting receptive fields in deep networks.," in NIPS, 2011.

[21]

A. Dosovitskiy, J. T. Springenberg, M. Riedmiller, and T. Brox, "Discriminative unsupervised feature learning with convolutional neural networks," in Advances in Neural Information Processing Systems, pp. 766-774, 2014.

[22]

T.-H. Lin and H. Kung, "Stable and efficient representation learning with nonnegativity constraints," in Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1323-1331, 2014.

Cited By

Yeşil ÇKorkmaz E(2023)A novel cellular automata-based approach for generating convolutional filtersMachine Vision and Applications10.1007/s00138-023-01389-z34:3Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1007/s00138-023-01389-z
Yang QMao JWang ZHai “(2021)Dynamic Regularization on Activation Sparsity for Neural Network Efficiency ImprovementACM Journal on Emerging Technologies in Computing Systems10.1145/344777617:4(1-16)Online publication date: 30-Jun-2021
https://dl.acm.org/doi/10.1145/3447776
Zhu HRen YHu J(2021)Establishing A Hybrid Pieceswise Linear Model for Air Quality Prediction Based Missingness Challenges2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC52423.2021.9658931(1705-1710)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1109/SMC52423.2021.9658931
Show More Cited By

Winner-take-all autoencoders
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Hyper Autoencoders
Abstract
We introduce the hyper autoencoder architecture where a secondary, hypernetwork is used to generate the weights of the encoder and decoder layers of the primary, actual autoencoder. The hyper autoencoder uses a one-layer linear hypernetwork to ...
Anomaly Detection with Robust Deep Autoencoders
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Deep autoencoders, and other deep neural networks, have demonstrated their effectiveness in discovering non-linear features across many problem domains. However, in many real-world problems, large outliers and pervasive noise are commonplace, and one ...
Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures
LVA/ICA 2015: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation - Volume 9237

This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder DAE in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

December 2015

3626 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 07 December 2015

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
2
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yeşil ÇKorkmaz E(2023)A novel cellular automata-based approach for generating convolutional filtersMachine Vision and Applications10.1007/s00138-023-01389-z34:3Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1007/s00138-023-01389-z
Yang QMao JWang ZHai “(2021)Dynamic Regularization on Activation Sparsity for Neural Network Efficiency ImprovementACM Journal on Emerging Technologies in Computing Systems10.1145/344777617:4(1-16)Online publication date: 30-Jun-2021
https://dl.acm.org/doi/10.1145/3447776
Zhu HRen YHu J(2021)Establishing A Hybrid Pieceswise Linear Model for Air Quality Prediction Based Missingness Challenges2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC52423.2021.9658931(1705-1710)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1109/SMC52423.2021.9658931
Fox IWiens J(2019)Advocacy learningProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367243.3367361(2315-2321)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367243.3367361
Majumdar ADey LChaudhury SKrishnapuram RSingla PRoy R(2019)Deeply Coupled Graph Structured Autoencoder for Domain AdaptationProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3297001.3297013(94-102)Online publication date: 3-Jan-2019
https://dl.acm.org/doi/10.1145/3297001.3297013
Pang GCao LChen LLiu HGuo YFarooq F(2018)Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier DetectionProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3220042(2041-2050)Online publication date: 19-Jul-2018
https://dl.acm.org/doi/10.1145/3219819.3220042
Deng JXu XZhang ZFruhholz SSchuller BJun Deng Xinzhou Xu Zixing Zhang Fruhholz SSchuller B(2018)Semisupervised Autoencoders for Speech Emotion RecognitionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2017.275933826:1(31-43)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1109/TASLP.2017.2759338
Kim JBukhari WLee M(2018)Feature Analysis of Unsupervised Learning for Multi-task Classification Using Convolutional Neural NetworkNeural Processing Letters10.1007/s11063-017-9724-147:3(783-797)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11063-017-9724-1
Chu WCai D(2017)Stacked similarity-aware autoencodersProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172104(1561-1567)Online publication date: 19-Aug-2017
https://dl.acm.org/doi/10.5555/3172077.3172104
Patel ANguyen TBaraniuk R(2016)A probabilistic framework for deep learningProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157382.3157384(2558-2566)Online publication date: 5-Dec-2016
https://dl.acm.org/doi/10.5555/3157382.3157384

View Options

View options

Media

Figures

Other

Tables

View Table of Contents