research-article

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

Authors:

Rajesh Ranganath,

Andrew Y. NgAuthors Info & Claims

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 609 - 616

https://doi.org/10.1145/1553374.1553453

Published: 14 June 2009 Publication History

Abstract

There has been much interest in unsupervised learning of hierarchical generative models such as deep belief networks. Scaling such models to full-sized, high-dimensional images remains a difficult problem. To address this problem, we present the convolutional deep belief network, a hierarchical generative model which scales to realistic image sizes. This model is translation-invariant and supports efficient bottom-up and top-down probabilistic inference. Key to our approach is probabilistic max-pooling, a novel technique which shrinks the representations of higher layers in a probabilistically sound way. Our experiments show that the algorithm learns useful high-level visual features, such as object parts, from unlabeled images of objects and natural scenes. We demonstrate excellent performance on several visual recognition tasks and show that our model can perform hierarchical (bottom-up and top-down) inference over full-sized images.

References

[1]

Bell, A. J., & Sejnowski, T. J. (1997). The 'independent components' of natural scenes are edge filters. Vision Research, 37, 3327--3338.

[2]

Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2006). Greedy layer-wise training of deep networks. Adv. in Neural Information Processing Systems.

[3]

Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. IEEE Conference on Computer Vision and Pattern Recognition (pp. 26--33).

Digital Library

[4]

Desjardins, G., & Bengio, Y. (2008). Empirical evaluation of convolutional RBMs for vision (Technical Report).

[5]

Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. CVPR Workshop on Gen.-Model Based Vision.

Digital Library

[6]

Grosse, R., Raina, R., Kwong, H., & Ng, A. (2007). Shift-invariant sparse coding for audio classification. Proceedings of the Conference on Uncertainty in AI.

[7]

Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14, 1771--1800.

Digital Library

[8]

Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527--1554.

Digital Library

[9]

Hinton, G. E., & Salakhutdinov, R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504--507.

[10]

Ito, M., & Komatsu, H. (2004). Representation of angles embedded within contour stimuli in area V2 of macaque monkeys. J. Neurosci., 24, 3313--3324.

[11]

Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition.

Digital Library

[12]

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1, 541--551.

Digital Library

[13]

Lee, H., Ekanadham, C., & Ng, A. Y. (2008). Sparse deep belief network model for visual area V2. Advances in Neural Information Processing Systems.

[14]

Lee, T. S., & Mumford, D. (2003). Hierarchical bayesian inference in the visual cortex. Journal of the Optical Society of America A, 20, 1434--1448.

[15]

Mutch, J., & Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. IEEE Conf. on Computer Vision and Pattern Recognition.

Digital Library

[16]

Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607--609.

[17]

Raina, R., Battle, A., Lee, H., Packer, B., & Ng, A. Y. (2007). Self-taught learning: Transfer learning from unlabeled data. International Conference on Machine Learning (pp. 759--766).

Digital Library

[18]

Raina, R., Madhavan, A., & Ng, A. Y. (2009). Large-scale deep unsupervised learning using graphics processors. International Conf. on Machine Learning.

Digital Library

[19]

Ranzato, M., Huang, F.-J., Boureau, Y.-L., & LeCun, Y. (2007). Unsupervised learning of invariant feature hierarchies with applications to object recognition. IEEE Conference on Computer Vision and Pattern Recognition.

[20]

Ranzato, M., Poultney, C., Chopra, S., & LeCun, Y. (2006). Efficient learning of sparse representations with an energy-based model. Advances in Neural Information Processing Systems (pp. 1137--1144).

[21]

Taylor, G., Hinton, G. E., & Roweis, S. (2007). Modeling human motion using binary latent variables. Adv. in Neural Information Processing Systems.

[22]

Varma, M., & Ray, D. (2007). Learning the discriminative power-invariance trade-off. International Conference on Computer Vision.

[23]

Weston, J., Ratle, F., & Collobert, R. (2008). Deep learning via semi-supervised embedding. International Conference on Machine Learning.

Digital Library

[24]

Yu, K., Xu, W., & Gong, Y. (2009). Deep learning with kernel regularization for visual recognition. Adv. Neural Information Processing Systems.

[25]

Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. IEEE Conference on Computer Vision and Pattern Recognition.

Digital Library

Cited By

Weng J(2024)Perspective Chapter: Deep Learning Misconduct and How Conscious Learning Avoids ItDeep Learning - Recent Findings and Research10.5772/intechopen.113359Online publication date: 29-May-2024
https://doi.org/10.5772/intechopen.113359
Mehrab ZStundal LSwarup SVenaktramanan SLewis BMortveit HBarrett CPandey AWells CGalvani ASinger BLeblang DColwell RMarathe MDastani MSichman JAlechina NDignum V(2024)Network Agency: An Agent-based Model of Forced Migration from UkraineProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662996(1372-1380)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662996
J.Haoran S. Karungaru K. Terada (2024)Indoors Fitness Training Monitoring based on OpenPoseJOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH10.46947/joaasr6320249476:3Online publication date: 30-May-2024
https://doi.org/10.46947/joaasr632024947
Show More Cited By

Recommendations

Unsupervised learning of hierarchical representations with convolutional deep belief networks

There has been much interest in unsupervised learning of hierarchical generative models such as deep belief networks (DBNs); however, scaling such models to full-sized, high-dimensional images remains a difficult problem. To address this problem, we ...
ImageNet classification with deep convolutional neural networks
NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% ...
Generative adversarial networks

Generative adversarial networks are a kind of artificial intelligence algorithm designed to solve the generative modeling problem. The goal of a generative model is to study a collection of training examples and learn the probability distribution that ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

June 2009

1331 pages

ISBN:9781605585161

DOI:10.1145/1553374

General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University

Copyright © 2009 Copyright 2009 by the author(s)/owner(s).

Sponsors

NSF
Microsoft Research: Microsoft Research
MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Defense Advanced Research Projects Agency

Conference

ICML '09

Sponsor:

Microsoft Research

ICML '09: The 26th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 14 - 18, 2009

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,435
Total Citations
View Citations
11,501
Total Downloads

Downloads (Last 12 months)312
Downloads (Last 6 weeks)42

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Weng J(2024)Perspective Chapter: Deep Learning Misconduct and How Conscious Learning Avoids ItDeep Learning - Recent Findings and Research10.5772/intechopen.113359Online publication date: 29-May-2024
https://doi.org/10.5772/intechopen.113359
Mehrab ZStundal LSwarup SVenaktramanan SLewis BMortveit HBarrett CPandey AWells CGalvani ASinger BLeblang DColwell RMarathe MDastani MSichman JAlechina NDignum V(2024)Network Agency: An Agent-based Model of Forced Migration from UkraineProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662996(1372-1380)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662996
J.Haoran S. Karungaru K. Terada (2024)Indoors Fitness Training Monitoring based on OpenPoseJOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH10.46947/joaasr6320249476:3Online publication date: 30-May-2024
https://doi.org/10.46947/joaasr632024947
Ammaiappan SLiang GDong F(2024)A Novel Deep Structure Auto Encoder for Image Reconstruction in Electrical Resistance Tomography2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10662430(3295-3300)Online publication date: 28-Jul-2024
https://doi.org/10.23919/CCC63176.2024.10662430
V KB JC J(2024)Decentralized Ledger Technology-enabled Textile Fabric Flaw Detection SystemSSRN Electronic Journal10.2139/ssrn.4816482Online publication date: 2024
https://doi.org/10.2139/ssrn.4816482
Ullah MZia TKim JKadry S(2024)An inherently interpretable deep learning model for local explanations using visual conceptsPLOS ONE10.1371/journal.pone.031187919:10(e0311879)Online publication date: 28-Oct-2024
https://doi.org/10.1371/journal.pone.0311879
Theiss JSilver M(2024)Top-Down Priors Disambiguate Target and Distractor Features in Simulated Covert Visual SearchNeural Computation10.1162/neco_a_0170036:10(2201-2224)Online publication date: 17-Sep-2024
https://doi.org/10.1162/neco_a_01700
Giordano MDoshi RLu QMurmann BTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)TinyForge: A Design Space Exploration to Advance Energy and Silicon Area Trade-offs in tinyML Compute Architectures with Custom Latch ArraysProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651328(1033-1047)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651328
Chen JXu QKang QZhou M(2024)MOGAN: Morphologic-Structure-Aware Generative Learning From a Single ImageIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.333122754:4(2021-2033)Online publication date: Apr-2024
https://doi.org/10.1109/TSMC.2023.3331227
Li SLiu FJiao LChen PLi L(2024)Self-Supervised Self-Organizing Clustering Network: A Novel Unsupervised Representation Learning MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318563835:2(1857-1871)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3185638
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents