Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1553374.1553453acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

Published: 14 June 2009 Publication History

Abstract

There has been much interest in unsupervised learning of hierarchical generative models such as deep belief networks. Scaling such models to full-sized, high-dimensional images remains a difficult problem. To address this problem, we present the convolutional deep belief network, a hierarchical generative model which scales to realistic image sizes. This model is translation-invariant and supports efficient bottom-up and top-down probabilistic inference. Key to our approach is probabilistic max-pooling, a novel technique which shrinks the representations of higher layers in a probabilistically sound way. Our experiments show that the algorithm learns useful high-level visual features, such as object parts, from unlabeled images of objects and natural scenes. We demonstrate excellent performance on several visual recognition tasks and show that our model can perform hierarchical (bottom-up and top-down) inference over full-sized images.

References

[1]
Bell, A. J., & Sejnowski, T. J. (1997). The 'independent components' of natural scenes are edge filters. Vision Research, 37, 3327--3338.
[2]
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2006). Greedy layer-wise training of deep networks. Adv. in Neural Information Processing Systems.
[3]
Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. IEEE Conference on Computer Vision and Pattern Recognition (pp. 26--33).
[4]
Desjardins, G., & Bengio, Y. (2008). Empirical evaluation of convolutional RBMs for vision (Technical Report).
[5]
Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. CVPR Workshop on Gen.-Model Based Vision.
[6]
Grosse, R., Raina, R., Kwong, H., & Ng, A. (2007). Shift-invariant sparse coding for audio classification. Proceedings of the Conference on Uncertainty in AI.
[7]
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14, 1771--1800.
[8]
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527--1554.
[9]
Hinton, G. E., & Salakhutdinov, R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504--507.
[10]
Ito, M., & Komatsu, H. (2004). Representation of angles embedded within contour stimuli in area V2 of macaque monkeys. J. Neurosci., 24, 3313--3324.
[11]
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition.
[12]
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1, 541--551.
[13]
Lee, H., Ekanadham, C., & Ng, A. Y. (2008). Sparse deep belief network model for visual area V2. Advances in Neural Information Processing Systems.
[14]
Lee, T. S., & Mumford, D. (2003). Hierarchical bayesian inference in the visual cortex. Journal of the Optical Society of America A, 20, 1434--1448.
[15]
Mutch, J., & Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. IEEE Conf. on Computer Vision and Pattern Recognition.
[16]
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607--609.
[17]
Raina, R., Battle, A., Lee, H., Packer, B., & Ng, A. Y. (2007). Self-taught learning: Transfer learning from unlabeled data. International Conference on Machine Learning (pp. 759--766).
[18]
Raina, R., Madhavan, A., & Ng, A. Y. (2009). Large-scale deep unsupervised learning using graphics processors. International Conf. on Machine Learning.
[19]
Ranzato, M., Huang, F.-J., Boureau, Y.-L., & LeCun, Y. (2007). Unsupervised learning of invariant feature hierarchies with applications to object recognition. IEEE Conference on Computer Vision and Pattern Recognition.
[20]
Ranzato, M., Poultney, C., Chopra, S., & LeCun, Y. (2006). Efficient learning of sparse representations with an energy-based model. Advances in Neural Information Processing Systems (pp. 1137--1144).
[21]
Taylor, G., Hinton, G. E., & Roweis, S. (2007). Modeling human motion using binary latent variables. Adv. in Neural Information Processing Systems.
[22]
Varma, M., & Ray, D. (2007). Learning the discriminative power-invariance trade-off. International Conference on Computer Vision.
[23]
Weston, J., Ratle, F., & Collobert, R. (2008). Deep learning via semi-supervised embedding. International Conference on Machine Learning.
[24]
Yu, K., Xu, W., & Gong, Y. (2009). Deep learning with kernel regularization for visual recognition. Adv. Neural Information Processing Systems.
[25]
Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. IEEE Conference on Computer Vision and Pattern Recognition.

Cited By

View all
  • (2024)Perspective Chapter: Deep Learning Misconduct and How Conscious Learning Avoids ItDeep Learning - Recent Findings and Research10.5772/intechopen.113359Online publication date: 29-May-2024
  • (2024)Network Agency: An Agent-based Model of Forced Migration from UkraineProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662996(1372-1380)Online publication date: 6-May-2024
  • (2024)Indoors Fitness Training Monitoring based on OpenPoseJOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH10.46947/joaasr6320249476:3Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374

Sponsors

  • NSF
  • Microsoft Research: Microsoft Research
  • MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICML '09
Sponsor:
  • Microsoft Research

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)312
  • Downloads (Last 6 weeks)42
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Perspective Chapter: Deep Learning Misconduct and How Conscious Learning Avoids ItDeep Learning - Recent Findings and Research10.5772/intechopen.113359Online publication date: 29-May-2024
  • (2024)Network Agency: An Agent-based Model of Forced Migration from UkraineProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662996(1372-1380)Online publication date: 6-May-2024
  • (2024)Indoors Fitness Training Monitoring based on OpenPoseJOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH10.46947/joaasr6320249476:3Online publication date: 30-May-2024
  • (2024)A Novel Deep Structure Auto Encoder for Image Reconstruction in Electrical Resistance Tomography2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10662430(3295-3300)Online publication date: 28-Jul-2024
  • (2024)Decentralized Ledger Technology-enabled Textile Fabric Flaw Detection SystemSSRN Electronic Journal10.2139/ssrn.4816482Online publication date: 2024
  • (2024)An inherently interpretable deep learning model for local explanations using visual conceptsPLOS ONE10.1371/journal.pone.031187919:10(e0311879)Online publication date: 28-Oct-2024
  • (2024)Top-Down Priors Disambiguate Target and Distractor Features in Simulated Covert Visual SearchNeural Computation10.1162/neco_a_0170036:10(2201-2224)Online publication date: 17-Sep-2024
  • (2024)TinyForge: A Design Space Exploration to Advance Energy and Silicon Area Trade-offs in tinyML Compute Architectures with Custom Latch ArraysProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651328(1033-1047)Online publication date: 27-Apr-2024
  • (2024)MOGAN: Morphologic-Structure-Aware Generative Learning From a Single ImageIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.333122754:4(2021-2033)Online publication date: Apr-2024
  • (2024)Self-Supervised Self-Organizing Clustering Network: A Novel Unsupervised Representation Learning MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318563835:2(1857-1871)Online publication date: Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media