Article

ImageNet classification with deep convolutional neural networks

Authors:

Alex Krizhevsky,

Ilya Sutskever,

Geoffrey E. HintonAuthors Info & Claims

NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1

Pages 1097 - 1105

Published: 03 December 2012 Publication History

Abstract

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

References

[1]

R.M. Bell and Y. Koren. Lessons from the netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75-79, 2007.

[2]

A. Berg, J. Deng, and L. Fei-Fei. Large scale visual recognition challenge 2010. www.image-net.org/challenges. 2010.

[3]

L. Breiman. Random forests. Machine learning, 45(1):5-32, 2001.

[4]

D. Ciresan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. Arxiv preprint arXiv:1202.2745, 2012.

[5]

D.C. Cireşan, U. Meier, J. Masci, L.M. Gambardella, and J. Schmidhuber. High-performance neural networks for visual object classification. Arxiv preprint arXiv:1102.0183, 2011.

[6]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.

[7]

J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, and L. Fei-Fei. ILSVRC-2012, 2012. URL http://www.image-net.org/challenges/LSVRC/2012/.

[8]

L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 106(1):59-70, 2007.

[9]

G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007. URL http://authors.library.caltech.edu/7694.

[10]

G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.

[11]

K. Jarrett, K. Kavukcuoglu, M. A. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In International Conference on Computer Vision, pages 2146-2153. IEEE, 2009.

[12]

A. Krizhevsky. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto, 2009.

[13]

A. Krizhevsky. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 2010.

[14]

A. Krizhevsky and G.E. Hinton. Using very deep autoencoders for content-based image retrieval. In ESANN, 2011.

[15]

Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, et al. Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems, 1990.

[16]

Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, volume 2, pages II-97. IEEE, 2004.

[17]

Y. LeCun, K. Kavukcuoglu, and C. Farabet. Convolutional networks and applications in vision. In Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pages 253-256. IEEE, 2010.

[18]

H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 609-616. ACM, 2009.

[19]

T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka. Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost. In ECCV - European Conference on Computer Vision, Florence, Italy, October 2012.

[20]

V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proc. 27th International Conference on Machine Learning, 2010.

[21]

N. Pinto, D.D. Cox, and J.J. DiCarlo. Why is real-world visual object recognition hard? PLoS computational biology, 4(1):e27, 2008.

[22]

N. Pinto, D. Doukhan, J.J. DiCarlo, and D.D. Cox. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS computational biology, 5(11):e1000579, 2009.

[23]

B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1):157-173, 2008.

[24]

J. Sanchez and F. Perronnin. High-dimensional signature compression for large-scale image classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1665-1672. IEEE, 2011.

[25]

P.Y. Simard, D. Steinkraus, and J.C. Platt. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, volume 2, pages 958-962, 2003.

[26]

S.C. Turaga, J.F. Murray, V. Jain, F. Roth, M. Helmstaedter, K. Briggman, W. Denk, and H.S. Seung. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation, 22(2):511-538, 2010.

Cited By

Wang THuang SBao ZCulpepper JDedeoglu VArablouei R(2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.14778/3648160.3648172
Hanawa GIto KAoki T(2024)Face image de-identification based on feature embeddingJournal on Image and Video Processing10.1186/s13640-024-00646-z2024:1Online publication date: 5-Sep-2024
https://dl.acm.org/doi/10.1186/s13640-024-00646-z
Tang MLi LZhao MQing S(2024)Analysis and Research on students' classroom behavior data based on Bidirectional Motion DifferenceProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700405(629-633)Online publication date: 6-Sep-2024
https://dl.acm.org/doi/10.1145/3700297.3700405
Show More Cited By

ImageNet classification with deep convolutional neural networks
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

ImageNet classification with deep convolutional neural networks

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0%, ...
A dyadic multi-resolution deep convolutional neural wavelet network for image classification

For almost the past four decades, image classification has gained a lot of attention in the field of pattern recognition due to its application in various fields. Given its importance, several approaches have been proposed up to now. In this paper, we ...
Automatic Fish Species Classification Using Deep Convolutional Neural Networks
Abstract
In this paper, we presented an automated system for identification and classification of fish species. It helps the marine biologists to have greater understanding of the fish species and their habitats. The proposed model is based on deep ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1

December 2012

3328 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 03 December 2012

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6,071
Total Citations
View Citations
2
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang THuang SBao ZCulpepper JDedeoglu VArablouei R(2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.14778/3648160.3648172
Hanawa GIto KAoki T(2024)Face image de-identification based on feature embeddingJournal on Image and Video Processing10.1186/s13640-024-00646-z2024:1Online publication date: 5-Sep-2024
https://dl.acm.org/doi/10.1186/s13640-024-00646-z
Tang MLi LZhao MQing S(2024)Analysis and Research on students' classroom behavior data based on Bidirectional Motion DifferenceProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700405(629-633)Online publication date: 6-Sep-2024
https://dl.acm.org/doi/10.1145/3700297.3700405
Woralert CLiu CBlasingame Z(2024)Towards Effective Machine Learning Models for Ransomware Detection via Low-Level Hardware InformationProceedings of the International Workshop on Hardware and Architectural Support for Security and Privacy 202410.1145/3696843.3696847(10-18)Online publication date: 2-Nov-2024
https://dl.acm.org/doi/10.1145/3696843.3696847
Zhang BWang CWang WCheng FHe L(2024)A Silicon Element Quality Inspection System Based on YOLOv8Proceedings of the International Conference on Machine Learning, Pattern Recognition and Automation Engineering10.1145/3696687.3696693(29-35)Online publication date: 7-Aug-2024
https://dl.acm.org/doi/10.1145/3696687.3696693
Esmaeilzadeh HGhodrati SKahng AKinzer SManasi SSapatnekar SWang Z(2024)Performance Analysis of CNN Inference/Training with Convolution and Non-Convolution Operations on ASIC AcceleratorsACM Transactions on Design Automation of Electronic Systems10.1145/369666530:1(1-34)Online publication date: 26-Sep-2024
https://dl.acm.org/doi/10.1145/3696665
Cao CZhou FDai YWang JZhang K(2024)A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and ExplainabilityACM Computing Surveys10.1145/369620657:2(1-38)Online publication date: 17-Sep-2024
https://dl.acm.org/doi/10.1145/3696206
Xie HZhang Y(2024)Assessing Pretrained Model Through Transfer Multi-Task Learn For Melanoma ClassificationProceedings of the 2024 8th International Conference on Cloud and Big Data Computing10.1145/3694860.3694871(73-79)Online publication date: 15-Aug-2024
https://dl.acm.org/doi/10.1145/3694860.3694871
Shen JLi ZPan MLi XFilkov VRay BZhou M(2024)Prioritizing Test Inputs for DNNs Using Training DynamicsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695498(1219-1231)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695498
Amri ATabbone SGouet-Brunet VKosti RWeng L(2024)Historical Postcards Retrieval through Vision Foundation ModelsProceedings of the 6th workshop on the analySis, Understanding and proMotion of heritAge Contents10.1145/3689094.3689471(50-56)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689094.3689471
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents