Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2999134.2999257guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

ImageNet classification with deep convolutional neural networks

Published: 03 December 2012 Publication History

Abstract

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

References

[1]
R.M. Bell and Y. Koren. Lessons from the netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75-79, 2007.
[2]
A. Berg, J. Deng, and L. Fei-Fei. Large scale visual recognition challenge 2010. www.image-net.org/challenges. 2010.
[3]
L. Breiman. Random forests. Machine learning, 45(1):5-32, 2001.
[4]
D. Ciresan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. Arxiv preprint arXiv:1202.2745, 2012.
[5]
D.C. Cireşan, U. Meier, J. Masci, L.M. Gambardella, and J. Schmidhuber. High-performance neural networks for visual object classification. Arxiv preprint arXiv:1102.0183, 2011.
[6]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.
[7]
J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, and L. Fei-Fei. ILSVRC-2012, 2012. URL http://www.image-net.org/challenges/LSVRC/2012/.
[8]
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 106(1):59-70, 2007.
[9]
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007. URL http://authors.library.caltech.edu/7694.
[10]
G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
[11]
K. Jarrett, K. Kavukcuoglu, M. A. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In International Conference on Computer Vision, pages 2146-2153. IEEE, 2009.
[12]
A. Krizhevsky. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto, 2009.
[13]
A. Krizhevsky. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 2010.
[14]
A. Krizhevsky and G.E. Hinton. Using very deep autoencoders for content-based image retrieval. In ESANN, 2011.
[15]
Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, et al. Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems, 1990.
[16]
Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, volume 2, pages II-97. IEEE, 2004.
[17]
Y. LeCun, K. Kavukcuoglu, and C. Farabet. Convolutional networks and applications in vision. In Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pages 253-256. IEEE, 2010.
[18]
H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 609-616. ACM, 2009.
[19]
T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka. Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost. In ECCV - European Conference on Computer Vision, Florence, Italy, October 2012.
[20]
V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proc. 27th International Conference on Machine Learning, 2010.
[21]
N. Pinto, D.D. Cox, and J.J. DiCarlo. Why is real-world visual object recognition hard? PLoS computational biology, 4(1):e27, 2008.
[22]
N. Pinto, D. Doukhan, J.J. DiCarlo, and D.D. Cox. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS computational biology, 5(11):e1000579, 2009.
[23]
B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1):157-173, 2008.
[24]
J. Sanchez and F. Perronnin. High-dimensional signature compression for large-scale image classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1665-1672. IEEE, 2011.
[25]
P.Y. Simard, D. Steinkraus, and J.C. Platt. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, volume 2, pages 958-962, 2003.
[26]
S.C. Turaga, J.F. Murray, V. Jain, F. Roth, M. Helmstaedter, K. Briggman, W. Denk, and H.S. Seung. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation, 22(2):511-538, 2010.

Cited By

View all
  • (2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 1-Feb-2024
  • (2024)Face image de-identification based on feature embeddingJournal on Image and Video Processing10.1186/s13640-024-00646-z2024:1Online publication date: 5-Sep-2024
  • (2024)Analysis and Research on students' classroom behavior data based on Bidirectional Motion DifferenceProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700405(629-633)Online publication date: 6-Sep-2024
  • Show More Cited By
  1. ImageNet classification with deep convolutional neural networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1
    December 2012
    3328 pages

    Publisher

    Curran Associates Inc.

    Red Hook, NY, United States

    Publication History

    Published: 03 December 2012

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 1-Feb-2024
    • (2024)Face image de-identification based on feature embeddingJournal on Image and Video Processing10.1186/s13640-024-00646-z2024:1Online publication date: 5-Sep-2024
    • (2024)Analysis and Research on students' classroom behavior data based on Bidirectional Motion DifferenceProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700405(629-633)Online publication date: 6-Sep-2024
    • (2024)Towards Effective Machine Learning Models for Ransomware Detection via Low-Level Hardware InformationProceedings of the International Workshop on Hardware and Architectural Support for Security and Privacy 202410.1145/3696843.3696847(10-18)Online publication date: 2-Nov-2024
    • (2024)A Silicon Element Quality Inspection System Based on YOLOv8Proceedings of the International Conference on Machine Learning, Pattern Recognition and Automation Engineering10.1145/3696687.3696693(29-35)Online publication date: 7-Aug-2024
    • (2024)Performance Analysis of CNN Inference/Training with Convolution and Non-Convolution Operations on ASIC AcceleratorsACM Transactions on Design Automation of Electronic Systems10.1145/369666530:1(1-34)Online publication date: 26-Sep-2024
    • (2024)A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and ExplainabilityACM Computing Surveys10.1145/369620657:2(1-38)Online publication date: 17-Sep-2024
    • (2024)Assessing Pretrained Model Through Transfer Multi-Task Learn For Melanoma ClassificationProceedings of the 2024 8th International Conference on Cloud and Big Data Computing10.1145/3694860.3694871(73-79)Online publication date: 15-Aug-2024
    • (2024)Prioritizing Test Inputs for DNNs Using Training DynamicsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695498(1219-1231)Online publication date: 27-Oct-2024
    • (2024)Historical Postcards Retrieval through Vision Foundation ModelsProceedings of the 6th workshop on the analySis, Understanding and proMotion of heritAge Contents10.1145/3689094.3689471(50-56)Online publication date: 28-Oct-2024
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media