research-article

One-Shot Learning of Object Categories

Authors:

Pietro PeronaAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 28, Issue 4

Pages 594 - 611

https://doi.org/10.1109/TPAMI.2006.79

Published: 01 April 2006 Publication History

Abstract

Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by Maximum Likelihood (ML) and Maximum A Posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.

References

[1]

Merriam-Webster's Collegiate Dictionary, 10th ed., Springfield, Mass.: Merriam-Webster, Inc., 1994.

[2]

Y. Amit and D. Geman, “A Computational Model for Visual Selection,” Neural Computation, vol. 11, no. 7, pp. 1691-1715, 1999.

Digital Library

[3]

H. Attias, “Inferring Parameters and Structure of Latent Variable Models by Variational Bayes,” Proc. 15th Conf. Uncertainty in Artificial Intelligence, pp. 21-30, 1999.

[4]

I. Biederman, “Recognition-by-Components: A Theory of Human Image Understanding,” Psychological Rev., vol. 94, pp. 115-147, 1987.

[5]

M. Burl and P. Perona, “Recognition of Planar Object Classes,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 223-230, 1996.

Digital Library

[6]

M. Burl, M. Weber, and P. Perona, “A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry,” Proc. European Conf. Computer Vision, pp. 628-641, 1996.

Digital Library

[7]

A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., vol. 29, pp. 1-38, 1976.

[8]

L. Fei-Fei, R. Fergus, and P. Perona, “A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories,” Proc. Ninth Int'l Conf. Computer Vision, pp. 1134-1141, Oct. 2003.

Digital Library

[9]

L. Fei-Fei, R. Fergus, and P. Perona, “Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories,” Proc. Workshop Generative-Model Based Vision, 2004.

Digital Library

[10]

L. Fei-Fei, R. Fergus, and P. Perona, supplemental material,

[11]

P. Felzenszwalb, and D. Huttenlocher, “Pictorial Structures for Object Recognition,” Int'l J. Computer Vision, vol. 1, pp. 55-79, 2005.

Digital Library

[12]

P. Felzenszwalb and D. Huttenlocher, “Representation and Detection of Deformable Shapes,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 208-220, Feb. 2005.

Digital Library

[13]

R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. Computer Vision and Pattern Recognition, pp. 264-271, 2003.

[14]

R. Fergus, P. Perona, and A. Zisserman, “A Visual Category Filter for Google Images,” Proc. Eighth European Conf. Computer Vision, 2004.

[15]

R. Fergus, P. Perona, and A. Zisserman, “A Sparse Object Category Model for Efficient Learning and Exhaustive Recognition,” Proc. Computer Vision and Pattern Recognition, 2005.

Digital Library

[16]

D. Forsyth and A. Zisserman, “Shape from Shading in the Light of Mutual Illumination,” Image and Vision Computing, pp. 42-29, 1990.

Digital Library

[17]

A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis. Chapman Hall/CRC, 1995.

[18]

R. Gilks, S. Richardson, and D. Spiegelhalter, Markov Chain Monte Carlo in Practice. Chapman Hall, 1992.

[19]

R. Gilks and P. Wild, “Adaptive Rejection Sampling for Gibbs Sampling,” Applied Statistics, vol. 41, pp. 337-348, 1992.

[20]

W. Grimson and D. Huttenlocher, “On the Sensitivity of the Hough Transform for Object Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 3, pp. 255-274, Mar. 1990.

Digital Library

[21]

K. Humphreys and M. Titterington, “Some Examples of Recursive Variational Approximations for Bayesian Inference,” Advanced Mean Field Methods. M. Opper and D. Saad, eds., MIT Press, 2001.

[22]

D. Huttenlocher, G. Klanderman, and W. Rucklidge, “Comparing Images Using the Hausdorff Distance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 850-863, Sept. 1993.

Digital Library

[23]

T. Kadir and M. Brady, “Scale, Saliency and Image Description,” Int'l J. Computer Vision, vol. 45, no. 2, pp. 83-105, 2001.

Digital Library

[24]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.

[25]

Y. LeCun, F. Huang, and L. Bottou, “Learning Methods for Generic Object Recognition with Invariance to Pose and Lighting,” Proc. Conf. Computer Vision and Pattern Recognition, 2004.

[26]

T. Leung, M. Burl, and P. Perona, “Finding Faces in Cluttered Scenes Using Labeled Random Graph Matching,” Proc. Int'l Conf. Computer Vision, pp. 637-644, 1995.

Digital Library

[27]

D. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proc. Int'l Conf. Computer Vision, pp. 1150-1157, 1999.

Digital Library

[28]

K. Mikolajczyk and C. Schmid, “An Affine Invariant Interest Point Detector,” Proc. European Conf. Computer Vision, vol. 1, pp. 128-142, 2002.

Digital Library

[29]

R. Neal and G. Hinton, “A View of the EM Algorithm that Justifies Incremental, Sparse and Other Variants,” Learning in Graphical Models, M.I. Jordan, ed., pp. 355-368, Norwell, Mass.: Kluwer Academic Press, 1998.

[30]

W. Penny, “Variational Bayes for d-Dimensional Gaussian Mixture Models,” technical report, Univ. College London, 2001.

[31]

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce, “3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints,” Proc. Computer Vision and Pattern Recognition, pp. 272-280, 2003.

[32]

H. Rowley, S. Baluja, and T. Kanade, “Neural Network-Based Face Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 23-38, Jan. 1998.

Digital Library

[33]

E. Sali and S. Ullman, “Combining Class-Specific Fragments for Object Classification,” Proc. British Machine Vision Conf., vol. 1, pp. 203-213, 1999.

[34]

H. Schneiderman and T. Kanade, “A Statistical Approach to 3D Object Detection Applied to Faces and Cars,” Proc. Computer Vision and Pattern Recognition, pp. 746-751, 2000.

[35]

K. Sung and T. Poggio, “Example-Based Learning for View-Based Human Face Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 39-51, Jan. 1998.

Digital Library

[36]

P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Proc. Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001.

[37]

P. Viola, M. Jones, and D. Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance,” Proc. Int'l Conf. Computer Vision, pp. 734-741, 2003.

Digital Library

[38]

M. Weber, W. Einhaeuser, M. Welling, and P. Perona, “Viewpoint-Invariant Learning and Detection of Human Heads,” Proc. Fourth Int'l Conf. Automated Face and Gesture Recognition, pp. 20-27, 2000.

Digital Library

[39]

M. Weber, M. Welling, and P. Perona, “Unsupervised Learning of Models for Recognition,” Proc. European Conf. Computer Vision, vol. 2, pp. 101-108, 2000.

Digital Library

[40]

A. Torralba, K.P. Murphy, and W.T. Freeman, “Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 762-769, 2004.

Digital Library

[41]

M. Weber, “Unsupervised Learning of Models for Object Recognition,” PhD thesis, Calif. Inst. of Technology, Pasadena, 2000.

Digital Library

[42]

R. Fergus, “Visual Object Category Recognition,” PhD thesis, Univ. of Oxford, U.K., 2005.

[43]

A. Berg, T. Berg, and J. Malik, “Shape Matching and Object Recognition Using Low Distortion Correspondence,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 26-33, June 2005.

Digital Library

Cited By

Gong NDuan PRong Y(2024)Learning Task-Specific Embeddings for Few-Shot Classification via Local Weight AdaptationProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651746(485-491)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651746
Wang ZDuan PRong Y(2024)SRCPT: Spatial Reconstruction Contrastive Pretext Task for Improving Few-Shot Image ClassificationProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651701(424-432)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651701
AlOmar EVenkatakrishnan AMkaouer MNewman COuni ASpinellis DConstantinou EBacchelli A(2024)How to refactor this code? An exploratory study on developer-ChatGPT refactoring conversationsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3645081(202-206)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3645081
Show More Cited By

Index Terms

One-Shot Learning of Object Categories
1. Computing methodologies

Recommendations

A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories
ICCV '03: Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2

Learning visual models of object categories notoriously requiresthousands of training examples; this is due to thediversity and richness of object appearance which requiresmodels containing hundreds of parameters. We present amethod for learning object ...
Variational inference in nonconjugate models

Mean-field variational methods are widely used for approximate posterior inference in many probabilistic models. In a typical application, mean-field methods approximately compute the posterior with a coordinate-ascent optimization algorithm. When the ...
Stochastic variational inference

We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 28, Issue 4

April 2006

175 pages

ISSN:0162-8828

Issue’s Table of Contents

Copyright © 2006.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 April 2006

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

611
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gong NDuan PRong Y(2024)Learning Task-Specific Embeddings for Few-Shot Classification via Local Weight AdaptationProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651746(485-491)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651746
Wang ZDuan PRong Y(2024)SRCPT: Spatial Reconstruction Contrastive Pretext Task for Improving Few-Shot Image ClassificationProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651701(424-432)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.1145/3651671.3651701
AlOmar EVenkatakrishnan AMkaouer MNewman COuni ASpinellis DConstantinou EBacchelli A(2024)How to refactor this code? An exploratory study on developer-ChatGPT refactoring conversationsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3645081(202-206)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3645081
Han MYang HNi TDuan DRuan MChen YZhang JXu W(2024)mmSign: mmWave-based Few-Shot Online Handwritten Signature VerificationACM Transactions on Sensor Networks10.1145/360594520:4(1-31)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3605945
Taştan AMuma MZoubir A(2024)Fast and Robust Sparsity-Aware Block Diagonal RepresentationIEEE Transactions on Signal Processing10.1109/TSP.2023.334356572(305-320)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TSP.2023.3343565
Yang YYu HLou XLiu YChoi C(2024)Attribute-Based Robotic Grasping With Data-Efficient AdaptationIEEE Transactions on Robotics10.1109/TRO.2024.335348440(1566-1579)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1109/TRO.2024.3353484
Yao QShen ZWang YDou D(2024)Property-Aware Relation Networks for Few-Shot Molecular Property PredictionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336809046:8(5413-5429)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3368090
Wang RFalk JPontil MCiliberto C(2024)Robust Meta-Representation Learning via Global Label Inference and ClassificationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332818446:4(1996-2010)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3328184
Wang TDou ZBao CShi Z(2024)Diffusion Mechanism in Residual Neural Network: Theory and ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.327234146:2(667-680)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3272341
Ye HMing LZhan DChao W(2024)Few-Shot Learning With a Strong TeacherIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.316036246:3(1425-1440)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TPAMI.2022.3160362
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents