Nothing Special   »   [go: up one dir, main page]

Skip to main content

Learning Hierarchical Representations of Object Categories for Robot Vision

  • Conference paper
Robotics Research

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 66))

Abstract

This paper presents our recently developed approach to constructing a hierarchical representation of visual input that aims to enable recognition and detection of a large number of object categories. Inspired by the principles of efficient indexing, robust matching, and ideas of compositionality, our approach learns a hierarchy of spatially flexible compositions, i.e. parts, in an unsupervised, statistics-driven manner. Starting with simple, frequent features, we learn the statistically most significant compositions (parts composed of parts), which consequently define the next layer. Parts are learned sequentially, layer after layer, optimally adjusting to the visual data. Lower layers are learned in a category-independent way to obtain complex, yet sharable visual building blocks, which is a crucial step towards a scalable representation. Higher layers of the hierarchy, on the other hand, are constructed by using specific categories, achieving a category representation with a small number of highly generalizable parts that gained their structural flexibility through composition within the hierarchy. Built in this way, new categories can be efficiently and continuously added to the system by adding a small number of parts only in the higher layers. The approach is demonstrated on a large collection of images and a variety of object categories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Cognitive Systems for Cognitive Assistants (CoSy). EU FP6-004250-IP IST Cognitive Systems Integrated project (2004-2008) http://www.cognitivesystems.org/

  2. Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. In: ECCV, vol. (1), pp. 30–43 (2006)

    Google Scholar 

  3. Amit, Y., Geman, D.: A computational model for visual selection. Neural Comp. 11(7), 1691–1715 (1999)

    Article  Google Scholar 

  4. Barlow, H.B.: Conditions for versatile learning, Helmholtz’s unconscious inference, and the task of perception. Vision Research 30, 1561–1571 (1990)

    Article  Google Scholar 

  5. Brincat, S.L., Connor, C.E.: Dynamic shape synthesis in posterior inferotemporal cortex. Neuron. 49(1), 17–24 (2006)

    Article  Google Scholar 

  6. Califano, A., Mohan, R.: Multidimensional indexing for recognizing visual shapes. PAMI 16(4), 373–392 (1994)

    Google Scholar 

  7. Crandall, D.J., Huttenlocher, D.P.: Weakly supervised learning of part-based spatial models for visual object recognition. In: ECCV, vol. (1), pp. 16–29 (2006)

    Google Scholar 

  8. Edelman, S., Intrator, N.: Towards structural systematicity in distributed, statically bound visual representations. Cognitive Science 27, 73–110 (2003)

    Article  Google Scholar 

  9. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR, vol. (2), pp. 264–271 (2003)

    Google Scholar 

  10. Fidler, S., Leonardis, A.: Towards scalable representations of visual categories: Learning a hierarchy of parts. In: CVPR (2007)

    Google Scholar 

  11. Fiser, J., Aslin, R.N.: Statistical learning of new visual feature combinations by infants. Proc. Natl. Acad. Sci. U.S.A. 99(24), 15822–15826 (2002)

    Article  Google Scholar 

  12. Fleuret, F., Geman, D.: Coarse-to-fine face detection. IJCV 41(1/2), 85–107 (2001)

    Article  MATH  Google Scholar 

  13. Fukushima, K., Miyake, S., Ito, T.: Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE SMC 13(3), 826–834 (1983)

    Google Scholar 

  14. Geman, S., Potter, D., Chi, Z.: Composition systems. Quarterly of App. Math. 60(4), 707–736 (2002)

    MATH  MathSciNet  Google Scholar 

  15. Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: CVPR, vol. (2), pp. 627–634 (2005)

    Google Scholar 

  16. Huang, F.-J., LeCun, Y.: Large-scale learning with svm and convolutional nets for generic object categorization. In: CVPR, pp. 284–291 (2006)

    Google Scholar 

  17. Jamone, L., Metta, G., Nori, F., Sandini, G.: James: A humanoid robot acting over an unstructured world. In: 6th IEEE-RAS International Conference on Humanoid Robots, pp. 143–150 (2006)

    Google Scholar 

  18. Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: ECCV 2004, SLCV Workshop (2004)

    Google Scholar 

  19. Mel, B.W., Fiser, J.: Minimizing binding errors using learned conjunctive features. Neural Computation 12(4), 731–762 (2000)

    Article  Google Scholar 

  20. Mikolajczyk, K., Leibe, B., Schiele, B.: Multiple object class detection with a generative model. In: CVPR 2006, pp. 26–36 (2006)

    Google Scholar 

  21. Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: CVPR 2006, pp. 11–18 (2006)

    Google Scholar 

  22. Opelt, A., Pinz, A., Zisserman, A.: A boundary-fragment-model for object detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 575–588. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neurosc. 2(11), 1019–1025 (1999)

    Article  Google Scholar 

  24. Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. In: CVPR, vol. (2), pp. 2033–2040 (2006)

    Google Scholar 

  25. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Object recognition with cortex-like mechanisms. PAMI 29(3), 411–426 (2007)

    Google Scholar 

  26. Sudderth, E., Torralba, A., Freeman, W., Willsky, A.: Learning hierarchical models of scenes, objects, and parts. In: ICCV, pp. 1331–1338 (2005)

    Google Scholar 

  27. Tsunoda, K., Yamane, Y., Nishizaki, M., Tanifuji, M.: Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns. Nature Neuroscience (4), 832–838 (2001)

    Google Scholar 

  28. Ullman, S., Epshtein, B.: Visual Classification by a Hierarchy of Extended Features. Towards Category-Level Object Recognition, pp. 321–344. Springer, Heidelberg (2006)

    Google Scholar 

  29. Welke, K., Oztop, E., Ude, A., Dillmann, R., Cheng, G.: Learning feature representations for an object recognition system. In: 6th IEEE-RAS International Conference on Humanoid Robots, pp. 290–295 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Leonardis, A., Fidler, S. (2010). Learning Hierarchical Representations of Object Categories for Robot Vision. In: Kaneko, M., Nakamura, Y. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 66. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14743-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14743-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14742-5

  • Online ISBN: 978-3-642-14743-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics