Abstract
This paper presents our recently developed approach to constructing a hierarchical representation of visual input that aims to enable recognition and detection of a large number of object categories. Inspired by the principles of efficient indexing, robust matching, and ideas of compositionality, our approach learns a hierarchy of spatially flexible compositions, i.e. parts, in an unsupervised, statistics-driven manner. Starting with simple, frequent features, we learn the statistically most significant compositions (parts composed of parts), which consequently define the next layer. Parts are learned sequentially, layer after layer, optimally adjusting to the visual data. Lower layers are learned in a category-independent way to obtain complex, yet sharable visual building blocks, which is a crucial step towards a scalable representation. Higher layers of the hierarchy, on the other hand, are constructed by using specific categories, achieving a category representation with a small number of highly generalizable parts that gained their structural flexibility through composition within the hierarchy. Built in this way, new categories can be efficiently and continuously added to the system by adding a small number of parts only in the higher layers. The approach is demonstrated on a large collection of images and a variety of object categories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cognitive Systems for Cognitive Assistants (CoSy). EU FP6-004250-IP IST Cognitive Systems Integrated project (2004-2008) http://www.cognitivesystems.org/
Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. In: ECCV, vol. (1), pp. 30–43 (2006)
Amit, Y., Geman, D.: A computational model for visual selection. Neural Comp. 11(7), 1691–1715 (1999)
Barlow, H.B.: Conditions for versatile learning, Helmholtz’s unconscious inference, and the task of perception. Vision Research 30, 1561–1571 (1990)
Brincat, S.L., Connor, C.E.: Dynamic shape synthesis in posterior inferotemporal cortex. Neuron. 49(1), 17–24 (2006)
Califano, A., Mohan, R.: Multidimensional indexing for recognizing visual shapes. PAMI 16(4), 373–392 (1994)
Crandall, D.J., Huttenlocher, D.P.: Weakly supervised learning of part-based spatial models for visual object recognition. In: ECCV, vol. (1), pp. 16–29 (2006)
Edelman, S., Intrator, N.: Towards structural systematicity in distributed, statically bound visual representations. Cognitive Science 27, 73–110 (2003)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR, vol. (2), pp. 264–271 (2003)
Fidler, S., Leonardis, A.: Towards scalable representations of visual categories: Learning a hierarchy of parts. In: CVPR (2007)
Fiser, J., Aslin, R.N.: Statistical learning of new visual feature combinations by infants. Proc. Natl. Acad. Sci. U.S.A. 99(24), 15822–15826 (2002)
Fleuret, F., Geman, D.: Coarse-to-fine face detection. IJCV 41(1/2), 85–107 (2001)
Fukushima, K., Miyake, S., Ito, T.: Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE SMC 13(3), 826–834 (1983)
Geman, S., Potter, D., Chi, Z.: Composition systems. Quarterly of App. Math. 60(4), 707–736 (2002)
Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: CVPR, vol. (2), pp. 627–634 (2005)
Huang, F.-J., LeCun, Y.: Large-scale learning with svm and convolutional nets for generic object categorization. In: CVPR, pp. 284–291 (2006)
Jamone, L., Metta, G., Nori, F., Sandini, G.: James: A humanoid robot acting over an unstructured world. In: 6th IEEE-RAS International Conference on Humanoid Robots, pp. 143–150 (2006)
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: ECCV 2004, SLCV Workshop (2004)
Mel, B.W., Fiser, J.: Minimizing binding errors using learned conjunctive features. Neural Computation 12(4), 731–762 (2000)
Mikolajczyk, K., Leibe, B., Schiele, B.: Multiple object class detection with a generative model. In: CVPR 2006, pp. 26–36 (2006)
Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: CVPR 2006, pp. 11–18 (2006)
Opelt, A., Pinz, A., Zisserman, A.: A boundary-fragment-model for object detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 575–588. Springer, Heidelberg (2006)
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neurosc. 2(11), 1019–1025 (1999)
Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. In: CVPR, vol. (2), pp. 2033–2040 (2006)
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Object recognition with cortex-like mechanisms. PAMI 29(3), 411–426 (2007)
Sudderth, E., Torralba, A., Freeman, W., Willsky, A.: Learning hierarchical models of scenes, objects, and parts. In: ICCV, pp. 1331–1338 (2005)
Tsunoda, K., Yamane, Y., Nishizaki, M., Tanifuji, M.: Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns. Nature Neuroscience (4), 832–838 (2001)
Ullman, S., Epshtein, B.: Visual Classification by a Hierarchy of Extended Features. Towards Category-Level Object Recognition, pp. 321–344. Springer, Heidelberg (2006)
Welke, K., Oztop, E., Ude, A., Dillmann, R., Cheng, G.: Learning feature representations for an object recognition system. In: 6th IEEE-RAS International Conference on Humanoid Robots, pp. 290–295 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leonardis, A., Fidler, S. (2010). Learning Hierarchical Representations of Object Categories for Robot Vision. In: Kaneko, M., Nakamura, Y. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 66. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14743-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-14743-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14742-5
Online ISBN: 978-3-642-14743-2
eBook Packages: EngineeringEngineering (R0)