Perception-Action Based Object Detection from Local Descriptor Combination and Reinforcement Learning

Lucas Paletta¹⁹,
Gerald Fritz¹⁹ &
Christin Seifert¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3540))

Included in the following conference series:

Scandinavian Conference on Image Analysis

2263 Accesses

Abstract

This work proposes to learn visual encodings of attention patterns that enables sequential attention for object detection in real world environments. The system embeds a saccadic decision procedure in a cascaded process where visual evidence is probed at informative image locations. It is based on the extraction of information theoretic saliency by determining informative local image descriptors that provide selected foci of interest. The local information in terms of code book vector responses and the geometric information in the shift of attention contribute to recognition states of a Markov decision process. A Q-learner performs then performs search on useful actions towards salient locations, developing a strategy of action sequences directed in state space towards the optimization of information maximization. The method is evaluated in outdoor object recognition and demonstrates efficient performance.

This work is supported by the European Commission funded projects MACS under grant number FP6-004381 and MOBVIS under grant number FP6-511051, and by the FWF Austrian Joint research Project Cognitive Vision under sub-projects S9103-N04 and S9104-N04.

Download to read the full chapter text

Chapter PDF

Multi-stage Reinforcement Learning for Object Detection

Learning to Perform Visual Tasks from Human Demonstrations

Saliency Prediction for Action Recognition

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bandera, C., Vico, F.J., Bravo, J.M., Harmon, M.E., Baird III, L.C.: Residual Q-learning applied to visual attention. In: International Conference on Machine Learning, pp. 20–27 (1996)
Google Scholar
Deco, G.: The computational neuroscience of visual cognition: Attention, memory and reward. In: Proc. International Workshop on Attention and Performance in Computational Vision, pp. 49–58 (2004)
Google Scholar
Fritz, G., Paletta, L., Bischof, H.: Object recognition using local information content. In: Proc. International Conference on Pattern Recognition, ICPR 2004, Cambridge, UK, vol. II, pp. 15–18 (2004)
Google Scholar
Fritz, G., Seifert, C., Paletta, L., Bischof, H.: Rapid object recognition from discriminative regions of interest. In: Proc. National Conference on Artificial Intelligence, AAAI 2004, San Jose, CA, pp. 444–449 (2004)
Google Scholar
Henderson, J.M.: Human gaze control in real-world scene perception. Trends in Cognitive Sciences 7, 498–504 (2003)
Article Google Scholar
Itti, L., Koch, C.: Computational modeling of visual attention. Nature Reviews Neuroscience 2(3), 194–203 (2001)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Puterman, M.L.: Markov Decision Processes. John Wiley & Sons, New York (1994)
Book MATH Google Scholar
Rybak, I.A., Gusakova, V.I., Golovan, A.V., Podladchikova, L.N., Shevtsova, N.A.: A model of attention-guided visual perception and recognition. Vision Research 38, 2387–2400 (1998)
Article Google Scholar
Tipper, S.P., Grisson, S., Kessler, K.: Long-term inhibition of return of attention. Psychological Science 14, 19–25–105 (2003)
Google Scholar
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8(3,4), 279–292 (1992)
MATH Google Scholar
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Digital Image Processing, JOANNEUM RESEARCH Forschungsgesellschaft mbH, Wastiangasse 6, A-8010, Graz, Austria
Lucas Paletta, Gerald Fritz & Christin Seifert

Authors

Lucas Paletta
View author publications
You can also search for this author in PubMed Google Scholar
Gerald Fritz
View author publications
You can also search for this author in PubMed Google Scholar
Christin Seifert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Technology, Lappeenranta University of Technology, P.O.Box 20, FIN-53851, Lappeenranta, Finland
Heikki Kalviainen
Dept. of Computer Science, University of Joensuu, Finland
Jussi Parkkinen
Department of Information and Computer Sciences, Toyohashi University of Technology, 1-1 Hibarigaoka, Tenpaku-cho, 441-8580, Toyohashi, Japan
Arto Kaarna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paletta, L., Fritz, G., Seifert, C. (2005). Perception-Action Based Object Detection from Local Descriptor Combination and Reinforcement Learning. In: Kalviainen, H., Parkkinen, J., Kaarna, A. (eds) Image Analysis. SCIA 2005. Lecture Notes in Computer Science, vol 3540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11499145_65

Download citation

DOI: https://doi.org/10.1007/11499145_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26320-3
Online ISBN: 978-3-540-31566-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Perception-Action Based Object Detection from Local Descriptor Combination and Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Multi-stage Reinforcement Learning for Object Detection

Learning to Perform Visual Tasks from Human Demonstrations

Saliency Prediction for Action Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Perception-Action Based Object Detection from Local Descriptor Combination and Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Multi-stage Reinforcement Learning for Object Detection

Learning to Perform Visual Tasks from Human Demonstrations

Saliency Prediction for Action Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation