Abstract
Invariant object recognition is arguably one of the major challenges for contemporary machine vision systems. In contrast, the mammalian visual system performs this task virtually effortlessly. How can we exploit our knowledge on the biological system to improve artificial systems? Our understanding of the mammalian early visual system has been augmented by the discovery that general coding principles could explain many aspects of neuronal response properties. How can such schemes be transferred to system level performance? In the present study we train cells on a particular variant of the general principle of temporal coherence, the “stability” objective. These cells are trained on unlabeled real-world images without a teaching signal. We show that after training, the cells form a representation that is largely independent of the viewpoint from which the stimulus is looked at. This finding includes generalization to previously unseen viewpoints. The achieved representation is better suited for view-point invariant object classification than the cells’ input patterns. This property to facilitate view-point invariant classification is maintained even if training and classification take place in the presence of an – also unlabeled – distractor object. In summary, here we show that unsupervised learning using a general coding principle facilitates the classification of real-world objects, that are not segmented from the background and undergo complex, non-isomorphic, transformations.
Similar content being viewed by others
References
Berkes P,Wiskott L (2003) Slowfeature analysis yields a rich repertoire of complex-cell properties. Cognit Sci EPrintArch (CogPrints) 2804, http://cogprints.ecs.soton.ac.uk/archive/00002804/
BY Betsch W Einhäuser KP Körding P König (2004) ArticleTitleThe world from a cat’s perspective – statistics of natural videos Biol Cybern 90 41–50 Occurrence Handle10.1007/s00422-003-0434-6 Occurrence Handle14762723
I Biederman (1987) ArticleTitleRecognition-by-components: a theory of human image understanding Psychol Rev 94 IssueID2 115–147 Occurrence Handle10.1037//0033-295X.94.2.115 Occurrence Handle3575582
I Biederman (2000) ArticleTitleRecognizing depth-rotated objects: a review of recent research and theory Spat Vis 13 241–253 Occurrence Handle10.1163/156856800741063 Occurrence Handle11198235
R Desimone J Duncan (1995) ArticleTitleNeural mechanisms of selective visual attention Annu Rev Neurosci 18 193–222 Occurrence Handle10.1146/annurev.ne.18.030195.001205 Occurrence Handle7605061
W Einhäuser C Kayser P König KP Körding (2002) ArticleTitleLearning the invariance properties of complex cells from their responses to natural stimuli Eur J Neurosci 15 475–486 Occurrence Handle10.1046/j.0953-816x.2001.01885.x Occurrence Handle11876775
W Einhäuser C Kayser KP Körding P König (2003) ArticleTitleLearning distinct and complementary feature-selectivities from natural colour videos Rev Neurosci 14 43–52 Occurrence Handle12929917
P Földiak (1991) ArticleTitleLearning Invariance from Transformation Sequences Neural Comput 3 194–200
Franzius M, Einhäuser W, König P, Körding KP (2005) Learning a hierarchical model of cortical function from natural stimuli. (submitted).
DH Hubel TN Wiesel (1962) ArticleTitleReceptive fields, binocular interaction and functional architecture in the cat’s visual cortex J Physiol 160 106–154 Occurrence Handle14449617
J Hurri A Hyvärinen (2003) ArticleTitleSimple-Cell-Like Receptive Fields Maximize Temporal Coherence in Natural Video Neural Comput 15 IssueID3 663–691 Occurrence Handle10.1162/089976603321192121 Occurrence Handle12620162
C Kayser W Einhäuser O Dümmer P König KP Körding (2001) Extracting slow subspaces from natural videos leads to complex cells G Dorffner H Bischoff K Hornik (Eds) Artificial neural networks – (ICANN) LNCS 2130 Springer Berlin Heidelberg New York 1075–1080
C Kayser W Einhäuser P König (2003a) ArticleTitleTemporal correlations of orientations in natural scenes Neurocomputing 52 117–123
C Kayser KP Körding P König (2003b) ArticleTitleLearning the nonlinearity of neurons from natural visual stimuli Neural Comput 15 1751–1759 Occurrence Handle10.1162/08997660360675026
KP Körding C Kayser W Einhäuser P König (2004) ArticleTitleHow are complex cell properties adapted to the statistics of natural stimuli? J Neurophysiol 91 206–212 Occurrence Handle10.1152/jn.00149.2003 Occurrence Handle12904330
BW Mel (1997) ArticleTitleSEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition Neural Comput 9 IssueID4 777–804 Occurrence Handle9161022
Nayer SK, Nene SA, Murase H (1996) Real Time 100 object recognition system. In: Proceedings of ARPA Image UnderstandingWorkshop. Morgan Kaufmann, San Matteo
BA Olshausen (2002) Principles of image representation in visual cortex LM Chalupa JS Werner (Eds) The visual neurosciences MIT Press Cambridge
T Poggio S Edelman (1990) ArticleTitleA network that learns to recognize three-dimensional objects Nature 343 IssueID6255 263–266 Occurrence Handle10.1038/343263a0 Occurrence Handle2300170
ET Rolls T Milward (2000) ArticleTitleA model of invariant object recognition in the visual system: learning rules, activation functions, lateral inhibition, and information-based performance measures Neural Comput 12 2547–2572 Occurrence Handle10.1162/089976600300014845 Occurrence Handle11110127
JV Stone (1996) ArticleTitleLearning perceptually salient visual parameters using spatiotemporal smoothness constraints Neural Comput 8 1463–1492 Occurrence Handle8823943
SM Stringer ET Rolls (2002) ArticleTitleInvariant object recognition in the visual system with novel views of 3D objects Neural Comput 14 2585–2596 Occurrence Handle10.1162/089976602760407982 Occurrence Handle12433291
MJ Tarr S Pinker (1989) ArticleTitleMental rotation and orientation-dependence in shape recognition Cognit Psychol 21 IssueID2 233–282 Occurrence Handle10.1016/0010-0285(89)90009-1 Occurrence Handle2706928
MJ Tarr HH Bülthoff (1998) ArticleTitleImage-based object recognition in man, monkey and machine Cognition 67 1–20 Occurrence Handle10.1016/S0010-0277(98)00026-2 Occurrence Handle9735534
J Touryan B Lau Y Dan (2002) ArticleTitleIsolation of relevant visual features from random stimuli for cortical complex cells J Neurosci 22 10811–10818 Occurrence Handle12486174
S Ullman R Basri (1991) ArticleTitleRecognition by linear combinations of models IEEE Trans Pattern Anal Mach Intell 13 IssueID10 992–1006 Occurrence Handle10.1109/34.99234
G Wallis ET Rolls (1997) ArticleTitleInvariant face and object recognition in the visual systems Prog Neurobiol 51 167–194 Occurrence Handle10.1016/S0301-0082(96)00054-8 Occurrence Handle9247963
H Wersing E Körner (2003) ArticleTitleLearning optimized features for hierarchical models of invariant object recognition Neural Comput 15 1559–1588 Occurrence Handle10.1162/089976603321891800 Occurrence Handle12816566
L Wiskott T Sejnowski (2002) ArticleTitleSlow feature analysis: unsupervised learning of invariances Neural Comput 14 715–770 Occurrence Handle10.1162/089976602317318938 Occurrence Handle11936959
L Wiskott (2003) ArticleTitleSlow feature analysis: a theoretical analysis of optimal free responses Neural Comput 15 IssueID9 2147–2177 Occurrence Handle10.1162/089976603322297331 Occurrence Handle12959670
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Einhäuser, W., Hipp, J., Eggert, J. et al. Learning viewpoint invariant object representations using a temporal coherence principle. Biol Cybern 93, 79–90 (2005). https://doi.org/10.1007/s00422-005-0585-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-005-0585-8