Abstract
3D object detection and recognition is increasingly used for manipulation and navigation tasks in service robots. It involves segmenting the objects present in a scene, estimating a feature descriptor for the object view and, finally, recognizing the object view by comparing it to the known object categories. This paper presents an efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner. In this paper, “open-ended” implies that the set of object categories to be learned is not known in advance. The training instances are extracted from on-line experiences of a robot, and thus become gradually available over time, rather than at the beginning of the learning process. This paper focuses on two state-of-the-art questions: (1) How to automatically detect, conceptualize and recognize objects in 3D scenes in an open-ended manner? (2) How to acquire and use high-level knowledge obtained from the interaction with human users, namely when they provide category labels, in order to improve the system performance? This approach starts with a pre-processing step to remove irrelevant data and prepare a suitable point cloud for the subsequent processing. Clustering is then applied to detect object candidates, and object views are described based on a 3D shape descriptor called spin-image. Finally, a nearest-neighbor classification rule is used to predict the categories of the detected objects. A leave-one-out cross validation algorithm is used to compute precision and recall, in a classical off-line evaluation setting, for different system parameters. Also, an on-line evaluation protocol is used to assess the performance of the system in an open-ended setting. Results show that the proposed system is able to interact with human users, learning new object categories continuously over time.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: Directions forward. Comp. Vision Image Underst. 117(8), 827–891 (2013)
Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 1997. IEEE Computer Society, Washington, DC, USA (1997)
Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comp. Vision Image Underst. 81(2), 166–210 (2001)
Chauhan, A., Lopes, L.S.: Using spoken words to guide open-ended category formation. Cogn. Process. 12(4), 341–354 (2011)
Collet Romea, A., Berenson, D., Srinivasa, S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: IEEE International Conference on Robotics and Automation, (ICRA 2009) (2009)
Dinh, H., Kropac, S.: Multi-resolution spin-images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 863–870 (2006)
Filipe, S., Alexandre, L.A.: A comparative evaluation of 3d keypoint detectors in a rgb-d object dataset. In: 9th International Conference on Computer Vision Theory and Applications. Lisbon, Portugal (2014)
Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Hertzberg, J., Zhang, J., Zhang, L., Rockel, S., Neumann, B., Lehmann, J., Dubba, K., Cohn, A.G., Saffiotti, A., Pecora, F., Mansouri, M., Konečný, S̆., Günther, M., Stock, S., Lopes, L.S., Oliveira, M., Lim, G.H., Kasaei, H., Mokhtari, V., Hotz, L., Bohlken, W.: The race project. KI - Künstliche Intelligenz, pp. 297–304 (2014). doi:10.1007/s13218-014-0327-y
Islam, M., Jahan, F., Min, J.H., hwan Baek, J.: Object classification based on visual and extended features for video surveillance application. In: Control Conference (ASCC 2011), 8th Asian, pp. 1398–1401 (2011)
Jeong, S., Lee, M.: Adaptive object recognition model using incremental feature representation and hierarchical classification. Neural Netw. 25, 130–140 (2012)
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern. Anal. Mach. Intell. 21(5), 433–449 (1999)
Kasaei, H., Oliveira, M.R., Lim, G.H., Lopes, L.S., Tomé, A.M.: An interactive open-ended learning approach for 3d object recognition. In: Proceedings of the 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC) (2014)
Kirstein, S., Wersing, H., Gross, H.M., Körner, E.: A life-long learning vector quantization approach for interactive learning of multiple categories. Neural Netw. 28, 90–105 (2012)
Kootstra, G., Ypma, J., De Boer, B.: Active exploration and keypoint clustering for object recognition. In: IEEE International Conference on Robotics and Automation, (ICRA 2008), pp. 1005–1010 (2008)
Liu, Y., Zha, H., Qin, H.: Shape topics: A compact representation and new algorithms for 3d partial shape retrieval. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 2, pp. 2025–2032 (2006)
Martinez Torres, M., Collet Romea, A., Srinivasa, S.: Moped: A scalable and low latency object recognition and pose estimation system. In: IEEE International Conference on Robotics and Automation, (ICRA 2010) (2010)
Mian, A., Bennamoun, M., Owens, R.: On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. Int. J. Comput. Vis. 89(2–3), 348–361 (2010)
Oliveira, M., Lim, G.H., Seabra Lopes, L., Kasaei, H., Tome, A., Chauhan, A.: A perceptual memory system for grounding semantic representations in intelligent service robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2014)
Ozawa, S., Toh, S.L., Abe, S., Pang, S., Kasabov, N.: Incremental learning of feature space and classifier for face recognition. Neural Netw. 18(5–6), 575–584 (2005)
Rockel, S., Neumann, B., Zhang, J., Dubba, S.K.R., Cohn, A.G., Konecny, S., Mansouri, M., Pecora, F., Saffiotti, A., Günther, M., et al.: An ontology-based multi-level robot architecture for learning from experiences. In: Proceedings of the AAAI Spring Symposium: Designing Intelligent Robots (2013)
Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1-3), 125–141 (2008)
Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3d recognition and pose using the viewpoint feature histogram. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pp. 2155–2162. IEEE (2010)
Schulz, D., Burgard, W., Fox, D., Cremers, A.: Tracking multiple moving targets with a mobile robot using particle filters and statistical data association. In: IEEE International Conference on Robotics and Automation, (ICRA 2001), vol. 2, pp. 1665–1670 (2001)
Seabra Lopes, L., Chauhan, A.: How many words can my robot learn? An approach and experiments with one-class learning. Interact. Stud. 8(1), 53–81 (2007)
Seabra Lopes, L., Chauhan, A.: Open-ended category learning for language acquisition. Connect. Sci 20(4), 277–297 (2008)
Takamuku, S., Hosoda, K., Asada, M.: Shaking eases object category acquisition: Experiments with a robot arm. In: Proceedings of the Seventh International Conference on Epigenetic Robotics (2007)
Tombari, F.: Di Stefano, L.: Object recognition in 3d scenes with occlusions and clutter by hough voting. In: 4th Pacific-Rim Symposium on Image and Video Technology (PSIVT 2010), pp. 349–355 (2010)
Wohlkinger, W., Vincze, M.: Shape-based depth image to 3d model matching and classification with inter-view similarity. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), pp. 4865–4870 (2011)
Yeh, T., Darrell, T.: Dynamic visual category learning. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR 2008), pp. 1–8 (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kasaei, S.H., Oliveira, M., Lim, G.H. et al. Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments. J Intell Robot Syst 80, 537–553 (2015). https://doi.org/10.1007/s10846-015-0189-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-015-0189-z