Abstract
This paper discusses an indoor robotic system that integrates a state-of-the-art object detection algorithm trained with data augmented for an indoor scenario and enabled with mechanisms to localize and position objects in 3D and display them interactively to a user. Size, weight, and power constraints in a mobile robot constrain the type of computing hardware that can be integrated with the robotic platform. However, on the other hand, the robot’s mobility if leveraged properly can provide enough opportunity to detect objects from different distances and viewpoints as the robot approaches them giving more robust results. This work adapts a CNN-based algorithm, YOLO, to run on a GPU-enabled board, the Jetson TX1. An innovative method to calculate the object position in the 3D environment map is discussed along with the problems therein, such as that of duplicate detections that need to be suppressed. Since multiple objects of different or same class may be detected, the user is overloaded with information and management of the visualization through human–machine interaction gains an important role. A scheme for informative display of objects is implemented which lets the user interactively view object images as well as their position in the scene. The complete robotic system including the interactive visualization tool can be put to various uses such as search and rescue, indoor assistance, patrolling and surveillance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. In CVPR, 2012.
Menglong Zhu, Konstantinos G. Derpanis, Yinfei Yang, Samarth Brahmbhatt, Mabel Zhang, Cody Phillips, Matthieu Lecce and Kostas Daniilidis, Single Image 3D Object Detection and Pose Estimation for Grasping, ICRA, 2014.
Ian Lenz, Honglak Lee and Ashutosh Saxena, Deep Learning for Detecting Robotic Grasps, arXiv 2014.
Ling Cai, Lei He, Yiren Xu, Yuming Zhao, Xin Yang, Multi-object detection and tracking by stereo vision, Pattern Recognition, 2010.
Arjun Singh, James Sha, Karthik S. Narayan, Tudor Achim, Pieter Abbeel, BigBIRD: A Large-Scale 3D Database of Object Instances, ICRA, 2014.
Omid Hosseini Jafari, Dennis Mitzel, Bastian Leibe, Real-Time RGB-D based People Detection and Tracking for Mobile Robots and Head-Worn Cameras, ICRA, 2014.
Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Le Cun, OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks, arXiv, 2014.
Saurabh Gupta, Ross Girshick, Pablo Arbelaez, and Jitendra Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation, arXiv, 2014.
Yulan Guo, Mohammed Bennamoun, Ferdous Sohel, Min Lu, and Jianwei Wan, 3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, November 2014.
Christian Szegedy, Alexander Toshev, Dumitru Erhan, Deep Neural Networks for Object Detection, NIPS, 2013.
Dumitru Erhan, Christian Szegedy, Alexander Toshev, and Dragomir Anguelov, Scalable Object Detection using Deep Neural Networks, CVPR, 2014.
Yu Xiang, Roozbeh Mottaghi, Silvio Savarese, Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild, WACV, 2014.
Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew Berneshawi, Huimin Ma, SanjaFidler, Raquel Urtasun, 3D Object Proposals for Accurate Object Class Detection, NIPS, 2015.
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi, You Only Look Once: Unified, Real-Time Object Detection, CVPR, 2016.
Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005.
Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester and Deva Ramanan, Object Detection with Discriminatively Trained Part Based Models, PAMI 2010.
J. Dong, Q. Chen, S. Yan, and A. Yuille. Towards unified object detection and semantic segmentation. In Computer Vision–ECCV 2014, pages 299–314. Springer, 2014.
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, 2014.
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2015.
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Region-Based Convolutional Networks for Accurate Object Detection and Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, January 2016.
M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1):98–136, Jan. 2015.
Khaled Alhamzi, Mohammed Elmogy, Sherif Barakat, 3D Object Recognition Based on Local and Global Features Using Point Cloud Library, IJACT, 2015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sharma, A.M., Syed, I.A., Sharma, B., Jamal, A., Deodhare, D. (2018). Visual Object Detection for an Autonomous Indoor Robotic System. In: Chaudhuri, B., Kankanhalli, M., Raman, B. (eds) Proceedings of 2nd International Conference on Computer Vision & Image Processing . Advances in Intelligent Systems and Computing, vol 703. Springer, Singapore. https://doi.org/10.1007/978-981-10-7895-8_17
Download citation
DOI: https://doi.org/10.1007/978-981-10-7895-8_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7894-1
Online ISBN: 978-981-10-7895-8
eBook Packages: EngineeringEngineering (R0)