Abstract
Millions of people in the world suffer from vision impairment or vision loss. Traditionally, they rely on guide sticks or dogs to move around and avoid potential obstacles. However, both guide sticks and dogs are passive. They are unable to provide conceptual knowledge or semantic contents of an environment. To address this issue, this paper presents a vision-based cognitive system to support the independence of visually impaired people. More specifically, a 3D indoor semantic map is firstly constructed with a hand-held RGB-D sensor. The constructed map is then deployed for indoor topological localization. Convolutional neural networks are used for both semantic information extraction and location inference. Semantic information is used to further verify localization results and eliminate errors. The topological localization performance can be effectively improved despite significant appearance changes within an environment. Experiments have been conducted to demonstrate that the proposed method can increase both localization accuracy and recall rates. The proposed system can be potentially deployed by visually impaired people to move around safely and have independent life.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
The World Health Organization - visual impairment and blindness. [Online]. Available: http://www.who.int/mediacentre/factsheets/fs282/en/.
Schönberger JL, Pollefeys M, Geiger A, Sattler T. Semantic visual localization. IEEE Conference on computer vision and pattern recognition (CVPR); 2018. p. 6896–06.
Liu Q, Li R, Hu H, Gu D. Extracting semantic information from visual data: a survey. Robotics 2016; 5(1):8.
Endres F, Hess J, Sturm J, Cremers D, Burgard W. 3-D mapping with an RGB-D camera. IEEE Trans Robot 2014;30(1):177–87.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 2015;115(3):211–52.
Lowry S, Sünderhauf N, Newman P, Leonard JJ, Cox D, Corke P, Milford M. Visual place recognition: a survey. IEEE Trans Robot 2016;32(1):1–9.
Biswas R, Limketkai B, Sanner S, Thrun S. Towards object mapping in non-stationary environments with mobile robots. 2002 IEEE/RSJ International C on Intelligent Robots and Systems (IROS). IEEE; 2002. p. 1014–9.
Brucker M, Durner M, Ambruş R, Márton ZC, Wendt A, Jensfelt P, Arras KO, Triebel R. Semantic labeling of indoor environments from 3D RGB maps. IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2018. p. 1871–78.
Garg S, Suenderhauf N, Milford M. Don’t look back: robustifying place categorization for viewpoint-and condition-invariant place recognition. IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2018. p. 3645–52.
Lee T-j, Kim C-h, Cho D-iD. A monocular vision sensor-based efficient slam method for indoor service robots. IEEE Trans Ind Electron 2019;66(1):318–28.
Bay H, Ess A, Tuytelaars T, Van Gool L. Speeded-up robust features (SURF). Comput Vis Image Understand 2008;110(3):346–59.
Bloesch M, Czarnowski J, Clark R, Leutenegger S, Davison AJ. CodeSLAM—learning a compact, optimisable representation for dense visual SLAM. IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2018. p. 2560–68.
Li R, Wang S, Long Z, Gu D. Undeepvo: monocular visual odometry through unsupervised deep learning. IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2018. p. 7286–91.
Liu Q, Li R, Hu H, Gu D. Using unsupervised deep learning technique for monocular visual odometry. IEEE Access 2019;7:18076–88.
Li R, Wang S, Gu D. Ongoing evolution of visual slam from geometry to deep learning: challenges and opportunities. Cogn Comput 2018;10(6):875–89.
Grimmett H, Buerki M, Paz L, Pinies P, Furgale P, Posner I, Newman P. Integrating metric and semantic maps for vision-only automated parking. 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2015. p. 2159–66.
Hart JW, Shah R, Kirmani S, Walker N, Baldauf K, John N, Stone P. PRISM: pose registration for integrated semantic mapping. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE; 2018. p. 896–902.
Maturana D, Chou P. -W., Uenoyama M, Scherer S. Real-time semantic mapping for autonomous off-road navigation. Field and service robotics. Springer; 2018. p. 335–50.
Arroyo R, Alcantarilla PF, Bergasa LM, Romera E. Are you able to perform a life-long visual topological localization? Auton Robot 2018;42(3):665–85.
Li Y, Hu Z, Hu Y, Chu D. Integration of vision and topological self-localization for intelligent vehicles. Mechatronics 2018;51:46–58.
Huang AS, Bachrach A, Henry P, Krainin M, Maturana D, Fox D, Roy N. Visual odometry and mapping for autonomous flight using an RGB-D camera. Robotics research. Springer; 2017 . p. 235–52.
Sharif Razavian A, Azizpour H, Sullivan J, Carlsson S. CNN features off-the-shelf: an astounding baseline for recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; 2014. p. 806–13.
Wen G, Hou Z, Li H, Li D, Jiang L, Xun E. 2017. Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn Comput, 1–4.
Yue Z, Gao F, Xiong Q, Wang J, Huang T, Yang E, Zhou H. 2019. A novel semi-supervised convolutional neural network method for synthetic aperture radar image recognition. Cogn Comput, 1–2.
Ren P, Sun W, Luo C, Hussain A. 2017. Clustering-oriented multiple convolutional neural networks for single image super-resolution. Cogn Comput, 1–4.
Li R, Gu D, Liu Q, Long Z, Hu H. 2017. Semantic scene mapping with spatio-temporal deep neural network for robotic applications. Cogn Comput, 1–2.
Chen X, Kundu K, Zhu Y, Ma H, Fidler S, Urtasun R. 3D object proposals using stereo imagery for accurate object class detection. IEEE Trans Pattern Anal Mach Intell 2018;40(5):1259–72.
Sunderhauf N, Shirazi S, Dayoub F, Upcroft B, Milford M. On the performance of ConvNet features for place recognition. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE; 2015. p. 4297–304.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. 2015. Rethinking the Inception architecture for computer vision. arXiv:https://arxiv.org/abs/1512.00567.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015. p. 1–9.
Zeng Z, Li Z, Cheng D, Zhang H, Zhan K, Yang Y. Two-stream multirate recurrent neural network for video-based pedestrian reidentification. IEEE Trans Indus Inform 2018;14(7):3179–86.
Kümmerle R, Grisetti G, Strasdat H, Konolige K, Burgard W. g2o: a general framework for graph optimization. 2011 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2011. p. 3607–13.
Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 1981;24(6):381–95.
Zender H, Mozos OM, Jensfelt P, Kruijff G-J, Burgard W. Conceptual spatial representations for indoor mobile robots. Robot Auton Syst 2008;56(6):493–502.
Samba - opening windows to a wider world. [Online]. Available: https://www.samba.org/.
TensorFlow. [Online]. Available: https://www.tensorflow.org/.
Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:https://arxiv.org/abs/1409.1556.
Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A. Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 2018;40(6):1452–64.
Acknowledgments
We thank Robin Dowling and Ian Dukes from the University of Essex for their technical support. Our thanks also go to Poppy Rees-Smith from Oxford Brookes University for her valuable comments.
Funding
The first two authors have been financially supported by the China Scholarship Council and University of Essex joint scholarship.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Q., Li, R., Hu, H. et al. Indoor Topological Localization Based on a Novel Deep Learning Technique. Cogn Comput 12, 528–541 (2020). https://doi.org/10.1007/s12559-019-09693-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-019-09693-5