Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision

Yu Hao^10,11,
Junchi Feng¹¹,
John-Ross Rizzo^11,13,
Yao Wang¹¹ &
…
Yi Fang^10,11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13806))

Included in the following conference series:

European Conference on Computer Vision

1938 Accesses
1 Altmetric

Abstract

People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments. Furthermore, besides initially locating and orienting oneself to a target object, approaching the final target from one’s present position is often frustrating and challenging, especially when one drifts away from the initial planned path to avoid obstacles. In this paper, we develop a novel wearable navigation solution to provide real-time guidance for a user to approach a target object of interest efficiently and effectively in unfamiliar environments. Our system contains two key visual computing functions: initial target object localization in 3D and continuous estimation of the user’s trajectory, both based on the 2D video captured by a low-cost monocular camera mounted on in front of the chest of the user. These functions enable the system to suggest an initial navigation path, continuously update the path as the user moves, and offer timely recommendation about the correction of the user’s path. Our experiments demonstrate that our system is able to operate with an error of less than 0.5 m both outdoor and indoor. The system is entirely vision-based and does not need other sensors for navigation, and the computation can be run with the Jetson processor in the wearable system to facilitate real-time navigation assistance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Supporting Autonomous Navigation of Visually Impaired People for Experiencing Cultural Heritage

Developing a way-finding system on mobile robot assisting visually impaired people in an indoor environment

Article 19 January 2016

Can We Unify Perception and Localization in Assisted Navigation? An Indoor Semantic Visual Positioning System for Visually Impaired People

Notes

1.
The video is typically captured between 15 to 30 frames per second. But this processing may be done at a slower speed, e.g. every 0.5 to 1 s.
2.
Here we assume that there is an open space between the target and the user for simplicity. In practice, more sophisticated algorithms that detect obstacles between the target and the user and plan the path accordingly are needed. In this work, we focus on the visual processing components.
3.
Note that in KITTI video and our video, the camera motion between successive frames is relatively small, leading to very small motion estimation error as well. For the intended navigation application, such analysis only need to be run between frames with a larger interval, and hence larger errors are likely, but we expect them to on the same order as the localization error, which is within 0.5 m.

References

Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Chapter Google Scholar
Fernandes, H., Costa, P., Filipe, V., Paredes, H., Barroso, J.: A review of assistive spatial orientation and navigation technologies for the visually impaired. Univ. Access Inf. Soc. 18(1), 155–168 (2019)
Article Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
GPS.gov: GPS accuracy. Official U.S. government information about the Global Positioning System (GPS) and related topics (2022)
Google Scholar
Griffin-Shirley, N., et al.: A survey on the use of mobile applications for people who are visually impaired. J. Visual Impairment Blindness 111(4), 307–323 (2017)
Article Google Scholar
Hakobyan, L., Lumsden, J., O’Sullivan, D., Bartlett, H.: Mobile assistive technologies for the visually impaired. Surv. Ophthalmol. 58(6), 513–528 (2013)
Article Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Google Scholar
Jiang, E., et al.: Field testing of all aboard, an AI app for helping blind individuals to find bus stops. Invest. Ophthalmol. Visual Sci. 62(8), 3529–3529 (2021)
Google Scholar
Labs, S.: ZED 2 Camera product page. https://www.stereolabs.com/zed-2
Liu, X.J., Fang, Y.: Virtual touch: computer vision augmented touch-free scene exploration for the blind or visually impaired. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1708–1717 (2021)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Article Google Scholar
Lu, D., Fang, Y.: Audi-exchange: AI-guided hand-based actions to assist human-human interactions for the blind and the visually impaired. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1718–1726 (2021)
Google Scholar
MacKeben, M., Fletcher, D.C.: Target search and identification performance in low vision patients. Invest. Ophthalmol. Visual Sci. 52(10), 7603–7609 (2011)
Article Google Scholar
Maheepala, M., Kouzani, A.Z., Joordens, M.A.: Light-based indoor positioning systems: a review. IEEE Sens. J. 20(8), 3971–3995 (2020). https://doi.org/10.1109/JSEN.2020.2964380
Article Google Scholar
Massiceti, D., Hicks, S.L., van Rheede, J.J.: Stereosonic vision: exploring visual-to-auditory sensory substitution mappings in an immersive virtual reality navigation paradigm. PLoS ONE 13(7), e0199389 (2018)
Article Google Scholar
Montello, D.R.: Cognitive research in GIScience: recent achievements and future prospects. Geogr. Compass 3(5), 1824–1840 (2009)
Article Google Scholar
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Article Google Scholar
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 756–770 (2004)
Article Google Scholar
World Health Organization, et al.: Visual impairment and blindness fact sheet no. 282. World Health Organization (2014)
Google Scholar
Pascolini, D., Mariotti, S.P.: Global estimates of visual impairment: 2010. Br. J. Ophthalmol. 96(5), 614–618 (2012)
Article Google Scholar
Pulli, K., Baksheev, A., Kornyakov, K., Eruhimov, V.: Real-time computer vision with OpenCV. Commun. ACM 55(6), 61–69 (2012)
Article Google Scholar
Qin, T., Li, P., Shen, S.: VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018)
Article Google Scholar
Real, S., Araujo, A.: Navigation systems for the blind and visually impaired: past work, challenges, and open problems. Sensors 19(15), 3404 (2019)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)
Google Scholar
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)
Article Google Scholar
Yuan, Z., et al.: Network-aware 5G edge computing for object detection: augmenting wearables to “see” more, farther and faster. IEEE Access 10, 29612–29632 (2022)
Google Scholar

Download references

Acknowledgments

Research reported in this publication was supported in part by the NSF grant 1952180 under the Smart and Connected Community program, the National Eye Institute of the National Institutes of Health under Award Number R21EY033689, and DoD grant VR200130 under the Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology” The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health and NSF, and DoD. Yu Hao was partially supported by NYUAD Institute (Research Enhancement Fund - RE132).

Author information

Authors and Affiliations

NYU Multimedia and Visual Computing Lab, New York, USA
Yu Hao & Yi Fang
NYU Tandon School of Engineering, New York University, New York, USA
Yu Hao, Junchi Feng, John-Ross Rizzo, Yao Wang & Yi Fang
New York University Abu Dhabi, Abu Dhabi, UAE
Yi Fang
NYU Langone Health, New York, USA
John-Ross Rizzo

Authors

Yu Hao
View author publications
You can also search for this author in PubMed Google Scholar
Junchi Feng
View author publications
You can also search for this author in PubMed Google Scholar
John-Ross Rizzo
View author publications
You can also search for this author in PubMed Google Scholar
Yao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Fang .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Ethics declarations

Conflict of Interest

New York University (NYU) and John-Ross Rizzo (JRR) have financial interests in related intellectual property. NYU owns a patent licensed to Tactile Navigation Tools. NYU, JRR are equity holders and advisors of said company.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hao, Y., Feng, J., Rizzo, JR., Wang, Y., Fang, Y. (2023). Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-25075-0_41
Published: 19 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics