Abstract
People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments. Furthermore, besides initially locating and orienting oneself to a target object, approaching the final target from one’s present position is often frustrating and challenging, especially when one drifts away from the initial planned path to avoid obstacles. In this paper, we develop a novel wearable navigation solution to provide real-time guidance for a user to approach a target object of interest efficiently and effectively in unfamiliar environments. Our system contains two key visual computing functions: initial target object localization in 3D and continuous estimation of the user’s trajectory, both based on the 2D video captured by a low-cost monocular camera mounted on in front of the chest of the user. These functions enable the system to suggest an initial navigation path, continuously update the path as the user moves, and offer timely recommendation about the correction of the user’s path. Our experiments demonstrate that our system is able to operate with an error of less than 0.5 m both outdoor and indoor. The system is entirely vision-based and does not need other sensors for navigation, and the computation can be run with the Jetson processor in the wearable system to facilitate real-time navigation assistance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The video is typically captured between 15 to 30 frames per second. But this processing may be done at a slower speed, e.g. every 0.5 to 1 s.
- 2.
Here we assume that there is an open space between the target and the user for simplicity. In practice, more sophisticated algorithms that detect obstacles between the target and the user and plan the path accordingly are needed. In this work, we focus on the visual processing components.
- 3.
Note that in KITTI video and our video, the camera motion between successive frames is relatively small, leading to very small motion estimation error as well. For the intended navigation application, such analysis only need to be run between frames with a larger interval, and hence larger errors are likely, but we expect them to on the same order as the localization error, which is within 0.5 m.
References
Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Fernandes, H., Costa, P., Filipe, V., Paredes, H., Barroso, J.: A review of assistive spatial orientation and navigation technologies for the visually impaired. Univ. Access Inf. Soc. 18(1), 155–168 (2019)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
GPS.gov: GPS accuracy. Official U.S. government information about the Global Positioning System (GPS) and related topics (2022)
Griffin-Shirley, N., et al.: A survey on the use of mobile applications for people who are visually impaired. J. Visual Impairment Blindness 111(4), 307–323 (2017)
Hakobyan, L., Lumsden, J., O’Sullivan, D., Bartlett, H.: Mobile assistive technologies for the visually impaired. Surv. Ophthalmol. 58(6), 513–528 (2013)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Jiang, E., et al.: Field testing of all aboard, an AI app for helping blind individuals to find bus stops. Invest. Ophthalmol. Visual Sci. 62(8), 3529–3529 (2021)
Labs, S.: ZED 2 Camera product page. https://www.stereolabs.com/zed-2
Liu, X.J., Fang, Y.: Virtual touch: computer vision augmented touch-free scene exploration for the blind or visually impaired. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1708–1717 (2021)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Lu, D., Fang, Y.: Audi-exchange: AI-guided hand-based actions to assist human-human interactions for the blind and the visually impaired. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1718–1726 (2021)
MacKeben, M., Fletcher, D.C.: Target search and identification performance in low vision patients. Invest. Ophthalmol. Visual Sci. 52(10), 7603–7609 (2011)
Maheepala, M., Kouzani, A.Z., Joordens, M.A.: Light-based indoor positioning systems: a review. IEEE Sens. J. 20(8), 3971–3995 (2020). https://doi.org/10.1109/JSEN.2020.2964380
Massiceti, D., Hicks, S.L., van Rheede, J.J.: Stereosonic vision: exploring visual-to-auditory sensory substitution mappings in an immersive virtual reality navigation paradigm. PLoS ONE 13(7), e0199389 (2018)
Montello, D.R.: Cognitive research in GIScience: recent achievements and future prospects. Geogr. Compass 3(5), 1824–1840 (2009)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 756–770 (2004)
World Health Organization, et al.: Visual impairment and blindness fact sheet no. 282. World Health Organization (2014)
Pascolini, D., Mariotti, S.P.: Global estimates of visual impairment: 2010. Br. J. Ophthalmol. 96(5), 614–618 (2012)
Pulli, K., Baksheev, A., Kornyakov, K., Eruhimov, V.: Real-time computer vision with OpenCV. Commun. ACM 55(6), 61–69 (2012)
Qin, T., Li, P., Shen, S.: VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018)
Real, S., Araujo, A.: Navigation systems for the blind and visually impaired: past work, challenges, and open problems. Sensors 19(15), 3404 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)
Yuan, Z., et al.: Network-aware 5G edge computing for object detection: augmenting wearables to “see” more, farther and faster. IEEE Access 10, 29612–29632 (2022)
Acknowledgments
Research reported in this publication was supported in part by the NSF grant 1952180 under the Smart and Connected Community program, the National Eye Institute of the National Institutes of Health under Award Number R21EY033689, and DoD grant VR200130 under the Delivering Sensory and Semantic Visual Information via Auditory Feedback on Mobile Technology” The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health and NSF, and DoD. Yu Hao was partially supported by NYUAD Institute (Research Enhancement Fund - RE132).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Conflict of Interest
New York University (NYU) and John-Ross Rizzo (JRR) have financial interests in related intellectual property. NYU owns a patent licensed to Tactile Navigation Tools. NYU, JRR are equity holders and advisors of said company.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hao, Y., Feng, J., Rizzo, JR., Wang, Y., Fang, Y. (2023). Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)