Audio Guide for Visually Impaired People Based on Combination of Stereo Vision and Musical Tones
<p>Indoor navigation system architecture. RANSAC, random sample consensus; i-PDR, iterative pedestrian dead reckoning.</p> "> Figure 2
<p>Visual marker recognition scheme.</p> "> Figure 3
<p>Target tracking scheme.</p> "> Figure 4
<p>Construction of visual and hybrid information.</p> "> Figure 5
<p>Components of target-tracking algorithms.</p> "> Figure 6
<p>Identification of curves and lines.</p> "> Figure 7
<p>Systematics of the disparity map operation.</p> "> Figure 8
<p>Sound alert scheme based on obstacle distance.</p> "> Figure 9
<p>Obstacle detection scheme using stereo vision and audible alerts.</p> "> Figure 10
<p>Protocol flowchart adopted for testing.</p> "> Figure 11
<p>The result of target navigation in the lab.</p> "> Figure 12
<p>Visual markers under different lighting.</p> "> Figure 13
<p>Cataloging collisions of reference group users.</p> "> Figure 14
<p>Cataloging of collisions by user.</p> "> Figure 15
<p>Detailed score of user-rated items.</p> "> Figure 16
<p>Detailed assessment of the two worst user-rated items.</p> "> Figure 17
<p>Comparison of the margin of error of the hybrid model and the related works.</p> ">
Abstract
:1. Introduction
2. State of the Art
3. Proposed Indoor Positioning System
- Security: The application is local and embedded. Only the device calculates and processes location information, ensuring data access privacy;
- Delay or Response Time: Choice of techniques that provide acceptable time for visually impaired navigation;
- Robustness: The choice of prediction and correction techniques makes the system less fault-prone than using readings directly, and reduces the need for processor utilization, reducing response time to a tolerable limit;
- Complexity: as a criterion, we chose the sample reduction of data accompanied by data fusion and the emission of sound messages as output from the system;
- Limitations: Navigation algorithms use probabilistic methods and require constant updating of positioning data to inform the user.
3.1. Preprocessing
3.1.1. Feature Extraction
3.1.2. Sample Reduction
3.1.3. Multisensor Data Fusion
3.2. Position
3.2.1. Static Indoor Positioning
3.2.2. Dynamic Indoor Positioning
3.3. Navigation
3.3.1. Route Rules
3.3.2. Guide
3.4. Obstacle Identification
3.5. Auxiliary Data
- -
- Note A +, which has a frequency of 440 Hz, shows horizontal obstacles (ground);
- -
- Note C +, with a frequency of 264 Hz, shows vertical obstacles (floor to ceiling);
- -
- Each note is played in three octaves (bass, medium, treble) to express near, mid-distance, and distant elements;
4. Evaluation
- -
- The user must perform a complete turn in the scenario without using the device to evaluate the impact of the audio guide on the perception of obstacles;
- -
- The user must perform a complete turn on the scenario with the device;
- -
- At the end of each test run, the user should be interviewed to record their usage perceptions.
4.1. Device Setup for the Experiment
4.2. Experiment Setup
- -
- Users were blindfolded to remove vision difference factor (partial blind and totally blind);
- -
- To avoid information contamination, test users were interviewed and separated from the group.
5. Conclusions and Future Work
- (1)
- Accuracy is inversely proportional to time (velocity). That is, the more accurate the system, the more complex the algorithmic approach and the more time consumed, and vice versa. It is necessary to define the objective of the system (precision, speed), or to find a middle ground in the relations between the two factors.
- (2)
- The data collection time is a crucial factor in the viability of indoor navigation. The time taken to collect and identify a candidate position should be as short as possible so that information is delivered within tolerable insurance limits to guide the user.
- (3)
- The combination of musical beeps and spoken instructions enables the user to avoid obstacles more accurately and provide a more efficient guide for safer routes.
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Mainetti, L.; Patrono, L.; Ilaria, S. A survey on indoor positioning systems. In Proceedings of the 2014 22nd International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 17–19 September 2014; pp. 111–120. [Google Scholar] [CrossRef]
- Zengke, L.; Wang, R.; Gao, J.; Wang, J. An Approach to Improve the Positioning Performance of GPS/INS/UWB Integrated System with Two-Step Filter. Remote Sens. 2018, 10, 19. [Google Scholar] [CrossRef] [Green Version]
- Zhu, Y.; Mottaghi, R.; Kolve, E.; Lim, J.J.; Gupta, A.; Fei-Fei, L.; Farhadi, A. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3357–3364. [Google Scholar] [CrossRef] [Green Version]
- Bayro Kaiser, E.; Lawo, M. Wearable Navigation System for the Visually Impaired and Blind People. In Proceedings of the IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, China, 30 May–1 June 2012. [Google Scholar] [CrossRef]
- Alatise, M.; Hancke, G. Pose Estimation of a Mobile Robot Based on Fusion of IMU Data and Vision Data Using an Extended Kalman Filter. Sensors 2017, 17, 2164. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, C.; Chai, W.; Wang, S.; Roth, H. A single frame depth visual gyroscope and its integration for robot navigation and mapping in structured indoor environments. J. Intell. Robot. Syst. 2015, 80, 365–374. [Google Scholar] [CrossRef]
- Li, X.; Wang, J.; Liu, C. Heading Estimation with Real-time Compensation Based on Kalman Filter Algorithm for an Indoor Positioning System. ISPRS Int. J. Geo-Inf. 2016, 5, 98. [Google Scholar] [CrossRef] [Green Version]
- Yulong, H.; Yonggang, Z.; Ning, L.; Lin, Z. Particle filter for nonlinear systems with multiple steps randomly delayed measurements. Electron. Lett. 2015, 51, 1859–1861. [Google Scholar] [CrossRef]
- Chuang, R.; Jianping, L.; Yu, W. Map navigation system based on optimal Dijkstra algorithm. In Proceedings of the IEEE 3rd International Conference on Cloud Computing and Intelligence Systems, Shenzhen, China, 27–29 November 2014; pp. 559–564. [Google Scholar] [CrossRef]
- Heya, T.; Arefin, S.; Chakrabarty, A.; Alam, M. Image Processing Based Indoor Localization System for Assisting Visually Impaired People. In Proceedings of the Ubiquitous Positioning, Indoor Navigation and Location-Based Services (UPINLBS), Wuhan, China, 22–23 March 2018; pp. 1–7. [Google Scholar] [CrossRef]
- Kitt, B.; Geiger, A.; Lategahn, H. Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme. In Proceedings of the IEEE Intelligent Vehicles Symposium, San Diego, CA, USA, 21–24 June 2010; pp. 486–492. [Google Scholar] [CrossRef] [Green Version]
- Xue, H.; Ma, L.; Tan, X. A fast visual map building method using video stream for visual-based indoor localization. In Proceedings of the International Wireless Communications and Mobile Computing Conference (IWCMC), Paphos, Cyprus, 5–9 September 2016; pp. 650–654. [Google Scholar] [CrossRef]
- Zheng, Y.; Shen, G.; Li, L.; Zhao, C.; Li, M.; Zhao, F. Travi-Navi: Self-Deployable Indoor Navigation System. IEEE/ACM Trans. Netw. 2017, 25, 2655–2669. [Google Scholar] [CrossRef]
- Alcantarilla, P.; Yebes, J.; Almazán, J.; Bergasa, L. On Combining Visual SLAM and Dense Scene Flow to Increase the Robustness of Localization and Mapping in Dynamic Environments. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 1290–1297. [Google Scholar] [CrossRef]
- Presti, G.; Ahmetovic, D.; Ducci, M.; Bernareggi, C.; Ludovico, L.; Baratè, A.; Avanzini, F.; Mascetti, S. WatchOut: Obstacle Sonification for People with Visual Impairment or Blindness. In Proceedings of the ASSETS ‘19 The 21st International ACM SIGACCESS Conference on Computers and Accessibility, Pittsburgh, PA, USA, 28–30 October 2019; pp. 402–413. [Google Scholar] [CrossRef] [Green Version]
- Massiceti, D.; Hicks, S.; Rheede, J.J. Stereosonic vision: Exploring visual-to-auditory sensory substitution mappings in an immersive virtual reality navigation paradigm. PLoS ONE 2018, 13, e0199389. [Google Scholar] [CrossRef] [PubMed]
- Bujakz, M.; Strumillo, P. Sonification: Review of Auditory Display Solutions in Electronic Travel Aids for the Blind. Arch. Acoust. 2016, 41, 401–414. [Google Scholar] [CrossRef]
- Skulimowski, P.; Owczarek, M.; Radecki, A.; Bujacz, M.; Rzeszotarski, D.; Strumillo, P. Interactive sonification of U-depth images in a navigation aid for the visually impaired. J. Multimodal User Interfaces 2018, 13, 219–230. [Google Scholar] [CrossRef] [Green Version]
- Kumar, S.; Kumar, P.; Pandey, S. Fast integral image computing scheme for vision-based applications. In Proceedings of the 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), Mathura, India, 26–28 October 2017; pp. 491–493. [Google Scholar] [CrossRef]
- Kalra, A.; Chhokar, R.L. A Hybrid Approach Using Sobel and Canny Operator for Digital Image Edge Detection. In Proceedings of the 2016 International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE), Ghaziabad, India, 22–23 September 2016; pp. 305–310. [Google Scholar] [CrossRef]
- Chanama, L.; Wongwitat, O. A comparison of decision tree-based techniques for indoor positioning system. In Proceedings of the IEEE International Conference on Information Networking (ICOIN), Chiang Mai, Thailand, 10–12 January 2018; pp. 732–737. [Google Scholar] [CrossRef]
- Liu, T.; Zhang, X.; Li, Q.; Fang, Z. A Visual-Based Approach for Indoor Radio Map Construction Using Smartphones. Sensors 2017, 17, 1790. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Chen, H.; Huang, Y.; Luo, Y.; Zhang, Y.; Xie, X. An Indoor Route Planning Method with Environment Awareness. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 2906–2909. [Google Scholar] [CrossRef]
- Krause, J.; Perer, A.; Bertini, E. INFUSE: Interactive feature selection for predictive modeling of high dimensional data. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1614–1623. [Google Scholar] [CrossRef] [PubMed]
- Chen, C.; Yang, B.; Song, S.; Tian, M.; Li, J.; Dai, W.; Fang, L. Calibrate Multiple Consumer RGB-D Cameras for Low-Cost and Efficient 3D Indoor Mapping. Remote Sens. 2018, 10, 328. [Google Scholar] [CrossRef] [Green Version]
- Song, X.; Wang, M.; Qiu, H.; Luo, L. Indoor Pedestrian Self-Positioning Based on Image Acoustic Source Impulse Using a Sensor-Rich Smartphone. Sensors 2018, 18, 4143. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhangaskanov, D.; Zhumatay, N.; Ali, H. Audio-based Smart White Cane for Visually Impaired People. In Proceedings of the 2019 5th International Conference on Control, Automation and Robotics (ICCAR), Beijing, China, 19–22 April 2019; pp. 889–893. [Google Scholar] [CrossRef]
- Spagnol, S.; Hoffmann, R.; Herrera Martínez, M.; Unnthorsson, R. Blind wayfinding with physically-based liquid sounds. Int. J. Hum. Comput. Stud. 2018, 115, 9–19. [Google Scholar] [CrossRef] [Green Version]
Author | Navigation Algorithms | Data Fusion | Alert Type |
---|---|---|---|
HEYA et al., 2018 [10] | SLAM | KNN | Sound |
KITT et al., 2010 [11] | Proximity Method | Kalman filter | Visual |
XUE et al., 2016 [12] | Proximity Method | RANSAC | Visual |
PRESTI et al., 2019 [15] | Proximity Method | Weighted average | Polytonic |
MASSICETI et al., 2018 [16] | Proximity Method | KNN | humming sound |
BUJACZ et al., 2016 [17] | Proximity Method | Particle filter | humming sound |
ALCANTARILLA et al., 2012 [14] | SLAM | Weighted average | Visual |
CHEN et al., 2014 [6] | PDR | Kalman filter | Visual |
Action | Answer of the Audio Guide |
---|---|
Drive forward | Go ahead |
Turn right | Turn right on X meters |
Turn left | Turn left on X meters |
Turn right immediately | Turn right |
Turn left immediately | Turn left |
Alert: Close obstacle | Stop! Obstacle detected |
Location Strategy | Error Margin (m) |
---|---|
Visual Marker | 0.454 |
Hybrid Marker | 0.108 |
IPS Type | Time (s) |
---|---|
Location for visual information | 0.17 |
Hybrid location | 0.07 |
Technique | Frames Per Second (FPS) |
---|---|
Image stereo | 9 |
Image stereo with RANSAC | 20 |
Image stereo, RANSAC, and particle filter | 23 |
Region | Height (m) | Distance (m) |
---|---|---|
Region 1 | 0.101 | 0.212 |
Region 2 | 0.205 | 0.647 |
Region 3 | 0.942 | 0.303 |
Region 4 | 0.942 | 0.129 |
Question | Performance Level | ||||
---|---|---|---|---|---|
Excellent | Very Good | Good | Satisfactory | Bad | |
Orientation | 65% | 20% | 5% | 5% | 5% |
Independence | 40% | 30% | 25% | 5% | 0% |
Location | 80% | 10% | 5% | 5% | 0% |
Reliability | 75% | 15% | 6% | 4% | 0% |
Response time | 85% | 10% | 3% | 2% | 0% |
Usability | 20% | 65% | 10% | 5% | 0% |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Simões, W.C.S.S.; Silva, Y.M.L.R.; Pio, J.L.d.S.; Jazdi, N.; F. de Lucena, V., Jr. Audio Guide for Visually Impaired People Based on Combination of Stereo Vision and Musical Tones. Sensors 2020, 20, 151. https://doi.org/10.3390/s20010151
Simões WCSS, Silva YMLR, Pio JLdS, Jazdi N, F. de Lucena V Jr. Audio Guide for Visually Impaired People Based on Combination of Stereo Vision and Musical Tones. Sensors. 2020; 20(1):151. https://doi.org/10.3390/s20010151
Chicago/Turabian StyleSimões, Walter C. S. S., Yuri M. L. R. Silva, José Luiz de S. Pio, Nasser Jazdi, and Vicente F. de Lucena, Jr. 2020. "Audio Guide for Visually Impaired People Based on Combination of Stereo Vision and Musical Tones" Sensors 20, no. 1: 151. https://doi.org/10.3390/s20010151
APA StyleSimões, W. C. S. S., Silva, Y. M. L. R., Pio, J. L. d. S., Jazdi, N., & F. de Lucena, V., Jr. (2020). Audio Guide for Visually Impaired People Based on Combination of Stereo Vision and Musical Tones. Sensors, 20(1), 151. https://doi.org/10.3390/s20010151