Abstract
This research addresses the critical need for real-time chart captioning systems optimized for visually impaired individuals in the evolving era of remote communication. The surge in online classes and video meetings has increased the reliance on visual data, such as charts, which poses a substantial challenge for those with visual impairments. Our study concentrates on the development and evaluation of an AI model tailored for real-time interpretation and captioning of charts. This model aims to enhance the accessibility and comprehension of visual data for visually impaired users in live settings. By focusing on real-time performance, the research endeavors to bridge the accessibility gap in dynamic and interactive remote environments. The effectiveness of the AI model is assessed in practical scenarios to ensure it meets the requirements of immediacy and accuracy essential for real-time applications. Our work represents a significant contribution to creating a more inclusive digital environment, particularly in addressing the challenges posed by the non-face-to-face era.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Colin, R.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR 21(140), 1 (2020)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Jocher, G., Chaurasia, A., Qiu, J.: Yolo by ultralytics (version 8.0. 0)[computer software]. YOLO by Ultralytics (Version 8.0. 0)[Computer software] (2023)
with Josh Starmer, S.: A gentle introduction to machine learning (2019). https://youtu.be/Gv9_4yMHFhI?si=3XFGt3M2lRWj8pnb
with Josh Starmer, S.: Machine learning fundamentals: Cross validation (2019). https://www.youtube.com/watch?v=fSytzGwwBVw &list=PLblh5JKOoLUICTaGLRoHQDuF_7q2GfuJF &index=2
Kahou, S.E., Michalski, V., Atkinson, A., Kádár, Á., Trischler, A., Bengio, Y.: Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300 (2017)
Kantharaj, S., et al.: Chart-to-text: a large-scale benchmark for chart summarization. arXiv preprint arXiv:2203.06486 (2022)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, F., et al.: Deplot: one-shot visual language reasoning by plot-to-table translation. arXiv preprint arXiv:2212.10505 (2022)
Liu, F., et al.: Matcha: enhancing visual language pretraining with math reasoning and chart derendering. arXiv preprint arXiv:2212.09662 (2022)
Masry, A., Long, D.X., Tan, J.Q., Joty, S., Hoque, E.: Chartqa: a benchmark for question answering about charts with visual and logical reasoning. arXiv preprint arXiv:2203.10244 (2022)
Methani, N., Ganguly, P., Khapra, M.M., Kumar, P.: Plotqa: reasoning over scientific plots. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1527–1536 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Tang, B.J., Boggust, A., Satyanarayan, A.: Vistext: a benchmark for semantically rich chart captioning. arXiv preprint arXiv:2307.05356 (2023)
Acknowledgments
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2023-00211205, RS-2022-00165818).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Song, Y., Jeong, S., Cho, W., Lim, SB., Park, J.H. (2024). A Real-Time Chart Explanation System for Visually Impaired Individuals. In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14750. Springer, Cham. https://doi.org/10.1007/978-3-031-62846-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-62846-7_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62845-0
Online ISBN: 978-3-031-62846-7
eBook Packages: Computer ScienceComputer Science (R0)