Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Real-Time Chart Explanation System for Visually Impaired Individuals

  • Conference paper
  • First Online:
Computers Helping People with Special Needs (ICCHP 2024)

Abstract

This research addresses the critical need for real-time chart captioning systems optimized for visually impaired individuals in the evolving era of remote communication. The surge in online classes and video meetings has increased the reliance on visual data, such as charts, which poses a substantial challenge for those with visual impairments. Our study concentrates on the development and evaluation of an AI model tailored for real-time interpretation and captioning of charts. This model aims to enhance the accessibility and comprehension of visual data for visually impaired users in live settings. By focusing on real-time performance, the research endeavors to bridge the accessibility gap in dynamic and interactive remote environments. The effectiveness of the AI model is assessed in practical scenarios to ensure it meets the requirements of immediacy and accuracy essential for real-time applications. Our work represents a significant contribution to creating a more inclusive digital environment, particularly in addressing the challenges posed by the non-face-to-face era.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Colin, R.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR 21(140), 1 (2020)

    MathSciNet  Google Scholar 

  2. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  3. Jocher, G., Chaurasia, A., Qiu, J.: Yolo by ultralytics (version 8.0. 0)[computer software]. YOLO by Ultralytics (Version 8.0. 0)[Computer software] (2023)

    Google Scholar 

  4. with Josh Starmer, S.: A gentle introduction to machine learning (2019). https://youtu.be/Gv9_4yMHFhI?si=3XFGt3M2lRWj8pnb

  5. with Josh Starmer, S.: Machine learning fundamentals: Cross validation (2019). https://www.youtube.com/watch?v=fSytzGwwBVw &list=PLblh5JKOoLUICTaGLRoHQDuF_7q2GfuJF &index=2

  6. Kahou, S.E., Michalski, V., Atkinson, A., Kádár, Á., Trischler, A., Bengio, Y.: Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300 (2017)

  7. Kantharaj, S., et al.: Chart-to-text: a large-scale benchmark for chart summarization. arXiv preprint arXiv:2203.06486 (2022)

  8. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  9. Liu, F., et al.: Deplot: one-shot visual language reasoning by plot-to-table translation. arXiv preprint arXiv:2212.10505 (2022)

  10. Liu, F., et al.: Matcha: enhancing visual language pretraining with math reasoning and chart derendering. arXiv preprint arXiv:2212.09662 (2022)

  11. Masry, A., Long, D.X., Tan, J.Q., Joty, S., Hoque, E.: Chartqa: a benchmark for question answering about charts with visual and logical reasoning. arXiv preprint arXiv:2203.10244 (2022)

  12. Methani, N., Ganguly, P., Khapra, M.M., Kumar, P.: Plotqa: reasoning over scientific plots. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1527–1536 (2020)

    Google Scholar 

  13. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  14. Tang, B.J., Boggust, A., Satyanarayan, A.: Vistext: a benchmark for semantically rich chart captioning. arXiv preprint arXiv:2307.05356 (2023)

Download references

Acknowledgments

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2023-00211205, RS-2022-00165818).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joo Hyun Park .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, Y., Jeong, S., Cho, W., Lim, SB., Park, J.H. (2024). A Real-Time Chart Explanation System for Visually Impaired Individuals. In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14750. Springer, Cham. https://doi.org/10.1007/978-3-031-62846-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-62846-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-62845-0

  • Online ISBN: 978-3-031-62846-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics