Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3597638.3608422acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article
Open access

VisPhoto: Photography for People with Visual Impairments via Post-Production of Omnidirectional Camera Imaging

Published: 22 October 2023 Publication History

Abstract

Many people with visual impairments would like to take photographs. However, they often have difficulty pointing the camera at the target. In this paper, we address this problem by proposing a novel photo-taking system called VisPhoto. Unlike conventional methods, VisPhoto generates a photograph in post-production. When the shutter button is pressed, VisPhoto captures an omnidirectional camera image that contains the surrounding scene of the camera. In post-production, the system outputs a cropped region as a “photograph” that satisfies the user’s preference. We conducted an experiment consisting of two parts. First, 24 people with visual impairments took photographs with a genuine iPhone camera app, a conventional method, and VisPhoto. Second, 20 sighted people evaluated the quality of the photographs. The experimental results showed that the participants with visual impairments preferred to use VisPhoto to take photographs of difficult targets, whereas they preferred the conventional method for easy targets. Moreover, we revealed that their preferences for photo-taking methods were influenced by the participants’ needs and values about photography and their confidence in their photographic abilities.

Supplemental Material

MP4 File
Video figure

References

[1]
[n. d.]. Envision Glasses. Retrieved Dec. 12, 2022 from https://www.letsenvision.com/glasses
[2]
[n. d.]. OrCam MyEye 2. Retrieved Sept. 18, 2019 from https://www.orcam.com/en/myeye2/
[3]
[n. d.]. SFSpeechRecognizer | Apple Developer Documentation. Retrieved Sept. 10, 2021 from https://developer.apple.com/documentation/speech/sfspeechrecognizer
[4]
Dustin Adams, Sri Kurniawan, Cynthia Herrera, Veronica Kang, and Natalie Friedman. 2016. Blind Photographers and VizSnap. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility - ASSETS '16. ACM Press. https://doi.org/10.1145/2982142.2982169
[5]
Dustin Adams, Lourdes Morales, and Sri Kurniawan. 2013. A Qualitative Study to Support A Blind Photography Mobile Application. In Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments - PETRA '13. ACM Press. https://doi.org/10.1145/2504335.2504360
[6]
Jan Balata, Zdenek Mikovec, and Lukas Neoproud. 2015. BlindCamera: Central and Golden-ratio Composition for Blind Photographers. In Proceedings of the Mulitimedia, Interaction, Design and Innnovation on ZZZ - MIDI '15. ACM Press. https://doi.org/10.1145/2814464.2814472
[7]
Peter G. J. Barten. 1989. The Square Root Integral (SQRI): A New Metric to Describe the Effect of Various Display Parameters on Perceived Image Quality. In Human Vision, Visual Processing, and Digital Display (1989-08). https://doi.org/10.1117/12.952705
[8]
Be My Eyes [n. d.]. Retrieved Sept. 20, 2019 from https://www.bemyeyes.com/
[9]
Cynthia L. Bennett, Jane E, Martez E. Mott, Edward Cutrell, and Meredith Ringel Morris. 2018. How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18 (2018-04). ACM Press. https://doi.org/10.1145/3173574.3173650
[10]
Jeffrey P. Bigham, Samual White, Tom Yeh, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, and Brandyn White. 2010. VizWiz: Nearly Real-Time Answers to Visual Questions. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology - UIST '10. ACM Press. https://doi.org/10.1145/1866029.1866080
[11]
Wei-Ta Chu, Yu-Kuang Chen, and Kuan-Ta Chen. 2013. Size does matter. In Proceedings of the 21st ACM International Conference on Multimedia - MM '13 (2013). https://doi.org/10.1145/2502081.2502102
[12]
Envision AI [n. d.]. Retrieved Feb. 12, 2020 from https://www.letsenvision.com/
[13]
David Frohlich and Ella Tallyn. 1999. Audiophotography: Practice and Prospects. In CHI '99 Extended Abstracts on Human Factors in Computing Systems - CHI '99. ACM Press. https://doi.org/10.1145/632716.632897
[14]
David M. Frohlich. 2004. Audiophotography. Springer Netherlands. https://doi.org/10.1007/978-1-4020-2210-4
[15]
Erving Goffman. 1990. The Presentation of Self in Everyday Life. Penguin Books Ltd. https://www.ebook.de/de/product/3273768/erving_goffman_the_presentation_of_self_in_everyday_life.html
[16]
Google Cloud Speech-to-Text API [n. d.]. Retrieved Sep. 8, 2021 from https://cloud.google.com/speech-to-text
[17]
Google Lookout [n. d.]. Retrieved Nov. 24, 2022 from https://play.google.com/store/apps/details?id=com.google.android.apps.accessibility.reveal
[18]
Kathryn Greene, Valerian J. Derlega, and Alicia Mathews. [n. d.]. Self-Disclosure in Personal Relationships. In The Cambridge Handbook of Personal Relationships, Anita L. Vangelisti and Daniel Perlman (Eds.). Cambridge University Press, 409–428. https://doi.org/10.1017/cbo9780511606632.023
[19]
Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. 2018. VizWiz Grand Challenge: Answering Visual Questions from Blind People. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE. https://doi.org/10.1109/cvpr.2018.00380
[20]
Susumu Harada, Daisuke Sato, Dustin W. Adams, Sri Kurniawan, Hironobu Takagi, and Chieko Asakawa. 2013. Accessible Photo Album: Enhancing the Photo Sharing Experience for People with Visual Impairment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI '13. ACM Press. https://doi.org/10.1145/2470654.2481292
[21]
Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, and Min Sun. 2017. Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Videos. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2017.153
[22]
Masakazu Iwamura, Naoki Hirabayashi, Zheng Cheng, Kazunori Minatani, and Koichi Kise. 2020. VisPhoto: Photography for People with Visual Impairment as Post-Production of Omni-Directional Camera Image. In Extended Abstracts of ACM CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3334480.3382983
[23]
Masakazu Iwamura, Naoki Hirabayashi, Zheng Cheng, Kazunori Minataniand, and Koichi Kise. 2021. Photography for People with Visual Impairment by Photo-Taking with Omni-Directional Camera and Its Post-Production. J104-D, 8 (2021), 663–677. https://doi.org/10.14923/transinfj.2020JDP7069
[24]
Masakazu Iwamura, Takaaki Kawai, Keigo Takashima, Kazunori Minatani, and Koichi Kise. 2022. Acquiring Surrounding Visual Information Without Actively Taking Photos for People with Visual Impairment. In Proc. Joint International Conference on Digital Inclusion, Assistive Technology & Accessibility (ICCHP-AAATE 2022). Lecture Notes in Computer Science, Vol. 13341. Springer International Publishing, 229–240. https://doi.org/10.1007/978-3-031-08648-9_27 https://www.doi.org/10.1007/978-3-031-08648-9_27.
[25]
Chandrika Jayant, Hanjie Ji, Samuel White, and Jeffrey P. Bigham. 2011. Supporting Blind Photography. In The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility - ASSETS '11. ACM Press. https://doi.org/10.1145/2049536.2049573
[26]
Wei-Sheng Lai, Yujia Huang, Neel Joshi, Christopher Buehler, Ming-Hsuan Yang, and Sing Bing Kang. 2018. Semantic-Driven Generation of Hyperlapse from 360 Degree Video. IEEE Transactions on Visualization and Computer Graphics 24, 9 (Sept. 2018), 2610–2621. https://doi.org/10.1109/tvcg.2017.2750671
[27]
Kyungjun Lee, Jonggi Hong, Simone Pimento, Ebrima Jarjue, and Hernisa Kacorri. 2019. Revisiting Blind Photography in the Context of Teachable Object Recognizers. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. ACM. https://doi.org/10.1145/3308561.3353799
[28]
Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár. 2014. Microsoft COCO: Common Objects in Context. arXiv preprint arXiv:1405.0312 (2014). arXiv:1405.0312v3 [cs.CV] https://arxiv.org/abs/1405.0312
[29]
Yen-Chen Lin, Yung-Ju Chang, Hou-Ning Hu, Hsien-Tzu Cheng, Chi-Wen Huang, and Min Sun. 2017. Tell Me Where to Look: Investigating Ways for Assisting Focus in 360° Video. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17. ACM Press. https://doi.org/10.1145/3025453.3025757
[30]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. In Proc. 14th European Conference on Computer Vision, Part I(Lecture Notes in Computer Science, Vol. 9905). 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
[31]
Object Localization of Google Cloud Vision API [n. d.]. Retrieved Sep. 8, 2021 from https://cloud.google.com/vision/docs/object-localizer
[32]
Plug-in store — Ricoh Theta [n. d.]. Retrieved Sept. 20, 2019 from https://pluginstore.theta360.com/
[33]
Post-production — Wikipedia [n. d.]. Retrieved Sept. 19, 2019 from https://en.wikipedia.org/wiki/Post-production
[34]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2016.91
[35]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 91–99.
[36]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. [n. d.]. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018-06). IEEE. https://doi.org/10.1109/cvpr.2018.00474
[37]
Seeing AI [n. d.]. Retrieved Sept. 7, 2021 from https://www.microsoft.com/ja-jp/ai/seeing-ai
[38]
Yu-Chuan Su and Kristen Grauman. 2017. Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2017.150
[39]
Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic Cinematography for Watching 360° Videos. In Proc. 13th Asian Conference on Computer Vision (ACCV). https://doi.org/10.1007/978-3-319-54190-7_10
[40]
TapTapSee [n. d.]. Retrieved Sept. 20, 2019 from https://www.bemyeyes.com/
[41]
TensorFlow Lite Object Detection iOS Example Application [n. d.]. Retrieved Apr. 14, 2021 from https://github.com/tensorflow/examples/tree/master/lite/examples/object_detection/ios
[42]
Marynel Vázquez and Aaron Steinfeld. 2012. Helping Visually Impaired Users Properly Aim A Camera. In Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility - ASSETS '12. ACM Press. https://doi.org/10.1145/2384916.2384934
[43]
Marynel Vázquez and Aaron Steinfeld. 2014. An Assisted Photography Framework to Help Visually Impaired Users Properly Aim a Camera. ACM Transactions on Computer-Human Interaction 21, 5 (Nov. 2014), 1–29. https://doi.org/10.1145/2651380
[44]
Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How Blind People Interact with Visual Content on Social Networking Services. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing - CSCW '16. ACM Press. https://doi.org/10.1145/2818048.2820013
[45]
Samuel White, Hanjie Ji, and Jeffrey P. Bigham. 2010. EasySnap: Real-Time Audio Feedback for Blind Photography. In Adjunct proceedings of the 23nd annual ACM symposium on User interface software and technology - UIST '10. ACM Press. https://doi.org/10.1145/1866218.1866244
[46]
Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. [n. d.]. Automatic Alt-Text: Computer-Generated Image Descriptions for Blind Users on a Social Network Service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (2017-02). ACM. https://doi.org/10.1145/2998181.2998364
[47]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas Huang. [n. d.]. Free-Form Image Inpainting with Gated Convolution. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019-10). IEEE. https://doi.org/10.1109/iccv.2019.00457
[48]
Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People with Visual Impairments. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (dec 2017), 1–22. https://doi.org/10.1145/3134756
[49]
Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2018. A Face Recognition Application for People with Visual Impairments. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18. ACM Press. https://doi.org/10.1145/3173574.3173789

Cited By

View all
  • (2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
  • (2024)EditScribe: Non-Visual Image Editing with Natural Language Verification LoopsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675599(1-19)Online publication date: 27-Oct-2024
  • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024

Index Terms

  1. VisPhoto: Photography for People with Visual Impairments via Post-Production of Omnidirectional Camera Imaging

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility
    October 2023
    1163 pages
    ISBN:9798400702204
    DOI:10.1145/3597638
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 October 2023

    Check for updates

    Author Tags

    1. Photography
    2. object detection
    3. omnidirectional camera
    4. sound recording

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • JST Kakenhi
    • The Telecommunication Advancement Foundation

    Conference

    ASSETS '23
    Sponsor:

    Acceptance Rates

    ASSETS '23 Paper Acceptance Rate 55 of 182 submissions, 30%;
    Overall Acceptance Rate 436 of 1,556 submissions, 28%

    Upcoming Conference

    ASSETS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)421
    • Downloads (Last 6 weeks)39
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the WildProceedings of the 21st International Web for All Conference10.1145/3677846.3677861(59-69)Online publication date: 13-May-2024
    • (2024)EditScribe: Non-Visual Image Editing with Natural Language Verification LoopsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675599(1-19)Online publication date: 27-Oct-2024
    • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media