Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3373625.3418031acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
poster

HoloSound: Combining Speech and Sound Identification for Deaf or Hard of Hearing Users on a Head-mounted Display

Published: 29 October 2020 Publication History

Abstract

Head-mounted displays can provide private and glanceable speech and sound feedback to deaf and hard of hearing people, yet prior systems have largely focused on speech transcription. We introduce HoloSound, a HoloLens-based augmented reality (AR) prototype that uses deep learning to classify and visualize sound identity and location in addition to providing speech transcription. This poster paper presents a working proof-of-concept prototype, and discusses future opportunities for advancing AR-based sound awareness.

Supplementary Material

a71-guo-supplement (a71-guo-supplement.mp4)
This supplementary video demonstrates HoloSound's key functionalities with real-life scenarios and briefly introduces its system design and user interface.

References

[1]
Manish Sharma, Mallikarjuna Rao Abhijit Jana. HoloLens Blueprints - Google Books. Retrieved June 7, 2020 from https://books.google.com/books?id=_Hc5DwAAQBAJ&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false
[2]
Edward T. Auer. 1998. Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: An initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals. The Journal of the Acoustical Society of America 104, 4: 2477. Retrieved from http://scitation.aip.org/content/asa/journal/jasa/104/4/10.1121/1.423909
[3]
Danielle Bragg, Nicholas Huynh, and Richard E. Ladner. 2016. A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, 3–13.
[4]
Leah Findlater, Bonnie Chinh, Dhruv Jain, Jon Froehlich, Raja Kushalnagar, and Angela Carey Lin. 2019. Deaf and Hard-of-hearing Individuals’ Preferences for Wearable and Mobile Sound Awareness Technologies. In SIGCHI Conference on Human Factors in Computing Systems (CHI). In Submission.
[5]
Eduardo Fonseca, Jordi Pons Puig, Xavier Favory, Frederic Font Corbera, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra. 2017. Freesound datasets: a platform for the creation of open audio datasets. In Hu X, Cunningham SJ, Turnbull D, Duan Z, editors. Proceedings of the 18th ISMIR Conference; 2017 oct 23-27; Suzhou, China.[Canada]: International Society for Music Information Retrieval; 2017. p. 486-93.
[6]
Abraham Glasser, Kesavan Kushalnagar, and Raja Kushalnagar. 2017. Deaf, hard of hearing, and hearing perspectives on using automatic speech recognition in conversation. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 427–432. https://doi.org/10.1145/3132525.3134781
[7]
Steven Goodman, Susanne Kirchner, Rose Guttman, Dhruv Jain, Jon Froehlich, and Leah Findlater. Evaluating Smartwatch-based Sound Feedback for Deaf and Hard-of-hearing Users Across Contexts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1–13.
[8]
Benjamin M Gorman. 2014. VisAural: a wearable sound-localisation device for people with impaired hearing. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility, 337–338. https://doi.org/10.1145/2661334.2661410
[9]
François Grondin and François Michaud. 2019. Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations. Robotics and Autonomous Systems 113: 63–80. Retrieved from =
[10]
Shawn Hershey, Sourish Chaudhuri, Daniel P W Ellis, Jort F Gemmeke, Aren Jansen, R Channing Moore, Manoj Plakal, Devin Platt, Rif A Saurous, Bryan Seybold, and others. 2017. CNN architectures for large-scale audio classification. In2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), 131–135.
[11]
Dhruv Jain, Bonnie Chinh, Leah Findlater, Raja Kushalnagar, and Jon Froehlich. 2018. Exploring Augmented Reality Approaches to Real-Time Captioning: A Preliminary Autoethnographic Study. In Proceedings of the 2018 ACM Conference Companion Publication on Designing Interactive Systems, 7–11.
[12]
Dhruv Jain, Audrey Desjardins, Leah Findlater, and Jon E Froehlich. 2019. Autoethnography of a Hard of Hearing Traveler. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility, 236–248.
[13]
Dhruv Jain, Leah Findlater, Christian Volger, Dmitry Zotkin, Ramani Duraiswami, and Jon Froehlich. 2015. Head-Mounted Display Visualizations to Support Sound Awareness for the Deaf and Hard of Hearing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 241–250.
[14]
Dhruv Jain, Rachel Franz, Leah Findlater, Jackson Cannon, Raja Kushalnagar, and Jon Froehlich. 2018. Towards Accessible Conversations in a Mobile Context for People who are Deaf and Hard of Hearing. In Proceedings of ACM ASSETS 2018, 12 pages.
[15]
Dhruv Jain, Angela Carey Lin, Marcus Amalachandran, Aileen Zeng, Rose Guttman, Leah Findlater, and Jon Froehlich. 2019. Exploring Sound Awareness in the Home for People who are Deaf or Hard of Hearing. In SIGCHI Conference on Human Factors in Computing Systems (CHI). In Submission.
[16]
Dhruv Jain, Kelly Mack, Akli Amrous, Matt Wright, Steven Goodman, Leah Findlater, and Jon E Froehlich. 2020. HomeSound: An Iterative Field Deployment of an In-Home Sound Awareness System for Deaf or Hard of Hearing Users. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20), 1–12. https://doi.org/10.1145/3313831.3376758
[17]
Dhruv Jain, Hung Ngo, Pratyush Patel, Steven Goodman, Leah Findlater, and Jon Froehlich. 2020. SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users. In ACM SIGACCESS conference on Computers and accessibility, 1–13.
[18]
Raja. S Kushalnagar, Walter S Lasecki, and Jeffrey P Bigham. 2014. Accessibility Evaluation of Classroom Captions. ACM Transactions on Accessible Computing 5, 3: 1–24. https://doi.org/10.1145/2543578
[19]
Yi-Hao Peng, Ming-Wei Hsu, Paul Taele, Ting-Yu Lin, Po-En Lai, Leon Hsu, Tzu-chuan Chen, Te-Yen Wu, Yu-An Chen, Hsien-Hui Tang, and Mike Y. Chen. 2018. SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing People in Group Conversations. In SIGCHI Conference on Human Factors in Computing Systems (CHI), Paper No. 293.
[20]
ReSpeaker Mic Array v2.0 - Seeed Wiki. Retrieved June 7, 2020 from https://wiki.seeedstudio.com/ReSpeaker_Mic_Array_v2.0/
[21]
Speech to Text | Microsoft Azure. Retrieved June 7, 2020 from https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/
[22]
BBC Sound Effects. Retrieved September 18, 2019 from http://bbcsfx.acropolis.org.uk/
[23]
HoloLens (1st gen) hardware | Microsoft Docs. Retrieved June 7, 2020 from https://docs.microsoft.com/en-us/hololens/hololens1-hardware
[24]
Raspberry Pi 4. Retrieved June 7, 2020 from https://www.raspberrypi.org/products/raspberry-pi-4-model-b/

Cited By

View all
  • (2024)Exploring Visual Scanning in Augmented Reality: Perspectives From Deaf and Hard of Hearing UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688535(1-6)Online publication date: 27-Oct-2024
  • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
  • (2024)Communication, Collaboration, and Coordination in a Co-located Shared Augmented Reality Game: Perspectives From Deaf and Hard of Hearing PeopleProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642953(1-14)Online publication date: 11-May-2024
  • Show More Cited By

Index Terms

  1. HoloSound: Combining Speech and Sound Identification for Deaf or Hard of Hearing Users on a Head-mounted Display
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASSETS '20: Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility
    October 2020
    764 pages
    ISBN:9781450371032
    DOI:10.1145/3373625
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2020

    Check for updates

    Author Tags

    1. Augmented reality
    2. deaf
    3. hard of hearing
    4. head-mounted display
    5. real-time captioning
    6. sound awareness
    7. sound localization
    8. sound recognition
    9. speech-transcription

    Qualifiers

    • Poster
    • Research
    • Refereed limited

    Funding Sources

    • National Science Foundation Grant
    • University of Washington Reality Lab Grant
    • Google Faculty Research Award

    Conference

    ASSETS '20
    Sponsor:

    Acceptance Rates

    ASSETS '20 Paper Acceptance Rate 46 of 167 submissions, 28%;
    Overall Acceptance Rate 436 of 1,556 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)161
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploring Visual Scanning in Augmented Reality: Perspectives From Deaf and Hard of Hearing UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688535(1-6)Online publication date: 27-Oct-2024
    • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
    • (2024)Communication, Collaboration, and Coordination in a Co-located Shared Augmented Reality Game: Perspectives From Deaf and Hard of Hearing PeopleProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642953(1-14)Online publication date: 11-May-2024
    • (2024)MyWay: a 3D and audio-enhanced transportation learning kit for the visually impaired teenagersCCF Transactions on Pervasive Computing and Interaction10.1007/s42486-024-00163-yOnline publication date: 23-Jul-2024
    • (2024)Deaf and Hard of Hearing People’s Perspectives on Augmented Reality Interfaces for Improving the Accessibility of Smart SpeakersUniversal Access in Human-Computer Interaction10.1007/978-3-031-60881-0_21(334-357)Online publication date: 29-Jun-2024
    • (2023)Augmented-Reality Presentation of Household Sounds for Deaf and Hard-of-Hearing PeopleSensors10.3390/s2317761623:17(7616)Online publication date: 2-Sep-2023
    • (2023)Communication and Collaboration Among DHH People in a Co-located Collaborative Multiplayer AR EnvironmentProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3614479(1-5)Online publication date: 22-Oct-2023
    • (2023)“Not There Yet”: Feasibility and Challenges of Mobile Sound Recognition to Support Deaf and Hard-of-Hearing PeopleProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608431(1-14)Online publication date: 22-Oct-2023
    • (2023)LiveLocalizer: Augmenting Mobile Speech-to-Text with Microphone Arrays, Optimized Localization and BeamformingAdjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586182.3615789(1-3)Online publication date: 29-Oct-2023
    • (2023)DHH People in Co-located Collaborative Multiplayer AR EnvironmentsCompanion Proceedings of the Annual Symposium on Computer-Human Interaction in Play10.1145/3573382.3616039(344-347)Online publication date: 6-Oct-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media