Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3308561.3353772acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

Exploration of Automatic Speech Recognition for Deaf and Hard of Hearing Students in Higher Education Classes

Published: 24 October 2019 Publication History

Abstract

Automatic speech recognition (ASR) programs that generate real-time speech-to-text captions can be provided as supplemental access technologies for deaf and hard of hearing (DHH) students in higher education classes. As part of a pilot program, we implemented ASR as a supplemental access service in biology, statistics, and other courses at our university. To identify the benefits and limitations of ASR as an access technology, we surveyed 26 DHH students and interviewed 8 of these students about their experiences with ASR in their mainstream classes. Participants believed that ASR was beneficial despite the errors that ASR continued to generate; however, the accuracy and readability of ASR need to improve so that students can better access spoken information through ASR. This paper reviews points for researchers to consider when designing and providing ASR as a supplemental access service in educational settings.

References

[1]
Keith Bain, Sara H. Basson, and Mike Wald. 2002. Speech recognition in university classrooms: liberated learning project. In Proceedings of the Fifth International ACM Conference on Assistive Technologies (ASSETS '02), 192--196. https://doi.org /10.1145/638249.638284
[2]
M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet, ... & R. Rose. 2007. Automatic speech recognition and speech variability: A review. Speech Communication 49, 10: 763--786. https://doi.org/10.1016/j.specom.2007.02.006
[3]
Larwan Berke, Christopher Caulfield, and Matt Huenerfauth. 2017. Deaf and hard-of-hearing perspectives on imperfect automatic speech recognition for captioning one-on-one meetings. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17), 155164. https://doi.org/10.1145/3132525.3132541
[4]
Andi Buzo, Horia Cucu, Corneliu Burileanu, Miruna Pasca, and Vladimir Popescu. 2011. Word error rate improvement and complexity reduction in automatic speech recognition by analyzing acoustic model uncertainty and confusion. In 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD), 1--8. https://doi.org/10.1109/SPED.2011.5940731
[5]
Petr Cerva, Jan Silovsky, Jindrich Zdansky, Jan Nouza, and Ladislav Seps. 2013. Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives. Speech Communication 55: 1033--1046. https://doi.org/10.1016/j.specom.2013.06.017
[6]
Lisa Elliot, Michael Stinson, Barbara McKee, Victoria Everhart, and Pamela Francis. 2001. College students' perceptions of the c-print speech-to-text transcription system. Journal of Deaf Studies and Deaf Education 6, 4: 285--298. https://doi.org/10.1093/deafed/6.4.285
[7]
B. Favre, K. Cheung, S. Kazemian, A. Lee, Y. Liu, C. Munteanu, A. Nenkova, D. Ochei, G. Penn, S. Tratz, et al. 2013. Automatic human utility evaluation of ASR systems: Does WER really predict performance? In Proceedings of Interspeech '13, 3463--3467.
[8]
Maria Federico and Marco Furini. 2012. Enhancing learning accessibility through fully automatic captioning. In Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A '12), 4 pages. https://doi.org/10.1145/2207016.2207053
[9]
Raymond Fok, Harmanpreet Kaur, Skanda Palani, Martez E. Mott, and Walter S. Lasecki. 2018. Towards more robust speech interactions for deaf and hard of hearing users. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '18), 57--67. https://doi.org/10.1145/3234695.3236343
[10]
Yasuhisa Fujii, Kazumasa Yamamoto, and Seiichi Nakagawa. 2012. Improving the readability of ASR results for lectures using multiple hypotheses and sentence-level knowledge. IEICE Transactions, 95-D: 1101--1111. https://dx.doi.org/10.1587/transinf.E95.D.1101
[11]
Yahesh Gaur, Walter Lasecki, Jeffrey Bigham, and Florian Metze. 2016. The effects of automatic speech recognition quality on human transcription latency. In Proceedings of the 13th Web for All Conference (W4A '16), 8 pages. https://doi.org/10.1145/2899475.2899478
[12]
Sushant Kafle and Matt Huenerfauth. 2017. Evaluating the usability of automatically generated captions for people who are deaf or hard of hearing. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17), 165--174. https://doi.org/10.1145/3132525.3132542
[13]
Sangwoo Kang, J.H. Kim, and Jungyun Seo. 2014. Post-error correction in automatic speech recognition using discourse information. Advances in Electrical and Computer Engineering 14, 2: 53--56. https://doi.org/10.4316/AECE.2014.02009
[14]
SabaKawas, George Karalis, Tzu Wen, and Richard E. Ladner. 2016. improving real-time captioning experiences for deaf and hard of hearing students. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '16), 15--23. https://doi.org/10.1145/2982142.2982164
[15]
Richard Kheir and Thomas Way. 2007. Inclusion of deaf students in computer science classes using real- time speech transcription. In Proceedings of the 12th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE'07), 261--265. https://doi.org/10.1145/1268784.1268860
[16]
Raja S. Kushalnagar, Walter S. Lasecki, and Jeffrey P. Bigham. 2012. A readability evaluation of real-time crowd captions in the classroom. In Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '12), 71--78. https://doi.org/10.1145/2384916.2384930
[17]
Raja S. Kushalnagar, Walter S. Lasecki, and Jeffrey P. Bigham. 2014. Accessibility evaluation of classroom captions. ACM Transactions on Accessible Computing (TACCESS) 5, 3: 1--24. http://dx.doi.org/10.1145/2543578
[18]
Raja S. Kushalnagar, Gary W. Behm, Aaron W. Kelstone, and Shareef Ali. 2015. Tracked speech-totext display: Enhancing accessibility and readability of real-time speech-to-text. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility(ASSETS '15), 223--230. https://doi.org/10.1145/2700648.2809843
[19]
Bernd T. Meyer, Sri Harish Mallidi, Hendrik Kayser, and Hynek Hermansky. 2017. Predicting error rates for unknown data in automatic speech recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 5330--5334. https://doi.org/10.1109/ICASSP.2017.7953174
[20]
Agnès Piquard-Kipffer, Odile Mella, Jérémy Miranda, Denis Jouvet, Luiza Orosanu. 2015. Qualitative investigation of the display of speech recognition results for communication with deaf people. In 6th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT'15), 36--41.
[21]
Giuseppe Riccardi and Dilek Hakkani-Tur. 2005. Active learning: Theory and applications to automatic speech recognition. IEEE Transactions on Speech and Audio Processing 13, 4: 504--511. https://doi.org/10.1109/TSA.2005.848882
[22]
Mike Wald. 2006. Captioning for deaf and hard of hearing people by editing automatic speech recognition in real time. In Proceedings of the 10th International Conference on Computers Helping People with Special Needs (ICCHP'06), 683--690. https://dx.doi.org/10.1007/11788713_100
[23]
Mike Wald and Keith Bain. 2008. Universal access to communication and learning: The role of automatic speech recognition. Universal Access in the Information Society 6, 4: 435--447. https://doi.org/10.1007/s10209-007-0093--9

Cited By

View all
  • (2024)Navigating the PathInnovative Approaches in Counselor Education for Students With Disabilities10.4018/979-8-3693-3342-6.ch006(141-196)Online publication date: 13-Dec-2024
  • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
  • (2024)Towards Improving Real-Time Head-Worn Display Caption Mediated Conversations with Speaker Feedback for Hearing Conversation PartnersExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650976(1-11)Online publication date: 11-May-2024
  • Show More Cited By

Index Terms

  1. Exploration of Automatic Speech Recognition for Deaf and Hard of Hearing Students in Higher Education Classes

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility
        October 2019
        730 pages
        ISBN:9781450366762
        DOI:10.1145/3308561
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 24 October 2019

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. automatic speech recognition
        2. deaf and hard of hearing
        3. feedback
        4. real-time captions

        Qualifiers

        • Research-article

        Conference

        ASSETS '19
        Sponsor:

        Acceptance Rates

        ASSETS '19 Paper Acceptance Rate 41 of 158 submissions, 26%;
        Overall Acceptance Rate 436 of 1,556 submissions, 28%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)97
        • Downloads (Last 6 weeks)13
        Reflects downloads up to 14 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Navigating the PathInnovative Approaches in Counselor Education for Students With Disabilities10.4018/979-8-3693-3342-6.ch006(141-196)Online publication date: 13-Dec-2024
        • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
        • (2024)Towards Improving Real-Time Head-Worn Display Caption Mediated Conversations with Speaker Feedback for Hearing Conversation PartnersExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650976(1-11)Online publication date: 11-May-2024
        • (2024)SeEar: Tailoring Real-time AR Caption Interfaces for Deaf and Hard-of-Hearing (DHH) Students in Specialized Educational SettingsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650974(1-8)Online publication date: 11-May-2024
        • (2024)What Factors Motivate Culturally Deaf People to Want Assistive Technologies?Extended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650871(1-7)Online publication date: 11-May-2024
        • (2024)Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing IndividualsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642258(1-17)Online publication date: 11-May-2024
        • (2024)“Caption It in an Accessible Way That Is Also Enjoyable”: Characterizing User-Driven Captioning Practices on TikTokProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642177(1-16)Online publication date: 11-May-2024
        • (2024)Inclusive Deaf Education Enabled by Artificial Intelligence: The Path to a SolutionInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00419-9Online publication date: 24-Jul-2024
        • (2023)Understanding Social and Environmental Factors to Enable Collective Access Approaches to the Design of Captioning TechnologyACM SIGACCESS Accessibility and Computing10.1145/3584732.3584735(1-1)Online publication date: 15-Feb-2023
        • (2023)Modeling and Improving Text Stability in Live CaptionsExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585609(1-9)Online publication date: 19-Apr-2023
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media