research-article

SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing People in Group Conversations

Authors:

Tzu-chuan Chen,

Hsien-Hui Tang,

Mike Y. ChenAuthors Info & Claims

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Paper No.: 293, Pages 1 - 10

https://doi.org/10.1145/3173574.3173867

Published: 21 April 2018 Publication History

Abstract

Deaf and hard-of-hearing (DHH) individuals encounter difficulties when engaged in group conversations with hearing individuals, due to factors such as simultaneous utterances from multiple speakers and speakers whom may be potentially out of view. We interviewed and co-designed with eight DHH participants to address the following challenges: 1) associating utterances with speakers, 2) ordering utterances from different speakers, 3) displaying optimal content length, and 4) visualizing utterances from out-of-view speakers. We evaluated multiple designs for each of the four challenges through a user study with twelve DHH participants. Our study results showed that participants significantly preferred speechbubble visualizations over traditional captions. These design preferences guided our development of SpeechBubbles, a real-time speech recognition interface prototype on an augmented reality head-mounted display. From our evaluations, we further demonstrated that DHH participants preferred our prototype over traditional captions for group conversations.

Supplementary Material

suppl.mov (pn2814-file3.mp4)

Supplemental video

Download
11.17 MB

suppl.mov (pn2814-file5.mp4)

Supplemental video

Download
11.07 MB

MP4 File (pn2814.mp4)

Download
213.20 MB

References

[1]

Ava. http://www.ava.me/.

[2]

Google Noto Fonts. https://www.google.com/get/noto/help/guidelines/.

[3]

UNI. http://www.motionsavvy.com/.

[4]

Unity. https://www.unity3d.com/.

[5]

Andy Brown, Rhia Jones, Mike Crabb, James Sandford, Matthew Brooks, Mike Armstrong, and Caroline Jay. 2015. Dynamic Subtitles: The User Experience. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video (TVX '15). ACM, New York, NY, USA, 103--112.

Digital Library

[6]

Delia Chiaro, Christine Heiss, and Chiara Bucaria. 2008. Between Text and Image: Updating Research in Screen Translation. John Benjamins Publishing Company, Amsterdam, Netherlands.

[7]

Bong-Kyung Chun, Dong-Sung Ryu, Won-Il Hwang, and Hwan-Gue Cho. 2006. An Automated Procedure for Word Balloon Placement in Cinema Comics. In Advances in Visual Computing: Second International Symposium, ISVC 2006 Lake Tahoe, NV, USA, November 6--8, 2006. Proceedings, Part II. Springer Berlin Heidelberg, Berlin, Heidelberg, 576--585.

Digital Library

[8]

Michael Crabb, Rhianne Jones, Mike Armstrong, and Chris J. Hughes. 2015. Online News Videos: The UX of Subtitle Position. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '15). ACM, New York, NY, USA, 215--222.

Digital Library

[9]

Marilyn E. Demorest and Sue Ann Erdman. 1986. Scale Composition and Item Analysis of the Communication Profile for the Hearing Impaired. Journal of Speech and Hearing Research 29, 4 (dec 1986), 515--535.

[10]

Gilbert C.F. Fong. 2009. Let the Words Do the Talking: The Nature and Art of Subtitling. In Dubbing and Subtitling in a World Context. The Chinese University of Hong Kong, 91--106.

[11]

Benjamin M. Gorman. 2014. VisAural: A Wearable Sound-localisation Device for People with Impaired Hearing. In Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '14). ACM, New York, NY, USA, 337--338.

Digital Library

[12]

Jan Gugenheimer, Katrin Plaumann, Florian Schaub, Patrizia Di Campli San Vito, Saskia Duck, Melanie Rabus, and Enrico Rukzio. 2017. The Impact of Assistive Technology on Communication Quality Between Deaf and Hearing Individuals. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). ACM, New York, NY, USA, 669--682.

Digital Library

[13]

Stephen R. Gulliver. 2002. Impact of Captions on Deaf and Hearing Perception of Multimedia Video Clips. In Proceedings of the 2002 IEEE International Conference on Multimedia and Expo (ICME '02), Vol. 1. IEEE, Washington, DC, USA, 753--756.

[14]

Stephen R. Gulliver and George Ghinea. 2003. How Level and Type of Deafness Affect User Perception of Multimedia Video Clips. Universal Access in the Information Society 2, 4 (nov 2003), 374--386.

Digital Library

[15]

Richard S. Hallam and Roslyn Corney. 2014. Conversation Tactics in Persons with Normal Hearing and Hearing-impairment. International Journal of Audiology 53, 3 (mar 2014), 174--181.

[16]

Marion A. Hersh and Michael A. Johnson. 2003. Assistive Technology for the Hearing-impaired, Deaf and Deafblind. Springer-Verlag London, London, England, UK.

[17]

Myles Hollander, Douglas A. Wolfe, and Eric Chicken. 1999. Nonparametric Statistical Methods. Wiley, New York, NY, USA.

[18]

Richang Hong, Meng Wang, Mengdi Xu, Shuicheng Yan, and Tat-Seng Chua. 2010. Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment. In Proceedings of the 18th ACM International Conference on Multimedia (MM '10). ACM, New York, NY, USA, 421--430.

Digital Library

[19]

Dhruv Jain, Leah Findlater, Jamie Gilkeson, Benjamin Holland, Ramani Duraiswami, Dmitry Zotkin, Christian Vogler, and Jon E. Froehlich. 2015. Head-Mounted Display Visualizations to Support Sound Awareness for the Deaf and Hard of Hearing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 241--250.

Digital Library

[20]

Jacek Jankowski, Krystian Samp, Izabela Irzynska, Marek Jozwowicz, and Stefan Decker. 2010. Integrating Text with Video and 3D Graphics: The Effects of Text Drawing Styles on Text Readability. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). ACM, New York, NY, USA, 1321--1330.

Digital Library

[21]

Fotios Karamitroglou. 1998. A Proposed Set of Subtitling Standards in Europe. Translation Journal 2, 2 (1998), 1â S15.

[22]

Saba Kawas, George Karalis, Tzu Wen, and Richard E. Ladner. 2016. Improving Real-Time Captioning Experiences for Deaf and Hard of Hearing Students. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '16). ACM, New York, NY, USA, 15--23.

Digital Library

[23]

Cees M. Koolstra, Allerd L. Peeters, and Herman Spinhof. 2002. The Pros and Cons of Dubbing and Subtitling. European Journal of Communication 17 (sep 2002), 325--354.

[24]

David Kurlander, Tim Skelly, and David Salesin. 1996. Comic Chat. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96). ACM, New York, NY, USA, 225--236.

Digital Library

[25]

Kuno Kurzhals, Emine Cetinkaya, Yongtao Hu, Wenping Wang, and Daniel Weiskopf. 2017. Close to the Action: Eye-Tracking Evaluation of Speaker-Following Subtitles. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 6559--6568.

Digital Library

[26]

Raja S. Kushalnagar, Gary W. Behm, Aaron W. Kelstone, and Shareef Ali. 2015. Tracked Speech-To-Text Display: Enhancing Accessibility and Readability of Real-Time Speech-To-Text. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '15). ACM, New York, NY, USA, 223--230.

Digital Library

[27]

Tara Matthews, Janette Fong, and Jennifer Mankoff. 2005. Visualizing Non-speech Sounds for the Deaf. In Proceedings of the 7th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '05). ACM, New York, NY, USA, 52--59.

Digital Library

[28]

Ryan W. McCreery, Rebecca A. Venediktov, Jaumeiko J. Coleman, and Hillary M. Leech. 2012. An Evidence-Based Systematic Review of Directional Microphones and Digital Noise Reduction Hearing Aids in School-Age Children With Hearing Loss. American Journal of Audiology 21, 2 (dec 2012), 295--312.

[29]

Anne Marie Piper and James D. Hollan. 2008. Supporting Medical Conversations Between Deaf and Hearing Individuals with Tabletop Displays. In Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work (CSCW '08). ACM, New York, NY, USA, 147--156.

Digital Library

[30]

Ruiwei Shen, Tsutomu Terada, and Masahiko Tsukamoto. 2012. A System for Visualizing Sound Source Using Augmented Reality. In Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia (MoMM '12). ACM, New York, NY, USA, 97--102.

Digital Library

[31]

Nancy Tye-Murray, Suzanne C. Purdy, and George G. Woodworth. 1992. Reported Use of Communication Strategies by SHHH Members: Client, Talker, and Situational Variables. Journal of Speech, Language, and Hearing Research 35 (jun 1992), 708--717.

[32]

Y. Wang. 2006. Discussion on Technical Principle for Handling with Translation of Captions of Movies and Televisions. Journal of Hebei Polytechnic College 6, 1 (2006), 61--63.

[33]

Huayong Zhao. 2000. The Approach & Research of Foreign Film Dubbing. Broadcasting Corporation of China, Beijing, China.

Cited By

An PZhang CGao HZhou ZXiao YZhao J(2025)AniBalloons: Animated chat balloons as affective augmentation for social messaging and chatbot interactionInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103365194(103365)Online publication date: Feb-2025
https://doi.org/10.1016/j.ijhcs.2024.103365
Luna STigwell GPapangelis KXu J(2024)Exploring Visual Scanning in Augmented Reality: Perspectives From Deaf and Hard of Hearing UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688535(1-6)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3688535
McDonnell EFindlater L(2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675649
Show More Cited By

Index Terms

SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing People in Group Conversations
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
2. Social and professional topics

Recommendations

Towards Accessible Conversations in a Mobile Context for People who are Deaf and Hard of Hearing
ASSETS '18: Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility

Prior work has explored communication challenges faced by people who are deaf and hard of hearing (DHH) and the potential role of new captioning and support technologies to address these challenges; however, the focus has been on stationary contexts ...
Deaf and Hard of Hearing People’s Perspectives on Augmented Reality Interfaces for Improving the Accessibility of Smart Speakers
Universal Access in Human-Computer Interaction
Abstract
The continued evolution of voice recognition technology has led to its integration into many smart devices as the primary mode of user interaction. Smart speakers are among the most popular smart devices that utilize voice recognition to offer ...
Deaf Individuals' Views on Speaking Behaviors of Hearing Peers when Using an Automatic Captioning App
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems

As automatic speech recognition (ASR) becomes more accurate, many deaf and hard-of-hearing (DHH) individuals are interested in ASR-based mobile applications to facilitate in-person communication with hearing peers. We investigate DHH users' preferences ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

April 2018

8489 pages

ISBN:9781450356206

DOI:10.1145/3173574

General Chairs:
Regan Mandryk
University of Saskatchewan, Canada
,
Mark Hancock
University of Waterloo, Canada
,
Program Chairs:
Mark Perry
Brunel University London, UK
,
Anna Cox
University College London, UK

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CHI '18

Sponsor:

SIGCHI

CHI '18: CHI Conference on Human Factors in Computing Systems

April 21 - 26, 2018

Montreal QC, Canada

Acceptance Rates

CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI '25

Sponsor:
sigchi

CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

70
Total Citations
View Citations
1,861
Total Downloads

Downloads (Last 12 months)359
Downloads (Last 6 weeks)28

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

An PZhang CGao HZhou ZXiao YZhao J(2025)AniBalloons: Animated chat balloons as affective augmentation for social messaging and chatbot interactionInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103365194(103365)Online publication date: Feb-2025
https://doi.org/10.1016/j.ijhcs.2024.103365
Luna STigwell GPapangelis KXu J(2024)Exploring Visual Scanning in Augmented Reality: Perspectives From Deaf and Hard of Hearing UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688535(1-6)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3688535
McDonnell EFindlater L(2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675649
Lee JTjahjadi AKim JYu JPark MZhang JFroehlich JTian YZhao Y(2024)CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low VisionProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676449(1-16)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676449
Cho HWang AKartik DXie EYan YLindlbauer D(2024)Auptimize: Optimal Placement of Spatial Audio Cues for Extended RealityProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676424(1-14)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676424
Boovaraghavan SZhou HGoel MAgarwal Y(2024)KirigamiProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435028:1(1-28)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643502
Das MStangl AFindlater L(2024)"That comes with a huge career cost:" Understanding Collaborative Ideation Experiences of Disabled ProfessionalsProceedings of the ACM on Human-Computer Interaction10.1145/36410188:CSCW1(1-28)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3641018
Kang JLayton EMartin DStarner T(2024)Towards Improving Real-Time Head-Worn Display Caption Mediated Conversations with Speaker Feedback for Hearing Conversation PartnersExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650976(1-11)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650976
Samaradivakara YUshan TPathirage ASasikumar PKarunanayaka KKeppitiyagama CNanayakkara S(2024)SeEar: Tailoring Real-time AR Caption Interfaces for Deaf and Hard-of-Hearing (DHH) Students in Specialized Educational SettingsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650974(1-8)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650974
Luna SXu JPapangelis KTigwell GLalone NSaker MChamberlain ALaato SDunham JWang Y(2024)Communication, Collaboration, and Coordination in a Co-located Shared Augmented Reality Game: Perspectives From Deaf and Hard of Hearing PeopleProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642953(1-14)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642953
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents