Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3234695.3236354acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech

Published: 08 October 2018 Publication History

Abstract

The usage of smartphone-based virtual assistants (e.g., Siri or Google Assistant) is growing, and their spread has generally a positive impact on device accessibility, e.g., for people with disabilities. However, people with dysarthria or other speech impairments may be unable to use these virtual assistants with proficiency. This paper investigates to which extent people with ALS-induced dysarthria can be understood and get consistent answers by three widely used smartphone-based assistants, namely Siri, Google Assistant, and Cortana. We focus on the recognition of Italian dysarthric speech, to study the behavior of the virtual assistants with this specific population for which no relevant studies are available. We collected and recorded suitable speech samples from people with dysarthria in a dedicated center of the Molinette hospital, in Turin, Italy. Starting from those recordings, the differences between such assistants, in terms of speech recognition and consistency in answer, are investigated and discussed. Results highlight different performance among the virtual assistants. For speech recognition, Google Assistant is the most promising, with around 25% of word error rate per sentence. Consistency in answer, instead, sees Siri and Google Assistant provide coherent answers around 60% of times.

References

[1]
Fabio Ballati, Fulvio Corno, Luigi De Russis. 2018. "Hey Siri, do you understand me?": Virtual Assistants and Dysarthria. In Workshop Proceedings of the 14th International Conference on Intelligent Environments (IE 2018), 557--566.
[2]
Jeffrey P. Bigham, Raja Kushalnagar, Ting-Hao Kenneth Huang, Juan Pablo Flores, and Saiph Savage. 2017. On how deaf people might use speech to control devices. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17), 383--384.
[3]
Jesse M. Cedarbaum, Nancy Stambler, Errol Malta. 1999. The ALSFRS-R: a revised ALS functional rating scale that incorporates assessments of respiratory function. Journal of the Neurological Sciences, 169, 1--2: 13--21.
[4]
Jan Derboven, Jonathan Huyghe, and Dirk De Grooff. 2014. Designing voice interaction for people with physical and speech impairments. In Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational (NordiCHI '14), 217--226.
[5]
Joseph Duffy. 2012. Motor Speech Disorders: Substrates, Differential Diagnosis, and Management. Mosby.
[6]
Abraham T. Glasser, Kesavan R. Kushalnagar, and Raja S. Kushalnagar. 2017. Feasibility of Using Automatic Speech Recognition with Voices of Deaf and Hard-of-Hearing Individuals. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17), 373--374.
[7]
Mark S. Hawley. 2002. Speech recognition as an input to electronic assistive technology. British Journal of Occupational Therapy, 65, 1: 15--20.
[8]
Gustavo Lopez, Luis Quesada, and Luis A. Guerrero. 2018. Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces. In Proceedings of the AHFE International Conference on Human Factors and Systems Interaction (AHFE 2017), 241--250.
[9]
Taylor Martin, David Priest. 2017. The complete list of Alexa commands so far. Retrieved April 11, 2018 from https://www.cnet.com/how-to/amazon-echo-the-complete-list-of-alexa-commands/.
[10]
Taylor Martin, David Priest. 2017. The complete list of Google Home commands so far. Retrieved April 11, from https://www.cnet.com/how-to/google-home-complete-list-of-commands/.
[11]
Erin Myers. 2017. Speech Recognition Accuracy: Past, Present, Future. Retrieved April 11, 2018 from https://www.temi.com/blog/2017/10/06/speech-recognition-accuracy-history/.
[12]
Prasad D. Polur and Gerald E. Miller. 2006. Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals. Medical Engineering & Physics, 28, 8: 741 -- 748.
[13]
Frank Rudzicz. 2012. Using articulatory likelihoods in the recognition of dysarthric speech. Speech Communication, 54, 3: 430 -- 444.
[14]
Andrew Sears and Mark Young. 2002. Physical disabilities and computing technologies: an analysis of impairments. In The human-computer interaction handbook 482--503.
[15]
Therapy Development Institute ALS. 2018. ALS Frequently Asked Questions. Retrieved June 13, 2018 from https://www.als.net/about-als-tdi/als-faq/.
[16]
Barbara Tomik and Roberto J. Guiloff. 2010. Dysarthria in amyotrophic lateral sclerosis: A review. Amyotrophic Lateral Sclerosis, 11, 1--2: 4--15.

Cited By

View all
  • (2024)A Voice User Interface on the Edge for People with Speech ImpairmentsElectronics10.3390/electronics1307138913:7(1389)Online publication date: 7-Apr-2024
  • (2024)Interacting with Smart Virtual Assistants for Individuals with Dysarthria: A Comparative Study on Usability and User PreferencesApplied Sciences10.3390/app1404140914:4(1409)Online publication date: 8-Feb-2024
  • (2024)Exploring the Role of Machine Learning in Diagnosing and Treating Speech Disorders: A Systematic Literature ReviewPsychology Research and Behavior Management10.2147/PRBM.S460283Volume 17(2205-2232)Online publication date: May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASSETS '18: Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility
October 2018
508 pages
ISBN:9781450356503
DOI:10.1145/3234695
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accessibility
  2. automatic speech recognition
  3. conversational assistant
  4. dysarthria
  5. speech impairment

Qualifiers

  • Research-article

Conference

ASSETS '18
Sponsor:

Acceptance Rates

ASSETS '18 Paper Acceptance Rate 28 of 108 submissions, 26%;
Overall Acceptance Rate 436 of 1,556 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Voice User Interface on the Edge for People with Speech ImpairmentsElectronics10.3390/electronics1307138913:7(1389)Online publication date: 7-Apr-2024
  • (2024)Interacting with Smart Virtual Assistants for Individuals with Dysarthria: A Comparative Study on Usability and User PreferencesApplied Sciences10.3390/app1404140914:4(1409)Online publication date: 8-Feb-2024
  • (2024)Exploring the Role of Machine Learning in Diagnosing and Treating Speech Disorders: A Systematic Literature ReviewPsychology Research and Behavior Management10.2147/PRBM.S460283Volume 17(2205-2232)Online publication date: May-2024
  • (2024)Co-designing the integration of voice-based conversational AI and web augmentation to amplify web inclusivityScientific Reports10.1038/s41598-024-66725-314:1Online publication date: 13-Jul-2024
  • (2024)Towards Inclusive Voice User Interfaces: A Systematic Review of Voice Technology Usability for Users with Communication DisabilitiesHCI International 2024 Posters10.1007/978-3-031-61947-2_9(75-85)Online publication date: 2-Jun-2024
  • (2023)Türkçe Konuşmadan Metne Dönüştürme için Ön Eğitimli Modellerin Performans Karşılaştırması: Whisper-Small ve Wav2Vec2-XLS-R-300MPerformance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300MTürkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi10.54525/tbbmd.125248716:2(109-116)Online publication date: 20-Nov-2023
  • (2023)A Systematic Review of Ethical Concerns with Voice AssistantsProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3600211.3604679(131-145)Online publication date: 8-Aug-2023
  • (2023)From User Perceptions to Technical Improvement: Enabling People Who Stutter to Better Use Speech RecognitionProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581224(1-16)Online publication date: 19-Apr-2023
  • (2023)E-DGAN: An Encoder-Decoder Generative Adversarial Network Based Method for Pathological to Normal Voice ConversionIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.323955127:5(2489-2500)Online publication date: May-2023
  • (2023)A Review and Classification of Amyotrophic Lateral Sclerosis with Speech as a Biomarker2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10308048(1-7)Online publication date: 6-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media