Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/332040.332451acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Article
Free access

The effect of task conditions on the comprehensibility of synthetic speech

Published: 01 April 2000 Publication History

Abstract

A study was conducted with 78 subjects to evaluate the comprehensibility of synthetic speech for various tasks ranging from short, simple e-mail messages to longer news articles on mostly obscure topics. Comprehension accuracy for each subject was measured for synthetic speech and for recorded human speech. Half the subjects were allowed to take notes while listening, the other half were not. Findings show that there was no significant difference in comprehension of synthetic speech among the five different text-to-speech engines used. Those subjects that did not take notes performed significantly worse for all synthetic voice tasks when compared to recorded speech tasks. Performance for synthetic speech in the non note-taking condition degraded as the task got longer and more complex. When taking notes, subjects also did significantly worse within the synthetic voice condition averaged across all six tasks. However, average performance scores for the last three tasks in this condition show comparable results for human and synthetic speech, reflective of a training effect.

References

[1]
Francis, A.L., Nusbaum, H.C. (1999). Evaluating the quality of Synthetic Speech. In Gardner-Bonneau, D. (Ed.), Human Factors and Voice Interactive Systems (pp. 63 - 97)
[2]
Greenspan, S. L., Nusbaum, H.C., and Pisoni D.B. (1988). Perception of synthetic speech produced by rule: lntellibility of eight text-to-speech systems. Behavioral Research Methods, Instruments, and Computers, 18 100-107
[3]
Klatt, D. H. (1987). Review of text-to-speech conversion for English. The Journal of the Acoustical Society of America. September 1987 (pp. 737- 793)
[4]
Pisoni, D.B. and Hunnieutt, S. (1980). Perceptual Evaluation of MITalk: The MIT unrestricted text-to-speech system. IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 572-575) New York
[5]
Pisoni D.B., Nusbaum H.C. & Greene B.G. (1985). Perception of synthetic speech generated by rule. Proceedings of the 1EEE 73:1665-1676
[6]
Ralston, J.V., Pisoni, D.B., Lively, S.E., Greene, B.G. & Mullennix, J.W. (1991). Comprehension of Synthetic Speech Produced by Rule: Word Monitoring and Sentence-by-Sentence Listening Times. Human Factors 1991, 33(4) (pp. 471-491)
[7]
Ralston, J.V., Pisoni, D.B., and Muliermix, J.W. (1995). Perception and comprehension of speech. In Syrdal, A.K., Bennett, R.W., Greenspan, S.L. (Eds.), Applied Speech Technology (pp. 233-288) Boca Raton: CRC Press
[8]
Van Bezooijen, R. & van Heuven, V. (1998) Assessment of Synthesis System. In Gibbon, D., Moore, R. and Winski,R. (Eds.), Volume 111." Spoken Language System Assessment (pp. 1671481} - 249 {563}) Mouton de Gruyer
[9]
www.toefl.org
[10]
www.genmagic.com/portico/portico__home.shtml
[11]
www.webley.com

Cited By

View all
  • (2022)Identifying an Aurally Distinct Phrase Set for Text Entry TechniquesProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501897(1-13)Online publication date: 29-Apr-2022
  • (2021)Synthetic versus human voices in audiobooks: The human emotional intimacy effectNew Media & Society10.1177/1461444821102414225:7(1746-1764)Online publication date: 28-Jun-2021
  • (2021)DisplaysHumanizing Healthcare – Human Factors for Medical Device Design10.1007/978-3-030-64433-8_11(271-306)Online publication date: 22-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '00: Proceedings of the SIGCHI conference on Human Factors in Computing Systems
April 2000
587 pages
ISBN:1581132166
DOI:10.1145/332040
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2000

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. comprehension
  2. synthetic speech
  3. text-to-speech
  4. user study

Qualifiers

  • Article

Conference

CHI00
Sponsor:
CHI00: Human Factors in Computing Systems
April 1 - 6, 2000
The Hague, The Netherlands

Acceptance Rates

CHI '00 Paper Acceptance Rate 72 of 336 submissions, 21%;
Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI '25
CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)11
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Identifying an Aurally Distinct Phrase Set for Text Entry TechniquesProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501897(1-13)Online publication date: 29-Apr-2022
  • (2021)Synthetic versus human voices in audiobooks: The human emotional intimacy effectNew Media & Society10.1177/1461444821102414225:7(1746-1764)Online publication date: 28-Jun-2021
  • (2021)DisplaysHumanizing Healthcare – Human Factors for Medical Device Design10.1007/978-3-030-64433-8_11(271-306)Online publication date: 22-Feb-2021
  • (2017)Effectiveness, attention, and recall of human and artificial voices in an advertising story. Prosody influence and functions of voicesComputers in Human Behavior10.1016/j.chb.2017.08.04477:C(336-346)Online publication date: 1-Dec-2017
  • (2011)Evaluation of TTS systems in intelligibility and comprehension tasksProceedings of the 23rd Conference on Computational Linguistics and Speech Processing10.5555/2207055.2207060(64-78)Online publication date: 8-Sep-2011
  • (2006)Effect of Learning English as a Second Language on the Comprehension of Synthetic Speech in Ambient NoiseProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/15419312060500177950:17(2084-2088)Online publication date: 1-Oct-2006
  • (2006)Reading on-the-goProceedings of the 8th conference on Human-computer interaction with mobile devices and services10.1145/1152215.1152262(219-226)Online publication date: 12-Sep-2006
  • (2006)Accessibility heuristics utilizing learnability characteristics of synthesized speech applicationsACM SIGACCESS Accessibility and Computing10.1145/1127564.1127574(45-47)Online publication date: 1-Jan-2006
  • (2006)One family, many voices: Can multiple synthetic voices be used as navigational cues in hierarchical interfaces?International Journal of Speech Technology10.1007/s10772-006-9000-79:1-2(1-15)Online publication date: 12-Oct-2006
  • (2004)ECA as user interface paradigmFrom brows to trust10.5555/1138317.1138327(239-267)Online publication date: 1-Jan-2004
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media