Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1088463.1088494acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Audio-visual cues distinguishing self- from system-directed speech in younger and older adults

Published: 04 October 2005 Publication History

Abstract

In spite of interest in developing robust open-microphone engagement techniques for mobile use and natural field contexts, there currently are no reliable techniques available. One problem is the lack of empirically-grounded models as guidance for distinguishing how users' audio-visual activity actually differs systematically when addressing a computer versus human partner. In particular, existing techniques have not been designed to handle high levels of user self talk as a source of "noise," and they typically assume that a user is addressing the system only when facing it while speaking. In the present research, data were collected during two related studies in which adults aged 18-89 interacted multimodally using speech and pen with a simulated map system. Results revealed that people engaged in self talk prior to addressing the system over 30% of the time, with no decrease in younger adults' rate of self talk compared with elders. Speakers' amplitude was lower during 96% of their self talk, with a substantial 26 dBr amplitude separation observed between self- and system-directed speech. The magnitude of speaker's amplitude separation ranged from approximately 10-60 dBr and diminished with age, with 79% of the variance predictable simply by knowing a person's age. In contrast to the clear differentiation of intended addressee revealed by amplitude separation, gaze at the system was not a reliable indicator of speech directed to the system, with users looking at the system over 98% of the time during both self- and system-directed speech. Results of this research have implications for the design of more effective open-microphone engagement for mobile and pervasive systems.

References

[1]
Bakx, I., K.v. Turnhout, & J. Terken. Facial orientation during multi-party interaction with information kiosks. Proceedings of the Interact Conference, 2003, Zurich, Switzerland: IOS Press, 701--704.
[2]
Berk, L.E. Why children talk to themselves. Scientific American, 1994, 271(5), 78--83.
[3]
Boersma, P. & D. Weenik, Praat: doing phonetics by computer (Version 4.2). 2005. (URL: www.praat.org)
[4]
Buxton, W. Integrating the periphery and context: A new taxonomy of telematics. Proceedings of the Graphics Interface Conference, 1995, Quebec City, Quebec: Morgan Kaufman, 239--246.
[5]
Comblain, A. Working memory in Down's Syndrome: Training the rehearsal strategy. Down's Syndrome: Research and Practice, 1994, 2(3), 123--126.
[6]
Czaja, S.J. & C.C. Lee., Designing computer systems for older adults. Handbook of Human-Computer Interaction, J. Jacko & A. Sears, eds: LEA, NY. 2002, 413--427.
[7]
Duncan, R.M. & J.A. Cheyne. Private speech in young adults: Task difficulty, self-regulation, and psychological predication. Cognitive Development, 2002, 16, 889--906.
[8]
Katzenmaier, M., R. Steifelhagen, & T. Schultz. Identifying the addressee in human-human-robot interactions based on head pose and speech. Proceedings of the International Conference on Multimodal Interfaces, 2004, State College, PA: ACM Press, 144--151.
[9]
Luria, A.R. The Role of Speech in the Regulation of Normal and Abnormal Behavior, 1961, Liveright, NY.
[10]
Meichenbaum, D. & J. Goodman. Reflection-impulsivity and verbal control of motor behavior. Child Development, 1969, 40, 785--797.
[11]
Messer, S.B. Reflection-impulsivity: A review. Psychological Bulletin, 1976, 83(6), 1026--1052.
[12]
Neti, C., G. Iyengar, G. Potamianos, A. Senior, & B. Maison. Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction. Proceedings of the International Conference on Spoken Language Processing, 2000, Beijing: Chinese Friendship Publishers, 11--14.
[13]
Oppermann, D., F. Schiel, S. Steininger, & N. Berger. Off-talk - a problem for human-machine-interaction? Proceedings of the EuroSpeech Conference, 2001, Aalborg, Denmark: ISCA Secretariat, 2197--2200.
[14]
Oviatt, S.L., P.R. Cohen, & M.Q. Wang. Toward interface design for human language technology: Modality and structure as determinants of linguistic complexity. Speech Communication, 1994, 15, 283--300.
[15]
Oviatt, S.L., R. Coulston, S. Tomko, B. Xiao, R. Lunsford, M. Wesson, & L. Carmichael. Toward a theory of organized multimodal integration patterns during human-computer interaction. Proceedings of the International Conference on Multimodal Interfaces, 2003, Vancouver, BC: ACM Press, 44--51.
[16]
Paek, T., E. Horvitz, & E. Ringger. Continuous listening for unconstrained spoken dialog. Proceedings of the ICSLP, 2000, Beijing, China: 138--141.
[17]
Svirsky, M.A., H. Lane, J.S. Perkell, & J. Wozniak. Effects of short-term auditory deprivation on speech production in adult cochlear implant users. Journal of the Acoustic Society of America, 1992, 3, 1284--1300.
[18]
Wilpon, J. & C. Jacobsen. A study of speech recognition for children and the elderly. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, 1996, Atlanta, GA: IEEE Press, 349--352.
[19]
Winsler, A. & J. Naglieri. Overt and covert verbal problem-solving strategies: Developmental trends in use, awareness, and relations with task performance in children aged 5 to 17. Child Development, 2003, 74(3), 659--678.
[20]
Xiao, B., R. Lunsford, R. Coulston, M. Wesson, & S.L. Oviatt. Modeling multimodal integration patterns and performance in seniors: Toward adaptive processing of individual differences. Proceedings of the International Conference on Multimodal Interfaces, 2003, Vancouver, BC: ACM Press, 265--272.

Cited By

View all
  • (2016)OPAC Usability Problems of ArchivesInternational Journal of Systems and Service-Oriented Engineering10.4018/IJSSOE.20160101046:1(54-70)Online publication date: 1-Jan-2016
  • (2013)Self-talk Discrimination in Human–Robot Interaction Situations for Supporting Social AwarenessInternational Journal of Social Robotics10.1007/s12369-013-0179-x5:2(277-289)Online publication date: 26-Jan-2013
  • (2012)Multimodal InterfacesHuman–Computer Interaction Handbook10.1201/b11963-22(405-430)Online publication date: 14-May-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces
October 2005
344 pages
ISBN:1595930280
DOI:10.1145/1088463
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. gaze
  2. individual differences
  3. intended addressee
  4. multimodal interaction
  5. open-microphone engagement
  6. spoken amplitude
  7. system adaptation
  8. universal access
  9. user modeling

Qualifiers

  • Article

Conference

ICMI05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2016)OPAC Usability Problems of ArchivesInternational Journal of Systems and Service-Oriented Engineering10.4018/IJSSOE.20160101046:1(54-70)Online publication date: 1-Jan-2016
  • (2013)Self-talk Discrimination in Human–Robot Interaction Situations for Supporting Social AwarenessInternational Journal of Social Robotics10.1007/s12369-013-0179-x5:2(277-289)Online publication date: 26-Jan-2013
  • (2012)Multimodal InterfacesHuman–Computer Interaction Handbook10.1201/b11963-22(405-430)Online publication date: 14-May-2012
  • (2008)Implicit user-adaptive system engagement in speech and pen interfacesProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/1357054.1357204(969-978)Online publication date: 6-Apr-2008
  • (2008)Multimodal InterfacesHCI Beyond the GUI10.1016/B978-0-12-374017-5.00012-2(391-444)Online publication date: 2008
  • (2006)Toward open-microphone engagement for multiparty interactionsProceedings of the 8th international conference on Multimodal interfaces10.1145/1180995.1181049(273-280)Online publication date: 2-Nov-2006
  • (2006)GSI demoProceedings of the 8th international conference on Multimodal interfaces10.1145/1180995.1181012(76-83)Online publication date: 2-Nov-2006
  • (2006)Human perception of intended addressee during computer-assisted meetingsProceedings of the 8th international conference on Multimodal interfaces10.1145/1180995.1181002(20-27)Online publication date: 2-Nov-2006

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media