Article

Free access

Speech dialogue with facial displays: multimodal human-computer conversation

Authors:

Akikazu TakeuchiAuthors Info & Claims

ACL '94: Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Pages 102 - 109

https://doi.org/10.3115/981732.981747

Published: 27 June 1994 Publication History

Abstract

Human face-to-face conversation is an ideal model for human-computer dialogue. One of the major features of face-to-face communication is its multiplicity of communication channels that act on multiple modalities. To realize a natural multimodal dialogue, it is necessary to study how humans perceive information and determine the information to which humans are sensitive. A face is an independent communication channel that conveys emotional and conversational signals, encoded as facial expressions. We have developed an experimental system that integrates speech dialogue and facial animation, to investigate the effect of introducing communicative facial expressions as a new modality in human-computer conversation. Our experiments have showen that facial expressions are helpful, especially upon first contact with the system. We have also discovered that featuring facial expressions at an early stage improves subsequent interaction.

References

[1]

{Bolt, 1980} Richard A. Bolt. 1980. Put-That-There: Voice and gesture at the graphics interface. Computer Graphics, 14(3):262--270.

Digital Library

[2]

{Chovil, 1991} Nicole Chovil. 1991. Discourse-oriented facial displays in conversation. Research on Language and Social Interaction, 25:163--194.

[3]

{Don et al., 1991} Abbe Don, Tim Oren, and Brenda Laurel. 1991. Guides 3.0. In Proceedings of ACM CHI'91: Conference on Human Factors in Computing Systems, pages 447--448. ACM Press.

Digital Library

[4]

{Ekman and Friesen, 1969} Paul Ekman and Wallace V. Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usages, and coding. Semiotica, 1:49--98.

[5]

{Ekman and Friesen, 1978} Paul Ekman and Wallace V. Friesen. 1978. Facial Action Coding System. Consulting Psychologists Press, Palo Alto, California.

[6]

{Ekman and Friesen, 1984} Paul Ekman and Wallace V. Friesen. 1984. Unmasking the Face. Consulting Psychologists Press, Palo Alto, California.

[7]

{Hasida et al., 1993} Kôiti Hasida, Katashi Nagao, and Takashi Miyata. 1993. Joint utterance: Intrasentential speaker/hearer switch as an emergent phenomenon. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), pages 1193--1199. Morgan Kaufmann Publishers, Inc.

[8]

{Itou et al., 1992} Katunobu Itou, Satoru Hayamizu, and Hozumi Tanaka. 1992. Continuous speech recognition by context-dependent phonetic HMM and an efficient algorithm for finding N-best sentence hypotheses. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP-92), pages I.21--I.24. IEEE.

[9]

{Nagao and Takeuchi, 1994} Katashi Nagao and Akikazu Takeuchi. 1994. Social interaction: Multimodal conversation with social agents. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94). The MIT Press.

Digital Library

[10]

{Nagao et al., 1993} Katashi Nagao, Kôiti Hasida, and Takashi Miyata. 1993. Understanding spoken natural language with omni-directional information flow. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), pages 1268--1274. Morgan Kaufmann Publishers, Inc.

[11]

{Nagao, 1992} Katashi Nagao. 1992. A preferential constraint satisfaction technique for natural language analysis. In Proceedings of the Tenth European Conference on Artificial Intelligence (ECAI-92), pages 523--527. John Wiley & Sons.

Digital Library

[12]

{Nagao, 1993} Katashi Nagao. 1993. Abduction and dynamic preference in plan-based dialogue understanding. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), pages 1186--1192. Morgan Kaufmann Publishers, Inc.

[13]

{Neal et al., 1988} Jeannette G. Neal, Zuzana Dobes, Keith E. Bettinger, and Jong S. Byoun. 1988. Multimodal references in human-computer dialogue. In Proceedings of the Seventh National Conference on Artificial Intelligence (AAAI-88), pages 819--823. Morgan Kaufmann Publishers, Inc.

[14]

{Oviatt et al., 1993} Sharon L. Oviatt, Philip R. Cohen, and Michelle Wang. 1993. Reducing linguistic variability in speech and handwriting through selection of presentation format. In Proceedings of the Internatonal Symposium on Spoken Dialogue (ISSD-93), pages 227--230. Waseda University, Tokyo, Japan.

[15]

{Shneiderman, 1983} Ben Shneiderman. 1983. Direct manipulation: A step beyond programming languages. IEEE Computer, 16:57--69.

Digital Library

[16]

{Stock, 1991} Oliviero Stock. 1991. Natural language and exploration of an information space: the ALFRESCO interactive system. In Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI-91), pages 972--978. Morgan Kaufmann Publishers, Inc.

[17]

{Suchman, 1987} Lucy Suchman. 1987. Plans and Situated Actions. Cambridge University Press.

[18]

{Takeuchi and Franks, 1992} Akikazu Takeuchi and Steve Franks. 1992. A rapid face construction lab. Technical Report SCSL-TR-92-010, Sony Computer Science Laboratory Inc., Tokyo, Japan.

[19]

{Takeuchi and Nagao, 1993} Akikazu Takeuchi and Katashi Nagao. 1993. Communicative facial displays as a new conversational modality. In Proceedings of ACM/IFIP INTERCHI'93: Conference on Human Factors in Computing Systems, pages 187--193. ACM Press.

Digital Library

[20]

{Waters, 1987} Keith Waters. 1987. A muscle model for animating three-dimensional facial expression. Computer Graphics, 21(4):17--24.

Digital Library

Cited By

Cuadra ABaek HEstrin DJung MDell N(2022)On Inclusion: Video Analysis of Older Adult Interactions with a Multi-Modal Voice Assistant in a Public SettingProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572371(1-17)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3572334.3572371
Cuadra ALi SLee HCho JJu W(2021)My Bad! Repairing Intelligent Voice Assistant Errors Improves InteractionProceedings of the ACM on Human-Computer Interaction10.1145/34491015:CSCW1(1-24)Online publication date: 22-Apr-2021
https://dl.acm.org/doi/10.1145/3449101
Alam LHoque M(2017)A Text-Based Chat System Embodied with an Expressive AgentAdvances in Human-Computer Interaction10.1155/2017/89627622017Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1155/2017/8962762
Show More Cited By

Speech dialogue with facial displays: multimodal human-computer conversation
1. Hardware
  1. Power and energy
    1. Power estimation and optimization
2. Human-centered computing

Recommendations

Facilitating multiparty dialog with gaze, gesture, and speech
ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

We study how synchronized gaze, gesture and speech rendered by an embodied conversational agent can influence the flow of conversations in multiparty settings. We begin by reviewing a computational framework for turn-taking that provides the foundation ...
Communicative facial displays as a new conversational modality
CHI '93: Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems

The human face is an independent communication channel that conveys emotional and conversational signals encoded as facial displays. Facial displays can be viewed as communicative signals that help coordinate conversation. We are attempting to introduce ...
Dialog acts in greeting and leavetaking in social talk
ISIAA 2017: Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents

Conversation proceeds through dialogue moves or acts, and dialog act annotation can aid the design of artificial dialog. While many dialogs are task-based or instrumental, with clear goals, as in the case of a service encounter or business meeting, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

ACL '94: Proceedings of the 32nd annual meeting on Association for Computational Linguistics

June 1994

353 pages

Program Chair:
James Pustejovsky
Brandeis University

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 27 June 1994

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
672
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)9

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cuadra ABaek HEstrin DJung MDell N(2022)On Inclusion: Video Analysis of Older Adult Interactions with a Multi-Modal Voice Assistant in a Public SettingProceedings of the 2022 International Conference on Information and Communication Technologies and Development10.1145/3572334.3572371(1-17)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3572334.3572371
Cuadra ALi SLee HCho JJu W(2021)My Bad! Repairing Intelligent Voice Assistant Errors Improves InteractionProceedings of the ACM on Human-Computer Interaction10.1145/34491015:CSCW1(1-24)Online publication date: 22-Apr-2021
https://dl.acm.org/doi/10.1145/3449101
Alam LHoque M(2017)A Text-Based Chat System Embodied with an Expressive AgentAdvances in Human-Computer Interaction10.1155/2017/89627622017Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1155/2017/8962762
Yamamoto DOura KNishimura RUchiya TLee ATakumi ITokuda KKuzuoka HOno TImai MYoung J(2014)Voice interaction system with 3D-CG virtual agent for stand-alone smartphonesProceedings of the second international conference on Human-agent interaction10.1145/2658861.2658874(323-330)Online publication date: 29-Oct-2014
https://dl.acm.org/doi/10.1145/2658861.2658874
Chen YNaveed APorzel RGao WLee CYang JChen XEskenazi MZhang Z(2010)Behavior and preference in minimal personalityInternational Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction10.1145/1891903.1891963(1-4)Online publication date: 8-Nov-2010
https://dl.acm.org/doi/10.1145/1891903.1891963
Raidt SBailly GElisei FBailly GCrowley J(2005)Basic components of a face-to-face interaction with a conversational agentProceedings of the 2005 joint conference on Smart objects and ambient intelligence: innovative context-aware services: usages and technologies10.1145/1107548.1107610(247-252)Online publication date: 12-Oct-2005
https://dl.acm.org/doi/10.1145/1107548.1107610
Lisetti CNasoz FRowe LMerialdo BMuhlhauser MRoss KDimitrova N(2002)MAUIProceedings of the tenth ACM international conference on Multimedia10.1145/641007.641038(161-170)Online publication date: 1-Dec-2002
https://dl.acm.org/doi/10.1145/641007.641038
Cavazza M(2001)An empirical study of speech recognition errors in a task-oriented dialogue systemProceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 1610.3115/1118078.1118084(1-8)Online publication date: 1-Sep-2001
https://dl.acm.org/doi/10.3115/1118078.1118084
Cassell JVilhjálmsson HBickmore TPocock L(2001)BEATProceedings of the 28th annual conference on Computer graphics and interactive techniques10.1145/383259.383315(477-486)Online publication date: 1-Aug-2001
https://dl.acm.org/doi/10.1145/383259.383315
Cheyer AJulia L(1999)Spoken language and multimodal applications for electronic realitiesVirtual Reality10.1007/BF014085904:2(114-128)Online publication date: 1-Jun-1999
https://dl.acm.org/doi/10.1007/BF01408590
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents