Abstract
Embodied Conversational Agents (ECAs) with realistic faces are becoming an intrinsic part of many graphics systems employed in HCI applications. A fundamental issue is how people visually perceive the affect of a speaking agent. In this paper we present the first study evaluating the relation between objective and subjective visual perception of emotion as displayed on a speaking human face, using both full video and sparse point-rendered representations of the face. We found that objective machine learning analysis of facial marker motion data is correlated with evaluations made by experimental subjects, and in particular, the lower face region provides insightful emotion clues for visual emotion perception. We also found that affect is captured in the abstract point-rendered representation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahlberg, J., Pandzic, I.S., You, L.: Evaluating MPEG-4 Facial Animation Players. In: Pandzic, I.S., Forchhimer, R. (eds.) MPEG-4 Facial Animation: the standard, implementation and applications, pp. 287–291 (2002)
Andre, E., Rist, M., Muller, J.: Guiding the User through Dynamically Generated Hypermedia Presentations with a Life-like Character. In: IUI 1998, pp. 21–28 (1998)
Bassili, J.N.: Emotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face. Journal of the Personality and Social Psychology (37), 2049–2058 (1979)
Blanz, V., Basso, C., Poggio, T., Vetter, T.: Reanimating Faces in Images and Video. Computer Graphics Forum 22(3) (2003)
Brand, M.: Voice Puppetry. In: Proc. of ACM SIGGRAPH 1999, pp. 21–28. ACM Press, New York (1999)
Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio. In: Proc. of ACM SIGGRAPH 1997, pp. 353–360. ACM Press, New York (1997)
Busso, C., Deng, Z., Neumann, U., Narayanan, S.: Natural Head Motion Synthesis Driven by Acoustic Prosody Features. The Journal of Computer Animation and Virtual Worlds 16(3-4), 283–290 (2005)
Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated Conversation: Rule-Based Generation of Facial Expression, Gesture and Spoken intonation for Multiple Conversational Agents. In: Proc. of ACM SIGGRAPH 1994, pp. 413–420. ACM Press, New York (1994)
Cassell, J., Sullivan, J., Prevost, S., Churchill, E.: Embodied Conversational Agents. MIT Press, Cambridge (2000)
Chuang, E.S., Deshpande, H., Bregler, C.: Facial Expression Space Learning. In: Proc. of Pacific Graphics 2002, pp. 68–76 (2002)
Cohen, M.M., Massaro, D.W.: Modeling Coarticulation in Synthetic Visual Speech. In: Magnenat-Thalmann, N., Thalmann, D. (eds.) Models and Techniques in Computer Animation, pp. 139–156. Springer, Heidelberg (1993)
Costantini, E., Pianesi, F., Cosi, P.: Evaluation of Synthetic Faces: Human Recognition of Emotional Facial Displays. In: Dybkiaer, L., Minker, W., Heisterkamp, P. (eds.) Affective Dialogue Systems (2004)
Costantini, E., Pianesi, F., Prete, M.: Recognising emotions in human and synthetic faces: the role of the upper and lower parts of the face. In: Proc. of IUI 2005, pp. 20–27. ACM Press, New York (2005)
Deng, Z., Neumann, U., Lewis, J.P., Kim, T.Y., Bulut, M., Narayanan, S.: Expressive Facial Animation Synthesis by Learning Speech Co-Articulation and Expression Space. IEEE Transaction on Visualization and Computer Graphics 12(6) (November/December 2006)
Deng, Z., Bulut, M., Neumann, U., Narayanan, S.: Automatic Dynamic Expression Synthesis for Speech Animation. In: Proc. of IEEE Computer Animation and Social Agents 2004, July 2004, pp. 267–274 (2004)
Deng, Z., Lewis, J.P., Neumann, U.: Synthesizing Speech Animation by Learning Compact Speech Co-Articulation Models. In: Proc. of Computer Graphics International 2005, June 2005, pp. 19–25 (2005)
Ekman, P., Friesen, W.V.: Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues. Prentice-Hall, Englewood Cliffs (1975)
Essa, I.A., Pentland, A.P.: Coding, Analysis, Interpretation, and Recognition of Facial Expressions. IEEE Transaction on Pattern Analysis and Machine Intelligence 19(7), 757–763 (1997)
Ezzat, T., Geiger, G., Poggio, T.: Trainable Videorealistic Speech Animation. ACM Trans. Graph. 21(3), 388–398 (2002)
Gratch, J., Marsella, S.: Evaluating a Computational Model of Emotion. Journal of Autonomous Agents and Multiagent Systems 11(1), 23–43
Gratch, J., Rickel, J., Andre, E., Badler, N., Cassell, J., Petajan, E.: Creating Interactive Virtual Humans: Some Assembly Required. IEEE Intelligent Systems, 54–63 (July/August 2002)
Hastie, T., Ribshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)
Katsyri, J., Klucharev, V., Frydrych, M., Sams, M.: Identification of Synthetic and Natural Emotional Facial Expressions. In: Proc. of AVSP 2003, pp. 239–244 (2003)
Kshirsagar, S., Thalmann, N.M.: Visyllable Based Speech Animation. Computer Graphics Forum 22(3) (2003)
Walker, J.H., Sproull, L., Subramani, R.: Using a human face in an interface. In: Proc. of CHI 1994, pp. 85–91. ACM Press, New York (1994)
Lee, Y., Terzopoulos, D., Waters, K.: Realistic modeling for facial animation. In: Proc. of ACM SIGGRAPH 1995, pp. 55–62. ACM Press, New York (1995)
Lewis, J.P.: Automated lip-sync: Background and techniques. J. of Visualization and Computer Animation, 118–122 (1991)
Lewis, J.P., Purcell, P.: Soft Machine: A Personable Interface. In: Proc. of Graphics Interface, vol. 84, pp. 223–226.
Marsella, S., Gratch, J.: Modeling the Interplay of Plans and Emotions in Multi-Agent Simulations. In: Proc. of the Cognitive Science Society (2001)
Nass, C., Kim, E.Y., Lee, E.J.: When My Face is the Interface: An Experimental Comparison of Interacting with One’s Own Face or Someone Else’s Face. In: Proc. of CHI 1998, pp. 148–154. ACM Press, New York (1998)
Noh, J.Y., Neumann, U.: Expression Cloning. In: Proc. of ACM SIGGRAPH 2001, pp. 277–288. ACM Press, New York (2001)
Pandzic, I.S., Ostermann, J., Millen, D.: User evaluation: synthetic talking faces for interactive services. The Visual Computer 15, 330–340 (1999)
Parke, F.: Computer Generated Animation of Faces. In: Proc. ACM Nat’l Conf., pp. 451–457. ACM Press, New York (1972)
Pelachaud, C., Badler, N., Steedman, M.: Linguistic Issues in Facial Animation. In: Proc. of Computer Animation 1991 (1991)
Pelachaud, C., Badler, N., Steedman, M.: Generating Facial Expressions for Speech. Cognitive Science 20(1), 1–46 (1994)
Rist, M., Andre, E., Muller, J.: Adding animated presentation agents to the interface. In: IUI 1997: Proc. of Intelligent user interfaces, pp. 79–86. ACM Press, New York (1997)
Sirovich, L., Kirby, M.: Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A. 4(3), 519–524 (1987)
Turk, M.A., Pentland, A.P.: Face Recognition Using Eigenfaces. In: IEEE CVPR 1991, pp. 586–591 (1991)
Uttkay, Z., Doorman, C., Noot, H.: Evaluating ECAs - What and How? In: Proc. of the AAMAS 2002 Workshop on Embodied Conversational Agents (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deng, Z., Bailenson, J., Lewis, J.P., Neumann, U. (2006). Perceiving Visual Emotions with Speech. In: Gratch, J., Young, M., Aylett, R., Ballin, D., Olivier, P. (eds) Intelligent Virtual Agents. IVA 2006. Lecture Notes in Computer Science(), vol 4133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11821830_9
Download citation
DOI: https://doi.org/10.1007/11821830_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37593-7
Online ISBN: 978-3-540-37594-4
eBook Packages: Computer ScienceComputer Science (R0)