Abstract
It is often assumed that one person in a conversation is active (the speaker) and the rest passive (the listeners). Conversational analysis has shown, however, that listeners take an active part in the conversation, providing feedback signals that can control conversational flow. The face plays a vital role in these backchannel responses. A deeper understanding of facial backchannel signals is crucial for many applications in social signal processing, including automatic modeling and analysis of conversations, or in the development of life-like, effective conversational agents. Here, we present results from two experiments testing the sensitivity to the context and the timing of backchannel responses. We utilised sequences from a newly recorded database of 5-minute, two-person conversations. Experiment 1 tested how well participants would be able to match backchannel sequences to their corresponding speaker sequence. On average, participants performed well above chance. Experiment 2 tested how sensitive participants would be to temporal misalignments of the backchannel sequence. Interestingly, participants were able to estimate the correct temporal alignment for the sequence pairs. Taken together, our results show that human conversational skills are highly tuned both towards context and temporal alignment, showing the need for accurate modeling of conversations in social signal processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D’Errico, F., Schroeder, M.: Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing. IEEE Transactions on Affective Computing 3, 69–87 (2012)
Gatica-Perez, D.: Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing 27, 1775–1787 (2009)
Ekman, P.: Universal and cultural differences in facial expressions of emotion, pp. 207–283 (1972)
Baron-Cohen, S., Wheelwright, S., Jolliffe, T.: Is there a ”language of the eyes”? evidence from normal adults, and adults with autism or asperger syndrome. Visual Cognition 4, 311–331 (1997)
Cunningham, D.W., Wallraven, C.: Dynamic information for the recognition of conversational expressions. Journal of Vision 9 (2009)
Pelachaud, C., Poggi, I.: Subtleties of facial expressions in embodied agents. The Journal of Visualization and Computer Animation 13, 301–312 (2002)
Nusseck, M., Cunningham, D.W., Wallraven, C., Bülthoff, H.H.: The contribution of different facial regions to the recognition of conversational expressions. Journal of Vision 8 (2008)
Wehrle, T., Kaiser, S., Schmidt, S., Scherer, K.R.: Studying the dynamics of emotional expression using synthesized facial muscle movements. Journal of Personality and Social Psychology 78, 105–119 (2000)
Heylen, D.: Challenges ahead: head movements and other social acts during conversations. In: Joint Symposium on Virtual Social Agents, pp. 45–52 (2005)
Bavelas, J.B., Coates, L., Johnson, T.: Listener responses as a collaborative process: The role of gaze. Journal of Communication 52, 566–580 (2002)
Yngve, V.: On getting a word in edgewise. In: Papers from the Sixth Regional Meeting of the Chicago Linguistic Society, pp. 567–578 (1970)
Schröder, M., Heylen, D., Poggi, I.: Perception of non-verbal emotional listener feedback. In: Proceedings of Speech Prosody, Dresden, Germany (2006)
Poppe, R., Truong, K.P., Heylen, D.: Backchannels: Quantity, Type and Timing Matters. In: Vilhjálmsson, H.H., Kopp, S., Marsella, S., Thórisson, K.R. (eds.) IVA 2011. LNCS, vol. 6895, pp. 228–239. Springer, Heidelberg (2011)
Poppe, R., Truong, K.P., Reidsma, D., Heylen, D.: Backchannel Strategies for Artificial Listeners. In: Allbeck, J., Badler, N., Bickmore, T., Pelachaud, C., Safonova, A. (eds.) IVA 2010. LNCS (LNAI), vol. 6356, pp. 146–158. Springer, Heidelberg (2010)
Cunningham, D.W., Wallraven, C.: Experimental Design: From User Studies to Psychophysics. A K Peters/CRC Press (2011)
Wichmann, F.A., Hill, N.J.: The psychometric function: I. Fitting, sampling and goodness of fit. Perception and Psychophysics 63, 1293–1313 (2001)
Trottier, L., Pratt, J.: Visual processing of targets can reduce saccadic latencies. Vision Research 45, 1349–1354 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aubrey, A.J., Cunningham, D.W., Marshall, D., Rosin, P.L., Shin, A., Wallraven, C. (2013). The Face Speaks: Contextual and Temporal Sensitivity to Backchannel Responses. In: Park, JI., Kim, J. (eds) Computer Vision - ACCV 2012 Workshops. ACCV 2012. Lecture Notes in Computer Science, vol 7729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37484-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-37484-5_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37483-8
Online ISBN: 978-3-642-37484-5
eBook Packages: Computer ScienceComputer Science (R0)