Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Head movements, facial expressions and feedback in conversations: empirical evidence from Danish multimodal data

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

This article deals with multimodal feedback in two Danish multimodal corpora, i.e., a collection of map-task dialogues and a corpus of free conversations in first encounters between pairs of subjects. Machine learning techniques are applied to both sets of data to investigate various relations between the non-verbal behaviour—more specifically head movements and facial expressions—and speech with regard to the expression of feedback. In the map-task data, we study the extent to which the dialogue act type of linguistic feedback expressions can be classified automatically based on the non-verbal features. In the conversational data, on the other hand, non-verbal and speech features are used together to distinguish feedback from other multimodal behaviours. The results of the two sets of experiments indicate in general that head movements, and to a lesser extent facial expressions, are important indicators of feedback, and that gestures and speech disambiguate each other in the machine learning process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Allwood J, Cerrato L, Jokinen K, Navarretta C, Paggio P (2007) The MUMIN coding scheme for the annotation of feedback, turn management and sequencing: multimodal corpora for modelling human multimodal behaviour. Int J Lang Resour Eval 41(3-4):273–287 (Special Issue)

    Article  Google Scholar 

  2. Allwood J, Nivre J, Ahlsén E (1992) On the semantics and pragmatics of linguistic feedback. J Semant 9:1–26

    Article  Google Scholar 

  3. Boersma P, Weenink D (2009) Praat: doing phonetics by computer (version 5.1.05) [computer program]. Retrieved 1 May 1 2009, from http://www.praat.org/

  4. Busso C, Narayanan SS (2007) Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Trans Audio Speech Lang Process 15:2331–2347

    Article  Google Scholar 

  5. Cerrato L (2007) Investigating communicative feedback phenomena across languages and modalities: PhD thesis. KTH, Speech and Music Communication, Stockholm

    Google Scholar 

  6. Duncan S (1972) Some signals and rules for taking speaking turns in conversations. J Pers Soc Psychol 23:283–292

    Article  Google Scholar 

  7. Fujie S, Ejiri Y, Nakajima N, Matsusaka Y, Kobayashi T (2004) A conversation robot using head gesture recognition as para-linguistic information. In: Proceedings of the 13th IEEE international workshop on robot and human interactive, communication, September 2004, pp 159–164

  8. Gregersen F, Pedersen IL (eds) (1991) The Copenhagen study in urban sociolinguistics. Reitzel, Copenhagen

    Google Scholar 

  9. Grønnum N (2006) DanPASS: a Danish phonetically annotated spontaneous speech corpus. In: Calzolari N, Choukri K, Gangemi A, Maegaard B, Mariani J, Odijk J, Tapias D (eds) Proceedings of the 5th LREC, Genoa, May, pp 1578–1583

  10. Hadar U, Steiner TJ, Rose FC (1985) Head movement during listening turns in conversation. J Nonverbal Behav 9(4):214–228

    Article  Google Scholar 

  11. Hager JC, Ekman P et al (1983) The inner and outer meanings of facial expressions, chapter 10. In: Cacioppo JT, Petty RE (eds) Social psychophysiology: a sourcebook. The Guilford Press, New York

    Google Scholar 

  12. Jokinen K, Navarretta C, Paggio P (2008) Distinguishing the communicative functions of gestures. In: Proceedings of the 5th MLMI, LNCS 5237, Utrecht, September 2008. Springer, pp 38–49

  13. Kendon A (1972) Some relationships between body motion and speech. In: Seigman A, Pope B (eds) Studies in dyadic communication. Pergamon Press, Elmsford, p 216

    Google Scholar 

  14. Kipp M (2004) Gesture Generation by imitation—from human behavior to computer character animation: PhD thesis. Saarland University, Saarbruecken, Germany, Boca Raton, Florida. http://www.dissertation.com. Accessed 7 Aug 2012

  15. Loehr DP (2004) Gesture and intonation. PhD thesis, Georgetown University, Washington

  16. Lu J, Allwood J (2011) Unimodal and multimodal feedback in Chinese and Swedish monocultural and intercultural interactions (a pilot study). In: Paggio p, Ahlsén E, Allwood J, Jokinen K, Navarretta C (eds) Proceedings of the 3rd Nordic symposium on multimodal communication, NEALT, May 27–28 2011, pp 40–47. Available at: http://dspace.utlib.ee/dspace/handle/10062/22532. Accessed 7 Aug 2012

  17. Lu J, Allwood J, Ahlsén E (2011) A study on cultural variations of smile based on empirical recordings of Chinese and Swedish first encounters. Talk given at the workshop on Multimodal Corpora at ICMI-MLMI, Alicante, Spain

  18. Maynard S (1987) Interactional functions of a nonverbal sign: head movement in Japanese dyadic casual conversation. J Pragmat 11:589–606

    Article  Google Scholar 

  19. McClave E (2000) Linguistic functions of head movements in the context of speech. J Pragmat 32:855–878

    Article  Google Scholar 

  20. Morency L-P, de Kok I, Gratch J (2009) A probabilistic multimodal approach for predicting listener backchannels. Auton Agents Multi-Agent Syst 20:70–84

    Article  Google Scholar 

  21. Morency L-P, Sidner C, Lee C, Darrell T (2005) Contextual recognition of head gestures. In: Proceedings of the international conference on multi-modal interfaces

  22. Morency L-P, Sidner C, Lee C, Darrell T (2007) Head gestures for perceptual interfaces: the role of context in improving recognition. Artif Intell 171(8–9):568–585

    Article  Google Scholar 

  23. Murray G, Renals S (2008) Detecting action meetings in meetings. In: Proceedings of the 5th MLMI, LNCS 5237, Utrecht, The Netherlands, September 2008. Springer, pp 208–213

  24. Navarretta C, Paggio P (2010) Classification of feedback expressions in multimodal data. In: Proceedings of the 48th annual meeting of the association for computational linguistics (ACL 2010), Uppsala, Sweden, July 11–16, 2010, pages 318–324

  25. Akker H, Schulz C (2008) Exploring features and classifiers for dialogue act segmentation. In: Proceedings of the 5th MLMI, pp 196–207

  26. Paggio P (2006) Annotating information structure in a corpus of spoken Danish. In: Proceedings of the 5th international conference on language resources and evaluation LREC2006, Genova, Italy, 2006, pp 1606–1609

  27. Paggio P (2006) Information structure and pauses in a corpus of spoken Danish. In: Conference companion of the 11th conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, 2006, pp 191–194

  28. Paggio P, Allwood J, Ahlsén E, Jokinen K, Navarretta C (2010) The NOMCO multimodal Nordic resource: goals and characteristics. In: Proceedings of the seventh conference on international language resources and evaluation (LREC 2010), Valletta, Malta, 2010. European Language, Resources Association (ELRA)

  29. Paggio P, Diderichsen P (2010) Information structure and communicative functions in spoken and multimodal data. In: Henriksen PJ (ed) Linguistic theory and raw sound: volume 49 of Copenhagen Studies in Language. Samfundslitteratur, Denmark, pp 149–168

    Google Scholar 

  30. Paggio P, Navarretta C (2010) Feedback in head gesture and speech. In: Kipp M, Martin JC, Paggio P, Heylen D (eds) Proceedings of the workshop on Multimodal Corpora held in conjunction with LREC 2010, Malta, May 17 2010, pp 1–4

  31. Paggio P, Navarretta C (2011) Feedback and gestural behaviour in a conversational corpus of Danish. In: Paggio P, Ahlsén E, Allwood J, Jokinen K, Navarretta C (eds) Proceedings of the 3rd Nordic symposium on multimodal communication, NEALT, May 27–28, 2011, pp 33–39. Available at: http://dspace.utlib.ee/dspace/handle/10062/22532

  32. Paggio P, Navarretta C (2011) Learning to classify the feedback function of head movements in a Danish corpus of first encounters. In: Talk given at the workshop on Multimodal Corpora at ICMI, November 2011, Alicante. Spain

  33. Reidsma D, Heylen D, Akker R (2009) On the contextual analysis of agreement scores. In: Kipp M, Martin J-C, Paggio P, Heylen D (eds) Multimodal corpora from models of natural interaction to systems and applications, number 5509 in LNAI. Springer, pp 122–137

  34. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  35. Yngve V (1970) On getting a word in edgewise. In: Papers from the sixth regional meeting of the Chicago Linguistic Society, pp 567–578

  36. Zhang H, Jiang L, Su J (2005) Hidden naive bayes. In: Proceedings of the twentieth national conference on artificial intelligence, pp 919–924

Download references

Acknowledgments

This work has been carried out in the context of the VKK and NOMCO projects. VKK is funded by the Danish Research Council for the Humanities. NOMCO is funded by the NORDCORP programme under the Nordic Research Councils for the Humanities and the Social Sciences (NOS-HS). We would like to thank our partners in the NOMCO project Elisabeth Ahlsén, Jens Allwood and Kristiina Jokinen, the annotators Sara Andersen, Josephine B. Arrild, Anette Studsgård and Bjørn Wesseltolvig, as well as the two anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrizia Paggio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Paggio, P., Navarretta, C. Head movements, facial expressions and feedback in conversations: empirical evidence from Danish multimodal data. J Multimodal User Interfaces 7, 29–37 (2013). https://doi.org/10.1007/s12193-012-0105-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-012-0105-9

Keywords

Navigation