Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3610661.3616188acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

SMYLE: A new multimodal resource of talk-in-interaction including neuro-physiological signal

Published: 09 October 2023 Publication History

Abstract

This article presents the SMYLE corpus, the first multimodal corpus in French (16h) including neuro-physiological data from 60 participants engaged in face-to-face storytelling (8.2h) and free conversation tasks (7.8h). The originality of this corpus lies first in the fact that it bears all modalities, precisely synchronized and second in the addition for the first time at this scale of neuro-physiological modalities. It constitutes the first corpus of this size offering the opportunity to investigate cognitive characteristics of spontaneous conversation including at the brain level. The storytelling task comprises two conditions: a storyteller talking with a “normal” or a “distracted” listener. Contrasting normal and disrupted conversations allows to study at a behavioral, linguistic and cognitive levels the complex characteristics and organization of conversations.
In this article, we present first the methodology developed to acquire and synchronize the different sources and types of signal. In a second part, we detail the large set of automatic, semi-automatic and manual annotations of the complete dataset. In a last section, we illustrate one application of the corpus by providing preliminary analyses of the annotated data, that reveal the impact of distracted listener’s on his/her feedbacks and the quality of the narration.

References

[1]
Mary Amoyal, Roxane Bertrand, Brigitte Bigi, Auriane Boudin, Christine Meunier, Berthille Pallaud, Béatrice Priego-Valverde, S. Rauzy, and Marion Tellier. 2022. Principes et outils pour l’annotation des corpus. Travaux Interdisciplinaires sur la Parole et le Langage 38 (Dec. 2022). https://doi.org/10.4000/tipa.5424
[2]
Mary Amoyal, Béatrice Priego-Valverde, and Stéphane Rauzy. 2020. PACO: A corpus to analyze the impact of common ground in spontaneous face-to-face interaction. In Language Resources and Evaluation Conference.
[3]
Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arxiv:2006.11477 [cs.CL]
[4]
Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. Openface 2.0: Facial behavior analysis toolkit. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 59–66.
[5]
Janet B Bavelas, Linda Coates, and Trudy Johnson. 2000. Listeners as co-narrators.Journal of personality and social psychology 79, 6 (2000), 941.
[6]
Roxane Bertrand. 2021. Linguistique Interactionnelle: du Corpus à l’Expérimentation. Ph. D. Dissertation. Aix Marseille Université.
[7]
Roxane Bertrand, Philippe Blache, Robert Espesser, Gaëlle Ferré, Christine Meunier, Béatrice Priego-Valverde, and Stéphane Rauzy. 2008. Le CID - Corpus of Interactional Data - Annotation et Exploitation Multimodale de Parole Conversationnelle. Revue TAL 49, 3 (2008), pp.105–134. https://hal.science/hal-00349893
[8]
Roxane Bertrand and Robert Espesser. 2017. Co-narration in French conversation storytelling: A quantitative insight. Journal of Pragmatics 111 (2017), 33–53.
[9]
Roxane Bertrand and Beatrice Priego-Valverde. 2017. Listing practice in French conversation: From collaborative achievement to interactional convergence. Discours. Revue de linguistique, psycholinguistique et informatique. A journal of linguistics, psycholinguistics and computational linguistics20 (2017).
[10]
Dana Bevilacqua, Ido Davidesco, Lu Wan, Kim Chaloner, Jess Rowland, Mingzhou Ding, David Poeppel, and Suzanne Dikker. 2019. Brain-to-brain synchrony and learning outcomes vary by student–teacher dynamics: Evidence from a real-world classroom electroencephalography study. Journal of cognitive neuroscience 31, 3 (2019), 401–411.
[11]
Brigitte Bigi. 2012. SPPAS: a tool for the phonetic segmentations of Speech. In The eighth international conference on Language Resources and Evaluation. 1748–1755.
[12]
Philippe Blache, Salomé Antoine, Dorina De Jong, Lena-Marie Huttner, Emilia Kerr, Thierry Legou, Eliot Maës, and Clément François. 2022. The Badalona Corpus An Audio, Video and Neuro-Physiological Conversational Dataset. In Language Resources and Evaluation Conference.
[13]
Philippe Blache, Roxane Bertrand, Gaëlle Ferré, Berthille Pallaud, Laurent Prevot, and Stéphane Rauzy. 2017. The Corpus of Interactional Data: a Large Multimodal Annotated Resource. In Handbook of Linguistic Annotation, N.Ide & J.Pustejovsky (Ed.). Springer, 323–1356. https://doi.org/10.1007/978-94-024-0881-2_51
[14]
Auriane Boudin, Roxane Bertrand, Magalie Ochs, Philippe Blache, and Stéphane Rauzy. 2022. Are you Smiling When I am Speaking?. In Proceedings of the Smiling and Laughter across Contexts and the Life-span Workshop @LREC2022. Marseille, France. https://hal.science/hal-03713867
[15]
Auriane Boudin, Roxane Bertrand, Stéphane Rauzy, Thierry Legou, Magalie Ochs, and Philippe Blache. 2023. SMYLE. https://hdl.handle.net/11403/smyle ORTOLANG (Open Resources and TOols for LANGuage) –www.ortolang.fr.
[16]
Auriane Boudin, Roxane Bertrand, Stéphane Rauzy, Magalie Ochs, and Philippe Blache. 2021. A Multimodal Model for Predicting Conversational Feedbacks. In International Conference on Text, Speech, and Dialogue. Springer, 537–549.
[17]
Pablo Brusco, Jazmín Vidal, Štefan Beňuš, and Agustín Gravano. 2020. A cross-linguistic analysis of the temporal dynamics of turn-taking cues using machine learning as a descriptive tool. Speech Communication 125 (2020), 24–40.
[18]
Wallace L. Chafe. 1980. The Pear Stories: Cognitive, Cultural, and Linguistic Aspects of Narrative Production. Ablex.
[19]
Herbert H Clark. 1996. Using language. Cambridge university press.
[20]
Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, N. Tomashenko, Marco Dinarelli, Titouan Parcollet, A. Allauzen, Y. Estève, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, and L. Besacier. 2021. LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech. ArXiv abs/2104.11462 (2021).
[21]
Greta Gandolfi, Martin J Pickering, and Simon Garrod. 2023. Mechanisms of alignment: shared control, social cognition and metacognition. Philosophical Transactions of the Royal Society B 378, 1870 (2023), 20210362.
[22]
Simon Garrod and Anthony Anderson. 1987. Saying what you mean in dialogue: A study in conceptual and semantic co-ordination. Cognition 27 (1987), 181–218.
[23]
Daniel Hirst. 2022. A multi-level, multilingual approach to the annotation of speech prosody. In Prosodic Theory and Practice. MIT Press, 117–149.
[24]
William S Horton. 2017. Theories and approaches to the study of conversation and interactive discourse. In The Routledge handbook of discourse processes. Routledge, 22–68.
[25]
Yi Hu, Yafeng Pan, Xinwei Shi, Qing Cai, Xianchun Li, and Xiaojun Cheng. 2018. Inter-brain synchrony and cooperation context in interactive decision making. Biological psychology 133 (2018), 54–62.
[26]
Brent A Kelsen, Alexander Sumich, Nikola Kasabov, Sophie HY Liang, and Grace Y Wang. 2022. What has social neuroscience learned from hyperscanning studies of spoken communication? A systematic review. Neuroscience & Biobehavioral Reviews 132 (2022), 1249–1262.
[27]
Sivan Kinreich, Amir Djalovski, Lior Kraus, Yoram Louzoun, and Ruth Feldman. 2017. Brain-to-brain synchrony during naturalistic social interactions. Scientific reports 7, 1 (2017), 17060.
[28]
Iwan de Kok and Dirk Heylen. 2012. A survey on evaluation metrics for backchannel prediction models. In Feedback behaviors in dialog.
[29]
Louis-Philippe Morency, Iwan de Kok, and Jonathan Gratch. 2010. A probabilistic multimodal approach for predicting listener backchannels. Autonomous agents and multi-agent systems 20, 1 (2010), 70–84.
[30]
Alejandro Pérez, Manuel Carreiras, and Jon Andoni Duñabeitia. 2017. Brain-to-brain entrainment: EEG interbrain synchronization while speaking and listening. Scientific reports 7, 1 (2017), 1–12.
[31]
Alejandro Pérez, Guillaume Dumas, Melek Karadag, and Jon Andoni Duñabeitia. 2019. Differential brain-to-brain entrainment while speaking and listening in native and foreign languages. Cortex 111 (2019), 303–315.
[32]
Martin Pickering and Simon Garrod. 2021. Understanding Dialogue. Cambridge University Press.
[33]
Béatrice Priego-Valverde, Brigitte Bigi, and Mary Amoyal. 2020. “Cheese!”: a Corpus of Face-to-face French Interactions. A Case Study for Analyzing Smiling and Conversational Humor. In Proceedings of the 12th Language Resources and Evaluation Conference. 467–475.
[34]
Stéphane Rauzy and Mary Amoyal. 2020. SMAD: a tool for automatically annotating the smile intensity along a video record. In HRC2020, 10th Humour Research Conference.
[35]
Stéphane Rauzy, Grégoire Montcheuil, and Philippe Blache. 2014. MarsaTag, a tagger for French written texts and speech transcriptions. In Second Asian Pacific Corpus linguistics Conference. 220–220.
[36]
Emanuel A Schegloff. 1982. Discourse as an interactional achievement: Some uses of ‘uh huh’and other things that come between sentences. Analyzing discourse: Text and talk 71 (1982), 71–93.
[37]
Tanya Stivers. 2008. Stance, alignment, and affiliation during storytelling: When nodding is a token of affiliation. Research on language and social interaction 41, 1 (2008), 31–57.
[38]
Victor H. Yngve. 1970. On Getting a Word in Edgewise. In Papers from the Sixth Regional Meeting, Mary A. Campbell (Ed.). Chicago Linguistics Society, Department of Linguistics, University of Chicago, Chicago, 567–578.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction
October 2023
434 pages
ISBN:9798400703218
DOI:10.1145/3610661
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automatic annotation
  2. Interaction
  3. Multimodal dataset

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICMI '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 60
    Total Downloads
  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)3
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media