Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3382507.3418853acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article
Open access

Multimodal Automatic Coding of Client Behavior in Motivational Interviewing

Published: 22 October 2020 Publication History

Abstract

Motivational Interviewing (MI) is defined as a collaborative conversation style that evokes the client's own intrinsic reasons for behavioral change. In MI research, the clients' attitude (willingness or resistance) toward change as expressed through language, has been identified as an important indicator of their subsequent behavior change. Automated coding of these indicators provides systematic and efficient means for the analysis and assessment of MI therapy sessions. In this paper, we study and analyze behavioral cues in client language and speech that bear indications of the client's behavior toward change during a therapy session, using a database of dyadic motivational interviews between therapists and clients with alcohol-related problems. Deep language and voice encoders, \ie BERT and VGGish, trained on large amounts of data are used to extract features from each utterance. We develop a neural network to automatically detect the MI codes using both the clients' and therapists' language and clients' voice, and demonstrate the importance of semantic context in such detection. Additionally, we develop machine learning models for predicting alcohol-use behavioral outcomes of clients through language and voice analysis. Our analysis demonstrates that we are able to estimate MI codes using clients' textual utterances along with preceding textual context from both the therapist and client, reaching an F1-score of 0.72 for a speaker-independent three-class classification. We also report initial results for using the clients' data for predicting behavioral outcomes, which outlines the direction for future work.

Supplementary Material

MP4 File (3382507.3418853.mp4)
This video provides a summary of the paper ?Multimodal Automatic Coding of Client Behavior Motivational Interviewing? describing the dataset, developed model and obtained results on two tasks: (1) multimodal automated coding of client utterances and (2) multimodal prediction of behavioral outcomes using in-session client data. We provide the state-of-the-art multimodal automated coding of client utterances and demonstrate promising results for prediction of behavioral outcomes using in-session client data from one-time interaction windows.

References

[1]
Chanuwas Aswamenakul, Lixing Liu, Kate B. Carey, Joshua Woolley, Stefan Scherer, and Brian Borsari. 2018. Multimodal Analysis of Client Behavioral Change Coding in Motivational Interviewing. In Proceedings of the 20th ACM International Conference on Multimodal Interaction (Boulder, CO, USA) (ICMI'18). Association for Computing Machinery, New York, NY, USA, 356--360. https://doi.org/10.1145/3242969.3242990
[2]
Matthew P. Black, Athanasios Katsamanis, Brian R. Baucom, Chi-Chun Lee, Adam C. Lammert, Andrew Christensen, Panayiotis G. Georgiou, and Shrikanth S. Narayanan. 2013. Toward automating a human behavioral coding system for married couples? interactions using speech acoustic features. Speech Communication 55, 1 (Jan. 2013). https://doi.org/10.1016/j.specom.2011.12.003
[3]
Brian Borsari, John TP Hustad, Nadine R Mastroleo, Tracy O'Leary Tevyaw, Nancy P Barnett, Christopher W Kahler, Erica Eaton Short, and Peter M Monti. 2012. Addressing alcohol use and problems in mandated college students: A randomized clinical trial using stepped care. Journal of consulting and clinical psychology 80, 6 (2012), 1062.
[4]
Kate B Carey, James M Henson, Michael P Carey, and Stephen A Maisto. 2009. Computer versus in-person intervention for students violating campus alcohol policy. Journal of consulting and clinical psychology 77, 1 (2009), 74.
[5]
Zhuohao Chen, Nikolaos Flemotomos, Victor Ardulov, Torrey A Creed, Zac E Imel, David C Atkins, and Shrikanth Narayanan. 2020. Feature Fusion Strategies for End-to-End Evaluation of Cognitive Behavior Therapy Sessions. arXiv preprint arXiv:2005.07809 (2020).
[6]
Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, and Stefan Scherer. 2014. COVAREP'A collaborative voice analysis repository for speech technologies. In 2014 ieee international conference on acoustics, speech and signal processing (icassp). IEEE, 960--964.
[7]
MP Ewbank, R Cummins, V Tablan, A Catarino, S Buchholz, and AD Blackwell. 2020. Understanding the relationship between patient language and outcomes in internet-enabled cognitive behavioural therapy: A deep learning approach to automatic coding of session transcripts. Psychotherapy Research (2020), 1--13.
[8]
Florian Eyben, Klaus R. Scherer, Bjorn W. Schuller, Johan Sundberg, Elisabeth Andre, Carlos Busso, Laurence Y. Devillers, Julien Epps, Petri Laukka, Shrikanth S. Narayanan, and Khiet P. Truong. 2016. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Transactions on Affective Computing 7, 2 (apr 2016), 190--202.
[9]
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio set: An ontology and human-labeled dataset for audio events. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 776--780.
[10]
Shawn Hershey, Sourish Chaudhuri, Daniel PW Ellis, Jort F Gemmeke, Aren Jansen, R Channing Moore, Manoj Plakal, Devin Platt, Rif A Saurous, Bryan Seybold, et al. 2017. CNN architectures for large-scale audio classification. In 2017 IEEE international conference on acoustics, speech and signal processing (icassp). IEEE, 131--135.
[11]
Christine Howes, Matthew Purver, and Rose McCabe. 2013. Using conversation topics for predicting therapy outcomes in schizophrenia. Biomedical informatics insights 6 (2013), BII--S11661.
[12]
Xiaolei Huang, Lixing Liu, Kate Carey, Joshua Woolley, Stefan Scherer, and Brian Borsari. 2018. Modeling Temporality of Human Intentions by Domain Adaptation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 696--701. https://doi.org/10.18653/v1/D18--1074
[13]
Wei Jiang, Zheng Wang, Jesse S Jin, Xianfeng Han, and Chunguang Li. 2019. Speech emotion recognition with heterogeneous feature unification of deep neural network. Sensors 19, 12 (2019), 2730.
[14]
Brad Lundahl and Brian L Burke. 2009. The effectiveness and applicability of motivational interviewing: A practice-friendly review of four meta-analyses. Journal of clinical psychology 65, 11 (2009), 1232--1245.
[15]
Brad W Lundahl, Chelsea Kunz, Cynthia Brownell, Derrik Tollefson, and Brian L Burke. 2010. A meta-analysis of motivational interviewing: Twenty-five years of empirical studies. Research on social work practice 20, 2 (2010), 137--160.
[16]
Molly Magill, Timothy R. Apodaca, Brian Borsari, Jacques Gaume, Ariel Hoadley, Rebecca E.F. Gordon, J. Scott Tonigan, and Theresa Moyers. 2018. A Meta-Analysis of Motivational Interviewing Process: Technical, Relational, and Conditional Process Models of Change. Journal of consulting and clinical psychology 86, 2 (Feb. 2018), 140--157. https://doi.org/10.1037/ccp0000250
[17]
Molly Magill, Timothy R Apodaca, Brian Borsari, Jacques Gaume, Ariel Hoadley, Rebecca EF Gordon, J Scott Tonigan, and Theresa Moyers. 2018. A meta-analysis of motivational interviewing process: Technical, relational, and conditional process models of change. Journal of consulting and clinical psychology 86, 2 (2018), 140.
[18]
Molly Magill, Jacques Gaume, Timothy R. Apodaca, Justin Walthers, Nadine R. Mastroleo, Brian Borsari, and Richard Longabaugh. 2014. The Technical Hypothesis of Motivational Interviewing: A Meta-Analysis of MI's Key Causal Model. Journal of consulting and clinical psychology 82, 6 (Dec. 2014), 973--983. https://doi.org/10.1037/a0036833
[19]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[20]
William R Miller, Theresa B Moyers, Denise Ernst, and Paul Amrhein. 2003. Manual for the motivational interviewing skill code (MISC). Unpublished manuscript. Albuquerque: Center on Alcoholism, Substance Abuse and Addictions, University of New Mexico (2003).
[21]
William R Miller and Stephen Rollnick. 2012. Motivational interviewing: Helping people change. Guilford press.
[22]
Brian T. Pace, Aaron Dembe, Christina S. Soma, Scott A. Baldwin, David C. Atkins, and Zac E. Imel. 2017. A Multivariate Meta-Analysis of Motivational Interviewing Process and Outcome. Psychology of addictive behaviors : journal of the Society of Psychologists in Addictive Behaviors 31, 5 (Aug. 2017), 524--533. https://doi.org/10.1037/adb0000280
[23]
James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71, 2001 (2001), 2001.
[24]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532--1543. https://doi.org/10.3115/v1/D14--1162
[25]
Balasubramanian Raman and Partha Pratim Roy. [n.d.]. A Segment Level Approach to Speech Emotion Recognition using Transfer Learning. ([n. d.]).
[26]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[27]
Karan Singla, Zhuohao Chen, Nikolaos Flemotomos, James Gibson, Dogan Can, David Atkins, and Shrikanth Narayanan. 2018. Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy. In Interspeech 2018. ISCA, 3413--3417. https://doi.org/10.21437/Interspeech.2018--2551
[28]
Shao-Yen Tseng, Brian R Baucom, and Panayiotis G Georgiou. 2017. Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings. In INTERSPEECH. 3291--3295.
[29]
Henny A Westra. 2011. Comparing the predictive capacity of observed in-session resistance to self-reported motivation in cognitive behavioral therapy. Behaviour research and therapy 49, 2 (2011), 106--113.
[30]
Bo Xiao, Do'an Can, James Gibson, Zac E. Imel, David C. Atkins, Panayiotis Georgiou, and Shrikanth S. Narayanan. 2016. Behavioral Coding of Therapist Language in Addiction Counseling Using Recurrent Neural Networks. 908--912. https://doi.org/10.21437/Interspeech.2016--1560

Cited By

View all
  • (2024)Scoping review on natural language processing applications in counselling and psychotherapyBritish Journal of Psychology10.1111/bjop.12721Online publication date: 2-Aug-2024
  • (2024)Integration of BERT Models in NAO Robot for Library Assistance2024 Brazilian Symposium on Robotics (SBR) and 2024 Workshop on Robotics in Education (WRE)10.1109/SBR/WRE63066.2024.10838042(7-12)Online publication date: 13-Nov-2024
  • (2024)Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts2024 IEEE 12th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI61247.2024.00057(392-401)Online publication date: 3-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction
October 2020
920 pages
ISBN:9781450375818
DOI:10.1145/3382507
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. human behavior
  2. machine learning
  3. mental health
  4. motivational interviewing

Qualifiers

  • Research-article

Funding Sources

Conference

ICMI '20
Sponsor:
ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
October 25 - 29, 2020
Virtual Event, Netherlands

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)219
  • Downloads (Last 6 weeks)31
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Scoping review on natural language processing applications in counselling and psychotherapyBritish Journal of Psychology10.1111/bjop.12721Online publication date: 2-Aug-2024
  • (2024)Integration of BERT Models in NAO Robot for Library Assistance2024 Brazilian Symposium on Robotics (SBR) and 2024 Workshop on Robotics in Education (WRE)10.1109/SBR/WRE63066.2024.10838042(7-12)Online publication date: 13-Nov-2024
  • (2024)Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts2024 IEEE 12th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI61247.2024.00057(392-401)Online publication date: 3-Jun-2024
  • (2023)Deciphering Entrepreneurial Pitches: A Multimodal Deep Learning Approach to Predict Probability of InvestmentProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614146(144-152)Online publication date: 9-Oct-2023
  • (2023)Multimodal Analysis and Assessment of Therapist Empathy in Motivational InterviewsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614105(406-415)Online publication date: 9-Oct-2023
  • (2023)Towards Cross-Content Conversational Agents for Behaviour Change: Investigating Domain Independence and the Role of Lexical Features in Written Language Around ChangeProceedings of the 5th International Conference on Conversational User Interfaces10.1145/3571884.3597136(1-13)Online publication date: 19-Jul-2023
  • (2023)Therapist Empathy Assessment in Motivational Interviews2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII59096.2023.10388176(1-8)Online publication date: 10-Sep-2023
  • (2023)Natural language processing for mental health interventions: a systematic review and research frameworkTranslational Psychiatry10.1038/s41398-023-02592-213:1Online publication date: 6-Oct-2023
  • (2022)Automated Detection of the Competency of Delivering Guided Self-Help for Anxiety via Speech and Language ProcessingApplied Sciences10.3390/app1217860812:17(8608)Online publication date: 28-Aug-2022
  • (2022)Detecting Change Talk in Motivational Interviewing using Verbal and Facial InformationProceedings of the 2022 International Conference on Multimodal Interaction10.1145/3536221.3556607(5-14)Online publication date: 7-Nov-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media