Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2988257.2988269acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Decision Tree Based Depression Classification from Audio Video and Language Information

Published: 16 October 2016 Publication History

Abstract

In order to improve the recognition accuracy of the Depression Classification Sub-Challenge (DCC) of the AVEC 2016, in this paper we propose a decision tree for depression classification. The decision tree is constructed according to the distribution of the multimodal prediction of PHQ-8 scores and participants' characteristics (PTSD/Depression Diagnostic, sleep-status, feeling and personality) obtained via the analysis of the transcript files of the participants. The proposed gender specific decision tree provides a way of fusing the upper level language information with the results obtained using low level audio and visual features. Experiments are carried out on the Distress Analysis Interview Corpus - Wizard of Oz (DAIC-WOZ) database, results show that the proposed depression classification schemes obtain very promising results on the development set, with F1 score reaching 0.857 for class depressed and 0.964 for class not depressed. Despite of the over-fitting problem in training the models of predicting the PHQ-8 scores, the classification schemes still obtain satisfying performance on the test set. The F1 score reaches 0.571 for class depressed and 0.877 for class not depressed, with the average 0.724 which is higher than the baseline result 0.700.

References

[1]
S. Alghowinem. From joyous to clinically depressed: mood detection using multimodal analysis of a person's appearance and speech. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on, pages 648--654. IEEE, 2013.
[2]
S. Alghowinem, R. Goecke, M. Wagner, G. Parkerx, and M. Breakspear. Head pose and movement analysis as an indicator of depression. In Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on, pages 283--288. IEEE, 2013.
[3]
M. Asgari, I. Shafran, and L. B. Sheeber. Inferring clinical depression from speech and spoken utterances. In Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on, pages 1--5, 2014.
[4]
T. Baltru, P. Robinson, L.-P. Morency, et al. Openface: an open source facial behavior analysis toolkit. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1--10. IEEE, 2016.
[5]
N. Cummins, J. Epps, M. Breakspear, and R. Goecke. An investigation of depressed speech detection: features and normalization. In Interspeech, pages 2997--3000, 2011.
[6]
N. Cummins, J. Joshi, A. Dhall, V. Sethu, R. Goecke, and J. Epps. Diagnosis of depression by behavioural signals: a multimodal approach. In Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pages 11--20. ACM, 2013.
[7]
M. Gamon, M. D. Choudhury, S. Counts, and E. Horvitz. Predicting depression via social media. In AAAI, 2013.
[8]
J. M. Girard, J. F. Cohn, and M. H. Mahoor. Nonverbal social withdrawal in depression: evidence from manual and automatic analyses. Image and Vision Computing, 32(10):641--647, 2014.
[9]
J. Gratch, R. Artstein, G. Lucas, G. Stratou, S. Scherer, A. Nazarian, R. Wood, B. Boberg, D. DeVault, S. Marsella, D. Traum, S. Rizzo, and L.-P. Morency. The Distress Analysis Interview Corpus of human and computer interviews. In Proceedings of Language Resources and Evaluation Conference (LREC), pages 3123--3128, 2014.
[10]
L. He, D. Jiang, and H. Sahli. Multimodal depression recognition with dynamic visual and audio cues. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on, pages 260--266. AAAC, 2015.
[11]
C. Howes, M. Purver, R. Mccabe, and R. Mccabe. Linguistic indicators of severity and progress in online text-based therapy for depression. In ACL Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal To Clinical Reality, pages 7--16, 2014.
[12]
J. Joshi, R. Goecke, G. Parker, and M. Breakspear. Can body expressions contribute to automatic depression analysis? In Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, pages 1--7. IEEE, 2013.
[13]
L.-S. A. Low, N. C. Maddage, M. Lech, L. Sheeber, and N. Allen. Influence of acoustic low-level descriptors in the detection of clinical depression in adolescents. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pages 5154--5157. IEEE, 2010.
[14]
L. S. A. Low, N. C. Maddage, M. Lech, L. B. Sheeber, and N. B. Allen. Detection of clinical depression in adolescents' speech during family interactions. IEEE Transactions on Biomedical Engineering, 58(3):574--86, 2011.
[15]
V. Mitra, E. Shriberg, M. McLaren, A. Kathol, C. Richey, D. Vergyri, and M. Graciarena. The SRI AVEC-2014 evaluation system. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, pages 93--101. ACM, 2014.
[16]
J. C. Mundt, P. J. Snyder, M. S. Cannizzaro, K. Chappie, and D. S. Geralts. Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. Journal of Neurolinguistics, 20(1):50--64, 2007.
[17]
S. Scherer, G. Stratou, M. Mahmoud, J. Boberg, J. Gratch, R. Albert, and L.-P. Morency. Automatic audiovisual behavior descriptors for psychological disorder analysis. Image and Vision Computing, 32(10):648--658, 2013.
[18]
M. Senoussaoui, M. Sarria-Paja, J. F. Santos, and T. H. Falk. Model fusion for multimodal depression classification and level detection. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, pages 57--63. ACM, 2014.
[19]
G. Stratou, S. Scherer, J. Gratch, and L. P. Morency. Automatic nonverbal behavior indicators of depression and PTSD: the effect of gender. Journal on Multimodal User Interfaces, 9(1):1--13, 2014.
[20]
M. Valstar, J. Gratch, B. Schuller, F. Ringeval, D. Lalanne, M. T. Torres, S. Scherer, G. Stratou, R. Cowie, and M. Pantic. AVEC 2016 - depression, mood, and emotion recognition workshop and challenge. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 2016.
[21]
M. Valstar, B. Schuller, K. Smith, T. Almaev, F. Eyben, J. Krajewski, R. Cowie, and M. Pantic. AVEC 2014: 3D dimensional affect and depression recognition challenge. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, pages 3--10. ACM, 2014.
[22]
M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic. AVEC 2013: the continuous audio/visual emotion and depression recognition challenge. In Proceedings of the 3rd ACM international workshop on audio/visual emotion challenge, pages 3--10. ACM, 2013.
[23]
J. R. Williamson, T. F. Quatieri, B. S. Helfer, R. Horwitz, B. Yu, and D. D. Mehta. Vocal biomarkers of depression based on motor incoordination. In Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, pages 41--48. ACM, 2013.

Cited By

View all
  • (2025)An Enhanced Cross‐Attention Based Multimodal Model for Depression DetectionComputational Intelligence10.1111/coin.7001941:1Online publication date: 13-Jan-2025
  • (2025)Weakly-Supervised Depression Detection in Speech Through Self-Learning Based Label CorrectionIEEE Transactions on Audio, Speech and Language Processing10.1109/TASLPRO.2025.353337033(748-758)Online publication date: 2025
  • (2024)Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing ApproachesSensors10.3390/s2402034824:2(348)Online publication date: 6-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AVEC '16: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge
October 2016
114 pages
ISBN:9781450345163
DOI:10.1145/2988257
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. decision tree
  2. depression classification
  3. multi-modal

Qualifiers

  • Research-article

Funding Sources

  • the VUB Interdisciplinary Research Program through the EMO-App project
  • the Research and Development Program 863 of China
  • the National Natural Science Foundation of China

Conference

MM '16
Sponsor:
MM '16: ACM Multimedia Conference
October 16, 2016
Amsterdam, The Netherlands

Acceptance Rates

AVEC '16 Paper Acceptance Rate 12 of 14 submissions, 86%;
Overall Acceptance Rate 52 of 98 submissions, 53%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)8
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)An Enhanced Cross‐Attention Based Multimodal Model for Depression DetectionComputational Intelligence10.1111/coin.7001941:1Online publication date: 13-Jan-2025
  • (2025)Weakly-Supervised Depression Detection in Speech Through Self-Learning Based Label CorrectionIEEE Transactions on Audio, Speech and Language Processing10.1109/TASLPRO.2025.353337033(748-758)Online publication date: 2025
  • (2024)Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing ApproachesSensors10.3390/s2402034824:2(348)Online publication date: 6-Jan-2024
  • (2024)Development of multimodal sentiment recognition and understandingJournal of Image and Graphics10.11834/jig.24001729:6(1607-1627)Online publication date: 2024
  • (2024)Multi Fine-Grained Fusion Network for Depression DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366524720:8(1-23)Online publication date: 29-Jun-2024
  • (2024)Detecting Depression With Heterogeneous Graph Neural Network in Clinical Interview TranscriptIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.326305611:1(1315-1324)Online publication date: Feb-2024
  • (2024)Rethinking Inconsistent Context and Imbalanced Regression in Depression Severity PredictionIEEE Transactions on Affective Computing10.1109/TAFFC.2024.340558415:4(2154-2168)Online publication date: 27-May-2024
  • (2024)Multimodal Prediction of Obsessive-Compulsive Disorder and Comorbid Depression Severity and Energy Delivered by Deep Brain ElectrodesIEEE Transactions on Affective Computing10.1109/TAFFC.2024.339511715:4(2025-2041)Online publication date: Oct-2024
  • (2024)Exploring Self-Supervised Models for Depressive Disorder Detection: A Study on Speech Corpora2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC53108.2024.10781765(1-4)Online publication date: 15-Jul-2024
  • (2024)SE-DCFN: Semantic-Enhanced Dual Cross-modal Fusion Network for Depression Recognition2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM62325.2024.10822361(1507-1512)Online publication date: 3-Dec-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media