Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3670105.3670110acmotherconferencesArticle/Chapter ViewAbstractPublication PagescniotConference Proceedingsconference-collections
research-article

Recognition of mild-to-moderate depression based on facial expression and speech

Published: 29 July 2024 Publication History

Abstract

The behavioral symptoms of patients with mild to moderate depression (MMD) are usually not obvious enough, which poses a challenge to MMD recognition research. A three-level feature construction strategy for facial expression was proposed to fully characterize the differences in facial activity between MMD patients and healthy control groups (HCs). Level 1: Construct geometric features to describe facial activity preliminarily. Level 2: Input geometric features into a denoising autoencoder (DAE) to generate a new hidden layer representation. Level 3: Gaussian mixture model (GMM) further characterizes the hidden layer features. Meanwhile, a speech feature set was constructed based on the fundamental frequency (F0), pitch intensity, mel frequency cepstral coefficients (MFCCs), and syllable pauses of speech. Finally, facial expression and speech were fused at the feature layer, and MMD recognition was carried out based on four classic classification algorithms. The experimental results show that the MMD recognition accuracy of the male and female groups can reach 70.3% and 68.3%, respectively.

References

[1]
BECK A T, ALFORD B A, BRENNER L, Depression: causes and treatment (2nd Edition) [M]. University of Pennsylvania Press, 2009.
[2]
CALIGIURI M P, ELLWANGER J. Motor and cognitive aspects of motor retardation in depression[J]. Journal of Affective Disorders, 2000, 57(1-3): 83-93.
[3]
WANG Q X, YANG H X, YU Y H. Facial expression video analysis for depression detection in Chinese patients[J]. Journal of Visual Communication and Image Representation, 2018, 57: 228-233.
[4]
JAN A, MENG H, GAUS Y, Artificial Intelligent System for Automatic Depression Level Analysis Through Visual and Vocal Expressions[J]. IEEE Transactions on Cognitive and Developmental Systems, 2018, 10(3): 668-680.
[5]
NIU M, TAO J, LIU B, Multimodal Spatiotemporal Representation for Automatic Depression Level Detection[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 294-307.
[6]
RINGRVSL F, SCHULLER B, VALSTAR M, AVEC 2019 workshop and challenge: state-of-mind, detecting depression with AI, and cross-cultural affect recognition[C]. The 9th International on Audio/Visual Emotion Challenge and Workshop. ACM, 2019: 3-12.
[7]
GUO Y, ZHU C, HAO S, A Topic-attentive trans-former-based model for multimodal depression detection[J/OL]. arXiv: 2206.13256 [cs.MM], 2022.6.27/ [2023.4.4], available on: https://arxiv.org/abs/2206.13256.
[8]
SONG S, JAISWAL S, SHEN L, Spectral representation of behaviour primitives for depression analysis[J]. IEEE Transactions on Affective Computing, 2022, 13(2): 829-844.
[9]
ZHU J. Research on EEG and EMs data representation and fusion for mild depression recognition [D]. Lanzhou: Lanzhou University, 2021.
[10]
LI X W, SUN S T, CAI H S, EEG-based mild depressive detection using feature selection methods and classifiers[J]. Computer Methods and Programs in Biomedicine. 2016, 136: 151-161.
[11]
LI J L, LIU Z Y, DING Z J, A novel study for MDD detection through task-elicited facial cues [C]// Proceedings of IEEE BIBM. New York: IEEE Press, 2018: 1003-1008.
[12]
GAEBEL W, WLWER W. Facial expressivity in the course of schizophrenia and depression[J]. European Archives of Psychiatry & Clinical Neuroscience, 2004, 254(5): 335-34.
[13]
YUAN Z A, QIAN Y H. Link Prediction combining motif graph neural network and auto-encoder [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 209-216.
[14]
VINCENT P, LAROCHELLE H, BENGIO Y, Extracting and composing robust features with denoising autoencoders[C]// Proceedings of the 25th International Conference on Machine Learning, New York: ACM Press, 2008: 1096-1103.
[15]
LIU Z Y, KANG H Y, FENG L, Speech pause time: A potential biomarker for depression detection [C]// Proceedings of IEEE BIBM. New York: IEEE Press, 2017: 2020-2025.
[16]
ALGHOWINEM S, GOECKE R, WAGNER M, From joyous to clinically depressed: mood detection using spontaneous speech[C]// Proceedings of FLAIRS Conference, California: AAAI, 2012: 141-146.
[17]
GUO W T, YANG H W, LIU Z Y, Deep neural networks for depression recognition based on 2D and 3D facial expressions under emotional stimulus tasks[J]. Frontiers in Neuroscience, 2021, 15: Article ID 609760.
[18]
CECCARELLI F, MAHMOUD M. Multimodal temporal machine learning for bipolar disorder and depression recognition[J]. Pattern Analysis & Applications. 2022, 25: 493-504.
[19]
WEI P C, PENG K Y, ROITBERG A, Multi-modal depression estimation based on sub-attentional fusion[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 623-639.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CNIOT '24: Proceedings of the 2024 5th International Conference on Computing, Networks and Internet of Things
May 2024
668 pages
ISBN:9798400716751
DOI:10.1145/3670105
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 July 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Denoising autoencoder
  2. Depression recognition
  3. Facial expression
  4. Speech

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • the University Teacher Innovation Foundation of Gansu Province
  • the University Innovation Foundation of Gansu Province
  • the Science and Technology Program of Gansu Province

Conference

CNIOT 2024

Acceptance Rates

Overall Acceptance Rate 39 of 82 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 14
    Total Downloads
  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media