Abstract
POST-traumatic stress disorder (PTSD) is a chronic and debilitating mental condition that is developed in response to catastrophic life events, such as military combat, sexual assault, and natural disasters. PTSD is characterized by flashbacks of past traumatic events, intrusive thoughts, nightmares, hypervigilance, and sleep disturbance, all of which affect a person’s life and lead to considerable social, occupational, and interpersonal dysfunction. The diagnosis of PTSD is done by medical professionals using self-assessment questionnaire of PTSD symptoms as defined in the Diagnostic and Statistical Manual of Mental Disorders (DSM). In this paper, and for the first time, we collected, annotated, and prepared for public distribution a new video database for automatic PTSD diagnosis, called PTSD-in-the-wild dataset. The database exhibits “natural" and big variability in acquisition conditions with different pose, facial expression, lighting, focus, resolution, age, gender, race, occlusions and background. In addition to describing the details of the dataset collection, we provide a benchmark for evaluating machine learning-based approaches on PTSD-in-the-wild dataset. In addition, we propose and we evaluate a deep learning-based approach for PTSD detection in respect to the given benchmark. The proposed approach shows very promising results. Interested researcher can download a copy of PTSD-in-the wild dataset from http://www.lissi.fr/PTSD-Dataset/.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available in the LISSI repository. Interested researcher can download a copy of PTSD-in-the wild dataset from http://www.lissi.fr/PTSD-Dataset/
Notes
References
Aadam Tubaishat A, Al-Obeidat F, Halim Z, Waqas M, Qayum F (2022) Emopercept: Eeg-based emotion classification through perceiver. Soft Computing, pp 1–8
Abualigah L, Alfar HE, Shehab M, Hussein AM (2020) Sentiment analysis in healthcare: a brief review. Recent advances in NLP: The case of arabic language, pp 129–141
Baevski A, Auli M, Mohamed A (2019) Effectiveness of self-supervised pre-training for speech recognition
Baevski A, Zhou Y, Mohamed A, Auli M (2020) wav2vec 2.0: A framework for self-supervised learning of speech representations
Banerjee D, Islam K, Xue K, Mei G, Xiao L, Zhang G, Xu R, Lei C, Ji S, Li J (2019) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowl Inf Syst 60(3):1693–1724
Batbaatar E, Li M, Ryu KH (2019) Semantic-emotion neural network for emotion recognition from text. IEEE Access 7:111866–111878
Bauer MR, Ruef AM, Pineles SL, Japuntich SJ, Macklin ML, Lasko NB, Orr SP (2013) Psychophysiological assessment of PTSD: a potential research domain criteria construct. Psychol Assess 25(3):1037–1043
de Beurs E, Thomaes K, Kronemeijer H, Dekker J (2020) the PTSD checklist for DSM-5 (PCL-5): comparing responsivity with the outcome questionnaire (OQ-45) and practical utility. Tijdschr Psychiatr 62(6):448–456
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, pp 248–255. IEEE
Desmet B, Hoste V (2013) Emotion detection in suicide notes. Exp Syst Appl 40(16):6351–6358
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding
Gemmeke JF, Ellis DP, Freedman D, Jansen A, Lawrence W, Moore RC, Plakal M, Ritter M (2017) Audio set: An ontology and human-labeled dataset for audio events. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 776–780. IEEE
Gratch J, Artstein R, Lucas GM, Stratou G, Scherer S, Nazarian A, Wood R, Boberg J, DeVault D, Marsella S, et al (2014) The distress analysis interview corpus of human and computer interviews. Technical report, UNIVERSITY OF SOUTHERN CALIFORNIA LOS ANGELES
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, pp 369–376, New York, NY, USA. Association for computing machinery
Gupta S, Goel L, Singh A, Agarwal AK, Singh RK (2022) Toxgb: Teamwork optimization based xgboost model for early identification of post-traumatic stress disorder. Cognitive Neurodynamics, pp 1–14
Halim Z, Rehan M (2020) On identification of driving-induced stress using electroencephalogram signals: A framework based on wearable safety-critical scheme and machine learning. Inf Fusion 53:66–79
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision, pp 630–645. Springer
Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, et al (2017) Cnn architectures for large-scale audio classification. In: 2017 ieee international conference on acoustics, speech and signal processing (icassp), pp 131–135. IEEE
Islam KA, Perez D, Li J (2018) A transfer learning approach for the 2018 femh voice data challenge. In: 2018 IEEE International conference on big data (Big Data), pp 5252–5257. IEEE
Kaur S, Aggarwal H, Rani R (2020) Hyper-parameter optimization of deep learning model for prediction of parkinson’s disease. Mach Vis Appl 31:1–15
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Kusters R, Misevic D, Berry H, Cully A, Le Cunff Y, Dandoy L, Díaz-Rodríguez N, Ficher M, Grizou J, Othmani A et al (2020) Interdisciplinary research in artificial intelligence: Challenges and opportunities. Front Big Data 3:577974
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization
Dia M, Khodabandelou G, Othmani A (2023) A novel stochastic transformer-based approach for post-traumatic stress disorder detection using audio recording of clinical interviews. In: 36th IEEE International Symposium on Computer-Based Medical Systems (IEEE CBMS2023)
McLean SA, Ressler K, Koenen KC, Neylan T, Germine L, Jovanovic T, Clifford GD, Zeng D, An X, Linnstaedt S et al (2020) The aurora study: a longitudinal, multimodal library of brain biology and function after traumatic stress exposure. Mol Psyc 25(2):283–296
Muzammel M, Salam H, Hoffmann Y, Chetouani M, Othmani A (2020) Audvowelconsnet: A phoneme-level based deep cnn architecture for clinical depression diagnosis. Mach Learn Appl 2:100005
Muzammel M, Salam H, Othmani A (2021) End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis. Comput Methods Prog Biomed 211:106433
O’Malley T, Bursztein E, Long J, Chollet F, Jin H, Invernizzi L, et al (2019) Kerastuner. https://github.com/keras-team/keras-tuner
Othmani A, Brahem B, Haddou Y (2023) Machine learning-based approaches for post-traumatic stress disorder diagnosis using video and eeg sensors: A review
Othmani A, Kadoch D, Bentounes K, Rejaibi E, Alfred R, Hadid A (2021) Towards robust deep neural networks for affect and depression recognition from speech. In: International conference on pattern recognition, pp 5–19. Springer
Pampouchidou A, Pediaditis M, Kazantzaki E, Sfakianakis S, Apostolaki IA, Argyraki K, Manousos D, Meriaudeau F, Marias K, Yang F et al (2020) Automated facial video-based recognition of depression and anxiety symptom severity: cross-corpus validation. Mach Vis Appl 31(4):30
Rahman AU, Halim Z (2023) Identifying dominant emotional state using handwriting and drawing samples by fusing features. Appl Intell 53(3):2798–2814
Rejaibi E, Komaty A, Meriaudeau F, Agrebi S, Othmani A (2022) Mfcc-based recurrent neural network for automatic clinical depression recognition and assessment from speech. Biom Signal Process Control 71:103107
Rozgic V, Vazquez-Reina A, Crystal M, Srivastava A, Tan V, Berka C (2014) Multi-modal prediction of ptsd and stress indicators. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 3636–3640. IEEE
Schoneveld L, Othmani A, Abdelkawy H (2021) Leveraging recent advances in deep learning for audio-visual emotion recognition. Pattern Recognit Lett 146:1–7
Schultebraucks K, Yadav V, Shalev AY, Bonanno GA, Galatzer-Levy IR (2022) Deep learning-based classification of posttraumatic stress disorder and depression following trauma utilizing visual and auditory markers of arousal and mood. Psychol Med 52(5):957–967
Alice Othmani Sirine Chaari, El Ouni C (2022) A mobile monitoring application for post-traumatic stress disorder. In: Proceedings of 2022 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2022)
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Res 15(56):1929–1958
Stappen L, Baird A, Schumann L, Bjorn S (2021) The multimodal sentiment analysis in car reviews (muse-car) dataset: Collection, insights and improvements. arXiv:2101.06053
Stappen L, Meßner EM, Cambria E, Zhao G, Schuller BW (2021) Muse 2021 challenge: Multimodal emotion, sentiment, physiological-emotion, and stress detection. In: Proceedings of the 29th ACM International conference on multimedia, pp 5706–5707
Tokuno S, Tsumatori G, Shono S, Takei E, Yamamoto T, Suzuki G, Mituyoshi S, Shimura M (2011) Usage of emotion recognition in military health care. In: 2011 Defense Science Research Conference and Expo (DSR), pp 1–5. IEEE
Ullah S, Halim Z (2021) Imagined character recognition through eeg signals using deep convolutional neural network. Med Biol Eng Comput 59(5):1167–1183
Yang L, Sahli H, Xia X, Pei E, Oveneke MC, Jiang D (2017) Hybrid depression classification and estimation from audio video and text information. In: Proceedings of the 7th annual workshop on audio/visual emotion challenge, pp 45–51
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Zhuang X, Rozgić V, Crystal M, Marx BP (2014) Improving speech-based ptsd detection via multi-view learning. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp 260–265. IEEE
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Not applicable
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Moctar Abdoul Latif Sawadogo and Furkan Pala are contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sawadogo, M.A.L., Pala, F., Singh, G. et al. PTSD in the wild: a video database for studying post-traumatic stress disorder recognition in unconstrained environments. Multimed Tools Appl 83, 42861–42883 (2024). https://doi.org/10.1007/s11042-023-17203-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17203-x