Authors:
Georgios Drakopoulos
;
George Pikramenos
;
Evaggelos Spyrou
and
Stavros J. Perantonis
Affiliation:
NCSR “Demokritos”, Athens and Greece
Keyword(s):
Emotion Recognition, Affective Computing, Signal Processing, Cepstrum, MFCC Coefficients, Deep Neural Networks, Time Series Analysis, Hidden Markov Chains, Higher Order Data.
Abstract:
Emotion recognition from speech signals is an important field in its own right as well as a mainstay of many multimodal sentiment analysis systems. The latter may as well include a broad spectrum of modalities which are strongly associated with consciously or subconsciously communicating human emotional state such as visual cues, gestures, body postures, gait, or facial expressions. Typically, emotion discovery from speech signals not only requires considerably less computational complexity than other modalities, but also at the same time in the overwhelming majority of studies the inclusion of speech modality increases the accuracy of the overall emotion estimation process. The principal algorithmic cornerstones of emotion estimation from speech signals are Hidden Markov Models, time series modeling, cepstrum processing, and deep learning methodologies, the latter two being prime examples of higher order data processing. Additionally, the most known datasets which serve as emotion r
ecognition benchmarks are described.
(More)