Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A 3-D Audio-Visual Corpus of Affective Communication

Published: 01 October 2010 Publication History

Abstract

Communication between humans deeply relies on the capability of expressing and recognizing feelings. For this reason, research on human-machine interaction needs to focus on the recognition and simulation of emotional states, prerequisite of which is the collection of affective corpora. Currently available datasets still represent a bottleneck for the difficulties arising during the acquisition and labeling of affective data. In this work, we present a new audio-visual corpus for possibly the two most important modalities used by humans to communicate their emotional states, namely speech and facial expression in the form of dense dynamic 3-D face geometries. We acquire high-quality data by working in a controlled environment and resort to video clips to induce affective states. The annotation of the speech signal includes: transcription of the corpus text into the phonological representation, accurate phone segmentation, fundamental frequency extraction, and signal intensity estimation of the speech signals. We employ a real-time 3-D scanner to acquire dense dynamic facial geometries and track the faces throughout the sequences, achieving full spatial and temporal correspondences. The corpus is a valuable tool for applications like affective visual speech synthesis or view-independent facial expression recognition.

Cited By

View all
  • (2024)Integrating Multimodal Affective Signals for Stress Detection from Audio-Visual DataProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685717(22-32)Online publication date: 4-Nov-2024
  • (2024)ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAEProceedings of the 17th ACM SIGGRAPH Conference on Motion, Interaction, and Games10.1145/3677388.3696320(1-12)Online publication date: 21-Nov-2024
  • (2024)MMHead: Towards Fine-grained Multi-modal 3D Facial AnimationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681366(7966-7975)Online publication date: 28-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 12, Issue 6
October 2010
122 pages

Publisher

IEEE Press

Publication History

Published: 01 October 2010

Author Tags

  1. 3-D face modeling
  2. Audio-visual database
  3. emotional speech
  4. face tracking
  5. visual speech modeling

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Integrating Multimodal Affective Signals for Stress Detection from Audio-Visual DataProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685717(22-32)Online publication date: 4-Nov-2024
  • (2024)ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAEProceedings of the 17th ACM SIGGRAPH Conference on Motion, Interaction, and Games10.1145/3677388.3696320(1-12)Online publication date: 21-Nov-2024
  • (2024)MMHead: Towards Fine-grained Multi-modal 3D Facial AnimationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681366(7966-7975)Online publication date: 28-Oct-2024
  • (2024)DEITalk: Speech-Driven 3D Facial Animation with Dynamic Emotional Intensity ModelingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681359(10506-10514)Online publication date: 28-Oct-2024
  • (2024)CLTalk: Speech-Driven 3D Facial Animation with Contrastive LearningProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657625(1175-1179)Online publication date: 30-May-2024
  • (2024)Media2Face: Co-speech Facial Animation Generation With Multi-Modality GuidanceACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657413(1-13)Online publication date: 13-Jul-2024
  • (2024)Expressive 3D Facial Animation Generation Based on Local-to-Global Latent DiffusionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345621330:11(7397-7407)Online publication date: 1-Nov-2024
  • (2024)CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D AnimationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338683634:9(8953-8965)Online publication date: 1-Sep-2024
  • (2024)3D head-talk: speech synthesis 3D head movement face animationSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-09292-528:1(363-379)Online publication date: 1-Jan-2024
  • (2024)ScanTalk: 3D Talking Heads from Unregistered ScansComputer Vision – ECCV 202410.1007/978-3-031-73397-0_2(19-36)Online publication date: 29-Sep-2024
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media