Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multimodal integration-a statistical view

Published: 01 December 1999 Publication History

Abstract

We present a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first identify the primary factors that influence multimodal recognition performance by evaluating the multimodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems

Cited By

View all
  • (2023)WavoID: Robust and Secure Multi-modal User Identification via mmWave-voice MechanismProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606775(1-15)Online publication date: 29-Oct-2023
  • (2022)Optimization of Multimodal Japanese Teaching Model Using Virtual RealityMobile Information Systems10.1155/2022/73642792022Online publication date: 1-Jan-2022
  • (2022)Multimodal Fusion of Smart Home and Text-based Behavior Markers for Clinical Assessment PredictionACM Transactions on Computing for Healthcare10.1145/35312313:4(1-25)Online publication date: 3-Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 1, Issue 4
December 1999
51 pages

Publisher

IEEE Press

Publication History

Published: 01 December 1999

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)WavoID: Robust and Secure Multi-modal User Identification via mmWave-voice MechanismProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606775(1-15)Online publication date: 29-Oct-2023
  • (2022)Optimization of Multimodal Japanese Teaching Model Using Virtual RealityMobile Information Systems10.1155/2022/73642792022Online publication date: 1-Jan-2022
  • (2022)Multimodal Fusion of Smart Home and Text-based Behavior Markers for Clinical Assessment PredictionACM Transactions on Computing for Healthcare10.1145/35312313:4(1-25)Online publication date: 3-Nov-2022
  • (2019)Multimodal integration for interactive conversational systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233798(21-76)Online publication date: 1-Jul-2019
  • (2017)Multimodal gesture recognitionThe Handbook of Multimodal-Multisensor Interfaces10.1145/3015783.3015796(449-487)Online publication date: 24-Apr-2017
  • (2017)Multimodal speech and pen interfacesThe Handbook of Multimodal-Multisensor Interfaces10.1145/3015783.3015795(403-447)Online publication date: 24-Apr-2017
  • (2017)The Handbook of Multimodal-Multisensor InterfacesundefinedOnline publication date: 24-Apr-2017
  • (2016)Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.251560638:8(1548-1568)Online publication date: 30-Jun-2016
  • (2015)Coupled hidden conditional random fields for RGB-D human action recognitionSignal Processing10.1016/j.sigpro.2014.08.038112:C(74-82)Online publication date: 1-Jul-2015
  • (2014)Latent Semantic Analysis for Multimodal User Input With Speech and GesturesIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2013.229458622:2(417-429)Online publication date: 1-Feb-2014
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media