Busso et al., 2005 - Google Patents

Smart room: Participant and speaker localization and identification

Busso et al., 2005

Document ID: 1058642703238447655
Author: Busso C; Hernanz S; Chu C; Kwon S; Lee S; Georgiou P; Cohen I; Narayanan S
Publication year: 2005
Publication venue: Proceedings.(ICASSP'05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.

External Links

Cited by

Snippet

Our long-term objective is to create smart room technologies that are aware of the users presence and their behavior and can become an active, but not an intrusive, part of the interaction. In this work, we present a multimodal approach for estimating and tracking the …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

230000004807 localization 0 title abstract description 25

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00228—Detection; Localisation; Normalisation
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading

Similar Documents

Publication	Publication Date	Title
Busso et al.	2005	Smart room: Participant and speaker localization and identification
Zotkin et al.	2002	Joint audio-visual tracking using particle filters
Gebru et al.	2017	Audio-visual speaker diarization based on spatiotemporal bayesian fusion
Gatica-Perez et al.	2007	Audiovisual probabilistic tracking of multiple speakers in meetings
Nickel et al.	2005	A joint particle filter for audio-visual speaker tracking
Aarabi et al.	2001	Robust sound localization using multi-source audiovisual information fusion
Zhou et al.	2008	Target detection and tracking with heterogeneous sensors
CN107820037B (en)	2021-03-26	Audio signal, image processing method, device and system
Kapralos et al.	2003	Audiovisual localization of multiple speakers in a video teleconferencing setting
Cech et al.	2013	Active-speaker detection and localization with microphones and cameras embedded into a robotic head
Kirchmaier et al.	2011	Dynamical information fusion of heterogeneous sensors for 3D tracking using particle swarm optimization
Liu et al.	2017	Multiple speaker tracking in spatial audio via PHD filtering and depth-audio fusion
Checka et al.	2003	A probabilistic framework for multi-modal multi-person tracking
Busso et al.	2007	Real-time monitoring of participants' interaction in a meeting using audio-visual sensors
Siracusa et al.	2003	A multi-modal approach for determining speaker location and focus
D'Arca et al.	2016	Robust indoor speaker recognition in a network of audio and video sensors
D'Arca et al.	2012	Person tracking via audio and video fusion
Omologo et al.	2005	Speaker localization in CHIL lectures: Evaluation criteria and results
Izhar et al.	2020	Tracking sound sources for object-based spatial audio in 3D audio-visual production
Salah et al.	2008	Multimodal identification and localization of users in a smart environment
Gatica-Perez et al.	2003	A mixed-state i-particle filter for multi-camera speaker tracking
Zhao et al.	2023	Audio Visual Speaker Localization from EgoCentric Views
Gatica-Perez et al.	2005	Multimodal multispeaker probabilistic tracking in meetings
Kim et al.	2007	Auditory and visual integration based localization and tracking of humans in daily-life environments
Korchagin et al.	2011	Just-in-time multimodal association and fusion from home entertainment