Busso et al., 2005 - Google Patents
Smart room: Participant and speaker localization and identificationBusso et al., 2005
View PDF- Document ID
- 1058642703238447655
- Author
- Busso C
- Hernanz S
- Chu C
- Kwon S
- Lee S
- Georgiou P
- Cohen I
- Narayanan S
- Publication year
- Publication venue
- Proceedings.(ICASSP'05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
External Links
Snippet
Our long-term objective is to create smart room technologies that are aware of the users presence and their behavior and can become an active, but not an intrusive, part of the interaction. In this work, we present a multimodal approach for estimating and tracking the …
- 230000004807 localization 0 title abstract description 25
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00228—Detection; Localisation; Normalisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Busso et al. | Smart room: Participant and speaker localization and identification | |
Zotkin et al. | Joint audio-visual tracking using particle filters | |
Gebru et al. | Audio-visual speaker diarization based on spatiotemporal bayesian fusion | |
Gatica-Perez et al. | Audiovisual probabilistic tracking of multiple speakers in meetings | |
Nickel et al. | A joint particle filter for audio-visual speaker tracking | |
Aarabi et al. | Robust sound localization using multi-source audiovisual information fusion | |
Zhou et al. | Target detection and tracking with heterogeneous sensors | |
CN107820037B (en) | Audio signal, image processing method, device and system | |
Kapralos et al. | Audiovisual localization of multiple speakers in a video teleconferencing setting | |
Cech et al. | Active-speaker detection and localization with microphones and cameras embedded into a robotic head | |
Kirchmaier et al. | Dynamical information fusion of heterogeneous sensors for 3D tracking using particle swarm optimization | |
Liu et al. | Multiple speaker tracking in spatial audio via PHD filtering and depth-audio fusion | |
Checka et al. | A probabilistic framework for multi-modal multi-person tracking | |
Busso et al. | Real-time monitoring of participants' interaction in a meeting using audio-visual sensors | |
Siracusa et al. | A multi-modal approach for determining speaker location and focus | |
D'Arca et al. | Robust indoor speaker recognition in a network of audio and video sensors | |
D'Arca et al. | Person tracking via audio and video fusion | |
Omologo et al. | Speaker localization in CHIL lectures: Evaluation criteria and results | |
Izhar et al. | Tracking sound sources for object-based spatial audio in 3D audio-visual production | |
Salah et al. | Multimodal identification and localization of users in a smart environment | |
Gatica-Perez et al. | A mixed-state i-particle filter for multi-camera speaker tracking | |
Zhao et al. | Audio Visual Speaker Localization from EgoCentric Views | |
Gatica-Perez et al. | Multimodal multispeaker probabilistic tracking in meetings | |
Kim et al. | Auditory and visual integration based localization and tracking of humans in daily-life environments | |
Korchagin et al. | Just-in-time multimodal association and fusion from home entertainment |