Even et al., 2012 - Google Patents

Combining laser range finders and local steered response power for audio monitoring

Even et al., 2012

Document ID: 197618046977072909
Author: Even J; Ishi C; Heracleous P; Miyashita T; Hagita N
Publication year: 2012
Publication venue: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems

External Links

Cited by

Snippet

This paper presents an audio monitoring system for detecting and identifying people engaged in a conversation. The proposed method is hands-free as it uses a microphone array to acquire the sound. A particularity of the approach is the use of a laser range finder …

Continue reading at www.researchgate.net (PDF) (other versions)

210000000214 Mouth 0 abstract description 5

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit

Similar Documents

Publication	Publication Date	Title
Yoshioka et al.	2019	Advances in online audio-visual meeting transcription
US8392185B2 (en)	2013-03-05	Speech recognition system and method for generating a mask of the system
US8577678B2 (en)	2013-11-05	Speech recognition system and speech recognizing method
Valin et al.	2007	Robust recognition of simultaneous speech by a mobile robot
Yamamoto et al.	2005	Enhanced robot speech recognition based on microphone array source separation and missing feature theory
Nakatani et al.	2013	Dominance based integration of spatial and spectral features for speech enhancement
Ince et al.	2011	Assessment of general applicability of ego noise estimation
Squartini et al.	2012	Environmental robust speech and speaker recognition through multi-channel histogram equalization
Faubel et al.	2011	Improving hands-free speech recognition in a car through audio-visual voice activity detection
Abutalebi et al.	2011	Performance improvement of TDOA-based speaker localization in joint noisy and reverberant conditions
Even et al.	2012	Combining laser range finders and local steered response power for audio monitoring
Yamamoto et al.	2017	Robust i-vector extraction tightly coupled with voice activity detection using deep neural networks
Xiong et al.	2018	Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments
Yamamoto et al.	2007	Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech
Rodomagoulakis et al.	2013	Experiments on far-field multichannel speech processing in smart homes
Rudzyn et al.	2007	Real time robot audition system incorporating both 3D sound source localisation and voice characterisation
Yamada et al.	1998	Hands-free speech recognition based on 3-D Viterbi search using a microphone array
Bergh et al.	2016	Multi-speaker voice activity detection using a camera-assisted microphone array
Hu et al.	2011	Wake-up-word detection for robots using spatial eigenspace consistency and resonant curve similarity
Asaei et al.	2009	Verified speaker localization utilizing voicing level in split-bands
Takashima et al.	2012	Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients.
Even et al.	2011	Multi-modal front-end for speaker activity detection in small meetings
Lee et al.	2009	Space-time voice activity detection
Liu et al.	2017	A unified network for multi-speaker speech recognition with multi-channel recordings
Patra et al.	2011	Dimension reduction of feature vectors using WPCA for robust speaker identification system