Even et al., 2012 - Google Patents
Combining laser range finders and local steered response power for audio monitoringEven et al., 2012
View PDF- Document ID
- 197618046977072909
- Author
- Even J
- Ishi C
- Heracleous P
- Miyashita T
- Hagita N
- Publication year
- Publication venue
- 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
External Links
Snippet
This paper presents an audio monitoring system for detecting and identifying people engaged in a conversation. The proposed method is hands-free as it uses a microphone array to acquire the sound. A particularity of the approach is the use of a laser range finder …
- 210000000214 Mouth 0 abstract description 5
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yoshioka et al. | Advances in online audio-visual meeting transcription | |
US8392185B2 (en) | Speech recognition system and method for generating a mask of the system | |
US8577678B2 (en) | Speech recognition system and speech recognizing method | |
Valin et al. | Robust recognition of simultaneous speech by a mobile robot | |
Yamamoto et al. | Enhanced robot speech recognition based on microphone array source separation and missing feature theory | |
Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
Ince et al. | Assessment of general applicability of ego noise estimation | |
Squartini et al. | Environmental robust speech and speaker recognition through multi-channel histogram equalization | |
Faubel et al. | Improving hands-free speech recognition in a car through audio-visual voice activity detection | |
Abutalebi et al. | Performance improvement of TDOA-based speaker localization in joint noisy and reverberant conditions | |
Even et al. | Combining laser range finders and local steered response power for audio monitoring | |
Yamamoto et al. | Robust i-vector extraction tightly coupled with voice activity detection using deep neural networks | |
Xiong et al. | Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments | |
Yamamoto et al. | Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech | |
Rodomagoulakis et al. | Experiments on far-field multichannel speech processing in smart homes | |
Rudzyn et al. | Real time robot audition system incorporating both 3D sound source localisation and voice characterisation | |
Yamada et al. | Hands-free speech recognition based on 3-D Viterbi search using a microphone array | |
Bergh et al. | Multi-speaker voice activity detection using a camera-assisted microphone array | |
Hu et al. | Wake-up-word detection for robots using spatial eigenspace consistency and resonant curve similarity | |
Asaei et al. | Verified speaker localization utilizing voicing level in split-bands | |
Takashima et al. | Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients. | |
Even et al. | Multi-modal front-end for speaker activity detection in small meetings | |
Lee et al. | Space-time voice activity detection | |
Liu et al. | A unified network for multi-speaker speech recognition with multi-channel recordings | |
Patra et al. | Dimension reduction of feature vectors using WPCA for robust speaker identification system |