Nothing Special   »   [go: up one dir, main page]

Even et al., 2012 - Google Patents

Combining laser range finders and local steered response power for audio monitoring

Even et al., 2012

View PDF
Document ID
197618046977072909
Author
Even J
Ishi C
Heracleous P
Miyashita T
Hagita N
Publication year
Publication venue
2012 IEEE/RSJ International Conference on Intelligent Robots and Systems

External Links

Snippet

This paper presents an audio monitoring system for detecting and identifying people engaged in a conversation. The proposed method is hands-free as it uses a microphone array to acquire the sound. A particularity of the approach is the use of a laser range finder …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit

Similar Documents

Publication Publication Date Title
Yoshioka et al. Advances in online audio-visual meeting transcription
US8392185B2 (en) Speech recognition system and method for generating a mask of the system
US8577678B2 (en) Speech recognition system and speech recognizing method
Valin et al. Robust recognition of simultaneous speech by a mobile robot
Yamamoto et al. Enhanced robot speech recognition based on microphone array source separation and missing feature theory
Nakatani et al. Dominance based integration of spatial and spectral features for speech enhancement
Ince et al. Assessment of general applicability of ego noise estimation
Squartini et al. Environmental robust speech and speaker recognition through multi-channel histogram equalization
Faubel et al. Improving hands-free speech recognition in a car through audio-visual voice activity detection
Abutalebi et al. Performance improvement of TDOA-based speaker localization in joint noisy and reverberant conditions
Even et al. Combining laser range finders and local steered response power for audio monitoring
Yamamoto et al. Robust i-vector extraction tightly coupled with voice activity detection using deep neural networks
Xiong et al. Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments
Yamamoto et al. Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech
Rodomagoulakis et al. Experiments on far-field multichannel speech processing in smart homes
Rudzyn et al. Real time robot audition system incorporating both 3D sound source localisation and voice characterisation
Yamada et al. Hands-free speech recognition based on 3-D Viterbi search using a microphone array
Bergh et al. Multi-speaker voice activity detection using a camera-assisted microphone array
Hu et al. Wake-up-word detection for robots using spatial eigenspace consistency and resonant curve similarity
Asaei et al. Verified speaker localization utilizing voicing level in split-bands
Takashima et al. Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients.
Even et al. Multi-modal front-end for speaker activity detection in small meetings
Lee et al. Space-time voice activity detection
Liu et al. A unified network for multi-speaker speech recognition with multi-channel recordings
Patra et al. Dimension reduction of feature vectors using WPCA for robust speaker identification system