Nakano et al., 2010 - Google Patents

Distant speech recognition using a microphone array network

Nakano et al., 2010

Document ID: 17379517150104665740
Author: Nakano A; Nakagawa S; Yamamoto K
Publication year: 2010
Publication venue: IEICE transactions on information and systems

External Links

Cited by

Snippet

In this work, spatial information consisting of the position and orientation angle of an acoustic source is estimated by an artificial neural network (ANN). The estimated position of a speaker in an enclosed space is used to refine the estimated time delays for a delay-and …

Continue reading at www.jstage.jst.go.jp (PDF) (other versions)

239000000203 mixture 0 abstract description 28

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids

Similar Documents

Publication	Publication Date	Title
EP3707716B1 (en)	2021-12-01	Multi-channel speech separation
Takahashi et al.	2009	Blind spatial subtraction array for speech enhancement in noisy environment
JP5738020B2 (en)	2015-06-17	Speech recognition apparatus and speech recognition method
Saruwatari et al.	2003	Blind source separation combining independent component analysis and beamforming
Swietojanski et al.	2013	Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
Valin et al.	2007	Robust recognition of simultaneous speech by a mobile robot
Kumatani et al.	2011	Channel selection based on multichannel cross-correlation coefficients for distant speech recognition
Yamamoto et al.	2005	Enhanced robot speech recognition based on microphone array source separation and missing feature theory
Potamitis et al.	2003	An integrated system for smart-home control of appliances based on remote speech interaction.
US20120095761A1 (en)	2012-04-19	Speech recognition system and speech recognizing method
Nakatani et al.	2013	Dominance based integration of spatial and spectral features for speech enhancement
Novoa et al.	2021	Automatic speech recognition for indoor hri scenarios
Diaz et al.	2021	Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios
Delcroix et al.	2013	Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds
Omologo et al.	2001	Speech recognition with microphone arrays
Maas et al.	2011	A two-channel acoustic front-end for robust automatic speech recognition in noisy and reverberant environments
Yamamoto et al.	2005	Making a robot recognize three simultaneous sentences in real-time
Yamada et al.	2002	Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array
Shi et al.	2006	Phase-based dual-microphone speech enhancement using a prior speech model
Nakano et al.	2010	Distant speech recognition using a microphone array network
Nakadai et al.	2008	A robot referee for rock-paper-scissors sound games
Okuno et al.	2011	Robot audition: Missing feature theory approach and active audition
Yamamoto et al.	2007	Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech
Ito et al.	2017	Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming
Ishi et al.	2006	Robust speech recognition system for communication robots in real environments