Nakano et al., 2010 - Google Patents
Distant speech recognition using a microphone array networkNakano et al., 2010
View PDF- Document ID
- 17379517150104665740
- Author
- Nakano A
- Nakagawa S
- Yamamoto K
- Publication year
- Publication venue
- IEICE transactions on information and systems
External Links
Snippet
In this work, spatial information consisting of the position and orientation angle of an acoustic source is estimated by an artificial neural network (ANN). The estimated position of a speaker in an enclosed space is used to refine the estimated time delays for a delay-and …
- 239000000203 mixture 0 abstract description 28
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3707716B1 (en) | Multi-channel speech separation | |
Takahashi et al. | Blind spatial subtraction array for speech enhancement in noisy environment | |
JP5738020B2 (en) | Speech recognition apparatus and speech recognition method | |
Saruwatari et al. | Blind source separation combining independent component analysis and beamforming | |
Swietojanski et al. | Hybrid acoustic models for distant and multichannel large vocabulary speech recognition | |
Valin et al. | Robust recognition of simultaneous speech by a mobile robot | |
Kumatani et al. | Channel selection based on multichannel cross-correlation coefficients for distant speech recognition | |
Yamamoto et al. | Enhanced robot speech recognition based on microphone array source separation and missing feature theory | |
Potamitis et al. | An integrated system for smart-home control of appliances based on remote speech interaction. | |
US20120095761A1 (en) | Speech recognition system and speech recognizing method | |
Nakatani et al. | Dominance based integration of spatial and spectral features for speech enhancement | |
Novoa et al. | Automatic speech recognition for indoor hri scenarios | |
Diaz et al. | Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios | |
Delcroix et al. | Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds | |
Omologo et al. | Speech recognition with microphone arrays | |
Maas et al. | A two-channel acoustic front-end for robust automatic speech recognition in noisy and reverberant environments | |
Yamamoto et al. | Making a robot recognize three simultaneous sentences in real-time | |
Yamada et al. | Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array | |
Shi et al. | Phase-based dual-microphone speech enhancement using a prior speech model | |
Nakano et al. | Distant speech recognition using a microphone array network | |
Nakadai et al. | A robot referee for rock-paper-scissors sound games | |
Okuno et al. | Robot audition: Missing feature theory approach and active audition | |
Yamamoto et al. | Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech | |
Ito et al. | Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming | |
Ishi et al. | Robust speech recognition system for communication robots in real environments |