He et al., 2020 - Google Patents

Mask-based blind source separation and MVDR beamforming in ASR

He et al., 2020

Document ID: 5638381240622721531
Author: He R; Long Y; Li Y; Liang J
Publication year: 2020
Publication venue: International Journal of Speech Technology

External Links

Cited by

Snippet

This paper presents a front-end enhancement system for automatic speech recognition to address the cocktail party problem. Cocktail party problem is focus on recognizing the target speech when multiple speakers talk in the noisy real-environments. Many conventional …

Continue reading at link.springer.com (other versions)

238000000926 separation method 0 title abstract description 43

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Similar Documents

Publication	Publication Date	Title
Wang et al.	2021	Multi-microphone complex spectral mapping for utterance-wise and continuous speech separation
CN109830245B (en)	2021-03-12	A method and system for multi-speaker speech separation based on beamforming
Tan et al.	2022	Neural spectrospatial filtering
Chazan et al.	2019	Multi-microphone speaker separation based on deep DOA estimation
Grais et al.	2018	Raw multi-channel audio source separation using multi-resolution convolutional auto-encoders
Schädler et al.	2015	Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition
CN110970053A (en)	2020-04-07	Multichannel speaker-independent voice separation method based on deep clustering
JP6348427B2 (en)	2018-06-27	Noise removal apparatus and noise removal program
Haridas et al.	2018	A novel approach to improve the speech intelligibility using fractional delta-amplitude modulation spectrogram
Sun et al.	2020	Joint constraint algorithm based on deep neural network with dual outputs for single-channel speech separation
Gul et al.	2021	Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions
Saleem et al.	2018	Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback–Leibler divergence
Nugraha et al.	2018	Deep neural network based multichannel audio source separation
Li et al.	2019	Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids
He et al.	2020	Mask-based blind source separation and MVDR beamforming in ASR
Li et al.	2022	Speech enhancement based on binaural sound source localization and cosh measure wiener filtering
Sheeja et al.	2023	Speech dereverberation and source separation using DNN-WPE and LWPR-PCA
Li et al.	2023	MAF-Net: multidimensional attention fusion network for multichannel speech separation
Chen et al.	2021	A multichannel learning-based approach for sound source separation in reverberant environments
Ali et al.	2024	The identification and localization of speaker using fusion techniques and machine learning techniques
Hsu et al.	2023	Array configuration-agnostic personalized speech enhancement using long-short-term spatial coherence
Venkatesan et al.	2018	Deep recurrent neural networks based binaural speech segregation for the selection of closest target of interest
Ghalamiosgouei et al.	2021	Robust Speaker Identification Based on Binaural Masks
Al-Ali et al.	2021	Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments
Habib et al.	2013	Auditory inspired methods for localization of multiple concurrent speakers