Nothing Special   »   [go: up one dir, main page]

He et al., 2020 - Google Patents

Mask-based blind source separation and MVDR beamforming in ASR

He et al., 2020

Document ID
5638381240622721531
Author
He R
Long Y
Li Y
Liang J
Publication year
Publication venue
International Journal of Speech Technology

External Links

Snippet

This paper presents a front-end enhancement system for automatic speech recognition to address the cocktail party problem. Cocktail party problem is focus on recognizing the target speech when multiple speakers talk in the noisy real-environments. Many conventional …
Continue reading at link.springer.com (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Similar Documents

Publication Publication Date Title
Wang et al. Multi-microphone complex spectral mapping for utterance-wise and continuous speech separation
CN109830245B (en) A method and system for multi-speaker speech separation based on beamforming
Tan et al. Neural spectrospatial filtering
Chazan et al. Multi-microphone speaker separation based on deep DOA estimation
Grais et al. Raw multi-channel audio source separation using multi-resolution convolutional auto-encoders
Schädler et al. Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition
CN110970053A (en) Multichannel speaker-independent voice separation method based on deep clustering
JP6348427B2 (en) Noise removal apparatus and noise removal program
Haridas et al. A novel approach to improve the speech intelligibility using fractional delta-amplitude modulation spectrogram
Sun et al. Joint constraint algorithm based on deep neural network with dual outputs for single-channel speech separation
Gul et al. Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions
Saleem et al. Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback–Leibler divergence
Nugraha et al. Deep neural network based multichannel audio source separation
Li et al. Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids
He et al. Mask-based blind source separation and MVDR beamforming in ASR
Li et al. Speech enhancement based on binaural sound source localization and cosh measure wiener filtering
Sheeja et al. Speech dereverberation and source separation using DNN-WPE and LWPR-PCA
Li et al. MAF-Net: multidimensional attention fusion network for multichannel speech separation
Chen et al. A multichannel learning-based approach for sound source separation in reverberant environments
Ali et al. The identification and localization of speaker using fusion techniques and machine learning techniques
Hsu et al. Array configuration-agnostic personalized speech enhancement using long-short-term spatial coherence
Venkatesan et al. Deep recurrent neural networks based binaural speech segregation for the selection of closest target of interest
Ghalamiosgouei et al. Robust Speaker Identification Based on Binaural Masks
Al-Ali et al. Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments
Habib et al. Auditory inspired methods for localization of multiple concurrent speakers