Nothing Special   »   [go: up one dir, main page]

Kavalekalam et al., 2017 - Google Patents

Model based binaural enhancement of voiced and unvoiced speech

Kavalekalam et al., 2017

View PDF
Document ID
9796278269130800097
Author
Kavalekalam M
Christensen M
Boldt J
Publication year
Publication venue
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

External Links

Snippet

This paper deals with the enhancement of speech in presence of non-stationary babble noise. A binaural speech enhancement framework is proposed which takes into account both the voiced and unvoiced speech production model. The usage of this model in …
Continue reading at vbn.aau.dk (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0202Applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building

Similar Documents

Publication Publication Date Title
Williamson et al. Time-frequency masking in the complex domain for speech dereverberation and denoising
Zhang et al. ADL-MVDR: All deep learning MVDR beamformer for target speech separation
Wang et al. Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR
Zhao et al. Two-stage deep learning for noisy-reverberant speech enhancement
Taherian et al. Robust speaker recognition based on single-channel and multi-channel speech enhancement
Kuklasiński et al. Maximum likelihood PSD estimation for speech enhancement in reverberation and noise
Sadjadi et al. Hilbert envelope based features for robust speaker identification under reverberant mismatched conditions
Li et al. Multichannel speech enhancement based on time-frequency masking using subband long short-term memory
Kavalekalam et al. Model-based speech enhancement for intelligibility improvement in binaural hearing aids
Hendriks et al. Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII
Kavalekalam et al. Kalman filter for speech enhancement in cocktail party scenarios using a codebook-based approach
Luo A time-domain real-valued generalized wiener filter for multi-channel neural separation systems
Dadvar et al. Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target
Li et al. Multichannel online dereverberation based on spectral magnitude inverse filtering
Westhausen et al. Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids
Li et al. Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids
Kavalekalam et al. Model based binaural enhancement of voiced and unvoiced speech
Aroudi et al. Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding
Gerkmann Cepstral weighting for speech dereverberation without musical noise
Kavalekalam et al. Binaural speech enhancement using a codebook based approach
Miyazaki et al. Theoretical analysis of parametric blind spatial subtraction array and its application to speech recognition performance prediction
Ji et al. Coherence-Based Dual-Channel Noise Reduction Algorithm in a Complex Noisy Environment.
Kavalekalam et al. Hearing aid-controlled beamformer for binaural speech enhancement using a model-based approach
Li et al. A composite t60 regression and classification approach for speech dereverberation
Schwarz et al. On blocking matrix-based dereverberation for automatic speech recognition