Kavalekalam et al., 2017 - Google Patents
Model based binaural enhancement of voiced and unvoiced speechKavalekalam et al., 2017
View PDF- Document ID
- 9796278269130800097
- Author
- Kavalekalam M
- Christensen M
- Boldt J
- Publication year
- Publication venue
- 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
External Links
Snippet
This paper deals with the enhancement of speech in presence of non-stationary babble noise. A binaural speech enhancement framework is proposed which takes into account both the voiced and unvoiced speech production model. The usage of this model in …
- 238000004519 manufacturing process 0 abstract description 5
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0202—Applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Williamson et al. | Time-frequency masking in the complex domain for speech dereverberation and denoising | |
Zhang et al. | ADL-MVDR: All deep learning MVDR beamformer for target speech separation | |
Wang et al. | Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR | |
Zhao et al. | Two-stage deep learning for noisy-reverberant speech enhancement | |
Taherian et al. | Robust speaker recognition based on single-channel and multi-channel speech enhancement | |
Kuklasiński et al. | Maximum likelihood PSD estimation for speech enhancement in reverberation and noise | |
Sadjadi et al. | Hilbert envelope based features for robust speaker identification under reverberant mismatched conditions | |
Li et al. | Multichannel speech enhancement based on time-frequency masking using subband long short-term memory | |
Kavalekalam et al. | Model-based speech enhancement for intelligibility improvement in binaural hearing aids | |
Hendriks et al. | Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII | |
Kavalekalam et al. | Kalman filter for speech enhancement in cocktail party scenarios using a codebook-based approach | |
Luo | A time-domain real-valued generalized wiener filter for multi-channel neural separation systems | |
Dadvar et al. | Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target | |
Li et al. | Multichannel online dereverberation based on spectral magnitude inverse filtering | |
Westhausen et al. | Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids | |
Li et al. | Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids | |
Kavalekalam et al. | Model based binaural enhancement of voiced and unvoiced speech | |
Aroudi et al. | Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding | |
Gerkmann | Cepstral weighting for speech dereverberation without musical noise | |
Kavalekalam et al. | Binaural speech enhancement using a codebook based approach | |
Miyazaki et al. | Theoretical analysis of parametric blind spatial subtraction array and its application to speech recognition performance prediction | |
Ji et al. | Coherence-Based Dual-Channel Noise Reduction Algorithm in a Complex Noisy Environment. | |
Kavalekalam et al. | Hearing aid-controlled beamformer for binaural speech enhancement using a model-based approach | |
Li et al. | A composite t60 regression and classification approach for speech dereverberation | |
Schwarz et al. | On blocking matrix-based dereverberation for automatic speech recognition |