Wang, 2008 - Google Patents

Time-frequency masking for speech separation and its potential for hearing aid design

Wang, 2008

Document ID: 3846170451997686267
Author: Wang D
Publication year: 2008
Publication venue: Trends in amplification

External Links

Cited by

Snippet

A new approach to the separation of speech from speech-in-noise mixtures is the use of time- frequency (TF) masking. Originated in the field of computational auditory scene analysis, TF masking performs separation in the time-frequency domain. This article introduces the TF …

Continue reading at journals.sagepub.com (HTML) (other versions)

230000000873 masking 0 title abstract description 131

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0202—Applications
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
Wang	2008	Time-frequency masking for speech separation and its potential for hearing aid design
Das et al.	2021	Fundamentals, present and future perspectives of speech enhancement
Wang et al.	2018	Supervised speech separation based on deep learning: An overview
Kim et al.	2009	An algorithm that improves speech intelligibility in noise for normal-hearing listeners
Stern et al.	2012	Hearing is believing: Biologically inspired methods for robust automatic speech recognition
Michelsanti et al.	2019	Deep-learning-based audio-visual speech enhancement in presence of Lombard effect
Kollmeier et al.	2008	Perception of speech and sound
Pertilä et al.	2015	Distant speech separation using predicted time–frequency masks from spatial features
Zhao et al.	2018	A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions
Bramsløw et al.	2018	Improving competing voices segregation for hearing impaired listeners using a low-latency deep neural network algorithm
Monaghan et al.	2017	Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners
Keshavarzi et al.	2018	Use of a deep recurrent neural network to reduce wind noise: Effects on judged speech intelligibility and sound quality
Wang et al.	2018	Speech enhancement for cochlear implant recipients
Zai et al.	2015	Reconstruction of audio waveforms from spike trains of artificial cochlea models
Hansen et al.	2020	A speech perturbation strategy based on “Lombard effect” for enhanced intelligibility for cochlear implant listeners
Priyanka et al.	2023	Multi-channel speech enhancement using early and late fusion convolutional neural networks
Albornoz et al.	2017	Feature extraction based on bio-inspired model for robust emotion recognition
Abel et al.	2014	Novel two-stage audiovisual speech filtering in noisy environments
Venkatesan et al.	2018	Binaural classification-based speech segregation and robust speaker recognition system
Patil et al.	2022	Marathi speech intelligibility enhancement using I-AMS based neuro-fuzzy classifier approach for hearing aid users
Zouhir et al.	2014	A bio-inspired feature extraction for robust speech recognition
Hummersone	2011	A psychoacoustic engineering approach to machine sound source separation in reverberant environments
Li et al.	2021	Improved environment-aware–based noise reduction system for cochlear implant users based on a knowledge transfer approach: Development and usability study
Gößling et al.	2020	Perceptual evaluation of binaural MVDR-based algorithms to preserve the interaural coherence of diffuse noise fields
Maciejewski et al.	2018	Building corpora for single-channel speech separation across multiple domains