Nothing Special   »   [go: up one dir, main page]

Wang, 2008 - Google Patents

Time-frequency masking for speech separation and its potential for hearing aid design

Wang, 2008

View HTML @Full View
Document ID
3846170451997686267
Author
Wang D
Publication year
Publication venue
Trends in amplification

External Links

Snippet

A new approach to the separation of speech from speech-in-noise mixtures is the use of time- frequency (TF) masking. Originated in the field of computational auditory scene analysis, TF masking performs separation in the time-frequency domain. This article introduces the TF …
Continue reading at journals.sagepub.com (HTML) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0202Applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication Publication Date Title
Wang Time-frequency masking for speech separation and its potential for hearing aid design
Das et al. Fundamentals, present and future perspectives of speech enhancement
Wang et al. Supervised speech separation based on deep learning: An overview
Kim et al. An algorithm that improves speech intelligibility in noise for normal-hearing listeners
Stern et al. Hearing is believing: Biologically inspired methods for robust automatic speech recognition
Michelsanti et al. Deep-learning-based audio-visual speech enhancement in presence of Lombard effect
Kollmeier et al. Perception of speech and sound
Pertilä et al. Distant speech separation using predicted time–frequency masks from spatial features
Zhao et al. A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions
Bramsløw et al. Improving competing voices segregation for hearing impaired listeners using a low-latency deep neural network algorithm
Monaghan et al. Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners
Keshavarzi et al. Use of a deep recurrent neural network to reduce wind noise: Effects on judged speech intelligibility and sound quality
Wang et al. Speech enhancement for cochlear implant recipients
Zai et al. Reconstruction of audio waveforms from spike trains of artificial cochlea models
Hansen et al. A speech perturbation strategy based on “Lombard effect” for enhanced intelligibility for cochlear implant listeners
Priyanka et al. Multi-channel speech enhancement using early and late fusion convolutional neural networks
Albornoz et al. Feature extraction based on bio-inspired model for robust emotion recognition
Abel et al. Novel two-stage audiovisual speech filtering in noisy environments
Venkatesan et al. Binaural classification-based speech segregation and robust speaker recognition system
Patil et al. Marathi speech intelligibility enhancement using I-AMS based neuro-fuzzy classifier approach for hearing aid users
Zouhir et al. A bio-inspired feature extraction for robust speech recognition
Hummersone A psychoacoustic engineering approach to machine sound source separation in reverberant environments
Li et al. Improved environment-aware–based noise reduction system for cochlear implant users based on a knowledge transfer approach: Development and usability study
Gößling et al. Perceptual evaluation of binaural MVDR-based algorithms to preserve the interaural coherence of diffuse noise fields
Maciejewski et al. Building corpora for single-channel speech separation across multiple domains