Kim et al., 2012 - Google Patents

Adaptation mode control with residual noise estimation for beamformer-based multi-channel speech enhancement

Kim et al., 2012

Document ID: 14012477935747418023
Author: Kim S; Kim H; Lee S; Lee Y
Publication year: 2012
Publication venue: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

External Links

Cited by

Snippet

In this paper, we propose a new adaptation mode controller (AMC) for a generalized sidelobe canceller (GSC) having prior knowledge of the direction-of-arrival (DOA) of a desired speech source. In order to optimize the adaptation mode of a GSC, the residual …

Continue reading at www.academia.edu (PDF) (other versions)

230000004301 light adaptation 0 title abstract description 20

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Similar Documents

Publication	Publication Date	Title
Parchami et al.	2016	Recent developments in speech enhancement in the short-time Fourier transform domain
Subramanian et al.	2019	Speech enhancement using end-to-end speech recognition objectives
EP2058797B1 (en)	2011-05-04	Discrimination between foreground speech and background noise
KR101726737B1 (en)	2017-04-13	Apparatus for separating multi-channel sound source and method the same
Abramson et al.	2007	Simultaneous detection and estimation approach for speech enhancement
CN101369427A (en)	2009-02-18	Noise reduction by combined beamforming and post-filtering
DEREVERBERATION et al.	2014	REVERB Workshop 2014
Garg et al.	2016	A comparative study of noise reduction techniques for automatic speech recognition systems
US9875748B2 (en)	2018-01-23	Audio signal noise attenuation
Martín-Doñas et al.	2017	Dual-channel DNN-based speech enhancement for smartphones
Naik et al.	2020	A literature survey on single channel speech enhancement techniques
Bohlender et al.	2021	Neural networks using full-band and subband spatial features for mask based source separation
EP3847645B1 (en)	2022-04-13	Determining a room response of a desired source in a reverberant environment
Kodrasi et al.	2018	Single-channel Late Reverberation Power Spectral Density Estimation Using Denoising Autoencoders.
Krishnamoorthy et al.	2009	Temporal and spectral processing methods for processing of degraded speech: a review
Tashev et al.	2009	Unified framework for single channel speech enhancement
Kim et al.	2011	Hybrid probabilistic adaptation mode controller for generalized sidelobe canceller-based target-directional speech enhancement
Kim et al.	2022	iDeepMMSE: An improved deep learning approach to MMSE speech and noise power spectrum estimation for speech enhancement.
Kim et al.	2012	Adaptation mode control with residual noise estimation for beamformer-based multi-channel speech enhancement
Martín Doñas et al.	2021	Dual-channel eKF-RTF framework for speech enhancement with DNN-based speech presence estimation
Son et al.	2012	Improved speech absence probability estimation based on environmental noise classification
Cheong et al.	2024	Postfilter for Dual Channel Speech Enhancement Using Coherence and Statistical Model-Based Noise Estimation
Aprilyanti et al.	2014	Optimized joint noise suppression and dereverberation based on blind signal extraction for hands-free speech recognition system
Abutalebi et al.	2011	Speech dereverberation in noisy environments using an adaptive minimum mean square error estimator
Bartolewska et al.	2021	Frame-based Maximum a Posteriori Estimation of Second-Order Statistics for Multichannel Speech Enhancement in Presence of Noise