Yu et al., 2010 - Google Patents

Automatic beamforming for blind extraction of speech from music environment using variance of spectral flux-inspired criterion

Yu et al., 2010

View PDF

Document ID: 7064054465510776848
Author: Yu T; Hansen J
Publication year: 2010
Publication venue: IEEE Journal of Selected Topics in Signal Processing

External Links

Cited by

Snippet

This paper addresses the problem of automatic beamforming for blind extraction of speech in a music environment, using multiple microphones. A new criterion is proposed based on the variance of the spectral flux (VSF), which is shown to be a compound measure of the …

Continue reading at www.utdallas.edu (PDF) (other versions)

238000000605 extraction 0 title abstract description 16

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

Similar Documents

Publication	Publication Date	Title
CN109830245B (en)	2021-03-12	Multi-speaker voice separation method and system based on beam forming
Saruwatari et al.	2003	Blind source separation combining independent component analysis and beamforming
Vu et al.	2010	Blind speech separation employing directional statistics in an expectation maximization framework
Souden et al.	2013	A multichannel MMSE-based framework for speech source separation and noise reduction
McCowan et al.	2001	Robust speaker recognition using microphone arrays
Kolossa et al.	2004	Nonlinear postprocessing for blind speech separation
Ito et al.	2010	Designing the Wiener post-filter for diffuse noise suppression using imaginary parts of inter-channel cross-spectra
Kumatani et al.	2009	Beamforming with a maximum negentropy criterion
Knaak et al.	2007	Geometrically constrained independent component analysis
Koldovský et al.	2013	Semi-blind noise extraction using partially known position of the target source
Xiao et al.	2016	Beamforming networks using spatial covariance features for far-field speech recognition
Li et al.	2020	Online Directional Speech Enhancement Using Geometrically Constrained Independent Vector Analysis.
Hoang et al.	2021	Joint maximum likelihood estimation of power spectral densities and relative acoustic transfer functions for acoustic beamforming
Kovalyov et al.	2023	Dsenet: Directional signal extraction network for hearing improvement on edge devices
Yu et al.	2010	Automatic beamforming for blind extraction of speech from music environment using variance of spectral flux-inspired criterion
Maazaoui et al.	2012	Adaptive blind source separation with HRTFs beamforming preprocessing
Kim et al.	2011	Probabilistic spectral gain modification applied to beamformer-based noise reduction in a car environment
Martín-Doñas et al.	2019	Multi-channel block-online source extraction based on utterance adaptation
Yamaoka et al.	2017	Performance evaluation of nonlinear speech enhancement based on virtual increase of channels in reverberant environments
McCowan et al.	2001	Multi-channel sub-band speech recognition
Rafique et al.	2016	Mixed source prior for the fast independent vector analysis algorithm
Zamani et al.	2019	Convolutive blind source separation with independent vector analysis and beamforming
Goto et al.	2020	Study on geometrically constrained IVA with auxiliary function approach and VCD for in-car communication
Chen et al.	2022	Reference microphone selection and low-rank approximation based multichannel wiener filter with application to speech recognition
Mimura et al.	2017	Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition.