Introduction to the Special Issue on Processing Reverberant Speech: Methodologies and Applications
The 17 papers in this special issue focus on the methodologies and applications of processing reverberant speech. The issue highlights some major aspects of the recent progress in the field.
Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition
The REMOS (REverberation MOdeling for Speech recognition) concept for reverberation-robust distant-talking speech recognition, introduced in “Distant-talking continuous speech recognition based on a novel reverberation model in the feature domain” (A. ...
Model-Based Feature Enhancement for Reverberant Speech Recognition
In this paper, we present a new technique for automatic speech recognition (ASR) in reverberant environments. Our approach is aimed at the enhancement of the logarithmic Mel power spectrum, which is computed at an intermediate stage to obtain the widely ...
Robust Speech Recognition Based on Dereverberation Parameter Optimization Using Acoustic Model Likelihood
Automatic speech recognition (ASR) in reverberant environments is a challenging task. Most dereverberation techniques address this problem through signal processing and enhances the reverberant waveform independent from the speech recognizer. In this ...
Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction
This paper proposes a statistical model-based speech dereverberation approach that can cancel the late reverberation of a reverberant speech signal captured by distant microphones without prior knowledge of the room impulse responses. With this approach,...
Model-Based Dereverberation Preserving Binaural Cues
The ability of the human auditory system for sound localization mainly depends on the binaural cues, especially interaural time and level differences (ITD and ILD). In the context of digital hearing aids and binaural audio transmission systems, these ...
Correlation-Based and Model-Based Blind Single-Channel Late-Reverberation Suppression in Noisy Time-Varying Acoustical Environments
This paper considers suppression of late reverberation and additive noise in single-channel speech recordings. The reverberation introduces long-term correlation in the observed signal. In the first part of this work, we show how this correlation can be ...
A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech
A modulation spectral representation is investigated for non-intrusive quality and intelligibility measurement of reverberant and dereverberated speech. The representation is obtained by means of an auditory-inspired filterbank analysis of critical-band ...
Using Steady-State Suppression to Improve Speech Intelligibility in Reverberant Environments for Elderly Listeners
Reverberation is a large problem for speech communication and it is known that strong reverberation affects speech intelligibility. This is especially true for people with hearing impairments and/or elderly people. Several approaches have been proposed ...
Using Reverberation to Improve Range and Elevation Discrimination for Small Array Sound Source Localization
Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms ...
Binaural Estimation of Sound Source Distance via the Direct-to-Reverberant Energy Ratio for Static and Moving Sources
One of the principal cues believed to be used by listeners to estimate the distance to a sound source is the ratio of energies along the direct and indirect paths to the receiver. In essence, this “direct-to-reverberant” energy ratio reveals the ...
An Acoustic Source Localization and Tracking Framework Using Particle Filtering and Information Theory
The problem of detecting the location of an active acoustic source in an enclosure remains subject to a series of difficulties. Algorithms operate repeatedly on small frames of data from microphone recordings and provide estimates of the current source ...
Beyond the Narrowband Approximation: Wideband Convex Methods for Under-Determined Reverberant Audio Source Separation
We consider the problem of extracting the source signals from an under-determined convolutive mixture assuming known mixing filters. State-of-the-art methods operate in the time-frequency domain and rely on narrowband approximation of the convolutive ...
Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model
This paper addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a zero-mean ...
Glimpsing IVA: A Framework for Overcomplete/Complete/Undercomplete Convolutive Source Separation
Independent vector analysis (IVA) is a method for separating convolutedly mixed signals that significantly reduces the occurrence of the well-known permutation problem in frequency domain blind source separation (BSS). In this paper, we develop a novel ...
Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization
Existing binaural approaches to speech segregation place an exclusive burden on cues related to the location of sound sources in space. These approaches can achieve excellent performance in anechoic conditions but degrade rapidly in realistic ...
Dynamic Precedence Effect Modeling for Source Separation in Reverberant Environments
Reverberation continues to present a major problem for sound source separation algorithms. However, humans demonstrate a remarkable robustness to reverberation and many psychophysical and perceptual mechanisms are well documented. The precedence effect ...
Evaluating Source Separation Algorithms With Reverberant Speech
This paper examines the performance of several source separation systems on a speech separation task for which human intelligibility has previously been measured. For anechoic mixtures, automatic speech recognition (ASR) performance on the separated ...