Export Citations
Detection of overlapped speech using lapel microphones in meeting
We propose an overlapped speech detection method for speech recognition and speaker diarization of meetings, where each speaker wears a lapel microphone. Two novel features are utilized as inputs for a GMM-based detector. One is speech power after cross-...
Rapid speaker adaptation using compressive sensing
Speaker-space-based speaker adaptation methods can obtain good performance even if the amount of adaptation data is limited. However, it is difficult to determine the optimal dimension and basis vectors of the subspace for a particular unknown speaker. ...
Speech enhancement based on soft audible noise masking and noise power estimation
This paper presents a perceptual model based speech enhancement algorithm. The proposed algorithm measures the amount of the audible noise in the input noisy speech based on estimation of short-time spectral power of noise signal, and masking threshold ...
Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise
We consider the speech enhancement problem in a moving car through a blind source separation BSS scheme involving two sensors. To correct the distortion brought by this structure we have proposed in previous work (Djendi et al., 2007) two frequency-...
Classifying the socio-situational settings of transcripts of spoken discourses
In this paper, we investigate automatic classification of the socio-situational settings of transcripts of a spoken discourse. Knowledge of the socio-situational setting can be used to search for content recorded in a particular setting or to select ...
Modified segmental signal-to-noise ratio reflecting spectral masking effect for evaluating the performance of hearing aid algorithms
- Sunhyun Yook,
- Kyoung Won Nam,
- Heepyung Kim,
- See Youn Kwon,
- Dongwook Kim,
- Sangmin Lee,
- Sung Hwa Hong,
- Dong Pyo Jang,
- In Young Kim
Most traditional objective indices don't distinguish between a real sound and a perceived sound, and therefore, these indices have limitations in regard to the evaluation of the real effect of an algorithm under investigation on the auditory perception ...
A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech
Despite the established importance of temporal fine-structure (TFS) on speech perception in noise, existing speech transmission metrics use primarily envelope information to model speech intelligibility variance. This study proposes a new physical ...
Speech intelligibility for different spatial configurations of target speech and competing noise source in a horizontal and median plane
The speech intelligibility for different configurations of a target signal (speech) and masker (babble noise) in a horizontal and a median plane was investigated. The sources were placed at the front, in the back or in the right hand side (at different ...
Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives
This paper deals with speaker-adaptive speech recognition for large spoken archives. The goal is to improve the recognition accuracy of an automatic speech recognition (ASR) system that is being deployed for transcription of a large archive of Czech ...
On the identification of relevant degradation indicators in super wideband listening quality assessment models
Recently, new objective speech quality evaluation methods, designed and adapted to new high voice quality contexts, have been developed. One interest of these methods is that they integrate voice quality perceptual dimensions reflecting the effects of ...
Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments
For non-native learners of Japanese, the pitch accent can be cumbersome to acquire without proper instruction. A Computer Assisted Language Learning (CALL) system could aid these learners in this acquisition provided that it can generate helpful ...
Modulation domain blind speech separation in noisy environments
We propose a noise robust blind speech separation (BSS) method by using two microphones. We perform BSS in the modulation domain to take advantage of the improved signal sparsity and reduced musical tone noise in this domain over the conventional ...