SPCO: Vol 55, No 10

Volume 55, Issue 10November, 2013

Volume 55, Issue 10

November, 2013

Publisher:

Elsevier Science Publishers B. V.
PO Box 211 1000 AE Amsterdam
Netherlands

ISSN:0167-6393

Tags:

Select All

Export Citations Save to Binder

article

Detection of overlapped speech using lapel microphones in meeting

Pages 941–949https://doi.org/10.1016/j.specom.2013.06.013

We propose an overlapped speech detection method for speech recognition and speaker diarization of meetings, where each speaker wears a lapel microphone. Two novel features are utilized as inputs for a GMM-based detector. One is speech power after cross-...

article

Rapid speaker adaptation using compressive sensing

Pages 950–963https://doi.org/10.1016/j.specom.2013.06.012

Speaker-space-based speaker adaptation methods can obtain good performance even if the amount of adaptation data is limited. However, it is difficult to determine the optimal dimension and basis vectors of the subspace for a particular unknown speaker. ...

article

Speech enhancement based on soft audible noise masking and noise power estimation

Rongshan Yu

Pages 964–974https://doi.org/10.1016/j.specom.2013.05.006

This paper presents a perceptual model based speech enhancement algorithm. The proposed algorithm measures the amount of the audible noise in the input noisy speech based on estimation of short-time spectral power of noise signal, and masking threshold ...

article

Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise

Pages 975–987https://doi.org/10.1016/j.specom.2013.06.001

We consider the speech enhancement problem in a moving car through a blind source separation BSS scheme involving two sensors. To correct the distortion brought by this structure we have proposed in previous work (Djendi et al., 2007) two frequency-...

article

Classifying the socio-situational settings of transcripts of spoken discourses

Pages 988–1002https://doi.org/10.1016/j.specom.2013.06.011

In this paper, we investigate automatic classification of the socio-situational settings of transcripts of a spoken discourse. Knowledge of the socio-situational setting can be used to search for content recorded in a particular setting or to select ...

article

Modified segmental signal-to-noise ratio reflecting spectral masking effect for evaluating the performance of hearing aid algorithms

Pages 1003–1010https://doi.org/10.1016/j.specom.2013.05.005

Most traditional objective indices don't distinguish between a real sound and a perceived sound, and therefore, these indices have limitations in regard to the evaluation of the real effect of an algorithm under investigation on the auditory perception ...

article

A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech

Pages 1011–1020https://doi.org/10.1016/j.specom.2013.06.016

Despite the established importance of temporal fine-structure (TFS) on speech perception in noise, existing speech transmission metrics use primarily envelope information to model speech intelligibility variance. This study proposes a new physical ...

article

Speech intelligibility for different spatial configurations of target speech and competing noise source in a horizontal and median plane

Pages 1021–1032https://doi.org/10.1016/j.specom.2013.06.009

The speech intelligibility for different configurations of a target signal (speech) and masker (babble noise) in a horizontal and a median plane was investigated. The sources were placed at the front, in the back or in the right hand side (at different ...

article

Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

Pages 1033–1046https://doi.org/10.1016/j.specom.2013.06.017

This paper deals with speaker-adaptive speech recognition for large spoken archives. The goal is to improve the recognition accuracy of an automatic speech recognition (ASR) system that is being deployed for transcription of a large archive of Czech ...

article

On the identification of relevant degradation indicators in super wideband listening quality assessment models

Pages 1047–1063https://doi.org/10.1016/j.specom.2013.06.010

Recently, new objective speech quality evaluation methods, designed and adapted to new high voice quality contexts, have been developed. One interest of these methods is that they integrate voice quality perceptual dimensions reflecting the effects of ...

article

Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments

Pages 1064–1080https://doi.org/10.1016/j.specom.2013.07.002

For non-native learners of Japanese, the pitch accent can be cumbersome to acquire without proper instruction. A Computer Assisted Language Learning (CALL) system could aid these learners in this acquisition provided that it can generate helpful ...

article

Modulation domain blind speech separation in noisy environments

Pages 1081–1099https://doi.org/10.1016/j.specom.2013.06.014

We propose a noise robust blind speech separation (BSS) method by using two microphones. We perform BSS in the modulation domain to take advantage of the improved signal sparsity and reduced musical tone noise in this domain over the conventional ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Speech Communication

Sections

Detection of overlapped speech using lapel microphones in meeting

Rapid speaker adaptation using compressive sensing

Speech enhancement based on soft audible noise masking and noise power estimation

Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise

Classifying the socio-situational settings of transcripts of spoken discourses

Modified segmental signal-to-noise ratio reflecting spectral masking effect for evaluating the performance of hearing aid algorithms

A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech

Speech intelligibility for different spatial configurations of target speech and competing noise source in a horizontal and median plane

Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

On the identification of relevant degradation indicators in super wideband listening quality assessment models

Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments

Modulation domain blind speech separation in noisy environments