SPIJST: Vol 27, No 2

Volume 27, Issue 2Jun 2024

Volume 27, Issue 2

Jun 2024

Publisher:

Springer-Verlag
Berlin, Heidelberg

ISSN:1381-2416

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Improving low-complexity and real-time DeepFilterNet2 for personalized speech enhancement

Pages 299–306https://doi.org/10.1007/s10772-024-10101-z

Abstract

DeepFilterNet2, a recently proposed real-time and low-complexity speech enhancement (SE) technique, has shown state-of-the-art SE performance in many deep noise suppression tasks. This paper proposes a new approach, termed pDeepFilterNet2, to ...

retraction

Retraction Note: Speaker identification using hybrid neural network support vector machine classifier

Page 307https://doi.org/10.1007/s10772-024-10105-9

research-article

VAD system under uncontrolled environment: A solution for strengthening the noise robustness using MMSE-SPZC

Pages 309–317https://doi.org/10.1007/s10772-024-10104-w

Abstract

Voice activity detection (VAD) plays a crucial role in speech processing, serving as a fundamental component for various applications such as speech recognition and communication systems. Numerous approaches have been explored to address the VAD ...

research-article

Feature fusion: research on emotion recognition in English speech

Yongyan Yang

Pages 319–327https://doi.org/10.1007/s10772-024-10107-7

Abstract

English speech incorporates numerous features associated with the speaker’s emotions, offering valuable cues for emotion recognition. This paper begins by briefly outlining preprocessing approaches for English speech signals. Subsequently, the Mel-...

research-article

An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

Pages 329–339https://doi.org/10.1007/s10772-024-10103-x

Abstract

In this study, we introduce an experimental framework for Moroccan dialect speech recognition under various additive noise conditions using the open-source tool PocketSphinx. We curated a corpus comprising the ten most commonly used greetings in ...

research-article

Effect of identical twins on deep speaker embeddings based forensic voice comparison

Pages 341–351https://doi.org/10.1007/s10772-024-10108-6

Abstract

Deep learning has gained widespread adoption in forensic voice comparison in recent years. It is mainly used to learn speaker representations, known as embedding features or vectors. In this work, the effect of identical twins on two state-of-the-...

research-article

Speech emotion recognition with transfer learning and multi-condition training for noisy environments

Pages 353–365https://doi.org/10.1007/s10772-024-10109-5

Abstract

This paper explores the use of transfer learning techniques to develop robust speech emotion recognition (SER) models capable of handling noise in real-world environments. Two SER frameworks have been proposed in this work: Framework-1 is a two-...

research-article

Server-side rescoring of spoken entity-centric knowledge queries for virtual assistants

Pages 367–375https://doi.org/10.1007/s10772-024-10102-y

Abstract

On-device virtual assistants (VAs) powered by automatic speech recognition (ASR) require effective knowledge integration for the challenging entity-rich query recognition. In this paper, we conduct an empirical study of modeling strategies for ...

research-article

Attentional multi-feature fusion for spoofing-aware speaker verification

Pages 377–387https://doi.org/10.1007/s10772-024-10112-w

Abstract

The Spoofing-Aware Speaker Verification (SASV) system is designed to protect automatic speaker verification (ASV) systems from potential speech spoofing attacks by integrating the ASV and countermeasure systems. The optimization of the ASV system ...

research-article

Fake news detection models using the largest social media ground-truth dataset (TruthSeeker)

Pages 389–404https://doi.org/10.1007/s10772-024-10106-8

Abstract

Twitter is a powerful platform for communication and information sharing but is also susceptible to spreading false information. This false information has adverse consequences for society and can significantly impact public perception, decision-...

research-article

A robust approach to authorship verification using siamese deep learning: application in phishing email detection

Pages 405–412https://doi.org/10.1007/s10772-024-10110-y

Abstract

Given the rapid and significant increase in email data, it is crucial for both individuals and organisations to prioritise the implementation of strong cybersecurity measures to combat attacks such as phishing emails. While continuous research has ...

research-article

Anomaly detection with a variational autoencoder for Arabic mispronunciation detection

Pages 413–424https://doi.org/10.1007/s10772-024-10113-9

Abstract

Computer-assisted language learning (CALL) systems increasingly arouse a significant interest and establish a presence in automated foreign language learning. They enhance traditional learning methods by providing access to various accents and ...

research-article

Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech

Pages 425–436https://doi.org/10.1007/s10772-024-10117-5

Abstract

Automatic phoneme boundary detection is a key problem in speech processing and applications. The accurate phoneme segmentation in continuous speech contributes to the improvement of recognition quality. This study proposes an efficient temporal ...

research-article

A quantal model for Algerian vowel features identification using formants and subglottal resonances

Pages 437–445https://doi.org/10.1007/s10772-024-10114-8

Abstract

The objective of the present study was to test the hypothesis that subglottal resonances (SGRs) serve as quantal boundaries that separate Arabic vowel features. Previous research suggests that SGR frequencies predict discontinuities in vowel ...

research-article

Automatic hate speech detection in audio using machine learning algorithms

Pages 447–469https://doi.org/10.1007/s10772-024-10116-6

Abstract

Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals ...

research-article

Analyzing acoustic patterns of vowel sounds produced by native Rangri speakers

Pages 471–481https://doi.org/10.1007/s10772-024-10122-8

Abstract

The foundational-cum-descriptive study aims to document Rangri vowel realizations acoustically. The Rangri is a member of the Indo-Aryan language family. Ranghar Muhajir of Pakistani Punjab and some parts of Sindh speak Rangri-a zero-resource ...

research-article

Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach

Pages 483–502https://doi.org/10.1007/s10772-024-10120-w

Abstract

The paper developing a resilient speech classification system for individuals with voice disorders poses a formidable challenge due to the significant variability and distortions inherent in vocal signals. This article outlines the steps to create ...

research-article

RETRACTED ARTICLE: Research on pronunciation accuracy detection of English Chinese consecutive interpretation in English intelligent speech translation terminal

Lei Jin

Page 503https://doi.org/10.1007/s10772-021-09839-7

research-article

RETRACTED ARTICLE: Construction of voice access clustering model for online shopping user groups based on electronic communication data mining algorithm

Page 505https://doi.org/10.1007/s10772-021-09863-7

research-article

RETRACTED ARTICLE: Research on life prediction method of rolling bearing based on deep learning and voice interaction technology

Page 507https://doi.org/10.1007/s10772-021-09873-5

research-article

RETRACTED ARTICLE: Speech fault recognition method of music intelligent player based on communication feature analysis

Dongmei Li

Page 509https://doi.org/10.1007/s10772-021-09889-x

research-article

RETRACTED ARTICLE: Ice detection and voice alarm of wind turbine blades based on belief network

Page 511https://doi.org/10.1007/s10772-021-09891-3

research-article

RETRACTED ARTICLE: Accurate recognition of heterogeneous features in super resolution image visualization based on voice remote control system

Page 513https://doi.org/10.1007/s10772-021-09892-2

research-article

RETRACTED ARTICLE: Dialect recognition from Telugu speech utterances using spectral and prosodic features

Page 515https://doi.org/10.1007/s10772-021-09854-8

research-article

RETRACTED ARTICLE: Deep learning based cardiovascular disease diagnosis system from heartbeat sound

Page 517https://doi.org/10.1007/s10772-021-09890-4

research-article

RETRACTED ARTICLE: Wearable sensor based acoustic gait analysis using phase transition-based optimization algorithm on IoT

Page 519https://doi.org/10.1007/s10772-021-09893-1

Comments

Please enable JavaScript to view thecomments powered by Disqus.

International Journal of Speech Technology

Sections

Improving low-complexity and real-time DeepFilterNet2 for personalized speech enhancement

Retraction Note: Speaker identification using hybrid neural network support vector machine classifier

VAD system under uncontrolled environment: A solution for strengthening the noise robustness using MMSE-SPZC

Feature fusion: research on emotion recognition in English speech

An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx

Effect of identical twins on deep speaker embeddings based forensic voice comparison

Speech emotion recognition with transfer learning and multi-condition training for noisy environments

Server-side rescoring of spoken entity-centric knowledge queries for virtual assistants

Attentional multi-feature fusion for spoofing-aware speaker verification

Fake news detection models using the largest social media ground-truth dataset (TruthSeeker)

A robust approach to authorship verification using siamese deep learning: application in phishing email detection

Anomaly detection with a variational autoencoder for Arabic mispronunciation detection

Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech

A quantal model for Algerian vowel features identification using formants and subglottal resonances

Automatic hate speech detection in audio using machine learning algorithms

Analyzing acoustic patterns of vowel sounds produced by native Rangri speakers

Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach

RETRACTED ARTICLE: Research on pronunciation accuracy detection of English Chinese consecutive interpretation in English intelligent speech translation terminal

RETRACTED ARTICLE: Construction of voice access clustering model for online shopping user groups based on electronic communication data mining algorithm

RETRACTED ARTICLE: Research on life prediction method of rolling bearing based on deep learning and voice interaction technology

RETRACTED ARTICLE: Speech fault recognition method of music intelligent player based on communication feature analysis

RETRACTED ARTICLE: Ice detection and voice alarm of wind turbine blades based on belief network

RETRACTED ARTICLE: Accurate recognition of heterogeneous features in super resolution image visualization based on voice remote control system

RETRACTED ARTICLE: Dialect recognition from Telugu speech utterances using spectral and prosodic features

RETRACTED ARTICLE: Deep learning based cardiovascular disease diagnosis system from heartbeat sound

RETRACTED ARTICLE: Wearable sensor based acoustic gait analysis using phase transition-based optimization algorithm on IoT