Nothing Special   »   [go: up one dir, main page]

skip to main content
Reflects downloads up to 13 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
Improving low-complexity and real-time DeepFilterNet2 for personalized speech enhancement
Abstract

DeepFilterNet2, a recently proposed real-time and low-complexity speech enhancement (SE) technique, has shown state-of-the-art SE performance in many deep noise suppression tasks. This paper proposes a new approach, termed pDeepFilterNet2, to ...

research-article
VAD system under uncontrolled environment: A solution for strengthening the noise robustness using MMSE-SPZC
Abstract

Voice activity detection (VAD) plays a crucial role in speech processing, serving as a fundamental component for various applications such as speech recognition and communication systems. Numerous approaches have been explored to address the VAD ...

research-article
Feature fusion: research on emotion recognition in English speech
Abstract

English speech incorporates numerous features associated with the speaker’s emotions, offering valuable cues for emotion recognition. This paper begins by briefly outlining preprocessing approaches for English speech signals. Subsequently, the Mel-...

research-article
An experiment of Moroccan dialect speech recognition in noisy environments using PocketSphinx
Abstract

In this study, we introduce an experimental framework for Moroccan dialect speech recognition under various additive noise conditions using the open-source tool PocketSphinx. We curated a corpus comprising the ten most commonly used greetings in ...

research-article
Effect of identical twins on deep speaker embeddings based forensic voice comparison
Abstract

Deep learning has gained widespread adoption in forensic voice comparison in recent years. It is mainly used to learn speaker representations, known as embedding features or vectors. In this work, the effect of identical twins on two state-of-the-...

research-article
Speech emotion recognition with transfer learning and multi-condition training for noisy environments
Abstract

This paper explores the use of transfer learning techniques to develop robust speech emotion recognition (SER) models capable of handling noise in real-world environments. Two SER frameworks have been proposed in this work: Framework-1 is a two-...

research-article
Server-side rescoring of spoken entity-centric knowledge queries for virtual assistants
Abstract

On-device virtual assistants (VAs) powered by automatic speech recognition (ASR) require effective knowledge integration for the challenging entity-rich query recognition. In this paper, we conduct an empirical study of modeling strategies for ...

research-article
Attentional multi-feature fusion for spoofing-aware speaker verification
Abstract

The Spoofing-Aware Speaker Verification (SASV) system is designed to protect automatic speaker verification (ASV) systems from potential speech spoofing attacks by integrating the ASV and countermeasure systems. The optimization of the ASV system ...

research-article
Fake news detection models using the largest social media ground-truth dataset (TruthSeeker)
Abstract

Twitter is a powerful platform for communication and information sharing but is also susceptible to spreading false information. This false information has adverse consequences for society and can significantly impact public perception, decision-...

research-article
A robust approach to authorship verification using siamese deep learning: application in phishing email detection
Abstract

Given the rapid and significant increase in email data, it is crucial for both individuals and organisations to prioritise the implementation of strong cybersecurity measures to combat attacks such as phishing emails. While continuous research has ...

research-article
Anomaly detection with a variational autoencoder for Arabic mispronunciation detection
Abstract

Computer-assisted language learning (CALL) systems increasingly arouse a significant interest and establish a presence in automated foreign language learning. They enhance traditional learning methods by providing access to various accents and ...

research-article
Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech
Abstract

Automatic phoneme boundary detection is a key problem in speech processing and applications. The accurate phoneme segmentation in continuous speech contributes to the improvement of recognition quality. This study proposes an efficient temporal ...

research-article
A quantal model for Algerian vowel features identification using formants and subglottal resonances
Abstract

The objective of the present study was to test the hypothesis that subglottal resonances (SGRs) serve as quantal boundaries that separate Arabic vowel features. Previous research suggests that SGR frequencies predict discontinuities in vowel ...

research-article
Automatic hate speech detection in audio using machine learning algorithms
Abstract

Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals ...

research-article
Analyzing acoustic patterns of vowel sounds produced by native Rangri speakers
Abstract

The foundational-cum-descriptive study aims to document Rangri vowel realizations acoustically. The Rangri is a member of the Indo-Aryan language family. Ranghar Muhajir of Pakistani Punjab and some parts of Sindh speak Rangri-a zero-resource ...

research-article
Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach
Abstract

The paper developing a resilient speech classification system for individuals with voice disorders poses a formidable challenge due to the significant variability and distortions inherent in vocal signals. This article outlines the steps to create ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.