Nothing Special   »   [go: up one dir, main page]

skip to main content
Reflects downloads up to 30 Sep 2024Bibliometrics
Skip Table Of Content Section
research-article
DNN controlled adaptive front-end for replay attack detection systems
Highlights

  • Conventional methods fall short in detecting replay spoofing attacks effectively.
  • Auditory-based dynamic filters can detect artefacts in high-quality replayed signals.
  • Deep neural networks can adaptively learn filter traits based ...

Abstract

Developing robust countermeasures to protect automatic speaker verification systems against replay spoofing attacks is a well-recognized challenge. Current approaches to spoofing detection are generally based on a fixed front-end, typically a ...

research-article
Acoustic properties of non-native clear speech: Korean speakers of English
Highlights

  • Non-native clear speech is acoustically distinct from casual speech.
  • The nature of modifications is the same in native and non-native clear speech.
  • The magnitude of modifications is different in native and non-native clear speech. ...

Abstract

The present study examined the acoustic properties of clear speech produced by non-native speakers of English (L1 Korean), in comparison to native clear speech. L1 Korean speakers of English (N=30) and native speakers of English (N=20) read an ...

review-article
Speech emotion recognition approaches: A systematic review
Abstract

The speech emotion recognition (SER) field has been active since it became a crucial feature in advanced Human–Computer Interaction (HCI), and wide real-life applications use it. In recent years, numerous SER systems have been covered by ...

Highlights

  • The speech-emotion recognition (SER) field became crucial in advanced Human-computer interaction (HCI).
  • Numerous SER systems have been proposed by researchers using Machine Learning (ML) and Deep Learning (DL).
  • This survey aims to ...

research-article
Model predictive PESQ-ANFIS/FUZZY C-MEANS for image-based speech signal evaluation
Abstract

This paper presents a new method to evaluate the quality of speech signals through images generated from a psychoacoustic model to estimate PESQ (ITU-T P862) values using a first-order Fuzzy Sugeno approach implemented in the Adaptive Neuro-Fuzzy ...

Highlights

  • Extraction of speech signal factors using image processing techniques.
  • Signal image extracted from a psychoacoustic model.
  • Non-intrusive measurement based on PESQ values trained by ANFIS.
  • Configuration of ANFIS with fuzzy c-...

research-article
Determining spectral stability in vowels: A comparison and assessment of different metrics
Highlights

  • Different metrics for spectral stability identification in vowels are discussed.
  • A new metric is introduced.
  • The different metrics are assessed both on synthesized and natural speech.
  • Higher-dimensional metrics capture spectral ...

Abstract

This study investigated the performance of several metrics used to evaluate spectral stability in vowels. Four metrics suggested in the literature and a newly developed one were tested and compared to the traditional method of associating the ...

Graphical abstract

Display Omitted

research-article
Toward enriched decoding of mandarin spontaneous speech
Highlights

  • Enriched decoding of spontaneous speech achieves better recognition performance.
  • Part-of-speech features help to reduce the perplexity of language model.
  • Hierarchical prosodic model enriches the recognition output with break type ...

Abstract

A deep neural network (DNN)-based automatic speech recognition (ASR) method for enriched decoding of Mandarin spontaneous speech is proposed. It adopts an enhanced approach over the baseline model built with factored time delay neural networks (...

Comments

Please enable JavaScript to view thecomments powered by Disqus.