Nothing Special   »   [go: up one dir, main page]

Dinkel et al., 2018 - Google Patents

Investigating raw wave deep neural networks for end-to-end speaker spoofing detection

Dinkel et al., 2018

Document ID
6253518142894034012
Author
Dinkel H
Qian Y
Yu K
Publication year
Publication venue
IEEE/ACM Transactions on Audio, Speech, and Language Processing

External Links

Snippet

Recent advances in automatic speaker verification (ASV) lead to an increased interest in securing these systems for real-world applications. Malicious spoofing attempts against ASV systems can lead to serious security breaches. A spoofing attack within the context of ASV is …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the analysis technique using neural networks

Similar Documents

Publication Publication Date Title
Dinkel et al. Investigating raw wave deep neural networks for end-to-end speaker spoofing detection
Kabir et al. A survey of speaker recognition: Fundamental theories, recognition methods and opportunities
Hanifa et al. A review on speaker recognition: Technology and challenges
Kamble et al. Advances in anti-spoofing: from the perspective of ASVspoof challenges
Wu et al. Spoofing and countermeasures for speaker verification: A survey
Aljasem et al. Secure automatic speaker verification (SASV) system through sm-ALTP features and asymmetric bagging
Yoon et al. A new replay attack against automatic speaker verification systems
Biagetti et al. An investigation on the accuracy of truncated DKLT representation for speaker identification with short sequences of speech frames
Joshi et al. A Study of speech emotion recognition methods
Agrawal et al. Prosodic feature based text dependent speaker recognition using machine learning algorithms
Mamyrbayev et al. Development of security systems using DNN and i & x-vector classifiers
Zhu et al. Source tracing: detecting voice spoofing
Babu Rao et al. Automatic Speech Recognition Design Modeling
Neelima et al. Mimicry voice detection using convolutional neural networks
Reimao Synthetic speech detection using deep neural networks
Raghib et al. Emotion analysis and speech signal processing
Shitov et al. Learning acoustic word embeddings with dynamic time warping triplet networks
Rupesh Kumar et al. Generative and discriminative modelling of linear energy sub-bands for spoof detection in speaker verification systems
Gao Audio deepfake detection based on differences in human and machine generated speech
Praksah et al. Analysis of emotion recognition system through speech signal using KNN, GMM & SVM classifier
Dennis et al. Generalized Hough transform for speech pattern classification
Trabelsi et al. Comparison of several acoustic modeling techniques for speech emotion recognition
Khonglah et al. Exploration of deep belief networks for vowel-like regions detection
Manzo-Martínez et al. Analysis of the Impact Using Pre-emphasis Filter, Unvoiced Sounds, Frame Size and Feature Vector Size on Human Emotion Recognition by Voice and Machine Learning
Dulhare et al. A Novel Approach for Speech Emotion Recognition with Facial Expression Analysis