Shi et al., 2021 - Google Patents

Anti-replay: A fast and lightweight voice replay attack detection system

Shi et al., 2021

Document ID: 12199470115928641637
Author: Shi Z; Li C; Jin Z; Sun W; Ji X; Xu W
Publication year: 2021
Publication venue: 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)

External Links

Cited by

Snippet

Due to the open nature of voice and voice interface, attackers can easily record the user's voice commands and spoof the voice recognition systems by replaying them. Existing voice replay attack detection methods mainly rely on extra hardware to determine the sound …

Continue reading at ieeexplore.ieee.org (other versions)

238000001514 detection method 0 title abstract description 41

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/10—Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/005—Speaker recognisers specially adapted for particular applications
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
Chen et al.	2021	Who is real bob? adversarial attacks on speaker recognition systems
Ahmed et al.	2020	Void: A fast and light voice liveness detection system
US11488605B2 (en)	2022-11-01	Method and apparatus for detecting spoofing conditions
Liu et al.	2018	An MFCC‐based text‐independent speaker identification system for access control
Huang et al.	2020	Audio replay spoof attack detection by joint segment-based linear filter bank feature extraction and attention-enhanced DenseNet-BiLSTM network
US8589167B2 (en)	2013-11-19	Speaker liveness detection
Huang et al.	2019	Audio replay spoof attack detection using segment-based hybrid feature and densenet-LSTM network
CN113823293B (en)	2024-04-26	Speaker recognition method and system based on voice enhancement
Adiban et al.	2020	Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge
Huang et al.	2021	Stop deceiving! an effective defense scheme against voice impersonation attacks on smart devices
Zhang et al.	2022	Waveform level adversarial example generation for joint attacks against both automatic speaker verification and spoofing countermeasures
Weng et al.	2015	The sysu system for the interspeech 2015 automatic speaker verification spoofing and countermeasures challenge
Shi et al.	2021	Anti-replay: A fast and lightweight voice replay attack detection system
Chen et al.	2022	Push the limit of adversarial example attack on speaker recognition in physical domain
Huang et al.	2021	Audio replay spoofing attack detection using deep learning feature and long-short-term memory recurrent neural network
Singh et al.	2020	Replay attack detection using excitation source and system features
Nagakrishnan et al.	2022	Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models
Williams et al.	2023	Privacy-Preserving Occupancy Estimation
Ye et al.	2019	Detection of replay attack based on normalized constant q cepstral feature
Huang et al.	2019	Audio-replay Attacks Spoofing Detection for Automatic Speaker Verification System
Ahmed et al.	2022	Detecting Replay Attack on Voice-Controlled Systems using Small Neural Networks
Impedovo et al.	2022	An Investigation on Voice Mimicry Attacks to a Speaker Recognition System.
Park et al.	2022	User authentication method via speaker recognition and speech synthesis detection
Hajipour et al.	2021	Listening to sounds of silence for audio replay attack detection
Jahanirad et al.	2017	Blind source computer device identification from recorded VoIP calls for forensic investigation