Shi et al., 2021 - Google Patents
Anti-replay: A fast and lightweight voice replay attack detection systemShi et al., 2021
- Document ID
- 12199470115928641637
- Author
- Shi Z
- Li C
- Jin Z
- Sun W
- Ji X
- Xu W
- Publication year
- Publication venue
- 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)
External Links
Snippet
Due to the open nature of voice and voice interface, attackers can easily record the user's voice commands and spoof the voice recognition systems by replaying them. Existing voice replay attack detection methods mainly rely on extra hardware to determine the sound …
- 238000001514 detection method 0 title abstract description 41
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/10—Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/005—Speaker recognisers specially adapted for particular applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Who is real bob? adversarial attacks on speaker recognition systems | |
Ahmed et al. | Void: A fast and light voice liveness detection system | |
US11488605B2 (en) | Method and apparatus for detecting spoofing conditions | |
Liu et al. | An MFCC‐based text‐independent speaker identification system for access control | |
Huang et al. | Audio replay spoof attack detection by joint segment-based linear filter bank feature extraction and attention-enhanced DenseNet-BiLSTM network | |
US8589167B2 (en) | Speaker liveness detection | |
Huang et al. | Audio replay spoof attack detection using segment-based hybrid feature and densenet-LSTM network | |
CN113823293B (en) | Speaker recognition method and system based on voice enhancement | |
Adiban et al. | Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge | |
Huang et al. | Stop deceiving! an effective defense scheme against voice impersonation attacks on smart devices | |
Zhang et al. | Waveform level adversarial example generation for joint attacks against both automatic speaker verification and spoofing countermeasures | |
Weng et al. | The sysu system for the interspeech 2015 automatic speaker verification spoofing and countermeasures challenge | |
Shi et al. | Anti-replay: A fast and lightweight voice replay attack detection system | |
Chen et al. | Push the limit of adversarial example attack on speaker recognition in physical domain | |
Huang et al. | Audio replay spoofing attack detection using deep learning feature and long-short-term memory recurrent neural network | |
Singh et al. | Replay attack detection using excitation source and system features | |
Nagakrishnan et al. | Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models | |
Williams et al. | Privacy-Preserving Occupancy Estimation | |
Ye et al. | Detection of replay attack based on normalized constant q cepstral feature | |
Huang et al. | Audio-replay Attacks Spoofing Detection for Automatic Speaker Verification System | |
Ahmed et al. | Detecting Replay Attack on Voice-Controlled Systems using Small Neural Networks | |
Impedovo et al. | An Investigation on Voice Mimicry Attacks to a Speaker Recognition System. | |
Park et al. | User authentication method via speaker recognition and speech synthesis detection | |
Hajipour et al. | Listening to sounds of silence for audio replay attack detection | |
Jahanirad et al. | Blind source computer device identification from recorded VoIP calls for forensic investigation |