Nothing Special   »   [go: up one dir, main page]

CN117012204B - Defensive method for countermeasure sample of speaker recognition system - Google Patents

Defensive method for countermeasure sample of speaker recognition system Download PDF

Info

Publication number
CN117012204B
CN117012204B CN202310918349.2A CN202310918349A CN117012204B CN 117012204 B CN117012204 B CN 117012204B CN 202310918349 A CN202310918349 A CN 202310918349A CN 117012204 B CN117012204 B CN 117012204B
Authority
CN
China
Prior art keywords
benign
model
cyclegan
data
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310918349.2A
Other languages
Chinese (zh)
Other versions
CN117012204A (en
Inventor
徐洋
杨凌一
张思聪
谢晓尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Education University
Original Assignee
Guizhou Education University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Education University filed Critical Guizhou Education University
Priority to CN202310918349.2A priority Critical patent/CN117012204B/en
Publication of CN117012204A publication Critical patent/CN117012204A/en
Application granted granted Critical
Publication of CN117012204B publication Critical patent/CN117012204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a defending method for a speaker recognition system countermeasure sample, which comprises the following steps: (1) creating a desired data set; (2) Constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model; (3) training the model using deweighting learning; (4) Performance testing was performed in the test set using the trained model and defenses against challenge samples generated by CW2, MIM, ADA and FGSM. According to the invention, the CycleGAN-VC2 is taken as a backbone network, and an antagonistic sample and a benign sample are added in a training set, so that side effects of a defense method are reduced. And according to the idea of decrement learning, deleting benign samples in the training process, accelerating the training of the model, restricting the loss function by using the L2 distance, encouraging the model to screen more features, and thus realizing the defense against the samples.

Description

Defensive method for countermeasure sample of speaker recognition system
Technical Field
The invention belongs to the technical field of voice systems, in particular to the field of countermeasure defense in speaker recognition, and more particularly relates to a defense method for a countermeasure sample of a speaker recognition system.
Background
Defensive audio challenge samples are an important topic in challenge defenses, the defensive effects of which directly affect the reliability of authentication, judicial authentication, and personalized services on smart devices. With the continued evolution and enhancement of audio counterattack, it is becoming increasingly important to protect speaker recognition systems from malicious interference and attack challenges. In defending against audio challenge samples, it is critical to effectively defend against challenge samples while not affecting the accuracy of benign samples. This is critical to maintaining the accuracy and robustness of the speaker recognition system.
Security in the image field has been widely studied at present. However, in the speech field, especially in speaker recognition systems, defense methods against samples have not been fully explored and studied. Moreover, security issues in speaker recognition systems are not negligible. If the speaker recognition system is used to provide financial related services or personal privacy related services, and is not adequately secured, the personal property and reputation can be greatly compromised.
The challenge defense can be divided into two methods, active defense and passive defense. The active defense uses the challenge sample for data enhancement, retraining the speaker recognition model to improve its robustness. Passive defense is achieved by adding new components without modifying the original model, and the passive defense method can be classified into a detection method and a purification method according to the functions of the new components, for detecting and eliminating the influence of the challenge sample.
The patent application of application number 202310123820.9 discloses a universal detection system and method for a speaker recognition system challenge sample, wherein the system comprises a multi-channel audio interference module for performing audio interference on input original audio to generate an audio variety set corresponding to the original audio; the speaker system recognition module is used for inputting the generated audio variety set into the speaker recognition system, and extracting a scoring sequence and a discrimination result sequence corresponding to the audio variety set; the stability feature extraction module is used for extracting statistics features of the obtained score sequence and the discrimination result sequence, and connecting the extracted feature value with the score sequence to obtain stability expression features; and the single-class judging module judges whether the input original audio is a countermeasure sample according to the stability representing characteristics. A universal detection method is also disclosed. The system provided by the invention can be self-adaptive to the attack detection of the countering sample under various conditions, so that the safety of voice recognition is enhanced.
The patent application of application number 202210659947.8 discloses a voiceprint recognition challenge sample detection method based on different migration capacities and decision boundary attacks, which comprises the steps of firstly, preprocessing data of speaker signals, and dividing the speaker signals into a training set and a testing set; building a voiceprint recognition model according to the training data; generating challenge samples on the target model using different challenge methods; inputting a mixed sample set of a clean sample and an countermeasure sample into a target model and a detection model to obtain two corresponding labels, comparing whether the labels are consistent or not, if not, setting a detection value, namely, the countermeasure disturbance proportion as 0, and if so, attacking the sample with the unchanged label by utilizing a decision boundary attack method to obtain the countermeasure disturbance proportion; utilizing decision boundary attack in a clean sample set to obtain a batch of anti-disturbance proportions, and determining a detected decision threshold value in the disturbance proportions; and detecting the countermeasure sample by using the determined decision threshold, wherein if the disturbance proportion value of the sample is larger than the threshold, the sample is a clean sample, otherwise, the sample is the countermeasure sample.
The above two patents both belong to detection methods in defense methods, wherein a general detection system and method for a speaker recognition system challenge sample is to preprocess audio to be detected so as to enrich the variety types of the audio to be recognized, then put the audio into a recognition system to obtain a scoring sequence and a discrimination result sequence, and perform feature extraction on the obtained scoring sequence and discrimination result sequence for detection. The voiceprint recognition challenge sample detection method based on different migration capacities and decision boundary attacks trains two speaker recognition models, if the output results of the two models are inconsistent, challenge samples are detected, some challenge samples possibly exist in the input and are not detected, a HopSkipJumpAttack (HSJA) decision boundary attack method is adopted for the challenge samples, the challenge samples are moved out of the decision boundary, and the challenge samples are detected by comparing with a decision threshold.
The present inventors devised a purification method different from the protection method of the above two patents, and found no similar patent document.
Disclosure of Invention
The invention aims to provide a defense method for a speaker recognition system challenge sample, which aims to generate the problem that the training is very slow and the side effect of the defense in the challenge network, takes CycleGAN-VC2 as a backbone network, adds the challenge sample and a benign sample in the training set, and reduces the side effect of the defense on the benign sample; and according to the idea of decrement learning, deleting benign samples in the training process, accelerating the training of the model, restricting the loss function by using the L2 distance, encouraging the model to screen more features, and thus realizing the defense against the samples.
The technical scheme of the invention is as follows:
a defending method for a speaker recognition system to fight against a sample maintains the accuracy and the robustness of the speaker recognition system by fusing decrement learning and improved CycleGAN-VC 2; firstly, benign samples and antagonistic samples are added into a data set for training of a generator, and a method of decrement learning is integrated in the training process to delete benign data; secondly, improving the CycleGAN-VC2, and adopting an L2 distance constraint loss function, wherein the method comprises the following steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model CycleGAN-L2 by using decrement learning;
and 4, performing performance test in the test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM.
The method comprises the following steps of:
step 1, acquiring Librispeech voice data sets, randomly selecting 10 speakers from the Librispeeches voice data sets, using 100 audio files of each person as benign data sets, performing PGD attack on the benign data sets, generating 1000 countersamples serving as countersample data sets, combining the benign data sets and the countersample data sets to obtain natural data sets required by experiments, and using the natural data sets according to 9: the ratio of 1 is divided into a training set and a test set, wherein the ratio of benign samples and challenge samples in the training set and the test set is 1:1, a step of;
step 2, loss of loop consistency L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator; in the cycle consistency loss, G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y; in loss of identity mappability, G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) Generating natural data x from the natural data x; cycle consistency penalty L cyc And identity mappability loss L id Respectively constraining by using L2 distances;
step 3, training the CycleGAN-L2 model by adopting a decrement learning method, and if G is generated in the training process nat→ori The input is benign sample, the benign sample that outputs makes the accuracy of speaker identification model x-vector unchanged or decline then remove the benign data in the natural dataset;
and 4, testing benign samples and challenge samples of the test set separately, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM respectively to test the defensive effect.
The invention has the following characteristics:
1. the invention improves the CycleGAN-VC2 model aiming at the speaker recognition system, uses the L2 distance constraint loss function, encourages the model to select more characteristics during training, and improves the learning performance of the model.
2. The invention uses natural data set to replace the antagonistic sample data set for the speaker recognition system, so that the model learns the characteristics of benign data, and the side effect of the model on benign samples is reduced.
3. The invention uses decrement learning in the training process aiming at the speaker recognition system, reduces the data in the training set in the training model, and greatly reduces the time required by the training to generate the countermeasure network.
Drawings
FIG. 1 is a business flow diagram of the present invention;
FIG. 2 is a primary training flow diagram of the present invention;
FIG. 3 is a secondary training flow diagram of the present invention;
FIG. 4 is a block diagram of a generator;
FIG. 5 is a block diagram of a arbiter;
FIG. 6 is a comparison of the effects of different loss functions on the defense of PGDs;
figure 7 is a waveform diagram of the different defenses produced;
figure 8 is a graph of the spectrum generated by different defenses.
Detailed Description
The invention is further described below by means of the figures and examples.
Referring to fig. 1-5, a defending method for a speaker recognition system against a sample maintains the accuracy and robustness of the speaker recognition system by fusing decrement learning and improved CycleGAN-VC2, comprising the steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model by using decrement learning;
and 4, performing performance test in the test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM.
The method comprises the following specific steps:
step 1, acquiring Librispeech voice data sets, randomly selecting 10 speakers from the Librispeeches voice data sets, using 100 audio files of each person as benign data sets, performing PGD attack on the benign data sets, generating 1000 countersamples serving as countersample data sets, combining the benign data sets and the countersample data sets to obtain natural data sets required by experiments, and using the natural data sets according to 9: the ratio of 1 is divided into a training set and a test set, wherein the ratio of benign samples and challenge samples in the training set and the test set is 1:1, a step of;
step 2, loss of loop consistency L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator. G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y. G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) The natural data x is generated. Cycle consistency penalty L cyc And identity mappability loss L id The constraints are respectively imposed by L2 distances.
Step 3, training the CycleGAN-L2 model by adopting a decrement learning method, and if G is generated in the training process nat→ori The input is benign sample, and the output benign sample leads the accuracy of the speaker recognition model x-vector to be unchanged or to be reduced, so that benign data in the natural data set are removed.
And 4, testing benign samples and challenge samples of the test set separately, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM respectively to test the defensive effect.
In CycleGAN-VC2, the L1 distance is used for model training. The invention combines the characteristic of a speaker recognition scene, which has a plurality of speakers, carries out partial modification on the basis of a CycleGAN-VC2 model, and encourages a generator model to learn more features by using an L2 distance constraint loss function.
Referring to FIG. 6, to verify the effectiveness of the L2 loss function in the present invention, the defensive effects of two different loss function methods, cycleGAN-L1 and CycleGAN-L2, were verified on the test set. Wherein, CSI represents no target attack in the closed set identification, OSI-simple represents simple target attack in the open set identification, and CSI-hard represents difficult target attack in the closed set identification. As can be seen from FIG. 6, the accuracy acc of the target model in CycleGAN-L2 under different speaker recognition tasks adv Better than CycleGAN-L1. This verifies the effectiveness of the L2 distance for the present invention.
In the selection of the countermeasure network based on the generation of the countermeasure defenses, the invention aims to reduce the side effects of the CycleGAN-L2 model on benign samples and has a certain defensive effect on various types of attacks.
For this purpose, a generator G of the invention nat→ori The input of (2) is performed in two times, the first input is real data including a challenge sample and a benign sample, and the second input is a benign sample, so that the adverse effect of the CycleGAN-L2 model on the benign sample is minimized.
In tables 1 and 2, acc ben And acc (sic) adv The accuracy of the speaker recognition model to recognize benign samples and the accuracy of the speaker recognition model to recognize antagonistic samples are respectively. We are primarily defending against the following non-targeted attacks: FGSM (Fast Gradient Sign Method) is a fast gradient sign attack, MIM (Momentum Iterative Fast Gradient Sign Method) is a gradient-based momentum iteration attack, PGD (Project Gradient Descent) is a projection gradient descent attack, CW2 (Carlini&Wagner) is an optimization-based attack, ADA (A Highly Stealthy Adaptive Decay Attack) is a highly covert adaptive attack.
QT (Quantization), AS (Average Smoothing) and MS (Median Smo-othing) are time-domain based methods that defend by quantization, average smoothing and Median smoothing, respectively. DS (Down Sampling), LPF (Low Pass Filter) and BPF (B and Pass Filter) are frequency domain based methods that defend by downsampling, lowpass filtering and bandpass filtering, respectively. OPUS and SPEEX are speech compression based methods that are defended by different speech compression algorithms, respectively. CycleGAN-L2 and CycleGAN-L1 are speech synthesis based methods, and CycleGAN-L2 is an improvement over CycleGAN-L1.
As can be seen from Table 1, in the closed set identification task, whether L1 or L2 is used, acc is added as long as the challenge sample is added to the training data while the benign sample is added ben The values of (2) were all maintained at 99.9%. And compared with QT, AS, MS, DS, LPF, BPF, OPUS, SPEEX, cycleGAN-L2 has the protection method of acc ben The value of (2) dominates. Under the condition of defending other attack methods, CYC-L2 is better than other methods, acc adv 94.7%, 35.5%, 75.1%, 99.6% and 88.5%, respectively. However, CYC-L2 is slightly less effective than QT in defending against ADA, differing only by 3.2%.
TABLE 1
TABLE 2
As can be seen from Table 2, in the open set identification task, the acc of the L2 method is used ben The value of (2) is 97.7% higher than using the L1 method, indicating that L2 is better than L1, with minimal side effects on benign samples. The effect of CYC-L2 is better than that of other methods when defending other attacks, for example, the accuracy of a model is 88.3% when the CYC-L2 defends FGSM. And compared with CYC-L1 and QT, AS, LPF, the CYC-L2 has the acc adv The differences were 1.1%, 12.3%, 40.3% and 38.2%, respectively.
The invention provides a defending method aiming at an antagonism sample of a speaker recognition system, wherein a model is named as CycleGAN-L2, and the model uses an L2 distance constraint loss function to encourage the model to select more characteristics during training, so that the training effect is further improved, and a method of decrement learning is introduced, so that the time required for generating an antagonism network during training is greatly reduced. In order to reduce the side effect of the model on benign samples, the training set is added with the antagonistic samples and the benign samples. Experimental results show that in the closed set identification and open set identification tasks, the invention has the acc ben 99.9% and 97.7%, respectively, have minimal impact on benign samples. In defending against FGSM, MIM, PGD, CW and ADA in open set identification, acc adv Better than other methods, and has certain resistance to different attacks. Figures 7 and 8 are defense visualizations of different defense methods against MIM attacks.
In conclusion, the invention converts the challenge sample into the benign sample based on the generation of the challenge network, adds the benign sample in the data set so as not to influence the recognition accuracy of the target model on the benign sample, and uses the decrement learning to greatly reduce the training time of the model when training the model, and can be deployed on any speaker recognition model.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the invention in any way, and any simple modification, equivalent variation and variation of the above embodiment according to the technical matter of the present invention still fall within the scope of the technical scheme of the present invention.

Claims (1)

1. A method of defending against a sample for a speaker recognition system, comprising: the accuracy and the robustness of the speaker recognition system are maintained by fusing decrement learning and improved CycleGAN-VC 2; firstly, benign samples and antagonistic samples are added into a data set for training of a generator, and a method of decrement learning is integrated in the training process to delete benign data; secondly, improving the CycleGAN-VC2, and adopting an L2 distance constraint loss function, wherein the method comprises the following steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model CycleGAN-L2 by using decrement learning;
step 4, performing performance test in a test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM;
step 1 is specifically that a librispech voice data set is obtained, 10 speakers are randomly selected from the librispech voice data set, 100 audio files of each person are used as benign data sets, PGD attack is carried out on the benign data sets, 1000 countersamples are generated to be used as countersample data sets, the benign data sets and the countersample data sets are combined to obtain a natural data set required by an experiment, and the natural data set is processed according to 9:1 into a training set and a test set, wherein the ratio of benign samples to challenge samples in the training set and the test set is 1:1;
the step 2 is specifically that the cycle consistency loss L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator; in the cycle consistency loss, G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y; in loss of identity mappability, G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) Is to generate natural data xNatural data x; cycle consistency penalty L cyc And identity mappability loss L id Respectively constraining by using L2 distances;
the step 3 is specifically that a decrement learning method is adopted to train the CycleGAN-L2 model, and if G is generated in the training process nat→ori The input is benign sample, the benign sample that outputs makes the accuracy of speaker identification model x-vector unchanged or decline then remove the benign data in the natural dataset;
the step 4 specifically includes separately testing benign samples and challenge samples of the test set, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM, respectively, to perform a defensive effect test.
CN202310918349.2A 2023-07-25 2023-07-25 Defensive method for countermeasure sample of speaker recognition system Active CN117012204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310918349.2A CN117012204B (en) 2023-07-25 2023-07-25 Defensive method for countermeasure sample of speaker recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310918349.2A CN117012204B (en) 2023-07-25 2023-07-25 Defensive method for countermeasure sample of speaker recognition system

Publications (2)

Publication Number Publication Date
CN117012204A CN117012204A (en) 2023-11-07
CN117012204B true CN117012204B (en) 2024-04-09

Family

ID=88566646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310918349.2A Active CN117012204B (en) 2023-07-25 2023-07-25 Defensive method for countermeasure sample of speaker recognition system

Country Status (1)

Country Link
CN (1) CN117012204B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117292690B (en) * 2023-11-24 2024-03-15 南京信息工程大学 Voice conversion active defense method, device, system and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110767216A (en) * 2019-09-10 2020-02-07 浙江工业大学 Voice recognition attack defense method based on PSO algorithm
CN111627429A (en) * 2020-05-20 2020-09-04 浙江工业大学 Defense method and device of voice recognition model based on cycleGAN
WO2021169292A1 (en) * 2020-02-24 2021-09-02 上海理工大学 Adversarial optimization method for training process of generative adversarial neural network
WO2021205746A1 (en) * 2020-04-09 2021-10-14 Mitsubishi Electric Corporation System and method for detecting adversarial attacks
CN115188384A (en) * 2022-06-09 2022-10-14 浙江工业大学 Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising
CN115309897A (en) * 2022-07-27 2022-11-08 方盈金泰科技(北京)有限公司 Chinese multi-modal confrontation sample defense method based on confrontation training and contrast learning
CN116013318A (en) * 2022-12-13 2023-04-25 浙江大学 Countermeasure sample construction method for voiceprint recognition defense module

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7259981B2 (en) * 2019-10-17 2023-04-18 日本電気株式会社 Speaker authentication system, method and program
CN113052203B (en) * 2021-02-09 2022-01-18 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Anomaly detection method and device for multiple types of data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110767216A (en) * 2019-09-10 2020-02-07 浙江工业大学 Voice recognition attack defense method based on PSO algorithm
WO2021169292A1 (en) * 2020-02-24 2021-09-02 上海理工大学 Adversarial optimization method for training process of generative adversarial neural network
WO2021205746A1 (en) * 2020-04-09 2021-10-14 Mitsubishi Electric Corporation System and method for detecting adversarial attacks
CN111627429A (en) * 2020-05-20 2020-09-04 浙江工业大学 Defense method and device of voice recognition model based on cycleGAN
CN115188384A (en) * 2022-06-09 2022-10-14 浙江工业大学 Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising
CN115309897A (en) * 2022-07-27 2022-11-08 方盈金泰科技(北京)有限公司 Chinese multi-modal confrontation sample defense method based on confrontation training and contrast learning
CN116013318A (en) * 2022-12-13 2023-04-25 浙江大学 Countermeasure sample construction method for voiceprint recognition defense module

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion;Takuhiro Kaneko;ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing;20190517;第6820-6823页 *
基于边界值不变量的对抗样本检测方法;严飞;张铭伦;张立强;;网络与信息安全学报;20200215(第01期);第1-3页 *

Also Published As

Publication number Publication date
CN117012204A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN113554089B (en) Image classification countermeasure sample defense method and system and data processing terminal
CN109599109B (en) Confrontation audio generation method and system for white-box scene
Chen et al. Robust deep feature for spoofing detection—The SJTU system for ASVspoof 2015 challenge
CN112883874B (en) Active defense method aiming at deep face tampering
CN117012204B (en) Defensive method for countermeasure sample of speaker recognition system
CN109887496A (en) Orientation confrontation audio generation method and system under a kind of black box scene
CN112287323B (en) Voice verification code generation method based on generation of countermeasure network
CN111881446B (en) Industrial Internet malicious code identification method and device
Peng et al. Pairing Weak with Strong: Twin Models for Defending Against Adversarial Attack on Speaker Verification.
CN115147682B (en) Method and device for generating hidden white box countermeasure sample with mobility
CN114640518B (en) Personalized trigger back door attack method based on audio steganography
Panariello et al. Malafide: a novel adversarial convolutive noise attack against deepfake and spoofing detection systems
Wang et al. ADDITION: Detecting Adversarial Examples With Image-Dependent Noise Reduction
CN113222120B (en) Neural network back door injection method based on discrete Fourier transform
Liu et al. Detecting adversarial audio via activation quantization error
CN113113023A (en) Black box directional anti-attack method and system for automatic voiceprint recognition system
Kaushal et al. The societal impact of Deepfakes: Advances in Detection and Mitigation
Kawa et al. Defense against adversarial attacks on audio deepfake detection
CN116013318A (en) Countermeasure sample construction method for voiceprint recognition defense module
CN116309031A (en) Face counterfeiting active interference method, system, equipment and storage medium
CN112289324B (en) Voiceprint identity recognition method and device and electronic equipment
CN118522290B (en) Voice countermeasure sample generation method and device, electronic equipment and storage medium
CN111353403A (en) Method and system for detecting confrontation sample of deep neural network image
CN113987955B (en) Antagonistic sample defense method based on trap type integrated network
Mahfuz et al. Ensemble noise simulation to handle uncertainty about gradient-based adversarial attacks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant