CN117012204B - Defensive method for countermeasure sample of speaker recognition system - Google Patents
Defensive method for countermeasure sample of speaker recognition system Download PDFInfo
- Publication number
- CN117012204B CN117012204B CN202310918349.2A CN202310918349A CN117012204B CN 117012204 B CN117012204 B CN 117012204B CN 202310918349 A CN202310918349 A CN 202310918349A CN 117012204 B CN117012204 B CN 117012204B
- Authority
- CN
- China
- Prior art keywords
- benign
- model
- cyclegan
- data
- samples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012549 training Methods 0.000 claims abstract description 46
- 238000012360 testing method Methods 0.000 claims abstract description 22
- 230000000694 effects Effects 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims abstract description 8
- 230000003042 antagnostic effect Effects 0.000 claims abstract description 6
- 238000012986 modification Methods 0.000 claims description 5
- 230000004048 modification Effects 0.000 claims description 5
- 238000002474 experimental method Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000011056 performance test Methods 0.000 claims description 3
- 208000032538 Depersonalisation Diseases 0.000 claims description 2
- 230000007423 decrease Effects 0.000 claims description 2
- 230000007123 defense Effects 0.000 abstract description 24
- 238000001514 detection method Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 6
- 238000009499 grossing Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008485 antagonism Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 101150020741 Hpgds gene Proteins 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses a defending method for a speaker recognition system countermeasure sample, which comprises the following steps: (1) creating a desired data set; (2) Constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model; (3) training the model using deweighting learning; (4) Performance testing was performed in the test set using the trained model and defenses against challenge samples generated by CW2, MIM, ADA and FGSM. According to the invention, the CycleGAN-VC2 is taken as a backbone network, and an antagonistic sample and a benign sample are added in a training set, so that side effects of a defense method are reduced. And according to the idea of decrement learning, deleting benign samples in the training process, accelerating the training of the model, restricting the loss function by using the L2 distance, encouraging the model to screen more features, and thus realizing the defense against the samples.
Description
Technical Field
The invention belongs to the technical field of voice systems, in particular to the field of countermeasure defense in speaker recognition, and more particularly relates to a defense method for a countermeasure sample of a speaker recognition system.
Background
Defensive audio challenge samples are an important topic in challenge defenses, the defensive effects of which directly affect the reliability of authentication, judicial authentication, and personalized services on smart devices. With the continued evolution and enhancement of audio counterattack, it is becoming increasingly important to protect speaker recognition systems from malicious interference and attack challenges. In defending against audio challenge samples, it is critical to effectively defend against challenge samples while not affecting the accuracy of benign samples. This is critical to maintaining the accuracy and robustness of the speaker recognition system.
Security in the image field has been widely studied at present. However, in the speech field, especially in speaker recognition systems, defense methods against samples have not been fully explored and studied. Moreover, security issues in speaker recognition systems are not negligible. If the speaker recognition system is used to provide financial related services or personal privacy related services, and is not adequately secured, the personal property and reputation can be greatly compromised.
The challenge defense can be divided into two methods, active defense and passive defense. The active defense uses the challenge sample for data enhancement, retraining the speaker recognition model to improve its robustness. Passive defense is achieved by adding new components without modifying the original model, and the passive defense method can be classified into a detection method and a purification method according to the functions of the new components, for detecting and eliminating the influence of the challenge sample.
The patent application of application number 202310123820.9 discloses a universal detection system and method for a speaker recognition system challenge sample, wherein the system comprises a multi-channel audio interference module for performing audio interference on input original audio to generate an audio variety set corresponding to the original audio; the speaker system recognition module is used for inputting the generated audio variety set into the speaker recognition system, and extracting a scoring sequence and a discrimination result sequence corresponding to the audio variety set; the stability feature extraction module is used for extracting statistics features of the obtained score sequence and the discrimination result sequence, and connecting the extracted feature value with the score sequence to obtain stability expression features; and the single-class judging module judges whether the input original audio is a countermeasure sample according to the stability representing characteristics. A universal detection method is also disclosed. The system provided by the invention can be self-adaptive to the attack detection of the countering sample under various conditions, so that the safety of voice recognition is enhanced.
The patent application of application number 202210659947.8 discloses a voiceprint recognition challenge sample detection method based on different migration capacities and decision boundary attacks, which comprises the steps of firstly, preprocessing data of speaker signals, and dividing the speaker signals into a training set and a testing set; building a voiceprint recognition model according to the training data; generating challenge samples on the target model using different challenge methods; inputting a mixed sample set of a clean sample and an countermeasure sample into a target model and a detection model to obtain two corresponding labels, comparing whether the labels are consistent or not, if not, setting a detection value, namely, the countermeasure disturbance proportion as 0, and if so, attacking the sample with the unchanged label by utilizing a decision boundary attack method to obtain the countermeasure disturbance proportion; utilizing decision boundary attack in a clean sample set to obtain a batch of anti-disturbance proportions, and determining a detected decision threshold value in the disturbance proportions; and detecting the countermeasure sample by using the determined decision threshold, wherein if the disturbance proportion value of the sample is larger than the threshold, the sample is a clean sample, otherwise, the sample is the countermeasure sample.
The above two patents both belong to detection methods in defense methods, wherein a general detection system and method for a speaker recognition system challenge sample is to preprocess audio to be detected so as to enrich the variety types of the audio to be recognized, then put the audio into a recognition system to obtain a scoring sequence and a discrimination result sequence, and perform feature extraction on the obtained scoring sequence and discrimination result sequence for detection. The voiceprint recognition challenge sample detection method based on different migration capacities and decision boundary attacks trains two speaker recognition models, if the output results of the two models are inconsistent, challenge samples are detected, some challenge samples possibly exist in the input and are not detected, a HopSkipJumpAttack (HSJA) decision boundary attack method is adopted for the challenge samples, the challenge samples are moved out of the decision boundary, and the challenge samples are detected by comparing with a decision threshold.
The present inventors devised a purification method different from the protection method of the above two patents, and found no similar patent document.
Disclosure of Invention
The invention aims to provide a defense method for a speaker recognition system challenge sample, which aims to generate the problem that the training is very slow and the side effect of the defense in the challenge network, takes CycleGAN-VC2 as a backbone network, adds the challenge sample and a benign sample in the training set, and reduces the side effect of the defense on the benign sample; and according to the idea of decrement learning, deleting benign samples in the training process, accelerating the training of the model, restricting the loss function by using the L2 distance, encouraging the model to screen more features, and thus realizing the defense against the samples.
The technical scheme of the invention is as follows:
a defending method for a speaker recognition system to fight against a sample maintains the accuracy and the robustness of the speaker recognition system by fusing decrement learning and improved CycleGAN-VC 2; firstly, benign samples and antagonistic samples are added into a data set for training of a generator, and a method of decrement learning is integrated in the training process to delete benign data; secondly, improving the CycleGAN-VC2, and adopting an L2 distance constraint loss function, wherein the method comprises the following steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model CycleGAN-L2 by using decrement learning;
and 4, performing performance test in the test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM.
The method comprises the following steps of:
step 1, acquiring Librispeech voice data sets, randomly selecting 10 speakers from the Librispeeches voice data sets, using 100 audio files of each person as benign data sets, performing PGD attack on the benign data sets, generating 1000 countersamples serving as countersample data sets, combining the benign data sets and the countersample data sets to obtain natural data sets required by experiments, and using the natural data sets according to 9: the ratio of 1 is divided into a training set and a test set, wherein the ratio of benign samples and challenge samples in the training set and the test set is 1:1, a step of;
step 2, loss of loop consistency L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator; in the cycle consistency loss, G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y; in loss of identity mappability, G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) Generating natural data x from the natural data x; cycle consistency penalty L cyc And identity mappability loss L id Respectively constraining by using L2 distances;
step 3, training the CycleGAN-L2 model by adopting a decrement learning method, and if G is generated in the training process nat→ori The input is benign sample, the benign sample that outputs makes the accuracy of speaker identification model x-vector unchanged or decline then remove the benign data in the natural dataset;
and 4, testing benign samples and challenge samples of the test set separately, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM respectively to test the defensive effect.
The invention has the following characteristics:
1. the invention improves the CycleGAN-VC2 model aiming at the speaker recognition system, uses the L2 distance constraint loss function, encourages the model to select more characteristics during training, and improves the learning performance of the model.
2. The invention uses natural data set to replace the antagonistic sample data set for the speaker recognition system, so that the model learns the characteristics of benign data, and the side effect of the model on benign samples is reduced.
3. The invention uses decrement learning in the training process aiming at the speaker recognition system, reduces the data in the training set in the training model, and greatly reduces the time required by the training to generate the countermeasure network.
Drawings
FIG. 1 is a business flow diagram of the present invention;
FIG. 2 is a primary training flow diagram of the present invention;
FIG. 3 is a secondary training flow diagram of the present invention;
FIG. 4 is a block diagram of a generator;
FIG. 5 is a block diagram of a arbiter;
FIG. 6 is a comparison of the effects of different loss functions on the defense of PGDs;
figure 7 is a waveform diagram of the different defenses produced;
figure 8 is a graph of the spectrum generated by different defenses.
Detailed Description
The invention is further described below by means of the figures and examples.
Referring to fig. 1-5, a defending method for a speaker recognition system against a sample maintains the accuracy and robustness of the speaker recognition system by fusing decrement learning and improved CycleGAN-VC2, comprising the steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model by using decrement learning;
and 4, performing performance test in the test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM.
The method comprises the following specific steps:
step 1, acquiring Librispeech voice data sets, randomly selecting 10 speakers from the Librispeeches voice data sets, using 100 audio files of each person as benign data sets, performing PGD attack on the benign data sets, generating 1000 countersamples serving as countersample data sets, combining the benign data sets and the countersample data sets to obtain natural data sets required by experiments, and using the natural data sets according to 9: the ratio of 1 is divided into a training set and a test set, wherein the ratio of benign samples and challenge samples in the training set and the test set is 1:1, a step of;
step 2, loss of loop consistency L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator. G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y. G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) The natural data x is generated. Cycle consistency penalty L cyc And identity mappability loss L id The constraints are respectively imposed by L2 distances.
Step 3, training the CycleGAN-L2 model by adopting a decrement learning method, and if G is generated in the training process nat→ori The input is benign sample, and the output benign sample leads the accuracy of the speaker recognition model x-vector to be unchanged or to be reduced, so that benign data in the natural data set are removed.
And 4, testing benign samples and challenge samples of the test set separately, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM respectively to test the defensive effect.
In CycleGAN-VC2, the L1 distance is used for model training. The invention combines the characteristic of a speaker recognition scene, which has a plurality of speakers, carries out partial modification on the basis of a CycleGAN-VC2 model, and encourages a generator model to learn more features by using an L2 distance constraint loss function.
Referring to FIG. 6, to verify the effectiveness of the L2 loss function in the present invention, the defensive effects of two different loss function methods, cycleGAN-L1 and CycleGAN-L2, were verified on the test set. Wherein, CSI represents no target attack in the closed set identification, OSI-simple represents simple target attack in the open set identification, and CSI-hard represents difficult target attack in the closed set identification. As can be seen from FIG. 6, the accuracy acc of the target model in CycleGAN-L2 under different speaker recognition tasks adv Better than CycleGAN-L1. This verifies the effectiveness of the L2 distance for the present invention.
In the selection of the countermeasure network based on the generation of the countermeasure defenses, the invention aims to reduce the side effects of the CycleGAN-L2 model on benign samples and has a certain defensive effect on various types of attacks.
For this purpose, a generator G of the invention nat→ori The input of (2) is performed in two times, the first input is real data including a challenge sample and a benign sample, and the second input is a benign sample, so that the adverse effect of the CycleGAN-L2 model on the benign sample is minimized.
In tables 1 and 2, acc ben And acc (sic) adv The accuracy of the speaker recognition model to recognize benign samples and the accuracy of the speaker recognition model to recognize antagonistic samples are respectively. We are primarily defending against the following non-targeted attacks: FGSM (Fast Gradient Sign Method) is a fast gradient sign attack, MIM (Momentum Iterative Fast Gradient Sign Method) is a gradient-based momentum iteration attack, PGD (Project Gradient Descent) is a projection gradient descent attack, CW2 (Carlini&Wagner) is an optimization-based attack, ADA (A Highly Stealthy Adaptive Decay Attack) is a highly covert adaptive attack.
QT (Quantization), AS (Average Smoothing) and MS (Median Smo-othing) are time-domain based methods that defend by quantization, average smoothing and Median smoothing, respectively. DS (Down Sampling), LPF (Low Pass Filter) and BPF (B and Pass Filter) are frequency domain based methods that defend by downsampling, lowpass filtering and bandpass filtering, respectively. OPUS and SPEEX are speech compression based methods that are defended by different speech compression algorithms, respectively. CycleGAN-L2 and CycleGAN-L1 are speech synthesis based methods, and CycleGAN-L2 is an improvement over CycleGAN-L1.
As can be seen from Table 1, in the closed set identification task, whether L1 or L2 is used, acc is added as long as the challenge sample is added to the training data while the benign sample is added ben The values of (2) were all maintained at 99.9%. And compared with QT, AS, MS, DS, LPF, BPF, OPUS, SPEEX, cycleGAN-L2 has the protection method of acc ben The value of (2) dominates. Under the condition of defending other attack methods, CYC-L2 is better than other methods, acc adv 94.7%, 35.5%, 75.1%, 99.6% and 88.5%, respectively. However, CYC-L2 is slightly less effective than QT in defending against ADA, differing only by 3.2%.
TABLE 1
TABLE 2
As can be seen from Table 2, in the open set identification task, the acc of the L2 method is used ben The value of (2) is 97.7% higher than using the L1 method, indicating that L2 is better than L1, with minimal side effects on benign samples. The effect of CYC-L2 is better than that of other methods when defending other attacks, for example, the accuracy of a model is 88.3% when the CYC-L2 defends FGSM. And compared with CYC-L1 and QT, AS, LPF, the CYC-L2 has the acc adv The differences were 1.1%, 12.3%, 40.3% and 38.2%, respectively.
The invention provides a defending method aiming at an antagonism sample of a speaker recognition system, wherein a model is named as CycleGAN-L2, and the model uses an L2 distance constraint loss function to encourage the model to select more characteristics during training, so that the training effect is further improved, and a method of decrement learning is introduced, so that the time required for generating an antagonism network during training is greatly reduced. In order to reduce the side effect of the model on benign samples, the training set is added with the antagonistic samples and the benign samples. Experimental results show that in the closed set identification and open set identification tasks, the invention has the acc ben 99.9% and 97.7%, respectively, have minimal impact on benign samples. In defending against FGSM, MIM, PGD, CW and ADA in open set identification, acc adv Better than other methods, and has certain resistance to different attacks. Figures 7 and 8 are defense visualizations of different defense methods against MIM attacks.
In conclusion, the invention converts the challenge sample into the benign sample based on the generation of the challenge network, adds the benign sample in the data set so as not to influence the recognition accuracy of the target model on the benign sample, and uses the decrement learning to greatly reduce the training time of the model when training the model, and can be deployed on any speaker recognition model.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the invention in any way, and any simple modification, equivalent variation and variation of the above embodiment according to the technical matter of the present invention still fall within the scope of the technical scheme of the present invention.
Claims (1)
1. A method of defending against a sample for a speaker recognition system, comprising: the accuracy and the robustness of the speaker recognition system are maintained by fusing decrement learning and improved CycleGAN-VC 2; firstly, benign samples and antagonistic samples are added into a data set for training of a generator, and a method of decrement learning is integrated in the training process to delete benign data; secondly, improving the CycleGAN-VC2, and adopting an L2 distance constraint loss function, wherein the method comprises the following steps of:
step 1, manufacturing a required data set;
step 2, constructing a network model, and constructing a final model CycleGAN-L2 through an improved CycleGAN-VC2 model;
step 3, training the model CycleGAN-L2 by using decrement learning;
step 4, performing performance test in a test set by using the model obtained by training, and defending the countermeasure samples generated by CW2, MIM, ADA and FGSM;
step 1 is specifically that a librispech voice data set is obtained, 10 speakers are randomly selected from the librispech voice data set, 100 audio files of each person are used as benign data sets, PGD attack is carried out on the benign data sets, 1000 countersamples are generated to be used as countersample data sets, the benign data sets and the countersample data sets are combined to obtain a natural data set required by an experiment, and the natural data set is processed according to 9:1 into a training set and a test set, wherein the ratio of benign samples to challenge samples in the training set and the test set is 1:1;
the step 2 is specifically that the cycle consistency loss L in the CycleGAN-VC2 model cyc And identity mappability loss L id The modification is carried out to obtain a CycleGAN-L2 model, and the specific formula is as follows:
wherein G is nat→ori And G ori→nat Is a generator; in the cycle consistency loss, G nat→ori (x) Generating benign data y, G from x in natural data set ori→nat (y) generating natural data x from benign samples y; in loss of identity mappability, G nat→ori (y) generating benign data y, G from benign data y ori→nat (x) Is to generate natural data xNatural data x; cycle consistency penalty L cyc And identity mappability loss L id Respectively constraining by using L2 distances;
the step 3 is specifically that a decrement learning method is adopted to train the CycleGAN-L2 model, and if G is generated in the training process nat→ori The input is benign sample, the benign sample that outputs makes the accuracy of speaker identification model x-vector unchanged or decline then remove the benign data in the natural dataset;
the step 4 specifically includes separately testing benign samples and challenge samples of the test set, and generating 1000 challenge samples by using CW2, MIM, ADA and FGSM, respectively, to perform a defensive effect test.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310918349.2A CN117012204B (en) | 2023-07-25 | 2023-07-25 | Defensive method for countermeasure sample of speaker recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310918349.2A CN117012204B (en) | 2023-07-25 | 2023-07-25 | Defensive method for countermeasure sample of speaker recognition system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117012204A CN117012204A (en) | 2023-11-07 |
CN117012204B true CN117012204B (en) | 2024-04-09 |
Family
ID=88566646
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310918349.2A Active CN117012204B (en) | 2023-07-25 | 2023-07-25 | Defensive method for countermeasure sample of speaker recognition system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117012204B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117292690B (en) * | 2023-11-24 | 2024-03-15 | 南京信息工程大学 | Voice conversion active defense method, device, system and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110767216A (en) * | 2019-09-10 | 2020-02-07 | 浙江工业大学 | Voice recognition attack defense method based on PSO algorithm |
CN111627429A (en) * | 2020-05-20 | 2020-09-04 | 浙江工业大学 | Defense method and device of voice recognition model based on cycleGAN |
WO2021169292A1 (en) * | 2020-02-24 | 2021-09-02 | 上海理工大学 | Adversarial optimization method for training process of generative adversarial neural network |
WO2021205746A1 (en) * | 2020-04-09 | 2021-10-14 | Mitsubishi Electric Corporation | System and method for detecting adversarial attacks |
CN115188384A (en) * | 2022-06-09 | 2022-10-14 | 浙江工业大学 | Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising |
CN115309897A (en) * | 2022-07-27 | 2022-11-08 | 方盈金泰科技(北京)有限公司 | Chinese multi-modal confrontation sample defense method based on confrontation training and contrast learning |
CN116013318A (en) * | 2022-12-13 | 2023-04-25 | 浙江大学 | Countermeasure sample construction method for voiceprint recognition defense module |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7259981B2 (en) * | 2019-10-17 | 2023-04-18 | 日本電気株式会社 | Speaker authentication system, method and program |
CN113052203B (en) * | 2021-02-09 | 2022-01-18 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Anomaly detection method and device for multiple types of data |
-
2023
- 2023-07-25 CN CN202310918349.2A patent/CN117012204B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110767216A (en) * | 2019-09-10 | 2020-02-07 | 浙江工业大学 | Voice recognition attack defense method based on PSO algorithm |
WO2021169292A1 (en) * | 2020-02-24 | 2021-09-02 | 上海理工大学 | Adversarial optimization method for training process of generative adversarial neural network |
WO2021205746A1 (en) * | 2020-04-09 | 2021-10-14 | Mitsubishi Electric Corporation | System and method for detecting adversarial attacks |
CN111627429A (en) * | 2020-05-20 | 2020-09-04 | 浙江工业大学 | Defense method and device of voice recognition model based on cycleGAN |
CN115188384A (en) * | 2022-06-09 | 2022-10-14 | 浙江工业大学 | Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising |
CN115309897A (en) * | 2022-07-27 | 2022-11-08 | 方盈金泰科技(北京)有限公司 | Chinese multi-modal confrontation sample defense method based on confrontation training and contrast learning |
CN116013318A (en) * | 2022-12-13 | 2023-04-25 | 浙江大学 | Countermeasure sample construction method for voiceprint recognition defense module |
Non-Patent Citations (2)
Title |
---|
Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion;Takuhiro Kaneko;ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing;20190517;第6820-6823页 * |
基于边界值不变量的对抗样本检测方法;严飞;张铭伦;张立强;;网络与信息安全学报;20200215(第01期);第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117012204A (en) | 2023-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113554089B (en) | Image classification countermeasure sample defense method and system and data processing terminal | |
CN109599109B (en) | Confrontation audio generation method and system for white-box scene | |
Chen et al. | Robust deep feature for spoofing detection—The SJTU system for ASVspoof 2015 challenge | |
CN112883874B (en) | Active defense method aiming at deep face tampering | |
CN117012204B (en) | Defensive method for countermeasure sample of speaker recognition system | |
CN109887496A (en) | Orientation confrontation audio generation method and system under a kind of black box scene | |
CN112287323B (en) | Voice verification code generation method based on generation of countermeasure network | |
CN111881446B (en) | Industrial Internet malicious code identification method and device | |
Peng et al. | Pairing Weak with Strong: Twin Models for Defending Against Adversarial Attack on Speaker Verification. | |
CN115147682B (en) | Method and device for generating hidden white box countermeasure sample with mobility | |
CN114640518B (en) | Personalized trigger back door attack method based on audio steganography | |
Panariello et al. | Malafide: a novel adversarial convolutive noise attack against deepfake and spoofing detection systems | |
Wang et al. | ADDITION: Detecting Adversarial Examples With Image-Dependent Noise Reduction | |
CN113222120B (en) | Neural network back door injection method based on discrete Fourier transform | |
Liu et al. | Detecting adversarial audio via activation quantization error | |
CN113113023A (en) | Black box directional anti-attack method and system for automatic voiceprint recognition system | |
Kaushal et al. | The societal impact of Deepfakes: Advances in Detection and Mitigation | |
Kawa et al. | Defense against adversarial attacks on audio deepfake detection | |
CN116013318A (en) | Countermeasure sample construction method for voiceprint recognition defense module | |
CN116309031A (en) | Face counterfeiting active interference method, system, equipment and storage medium | |
CN112289324B (en) | Voiceprint identity recognition method and device and electronic equipment | |
CN118522290B (en) | Voice countermeasure sample generation method and device, electronic equipment and storage medium | |
CN111353403A (en) | Method and system for detecting confrontation sample of deep neural network image | |
CN113987955B (en) | Antagonistic sample defense method based on trap type integrated network | |
Mahfuz et al. | Ensemble noise simulation to handle uncertainty about gradient-based adversarial attacks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |